Analysis of Economic Indicators among countries
- 1 Abstract
- 2 Introduction
- 3 Background
- 4 Introduction to data
- 5 Analysis objectives
- 6 Univariate descriptive statistics
- 7 Multivariate analysis
- 8 Regression analysis
- 9 Checking for Multicollinearity issues
- 10 Conclusion
- 11 References
- 12 Comments
This paper analyses differences among countries in respect to particular economic indicators while evaluating the relationship among these indicators, with the objective of qualitative and quantitatively understanding the impact on socio-economical aspects.
The analysis has shown that there is a significant correlation among some of the observed variables that may be indicative of how GDP per Capita is considerably influenced by certain economic factors of the country. Some of the relevant findings of the performed analysis is how a combination of Budget Account Balance, Total Energy Consumption per Capita and External debt can assist a country to reach different GDP per Capita levels.
A considerable gap in wealth distribution among world nations has been always an issue in terms of efficient utilization of resources, inequality of standards of living, economical/political power and challenges to global environment. Due to the impact of economic globalization which include a variety of events ranging from emerging economies joining the WTO (e.g. China) and reducing trading barriers that allow heavier movement of capital, industrial and labor markets, recently many developing countries have succeeded in reducing the gap in world’s wealth distribution (e.g. China, Brazil, India, Singapore). Nevertheless many other countries in Africa and Latinamerica are still lagging behind in terms of economic prosperity.
The objective of this analysis is to observe several economic indicators of several countries to quantify the difference among these variables and seek correlation among them. While studying the structure and economic behavior of countries whose individuals currently enjoy high GDP’s per capita, one can seek to project this trend and understand what challenges might exist when developing countries push themselves to increase theirs as well.
There is the question on how emerging economies will achieve growth and at what expense. For instance, when France, USA or Great Britain encountered industrial revolutions, environmental concerns were not an issue. Nowadays, they are indeed a hot issue on the world’s leaders agendas, but this poses a question: How can emerging countries grow at a sustainable rate while having to carry with an excess cost burden of environmental regulation that demands low carbon emissions and greener technologies that most of the time demand heavy upfront investment. Their arguments have valid points, why do they have to pick up the bill when rich countries have done most of the damage?, How can they compete globally on matters such as resource’s efficiency, green regulation and environment friendly technologies, when they have hundreds of years of lagging disadvantage? And most importantly how realistic and economicly feasible is it for emerging countries to catch up the lost time, without following at least some of the developed countries history?
Now, if there are concerns regarding the process of how the emerging countries will arrive at the level of rich countries, there are also important economic implications of what challenges may come up once they do. For example, as more Chinese, African and Mexican people use cars, buy home appliances and demand higher standards of living: How will the excessive demand push up the price of oil, electricity, minerals, etc?, How will this demand influence production funtions –i.e. down because of economies of scale or up because of diminishing returns-?.
Perhaps all these corncerns are alarmistic, and what may well happen is that as there is more pressure to stop depending on current economic models, more investment will be injected to alternative fuels, more efficient technology and higher productivity programs. For example, as there is pressure to cut carbon emissions, people and governments are more willing to reconsider nuclear plants as an option.
Introduction to data
The Data set selected for this analysis comprises economic indicators for 52 countries, which correspond to the year 2004 timeframe. The source of this data was Fedstats US Government (http://www.eia.doe.gov./emeu/cabs/contents.html) crossreferenced with World Bank (http://www.worldbank.org/) information.
The countries were selected from a wide range of geographic location, population size and economic weight in order to seek a representative sample of world economies. Below is a list of how many countries were selected per geographic region:
- North America: 3
- Europe: 6
- South/Central America: 10
- Africa: 9
- Middle East: 7
- Asia: 13
- Australia: 1
- Ex-USRR: 3
The variables under study were the following economic indicators:
|Population:||Expressed as Number of Inhabitants|
|Religion:||Divided in 4 groups|
|Predominant Religion:||Expressed in % of Population belonging to predominant religion|
|Independence:||Expressed as Years since independence was acquired|
|Gross Domestic Product:||Expressed as annual GDP in USD Billion|
|Unemployment Rate:||Expressed in decimals|
|Total Per Capita Energy Consumption:||Expressed in Mio. BTU per Individual per year|
|From 2003 to 2004|
|Current Account Balance:||Expressed in US Dollars Billion|
|External Debt:||Expressed in US Dollars Billion|
|Inflation Rate:||Expressed in decimals|
|GDP Per Capita:||Expressed in USD|
The goal of studying this data is to get a deeper insight and understanding of the following issues:
- How wealth is distributed in terms of amount, percentage, concentration, etc.
- Is there a trend or indicator of which countries are growing the fastest/slowest.
- Which pattern does energy consumption follow in respect to GDP per capita.
- Which role can capital markets play on assisting a country’s economic growth.
- Environmental and economic sustainability:
- What challenges may exist for the future, both for developed and emerging countries.
Univariate descriptive statistics
In this section we take a closer look at the variables that obviously depend directly on our main variable of interest, GDP per capita. We consider GDP in absolute terms and GDP growth before analysing our measurement of wealth, GDP per capita. The graphs displayed for every variable contain 4 different descriptive techniques. Clockwise starting in the upper left corner, we focus on Histograms, Boxplots, QQ-Plots and Dotplots. An analysis of the results of every variable is given after the graphs and the printout of the descriptive statistics.
The xplore-code that was used for creating the plots is to be found in the table on the right.
Gross domestic product (GDP)
While looking at the difference in Mean (715.134) and Median (116.8), as well as the minimum and maximum values, we can conclude that wealth is largely concentrated on few countries which act as outliers bringing the mean up, while most of the countries remain on the low end. From the data a graph of cumulative wealth % per countries was created in order to illustrate the global distribution of wealth.
[152,] " Mean 715.134" [153,] " Std.Error 1798.52 Variance 3.23467e+06" [155,] " Minimum 7.2 Maximum 11800" [156,] " Range 11792.8" [157,] " " [158,] " Lowest cases Highest cases " [159,] " 24 (Gabon): 7.2 19 (UK): 2200" [160,] " 10 (Paraguay): 8.06 40 (China): 2230" [161,] " 36 (Azerbaijan): 8.7 15 (Germany): 2700" [162,] " 5 (Bolivia): 9.6 43 (Japan): 4700" [163,] " 29 (Bahrain): 11.8 2 (US): 11800" [164,] " " [165,] " Median 116.8" [166,] " 25% Quantile 32 75% Quantile 689" [167,] " " [168,] " Skewness 4.74071 Kurtosis 28.2801" [169,] " Excess 25.2801"
Golbal wealth distribution
There is a clear bias in the distribution of wealth perceptible. A small number of countries hold a large proportion of the global wealth. 10 countries hold roughly 80 % of the wealth. Leaving the majority (42 countries in our model world) with only 20 %. The richest country being considered, the US, even holds 31 % of global wealth.
GDP growth rate
Analysing this data we can see that overall the mean is 5.5% and median 5%, but looking at the extremes we can see that certain economies seem stagnated while other are blooming. It is interesting to see that from the 5 lowest, 3 are developed countries (France, Germany and Italy) with an average of 1%. On the other hand all of the 5 are developing countries, being China the best example of an emerging economy which is positioning itself as a global competitor, political/monetary influencial figure and large demand/supply market.
Several factors could contribute to this:
- As emerging economies are developing infrastructure, industrial capabilities and open markets they are gaining momentum at leapfroging steps from.
- Developed countries are facing diminishing productivity returns, aging population, higher burdens (taxes, salaries, unions, regulation) which do not allow growth rates.
- It is important to consider that even though growth rate for these developing countries are higher than the ones of developed countries, their inflation rates are considerably higher (an extreme example is Angola with 37%), which combined with political instability and currency volatility offset the potential for investment and growth.
Mean 0.0550769" [243,] " Std.Error 0.0400024 Variance 0.00160019" [244,] " " [245,] " Minimum 0.009 Maximum 0.26" [246,] " Range 0.251" [247,] " " [248,] " Lowest cases Highest cases " [249,] " 15 (Germany): 0.009 13(Venezuela): 0.093" [250,] " 17 (Italy): 0.01 40(China): 0.1" [251,] " 22 (Cote D‘Ivoire): 0.011 12(Uruguay): 0.123" [252,] " 14(France): 0.012 21(Angola): 0.144" [253,] " 24(Gabon): 0.014 36(Azerbaijan): 0.26" [254,] " " [255,] " Median 0.05" [256,] " 25% Quantile 0.035 75% Quantile 0.064" [257,] " " [258,] " Skewness 2.76096 Kurtosis 14.1475" [259,] " Excess 11.1475"
GDP per Capita
From the analysed 52 countries, 3 of the 5 countries with highest GDP per capita (US, Japan and Norway) mostly owe these GDP per capita levels to high literacy rates, technological advantages, infrastucture which foster market efficiency and productivity. The remaining 2 countries (Qatar and United Arabs) have a large state income from oil exports. On the other side the countries on the lower end in general lack solid judiciary and legal frameworks which enforce contracts, have poor transport/industrial infrastructure and a history of socio-political instability.
Mean 12141.4" [363,] " Std.Error 15247.4 Variance 2.32482e+08" [364,] " " [365,] " Minimum 397.04 Maximum 55518.8" [366,] " Range 55121.7" [367,] " " [368,] " Lowest cases Highest cases " [369,] " 39(Bangladesh): 397.04 43(Japan): 36886.7" [370,] " 51(Vietnam): 627.28 2(US): 39900.7" [371,] " 28(Sudan): 641.991 33(Qatar): 43682.2" [372,] " 26(Nigeria): 687.261 34(United Aabs): 47674.6" [373,] " 35(Yemen): 733.34 18(Norway): 55518.8" [374,] " " [375,] " Median 4298.47" [376,] " 25% Quantile 1269.71 75% Quantile 17142.6" [377,] " " [378,] " Skewness 1.23724 Kurtosis 3.14239" [379,] " Excess 0.142389"
From the scatter plot, we can see that the variable which seems to have a correlation with GDP per capita is total energy consumption per capita, having a positive relationship. This can be observed in the graph in the left corner. At the same time External Debt seems to have a slight contribution to GDP per capita. This has to be yet confirmed in further investigation. For the other scatterplots there is no correlation detectable. All of these effects will be further studied in more detail.
We are using two techniques in order to display more than 4 variables in a graph. Namely Andrew Curve and Parallel Coordinates Plot. Out of our dataset we chose 18 countries. 9 countries were chosen from Europe, North America + Japan, representing the rich countries and being colored black in our analysis. The other 9 countries belong to the region Africa, representing poor countries and being colored red. The number of observations was limited to 18 in order to minimize the signal-to-ink-ratio which occurs when more than 20 observations are taken into consideration. This would make a clear distinction of the curves more difficult.
There is a slight separation in subgroups detectable. Poor countries seem to vary less than rich ones, also displaying less outliers than rich countries. The separation follows the colour of the curves, thereby indicating a difference between rich and poor.
Parallel coordinates plot
The second tool for analyzing multivariate datasets used are Parallel Coordinates Plot. We chose the same countries as in the section on Andrew curves, to allow an easier comparison of two similar analysing tools.
The result of this analysis shows, that variable 14 (gpd per capita) divides the groups relatively good into 2 subgroups. As in the section before, black values display a higher fluctuation, thereby confirming the results from the Andrew Curves part. Furthermore the overlappings and clusters, observed in the first first variables indicates, that religion, region and independence have no influence on gdp per capita and are not correlated. One black outlier clearly sticks out from the others. This value is the negative current account balance of the US. Poor countries have some very perceptible outliers, when considering the variable “Inflation”. Opposed to this, all rich countries hardly show any variation concerning this variable.
By regressing our variables on the main variable of interest, GDP per Capita, we try to detect, which variables have a significant influence on this variable. Variables with a minor contribution will be dropped, so that in the end a model which tries to explain most of the deviation in the observations using the lowest possible number of explaining variables, remains. R2 is the measure used to evaluate our model. The closer R2 to 1, the better our model.
"A N O V A SS df MSS F-test P-value" [ 3,] "_________________________________________________________________________" [ 4,] "Regression 9766446838.722 11887858803.520 16.991 0.0000" [ 5,] "Residuals 2090145385.181 4052253634.630" [ 6,] "Total Variation 11856592223.903 51232482200.469" [ 7,] "" [ 8,] "Multiple R = 0.90759" [ 9,] "R^2 = 0.82371" [10,] "Adjusted R^2 = 0.77524" [11,] "Standard Error = 7228.66756" [12,] [14,] ” PARAMETERS Beta SE StandB t-test P-value" [15,] ” ________________________________________________________________________" [16,] ”Coefficient b[ 0,]= 2800.7858 5664.4249 0.0000 0.494 0.6237" [17,] ”Population b[ 1,]= 0.0000 0.0000 -0.1390 -1.738 0.0899" [18,] ”Religion b[ 2,]= -334.4732 1241.4180 -0.0217 -0.269 0.7890" [19,] ”% Pred Religion b[ 3,]= 2080.1449 5532.5956 0.0294 0.376 0.7089" [20,] ”Independence b[ 4,]= 1.1147 3.6410 0.0378 0.306 0.7611" [21,] ”GDP b[ 5,]= 2.4083 1.7230 0.2841 1.398 0.1699" [22,] ”Unemployment b[ 6,]= 45.8795 205.1109 0.0158 0.224 0.8241" [23,] ”Energy per Capita b[ 7,]= 58.5546 6.1636 0.6821 9.500 0.0000" [24,] ”Real GDP Growth b[ 8,]= -31337.6233 30288.2382 -0.0822 -1.035 0.3070" [25,] ”Current Account b[ 9,]= 39.0404 19.8472 0.3120 1.967 0.0561" [26,] ”External Debt b[10,]= 2.9052 1.2462 0.3178 2.331 0.0249" [27,] ”Inflation b[11,]= -12293.6769 20468.4940 -0.0478 -0.601 0.5515"
The performed regression model indicates an overall P-value of 0.000 which implies that at least one of the coefficients is non-zero. The difference in R2 and R2-adjusted shows there are many variables which are not greatly contributing to the model. If we use a α-significance-value of 0.05 we drop all variables except for External Debt (P-value=0.0249) and Total Energy Consumption per Capita (P-value=0.000).
It is important to mention that Current Account Balance had also a small P-value, and even though in this first removal round of α =0.05 it did not make the cut, if subsequent models did not provide good prediction this variable would have been revisited. In this first analysis Current Account Balance had a positive relationship with GDP per capita, the potential explanation is that as governments have a surplus it is indicative of sustainable usage of state resources, as well as being able to meet debt interest. The exception were US with a huge deficit of USD 821 Billion, which in combination with the house market bubble burst, lower productivity rates and falling currency could potentially explain the difficulties the American economy is facing to grow. On the other hand, while seeing the countries with the highest surplus; China with USD 160.8 Billion, and Japan with USD 162.2 Billion we can better understand why these countries hold the biggest Dollar reserves worldwide with China approaching the 1 trillion figure. This fact also assists China to keep the Yuan exchange rate controlled in order to maintain a huge cost competitiveness continued to boost exports globally.
Linear Regression using Total Energy Consumption and External Debt
[ 2,] "A N O V A SS df MSS F-test P-value" [ 3,] "_________________________________________________________________________" [ 4,] "Regression 7792468767.331 23896234383.665 46.976 0.0000" [ 5,] "Residuals 4064123456.572 4982941295.032" [ 6,] "Total Variation 11856592223.903 51232482200.469" [ 7,] "" [ 8,] "Multiple R = 0.81070" [ 9,] "R^2 = 0.65723" [10,] "Adjusted R^2 = 0.64324" [11,] "Standard Error = 9107.21116" [12,] "" [13,] "" [14,] "PARAMETERS Beta SE StandB t-test P-value" [15,] "________________________________________________________________________" [16,] "b[ 0,]= 7811.0200 2451.7036 0.0000 3.186 0.0025" [17,] "b[ 1,]= 65.2039 7.2080 0.7596 9.046 0.0000" [18,] "b[ 2,]= -85254.4408 32006.4419 -0.2237 -2.664 0.0104"
A second regression analysis shows that the overall model P-value remains at 0.000, while the individual values confirms statistical significance. Nevertheless while looking at the linear plot as well as the low R2 value (65%), we can say that there is still something missing in our analysis. As suggested by Statistics Professors and literature, we try a different approach, to transform the data by calculating the lognormal values and rerunning the analysis.
2,] "A N O V A SS df MSS F-test P-value" [ 3,] "_________________________________________________________________________" [ 4,] "Regression 87.300 2 43.650 125.246 0.0000" [ 5,] "Residuals 17.077 49 0.349" [ 6,] "Total Variation 104.377 51 2.047" [ 7,] "" [ 8,] "Multiple R = 0.91454" [ 9,] "R^2 = 0.83639" [10,] "Adjusted R^2 = 0.82971" [11,] "Standard Error = 0.59035" [12,] "" [13,] "" [14,] "PARAMETERS Beta SE StandB t-test P-value" [15,] "________________________________________________________________________" [16,] "b[ 0,]= 4.2518 0.2815 0.0000 15.106 0.0000" [17,] "b[ 1,]= 0.7992 0.0668 0.7454 11.963 0.0000" [18,] "b[ 2,]= 0.2257 0.0440 0.3200 5.135 0.0000"
The obtained regression model after transforming the data seems to better predict the GDP per Capita dependent variable. By looking at the linear plot, we can see that the linear relationships has less residual errors. As we observed the ANOVA statistics, we could see that the individual P-values is now 0.000. and the R2 has increased to 83%, which indicates a better model.
Checking for Multicollinearity issues
In order to check if the model did not have multicollinearity issues between the predictors, we used a thumb rule that if the predictors have high correlation with the dependent variable but low correlation among them it indicates that there is small “noise” and the variables are independent.
Pearson correlation coefficient
Not recommended as neither the regressors nor the dependent variables follow a normal distribution
Spearman’s rank correlation coefficient
Kendall’s rank correlation coefficients
We can see that in the Spearman’s rank correlation numbers, the correlation of External Debt and Total Energy consumption in respect to GDP per Capita are high (>0.5), but small between them (<0.5). As the Kendall coefficient follow the same pattern but with smaller values, we infer a linear correlation (Spearmans Non-parametric test) fits the relationship better than a non-linear (Kendalls non-parametric test).
As we can see the GDP per Capita can be greatly explained by looking at just these two variables. First we see positive correlation with External Debt, which at first sight might seem odd, does this imply the more endebted the country the better it will benefit its individuals? If we try to explain this, we could come up with different hypothesis, one can be that as the countries have higher economic weight they carry heavier credit and hence can afford a larger debt which would be impossible for a poor country to deal with. Other explanation is that as these markets are more mature and more stable, it is safer to have a US government bond than invest in a volatile Asian country for example. A third explanation has to do with returns, as some governments (i.e. USA) including their corporations are eager to make a profit, it seems logical to borrow money from other countries (Japan for example) which have lower interest rates. The government expenditures seek to improve education, infrastructure, technology, competition or curve unemployment and inflation rate. By acquiring a lower cost of capital, both governments and corporations profit by being endebted at low cost while putting the money in higher ROI niches. The challenge comes where other markets start to mature, for example the NYSE has been losing ground to Hong Kong and London therefore market capitalisation is more difficult to obtain. And when other governments from emerging countries stop to finance growth in developed economies in order to invest either in themselves or in higher interest rates, therefore bringing higher capital cost to developed economies. For instance, China could continue to buy US governments bonds in order to sub-value its currency but it may well realize that the more dollars it buys, the more it suffers if the dollar is low, and once it starts to use its dollar reserves the more it will plumb. A consideration of all these factors could partially explain why growth rates are much higher in emerging economies, and why US Federal bank may have to adjust interest and inflations rates in order to prevent a recession.
All this leads to the question, if the “poor” has financed the growth of the “rich” in the past years, who will finance both the “poor” and the “rich” as emerging economies continue to gain strength?
The second finding of this regression analysis indicates that there is a positive relationship between energy consumption and GDP per capita. This could maybe be explained that larger production and industrialization mean higher consumption, nevertheless it can also well mean that as individual obtain higher standards of living they demand more comfortable behaviour (more cars, appliances, bigger houses, etc).
It is difficult to determine how much energy is being wasted and how much is being actually utilized for good economic purposes. In this case causation and correlation are not so easy to separate, is it that starting utilizing more energy (hence more industrialization, production,etc) leads to higher GDP’s per capita? Or is it that once individuals achieve a level of income, they start to consume and waste energy irresponsibly?
Whatever the reason may be, it is obvious that as emerging economies grow, hence their GDP per capita grow will grow along with it, given that the growth of population remains stable. This will result in a considerably higher be a higher demand for energy. This will have an impact on prices of scarce resources oil, minerals, gas as well as the cost of producing/obtaining them because it will only get harder to obtain it (deeper oil pumps, longer gas pipes, etc). It will have a great impact on environment, for example 200 million new cars are expected to be introduced to the Chinese markets in the incoming years which is actually not so far as the current cars in the US. The pollution factors may add up to levels where they can affect overall health, climate change and disposal expenditures. On the other side, this pressure may well create a redirection to renewal energy sources (such as GE, petrol companies latest programs) and better regulations that encourage best practices (carbon emission market, efficiency and productivity programs).
To conclude, as the gap of certain economies is reduced and GDP per capita is increased in certain areas, there are economical and environmental challenges which need to be adressed in order to move in a sustainable and beneficial trend.
- Härdle, W./ Simar, L.: Applied Multivariate Statistical Analysis, Springer 2003
- Härdle, W./ Klinke, S./ Müller, M.: XploRe Learning Guide, Springer 2003
- The Economist
- The Heat is on. A special report on climate change. September 9th-15th 2006 issue.
- The dark side of debt. Why it matters that markets are going private. Sep 23th issue.
- Green Dreams. The risky boom in the clean-energy business. November 18th issue.
- Well structured
- At least one XploRe program
- Does Norway not also have its high GDP per Capita from Oil exports?
- The parallel coordinate plot indicates that transforming some variables would be a good idea :)
- Based in the regression analysis I would have taken "Current Account" and "Inflation" as important variables, too
- Which transformation has been used? Is Y also transformed? If yes, is the R^2 retransformed to the original Y? The caption text is not enough.
- What is GE?
- References are imcomplete
- I'am not sure if the GDP growth is a good variable, the absolute growth may play also an important role.
- The questions concerning "Environmental and economic sustainability" are not answered from the data