Analysis of Economic Indicators among countries

From Teachwiki
Jump to: navigation, search


This paper analyses differences among countries in respect to particular economic indicators while evaluating the relationship among these indicators, with the objective of qualitative and quantitatively understanding the impact on socio-economical aspects.

The analysis has shown that there is a significant correlation among some of the observed variables that may be indicative of how GDP per Capita is considerably influenced by certain economic factors of the country. Some of the relevant findings of the performed analysis is how a combination of Budget Account Balance, Total Energy Consumption per Capita and External debt can assist a country to reach different GDP per Capita levels.


A considerable gap in wealth distribution among world nations has been always an issue in terms of efficient utilization of resources, inequality of standards of living, economical/political power and challenges to global environment. Due to the impact of economic globalization which include a variety of events ranging from emerging economies joining the WTO (e.g. China) and reducing trading barriers that allow heavier movement of capital, industrial and labor markets, recently many developing countries have succeeded in reducing the gap in world’s wealth distribution (e.g. China, Brazil, India, Singapore). Nevertheless many other countries in Africa and Latinamerica are still lagging behind in terms of economic prosperity.

The objective of this analysis is to observe several economic indicators of several countries to quantify the difference among these variables and seek correlation among them. While studying the structure and economic behavior of countries whose individuals currently enjoy high GDP’s per capita, one can seek to project this trend and understand what challenges might exist when developing countries push themselves to increase theirs as well.


There is the question on how emerging economies will achieve growth and at what expense. For instance, when France, USA or Great Britain encountered industrial revolutions, environmental concerns were not an issue. Nowadays, they are indeed a hot issue on the world’s leaders agendas, but this poses a question: How can emerging countries grow at a sustainable rate while having to carry with an excess cost burden of environmental regulation that demands low carbon emissions and greener technologies that most of the time demand heavy upfront investment. Their arguments have valid points, why do they have to pick up the bill when rich countries have done most of the damage?, How can they compete globally on matters such as resource’s efficiency, green regulation and environment friendly technologies, when they have hundreds of years of lagging disadvantage? And most importantly how realistic and economicly feasible is it for emerging countries to catch up the lost time, without following at least some of the developed countries history?

Now, if there are concerns regarding the process of how the emerging countries will arrive at the level of rich countries, there are also important economic implications of what challenges may come up once they do. For example, as more Chinese, African and Mexican people use cars, buy home appliances and demand higher standards of living: How will the excessive demand push up the price of oil, electricity, minerals, etc?, How will this demand influence production funtions –i.e. down because of economies of scale or up because of diminishing returns-?.

Perhaps all these corncerns are alarmistic, and what may well happen is that as there is more pressure to stop depending on current economic models, more investment will be injected to alternative fuels, more efficient technology and higher productivity programs. For example, as there is pressure to cut carbon emissions, people and governments are more willing to reconsider nuclear plants as an option.

Introduction to data

The Data set selected for this analysis comprises economic indicators for 52 countries, which correspond to the year 2004 timeframe. The source of this data was Fedstats US Government ( crossreferenced with World Bank ( information.


The countries were selected from a wide range of geographic location, population size and economic weight in order to seek a representative sample of world economies. Below is a list of how many countries were selected per geographic region:

  • North America: 3
  • Europe: 6
  • South/Central America: 10
  • Africa: 9
  • Middle East: 7
  • Asia: 13
  • Australia: 1
  • Ex-USRR: 3


The variables under study were the following economic indicators:

Population: Expressed as Number of Inhabitants
Religion: Divided in 4 groups
Predominant Religion: Expressed in % of Population belonging to predominant religion
Independence: Expressed as Years since independence was acquired
Gross Domestic Product: Expressed as annual GDP in USD Billion
Unemployment Rate: Expressed in decimals
Total Per Capita Energy Consumption: Expressed in Mio. BTU per Individual per year
From 2003 to 2004
Current Account Balance: Expressed in US Dollars Billion
External Debt: Expressed in US Dollars Billion
Inflation Rate: Expressed in decimals
GDP Per Capita: Expressed in USD

Analysis objectives

The goal of studying this data is to get a deeper insight and understanding of the following issues:

How wealth is distributed in terms of amount, percentage, concentration, etc.
Is there a trend or indicator of which countries are growing the fastest/slowest.
Which pattern does energy consumption follow in respect to GDP per capita.
Which role can capital markets play on assisting a country’s economic growth.
Environmental and economic sustainability:
What challenges may exist for the future, both for developed and emerging countries.

Univariate descriptive statistics

  • x=read("gdp0811")
  • descriptive(x)
  • gr0=grash(x[,14])
  • gr0=setmask(gr0,"line","red")
  • gr1=grhist(x[,14])
  • gr2=grbox(x[,14])
  • gr3=grdot(x[,14])
  • gr4=grqqn(x[,14])
  • gr4=setmask(gr4,"line","blue")
  • d = createdisplay(2,2)
  • show(d,1,1,gr1,gr0)
  • show(d,1,2,gr2)
  • show(d,2,1,gr3)
  • show(d,2,2,gr4)

In this section we take a closer look at the variables that obviously depend directly on our main variable of interest, GDP per capita. We consider GDP in absolute terms and GDP growth before analysing our measurement of wealth, GDP per capita. The graphs displayed for every variable contain 4 different descriptive techniques. Clockwise starting in the upper left corner, we focus on Histograms, Boxplots, QQ-Plots and Dotplots. An analysis of the results of every variable is given after the graphs and the printout of the descriptive statistics.

The xplore-code that was used for creating the plots is to be found in the table on the right.

Gross domestic product (GDP)

Descpriptive Statistics GDP

While looking at the difference in Mean (715.134) and Median (116.8), as well as the minimum and maximum values, we can conclude that wealth is largely concentrated on few countries which act as outliers bringing the mean up, while most of the countries remain on the low end. From the data a graph of cumulative wealth % per countries was created in order to illustrate the global distribution of wealth.

[152,] " Mean              715.134"
[153,] " Std.Error         1798.52     Variance      3.23467e+06"
[155,] " Minimum               7.2     Maximum             11800"
[156,] " Range             11792.8"
[157,] " "
[158,] " 	Lowest cases                  Highest cases "
[159,] "         24 (Gabon):           7.2              19 (UK):         2200"
[160,] "         10 (Paraguay):       8.06              40 (China):      2230"
[161,] "         36 (Azerbaijan):    8.7              15 (Germany):      2700"
[162,] "          5 (Bolivia):           9.6              43 (Japan):    4700"
[163,] "         29 (Bahrain):         11.8               2 (US):        11800"
[164,] " "
[165,] " Median              116.8"
[166,] " 25% Quantile           32     75% Quantile          689"
[167,] " "
[168,] " Skewness          4.74071     Kurtosis          28.2801"
[169,] "                               Excess            25.2801"

Golbal wealth distribution

Dist of wealth.jpg

There is a clear bias in the distribution of wealth perceptible. A small number of countries hold a large proportion of the global wealth. 10 countries hold roughly 80 % of the wealth. Leaving the majority (42 countries in our model world) with only 20 %. The richest country being considered, the US, even holds 31 % of global wealth.

GDP growth rate

Descriptive Staistics GDP growth rate

Analysing this data we can see that overall the mean is 5.5% and median 5%, but looking at the extremes we can see that certain economies seem stagnated while other are blooming. It is interesting to see that from the 5 lowest, 3 are developed countries (France, Germany and Italy) with an average of 1%. On the other hand all of the 5 are developing countries, being China the best example of an emerging economy which is positioning itself as a global competitor, political/monetary influencial figure and large demand/supply market.

Several factors could contribute to this:

  • As emerging economies are developing infrastructure, industrial capabilities and open markets they are gaining momentum at leapfroging steps from.
  • Developed countries are facing diminishing productivity returns, aging population, higher burdens (taxes, salaries, unions, regulation) which do not allow growth rates.
  • It is important to consider that even though growth rate for these developing countries are higher than the ones of developed countries, their inflation rates are considerably higher (an extreme example is Angola with 37%), which combined with political instability and currency volatility offset the potential for investment and growth.
Mean            0.0550769"
[243,] " Std.Error       0.0400024     Variance       0.00160019"
[244,] " "
[245,] " Minimum             0.009     Maximum              0.26"
[246,] " Range               0.251"
[247,] " "
[248,] " Lowest cases                  Highest cases "
[249,] "         15 (Germany):         0.009              13(Venezuela):        0.093"
[250,] "         17 (Italy): 	             0.01              40(China):               0.1"
[251,] "         22 (Cote D‘Ivoire):  0.011              12(Uruguay):         0.123"
[252,] "         14(France):               0.012              21(Angola):          0.144"
[253,] "         24(Gabon):              0.014              36(Azerbaijan):         0.26"
[254,] " "
[255,] " Median               0.05"
[256,] " 25% Quantile        0.035     75% Quantile        0.064"
[257,] " "
[258,] " Skewness          2.76096     Kurtosis          14.1475"
[259,] "                               Excess            11.1475"

GDP per Capita

Descriptive Statistics GDP per Capita

From the analysed 52 countries, 3 of the 5 countries with highest GDP per capita (US, Japan and Norway) mostly owe these GDP per capita levels to high literacy rates, technological advantages, infrastucture which foster market efficiency and productivity. The remaining 2 countries (Qatar and United Arabs) have a large state income from oil exports. On the other side the countries on the lower end in general lack solid judiciary and legal frameworks which enforce contracts, have poor transport/industrial infrastructure and a history of socio-political instability.

Mean              12141.4"
[363,] " Std.Error         15247.4     Variance      2.32482e+08"
[364,] " "
[365,] " Minimum            397.04     Maximum           55518.8"
[366,] " Range             55121.7"
[367,] " "
[368,] " Lowest cases                  Highest cases "
[369,] "         39(Bangladesh):        397.04              43(Japan):      36886.7"
[370,] "         51(Vietnam):          627.28               2(US):            39900.7"
[371,] "         28(Sudan):            641.991              33(Qatar):       43682.2"
[372,] "         26(Nigeria):           687.261              34(United Aabs):  47674.6"
[373,] "         35(Yemen):           733.34              18(Norway):      55518.8"
[374,] " "
[375,] " Median            4298.47"
[376,] " 25% Quantile      1269.71     75% Quantile      17142.6"
[377,] " "
[378,] " Skewness          1.23724     Kurtosis          3.14239"
[379,] "                               Excess           0.142389"

Multivariate analysis



From the scatter plot, we can see that the variable which seems to have a correlation with GDP per capita is total energy consumption per capita, having a positive relationship. This can be observed in the graph in the left corner. At the same time External Debt seems to have a slight contribution to GDP per capita. This has to be yet confirmed in further investigation. For the other scatterplots there is no correlation detectable. All of these effects will be further studied in more detail.

Andrews curve

Andrews curve

We are using two techniques in order to display more than 4 variables in a graph. Namely Andrew Curve and Parallel Coordinates Plot. Out of our dataset we chose 18 countries. 9 countries were chosen from Europe, North America + Japan, representing the rich countries and being colored black in our analysis. The other 9 countries belong to the region Africa, representing poor countries and being colored red. The number of observations was limited to 18 in order to minimize the signal-to-ink-ratio which occurs when more than 20 observations are taken into consideration. This would make a clear distinction of the curves more difficult.

There is a slight separation in subgroups detectable. Poor countries seem to vary less than rich ones, also displaying less outliers than rich countries. The separation follows the colour of the curves, thereby indicating a difference between rich and poor.

Parallel coordinates plot

Andrews curve

The second tool for analyzing multivariate datasets used are Parallel Coordinates Plot. We chose the same countries as in the section on Andrew curves, to allow an easier comparison of two similar analysing tools.

The result of this analysis shows, that variable 14 (gpd per capita) divides the groups relatively good into 2 subgroups. As in the section before, black values display a higher fluctuation, thereby confirming the results from the Andrew Curves part. Furthermore the overlappings and clusters, observed in the first first variables indicates, that religion, region and independence have no influence on gdp per capita and are not correlated. One black outlier clearly sticks out from the others. This value is the negative current account balance of the US. Poor countries have some very perceptible outliers, when considering the variable “Inflation”. Opposed to this, all rich countries hardly show any variation concerning this variable.

Regression analysis

By regressing our variables on the main variable of interest, GDP per Capita, we try to detect, which variables have a significant influence on this variable. Variables with a minor contribution will be dropped, so that in the end a model which tries to explain most of the deviation in the observations using the lowest possible number of explaining variables, remains. R2 is the measure used to evaluate our model. The closer R2 to 1, the better our model.

"A  N  O  V  A                   SS      df     MSS       F-test   P-value"
[ 3,] "_________________________________________________________________________"
[ 4,] "Regression                9766446838.722    11887858803.520      16.991   0.0000"
[ 5,] "Residuals                 2090145385.181    4052253634.630"
[ 6,] "Total Variation           11856592223.903    51232482200.469"
[ 7,] ""
[ 8,] "Multiple R      = 0.90759"
[ 9,] "R^2             = 0.82371"
[10,] "Adjusted R^2    = 0.77524"
[11,] "Standard Error  = 7228.66756"
[14,] ”		PARAMETERS         Beta         SE         StandB       	 t-test   	P-value"
[15,] ”	________________________________________________________________________"
[16,] ”Coefficient	b[ 0,]=       2800.7858     5664.4249       0.0000         	0.494  	 0.6237"
[17,] ”Population	b[ 1,]=          0.0000       0.0000      -0.1390        	-1.738 	  0.0899"
[18,] ”Religion		b[ 2,]=       -334.4732     1241.4180      -0.0217     	 -0.269    0.7890"
[19,] ”% Pred Religion	b[ 3,]=       2080.1449     5532.5956       0.0294         0.376  	 0.7089"
[20,] ”Independence	b[ 4,]=          1.1147       3.6410       0.0378         	0.306  	 0.7611"
[21,] ”GDP		b[ 5,]=          2.4083       1.7230       0.2841       	  1.398 	  0.1699"
[22,] ”Unemployment	b[ 6,]=         45.8795     205.1109       0.0158         	0.224   	0.8241"
[23,] ”Energy per Capita	b[ 7,]=         58.5546       6.1636       0.6821         	9.500   	0.0000"
[24,] ”Real GDP Growth	b[ 8,]=     -31337.6233     30288.2382      -0.0822      	-1.035	   0.3070"
[25,] ”Current Account	b[ 9,]=         39.0404      19.8472       0.3120       	  1.967   	0.0561"
[26,] ”External Debt	 b[10,]=          2.9052       1.2462       0.3178        	 2.331   	0.0249"
[27,] ”Inflation 	b[11,]=     -12293.6769     20468.4940      -0.0478  		 -0.601  	 0.5515"

The performed regression model indicates an overall P-value of 0.000 which implies that at least one of the coefficients is non-zero. The difference in R2 and R2-adjusted shows there are many variables which are not greatly contributing to the model. If we use a α-significance-value of 0.05 we drop all variables except for External Debt (P-value=0.0249) and Total Energy Consumption per Capita (P-value=0.000).

It is important to mention that Current Account Balance had also a small P-value, and even though in this first removal round of α =0.05 it did not make the cut, if subsequent models did not provide good prediction this variable would have been revisited. In this first analysis Current Account Balance had a positive relationship with GDP per capita, the potential explanation is that as governments have a surplus it is indicative of sustainable usage of state resources, as well as being able to meet debt interest. The exception were US with a huge deficit of USD 821 Billion, which in combination with the house market bubble burst, lower productivity rates and falling currency could potentially explain the difficulties the American economy is facing to grow. On the other hand, while seeing the countries with the highest surplus; China with USD 160.8 Billion, and Japan with USD 162.2 Billion we can better understand why these countries hold the biggest Dollar reserves worldwide with China approaching the 1 trillion figure. This fact also assists China to keep the Yuan exchange rate controlled in order to maintain a huge cost competitiveness continued to boost exports globally.

Linear Regression using Total Energy Consumption and External Debt

[ 2,] "A  N  O  V  A                   SS      df     MSS       F-test   P-value"
[ 3,] "_________________________________________________________________________"
[ 4,] "Regression                7792468767.331     23896234383.665      46.976   0.0000"
[ 5,] "Residuals                 4064123456.572    4982941295.032"
[ 6,] "Total Variation           11856592223.903    51232482200.469"
[ 7,] ""
[ 8,] "Multiple R      = 0.81070"
[ 9,] "R^2             = 0.65723"
[10,] "Adjusted R^2    = 0.64324"
[11,] "Standard Error  = 9107.21116"
[12,] ""
[13,] ""
[14,] "PARAMETERS         Beta         SE         StandB        t-test   P-value"
[15,] "________________________________________________________________________"
[16,] "b[ 0,]=       7811.0200     2451.7036       0.0000         3.186   0.0025"
[17,] "b[ 1,]=         65.2039       7.2080       0.7596         9.046   0.0000"
[18,] "b[ 2,]=     -85254.4408     32006.4419      -0.2237        -2.664   0.0104"
Left: External Debt / Right: Total Energy Consumption per Capita

A second regression analysis shows that the overall model P-value remains at 0.000, while the individual values confirms statistical significance. Nevertheless while looking at the linear plot as well as the low R2 value (65%), we can say that there is still something missing in our analysis. As suggested by Statistics Professors and literature, we try a different approach, to transform the data by calculating the lognormal values and rerunning the analysis.

 2,] "A  N  O  V  A                   SS      df     MSS       F-test   P-value"
[ 3,] "_________________________________________________________________________"
[ 4,] "Regression                    87.300     2    43.650     125.246   0.0000"
[ 5,] "Residuals                     17.077    49     0.349"
[ 6,] "Total Variation              104.377    51     2.047"
[ 7,] ""
[ 8,] "Multiple R      = 0.91454"
[ 9,] "R^2             = 0.83639"
[10,] "Adjusted R^2    = 0.82971"
[11,] "Standard Error  = 0.59035"
[12,] ""
[13,] ""
[14,] "PARAMETERS         Beta         SE         StandB        t-test   P-value"
[15,] "________________________________________________________________________"
[16,] "b[ 0,]=          4.2518       0.2815       0.0000        15.106   0.0000"
[17,] "b[ 1,]=          0.7992       0.0668       0.7454        11.963   0.0000"
[18,] "b[ 2,]=          0.2257       0.0440       0.3200         5.135   0.0000"
Left: log-External Debt / Right: log-Total Energy Consumption per Capita

The obtained regression model after transforming the data seems to better predict the GDP per Capita dependent variable. By looking at the linear plot, we can see that the linear relationships has less residual errors. As we observed the ANOVA statistics, we could see that the individual P-values is now 0.000. and the R2 has increased to 83%, which indicates a better model.

Checking for Multicollinearity issues

In order to check if the model did not have multicollinearity issues between the predictors, we used a thumb rule that if the predictors have high correlation with the dependent variable but low correlation among them it indicates that there is small “noise” and the variables are independent.

Pearson correlation coefficient

Not recommended as neither the regressors nor the dependent variables follow a normal distribution

Spearman’s rank correlation coefficient

  • Total energy per Capita vs GDP per Capita: 0.87416
  • External Debt vs GDP per Capita: 0.53689
  • Total Energy per Capita vs External Debt: 0.39888

Kendall’s rank correlation coefficients

  • Total energy per Capita vs GDP per Capita: 0.69985
  • External Debt vs GDP per Capita: 0.38023
  • Total Energy per Capita vs External Debt: 0.25877

We can see that in the Spearman’s rank correlation numbers, the correlation of External Debt and Total Energy consumption in respect to GDP per Capita are high (>0.5), but small between them (<0.5). As the Kendall coefficient follow the same pattern but with smaller values, we infer a linear correlation (Spearmans Non-parametric test) fits the relationship better than a non-linear (Kendalls non-parametric test).


As we can see the GDP per Capita can be greatly explained by looking at just these two variables. First we see positive correlation with External Debt, which at first sight might seem odd, does this imply the more endebted the country the better it will benefit its individuals? If we try to explain this, we could come up with different hypothesis, one can be that as the countries have higher economic weight they carry heavier credit and hence can afford a larger debt which would be impossible for a poor country to deal with. Other explanation is that as these markets are more mature and more stable, it is safer to have a US government bond than invest in a volatile Asian country for example. A third explanation has to do with returns, as some governments (i.e. USA) including their corporations are eager to make a profit, it seems logical to borrow money from other countries (Japan for example) which have lower interest rates. The government expenditures seek to improve education, infrastructure, technology, competition or curve unemployment and inflation rate. By acquiring a lower cost of capital, both governments and corporations profit by being endebted at low cost while putting the money in higher ROI niches. The challenge comes where other markets start to mature, for example the NYSE has been losing ground to Hong Kong and London therefore market capitalisation is more difficult to obtain. And when other governments from emerging countries stop to finance growth in developed economies in order to invest either in themselves or in higher interest rates, therefore bringing higher capital cost to developed economies. For instance, China could continue to buy US governments bonds in order to sub-value its currency but it may well realize that the more dollars it buys, the more it suffers if the dollar is low, and once it starts to use its dollar reserves the more it will plumb. A consideration of all these factors could partially explain why growth rates are much higher in emerging economies, and why US Federal bank may have to adjust interest and inflations rates in order to prevent a recession.

All this leads to the question, if the “poor” has financed the growth of the “rich” in the past years, who will finance both the “poor” and the “rich” as emerging economies continue to gain strength?

The second finding of this regression analysis indicates that there is a positive relationship between energy consumption and GDP per capita. This could maybe be explained that larger production and industrialization mean higher consumption, nevertheless it can also well mean that as individual obtain higher standards of living they demand more comfortable behaviour (more cars, appliances, bigger houses, etc).

It is difficult to determine how much energy is being wasted and how much is being actually utilized for good economic purposes. In this case causation and correlation are not so easy to separate, is it that starting utilizing more energy (hence more industrialization, production,etc) leads to higher GDP’s per capita? Or is it that once individuals achieve a level of income, they start to consume and waste energy irresponsibly?

Whatever the reason may be, it is obvious that as emerging economies grow, hence their GDP per capita grow will grow along with it, given that the growth of population remains stable. This will result in a considerably higher be a higher demand for energy. This will have an impact on prices of scarce resources oil, minerals, gas as well as the cost of producing/obtaining them because it will only get harder to obtain it (deeper oil pumps, longer gas pipes, etc). It will have a great impact on environment, for example 200 million new cars are expected to be introduced to the Chinese markets in the incoming years which is actually not so far as the current cars in the US. The pollution factors may add up to levels where they can affect overall health, climate change and disposal expenditures. On the other side, this pressure may well create a redirection to renewal energy sources (such as GE, petrol companies latest programs) and better regulations that encourage best practices (carbon emission market, efficiency and productivity programs).

To conclude, as the gap of certain economies is reduced and GDP per capita is increased in certain areas, there are economical and environmental challenges which need to be adressed in order to move in a sustainable and beneficial trend.


  • Härdle, W./ Simar, L.: Applied Multivariate Statistical Analysis, Springer 2003
  • Härdle, W./ Klinke, S./ Müller, M.: XploRe Learning Guide, Springer 2003

The Economist
  • The Heat is on. A special report on climate change. September 9th-15th 2006 issue.
  • The dark side of debt. Why it matters that markets are going private. Sep 23th issue.
  • Green Dreams. The risky boom in the clean-energy business. November 18th issue.


  • Well structured
  • At least one XploRe program
  • Does Norway not also have its high GDP per Capita from Oil exports?
  • The parallel coordinate plot indicates that transforming some variables would be a good idea :)
  • Based in the regression analysis I would have taken "Current Account" and "Inflation" as important variables, too
  • Which transformation has been used? Is Y also transformed? If yes, is the R^2 retransformed to the original Y? The caption text is not enough.
  • What is GE?
  • References are imcomplete
  • I'am not sure if the GDP growth is a good variable, the absolute growth may play also an important role.
  • The questions concerning "Environmental and economic sustainability" are not answered from the data