Analysing DAX Returns

From Teachwiki
Jump to: navigation, search


This thesis explores the distribution of DAX stock returns and analyzes the influences of market capitalization, different time horizons and some phenomenons of financial markets such as fat tails. Since the stock returns are calculated over a market index, the strategy under investigation is considered as market neutral. Our results show that the distributions of our sample stock returns deviate quite substantial from the normal distribution hypothesis. However, the returns converge to normal distributions with increasing time horizons. We also find that market capitalization does not influence the average returns in a way that can be described with a simple linear model.

The data set[edit]

The data set consists of 5 minute ticks for 28 stocks in the Deutsche Aktien Index (DAX). The missing stocks are TUI and Hypo Real Estate due to data inavailability. With a time frame of 4 months between April 2001 and Juli 2001, the data set comprises 10.000 data points in total.

Out of the 28 available stock data, we chose 3 stocks corresponding to a small, medium and large capitalization for special considerations. That is, we chose Altana as the stock with the smallest capitalization in the DAX, the Deutsche Börse with the median capizalization and Allianz with the largest capizalization.

Getting to know the data set[edit]

Figure: Timeplots of the stock price of three stocks over the period of April 2001 to Juli 2001 as well as of the market index.

We first explored the data set with basic XploRe commands as 'dimensions', 'descriptive' and 'countNotNumbers'. The commands gave us the feedback that the 10.000 data points of 28 stocks were correctly read in. Therefore we concluded that we can go into the analysis without further modifications.

The plot of the three stocks and the market shows that the Altana and Dt. Börse decreased over the time frame from 320 to 291 euros and from 55 to 49 euros respectively. On the other hand, the market price for Allianz increased from April to August 2001 more than 30 percent. The market price as calculated above starts with the initialization of 0 and decreases to -12, which is in accordance with the bearish global market trend in the Q2-Q3 2001.

As a first impression, all stocks show times of high variance. The market price for Altana shares increased significantly in the first days of May. This surge in stock price is due to the 194% EBT increase in Q1 that was announced during the shareholder´s meeting held on 03/05/2001. At the same time a +15% increase sales and an increase in earnings of 20% was forecasted for the financial year, which inspired stockholders.

The market price for Dt. Börse suffered some losses at the end of may (partly weak holiday trading, OM declares interest in LSE on 29/05, which terminates Deutsche Boerse´s plans to merge with LSE).

Although continuously downwards, the calculated market average is developing rather smooth over the time horizon compared to the single stocks as a result of diversification.

Theoretical underpinning[edit]

Leptokurtosis or 'fat tails'[edit]

Figure: Platykurtosis(left) vs. Leptokurtosis (right).

In financial time series, generally fat tails are assumed because high negative log returns are historically likelier than a normal distribution would predict. That is in a historic time series high negative returns have a higher probability than expected. The kurtosis is an indicator of such fat tails in that it gives a measure of the probability of very high or very low values (so called shoulders) and of the peakage. It is measured as

kurt(x) = \frac{E[(X-\mu)^4]}{\sigma^4}

with a value of 3 for a normal distribution. A value above three indicates a leptokurtosis, while a value smaller than three indicates a platykurtosis indicated in the figure (left).

Usually the concept of excess kurtosis is used, where kurtosis is defined as the fourth cumulant divided by the square of the variance of the probability distribution.

\gamma_2 = \frac{\kappa_4}{\kappa_2^2} = \frac{\mu_4}{\sigma^4} - 3, \!

The "minus 3" at the end of this formula is often explained as a correction to make the kurtosis of the normal distribution equal to zero.


Kurtosis and skewness are both parameters in the Jarques-Bera-Test on normality, that we also used on our calculated returns. The teststatistics is computed as follows

JB = \frac{N}{6}*s^2 + \frac {N}{24}*(k-3)^2

where s denotes skewness, k denotes kurtosis and N is the number of observations. The teststatistics approximately follows a \Chi^2-distribution, the critical value on a 10% significance level from the corresponding table is 4,61. A teststatistic value above this critical value implies, that the H_0-hypothesis can not be supported on the 10% significance level.

The questions[edit]

  1. Does any of the three selected stocks perform better than the market?
  2. How much better?
  3. How do the distributions of returns over market vary and how much within different time-horizons?

In order to answer these questions, this thesis is structured as follows: in the next section the methods are described to derive the absolute stock returns and the stock returns above market. The results show the timeplots of the cumulated log returns above market and the histograms of the log returns above market. These are substantiated with central indicators of distributions and extended to different time horizons. The results are concluded with the correlation between the stocks. Finally the results are discussed in the last section and some suggestions for an extended data analysis are made.


The methods consist of three steps leading to the logarithmic (log) returns above market. As a first step, the return of each stock was calculated. The return for each stock 'i' can be calculated as relative return return_t^i = \frac{p_t^i - p_{t-1}^i}{p_t^i} or logarithmic return return_t^i = log(\frac{p_t^i}{p_{t-1}^i}), which we decided for. The actual calculation is done for example with the quantlet 'vEarnings5min' for a time horizon of 5 minutes:

	return = (1:9999)*0
	t = 1
		t = t + 1
		return[t] = log(x[t,2]/x[t-1,2])

This function receives as input the timeseries 'x' and computes the output 'return'. The computation is done for every time step 't', beginning from t = 2 until t = 9999. The length of the output is decreased by one data point, because the return cannot be calculated for the fist data point.

As a second step, the market return is calculated as the average log. return of all stocks with return_t = \frac{1}{28}\sum_i log(\frac{p_t^i}{p_{t-1}^i}) with stock price p of stock i at time t. The calculation is done with the quantlett 'mEarnings5min' for a time horizon of 5 minutes:

         return = (1:9999)*0
         i = 0
	         i = i + 1
	         return = return + log(x[2:10000,j]/x[1:9999,j])
         return = return/28

This function receives the timeseries of all stocks and computes the output 'return'. The computation is done as summation over the returns and then by division by the number of stocks. The returns are in this function calculated with a vector division instead of with a while loop in order to decrease computation time.

The third step, then calculates the log. return of a stock above the market with return_t^i - return_t for each time step 't'. This procedure was repeated also for different time horizons of 5 min, 1 hour and 1 day, which equals 1 tick, 12 ticks and 108 ticks respectively. Therefore not only the difference between subsequent data points was calculated but also over 12 and 108 data points.

Given the methods to calculate the log returns of each stock absolute and above market, the next section analyses the results in timeplots of cumulated log returns and histograms for three different stocks and different time horizons.


Whether a stock performs better than the market can be analysed by plotting the cumulated return above market. The returns are cumulated in order to plot the development of an investment at the beginning of the period in April 2001. With a positive cumuluted return at the end of the period, the investment then performed better then the market.

Timeplots of the cumulated log returns above market[edit]

Figure: Timeplots of the log returns above market for Altana, Dt. Börse and Allianz and for the time horizons of 5 minutes, 1 hour and 1 day between April and Juli 2001.

The figure plots the the return of three stocks over the period April 2001 to Juli 2001 for different time horizons. From the plots follows that Altana and Allianz performed better than the proposed market with 4.3 and .42 percent respectively. Dt. Börse performed worse than the market with 1.2 percent. Therefore an long investment in Altana (small cap) and Allianz (large cap) and a short investment in the market would have been prosperous, while only a contrary strategy would have increased the value for Dt. Börse.

Interestingly, the different time horizons do not change this result. However, the time course of cumulated returns becomes smoother with increasing time horizons. This seems to hold for all stocks and for each increase in time horizon.

Compared to timeplot in the figure above, however, general features of the timecourses remain. For instance, the sharp increase for Altana in the middle of may and for Allianz in the middle of july remain in the plotted returns. This seems plausible, since such high changes cannot absorbed by the market. However, given the market movement the returns of Dt. Börse above market are increasing quite dramatically in july and june 2001, which was not that evident in the timeseries of stock prices above.

Histograms of the log returns above market[edit]

Figure: Average shifted histograms of the log returns.

This result is also resembled in the histograms of the log returns of each stock above market. Note that we examined the 5 minute returns here. Especially for Allianz the histogram is slightly shifted to positive returns in the time frame. On the other side, the histogram for Dt. Börse is rather shifted to the left. All three histograms show quite some differences to a normal distribution. This becomes even more evident in the following table, showing several indicators of the distribution shown in the histogram.

Altana Dt. Börse Allianz
Mean 0.00058 -0.00055 0.00502
Median 0.00039 -0.00089 0.00333
Variance 0.00013 0.00014 0.00063
Skewness 0.14728 0.35433 1.24129
Kurtosis 3.20132 4.97229 7.98768
JB-test 155669 162932 41521300
prob. value 0 0 0

The mean of all three stocks is quite close to zero. However, the distribution has a negative mean for Dt. Börse, but not for Allianz or Altana. This follows directly from the loss for Dt. Börse, and the increase for Allianz and Altana above in the timeplots. The median is generally quite similar to the mean, but diverges more for Allianz than for Altana for example. Similarly, skewness and kurtosis increase for Allianz, showing even fat tails in the histogram for positive log returns. The variance is similar for the three stocks and is generally small. The evidence of leptokurtosis is therefore highest in the Allianz log returns. The H_0-hypothesis that 5 minute returns over market for each stock are clearly normal cannnot be supported on the 10% level, since the critical value of 4,61 is exceeded (see table).

Results for different time horizons[edit]

Looking at the different time horizons, the indicators for our example Altana change quite interestingly. The decreasing variance with higher time horizon indicate the decreasing volatiliy already apparent in the timeplot. The difference in time horizon also becomes evident in the other indicators. With a higher time horizon the mean shifts closer to 0, while the median on the other hand increases. The skewness even changes from negative values to positive ones and the kurtosis decreases with higher time horizons becoming closer to the value of 3. In accordance, the JB-testvalue decreases rapidly, which indicates, that with increasing time horizon the distribution of returns converges to a normal distribution.

Example: Altana

5 min 1 hour 1 day
Mean 0.000004 0.000046 0.000580
Median 0 0 0.00039
Variance 0.000002 0.00001 0.00013
Skewness -0.73033 -0.35005 0.14728
Kurtosis 22.2724 8.28639 3.20132
JB-test 155669 11837.8 52.5306

Correlations between the three stocks[edit]

Finally, we also calculated the correlation between the three stocks and found quite little correlation. Especially the correlation with Altana of the other stocks is quite low. On the other side, the correlation between Allianz and Dt. Börse is more than twice as high. This can be for different reasons, and is especially surprising since it is widely acknowledged that DAX stocks have a positive correlation. We see from the tables that returns of these three stocks seem to be slightly negatively correlated in the longer run (1hour, 1day), but show a positive (Bravais-Pearson) correlation coefficient in the 5 minute period.

Correlation for 1 day returns
1 (-0.04214) (-0.06021)
(-0.04214) 1 (-0.1533)
(-0.06021) (-0.1533) 1
Correlation for 1 hour returns
1 (-0.02706) (-0.13003)
(-0.02706) 1 (-0.14557)
(-0.13003) (-0.14557) 1
Correlation for 5 min returns
1 0.50168 0.08985
0.50168 1 0.07251
0.08985 0.07251 1


The first and most important point that has to be criticized is that the sample of three that we decided for in the first place is too small to provide any reliable explanation. Furthermore the assessed period is too short and DAX companies might have some unusual features that are not valid for stock markets in general.

As for the effect of market capitalization and time horizons on returns and performance in general, a relationship cannot be deduced from 3 stocks alone. To overcome at least some of this points, we decided to extend parts of our analysis to all 28 stocks.

Generating returns for all 28 stocks[edit]

    proc(mx) = pickstock(x)
         i = 0
            x1 = x[,i]
        	return = sEarnings1d(x1)-mEarnings1d(x)
        	mx[i] = mean(return)

This quantlet creates a 1x28 vector of the average stock returns over market. Afterwards, this vector can be concenated with a number or name vector and sorted descendingly with respect to the "return" column. The first element in the list is the best performing stock during the examined time frame, in our case Allianz, which did not only achieve the best results in our sample of three, but also compared to the other stocks in the market.

        s= (1:28)
           sx = s~mx
           y=sort(sx, -2)

Here, performance is measured in terms of average return over the examined period, but to be exact the optimal stock would have to have not only the highest \mu but the lowest possible \sigma as well. Another interesting application would be to test the validity of the CAPM using this dataset.

The model[edit]

Coming back to the the initial question, whether marketcap influences performance or not, we also read in the market capitalization of the 28 stocks as of November 2006. To visualize the relationship between market capitalization and performance (average returns) we ran a linear regression.

Figure: Linear regression for variables market capitalization and average return

The figure indicates that it is unlikely that there is a (linear) relationship between the two variables, and the more detailled analysis below shows indeed, that the linear model cannot explain the observations in a sufficient way.

   [ 1,] 
   [ 2,] A  N  O  V  A                  SS      df     MSS       F-test   P-value
   [ 3,] 
   [ 4,] Regression                     0.000     1     0.000       3.245   0.0833
   [ 5,] Residuals                      0.000    26     0.000
   [ 6,] Total Variation                0.000    27     0.000
   [ 7,] 
   [ 8,] Multiple R      = 0.33309
   [ 9,] R^2             = 0.11095
   [10,] Adjusted R^2    = 0.07675
   [11,] Standard Error  = 0.00214
   [14,] PARAMETERS         Beta         SE         StandB        t-test   P-value
   [16,] b[ 0,]=         -0.0009       0.0007       0.0000        -1.432   0.1641
   [17,] b[ 1,]=          0.0000       0.0000       0.3331         1.801   0.0833

Running the procedure with the variable cumulated return instead, gives a different stock (nr.17 Henkel KGaA) that clearly outperforms the market and even less of a linear relationship.



  • W.Härdle, S.Klinke, M.Müller: „XploRe Learning Guide“, 2000, Springer-Verlag, Heidelberg
  • J.Franke, W.Härdle, C.Hafner: „Einführung in die Statistik der Finanzmärkte“, 2001, Springer, Heidelberg



  • Notation could have been unified use \gamma_2 instead of k-3, \kappa_i not explained
  • The quantlet 'vEarnings5min' could have been written much more efficiently
  • Who can invest in time horizons like 5 min, 1 hour and 1 day? I can not...
  • Maybe a quadratic regression would be a good idea