Computing eigenvalues and eigenvectors

From Teachwiki
Jump to: navigation, search

Introduction

The concept of computation of eigenvalues and eigenvectors is very important in many areas of economics, especially in statistics, econometrics and macroeconomics. Some important concepts of economic analysis use eigenvalues and eigenvectors, i.e. dynamic analysis, stability analysis, concepts and methods for reducing dimensionality, etc.

In econometrics the concept of eigenvalues and eigenvectors (in econometric terms the concept of characteristic roots and characteristic vectors) is important part of the vector ARMA models and of the principal component regression model [2].

In time series analysis, it is common to use the principal component analysis. This concept can be used for reducing dimensionality of the data. Therefore, it can be helpful for modelling the covariance (or correlation) structure of the time series. This covariance structure modelling plays an important role in portfolio selection [6].

There are two different sets of techniques for computation of eigenvalues and eigenvectors. The first set relies on specially designed transformations that are useful for specific tasks, for example zeroing a particular off-diagonal element (Jacobi's method) or zeroing a whole row or column (Household method). The second set of techniques is connected with factorization models (QR and QL method) [4].

In the following the Jacobi's method will be explained. There exist several competing versions of Jacobi's method. Two basic versions will be introduced: the classical Jacobi's method and the threshold Jacobi's method [3].


Eigenvalues and Eigenvectors

In the  m dimensional space  R^m eigenvalues are characteristic roots of the characteristic (eigenvalue) equation. This eigenvalue equation is a polynomial of order  m . Since a polynomial of order  m can have  m solutions, there exist also  m possible eigenvalues. Some eigenvalues can, however, be equal. Eigenvectors are corresponding vectors associated with each eigenvalue that are solutions to the eigenvalue equation.

We can distinguish between right and left eigenvalues and between right and left eigenvectors. Later in this chapter we will connect the concepts of computation of the right and left eigenvalues and eigenvectors.

Symmetric matrices can have some nice properties that will be explained later in this chapter. Also, we can decompose any symmetric matrix according to the concept of spectral decomposition.

Thus, in this chapter the following concepts will be introduced:

  • Right eigenvalues and right eigenvectors,
  • Left eigenvalues and left eigenvectors, and
  • Characteristics of a symmetric matrix; spectral decomposition of a matrix.

This chapter is an introduction to the concept of computation of eigenvalues and eigenvectors. Therefore, some terms will be defined, and the concept of computation of eigenvalues and eigenvectors will be explained for matrices of order  m = 2 .


  • Right eigenvalues and right eigenvectors


Right eigenvalues of a quadratic matrix can be computed according to rules (1) - (5), where  \Sigma is a quadratic matrix of order  m ,  I is identitiy matrix of order  m (with ones on the mean diagonal),  \gamma represents one right eigenvector, and  \lambda represents one right eigenvalue. The expression (5) is in fact a polynomial of order  m , so it can have  m possible solutions.


 (1 \,\!)
	{\Sigma	\gamma=\lambda	\gamma}\,\!


 (2 \,\!)
	{\Sigma	\gamma=\lambda	I	\gamma}\,\!


 (3 \,\!)
	\Sigma	\gamma-\lambda	I	\gamma=0\,\!


 (4 \,\!)
	(\Sigma -\lambda	I)	\gamma=0\,\!


 (5 \,\!)
	|\Sigma -\lambda	I|=0\,\!


As an example, consider a symmetric matrix  \Sigma in (6). This matrix is symmetric, it is of order 2  ( m = 2 ) and it contains real numbers [5].


 (6 \,\!)
	\Sigma =\begin{bmatrix}2&1\\1&2\end{bmatrix}\,\!


The procedure for computation of right eigenvalues is given in expressions (7) - (9). In (8) we have a quadratic polynomial, which has two solutions, in our case two real right eigenvalues given in (9).


 (7 \,\!)
	\left|\begin{bmatrix}2-\lambda&1\\1&2-\lambda\end{bmatrix}\right|=0\,\!


 (8 \,\!)
	\lambda^2-4\lambda+3=0\,\!


 (9 \,\!)
	\lambda_{1}	=	1, \lambda_{2}	=	3.\,\!


If we consider the first right eigenvalue  \lambda_{1}	= 1 one can compute the corresponding first right eigenvector by rules (10) - (13). It is enough to solve for only one equation that arise from expression (10). In (12) we use the fact that eigenvectors are orthonormal, that is their Euclidean norm (square root of the sum of the squared elements of a vector) is equal to one and they are orthogonal each other  ( \gamma_{1}^{\top} \gamma_{2} = 0 ) [1]. After solving, we finally get following two elements of the first right eigenvector \gamma_{1} , namely \gamma_{11} and \gamma_{12} in (13).


 (10 \,\!)
	\begin{bmatrix}2-1&1\\1&2-1\end{bmatrix}\begin{bmatrix}\gamma_{11}\\\gamma_{12}\end{bmatrix}=0\,\!


 (11 \,\!)
	\gamma_{11}=-\gamma_{12}\,\!


 (12 \,\!)
	\gamma_{11}^2+\gamma_{12}^2=1\,\!


 (13 \,\!)
	\gamma_{11}=0.70711,	\gamma_{12}=-0.70711\,\!


Similar, for the second right eigenvalue  \lambda_{2}	= 3 the corresponding right eigenvector can be found by expressions (14) - (17):


 (14 \,\!)
	\begin{bmatrix}2-3&1\\1&2-3\end{bmatrix}\begin{bmatrix}\gamma_{21}\\\gamma_{22}\end{bmatrix}=0\,\!


 (15 \,\!)
	\gamma_{21}=\gamma_{22}\,\!


 (16 \,\!)
	\gamma_{21}^2+\gamma_{22}^2=1\,\!


 (17 \,\!)
	\gamma_{21}=0.70711,	\gamma_{22}=0.70711.\,\!


  • Left eigenvalues and left eigenvectors


For computation of the left eigenvalues and left eigenvectors we have the following steps (18) - (23). There is no difference between left and right eigenvalues, but difference occurs in the case of computation of the left and right eigenvectors.


 (18 \,\!)
	{x\Sigma =\lambda	x}\,\!


 (19 \,\!)
	{Ix\Sigma =\lambda	Ix}\,\!


 (20 \,\!)
	{\Sigma ^{\top}Ix^{\top}=\lambda	Ix^{\top}}\,\!


 (21 \,\!)
	{\Sigma ^{\top}Ix^{\top}-\lambda	Ix^{\top}}=0\,\!


 (23 \,\!)
	{(\Sigma ^{\top}-\lambda	I)(Ix^{\top})}=0\,\!


From (23) it follows that for a symmetric real matrix the left eigenvalues are equal to the right eigenvalues, and that the left eigevectors are transposed right eigenvectors.

For matrix  \Sigma from our example we get right eigenvalues as in (9) and right eigenvectors which are given in (24) and in (25).


 (24 \,\!)
	x_{1}=\gamma_{1}^{\top}=\begin{bmatrix} 0.70711&-0.70711\end{bmatrix}\,\!


 (25 \,\!)
	x_{2}=\gamma_{2}^{\top}=\begin{bmatrix} 0.70711& \ \ 0.70711\end{bmatrix}\,\!


  • Characteristics of a symmetric matrix; spectral decomposition of a matrix


If we consider a symmetric matrix of order  m with real entries then the characteristic polynomial of this matrix has only real roots (all eigenvalues are real) [5]. If all eigenvalues are nonnegative then the matrix is called positive semidefinite. Also, if all eigenvalues are greater than zero, the matrix is called positive definite.

Since in our example all eigenvalues of matrix  \Sigma are real and greater than zero, the matrix  \Sigma is positive definite.

According to the spectral decomposition of a matrix, every square matrix can be decomposed with the matrix of eigenvectors  \Gamma and diagonal matrix of eigenvalues  \Lambda . The matrix of eigenvalues contains all eigenvalues on the mean diagonal. The expression for spectral decomposition is given in (26) [1], and spectral decomposition for our matrix  \Sigma is given in (27).


 (26 \,\!)
	\Sigma =\Gamma	\Lambda	\Gamma	^{\top}\,\!


 (27 \,\!)
	\Sigma =\begin{bmatrix}2&1\\1&2\end{bmatrix}=\begin{bmatrix} \ \ 0.70711& \ \ 0.70711\\-0.70711& \ \ 0.70711\end{bmatrix}\begin{bmatrix} 1& 0\\ 0& 3\end{bmatrix}\begin{bmatrix} \ \ 0.70711& \ \ 0.70711\\-0.70711& \ \ 0.70711\end{bmatrix}^{\top}\,\!


The matrix of eigenvalues  \Lambda has some nice properties. These properties are related to the trace and determinant of the matrix  \Sigma .

For any given symmetric matrix the trace (sum of the diagonal elements) is equal to the trace of its matrix of eigenvalues  \Lambda . Put otherwise, the trace of a symmetric matrix is equal to the sum of its eigenvalues. The proof for this property is given in (28).

On the other hand, the determinant of any symmetric matrix is equal to the determinant of its matrix of eigenvalues  \Lambda , so the determinant of a symmetric matrix is equal to the product of its eigenvalues. Proof for this statement is given in (29).


 (28 \,\!)
	tr(\Sigma) =tr(\Gamma	\Lambda	\Gamma	^{\top})=tr(\Lambda	\Gamma	\Gamma	^{\top})=tr(\Lambda	I)=tr(\Lambda)\,\!


 (29 \,\!)
	|\Sigma |=|\Gamma	\Lambda	\Gamma	^{\top}|=|\Lambda	\Gamma	\Gamma	^{\top}|=|\Lambda||\Gamma	\Gamma	^{\top}|=|\Lambda||I|=|\Lambda|\,\!


To show empirically that the statement about the trace and determinant of a symmetric matrix and the matrix of eigenvalues is correct, consider our matrix  \Sigma with matrix of eigenvalues  \Lambda and matrix of eigenvectors  \Gamma given in (27). The expressions (30) and (31) show us these two properties.


 (30 \,\!)
	tr\left(\begin{bmatrix}2&1\\1&2\end{bmatrix}\right)=tr\left(\begin{bmatrix}1&0\\0&3\end{bmatrix}\right)=4\,\!


 (31 \,\!)
	\left|\begin{bmatrix}2&1\\1&2\end{bmatrix}\right|=\left|\begin{bmatrix}1&0\\0&3\end{bmatrix}\right|=3\,\!


Jacobi's Method

In the previous chapter we introduced the concept of computation of eigenvalues and eigenvectors considering matrices of order  m = 2 . As we know, difficulties can arise when one should compute eigenvalues and eigenvectors for matrices of higher order. To overcome these problems we will introduce the Jacobi's Method, in particular the classical Jacobi's Method and the threshold Jacobi's Method.

The idea of the Jacobi's Method is to transform a symmetric matrix to a diagonal matrix by a sequence of similarity transformations. Each similarity transformation involves a rotation designed to increase the sum of squares of the diagonal entries of the matrix currently similar to our symmetric matrix [3].

After similarity transformations we should get a diagonal matrix. In particular, the goal is to zeroing a particular off-diagonal element of a symmetric matrix. The procedure is finished when one get after these similarity transformations a diagonal matrix. This diagonal matrix, as we will see later, is the matrix of eigenvalues  \Lambda .

In the following we will define:

  • Rotation matrix,  R ,
  • Reflection matrix,  R^{*} ,
  • Transformation matrix,  W ,
  • Similarity transformations,
  • Zeroing a particular off-diagonal element in the similarity transformations, and
  • Increasing the sum of squared elements on the mean diagonal of the transformed matrices.

Thereafter, the classical Jacobi's Method and the threshold Jacobi's Method will be explained in detail.

At the end of the chapter, the theoretical concepts of the classical Jacobi's Method will be illustrated with an numerical problem. This problem will be solved by using the statistical software package XploRe.


  • Rotation matrix,  R


Rotation matrix  R (32) rotates the objects clockwise by angle  \theta . For example, if we consider a vector and choose angle  \theta = \pi / 2 , we will get a rotated vector in (33).

Useful properties of the rotation matrix  R are that it is a nonsymmetric square matrix of order two (34), its trace is given in (35), and it has determinant 1, as we can see in (36).


 (32 \,\!)
	R=\begin{bmatrix}cos\theta&sin\theta\\-sin\theta&cos\theta\end{bmatrix}\,\!


 (33 \,\!)
	\begin{bmatrix}cos\theta&sin\theta\\-sin\theta&cos\theta\end{bmatrix}\begin{bmatrix}1\\2\end{bmatrix}=\begin{bmatrix}cos\theta+2sin\theta\\-sin\theta+2cos\theta\end{bmatrix}=\begin{bmatrix}2\\-1\end{bmatrix}\,\!


 (34 \,\!)
	r_{ij}\neq	r_{ji},	i	\neq	j\,\!


 (35 \,\!)
	tr(R)=2cos\theta\,\!


 (36 \,\!)
	|R| = cos^2 \theta + sin^2 \theta = 1\,\!


  • Reflection matrix,  R^{*}


Reflection matrix,  R^{*} (37) reflects the objects by angle  \theta . If we consider a vector and choose angle  \theta = \pi / 2 , we will get a reflected vector in (38).

Properties of the reflection matrix  R^{*} are that it is a symmetric square matrix of order two (39), its trace is 0 (given in (40)), and it has determinant -1, as we can see in (41).


 (37 \,\!)
	R^*=\begin{bmatrix}cos\theta&sin\theta\\sin\theta&-cos\theta\end{bmatrix}\,\!


 (38 \,\!)


	\begin{bmatrix}cos\theta&sin\theta\\sin\theta&-cos\theta\end{bmatrix}\begin{bmatrix}1\\2\end{bmatrix}=\begin{bmatrix}cos\theta+2sin\theta\\-sin\theta-2cos\theta\end{bmatrix}=\begin{bmatrix}2\\1\end{bmatrix} \,\!


 (39 \,\!)
	r_{ij}	=	r_{ji},	i	\neq	j\,\!


 (40 \,\!)
	tr(R^*)=0\,\!


 (41 \,\!)
	|R^*|= -cos^2 \theta - sin^2 \theta = -1\,\!


  • Transformation matrix,  W


Transformation matrix,  W , given in (42) transforms a matrix. In this expression  R represents a rotation matrix of order 2,  0 is a zero matrix with  2 rows and  (m-2) columns,  I is an identity matrix of order  (m-2) .

General, the transformation matrix,  W for transforming (zeroing) the element  a_{12} of a symmetric matrix  A is given in (43).


 (42 \,\!)
	W=\begin{bmatrix}R&0\\0^{\top}&I_{m-2}\end{bmatrix}\,\!


 (43 \,\!)
	W=\begin{bmatrix}cos\theta&sin\theta&0&0&\cdots&0\\-sin\theta&cos\theta&0&0&\cdots&0\\0&0&1&0&\cdots&0\\0&0&0&1&\cdots&0\\\cdots&\cdots&\cdots&\cdots&\ddots&\cdots\\0&0&0&0&\cdots&1\end{bmatrix}\,\!


For zeroing elements of a symmetric matrix  A of order  m = 3  a_{12} ,  a_{13} and  a_{23} we use the transformation matrices  W given in expressions (44), (45) and (46), respectively. One can easy extended this concept to the class of matrices of higher order.


 (44 \,\!)
	W=\begin{bmatrix}cos\theta&sin\theta&0\\-sin\theta&cos\theta&0&\\0&0&1\end{bmatrix}\,\!


 (45 \,\!)
	W=\begin{bmatrix}cos\theta&0&sin\theta\\0&1&0\\-sin\theta&0&cos\theta \end{bmatrix}\,\!


 (46 \,\!)
	W=\begin{bmatrix}1&0&0\\0&cos\theta&sin\theta \\0&-sin\theta&cos\theta\end{bmatrix}\,\!


The most important characteristics of the transformation matrix,  W , are that it is a square matrix of order  m and orthonormal as stated in expression (47). The proof for orthonormality is given in expressions (48) - (51).


 (47 \,\!)
	WW^{\top}=W^{\top}W=I\,\!


 (48 \,\!)


	WW^{\top}=\begin{bmatrix}cos\theta&sin\theta&0\\-sin\theta&cos\theta&0&\\0&0&1\end{bmatrix}\begin{bmatrix}cos\theta&-sin\theta&0\\sin\theta&cos\theta&0&\\0&0&1\end{bmatrix}\,\!


 (49 \,\!)
	WW^{\top}=\begin{bmatrix}cos^2+sin^2\theta&0&0\\0&sin^2\theta+cos^2\theta&0\\0&0&1\end{bmatrix}=\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}=I\,\!


 (50 \,\!)
	W^{\top}W=\begin{bmatrix}cos\theta&-sin\theta&0\\sin\theta&cos\theta&0&\\0&0&1\end{bmatrix}\begin{bmatrix}cos\theta&sin\theta&0\\-sin\theta&cos\theta&0&\\0&0&1\end{bmatrix}\,\!


 (51 \,\!)
	W^{\top}W=\begin{bmatrix}cos^2+sin^2\theta&0&0\\0&sin^2\theta+cos^2\theta&0\\0&0&1\end{bmatrix}=\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}=I\,\!


  • Similarity transformations


Similarity transformations transform a symmetric matrix  A under the assumption that the transformation matrix  W is orthonormal as stated in (47).

The first similarity transformation of a symmetric matrix  A of order  m is given by matrix  B in (52). This similarity transformation (matrix  B ) has the property that its trace is equal to the trace of the matrix  A . This property is proven in (53). Also, the Euclidean norm of the similarity transformation (matrix  B ) is equal to the Euclidean norm of the matrix A. Proof for this property is given in expressions (54) - (55).


 (52 \,\!)
	B=W^{\top}AW\,\!


 (53 \,\!)
	tr(B)=tr(W^{\top}AW)=tr(AW^{\top}W)=tr(AI)=tr(A)\,\!


 (54 \,\!)
 \|B\|_{E}^2=tr(B^{\top}B)=tr[(W^{\top}AW)^{\top}(W^{\top}AW)] =tr(W^{\top}A^{\top}WW^{\top}AW)=\,\!


 (55 \,\!)
 =tr(W^{\top}A^{\top}AW) =tr(WW^{\top}A^{\top}A)=tr(IA^{\top}A)=tr(A^{\top}A)=\|A\|_{E}^2\,\!


If we consider a symmetric matix  A of order  3 and if we are interested in zeroing the element  a_{12} one can obtain a matrix  B (in expression (56)) using one similarity transformation. After calculation one get all nine elements of matrix  B in expressions (57) - (65). The elements of matrix  B are in fact functions of elements of matrix  A and the rotation angle  \theta .


 (56 \,\!)
	B= \begin{bmatrix}b_{11}&b_{12}&b_{13}\\b_{21}&b_{22}&b_{23}\\b_{31}&b_{32}&b_{33}\end{bmatrix}=\begin{bmatrix}cos\theta&-sin\theta&0\\sin\theta&cos\theta&0\\0&0&1\end{bmatrix}\begin{bmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{bmatrix}\begin{bmatrix}cos\theta&sin\theta&0\\-sin\theta&cos\theta&0\\0&0&1\end{bmatrix}\,\!


 (57 \,\!)
	b_{11}=a_{11}	cos^2\theta	-	a_{21}	cos\theta	sin\theta	-	a_{12}	cos\theta	sin\theta	+	a_{22}	sin^2	\theta = 	a_{11}	cos^2\theta	-	2a_{12}	cos\theta	sin\theta	+	a_{22}	sin^2	\theta\,\!


 (58 \,\!)
	b_{12}	=	a_{11}	cos\theta	sin\theta	-	a_{21}	sin^2	\theta	+	a_{12}	cos^2\theta	-	a_{22}	cos\theta	sin\theta =	(a_{11}	-	a_{22})	cos\theta	sin\theta	+	a_{12}	(	cos^2	\theta	-	sin^2	\theta	)\,\!


 (59 \,\!)
	b_{13}	=	a_{13}	cos\theta	-	a_{23}	sin\theta\,\!


 (60 \,\!)
	b_{22}	=	a_{11}	sin^2	\theta	+	a_{21}	cos\theta	sin\theta	+	a_{12}	cos\theta	sin\theta	+	a_{22}	cos^2	\theta = a_{11}	sin^2	\theta	+	2a_{12}	cos\theta	sin\theta	+	a_{22}	cos^2	\theta \,\!


 (61 \,\!)
	b_{23}	=	a_{13}	sin\theta	+	a_{23}	cos\theta\,\!


 (62 \,\!)
	b_{33}	=	a_{33}\,\!


 (63 \,\!)


	b_{21}	=	b_{12}\,\!


 (64 \,\!)
	b_{31}	=	b_{13}\,\!


 (65 \,\!)
	b_{32}	=	b_{23}\,\!


  • Zeroing a particular off-diagonal element in the similarity transformations


Zeroing a particular off-diagonal element of the symmetric matrix  A gives us a diagonal matrix at the end of these similarity transformations.

In the following we will describe the way to zero a particular off-diagonal element of the matrix  A . For this purpose we will set the element  b_{12} to zero as in expression (66). It is obvious that one can set any off-diagonal element of the matrix  B to zero. In this context there is a difference between the classical Jacobi's Method and the threshold Jacobi's Method.

To get an angle  \theta that allows us to set the value of element  b_{12} to zero one use the expressions (58) and (66). One can thereafter easy get the expression (67), and after some rearranging the expression (68). Using the formula from trigonometrics (69) it follows (70). If we choose angle  \theta according to this rule it follows (66) - we successfully imposed the procedure for zeroing a particular off-diagonal element of a symmetric matrix. An important restriction for this procedure is that the element  a_{12} is not equal to zero.


 (66 \,\!)
	b_{12}=0\,\!


 (67 \,\!)
	a_{12}	(	cos^2	\theta	-	sin^2	\theta	)	=	(a_{22}	-	a_{11})	cos\theta	sin\theta	\,\!


 (68 \,\!)
	\frac	{cos^2	\theta	-	sin^2	\theta	}	{	2cos\theta	sin\theta	}	=	\frac	{a_{22}	-	a_{11}}	{	2a_{12}	}\,\!


 (69 \,\!)
	\frac	{cos^2	\theta	-	sin^2	\theta	}	{	2cos\theta	sin\theta	}	=	\frac	{	cos	(	2	\theta	)	}	{sin	(	2	\theta	)}	=	cot(2	\theta) \,\!


 (70 \,\!)
	cot(2	\theta)	=	\frac	{a_{22}	-	a_{11}}	{	2a_{12}	}\,\!


Next step is to find a solution to element  b_{11} . In expression (57) we add an additional term (which is equal to zero as we know). That leads us to expression (71). Using the definition on the right side of (58) for  b_{12} we insert this in (71) and get (72). Rearranging elements  a_{11} and  a_{22} and manipulation we end with (75). Thus, under all assumptions and restrictions one finally get the expression for computation of the element on the mean diagonal  b_{11} that is affected by the similarity transformation if one is interested in zeroing the off-diagonal element  b_{12} .


 (71 \,\!)
	b_{11}	=	a_{11}	cos^2\theta	-	2a_{12}	cos\theta	sin\theta	+	a_{22}	sin^2	\theta	+	b_{12}	tan	\theta\,\!


 (72 \,\!)
	b_{11}	=	a_{11}	cos^2\theta	-	2a_{12}	cos\theta	sin\theta	+	a_{22}	sin^2	\theta	+	a_{12}	cos^2	\theta	\frac	{sin	\theta}	{cos\theta}- a_{12}	sin^2	\theta	\frac	{sin	\theta}	{cos\theta}	+	a_{11}	sin^2	\theta	-	a_{22}	sin^2	\theta\,\!


 (73 \,\!)
	b_{11} =	a_{11}	+	a_{12}	cos^2	\theta	tan\theta	-	a_{12}	cos\theta	sin\theta	-	a_{12}	sin^2	\theta	tan\theta	-	a_{12}	cos\theta	sin\theta\,\!


 (74 \,\!)
	b_{11} =	a_{11}	+	a_{12}	cos	\theta	(cos	\theta	tan	\theta	-	sin	\theta)	-	a_{12}	sin	\theta	(sin	\theta	tan	\theta	+	cos	\theta) \,\!


 (75 \,\!)
	b_{11} = a_{11}	-	a_{12}	sin	\theta	\frac	{sin^2	\theta	+	cos^2	\theta}	{cos	\theta} = a_{11}	-	a_{12}	tan	\theta\,\!


The next element that will be explained is  b_{22} . We start with expression (53) that states that the trace of similarity transformation matrix  B is equal to the trace of the matrix  A . Using the fact (62) and inserting the expression (75) for  b_{11} we finally get an expression for element  b_{22} in (79). It remains to calculate the other elements of the matrix  B .


 (76 \,\!)
	b_{11}+b_{22}+b_{33}=a_{11}+a_{22}+a_{33}\,\!


 (77 \,\!)
	b_{11}+b_{22}=a_{11}+a_{22}\,\!


 (78 \,\!)
	a_{11}	-	a_{12}	tan	\theta	+	b_{22}	=	a_{11}	+a_{22}\,\!


 (79 \,\!)
	b_{22}	=	a_{22}	+	a_{12}	tan	\theta\,\!


Other elements of the matrix  B can be computed as before. In particular, the elements of the third row ( b_{31} ,  b_{32} and  b_{33} ) are given by expressions (64), (65) and (62), respectively. The elements  b_{13} and  b_{23} are given by expressions (59) and (61), respectively.


  • Increasing the sum of squared elements on the mean diagonal of the transformed matrices


For matrix  A of order  3 we can define the Euclidean norm as in expression (80), where  S(A) represent the sum of squared elements on the mean diagonal and  L(A) represent the sum of squared off-diagonal elements. The extension is straightforward for matrices of (higher) order  m . For matrix  B we get similar expression (81).


 (80 \,\!)
	\|A\|_{E}^2 = tr ( A^{\top} A ) = S(A) + L(A) = \sum_{i=1}^3	a_{ii} ^2	+	2	a_{12}^2	+	2	a_{13}^2	+	2	a_{23}^2\,\!


 (81 \,\!)
	\|B\|_{E}^2 = tr ( B^{\top} B ) = S(B) + L(B) = \sum_{i=1}^3	b_{ii}^2	+	2	b_{12}^2	+	2	b_{13}^2	+	2	b_{23}^2\,\!


Using the expressions (59) and (61) for elements  b_{13} and  b_{23} , respectively, we can conclude (82), and after some transformations we finally get (83). Thus, the sum of squared particular elements of the original matrix  A and the transformed matrix  B are equal and can be cancelled out, as it will be stated later.


 (82 \,\!)
	2b_{13}^2	+	2b_{23}^2	=	2	(a_{13}^2	cos^2	\theta	-	2	a_{13}	a_{23}	cos	\theta	sin	\theta	+	a_{23}^2	sin^2	\theta)+ 2(a_{13}^2	sin^2	\theta	+	2	a_{13}	a_{23}	cos	\theta	sin	\theta	+	a_{23}^2	cos^2	\theta)\,\!


 (83 \,\!)
	2b_{13}^2	+	2b_{23}^2 = 2	a_{13}^2	cos^2	\theta	+	a_{23}^2	sin^2	\theta +	2a_{13}^2	sin^2	\theta	+	a_{23}^2	cos^2	\theta)=2	a_{13}^2	+	2	a_{23}^2\,\!


With assumption  b_{12} = 0 and condition  a_{12}	 \ne  0 we can rewrite the sum of squared diagonal elements of the matrix  B as in (84). Immediately we can conclude that the sum of squared elements of the transformed matrix  B is greater than the sum of squared elements of the initially given matrix  A as stated in (85). This proves us the idea of the Jacobi's method - increase the sum of squared diagonal elements of the transformed matrix  B . This proof can be easy extended to case of matrices of higher order than  m = 3 .


 (84 \,\!)
	S(B)=\sum_{i=1}^3	b_{ii}^2 =	\sum_{i=1}^3	a_{ii}^2	+	2	a_{12}^2 = S(A)+2 a_{12}^2 \,\!


 (85 \,\!)
	S(B)>S(A)\,\!


Classical Jacobi's method

The classical Jacobi's method uses a transformation matrix which contains a rotation matrix to decrease the sum of squared off-diagonal elements of the transformed matrices. In each transformation it is necessary to find the largest off-diagonal element and then using the angle  \theta set value  0 for this element and recalculate other elements of the transformed matrix. Later, we will prove the fact that the transformed matrix converge to the matrix of eigenvalues  \Lambda , and that the product of the transformation matrices converge to the matrix of eigenvectors  \Gamma . Put otherwise, the elements of the resulting diagonal matrix (after similarity transformations) are eigenvalues of the matrix  A . The matrix of eigenvectors can be computed by product of the transformed matrices.

The classical Jacobi's method is a simple algorithm (strategy) and it is not the fastest one. Considering tridiagonal matrices, the procedure can be easier and faster. One can use, for example, the Householder algorithm to transform a symmetric matrix to a tridiagonal symmetric matrix.

Two concepts will be considered:

  • Decreasing the sum of squared off-diagonal elements of the transformed matrices and
  • Proof of convergence of the classical Jacobi's method [3].

At the end of this chapter an numerical example will be introduced to complete the introduced concept for finding the eigenvalues and eigenvectors for a real and symmetric matrix  A .


  • Decreasing the sum of squared off-diagonal elements of the transformed matrices


Since matrices  A and  B have the same Euclidean norm (according to proof in (54) and (55)) we can state the relationship (86), which can be expressed like in (87). From rearranged expression (84) for the difference  S(A) - S(B) and plugging into (87), and after following the expressions until (89) we finally get the result (90). In expression (88) we used the fact that the sum of all squared off-diagonal elements of any symmetric matrix is less than or equal to the squared largest element multiplied by its frequency. In the case of symmetric matrices of order  m = 3 there are 6 off-diagonal elements that should be defined.

Using the fact stated in (88) we finally get our result in (90); the sum of squared off-diagonal elements of the transformed matrices is less than or equal than the sum of squared off-diagonal elements of the symmetric matrix before the transformation. Therefore, the sum of squared off-diagonal elements of the transformed matrices tends to zero.


 (86 \,\!)
	\|B\|_{E}^2-S(B)=\|A\|_{E}^2-S(B)\,\!


 (87 \,\!)
	L(B)=L(A)+S(A)-S(B)=L(A)-2a_{ij}^2 \,\!


 (88 \,\!)
	L(A) \le m(m-1)a_{ij}^2 \,\!


 (89 \,\!)
	L(B) \le \left(m(m-1)a_{ij}^2-2a_{ij}^2\right)=\left(1-\frac {2}{m(m-1)} \right)m(m-1) a_{ij}^2= \left(1-\frac {2}{m(m-1)} \right) L(A)\,\!


 (90 \,\!)
	L(B) \le L(A)\,\!


Our transformed matrix  B will now be replaced by matrix  A_{1} which represents matrix  A after first transformation.

General, in classical Jacobi's method start with a symmetric matrix  A and use a sequence of rotations  W_{n} (similarity transformations) to decrease the sum of squares of the off-diagonal elements of the transformed matrices.

If we consider a sequence of square matrices { A, A_{1}, A_{2},…, A_{n} } and a sequence of rotation matrices { W_{1}, W_{2},…, W_{n} } we get the expression (93) that is similar to (86). The transformed matrix  A_{n} is given by similarity transformations in (91). As we can see from this expression and according to previous result (90), we are indeed interested in decreasing the sum of squared off-diagonal elements of the resulting transformed matrices. We use the element that has the strongest impact on the decreasing of the sum of the off-diagonal elements.

The sequence { X_{1}, X_{2},…, X_{n} } represents the product of transformation matrices { W_{1}, W_{2},…, W_{n} }. In particular, the matrix  X_{n} is given in expression (92).


 (91 \,\!)
	A_{n}=W_{n}	^{\top}	\cdots	W_{2} ^{\top}	W_{1} ^{\top}	A	W_{1}	W_{2}	\cdots	W_{n} \,\!


 (92 \,\!)
	X_{n}= W_{1}W_{2}\cdots	W_{n} \,\!


 (93 \,\!)
	\|A_{n}\|_{E}^2-S(A_{n})=\|A\|_{E}^2-S(A_{n})\,\!


  • Proof of convergence of the classical Jacobi's method [3]


The classical Jacobi's method generates a sequence of rotations { W_{1}, W_{2},…, W_{n} } that have the property (94) which means that the limit of the transformations exist and that it is equal to a diagonal matrix  \Lambda , whose entries are the eigenvalues of  A , in some order [3]. It is already shown that the off-diagonal elements tend to zero. If the sequence  \Lambda_{n(k)} is a convergent subsequence with limit  \Lambda we can find the following expression (95).


 (94 \,\!)
	\lim_{n	\to	\infty}A_{n}	=\Lambda\,\!


 (95 \,\!)
	|\Lambda-	\lambda	I|=\lim_{k	\to	\infty}|\Lambda_{n_{k}}	-	\lambda	I|=\lim_{k	\to	\infty}|A_{n_{k}}	-	\lambda	I|=|A-	\lambda	I|\,\!


Thus,  \Lambda possesses the same eigenvalues, counting multiplicities, as matrix  A [3].

On the other hand, the sequence { X_{1}, X_{2},…, X_{n} } converges to the matrix of eigenvectors  \Gamma .

The sequence  X_{n} is bounded because (97). If we suppose that  X_{n} has a convergent subsequence  X_{n(k)} with limit  \Gamma , then one can show (98).


 (96 \,\!)
	\lim_{n	\to	\infty}X_{n}	=\Gamma\,\!


 (97 \,\!)
	||X_{n}||_{E}^{2}=tr\left(X_{n}^{\top}X_{n}\right)=tr(I) 
\,\!


 (98 \,\!)
	\lim_{n	\to	\infty}||X_{n+1}-X_{n}||_{E}=0


Threshold Jacobi's Method

The threshold Jacobi's Method is an algorithm that transforms a symmetric matrix to a diagonal matrix (matrix of its eigenvalues,  \Lambda ) according to a fixed schedule. The goal is to increase the sum of squared diagonal elements and to decrease the sum of squared off-diagonal elements in the transformed matrices.

In the threshold Jacobi's Method the cyclic strategy for searching through all off-diagonal entries is modified so that a rotation is undertaken only when the current off-diagonal element is sufficiently large in absolute value [3].

This method is, like the classical Jacobi's Method, slow and simple, and it can be improved.


Numerical Example

Using the statistical software XploRe we will consider a numerical example for computation of eigenvalues and eigenvectors. For this proposes we use the matrix  A of order  m = 3 . After every transformation it is necessary to recalculate the angle  \theta . In the first transformation we use the angle  \theta=0.71445 . The resulting first transformation matrix  W_{1} and the first transformed matrix  A_{1} are given in (100) and (101), respectively.

The first step is to find the largest squared off-diagonal element, and then zero its value according to the procedure of the Jacobi's method. If we proceed on this way, we will get the largest decreasing of the sum of squared elements of the off-diagonal elements.

In this example all results are expressed on five decimal places.


 (99 \,\!)
	A= \begin{bmatrix} \ \ 1.00000& \ \ 7.00000& \ \ 4.00000\\ \ \ 7.00000& \ \ 3.00000& \ \ 2.00000\\ \ \ 4.00000& \ \ 2.00000& \ \ 8.00000\end{bmatrix}
\,\!


 (100 \,\!)
	W_{1}= \begin{bmatrix} \ \ 0.75545& \ \ 0.65520& \ \ 0.00000\\ -0.65520& \ \ 0.75545& \ \ 0.00000 \\ \ \ 0.00000& \ \ 0.00000& \ \ 1.00000\end{bmatrix}
\,\!


 (101 \,\!)
	A_{1}= \begin{bmatrix} -5.07103& -0.00002& \ \ 1.71140\\-0.00002& \ \ 9.07099& \ \ 4.13170\\ \ \ 1.71140& \ \ 4.13170& \ \ 8.00000\end{bmatrix}
\,\!


After two swaps (6 transformations in our case) we get a symmetric matrix of eigenvalues (103). Also, the matrix of corresponding eigenvectors is given in (104). Finally, the spectral decomposition of the symmetric and real matrix  A is given in expression (105).


 (102 \,\!)
	W_{6}= \begin{bmatrix} \ \ 1.00000& \ \ 0.00000&-0.00002\\ \ \ 0.00000& \ \ 1.00000& \ \ 0.00000\\ -0.00002& \ \ 0.00000& \ \ 1.00000\end{bmatrix}
\,\!


 (103 \,\!)
	A_{6}= \begin{bmatrix}12.77412&\ \ 0.00000&\ \ 0.00000\\ \ \ 0.00000& \ \ 4.53849&\ \ 0.00000\\ \ \ 0.00000&\ \ 0.00000&-5.31261\end{bmatrix}
\,\!


 (104 \,\!)
	X_{6} = \begin{bmatrix} -0.53518& -0.33904& \ \ 0.77372\\ -0.51957& -0.59007& - 0.61795 \\ -0.66606 & \ \ 0.73271 & -0.13964 \end{bmatrix}
\,\!


 (105 \,\!)
	A = \begin{bmatrix} \ \ 1.00000& \ \ 7.00000& \ \ 4.00000\\ \ \ 7.00000& \ \ 3.00000& \ \ 2.00000\\ \ \ 4.00000& \ \ 2.00000& \ \ 8.00000\end{bmatrix} = \begin{bmatrix} -0.53518& -0.33904& \ \ 0.77372\\ -0.51957& -0.59007& - 0.61795 \\ -0.66606 & \ \ 0.73271 & -0.13964 \end{bmatrix} \cdot \,\!


	\cdot \begin{bmatrix} 12.77412&\ \ 0.00000&\ \ 0.00000\\ \ \ 0.00000& \ \ 4.53849&\ \ 0.00000\\ \ \ 0.00000&\ \ 0.00000&-5.31261\end{bmatrix} \begin{bmatrix} -0.53518& -0.33904& \ \ 0.77372\\ -0.51957& -0.59007& - 0.61795 \\ -0.66606 & \ \ 0.73271 & -0.13964 \end{bmatrix}^{\top} \,\!


Conclusion

In the second chapter the concept of computation of eigenvalues and eigenvectors was introduced. This part gives us the necessary understanding of the problematic of the computation of eigenvalues and eigenvectors.

In the third chapter the Jacobi's method was explained in detail. This method is very useful in cases of the computation of eigenvalues and eigenvectors for matrices of higher order. Two methods were introduced, the classical Jacobi's method and the threshold Jacobi's method. These methods are simple. A disadvantage of these two methods is that they are slower than some other methods (i.e. the factorization methods; the QR and QL algorithm).

The computation can be further improved if we transform our symmetric matrix to tridiagonal form with the Household algorithm. The computation can then be faster and easier. This advantage has its importance in computation of eigenvalues and eigenvectors for matrices of higher order.

Finally, the introduced Jacobi's method is a numerical algorithm that enables to compute eigenvalues and eigenvectors for real and symmetric matrices that are in the  m dimensional space  R^m .


References

[1] Härdle, W., Simar, L. (2002) Applied Multivariate Statistical Analysis, Springer

[2] Judge, G.G. et al. (1985) The Theory and Practice of Econometrics, 2nd ed. John Wiley & Sons, Inc., New York

[3] Lange, K. (1998) Numerical Analysis for Statisticians, Springer-Verlag

[4] Press, W.H. et al. (1992) Numerical Receipes in C, 2nd ed. Cambridge University Press, New York

[5] Sydsaeter, K., Hammond, P.J. (1995) Mathematics for Economic Analysis, Pretence-Hall Inc., New Jersey

[6] Tsay, R.S. (2002) Analysis of Financial Time Series, John Wiley & Sons, Inc., New York