Files
MultiPhysicsVault/.raw/FiniteElementProcedures/FiniteElementProcedures_093.md
T
김경종 4cc312954f
Tests / Hermetic test suite (push) Has been cancelled
Tests / Skill frontmatter validation (push) Has been cancelled
add wiki
2026-05-28 17:16:48 +09:00

31 KiB

inverse iteration than using forward iteration. For this reason and because a shift can be chosen to converge to any eigenpair, inverse iteration is much more important in practical analysis, and in the algorithms presented later, we always use inverse iteration whenever vector iteration is required.

11.2.4 Rayleigh Quotient Iteration

We discussed in Section 11.2.3 that the convergence rate in inverse iteration can be much improved by shifting. In practice, the difficulty lies in choosing an appropriate shift. One possibility is to use as a shift value the Rayleigh quotient calculated in (11.18), which is an approximation to the eigenvalue sought. If a new shift using (11.18) is evaluated in each iteration, we have the Rayleigh quotient iteration (see A. M. Ostrowski [A]). In this procedure we assume a starting iteration vector x_{1} , hence y_{1} = Mx_{1} , a starting shift \rho(\overline{x}_{1}) , which is usually zero, and then evaluate for k = 1, 2, \ldots :


[ \mathbf {K} - \rho (\overline {{{\mathbf {x}}}} _ {k}) \mathbf {M} ] \overline {{{\mathbf {x}}}} _ {k + 1} = \mathbf {y} _ {k} \tag {11.50}

\overline {{{\mathbf {y}}}} _ {k + 1} = \mathbf {M} \overline {{{\mathbf {x}}}} _ {k + 1} \tag {11.51}

\rho (\overline {{{\mathbf {x}}}} _ {k + 1}) = \frac {\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \mathbf {y} _ {k}}{\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \overline {{{\mathbf {y}}}} _ {k + 1}} + \rho (\overline {{{\mathbf {x}}}} _ {k}) \tag {11.52}

\mathbf {y} _ {k + 1} = \frac {\overline {{{\mathbf {y}}}} _ {k + 1}}{(\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \overline {{{\mathbf {y}}}} _ {k + 1}) ^ {1 / 2}} \tag {11.53}

where now \mathbf{y}_{k + 1}\rightarrow \mathbf{M}\phi_{i} and \rho (\overline{\mathbf{x}}_{k + 1})\rightarrow \lambda_{i} as k\to \infty

The eigenvalue \lambda_{i} and corresponding eigenvector \phi_{i} to which the iteration converges depend on the starting iteration vector \mathbf{x}_1 and the initial shift \rho(\overline{\mathbf{x}}_1) . If \mathbf{x}_1 has strong components of an eigenvector, say \phi_k , and \rho(\overline{\mathbf{x}}_2) provides a sufficiently close shift to the corresponding eigenvalue \lambda_k , then the iteration converges to the eigenpair (\lambda_k, \phi_k) and the ultimate order of convergence for both \lambda_k and \phi_k is cubic. Hence, in practice we need to ensure that \mathbf{x}_1 is reasonably close to the eigenvector of interest, and then convergence will always be cubic. This excellent convergence behavior is a most important observation. We may intuitively explain it by the fact that in inverse iteration the vector converges linearly, and with an error of order \epsilon in the vector the Rayleigh quotient predicts the eigenvalue with an error of order \epsilon^2 . Since the eigenvalue approximation used as a shift has a direct effect on the approximation to be obtained for the eigenvector, and vice versa, it seems probable that in Rayleigh quotient iteration the order of convergence is cubic for both the eigenvalue and eigenvector.

To analyze the convergence characteristics of Rayleigh quotient iteration we may proceed in the same way as in the analysis of inverse iteration; i.e., we consider the iteration in the basis of the eigenvectors. In this case we use the transformation in (11.24) and write the two basic equations of Rayleigh quotient iteration [i.e., (11.50) and (11.52), respectively] in the following form:


[ \mathbf {\Lambda} - \rho (\mathbf {z} _ {k}) \mathbf {I} ] \mathbf {z} _ {k + 1} = \mathbf {z} _ {k} \tag {11.54}

\rho (\mathbf {z} _ {k + 1}) = \frac {\mathbf {z} _ {k + 1} ^ {T} \mathbf {z} _ {k}}{\mathbf {z} _ {k + 1} ^ {T} \mathbf {z} _ {k + 1}} + \rho (\mathbf {z} _ {k}) \tag {11.55}

where the length normalization of the iteration vector has been omitted.

To consider the convergence characteristics of the iteration vector, let us perform an approximate convergence analysis that gives insight into the working of the algorithm. Assume that the current iteration vector z_{i} is already close to the eigenvector e_{1} ; i.e., we have


\mathbf {z} _ {l} ^ {T} = \left[ \begin{array}{l l l l l} 1 & o (\epsilon) & o (\epsilon) & \dots & o (\epsilon) \end{array} \right] \tag {11.56}

where o(\epsilon) denotes “of order \epsilon ” and \epsilon \ll 1 . We then obtain


\rho (\mathbf {z} _ {i}) = \lambda_ {1} + o (\epsilon^ {2}) \tag {11.57}

Solving from (11.54) for \mathbf{z}_{l + 1} , we thus have


\mathbf {z} _ {l + 1} ^ {T} = \left[ \frac {1}{o (\epsilon^ {2})} \frac {o (\epsilon)}{\lambda_ {2} - \lambda_ {1}} \dots \frac {o (\epsilon)}{\lambda_ {n} - \lambda_ {1}} \right] \tag {11.58}

In order to assess the convergence of the iteration vector we normalize to 1 the first component of z_{i+1} , to obtain


\overline {{{\mathbf {z}}}} _ {i + 1} ^ {T} = \left[ \begin{array}{l l l l l} 1 & o (\epsilon^ {3}) & o (\epsilon^ {3}) & \dots & o (\epsilon^ {3}) \end{array} \right] \tag {11.59}

Hence, the elements that in \overline{\mathbf{z}}_l have been of order \epsilon are now of order \epsilon^3 , indicating cubic convergence.

Consider the following example to demonstrate the characteristics of Rayleigh quotient iteration.

EXAMPLE 11.7: Perform the Rayleigh quotient iteration on the problem \Lambda \phi = \lambda \phi , where


\boldsymbol {\Lambda} = \left[ \begin{array}{l l} 2 & 0 \\ 0 & 6 \end{array} \right]

Use as starting iteration vectors x_{1} the vectors


\mathbf {x} _ {1} = \left[ \begin{array}{l} 1 \\ 1 \end{array} \right]; \quad \mathbf {x} _ {1} = \left[ \begin{array}{l} 1 \\ 0. 1 \end{array} \right] \tag {1}

Using the relations given in (11.50) to (11.53) [with \rho(\overline{\mathbf{x}}_1) = 0.0 ], we obtain, in case 1,


\overline {{{\mathbf {x}}}} _ {2} = \left[ \begin{array}{l} 0. 5 0 0 \\ 0. 1 6 6 6 6 7 \end{array} \right]; \quad \rho (\overline {{{\mathbf {x}}}} _ {2}) = 2. 4 0

\mathbf {y} _ {2} = \left[ \begin{array}{l} 0. 9 4 8 6 8 \\ 0. 3 1 6 2 3 \end{array} \right]

\overline {{{\mathbf {x}}}} _ {3} = \left[ \begin{array}{c} - 2. 3 7 1 7 1 \\ 0. 0 8 7 8 4 \end{array} \right]; \quad \rho (\overline {{{\mathbf {x}}}} _ {3}) = 2. 0 0 5 4 8

\mathbf {y} _ {3} = \left[ \begin{array}{c} - 0. 9 9 9 3 1 \\ 0. 0 3 7 0 1 \end{array} \right]

\overline {{{\mathbf {x}}}} _ {4} = \left[ \begin{array}{c} 1 8 2. 3 7 4 9 6 \\ 0. 0 0 9 2 7 \end{array} \right]; \quad \rho (\overline {{{\mathbf {x}}}} _ {4}) = 2. 0 0 0 0 0 0

and \mathbf{y}_4 = \begin{bmatrix} 1.0000 \\ 0.00005 \end{bmatrix}

Hence, we see that in three steps of iteration we have obtained a good approximation to the required eigenvalue and eigenvector.

In case 2 we have


\begin{array}{l} \overline {{{\mathbf {x}}}} _ {2} = \left[ \begin{array}{l} 0. 5 0 0 0 0 \\ 0. 0 1 6 6 6 7 \end{array} \right]; \quad \rho (\overline {{{\mathbf {x}}}} _ {2}) = 2. 0 0 4 4 4 \\ \mathbf {y} _ {2} = \left[ \begin{array}{l} 0. 9 9 9 4 4 \\ 0. 0 3 3 3 1 5 \end{array} \right] \\ \mathbf {y} _ {3} = \left[ \begin{array}{c} - 1. 0 0 0 0 0 \\ 0. 0 0 0 0 3 7 \end{array} \right] \\ \end{array}

and then \overline{\mathbf{x}}_3 = \left[ \begin{array}{c} -225.125 \\ 0.00834 \end{array} \right]; \quad \rho(\overline{\mathbf{x}}_3) = 2.000001

We observe that in this case two iterations are sufficient to obtain a good approximation to the required eigenvalue and eigenvector because the starting iteration vector was already closer to the required eigenvector.

As was pointed out in the preceding discussion, the Rayleigh quotient iteration can, in principle, converge to any eigenpair. Therefore, if we are interested in the smallest p eigenvalues and corresponding eigenvectors, we need to supplement the Rayleigh quotient iterations by another technique to ensure convergence to one of the eigenpairs sought. For example, to calculate the smallest eigenvalue and corresponding eigenvector, we may first use the inverse iteration in (11.16) to (11.19) without shifting to obtain an iteration vector that is a good approximation of \phi_{1} , and only then start with Rayleigh quotient iteration. However, the difficulty lies in assessing how many inverse iterations must be performed before Rayleigh quotient shifting can be started and yet convergence to \phi_{1} and \lambda_{1} is achieved. Unfortunately, this question can in general not be resolved, and it is necessary to use the Sturm sequence property to make sure that the required eigenvalue and corresponding eigenvector have indeed been calculated (see Section 11.4).

11.2.5 Matrix Deflation and Gram-Schmidt Orthogonalization

In Sections 11.2.1 to 11.2.4 we discussed how an eigenvalue and corresponding eigenvector can be calculated using vector iteration. The basic inverse iteration technique converges to \lambda_{1} and \phi_{1} (see Section 11.2.1), and the basic forward iteration can be used to calculate \lambda_{n} and \phi_{n} (see Section 11.2.2), but the methods can also be employed with shifting to calculate other eigenvalues and corresponding eigenvectors (see Section 11.2.3). Assume now that we have calculated a specific eigenpair, say (\lambda_{k}, \phi_{k}) , using either method and that we require the solution of another eigenpair. To ensure that we do not converge again to \lambda_{k} and \phi_{k} , we need to deflate either the matrices or the iteration vectors.

Matrix deflation has been applied extensively in the solution of standard eigenproblems. The problem may be K\phi = \lambda\phi , i.e., when M is the identity matrix in K\phi = \lambda M\phi , or may be \tilde{K}\tilde{\phi} = \lambda\tilde{\phi} , which is obtained by transforming the generalized eigenproblem into a standard form (see Section 10.2.5). We recall that this transformation is effective when M is diagonal and all diagonal elements are larger than zero because in such a case \tilde{K} has the same bandwidth as K.

Consider the deflation of K\phi = \lambda\phi because the deflation of \tilde{K}\tilde{\phi} = \lambda\tilde{\phi} would be obtained in the same way. A stable matrix deflation can be carried out by finding an orthogonal matrix P whose first column is the calculated eigenvector \phi_{k} .

Writing P as


\mathbf {P} = \left[ \boldsymbol {\phi} _ {k}, \mathbf {p} _ {2}, \dots , \mathbf {p} _ {n} \right] \tag {11.60}

we need to have \pmb{\Phi}_k^T\mathbf{p}_i = 0 for i = 2,\dots ,n . It then follows that


\mathbf {P} ^ {T} \mathbf {K P} = \left[ \begin{array}{c c} \lambda_ {k} & \mathbf {0} \\ \mathbf {0} & \mathbf {K} _ {1} \end{array} \right] \tag {11.61}

because \phi_k^T\phi_k = 1 . The important point is that \mathbf{P}^T\mathbf{K}\mathbf{P} has the same eigenvalues as \mathbf{K} , and therefore \mathbf{K}_1 must have all eigenvalues of \mathbf{K} except \lambda_k . In addition, denoting the eigenvectors of \mathbf{P}^T\mathbf{K}\mathbf{P} by \overline{\phi}_i , we have


\boldsymbol {\phi} _ {i} = \mathbf {P} \overline {{{\boldsymbol {\phi}}}} _ {i} \tag {11.62}

It is important to note that the matrix P is not unique and that various techniques can be used to construct an appropriate transformation matrix. Since K is banded, we would like to have that the transformation does not destroy the bandform (see, for example, H. Rutishauser [A]).

From this discussion it follows that once a second required eigenpair using K_{1} has been evaluated, the process of deflation can be repeated by working with K_{1} rather than with K. Therefore, we may continue deflating until all required eigenvalues and eigenvectors have been calculated. The disadvantage of matrix deflation is that the eigenvectors have to be calculated to very high precision to avoid the accumulation of errors introduced in the deflation process.

Instead of matrix deflation, we may deflate the iteration vector in order to converge to an eigenpair other than (\lambda_{k}, \phi_{k}) . The basis of vector deflation is that in order for an iteration vector to converge in forward or inverse iteration to a required eigenvector, the iteration vector must not be orthogonal to it. Hence, conversely, if the iteration vector is orthogonalized to the eigenvectors already calculated, we eliminate the possibility that the iteration converges to any one of them, and, as we will see, convergence occurs instead to another eigenvector.

A particular vector orthogonalization procedure that is employed extensively is the Gram-Schmidt method. The procedure can be used in the solution of the generalized eigenproblem K\phi = \lambda M\phi , where M can take the different forms that we encounter in finite element analysis.

In order to consider a general case, assume that we have calculated in inverse iteration the eigenvectors \phi_{1}, \phi_{2}, \ldots, \phi_{m} and that we want to M-orthogonalize \mathbf{x}_{1} to these eigenvectors. In Gram-Schmidt orthogonalization a vector \tilde{\mathbf{x}}_{1} , which is M-orthogonal to the eigenvectors \phi_{i}, i = 1, \ldots, m , is calculated using


\tilde {\mathbf {x}} _ {1} = \mathbf {x} _ {1} - \sum_ {i = 1} ^ {m} \alpha_ {i} \phi_ {i} \tag {11.63}

where the coefficients \alpha_{i} are obtained using the conditions that \boldsymbol{\phi}_{i}^{T}\mathbf{M}\tilde{\mathbf{x}}_{1}=0, i=1,\ldots,m , and \boldsymbol{\phi}_{i}^{T}\mathbf{M}\boldsymbol{\phi}_{j}=\delta_{ij} . Premultiplying both sides of (11.63) by \boldsymbol{\phi}_{i}^{T}\mathbf{M} , we therefore obtain


\alpha_ {i} = \phi_ {i} ^ {T} \mathbf {M} \mathbf {x} _ {1}; \quad i = 1, \dots , m \tag {11.64}

In the inverse iteration we would now use \tilde{\mathbf{x}}_1 as the starting iteration vector instead of \mathbf{x}_1 and provided that \mathbf{x}_1^T\mathbf{M}\boldsymbol{\phi}_{m+1} \neq 0 , convergence occurs (at least in theory; see Section 11.2.6) to \boldsymbol{\phi}_{m+1} and \lambda_{m+1} .

To prove the convergence given above, we consider as before the iteration process in the basis of eigenvectors; i.e., we analyze the iteration given in (11.25) when the Gram-

Schmidt orthogonalization is included. In this case the eigenvectors corresponding to the smallest eigenvalues are e_{i}, i = 1, \ldots, m . Carrying out the deflation of the starting iteration vector z_{1} in (11.27), we obtain


\tilde {\mathbf {z}} _ {1} = \mathbf {z} _ {1} - \sum_ {i = 1} ^ {m} \alpha_ {i} \mathbf {e} _ {i} \tag {11.65}

with \alpha_{i} = \mathbf{e}_{i}^{T}\mathbf{z}_{1} = 1;\qquad i = 1,\dots ,m (11.66)


\sqrt {- -} \text {   Element   } m + 1

Hence, \tilde{\mathbf{z}}_1^T = [0\ldots 0\quad 1\ldots 1] (11.67)

Using now \tilde{z}_{1} as the starting iteration vector and performing the convergence analysis as discussed in Section 11.2.1, we find that if \lambda_{m+2} > \lambda_{m+1} , we have \tilde{z}_{l+1} \to e_{m+1} , as was required to prove. Furthermore, we find that the rate of convergence of the eigenvector is \lambda_{m+1}/\lambda_{m+2} , and when \lambda_{m+1} is a multiple eigenvalue, the rate of convergence is given by the ratio of \lambda_{m+1} to the next distinct eigenvalue.

Although so far we have discussed Gram-Schmidt orthogonalization in connection with vector inverse iteration, it should be realized that the orthogonalization procedure can be used equally well in the other vector iteration methods. All convergence considerations discussed in the presentation of inverse iteration, forward iteration, and Rayleigh quotient iteration are also applicable when Gram-Schmidt orthogonalization is included if it is taken into account that convergence to the eigenvectors already calculated is not possible.

EXAMPLE 11.8: Calculate, using Gram-Schmidt orthogonalization, an appropriate starting iteration vector for the solution of the problem K\phi = \lambda M\phi , where K and M are given in Example 11.4. Assume that the eigenpairs (\lambda_{1}, \phi_{1}) and (\lambda_{4}, \phi_{4}) are known as obtained in Example 11.5 and that convergence to another eigenpair is sought.

To determine an appropriate starting iteration vector, we want to deflate the unit full vector of the vectors \phi_{1} and \phi_{4} ; i.e., (11.63) reads


\tilde {\mathbf {x}} _ {1} = \left[ \begin{array}{l} 1 \\ 1 \\ 1 \\ 1 \end{array} \right] - \alpha_ {1} \boldsymbol {\phi} _ {1} - \alpha_ {4} \boldsymbol {\phi} _ {4}

where \alpha_{1} and \alpha_{4} are obtained using (11.64):


\alpha_ {1} = \phi_ {1} ^ {T} \mathbf {M} \mathbf {x} _ {1}; \quad \alpha_ {4} = \phi_ {4} ^ {T} \mathbf {M} \mathbf {x} _ {1}

Substituting for M, \phi_1 , and \phi_4 , we obtain


\alpha_ {1} = 2. 3 8 5; \quad \alpha_ {4} = 0. 1 2 9 9

Then, to a few-digit accuracy,


\tilde {\mathbf {x}} _ {1} = \left[ \begin{array}{c} 0. 2 6 8 3 \\ - 0. 2 1 4 9 \\ - 0. 0 4 8 1 2 \\ 0. 2 3 5 8 \end{array} \right]

11.2.6 Some Practical Considerations Concerning Vector Iterations

So far we have discussed the theory used in vector iteration techniques. However, for a proper computer implementation of the methods, it is important to interpret the theoretical results and relate them to practice. Of particular importance are practical convergence and stability considerations when any one of the techniques is used.

A first important point is that the convergence rates of the iterations may turn out to be rather theoretical when measured in practice, namely, we assumed that the starting iteration vector z_{1} in (11.27) is a unit full vector, which corresponds to a vector x_{1} = \sum_{i=1}^{n} \phi_{i} . This means that the starting iteration vector is equally strong in each of the eigenvectors \phi_{i} . We chose this starting iteration vector to identify easily the theoretical convergence rate with which the iteration vector approaches the required eigenvector. However, in practice it is hardly possible to pick x_{1} = \sum_{i=1}^{n} \phi_{i} as the starting iteration vector, and instead we have


\mathbf {x} _ {1} = \sum_ {i = 1} ^ {n} \alpha_ {i} \boldsymbol {\phi} _ {i} \tag {11.68}

where the \alpha_{i} are arbitrary constants. This vector x_{1} corresponds to the following vector in the basis of eigenvectors:


\mathbf {z} _ {1} = \left[ \begin{array}{c} \alpha_ {1} \\ \vdots \\ \alpha_ {n} \end{array} \right] \tag {11.69}

To identify the effect of the constants \alpha_{i} , consider as an example the convergence analysis of inverse iteration without shifting when the starting vector in (11.68) is used and \lambda_{2} > \lambda_{1} . The conclusions reached will be equally applicable to other vector iteration methods. As before, we consider the iteration in the basis of eigenvectors \Phi and require \alpha_{1} \neq 0 in order to have x_{1}^{T}M\phi_{1} \neq 0 . After l inverse iterations we now have instead of (11.29),


\tilde {\mathbf {z}} _ {l + 1} = \left[ \begin{array}{c} 1 \\ \beta_ {2} \left(\frac {\lambda_ {1}}{\lambda_ {2}}\right) ^ {l} \\ \vdots \\ \beta_ {n} \left(\frac {\lambda_ {1}}{\lambda_ {n}}\right) ^ {l} \end{array} \right] \tag {11.70}

\beta_ {i} = \frac {\alpha_ {i}}{\alpha_ {1}}; \qquad i = 2, \dots , n \tag {11.71}

Therefore, the iteration vector obtained now has the multipliers \beta_{i} in its last n - 1 components. In the iteration the ith component is still decreasing with each iteration, as in (11.29) by the factor \lambda_{1}/\lambda_{i}, i = 2, \ldots, n , and the rate of convergence is \lambda_{1}/\lambda_{2} , as already derived in Section 11.2.1. However, in practical analysis the unknown coefficients \beta_{i} may produce the result that the theoretical convergence rate is not observed for many iterations. In practice, therefore, not only the order and rate of convergence but equally importantly

the “quality” of the starting iteration vector determines the number of iterations required for convergence. Furthermore, it is important to use a high enough convergence tolerance to prevent premature acceptance of the iteration vector as an approximation to the required eigenvector.

Together with the vector iterations, we may use a matrix deflation procedure or Gram-Schmidt vector orthogonalization to obtain convergence to an eigenpair not already calculated (see Section 11.2.5). We have mentioned already that for matrix deflation, the eigenvectors have to be evaluated to relatively high precision to preserve stability. Considering the Gram-Schmidt orthogonalization, the method is sensitive to round-off errors and must also be used with care. If the technique is employed in inverse or forward iteration without shifting, it is necessary to calculate the eigenvectors to high precision in order that Gram-Schmidt orthogonalization will work. In addition, the iteration vector should be orthogonalized in each iteration to the eigenvectors already calculated.

Let us now draw an important conclusion. We pointed out earlier in the presentation of the vector iteration techniques that it is difficult (and indeed theory shows impossible) to ensure convergence to a specific (but arbitrarily selected) eigenvalue and corresponding eigenvector. The discussion concerning practical aspects in this section substantiates those observations, and it is concluded that the vector iteration procedures and the Gram-Schmidt orthogonalization process must be employed with care if a specific eigenvalue and corresponding eigenvector are required. We will see in Sections 11.5 and 11.6 that, in fact, both techniques are best employed and are used very effectively in conjunction with other solution strategies.

11.2.7 Exercises

11.1. Consider the generalized eigenproblem


\left[ \begin{array}{r r r} 6 & - 1 & 0 \\ - 1 & 4 & - 1 \\ 0 & - 1 & 2 \end{array} \right] \boldsymbol {\phi} = \lambda \left[ \begin{array}{r r r} 2 & 0 & 0 \\ 0 & 2 & 1 \\ 0 & 1 & 1 \end{array} \right] \boldsymbol {\phi}

with the starting vector for iteration


\mathbf {x} _ {1} ^ {T} = \left[ \begin{array}{l l l} 1 & 1 & 1 \end{array} \right]

(a) Perform two inverse iterations and then use the Rayleigh quotient to calculate an approximation to \lambda_{1} .
(b) Perform two forward iterations and then use the Rayleigh quotient to calculate an approximation to \lambda_{3} .

11.2. Proceed as in Exercise 11.1, but for the following eigenproblem,


\left[ \begin{array}{r r r} 2 & - 1 & 0 \\ - 1 & 6 & - 1 \\ 0 & - 1 & 8 \end{array} \right] \boldsymbol {\Phi} = \lambda \left[ \begin{array}{c c c} 1 & & \\ & \frac {1}{2} & \\ & & 2 \end{array} \right] \boldsymbol {\Phi}

11.3. The eigenvectors corresponding to the two smallest eigenvalues \lambda_{1}=1 , \lambda_{2}=2 of the problem


\left[ \begin{array}{l l l} 2 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{array} \right] \boldsymbol {\phi} = \lambda \boldsymbol {\phi}

are \phi_{1} = \frac{1}{\sqrt{3}}\left[ \begin{array}{c}1\\ -1\\ 1 \end{array} \right];\qquad \phi_{2} = \frac{1}{\sqrt{2}}\left[ \begin{array}{c}1\\ 0\\ -1 \end{array} \right]

Let \mathbf{x}_1 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}

Use the Gram-Schmidt orthogonalization procedure to extract from x_{1} a vector orthogonal to \phi_{1} and \phi_{2} . Show explicitly that this vector is the third eigenvector \phi_{3} and calculate \lambda_{3} .

11.4. Consider the eigenproblem


\left[ \begin{array}{r r r} 2 & - 1 & 0 \\ - 1 & 4 & - 1 \\ 0 & - 1 & 2 \end{array} \right] \boldsymbol {\phi} = \lambda \left[ \begin{array}{c c c} \frac {1}{2} & & \\ & 1 & \\ & & \frac {1}{2} \end{array} \right] \boldsymbol {\phi}

For this problem, \phi_{1} = \frac{1}{\sqrt{2}}\left[ \begin{array}{c}1\\ 1\\ 1 \end{array} \right];\qquad \phi_{3} = \frac{1}{\sqrt{2}}\left[ \begin{array}{c}1\\ -1\\ 1 \end{array} \right]

Use Gram-Schmidt orthogonalization to calculate \phi_{2} and calculate all eigenvalues.

11.3 TRANSFORMATION METHODS

We pointed out in Section 11.1 that the transformation methods comprise a group of eigensystem solution procedures that employ the basic properties of the eigenvectors in the matrix \Phi ,


\boldsymbol {\Phi} ^ {T} \mathbf {K} \boldsymbol {\Phi} = \boldsymbol {\Lambda} \tag {11.3}

and \Phi^T\mathbf{M}\Phi = \mathbf{I} (11.4)

Since the matrix \Phi , of order n \times n , which diagonalizes \mathbf{K} and \mathbf{M} in the way given in (11.3) and (11.4) is unique, we can try to construct it by iteration. The basic scheme is to reduce \mathbf{K} and \mathbf{M} to diagonal form using successive pre- and postmultiplication by matrices \mathbf{P}_k^T and \mathbf{P}_k , respectively, where k = 1, 2, \ldots . Specifically, if we define \mathbf{K}_1 = \mathbf{K} and \mathbf{M}_1 = \mathbf{M} , we form


\left. \begin{array}{c} \mathbf {K} _ {2} = \mathbf {P} _ {1} ^ {T} \mathbf {K} _ {1} \mathbf {P} _ {1} \\ \mathbf {K} _ {3} = \mathbf {P} _ {2} ^ {T} \mathbf {K} _ {2} \mathbf {P} _ {2} \\ \vdots \\ \mathbf {K} _ {k + 1} = \mathbf {P} _ {k} ^ {T} \mathbf {K} _ {k} \mathbf {P} _ {k} \\ \vdots \end{array} \right\} \tag {11.72}

Similarly, \left. \begin{array}{l} \mathbf{M}_2 = \mathbf{P}_1^T \mathbf{M}_1 \mathbf{P}_1 \\ \mathbf{M}_3 = \mathbf{P}_2^T \mathbf{M}_2 \mathbf{P}_2 \\ \vdots \\ \mathbf{M}_{k+1} = \mathbf{P}_k^T \mathbf{M}_k \mathbf{P}_k \\ \vdots \\ \end{array} \right\} (11.73)

where the matrices \mathbf{P}_k are selected to bring \mathbf{K}_k and \mathbf{M}_k closer to diagonal form. Then for a proper procedure we apparently need to have


\mathbf {K} _ {k + 1} \rightarrow \Lambda \quad \text { and } \quad \mathbf {M} _ {k + 1} \rightarrow \mathbf {I} \quad \text { as } k \rightarrow \infty

in which case, with l being the last iteration,


\boldsymbol {\Phi} = \mathbf {P} _ {1} \mathbf {P} _ {2} \dots \mathbf {P} _ {l} \tag {11.74}

In practice, it is not necessary that \mathbf{M}_{k + 1} converges to \mathbf{I} and \mathbf{K}_{k + 1} to \pmb{\Lambda} , but they only need to converge to diagonal form. Namely, if


\mathbf {K} _ {k + 1} \rightarrow \operatorname{diag} (K _ {r}) \quad \text { and } \quad \mathbf {M} _ {k + 1} \rightarrow \operatorname{diag} (M _ {r}) \quad \text { as } k \rightarrow \infty

then with l indicating the last iteration and disregarding that the eigenvalues and eigenvectors may not be in the usual order,


\Lambda = \mathrm{diag} \left(\frac {K _ {r} ^ {(l + 1)}}{M _ {r} ^ {(l + 1)}}\right) \tag {11.75}

and \Phi = \mathbf{P}_1\mathbf{P}_2\ldots \mathbf{P}_l\mathrm{diag}\left(\frac{1}{\sqrt{M_r^{(l + 1)}}}\right) (11.76)

Using the basic idea described above, a number of different iteration methods have been proposed. We shall discuss in the next sections only the Jacobi and the Householder-QR methods, which are believed to be most effective in finite element analysis. However, before presenting the techniques in detail we should point out one important aspect. In the above introduction it was implied that iteration is started with pre- and postmultiplication by P_{1}^{T} and P_{1} , respectively, which is indeed the case in the Jacobi solution methods. However, alternatively, we may first aim to transform the eigenvalue problem K\phi = \lambda M\phi into a form that is more economical to use in the iteration. In particular, when M = I, the first m transformations in (11.72) may be used to reduce K into tridiagonal form without iteration, after which the matrices P_{i}, i = m + 1, \ldots, l , are applied in an iterative manner to bring K_{m+1} into diagonal form. In such a case the first matrices P_{1}, \ldots, P_{m} may be of different form than the later applied matrices P_{m+1}, \ldots, P_{l} . An application of this procedure is the Householder-QR method, in which Householder matrices are used to first transform K into tridiagonal form and then rotation matrices are employed in the QR transformations. The same solution strategy can also be used to solve the generalized eigenproblem K\phi = \lambda M\phi , M \neq I , provided that the problem is first transformed into the standard form.

11.3.1 The Jacobi Method

The basic Jacobi solution method has been developed for the solution of standard eigenproblems (M being the identity matrix), and we consider it in this section. The method was proposed over a century ago (see C. G. J. Jacobi [A]) and has been used extensively. A major advantage of the procedure is its simplicity and stability. Since the eigenvector properties in (11.3) and (11.4) (with M = I) are applicable to all symmetric matrices K with no restriction on the eigenvalues, the Jacobi method can be used to calculate negative, zero, or positive eigenvalues.

Considering the standard eigenproblem \mathbf{K}\boldsymbol{\phi} = \lambda \boldsymbol{\phi} , the k th iteration step defined in (11.72) reduces to


\mathbf {K} _ {k + 1} = \mathbf {P} _ {k} ^ {T} \mathbf {K} _ {k} \mathbf {P} _ {k} \tag {11.77}

where P_{k} is an orthogonal matrix; i.e., (11.73) gives


\mathbf {P} _ {k} ^ {T} \mathbf {P} _ {k} = \mathbf {I} \tag {11.78}

In the Jacobi solution the matrix P_{k} is a rotation matrix that is selected in such way that an off-diagonal element in K_{k} is zeroed. If element (i,j) is to be reduced to zero, the corresponding orthogonal matrix P_{k} is


\mathbf {P} _ {k} = \left[ \begin{array}{c c c c c c c c c} 1 & & & & i \text {th} & & j \text {th} \text {column} \\ & \ddots & & & & & & & \\ & & 1 & & & & & & \\ & & & \cos \theta & & - \sin \theta & & & i \text {th} \\ & & & 1 & & & & & \\ & & & & \ddots & & & & \\ & & & & & 1 & & & \\ & & & \sin \theta & & \cos \theta & & j \text {th} \text {row} \\ & & & & & & 1 & \\ & & & & & & & 1 \end{array} \right] \tag {11.79}

where \theta is selected from the condition that element (i,j) in \mathbf{K}_{k + 1} be zero. Denoting element (i,j) in \mathbf{K}_k by k_{ij}^{(k)} , we use


\tan 2 \theta = \frac {2 k _ {i j} ^ {(k)}}{k _ {i i} ^ {(k)} - k _ {j j} ^ {(k)}} \quad \text { for } k _ {i i} ^ {(k)} \neq k _ {j j} ^ {(k)} \tag {11.80}

and


\theta = \frac {\pi}{4} \quad \text { for } k _ {i i} ^ {(k)} = k _ {j j} ^ {(k)} \tag {11.81}

It should be noted that the numerical evaluation of K_{k+1} in (11.77) requires only the linear combination of two rows and two columns. In addition, advantage should also be taken of the fact that K_{k} is symmetric for all k; i.e., we should work on only the upper (or lower) triangular part of the matrix, including its diagonal elements.

An important point to emphasize is that although the transformation in (11.77) reduces an off-diagonal element in K_{k} to zero, this element will again become nonzero during the transformations that follow. Therefore, for the design of an actual algorithm, we have to decide which element to reduce to zero. One choice is to always zero the largest off-diagonal element in K_{k} . However, the search for the largest element is time-consuming, and it may be preferable to simply carry out the Jacobi transformations systematically, row by row or column by column, which is known as the cyclic Jacobi procedure. Running once over all off-diagonal elements is one sweep. The disadvantage of this procedure is that regardless of its size, an off-diagonal element is always zeroed; i.e., the element may already be nearly zero, and a rotation is still applied.