Files
김경종 4cc312954f
Tests / Hermetic test suite (push) Has been cancelled
Tests / Skill frontmatter validation (push) Has been cancelled
add wiki
2026-05-28 17:16:48 +09:00

29 KiB

  1. Use a simple scheme of bisection to identify intervals in which the individual eigenvalues lie. In this process, those intervals in which more than one eigenvalue are known to lie are successively bisected and the Sturm sequence check is carried out until all eigenvalues are isolated.
  2. Calculate the eigenvalues to the accuracy required and then obtain the corresponding eigenvectors by inverse iteration.

To obtain the eigenvalues accurately in step 4, the method of bisection is abandoned and a more efficient procedure is generally used. For example, the secant iteration presented in Section 11.4.2 can be employed once the eigenvalues have been isolated (see Example 11.20).

The above technique for calculating the required eigenvalues is straightforward. However, the method can be quite inefficient because each iteration in step 3 requires a triangular factorization and because many iterations may be needed. The number of iterations can be particularly high when multiple eigenvalues (see Fig. 11.4) or eigenvalue clusters must be evaluated, in which case additional strategies to accelerate the process must be included. In general, the method is effective if only a few factorizations are needed to identify intervals of individual eigenvalues and the smallest required eigenvalue is much larger than \lambda_{1} .

EXAMPLE 11.20: Use the bisection method followed by secant iteration to calculate \lambda_{2} in the problem \mathbf{K}\phi = \lambda \mathbf{M}\phi , where


\mathbf {K} = \left[ \begin{array}{r r r} 2 & - 1 & 0 \\ - 1 & 4 & - 1 \\ 0 & - 1 & 2 \end{array} \right]; \quad \mathbf {M} = \left[ \begin{array}{c c c} \frac {1}{2} & & \\ & 1 & \\ & & \frac {1}{2} \end{array} \right]

We considered this problem in Example 10.5, where we isolated the eigenvalues using the bisection technique. Specifically, we found that


\lambda_ {1} <   3 <   \lambda_ {2} <   5 <   \lambda_ {3}

Hence, we can start the secant iteration with \mu_{1} = 3 , \mu_{2} = 5 , and using the results of Example 10.5,


p (\mu_ {1}) = - \frac {3}{4}; \quad p (\mu_ {2}) = \frac {3}{4}

Using (11.119), we obtain


\mu_ {3} = 5 - \frac {\frac {3}{4}}{(\frac {3}{4}) - (- \frac {3}{4})} (5 - 3)

or


\mu_ {3} = 4

Next we need to evaluate


p (\mu_ {3}) = \det (\mathbf {K} - \mu_ {3} \mathbf {M})

and we find that p(\mu_{3}) = 0.0 . Hence, \lambda_{2} = \mu_{3} = 4 . Therefore, in this case one step of secant iteration was sufficient to calculate \lambda_{2} . But it may be noted that to evaluate p(\mu_{3}) by Gauss factorization of K - \mu_{3}M , a row interchange is needed (see Example 10.4).

It is important to note that we assumed in the above presentation that the factorization of K - \mu_{k}M into LDL^{T} can be obtained. However, we discussed in Sections 8.2.5 and 11.4.2 that if \mu_{k} > \lambda_{1} , interchanges may be required. If interchanges are needed, the same

considerations as mentioned in these sections are applicable and proper account has to be taken of the effect of each row interchange on the Sturm sequence property.

The bisection method has two major disadvantages: (1) it is necessary to complete the factorization of K - \mu_{k}M with as much accuracy as possible, although the coefficient matrix may be ill-conditioned (indeed, this may still be the case even when row interchanges are included), and (2) convergence can be very slow when a cluster of eigenvalues has to be solved. However, the Sturm sequence property can be employed in conjunction with other solution strategies, and it is in such cases that the property can be extremely useful. In particular, the Sturm sequence property is employed in the determinant search algorithm (see K. J. Bathe and E. L. Wilson [C]), in which the factorization of K - \mu_{k}M is carried out without interchanges but is completed only provided that no instability occurs. If the factorization proves to be unstable, a different \mu_{k} is selected. This is possible because the final accuracy with which the eigenvalues and eigenvectors are calculated does not depend on the specific shift \mu_{k} used in the solution.

11.4.4 Exercises

11.9. Perform an implicit secant polynomial iteration on the eigenproblem in Exercise 11.1 to calculate \lambda_{1} . Next use instead of p(\lambda) the characteristic polynomial deflated by \lambda_{1} , given by


p ^ {(1)} (\lambda) = \frac {p (\lambda)}{\lambda - \lambda_ {1}}

where p(\lambda) = \operatorname{det}(\mathbf{K} - \lambda \mathbf{M}) , to calculate \lambda_2 .

Plot p(\lambda) and p^{(1)}(\lambda) and show on these plots the iterative steps you have performed.

11.10. Consider the eigenproblem in Exercise 11.1 and develop a quadratic interpolation for an iterative scheme to solve for \lambda_{1} .

11.11. Use the Sturm sequence property to evaluate \lambda_{1} and \lambda_{2} in the eigenvalue problem in Exercise 11.8.

11.12. Use the Sturm sequence property to evaluate \lambda_{1} in the eigenvalue problem in Exercise 11.1. (If a zero pivot is encountered, use row and column interchanges.)

11.5 THE LANCZOS ITERATION METHOD

Very effective procedures for the solution of p eigenvalues and eigenvectors of finite element equations have been developed based on iterations with Lanczos transformations.

In his seminal work, C. Lanczos [A] proposed a transformation for the tridiagonalization of matrices. However, as already recognized by Lanczos, the tridiagonalization procedure has a major shortcoming in that the constructed vectors, which in theory should be orthogonal, are, as a result of round-off errors, not orthogonal in practice. A remedy is to use Gram-Schmidt orthogonalization, but such an approach is also sensitive to round-off errors and renders the process inefficient when a complete matrix is to be tridiagonalized. Other techniques such as the Householder method (see Section 11.3.3) are significantly more efficient.

If, on the other hand, the objective is to calculate only a few eigenvalues and corresponding eigenvectors of the problem K\phi = \lambda M\phi , an iteration based on the Lanczos transformation can be very efficient (see C. C. Paige [A, B] and T. Ericsson and A. Ruhe [A]).

In the following sections, we first present the basic Lanczos transformation with its important properties, and then we discuss the use of this transformation in an iterative manner for the solution of p eigenvalues and vectors of the problem K\phi = \lambda M\phi , where p \ll n and n is the order of the matrices.

11.5.1 The Lanczos Transformation

The basic steps of the Lanczos method transform, in theory, our generalized eigenproblem K\phi = \lambda M\phi into a standard form with a tridiagonal coefficient matrix. Let us summarize the steps of transformation.

Pick a starting vector x and calculate


\mathbf {x} _ {1} = \frac {\mathbf {x}}{\gamma}; \quad \gamma = (\mathbf {x} ^ {T} \mathbf {M} \mathbf {x}) ^ {1 / 2} \tag {11.122}

Let \beta_0 = 0 ; then calculate for i = 1, \ldots, n ,


\mathbf {K} \overline {{{\mathbf {x}}}} _ {i} = \mathbf {M} \mathbf {x} _ {i} \tag {11.123}

\alpha_ {i} = \overline {{{\mathbf {x}}}} _ {i} ^ {T} \mathbf {M} \mathbf {x} _ {i} \tag {11.124}

and if \mathbf{i} \neq \mathbf{n} , \tilde{\mathbf{x}}_i = \overline{\mathbf{x}}_i - \alpha_i \mathbf{x}_i - \beta_{i-1} \mathbf{x}_{i-1} (11.125)


\beta_ {i} = (\bar {\mathbf {x}} _ {i} ^ {T} \mathbf {M} \bar {\mathbf {x}} _ {i}) ^ {1 / 2} \tag {11.126}

and \mathbf{x}_{i + 1} = \frac{\tilde{\mathbf{x}}_i}{\beta_i} (11.127)

Theoretically, the vectors \mathbf{x}_i, i = 1, \ldots, n , generated using (11.122) to (11.127) are M-orthonormal


\mathbf {x} _ {i} ^ {T} \mathbf {M} \mathbf {x} _ {j} = \delta_ {i j} \tag {11.128}

and the matrix \mathbf{X}_n = [\mathbf{x}_1, \dots, \mathbf{x}_n] (11.129)

satisfies the relationship \mathbf{X}_n^T (\mathbf{MK}^{-1}\mathbf{M})\mathbf{X}_n = \mathbf{T}_n (11.130)

where \mathbf{T}_n = \begin{bmatrix} \alpha_1 & \beta_1 & & & & \\ \beta_1 & \alpha_2 & \beta_2 & & & \\ & & & \ddots & & \\ & & & & \alpha_{n-1} & \beta_{n-1} \\ & & & & \beta_{n-1} & \alpha_n \end{bmatrix} (11.131)

We can now relate the eigenvalues and vectors of T_{n} to those of the problem K\phi = \lambda M\phi , which can be written in the form


\mathbf {M} \mathbf {K} ^ {- 1} \mathbf {M} \boldsymbol {\phi} = \frac {1}{\lambda} \mathbf {M} \boldsymbol {\phi} \tag {11.132}

Using the transformation


\boldsymbol {\phi} = \mathbf {X} _ {n} \tilde {\boldsymbol {\phi}} \tag {11.133}

and (11.128) and (11.130), we obtain from (11.132),


\mathbf {T} _ {n} \tilde {\boldsymbol {\Phi}} = \frac {1}{\lambda} \tilde {\boldsymbol {\Phi}} \tag {11.134}

Hence, the eigenvalues of T_{n} are the reciprocals of the eigenvalues of K\phi = \lambda M\phi and the eigenvectors of the two problems are related as in (11.133).

As further discussed later, we assume in the above transformational steps that these steps can be completed. We defer the proof that the vectors x_{i} are M-orthonormal to Exercise 11.13 but show in the following example that (11.130) holds.

EXAMPLE 11.21: Show that \mathbf{T}_n in (11.131) is obtained by the transformation of (11.130).

Using (11.123), we obtain


\overline {{{\mathbf {x}}}} _ {i} = \mathbf {K} ^ {- 1} \mathbf {M} \mathbf {x} _ {i}

Substituting from (11.124) and (11.127), we obtain


\mathbf {K} ^ {- 1} \mathbf {M} \mathbf {x} _ {i} = \beta_ {i - 1} \mathbf {x} _ {i - 1} + \alpha_ {i} \mathbf {x} _ {i} + \beta_ {i} \mathbf {x} _ {i + 1}

Using this relation for i = 1, \ldots, j , we obtain


\mathbf {K} ^ {- 1} \mathbf {M} [ \mathbf {x} _ {1}, \dots , \mathbf {x} _ {j} ] = [ \mathbf {x} _ {1}, \dots , \mathbf {x} _ {j} ] \left[ \begin{array}{c c c c c} \alpha_ {1} & \beta_ {1} & & & \\ \beta_ {1} & \alpha_ {2} & \beta_ {2} & & \\ & & \ddots & & \\ & & & \alpha_ {j - 1} & \beta_ {j - 1} \\ & & & \beta_ {j - 1} & \alpha_ {j} \end{array} \right] + [ \mathbf {0}, \dots , \mathbf {0}, \beta_ {j} \mathbf {x} _ {j + 1} ]

Hence, \mathbf{K}^{-1}\mathbf{M}\mathbf{X}_j = \mathbf{X}_j\mathbf{T}_j + \beta_j\mathbf{x}_{j + 1}\mathbf{e}_j^T (a)

where e_{j} is a vector of length j,


\mathbf {e} _ {j} ^ {T} = [ 0 \quad \dots \quad 0 \quad 1 ]

Premultiplying (a) by X_{j}^{T}M and using the M-orthonormality of the vectors x_{i} , we obtain


\mathbf {T} _ {j} = \mathbf {X} _ {j} ^ {T} \mathbf {M} \mathbf {K} ^ {- 1} \mathbf {M} \mathbf {X} _ {j} \tag {b}

which for j = n proves the desired result.

Note that when j = n , we also have \tilde{\mathbf{x}}_n = \mathbf{0} in (11.125) because the complete space is spanned by \mathbf{X}_n and no vector \mathbf{M} -orthogonal to all vectors in \mathbf{X}_n exists.

We use the relation in (b) in Lanczos iteration when only a few eigenvalues and corresponding vectors are to be calculated. Also, the relation in (a) gives in that case error bounds on the calculated eigenvalue approximations.

As we have already mentioned, the tridiagonalization procedure in (11.122) to (11.127) does not, in practice, produce the desired M-orthonormal vectors because of round-off errors, and if additional Gram-Schmidt orthogonalizations are used, the method of tridiagonalization for a complete matrix is inefficient.

However, there are certain important properties of the transformation in (11.122) to (11.127) when i = 1, \ldots, q , with q \ll n , that can provide the basis of an effective iterative solution scheme. If we perform the Lanczos transformation in truncated form, we calculate, with i = 1, \ldots, q ,


\mathbf {X} _ {q} = \left[ \mathbf {x} _ {1}, \dots , \mathbf {x} _ {q} \right] \tag {11.135}

and the elements of the matrix T_{q} ,


\mathbf {T} _ {q} = \left[ \begin{array}{c c c c c} \alpha_ {1} & \beta_ {1} & & & \\ \beta_ {1} & \alpha_ {2} & \beta_ {2} & & \\ & & \ddots & & \\ & & & \alpha_ {q - 1} & \beta_ {q - 1} \\ & & & \beta_ {q - 1} & \alpha_ {q} \end{array} \right] \tag {11.136}

This matrix \mathbf{T}_q is actually the result of a Rayleigh-Ritz transformation on the eigenvalue problem (11.132). Namely, using


\overline {{{\boldsymbol {\Phi}}}} = \mathbf {X} _ {q} \mathbf {s} \tag {11.137}

in a Rayleigh-Ritz transformation (see Section 10.3.2) on the problem (11.132), equivalent to \mathbf{K}\phi = \lambda \mathbf{M}\phi , we obtain the eigenproblem


\mathbf {T} _ {q} \mathbf {s} = \nu \mathbf {s} \tag {11.138}

Hence, using the transformation from (11.132) to (11.134), all exact eigenvalues \lambda_{i} and eigenvectors \phi_{i} of the problem K\phi = \lambda M\phi are calculated by solving (11.134), while using (11.137) and (11.138) only approximations to eigenvalues and eigenvectors are computed.

However, we also realize that in (11.123) an inverse iteration step is performed, and therefore, the Ritz vectors in X_{q} should largely correspond to a space close to the least dominant subspace of K\phi = \lambda M\phi (i.e., the subspace corresponding to the smallest eigenvalues). For this reason, the solution of (11.138) may yield good approximations to the smallest eigenvalues and corresponding eigenvectors of K\phi = \lambda M\phi .

Of course, in the computations we never calculate the matrix MK^{-1}M but directly use the values \alpha_{i}, \beta_{i} obtained from (11.124) and (11.126) to form T_{q} . We also note that since in exact arithmetic, the inverses of the exact eigenvalues are calculated when q = n, in general we may expect better approximations to the inverses of the smallest eigenvalues as q increases.

During the calculation of the eigenvalues \nu_{i} in (11.138) we also directly obtain error bounds on the accuracy of these values. These error bounds are derived as follows.

Using the decomposition \mathbf{M} = \mathbf{SS}^T (see Section 10.2.5), we can transform the problem (11.132) to


\mathbf {S} ^ {T} \mathbf {K} ^ {- 1} \mathbf {S} \boldsymbol {\psi} = \frac {1}{\lambda} \boldsymbol {\psi} \tag {11.139}

where \pmb{\psi} = \mathbf{S}^T\pmb{\phi} (11.140)

We can now directly use the error bound formula (10.101). Assume (\nu_{i},\mathbf{s}_{i}) is an eigenpair of (11.138) and \overline{\Phi}_i is the corresponding vector obtained using (11.137). Then from (11.140),


\overline {{{\Psi}}} _ {i} = \mathbf {S} ^ {T} \overline {{{\Phi}}} _ {i} \tag {11.141}

and hence, \| \mathbf{r}_i\| = \| \mathbf{S}^T\mathbf{K}^{-1}\mathbf{S}\overline{\psi}_i - \nu_i\overline{\psi}_i\|


= \left\| \mathbf {S} ^ {T} \mathbf {K} ^ {- 1} \mathbf {S} \mathbf {S} ^ {T} \mathbf {X} _ {q} \mathbf {s} _ {i} - \nu_ {i} \mathbf {S} ^ {T} \mathbf {X} _ {q} \mathbf {s} _ {i} \right\|

= \left\| \mathbf {S} ^ {T} \left(\mathbf {K} ^ {- 1} \mathbf {M} \mathbf {X} _ {q} - \mathbf {X} _ {q} \mathbf {T} _ {q}\right) \mathbf {s} _ {i} \right\| \tag {11.142}

= \left\| \mathbf {S} ^ {T} \left(\beta_ {q} \mathbf {x} _ {q + 1} \mathbf {e} _ {q} ^ {T}\right) \mathbf {s} _ {i} \right\|

where we used the result (a) in Example 11.21. Since \| \mathbf{S}^T\mathbf{x}_{q + 1}\| = 1 , we thus have


\left\| \mathbf {r} _ {i} \right\| \leq \left| \beta_ {q} s _ {q i} \right| \tag {11.143}

where s_{qi} is the q th element, i.e. the last element, in the eigenvector \mathbf{s}_i of (11.138).

Using (10.101), we thus have


\left| \lambda_ {k} ^ {- 1} - \nu_ {i} \right| \leq \left| \beta_ {q} s _ {q i} \right| \tag {11.144}

for some value k. This bound hence requires only the calculation of \beta_{q} used for all values of \nu_{i} . In an actual solution we need to establish k, by a Sturm sequence check or otherwise, so as to know which eigenvalue has been approximated.

We demonstrate the truncated Lanczos transformation and solution of approximate eigenvalues in the following example.

EXAMPLE 11.22: Use the Lanczos transformation to calculate approximations to the two smallest eigenvalues of the eigenproblem K\phi = \lambda M\phi considered in Example 10.18.

Using the algorithm in (11.122) to (11.127) with x a full unit vector, we obtain


\gamma = 2. 1 2 1; \quad \mathbf {x} _ {1} = 0. 4 7 1 4 \left[ \begin{array}{l} 1 \\ 1 \\ 1 \\ 1 \\ 1 \end{array} \right]

\text { for   } i = 1: \quad \overline {{\mathbf {x}}} _ {1} = \left[ \begin{array}{l} 2. 1 2 1 \\ 3. 7 7 1 \\ 4. 9 5 0 \\ 5. 6 5 7 \\ 5. 8 9 3 \end{array} \right]; \quad \alpha_ {1} = 9. 1 6 7; \quad \tilde {\mathbf {x}} _ {1} = \left[ \begin{array}{l} - 2. 2 0 0 \\ - 0. 5 5 0 0 \\ 0. 6 2 8 5 \\ 1. 3 3 6 \\ 1. 5 7 1 \end{array} \right];

\beta_ {1} = 2. 9 2 5; \quad \mathbf {x} _ {2} = \left[ \begin{array}{l} -. 7 5 2 1 \\ -. 1 8 8 0 \\ 0. 2 1 4 9 \\ 0. 4 5 6 6 \\ 0. 5 3 7 2 \end{array} \right]

\text { for   } i = 2: \quad \overline {{\mathbf {x}}} _ {2} = \left[ \begin{array}{l} 0. 0 0 0 \\ 0. 7 5 2 1 \\ 1. 6 9 2 \\ 2. 4 1 7 \\ 2. 6 8 6 \end{array} \right]; \quad \alpha_ {2} = 2. 0 4 8

\mathbf {T} _ {2} = \left[ \begin{array}{l l} 9. 1 6 7 & 2. 9 2 5 \\ 2. 9 2 5 & 2. 0 4 8 \end{array} \right]

Approximations to the eigenvalues of \mathbf{K}\phi = \lambda \mathbf{M}\phi are obtained by solving


\mathbf {T} _ {2} \mathbf {s} = \frac {1}{\rho} \mathbf {s}; \quad \nu_ {1} = \frac {1}{\rho_ {2}}, \quad \nu_ {2} = \frac {1}{\rho_ {1}}

\text { which   gives } \quad \rho_ {1} = 0. 0 9 7 9 0; \quad \rho_ {2} = 1. 0 0 0

Comparing these values with the exact eigenvalues of K\phi = \lambda M\phi (see Example 10.18), we find that \rho_{1} is a good approximation to \lambda_{1} but \rho_{2} is not very close to \lambda_{2} . Of course, \rho_{i} \geq \lambda_{i}, i = 1, 2 . The smallest eigenvalue is well predicted in this solution because the starting vector x is relatively close to \phi_{1} . The error bound calculations of (11.144) give here


\left| \lambda_ {1} ^ {- 1} - \nu_ {2} \right| = 0. 0 0 1 6 \leq 0. 1 2 3

\left| \lambda_ {2} ^ {- 1} - \nu_ {1} \right| = 0. 2 1 3 \leq 0. 3 4 3

11.5.2 Iteration with Lanczos Transformations

As discussed in the previous section, a truncated Lanczos transformation is a Rayleigh-Ritz analysis and therefore the predicted eigenvalues and eigenvectors may or may not be accurate approximations of the values sought. It is a further step to develop, based on Lanczos transformations, an algorithm that in an iterative manner calculates the required eigenvalues and vectors to the required accuracy.

As an example, consider the following proposal of a simple iterative algorithm. Assume that we require the p smallest eigenvalues and corresponding eigenvectors of K\phi = \lambda M\phi , where n is the number of equations and n \gg p .

Perform the Lanczos transformation with q = 2p and solve for the largest p eigenvalues of \mathbf{T}_{2p} . [Note that we seek the smallest values of \lambda in (11.134).]

Then perform the Lanczos transformation with q = 3p and solve for the largest p eigenvalues of \mathbf{T}_{3p} .

Continue this process to q = rp , with r = 4, 5, \ldots , until the largest p eigenvalues satisfy the accuracy criterion |\beta_q s_{qi}| < tol for i = 1, \ldots, p [see (11.144) where tol is a selected tolerance]. Of course, there is no need to increase the number of vectors in each stage by p , but depending on the problem considered, a smaller increase may be used.

This simple process appears very attractive; unfortunately, however, it is unstable because in finite digit arithmetic, loss of M-orthogonality of the actually calculated vectors x_{i} occurs. When this loss of orthogonality occurs, some values obtained from the solution are approximations to spurious copies of the actual eigenvalues; in other words, for example, a single eigenvalue may be approximated a few times. Such a result is of course unacceptable, and we would have to sort out which calculated values are approximations to actual eigenvalues and whether any actual eigenvalue has been missed. For this purpose, Sturm sequence checks could be performed, but too many such checks render the complete solution inefficient.

A remedy to prevent spurious copies of eigenvalues is to use Gram-Schmidt orthogonalization of the Lanczos vectors. In some cases, selective reorthogonalization may be sufficient (see B. N. Parlett and D. S. Scott [A]). However, the Gram-Schmidt process is also sensitive to round-off errors (see Section 11.2.6), and it actually may be necessary to perform the orthogonalization completely on all earlier established vectors not only once but twice, as reported by H. Matthies [B].

Considering the Lanczos iteration method in exact arithmetic, we also notice that the vector \tilde{x}_{i} may be a null vector after some vectors have been established with x_{1} as the

starting vector. This phenomenon occurs when x_{1} contains only components of q eigenvectors, that is, x_{1} lies in a q-dimensional subspace of the entire n-dimensional space corresponding to the matrices K and M (then \tilde{x}_{q} = 0 ). Let us demonstrate this observation and what can happen when a multiple eigenvalue is present in the following simple example.

EXAMPLE 11.23: Use the Lanczos method in the solution of the eigenproblem \mathbf{K}\phi = \lambda \mathbf{M}\phi , where


\mathbf {K} = \left[ \begin{array}{c c c c c} \lambda_ {1} & & & & \\ & \lambda_ {2} & & & \\ & & \lambda_ {3} & & \\ & & & \lambda_ {4} & \\ & & & & \lambda_ {5} \end{array} \right]; \quad \mathbf {M} = \left[ \begin{array}{c c c c c} 1 & & & & \\ & 1 & & & \\ & & 1 & & \\ & & & 1 & \\ & & & & 1 \end{array} \right] \tag {a}

and \lambda_{1} = \lambda_{2} < \lambda_{3} < \lambda_{4} < \lambda_{5} .

(i) Let \mathbf{x}_1 = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} and calculate the Lanczos vectors.

(ii) Let \mathbf{x}_1 = \frac{1}{\sqrt{2}}\begin{bmatrix} 0 \\ 0 \\ 1 \\ 1 \\ 0 \end{bmatrix} and calculate the Lanczos vectors.

Although the matrices used here are of special diagonal form, the observations in this example are of a general nature. Namely, we can imagine that the matrices in (a) have been obtained by a transformation of more general matrices to their eigenvector basis, but this transformation is performed here merely to display more readily the ingredients of the solution algorithm (in the same way as we proceeded in the analysis of the vector iteration methods; see Section 11.2.1).

In case (i) we obtain, using (11.123) to (11.125),


\tilde {\mathbf {x}} _ {1} = \mathbf {0} \tag {b}

\mathbf {x} _ {2} = \left[ \begin{array}{c} 0 \\ 0 \\ \frac {1}{\sqrt {2}} \\ - \frac {1}{\sqrt {2}} \\ 0 \end{array} \right] \tag {c}

and \tilde{\mathbf{x}}_2 = \mathbf{0}

Hence (b) shows that because x_{1} is an eigenvector, we cannot continue with the process, and (c) shows that because x_{1} lies in the subspace corresponding to \lambda_{3} and \lambda_{4} , we cannot find more than two Lanczos vectors including x_{1} .

In practice, the solution process usually does not break down as mentioned above because of round-off errors. However, the preceding discussion shows that two features are important in a Lanczos iteration method, namely, Gram-Schmidt orthogonalization and, when necessary, restarting of the algorithm with a new Lanczos vector x.

To present a general solution approach, let us define a Lanczos step, a Lanczos stage, and restarting.

A Lanczos step is the use of (11.123) to (11.127), in which we now include reorthogonalization.

A Lanczos stage consists of q Lanczos steps and the calculation of the eigenvalues and eigenvectors of T_{q}s = \nu s . If one of the following conditions is satisfied, the eigenvalues and eigenvectors of T_{q}s = \nu s are calculated and hence the stage is completed.

  1. The preassigned maximum number of Lanczos steps, q_{\max} , is reached.
  2. The loss of orthogonality in the Lanczos vectors or between the Lanczos vectors and the converged eigenvectors is detected.

In case (1), q = q_{\max} , and in case (2), q is the number of Lanczos steps completed prior to the loss of orthogonality.

At the end of each Lanczos stage, we check whether all required eigenvalues and eigenvectors have been calculated. If the required eigenpairs have not yet been obtained, we restart for a new Lanczos stage with a new vector x in (11.122). Prior to the restart, it can be effective to introduce a shift \mu so that the inverse iteration step in (11.123) is performed with K - \mu M (see Section 11.2.3).

Without giving all details, a complete solution algorithm can therefore be summarized as follows.

Start of a new Lanczos stage: Choose a starting vector x that is orthogonal to all previously calculated eigenvector approximations and calculate


\mathbf {x} _ {1} = \frac {\mathbf {x}}{\gamma}; \quad \gamma = (\mathbf {x} ^ {T} \mathbf {M} \mathbf {x}) ^ {1 / 2} \tag {11.122}

Choose a shift \mu (usually \mu = 0 for the first Lanczos stage).

Perform the Lanczos steps, using i = 1, 2, \ldots ; \beta_0 = 0 ,


\begin{array}{l} (\mathbf {K} - \mu \mathbf {M}) \overline {{\mathbf {x}}} _ {i} = \mathbf {M} \mathbf {x} _ {i} \\ \alpha_ {i} = \overline {{{\mathbf {x}}}} _ {i} ^ {T} \mathbf {M} \mathbf {x} _ {i} \\ \tilde {\mathbf {x}} _ {i} ^ {\prime} = \overline {{{\mathbf {x}}}} _ {i} - \alpha_ {i} \mathbf {x} _ {i} - \beta_ {i - 1} \mathbf {x} _ {i - 1} \\ \tilde {\mathbf {x}} _ {i} = \tilde {\mathbf {x}} _ {i} ^ {\prime} - \sum_ {k = 1} ^ {i} \left(\tilde {\mathbf {x}} _ {i} ^ {\prime T} \mathbf {M} \mathbf {x} _ {k}\right) \mathbf {x} _ {k} - \sum_ {j = 1} ^ {n _ {c}} \left(\tilde {\mathbf {x}} _ {i} ^ {\prime T} \mathbf {M} \boldsymbol {\phi} _ {j}\right) \boldsymbol {\phi} _ {j} \tag {11.145} \\ \end{array}

\beta_ {i} = (\tilde {\mathbf {x}} _ {i} ^ {T} \mathbf {M} \tilde {\mathbf {x}} _ {i}) ^ {1 / 2}

\mathbf {x} _ {i + 1} = \frac {\tilde {\mathbf {x}} _ {i}}{\beta_ {i}}

where n_{c} is the number of converged eigenvalues in the preceding stages. Although a Gram-Schmidt orthogonalization has been performed, the vector x_{i+1} is checked for loss of orthogonality and if the loss occurs, this Lanczos stage is terminated with q = i; otherwise perform a maximum number of steps, q_{max} , and set q = q_{max} .

Compute r additional converged eigenvalues \lambda_{nc+1}, \ldots, \lambda_{nc+r} and corresponding eigenvectors \phi_{nc+1}, \ldots, \phi_{nc+r} by solving (using, for example, QR-inverse iteration, see Section 11.3.3)


\mathbf {T} _ {q} \mathbf {s} = \nu \mathbf {s} \tag {11.138}

Convergence is defined by satisfying the criterion (11.144). Reset nc to the new value.

If the required eigenvalues and eigenvectors have not yet been obtained, restart for an additional Lanczos stage.

Continue until all required eigenvalues and eigenvectors have been calculated or until the maximum number of assigned Lanczos steps has been reached.

These solution steps give the general steps (including a simple and single full re-orthogonalization) of a Lanczos iterative scheme. As mentioned earlier, the details of the actual implementation—which are not given here—are most important to render the method effective and reliable. Some important and subtle aspects of an actual Lanczos iterative scheme are to select the actual reorthogonalization to be performed, to identify efficiently and reliably the loss of orthogonality and then restart with an effective new vector x, to ensure that only converged values to true eigenvalues and not spuriously duplicated values are accepted as eigenvalues, to restart for a new stage when such is more effective than to continue with the present stage, to select an appropriate value for q_{max} , and to use an effective shifting strategy. Also, Sturm sequence checks need to be performed in order to ensure that all required eigenvalues have been calculated (see Section 11.6.4 for more information on such checking strategies). The rate of convergence of the eigenvalues depends on course on the actual algorithm used.

Finally, we should mention that the Lanczos algorithm has been developed also to work with blocks of vectors (instead of individual vectors only) (see, for example, G. H. Golub and R. Underwood [A] and H. Matthies [B]).

11.5.3 Exercises

11.13. Show that the vectors x_{i} generated in the Lanczos transformation are (in exact arithmetic) M-orthonormal. (Hint: Show that x_{1} , x_{2} , and x_{3} are M-orthonormal and then use the method of induction.)

11.14. Assume that the loss of vector orthogonality in the Lanczos transformation is only a result of the vector subtraction step (11.125). Then show that the loss of orthogonality can be predicted using the equation


\left| \mathbf {x} _ {i} ^ {T} \mathbf {M} \mathbf {x} _ {j + 1} \right| \lesssim \frac {f _ {i j}}{\beta_ {j}} \epsilon

where the symbol \lesssim stands for "approximately smaller than" and


j = \text { Lanczos   step   number };

1 \leq i \leq j

\left| \mathbf {x} _ {i} ^ {T} \mathbf {M} \mathbf {x} _ {k} \right| \leq \epsilon (1 \leq i, k \leq j; k \neq i)

f _ {i j} = \left\{ \begin{array}{l l} \left| \alpha_ {i} - \alpha_ {j} \right| + \beta_ {j - 1} + \beta_ {i - 1} + \beta_ {i} & (i \leq j - 2) \\ \left| \alpha_ {i} - \alpha_ {j} \right| + \beta_ {j - 2} & (i = j - 1) \\ \beta_ {j - 1} & (i = j) \end{array} \right.