In the above discussion we have merely stated the iteration scheme and its convergence. We then applied the method in two examples but did not formally prove convergence. In the following we derive the convergence properties because we believe that the proof is very instructive. The first step in the proof of convergence and of the convergence rate given here is similar to the procedure used in the analysis of direct integration methods (see Section 9.4). The fundamental equation used in inverse iteration is the relation in (11.13). Neglecting the scaling of the elements in the iteration vector, we basically use for $k = 1, 2, \ldots$ , $$ \mathbf {K} \mathbf {x} _ {k + 1} = \mathbf {M} \mathbf {x} _ {k} \tag {11.23} $$ where we stated that $x_{k+1}$ will now converge to a multiple of $\phi_{1}$ . To show convergence it is convenient (as in the analysis of direct integration procedures) to change basis from the finite element coordinate basis to the basis of eigenvectors; namely, we can write for any iteration vector $x_{k}$ , $$ \mathbf {x} _ {k} = \Phi \mathbf {z} _ {k} \tag {11.24} $$ where $\Phi$ is the matrix of eigenvectors $\Phi = [\phi_1, \ldots, \phi_n]$ . It should be realized that because $\Phi$ is nonsingular, there is a unique vector $\mathbf{z}_k$ for any vector $\mathbf{x}_k$ . Substituting for $\mathbf{x}_k$ and $\mathbf{x}_{k+1}$ from (11.24) into (11.23), premultiplying by $\Phi^T$ , and using the orthogonality relations $\Phi^T \mathbf{K} \Phi = \Lambda$ and $\Phi^T \mathbf{M} \Phi = \mathbf{I}$ , we obtain $$ \mathbf {\Lambda} \mathbf {z} _ {k + 1} = \mathbf {z} _ {k} \tag {11.25} $$ where $\Lambda = \mathrm{diag}(\lambda_i)$ . Comparing (11.25) with (11.23) we find that the iterations are of the same form with $\mathbf{K} = \Lambda$ and $\mathbf{M} = \mathbf{I}$ . We may wonder why the transformation in (11.24) is used since $\Phi$ is unknown. However, we should realize that the transformation is employed only to investigate the convergence behavior of inverse iteration. Namely, because in theory (11.25) is equivalent to (11.23), the convergence properties of (11.25) are also those of (11.23). But the convergence characteristics of (11.25) are relatively easy to investigate, since the eigenvalues are the diagonal elements of $\Lambda$ and the eigenvectors are the unit vectors $\mathbf{e}_i$ , where $$ \mathbf {e} _ {i} ^ {r} = \left[ \begin{array}{l l l l l l l} 0 & \dots & 0 & 1 & 0 & \dots & 0 \end{array} \right] \tag {11.26} $$ In the presentation of the inverse iteration algorithms given in (11.13) and (11.14) and (11.16) to (11.22), we stated that the starting iteration vector $x_{1}$ must not be M-orthogonal to $\phi_{1}$ . Equivalently, in (11.25) the iteration vector $z_{1}$ must not be orthogonal to $e_{1}$ . Assume that we use $$ \mathbf {z} [ \mathbf {\Gamma} = [ 1 \quad 1 \quad 1 \quad \dots \quad 1 ] \tag {11.27} $$ We discuss the effect of this assumption in Section 11.2.6. Then using (11.25) for $k = 1, \ldots, l$ , we obtain $$ \mathbf {z} _ {l + 1} ^ {T} = \left[ \left(\frac {1}{\lambda_ {1}}\right) ^ {l} \left(\frac {1}{\lambda_ {2}}\right) ^ {l} \dots \left(\frac {1}{\lambda_ {n}}\right) ^ {l} \right] \tag {11.28} $$ Let us first assume that $\lambda_1 < \lambda_2$ . To show that $\mathbf{z}_{l+1}$ converges to a multiple of $\mathbf{e}_1$ as $l \to \infty$ , we multiply $\mathbf{z}_{i+1}$ in (11.28) by $(\lambda_1)^i$ to obtain $$ \overline {{{\mathbf {z}}}} _ {l + 1} = \left[ \begin{array}{c} 1 \\ \left(\lambda_ {1} / \lambda_ {2}\right) ^ {l} \\ \vdots \\ \left(\lambda_ {1} / \lambda_ {n}\right) ^ {l} \end{array} \right] \tag {11.29} $$ and observe that $\overline{\mathbf{z}}_{l+1}$ converges to $\mathbf{e}_1$ as $l \to \infty$ . Hence, $\mathbf{z}_{l+1}$ converges to a multiple of $\mathbf{e}_1$ as $l \to \infty$ . To evaluate the order and rate of convergence, we use the convergence definition given in Section 2.7. For the iteration under consideration here we obtain $$ \lim _ {l \rightarrow \infty} \frac {\| \overline {{{\mathbf {z}}}} _ {l + 1} - \mathbf {e} _ {1} \| _ {2}}{\| \overline {{{\mathbf {z}}}} _ {l} - \mathbf {e} _ {1} \| _ {2}} = \frac {\lambda_ {1}}{\lambda_ {2}} \tag {11.30} $$ Hence convergence is linear, and the rate of convergence is $\lambda_1 / \lambda_2$ . This convergence rate is also shown in the iteration vector $\overline{\mathbf{z}}_{l + 1}$ in (11.29); i.e., those elements in the iteration vector that should tend to zero do so with at least the ratio $\lambda_1 / \lambda_2$ in each additional iteration. Thus, if $\lambda_2 > \lambda_1$ , it is the relative magnitude of $\lambda_1$ to $\lambda_2$ that determines how fast the iteration vector converges to the eigenvector $\phi_1$ . In this discussion we assumed that $\lambda_{1} < \lambda_{2}$ . Let us now consider the case of a multiple eigenvalue, namely, $\lambda_{1} = \lambda_{2} = \cdots = \lambda_{m}$ . Then we have in (11.29), $$ \overline {{{\mathbf {z}}}} _ {i + 1} ^ {T} = \left[ \begin{array}{l l l l l l l} 1 & 1 & \dots & 1 & \left(\frac {\lambda_ {1}}{\lambda_ {m + 1}}\right) ^ {l} & \dots & \left(\frac {\lambda_ {1}}{\lambda_ {n}}\right) ^ {l} \end{array} \right] \tag {11.31} $$ and the convergence rate of the iteration vector is $\lambda_{1}/\lambda_{m+1}$ . Therefore, in general, the rate of convergence of the iteration vector in inverse iteration is given by the ratio of $\lambda_{1}$ to the next distinct eigenvalue. In the iteration given in (11.16) to (11.22), we obtain an approximation to the eigenvalue $\lambda_{1}$ by evaluating the Rayleigh quotient. Corresponding to (11.18), the Rayleigh quotient calculated in the iteration of (11.25) would be $$ \rho (\mathbf {z} _ {k + 1}) = \frac {\mathbf {z} _ {k + 1} ^ {T} \mathbf {z} _ {k}}{\mathbf {z} _ {k + 1} ^ {T} \mathbf {z} _ {k + 1}} \tag {11.32} $$ Assume that we consider the last iteration in which $k = l$ . Then substituting for $\mathbf{z}_l$ and $\mathbf{z}_{l+1}$ from (11.28) into (11.32), we obtain $$ \rho (\mathbf {z} _ {l + 1}) = \frac {\lambda_ {1} \sum_ {i = 1} ^ {n} \left(\lambda_ {1} / \lambda_ {i}\right) ^ {2 l - 1}}{\sum_ {l = 1} ^ {n} \left(\lambda_ {1} / \lambda_ {i}\right) ^ {2 l}} \tag {11.33} $$ Hence we have for $\lambda_{1}$ being a simple or multiple eigenvalue, $$ \rho (\mathbf {z} _ {l + 1}) \rightarrow \lambda_ {1} \quad \text { as } l \rightarrow \infty $$ Also, convergence is linear with the rate equal to $(\lambda_{1}/\lambda_{m+1})^{2}$ , where $\lambda_{m+1}$ is defined as in (11.31). This convergence rate substantiates the observation that if an eigenvector is known with an error $\epsilon$ , then the Rayleigh quotient yields an approximation to the corresponding eigenvalue with error $\epsilon^{2}$ (see Section 2.6). Before demonstrating the results by means of a brief example, it should be recalled that we assumed in the above analysis a full unit starting iteration vector as given in (11.27). The convergence properties derived hold for any starting iteration vector that is not orthogonal to the eigenvector of interest, but the convergence rates can in many practical analyses be observed only as the number of iterations becomes large. The same observation also holds for any of the other convergence analyses that are presented in the following sections. We discuss this observation with other important practical aspects in Section 11.2.6. EXAMPLE 11.3: For the problem considered in Example 11.2, calculate the ultimate convergence rates of the iteration vector and the Rayleigh quotient. Compare the ultimate convergence rates with those actually observed in the inverse iteration carried out in Example 11.2. For the evaluation of the theoretical convergence rates, we need $\lambda_{1}$ and $\lambda_{2}$ . We calculated the eigenvalues in Example 10.12 and found $$ \lambda_ {1} = \frac {1}{2} - \frac {\sqrt {2}}{4} $$ $$ \lambda_ {2} = \frac {1}{2} + \frac {\sqrt {2}}{4} $$ Hence, the ultimate convergence rate of the iteration vector is $$ \frac {\lambda_ {1}}{\lambda_ {2}} = 0. 1 7 $$ and the ultimate convergence rate of the Rayleigh quotient is $$ \left(\frac {\lambda_ {1}}{\lambda_ {2}}\right) ^ {2} = 0. 0 2 9 $$ The actual vector convergence obtained is observed by evaluating the ratio $r_{k+1}, k = 1, 2, \ldots$ , where $$ r _ {k + 1} = \frac {\left\| \mathbf {x} _ {k + 1} - \boldsymbol {\phi} _ {1} \right\| _ {2}}{\left\| \mathbf {x} _ {k} - \boldsymbol {\phi} _ {1} \right\| _ {2}} $$ and we assume that $\phi_{1}$ is obtained in the last iteration [see (11.22)]. For the iteration in Example 11.2, we thus obtain $$ r _ {2} = 0. 0 2 6 0 8 3; \quad r _ {3} = 0. 1 7 0 5 5 9; \quad r _ {4} = 0. 1 6 7 1 3 4; \quad r _ {5} = 0. 1 4 4 2 5 1 $$ Ignoring $r_2$ because the iteration just started, we see that the theoretical and actual convergence rates compare quite well. Similarly, the actual convergence of the Rayleigh quotient calculated in Example 11.2 is observed by evaluating $$ \epsilon_ {k + 1} = \frac {\left| \rho \left(\overline {{\mathbf {x}}} _ {k + 1}\right) - \lambda_ {1} \right|}{\left| \rho \left(\overline {{\mathbf {x}}} _ {k}\right) - \lambda_ {1} \right|} $$ where we use the converged value of the Rayleigh quotient for $\lambda_{1}$ . In the iteration of Example 11.2 we have $$ \epsilon_ {3} = 0. 0 2 8 7 6 8; \quad \epsilon_ {4} = 0. 0 2 7 7 7 8; \quad \epsilon_ {5} = 0 $$ Hence, we see that the theoretical and observed convergence rates again agree quite well in this solution. # 11.2.2 Forward Iteration The method of forward iteration is complementary to the inverse iteration technique in that the method yields the eigenvector corresponding to the largest eigenvalue. Whereas we assumed in inverse iteration that K is positive definite, we assume in this section that M is positive definite; otherwise, a shift must be used (see Section 11.2.3). Having chosen a starting iteration vector $x_{1}$ , in forward iteration we evaluate, for $k = 1, 2, \ldots$ , $$ \mathbf {M} \overline {{{\mathbf {x}}}} _ {k + 1} = \mathbf {K} \mathbf {x} _ {k} \tag {11.34} $$ and $\mathbf{x}_{k + 1} = \frac{\overline{\mathbf{x}}_{k + 1}}{(\overline{\mathbf{x}}_{k + 1}^T\mathbf{M}\overline{\mathbf{x}}_{k + 1})^{1 / 2}}$ (11.35) where provided that $x_{1}$ is not M-orthogonal to $\phi_{n}$ , we have $$ \mathbf {x} _ {k + 1} \rightarrow \boldsymbol {\phi} _ {n} \quad \text { as } k \rightarrow \infty $$ The analogy to inverse iteration should be noted; the only difference is that we solve (11.34) rather than (11.13) to obtain an improved eigenvector. This means, in practice, that in the inverse iteration we need to triangularize the matrix K and in the forward iteration we decompose M. A more effective forward iteration procedure than that in (11.34) and (11.35) would be obtained by using equations that are analogous to those in (11.16) to (11.22). Assuming that $y_{1} = Kx_{1}$ , we evaluate for $k = 1, 2, \ldots$ , $$ \mathbf {M} \overline {{{\mathbf {x}}}} _ {k + 1} = \mathbf {y} _ {k} \tag {11.36} $$ $$ \overline {{{\mathbf {y}}}} _ {k + 1} = \mathbf {K} \overline {{{\mathbf {x}}}} _ {k + 1} \tag {11.37} $$ $$ \rho (\overline {{{\mathbf {x}}}} _ {k + 1}) = \frac {\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \overline {{{\mathbf {y}}}} _ {k + 1}}{\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \mathbf {y} _ {k}} \tag {11.38} $$ $$ \mathbf {y} _ {k + 1} = \frac {\overline {{{\mathbf {y}}}} _ {k + 1}}{(\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \mathbf {y} _ {k}) ^ {1 / 2}} \tag {11.39} $$ where provided that $\phi_n^T\mathbf{y}_1\neq 0$ $$ \mathbf {y} _ {k + 1} \rightarrow \mathbf {K} \boldsymbol {\phi} _ {n} \quad \text { and } \quad \rho (\overline {{\mathbf {x}}} _ {k + 1}) \rightarrow \lambda_ {n} \quad \text { as } k \rightarrow \infty $$ Convergence in the iteration could again be measured as given in (11.20), and denoting the last iteration by $l$ , we have $$ \lambda_ {n} \doteq \rho (\overline {{{\mathbf {x}}}} _ {l + 1}) \tag {11.40} $$ and $\Phi_n = \frac{\overline{\mathbf{x}}_{l + 1}}{(\overline{\mathbf{x}}_{l + 1}^T\mathbf{y}_l)^{1 / 2}}$ (11.41) Considering the analysis of convergence of the iteration vector to $\phi_{n}$ , it can be carried out following the same procedure that was used in the evaluation of the convergence characteristics of inverse iteration. Alternatively, we may use the results that we obtained in the analysis of inverse iteration. Namely, assume that we write the eigenproblem $K\phi = \lambda M\phi$ in the form $M\phi = \lambda^{-1}K\phi$ ; then using inverse iteration to solve for an eigenvector and corresponding eigenvalue is equivalent to performing forward iteration on the problem $\mathbf{K}\boldsymbol{\Phi} = \lambda \mathbf{M}\boldsymbol{\Phi}$ . But since we converge in the inverse iteration of (11.16) to (11.22) to the smallest eigenvalue and corresponding eigenvector, and since for the problem $\mathbf{M}\boldsymbol{\Phi} = \lambda^{-1}\mathbf{K}\boldsymbol{\Phi}$ this eigenvalue is $\lambda_n^{-1}$ , where $\lambda_n$ is the largest eigenvalue of $\mathbf{K}\boldsymbol{\Phi} = \lambda \mathbf{M}\boldsymbol{\Phi}$ , we converge in the forward iteration of (11.36) to (11.41) to $\lambda_n$ and $\boldsymbol{\Phi}_n$ and the convergence rate of the iteration vector is $\lambda_{n-1} / \lambda_n$ . We should note that the Rayleigh quotient evaluated in (11.38) is $\overline{\mathbf{x}}_{k+1}^T\mathbf{K}\overline{\mathbf{x}}_{k+1} / \overline{\mathbf{x}}_{k+1}^T\mathbf{M}\overline{\mathbf{x}}_{k+1}$ , i.e., just the inverse of the Rayleigh quotient for calculating an approximation to $\lambda_n^{-1}$ in the problem $\mathbf{M}\boldsymbol{\Phi} = \lambda^{-1}\mathbf{K}\boldsymbol{\Phi}$ . We demonstrate the iteration and convergence in the following example. EXAMPLE 11.4: Use forward iteration as given in (11.36) to (11.41) with $tol = 10^{-6}$ in (11.20) to evaluate $\lambda_{4}$ and $\phi_{4}$ of the eigenproblem $K\phi = \lambda M\phi$ , where $$ \mathbf {K} = \left[ \begin{array}{r r r r} 5 & - 4 & 1 & 0 \\ - 4 & 6 & - 4 & 1 \\ 1 & - 4 & 6 & - 4 \\ 0 & 1 & - 4 & 5 \end{array} \right]; \quad \mathbf {M} = \left[ \begin{array}{c c c c} 2 & & & \\ & 2 & & \\ & & 1 & \\ & & & 1 \end{array} \right] $$ The physical problem considered in this example is the free-vibration response of the simply supported beam shown in Fig. 8.1 with the above mass matrix. Starting the iteration with $$ \mathbf {x} _ {1} = \left[ \begin{array}{l} 1 \\ 1 \\ 1 \\ 1 \end{array} \right] $$ we calculate in the inverse iteration the values summarized in Table E11.4. Hence, we need 10 iterations for a convergence tolerance of $10^{-6}$ in (11.20), and we then use, as given in (11.40) and (11.41), $$ \lambda_ {4} \doteq 1 0. 6 3 8 4 5; \quad \Phi_ {4} \doteq \left[ \begin{array}{c} - 0. 1 0 7 3 1 \\ 0. 2 5 5 3 9 \\ - 0. 7 2 8 2 7 \\ 0. 5 6 2 2 7 \end{array} \right] $$ Comparing after iteration 10 the predicted value of $\lambda_{4}$ with the exact value, we have $$ \frac {\left| \lambda_ {4} ^ {\text { exact }} - \rho (\overline {{{\mathbf {x}}}} _ {1 1}) \right|}{\lambda_ {4} ^ {\text { exact }}} = 1. 9 2 \times 1 0 ^ {- 7} $$ Also, the right-hand side in (10.107) gives $5.24 \times 10^{-4}$ . TABLE E11.4
| k | $\overline{x}_{k+1}$ | $\overline{y}_{k+1}$ | $\rho(\overline{x}_{k+1})$ | $y_{k+1}$ | $\frac{|\lambda_{4}^{(k+1)} - \lambda_{4}^{(k)}|}{\lambda_{4}^{(k+1)}}$ |
| 1 | 1 | 6 | 5.93333 | 2.1909 | --- |
| -0.5 | -1 | -0.3651 | |||
| -1 | -11 | -4.0166 | |||
| 2 | 13.5 | 4.9295 | |||
| 2 | 1.0954 | 2.1909 | 8.57887 | 0.3345 | 0.3084 |
| -0.1826 | 15.5188 | 2.3694 | |||
| -4.0166 | -41.9921 | -6.4112 | |||
| 4.9295 | 40.5315 | 6.1882 | |||
| 3 | 0.1672 | -10.3137 | 10.15966 | -1.1372 | 0.1556 |
| 1.1847 | 38.2720 | 4.2198 | |||
| -6.4112 | -67.7914 | -7.4745 | |||
| 6.1882 | 57.7704 | 6.3696 | |||
| 8 | -1.1285 | -24.2083 | 10.63838 | -2.2756 | 0.00003304 |
| 2.7044 | 57.7298 | 5.4267 | |||
| -7.7481 | -82.4222 | -7.7478 | |||
| 5.9969 | 63.6811 | 5.9861 | |||
| 9 | -1.1378 | -24.2902 | 10.63844 | -2.2833 | 0.000005584 |
| 2.7133 | 57.8086 | 5.4340 | |||
| -7.7478 | -82.4224 | -7.7476 | |||
| 5.9861 | 63.6351 | 5.9816 | |||
| 10 | -1.1416 | -24.3237 | 10.63845 | -2.2864 | 0.0000009437 |
| 2.7170 | 57.8405 | 5.4369 | |||
| -7.7476 | -82.4219 | -7.7476 | |||
| 5.9816 | 63.6157 | 5.9798 |