<!-- source-page: 911 -->

In the above discussion we have merely stated the iteration scheme and its convergence. We then applied the method in two examples but did not formally prove convergence. In the following we derive the convergence properties because we believe that the proof is very instructive.

The first step in the proof of convergence and of the convergence rate given here is similar to the procedure used in the analysis of direct integration methods (see Section 9.4). The fundamental equation used in inverse iteration is the relation in (11.13). Neglecting the scaling of the elements in the iteration vector, we basically use for $k = 1, 2, \ldots$ ,

$$
\mathbf {K} \mathbf {x} _ {k + 1} = \mathbf {M} \mathbf {x} _ {k} \tag {11.23}
$$

where we stated that $x_{k+1}$ will now converge to a multiple of $\phi_{1}$ . To show convergence it is convenient (as in the analysis of direct integration procedures) to change basis from the finite element coordinate basis to the basis of eigenvectors; namely, we can write for any iteration vector $x_{k}$ ,

$$
\mathbf {x} _ {k} = \Phi \mathbf {z} _ {k} \tag {11.24}
$$

where $\Phi$ is the matrix of eigenvectors $\Phi = [\phi_1, \ldots, \phi_n]$ . It should be realized that because $\Phi$ is nonsingular, there is a unique vector $\mathbf{z}_k$ for any vector $\mathbf{x}_k$ . Substituting for $\mathbf{x}_k$ and $\mathbf{x}_{k+1}$ from (11.24) into (11.23), premultiplying by $\Phi^T$ , and using the orthogonality relations $\Phi^T \mathbf{K} \Phi = \Lambda$ and $\Phi^T \mathbf{M} \Phi = \mathbf{I}$ , we obtain

$$
\mathbf {\Lambda} \mathbf {z} _ {k + 1} = \mathbf {z} _ {k} \tag {11.25}
$$

where $\Lambda = \mathrm{diag}(\lambda_i)$ . Comparing (11.25) with (11.23) we find that the iterations are of the same form with $\mathbf{K} = \Lambda$ and $\mathbf{M} = \mathbf{I}$ . We may wonder why the transformation in (11.24) is used since $\Phi$ is unknown. However, we should realize that the transformation is employed only to investigate the convergence behavior of inverse iteration. Namely, because in theory (11.25) is equivalent to (11.23), the convergence properties of (11.25) are also those of (11.23). But the convergence characteristics of (11.25) are relatively easy to investigate, since the eigenvalues are the diagonal elements of $\Lambda$ and the eigenvectors are the unit vectors $\mathbf{e}_i$ , where

$$
\mathbf {e} _ {i} ^ {r} = \left[ \begin{array}{l l l l l l l} 0 & \dots & 0 & 1 & 0 & \dots & 0 \end{array} \right] \tag {11.26}
$$

In the presentation of the inverse iteration algorithms given in (11.13) and (11.14) and (11.16) to (11.22), we stated that the starting iteration vector $x_{1}$ must not be M-orthogonal to $\phi_{1}$ . Equivalently, in (11.25) the iteration vector $z_{1}$ must not be orthogonal to $e_{1}$ . Assume that we use

$$
\mathbf {z} [ \mathbf {\Gamma} = [ 1 \quad 1 \quad 1 \quad \dots \quad 1 ] \tag {11.27}
$$

We discuss the effect of this assumption in Section 11.2.6. Then using (11.25) for $k = 1, \ldots, l$ , we obtain

$$
\mathbf {z} _ {l + 1} ^ {T} = \left[ \left(\frac {1}{\lambda_ {1}}\right) ^ {l} \left(\frac {1}{\lambda_ {2}}\right) ^ {l} \dots \left(\frac {1}{\lambda_ {n}}\right) ^ {l} \right] \tag {11.28}
$$

Let us first assume that $\lambda_1 < \lambda_2$ . To show that $\mathbf{z}_{l+1}$ converges to a multiple of $\mathbf{e}_1$ as $l \to \infty$ ,

<!-- source-page: 912 -->

we multiply $\mathbf{z}_{i+1}$ in (11.28) by $(\lambda_1)^i$ to obtain

$$
\overline {{{\mathbf {z}}}} _ {l + 1} = \left[ \begin{array}{c} 1 \\ \left(\lambda_ {1} / \lambda_ {2}\right) ^ {l} \\ \vdots \\ \left(\lambda_ {1} / \lambda_ {n}\right) ^ {l} \end{array} \right] \tag {11.29}
$$

and observe that $\overline{\mathbf{z}}_{l+1}$ converges to $\mathbf{e}_1$ as $l \to \infty$ . Hence, $\mathbf{z}_{l+1}$ converges to a multiple of $\mathbf{e}_1$ as $l \to \infty$ .

To evaluate the order and rate of convergence, we use the convergence definition given in Section 2.7. For the iteration under consideration here we obtain

$$
\lim _ {l \rightarrow \infty} \frac {\| \overline {{{\mathbf {z}}}} _ {l + 1} - \mathbf {e} _ {1} \| _ {2}}{\| \overline {{{\mathbf {z}}}} _ {l} - \mathbf {e} _ {1} \| _ {2}} = \frac {\lambda_ {1}}{\lambda_ {2}} \tag {11.30}
$$

Hence convergence is linear, and the rate of convergence is $\lambda_1 / \lambda_2$ . This convergence rate is also shown in the iteration vector $\overline{\mathbf{z}}_{l + 1}$ in (11.29); i.e., those elements in the iteration vector that should tend to zero do so with at least the ratio $\lambda_1 / \lambda_2$ in each additional iteration. Thus, if $\lambda_2 > \lambda_1$ , it is the relative magnitude of $\lambda_1$ to $\lambda_2$ that determines how fast the iteration vector converges to the eigenvector $\phi_1$ .

In this discussion we assumed that $\lambda_{1} < \lambda_{2}$ . Let us now consider the case of a multiple eigenvalue, namely, $\lambda_{1} = \lambda_{2} = \cdots = \lambda_{m}$ . Then we have in (11.29),

$$
\overline {{{\mathbf {z}}}} _ {i + 1} ^ {T} = \left[ \begin{array}{l l l l l l l} 1 & 1 & \dots & 1 & \left(\frac {\lambda_ {1}}{\lambda_ {m + 1}}\right) ^ {l} & \dots & \left(\frac {\lambda_ {1}}{\lambda_ {n}}\right) ^ {l} \end{array} \right] \tag {11.31}
$$

and the convergence rate of the iteration vector is $\lambda_{1}/\lambda_{m+1}$ . Therefore, in general, the rate of convergence of the iteration vector in inverse iteration is given by the ratio of $\lambda_{1}$ to the next distinct eigenvalue.

In the iteration given in (11.16) to (11.22), we obtain an approximation to the eigenvalue $\lambda_{1}$ by evaluating the Rayleigh quotient. Corresponding to (11.18), the Rayleigh quotient calculated in the iteration of (11.25) would be

$$
\rho (\mathbf {z} _ {k + 1}) = \frac {\mathbf {z} _ {k + 1} ^ {T} \mathbf {z} _ {k}}{\mathbf {z} _ {k + 1} ^ {T} \mathbf {z} _ {k + 1}} \tag {11.32}
$$

Assume that we consider the last iteration in which $k = l$ . Then substituting for $\mathbf{z}_l$ and $\mathbf{z}_{l+1}$ from (11.28) into (11.32), we obtain

$$
\rho (\mathbf {z} _ {l + 1}) = \frac {\lambda_ {1} \sum_ {i = 1} ^ {n} \left(\lambda_ {1} / \lambda_ {i}\right) ^ {2 l - 1}}{\sum_ {l = 1} ^ {n} \left(\lambda_ {1} / \lambda_ {i}\right) ^ {2 l}} \tag {11.33}
$$

Hence we have for $\lambda_{1}$ being a simple or multiple eigenvalue,

$$
\rho (\mathbf {z} _ {l + 1}) \rightarrow \lambda_ {1} \quad \text { as } l \rightarrow \infty
$$

Also, convergence is linear with the rate equal to $(\lambda_{1}/\lambda_{m+1})^{2}$ , where $\lambda_{m+1}$ is defined as in (11.31). This convergence rate substantiates the observation that if an eigenvector is known with an error $\epsilon$ , then the Rayleigh quotient yields an approximation to the corresponding eigenvalue with error $\epsilon^{2}$ (see Section 2.6).

<!-- source-page: 913 -->

Before demonstrating the results by means of a brief example, it should be recalled that we assumed in the above analysis a full unit starting iteration vector as given in (11.27). The convergence properties derived hold for any starting iteration vector that is not orthogonal to the eigenvector of interest, but the convergence rates can in many practical analyses be observed only as the number of iterations becomes large. The same observation also holds for any of the other convergence analyses that are presented in the following sections. We discuss this observation with other important practical aspects in Section 11.2.6.

EXAMPLE 11.3: For the problem considered in Example 11.2, calculate the ultimate convergence rates of the iteration vector and the Rayleigh quotient. Compare the ultimate convergence rates with those actually observed in the inverse iteration carried out in Example 11.2.

For the evaluation of the theoretical convergence rates, we need $\lambda_{1}$ and $\lambda_{2}$ . We calculated the eigenvalues in Example 10.12 and found

$$
\lambda_ {1} = \frac {1}{2} - \frac {\sqrt {2}}{4}
$$

$$
\lambda_ {2} = \frac {1}{2} + \frac {\sqrt {2}}{4}
$$

Hence, the ultimate convergence rate of the iteration vector is

$$
\frac {\lambda_ {1}}{\lambda_ {2}} = 0. 1 7
$$

and the ultimate convergence rate of the Rayleigh quotient is

$$
\left(\frac {\lambda_ {1}}{\lambda_ {2}}\right) ^ {2} = 0. 0 2 9
$$

The actual vector convergence obtained is observed by evaluating the ratio $r_{k+1}, k = 1, 2, \ldots$ , where

$$
r _ {k + 1} = \frac {\left\| \mathbf {x} _ {k + 1} - \boldsymbol {\phi} _ {1} \right\| _ {2}}{\left\| \mathbf {x} _ {k} - \boldsymbol {\phi} _ {1} \right\| _ {2}}
$$

and we assume that $\phi_{1}$ is obtained in the last iteration [see (11.22)].

For the iteration in Example 11.2, we thus obtain

$$
r _ {2} = 0. 0 2 6 0 8 3; \quad r _ {3} = 0. 1 7 0 5 5 9; \quad r _ {4} = 0. 1 6 7 1 3 4; \quad r _ {5} = 0. 1 4 4 2 5 1
$$

Ignoring $r_2$ because the iteration just started, we see that the theoretical and actual convergence rates compare quite well.

Similarly, the actual convergence of the Rayleigh quotient calculated in Example 11.2 is observed by evaluating

$$
\epsilon_ {k + 1} = \frac {\left| \rho \left(\overline {{\mathbf {x}}} _ {k + 1}\right) - \lambda_ {1} \right|}{\left| \rho \left(\overline {{\mathbf {x}}} _ {k}\right) - \lambda_ {1} \right|}
$$

where we use the converged value of the Rayleigh quotient for $\lambda_{1}$ . In the iteration of Example 11.2 we have

$$
\epsilon_ {3} = 0. 0 2 8 7 6 8; \quad \epsilon_ {4} = 0. 0 2 7 7 7 8; \quad \epsilon_ {5} = 0
$$

Hence, we see that the theoretical and observed convergence rates again agree quite well in this solution.

<!-- source-page: 914 -->

# 11.2.2 Forward Iteration

The method of forward iteration is complementary to the inverse iteration technique in that the method yields the eigenvector corresponding to the largest eigenvalue. Whereas we assumed in inverse iteration that K is positive definite, we assume in this section that M is positive definite; otherwise, a shift must be used (see Section 11.2.3). Having chosen a starting iteration vector $x_{1}$ , in forward iteration we evaluate, for $k = 1, 2, \ldots$ ,

$$
\mathbf {M} \overline {{{\mathbf {x}}}} _ {k + 1} = \mathbf {K} \mathbf {x} _ {k} \tag {11.34}
$$

and $\mathbf{x}_{k + 1} = \frac{\overline{\mathbf{x}}_{k + 1}}{(\overline{\mathbf{x}}_{k + 1}^T\mathbf{M}\overline{\mathbf{x}}_{k + 1})^{1 / 2}}$ (11.35)

where provided that $x_{1}$ is not M-orthogonal to $\phi_{n}$ , we have

$$
\mathbf {x} _ {k + 1} \rightarrow \boldsymbol {\phi} _ {n} \quad \text { as } k \rightarrow \infty
$$

The analogy to inverse iteration should be noted; the only difference is that we solve (11.34) rather than (11.13) to obtain an improved eigenvector. This means, in practice, that in the inverse iteration we need to triangularize the matrix K and in the forward iteration we decompose M.

A more effective forward iteration procedure than that in (11.34) and (11.35) would be obtained by using equations that are analogous to those in (11.16) to (11.22). Assuming that $y_{1} = Kx_{1}$ , we evaluate for $k = 1, 2, \ldots$ ,

$$
\mathbf {M} \overline {{{\mathbf {x}}}} _ {k + 1} = \mathbf {y} _ {k} \tag {11.36}
$$

$$
\overline {{{\mathbf {y}}}} _ {k + 1} = \mathbf {K} \overline {{{\mathbf {x}}}} _ {k + 1} \tag {11.37}
$$

$$
\rho (\overline {{{\mathbf {x}}}} _ {k + 1}) = \frac {\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \overline {{{\mathbf {y}}}} _ {k + 1}}{\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \mathbf {y} _ {k}} \tag {11.38}
$$

$$
\mathbf {y} _ {k + 1} = \frac {\overline {{{\mathbf {y}}}} _ {k + 1}}{(\overline {{{\mathbf {x}}}} _ {k + 1} ^ {T} \mathbf {y} _ {k}) ^ {1 / 2}} \tag {11.39}
$$

where provided that $\phi_n^T\mathbf{y}_1\neq 0$

$$
\mathbf {y} _ {k + 1} \rightarrow \mathbf {K} \boldsymbol {\phi} _ {n} \quad \text { and } \quad \rho (\overline {{\mathbf {x}}} _ {k + 1}) \rightarrow \lambda_ {n} \quad \text { as } k \rightarrow \infty
$$

Convergence in the iteration could again be measured as given in (11.20), and denoting the last iteration by $l$ , we have

$$
\lambda_ {n} \doteq \rho (\overline {{{\mathbf {x}}}} _ {l + 1}) \tag {11.40}
$$

and $\Phi_n = \frac{\overline{\mathbf{x}}_{l + 1}}{(\overline{\mathbf{x}}_{l + 1}^T\mathbf{y}_l)^{1 / 2}}$ (11.41)

Considering the analysis of convergence of the iteration vector to $\phi_{n}$ , it can be carried out following the same procedure that was used in the evaluation of the convergence characteristics of inverse iteration. Alternatively, we may use the results that we obtained in the analysis of inverse iteration. Namely, assume that we write the eigenproblem $K\phi = \lambda M\phi$ in the form $M\phi = \lambda^{-1}K\phi$ ; then using inverse iteration to solve for an eigenvector and corresponding eigenvalue is equivalent to performing forward iteration on the problem

<!-- source-page: 915 -->

$\mathbf{K}\boldsymbol{\Phi} = \lambda \mathbf{M}\boldsymbol{\Phi}$ . But since we converge in the inverse iteration of (11.16) to (11.22) to the smallest eigenvalue and corresponding eigenvector, and since for the problem $\mathbf{M}\boldsymbol{\Phi} = \lambda^{-1}\mathbf{K}\boldsymbol{\Phi}$ this eigenvalue is $\lambda_n^{-1}$ , where $\lambda_n$ is the largest eigenvalue of $\mathbf{K}\boldsymbol{\Phi} = \lambda \mathbf{M}\boldsymbol{\Phi}$ , we converge in the forward iteration of (11.36) to (11.41) to $\lambda_n$ and $\boldsymbol{\Phi}_n$ and the convergence rate of the iteration vector is $\lambda_{n-1} / \lambda_n$ . We should note that the Rayleigh quotient evaluated in (11.38) is $\overline{\mathbf{x}}_{k+1}^T\mathbf{K}\overline{\mathbf{x}}_{k+1} / \overline{\mathbf{x}}_{k+1}^T\mathbf{M}\overline{\mathbf{x}}_{k+1}$ , i.e., just the inverse of the Rayleigh quotient for calculating an approximation to $\lambda_n^{-1}$ in the problem $\mathbf{M}\boldsymbol{\Phi} = \lambda^{-1}\mathbf{K}\boldsymbol{\Phi}$ .

We demonstrate the iteration and convergence in the following example.

EXAMPLE 11.4: Use forward iteration as given in (11.36) to (11.41) with $tol = 10^{-6}$ in (11.20) to evaluate $\lambda_{4}$ and $\phi_{4}$ of the eigenproblem $K\phi = \lambda M\phi$ , where

$$
\mathbf {K} = \left[ \begin{array}{r r r r} 5 & - 4 & 1 & 0 \\ - 4 & 6 & - 4 & 1 \\ 1 & - 4 & 6 & - 4 \\ 0 & 1 & - 4 & 5 \end{array} \right]; \quad \mathbf {M} = \left[ \begin{array}{c c c c} 2 & & & \\ & 2 & & \\ & & 1 & \\ & & & 1 \end{array} \right]
$$

The physical problem considered in this example is the free-vibration response of the simply supported beam shown in Fig. 8.1 with the above mass matrix.

Starting the iteration with

$$
\mathbf {x} _ {1} = \left[ \begin{array}{l} 1 \\ 1 \\ 1 \\ 1 \end{array} \right]
$$

we calculate in the inverse iteration the values summarized in Table E11.4.

Hence, we need 10 iterations for a convergence tolerance of $10^{-6}$ in (11.20), and we then use, as given in (11.40) and (11.41),

$$
\lambda_ {4} \doteq 1 0. 6 3 8 4 5; \quad \Phi_ {4} \doteq \left[ \begin{array}{c} - 0. 1 0 7 3 1 \\ 0. 2 5 5 3 9 \\ - 0. 7 2 8 2 7 \\ 0. 5 6 2 2 7 \end{array} \right]
$$

Comparing after iteration 10 the predicted value of $\lambda_{4}$ with the exact value, we have

$$
\frac {\left| \lambda_ {4} ^ {\text { exact }} - \rho (\overline {{{\mathbf {x}}}} _ {1 1}) \right|}{\lambda_ {4} ^ {\text { exact }}} = 1. 9 2 \times 1 0 ^ {- 7}
$$

Also, the right-hand side in (10.107) gives $5.24 \times 10^{-4}$ .

<!-- source-page: 916 -->

TABLE E11.4 

<table><tr><td>k</td><td> $\overline{x}_{k+1}$ </td><td> $\overline{y}_{k+1}$ </td><td> $\rho(\overline{x}_{k+1})$ </td><td> $y_{k+1}$ </td><td> $\frac{|\lambda_{4}^{(k+1)} - \lambda_{4}^{(k)}|}{\lambda_{4}^{(k+1)}}$ </td></tr><tr><td rowspan="4">1</td><td>1</td><td>6</td><td>5.93333</td><td>2.1909</td><td>---</td></tr><tr><td>-0.5</td><td>-1</td><td></td><td>-0.3651</td><td></td></tr><tr><td>-1</td><td>-11</td><td></td><td>-4.0166</td><td></td></tr><tr><td>2</td><td>13.5</td><td></td><td>4.9295</td><td></td></tr><tr><td rowspan="4">2</td><td>1.0954</td><td>2.1909</td><td>8.57887</td><td>0.3345</td><td>0.3084</td></tr><tr><td>-0.1826</td><td>15.5188</td><td></td><td>2.3694</td><td></td></tr><tr><td>-4.0166</td><td>-41.9921</td><td></td><td>-6.4112</td><td></td></tr><tr><td>4.9295</td><td>40.5315</td><td></td><td>6.1882</td><td></td></tr><tr><td rowspan="4">3</td><td>0.1672</td><td>-10.3137</td><td>10.15966</td><td>-1.1372</td><td>0.1556</td></tr><tr><td>1.1847</td><td>38.2720</td><td></td><td>4.2198</td><td></td></tr><tr><td>-6.4112</td><td>-67.7914</td><td></td><td>-7.4745</td><td></td></tr><tr><td>6.1882</td><td>57.7704</td><td></td><td>6.3696</td><td></td></tr><tr><td rowspan="4">8</td><td>-1.1285</td><td>-24.2083</td><td>10.63838</td><td>-2.2756</td><td>0.00003304</td></tr><tr><td>2.7044</td><td>57.7298</td><td></td><td>5.4267</td><td></td></tr><tr><td>-7.7481</td><td>-82.4222</td><td></td><td>-7.7478</td><td></td></tr><tr><td>5.9969</td><td>63.6811</td><td></td><td>5.9861</td><td></td></tr><tr><td rowspan="4">9</td><td>-1.1378</td><td>-24.2902</td><td>10.63844</td><td>-2.2833</td><td>0.000005584</td></tr><tr><td>2.7133</td><td>57.8086</td><td></td><td>5.4340</td><td></td></tr><tr><td>-7.7478</td><td>-82.4224</td><td></td><td>-7.7476</td><td></td></tr><tr><td>5.9861</td><td>63.6351</td><td></td><td>5.9816</td><td></td></tr><tr><td rowspan="4">10</td><td>-1.1416</td><td>-24.3237</td><td>10.63845</td><td>-2.2864</td><td>0.0000009437</td></tr><tr><td>2.7170</td><td>57.8405</td><td></td><td>5.4369</td><td></td></tr><tr><td>-7.7476</td><td>-82.4219</td><td></td><td>-7.7476</td><td></td></tr><tr><td>5.9816</td><td>63.6157</td><td></td><td>5.9798</td><td></td></tr></table>

# 11.2.3 Shifting in Vector Iteration

The convergence analysis of inverse iteration in Section 11.2.1 showed that assuming $\lambda_{1} < \lambda_{2}$ , the iteration vector converges with a rate $\lambda_{1}/\lambda_{2}$ to the eigenvector $\phi_{1}$ . Therefore, depending on the magnitude of $\lambda_{1}$ and $\lambda_{2}$ , the convergence rate can be arbitrarily low, say $\lambda_{1}/\lambda_{2} = 0.99999$ , or can be very high, say $\lambda_{1}/\lambda_{2} = 0.01$ . Similarly, in forward iteration the convergence rate can be low or high. Therefore, a natural question must be how to improve the convergence rate in the vector iterations. We show in this section that the convergence rate can be much improved by shifting. In addition, a shift can be used to obtain convergence to an eigenpair other than $(\lambda_{1}, \phi_{1})$ and $(\lambda_{n}, \phi_{n})$ in inverse and forward iterations, respectively, and a shift is used effectively in inverse iteration when K is positive semidefinite and in forward iteration when M is diagonal with some zero diagonal elements (see Example 11.6).

Assume that a shift $\mu$ is applied as described in Section 10.2.3; then we consider the eigenproblem

$$
(\mathbf {K} - \mu \mathbf {M}) \phi = \eta \mathbf {M} \phi \tag {11.42}
$$

<!-- source-page: 917 -->

where the eigenvalues of the original problem $K\phi = \lambda M\phi$ and of the problem in (11.42) are related by $\eta_{i} = \lambda_{i} - \mu, i = 1, \ldots, n$ . To analyze the convergence properties of inverse and forward iteration when applied to the problem in (11.42), we follow in all respects the procedure used in Section 11.2.1. The first step is to consider the problem in the basis of eigenvectors $\Phi$ . Using the transformation

$$
\phi = \Phi \Psi \tag {11.43}
$$

we obtain for the convergence analysis the equivalent eigenproblem

$$
(\mathbf {\Lambda} - \mu \mathbf {I}) \boldsymbol {\psi} = \eta \boldsymbol {\psi} \tag {11.44}
$$

Consider first inverse iteration and assume that all eigenvalues are distinct. In that case we obtain, using the notation in Section 11.2.1,

$$
\mathbf {z} _ {l + 1} ^ {T} = \left[ \frac {1}{(\lambda_ {1} - \mu) ^ {l}} \frac {1}{(\lambda_ {2} - \mu) ^ {l}} \dots \frac {1}{(\lambda_ {n} - \mu) ^ {l}} \right] \tag {11.45}
$$

where it is assumed that all $\lambda_{i} - \mu$ are nonzero, but they may be positive or negative. Assume that $\lambda_{i} - \mu$ is smallest in absolute magnitude when $i = j$ ; then multiplying $\mathbf{z}_{l+1}$ by $(\lambda_{j} - \mu)^{l}$ , we obtain

$$
\overline {{{\mathbf {z}}}} _ {l + 1} = \left[ \begin{array}{c} \left(\frac {\lambda_ {j} - \mu}{\lambda_ {1} - \mu}\right) ^ {l} \\ \vdots \\ \left(\frac {\lambda_ {j} - \mu}{\lambda_ {j - 1} - \mu}\right) ^ {l} \\ 1 \\ \left(\frac {\lambda_ {j} - \mu}{\lambda_ {j + 1} - \mu}\right) ^ {l} \\ \vdots \\ \left(\frac {\lambda_ {j} - \mu}{\lambda_ {n} - \mu}\right) ^ {l} \end{array} \right] \tag {11.46}
$$

where $\left| (\lambda_j - \mu) / (\lambda_p - \mu) \right| < 1$ for all $p \neq j$ . Hence, in the iteration we have $\overline{\mathbf{z}}_{l+1} \to \mathbf{e}_j$ , meaning that in inverse iteration to solve (11.42), the iteration vector converges to $\phi_j$ . Furthermore, we obtain $\lambda_j = \eta_j + \mu$ . The convergence rate in the iteration is determined by the element $(\lambda_j - \mu) / (\lambda_p - \mu)$ which is largest in absolute magnitude, $p \neq j$ ; i.e., the convergence rate $r$ is

$$
r = \max _ {p \neq j} \left| \frac {\lambda_ {j} - \mu}{\lambda_ {p} - \mu} \right| \tag {11.47}
$$

Since $\lambda_{j}$ is nearest $\mu$ , the convergence rate of the iteration vector in (11.42) to the eigenvector $\phi_{j}$ is either

$$
\left| \frac {\lambda_ {j} - \mu}{\lambda_ {j - 1} - \mu} \right| \quad \text { or } \quad \left| \frac {\lambda_ {j} - \mu}{\lambda_ {j + 1} - \mu} \right|
$$

whichever is larger. The convergence rate for a typical case is shown in Fig. 11.1.

Using the results of the above convergence analysis and of the analysis of inverse iteration without shifting (see Section 11.2.1), two additional conclusions are reached.

<!-- source-page: 918 -->

![](images/page-918_d672cbc57527d7db74664dff1493c3e8cb80f59e514c9b078f212f7097ff7feb.jpg)

<details>
<summary>line</summary>

| λ       | p(λ) = det (K - λM) |
| ------- | ------------------- |
| λ₁      | μ                   |
| λ₂      | μ - λ₁              |
| λ₃      | μ - λ₂              |
| λ₄, λ₅  | μ - λ₂              |
| λ₆      | μ - λ₁              |
</details>

Figure 11.1 Example of vector convergence rate r in inverse iteration

First, we observe that the convergence rate of the Rayleigh quotient, which for $\mu$ nearest to $\lambda_{j}$ converges to $\lambda_{j} - \mu$ , is

$$
\left| \frac {\lambda_ {j} - \mu}{\lambda_ {j - 1} - \mu} \right| ^ {2} \quad \text { or } \quad \left| \frac {\lambda_ {j} - \mu}{\lambda_ {j + 1} - \mu} \right| ^ {2}
$$

whichever is larger.

The second observation concerns the case of $\lambda_{j}$ being a multiple eigenvalue. The analysis in Section 11.2.1 and the conclusions above show that if $\lambda_{j} = \lambda_{j+1} = \cdots = \lambda_{j+m-1}$ , the rate of convergence of the iteration vector is

$$
\max _ {p \neq j, j + 1, \dots , j + m - 1} \left| \frac {\lambda_ {j} - \mu}{\lambda_ {p} - \mu} \right|
$$

and convergence occurs to a vector in the subspace corresponding to $\lambda_{j}$ .

The important point in inverse iteration with shifting is that by choosing a shift near enough the specific eigenvalue of interest, we can, in theory, have a convergence rate that is as high as required; i.e., we would only need to make $\left|\lambda_{j}-\mu\right|$ small enough in relation to $\left|\lambda_{p}-\mu\right|$ defined above. However, in an actual solution scheme the difficulty is to find an appropriate $\mu$ , for which we consider various methods in the next sections.

EXAMPLE 11.5: Use inverse iteration as given in (11.16) to (11.22) in order to calculate $(\lambda_{1}, \phi_{1})$ of the problem $K\phi = \lambda M\phi$ , where K and M are given in Example 11.4. Then impose the shift $\mu = 10$ and show that in the inverse iteration convergence occurs toward $\lambda_{4}$ and $\phi_{4}$ .

Using inverse iteration on $K\phi = \lambda M\phi$ as in Example 11.2 gives convergence after three iterations with a tolerance of $10^{-6}$ ,

$$
\lambda_ {1} \doteq 0. 0 9 6 5 4; \quad \phi_ {1} \doteq \left[ \begin{array}{l} 0. 3 1 2 6 \\ 0. 4 9 5 5 \\ 0. 4 7 9 1 \\ 0. 2 8 9 8 \end{array} \right]
$$

Now imposing a shift of $\mu = 10$ , we obtain

$$
\mathbf {K} - \mu \mathbf {M} = \left[ \begin{array}{c c c c} - 1 5 & - 4 & 1 & 0 \\ - 4 & - 1 4 & - 4 & 1 \\ 1 & - 4 & - 4 & - 4 \\ 0 & 1 & - 4 & - 5 \end{array} \right]
$$

<!-- source-page: 919 -->

Using inverse iteration on the problem $(\mathbf{K} - \mu\mathbf{M})\boldsymbol{\phi} = \eta\mathbf{M}\boldsymbol{\phi}$ , we obtain convergence after six iterations with

$$
\rho (\overline {{{\mathbf {x}}}} _ {7}) = 0. 6 3 8 5; \quad \mathbf {x} _ {7} = \left[ \begin{array}{c} - 0. 1 0 7 6 \\ 0. 2 5 5 6 \\ - 0. 7 2 8 3 \\ 0. 5 6 2 0 \end{array} \right]
$$

Since we imposed a shift, we do know that $\mu + \rho(\overline{\mathbf{x}}_{7})$ is an approximation to an eigenvalue and $x_{7}$ is an approximation to the corresponding eigenvector. But we do not know which eigenpair has been approximated. By comparing $x_{7}$ with the results obtained in Example 11.4 we find that

$$
\lambda_ {4} \doteq \mu + \rho (\mathbf {x} _ {7}) \doteq 1 0. 6 3 8 5; \quad \phi_ {4} \doteq \mathbf {x} _ {7}
$$

EXAMPLE 11.6: Consider the unsupported beam element depicted in Fig. E8.13. Show that the usual inverse iteration algorithm of calculating $\lambda_{1}$ and $\phi_{1}$ does not work, but that after imposing a shift the standard algorithm can again be applied.

The first step in the inverse iteration defined in (11.16) is, in this case with $\mathbf{M} = \mathbf{I}$ and $\mathbf{x}_1$ a full unit vector,

$$
\left[ \begin{array}{c c c c} 1 2 & - 6 & - 1 2 & - 6 \\ - 6 & 4 & 6 & 2 \\ - 1 2 & 6 & 1 2 & 6 \\ - 6 & 2 & 6 & 4 \end{array} \right] \overline {{{\mathbf {x}}}} _ {2} = \left[ \begin{array}{l} 1 \\ 1 \\ 1 \\ 1 \end{array} \right] \tag {a}
$$

Using Gauss elimination to solve the equations, we arrive at

$$
\left[ \begin{array}{c c c c} 1 2 & - 6 & - 1 2 & - 6 \\ & 1 & 0 & - 1 \\ & & 0 & 0 \\ & & & 0 \end{array} \right] \overline {{\mathbf {x}}} _ {2} = \left[ \begin{array}{c} 1 \\ \frac {3}{2} \\ 2 \\ \frac {7}{2} \end{array} \right]
$$

and hence the equations in (a) have no solution. They have a solution only if the right-hand side [i.e., $x_{1}$ in (11.16)] is a null vector. There would be no difficulty in modifying the solution procedure when a singular coefficient matrix is encountered, and the advantage would be that the eigenvector would be calculated in one iteration. On the other hand, if we impose a shift, we can use the standard iteration procedure, and stability problems are avoided in the calculation of other eigenvalues and eigenvectors. Assume that we use $\mu = -6$ so that all $\lambda_{i}$ are positive. Then we have

$$
\mathbf {K} - \mu \mathbf {I} = \left[ \begin{array}{r r r r} 1 8 & - 6 & - 1 2 & - 6 \\ - 6 & 1 0 & 6 & 2 \\ - 1 2 & 6 & 1 8 & 6 \\ - 6 & 2 & 6 & 1 0 \end{array} \right]
$$

The inverse iteration can now be performed in the standard manner using a full unit starting iteration vector. Convergence is achieved after five iterations to a tolerance of $10^{-6}$ , and we have

$$
\rho (\overline {{{\mathbf {x}}}} _ {6}) = 6. 0 0 0 0 0 0; \quad \mathbf {x} _ {6} = \left[ \begin{array}{l} 0. 7 3 7 8 4 \\ 0. 4 2 1 6 5 \\ 0. 3 1 6 2 5 \\ 0. 4 2 1 6 5 \end{array} \right]
$$

<!-- source-page: 920 -->

Hence, taking account of the shift, we have

$$
\lambda_ {1} \doteq 0. 0; \quad \phi_ {1} \doteq x _ {6}
$$

We showed before that the rate of convergence in inverse iteration can be greatly increased by shifting. We may now wonder whether the convergence rate in forward iteration can be increased in a similar way. In analogy to the convergence proof of inverse iteration with a shift, we can generalize the convergence analysis of forward iteration when a shift $\mu$ is used. The final result is that the iteration vector converges to the eigenvector $\phi_{j}$ that corresponds to the largest eigenvalue $\left|\lambda_{j}-\mu\right|$ of the problem in (11.42), where

$$
\left| \lambda_ {j} - \mu \right| = \max _ {\text { all } i} \left| \lambda_ {i} - \mu \right| \tag {11.48}
$$

The convergence rate of the iteration vector is given by

$$
r = \max _ {p \neq j} \left| \frac {\lambda_ {p} - \mu}{\lambda_ {j} - \mu} \right| \tag {11.49}
$$

which, in fact, is the ratio of the second largest eigenvalue to the largest eigenvalue (both measured in absolute values) of the problem $(\mathbf{K}-\mu\mathbf{M})\phi=\eta\mathbf{M}\phi$ . In the case of $\lambda_{j}$ being a multiple eigenvalue, say $\lambda_{j}=\lambda_{j+1}=\cdots=\lambda_{j+m-1}$ , the iteration vector converges to a vector in the subspace corresponding to $\lambda_{j}$ , and the rate of convergence is

$$
\max _ {p \neq j, j + 1, \dots , j + m - 1} \left| \frac {\lambda_ {p} - \mu}{\lambda_ {j} - \mu} \right|
$$

The main difference between the convergence rate in (11.47) and (11.49) is that in (11.47), $\lambda_{p}$ is in the denominator, whereas in (11.49), $\lambda_{p}$ is in the numerator. This limits the convergence rate in forward iteration and by means of shifting convergence can be obtained only to the eigenpair $(\lambda_{n}, \phi_{n})$ or to the eigenpair $(\lambda_{1}, \phi_{1})$ . To achieve the highest convergence rates to $\phi_{n}$ and $\phi_{1}$ , we need to choose $\mu = (\lambda_{1} + \lambda_{n-1})/2$ and $\mu = (\lambda_{2} + \lambda_{n})/2$ , respectively, and have the corresponding convergence rates

$$
\left| \frac {\lambda_ {n - 1} - \frac {\lambda_ {1} + \lambda_ {n - 1}}{2}}{\lambda_ {n} - \frac {\lambda_ {1} + \lambda_ {n - 1}}{2}} \right| \quad \text { and } \quad \left| \frac {\lambda_ {2} - \frac {\lambda_ {2} + \lambda_ {n}}{2}}{\lambda_ {1} - \frac {\lambda_ {2} + \lambda_ {n}}{2}} \right|
$$

(see Fig. 11.2). Therefore, a much higher convergence rate can be obtained with shifting in

![](images/page-920_9234966c6504600293d910735133d3ecb3e96c88c05c9c5a441a103dfa867a86.jpg)

<details>
<summary>line</summary>
| λ       | p(λ) = det (K - λM) |
| ------- | ------------------- |
| λ₁      | λ₁ - λ₅ - λ₆ - λ₆ - λ₆ |
| λ₂      | λ₁ + λ₅ / 2        |
| λ₃      | λ₃ - λ₆ - λ₆ - λ₆ - λ₆ |
| λ₄      | λ₄ - λ₅ - λ₆ - λ₆ - λ₆ |
| λ₆      | λ₆ - λ₆ - λ₆ - λ₆ - λ₆ |
</details>

Figure 11.2 Shifting to obtain best convergence rate r in forward iteration for $\lambda_{6}$ ( $\lambda_{6} =$ largest eigenvalue)