Files
김경종 4cc312954f
Tests / Hermetic test suite (push) Has been cancelled
Tests / Skill frontmatter validation (push) Has been cancelled
add wiki
2026-05-28 17:16:48 +09:00

27 KiB

problem \mathbf{A}\mathbf{v} = \lambda \mathbf{v} , we consider the problems


\mathbf {A} ^ {(m)} \mathbf {v} ^ {(m)} = \lambda^ {(m)} \mathbf {v} ^ {(m)} \tag {2.132}

where \mathbf{A}^{(m)} is obtained by omitting the last m rows and columns of \mathbf{A} . Hence \mathbf{A}^{(m)} is a square-symmetric matrix of order (n - m) . Using also the notation \mathbf{A}^{(0)} = \mathbf{A} , \lambda^{(0)} = \lambda , \mathbf{v}^{(0)} = \mathbf{v} , the eigenvalue separation property states that the eigenvalues of the problem \mathbf{A}^{(m+1)}\mathbf{v}^{(m+1)} = \lambda^{(m+1)}\mathbf{v}^{(m+1)} separate the eigenvalues of the problem in (2.132); i.e., we have


\lambda_ {1} ^ {(m)} \leq \lambda_ {1} ^ {(m + 1)} \leq \lambda_ {2} ^ {(m)} \leq \lambda_ {2} ^ {(m + 1)} \leq \dots \leq \lambda_ {n - m - 1} ^ {(m)} \leq \lambda_ {n - m - 1} ^ {(m + 1)} \leq \lambda_ {n - m} ^ {(m)} \tag {2.133}

\text { for } m = 0, \dots , n - 2

For the proof of (2.133) we consider the problems \mathbf{A}\mathbf{v} = \lambda \mathbf{v} and \mathbf{A}^{(1)}\mathbf{v}^{(1)} = \lambda^{(1)}\mathbf{v}^{(1)} . If we can show that the eigenvalue separation property holds for these two problems, it will hold also for m = 1, 2, \ldots, n - 2 . Specifically, we therefore want to prove that


\lambda_ {r} \leq \lambda_ {r} ^ {(1)} \leq \lambda_ {r + 1} \quad r = 1, \dots , n - 1 \tag {2.134}

Using the minimax characterization, we have


\left. \begin{array}{c} \lambda_ {r + 1} = \max \left\{\min \frac {\mathbf {v} ^ {T} \mathbf {A} \mathbf {v}}{\mathbf {v} ^ {T} \mathbf {v}} \right\} \\ \mathbf {v} ^ {T} \mathbf {w} _ {i} = 0; \quad i = 1, \dots , r; \text { all } \mathbf {w} _ {i} \text { arbitrary } \end{array} \right\} \tag {2.135}

Similarly, we have


\left. \begin{array}{l} \lambda_ {r} ^ {(1)} = \max \left\{\min \frac {\mathbf {v} ^ {T} \mathbf {A} \mathbf {v}}{\mathbf {v} ^ {T} \mathbf {v}} \right\} \\ \mathbf {v} ^ {T} \mathbf {w} _ {i} = 0; \quad i = 1, \dots , r \\ \mathbf {w} _ {i} \text {   arbitrary   for   } i = 1, \dots , r - 1 \\ \mathbf {w} _ {r} = \mathbf {e} _ {n} \end{array} \right\} \tag {2.136}

where w_{r} is constrained to be equal to e_{n} to ensure that the last element in v is zero because e_{n} is the last column of the n \times n identity matrix I. However, since the constraint for \lambda_{r+1} can be more severe and includes that for \lambda_{r}^{(1)} , we have


\lambda_ {r} ^ {(1)} \leq \lambda_ {r + 1} \tag {2.137}

To determine \lambda_r we use


\left. \begin{array}{l} \lambda_ {r} = \max \left\{\min \frac {\mathbf {v} ^ {T} \mathbf {A} \mathbf {v}}{\mathbf {v} ^ {T} \mathbf {v}} \right\} \\ \mathbf {v} ^ {T} \mathbf {w} _ {i} = 0; \quad i = 1, \dots , r - 1 \\ \text { all } \mathbf {w} _ {i} \text { arbitrary } \end{array} \right\} \tag {2.138}

Comparing the characterizations of \lambda_{r}^{(1)} and \lambda_{r} , i.e., (2.136) with (2.138), we observe that to calculate \lambda_{r}^{(1)} we have the same constraints as in the calculation of \lambda_{r} plus one more (namely, v^{T}e_{n}=0 ), and hence


\lambda_ {r} \leq \lambda_ {r} ^ {(1)} \tag {2.139}

But (2.137) and (2.139) together establish the required result given in (2.134).

The eigenvalue separation property now yields the following result. If we write the eigenvalue problems in (2.132) including the problem A v = \lambda v in the form


p ^ {(m)} (\lambda^ {(m)}) = \det \left(\mathbf {A} ^ {(m)} - \lambda^ {(m)} \mathbf {I}\right); \quad m = 0, \dots , n - 1 \tag {2.140}

where p^{(0)} = p , we see that the roots of the polynomial p(\lambda^{(m+1)}) separate the roots of the polynomial p(\lambda^{(m)}) . However, a sequence of polynomials p_i(x), i = 1, \ldots, q , form a Sturm sequence if the roots of the polynomial p_{j+1}(x) separate the roots of the polynomial p_j(x) . Hence the eigenvalue separation property states that the characteristic polynomials of the problems \mathbf{A}^{(m)}\mathbf{v}^{(m)} = \lambda^{(m)}\mathbf{v}^{(m)}, n = 0, 1, \ldots, n - 1 , form a Sturm sequence. It should be noted that in the presentation we considered all symmetric matrices; i.e., the minimax characterization of eigenvalues and the Sturm sequence property are applicable to positive definite and indefinite matrices. We shall use the Sturm sequence property extensively in later chapters (see Sections 8.2.5, 10.2.2, 11.4.3, and 11.6.4). Consider the following example.

EXAMPLE 2.38: Consider the eigenvalue problem \mathbf{A}\mathbf{v} = \lambda \mathbf{v} , where


\mathbf {A} = \left[ \begin{array}{r r r} 5 & - 4 & - 7 \\ - 4 & 2 & - 4 \\ - 7 & - 4 & 5 \end{array} \right]

Evaluate the eigenvalues of \mathbf{A} and of the matrices \mathbf{A}^{(m)} , m = 1, 2 . Show that the separation property given in (2.133) holds and sketch the characteristic polynomials p(\lambda) , p^{(1)}(\lambda^{(1)}) , and p^{(2)}(\lambda^{(2)}) .

We have


p (\lambda) = \det (\mathbf {A} - \lambda \mathbf {I}) = (5 - \lambda) [ (2 - \lambda) (5 - \lambda) - 1 6 ]

+ 4 [ - 4 (5 - \lambda) - 2 8 ] - 7 [ 1 6 + 7 (2 - \lambda) ]

Hence p(\lambda) = (-6 - \lambda)(6 - \lambda)(12 - \lambda)

and the eigenvalues are


\lambda_ {1} = - 6; \quad \lambda_ {2} = 6; \quad \lambda_ {3} = 1 2

Also, p^{(1)}(\lambda^{(1)}) = \det (\mathbf{A}^{(1)} - \lambda^{(1)}\mathbf{I})


= (5 - \lambda^ {(1)}) (2 - \lambda^ {(1)}) - 1 6

or p^{(1)}(\lambda^{(1)}) = \lambda^{(1)^2} - 7\lambda^{(1)} - 6

Hence \lambda_1^{(1)} = \frac{7}{2} -\frac{1}{2}\sqrt{73} = -0.7720


\lambda_ {2} ^ {(1)} = \frac {7}{2} + \frac {1}{2} \sqrt {7 3} = 7. 7 7 2

Finally, p^{(2)}(\lambda^{(2)}) = \det (A^{(2)} - \lambda^{(2)}\mathbf{I})


= 5 - \lambda^ {(2)}

Hence \lambda_{1}^{(2)} = 5

The separation property holds because


\lambda_ {1} \leq \lambda_ {1} ^ {(1)} \leq \lambda_ {2} \leq \lambda_ {2} ^ {(1)} \leq \lambda_ {3}


\lambda^ {(1)} <   \lambda^ {(2)} <   \lambda_ {2} ^ {(1)}

Figure E2.38 Characteristic polynomials

The characteristic polynomials are sketched in Fig. E2.38.

2.7 VECTOR AND MATRIX NORMS

We have discussed vectors, matrices, eigenvalues, and eigenvectors of symmetric matrices and have investigated the deeper significance of the elements in these entities. However, one important aspect has not been discussed so far. If we deal with single numbers, we can identify a number as being large or small. Vectors and matrices are functions of many elements, but we also need to measure their “size.” Specifically, if single numbers are used in iterative processes, the convergence of a series of numbers, say x_{1}, x_{2}, \ldots, x_{k} , to a number x is simply measured by


\lim _ {k \rightarrow \infty} | x _ {k} - x | = 0 \tag {2.141}

or, in words, convergence is obtained if the residual y_{k} = |x_{k} - x| approaches zero as k \to \infty . Furthermore, if we can find constants p \geq 1 and c > 0 such that


\lim _ {k \rightarrow \infty} \frac {\left| x _ {k + 1} - x \right|}{\left| x _ {k} - x \right| ^ {p}} = c \tag {2.142}

we say that convergence is of order p . If p = 1 , convergence is linear and the rate of convergence is c , in which case c must be smaller than 1.

In iterative solution processes using vectors and matrices we also need a measure of convergence. Realizing that the size of a vector or matrix should depend on the magnitude

of all elements in the arrays, we arrive at the definition of vector and matrix norms. A norm is a single number that depends on the magnitude of all elements in the vector or matrix.

Definition: A norm of a vector v of order n written as \|v\| is a single number. The norm is a function of the elements of v, and the following conditions are satisfied:

  1. \| \mathbf{v}\| \geq 0 and \| \mathbf{v}\| = 0 if and only if \mathbf{v} = \mathbf{0} . (2.143)
  2. \| c\mathbf{v}\| = |c|\| \mathbf{v}\| for any scalar c . (2.144)
  3. \| \mathbf{v} + \mathbf{w}\| \leq \| \mathbf{v}\| +\| \mathbf{w}\| for vectors \mathbf{v} and \mathbf{w} . (2.145)

The relation (2.145) is the triangle inequality. The following three vector norms are commonly used and are called the infinity, one, and two vector norms:


\| \mathbf {v} \| _ {\infty} = \max _ {i} | v _ {i} | \tag {2.146}

\| \mathbf {v} \| _ {1} = \sum_ {i = 1} ^ {n} | v _ {i} | \tag {2.147}

\| \mathbf {v} \| _ {2} = \left(\sum_ {i = 1} ^ {n} | v _ {i} | ^ {2}\right) ^ {1 / 2} \tag {2.148}

\| \mathbf{v} \|_2 is also known as the Euclidean vector norm. Geometrically, this norm is equal to the length of the vector \mathbf{v} . All three norms are special cases of the vector norm \sqrt[4]{\sum_i |v_i|^p} , where for (2.146), (2.147), and (2.148), p = \infty , 1, and 2, respectively. It should be noted that each of the norms in (2.146) to (2.148) satisfies the conditions in (2.143) to (2.145).

We can now measure convergence of a sequence of vectors x_{1}, x_{2}, x_{3}, \ldots, x_{k} to a vector x. That is, for the sequence to converge to x it is sufficient and necessary that


\lim _ {k \rightarrow \infty} \| \mathbf {x} _ {k} - \mathbf {x} \| = 0 \tag {2.149}

for any one of the vector norms. The order of convergence p, and in case p = 1, the rate of convergence c, are calculated in an analogous manner as in (2.142) but using norms; i.e., we have


\lim _ {k \rightarrow \infty} \frac {\left\| \mathbf {x} _ {k + 1} - \mathbf {x} \right\|}{\left\| \mathbf {x} _ {k} - \mathbf {x} \right\| ^ {p}} = c \tag {2.150}

Looking at the relationship between the vector norms, we note that they are equivalent in the sense that for any two norms \|\cdot\|_{s_{1}} and \|\cdot\|_{s_{2}} there exist two positive constants \alpha_{1} and \alpha_{2} such that


\left\| \mathbf {v} \right\| _ {s _ {1}} \leq \alpha_ {1} \left\| \mathbf {v} \right\| _ {s _ {2}} \tag {2.151}

and \| \mathbf{v}\|_{s_2}\leq \alpha_2\| \mathbf{v}\|_{s_1} (2.152)

where s_{1} and s_{2} denote the \infty -, 1-, or 2-norms. Hence it follows that


c _ {1} \left\| \mathbf {v} \right\| _ {s _ {1}} \leq \left\| \mathbf {v} \right\| _ {s _ {2}} \leq c _ {2} \left\| \mathbf {v} \right\| _ {s _ {1}} \tag {2.153}

where c_{1} and c_{2} are two positive constants that may depend on n , and of course also


\frac {1}{c _ {2}} \| \mathbf {v} \| _ {s _ {2}} \leq \| \mathbf {v} \| _ {s _ {1}} \leq \frac {1}{c _ {1}} \| \mathbf {v} \| _ {s _ {2}}

EXAMPLE 2.39: Give the constants c_{1} and c_{2} in (2.153) if, first, the norms s_{1} and s_{2} are the \infty - and 1-norms, and then, second, the \infty - and 2-norms. Then show that in each case (2.153) is satisfied using the vector


\mathbf {v} = \left[ \begin{array}{c} 1 \\ - 3 \\ 2 \end{array} \right]

In the first case we have


\| \mathbf {v} \| _ {\infty} \leq \| \mathbf {v} \| _ {1} \leq n \| \mathbf {v} \| _ {\infty} \tag {a}

with c_{1} = 1, c_{2} = n , and in the second case we have


\| \mathbf {v} \| _ {\infty} \leq \| \mathbf {v} \| _ {2} \leq \sqrt {n} \| \mathbf {v} \| _ {\infty} \tag {b}

with c_{1} = 1 and c_{2} = \sqrt{n} . These relations show that the 1- and 2-norms are equivalent to the \infty -norm. We can easily show that lower and upper bounds on \| \mathbf{v} \|_{1} in (a) and \| \mathbf{v} \|_{2} in (b) cannot be closer because the equal signs are reached for the vectors \mathbf{v}^{T} = [11\ldots 1] and \mathbf{v}^{T} = \mathbf{e}_{i} (and any scalar multiples thereof).

If we apply (a) and (b) to the given vector \mathbf{v} , we have


\begin{array}{l} \| \mathbf {v} \| _ {\infty} = 3 \\ \| \mathbf {v} \| _ {1} = 1 + 3 + 2 = 6 \\ \| \mathbf {v} \| _ {2} = \sqrt {1 + 9 + 4} = \sqrt {1 4} \\ \end{array}

and the relations in (a) and (b) read


3 \leq 6 \leq (3) (3); \quad 3 <   \sqrt {1 4} \leq (\sqrt {3}) (3)

In analogy with the definition of a vector norm, we also define a matrix norm.

Definition: A norm of a matrix \mathbf{A} of order n \times n , written as \| \mathbf{A} \| , is a single number. The norm is a function of the elements of \mathbf{A} , and the following relations hold:

  1. \| \mathbf{A}\| \geq 0 and \| \mathbf{A}\| = 0 if and only if \mathbf{A} = \mathbf{0} . (2.154)
  2. \| c\mathbf{A}\| = |c|\| \mathbf{A}\| for any scalar c . (2.155)
  3. \| \mathbf{A} + \mathbf{B}\| \leq \| \mathbf{A}\| +\| \mathbf{B}\| for matrices \mathbf{A} and \mathbf{B} . (2.156)
  4. \| \mathbf{A}\mathbf{B}\| \leq \| \mathbf{A}\|\| \mathbf{B}\| for matrices \mathbf{A} and \mathbf{B} . (2.157)

The relation in (2.156) is the triangle inequality equivalent to (2.145) . The additional condition in (2.157) , which was not postulated in the definition of a vector norm, must be satisfied in order to be able to use matrix norms when matrix products occur.

The following are frequently used matrix norms:


\| \mathbf {A} \| _ {\infty} = \max _ {i} \sum_ {j = 1} ^ {n} \left| a _ {i j} \right| \tag {2.158}

\| \mathbf {A} \| _ {1} = \max _ {j} \sum_ {i = 1} ^ {n} \left| a _ {i j} \right| \tag {2.159}

\| \mathbf {A} \| _ {2} = \sqrt {\tilde {\lambda} _ {n}}; \quad \tilde {\lambda} _ {n} = \text { maximum   eigenvalue   of   } \mathbf {A} ^ {T} \mathbf {A} \tag {2.160}

where for a symmetric matrix \mathbf{A} we have \| \mathbf{A}\|_{\infty} = \| \mathbf{A}\|_{1} and \| \mathbf{A}\|_{2} = \max_{i}|\lambda_{i}| (see Exercise 2.21). The norm \| \mathbf{A}\|_{2} is called the spectral norm of \mathbf{A} . Each of these norms satisfies

the relations in (2.154) to (2.157). The proof that the relation in (2.157) is satisfied for the infinity norm is given in Example 2.41.

EXAMPLE 2.40: Calculate the \infty -, 1-, and 2-norms of the matrix \mathbf{A} , where \mathbf{A} was given in Example 2.38.

The matrix A considered is


\mathbf {A} = \left[ \begin{array}{r r r} 5 & - 4 & - 7 \\ - 4 & 2 & - 4 \\ - 7 & - 4 & 5 \end{array} \right]

Using the definitions given in (2.158) to (2.160), we have


\begin{array}{l} \| \mathbf {A} \| _ {\infty} = 5 + 4 + 7 = 1 6 \\ \| \mathbf {A} \| _ {1} = 5 + 4 + 7 = 1 6 \\ \end{array}

The 2-norm is equal to |\lambda_3| , and hence (see Example 2.38) \| \mathbf{A} \|_2 = 12 .

EXAMPLE 2.41: Show that for two matrices A and B, we have


\| \mathbf {A B} \| _ {\infty} \leq \| \mathbf {A} \| _ {\infty} \| \mathbf {B} \| _ {\infty}

Using the definition of the infinity matrix norm in (2.158), we have


\| \mathbf {A B} \| _ {\infty} = \max _ {i} \sum_ {j = 1} ^ {n} \left| \sum_ {k = 1} ^ {n} a _ {i k} b _ {k j} \right|

\begin{array}{l} = \max _ {i} \sum_ {k = 1} ^ {n} \left\{\left| a _ {i k} \right| \sum_ {j = 1} ^ {n} \left| b _ {k j} \right| \right\} \\ \leq \left\{\max _ {i} \sum_ {k = 1} ^ {n} \left| a _ {i k} \right| \right\} \left\{\max _ {k} \sum_ {j = 1} ^ {n} \left| b _ {k j} \right| \right\} \\ \end{array}

but then \| \mathbf{AB}\|_{\infty}\leq \max_{i}\sum_{j = 1}^{n}\sum_{k = 1}^{n}\left|a_{ik}\right|\left|b_{kj}\right|

This proves the desired result.

As in the case of a sequence of vectors, we can now measure the convergence of a sequence of matrices A_{1}, A_{2}, A_{3}, \ldots, A_{k} to a matrix A. For convergence it is necessary and sufficient that


\lim _ {k \rightarrow \infty} \| \mathbf {A} _ {k} - \mathbf {A} \| = 0 \tag {2.161}

for any one of the given matrix norms.

In the definition of a matrix norm we needed relation (2.157) to be able to use norms when we encounter matrix products. Similarly, we also want to use norms when products of matrices with vectors occur. In such a case, in order to obtain useful information by applying norms, we need to employ only specific vector norms with specific matrix norms. Which matrix and vector norms should only be used together is determined by the condition that the following relation hold for any matrix A and vector v:


\| \mathbf {A} \mathbf {v} \| \leq \| \mathbf {A} \| \| \mathbf {v} \| \tag {2.162}

where \|Av\| and \|v\| are evaluated using the vector norm and \|A\| is evaluated using the matrix norm. We may note the close relationship to the condition (2.157), which was required to hold for a matrix norm. If (2.162) holds for a specific vector and matrix norm, the two norms are said to be compatible and the matrix norm is said to be subordinate to the vector norm. The 1-, 2-, and \infty -norms of a matrix, as defined previously, are subordinate, respectively, to the 1-, 2-, and \infty -norms of a vector given in (2.146) to (2.148). In the following example we give the proof that the \infty -norms are compatible and subordinate. The compatibility of the vector and matrix 1- and 2-norms is proved similarly.

EXAMPLE 2.42: Show that for a matrix A and vector v, we have


\left\| \mathbf {A} \mathbf {v} \right\| _ {\infty} \leq \left\| \mathbf {A} \right\| _ {\infty} \left\| \mathbf {v} \right\| _ {\infty} \tag {a}

Using the definitions of the infinity norms, we have


\begin{array}{l} \left\| \mathbf {A} \mathbf {v} \right\| _ {\infty} = \max _ {i} \left| \sum_ {j = 1} ^ {n} a _ {i j} v _ {j} \right| \\ \leq \max _ {i} \sum_ {j = 1} ^ {n} \left| a _ {i j} \right| \left| v _ {j} \right| \\ \leq \left\{\max _ {i} \sum_ {j = 1} ^ {n} \left| a _ {i j} \right| \right\} \left\{\max _ {j} \left| v _ {j} \right| \right\} \\ \end{array}

This proves (a).

To show that equality can be reached, we need only to consider the case where \mathbf{v} is a full unit vector and a_{ij} \geq 0 . In this case, \| \mathbf{v} \|_{\infty} = 1 and \| \mathbf{A} \mathbf{v} \|_{\infty} = \| \mathbf{A} \|_{\infty} .

In later chapters we shall encounter various applications of norms. One valuable application arises in the calculation of eigenvalues of a matrix: if we consider the problem Av = \lambda v , we obtain, taking norms on both sides,


\| \mathbf {A} \mathbf {v} \| = \| \lambda \mathbf {v} \| \tag {2.163}

and hence using (2.144) and (2.162), we have


\| \mathbf {A} \| \| \mathbf {v} \| \geq | \lambda | \| \mathbf {v} \| \tag {2.164}

or


\left| \lambda \right| \leq \left\| \mathbf {A} \right\| \tag {2.165}

Therefore, every eigenvalue of \mathbf{A} is in absolute magnitude smaller than or equal to any norm of \mathbf{A} . Defining the spectral radius \rho(\mathbf{A}) as ^3


\rho (\mathbf {A}) = \max _ {i} | \lambda_ {i} | \tag {2.166}

we have


\rho (\mathbf {A}) \leq \| \mathbf {A} \| \tag {2.167}

In practice, the \infty -norm of A is calculated most conveniently and thus used effectively to obtain an upper bound on the largest absolute value reached by the eigenvalues.

^{3} Note that for a symmetric matrix A we have \rho(\mathbf{A}) = \|\mathbf{A}\|_{2} , but this does not hold in general for a nonsymmetric matrix; consider, for example, A = \begin{bmatrix} 1 & \alpha \\ 0 & 1 \end{bmatrix} , \alpha \neq 0 .

EXAMPLE 2.43: Calculate the spectral radius of the matrix \mathbf{A} considered in Example 2.38. Then show that \rho(\mathbf{A}) \leq \|\mathbf{A}\|_{\infty} .

The spectral radius is equal to max |\lambda_i| . The eigenvalues of \mathbf{A} have been calculated in Example 2.38.


\lambda_ {1} = - 6; \quad \lambda_ {2} = 6; \quad \lambda_ {3} = 1 2

Hence \rho (\mathbf{A}) = 12

In Example 2.40 we calculated \| \mathbf{A}\|_{\infty} = 16 . Thus the relation \rho (\mathbf{A})\leq \| \mathbf{A}\|_{\infty} is satisfied.

Another important application of norms is encountered when considering the stability of finite element formulations (see Section 4.5). Assume that we have a sequence of finite element discretizations using a specific element and that a typical discretization gives the equation


\mathbf {A} \mathbf {x} = \mathbf {b} \tag {2.168}

Then, roughly speaking, for stability we want a small change in \mathbf{b} to result in only a small change in \mathbf{x} . To measure the magnitude of these changes, assume that we choose a norm \| \cdot \|_L for measuring the size of solutions and a norm \| \cdot \|_R for measuring the size of the right-hand side terms.

Definition: Let \mathbf{A} be a nonsingular matrix of size n \times n . We define the stability constant of \mathbf{A} with respect to the norms \| \cdot \|_L and \| \cdot \|_R as the smallest possible constant S_{LR} such that


\frac {\| \Delta \mathbf {x} \| _ {L}}{\| \mathbf {x} \| _ {L}} \leq S _ {L R} \frac {\| \Delta \mathbf {b} \| _ {R}}{\| \mathbf {b} \| _ {R}} \tag {2.169}

for all vectors \mathbf{x} and perturbations \Delta \mathbf{x} which satisfy \mathbf{A}\mathbf{x} = \mathbf{b} and \mathbf{A}\Delta \mathbf{x} = \Delta \mathbf{b} .

This relation bounds the relative change in the solution \mathbf{x} (in the norm \| \cdot \|_L ) as a consequence of a change in the forcing vector \mathbf{b} (in the norm \| \cdot \|_R ), and we say that a sequence of discretizations is stable with respect to the norms \| \cdot \|_L and \| \cdot \|_R if the constant S_{LR} is uniformly bounded irrespective of how large n is (see Section 4.5.2).

In accordance with (2.162), let ^{4}


\| \mathbf {A} \| _ {L R} = \sup _ {\mathbf {y}} \frac {\| \mathbf {A} \mathbf {y} \| _ {R}}{\| \mathbf {y} \| _ {L}} \tag {2.170}

and \| \mathbf{A}^{-1}\|_{RL} = \sup_{\mathbf{z}}\frac{\| \mathbf{A}^{-1}\mathbf{z}\|_L}{\| \mathbf{z}\|_R} (2.171)

Using \mathbf{y} = \mathbf{x} in (2.170), we obtain


\left\| \mathbf {A} \right\| _ {L R} \geq \frac {\left\| \mathbf {b} \right\| _ {R}}{\left\| \mathbf {x} \right\| _ {L}} \tag {2.172}

and using \mathbf{z} = \Delta \mathbf{b} in (2.171), we obtain


\left\| \mathbf {A} ^ {- 1} \right\| _ {R L} \geq \frac {\left\| \Delta \mathbf {x} \right\| _ {L}}{\left\| \Delta \mathbf {b} \right\| _ {R}} \tag {2.173}

^{4} In the following presentation “sup” means “supremum” and “inf” means “infimum” (see Table 4.5).

Therefore,


\frac {\left\| \Delta \mathbf {x} \right\| _ {L}}{\left\| \mathbf {x} \right\| _ {L}} \leq \left\| \mathbf {A} \right\| _ {L R} \left\| \mathbf {A} ^ {- 1} \right\| _ {R L} \frac {\left\| \Delta \mathbf {b} \right\| _ {R}}{\left\| \mathbf {b} \right\| _ {R}} \tag {2.174}

and hence


S _ {L R} = \left\| \mathbf {A} \right\| _ {L R} \left\| \mathbf {A} ^ {- 1} \right\| _ {R L} \tag {2.175}

In the evaluation of S_{LR} it is crucial to use appropriate norms, and given a norm \| \cdot \|_L a natural choice for the R norm is the dual norm of \| \cdot \|_L defined as


\| \mathbf {z} \| _ {D L} = \sup _ {\mathbf {y}} \frac {\mathbf {y} ^ {T} \mathbf {z}}{\| \mathbf {y} \| _ {L}} \tag {2.176}

With this choice we obtain for a symmetric matrix A (see Exercise 2.22)


\begin{array}{l} \| \mathbf {A} \| _ {L R} = \sup _ {\mathbf {x}, \mathbf {y}} \frac {\mathbf {x} ^ {T} \mathbf {A} \mathbf {y}}{\| \mathbf {x} \| _ {L} \| \mathbf {y} \| _ {L}} \tag {2.177} \\ = k _ {A} \\ \end{array}

and (\| \mathbf{A}^{-1}\|_{RL})^{-1} = \inf_{\mathbf{x}}\sup_{\mathbf{y}}\frac{\mathbf{x}^T\mathbf{A}\mathbf{y}}{\|\mathbf{x}\|_L\|\mathbf{y}\|_L} = \gamma_{A} (2.178)

The stability constant S_{LR} is then given by


S _ {L R} = \frac {k _ {A}}{\gamma_ {A}} \tag {2.179}

As we mentioned earlier, for stability of a discretization we need to show that S_{LR} in (2.179) remains bounded as the finite element mesh is refined. This is a rather general result. Our discussion in Section 4.5 is concerned with a particular form of A, namely, the form arising in our mixed displacement/pressure (u/p) formulations. In this case the stability condition leads to specific expressions that pertain specifically to the u/p formulations, and we give these expressions in Section 4.5.2.

2.8 EXERCISES

2.1. Evaluate the following required result in the most efficient way, that is, with the least number of multiplications. Count the number of multiplications used.

Let


\mathbf {A} = \left[ \begin{array}{l l l} 3 & 4 & 1 \\ 4 & 6 & 2 \\ 1 & 2 & 3 \end{array} \right]

\mathbf {B} ^ {T} = \left[ \begin{array}{l l l} 1 & 3 & 2 \end{array} \right]

k = 4

\mathbf {C} = \left[ \begin{array}{r r r} 4 & 1 & - 2 \\ 1 & 8 & - 1 \\ - 2 & - 1 & 6 \end{array} \right]

and calculate B^{T}AkCB .

2.2. (a) Evaluate A^{-1} when


\mathbf {A} = \left[ \begin{array}{c c} 3 & - 1 \\ - 1 & 2 \end{array} \right]

and when


\mathbf {A} = \left[ \begin{array}{l l l} 2 & 0 & 1 \\ 0 & 4 & 0 \\ 1 & 0 & 2 \end{array} \right]

(b) Evaluate the determinants of these two matrices.

2.3. Consider the following three vectors.


\mathbf {x} _ {1} = \left[ \begin{array}{c} 1 \\ 3 \\ 4 \\ - 1 \\ 2 \end{array} \right]; \quad \mathbf {x} _ {2} = \left[ \begin{array}{c} 4 \\ 1 \\ - 1 \\ 0 \\ 1 \end{array} \right]; \quad \mathbf {x} _ {3} = \left[ \begin{array}{c} - 7 \\ 1 \\ 6 \\ - 1 \\ - 1 \end{array} \right]

Use these vectors as the columns of a matrix A and determine the rank and kernel of A.

2.4. Consider the following matrix A. Determine the constant k such that the rank of A is 2 and then determine the kernel of A.


\mathbf {A} = \left[ \begin{array}{c c c} 1 & - 1 & 0 \\ - 1 & 1 + k & - 1 \\ 0 & - 1 & 1 \end{array} \right]

2.5. Consider the following two vectors defined in the three-dimensional Cartesian frame with basis vectors \mathbf{e}_i .


\mathbf {u} = \left[ \begin{array}{l} 2 \\ 3 \\ 4 \end{array} \right]; \quad \mathbf {v} = \left[ \begin{array}{l} 1 \\ 2 \\ 3 \end{array} \right]

(a) Evaluate the angle between these vectors.
(b) Assume that a new basis is to be used, namely, the primed basis in Example 2.24. Evaluate the components of the two vectors in this basis.
(c) Evaluate the angle between the vectors in this new basis.

2.6. A reflection matrix is defined as \mathbf{P} = \mathbf{I} - \alpha \mathbf{v}\mathbf{v}^T , \alpha = \frac{2}{\mathbf{v}^T\mathbf{v}} where \mathbf{v} is a vector (of order n ) normal to the plane of reflection.

(a) Show that \mathbf{P} is an orthogonal matrix.
(b) Consider the vector \mathbf{Pu} where \mathbf{u} is also a vector of order n . Show that the action of \mathbf{P} on \mathbf{u} is that the component of \mathbf{u} normal to the plane of reflection has its direction reversed and the component of \mathbf{u} in the plane of reflection is not changed.

2.7. The components of the stress tensor in the x_{1}, x_{2} coordinate system of Fig. E2.25 are at a point


\boldsymbol {\tau} = \left[ \begin{array}{c c} 1 0 & - 6 \\ - 6 & 2 0 \end{array} \right]

(a) Establish a new basis in which the off-diagonal components are zero, and give the new diagonal components.