29 KiB
4.3.3 The Monotonically Convergent Finite Element Solution: A Ritz Solution
We observed earlier that the application of the principle of virtual work is identical to using the stationarity condition of the total potential of the system (see Example 4.4). Considering also the discussion of the Ritz method in Section 3.3.3, we can conclude that monotonically convergent displacement-based finite element solutions are really only applications of this method. In the finite element analysis the Ritz functions are contained in the element displacement interpolation matrices \mathbf{H}^{(m)} , m = 1, 2, \ldots , and the Ritz parameters are the unknown nodal point displacements stored in U. As we discuss further below, the mathematical conditions on the displacement interpolation functions in the matrices \mathbf{H}^{(m)} , in order that the finite element solution be a Ritz analysis, are exactly those that we identified earlier using physical reasoning. The correspondence between the analysis methods is illustrated in Examples 3.22 and 4.5.
Considering the Ritz method of analysis with the finite element interpolations, we have
\boldsymbol {\Pi} = \frac {1}{2} \mathbf {U} ^ {T} \mathbf {K} \mathbf {U} - \mathbf {U} ^ {T} \mathbf {R} \tag {4.68}
where \Pi is the total potential of the system. Invoking the stationarity of \Pi with respect to the Ritz parameters U_{i} stored in U and recognizing that the matrix K is symmetric, we obtain
\mathbf {K U} = \mathbf {R} \tag {4.69}
The solution of (4.69) yields the Ritz parameters, and then the displacement solution in the domain considered is
\mathbf {u} ^ {(m)} = \mathbf {H} ^ {(m)} \mathbf {U}; \quad m = 1, 2, \dots \tag {4.70}
The relations in (4.68) to (4.70) represent a Ritz analysis provided the functions used satisfy certain conditions. We defined in Section 3.3.2 a C^{m-1} variational problem as one in which the variational indicator of the problem contains derivatives of order m and lower. We then noted that for convergence the Ritz functions must satisfy the essential (or geometric) boundary conditions of the problem involving derivatives up to order (m - 1) , but that the functions do not need to satisfy the natural (or force) boundary conditions involving derivatives of order m to (2m - 1) because these conditions are implicitly contained in the variational indicator \Pi . Therefore, in order for a finite element solution to be a Ritz analysis, the essential boundary conditions must be completely satisfied by the finite element nodal point displacements and the displacement interpolations between the nodal points. However, in selecting the finite element displacement functions, no special attention need be given to the natural boundary conditions because these conditions are imposed with the load vector and are satisfied approximately in the Ritz solution. The accuracy with which the natural or force boundary conditions are satisfied depends on the specific Ritz functions employed, but this accuracy can always be increased by using a larger number of functions, i.e., a larger number of finite elements to model the problem.
In the classical Ritz analysis the Ritz functions extend over the complete domain considered, whereas in the finite element analysis the individual Ritz functions extend only over subdomains (finite elements) of the complete region. Hence, there must be a question as to what conditions must be fulfilled by the finite element interpolations with regard to
continuity requirements between adjacent subdomains. To answer this question we consider the integrations that must be performed to evaluate the coefficient matrix K. We recognize that in considering a C^{m-1} problem we need continuity in at least the (m - 1) st derivatives of the Ritz trial functions in order that we can perform the integrations across the element boundaries. However, this continuity requirement corresponds entirely to the element compatibility conditions that we discussed in Section 4.3.2. For example, in the analysis of fully three-dimensional problems only the displacements between elements must be continuous, whereas in the analysis of plate problems formulated using the Kirchhoff plate theory we also need continuity in the first derivatives of the displacement functions.
In summary, therefore, for a C^{m-1} problem [C^{m-1} \equiv continuity on trial functions and their derivatives up to order \( (m - 1)] , in the classical Ritz analysis the trial functions are selected to satisfy exactly all boundary conditions that involve derivatives up to order (m - 1) . The same holds in finite element analysis, but in addition, continuity in the trial functions and their derivatives up to order (m - 1) must be satisfied between elements in order for the finite element solution to correspond to a Ritz analysis.
Although the classical Ritz analysis procedure and the displacement-based finite element method are theoretically identical, in practice, the finite element method has important advantages over a conventional Ritz analysis. One disadvantage of the conventional Ritz analysis is that the Ritz functions are defined over the whole region considered. For example, in the analysis of the cantilever in Example 3.24, the Ritz functions spanned from x = 0 to x = L. Therefore, in the conventional Ritz analysis, the matrix K is a full matrix, and as pointed out in Section 8.2.3, the numerical operations required for solution of the resulting algebraic equations are considerable if many functions are used.
A particular difficulty in a conventional Ritz analysis is the selection of appropriate Ritz functions since the solution is a linear combination of these functions. In order to solve accurately for large displacement or stress gradients, many functions may be needed. However, these functions also unnecessarily extend over the regions in which the displacements and stresses vary rather slowly and where not many functions are needed.
Another difficulty arises in the conventional Ritz analysis when the total region of interest is made up of subregions with different kinds of strain distributions. As an example, consider a plate that is supported by edge beams and columns. In such a case, the Ritz functions used for one region (e.g., the plate) are not appropriate for the other regions (i.e., the edge beams and columns), and special displacement continuity conditions or boundary relations must be introduced.
The few reasons given already show that the conventional Ritz analysis is, in general, not particularly computer-oriented, except in some cases for the development of special-purpose programs. On the other hand, the finite element method has to a large extent removed the practical difficulties while retaining the advantageous properties of the conventional Ritz method. With regard to the difficulties mentioned above, the selection of Ritz functions is handled by using an adequate element library in the computer program. The use of relatively many functions in regions of high stress and displacement gradients is possible simply by using many elements, and the combination of domains with different kinds of strain distributions is possible by using different kinds of elements to idealize the domains. It is this generality of the finite element method, and the good mathematical foundation, that have made the finite element method the very widely used analysis tool in today's engineering environments.
4.3.4 Properties of the Finite Element Solution
Let us consider the general linear elasticity problem and its finite element solution and identify certain properties that are useful for an understanding of the finite element method. We shall use the notation summarized in Table 4.5.
The elasticity problem can be written as follows (see, for example, G. Strang and G. J. Fix [A], P. G. Ciarlet [A], or F. Brezzi and M. Fortin [A]).
Find \mathbf{u} \in V such that
a (\mathbf {u}, \mathbf {v}) = (\mathbf {f}, \mathbf {v}) \quad \forall \mathbf {v} \in V \tag {4.71}
where the space V is defined as
V = \left\{\mathbf {v} \mid \mathbf {v} \in L ^ {2} (\mathrm{Vol}); \frac {\partial v _ {i}}{\partial x _ {j}} \in L ^ {2} (\mathrm{Vol}), i, j = 1, 2, 3; v _ {i} | _ {s _ {u}} = 0, i = 1, 2, 3 \right\} \tag {4.72}
Here L^{2}(\text{Vol}) is the space of square integrable functions in the volume, “Vol”, of the body being considered,
L ^ {2} (\text { Vol }) = \left\{\mathbf {w} \mid \mathbf {w} \text { is defined in Vol and } \int_ {\text { Vol }} \left(\sum_ {i = 1} ^ {3} (w _ {i}) ^ {2}\right) d \text { Vol } = \left\| \mathbf {w} \right\| _ {L ^ {2} (\text { Vol })} ^ {2} < + \infty \right\} \tag {4.73}
TABLE 4.5 Notation used in discussion of the properties and convergence of finite element solutions
| Symbol | Meaning |
| $a(.,.)$ | Bilinear form corresponding to model problem being considered (see Example 4.22) |
| $\mathbf{f}$ | Load vector |
| $\mathbf{u}$ | Exact displacement solution to mathematical model; an element of the space $V$ |
| $\mathbf{v}$ | Displacements; an element of the space $V$ |
| $\mathbf{u}_{h}$ | Finite element solution, an element of the space $V_{h}$ |
| $\mathbf{v}_{h}$ | Finite element displacements; an element of the space $V_{h}$ |
| $\forall$ | For all |
| $\in$ | An element of |
| $V, V_{h}$ | Spaces of functions [see (4.72) and (4.84)] |
| Vol | Volume of body considered |
| $L^{2}$ | Space of a square integrable functions [see (4.73)] |
| $\mathbf{e}_{h}$ | Error between exact and finite element solution, $\mathbf{e}_{h} = \mathbf{u} - \mathbf{u}_{h}$ |
| $\exists$ | There exists |
| $\subset$ | Contained in |
| $\subsetneq$ | Contained in but not equal to |
| $\parallel \parallel_{E}$ | Energy norm [see (4.74)] |
| inf | We take the infimum. |
| sup | We take the supremum. |
Hence, (4.72) defines a space of functions corresponding to a general three-dimensional analysis. The functions in the space vanish on the boundary S_{u} , and the squares of the functions and of their first derivatives are integrable. Corresponding to V, we use the energy norm
\| \mathbf {v} \| _ {E} ^ {2} = a (\mathbf {v}, \mathbf {v}) \tag {4.74}
which actually corresponds to twice the strain energy stored in the body when the body is subjected to the displacement field v.
We assume in our discussion that the structure considered in (4.71) is properly supported, corresponding to the zero displacement conditions on S_{u} , so that \| \mathbf{v} \|_E^2 is greater than zero for any \mathbf{v} different from zero.
In addition, we shall also use the Sobolev norms of order m = 0 and m = 1 defined as
m = 0;
\left(\| \mathbf {v} \| _ {0}\right) ^ {2} = \int_ {\operatorname{Vol}} \left(\sum_ {i = 1} ^ {3} \left(v _ {i}\right) ^ {2}\right) d \operatorname{Vol} \tag {4.75}
m = 1:
\left(\left\| \mathbf {v} \right\| _ {1}\right) ^ {2} = \left(\left\| \mathbf {v} \right\| _ {0}\right) ^ {2} + \int_ {\mathrm{Vol}} \left[ \sum_ {i = 1, j = 1} ^ {3} \left(\frac {\partial v _ {i}}{\partial x _ {j}}\right) ^ {2} \right] d \mathrm{Vol} \tag {4.76}
For our elasticity problem the norm of order 1 is used, ^{11} and we have the following two important properties for our bilinear form a.
Continuity:
\exists M > 0 \text { such that } \forall \mathbf {v} _ {1}, \mathbf {v} _ {2} \in V, \quad | a (\mathbf {v} _ {1}, \mathbf {v} _ {2}) | \leq M \| \mathbf {v} _ {1} \| _ {1} \| \mathbf {v} _ {2} \| _ {1} \tag {4.77}
Ellipticity:
\exists \alpha > 0 \text { such that } \forall \mathbf {v} \in V, \quad a (\mathbf {v}, \mathbf {v}) \geq \alpha \| \mathbf {v} \| _ {1} ^ {2} \tag {4.78}
where the constants \alpha and M depend on the actual elasticity problem being considered, including the material constants used, but are independent of v.
^{11} In our discussion, we shall also use the Poincaré-Friedrichs inequality, namely, that for the analysis problems we consider, for any v we have
\int_ {\mathrm{Vol}} \left(\sum_ {i = 1} ^ {3} (v _ {i}) ^ {2}\right) d \mathrm{Vol} \leq c \int_ {\mathrm{Vol}} \left(\sum_ {i, j = 1} ^ {3} \left(\frac {\partial v _ {i}}{\partial x _ {j}}\right) ^ {2}\right) d \mathrm{Vol}
where c is a constant (see, for example, P. G. Ciarlet [A]).
The continuity property is satisfied because reasonable norms are used in (4.77) , and the ellipticity property is satisfied because a properly supported (i.e., stable) structure is being considered (see P. G. Ciarlet [A] for a mathematical proof). Based on these properties we have
c _ {1} \left\| \mathbf {v} \right\| _ {1} \leq (a (\mathbf {v}, \mathbf {v})) ^ {1 / 2} \leq c _ {2} \left\| \mathbf {v} \right\| _ {1} \tag {4.79}
where c_{1} and c_{2} are constants independent of v, and we therefore have that the energy norm is equivalent to the 1-norm (see Section 2.7). In mathematical analysis the Sobolev norms are commonly used to measure rates of convergence (see Section 4.3.5), but in practice the energy norm is frequently more easily evaluated [see (4.97)]. Because of (4.79), we can say that convergence can also be defined, instead of using (4.64), as
\| \mathbf {u} - \mathbf {u} _ {h} \| _ {1} \rightarrow 0 \quad \text { as } h \rightarrow 0
and the energy norm in problem solutions will converge with the same order as the 1-norm. We examine the continuity and ellipticity of a bilinear form a in the following example.
EXAMPLE 4.24: Consider the problem in Example 4.22. Show that the bilinear form a satisfies the continuity and ellipticity conditions.
Continuity follows because ^{12}
\begin{array}{l} a (w _ {1}, w _ {2}) = \int_ {A} T \left(\frac {\partial w _ {1}}{\partial x} \frac {\partial w _ {2}}{\partial x} + \frac {\partial w _ {1}}{\partial y} \frac {\partial w _ {2}}{\partial y}\right) d x d y \\ \leq \int_ {A} T \left[ \left(\frac {\partial w _ {1}}{\partial x}\right) ^ {2} + \left(\frac {\partial w _ {1}}{\partial y}\right) ^ {2} \right] ^ {1 / 2} \left[ \left(\frac {\partial w _ {2}}{\partial x}\right) ^ {2} + \left(\frac {\partial w _ {2}}{\partial y}\right) ^ {2} \right] ^ {1 / 2} d x d y \\ \leq \left\{\int_ {A} T \left[ \left(\frac {\partial w _ {1}}{\partial x}\right) ^ {2} + \left(\frac {\partial w _ {1}}{\partial y}\right) ^ {2} \right] d x d y \right\} ^ {1 / 2} \\ \times \left\{\int_ {A} T \left[ \left(\frac {\partial w _ {2}}{\partial x}\right) ^ {2} + \left(\frac {\partial w _ {2}}{\partial y}\right) ^ {2} \right] d x d y \right\} ^ {1 / 2} \leq c \| w _ {1} \| _ {1} \| w _ {2} \| _ {1} \\ \end{array}
Ellipticity requires that
\begin{array}{l} a (w, w) = \int_ {A} T \left[ \left(\frac {\partial w}{\partial x}\right) ^ {2} + \left(\frac {\partial w}{\partial y}\right) ^ {2} \right] d x d y \tag {a} \\ \geq \alpha \int_ {A} \left[ w ^ {2} + \left(\frac {\partial w}{\partial x}\right) ^ {2} + \left(\frac {\partial w}{\partial y}\right) ^ {2} \right] d x d y = \alpha \| w \| _ {1} ^ {2} \\ \end{array}
However, the Poincaré-Friedrichs inequality,
\int_ {A} w ^ {2} d x d y \leq c \int_ {A} \left[ \left(\frac {\partial w}{\partial x}\right) ^ {2} + \left(\frac {\partial w}{\partial y}\right) ^ {2} \right] d x d y
where c is a constant, ensures that (a) is satisfied.
The above statements on the elasticity problem encompass one important point already mentioned earlier: the exact solution to the problem must correspond to a finite strain energy, see (4.64) and (4.79). Therefore, for example,—strictly—we do not endeavor to solve general two- or three-dimensional elasticity problems with the mathematical idealization of point loads (the solution for a point load on a half space corresponds to infinite strain energy, see for instance S. Timoshenko and J. N. Goodier [A]). Instead, we represent the loads in the elasticity problem closer to how they actually act in nature, namely as smoothly distributed loads, which however can have high magnitudes and act over very small areas. Then the solution of the variational formulation in (4.71) is the same as the solution of the differential formulation. Of course, in our finite element analysis so long as the finite elements are much larger than the area of load application, we can replace the distributed load over the area with an equivalent point load, merely for efficiency of solution; see Section 1.2 and the example in Fig. 1.4.
An important observation is that the exact solution to our elasticity problem is unique. Namely, assume that u_{1} and u_{2} are two different solutions; then we would have
a \left(\mathbf {u} _ {1}, \mathbf {v}\right) = (\mathbf {f}, \mathbf {v}) \quad \forall \mathbf {v} \in V \tag {4.80}
and a(\mathbf{u}_2,\mathbf{v}) = (\mathbf{f},\mathbf{v})\quad \forall \mathbf{v}\in V (4.81)
Subtracting, we obtain
a \left(\mathbf {u} _ {1} - \mathbf {u} _ {2}, \mathbf {v}\right) = 0 \quad \forall \mathbf {v} \in V \tag {4.82}
and taking \mathbf{v} = \mathbf{u}_1 - \mathbf{u}_2 , we have a(\mathbf{u}_1 - \mathbf{u}_2, \mathbf{u}_1 - \mathbf{u}_2) = 0 . Using (4.79) with \mathbf{v} = \mathbf{u}_1 - \mathbf{u}_2 , we obtain \| \mathbf{u}_1 - \mathbf{u}_2 \|_1 = 0 , which means \mathbf{u}_1 \equiv \mathbf{u}_2 , and hence we have proven that our assumption of two different solutions is untenable.
Now let V_{h} be the space of finite element displacement functions (which correspond to the displacement interpolations contained in all element displacement interpolation matrices \mathbf{H}^{(m)} ) and let v_{h} be any element in that space (i.e., any displacement pattern that can be obtained by the displacement interpolations). Let u_{h} be the finite element solution; hence u_{h} is also an element in V_{h} and the specific element that we seek. Then the finite element solution of the problem in (4.71) can be written as
Find \mathbf{u}_h\in V_h such that
a \left(\mathbf {u} _ {h}, \mathbf {v} _ {h}\right) = \left(\mathbf {f}, \mathbf {v} _ {h}\right) \quad \forall \mathbf {v} _ {h} \in V _ {h} \tag {4.83}
The space V_{h} is defined as
V _ {h} = \left\{\mathbf {v} _ {h} \mid \mathbf {v} _ {h} \in L ^ {2} (\text {Vol}); \frac {\partial \left(v _ {h}\right) _ {i}}{\partial x _ {j}} \in L ^ {2} (\text {Vol}), i, j = 1, 2, 3; \left(v _ {h}\right) _ {i} \mid_ {s _ {u}} = 0, i = 1, 2, 3 \right\} \tag {4.84}
and for the elements of this space we use the energy norm (4.74) and the Sobolev norm (4.76). Of course, V_{h} \subset V .
The relation in (4.83) is the principle of virtual work for the finite element discretization corresponding to V_{h} . With this solution space, the continuity and ellipticity conditions (4.77) and (4.78) are satisfied, using v_{h} \in V_{h} , and a positive definite stiffness matrix is obtained for any V_{h} .
We should note that V_{h} corresponds to a given mesh, where h denotes the generic element size, and that in the discussion of convergence we of course consider a sequence of spaces V_{h} (a sequence of meshes with decreasing h). We illustrate in Figure 4.11 the elements of V_{h} for the discretization dealt with in Example 4.6.

Figure 4.11 Aerial view of basis functions for space V_{h} used in analysis of cantilever plate of Example 4.6. The displacement functions are plotted upwards for ease of display, but each function shown is applicable to the u and v displacements. An element of V_{h} is any linear combination of the 12 displacement functions. Note that the functions correspond to the element displacement interpolation matrices \mathbf{H}^{(m)} , discussed in Example 4.6, and that the displacements at nodes 1, 2 and 3 are zero.
Considering the finite element solution u_{h} and the exact solution u to the problem, we have the following important properties.
Property 1. Let the error between the exact solution \mathbf{u} and the finite element solution \mathbf{u}_h be \mathbf{e}_h ,
\mathbf {e} _ {h} = \mathbf {u} - \mathbf {u} _ {h} \tag {4.85}
Then the first property is
\boxed {a (\mathbf {e} _ {h}, \mathbf {v} _ {h}) = 0 \quad \forall \mathbf {v} _ {h} \in V _ {h}} \tag {4.86}
The proof is obtained by realizing that the principle of virtual work gives
a (\mathbf {u}, \mathbf {v} _ {h}) = (\mathbf {f}, \mathbf {v} _ {h}) \quad \forall \mathbf {v} _ {h} \in V _ {h} \tag {4.87}
and a(\mathbf{u}_h,\mathbf{v}_h) = (\mathbf{f},\mathbf{v}_h) \forall \mathbf{v}_h\in V_h (4.88)
so that by subtraction we obtain (4.86). We may say that the error is “orthogonal in a(. , .) ” to all v_{h} in V_{h} . Clearly, as the space V_{h} increases, with the larger space always containing the smaller space, the solution accuracy will increase continuously. The next two properties are based on Property 1.
Property 2. The second property is
a (\mathbf {u} _ {h}, \mathbf {u} _ {h}) \leq a (\mathbf {u}, \mathbf {u}) \tag {4.89}
We prove this property by considering
\begin{array}{l} a (\mathbf {u}, \mathbf {u}) = a (\mathbf {u} _ {h} + \mathbf {e} _ {h}, \mathbf {u} _ {h} + \mathbf {e} _ {h}) \\ = a \left(\mathbf {u} _ {h}, \mathbf {u} _ {h}\right) + 2 a \left(\mathbf {u} _ {h}, \mathbf {e} _ {h}\right) + a \left(\mathbf {e} _ {h}, \mathbf {e} _ {h}\right) \tag {4.90} \\ = a \left(\mathbf {u} _ {h}, \mathbf {u} _ {h}\right) + a \left(\mathbf {e} _ {h}, \mathbf {e} _ {h}\right) \\ \end{array}
where we have used (4.86) with \mathbf{v}_h = \mathbf{u}_h . The relation (4.89) follows because a(\mathbf{e}_h, \mathbf{e}_h) > 0 for any \mathbf{e}_h \neq \mathbf{0} (since for the properly supported structure \| \mathbf{v} \|_E > 0 for any nonzero \mathbf{v} ).
Hence, the strain energy corresponding to the finite element solution is always smaller than or equal to the strain energy corresponding to the exact solution.
Property 3. The third property is
a \left(\mathbf {e} _ {h}, \mathbf {e} _ {h}\right) \leq a \left(\mathbf {u} - \mathbf {v} _ {h}, \mathbf {u} - \mathbf {v} _ {h}\right) \quad \forall \mathbf {v} _ {h} \in V _ {h} \tag {4.91}
For the proof we use that for any w_{h} in V_{h} , we have
a (\mathbf {e} _ {h} + \mathbf {w} _ {h}, \mathbf {e} _ {h} + \mathbf {w} _ {h}) = a (\mathbf {e} _ {h}, \mathbf {e} _ {h}) + a (\mathbf {w} _ {h}, \mathbf {w} _ {h}) \tag {4.92}
Hence, a(\mathbf{e}_h,\mathbf{e}_h)\leq a(\mathbf{e}_h + \mathbf{w}_h,\mathbf{e}_h + \mathbf{w}_h) (4.93)
Choosing \mathbf{w}_h = \mathbf{u}_h - \mathbf{v}_h gives (4.91).
This third property says that the finite element solution u_{h} is chosen from all the possible displacement patterns v_{h} in V_{h} such that the strain energy corresponding to u - u_{h} is the minimum. Hence, in that sense, the “energy distance” between u and the elements in V_{h} is minimized by the solution u_{h} in V_{h} .
Using (4.91) and the ellipticity and continuity of the bilinear form, we further obtain
\begin{array}{l} \alpha \left\| \mathbf {u} - \mathbf {u} _ {h} \right\| _ {1} ^ {2} \leq a (\mathbf {u} - \mathbf {u} _ {h}, \mathbf {u} - \mathbf {u} _ {h}) \\ = \inf _ {\mathbf {v} _ {h} \in V _ {h}} a (\mathbf {u} - \mathbf {v} _ {h}, \mathbf {u} - \mathbf {v} _ {h}) \tag {4.94} \\ \leq M \inf _ {\mathbf {v} _ {h} \in V _ {h}} \| \mathbf {u} - \mathbf {v} _ {h} \| _ {1} \| \mathbf {u} - \mathbf {v} _ {h} \| _ {1} \\ \end{array}
where “inf” denotes the infimum (see Table 4.5). If we let d(\mathbf{u}, V_{h}) = \inf_{\mathbf{v}_{h} \in V_{h}} \|\mathbf{u} - \mathbf{v}_{h}\|_{1} , we recognize that we have the property
\left\| \mathbf {u} - \mathbf {u} _ {h} \right\| _ {1} \leq c d (\mathbf {u}, V _ {h}) \tag {4.95}
where c is a constant, c = \sqrt{M / \alpha} , independent of h but dependent on the material properties. ^{13} This result is referred to as Cea's lemma (see, for example, P. G. Ciarlet [A]).
The above three properties give valuable insight into how the finite element solution is chosen from the displacement patterns possible within a given finite element mesh and what we can expect as the mesh is refined.
We note, in particular, that (4.95), which is based on Property 3, states that a sufficient condition for convergence with our sequence of finite element spaces is that for any \mathbf{u} \in V we have \lim_{h \to 0} \inf \| \mathbf{u} - \mathbf{v}_h \|_1 = 0 . Also, (4.95) can be used to measure the rate of convergence as the mesh is refined by introducing an upper bound on how d(\mathbf{u}, V_h) changes with the mesh refinement (see Section 4.3.5).
Further, Properties 2 and 3 say that at the finite element solution the error in strain energy is minimized within the possible displacement patterns of a given mesh and that the strain energy corresponding to the finite element solution will approach the exact strain energy (from below) as increasingly finer meshes are used (with the displacement patterns of the finer mesh containing the displacement patterns of the previous coarser mesh).
We can also relate these statements to earlier observations that in a finite element solution the stationarity of the total potential is established (see Section 4.3.2). That is, for a given mesh and any nodal displacements U_{any} , we have
\Pi \left| \mathrm{U} _ {\text {any}} = \frac {1}{2} \mathrm{U} _ {\text {any}} ^ {T} \mathrm{K} \mathrm{U} _ {\text {any}} - \mathrm{U} _ {\text {any}} ^ {T} \mathrm{R} \right. \tag {4.96}
The finite element solution U is obtained by invoking the stationarity of \Pi to obtain
\mathbf {K U} = \mathbf {R}
At the finite element displacement solution U we have the total potential \Pi and strain energy U
\boldsymbol {\Pi} = - \frac {1}{2} \mathbf {U} ^ {T} \mathbf {R}; \quad \mathcal {U} = \frac {1}{2} \mathbf {U} ^ {T} \mathbf {R} \tag {4.97}
Therefore, to evaluate the strain energy corresponding to the finite element solution, we only need to perform a vector multiplication.
To show with this notation that within the given possible finite element displacements (i.e., within the space V_{h} ) \Pi is minimized at the finite element solution U, let us calculate \Pi at U + \epsilon , where \epsilon is any arbitrary vector,
\begin{array}{l} \Pi \left| _ {U + \epsilon} = \frac {1}{2} (U + \epsilon) ^ {T} K (U + \epsilon) - (U + \epsilon) ^ {T} R \right. \\ = \Pi \left| _ {\mathrm{U}} + \epsilon^ {T} (\mathbf {K} \mathbf {U} - \mathbf {R}) + \frac {1}{2} \epsilon^ {T} \mathbf {K} \epsilon \right. \tag {4.98} \\ = \Pi | _ {U} + \frac {1}{2} \epsilon^ {T} K \epsilon \\ \end{array}
where we used that \mathbf{KU} = \mathbf{R} and the fact that \mathbf{K} is a symmetric matrix. However, since \mathbf{K} is positive definite, \Pi \mid_{\mathbf{U}} is the minimum of \Pi for the given finite element mesh. As the mesh is refined, \Pi will decrease and according to (4.97) {}^{\mathfrak{U}} will correspondingly increase.
Considering (4.89), (4.91), and (4.97), we observe that in the finite element solution the displacements are (on the “whole”) underestimated and hence the stiffness of the mathematical model is (on the “whole”) overestimated. This overestimation of the stiffness is (physically) a result of the “internal displacement constraints” that are implicitly imposed on the solution as a result of the displacement assumptions. As the finite element discretization is refined, these “internal displacement constraints” are reduced, and convergence to the exact solution (and stiffness) of the mathematical model is obtained.
To exemplify the preceding discussion, Figure 4.12 shows the results obtained in the analysis of an ad hoc test problem for two-dimensional finite element discretizations. The problem is constructed so as to have no singularities. As we discuss in the next section, in this case the full (maximum) order of convergence is obtained with a given finite element in a sequence of uniform finite element meshes (in each mesh all elements are of equal square size).
Figure 4.12 shows the convergence in strain energy when a sequence of uniform meshes of nine-node elements is employed for the solutions. The meshes are constructed by starting with a 2 \times 2 mesh of square elements of unit side length (for which h = 1), then subdividing each element into four equal square elements (for which h = \frac{1}{2} ,) to obtain the second mesh, and continuing this process. We clearly see that the error in the strain energy decreases with decreasing element size h, as we would expect according to (4.91). We compare the order of convergence seen in the finite element computations with a theoretically established value in the next section.