Layton W., Sussman M. Numerical Linear Algebra 2020
Layton W., Sussman M. Numerical Linear Algebra 2020
Linear
Algebra
b2530 International Strategic Relations and China’s National Security: World at the Crossroads
World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TAIPEI • CHENNAI • TOKYO
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance
Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy
is not required from the publisher.
Printed in Singapore
Preface
This book presents numerical linear algebra for students from a diverse
audience of senior level undergraduates and beginning graduate students in
mathematics, science and engineering. Typical courses it serves include:
A one term, senior level class on Numerical Linear Algebra.
Typically, some students in the class will be good programmers but have
never taken a theoretical linear algebra course; some may have had many
courses in theoretical linear algebra but cannot find the on/off switch on a
computer; some have been using methods of numerical linear algebra for a
while but have never seen any of its background and want to understand
why methods fail sometimes and work sometimes.
Part of a graduate “gateway” course on numerical methods.
This course gives an overview in two terms of useful methods in compu-
tational mathematics and includes a computer lab teaching programming
and visualization connected to the methods.
Part of a one term course on the theory of iterative meth-
ods. This class is normally taken by students in mathematics who want to
study numerical analysis further or to see deeper aspects of multivariable
advanced calculus, linear algebra and matrix theory as they meet applica-
tions.
This wide but highly motivated audience presents an interesting chal-
lenge. In response, the material is developed as follows: Every topic in
v
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page vi
Contents
Preface v
1. Introduction 1
1.1 Sources of Arithmetical Error . . . . . . . . . . . . . . . . 4
1.2 Measuring Errors: The Trademarked Quantities . . . . . . 9
3. Gaussian Elimination 33
3.1 Elimination + Backsubstitution . . . . . . . . . . . . . . . 33
3.2 Algorithms and Pseudocode . . . . . . . . . . . . . . . . . 36
3.3 The Gaussian Elimination Algorithm . . . . . . . . . . . . 38
3.3.1 Computational Complexity and Gaussian
Elimination . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Pivoting Strategies . . . . . . . . . . . . . . . . . . . . . . 43
3.5 Tridiagonal and Banded Matrices . . . . . . . . . . . . . . 48
3.6 The LU Decomposition . . . . . . . . . . . . . . . . . . . 52
vii
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page viii
Contents ix
Bibliography 259
Index 261
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 1
Chapter 1
Introduction
1
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 2
The origin of numerical linear algebra lies in a 1947 paper of von Neu-
mann and Goldstine [von Neumann and Goldstine (1947)]. Its table of
contents, given below, is quite modern in all respects except for the omis-
sion of iterative methods:
PREFACE
CHAPTER I. The sources of errors in a computation
1.1. The sources of errors.
(A) Approximations implied by the mathematical model.
(B) Errors in observational data.
(C) Finitistic approximations to transcendental and implicit
mathematical formulations.
(D) Errors of computing instruments in carrying out elemen-
tary operations: “Noise.” Round off errors. “Analogy”
and digital computing. The pseudo-operations.
1.2. Discussion and interpretation of the errors (A)–(D). Stability.
1.3. Analysis of stability. The results of Courant, Friedrichs, and
Lewy.
1.4. Analysis of “noise” and round off errors and their relation to
high speed computing.
1.5. The purpose of this paper. Reasons for the selection of its
problem.
1.6. Factors which influence the errors (A)–(D). Selection of the
elimination method.
1.7. Comparison between “analogy” and digital computing meth-
ods.
CHAPTER II. Round off errors and ordinary algebraical processes.
2.1. Digital numbers, pseudo-operations. Conventions regarding
their nature, size and use: (a), (b).
2.2. Ordinary real numbers, true operations. Precision of data.
Conventions regarding these: (c), (d).
2.3. Estimates concerning the round off errors:
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 3
Introduction 3
6.6. Continuation.
6.7. Continuation.
6.8. Continuation. The estimates connected with the inverse of A.
6.9. The general AI . Various estimates.
6.10. Continuation.
6.11. Continuation. The estimates connected with the inverse of
AI .
CHAPTER VII. Evaluation of the results.
7.1. Need for a concluding analysis and evaluation.
7.2. Restatement of the conditions affecting A and AI : (A) − (D).
7.3. Discussion of (A), (B): Scaling of A and AI .
7.4. Discussion of (C): Approximate inverse, approximate singu-
larity.
7.5. Discussion of (D): Approximate definiteness.
7.6. Restatement of the computational prescriptions. Digital char-
acter of all numbers that have to be formed.
7.7. Number of arithmetical operations involved.
7.8. Numerical estimates of precision.
Errors using inadequate data are much less than those us-
ing no data at all.
— Babbage, Charles (1792–1871)
On two occasions I have been asked [by members of Par-
liament], ‘Pray, Mr. Babbage, if you put into the machine
wrong figures, will the right answers come out?’ I am not
able rightly to apprehend the kind of confusion of ideas
that could provoke such a question.
— Babbage, Charles (1792–1871)
Introduction 5
of how numbers are represented in computers and the fact that computer
arithmetic is only a close approximation to exact arithmetic. Integers, for
example, are typically represented in a computer in binary form, with a fi-
nite number of binary digits (bits), most commonly 32 or 64 bits, with one
bit reserved for the sign of the integer. Exceeding the maximum number of
digits can result in anomalies such as the sum of two large positive integers
being a negative integer.
Real numbers are typically stored in computers in essentially scientific
notation, base 2. As with integers, real numbers are limited in precision
by the necessity of storing them with a limited number of bits. Typical
precisions are listed in Table 1.1. In Fortran, single precision numbers are
called “real”, double precision numbers are called “double precision”, and
quadruple and other precisions are specified without special names. In C
and related languages, single precision numbers are called “float”, double
precision numbers are called “double”, and quadruple precision numbers
are called “long double”. In Matlab, numbers are double precision by
default; other precisions are also available when required.
Machine epsilon. The finite precision of computer numbers means
that almost all computer operations with numbers introduce additional
numerical errors. For example, there are numbers that are so small that
adding them to the number 1.0 will not change its value! The largest of
these is often called “machine epsilon ” and satisfies the property that
1+=1
in computer arithmetic.1 This error and other consequences of the finite
length of computer numbers are called “roundoff errors ”. Generation and
propagation of these roundoff errors contains some unpleasant surprises.
Everyone who writes computer programs should be aware of the possibilities
of roundoff errors as well as other numerical errors.
Common sources of numerical errors. The following five types of
error are among the most common sources of numerical errors in computer
programs.
1 The precise definition of machine epsilon varies slightly among sources. Some include
a factor of 2 so machine epsilon represents the smallest number that changes 1 when
added to it instead of the largest which doesn’t change the value of 1.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 6
Another error can arise when performing integer arithmetic and, espe-
cially, when mixing integer and real arithmetic. The following Fortran
example program seems to be intended to print the value 0.5, but it
2 In C, numbers are assumed to be double precision, but in Fortran, numbers must have
Introduction 7
Example 1.3.
integer j,k
real x
j=1
k=2
x=j/k
print *,x
end
This program will first perform the quotient 1/2, which is chopped to
zero because integer division results in an integer. Then it will set x=0,
so it will print the value 0.0 even though the programmer probably
expected it to print the value 0.5.
A good way to cause this example program to print the value 0.5 would
be to replace the line x=j/k with the line x=real(j)/real(k) to con-
vert the integers to single precision values before performing the divi-
sion. Analogous programs written in C, C++ and Java can be modified
in an analogous way.
(3) Subtracting nearly equal numbers.
This is a frequent cause of roundoff error since subtraction causes a loss
of significant digits. This source arises in many applications, such as
numerical differentiation.
Thus the effect of the small addend is lost on the calculated value of
the sum.
(5) Dividing by a small number.
This has the effect of magnifying errors: a small percent error can
become a large percent error when divided by a small number.
Introduction 9
We generally have:
• Error: essential but unknowable. Indeed, if we know the error and the
approximate value, adding then gives the true value. If the true value
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 10
Introduction 11
Exercise 1.1. What are the 5 main causes of serious roundoff error? Give
an example of each.
Chapter 2
13
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 14
Vector addition and scalar multiplication share many of the usual prop-
erties of addition and multiplication of real numbers. One of the most
important vector operations is the dot product or the scalar product of two
vectors.
→ →
Definition 2.2. Given vectors x , y , the dot product or scalar product is
the real number
⎧ → → ⎫
⎪ x·y ⎪
⎪ ⎪
⎪
⎪ ⎪
⎪
⎨ or ⎪
⎪ ⎬
→ →
x , y := x1 y1 + x2 y2 + . . . + xn yn
⎪
⎪ ⎪
⎪ or ⎪
⎪ ⎪
⎪
⎩ → → ⎪
⎪ ⎭
(x, y )
and the usual (euclidean) length of the vector x is
→ →
||x||2 = x · x = x21 + x22 + . . . + x2n .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 15
The i,j entry in AB is the dot product: (The ith row vector in A)· (the jth
column vector in B).
1 The same formula is also interpreted as the correlation between x and y, depending
on intended application.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 16
Exercise 2.1.
definition of derivative and dot product prove the versions of the product
rule of differentiation
Exercise 2.4. Find two 2×2 matrices A and B so that AB = 0 but neither
A = 0 nor B = 0.
Exercise 2.5. Let x(t), y(t) be N vectors that depend smoothly on t. For
A an N × N matrix g(t) := x(t)t Ay(t) is a differentiable function : R → R.
Prove the following version of the product rule of differentiation
→
−
Finding λ, φ for an N × N real matrix A by hand:
0 1
A= .
−1 0
det[A − λI] = λ2 + 1 = 0
λ1 = i, λ2 = −i.
+ε 1
A= .
−1 −ε
dot products give zero) and normal (their lengths are normalized to be one).
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 22
Exercise 2.10. Pick two (nonzero) 3-vectors and calculate the 3×3 matrix
xy t . Find its eigenvalues. You should get 0,0, and something nonzero.
the non-zero entries in a matrix in cases where either their exact value does not affect
the result or where the non-zero pattern is the key issue.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 23
Definition 2.6. Let Ax = b and let x be any vector. The error (vector)
is e := x − x
and the residual (vector) is r := b − A
x.
Obviously, the error is zero if and only if the residual is also zero. Errors
and residuals have a geometric interpretation:
For general N ×N systems, the error is essential but, in a very real sense
unknowable. Indeed, if we knew the exact error then we could recover the
exact solution by x = x + e. If we could find the exact solution, then
we wouldn’t be approximating it in the first place! The residual is easily
computable so it is observable. It also gives some indication about the
error as whenever r = 0, then necessarily e = 0. Thus much of numerical
linear algebra is about using the observable residual to infer the size of
the unknowable error. The connection between residual and error is given
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 24
||→
−
x ||∞ := max |xi |.
1≤i≤n
1.01x + 0.99y = 2
0.99x + 1.01y = 2.
Let the approximate solution be (2, 0). Compute the following quantities.
Exercise 2.13. Suppose you are given a matrix, A, a right hand side vec-
tor, b, and an approximate solution vector, x. Write a computer program
to compute each of the following quantities.
Test your program with numbers from the previous exercise. Hint: If
you are using Matlab, the norm function can be used to compute the
unscaled quantity || · ||2 .
Exercise 2.14. Given a point (x0 , y0 ) and two lines in the plane. Calculate
the distance to the lines and relate it to the residual vector. Show that
Ax = b
has a unique solution for every right hand side b. The correct condition is
absolutely clear for 2 × 2 linear systems. Consider therefore a 2 × 2 linear
system
a11 a12 x b
= 1 .
a21 a22 y b2
The reason to call the variables x and y (and not x1 , x2 ) is that the 2 × 2
case is equivalent to looking in the x − y plane for the intersection of the
two lines (and the “solution” is the x − y coordinates of the intersection
point of the 2 lines)
Plotting two lines in the x − y plane the three possible cases are shown in
Figure 2.2.
(a) If L1 and L2 are not parallel, then a unique solution exists for all RHS.
(b) If L1 is on top of L2, than an infinite number of solutions exist for that
particular RHS and no solution for any other RHS.
(c) If L1 is parallel to L2 and they are not the same line, then no solution
exists. Otherwise, there are an infinite number of solutions.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 28
y3
1 2 3
x
1 2 3
x
1 2 3
x
Unique solvability thus depends on the angle between the two lines: If
it is not 0 or 180 degrees a unique solution exists for every possible right
hand side.
For the general N × N linear system, the following is known.
det(A) = 0.
Goldstine, von Neumann and Wilkinson found the correct path by look-
ing at 2×2 linear systems (we have been following their example). Consider
therefore a 2 × 2 linear system
a11 a12 x b
= 1
a21 a22 y b2
⇔
Line L1: a11 x + a12 y = b1
Line L2: a21 x + a22 y = b2 .
Plotting two lines in the x − y plane, geometrically it is clear that the right
definition for 2 × 2 systems of almost singular or numerically singular is as
follows.
Definition 2.8. For the 2 × 2 linear system above, the matrix A is almost
or numerically singular if the angle between the lines L1 and L2 is almost
zero or zero to numerical precision.
Exercise 2.16.
Chapter 3
Gaussian Elimination
33
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 34
Substep 1: Examine the entry a11 . If it is zero or too small, find another
matrix entry and interchange rows or columns to make this the entry
a11 . This process is called “pivoting” and a11 is termed the “pivot
entry”. Details of pivoting will be discussed in a later section, so for
now, just assume a11 is already suitably large.
With the pivot entry non-zero, add a multiple of row 1 to row 2 to
make a21 zero:
a21
Compute: m21 := ,
a11
Then compute: Row 2 ⇐ Row 2 − m21 · Row 1.
This zeroes out the 2, 1 entry and gives
⎡ ⎤⎡ ⎤ ⎡ ⎤
a11 a12 ... a1N x1 b1
⎢ 0 a22 − m21 a12 . . . a2N − m21 a1N ⎥ ⎢ x2 ⎥ ⎢ b2 − m21 b1 ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ a31 a32 ... a3N ⎥⎢ x3 ⎥ ⎢ b3 ⎥
⎢ ⎥⎢ ⎥=⎢ ⎥.
⎢ . . .. . ⎥⎢ .. ⎥ ⎢ . ⎥
⎣ .. .. . .
. ⎦⎣ . ⎦ ⎣ .. ⎦
aN 1 aN 2 ... aN N xN bN
Note that the 2, 1 entry (and all the entries in the second row and
second component of the RHS) are now replaced by new values. Often
2 It is known that operation 1 multiplies det(A) by the scalar, operation 2 does not
Gaussian Elimination 35
where the second row now contains different numbers than before
step 1.
Substep 1 continued: Continue down the first column, zeroing out the
values below the diagonal (the pivot) in column 1:
a31
Compute: m31 := ,
a11
Then compute: Row 3 ⇐ Row 3 − m31 · Row 1,
... ...
aN 1
Compute: mN 1 := ,
a11
Then compute: Row n ⇐ Row N − mN 1 · Row 1.
Step 2: Examine the entry a22 . If it is zero or too small, find another
matrix entry below (and sometimes to the right of) a22 and interchange
rows (or columns) to make this entry a22 . Details of this pivoting
process will be discussed later.
With the pivot entry non zero, add a multiple of row 2 to row 3 to
make a32 zero:
a32
Compute 3, 2 multiplier: m32 := ,
a22
Then compute: Row 3 ⇐ Row 3 − m32 · Row 2.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 36
Step 2 continued: Continue down column 2, zeroing out the values below
the diagonal (the pivot):
a42
Compute: m42 := ,
a22
Then compute: Row 4 ⇐ Row 4 − m42 · Row 2,
... ...
aN 2
Compute: mN 2 := ,
a22
Then compute: Row N ⇐ Row N − mN 2 · Row 2.
Careful analysis of algorithms requires some way to make them more pre-
cise. While the description of the Gaussian elimination algorithm provided
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 37
Gaussian Elimination 37
in the previous section is clear and complete, it does not provide a straight-
forward roadmap to writing a computer program. Neither does it make
certain aspects of the algorithm obvious: for example it is hard to see why
the algorithm requires O(N 3 ) time for an N × N matrix.
In contrast, a computer program would provide an explicit implementa-
tion of the algorithm, but it would also include details that add nothing to
understanding the algorithm itself. For example, the algorithm would not
change if the matrix were written using single precision or double precision
numbers, but the computer program would. Further, printed computer
code is notoriously difficult for readers to understand. What is needed
is some intermediate approach that marries the structural precision of a
computer program with human language descriptions and mathematical
notation.
This intermediate approach is termed “pseudocode”. A recent
Wikipedia article3 describes pseudocode in the following way.
The term “pseudocode” does not refer to a specific set of rules for ex-
pressing and formatting algorithms. Indeed, the Wikipedia article goes
on to give examples of pseudocode based on the Fortran, Pascal, and C
3 From: Wikipedia contributors, “Pseudocode”, Wikipedia, The Free Encyclopedia.
http://en.wikipedia.org/w/index.php?title=Pseudocode&oldid=564706654 (accessed
July 18, 2013). This article cites: Justin Zobel (2004). “Algorithms” in Writing for
Computer Science (second edition). Springer. ISBN 1-85233-802-4.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 38
Notice that Gaussian elimination does not use the x values in computa-
tions in any way. They are only used in the final step of back substitution
to store the solution values. Thus we work with the augmented matrix:
an N × N + 1 matrix with the RHS ⎡ vector in the last ⎤ column
a11 a12 . . . a1n b1
⎢ a21 a22 . . . a2n b2 ⎥
⎢ ⎥
WN ×N +1 := ⎢ . . . . . ⎥.
⎣ .. .. . . .. .. ⎦
an1 an2 . . . ann bn
Further, its backsubstitution phase does not refer to any of the zeroed out
values in the matrix W . Because these are not referred to, their positions
can be used to store the multipliers mij .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 39
Gaussian Elimination 39
error(’singular!’)
end
Each of the rows, columns and diagonals of A sum to the same values, and
similarly for B. Gaussian elimination is written above for an augmented
matrix W that is N × N + 1. Modify it so that it can be applied to a square
matrix. Then write a computer program to do Gaussian elimination on
square matrices, apply it to the matrices A and B, and use the resulting
reduced matrix to compute the determinants of A and B. (det(A) = −360
and det(B) = 0.)
x(N)=W(N,N+1)/W(N,N)
for i=(N-1):-1:1
N
Compute the sum s = j=i+1 Wi,j xj
x(i)=(W(i,N+1)-s)/W(i,i)
end
N
The sum s = j=i+1 Wi,j xj can be accumulated using a loop, a stan-
dard programming approach to computing a sum is the following algorithm.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 41
Gaussian Elimination 41
n
Algorithm 3.3 (Accumulating the sum j=i+1 Wi,j xj ).
s=0
for j=(i+1):N
s=s+W(i,j)*x(j)
end
x(N)=W(N,N+1)/W(N,N)
for i=(N-1):-1:1
n
Next, accumulate j=i+1 Wi,j xj
s=0
for j=(i+1):N
s=s+W(i,j)*x(j)
end
x(i)=(W(i,N+1)-s)/W(i,i)
end
Backsubstitution has two nested loops. The innermost loop contains one
add and one multiply, for two operations, there are roughly N (N − 1)/2
passes through this innermost loop. Thus, it is clear that, in the whole,
O(N 2 ) floating point operations are done inside backsubstitution for an
N × N matrix.
The cost (in time to execute) of each of these is highly computer de-
pendent. Traditionally arithmetic operations on real numbers have been
considered to take the most time. Memory access actually takes much
more time than arithmetic and there are elaborate programming strate-
gies to minimize the effect of memory access time. Since each arithmetic
operation generally requires some memory access, numerical analysts tra-
ditionally have rolled an average memory access time into the time for the
arithmetic for the purpose of estimating run time. Thus one way to estimate
run time is to count the number of floating point operations performed (or
even just the number of multiply’s and divides). This is commonly called
a “FLOP count” for FLoating point OPeration count. More elegantly it
is called “estimating computational complexity”. Counting floating point
operations gives
Gaussian Elimination 43
Exercise 3.5. Verify the claimed FLOP count for Gaussian elimination
and back substitution.
Gaussian Elimination 45
Simple partial pivoting is a common strategy, but there are better ones.
Scaled partial pivoting: Interchange only rows so that the pivot entry
Wii is the element in column i on or below the diagonal which is largest
relative to the size of the whole row that entry is in.
There may be zero or more return values and zero or more arguments.
If there are zero or one return values, the brackets (“[” and “]”) can
be omitted.
end
end
Exercise 3.7. Multiply the first equation in (3.3) by 10,000. Show that
Scaled partial pivoting yields the correct answer in four-digit arithmetic,
but partial pivoting does not.
Gaussian Elimination 47
• Partial pivoting: Row 2 swap with Row 4 since 3.0 is the largest
entry below 10−10 .
• Scaled partial pivoting: Row 2 swap with Row 3 since 2.0/3.0 >
3.0/5.0.
• Full pivoting: Row 2 swap with Row 4 and Column 2 swap with
column 4 since −5.0 is the largest entry in absolute value in the active
submatrix.
temporary = p(i)
p(i) = p(j)
p(j) = temporary
Exercise 3.9. Solve the 3 × 3 linear system with augmented matrix given
below by hand executing the Gaussian elimination with scaled partial piv-
oting algorithm:
⎡ ⎤
−1 2 −1 0
W = ⎣ 0 −1 2 1 ⎦ .
2 −1 0 0
“The longer I live, the more I read, the more patiently I think,
and the more anxiously I inquire, the less I seem to know.”
— John Adams
Gaussian Elimination 49
Clearly, tridiagonal Gaussian elimination has one loop. Inside the loop,
roughly five arithmetic operations are performed. Thus, it is clear that, on
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 50
the whole, O(N ) floating point operations (more precisely 5N-5 FLOPS)
are done inside tridiagonal Gaussian elimination for an N × N matrix.
The backsubstitution algorithm is as follows.4
decreases to 1.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 51
Gaussian Elimination 51
for i = 2:N
if W(3,i-1) is zero
error(’the matrix is singular or pivoting is required’)
end
m = W(4,i)/W(3,i-1)
W(3,i) = W(3,i) - m*W(2,i-1)
W(1,i) = W(1,i) - m*W(1,i-1)
end
if W(3,N) is zero
error(’the matrix is singular.’)
end
Exercise 3.13. Extend the algorithms given here to general banded sys-
tems with half bandwidth p < N/2.
long it would take to solve using the full GE plus full backsubstitution
algorithms. (This explains why it makes sense to look at the special case
of tridiagonal matrices.)
“Vakmanschap is meesterschap.”
— (Motto of Royal Grolsch NV, brewery.)
Gaussian Elimination 53
11 y1 = b1 ⇒ y1 = b1 / 11 ,
21 y1 + 22 y2 = b2 ⇒ y2 = (b2 − 21 y1 )/ 22 ,
and so on.
Step 2. Backward solve U x = y for x
Ux = y
⇔
⎡ ⎤⎡ ⎤ ⎡ ⎤
u11 u12 u1,3 . . . u1,N −1 u1N x1 y1
⎢0 u22 u23 . . . u2,N −1 u2N ⎥ ⎢ x2 ⎥ ⎢ y2 ⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ .. .. . . .. . . .. ⎥ ⎢ .. ⎥ ⎢ .. ⎥
⎢. . . . . . ⎥⎢. ⎥ = ⎢. ⎥,
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣0 0 0 . . . uN −1,N −1 uN,N ⎦ ⎣ xN −1 ⎦ ⎣ yN −1 ⎦
0 0 0 ... 0 uN,N xN yN N
so
u N N x N = yN ⇒ xN = yN /uN N
and
uN −1,N −1 xN −1 + uN −1,N xN = yN −1 ⇒
xN −1 = (yN −1 − uN −1,N xN )/uN −1,N −1 .
Thus, once we compute a factorization A = LU we can solve linear systems
relatively cheaply. This is especially important if we must solve many linear
systems with the same A = LU and different RHS’s b. First consider the
case without pivoting.
Exercise 3.19. Prove Theorem 3.1 for the 3 × 3 case using the following
steps.
Gaussian Elimination 55
Exercise 3.20. Prove Theorem 3.1 for the general case, using Exercise 3.19
as a model.
Remark 3.1. When solving systems with multiple RHS’s, it is common
to compute L and U in double precision and store in the precision sought
in the answer (either single or double). This gives extra accuracy without
extra storage. Precisions beyond double are expensive, however, and are
used sparingly.
Remark 3.2. Implementations of Gaussian Elimination combine the two
matrices L and U together instead of storing the ones on the diagonal of
L and all the zeros of L and U , a savings of storage for N 2 real numbers.
The combined matrix is
⎡ ⎤
u11 u12 u1,3 . . . u1,N −1 u1N
⎢ m21 u22 u23 . . . u2,N −1 u2N ⎥
⎢ ⎥
⎢ .. .. .. .. . . .. ⎥
W = ⎢. . . . . . ⎥.
⎢ ⎥
⎣ mN −1,1 mN −1,2 mN −1,3 . . . uN −1,N −1 uN,N ⎦
mN,1 mN,2 mN,3 . . . mN,N −1 uN,N
Example 3.3. Suppose A is the 4 × 4 matrix below.
⎡ ⎤
3 1 −2 −1
⎢ 2 −2 2 3 ⎥
⎢ ⎥
⎣ 1 5 −4 −1 ⎦ .
3 1 2 3
Performing Gauss elimination without pivoting (exactly as in the algo-
rithm) and storing the multipliers gives
⎡ ⎤
3 1 −2 −1
⎢ 2 − 8 10 11 ⎥
⎢3 3 3 3 ⎥
W = ⎢ 1 7 5 23 ⎥ .
⎣ 3 −4 2 4 ⎦
1 0 8
5 − 26
5
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 56
Thus, A = LU where
⎡ ⎤ ⎡ ⎤
1 0 00 3 1 −2 −1
⎢ 2 1 0 0⎥ ⎢ 0 − 8 10 11 ⎥
⎢3 ⎥ ⎢ 3 3 3 ⎥
L=⎢1 7 ⎥ and U = ⎢ ⎥.
⎣ 3 −4 1 0⎦ ⎣ 0 0 52 234 ⎦
1 0 8
5 1 0 0 0 − 26
5
Exercise 3.23. Algorithm 3.8 describes the algorithm for Gaussian elimi-
nation with scaled partial pivoting for an augmented matrix W , but it does
not employ the combined matrix factor storage described in Remark 3.2.
Modify Algorithm 3.8 so that
Gaussian Elimination 57
Note that
x1 x2 x1 x2
P1 = and P1−1 = .
x2 x1 x2 x1
If →
− y = P −1 →
p = (2, 1) then we compute →
− −
x by
for i = 1:2
y(i) = x( p(i) )
end
y = P −1 →
More generally, we compute →
− −
x by
for i=1:N
y(i)=x( p(i) )
end
Proof. The essential part of the proof can be seen in the 3 × 3 case, so
that case will be presented here.
The first step in Algorithm 3.8 is to find a pivot for the first column.
Call this pivot matrix P1 . Then row reduction is carried out for the first
column, with the result
A = (P1−1 P1 )A = P1−1 (P1 A) = P1−1 L1 U1 ,
where
⎡ ⎤ ⎡ ⎤
1 00 u11 ∗ ∗
L1 = ⎣ m21 1 0 ⎦ and U1 = ⎣ 0 ∗ ∗ ⎦
m31 0 1 0 ∗∗
where the asterisks indicate entries that might be non-zero.
The next step is to pivot the second column of U1 from the diagonal
down, and then use row-reduction to factor it:
A = P1−1 L1 (P2−1 L2 U )
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 58
where
⎡ ⎤ ⎡ ⎤
1 0 0 u11 u12 u13
L2 = ⎣ 0 1 0 ⎦ and U = ⎣ 0 u22 u23 ⎦ .
0 m32 1 0 0 u33
There are only two possibilities for the permutation matrix P2 . It can be
the identity, or it can be
⎡ ⎤
100
P2 = ⎣ 0 0 1 ⎦ .
010
If P2 is the identity, then P2−1 (P2 L1 P2−1 )L2 = L1 L2 and is easily seen to
be lower triangular. If not,
⎡ ⎤⎡ ⎤⎡ ⎤
100 1 00 100
P2 L1 P2−1 = ⎣ 0 0 1 ⎦ ⎣ m21 1 0 ⎦ ⎣ 0 0 1 ⎦
010 m31 0 1 010
⎡ ⎤
1 00
= ⎣ m31 1 0 ⎦ .
m21 0 1
Hence,
⎡ ⎤
1 0 0
P2 L1 P2−1 L2 = ⎣ m31 1 0 ⎦ ,
m21 m32 1
a lower triangular matrix.
Gaussian Elimination 59
for k=1:N
d(k)=b(p(k))
end
→
−
(2) Forward solve L→
−
y = d.
(3) Backsolve U →
−
x =→−y.
Example 3.4. Suppose in solving a 3×3, elimination swaps rows 1 and row
2. Then p = (1, 2, 3) is changed to p = (2, 1, 3) at the end of elimination.
Let b = (1, 3, 7)t , p = (2, 1, 3)t . Then d = P −1 b = (3, 1, 7)t .
Remark 3.3.
2x + 3y = 0
8x + 11y = 1.
Chapter 4
Theorem 4.1 (FENLA). Let AN ×N , bn×1 and let x be the true solution
to Ax = b. Let x̂ be some other vector. The error e := x − x̂ and residual
r := b − Ax̂ are related by
Ae = r.
61
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 62
Given a candidate for a solution x̂, if we could find its error ê(= x − x̂),
then we would recover the true solution
x = x̂ + ê (since x̂ + ê = x̂ + (x − x̂) = x).
Thus we can say the following two problems are equivalent:
Problem 1: Solve Ax = b.
Problem 2: Guess x̂, compute r̂ = b−Ax̂, solve Aê = r̂, and set x = x̂+ê.
r = b − Ax̂ (4.1)
in extended precision
Solve Ae = r (by doing 2 backsolves in working precision) for
an approximate error ê
Replace x̂ with x̂ + ê
if ê ≤ 10−t x̂
return
end
end
error(‘The iteration did not achieve the required error.’)
Using extended precision for the residual may require several iteration
steps, and the number of steps needed increases as A becomes more ill-
conditioned, but in all cases, it is much cheaper than computing the LU
decomposition of A itself in extended precision. Thus, it is almost always
performed in good packages.
Example 4.1. Suppose the matrix A is so ill conditioned that solving with
it only gives 2 significant digits of accuracy. Stepping through iterative
improvement we have:
x = 2 : sig-digits
Calculate r = b − A x
Solve A e = r
e = 2 : sig-digits
Then x̂ ⇐ x̂ + ê : 4 significant digits.
e = 2 : sig-digits
Then x̂ ⇐ x̂+ê : 6 significant digits, and so on until the desired accuracy
is attained.
and
⎡ ⎤
1.00 1.20 1.50
A = ⎣ 1.20 1.50 2.00 ⎦ .
1.50 2.00 3.00
The exact solution of Ax = b is
Exercise 4.2. Show that, when double precision is desired, it can be more
efficient for large N to compute the LU factorization in single precision and
use iterative refinement instead of using double precision for the factoriza-
tion and solution. The algorithm can be described as:
Estimate the FLOP count for the algorithm as outlined above, assuming
ten iterations are necessary for convergence. Count each double precision
operation as two FLOPs and count each change of precision as one FLOP.
Compare this value with the operation count for double precision factor-
ization with a pair of double precision backsolves.
RMS norm comes from an inner product. Other weights are possible, such
as
N
N
|||x||| := ωj x2j , where ωj > 0 and ωj = 1.
j=1 j=1
Weighted norms are used in cases where different components have different
significance, uncertainty, impact on the final answer, etc.
Dot products open geometry as a tool for analysis and for understanding
since the angle1 between two vectors x, y can be defined through the dot
product by
x·y
cos(θ) = .
||x||2 ||y||2
Thus norms that are induced by dot products are special because they
increase the number of tools available for analysis.
point the same way and are thus perfectly correlated. If its −1 they are said to be
anti-correlated.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 67
2 Every proof involving a double sum seems to be done by switching their order then
The property that Ax, y = x, Ay is called self-adjointness with re-
spect to the given inner product ·, ·. If the inner product changes, the ma-
trices that are self-adjoint change and must be redetermined from scratch.
Often the problem under consideration will induce the norm one is
forced to work with. One common example occurs with SPD matrices.
Definition 4.4. An N × N matrix A is symmetric positive definite, SPD
for short, if
• A is symmetric: At = A, and
• A is positive definite: for all x = 0, xt Ax > 0.
SPD matrices can be used to induce inner products and norms as follows.
Definition 4.5. Suppose A is SPD. The A inner product and A norm are
x, yA := xt Ay, and ||x||A := x, xA .
The A inner product is of special importance for solutions of Ax =
b when A is SPD. Indeed, using the equation Ax = b, x, yA can be
calculated when A is SPD without knowing the vector x as follows:
t
x, yA = xt Ay = (Ax) y = bt y.
Exercise 4.4. Prove that if ·, ·∗ is an inner product then ||x||∗ = x, x∗
is a norm.
Exercise 4.5. Prove that if A is SPD then x, yA := xt Ay is an inner
product. Show that A, A2 , A3 , · · · are self adjoint with respect to the A
inner product: Ak x, yA = x, Ak yA .
Exercise 4.6. If ·, ·∗ satisfies two but not all three conditions of an inner
product find which conditions in the definition of a norm are satisfied and
which are violated. Apply your analysis to x, yA := xt Ay when A is not
SPD.
Exercise 4.7. The unit ball is {x : ||x||∗ ≤ 1}. Sketch the unit ball in R2
for the 1, 2 and infinity norms. Note that the only ball that looks ball-like
is the one for the 2-norm. Sketch the unit ball in the weighted 2 norm
induced by the inner product x, y := (1/4)x1 y1 + (1/9)x2 y2 .
Exercise 4.8. An N × N matrix is orthogonal if its columns are N
orthonormal (with respect to the usual euclidean inner product) vec-
tors. Show that if O is an orthogonal matrix then OT O = I, and that
||Ox||2 = ||x||2 .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 69
However, most such norms are not useful. Matrices multiply vectors.
Thus, a useful norm is one which can be used to bound how much a vector
grows when multiplied by A. Thus, under y = Ax we seek a notion of ||A||
under which
y = Ax ≤ Ax.
Starting with the essential function a matrix norm must serve and working
backwards gives the following definition.
Proof. Exercise!
Other features follow from the fact that matrix norms split products
apart, such as the following.
Proof. We will prove some of these to show how Ax ≤ Ax is used in
getting bounds on the action if a matrix. For example, note that A−1 Ax =
x. Thus
x ≤ A−1 Ax,
so
x
≤ Ax ≤ Ax.
A−1
For (5), A−1 A = I so I = 1 ≤ A−1 A (using (2)), and A−1 ≥
1/A. For number 6, since Aφ = λφ. Thus |λ|φ = Aφ ≤ Aφ.
Remark 4.1. The fundamental property that AB ≤ AB for all A, B
shows the key to using it to structure proofs. As a first example, consider
the above proof of Ax−1 ≤ Ax. How is one to arrive at this proof? To
begin rearrange so it becomes x ≤ A−1 Ax. The top (upper) side of
such an inequality must come from splitting a product apart. This suggests
starting with A−1 Ax ≤ A−1 Ax. Next observe the LHS is just x.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 71
This implies Ot O = I.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 73
The proof the formulas for the 1 and infinity norms are a calculation.
N
A1 = max |aij |.
1≤j≤N
i=1
Then we have
Ax = x1 →
−
a1 + · · · + xN −
a→
N.
To prove equality, we take j ∗ to be the index (of the largest column vector)
for which maxj ||→
−
aj ||1 = ||−
a→
j ∗ ||1 and choose x = ej ∗ . Then
Aej ∗ = 1→
−
a j ∗ and
||Aej ∗ ||1 ||aj ∗ ||1 N
= = ||aj ∗ ||1 = max |aij |.
||ej ∗ ||1 ||ej ∗ ||1 1≤j≤N
i=1
N
We leave the proof for A∞ = max1≤i≤N j=1 |aij | as an exercise.
“He (Gill) climbed up and down the lower half of the rock over
and over, memorizing the moves... He says that ‘... going up and
down, up and down eventually... your mind goes blank and you
climb by well cultivated instinct’. ”
— J. Krakauer, from his book Eiger Dreams.
n
Exercise 4.13. Show that A∞ = max1≤i≤n j=1 |aij |. Hint:
n
n
(Ax)i = | aij xj | ≤ |aij ||xj | ≤
j=1 j=1
# $
n
max |xj | |aij | = ||x||∞ · (Sum of row i) .
j
j=1
error : = e = x − x
,
residual : = r = b − A
x.
Recall that, while e = 0 if and only if r = 0, there are cases where small
residuals and large errors coexist. For example, the point P = (0.5, 0.7)
and the 2 × 2 linear system
x−y = 0
−0.8x + y = 1/2
Usually the norm in question will be clear from the context in which
cond(A) occurs so usually the subscript of the norm is omitted. The con-
dition number of a matrix is also often denoted by the Greek letter kappa:
κ(A) = cond(A) = condition number of A.
Remark 4.2. The manipulations in the above proof are typical of ones in
numerical linear algebra and the actual result is a cornerstone of the field.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 77
Thus,
cond(A) = 3001.02
and the relative residual can be 3000× smaller that the relative error. In-
deed, we find
x − x̂∞ r∞
= 1, and = 0.0045.
x∞ b∞
Example 4.10. Calculate cond(HN ), for N = 2, 3, · · · , 13. (This is easily
done in Matlab.) Plot cond(H) vs N various ways and try to find its
growth rate. Do a literature search and find it.
Exercise 4.18. Show that for any square matrix (not necessarily symmet-
ric) cond2 (A) ≥ |λ|max /|λ|min .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 79
Exercise 4.19.
which it is not yet proven and it is an open question if it holds for all matrices (i.e., for
those rare examples) without some adjustments.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 80
The most important other result involving cond(A) is for the perturbed
system when there are perturbations in both A and b.
This result holds for any matrix norm. Thus, various norms of A can be
calculated and the smallest used are an inclusion radius for the eigenvalues
of A.
B n ≤ Bn → 0 as n → ∞.
We shall use the following special case of the spectral mapping theorem.
−1
Lemma 4.2. The eigenvalues of (I − B) are (1 − λ)−1 where λ is an
eigenvalue of B.
S = 1 + α + α2 + · · · + αN
αS = α + · · · + αN + αN +1
(1 − α)S = 1 − αN +1 .
6 Briefly: Exercise 4.22 shows that given a matrix B and any ε > 0, there exists a norm
within ε of spr(B). With this result, if there does not exist a norm with B < 1, then
there is a λ(B) with |λ| = spr(B) > 1. Picking x = eigenvector of λ, we calculate:
|B n x| = |λn x| → ∞.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 83
To apply this idea, note that since |λ| ≤ B, |λ| < 1. Further, λ(I − B) =
1 − λ(B) by the spectral mapping theorem.7 Since |λ(B)| < 1, λ(I − B) = 0
and (I − B)−1 exists. We verify that the inverse is as claimed. To begin,
note that
(I − B)(I + B + · · · + B N ) = I − B N +1 .
Since B N → 0 as N → ∞
I + B + · · · + BN =
= (I − B)−1 (I − B N +1 ) = (I − B)−1 − (I − B)−1 BB N → (I − B)−1 .
A−1
(A + E)−1 ≤ .
1 − A−1 E
The ingredients are now in place. We give the proof of the general case.
Ax = b, (A + E)x̂ = b + f.
Then
&
x − x̂ cond(A) E f
≤ + .
x 1 − A−1 E A b
7 An elementary proof is because the eigenvalues of λ(B) are roots of the polynomial
Proof. The proof uses same ideas but is a bit more delicate in the order
of steps. First,8
Ax = b ⇐⇒(A + E)x = b + Ex
(A + E)x̂ = b + f
(A + E)e = Ex − f
e = (A + E)−1 (Ex − f )
Remark 4.4 (How big is the RHS?). If A−1 E 1, we can esti-
1
mate (e.g., 1−α 1 + α)
1
∼ 1 + A−1 E = 1 + small
1 − A−1 E
so that up to O(A−1 E) the first order error is governed by cond(A).
8 The other natural way to start is to rewrite
Ax = b ⇐⇒(A + E)x = b + Ex
Ax̂ = b + f − E x̂
e = A−1 (f − E x̂).
Since there are 2 natural starting points, the strategy is to try one and if it fails, figure
out why then try the other.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 85
1 −1 1 −1
A= , and B = ,
1 −1.00001 −1 1.00001
|λ|max (A) |λ|max (B)
∼ 1, while ∼ 4 · 105 .
|λ|min (A) |λ|min (B)
There are many, other results related to Theorem 4.12. For example, all
the above upper bounds as relative errors can be complemented by lower
bounds, such as the following.
801. Kahan attributes the theorem to “Gastinel” without reference, but does not seem
to be attributing the proof. Possibly the Gastinel reference is: Noël Gastinel, Matrices
du second degré et normes générales en analyse numérique linéaire, Publ. Sci. Tech.
Ministére de l’Air Notes Tech. No. 110, Paris, 1962.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 86
To summarize:
• If cond(A) = 10t then at most t significant digits are lost when solving
Ax = b.
• cond(A) = AA−1 is the correct measure of ill-conditioning; in par-
ticular, it is scale invariant whereas det(A) is not.
• For 2 × 2 linear systems representing two lines in the x1 , x2 plane,
cond(A) is related to the angle between the lines.
• The effects of roundoff errors and finite precision arithmetic can be
reduced to studying the sensitivity of the problem to perturbations.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 88
Exercise 4.27. For B an N ×N matrix. Show that for a > 0 small enough
then I − aB is invertible. What is the infinite sum in that case:
∞
an B n ?
n=0
Exercise 4.28. Verify that the determinant gives no insight into condi-
tioning. Calculate the determinant of the coefficient matrix of the system
1x1 + 1x2 = 2
10.1x1 + 10x2 = 20.1.
Recalculate after the first equation is multiplied by 10:
10x1 + 10x2 = 20
10.1x1 + 10x2 = 20.1.
For more information see the articles and books of Wilkinson [W61],
[W63].
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 89
Chapter 5
5.1 Derivation
89
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 90
Assumption: Q is conserved.
This suggests that Q often will flow from regions of high concentration
to low concentration. For example, Fourier’s law of heat conduction and
Newton’s law of cooling state that
Heat F lux = −k∇T
where k is the (material dependent) thermal conductivity. The analogous
assumption for Q is
Problem (5.2) is the model problem. What about the time dependent
problem however? One common way to solve it is by the “method of lines”
or time stepping. Pick a Δt (small) and let un (x) ∼ u(x, t)|t=nΔt .
Then
· u (x) − u
n n−1
∂u (x)
(tn ) = .
∂t Δt
Replacing ∂u
∂t by the difference approximation on the above RHS gives a
sequence of problems
un − un−1
− kΔun = f n
Δt
or, solve for n = 1, 2, 3, · · · ,
# $
1
−Δu + n
un = f n + un−1 ,
k
• Solving a time dependent problem can require solving the Poisson prob-
lem (or its ilk) thousands or tens of thousands of times.
• The cost of solving the Poisson problem increases exponentially in the
dimension from 1D to 2D to 3D.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 93
u u(x)
slope = D+ u(a)
slope = D− u(a)
slope = D0 u(a)
x
u(a + h) − u(a − h)
D0 u(a) = (D+ u(a) + D− u(a)) /2 = .
2h
We will use this observation about averaging to prove that A−1 exists.
We can also assume uJ > 0; if uJ < 0 then note that A(−u) = 0 has Jth
component (−uJ ) > 0. Then, if J is not 1 or N , the J th equation in Au = 0
is
−uJ+1 + 2uJ − uJ−1 = 0
or
uJ+1 + uJ−1
uJ =
2
implying that uJ is between uJ−1 and uJ+1 . Thus either they are all zero
or
uJ = uJ−1 = uJ+1 ≡ uMAX .
Continuing across the interval (a, b) we get
u1 = u2 = . . . = uN ≡ umax .
Consider the equation at x1 : 2u1 − u2 = 0. Thus 2umax − 2umax = 0 so
umax ≡ 0 and uJ ≡ 0. We leave the case when J = 1 and J = N as
exercises.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 98
Exercise 5.4. This exercise will calculate the eigenvalues of the 1D matrix
A = tridiag (−1, 2, −1) exactly based on methods for solving difference
equations. If Au = λu then, for we have the difference equation
u0 = 0,
−uj+1 + 2uj − uj−1 = λuj , j = 1, 2, . . . , N,
uN +1 = 0.
uj = C1 R1j + C2 R2j
−R2 + 2R − 1 = λR.
For λ to be an eigenvalue this quadratic equation must have two real roots
and there much be nonzero values of C1/2 for which u0 = uN +1 = 0. Now
find the eigenvalues!
where f (x), g1 and g2 are given. Let u (a), u (a) be replaced by the differ-
ence approximations
· u(a + h) − 2u(a) + u(a + h)
u (a) = D+ D− u(a) =
h2
· u(a + h) − u(a − h)
u (a) = D0 u(a) = .
2h
With these approximations, the CDEqn is reduced to a linear system in
the same way as the MPP. Divide [a, b] into N subintervals h := Nb−a
+1 ,
xj = a + jh, j = 0, 1, · · · , N + 1,
a = x0 < x1 < x2 < · · · < xN < xN +1 = b.
At each meshpoint xj we will compute a uj ∼ u(xj ). We will, of course,
need one equation for each variable meaning one equation for each mesh-
point. Approximate −u = f (x) at each xj by using D+ D− uj :
−εD+ D− uj + D0 uj = f (xj ), for j = 1, 2, 3, · · ·, N.
(a) Find the system of linear equations that results. (b) Investigate invert-
ibility of the matrix A that results. Prove invertibility under the condition
h
P e := < 1.
2ε
P e is called the cell Peclet number.
Exercise 5.6. Repeat the analysis of the 1D discrete CDEqn from the last
exercise. This time use the approximation
· u(a) − u(a − h)
u (a) = D− u(a) = .
h
This is a perfect result: The cost in both storage and floating point
operations is proportional to the resolution sought. If we want to see the
solution on scales 10× finer (so h ⇐ h/10) the total costs increases by a
factor of 10.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 100
The two-dimensional model problem is the first one that reflects some
complexities of real problems. The domain is taken to be the unit square
(to simplify the problem), Ω = (0, 1) × (0, 1). the problem is, given f (x, y)
and g(x, y), to approximate the solution u(x, y) of
You can think of u(x, y) as the deflection of a membrane stuck at its edges
and loaded by f (x, y). Figure 5.2 below given a solution where g(x, y) = 0
and where f (x, y) > 0 and so pushes up on the membrane.
q q q q q q q q q q q q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q q q q q q q q q q q q
Fig. 5.3 A coarse mesh on the unit square, with interior nodes indicated by larger dots
and boundary nodes by smaller ones.
To have a square linear system, we need one equation for each variable.
There is one unknown (uij ) at each mesh point on Ω. Thus, we need one
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 102
equation at each mesh point. The equation for each mesh point on the
boundary is clear:
uij = g(xi , yj ) ( here g ≡ 0) for each xi , yj on ∂Ω. (5.8)
Thus, we need an equation for each xi , yj inside Ω. For a typical (xi ,
yj ) inside Ω we use the approximations (5.6) and (5.7). This gives
# $
ui+1j − 2uij + ui−1j uij+1 − 2uij + uij−1
− + = f (xi , yj ) (5.9)
h2 h2
for all (xi , yj ) inside of Ω.
The equations (5.8) and (5.9) give a square (N + 2)2 × (N + 2)2 linear
system for the uij ’s. Before developing the system, we note that (5.9) can
be simplified to read
−ui+1j − ui−1j + 4uij − uij+1 − uij−1 = h2 f (xi , yj ).
This is often denoted using the “difference molecule” represented by Fig-
ure 5.4,
−u(N ) − u(S) + 4u(P ) − u(E) − u(W ) = h2 f (P )
-1
-1
4
-1
-1
and by Figure 5.5 using the “compass” notation, where P is the mesh point
and N, S, E and W are the mesh points immediately above, below, to the
right and to the left of the point P.
The equation, rewritten in terms of the stencil notation, becomes
−u(N ) − u(S) + 4u(C) − u(E) − u(W ) = h2 f (C).
The discrete Laplacian, denoted Δh , in 2D is thus
−uij+1 − uij−1 + 4uij − ui+1j − ui−1j
−Δh uij := ,
h2
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 103
q q q q q q q q q q q q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t Nm t t t q
q t t t t t Wm Cm Em t t q
q t t t t t t Sm t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q t t t t t t t t t t q
q q q q q q q q q q q q
Fig. 5.5 A sample mesh showing interior points and indicating a five-point Poisson
equation stencil and the “compass” notation, where C is the mesh point and N, S, E, W
are the mesh points immediately above, below, to the right and the left of C.
q q q q q q q q q q q q
q 10m 20
m 30
m 40m 50m 60m 70m 80m 90m 100
m q
q 9m 19
m 29
m 39m 49m 59m 69m 79m 89m 99m q
q 8m 18
m 28
m 38m 48m 58m 68m 78m 88m 98m q
q 7m 17
m 27
m 37m 47m 57m 67m 77m 87m 97m q
q 6m 16
m 26
m 36m 46m 56m 66m 76m 86m 96m q
q 5m 15
m 25
m 35m 45m 55m 65m 75m 85m 95m q
q 4m 14
m 24
m 34m 44m 54m 64m 74m 84m 94m q
q 3m 13
m 23
m 33m 43m 53m 63m 73m 83m 93m q
q 2m 12
m 22
m 32m 42m 52m 62m 72m 82m 92m q
q q q q q q q q q q q q
Thus, the typical k th row (associated with an interior mesh point (xi , yj )
not adjacent to a boundary point) of the matrix A will read:
d t d
uk+1
t t t
uk−N uk uk+N
d t d
uk−1
Fig. 5.7 The 2D difference stencil at the kth point in a lexicographic ordering with N
numbered points in each direction.
The exact structure of A can easily be written down because all the choices
were made to keep A as simple as possible. A is an N 2 ×N 2 block tridiagonal
matrix (N × N blocks with each block an N × N matrix) of the form:
⎡ ⎤
T −I 0
⎢ ⎥
⎢ −I T . . . ⎥
A=⎢ ⎢ ⎥ (5.13)
. . ⎥
⎣ . . . . −I ⎦
0 −I T
where I is the N × N identity matrix and
and
jnπ kmπ
(→
−
u n,m )j,k = sin sin ,
N +1 N +1
where j and k vary from 1, . . . , N . Verify these expressions by calculating
A→
−
u n,m and showing it is equal to λn,m → −
u n,m .
Exercise 5.10. Let the domain be the triangle with vertices at (0, 0), (1, 0),
and (0, 1). Write down the linear system arising from the MPP on this
domain with f(x, y) = x + y, g = 0 and N = 5.
system: one equation for each unknown variable! Let the approximation at
the meshpoint (xi , yj , zk ) be denoted (as usual) by
uijk := approximation to u(xi , yj , zk ).
The discrete Laplacian in 3D is
ui+1jk − 2uijk + ui−1jk
Δh uijk :=
h2
uij+1k − 2uijk + uij−1k uijk+1 − 2uijk + uijk−1
+ 2
+ .
h h2
Collecting terms we get
Δh uijk :=
ui+1jk + uij+1k + uijk+1 − 6uijk + ui−1jk + uij−1k + uijk−1
.
h2
The 3D discrete model Poisson problem is thus
−Δh uijk = f (xi , yj , zk ), at all meshpoints (xi , yj , zk ) inside Ω
uijk = g(xi , yj , zk ) = 0, at all meshpoints (xi , yj , zk ) on ∂Ω.
In the above “at all meshpoints (xi , yj , zk ) inside Ω”means for 1 ≤ i, j, k ≤
N and “at all meshpoints (xi , yj , zk ) on ∂Ω” means for i or j or k = 0 or
N + 1. We thus have the following square (one variable for each meshpoint
and one equation at each meshpoint) system of linear equations (where
fijk := f (xi , yj , zk )).
For 1 ≤ i, j, k ≤ N ,
−ui+1jk − uij+1k − uijk+1 + 6uijk − ui−1jk − uij−1k − uijk−1 = h2 fijk .
And for i or j or k = 0 or N + 1,
uijk = 0. (5.14)
The associated difference stencil is sketched in Figure 5.8.
Counting is good!
This system has one unknown per meshpoint and one equation per mesh-
point. In this form it is a square (N + 2)3 × (N + 2)3 linear system. Since
uijk = 0 for all boundary meshpoints we can also eliminate these degrees
of freedom and get a reduced2 N 3 × N 3 linear system, for 1 ≤ i, j, k ≤ N :
−ui+1jk − uij+1k − uijk+1 + 6uijk − ui−1jk − uij−1k − uijk−1 = h2 fijk ,
2 We shall do this reduction herein. However, there are serious reasons not to do it if
you are solving more general problems: including these gives a negligably smaller sys-
tem and it is easy to change the boundary conditions. If one eliminates these unknowns,
then changing the boundary conditions can mean reformatting all the matrices and pro-
gramming again from scratch. On the other hand, this reduction results in a symmetric
matrix while keeping Dirichlet boundary conditions in the matrix destroys symmetry
and complicates the solution method.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 108
-1
-1
A
A
-1
6
-1
AA
-1
-1
i j k+1
zk+1
xk+1
xk
xk−1
yk+1 ij+1 k
zk
i+1 j k
yk ijk
i−1 j k
yk−1 ij−1 k
zk−1
i j k−1
Fig. 5.10 Geometry of a 3D uniform mesh. Each point has six neighbors, indicated by
heavy dots.
− 1, 0, . . . , 0, −1, 0, . . . . . . , 0, −1, 0, . . . , 0
' () * ' () *
N −1 N 2 −N −1
where the value 6 is the diagonal entry. If the mesh point is adjacent to the
boundary then this row is modified. (In 3D adjacency happens often.)
To summarize, some basic facts about the coefficient matrix A of linear
system derived from the 3D model Poisson problem on an N × N × N mesh
with h = 1/(N + 1):
that
Storage requirements increase 100, 000 times, and
Solution by banded sparse GE takes 10, 000, 000 longer!
Exercise 5.11. Prove the claimed estimates of cond(A) from the formula
for the eigenvalues of A.
“Let us first understand the facts, and then we may seek for the
causes.”
— Aristotle.
The right way to compare the costs in storage and computer time of
solving a BVP is in terms of the resolution desired, i.e., in terms of the
meshwidth h. The previous estimates for storage and solution are sum-
marized in Table 5.1. Comparing these we see the curse of dimensionality
clearly: As the dimension increases, the exponent increases rather than the
constant or parameter being raised to the exponent. In other words:
The cost of storing the data and solving the linear system using
direct methods for the model problem increases exponentially
with the dimension.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 111
for i=1:N
r(i) = h^2*f(i)-(-u(i+1)+2u(i)-u(i-1))
end
for i=1:N
for j=1:N
r(i,j)=h^2*f(i,j)-(-u(i,j+1)-u(i,j-1) + 4u(i,j)
-u(i+1,j)-u(i-1,j) )
end
end
This gives a total of only 10h−2 FLOPS and requires only 2h−2 real
numbers to be stored.
The 3D case: The 3D case takes 7 multiplies and adds per row for
−3
h rows by:
for i=1:N
for j=1:N
for k=1:N
r(i,j,k)=h^2*f(i,j,k) - ( ...
-u(i+1,j,k)-u(i,j+1,k)-u(i,j,k+1)
+6u(i,j) ...
-u(i-1,j,k)-u(i,j-1,k)-u(i,j,k-1) )
end
end
end
This gives a total of only 14h−3 FLOPS and requires only 2h−3 real
numbers to be stored.
3 When i=1, the expression “u(i-1) ” is to be interpreted as the boundary value at
the left boundary, and when i=N, the expression “u(i+1)” is to be interpreted as the
boundary value at the right boundary.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 113
To summarize,
The matrix A does not need to be stored for the MPP since we already
know the nonzero values and the components they multiply. More generally
we would only need to store the nonzero entries and a pointer vector to tell
which entry in the matrix is to be multiplied by that value. Thus the only
hope to break the curse of dimensionality is to use algorithms where the
work involves computing residuals instead of elimination! These special
methods are considered in the next chapter.
(1) Find the smallest h (largest N ) for which the program will execute
without running out of memory.
(2) Next from this estimate and explain how you did it: the smallest h
(largest N ) for which this can be done in 2D in banded sparse storage
mode. The same question in 1D. The same question in 3D.
(3) Make a chart of your findings and draw conclusions.
Exercise 5.14. Same setting as the last problem. Now however, estimate
how long it would take to compute a residual in 1D, 2D and 3D. Explain
how you did the estimate.
after finding one answer, you will be able to quickly grasp the point of
the various sparse storage schemes. Next look in the Templates book,
Barrett, Berry, et al. [Barrett et al. (1994)] and compare your method to
Compressed Row Storage. Explain the differences.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 115
Chapter 6
Iterative Methods
115
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 116
Compute residual: r = b − A x
for i = 1:itmax
Use r to improve x
Compute residual using improved x : r = b − A
x
Use residual and update to estimate accuracy
if accuracy is acceptable, exit with converged solution
end
Signal failure if accuracy is not acceptable.
ρ(x − x) = b − Ax,
ρ(xn+1 − xn ) = b − Axn , or
1 1
xn+1 = I − A xn + b.
ρ ρ
Algorithm 6.2 (FOR = First Order Richardson). Given ρ > 0, tar-
get accuracy tol, maximum number of steps itmax and initial guess x0 :
h=1/N
for it=1:itmax
% initialize solution, delta, residual and rhs norms
delta=0
unorm=0
resid=0
bnorm=0
for i=2:N
for j=2:N
for k=2:N
% compute increment
au=-( uold(i+1,j,k) + uold(i,j+1,k) ...
+ uold(i,j,k+1) + uold(i-1,j,k) ...
+ uold(i,j-1,k) + uold(i,j,k-1) )
unew(i,j,k)=(h^2*f(i,j,k) - au)/6
- au - 6*uold(i,j,k))^2
bnorm=bnorm + (h^2*f(i,j,k))^2
end
end
end
end
error(’convergence failed’)
approximations. Remarkably, this does not require that the coefficient ma-
trix be stored at all! Thus, provided it converges rapidly enough, we have
a method for overcoming the curse of dimensionality. Unfortunately, this
“provided ” is the key question: Iterative methods utility depend on speed of
convergence and, double unfortunately, we shall see that the Jacobi method
does not converge fast enough as the next example begins to indicate.
with h = 1
5 and f (x) = x
5 leads to the 4 × 4 tridiagonal linear system
2u1 −u2 =1
−u1 +2u2 −u3 =2
−u2 +2u3 −u4 =3
−u3 +2u4 = 4.
The true solution is
u1 = 4, u2 = 7, u3 = 8, u4 = 6.
(1) Write a computer program to solve this problem for h = 1/N using
FOR with ρ = 4 (Jacobi iteration). This amounts to modifying Algo-
rithm 6.3 for 2D instead of 3D.
(2) It is easy to see that the solution to this problem is u(x, y) = x −
y. Remarkably, this continuous solution is also the discrete solution.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 121
Verify that your code reproduces the continuous solution to within the
convergence tolerance for the case h = 1/3 (N = 3).
(3) Verify that your code reproduces the continuous solution to within the
convergence tolerance for the case h = 1/100 (N = 100).
n=0
while convergence is not satisfied
Obtain xn+1 as the solution of M (xn+1 − xn ) = b − Axn
n=n+1
end
The matrix M does not depend on the iteration counter n, hence the
name “stationary.” This algorithm results in a new iterative method for
each new choice of M , called a “preconditioner.” For FOR (which takes
very many steps to converge) M = (1/ρ)I. At the other extreme, if we
pick M = A then the method converges in 1 step but that one step is just
solving a linear system with A so no simplification is obtained. From these
two extreme examples, it is expected that some balance must be struck
between the cost per step (less with simpler M ) and the number of steps
(the closer M is to A the fewer steps expected).
There are three standard ways to write any stationary iterative method.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 122
The residual and updates are used to give indications of the error and
to decide when to stop an iterative method.
The program should terminate if either the first test or both the second
and third tests are satisfied. Usually other computable heuristics are also
monitored to check for convergence and speed of convergence. One example
is the experimental contraction constant
rn+1 Δn+1
αn := or .
rn Δn
This is monitored because αn > 1 suggests divergence and αn < 1 but very
close to 1 suggests very slow convergence.
To summarize, the important points
• Basic iterative methods are easy to program.3 The programs are short
and easy to debug and often are inherently parallel.
• Iterative method’s convergence can be fast or not at all. The questions
of convergence (at all) and speed of convergence are essential ones that
determine if an iterative method is practical or not.
iterative methods are more complicated. Further, often the requirements of rapid con-
vergence adds layers of complexity to what started out as a simple implementation of a
basic iterative method.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 126
Theorem 6.4. Given any N × N matrix T and any ε > 0 there exists a
matrix norm · with T ≤ spr(T ) + ε.
Proof. That it suffices follow from the previous two theorems. It is easy to
prove that it is necessary. Indeed, suppose ρ(T ) ≥ 1 so T has an eigenvalue
λ
T φ = λφ with |λ| ≥ 1.
Pick e0 = φ. Then, e1 = T e0 = T φ = λφ, e2 = T e1 = λ2 φ, . . . ,
en = λn φ.
Since |λ| ≥ 1, en clearly does not approach zero as n → ∞.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 127
B and P BP −1
In general, f (A) is well defined (by its power series as in eA ) for any analytic
function. The next theorem, known as the Spectral Mapping Theorem, is
extremely useful. It says that
Exercise 6.7. Let A be invertible and f (z) = 1/z. Give a direct proof of
the SMT for this particular f (·). Repeat for f (z) = z 2 .
This section gives a complete and detailed proof that the First Order
Richardson iteration
Ax = b.
the spectrum of A.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 129
ρ(en+1 − en ) = −Aen , en = x − x n ,
or
From Section 6.2, we know that en → 0 for any e0 if and only if |λ(T )| < 1
for every eigenvalue λ of the matrix T . If f (x) = 1 − x/ρ, note that
T = f (A). Thus, by the spectral mapping theorem
λ(T ) = 1 − λ(A)/ρ.
−ρ < ρ − x < +ρ
6.3.1 Optimization of ρ
If you optimize everything, you will always be unhappy.
— Donald Knuth
Clearly, the smaller T 2 the faster en → 0. Now, from the above proof
T 2 = max |λ(T )| = max |1 − λ(A)/ρ|.
The eigenvalues λ(A) are a discrete set on [a, b]. A simple sketch (see the
next subsection) shows that
• ρ = λmax (A), T 2 = 1 − κ1 ,
• ρ = (λmax + λmin )/2, T 2 = 1 − 2
κ+1 .
However, estimating λmin (A) is often difficult. The shape of α(ρ) also
suggests that it is better to overestimate ρ than underestimate ρ. Thus,
often one simply takes ρ = A rather than the “optimal” value of ρ. The
cost of this choice is that it roughly doubles the number of steps required.
To simplify this we suppose that only the largest and smallest eigenvalues
(or estimates thereof) are known. Thus, let
0 < a = λmin (A) ≤ λ ≤ b = λmax (A) < ∞
so that the simplified parameter optimization problem is
min max |1 − λ/ρ|.
ρ a≤λ≤b
Fix one eigenvalue λ and consider in the y − ρ plane the curve y = 1 − λ/ρ,
as in Figure 6.1. The plot also has small boxes on the ρ axis indicating
a = λmin (A), b = λmax (A) and the chosen intermediate value of λ.
rho
The next step is for this same one eigenvalue λ to consider in the y − ρ
plane the curve y = |1 − λ/ρ|, as shown in Figure 6.2. This just reflects up
the portion of the curve below the rho axis in the previous figure. We also
begin including the key level y = 1.
rho
rho
rho
Checking which individual curve is the active one in the maximum, we find:
• Convergence: ||T (ρ)||2 < 1 if and only if ρ is bigger than the value
of rho at the point where 1 = −(1 − λmax /ρ). Solving this equation for
rho we find the condition
The above analysis also gives insight on the expected number of itera-
tions for FOR to converge. Since
en = T en−1 , so we have en = T e0 .
T = 1 − α, where α is small,
.
and ln(1 − α) = −α + O(α2 ) (by Taylor series). This gives
1 . −1
- .= α , where T = 1 − α.
1
T
Conclusions
Exercise 6.9. Consider error in FOR yet again. Suppose one chooses
2 values of ρ and alternates with ρ1 , ρ2 , ρ1 , ρ2 , etc. Relabel the steps as
follows:
ρ1 (xn+1/2 − xn ) = b − Axn
ρ2 (xn+1 − xn+1/2 ) = b − Axn+1/2 .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 136
Eliminate the half step to write this as an stationary iterative method [i.e.,
relate xn+1 to xn ]. Analyze convergence for SPD A. Can this converge
faster with two different values of ρ than with 2 steps of one value of ρ?
If you prefer, you can explore this computationally instead of theoretically
[Choose one approach: analysis or computations. It will be most exciting if
you work with someone on this problem with one person doing the analysis
and the other the numerical explorations].
(1) Take h = 1/3 and write down the 4 × 4 linear system in matrix vector
form.
(2) Given an N × N mesh, let u(i, j) denote an N × N array of approxima-
tions at each (xi , yj ). Give pseudocode for computing the residual
r(i, j) (N × N array) and its norm. c. Suppose the largest N for which
the coefficient matrix can be stored in banded sparse form (to be solved
by Gaussian elimination) is N = 150.
(3) Estimate the largest value of N the problem can be stored to be solved
by First Order Richardson. Explain carefully!
FOR has a huge savings in storage over Gaussian elimination but not in
time to calculate the solution. There are many better iterative methods; we
consider a few such algorithms in this section: the Gauss–Seidel method,
over-relaxation and the SOR method. These are still used today — not
as solvers but as preconditioners for the Conjugate Gradient method of
Chapter 7.
pick M then:
M x − xn = b − Axn .
n+1
Costs for the MPP: In most cases, the Gauss–Seidel iteration takes
approximately 21 as many steps as Jacobi iteration. This is because, intu-
itively speaking, each time (*) in Algorithm 6.6 is executed, it involves half
old values and half updated values. Thus, using Gauss–Seidel over FOR
cuts execution time roughly in half. However, the model problem still needs
1 −2
2 O(h ) iterations. Cutting costs by 50% is always good. However, the
essential problem is how the costs grow as h → 0. In other words, the goal
should be to cut the exponent as well as the constant!
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 139
6.4.2 Relaxation
“The time to relax is when you don’t have time for it.”
— Sydney J. Harris
The second point (cost reduction) happens in cases where the number
of steps is sensitive to the precise choice of the parameter. However, it is
not appealing because
The idea is simple: pick the relaxation parameter ω, then add one line
to an existing iterative solver as follows.
for n=1:itmax
Compute xn+1
temp by some iterative method
temp + (1 − ω)x
Compute xn+1 = ωxn+1 n
Since the assignment operator “=” means “replace the value on the left
with the value on the right”, in a computer program there is sometimes
no need to allocate extra storage for the temporary variable xn+1
temp . Under-
relaxation means 0 < ω < 1 and is a good choice if the underlying iteration
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 140
Exercise 6.11. In this exercise, you will see a simple example of how over-
relaxation or under-relaxation can accelerate convergence of a sequence.
For a number r with |r| < 1, consider the sequence6 {en = (r)n }∞ n=0 .
This sequence satisfies the recursion
en = ren−1 (6.2)
and converges to zero at a rate r. Equation (6.2) can be relaxed as
en = ωren−1 + (1 − ω)en = (1 + ω(r − 1)en−1 . (6.3)
(1) Assume that 0 < r < 1 is real, so that the sequence {en } is of one
sign. Show that there is a value ω0 so that if 1 < ω < ω0 , then (6.3)
converges more rapidly than (6.2).
(2) Assume that −1 < r < 0 is real, so that the sequence {en } is of
alternating sign. Show that there is a value ω0 so that if 0 < ω0 < ω <
1, then (6.3) converges more rapidly than (6.2).
(3) Assume that r is real, find the value ω0 and show that, in this very
special case, the relaxed expression converges in a single iteration.
Exercise 6.12. Show that FOR with relaxation does not improve conver-
gence. It just corresponds to a different value of ρ in FOR.
iterate.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 141
Compute r0 = b − Ax0
for n=1:itmax
7 Personal communication.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 142
Compute xn+1
temp by one GS step:
temp − x
(D + U ) xn+1 = b − Axn
n
temp + (1 − ω)x
Compute xn+1 = ωxn+1 n
For the 2D MPP the vector x is the array u(i, j) and the action of D, U
and A can be computed directly using the stencil. That D + U is upper
triangular means just use the most recent value for any u(i, j). It thus
simplifies as follows.
h=1/N
for it=1:itmax
for i=2:N
for j=2:N
uold=u(i,j)
u(i,j)=h^2*f(i,j) ...
+ (u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1))/4
u(i,j)=omega*u(i,j)+(1-omega)*uold
end
end
if convergence is satisfied, exit
end
Convergence results for SOR are highly developed. For example, the fol-
lowing is known.
For ω = ωoptimal and TSOR , TGaussSeidel the iteration matrices for SOR
and Gauss–Seidel respectively, we have
2
spr(TSOR ) = ωoptimal − 1 < spr(TGaussSeidel ) ≤ (spr(TJacobi )) < 1.
The dramatic reason SOR was the method of choice for ω = ωoptimal is
that it reduces the exponent in the complexity estimate for the MPP
f rom O(h−2 ) to O(h−1 ).
Exercise 6.14. Theorem 5.3 presents the eigenvalues of the 3D MPP ma-
trix, and the analogous expression for the 2D MPP (A) is
# # $ # $$
2 pπ 2 qπ
λpq = 4 sin + sin ,
2(N + 1) 2(N + 1)
for 1 ≤ p, q ≤ N .
Using this expression along with the observation that the diagonal of A is a
multiple of the identity, find spr(TJacobi ) and spr(TSOR ) for ω = ωoptimal .
How many iterations will it take to reduce the error from 1 to 10−8 using:
(a) Jacobi, and (b) SOR with ω = ωoptimal for the case that N = 1000?
T EM P + (1 − ω)u
un+1 = ωun+1 n−1
for its=1:itmax
for i=2:N
for j=2:N
au = - uold(i+1,j) - uold(i-1,j) ...
+ 4.0*uold(i,j) ...
- uold(i,j+1) - uold(i,j-1)
r(i,j) = h^2*f(i,j) - au
unow(i,j) = uold(i,j) + (1/rho)*r(i,j)
unew(i,j) = omega*unow(i,j) + (1-omega)*uold(i,j)
end
end
Test for convergence
if convergence not satisfied
Copy unow to uold and unew to unow
for i=2:N
for j=2:N
uold(i,j)=unow(i,j)
unow(i,j)=unew(i,j)
end
end
else
Exit with converged result
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 145
end
end
rnorm=0
for i=2:N
for j=2:N
au=4*u(i,j)-(u(i+1,j)+u(i-1,j)+u(i,j+1)+u(i,j-1))
r(i,j)=h^2*f(i,j)-au
8 N.K. Nichols, On the convergence of two-stage iterative processes for solving linear
rnorm=rnorm+r(i,j)^2
end
end
rnorm=sqrt(rnorm/(N-1)^2)
Note that because the nonzero entries are known and regular the above
did not even need to store the nonzero entries in A. We give one impor-
tant example of a storage scheme for more irregular patterned matrices:
CRS=Compressed Row Storage.
To use A we need to first store the nonzero entries. In CRS this is done,
row by row, in a long vector. If the matrix has M nonzero entries we store
them in an array of length M named value
value = [2, −1, 3, 2, 1, 5, −1, 2, −1, 1, 3, 2, 1, 1, . . . ].
Next we need to know in the above vector the index where each row starts.
For example, the first 3 entries, 2, −1, 3, come from row 1 in A. Row 2
starts with the next (4th in this example) entry. This metadata can be
stored in an array of length M named row, containing indices where each
row starts. Of course, the first row always starts with the first value in
value, so there is no need to store the first index, 1, leaving (M − 1) row
indices to be stored. By convention, the final index in value is (M + 1).
row = [4, 7, 11, . . . , M ).
Now we know that Row 1 contains entries 1, 2, 3 (because Row 2 starts with
entry 4), we need to store the column numbers that each entry in value
corresponds with in the global matrix A. This information can be stored
in a vector of length M named col.
col = [1, 2, 5, 2, 4, 7, . . . ].
With these three arrays we can calculate the matrix vector product as
follows.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 147
first=1
for i=1:N
y(i)=0
for j=first:row(i)-1
k=col(j)
y(i)= y(i) + value(j)*x(k)
end
first=row(i)
end
Theorem 6.9. Let A be SPD. Then for any initial guess the unique so-
lution to (IVP) x(t) converges to the unique solution of the linear system
Ax = b:
x(t) → A−1 b as t → ∞.
Thus one way to solve the linear system is to use any explicit method
for the IVP and time step to steady state. There is in fact a 1 − 1 corre-
spondence between time stepping methods for some initial value problem
associated with Ax = b and stationary iterative methods for solving Ax = b.
While this sounds like a deep meta-theorem it is not. Simply identify the it-
eration number n with a time step number and the correspondence emerges.
For example, consider FOR
ρ(xn+1 − xn ) = b − Axn .
Rearrange FOR as follows:
xn+1 − xn
+ Axn = b where Δt := ρ−1 .
Δt
This shows that FOR is exactly the forward Euler method for IVP with
timestep and pseudo-time
Δt := ρ−1 and tn = nΔt and xn x(tn ).
Similarly, the linear system Ax = b can be embedded into a second order
equation with damping
x (t) + ax (t) + Ax(t) = b, for a > 0 and 0 < t < ∞
x(0) = x0 , x (0) = x1 the initial guesses.
Timestepping gives an iterative method with 2 parameters (a, Δt) and thus
resembles second order Richardson.
xn+1 − 2xn + xn−1 xn+1 − xn−1
+ a + Axn = b .
Δt2 2Δt
The reasons this approach is not competitive, if programmer time is not
counted, include:
Exercise 6.16. Find the IVP associated with the stationary iterative
methods Gauss–Seidel and SOR.
Exercise 6.18. Show that the solution of both IVP’s converges to the
solution of Ax = b as t → ∞.
The classic and often very effective use of dynamic relaxation is in splitting
methods. Splitting methods have a rich history; entire books have been
written to develop aspects of them so we shall give one central and still
important example, the Peaceman–Rachford method. Briefly, the N × N
matrix A is split as
A = A1 + A2
(A1 + A2 )x = b,
d
x(t) + (A1 + A2 )x(t) = f (t).
dt
We consider the first two problems. We stress that splitting methods in-
volve two separate steps and each is important for the success of the whole
method:
The first splitting method and in many ways still the best is the
Peaceman–Rachford method.
Thus
spr(AB) = spr(BA).
Proof. Exercise!
confusing and standard in the area (so just get used to it).
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 152
Proof. We have
spr(TP R ) = spr [T (A1 )T (A2 )] .
By Kellogg’s lemma we have for both i = 1, 2, ||T (Ai )||2 ≤ 1 with one of
||T (Ai )||2 < 1. Thus, ||T (A1 )T (A2 )||2 < 1 and
spr(T (A1 )T (A2 )) ≤ ||T (A1 )T (A2 )||2 < 1.
Exercise 6.19. The following iteration is silly in that each step costs as
much as just solving Ax = b. Nevertheless, (and ignoring this aspect of it)
prove convergence for matrices A with xT Ax > 0 and analyze the optimal
parameter:
w(ρI + A)xn+1 = b + ρxn .
12 a
T
:= xT Ass x = xT Ass x = xT AssT x ≡ −xT Ass x. Thus a = −a so a = 0.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 153
rho
rho
Indeed, consider the 2D MPP. Recall that the domain is the unit square,
Ω = (0, 1) × (0, 1). Approximate uxx and uyy by
. u(a + Δx, b) − 2u(a, b) + u(a − Δx, b)
uxx (a, b) = , (6.4)
Δx2
. u(a, b + Δy) − 2u(a, b) + u(a, b − Δy)
uyy (a, b) = . (6.5)
Δy 2
Introduce a uniform mesh on Ω with N + 1 points in both directions: Δx =
Δy = N1+1 =: h and
xi = ih, yj = jh, i, j = 0, 1, . . . , N + 1.
Let uij denote the approximation to u(xi , yj ) we will compute at each mesh
point. On the boundary use
uij = g(xi , yj ) ( here g ≡ 0) for each xi , yj on ∂Ω
and eliminate the boundary points from the linear system. For a typical
(xi , yj ) inside Ω we use
# $
ui+1j − 2uij + ui−1j uij+1 − 2uij + uij−1
− + = f (xi , yj ) (6.6)
h2 h2
for all (xi , yj ) inside of Ω
uij = g(xi , yj ) ( ≡ 0 ) at all (xi , yj ) on ∂Ω. (6.7)
The boundary unknowns can be eliminated giving an N 2 ×N 2 linear system
for the N 2 unknowns:
AN 2 ×N 2 uN 2 ×1 = fN 2 ×1 .
To split A with the ADI = Alternating Direction Implicit splitting we use
the directional splitting already given above:
A = A1 + A2 , where
ui+1j − 2uij + ui−1j
A1 = −
h2
uij+1 − 2uij + uij−1
A2 = − .
h2
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 156
Exercise 6.21. If one full PR-ADI step costs the same as 6 FOR steps,
is it worth doing PR-ADI? Answer this question using results on condition
numbers of tridiag(−1, 2, −1) and the estimates of number of steps per
significant digit for each method.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 157
Chapter 7
Solving Ax = b by Optimization
157
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 158
x = αφ1 + α2 φ2 .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 159
Exercise 7.1. For A either (i) not symmetric, or (ii) indefinite, consider
x, y → x, yA .
1 t
J(x) = x Ax − xt b.
2
Theorem 7.1. Let A be SPD. The solution of Ax = b is the unique mini-
mizer of J(x). There holds
1
J(x + y) = J(x) + y t Ay > J(x) for any y.
2
is any other vector in RN then
Further, if x
x − x
2A = 2 (J(
x) − J(x)) . (7.1)
2A = (x − x
x − x )t A(x − x
) = xt Ax − x
t Ax − xt A t A
x+x x
= (since Ax = b) = xb − x
t b − x
t b + x
t A t A
x=x x − 2
xt b + xt b
and
x) − J(x)) = x
2 (J( t A
x − 2
xt b − xt Ax + 2xt b
= (since Ax = b) = x x − 2
t A xt b − xt [Ax − b] + xt b
=x x − 2
t A xt b + xt b,
which are obviously equal. Each step is reversible so the result is proven.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 161
z=J(x,y)
Compute r0 = b − Ax0
for n=1:itmax
(∗) Choose a direction vector dn
Find α = αn by solving the 1D minimization problem:
(∗∗) αn = arg minα Φ(xn + αdn )
n+1
x = x n + α n dn
r n+1
= b − Axn+1
if converged, exit, end
end
These choices yield the steepest descent method. Because the functional
J(x) is quadratic, there is a very simple formula for αn in step (∗∗) for
steepest descent:
dn · r n
αn = , where rn = b − Axn . (7.2)
dn · Adn
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 163
It will be convenient to use the ·, · notation for dot products so this formula
is equivalently written
dn , rn dn , rn
αn = = n n .
d , Ad
n n d , d A
The difference between descent methods arises from:
Many choices of descent direction and functionals have been tried. Ex-
amples of other choices include the following:
Choice of descent direction dn :
• FOR: M = ρI, N = ρI − A
• Jacobi: M = diag(A)
• Gauss–Seidel: M = D + L (the lower triangular part of A).
Householder (an early giant in numerical linear algebra and matrix the-
ory) proved a very simple identity for (7.3) when A is SPD.
Proof. This is an identity: expand both sides and cancel to check that is
true. Next reverse the steps to give the proof.
Proof. The proof is easy but there are so many tools at hand it is also
easy to start on the wrong track and get stuck there. Note that en A
is monotone decreasing and bounded below by zero. Thus it has a non-
negative limit. Since it converges to something, the Cauchy criteria implies
that
t
en Aen − etn+1 Aen+1 → 0.
Now reconsider the Householder Relation (7.4). Since the LHS → 0 we
must have the RHS → 0 too.1 Since P > 0, this means
||xn+1 − xn ||P → 0.
Finally the iteration itself,
M xn+1 − xn = b − Axn ,
implies that if xn+1 −xn → 0 (the LHS), then the RHS does also: b−Axn →
0, so convergence follows.
1 This step is interesting to the study of human errors. Since we spend our lifetime
reading and writing L to R, top to bottom, it is common for our eyes and brain to
process the mathematics = sign as a one directional relation ⇒ when we are in the
middle of a proof attempt.
June 25, 2020 13:34 ws-book9x6 Numerical Linear Algebra 11926-main page 166
Proof. This follows easily from Householder’s result as follows. For Jacobi,
M = diag(A), so
1
P = M + M t − A = 2diag(A) − A > 0 if diag(A) > A.
2
For GS, since A = D + L + U where (since A is symmetric) U = Lt and
P = M + M t − A = (D + L) + (D + L)t − A =
D + L + D + Lt − (D + L + Lt ) = D > 0
for SPD A. For SOR we calculate (as above using Lt = U )
P = M + M t − A = M + M t − (M − N ) = M t + N
1−ω 2−ω
= ω −1 D + Lt + D−U = D > 0,
ω ω
for 0 < ω < 2. Convergence of FOR in the A-norm is left as an exercise.
monotonic convergence in the A-norm means the errors satisfy ken+1 kA < ken kA for all
n.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 167
Find the form of A1 and A2 . Show that they are diagonally semi-dominant.
Look up the definition and show they are also irreducibly diagonally dom-
inant. Show that the entries in C are nonnegative.
Exercise 7.10. Repeat the above for block FOR and for Block Gauss–
Seidel.
Exercise 7.13. Consider the proof of convergence when P > 0. This proof
goes back and forth between the minimization structure of the iteration and
the algebraic form of it. Try to rewrite the proof entirely in terms of the
functional J(x) and ∇J(x).
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 168
Proof.
xn+1 = xn − ρ−1 rn .
Multiply by “−A” and add b to both sides. This gives
b − Axn+1 = b − Axn − ρ−1 Arn ,
which is the claimed iteration.
June 25, 2020 13:34 ws-book9x6 Numerical Linear Algebra 11926-main page 169
r1 = b − Ax1
for n=1:itmax
n n
ρn = Ar ,r
r n ,r n
−1
xn+1 = xn + (ρn ) rn
if satisfied, exit, end
rn+1 = b − Axn+1
end
(At A)x = At b.
1
J(x)
e = xt At Ax − xt At b.
2
If we are solving Ax = b with A an N × N nonsingular matrix then we can
convert it to the normal equations by multiplication by At :
(At A)x = At b.
Thus, minimizing the residual is equivalent to passing to the normal equa-
tions and minimizing J(·). Unfortunately, the bandwidth of At A is (typi-
cally) double the bandwidth of A. Further, passing to the normal equations
squares the condition number of the associated linear system.
2
The relation cond2 (At A) = [cond2 (A)] explains why Option 2 is bet-
ter. Option 1 implicitly converts the system to the normal equations and
thus squares the condition number of the system being solved then applies
Option 2. This results in a very large increase in the number of iterations.
(∗) d = −∇J(xn ).
x = xn + αd.
dn := −∇J(xn ) = rn = b − Axn ,
for n=0:itmax
rn = b − Axn
αn = rn , rn /rn , Arn
xn+1 = xn + αn rn
if converged, exit, end
end
→ 2 0 2
Example 7.4. N = 2, i.e., x = (x, y)t . Let A = ,b= . Then
0 50 0
→ 1 2 0 x 2
J( x ) = [x, y] − [x, y]
2 0 50 y 0
1
= (2x2 + 50y 2 ) − 2x = x2 − 2x + 25y 2 +1 − 1
2 ' () *
ellipse
(x − 1)2 y2
= + − 1.
12 ( 15 )2
x0 = (11, 1)
limit= (1, 0)
x4 = (6.37, 0.54)
Fig. 7.2 The first minimization steps for Example 7.4. The points x0 , . . . , x4 , . . . are
indicated with dots, the level curves of J are ellipses centered at (1, 0) and construction
lines indicate search directions and tangents.
and
# $n
κ−1
J(xn ) − J(x) ≤ J(x0 ) − J(x) .
κ+1
Proof. We shall give a short proof that for one step of steepest descent
# $
κ−1
x − xn A ≤ x − xn−1 A .
κ+1
If this holds for one step then the claimed result follows for n steps. We
observe that this result has already been proven! Indeed, since steepest
descent picks ρ to reduce J(·) maximally and thus the A norm of the error
maximally going from xn−1 to xn it must also reduce it more than for any
other choice of ρ including ρoptimal for FOR. Let xnF OR be the result from
one step from xn−1 of First Order Richardson with optimal parameter. We
have proven that
# $
κ−1
x − xnF OR A ≤ x − xn−1 A .
κ+1
Thus
# $
κ−1
x − xn A ≤ x − xnF OR A ≤ x − xn−1 A ,
κ+1
completing the proof for the error. The second result for J(xn ) − J(x) is
left as an exercise.
λmax = O(1) while λmin = O(h2 ) and thus κ = O(h2 ) so steepest descent
requires O(h−2 ) directions to converge.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 175
Theorem 7.5. The convergence rate κ−1 κ+1 of steepest descent is sharp. It
is exactly the rate of convergence when the initial error is e0 = φ1 + φ2 and
when e0 = φ1 − φ2 where φ1,2 are the eigenvectors of λmin (A) and λmax (A)
respectively.
Proof. Let φ1,2 be the eigenvectors of λmin (A) and λmax (A) respectively.
Consider two possible selections of initial guesses: Pick
x0 = x − (φ1 + φ2 ) or x0 = x − (φ1 − φ2 ).
We proceed by direct calculations (which are not short but routine step by
step): if we choose x0 = x + (φ1 + φ2 ) then e0 = φ1 + φ2 . We find
x1 = x0 + α0 (b − Ax0 ) =
x1 = x0 + α0 Ae0 (since Ae = r)
x − x = x − x − α0 Ae
1 0 0
Proceeding by induction,
# $n
κ−1
n
e = (φ1 either ± or ∓ φ2 ),
κ+1
in the two cases, which is exactly the predicted rate of convergence.
Exercise 7.17. Suppose you must solve a very large sparse linear system
Ax = b by some iterative method. Often one does not care about the in-
dividual millions of entries in the solution vector but one only wants a few
statistics [i.e., numbers] such as the average. Obviously, the error in the
averages can be much smaller than the total error in every component or
just as large as the total error. Your goal is to try to design iterative meth-
ods which will produce accurate statistics more quickly than an accurate
answer.
To make this into a math problem, let the (to fix ideas) statistic be a
linear functional of the solution. Define a vector l and compute
L = lt x = l, x
if, e.g., L = average(x) then
l = (1/N, 1/N, ..., 1/N )t .
Problem:
Solve : Ax = b,
Compute : L = l, x
Exercise 7.18. The standard test problem for nonsymmetric systems is the
2D CDEqn = 2D model discrete Convection Diffusion equation. Here ε is a
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 177
small to very small positive parameter. (Recall that you have investigated
the 1D CDEqn in Exercise 5.5, page 98.)
−εΔu + ux = f, inside (0, 1) × (0, 1)
u = g, on the boundary.
Discretize the Laplacian by the usual 5-point star and approximate ux by
u(I + 1, J) − u(I − 1, J)
ux (xI , yJ ) .
2h
Find the associated difference stencil. This problem has 2 natural parame-
ters:
1
h = , the meshwidth; and,
N +1
h
P e := , the “cell Péclet number”.
2ε
The interesting case is when the cell Péclet3 number P e 1, i.e., when
ε h.
Hint: You have already written programs for the 2D MPP in Exercises 7.15
and 7.16. You can modify one of those programs for this exercise.
(1) Debug your code using h = 1/5, g(x, y) = x − y, and f (x, y) = 1. The
exact solution in this case is u(x, y) = x − y. Starting from the exact
solution, convergence to 1.e-3 should be achieved in a single iteration
of a method such as Jacobi (FOR with ρ = 4).
(2) Fix h = 1/50, f (x, y) = x + y, and g(x, y) = 0. Pick three iterative
methods (your choice). Solve the nonsymmetric linear system for a
variety of values4 of ε = 1, 1/10, 1/100, 1/1000, 1/10000, starting from
u(x, y) = 0, to an accuracy of 10−3 . Report the results, consisting of
convergence with the number of iterations or nonconvergence. Describe
the winners and losers for small cell P e and for large cell P e.
is given by Length × Velocity / Diffusion coefficient. In our simple example, the velocity
is the vector (1,0) and the diffusion coefficient is ε. The cell Peclet number, also denoted
by Pe, is the Peclet number associated with one mesh cell so the length is taken to be
the meshwidth.
4 For ε = 1, your solution should appear much like the MPP2D solution with the same
right side and boundary conditions. For smaller ε, the peak of the solution is pushed to
larger x locations. Nonconvergence is likely for very small ε.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 178
r0 = b − Ax0
for n=0:itmax
Choose dn
(∗) Pick αn to minimize b − A(xn + αdn )2
xn+1 = xn + αn dn
(∗∗) rn+1 = b − Axn+1
if converged, return, end
end
(1) Show that step (∗∗) can be replaced by: (∗∗) rn+1 = rn − αn Adn .
(2) Find an explicit formula for the optimal value of α in step (∗).
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 179
Chapter 8
The conjugate gradient method is the best possible method1 for solving
Ax = b for A an SPD matrix. We thus consider the solution of
1 “Best possible” has a technical meaning here with equally technical qualifiers. We shall
see that the kth step of the CG method computes the projection (the best approximation)
with respect to the A-norm into a k dimensional subspace.
179
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 180
The A-norm is
√
xA = x, xA = xt Ax.
The quadratic functional associated with Ax = b is
1 t
J(x) = x Ax − xt b.
2
The conjugate gradient method (hereafter: CG) is a descent method.
Thus, it takes the general form.
CG differs from the slow steepest descent method by step (∗) the choice
of search directions. In Steepest Descent dn = rn while in CG dn is calcu-
lated by a two term recursion that A orthogonalizes the search directions.
The CG algorithm is very simple to write down and easy to program.
It is given as follows:2
2 We shall use fairly standard conventions in descent methods; we will use roman letters
r0 = b − Ax0
d0 = r 0
for n=1:itmax
αn−1 = dn−1 , rn−1 /dn−1 , Adn−1
xn = xn−1 + αn−1 dn−1
rn = b − Axn
if converged, stop, end
βn = rn , rn /rn−1 , rn−1
dn = rn + βn dn−1
end
(1) Apply your program to the matrix A1 using the exact solution xexact =
[1, 2, 3, 4, 5]t and b1 = A1 xexact , starting with x0 = xexact . Demonstrate
convergence to xexact in a single iteration with = 10−4 .
3 All previous dn must be stored and used in order to compute xn+1 .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 183
(2) Apply your program to A1 and b1 with tolerance = 10−4 but with
initial guess x0 = 0. Demonstrate convergence to xexact in no more
than five iterations.
(3) Repeat the previous two cases with the matrix
⎡ ⎤
2 −1 0 −1 0
⎢ −1 3 −1 0 −1 ⎥
⎢ ⎥
A2 = ⎢ ⎥
⎢ 0 −1 2 −1 0 ⎥ .
⎣ −1 0 −1 3 −1 ⎦
0 −1 0 −1 3
Exercise 8.2. Recall that Exercises 7.15 (page 169) and 7.16 (page 170)
had you wrote programs implementing iterative methods for the 2D MPP.
Write a computer program to solve the 2D MPP using conjugate gra-
dients Algorithm 8.2. How many iterations does it take to converge when
N = 100 and = 1.e − 8?
Recommendation: You have already written and tested a conjugate
gradient code in Exercise 8.1 and a 2D MPP code in Exercise 7.16. If
you replace the matrix-vector products Axn appearing in your conjugate
gradient code with function or subroutine calls that use 2D MPP code to
effectively compute the product without explicitly generating the matrix
A, you can leverage your earlier work and save development and debugging
time.
There are many algorithmic options (we will list two below) but the above
is a good, stable and efficient form of CG.
rn , rn
αn = .
dn , Adn
rn+1 = rn − αn Adn .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 184
as claimed above.
Exercise 8.3. Consider the CG method, Algorithm 8.2. Show that it can
be written as a three term recursion of the general form xn+1 = αn xn +
βn xn−1 + cn .
“All sorts of computer errors are now turning up. You’d be sur-
prised to know the number of doctors who claim they are treating
pregnant men.”
— Anonymous Official of the Quebec Health Insurance Board,
on Use of Computers in Quebec Province’s Comprehensive Medical-
care system. F. 19, 4:5. In Barbara Bennett and Linda Am-
ster, Who Said What (and When, and Where, and How) in 1971:
December–June, 1971 (1972), Vol. 1, 38.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 185
i=1
Thus
/{r0 − αAr0 + βr0 }, so
x2 = x1 + αd1 = x0 + αr0 + α
x2 ∈ x0 + span{r0 , Ar0 }, and similarly
r1 ∈ r0 + A · span{r0 , Ar0 }.
Continuing, we easily find the following.
Proposition 8.1. The CG iterates xj , residuals rj and search directions
dj satisfy
xj ∈ x0 + span{r0 , Ar0 , · · ·, Aj−1 r0 },
rj ∈ r0 + A · span{r0 , Ar0 , · · ·, Aj−1 r0 }
and
d ∈ span{r0 , Ar0 , · · ·, Aj−1 r0 }.
j
Proof. Induction.
Theorem 8.1. Let A be SPD. Then the CG method satisfies the following:
(i) The nth residual is globally optimal over the affine subspace Kn in
the A−1 -norm
||rn ||A−1 = min ||r||A−1 .
r∈r 0 +AXn
Proof. Since the residuals {r0 , r1 , . . . , rN −1 } are orthogonal they are lin-
early independent. Thus, rl = 0 for some l ≤ N .
Using the properties (i) through (iv), the error in the nth CG step will
be linked to an analytic problem: the error in Chebychev interpolation.
The main result of it is the second big convergence theorem for CG.
Art has a double face, of expression and illusion, just like science
has a double face: the reality of error and the phantom of truth.
— René Daumal
‘The Lie of the Truth’. (1938) translated by Phil Powrie (1989).
In Carol A. Dingle, Memorable Quotations (2000).
x − xn ∗ = min x − x
/∗ .
∈X
x
x − xK ∗ = min x − x
/∗ .
∈K
x
The two best approximations are related. Given x, xK , the best approxi-
mation in K, is given by xK = x0 + xX where xX is the best approximation
in X to x − x0 .
x − xn ∗ = min x − x
/∗
∈X
x
is determined by
x − xn , x
/∗ = 0, ∀/
x ∈ X.
Further, we have
x − xn ∗ = min x − x
/∗ .
∈K
x
x − xn , x
/∗ = 0, ∀/
x ∈ X.
Definition 8.4. Let {φ1 , φ2 , . . . , φn } be a basis for X and ·, ·∗ an inner
product on X. The associated Gram matrix G of the basis is
Gij = φi , φj ∗ .
• How to calculate all the inner products fj if x is the sought but unknown
solution of Ax = b?
• Are there cases when the best approximation in K can be computed
at less cost than constructing a basis, assembling G and then solving
Gc = f ?
For the first question there is a clever finesse that works when A is SPD.
Indeed, if we pick ·, ·∗ = ·, ·A then for Ax = b,
x, yA = xt Ay = (Ax)t y = bt y = b, y
which is computable without knowing x. For the second question, there is a
case when calculating the best approximation is easy: when an orthogonal
basis is known for X. This case is central to many mathematical algorithms
including CG.
Theorem 8.4. If {φ1 , . . . , φj } are A-orthogonal and ·, ·∗ is the A-inner
product, then the approximations produced by the descent method choosing
{φ1 , . . . , φj } for descent directions (i.e., if choosing di = φi ) are the same
as those produced by summing the orthogonal series in Algorithm 8.3 above.
Thus, with A-orthogonal search directions, the approximations produced by
the descent algorithm satisfy
||x − xj ||A = min x − x
/A ,
∈x0 +span{φ1 ,...,φj }
x
Proof. Thus, consider the claim of equivalence of the two methods. The
general step of each takes the form xj+1 = xj + αdj with the same xj , dj .
We thus need to show equivalence of the two formulas for the stepsize:
descent : αj = rj , φj /φj , φj A
orthogonal series: αj = x − y 0 , φj A /φj , φj A .
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 191
Since the denominators are the same we begin with the first numerator and
show its equal to the second. Indeed,
rj , φj = b − Axj , φj = Ax − Axj , φj
= x − xj , φj A .
Consider the form of xj produced by the descent algorithm. We have (both
obvious and easily proven by induction) that xj takes the general form
xj = x0 + a1 φ1 + · · · + aj−1 φj−1 .
Thus, by A-orthogonality of {φ1 , . . . , φj }
xj , φj A = x0 + a1 φ1 + · · · + aj−1 φj−1 , φj A = x0 , φj A .
Thus we have
rj , φj = x − xj , φj A = x − x0 , φj A ,
which proves equivalence. The error estimate is just restating the error
estimate of the Pythagorean theorem. From the work on descent methods
we know that A-norm optimality of the error is equivalent to minimization
of J(·) over the same space. Hence the last claim follows.
Thus:
The focus now shifts to how to generate the orthogonal basis. The
classic method is the Gram–Schmidt algorithm.
φ1 = e1
for j=1:n
for i=1:j
αi = ej+1 , φi ∗ /φi , φi ∗
end
j
φj+1 = ej+1 − i=1 αi φi
end
Exercise 8.6. Prove that the Gram matrix Gij = ei , ej ∗ is SPD provided
e1 , . . . , en is a basis for X and diagonal provided e1 , . . . , en are orthogonal.
r0 = b − Ax0
d0 = r 0
for n=1:itmax
Descent step:
αn−1 = rn−1 , dn−1 /dn−1 , dn−1 A
xn = xn−1 + αn−1 dn−1
rn = b − Axn
OM step:
Calculate new A-orthogonal search direction dn so that
span{d0 , d1 , . . . , dn } = span{r0 , Ar0 , A2 r0 , . . . , An r0 }
end
φ1 = e1
for n=1:N-1
α = φn , Aφn A /φn , φn A
if n==1
φ2 = Aφ1 − αφ1
else
β = φn−1 , Aφn A /φn−1 , φn−1 A
φn+1 = Aφn − αφn − βφn−1
end
end
span{e1 , e2 , e3 , · · · , ej } = span{φ1 , φ2 , φ3 , · · · , φj }.
Proof. Preliminary remarks: First note that the equation for φn+1
takes the form
The step φn+1 = Aφn + αφn + βφn−1 contains two parameters. It is easy
to check that the parameters α and β are picked (respectively) to make the
two equations hold:
φn+1 , φn A = 0,
φn+1 , φn−1 A = 0.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 195
Indeed
7 8 7 8
0 = φn+1 , φn A = Aφn + αφn + βφn−1 , φn A
7 8
= Aφn , φn A + α φn , φn A + β φn−1 , φn A
= Aφn , φn A + α φn , φn A
and the same for φn+1 , φn−1 A = 0 gives two equations for α, β whose
solutions are exactly the values chosen on Orthogonalization of Moments.
The key issue in the proof is thus to show that
φn+1 , φj A = 0, for j = 1, 2, · · · , n − 2. (8.2)
This will hold precisely because span{e1 , e2 , e3 , · · · , ej } is a Krylov subspace
determined by moments of A.
Details of the proof: We show from (8.1) that (8.2) holds. The proof
is by induction. To begin, from the choice of α, β it follows that the theorem
holds for j = 1, 2, 3. Now suppose the theorem holds for j = 1, 2, · · · , n.
From (8.1) consider φn+1 , φj A . By the above preliminary remarks, this
is zero for j = n, n − 1. Thus consider j ≤ n − 2. We have
φn+1 , φj A = Aφn , φj A + αφn , φj A + βφn−1 , φj A
for j = 1, 2, · · ·, n − 2.
By the induction hypothesis
φn , φj A = φn−1 , φj A = 0
thus it simplifies to
φn+1 , φj A = Aφn , φj A .
Consider thus Aφn , φj A . Note that A is self adjoint with respect to the
A-inner product. Indeed, we calculate
t
Aφn , φj A = (Aφn ) Aφj = (φn )t At Aφj = (φn )t AAφj = φn , Aφj A .
Thus, Aφn , φj A = φn , Aφj A . By the induction hypothesis (and because
we are dealing with a Krylov subspace): for j ≤ n − 2
φj ∈ span{e1 , e2 , e3 , · · ·, en−2 }
thus
Aφ ∈ span{e1 , e2 , e3 , · · ·, en−2 , en−1 }.
j
Exercise 8.9. If A is not symmetric, where does the proof break down? If
A is not positive definite, where does it break down?
r0 = b − Ax0
d0 = r0 /r0
First descent step:
d0 ,r0
α0 = d0 ,d0
A
x1 = x 0 + α 0 d0
r1 = b − Ax1
First step of OM:
d0 ,Ad0
γ0 = d0 ,d0 A
A
d1 = Ad0 − γ0 d0
d1 = d1 /d1 (normalize5 d1 )
for n=1:∞
Descent Step:
r n ,dn
αn = dn ,dn A
n+1
x = x + α n dn
n
rn+1 = b − Axn+1
if converged, STOP, end
OM step:
dn ,Adn A
γn = dn ,dn A
dn−1 ,Adn A
βn = dn−1 ,dn−1 A
dn+1 = Adn − γn dn − βn dn−1
dn+1 n+1
=d n+1
/d (normalize5 dn+1 )
end
Algorithm 8.7, while not the most efficient form for computations, cap-
tures the essential features of the method. The differences between the
above version and the highly polished one given in the first section, Algo-
rithm 8.2, take advantage of the various orthogonality properties of CG.
These issues, while important, will be omitted to move on to the error
analysis of the method.
Exercise 8.12. Consider the above version of CG. Show that it can be writ-
ten as a three term recursion of the general form xn+1 = an xn +bn xn−1 +cn .
Proof. As Ae = r and
rn ∈ r0 + span{Ar0 , A2 r0 , A3 r0 , · · ·, An r0 }
this implies
Ae ∈ A e0 + span{Ae0 , A2 e0 , A3 e0 , · · ·, An e0 } ,
n
which proves part (i). For (ii) note that since ||en ||2A = 2(J(xn ) − J(x))
minimizing J(x) is equivalent to minimizing the A-norm of e. Thus, part
(ii) follows. For part (iii), note that part (i) implies
en = [I + a1 A + a2 A2 + · · · + an An ]e0 = p(A)e0 ,
where p(x) is a real polynomial of degree ≤ n and p(0) = 1. Thus, from
this observation and part (ii),
||x − xn ||A = min ||p(A)e0 ||A
pn ∈Πn and p(0)=1
# $
≤ min ||p(A)||A ||e0 ||A .
pn ∈Πn and p(0)=1
The result follows by calculating using the spectral mapping theorem that
p(A)A = max |p(x)| ≤ max |p(x)|.
λ∈spectrum(A) λmin ≤x≤λmax
≤ max |/
pn (x)|.
λmin ≤x≤λmax
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 200
The problem now shifts to finding the “best” points to interpolate zero, best
being in the sense of the min-max approximation error. This problem is a
classical problem of approximation theory and was also solved by Cheby-
chev, and the resulting polynomials are called “Chebychev polynomials”,
one of which is depicted in Figure 8.1.
max p(x)
p(x)
1
Fig. 8.1 We make p(x) small by interpolating zero at points on the interval. In this
illustration, the minimum and maximum values of p(x) are computed on the interval
[λmin , λmax ].
Proof. For the proof and development of the beautiful theory of Chebychev
approximation see any general approximation theory book.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 201
Exercise 8.15. Show that if A has M < N distinct eigenvalues then the
CG method converges to the exact solution in at most M (< N ) steps.
Recall that
⎛ ⎞
⎜ ⎟
x − xn ||A = ⎝ min
|| max |p(x)|⎠ ||e0 ||A .
pn ∈ Πn λ∈spectrum(A)
p(0) = 1
8.5 Preconditioning
r0 = b − Ax0
Solve M d0 = r0
z 0 = d0
for n=0:itmax
αn = rn , z n /dn , Adn
xn+1 = xn + αn dn
rn+1 = b − Axn+1 (∗)
if converged, stop end
Solve M z n+1 = rn+1
βn+1 = rn+1 , z n+1 /rn , z n
dn+1 = z n+1 + βn+1 dn (∗∗)
end
Note that the extra cost is exactly one solve with M each step. There is
a good deal of art in picking preconditioners that are inexpensive to apply
and that reduce cond(A) significantly.
Exercise 8.17. (a) Find 2×2 examples where A SPD and M SPD does not
imply M −1 A is SPD. (b) Show however M −1/2 AM −1/2 is SPD. (c) Show
that if A or B is invertible then AB is similar to BA. Using this show that
M −1/2 AM −1/2 and has the same eigenvalues as M −1 A.
Exercise 8.18. Write down CG for M −1/2 AM −1/2 y = M −1/2 b. Re-
verse the change of variable everywhere y ⇐ M 1/2 x and eliminate all the
M ±−1/2 to give the PCG algorithm as stated.
Exercise 8.19. In Exercise 8.1 (page 182), you wrote a program to apply
CG to the 2D MPP. Modify that code to use PCG, Algorithm 8.8. Test
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 204
You must take your opponent into a deep dark forest where
2+2=5, and the path leading out is only wide enough for one.
— Mikhail Tal
If A is SPD then the CG method is provable the best possible one. For
a general linear system the whole beautiful structure of the CG method
collapses. In the SPD case CG has the key properties that
method will vary from one system to another. We shall restrict ourselves
to the case where A is square (N × N ). The following is known about the
normal equations.
Thus, any method using the normal equation will pay a large price in
increasing condition numbers and numbers of iterations required. Beyond
that, if A is sparse, forming At A directly shows that At A will have roughly
double the number of nonzero entries per row as A. Thus, any algorithm
working with the normal equations avoids forming them explicitly. Resid-
uals are calculated by multiplying by A and then multiplying that by At .
is very complex. On the one hard, one can phrase the question (indi-
rectly) that if method X is applied and A happens to be SPD then, does
method X reduce to CG? Among the methods mentioned, only biCG
has this property. General (worst case) convergence results for these
methods give no improvement over CGNE: they predict O(cond2 (A))
steps per significant digit. Thus the question is usually studied by com-
putational tests which have shown that there are significant examples
of nonsymmetric systems for which each of the methods mentioned is
the best and requires significantly fewer than the predicted worst case
number of steps.
Exercise 8.20. The goal of this exercise is for you to design and analyze
(reconstruct as much of the CG theory as you can) your own Krylov sub-
space iterative method that will possibly be better than CG. So consider
solving Ax = b where A is nonsingular. Given xn , dn the new iterate is
computed by
xn+1 = xn + αn dn
αn = arg min ||b − Axn+1 ||22 .
(a) Find a formula for αn . Can this formula ever break down? Is there a
zero divisor ever? Does the formula imply that xn+1 is a projection
[best approximation] with respect to some inner product and norm?
Prove it.
(b) Next consider your answer to part (a) carefully. Suppose the search
directions are orthogonal with respect to this inner product. Prove a
global optimality condition for your new method.
(c) What is the appropriate Krylov subspace to consider for the new
method? Reconsider the Orthogonalization of Moments algorithm.
Adapt it to give a algorithm and its proof for generating such an or-
thogonal basis.
(d) For this part you may choose: Either test the method and compare it
with CG for various h’s for the MPP or complete the error estimate for
the method adapting the one for CG.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 209
Chapter 9
Eigenvalue Problems
211
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 212
problem becomes 1/2 step closer to a real problem from science or engineering, such as
buckling of a 2D shell, it cannot be solved exactly. Then the only recourse is to discretize,
replace it by an EVP for a matrix and solve that.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 213
→
− −1
λ2 = −1, φ 2 = .
2
where p1 + p2 + . . . + pk = n.
Each λj has at least one eigenvector φj and possibly as many as pk
linearly independent eigenvectors.
If each λj has pj linearly independent eigenvectors then all the eigen-
vectors together form a basis for RN .
→
− →
−
(ii) there exists N orthonormal2 eigenvectors φ 1 , . . . , φ N of A:
→
− → − 1, if i = j,
φ i, φ j =
0, if i = j.
→
−
(iii) if C is the N × N matrix with eigenvector φ j in the j th column
then
⎡ ⎤
λ1 0 . . . 0
⎢ 0 λ2 . . . 0 ⎥
⎢ ⎥
C −1 = C t and C −1 AC = ⎢ . . . . ⎥.
⎣ .. .. . . .. ⎦
0 0 . . . λN
→
−
Proposition 9.4. If an eigenvector φ is known, the corresponding eigen-
value is given by the Rayleigh quotient
−→ → − -→
φ∗ A φ −
→∗ − .tr
λ= − − , where φ = conjugate transpose = φ
→∗ → .
φ φ
→
− →
−
Proof. If A φ = λ φ , we have
−
→ → − −
→→ −
φ ∗ A φ = λφ ∗ φ
from which the formula for λ follows.
Proof. Since
→
− →
−
λφ = Aφ
we have
→
− →
− →
− →
−
|λ| φ = λ φ = A φ ≤ A φ .
ucts give zero) and normal (meaning their length is normalized to be one).
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 217
What can we tell about the eigenvalues of A from the entries in the
matrix A?
• σ(A) ⊂ R(A)
• R(A) is compact and convex (and hence simply connected).
• If A is normal matrix (i.e., A commutes with At ) then R(A) is the
convex hull of σ(A).
2
1
Example 9.3.
⎡ ⎤
1 2 −1
A3×3 = ⎣ 2 7 0 ⎦.
−1 0 5
We calculate
r1 = 2 + 1 = 3, r2 = 2 + 0 = 2, r3 = 1 + 0 = 1.
c1 = 2 + 1 = 3, c2 = 2 + 0 = 2, ck = 1 + 0 = 1.
The eigenvalues must belong to the three disks in Figure 9.1. Since A = At ,
they must also be real. Thus
−2 ≤ λ ≤ 8.
μk = a + ω k ε1/N , k = 0, 1, · · ·, N − 1
Proof. In this case note that H is orthogonal and thus ||H||2 = ||H −1 ||2 =
1.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 221
The power method is used to find the dominant (meaning the largest in
complex modulus) eigenvalue of a matrix A. It is specially appropriate
when A is large and sparse so multiplying by A is cheap in both storage
and in floating point operations. If a complex eigenvalue is sought, then
the initial guess in the power method must also be complex. In this case
the inner product of complex vectors is the conjugate transpose:
N
x, y := x∗ y := xT y = x i yi .
i=1
Remark 9.2. The step (∗) in which the eigenvalue is recovered can be
rewritten as
xn+2 )∗ xn+1
λn+1 = (/
/n+2 = Axn+1 . Thus it can be computed without additional cost.
since x
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 222
→
− - .k →
− - .k → −
λ
1 k
k /
x = c 1 φ 1 + c 2
λ2
λ1 φ 2 +. . . + c N
λN
λ1 φ N.
1
↓ ↓ ↓
0 0 0
3 The x
here are different from those in the algorithm because the normalization is
different.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 223
9 9
9 9
Each term except the first → 0 since 9 λλ21 9 < 1. Thus,
1 k →
−
k
/ = c1 φ 1 + ( terms that → 0 as k → ∞) ,
x
λ1
or,
→
− →
−
/k c1 λk1 φ 1 , φ 1 = eigenvector of λ1 ,
x
→
− →
−
xk A(c1 λk1 φ 1 ) = c1 λk+1
so A/ 1 φ 1 or
x k λ1 x
A/ /k
→
−
and so we have found λ1 , φ 1 approximately.
2 4 1
Example 9.5. A = , x0 = . Then
3 13 0
2 4 1 2
/1 = Ax0 =
x = ,
3 13 0 3
/1
x [2, 3]t .5547
x1 = =√ = .
/
x
1
22 + 32 .8321
4.438
/2 = Ax1 =
x .
12.48
<2
x
x2 = = ..., and so on.
/
x2
Exercise 9.4. Write a computer program to implement the power method,
Algorithm 9.1. Regard the algorithm as converged when |Axn+1 −λxn+1 | <
, with = 10−4 . Test your program by computing x /1 , x1 and x
/2 in
Example 9.5 above. What are the converged eigenvalue and eigenvector in
this example? How many steps did it take?
→
− →
− →
−
If x0 = c1 φ 1 + c2 φ 2 + . . . + cN φ N then, as in the previous case
→
− →
−
x/k = c1 λk1 φ 1 + . . . + cN λkN φ N , and thus
→
− →
−
x/k+1 = c1 λk+11 φ 1 + . . . + cN λk+1
N φ N.
In the symmetric case the eigenvectors are mutually orthogonal:
− t→
→ −
φ i φ j = 0, i = j.
Using orthogonality we calculate
(xk )t xk+1
- →
− − .t - k+1 →
→ − − .
→
= c1 λk1 φ 1 + . . . + cN λkN φ N c1 λ1 φ 1 + . . . + cN λk+1
N φN
= . . . = c21 λ2k+1
1 + c22 λ2k+1
2 + . . . + c2N λ2k+1
N .
Similarly
(xk )t xk = c21 λ2k 2 2k
1 + . . . + c N λN
and we find
c21 λ2k+1
1 + . . . + c2N λ2k+1
μk = N
c21 λ2k + . . . + c 2 λ2k
9 92k !
1 N N
c21 λ2k+1 9 λ 29
= 2 1 2k + O 99 99
c 1 λ1 λ1
9 92k !
9 λ2 9
= λ1 + O 99 99 ,
λ1
which is twice as fast as the non-symmetric case!
Exercise 9.5. Take A2×2 given below. Find the eigenvalues of A. Take
x0 = (1, 2)t and do 2 steps of the power method. If it is continued, to which
eigenvalue will it converge? Why?
2 −1
A= .
−1 2
for n=0:itmax
(∗) Solve A/ xn+1 = xn
x n+1
=x/ n+1
xn+1
//
if converged, break, end
end
% The converged eigenvalue is given by
xn+1 )∗ xn
μ = (/
λ = 1/μ
For large sparse matrices step (∗) is done by using some other iterative
method for solving a linear system with coefficient matrix A. Thus the total
cost of the inverse power method is:
This product can be large. Thus various ways to accelerate the inverse
power method have been developed. Since the number of steps depends on
the separation of the dominant eigenvalue from the other eigenvalues, most
methods do this by using shifts to get further separation. If α is fixed, then
the largest eigenvalue of (A − αI)−1 is related to the eigenvalue of A closest
to α, λα by
1
λmax (A − αI) = .
λα (A) − α
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 226
The inverse power method with shift finds the eigenvalue closest to α.
for n=0:itmax
(∗) Solve (A − αI)/ xn+1 = xn
xn+1 = x/n+1 //
xn+1
if converged, break, end
end
% The converged eigenvalue is given by
xn+1 )∗ xn
μ = (/
λ = α + 1/μ
The Power Method and the Inverse Power Method are related to (and
combine to form) Rayleigh Quotient Iteration. Rayleigh Quotient Iteration
finds very quickly the eigenvalue closest to the initial shift for symmetric
matrices. It is given by:
for n=0:itmax
(∗) Solve (A − λn I)/xn+1 = xn
/n+1 //
xn+1 = x xn+1
λn+1 = (xn+1 )t Axn+1
if converged, return, end
end
i.e., the number of significant digits triples at each step in Rayleigh quotient
iteration.
“But when earth had covered this generation also, Zeus the son of Cronos
made yet another, the fourth, upon the fruitful earth, which was nobler
and more righteous, a god-like race of hero-men who are called demi-gods,
the race before our own, throughout the boundless earth. Grim war and
dread battle destroyed a part of them, some in the land of Cadmus at
seven-gated Thebe when they fought for the flocks of Oedipus, and some,
when it had brought them in ships over the great sea gulf to Troy for
rich-haired Helen’s sake: there death’s end enshrouded a part of them.
But to the others father Zeus the son of Cronos gave a living and an abode
apart from men, and made them dwell at the ends of earth. And they live
untouched by sorrow in the islands of the blessed along the shore of deep
swirling Ocean, happy heroes for whom the grain-giving earth bears
honey-sweet fruit flourishing thrice a year, far from the deathless
gods. . . ”
— Hesiod, Works and Days
such that
A = QR.
a1 = r11 q1
a2 = r12 q1 + r22 q2
···
aN = r1n q1 + r2n q2 + · · · + rN N qN .
Thus the entries in R are just the coefficients generated by the Gram–
Schmidt process! This proves existence when A is invertible.
for n=1:itmax
Factor An = Qn Rn
Form An+1 = Rn Qn
if converged, return, end
end
Appendix A
An Omitted Proof
The proof of this theorem is beyond the scope of this text, but can be
found in any text including elementary linear algebra, such as the beautiful
book of Herstein [Herstein (1964)].
231
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 232
Theorem A.2. Given any N × N matrix T and any ε > 0 there exists a
matrix norm · with T ≤ ρ(T ) + ε.
Remark A.1. If it happens that each of the eigenvalues with |λi | = ρ(T )
is simple, then each of the corresponding Jordan blocks is 1 × 1 and T =
ρ(T ).
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 233
Appendix B
B.1 Objective
The purpose of this appendix is to introduce the reader to the basics of the
Matlab programming (or scripting) language. By “basics” is meant the
basic syntax of the language for arithmetical manipulations. The intent of
this introduction is twofold:
(1) Make the reader sufficiently familiar with Matlab that the pseudocode
used in the text is transparent.
(2) Provide the reader with sufficient syntactical detail to expand pseu-
docode used in the text into fully functional programs.
233
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 234
For our purposes, the best way to use Matlab is to use its scripting facility.
With sequences of Matlab commands contained in files, it is easy to see
what calculations were done to produce a certain result, and it is easy to
show that the correct values were used in producing a result. It is terribly
embarrassing to produce a very nice plot that you show to your teacher or
advisor only to discover later that you cannot reproduce it or anything like
it for similar conditions or parameters. When the commands are in clear
text files, with easily read, well-commented code, you have a very good idea
of how a particular result was obtained. And you will be able to reproduce
it and similar calculations as often as you please.
The Matlab comment character is a percent sign (%). That is, lines
starting with % are not read as Matlab commands and can contain any
text. Similarly, any text on a line following a % can contain textual com-
ments and not Matlab commands.
A Matlab script file is a text file with the extension .m. Matlab script
files should start off with comments that identify the author, the date, and
a brief description of the intent of the calculation that the file performs.
Matlab script files are invoked by typing their names without the .m at
the Matlab command line or by using their names inside another Matlab
file. Invoking the script causes the commands in the script to be executed,
in order.
1 For example, if the variable A represents a matrix A, its components A are represented
ij
by A(i,j) in Matlab and Fortran but by A[i][j] in C, C++ and Java.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 235
Matlab function files are also text files with the extension .m, but the
first non-comment line must start with the word function and be of the
form
This defining line is called the “signature” of the function. More than
one input parameter requires they be separated by commas. If a func-
tion has no input parameters, they, and the parentheses, can be omitted.
Similarly, a function need not have output variables. A function can have
several output variables, in which case they are separated by commas and
enclosed in brackets as
The name of the function must be the same as the file name. Comment
lines can appear either before or after the signature line, but not both, and
should include the following.
(1) The first line following the signature (or the first line of the file) should
repeat the signature (I often leave out the word “function”) to provide
a reminder of the usage of the function.
(2) Brief description of the mathematical task the function performs.
(3) Description of all the input parameters.
(4) Description of all the output parameters.
special declarations.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 237
should be given names longer than a few letters, and the names should indi-
cate the meaning of the quantity. For example, if you were using Matlab
to generate a matrix containing a table of squares of numbers, you might
name the table, for example, tableOfSquares or table of squares.
Once you have used a variable name, it is bad practice to re-use it to
mean something else. It is sometimes necessary to do so, however, and the
statement
should be used to clear the two variables varOne and varTwo before they
are re-used. This same command is critical if you re-use a variable name
but intend it to have smaller dimensions.
Matlab has a few reserved names. You should not use these as variable
names in your files. If you do use such variables as i or pi, they will lose
their special meaning until you clear them. Reserved names include:
Exercise B.1. Start up Matlab or Octave and use it to answer the fol-
lowing questions.
(a) What are the values of the reserved variables pi, eps, realmax, and
realmin?
(b) Use the “format long” command to display pi in full precision and
“format short” to return Matlab to its default, short, display.
(c) Set the variable a=1, the variable b=1+eps, the variable c=2, and the
variable d=2+eps. What is the difference in the way that Matlab
displays these values?
(d) Do you think the values of a and b are different? Is the way that
Matlab formats these values consistent with your idea of whether
they are different or not?
(e) Do you think the values of c and d are different? Explain your answer.
(f) Choose a value and set the variable x to that value.
(g) What is the square of x? Its cube?
(h) Choose an angle θ and set the variable theta to its value (a number).
(i) What is sin θ? cos θ? Angles can be measured in degrees or radians.
Which of these has Matlab used?
Matlab treats all its variables as though they were matrices. Important
subclasses of matrices include row vectors (matrices with a single row and
possibly several columns) and column vectors (matrices with a single col-
umn and possibly several rows). One important thing to remember is that
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 239
you don’t have to declare the size of your variable; Matlab decides how big
the variable is when you try to put a value in it. The easiest way to define
a row vector is to list its values inside of square brackets, and separated by
spaces or commas:
rowVector = [ 0, 1, 3, 6, 10 ]
The easiest way to define a column vector is to list its values inside of square
brackets, separated by semicolons or line breaks.
columnVector1 = [ 0; 1; 3; 6; 10 ]
columnVector2 = [ 0
1
9
36
100 ]
(It is not necessary to line the entries up, but it makes it look nicer.) Note
that rowVector is not equal to columnVector1 even though each of their
components is the same.
Matlab has a special notation for generating a set of equally spaced
values, which can be useful for plotting and other tasks. The format is:
or
start : finish
evens = 10 : 2 : 20
Sometimes, you’d prefer to specify the number of items in the list, rather
than their spacing. In that case, you can use the linspace function, which
has the form
in which case we could generate six even numbers with the command:
As a general rule, use the colon notation when the increment is an integer
or when you know what the increment is and use linspace when you know
the number of values but not the increment.
Another nice thing about Matlab vector variables is that they are
flexible. If you decide you want to add another entry to a vector, it’s very
easy to do so. To add the value 22 to the end of our evens vector:
evens = [ evens, 22 ]
and you could just as easily have inserted a value 8 before the other entries,
as well.
Even though the number of elements in a vector can change, Matlab
always knows how many there are. You can request this value at any time
by using the numel function. For instance,
numel ( evens )
should yield the value 7 (the 6 original values of 10, 12, ... 20, plus the value
22 tacked on later). In the case of matrices with more than one nontrivial
dimension, the numel function returns the product of the dimensions. The
numel of the empty vector is zero. The size function returns a vector
containing two values: the number of rows and the number of columns (or
the numbers along each of the dimensions for arrays with more than two
dimensions). To get the number of rows of a variable v, use size(v,1) and
to get the number of columns use size(v,2). For example, since evens is
a row vector, size( evens, 1)=1 and size( evens, 2)=7, one row and
seven columns.
To specify an individual entry of a vector, you need to use index no-
tation, which uses round parentheses enclosing the index of an entry. The
first element of an array has index 1 (as in Fortran, but not C and Java).
Thus, if you want to alter the third element of evens, you could say
evens(3) = 7
(a) Use the linspace function to create a row vector called meshPoints
containing exactly 500 values with values evenly spaced between -1 and
1. Do not print all 500 values!
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 241
(b) What expression will yield the value of the 55th element of meshPoints?
(c) Use the numel function to confirm the vector has length 500.
(d) Produce a plot of a sinusoid on the interval [−1, 1] using the command
plot(meshPoints,sin(2*pi*meshPoints))
In its very simplest form, the signature of the plot function is
plot(array of x values, array of y values)
The arrays, of course, need to have the same numbers of elements.
The plot function has more complex forms that give you considerable
control over the plot. Use doc plot for further documentation.
Matlab provides a large assembly of tools for matrix and vector manip-
ulation. The following exercise illuminates the use of these operations by
example.
(d) The sum of a row vector and a column vector makes no sense in the
context of this book. Old versions of Matlab flag attempts to add
row vectors to column vectors as errors, but current versions allow
it. The result is a matrix. In the context of this book, this matrix
result is always wrong and is a recurring source of programming errors.
Compute
Look carefully at the result. When you are testing a program, be alert
for unexpected appearances of matrices and search for sums of row and
column vectors that might cause these appearances!
(e) You can do row-column matrix multiplication. Compute
mat1Transpose = mat1’
rowVec2 = colVec3’
(h) Matrix operations such as determinant and trace are available, too.
(i) You can pick certain elements out of a vector, too. Use the following
command to find the smallest element in a vector rowVec1.
min(rowVec1)
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 243
(j) The min and max functions work along one dimension at a time. They
produce vectors when applied to matrices.
max(mat1)
(k) You can compose vector and matrix functions. For example, use the
following expression to compute the max norm of a vector.
max(abs(rowVec1))
(l) How would you find the single largest element of a matrix?
(m) As you know, a magic square is a matrix all of whose row sums, column
sums and the sums of the two diagonals are the same. (One diagonal of
a matrix goes from the top left to the bottom right, the other diagonal
goes from top right to bottom left.) Show by direct computation that
if the matrix A is given by
Then it has 100 row sums (one for each row), 100 column sums (one
for each column) and two diagonal sums. These 202 sums should all
be exactly the same, and you could verify that they are the same by
printing them and “seeing” that they are the same. It is easy to miss
small differences among so many numbers, though. Instead, verify that
A is a magic square by constructing the 100 column sums (without
printing them) and computing the maximum and minimum values of
the column sums. Do the same for the 100 row sums, and compute the
two diagonal sums. Check that these six values are the same. If the
maximum and minimum values are the same, the flyswatter principle
says that all values are the same.
Hints:
• Use the Matlab min and max functions.
• Recall that sum applied to a matrix yields a row vector whose values
are the sums of the columns.
• The Matlab function diag extracts the diagonal of a matrix, and
the composition of functions
sum(diag(fliplr(A))) computes the sum of the other diagonal.
(n) Suppose we want a table of integers from 0 to 9, their squares and
cubes. We could start with
integers = 0 : 9
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 244
but now we’ll get an error when we try to multiply the entries of
integers by themselves.
squareIntegers = integers * integers
Realize that Matlab deals with vectors, and the default multiplication
operation with vectors is row-by-column multiplication. What we want
here is element-by-element multiplication, so we need to place a period
in front of the operator:
squareIntegers = integers .* integers
Now we can define cubeIntegers and fourthIntegers in a similar
way.
cubeIntegers = squareIntegers .* integers
fourthIntegers = squareIntegers .* squareIntegers
Finally, we would like to print them out as a table. integers,
squareIntegers, etc. are row vectors, so make a matrix whose columns
consist of these vectors and allow Matlab to print out the whole ma-
trix at once.
tableOfPowers=[integers’, squareIntegers’, ...
cubeIntegers’, fourthIntegers’]
(The “. . . ” tells Matlab that the command continues on the next
line.)
(o) Compute the squares of the values in integers alternatively using the
exponentiation operator as:
sqIntegers = integers .^ 2
and check that the two calculations agree with the command
norm(sqIntegers-squareIntegers)
that should result in zero.
(p) You can add constants to vectors and matrices. Compute
squaresPlus1=squareIntegers+1;
(q) Watch out when you use vectors. The multiplication, division and
exponentiation operators all have two possible forms, depending on
whether you want to operate on the arrays, or on the elements in the
arrays. In all these cases, you need to use the period notation to force
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 245
squareIntegers./squaresPlus1
and also
squareIntegers/squaresPlus1
tableOfCubes = tableOfPowers(:,[1,3])
tableOfOddCubes = tableOfPowers(2:2:end,[1,3])
tableOfEvenFourths = tableOfPowers(1:2:end,1:3:4)
(s) You have already seen the Matlab function magic(n). Use it to con-
struct a 10 × 10 matrix.
A = magic(10)
if logical condition
Matlab statement . . .
...
elseif logical condition
...
else
...
end
Note that elseif is one word! Using two words else if changes the
statement into two nested if statements with possibly a very different
meaning, and a different number of end statements.
If v is a Matlab vector, then the Matlab function numel gives its number
of elements, and the following code will compute the infinity norm. Note
how indentation helps make the code understandable. (Matlab already
has a norm function to compute norms, but this is how it could be done.)
N = numel(v);
nrm = abs(v(1));
for n=2:N
if abs(v(n)) > nrm
nrm=abs(v(n)); % largest value up to now
end
end
nrm % no semicolon: value is printed
(d) What is the first value that nrm takes on? (5)
(e) How many times is the statement with the comment “largest value
up to now” executed? (3)
(f) What are all the values taken by the variable nrm? (5,6,7,10)
(g) What is the final value of nrm? (10)
If you have to type everything at the command line, you will not get very
far. You need some sort of scripting capability to save the trouble of typing,
to make editing easier, and to provide a record of what you have done. You
also need the capability of making functions or your scripts will become too
long to understand. In the next exercise, you will write a script file.
Exercise B.5.
(a) Copy the code given above for the infinity norm into a file
named infnrm.m. Recall you can get an editor window from the
File→New→M-file menu or from the edit command in the command
windowpane. Don’t forget to save the file.
(b) Redefine the vector
v = [-35 -20 38 49 4 -42 -9 0 -44 -34];
(c) Execute the script m-file you just created by typing just its name
(infnrm) without the .m extension in the command windowpane. What
is the infinity norm of this vector? (49)
(d) The usual Euclidean or 2-norm is defined as
N
v2 = vn2 . (B.2)
1
Copy the following Matlab code to compute the 2-norm into a file
named twonrm.m.
% find the two norm of a vector v
% your name and the date
N = numel(v);
nrm = v(1)^2;
for n=2:N
nrm = nrm + v(n)^2;
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 249
end
nrm=sqrt(nrm) % no semicolon: value is printed
(e) Using the same vector v, execute the script twonrm. What are the first
four values the variable norm takes on? (1625, 3069, 5470, 5486) What
is its final value? (102.0931)
(f) Look carefully at the mathematical expression (B.2) and the Matlab
code in twonrm.m. The the way one translates a mathematical summa-
tion into Matlab code is to follow the steps:
(i) Set the initial value of the sum variable (nrm in this case) to zero
or to the first term.
(ii) Put an expression adding subsequent terms, one at a time, inside
a loop. In this case it is of the form nrm=nrm+something.
Script files are very convenient, but they have drawbacks. For example,
if you had two different vectors, v and w, for which you wanted norms, it
would be inconvenient to use infnrm or twonrm. It would be especially
inconvenient if you wanted to get, for example, v2 + 1/w∞ . This in-
convenience is avoided by using function m-files. Function m-files define
your own functions that can be used just like Matlab functions such as
sin(x), etc. In the following exercise, you will write two function m-files.
Exercise B.6.
(a) Copy the file infnrm.m to a file named infnorm.m. (Look carefully, the
names are different! You can use “save as” or cut-and-paste to do the
copy.) Add the following lines to the beginning of the file:
(b) The first line of a function m-file is called the “signature” of the function.
The first comment line repeats the signature in order to explain the
“usage” of the function. Subsequent comments explain the parameters
(such as v) and the output (such as norm) and, if possible, briefly explain
the methods used. The function name and the file name must agree.
(c) Place a semicolon on the last line of the file so that nothing will normally
be printed by the function.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 250
help infnorm
This command will repeat the first lines of comments (up to a blank
line or a line of code) and provides a quick way to refresh your memory
of how the function is to be called and what it does.
(e) Invoke the function in the command windowpane by typing
infnorm(v)
(f) Repeat the above steps to define a function named twonorm.m from the
code in twonrm.m. Be sure to put comments in.
(g) Define two vectors
a = [ -43 -37 24 27 37 ];
b = [ -5 -4 -29 -29 30 ];
and find the value of infinity norm of a and the two norm of b with the
commands
aInfinity = infnorm(a)
bTwo = twonorm(b)
Note that you no longer need to use the letter v to denote the vector,
and it is easy to manipulate the values of the norms.
(1) The function cond computes the condition number of a matrix as pre-
sented in Definition 4.7.
(2) The function condest is an estimate of the 1-norm condition number.
(3) The function rcond is an estimate of the reciprocal of the condition
number.
B.9 Debugging
You are urged to adopt these and any other practices that you find help
you avoid bugs.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 253
Values All debuggers allow you to query the current value of variables in
the current function. In Matlab and in several other debuggers, this
can be accomplished by placing the cursor over the variable and holding
it stationary.
Step Execute one line of source code from the current location. If the line
is a function call, complete the function call and continue in the current
function.
Step in If the next line of source code is a function call, step into that
function, so that the first line of the function is the line that is displayed.
You would normally use this for functions you suspect contribute to the
bug but not for Matlab functions or functions you are confident are
correct.
Breakpoints It is usually inconvenient to follow a large program from its
beginning until the results of a bug become apparent. Instead, you set
“breakpoints”, which are places in the code that cause the program to
stop and display source code along with values of variables. If you find
a program stopping in some function, you can set a breakpoint near the
beginning of that function and then track execution from that point on.
Conditional breakpoints Matlab provides for breakpoints based on
conditions. Numerical programs sometimes fail because the result of
some calculation is unreasonable or unexpected. For example, if the
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 254
The remarks in this section are specific to Matlab and, to some extent,
Octave. These remarks cannot be generalized to languages such as C, C++
and Fortran, although Fortran shares the array notation and use of the
colon with Matlab.
It is sometimes possible to substantially reduce execution times for some
Matlab code by reformulating it in a mathematically equivalent manner
or by taking advantage of Matlab’s array notation. In this section, a few
strategies are presented for speeding up programs similar to the pseudocode
examples presented in this book.
The simplest timing tools in Matlab are the tic and toc commands.
These commands are used by calling tic just before the segment of code or
function that is being timed, and toc just after the code is completed. The
toc call results in the elapsed time since the tic call being printed. Care
must be taken to place them inside a script or function file or on the same
line as the code to be timed, or else it will be your typing speed that is
3 Division by zero results in a special illegal value denoted inf. The result of 0/0 and
most arithmetic performed on inf results in a different illegal value denoted NaN for “Not
a Number”.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 255
g=zeros(N,1);
for i=1:N
g(i)=sin(i);
end
This loop takes about 3.93 seconds on the computer mentioned above. A
speedup of almost a factor of two is available with the simple trick of cre-
ating a vector, i=(1:N), consisting of the consecutive integers from 1 to N,
as in the following code.
June 24, 2020 11:35 ws-book9x6 Numerical Linear Algebra 11926-main page 257
g=zeros(N,1);
i=1:N; % i is a vector
g(i)=sin(i); % componentwise application of sin
This code executes in 2.04 seconds. Once the loop has been eliminated, the
code can be streamlined to pick up another 10%.
g=sin(1:N);
and this code executes in 1.82 seconds, for a total improvement of more
than a factor of two.
Sometimes dramatic speed improvements are available through careful
consideration of what the code is doing. The MPP2D matrix is available in
Matlab through the gallery function. This function provides a “rogues
gallery” of matrices that can be used for testing algorithms. Recall that the
MPP2D matrix is tridiagonal, and hence quite sparse. An LU factorization
is available using Matlab’s lu function, and, given a right hand side vector
b, the forward and backward substitutions can be done using mldivide (the
“\” operator).
Consider the following code:
N=4000;
A=gallery(’tridiag’,N);
[L,U]=lu(A);
b=ones(N,1);
tic;x=U\L\b;toc
tic;y=U\(L\b);toc
On the same computer mentioned above, the computation of x takes 1.06
seconds, dramatically slower than the 0.0005 seconds needed to compute
y. The reason is that U\L\b means the same as (U\L)\b. In this case,
both U and L are bidiagonal, but (U\L) has nonzeros everywhere above the
diagonal and also on the lower subdiagonal. It is quite large and it takes
a long time to compute. Once it is computed, multiplying by b reduces it
back to a vector. In contrast, U\(L\b) first computes the vector L\b by
a simple bidiagonal multiplication and then computes the vector y with
another bidiagonal multiplication.
b2530 International Strategic Relations and China’s National Security: World at the Crossroads
Bibliography
Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J.,
Eijkhout, V., Pozo, R., Romine, C., and der Vorst, H. V. (1994). Templates
for the Solution of Linear Systems: Building Blocks for Iterative Methods
(SIAM, Philadelphia, PA).
Davis, T. (2004). Algorithm 832: Umfpack v4.3 — an unsymmetric-pattern mul-
tifrontal method, ACM Transactions on Mathematical Software (TOMS)
30, 2, pp. 196–199.
Faber, V. and Manteuffel, T. (1984). Necessary and sufficient conditions for the
existence of a conjugate gradient method, SIAM Journal on Numerical
Analysis 21, 2, pp. 352–362.
Gilbert, J. R., Moler, C., and Schreiber, R. (1992). Sparse matrices in matlab:
Design and implementation, SIAM Journal on Matrix Analysis and Appli-
cations 13, 1, pp. 333–356.
Hageman, L. A. and Young, D. M. (1981). Applied iterative methods (Academic
Press, New York), ISBN 0123133408; 9780123133403.
Herstein, I. N. (1964). Topics in algebra, 1st edn. (Blaisdell Pub. Co, New York,
N.Y).
Patterson, M. R. (2011). Grace murray hopper, rear admiral, united states navy,
http://www.arlingtoncemetery.net/ghopper.htm.
Voevodin, V. V. (1983). The question of non-self-adjoint extension of the con-
jugate gradients method is closed, USSR Computational Mathematics and
Mathematical Physics 23, 2, pp. 143–144.
von Neumann, J. and Goldstine, H. H. (1947). Numerical inverting of matrices of
high order, Bulletin of the American Mathematical Society 53, 11, pp. 1021–
1100.
Watkins, D. S. (1982). Understanding the qr algorithm, SIAM Review 24, 4,
pp. 427–440.
259
b2530 International Strategic Relations and China’s National Security: World at the Crossroads
Index
261
June 25, 2020 13:34 ws-book9x6 Numerical Linear Algebra 11926-main page 262
Index 263