MATH3230B Numerical Analysis
Tutorial 4
1 Recall:
1. Newton’s method for general nonlinear systems Consider the following system of nonlinear equation:
f1 (x1 , x2 , · · · , xn ) = 0,
f2 (x1 , x2 , · · · , xn ) = 0,
.. (1)
.
fn (x1 , x2 , · · · , xn ) = 0.
where each fi (x1 , x2 , · · · , xn ) is a nonlinear function of n variables x1 , x2 , · · · , xn . The system can be written
simply as
F (x) = 0,
where F (x) = (f1 (x), f2 (x), · · · , fn (x))T , and x = (x1 , x2 , · · · , xn )T .
The Newton’s method for solving F (x) = 0 is as follows: Given x0 , for k = 0, 1, 2, · · · , do the following
(a) Compute dk by solving
F 0 (xk )dk = −F (xk ).
(b) Update xk+1 by
xk+1 = xk + dk .
2. Vector norm:
Treat k · k as a function (where the dot · is some vectors which are going to be the input of the function).
Given an n-dimensional vector x
x1
x2
x = . ,
..
xn
a vector norm is just a function k · k : Rn → [0, ∞) such that it satisfies the following properties:
(a) kxk > 0 when x 6= 0 and kxk = 0 iff x = 0;
(b) kλxk = |λ|kxk for any λ ∈ R;
(c) kx + yk ≤ kxk + kyk.
(Here we simplify the discussion and assume that vectors are in Rn ).
For example, the Euclidean norm (the “distance" we used in daily life) in Rn is
n
! 21
X
kxk2 = x2i
i=1
Similarly, we can also define the sup-norm, which is defined to be
kxk∞ = max |xi |
1≤i≤n
Similarly, we have the defintiion of 1-norm:
n
X
kxk1 = |xi |
i=1
1
3. Matrix norm:
The matrix norm in this course is specific to the “vector normed induced matrix norm", or matrix norm
associated with the given vector norm. The matrix norm k · k is defined to be
kAk = sup {kAuk : u ∈ Rn , kuk = 1}
or
kAuk n
kAk = sup : u ∈ R , kuk =
6 0
kuk
Since we are talking everything in finite case, we can repace the supremum “sup" by the maximum “max" as
follows:
kAk = max {kAuk : u ∈ Rn , kuk = 1}
or
kAuk
kAk = max : u ∈ Rn , kuk =
6 0
kuk
Note that the intuitive meaning of the norm of a matrix is the “distance of a matrix from the zero matrix",
or the “distance of a transformation".
Similar to the vector norm case, the matrix norm has the following properties for A, B ∈ Rm×n .
(a) kAk ≥ 0, and kAk = 0 only if A = 0;
(b) kA + Bk ≤ kAk + kBk;
(c) kαAk = |α|kAk;
(d) kABk ≤ kAkkBk (submultiplicativity)
4. Convergence of Newton’s method for general nonlinear systems:
Assume F satisfies the following assumptions:
(a) There is a solution x∗ to the equation F (x) = 0;
(b) Jacobian matrix F 0 (x∗ ) is nonsingular;
(c) Jacobian F 0 : Ω → Rn×n is Lipschitz continuous with Lipschitz constant γ, i.e. we have
kF 0 (x) − F 0 (x∗ )k ≤ γkx − x∗ k
Theorem 1 (Quadratically convergence of Newton’s method). Under the three assumptions above, there exist
constant K > 0 and δ > 0 such that for any x0 ∈ Bδ (x∗ ), the sequence {xn } generated by the Newton’s method
satisfies xn ∈ Bδ (x∗ ), and
kxn+1 − x∗ k ≤ Kkxn − x∗ k2 , n = 0, 1, 2, · · ·
So Newton’s method converges quadratically.
2
2 Exercises:
Please do the star problem (*) in tutorial class and finish the rest after class.
1. Consider the following system of linear equation
x − 3z = 2
2x − 2y + z = 1
− y + 3z = −2
(a) Formulate the Newton’s method for solving the above system of linear equation.
(b) Find the number of iteration for the Newton’s method to return the exact solution with initial guess
X0 = (0, 0, 0)T .
(c) What does Newton’s method reduce to for the linear system Ax = b given by
a11 x1 + a12 x2 + · · · + a1n xn = b1 ,
a21 x1 + a22 x2 + · · · + a2n xn = b2 ,
..
.
an1 x1 + an2 x2 + · · · + ann xn = bn ,
where A is a nonsingular matrix?
Solution. (a) Let
f1 (x, y, z) = x − 3z − 2
f2 (x, y, z) = 2x − 2y + z − 1
f3 (x, y, z) = −y + 3z + 2.
Denote X = (x, y, z)T , F (X) = (f1 (X), f2 (X), f3 (X))T .
Note that
1 0 −3
0
F (X) = 2 −2 1 := A
0 −1 3
Set initial value X0 = (x0 , y0 , z0 )T , Newton’s method gives:
Xk+1 = Xk − F 0 (Xk )−1 F (Xk ) = Xk − A−1 F (Xk )
(b) The first iteration of Newton’s method will be
X1 = X0 − A−1 F (X0 ) = (5, 5, 1)T ,
which is the solution of the system of linear equation. So one iteration will return the exact solution.
∂fj
(c) Since fj (x1 , · · · , xn ) = aj1 x1 + aj2 x2 + · · · + ajn xn − bj , we have ∂xi = aji . Hence,
a11 a12 ··· a1n
a21 a22 ··· a2n
F 0 (X) = . .. := A.
.. .
an1 an2 ··· ann
Further,
a11 a12 ··· a1n b1
a21 a22 ··· a2n b2
F (X0 ) = . .. X0 − ..
.. . .
an1 an2 ··· ann bn
= AX0 − b
3
Thus, given X0 , we have
X1 = X0 − A−1 (AX0 − b)
= X0 − A−1 AX0 + A−1 b
= A−1 b
So given any X0 , the solution to the linear system is X1 .
2. (a) Please suggest a way to implement the Newton’s Method so that the computing of the inverse of the
Jacobian matrix can be avoided.
(b) Please explain briefly why we want to avoid the computing of the inverse of the Jacobian matrix.
Solution. (a) For k = 0, 1, 2, · · · , do the following
i. Compute dk by solving
F0 (xk )dk = −F(xk )
ii. Update xk+1 by
xk+1 = xk + dk
(b) When the Jacobian matrix is large, it may be not efficient and not stable to compute its inverse at each
iteration.
3. (a) This is part of the proof of Banach Lemma. Consider a matrix C ∈ Rn×n such that kCk < 1.
i. Show that
lim C n = 0,
n→∞
where 0 is a zero matrix.
ii. Show that I − C is invertible and
(I − C)−1 = I + C + C 2 + · · ·
(b) Please use the above results to finish the proof of Banach Lemma.
Solution. (a) i. (in this solution I use matrix A to denote C). Use the result in 3(d) repeatedly, then we
have
kAn k = kA(An−1 )k ≤ kAkkAn−1 k ≤ kAk2 kAn−2 k ≤ kAkn .
Thus,
lim kAn k = 0.
n→∞
Therefore,
lim An = 0.
n→∞
ii. A direct computation yields
(I − A)(I + A + A2 + · · · + An ) = I + A + A2 + · · · + An − (A + A2 + · · · + An+1 ) = I − An+1 .
In view of the results above,
I = I − lim An = lim (I − An ) = lim (I − A)(I + A + A2 + · · · + An−1 ) = (I − A)(I + A + A2 + · · · )
n→∞ n→∞ n→∞
(b) In the assumption of Banach Lemma, we know that kI − BAk < 1. Use the result in (b), we know that
BA = I − (I − BA) is invertible, so both A and B are invertible. Then we can write
A−1 = (BA)−1 B = (I − (I − BA))−1 B = (I + (I − BA) + (I − BA)2 + · · · )B,
which implies
kBk
kAk−1 ≤ .
1 − kI − BAk
4
4. Consider the following system of nonlinear equations
F (x) = 0, (2)
where F : Rn → Rn is a vector-valued function.
(a) Please write down the three standard assumptions on F such that the Newton’s method works and
converges quadratically.
(b) Let x∗ denote the solution that F(x∗ ) = 0. Consider the function
g(t) = F(x∗ + (x − x∗ )t), t ∈ [0, 1],
show that ˆ 1
F(x) − F(x∗ ) = F0 (x∗ + (x − x∗ )t)(x − x∗ )dt.
0
(c) Please prove that: there exists a δ > 0 such that for all x ∈ Bδ (x∗ ), it holds true that:
i. kF0 (x) − F0 (x∗ )k ≤ γkx − x∗ k,
ii. kF0 (x)k ≤ 2kF0 (x∗ )k,
iii. kF0 (x)−1 k ≤ 2kF0 (x∗ )−1 k
iv. 21 kF0 (x∗ )−1 k−1 kx − x∗ k ≤ kF(x)k ≤ 2kF0 (x∗ )kkx − x∗ k,
where γ is the Lipschitz constant of the Jacobian F 0 .
(d) Based on the the results from (b)(c) and your assumptions in (a), please prove the quadratic convergence
of the Newton’s method for system of nonlinear equations.
Solution. (a) The assumptions are:
1. There is a solution x∗ to the equation F (x) = 0.
2. Jacobian matrix F 0 (x∗ ) is nonsingular.
3. Jacobian F 0 : Ω → Rn×n is Lipschitz continuous.
(b) By the fundamental theorem of calculus and the change of variables,
ˆ 1
dg
g(1) − g(0) = dt
0 dt
Substituting g(t) = F(x∗ + (x − x∗ )t, we have
ˆ 1
F(x) − F(x∗ ) = F0 (x∗ + (x − x∗ )t)(x − x∗ )dt
0
(c) Part of (iv): Note that if x ∈ Bδ (x∗ ) then x∗ + t(x − x∗ ) ∈ Bδ (x∗ ) for all 0 ≤ t ≤ 1. Using the result
from (b) and inequality (ii), we can easily obtain the right side of (iv):
ˆ 1
kF (x)k ≤ kF 0 (x∗ + t(x − x∗ ))k kx − x∗ kdt ≤ 2kF 0 (x∗ )kkx − x∗ k
0
For the left side, please refer to page 40 in the lecture notes.
Also, please read the proof of (i)-(iii) in the notes.
(d) By the definition we have
xn+1 − x∗ = xn − x∗ − F0 (xn )−1 F(xn )
= F0 (xn )−1 (F0 (xn )(xn − x∗ ) − F(xn ))
ˆ 1
0 −1 0 ∗ 0 ∗ ∗ ∗
= F (xn ) F (xn )(xn − x ) − F (x + (xn − x )t)(xn − x )dt
0
ˆ 1
= F0 (xn )−1 (F0 (xn ) − F0 (x∗ + (xn − x∗ )t)) (xn − x∗ )dt
0
Using the properties, we have
kxn+1 − x∗ k ≤ (2kF 0 (x∗ )−1 k)γkxn − x∗ k2 /2
This completes the proof.