Sti Systems
1 Introduction. What is stiness?
The main purposes of the following lectures is twofold. Firstly, to discuss
from a users point of view the concept of sti problems, which appear so
often in practical situations. Secondly, to give recipes that help to choose the
right method to solve the sti problem at hand. Following Lambert (1991)
one should consider stiness as a phenomenon exhibited by a system, rather
than a property of it, because the word property is associated to the existence
of a denition which is both comprehensive and precise, whereas it is dicult
to come out with a satisfactory denition for the concept of stiness. A
convenient and constructive way of introducing the discussion about stiness
is in relation with the concept of linear stabilty of a numerical method used
to calculate the numerical solution of the well-posed initial value problem
(IVP)
y
= f(t, y), t (0, T],
y(0) = a, prescribed, .
(1.1)
where y C
m
, m being the dimension of the system. Generalizing the
technique of the linear stability analysis of a numerical method, we substitute
(1.1) by the problem
y
= A(y p(t)) + p
(t), t (0, T],
y(0) = a, prescribed,
(1.2)
where A = diag(a
ii
), i = 1 m, a
ii
C. Note that (1.2) is theoretically
more relevant that apparently seems to be, because if we wish to study the
solution of (1.1) near a particular solution g(t) we should apply a Taylor
series expansion and get
y
= J(t, g(t))(y g(t)) + f(t, g(t))
= J(t, g(t))(y g(t)) + g
(t),
(1.3)
1
where J(t, g(t)) is the Jacobian matrix, which is assumed to be slowly
varying in t, so that, locally, J(t, g(t)) can be taken as a constant. For the
ease of the exposition, we shall further assume that J is diagonalizable; hence,
after application of the diagonalization procedure to (1.3), we obtain system
(1.2). We remark that the latter hypothesis on J is not restrictive, because
in the general case J admits a Jordan canonical form, so that from (1.3) can
be obtained (1.2) with the matrix A being now the Jordan canonical form of
J.
The solution of (1.2) is
y(t) = (a p(0))e
At
+ p(t) (1.4)
The qualitative behavior of y(t) depends mainly on A. Thus, we can
distinguish three cases. i) If for all i, Re(a
ii
) > 0 and large, then the solution
curves for various a will fan out as t increases and the problem will be dicult
for any numerical method. We say that the problems is unstable. ii) If for
all i, Re(a
ii
) > 0, but small, the problem can be easily handled for any
conventional numerical method since the solution curves are more or less
parallel. The IVP (1,1) is neutrally stable. iii) If for all i, Re(a
ii
) < 0 and
there are i and j, j = i, such that
Re(a
jj
)
Re(a
ii
)
is small.
then the solution will tend to p(t) after a given time t called the initial
transient. In this case the IVP is stable.
The characteristics that distinguish a sti problem from a non-sti one
are encapsulated in the following statement (SG)
Statement 1 We say that a problem is sti if the following conditions are
fullled. A) No solution component is unstable, or equivalently, no eigenvalue
of the Jacobian matrix has a real part which is at all large and positive, and
at least some component is very stable, that is, at least one eigenvalue has a
negative part which is negative and large. B) The solution is slowly varying
with respect to the negative real part of the eigenvalues.
Some comments on the meaning of this statement are now in order.
1) Roughly speaking, condition B) of this statement means that the solu-
tion is smooth and the norm of its derivatives is much smaller than the norm
of the derivatives of exp(At).
2) Implicitly recognized in Statement 1 is the fact that a problem may
be sti in some intervals of t and not in others. For example, if for all
2
i, Re(a
ii
) << 0 (very negative) the problem will be sti after the initial
transient exp(At) has died out, but the problem is not sti during the tran-
sient interval. This also applies to a linear problem with a constant Jacobian
such that Re(a
ii
) << 0.
3) A sti problem can exhibit several periods of rapid change. (or can
have various rapid transients at dierent time intervals) because the term
p(t) may suddenly change.
4) A symptom of the potential presence of stiness in an stable IVP
(1.1) is the existence of components which change much faster than others,
although we must point out that such a symptom is not necessarily an indi-
cation of stiness, because the stiness depends on the dierential equation
rather than the behavior of the solution itself.
1.1 Examples of sti problems
The areas of chemical engineering, nonlinear mechanics, biochemistry and
life sciences are sources of sti problems.
1.1.1 Chemical reactions systems.
A famous chemical reaction is the Oregenator reaction between HBrO
2
, Br
, and
Ce(IV ) described by Field and Noyes in 1984. The Oregenator is expressed
mathematically by the following IVP
y
1
= 77.27 (y
2
+ y
1
(1 8.375 10
6
y
1
y
2
)) ,
y
2
=
1
77.27
(y
3
((1 + y
1
) y
2
) ,
y
3
= 0.16(y
1
y
3
).
(1.5)
The stiness o (1.5) is due to the fast variation of components y
1
and
y
3
as compared to y
2
. If T is suciently large, the stiness phenomena may
appear several times in the interval [0, T].
1.1.2 Reaction-diusion systems
Problems in which the diusion is modeled via the Laplace operator may
become sti as they are discretized in space by nite dierences or nite
3
elements. A typical example of such systems which appear so often in math-
ematical biology is the following
u
t
= u
xx
+ a + u(uv (b + 1))
v
t
= v
xx
+ u(b uv), 0 x 1, t (0, T],
(1.6a)
with initial and boundary conditions
u(x, 0) = 1 + sin(2x), v(x, 0) = c
u(0, t) = u(1, t) = 1; v(0, t) = v(1, t) = c,
(1.6b)
where a, b and c are real constants, and is another positive constant
which is called the diusion coecient. If we discretize the spatial derivatives
by second order nite dierence on a grid of I points x
i
=
i
I + 1
, 1 i
I, with spatial discretization parameter h =
1
I + 1
, we obtain the following
IVP
u
i
=
h
2
(u
i+1
2u
i
+ u
i1
) + a + u
i
(u
i
v
i
(b + 1)),
v
i
=
h
2
(v
i+1
2v
i
+ v
i1
) + u
i
(b u
i
v
i
)
u
i
(0) = 1 + sin(2x
i
), v
i
(0) = c, i = 1 I,
u
0
(t) = u
I+1
(t) = d; v
0
(t) = v
I+i
(t) = f
(1.7)
Setting u := (u
1
, ..., u
I
)
T
, v := (v
1
, ..., v
I
)
T
and y := (u, v), it follows from
(1.7) that
y
= f(y),
where
f(y) := (f
1
(u, v), f
2
(u, v)) =
h
2
_
T 0
0 T
_ _
u
v
_
+
_
R
1
R
2
_
,
with
T =
_
_
2 1
1 2 1
1 . .
. 2 1
1 2
_
_
,
R
1
= diag(a + u
i
(u
i
v
i
(b + 1)) and R
2
= diag(u
i
(b u
i
v
i
)).
4
Thus, the Jacobian matrix J =
(f
1
, f
2
)
(u, v)
is the sum of a diusion matrix
and a reaction matrix as
J =
h
2
_
T 0
0 T
_
+
_
diag(2u
i
v
i
b 1) diag(u
2
i
)
diag(b 2u
i
v
i
) diag(u
2
i
)
_
In many problems of interest the reaction matrix represents a perturba-
tion to the real symmetric diusion matrix whose eigenvalues are given by
(Thomas)
k
= 4
h
2
(sin
k
2h
)
2
, 1 k I + 1.
Notice that
k
takes values between 0 and
4
h
2
; so that, for h smaller
than the stiness of this problem is caused by the diusion term.
2 Why is dicult to solve sti problems?
In this section we shall analyze some of the reasons that make sti problems
be dicult for conventional explicit numerical methods. A rst approach to
measure the eciency of a numerical method is to look at the accuracy and
stability properties; so that, we shall examine the consequences of these two
attributes as we solve sti problems by conventional explicit methods. First,
accuracy. To carry out our analysis, we consider the model problem (1.1)
which, for ease of the exposition , is solved by the Euler explicit scheme with
a prescribed tolerance . Assuming the solution is suciently smooth, we
implement our computer code in such a way that it can select the time step
length t in order to give a numerical solution with the prescribed accuracy.
Therefore,
y
n+1
= y
n
+ tf(t, y
n
) = y
n
+ ty
n
. (2.1)
Taking y
n
= y(t
n
), the local truncation error (LTE) is expressed by
LTE =
t
2
2
y
(t
n
) + O(t
3
). (2.2)
We further assume that the tolerance is such that
=
t
2
2
y
(t
n
) ,
5
hence
t
n
_
2
y
(t
n
)
_1
2
. (2.3)
Clearly, using (1.4) in (2.3) we can distinguish the following two cases:
1) For t small, (2.3) yields
t
n
_
2
(a p(0))A
2
_1
2
. (2.4a)
2) For t large,(2.3) yields
t
n
_
2
p
(t
n
)
_1
2
, (2.4b)
because the exponential becomes very small. Since p
is small and A
2
is large, (2.4) is telling us that in sti problems we can achieve the prescribed
accuracy using small time steps t during the initial transient, and large time
steps after the initial transients have died out. Next, let us examine stability.
We recall that stability is the property of a numerical method to keep the
errors bounded as the calculation advances. Therefore, if we consider the
global error at time t
n
, e
n
:= y
n
y(t
n
), it follows for the Euler method that
e
n+1
= (1 + t
n
A)e
n
+ LTE.
Hence, in order to keep e
n
bounded we have that
1 + t
n
A 1, or 2 t
n
A 0. (2.5)
This means that for sti problems, for which A is too large, the
selection of the length of the time step in an explicit method is made by the
stability restriction, which, outside the transient region, imposes the use of
a t that is clearly inecient. But, recalling the denitions of the region of
absolute stability S of a method and its associated stability function R(z),i.e.,
S :={z C :| R(z) | 1}, (2.6)
we see that (2.5) means that t
n
is chosen in such way that z = t
n
max |
i
|,
i
C being the eigenvalues of A, is in S. Since S is small for explicit
methods and large for the implicit ones, then another heuristic way to char-
acterize sti problems is the following
6
Statement 2. Sti problems are those for which explicit method
are inecient.
Hence, the following recommendation makes sense
Recommendation 1. In general, for a sti problem it is better
to use implicit schemes with time step selection
3 Linear stability denitions pertinent to sti-
ness
From the previous analysis on linear stability and sti problems, we come
to the conclusion that if the method employed to integrate a sti problem
has an absolute stability region S extending the whole left-half plane, then
there will not be any stability restriction on the time step, and therefore,
we can select the time steplength based on accuracy considerations. At this
point, it is convenient to characterize the linear stabilty requirements for
sti problems. Following the linear stability analysis of previous lectures, we
consider the model problem (Dahlquist, 1963)
y
= y, C, Re() 0. (3.1)
and set z = t.
A-STABILITY
A method is said to be A-stable if S C
:= {z : Re(z) 0}
A-stability is a strong requirement, in particular for linear multi-step
methods, so that it is natural to restrict the class of problems in some way
and seek alternative denitions of stability that remove the restriction on the
time steplength for that class of problems.
A()-STABILITY
A method is said to be A()stable, (0, /2), if S {z : <
arg z < }; it is said to be A(0)-stable if it is A( )-stable for some
(0, /2).
A
0
-STABLE
A method is said to be A
0
stable if S {z : Re(z) < 0, Im(z) = 0}.
7
Under the observation that for many problems the eigenvalues responsible
of the fastest transients all lie to the left of a line Re(z) = a, where a is
positive, and the remaining eigenvalues, responsible of the slower transients,
will have small negative real parts and be clustered close to the origin , Gear
(1969) gives the following denition
STIFFLY STABLE
Let R
1
and R
2
be two regions of the complex plane dened as R
1
:= {z :
Re(z) < a}, R
2
:= {z : a Re(z) < 0, c Im(z) c}, where a and c
are positive constants. A method is said to be stiy stable if S R
1
R
2
.
There are methods with a rational stability function R(z), for which A-
stability is not as desirable a property as it seems to be, because for rational
stability functions R(z) we have that for z real and very negative, | R(z) | is
less than 1 but very close to 1, so that the sti components are damped out
very slowly. This motivates the following denition
L-STABILITY
A method is said to be L-stable if is A-stable and if in addition
limR(z) 0, as z
This property is sometimes called sti A-stability or strong A-stability.
It is worth noting the stability hierarchy
L stability A stability stiff stability
A() stability A(0) stability A
0
stability
(3.2)
In the next sections, we shall examine some relevant implications of the
stability hierarchy on members of the families of Runge-Kutta and multi-
step methods. This will help to establish practical criteria to choose the
good method for the integration of sti problems
4 A-stability and Runge-Kutta methods
The important fact we must point out is given in the following statement
(HW II)
8
Statement 3. No explicit Runge-Kutta method is A-stable
Let us examine then some interesting relations between implicit methods
and A-stability. First, we recall the general formula of implicit Runge-Kutta
methods of s stages.
Z
i
= y
n
+ t
s
j=1
a
ij
f(t
n
+ c
J
t, Z
j
), 1 j s,
y
n+1
= y
n
+ t
s
i=1
b
i
f(t
n
+ c
i
t, Z
i
),
(4.1a)
or using the Butcher tableau
c A
b
T
:=
c
1
a
11
a
12
a
1s
c
2
a
21
a
22
a
2s
c
s
a
s1
a
s2
a
ss
b
1
b
2
b
s
(4.1b)
To construct IRK the following assumptions are used (HWN I)
B(p) :
s
i=1
b
i
c
q1
i
=
1
q
, 1 q p,
C() :
s
j=1
a
ij
c
q1
j
=
c
q
i
q
, 1 i s, 1 q ,
D() :
s
j=1
b
i
c
q1
i
a
ij
=
b
j
q
(1 c
q
j
), 1 j s, 1 q
(4.1c)
The relevance of these conditions is established in a theorem due to
Butcher (1965) which says if the coecients b
i
, c
i
, a
ij
of a RK method
satisfy B(p), C() and D() with p + + 1 and p 2 + 2, then the
method is of order p
It is easy to see that the stability function of the IRK methods of sstages
is
R(z) = 1 + zb
T
(I zA)
1
1, 1 = (1, ..
(s)
., 1)
T
,
which can be written as a rational function (HW II)
R(z) =
det(I zA + z1b
T
)
det(I zA)
(4.2a)
It is usual to write R(z) as
R(z) =
P(z)
Q(z)
, (4.2b)
9
with P(z) and Q(z) polynomials of degree s, deg P = k and deg Q =
j. We have the following properties (HW II)
Proposition i) The IRK (4.1) is A stable if and only if
for allreal y, | R(iy) | 1, and
R(z) is analytic for R(z) < 0.
(4.3)
ii) If the matrix A of the IRK (4.1) is non singular and
a
sj
= b
j
, j = 1, ..., s, (4.4a)
and
a
i1
= b
1
, i = 1, ..., s, (4.4b)
then the method is L stable, that is, limR(z) 0 as z 0. The
method is stiffly accurate if (4.4b) is satised..
Next, we formulate the most frequently used IRK methods in sti system
codes
LOW ORDER IRK METHODS
1) Implicit Euler (s=1). Order 1
y
n+1
= y
n
+ f(t
n+1
, y
n+1
) (4.5a)
R(z) =
1
1 z
(4.5b)
Note that this method is also L stable.
2) Mid-point rule (s=1). Order 2
Z
1
= y
n
+
t
2
f(t
n
+
t
2
, Z
1
),
y
n+1
= y
n
+ tf(t
n
+
t
2
, Z
1
)
(4.6a)
R(z) =
1 + z/2
1 z/2
(4.6b)
This method is A stable. Note that the method is not L stable.
10
HIGHER ORDER METHODS
There are many IRK methods of order higher than 2, the so called Gauss
and Radau methods are frequently used in codes designed for sti prob-
lems due to the following property (HW-II)
Theorem i) The s-stage Gauss method is A-stable and of order 2s, with
stability function R(z) given by the (s,s) Pade approximation to exp(z). ii)
The s-stage Radau IA and Radau IIA methods are A-stable and of order 2s-
1, with stability function given by the (s-1,s) subdiagonal Pade approximation
to exp(z).
The interesting fact of this theorem is that Gauss and Radau methods rep-
resent a good balance between computational eort and order of the method,
for with a low number of stages the methods may achieve a high order accu-
rate numerical solution. We shall write down the Butcher tableau of some of
these methods which are used in canned codes for sti problems. But before
doing so, we give a brief idea of how these methods are constructed. For this
purpose, we recall that if we apply a general Runge-Kutta method to the
scalar equation y
= f(t), we obtain the quadrature formula
_
t
n+1
tn
f(t)dt y
n+1
y
n
= t
s
j=1
b
j
f(t
n
+ c
j
t),
which in essence is a Gaussian quadrature formula with weights b
j
and
integration points (abscissae) t
n
+ c
j
t. Computing the coecients c
j
, 1
j s, as the zeros of the shifted Legendre polynomial of degree s
d
s
dt
s
(t
s
(t 1)
s
)
the result is the Gauss method of order s. But if the coecients c
j
are
computed as the zeros of
d
s
1
dt
s1
(t
s
(t 1)
s1
), (Radau I), or
d
s1
dt
s1
(t
s1
(t 1)
s
), (Radau II),
11
and the weights b
1
, 1 i s, are chosen such that condition B(2s1) in
(4.1c) is satised, the result that follows is the methods Radau I and Radau
II.
1) Gauss method of s=2 and order 4.
The Butcher tableau of this method is:
3
3
6
1
4
32
3
12
3+
3
6
3+2
3
12
1
4
1
2
1
2
(4.7a)
with
R(z) =
1 +
1
2
z +
1
6
z
2
2
1
1
2
z +
1
6
z
2
2
(4.7b)
2) Gauss method of s=3 and order 6
The Butcher tableau of this method is
5
15
10
5
36
103
15
45
256
15
180
1
2
10+3
15
72
2
9
103
15
72
5+
15
10
25+6
15
180
10+3
15
45
5
36
5
18
4
9
5
18
(4.8a)
with
R(z) =
1 +
1
2
z +
1
5
z
2
2
+
1
20
z
3
3
1
1
2
z +
1
5
z
2
2
1
20
z
3
3
(4.8b)
As a remark, note that the mid-point rule is a Gauss method with s=1.
Radau I and II of s=2 and order 3
0
1
4
1
4
2
3
1
4
5
12
1
4
3
4
(Radau I),
1
3
5
12
1
12
1
3
4
1
4
3
4
1
4
(Radau II) (4.9)
ROSENBROCK METHODS
These methods are very competitive for sti systems of moderate size, for
example, systems of dimension 10, and when the required accuracy is not
12
too high, say 10
4
10
5
. Of course, one can use Rosenbrock methods with
conditions that are more demanding, but in such cases the methods will be
less ecient. The most attractive feature of these methods is that they are
as easy to implement as explicit RK methods. We give Kaps and Rentrop
formulation of an s stage Rosenbrock method to integrate (1.1) Thus an
sstage Rosenbrock method is given by the formula
k
i
= tf(t
n
+
i
t, y
n
+
i1
j=1
ij
k
j
) +
i
t
2
f
t
(t
n
, y
n
)
+t
f
y
(t
n
, y
n
)
i
j=1
ij
k
j
, i = 1, ..., s
y
n+1
= y
n
+
s
j=1
b
j
k
j
.,
(4.10a)
where the coecients
i
and
i
satisfy the relations
i
=
i1
j=1
ij
and
i
=
i
j=1
ij
(4.10b)
Note that at each stage of these methods a linear system of equations as
_
I t
ii
f
y
(t
n
, y
n
)
_
k
i
= R
i
(4.10c)
has to be solved. Of special interest are those method which chose the
coecients
ii
= for all i, because in such case only one LU decomposition
is necessary per step.
Kaps and Rentrop implementation (4.10a)-(4.10c) is via an stepsize control-
embedded formula with s = 4 for the high order solution and s = 3 for the
low order solution. For further details on the order conditions, stability and
coding of Rosenbrock methods see HW II.
5 A-stability and multistep methods
We recall that a general multistep method of k-steps is formulated as
k
j=0
j
y
n+j
= t
k
j=0
j
f
n+j
, (5.1)
13
where f
m
:= f(t
m
, y
m
). The linear stability analysis of the method is
performed by applying (5.1) to the test problem
y
= y,
yielding
k
j=0
(
j
j
)y
n+j
= 0, := t. (5.2)
To solve (5.2) we use the Lagrange method, setting y
j
=
j
, dividing by
n
and obtaining
k
j=0
(
j
j
)
j
() () = 0 (5.3a)
with (the already familiar polynomials)
() =
k
j=0
j
j
,
() =
k
j=0
j
j
.
(5.3b)
The dierence equation (5.2) has stable solutions for arbitrary starting
values, if and only if all roots of (5.3a) are 1 in modulus. So that, we
dene the absolute stability region S of (5.1) as
S =
_
C :
all roots
i
of (5.3a) satisfy |
i
() | 1,
multiple roots satisfy |
i
() |< 1
_
(5.4)
Considering (5.4), we particularize the general denition of A-stability to
multi-step methods as follows
Lemma The multistep method (5.1) is A-stable if for all C
it holds
i) all roots of (5.3a) satisfy
|
i
() | 1,
ii) and multiple roots satisfy
|
i
() |< 1.
The Adams and the Predictor-Corrector families of multistep methods are
the most popular multistep methods, but they possess the sad property that
14
none of them is A-stable (so that, they are no convenient for sti problems),
except the implicit Adams method of order 2,also known as trapezoidal
rule
y
n+1
= y
n
+
t
2
[f
n
+ f
n+1
] (5.5)
This observation was taken to the category of theorem by Dahlquist in
1963. (See HW II)
Theorem An A-stable multi-step method must be of order p 2. If the
order is 2, then the order constant satises
C
1
12
.
The trapezoidal rule is the only A-stable multi-step method of order 2 with
C =
1
12
.
This theorem is also known with the descriptive name of Dahlquist sec-
ond barrier, and it seems to convey the rather deceptive message that mul-
tistep methods are inferior to RK methods as far as A-stability is concerned,
and therefore, one may arrive to the conclusion that multistep methods are
not good for sti problems. However, this conclusion is not entirely true,
because one can break the barrier either i) by weaken the condition of A-
stabilty, or ii) by strengthen the method. Weaken the condition of A-stability
has led us to introduce dierent denitions of stability in section 3, which are
relevant to many sti problems. This opens the possibility of using a family
of multistep methods, known as BDF methods, for sti problems.
THE BACKWARD DIFFERENTIATION FORMULAE (BDF)
The general expression of these formulae is (HWN I)
k
j=0
j
y
n+j
= t
k
f
n+k
(5.6)
Two important properties of BDF are given in the following propositions
Proposition 1 The BDF with k 6 are all of them stiy stable, with
the following values of a (in the stiy stability denition) and ( in the
15
A()-stability denition)
k 1 2 3 4 5 6
90
o
90
o
86.03
o
73.35
o
51.84
o
17.84
o
a 0 0 0.083 0.667 2.327 6.075
(5.7)
Note that for k = 1 and 2, the BDF are also A stable, because a = 0
and =
2
imply that S C
Proposition 2 .For k 7, all BDF are unstable.
For completeness of this notes we give next the table with the coecients
and the order constant of the stable BDF.
k
6
5
4
3
2
1
0
1
p C
p
1 1 1 1 1 -
1
2
2 1
4
3
1
3
2
3
2 -
2
9
3 1 -
18
11
9
11
-
2
11
6
11
3 -
3
22
4 1 -
48
25
36
25
-
16
25
3
25
12
25
4 -
12
125
5 1 -
300
137
300
137
-
200
137
75
137
-
12
137
60
137
5 -
10
137
6 1
360
147
72
147
450
147
400
147
225
147
10
147
60
147
6
20
343
(5.8)
6 Characteristics of the solution methods
In previous sections, we have introduced a list of methods which are good
to compute the numerical solution of sti systems. All those methods share
the property of being implicit, this means that an equation of the form
w = tg(w) + (y
ns
; t
ns
, t), (6.1)
where the vectors g(w) and (y
ns
; t
ns
, t) are known and g is also
dierentiable. If the problem is linear (that is, if g is linear), then a linear
equation must be solved, but if the problem is nonlinear a nonlinear equation
must be solved at each step. In the latter case, we must consider wether the
problem is sti or not. For non sti problems an ecient method to solve
(6.1) may be the functional iteration
w
(k+1)
= tg(w
(k)
) + (6.2a)
16
that converges if and only if
t
g
w
< 1. (6.2b)
This condition, which is equivalent to the absolute stability condition of
the numerical method since
g
w
is the Jacobian matrix, is not restrictive for
non-sti problems because the norm of the Jacobian matrix is not large; how-
ever, this argument shows that for sti problems this procedure is inecient
because
g
w
is very large. So that when (6.1) represents a sti problem
the Newton-Raphson algorithm or some of its variants is currently used to
calculate a numerical solution of (6.1). The Newton-Raphson algorithm as
used in many codes is
[I tJ(w
(k)
)]
w
(k)
= w
(k)
+ tg(w
(k)
) + , (6.3)
where
I is the unit matrix, J(w
(k)
) =
g(w
(k)
)
w
,
and
w
(k)
= w
(k+1)
w
(k)
.
Note that in terms of CPU time and computer storage, (6.3) is much more
expensive to carry out than (6.2a), because at each iteration we have a) to
compute and storage the Jacobian matrix, and b) to solve a linear system
of equations whose matrix is, in general, nonsymmetric and full. Despite
these unavoidable tasks, the Newton-Raphson algorithm is in general (for sti
problems) more ecient than functional iteration for the following reasons:
(i) as we mention above, (see (6.2a)), the size of t has to be much smaller
in the functional iteration than in the Newton-Raphson algorithm; and (ii)
the rate of convergence of the Newton Raphson algorithm is higher than that
of the functional iteration if a good starting is available. In fact, it can be
proved (IK) that as long as t is suciently small the rate of convergence of
the Newton-Raphson algorithm is quadratic, that is, there exists a positive
constant K such that if w is the exact solution of (6.1) then
w
(k+1)
w K w
(k+1)
w
2
, for k sufficiently l arg e,
whereas the rate of convergence of the functional iteration is linear, that
is, there exists a positive constant K such that
w
(k+1)
w K w
(k+1)
w , for k sufficiently l arg e.
17
(i) implies that the Newton-Raphson algorithm needs take less time steps
to compute the solution at time, say t=T, whereas (ii) means that the
Newton-Raphson algorithm needs less iterations per step to compute the so-
lution with a prescribed accuracy . So that, from (i) and (ii) we can conclude
that the Newton-Raphson algorithm is more ecient than the functional it-
eration for a large number of strong sti problems. Unfortunately, one can
come across sti problems for which the use of the Newton-Raphson algo-
rithm may represent in several respects a heavy burden due, in particular,
to the computation and storage the Jacobian matrix at each iteration. In
such cases, a variant of the Newton-Raphson algorithm is employed, usually
a linear one that consists in replacing at each step the Jacobian J(w
(k)
) by
J(w
(0)
), this means that the Jacobian is calculated once per step (or may be
for several steps). In doing so, we can see after some algebraic manipulation
and setting J
0
:= J(w
(0)
) that (6.3) becomes
w
(k+1
) = (I tJ
0
)
1
t(g(w
(k)
) J
0
w
(k)
), (6.4a)
which is a functional iteration with
g(w) = (I tJ
0
)
1
(g(w
(k)
) J
0
w
(k)
). (6.4b)
Of course, (6.4a) does not possess the quadratic convergence of the true
Newton-Raphson algorithm (6.3), but considering that g(w) is suciently
smooth and applying the mean value theorem it is shown that the rate of
convergence of (6.4a) is
w
(k+1)
w
J(z) J
0
(I tJ
0
)
1
w
(k)
w ,
where z (w
(0)
, w
(k)
). Unless the Jacobian matrix varies strongly with
time, the term J(z) J
0
is likely to be small. On the other hand, for sti
problems, (I tJ
0
)
1
is likely also to be small, so that stiness will help
to make
J(z) J
0
(I tJ
0
)
1
small, contributing in this way to accelerate the
rate of convergence. There are still one more point to comment on. which
is related with the fact that in the Newton-Raphson algorithm or any of its
variants a linear system has to be solved. The thumb rule is that if such a
system is not too large LU decomposition is a good choice as a solver, but
for midsize or large systems the GMRES iterative method is becoming the
favorite solver nowadays.
18
7 Basic Bibliography
[HNW] Hairer, E., S.P. Norsett and G. Wanner (1987): Solving Ordinary
Dierential Equations I (Nonsti Problems) Springer- Verlag
[HW] Hairer, E. and G. Wanner (1991): Solving Ordinary Dierential
Equations II (Sti and Dierential-Algebraic Problems) Springer- Verlag
[IK] Isaccson, E. and H.Keller (1966) Analysis of Numerical Methods.
Dover.
[L] Lambert, J. (1991) Numrerical Methods for Ordinary Dierential Sys-
tems. John Wiley and Sons. Chichester.
[SG] Shampine, L.F. and W. Gear. (1979) A users view of solving sti
ordinary dierential equations. SIAM Review. 21, 1, 1-17.
19