and Scientific
Computing with
MATLAB® and
Python
Practical Numerical
and Scientific
Computing with
MATLAB® and
Python
Eihab B. M. Bashier
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The Mathworks
does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion
of MATLAB® software or related products does not constitute endorsement or sponsorship by The
MathWorks of a particular pedagogical approach or particular use of the MATLAB® software.
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The
authors and publishers have attempted to trace the copyright holders of all material reproduced
in this publication and apologize to copyright holders if permission to publish in this form has
not been obtained. If any copyright material has not been acknowledged please write and let us
know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, repro-
duced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now
known or hereafter invented, including photocopying, microfilming, and recording, or in any infor-
mation storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access
www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center,
Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit
organization that provides licenses and registration for a variety of users. For organizations that
have been granted a photocopy license by the CCC, a separate system of payment has been
arranged.
Preface xiii
Author xvii
vii
viii Contents
Bibliography 321
Index 327
Preface
The past few decades have witnessed tremendous development in the manu-
facture of computers and software, and scientific computing has become an
important tool for finding solutions to scientific problems that come from var-
ious branches of science and engineering. Nowadays, scientific computing has
become one of the most important means of research and learning in the fields
of science and engineering, which are indispensable to any researcher, teacher,
or student in the fields of science and engineering.
One of the most important branches of scientific computing is a numer-
ical analysis which deals with the issues of finding approximate numerical
solutions to such problems and analyzing errors related to such approximate
methods. Both the MATLAB® and Python programming languages provide
many libraries that can be used to find solutions of scientific problems visual-
izing them. The ease of use of these two languages became the most languages
that most scientists who use computers to solve scientific problems care about.
The idea of this book came after I taught courses of scientific computing for
physics students, introductory and advanced courses in mathematical software
and mathematical computer applications in many Universities in Africa and
the gulf area. I also conducted some workshops for mathematics and science
students who are interested in computational mathematics in some Sudanese
Universities. In these courses and workshops, MATLAB and Python were used
for the implementation of the numerical approximation algorithms. Hence, the
purpose of introducing this book is to provide the student with a practical
guide to solve mathematical problems using MATLAB and Python software
without the need for third-party assistance. Since numerical analysis is con-
cerned with the problems of approximation and analysis of errors of numerical
methods associated with approximation methods, this book is more concerned
with how these two aspects are applied in practice by software, where illustra-
tions and tables are used to clarify approximate solutions, errors and speed of
convergence, and its relations to some of the numerical method parameters,
such as step size and tolerance. MATLAB and Python are the most popular
programming languages for mathematicians, scientists, and engineers. Both
the two programming languages possess various libraries for numerical and
symbolic computations and data representation and visualization. Proficiency
with the computer programs contained in this book requires that the student
have prior knowledge of the basics of the programming languages MATLAB
and Python, such as branching, Loops, symbolic packages, and the graphical
xiii
xiv Preface
libraries. The MATLAB version used for this book is 2017b and the Python
version is 3.7.4.
The book consists of 11 chapters divided into three parts: the first part is
concerned with discussing numerical solutions for linear and nonlinear systems
and numerical difficulties facing these types of problems with how to overcome
these numerical difficulties. The second part deals with methods of completing
functions, differential and numerical integration, and solutions of differential
equations. The last part of the book discusses methods to solve linear and
nonlinear programming and optimal control problems. It also contains some
specialized software in Python language to solve some problems numerically.
These software packages must be downloaded from a third party, such as
Gekko which is used for the solutions of differential equations and linear and
nonlinear programming in addition to the optimal control problems. Also, the
Pulp package is used to solve linear programming problems and finally Pyomo
a package is used for solving linear and nonlinear programming problems. How
to install and run such a package is also presented in the book.
What distinguishes this book from many other numerical analysis books
is that it contains some topics that are not usually found in other books, such
as nonstandard finite difference methods for solving differential equations and
solutions of optimal control problems. In addition, the book discusses imple-
mentations of methods with high convergence rates, such as Gauss integration
methods discussed in the numerical differentiation and integration, exact finite
difference schemes for solving differential equations discussed in the nonstan-
dard finite differences Chapter. It also uses efficient python-based software for
solving some kinds of mathematical problems numerically.
The parts of the book are separate from each other so that the student
can study any part of it without having to read the previous parts of that
part. The exception to this is the optimal control chapter in the third part,
which requires studying numerical methods to solve the differential equations
discussed in the second part.
After reading this book and implementing the programs contained on it,
a student will be able to deal with and solve many kinds of mathematical
problems such as differential equations, static, and dynamical optimization
problems and apply the methods to real-life problems.
Acknowledgment
I am very grateful to the African Institute of Mathematical Sciences (AIMS),
Cape Town, South Africa, which hosted me on a research visit during which
some parts of this book have been written. I would also like to thank the edito-
rial team of this book under the leadership of publisher, Randi Cohen, for their
continuous assistance in formatting, coordinating, editing, and directing the
book throughout all stages. Special thanks go to all professors who taught me
the courses of numerical analysis in the various stages of my under- and post-
graduate studies, and, in particular, I thank Dr. Mohsin Hashim University of
Preface xv
xvii
Part I
Abstract
Linear systems of equations have many applications in mathematics and sci-
ence. Many of the numerical methods used for solving mathematics problems
such as differential or integral equations, polynomial approximations of tran-
scendental functions and solving systems of nonlinear equations arrive at a
stage of solving linear systems of equations. Hence, solving a linear system of
equations is a fundamental problem in numerical computing.
This chapter discusses the direct methods for solving linear systems of
equations, using Gauss and Gauss-Jordan elimination techniques and the
matrix factorization approach. MATLAB® and Python implementations of
such algorithms are provided.
3
4 Solving Linear Systems Using Direct Methods
>> A = [1 2 3; 4 5 6; 7 8 9]
A =
1 2 3
4 5 6
7 8 9
>> b = [1; 1; 1]
b =
1
1
1
>> r1 = rank(A)
r1 =
2
>> r2 = rank([A b])
r2 =
2
x = A−1 b.
Hence, finding the solution of the linear system requires the inversion of
matrix A.
Ax = b,
bi
xi =
aii
The MATLAB code to compute this solution is given by:
1 function x = SolveDiagonalLinearSystem(A, b)
2 % This function solves the linear system Ax = b, where ...
A is a diagonal matrix
3 % b is a known vector and n is the dimension of the ...
problem.
4 n = length(b) ;
5 x = zeros(n, 1) ;
6 for j = 1: n
7 x(j) = b(j)/A(j, j) ;
8 end
1 import numpy as np
2 def SolveDiagonalLinearSystem(A, b):
3 n = len(b)
Methods for Solving Linear Systems 7
In this case we use the back substitution method for finding the solution
of system 1.3. The MATLAB function SolveUpperSystem.m solves the
linear system 1.3 using the back-substitution method.
1 function x = SolveUpperLinearSystem(A, b)
2 % This function uses the backward substitution method ...
for solving
3 % the linear system Ax = b, where A is an upper ...
triangular matrix
4 % b is a known vector and n is the dimension of the ...
problem.
5 n = length(b) ;
6 x = zeros(n, 1) ;
7 x(n) = b(n)/A(n, n) ;
8 for j = n-1: -1 : 1
9 x(j) = b(j) ;
8 Solving Linear Systems Using Direct Methods
10 for k = j+1 : n
11 x(j) = x(j) - A(j, k)*x(k) ;
12 end
13 x(j) = x(j)/A(j, j) ;
14 end
1 import numpy as np
2 def SolveUpperLinearSystem(A, b):
3 n = len(b)
4 x = np.zeros((n, 1), 'float')
5 x[n-1] = b[n-1]/A[n-1, n-1]
6 for i in range(n-2, -1, -1):
7 x[i] = b[i]
8 for j in range(i+1, n):
9 x[i] -= A[i, j]*x[j]
10 x[i] /= A[i, i]
11 return x
The forward substitution method is used to find the solution of system 1.4.
The MATLAB function SolveLowerSystem.m solves the linear system 1.4
using the forward-substitution method.
1 function x = SolveLowerLinearSystem(A, b)
2 % This function uses the forward substitution method ...
for solving
3 % the linear system Ax = b, where A is an lower ...
triangular matrix
4 % b is a known vector and n is the dimension of the ...
problem.
Methods for Solving Linear Systems 9
5 n = length(b) ;
6 x = zeros(n, 1) ;
7 x(1) = b(1)/A(1, 1) ;
8 for j = 2 : n
9 x(j) = b(j) ;
10 for k = 1 : j-1
11 x(j) = x(j) - A(j, k)*x(k) ;
12 end
13 x(j) = x(j)/A(j, j) ;
14 end
Starting by finding the row echelon form for the given matrix.
4 −1 −1 4 −1 −1 4 −1 −1
R ←4R2+R1 R3 ←3R3+R2
A = −1 4 −1 ==2=======⇒ 0 15 −5 = ========⇒ 0 15 −5
R3 ←4R3+R1
−1 −1 4 0 −5 15 0 0 40
10 Solving Linear Systems Using Direct Methods
1 clear ; clc ;
2 A = input('Enter the matrix A: ') ; % Reading matrix A from ...
the user
3 b = input('Enter the vector b: ') ; % Reading vector b from ...
the user
4 [m, n] = size(A) ; % m and n are the matrix ...
dimensions
5 r1 = rank(A) ; % the rank of matrix A is ...
assigned to r1
6 r2 = rank([A b]) ; % the rank of the ...
augmented system [A b] is assigned to r2
7 if r1 , r2 % testing whether rank(A) ...
not equal rank([A b])
8 disp(['Rank(A) = ' num2str(r1) ' , ' num2str(r2) ' = ...
Rank([A b]).']) ;
9 fprintf('There is no solution.\n') ; % No solution in this ...
case
10 end
11 if r1 == r2 % testing whether rank(A) = ...
rank([A b])
Methods for Solving Linear Systems 11
A = A1 · A2 · . . . · An
U =
4.0000 -1.0000 -1.0000
0 3.7500 -1.2500
0 0 3.3333
In Python, the function lu is located in the scipy.linalg sub-package
and can be used to find the LU factors of matrix A.
In [1]: import numpy as np, scipy.linalg as lg
In [2]: A = np.array([[4, -1, -1], [-1, 4, -1], [-1, -1, 4]])
In [3]: P, L, U = lg.lu(A)
In [4]: print(’L = \n’, L, ’\nU = \n’, U)
L =
[[ 1. 0. 0. ]
[-0.25 1. 0. ]
[-0.25 -0.33333333 1. ]]
U =
[[ 4. -1. -1. ]
[ 0. 3.75 -1.25 ]
[ 0. 0. 3.33333333]]
However, python can compact both the L and U factors of matrix A using
the function lu factor.
14 Solving Linear Systems Using Direct Methods
In [5]: LU = lg.lu_factor(A)
In [6]: print(’LU = \n’, LU)
LU =
(array([[ 4. , -1. , -1. ],
[-0.25 , 3.75 , -1.25 ],
[-0.25 , -0.33333333, 3.33333333]]), array([0, 1, 2],
dtype=int32))
Ly = b
Example 1.1 In this example, the LU-factors will be used to solve the linear
system:
4 −1 −1 x1 2
−1 4 −1 x2 = 2
−1 −1 4 x3 2
In MATLAB, the following commands can be used:
L =
1.0000 0 0
-0.2500 1.0000 0
-0.2500 -0.3333 1.0000
U =
4.0000 -1.0000 -1.0000
0 3.7500 -1.2500
0 0 3.3333
>> y = SolveLowerLinearSystem(L, b, 3)
y =
2.0000
2.5000
3.3333
>> x = SolveUpperLinearSystem(U, y, 3)
x =
1.0000
1.0000
1.0000
In Python, similar steps can be followed to solve the linear system Ax = b
using the LU factors of matrix A.
In [7]: y = lg.solve(L, b)
In [8]: x = lg.solve(U, y)
In [9]: print(’x = \n’, x)
x =
[[ 0.5]
[ 0.5]
[ 0.5]]
Python has the LU solver lu solve located in scipy.linalg sub-package.
It receives the matrix LU obtained by applying the lu solve function, to
return the solution of the given linear system.
In [10]: x = lg.lu_solve(LU, b)
In [11]: print(x)
[[ 0.5]
[ 0.5]
[ 0.5]]
The Python’s symbolic package sympy can also be used to find the LU
factors of a matrix A. This can be done as follows:
In [10]: import sympy as smp
In [11]: A = smp.Matrix([[4., -1., -1.], [-1., 4., -1.],
[-1., -1., 4.]])
In [12]: LU = B.LUdecomposition()
16 Solving Linear Systems Using Direct Methods
In [13]: LU
Out[13]:
(Matrix([
[ 1, 0, 0],
[-0.25, 1, 0],
[-0.25, -0.333333333333333, 1]]), Matrix([
[4.0, -1.0, -1.0],
[ 0, 3.75, -1.25],
[ 0, 0, 3.33333333333333]]), [])
In [14]: LU[0]
Out[14]:
Matrix([
[ 1, 0, 0],
[-0.25, 1, 0],
[-0.25, -0.333333333333333, 1]])
In [15]: LU[1]
Out[15]:
Matrix([
[4.0, -1.0, -1.0],
[ 0, 3.75, -1.25],
[ 0, 0, 3.33333333333333]])
The symbolic package sympy can be also used to solve a linear system, using
the LU factors.
In [16]: b = [[2.0], [2.0], [2.0]]
In [17]: A.LUSolve(b)
Out[17]:
Matrix([
[1.0],
[1.0],
[1.0]])
Q =
-0.9428 -0.1421 0.3015
0.2357 -0.9239 0.3015
0.2357 0.3553 0.9045
R =
-4.2426 1.6499 1.6499
0 -3.9087 2.4873
0 0 3.0151
In Python, the function qr located in scipy.linalg can be used to find
the QR factors of matrix A.
In [18]: Q, R = lg.qr(A)
In [19]: print(’Q = \n’, Q, ’\nR =\n’, R)
Q =
[[-0.94280904 -0.14213381 0.30151134]
[ 0.23570226 -0.92386977 0.30151134]
[ 0.23570226 0.35533453 0.90453403]]
R =
[[-4.24264069 1.64991582 1.64991582]
[ 0. -3.9086798 2.48734169]
[ 0. 0. 3.01511345]]
The symbolic Python can also be used to find the QR-factors of matrix A
In [20]: QR = A.QRdecomposition()
In [21]: QR
Out[21]:
(Matrix([
[ 0.942809041582063, 0.14213381090374, 0.301511344577764],
[-0.235702260395516, 0.923869770874312, 0.301511344577764],
[-0.235702260395516, -0.355334527259351, 0.904534033733291]]),
Matrix([[4.24264068711928, -1.64991582276861, -1.64991582276861],
[ 0, 3.90867979985286, -2.48734169081546],
[ 0, 0, 3.01511344577764]]))
In [22]: QR[0]
Out[22]:
Matrix([
[ 0.942809041582063, 0.14213381090374, 0.301511344577764],
[-0.235702260395516, 0.923869770874312, 0.301511344577764],
18 Solving Linear Systems Using Direct Methods
In [23]: QR[1]
Out[23]:
Matrix([
[4.24264068711928, -1.64991582276861, -1.64991582276861],
[ 0, 3.90867979985286, -2.48734169081546],
[ 0, 0, 3.01511344577764]])
To solve the linear system 1.1, using the QR factorization technique the
following steps can be used:
1. Finding the QR factors of matrix A and rewrite the system Ax = b as
Q · Rx = b.
2. multiplying the two sides of equation Q · Rx = b by QT , giving:
QT QRx = QT b ⇒ Rx = QT b
Example 1.2 In this example, the QR factors will be used to solve the linear
system:
4 −1 −1 x1 2
−1 4 −1 x2 = 2
−1 −1 4 x3 2
>> A = [4, -1, -1; -1, 4, -1; -1, -1, 4] ;
>> b = [2; 2; 2] ;
>> [Q, R] = qr(A) ;
>> y = Q’*b
y =
-0.9428
-1.4213
3.0151
>> x = SolveUpperLinearSystem(R, y, 3)
x =
1.0000
1.0000
1.0000
Python can be used to solve the above linear system, using the QR factors
as follows.
In [24]: import numpy as np, scipy.linalg as lg
In [25]: A = np.array([[4, -1, -1], [-1, 4, -1], [-1, -1, 4]])
Matrix Factorization Techniques 19
In [26]: b = np.array([[2.0],[2.0],[2.0]])
In [27]: Q, R = lg.qr(A)
In [28]: y = np.matmul(Q.T, b)
In [29]: x = lg.solve(R, y)
In [30]: print(’x = \n’, x)
x =
[[ 1.0]
[ 1.0]
[ 1.0]]
Another method is to use the symbolic package:
In [31]: import sympy as smp
In [32]: A = smp.Matrix([[4.0, -1.0, -1.0], [-1.0, 4.0, -1.0],
[-1.0, -1.0, 4.0]])
In [33]: b = smp.Matrix([[2.0], [2.0], [2.0]])
In [34]: x = A.QRsolve(b)
In [35]: x
Out[35]:
Matrix([
[1.0],
[1.0],
[1.0]])
A = U ·S ·V T,
U =
0.0000 -0.8165 -0.5774
-0.7071 0.4082 -0.5774
0.7071 0.4082 -0.5774
S =
5 0 0
0 5 0
0 0 2
V =
0 -0.8165 -0.5774
-0.7071 0.4082 -0.5774
0.7071 0.4082 -0.5774
In Python, the function svd is used to find the svd decomposition of
matrix A.
In [36]: U, S, V = lg.svd(A)
In [37]: print(’U = \n’, U, ’\nS = \n’, S, ’\nV = \n’, V)
U =
[[ 2.69618916e-17 -8.16496581e-01 -5.77350269e-01]
[ -7.07106781e-01 4.08248290e-01 -5.77350269e-01]
[ 7.07106781e-01 4.08248290e-01 -5.77350269e-01]]
S =
[ 5. 5. 2.]
V =
[[ 0. -0.70710678 0.70710678]
[-0.81649658 0.40824829 0.40824829]
[-0.57735027 -0.57735027 -0.57735027]]
Now, solving system Ax = b is equivalent to finding the solution of the linear
system
U SV T x = b, (1.7)
hence, multiplying the two sides of Equation (1.7) by V · S −1 U T , gives,
x = V · S −1 U T b
Example 1.3 The svd will be used to solve the linear system:
4 −1 −1 x1 2
−1 4 −1 x2 = 2
−1 −1 4 x3 2
Matrix Factorization Techniques 21
>> x = V*inv(S)*U’*b
x =
1.0000
1.0000
1.0000
In Python, the linear system is solved with the svd components as follows:
In [38]: x = lg.solve(V, lg.solve(np.diag(S), lg.solve(U,b)))
In [39]: print(’x = \n’, x)
x =
[[ 0.5]
[ 0.5]
[ 0.5]]
2
Solving Linear Systems with Iterative and
Least Squares Methods
Abstract
The direct methods for solving a linear system Ax = b are to try to find
the exact solution of the linear system, by inverting the matrix A directly or
indirectly. The iterative methods aim at finding an approximate solution of
the linear system by finding a sequence of vectors that is converging to the
exact solution. In the case that the linear system does not have a solution,
the problem turns into a least squares problem.
This chapter aims to find approximate and least squared solutions of linear
systems. It is divided into three sections. In the first section, basic concepts
such as error norm and convergence of vector sequences are introduced. Then,
in the second section three iterative methods for finding approximate solutions
of linear systems of equations are discussed and implemented in MATLAB®
and Python. When a linear system does not have a solution, the problem
turns into searching for a least squares solution that minimizes the error norm.
Examples of least squares problems and best approximations of functions by
polynomials are discussed and implemented in MATLAB and Python, in the
third section.
23
24 Solving Linear Systems with Iterative and Least Squares Methods
{an }∞ ∞
n=N = aN , aN +1 , aN +2 , . . . of the sequence {an }n=0 . To express the con-
∞
vergence of the sequence {an }n=0 to a mathematically, we write
|an − a| < ε, ∀n ≥ N
For example, the sequence
∞
n
→ 1 as n → ∞
n + 1 n=0
because, if we make any choice for ε > 0, then
n −1 1 1 1−ε
n + 1 − 1 = n + 1 = n + 1 < ε ⇒ n > ε − 1 = ε ,
It satisfies kxk1 ≥ kxk2 ≥ . . . ≥ kxk∞ . The norm kxk2 gives the classical
Euclidean distance: q
kxk2 = x21 + x22 + . . . + x2n
The Python function norm (located in numpy.linalg library) receives a vector
x and an integer p or np.inf , and returns kxkp .
In [27]: x = np.array([1, -1, 2, -2, 3, -3])
In [28]: from numpy.linalg import norm
In [29]: n1, n2, n3 = norm(x, 1), norm(x, 2), norm(x, np.inf)
In [30]: print(n1, n2, n3)
12.0
5.29150262213
3.0
lim x(k) = x∗
k→∞
Sx = b − T x
x∗ = Bx∗ + c
The fixed point is approached iteratively, starting from the given initial
point x(0) , by using the iterative relationship:
x(k+1) = Bx(k) + c
>> U = triu(A) - D;
In Python, the tril, triu are implemented in the scipy.linalg pack-
age, and diag is implemented in both scipy and numpy. They can be obtained
through the following commands:
28 Solving Linear Systems with Iterative and Least Squares Methods
(k+1)
Generally, the solution in iteration xi can be written in the form:
n
(k+1) 1 bi −
X (k)
xi = aij xj , i = 1, . . . , n (2.4)
aii
j=1,j,i
Example 2.1 Write the first three iterations of the Jacobi method, for the linear
system:
2 −1 1 x1 −1
−2 5 −1 x2 = 1
1 −2 4 x3 3
starting from the zeros vector
0
x(0) = 0
0
Solution:
We write:
(k+1) 1 (k) (k)
x1 = −1 + x2 − x3
2
(k+1) 1 (k) (k)
x2 = 1 + 2x1 + x3
5
(k+1) 1 (k) (k)
x3 = 3 − x1 + 2x2
4
30 Solving Linear Systems with Iterative and Least Squares Methods
1. First iteration k = 0:
(1) 1 (0) (0)
1 1
x1 = −1 + x2 − x3 = (−1 + 0 − 0) = −
2 2 2
(1) 1 (0) (0)
1 1
x2 = 1 + 2x1 + x3 = (1 + 2(0) + 0) =
5 5 5
(1) 1 (0) (0)
1 3
x3 = 3 − x1 + 2x2 = (3 − 0 + 2(0)) =
4 4 4
2. Second iteration k = 1:
1 1 1 3
31
(2) (1) (1)
x1 = −1 + x2 − x3 = −1 + − =−
2 2 5 4 40
1 1 −1 3
3
(2) (1) (1)
x2 = 1 + 2x1 + x3 = 1+2· + =
5 5 2 4 20
1 1 −1 1
39
(2) (1) (1)
x3 = 3 − x1 + 2x2 = 3− +2 =
4 4 2 5 40
3. Third iteration k = 2:
1 1 3 39
73
(3) (2) (2)
x1 = −1 + x2 − x3 = −1 + − =−
2 2 20 40 80
1 1 −31 −73
17
(3) (2) (2)
x2 = 1 + 2x1 + x3 = 1+2· − =
5 5 40 80 200
1 1 31 3
163
(3) (2) (2)
x3 = 3 − x1 + 2x2 = 3− +2 =
4 4 40 20 160
Example 2.2 The Jacobi method will be applied for solving the linear system:
−5 1 −2 x1 13
1 6 3 x2 = 1
2 −1 −4 x3 −1
By calling the function JacobiSolve to solve the given linear system, we obtain:
In [3]: A = np.array([[-5, 1, -2], [1, 6, 3], [2, -1, -4]])
In [4]: b = np.array([[13], [1], [-1]])
In [5]: x = JacobiSolve(A, b, Eps)
In [6]: print(’x = \n’, x)
Out[6]:
x =
[[-2. ],
[ 1.00000002],
[-1.00000002]]
The following MATLAB code implements the Jacobi method in the matrix
form:
From the IPython console, the following commands can be used to solve the
linear system:
In [7]: x, Iterations = JacobiSolve_VectorForm(A, b, x0, Eps)
In [8]: print(’Iterations = ’, Iterations, ’\nx = \n’, x)
Iterations = 24
x =
[[ -2. ]
The Iterative Methods 33
[ 1.00000002]
[-1.00000002]]
Applying the above code to the linear systems in examples 2.1 and 2.2:
>> Eps = 1e-8 ; x0 = [0;0;0] ;
>> A = [2 -1 1; -2 5 -1; 1 -2 4] ; b = [-1;1;3] ;
>> [x, Iters] = GaussSeidel(A, b, Eps, x0)
x =
-1.0000
0
1.0000
Iters =
11
>> A = [-5 1 -2; 1 6 3; 2 -1 -4] ; b = [13; 1; -1] ;
>> [x, Iters] = GaussSeidel(A, b, Eps, x0)
x =
-2.0000
1.0000
-1.0000
Iters =
15
1 import numpy as np
2 def GaussSeidelSolve(A, b, x0, Eps):
3 n = len(b)
4 x = np.ones((n, 1), 'float')
5 Iterations = 1
6 while np.linalg.norm(x-x0, np.inf) ≥ Eps:
7 x0 = x.copy()
8 for i in range(n):
9 x[i] = b[i]
The Iterative Methods 35
10 for j in range(i):
11 x[i] -= A[i][j]*x[j]
12 for j in range(i+1, n):
13 x[i] -= A[i][j]*x0[j]
14 x[i] /= A[i][i]
15 Iterations += 1
16 return x, Iterations
17
18 A = np.array([[-5, 1, -2], [1, 6, 3], [2, -1, -4]])
19 b = np.array([[13], [1], [-1]])
20 x0 = np.zeros((3, 1), 'float')
21 Eps = 1e-8
22 x, Iterations = GaussSeidelSolve(A, b, x0, Eps)
23 print('Iterations = ', Iterations, '\nx = \n', x)
-1.0000
0
1.0000
Iters =
11
>> A = [-5 1 -2; 1 6 3; 2 -1 -4] ; b = [13; 1; -1] ;
>> [x, Iters] = GaussSeidelIter(A, b, Eps, x0)
x =
-2.0000
1.0000
-1.0000
Iters =
15
The function GaussSeidelIter can be implemented in python as:
from which,
(D + ωL)x = ((1 − ω)D − ωU ) x + ωb
The iterative method is obtained by replacing x at the left-hand side by x(k+1)
and by x(k) at the right-hand side, giving the formula
x =
-1.0000
38 Solving Linear Systems with Iterative and Least Squares Methods
0.0000
1.0000
>> A = [-5 1 -2; 1 6 3; 2 -1 -4] ; b = [13; 1; -1] ;
>> [x, Iters] = SOR(A, b, w, x0, Eps)
x =
-2.0000
1.0000
-1.0000
Iters =
33
The Python code SORIter.py applies the SOR to solve the linear systems
in examples 2.1 and 2.2.
1 import numpy as np
2 from scipy.linalg import tril, solve
3 def SOR(A, b, w, x0, Eps):
4 D = np.diag(np.diag(A))
5 L = tril(A) - D
6 U = A - (L+D)
7 B = solve(-(D+w*L), (w-1)*D+w*U)
8 c = solve((D+w*L), w*b)
9 Iters = 1
10 x = np.matmul(B, x0) + c
11 while np.linalg.norm(x-x0, np.inf) ≥ Eps:
12 x0 = x.copy()
13 x = np.matmul(B, x0) + c
14 Iters += 1
15 return x, Iters
16
17 print('Solving the first linear system:')
18 A = np.array([[2, -1, 1], [-2, 5, -1], [1, -2, 4]])
19 b = np.array([[-1], [1], [3]])
20 w, x0 = 1.25, np.zeros((3, 1), 'float')
21 Eps = 1e-8
22 x, Iterations = SOR(A, b, w, x0, Eps)
23 print( 'x = \n', x, '\nIterations = ', Iterations)
24
25 print('Solving the second linear system:')
26 A = np.array([[-5, 1, -2], [1, 6, 3], [2, -1, -4]])
27 b = np.array([[13], [1], [-1]])
28 x, Iterations = SOR(A, b, w, x0, Eps)
29 print( 'x = \n', x, '\nIterations = ', Iterations)
[-1.94732286e-10]
[ 1.00000000e+00]]
Iterations = 16
Solving the second linear system:
x =
[[-2.]
[ 1.]
[-1.]]
Iterations = 32
The Gauss Seidel method solved Example 2.2 in 16 iterations, and the
SOR method with ω = 1.25 solved the example in 33 iterations. From this
example, it is clear that the SOR is not guaranteed to converge faster than
Gauss-Seidel method if the parameter ω is not selected carefully. The selection
of the relaxation parameter ω plays a key role in the convergence rate of the
SOR. Hence when the optimal value of the parameter is selected, the SOR
method achieves the best convergence rate.
In the following example, 15 values in [0.1, 1.5] are selected for parameter
w and the corresponding numbers of iterations are computed. Then the values
of w are plotted against the corresponding numbers of iterations.
Executing this code gives Figure 2.1. From figure 2.1 the optimal value of
parameter ω lies some where between 0.8 and 1.1. Hence, zooming more in
this interval, the optimal value of the parameter ω lies somewhere between
0.92 and 0.98 as shown in Figure 2.2.
200
175
150
125
Iterations
100
75
50
25
19
18
17
Iterations
16
15
14
13
0.80 0.85 0.90 0.95 1.00 1.05 1.10
w
FIGURE 2.2: Plot of the data (w, Iters) for ω in [0.8, 1.1] VS the corresponding
numbers of iterations.
case that b does not lie in col(A), the problem of solving Ax = b becomes an
approximation problem, in which we look for some x̂ ∈ Rn such that
or in other words:
find x̂ : kb − Ax̂k = minn {kb − Axk}. (2.9)
x∈R
3. the least squares problem, where the minimization is under the classical
Euclidean distance in Rm :
21
m
|bj − (Ax)j |2 .
X
find x̂ : kb − Ax̂k2 = minn
x∈R
j=1
Example 2.3 Find the least squares solution of the inconsistent linear system
Ax = b, where:
1 2 4 5
3 1 5 3
A= 1 1 1 and b = 3
2 2 1 1
3 1 3 4
Solution: The normal equations are given by AT Ax̂ = AT b, where,
24 13 31 31
AT A = 13 11 19 and AT b = 22
31 19 52 51
[ 0.8969578 ]
[ 0.75073602]]
Remark 2.1 The inconsistent linear system Ax = b has a unique least-
squares solution, if AT A is a full-rank matrix. That is AT A is nonsingular
[30].
Remark 2.2 If x̂ is a least-squares solution of the inconsistent linear system
Ax = b, then kb − Ax̂k2 defines the least squares error [53].
For example, the least squares error in the above example is obtained by
using the following MATLAB command:
>> LSError = norm(b-A*xh, 2)
LSError = 2.6570
In Python, the least squares error is obtained by using the command:
In [14]: LSError = np.linalg.norm(b-A@xh, 2)
In [15]: print(LSError)
2.6570401973629143
x1 x2 ... xn
y1 y2 ... yn
find α̂ and β̂ in R, such that the linear model y = α̂ + β̂x gives the best
fit of the linear model to the data (xi , yi ), i = 1, . . . , n. This problem is
equivalent to the problem:
n 2 n
(yj − (α + βxj ))2 .
X X
Find α̂, β̂ ∈ R : yj − α̂ + β̂xj = min
α,β∈R
j=1 j=1
Ax = y (2.14)
where,
1 x1 y1
1 x2
α
y2
A = . .. , x = β and y = ..
.. . .
1 xn yn
The regression coefficients α̂ and β̂ can be obtained from the normal equa-
tions
AT Ax̂ = AT ŷ,
where,
1 x1
1 x Pn
2
T1 1 ... 1 n j=1 xj
A A= .. = Pn x
xn ... 2
Pn
x1 x2 ... j=1 xj
. j=1 j
1 xn
and
y1
y Pn y
1 2
T 1 1 ... j=1 j
A y= .. = Pn x y
x1 x2 ... xn ...
. j=1 j j
yn
Now,
−1 1 1
AT A = adj(AT A) = 2
det(AT A)
P
n j=1 x2j −
Pn n
j=1 xj
x2j
Pn
− n
P
Pj=1 j=1 xj
× n
− j=1 xj n
as follows:
2
Pn Pn Pn Pn
j=1 xj j=1 yj − j=1 xj j=1 xj yj
α̂ = P 2 (2.15)
2
Pn n
nj=1 xj − j=1 xj
n j=1 xj yj − j=1 xj n
Pn Pn P
j=1 yj
β̂ = P 2 (2.16)
n j=1 x2j −
Pn n
j=1 x j
The Least Squares Solutions 45
Now, we test the function LinearRegCoefs, using data points of heights (in
meters) and weights (in kilograms) taken for nine students as in Table 2.1.
TABLE 2.1: Heights (in centimeters) vs. weights (in kilograms) for nine male
students in secondary school
Height 1.65 1.67 1.68 1.72 1.77 1.82 1.86 1.89 1.90
Weight 57.0 61.0 64.0 69.0 75.0 83.0 90.0 97.0 100.0
1 import numpy as np
2 import matplotlib.pylab as plt
3 def LinearRegCoefs(x, y):
4 n = len(x)
5 ahat = ...
(sum(x*x)*sum(y)-sum(x)*sum(x*y))/(n*sum(x*x)-sum(x)**2)
6 bhat = (n*sum(x*y)-sum(x)*sum(y))/(n*sum(x*x)-sum(x)**2)
7 return ahat, bhat
8
9 H = np.array([1.65, 1.67, 1.68, 1.72, 1.77, 1.82, 1.86, ...
1.89, 1.90])
10 W = np.array([57.0, 61.0, 64.0, 69.0, 75.0, 83.0, 90.0, ...
97.0, 100.0])
11 ahat, bhat = LinearRegCoefs(H, W) ;
12 WW = ahat + bhat * H ;
46 Solving Linear Systems with Iterative and Least Squares Methods
Executing the above Python code, will give the a similar graph as in
Figure 2.3.
ii Fitting a polynomial to data: Given a table of data points:
x1 x2 ... xn
y1 y2 ... yn
y = α0 + α1 x + α2 x2 + · · · + αk xk , 1 ≤ k ≤ n − 1, (2.17)
y1 = α0 + α1 x1 + α2 x21 + · · · + αk xk1
y2 = α0 + α1 x2 + α2 x22 + · · · + αk xk2
.. ..
. .
yn = α0 + α1 xn + α2 x2n + · · · + αk xkn
110
100
90
Weight (in kilograms)
80
70
60
50
1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95
Height (in meters)
FIGURE 2.3: The linear model vs. the scattered data of heights and weights.
The Least Squares Solutions 47
x 5 15 25 35 45 55 65 75 85 95
y 2 6 10 16 25 21 10 5 3 2
The above code uses the piece-wise cubic Hermite interpolating polyno-
mials (pchip) to approximate the solution by a smooth curve, instead of
the linear interpolation in which the solution curve between two points is
approximated by a straight line. By executing the above code, we obtain
the Figure 2.4:
The Python code to fit polynomials of degrees 2, 4, 7 and 9 to the tabular
data and plot the least-square curves is:
1 import numpy as np
2 from scipy.interpolate import pchip
3 import matplotlib.pyplot as plt
4 x = np.array([5., 15., 25., 35., 45., 55., 65., ...
75., 85., 95.])
5 y = np.array([2., 6., 10., 16., 25., 21., 10., ...
5., 3., 2.])
6 xx = np.arange(1., 101.)
7 deg = np.array([2, 4, 7, 9])
8 Model = np.zeros((4, len(x)), float) ;
30
Data
27
k=2
k=4
k=7
24 k=9
21
18
15
y
12
0
0 10 20 30 40 50 60 70 80 90 100
x
Degree 2 4 7 9
LS Error 1.29467 × 101 6.89477 3.01662 1.50627 × 10−3
Both MATLAB and Python have a function polyfit that receives two vec-
tors x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) and a positive integer k and returns
the optimal parameters α̂1 , . . . , α̂k+1 . In MATLAB, the code lines:
1 A = zeros(length(x), k+1) ;
2 for j = 1 : k+1
3 A(:, j) = x(:).ˆ(j-1) ;
4 end
5 Q = (A'*A)\(A'*y(:)) ;
1 Q = polyfit(x, y, k) ;
50 Solving Linear Systems with Iterative and Least Squares Methods
1 Q = np.polyfit(x, y, k)
The problem is to find the optimal P coefficients α̂0 , α̂1 , . . . , α̂n such that the
nth -degree polynomial Pn (x) = n j=0 α̂ j x j is a least squares solution of
b n
bj+k+1 − aj+k+1
Z X
= 2 xk f (x)dx − 2 α̂j =0 (2.23)
a j +k+1
j=0
The optimal coefficients α̂0 , . . . , α̂n are obtained by solving the normal
equations (2.24).
We write a function FunApproxCoef that receives a function f, limits of
interval a and b and the degree of the least-squares approximating poly-
nomial n. The function returns a vector alph whose components are the
optimal coefficients α̂0 , . . . , α̂n .
The MATLAB code of the function FunApproxCoef:
14 for k = 1 : n+1
15 y(k) = integral(@(x) x.ˆ(k-1).*f(x), a, b) ;
16 end
17 alp = A\y ;
18 alph = zeros(1, length(alp)) ;
19 for j = 1 : length(alp)
20 alph(j) = alp(n-j+2) ;
21 end
1 import numpy as np
2 from scipy import integrate
3 import matplotlib.pyplot as plt
4
5 def FunApproxCoef(f, a, b, n):
6 A = np.zeros((n+1, n+1), float)
7 y = np.zeros((n+1, 1), float)
8 j = 0
9 while j ≤ n:
10 k = 0 ;
11 while k ≤ j:
12 A[j, k] = (b**(j+k+1)-a**(j+k+1))/(j+k+1)
13 A[k, j] = A[j, k]
14 k += 1
15 j += 1
16 for k in range(n+1):
17 y[k] = integrate.quad(lambda x: x**k*f(x), a, b)[0]
18 alp = np.linalg.solve(A,y)
19 alph = list(alp)
20 alph.reverse()
21 alph = np.array(alph)
22 return alph
1 a = 0; b = 1 ;
2 x = linspace(0, 3) ;
3 f = @(x) exp(x) ;
4 P = zeros(4, length(x)) ;
5 for l = 1 : 4
6 n = l ;
7 alph = FunApproxCoef(f, a, b, n) ;
8 P(l, :) = polyval(alph, x) ;
9 end
10 plot(t, f(t), '-b', t, P(1,:), '--r', t, P(2, :), '-.m', t, ...
P(4, :), ':k', 'LineWidth', 2) ;
11 xlabel('x', 'fontweight','bold') ;
12 ylabel('y', 'fontweight','bold') ;
13 legend('y = exp(x)', 'n = 1', 'n = 2', 'n = 4') ;
14 grid on ;
The Least Squares Solutions 53
15 ax = gca ;
16 ax.FontWeight = 'bold' ;
17 ax.FontSize = 14 ;
20.0 y=exp(x)
n=1
17.5 n= 2
n=4
15.0
12.5
y
10.0
7.5
5.0
2.5
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x
1 a, b = 0, 1
2 t = np.linspace(a, b, 101)
3 f = lambda t: np.cos(np.pi*t)
4 P = np.zeros((7, len(t)), float)
5 print('n \t | | f(t)-P n(t) | | 2)\n')
6 print('---------------------------------\n')
7 for l in range(7):
8 n = 2*l+1
9 alph = FunApproxCoef(f, a, b, n)
10 P[l, :] = np.polyval(alph, t)
11 print(n, '\t', ...
'{:7.4e}'.format(np.linalg.norm(f(t)-P2[l, :]), '\n'))
12 plt.plot(t, f(t), '-m', lw=2, label='cos(pi*t)')
13 plt.plot(t, P[0, :], '--r', lw = 2, label = 'n = 1')
14 plt.plot(t, P[2, :], ':b', lw = 4, label = 'n = 3')
15 plt.xlabel('x', fontweight = 'bold')
16 plt.ylabel('y', fontweight= 'bold')
17 plt.xticks(np.arange(0.0, 1.1, 0.1))
18 plt.legend()
19 plt.grid(True, ls = '--')
cos(pi*t)
n = 1
1.0
n = 3
0.5
0.0
y
−0.5
−1.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
x
FIGURE 2.6: Approximate functions of f (x) = cos (πx), x ∈ [0, 1] with poly-
nomials of degrees 1 and 3
8 for k = 1 : 8
9 n = 2*k-1 ;
10 alph = FunApproxCoef(f, a, b, n) ;
11 P(k, :) = polyval(alph, t) ;
12 fprintf('%i\t\t%7.4e\n', n, norm(f(t)-P(k, :), 2))
13 end
14 plot(t, f(t), '-m', t, P(1, :), '--r', t, P(2, :), ':b', ...
'LineWidth', 2) ;
15 xlabel('x', 'fontweight','bold') ;
16 ylabel('y', 'fontweight','bold') ;
17 legend('y = cos(pi*x)', 'n = 1', 'n = 3') ;
18 grid on ;
19 ax = gca ;
20 ax.FontWeight = 'bold' ;
21 ax.FontSize = 14 ;
13 1.6090e-07
> In FunApproxCoef (line 17)
In PlotApproxcos (line 10)
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 5.742337e-20.
15 3.1356e-07
The reason behind these error messages will be discussed in the next chap-
ter.
MATLAB has a function hilb, which recieves an integer n and return the
corresponding Hiblbert matrix Hn .
>> hilb(3)
ans =
1.0000 0.5000 0.3333
0.5000 0.3333 0.2500
0.3333 0.2500 0.2000
>> hilb(5)
ans =
1.0000 0.5000 0.3333 0.2500 0.2000
0.5000 0.3333 0.2500 0.2000 0.1667
0.3333 0.2500 0.2000 0.1667 0.1429
0.2500 0.2000 0.1667 0.1429 0.1250
0.2000 0.1667 0.1429 0.1250 0.1111
3
Ill-Conditioning and Regularization
Techniques in Solutions of Linear Systems
Abstract
If a small perturbation is introduced to either the coefficient matrix A or
vector b, it might lead to a big change in the solution vector x. Hence, both
the direct and iterative methods are not guaranteed to give accurate solution
of the given linear system.
This chapter is divided into two sections. The first section presents the
concept of ill-conditioning in linear systems and how to use MATLAB® and
Python to measure the condition numbers of matrices. In the second section,
some regularization techniques are presented to stabilize the solutions of ill-
conditioned systems.
57
58 Ill-Conditioning and Regularization Techniques
5 1 1
1 5 1
1 1 5
>> x = A\b
x =
1.0000
-1.0000
1.0000
In Python:
In [1]: import numpy as np
In [2]: A = np.array([[5., 1., 1.], [1., 5., 1.], [1., 1., 5.]])
In [3]: b = np.array([[5.], [-3.], [5.]])
In [4]: x = np.linalg.solve(A, b)
In [5]: print(’x = \n’, x)
x =
[[ 1.]
[-1.]
[ 1.]]
Now, the first perturbation we consider will be on the component A11 ,
where we set B = A and B11 = A11 + 10−4 = A11 + 0.0001
>> B = A
B =
5 1 1
1 5 1
1 1 5
>> B(1, 1) = B(1, 1) + 0.0001
B =
5.0001 1.0000 1.0000
1.0000 5.0000 1.0000
1.0000 1.0000 5.0000
>> y = B\b
y =
1.0000
-1.0000
1.0000
>> disp(’x - y = ’), disp(num2str(x-y))
Ill-Conditioning in Solutions of Linear Systems 59
x - y =
2.1428e-05
-3.5714e-06
-3.5714e-06
>> disp([’|| x-y||_2 = ’ num2str(norm(x-y, 2))])
|| x-y||_2 = 2.2015e-05
In Python,
In [7]: B = A.copy()
In [8]: B[0, 0] += 1e-4
In [9]: y = np.linalg.solve(B, b)
In [10]: print(’y = \n’, y)
y =
[[ 0.99997857]
[-0.99999643]
[ 1.00000357]]
In [11]: print(’x - y = \n’, x-y)
x - y =
[[ 2.14281123e-05]
[-3.57135204e-06]
[-3.57135204e-06]]
In [12]: print(’||x-y||_2 = \n’, np.linalg.norm(x-y, 2))
||x-y||_2 =
2.201529253996403e-05
The second perturbation under consideration will be on the right-hand
side, where b3 will be replaced by b3 + 10−4 giving a vector c in the right-
hand side. The original coefficient matrix A will remain without a change and
the linear system Az = c will be solved.
>> c = b ;
>> c(3) = c(3) + 1e-4
c =
5.0000
-3.0000
5.0001
>> z = A\c
z =
1.0000
-1.0000
1.0000
60 Ill-Conditioning and Regularization Techniques
In Python,
In [12]: c = b.copy()
In [13]: c[-1] += 1e-4
In [14]: z = np.linalg.solve(A, c)
In [15]: print(’z = \n’, z)
z =
[[ 0.99999643]
[-1.00000357]
[ 1.00002143]]
In [16]: print(’x - z = \n’, x-z)
x - z =
[[ 3.57142857e-06]
[ 3.57142857e-06]
[-2.14285714e-05]]
In [17]: print(’||x-z||_2 = \n’, np.linalg.norm(x-z, 2))
||x-z||_2 =
2.2015764296112095e-05
From Example 3.1 it can be noticed that small changes in some components
of the coefficient matrix or the vector at the right-hand side lead to small
changes in the solution. Hence, the given linear system is not sensitive to
small perturbations. A system that is not sensitive to small perturbations is
called well-posed system.
The purpose from this example is to show how a small change in one
entry of the coefficient matrix F or vector d can cause a drastic change in the
solution of the linear system F x = d.
MATLAB is used to solve the linear system Ax = b, with the commands:
>> F = [1001, -999, 999; 1, 1, -1; 1000, -1000, 1000] ;
>> d = [1001; 1; 1000] ;
>> x = F\d
Ill-Conditioning in Solutions of Linear Systems 61
x =
1
0.4147
0.4147
The warning indicates that the matrix is close to being singular, hence the
results might be inaccurate. The MATLAB solution of the linear system is far
from the exact solution [1.0, 1.0, 1.0]T as noticed.
In Python, the above linear system can be solved by using the Python
commands:
In [1]: import numpy as np
In [2]: A = np.array([
[1.001e+03, -9.990e+02, 9.990e+02],
[1.000e+00, 1.000e+00, -1.000e+00],
[1.000e+03, -1.000e+03, 1.000e+03]])
In [3]: b = np.array([[1001], [1], [1000]])
In [4]: x = np.linalg.solve(A, b)
In [5]: print(’x =\n’, x)
x =
[[1. ]
[0.93969727]
[0.93969727]]
Again, despite Python having better accuracy in finding a solution of the
linear system than MATLAB, still the error in solving the linear system is not
small.
Again, small perturbations will be introduced to the coefficient matrix
F and vector d. First, a perturbation 10−5 will be added to F11 giving a
matrix G. Then, computing the solution of the linear system Gy = d and
ky − xk2 :
>> G = F ;
>> format long g
>> G(1, 1) = G(1, 1) + 1e-5
G =
1001.00001 -999 999
0.9999999999999 1.0000000000001 -0.9999999999999
1000 -1000 1000
>> y = G\d
Warning: Matrix is close to singular or badly scaled. Results
may be inaccurate.
RCOND = 1.092847e-16.
62 Ill-Conditioning and Regularization Techniques
y =
0.999997714598233
22853063.4012923
22853063.4012946
>> disp(norm(y-x, 2))
32319111.6174026
In Python, the commands will be used:
In [6]: G = F.copy()
In [7]: G[0, 0] += 1e-5
In [8]: y = np.linalg.solv(G, d)
In [9]: print(’y = ’, y)
y =
[[9.99997654e-01]
[2.34012849e+07]
[2.34012849e+07]]
In [10]: print(’y-x = \n’, y-x)
y-x =
[[-2.34606334e-06]
[ 2.34012840e+07]
[ 2.34012840e+07]]
In [11]: print("%20.16e"% np.linalg.norm(x-y, 2))
3.3094413214227043e+07
Second, a small value 10−5 is added to the third component of vector d to
get a vector g and solve the linear system F z = g. In MATLAB, the following
commands are used:
>> g = d ;
>> g(3) = g(3) + 1e-5
g =
1001
1
1000.00001
>> z = F\g
Warning: Matrix is close to singular or badly scaled. Results
may be inaccurate.
RCOND = 1.067985e-16.
z =
0.999997659959457
23385189.5233867
23385189.523389
>> format long e
>> disp(’||z - x||_2 = ’), disp(norm(z-x, 2))
||z - x||_2 =
3.307165159616157e+07
Ill-Conditioning in Solutions of Linear Systems 63
In Python:
In [12]: g = d.copy()
In [13]: g[-1] += 1e-5
In [14]: z = np.linalg.solve(F, g)
In [15]: print(’z = \n’, z)
z =
[[ 9.99977257e-01]
[-2.94647268e+11]
[-2.94647268e+11]]
In [16]: print(’||x-z||_2 = \n’, "%20.16e"%
np.linalg.norm(x-z, 2))
||x-z||_2 =
4.1669416392824341e+11
Python does not release a message to complain about the closeness of
the matrix to being singular, but it is true that the matrix is close to being
singular.
The results obtained in Example 3.2 show that either a small change in
the coefficient matrix or the vector at the right-hand side lead to huge changes
in the solution of the linear system. Sensitivity to small changes indicate that
such a linear system is ill-conditioned [42].
x =
3918.1
-21520
35375
-46361
2.1855e+05
-4.3689e+05
52491
5.8691e+05
-4.1416e+05
In Python:
In [17]: import numpy as np
In [18]: H = np.array([1.65, 1.67, 1.68, 1.72, 1.77, 1.82, 1.86,
1.89, 1.9])
In [19]: W = np.array([57, 61, 64, 69, 75, 83, 90, 97, 100])
In [20]: V = np.fliplr(np.vander(H))
In [21]: A = V.T@V
In [22]: b = V.T@W
In [23]: x = np.linalg.solve(A, b)
In [24]: print(’x = \n’, x)
x =
[-1.26559701e+06 3.39786878e+06 -3.76621053e+06 2.29662532e+06
-8.88505759e+05 2.23976721e+05 -1.87100975e+04 -9.10775041e+03
2.24200821e+03]
Now we present a small change on the third component of the height vector
and see how this change will affect the resulting regression coefficients.
>> H1 = H ;
>> H1(3) = H1(3) + 0.01 ;
>> V1 = vander(H1) ;
>> B = V1’*V1 ;
>> b1 = V1’*W ;
>> y = B\b1
Warning: Matrix is close to singular or badly scaled. Results
may be inaccurate. RCOND = 1.439393e-20.
y =
2214
-13856
28739
-31202
90976
-2.4644e+05
2.4675e+05
-20840
-63193
Ill-Conditioning in Solutions of Linear Systems 65
>> disp(norm(x-y))
7.6365e+05
In Python:
In [25]: H1 = H.copy()
In [26]: H1[2] += 0.01
In [27]: V1 = np.fliplr(np.vander(H1))
In [28]: B = V1.T@V1
In [29]: b1 = V1.T@W
In [30]: y = np.linalg.solve(B, b1)
In [31]: print(’y = \n’, y)
y =
[-1.67279152e+05 2.06762159e+05 1.09086145e+05 -2.55301791e+05
8.12931633e+04 3.93681517e+04 -2.63131683e+04 3.30116939e+03
2.38312295e+02]
In [32]: print(’||x - y||_2 = ’, ’{0:1.6e}’.format(np.linalg
.norm(x-y)))
||x - y||_2 = 5.821901e+06
2
Example 3.4 In this example, we approximate the function y(x) = 5xe−2x
by a polynomial P (x) of degree 12 in an interval [α, β]. The coefficients of the
polynomial are computed by the function FunApproxCoef developed in the
previous chapter.
The coefficient matrix A is of type 13 × 13 defined by
β i+j−1 − αi+j−1
Aij = , i, j = 1 . . . , 13
i+j −1
and the right-hand side is a vector b of type 13 × 1 defined by
Z β
bj = xj−1 y(x)dx, j = 1, . . . , 13.
α
At the beginning the problem is solved without any change in either matrix
A or vector b. Then we make a small change in the last component in b such
that b13 = b13 + 10−4 .
The MATLAB code to compute the polynomials is:
13 C = A\b(:) ; C = wrev(C) ;
14 b1 = b ; b1(end) = b1(end) + 1e-4 ;
15 C1 = A\b1(:) ; C1 = wrev(C1) ;
16 t = linspace(al, bt, 401) ; F = f(t) ;
17 Pe = polyval(C, t) ; Pp = polyval(C1, t) ;
18
19 subplot(1, 2, 1) ; plot(t, F, '--b', 'LineWidth', 3) ;
20 xlabel('x', 'fontweight','bold') ; ylabel('y', ...
'fontweight','bold') ;
21 legend('y(x) = 5xeˆ{-2xˆ2}') ; grid on ; ax = gca ;
22 ax.FontWeight = 'bold' ; ax.FontSize = 12 ;
23 set(gca, 'XTick', linspace(al, bt, 9)) ; set(gca, 'YTick', ...
linspace(1, 5, 9)) ;
24 axis([al, bt, 1, 5]) ;
25
26 subplot(1, 2, 2) ; plot(t, Pe, '--r', t, Pp, ':k', ...
'LineWidth', 3) ;
27 xlabel('x', 'fontweight','bold') ; ylabel('y', ...
'fontweight','bold') ;
28 legend('Without perturbation', 'With perturbation') ; grid on ;
29 ax = gca ; ax.FontWeight = 'bold' ; ax.FontSize = 12 ;
30 set(gca, 'XTick', linspace(al, bt, 9)) ; set(gca, 'YTick', ...
linspace(1, 5, 9)) ;
31 axis([al, bt, 1, 5]) ;
1 import numpy as np
2 import matplotlib.pyplot as plt
3 from scipy import integrate
4 f = lambda x: 2.+5*x*np.exp(-2*x**2)
5 A, b = np.zeros((13, 13), 'float'), np.zeros((13, 1), 'float')
6 al, bt = 0., 2.
7 for i in range(13):
8 for j in range(i+1):
9 k = i+j+1
10 A[i, j] = (bt**k-al**k)/k
11 A[j, i] = A[i, j]
12 b[i] = integrate.quad(lambda x: x**i*f(x), al, bt)[0]
13 C = np.linalg.solve(A, b)
14 c = list(C.T[0])
15 c.reverse()
16 C = np.array(c)
17 t = np.linspace(al, bt+(bt-al)/400, 401)
18 F = f(t)
19 Pe = np.polyval(C, t)
20 b[-1] += 1e-4
21 C1 = np.linalg.solve(A, b)
22 c = list(C1.T[0])
23 c.reverse()
24 C1 = np.array(c)
25 Pp = np.polyval(C1, t)
26
27 plt.figure(1, figsize=(20, 8))
28 plt.subplot(1, 2, 1)
29 plt.plot(t, F, '--b', lw = 3, label = 'y(x) = 5 x eˆ{-2xˆ2}')
30 plt.xlabel('x', fontweight='bold')
Ill-Conditioning in Solutions of Linear Systems 67
31 plt.ylabel('y', fontweight='bold')
32 plt.legend()
33 plt.xticks(np.arange(al, bt+(bt-al)/8, (bt-al)/8), ...
fontweight='bold')
34 plt.yticks(np.arange(1., 5., 0.5), fontweight='bold')
35 plt.grid(True, ls=':')
36 plt.axis([al, bt, 1., 5.])
37
38 plt.subplot(1, 2, 2)
39 plt.plot(t, Pe, '--b', lw = 3, label = 'Without perturbation')
40 plt.plot(t, Pp, ':k', lw = 3, label = 'With perturbation')
41 plt.xlabel('x', fontweight='bold')
42 plt.ylabel('y', fontweight='bold')
43 plt.legend(loc="upper center")
44 plt.xticks(np.arange(al, bt+(bt-al)/8, (bt-al)/8), ...
fontweight='bold')
45 plt.yticks(np.arange(1., 5., 0.5), fontweight='bold')
46 plt.grid(True, ls=':')
47 plt.axis([al, bt, 1., 5.])
Figure 3.1 contains two subgraphs. In the first subgraph (at the left side) the
original function y(x) = 5xe−2x is plotted. In the second subgraph (at the
right side) the coefficients resulting from the linear systems without and with
changing the right-hand side are used to graph the approximating polynomials.
( )=5xe−2x2
yx Without perturbation
With perturbation
4.5 4.5
4.0 4.0
3.5 3.5
3.0 3.0
y
2.5 2.5
2.0 2.0
1.5 1.5
1.0 1.0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00
x x
FIGURE 3.1: Left: the exact function. Right: the approximating polynomials
of degree 12 without and with changes in the last component of vector b.
68 Ill-Conditioning and Regularization Techniques
nP o
n Pn Pn
(b) kAk∞ = max j=1 |a 1,j |, j=1 |a2,j |, . . . , j=1 |am,j |
√ √
(c) kAk2 = max σ1 , . . . , σm , where σj is the jth eigenvalue of
T
A·A .
We can use the Python command norm to compute kAkp of matrix A.
Both MATLAB and Python adopt the IEEE double precision numbers
as default type of their numerical variables with floating points. The
smallest binary fraction that can be represented by 52 bits is 2−52 ≈
2.2204 × 10−16 . This number is called the machine precision and is
denoted by ε. In MATLAB the machine precision can be seen by typing
eps in the command window:
>> disp(eps)
2.2204e-16
>> disp(sin(0.0))
0
>> disp(sin(2*pi))
-2.4493e-16
because Ax = b. Then,
and
k∆xk = kA−1 ∆bk ≤ kA−1 kk∆bk. (3.2)
70 Ill-Conditioning and Regularization Techniques
Dividing the two sides by kx + ∆xk and multiplying the right-hand side by
kAk
kAk gives:
k∆xk k∆Ak
≤ κ(A) (3.4)
kx + ∆xk kAk
Equation (3.4) shows the relationship between the relative change in the
solution vector x compared to the relative change in matrix A, in terms of
the condition number of matrix A.
It can be seen that the smaller the condition number of a matrix A, the less
the linear system will be sensitive to perturbations, and the resulting system is
well-posed. Also, the larger the condition number, the more sensitive the linear
system is to perturbations, and the resulting system is ill-posed. The condition
number of the 3 × 3 zero matrix is ∞ and for the 3 × 3 identity matrix is 1.
The MATLAB command cond can be used for finding the condition number
of a matrix A.
0 2.0000 4.0030
>> cond(A)
ans =
1.6676e+05
>> cond(zeros(3))
ans =
Inf
>> cond(eye(3))
ans =
1
10 1.602503e+13
11 5.220207e+14
12 1.621164e+16
13 4.786392e+17
14 2.551499e+17
15 2.495952e+17
This shows that as the dimensions of Hilbert matrices increase, they become
more ill-conditioned. It is also noticed that MATLAB and Python do agree
on the values of condition numbers as long as rcond(Hn ) ≥ eps. When
rcond(Hn ) < eps they could have different roundoff errors, causing them to
produce different condition numbers.
Another example is the Vandermonde matrix, used with least-squares
approximations. As the number of data points increases, the condition num-
ber of the corresponding Vandermone matrix increases, so it becomes more
ill-conditioned. The following MATLAB commands show the condition num-
bers of Vandermone matrix for different numbers of data points:
>> fprintf(’n\t\t ||V_n||\n’); for n = 2 : 13
fprintf(’%i\t\t%10.6e\n’, n, cond(vander(H(1:n)))) ; end
n ||V_n||
2 3.755673e+02
3 1.627586e+05
4 3.051107e+07
5 3.481581e+09
6 2.480023e+19
7 2.970484e+21
8 6.929557e+24
9 7.174378e+27
10 8.795704e+31
11 2.868767e+35
12 1.380512e+39
13 1.532255e+42
In Python:
In [44]: for n in range(2, len(H)): print(n, ’\t\t’,
’{0:1.6e}’.format(cond(vander(H[:n]))))
2 3.755673e+02
3 1.627586e+05
4 3.051107e+07
5 3.481581e+09
6 2.479949e+19
7 2.970484e+21
8 6.929557e+24
9 7.174382e+27
10 8.796918e+31
11 2.862293e+35
74 Ill-Conditioning and Regularization Techniques
12 1.375228e+39
13 1.740841e+42
If the condition number of some matrix A is of order 10` , then when
solving a linear system Ax = b up to ` decimal places from the right can be
inaccurate (with the notice that 16 decimal places are truly represented for
double-precision numbers). To see this, we consider a Vandermonde matrix
generated by a random vector v:
>> v = rand(10, 1)
v =
4.4559e-01
6.4631e-01
7.0936e-01
7.5469e-01
2.7603e-01
6.7970e-01
6.5510e-01
1.6261e-01
1.1900e-01
4.9836e-01
>> V = fliplr(vander(v)) ;
In Python, the vector v and the Vandermonde matrix V of v can be gen-
erated by using the Python commands:
In [45]: v = np.random.rand(10)
In [46]: v
Out[46]:
array([0.47585697, 0.31429675, 0.73920316, 0.45044728, 0.16221156,
0.8241245, 0.9038605, 0.28001448, 0.85937663, 0.07834397])
In [47]: V = np.fliplr(np.vander(v))
We can measure the condition number of matrix V in MATLAB:
>> Cv = cond(V)
Cv =
1.730811304916736e+10
In Python:
In [48]: cV = np.linalg.cond(V)
In [49]: print(’{0:1.6e}’.format(cV))
8.671331e+07
Let x be a column vector of ones of dimension 10 and b = V x.
>> x = ones(10, 1) ;
>> b = V * x ;
Ill-Conditioning in Solutions of Linear Systems 75
where U Ti and V i are the ith columns of the matrices U and V , and the σi ’s
are the singular values of F , i = 1, . . . , m, with σ1 ≥ σ2 ≥ . . . ≥ σn ≥ 0.
Therefore, the solution x is a linear combination of {V 1 , . . . , V n }, the
UT y
columns of V , with coefficients σii , i = 1, . . . , n.
When the noisy data y is used instead of the exact data y, we get
m n n n
X (U T y ) X (U T y) X (U T ) X (U T )
F −1 y = i
Vi= i
V i+ i
V i = x+ i
Vi
σi σi σi σi
i=1 i=1 i=1 i=1
(3.9)
2
UT
From equation (3.9), we find that, ky − xk22 = i
Pn
i=1 σi . As the
matrix F tends to be singular, some of its singular values tend to be zeros,
(U T )
and hence, some of the coefficients σii gets very large. This tells us that,
the residual norm is not effected by only how small the noise is, but also by
how small a singular value of the matrix F is.
As an explanation, let us look at the singular values of the 20 × 20 Hilbert
matrix:
>> H = hilb(20) ;
>> [U, S, V] = svd(H) ;
>> format short e ;
>> D = diag(S)
D =
1.9071e+00
Ill-Conditioning in Solutions of Linear Systems 77
4.8704e-01
7.5596e-02
8.9611e-03
8.6767e-04
7.0334e-05
4.8305e-06
2.8277e-07
1.4140e-08
6.0361e-10
2.1929e-11
6.7408e-13
1.7384e-14
3.7318e-16
1.5057e-17
1.2511e-17
7.3767e-18
5.4371e-18
1.7279e-18
5.8796e-19
We see that the last 6 singular values of H(= U SV T ) are below ε(= 2.2204 ×
10−16 ), which are not discriminated from 0 in the 64 − bit systems. Dividing
by such a singular value when computing H −1 = V S −1 U T causes huge errors,
since the diagonal elements of S −1 are the reciprocals of the diagonal elements
of S.
According to the distribution of the singular values of F , ill-posed
problems (whose coefficient matrices are ill-conditioned) are divided into
two classes. Namely, the rank deficient problems and the discrete ill
posed problems. These two classes of problems can be distinguished by using
the following properties [36]
1. in the rank deficient problems, there is a small cluster of small singular
values, while in discrete ill-posed problems there is a large cluster of small
singular values.
2. in the rank deficient problems there is a clear gap between the large and
small singular values, while in discrete ill-posed problems the singular
values decay gradually without gaps between the large and small singular
values.
3. for the rank deficient problems, there could be a formulation which can
eliminate the ill-conditioning but no such formulation exists for the ill-
posed problems.
78 Ill-Conditioning and Regularization Techniques
F x = y, F ∈ Rm×n , x ∈ Rn and y ∈ Rm .
such that, for each y satisfying ky − yk ≤ there exists α ∈ (0, ∞) such that,
def
xα = Γα y → x as → 0.
Some regularization techniques include the truncated SVD method, the
Tikhonov regularization method, the L-curve method and the Morosov dis-
crepancy principle [8, 9].
In this section we discuss these methods.
F x = b, F ∈ Rn×n , x ∈ Rn and y ∈ Rn ,
and supposing that U, Σ and V are the svd factors of matrix F , such that
F = U ΣV T . Hence, F −1 = V Σ−1 U T . The unique solution of the linear system
f x = b is given by:
n
X uTj b uT1 b uT b
x = V Σ−1 U T b = vj = v1 + · · · + n vn
σj σ1 σn
j=1
1 H = hilb(20) ;
2 y = ones(20, 1) ;
3 b = H * y ;
4 z = H\b ;
5 subplot(1, 2, 1) ;
6 plot(1:20, y, '-b', 1:20, z, 'r:', 'LineWidth', 3) ;
7 legend('Exact Solution', 'Unregularized Solution') ;
8 xlabel('Solution Component') ;
9 ylabel('Component Value') ;
10 grid on ;
11 format short e
12 disp('Error norm of unregularized solution | | y-z | | 2'), ...
disp(norm(y-z))
250 1.15
Exact Solution Exact Solution
200 Unregularized Solution Regularized Solution
1.1
150
100 1.05
Component Value
Component Value
50
1
0
-50 0.95
-100
0.9
-150
-200 0.85
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Solution Component Solution Component
13 w = SolveWithTSVD(H, b, eps) ;
14 subplot(1, 2, 2) ;
15 plot(1:20, y, '-b', 1:20, w, 'r:', 'LineWidth', 3) ;
16 legend('Exact Solution', 'Regularized Solution') ;
17 xlabel('Solution Component') ;
18 ylabel('Component Value') ;
19 grid on
20 disp('Error norm of regularized solution | | y-w | | 2'), ...
disp(norm(y-w))
21
22 x = zeros(20, 15) ;
23 Err = zeros(15, 2) ;
24 disp(' Alpha | | x-y | | 2') ;
25 disp('-------------------------') ;
26 for n = 1 : 15
27 Alpha = 10ˆ(n-1)*eps ;
28 x(:, n) = SolveWithTSVD(H, b, Alpha) ;
29 Err(n, 1) = Alpha ;
30 Err(n, 2) = norm(y-x(:, n), inf) ;
31 end
32 disp(Err) ;
Alpha ||x-y||_2
------------------------
2.2204e-16 1.4427e-01
2.2204e-15 2.6146e-02
2.2204e-14 3.3896e-04
2.2204e-13 3.3896e-04
2.2204e-12 2.4002e-06
2.2204e-11 7.8511e-06
2.2204e-10 7.8511e-06
2.2204e-09 4.5607e-05
2.2204e-08 2.1812e-04
2.2204e-07 2.1812e-04
2.2204e-06 1.0205e-03
2.2204e-05 4.2420e-03
2.2204e-04 1.5665e-02
Regularization of Solutions in Linear Systems 81
2.2204e-03 5.1211e-02
2.2204e-02 1.6937e-01
1 import numpy as np
2 from scipy.linalg import hilbert as hilb
3 from numpy.linalg import norm, svd
4 import matplotlib.pyplot as plt
5 def SolveWithTSVD(A, b, Alpha):
6 U, S, V = svd(A)
7 x = np.zeros like(b)
8 n = len(S)
9 for j in range(n):
10 if S[j] ≥ Alpha:
11 x += np.dot(U[:,j], b)/S[j]*V.T[:,j]
12 else:
13 continue
14 return x
15
16 H = hilb(20)
17 y = np.ones((20,), 'float')
18 b = H@y
19 z = np.linalg.solve(H, b)
20 Eps = np.spacing(1.0)
21 Alpha=Eps
22 w = SolveWithTSVD(H, b, Alpha)
23 print('Error norm for Unregularized solution = ', norm(y-z))
24 print('Error norm for Regularized solution = ', norm(y-w))
25 plt.figure(1)
26 plt.subplot(1, 2, 1)
27 t = np.arange(1, len(b)+1)
28 plt.plot(t, y, '-b', lw=3, label='Exact solution')
29 plt.plot(t, z, '-.r', lw=3, label='Unregularized solution')
30 plt.xlabel('Solution component', fontweight='bold')
31 plt.ylabel('Component value', fontweight='bold')
32 plt.grid(True, ls=':')
33 plt.legend()
34
35 plt.subplot(1, 2, 2)
36 t = np.arange(1, len(b)+1)
37 plt.plot(t, y, '-b', lw=3, label='Exact solution')
38 plt.plot(t, w, ':m', lw=3, label='Regularized solution')
39 plt.xlabel('Solution component', fontweight='bold')
40 plt.ylabel('Component value', fontweight='bold')
41 plt.grid(True, ls=':')
42
43 Err = np.zeros((16, 2), 'float')
44 for j in range(16):
45 Alpha = 10**j*np.spacing(1.)
46 w = SolveWithTSVD(H, b, Alpha)
47 Err[j, 0] = Alpha
48 Err[j, 1] = norm(y-w)
49 print(Err)
82 Ill-Conditioning and Regularization Techniques
n
X σi
xα = (U Ti y) Vi (3.11)
i=1
α + σi2
1 import numpy as np
2 from scipy.linalg import hilbert as hilb
3 from numpy.linalg import norm, svd
4 import matplotlib.pyplot as plt
5 def SolveWithTikhonov(A, b, Alpha):
6 U, S, V = svd(A)
7 x = np.zeros like(b)
8 n = len(S)
9 for j in range(n):
10 x += np.dot(U[:,j], b)*S[j]/(S[j]**2+Alpha)*V.T[:,j]
11 return x
12
13 H = hilb(20)
14 y = np.ones((20,), 'float')
15 b = H@y
16 z = np.linalg.solve(H, b)
17 Eps = np.spacing(1.0)
18 Alpha=Eps
19 w = SolveWithTikhonov(H, b, Alpha)
20 print('Error norm for Unregularized solution = ', ...
'{0:1.8e}'.format(norm(y-z)))
21 print('Error norm for Regularized solution = ', ...
'{0:1.8e}'.format(norm(y-w)))
22 plt.figure(1)
23 plt.subplot(1, 2, 1)
24 t = np.arange(1, len(b)+1)
25 plt.plot(t, y, '-b', marker='s', lw=3, label='Exact solution')
26 plt.plot(t, z, '-.r', marker='o', lw=3, label='Unregularized ...
solution')
27 plt.xlabel('Solution component', fontweight='bold')
28 plt.ylabel('Component value', fontweight='bold')
29 plt.grid(True, ls=':')
30 plt.legend()
31
32 plt.subplot(1, 2, 2)
33 t = np.arange(1, len(b)+1)
34 plt.plot(t, y, '-.b', marker='s', lw=3, label='Exact solution')
35 plt.plot(t, w, ':m', marker='o', lw=3, label='Regularized ...
solution')
36 plt.xlabel('Solution component', fontweight='bold')
37 plt.ylabel('Component value', fontweight='bold')
38 plt.grid(True, ls=':')
39 plt.legend()
40
41 Err = np.zeros((16, 2), 'float')
42 for j in range(16):
43 Alpha = 10**j*np.spacing(1.)
44 w = SolveWithTikhonov(H, b, Alpha)
45 Err[j, 0] = Alpha
46 Err[j, 1] = norm(y-w)
47 print('{0:1.6e}'.format(Err[j, 0]), '\t', ...
'{0:1.8e}'.format(Err[j, 1]))
84 Ill-Conditioning and Regularization Techniques
+1
60
Exact solution 0.00010
Unregularized solution
40
0.00005
20
Component value
Component value
0.00000
0
−0.00005
−20
−40 −0.00010
Exact solution
Regularized solution
−60
2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0
Solution component Solution component
FIGURE 3.3: Unregularized solution and regularized solution with reg. param
α = ε of Hx = b.
are computed through the above Python code, and the outputs are as follows:
Error norm for Unregularized solution = 1.36970281e+02
Error norm for Regularized solution = 2.96674549e-04
2.220446e-16 2.96674549e-04
2.220446e-15 4.95587019e-04
2.220446e-14 7.25227169e-04
2.220446e-13 1.76193763e-03
2.220446e-12 2.40230468e-03
2.220446e-11 5.05832063e-03
2.220446e-10 8.76991238e-03
2.220446e-09 1.41598120e-02
2.220446e-08 2.96964698e-02
2.220446e-07 4.37394775e-02
2.220446e-06 9.44807071e-02
2.220446e-05 1.42053519e-01
2.220446e-04 2.95683135e-01
2.220446e-03 4.69592530e-01
2.220446e-02 9.24797653e-01
2.220446e-01 1.61864335e+00
The MATLAB code is:
1 H = hilb(20) ;
2 y = ones(20, 1) ;
3 b = H * y ;
4 z = H\b ;
5 subplot(1, 2, 1) ;
6 plot(1:20, y, '-b', 1:20, z, '-.m', 'LineWidth', 3) ;
7 legend('Exact Solution', 'Unregularized Solution') ;
Regularization of Solutions in Linear Systems 85
8 xlabel('Solution Component') ;
9 ylabel('Component Value') ;
10 grid on ;
11 format short e
12 disp('Error norm of unregularized solution | | y-z | | 2'), ...
disp(norm(y-z))
13 w = SolveWithTikhonov(H, b, eps) ;
14 subplot(1, 2, 2) ;
15 plot(1:20, y, '-b', 1:20, w, 'r:', 'LineWidth', 3) ;
16 legend('Exact Solution', 'Regularized Solution') ;
17 xlabel('Solution Component') ;
18 ylabel('Component Value') ;
19 grid on
20 disp('Error norm of regularized solution | | y-w | | 2'), ...
disp(norm(y-w))
21
22 x = zeros(20, 15) ;
23 Err = zeros(15, 2) ;
24 disp('-------------------------') ;
25 disp(' Alpha | | x-y | | 2') ;
26 disp('-------------------------') ;
27 for n = 1 : 15
28 Alpha = 10ˆ(n-1)*eps ;
29 x(:, n) = SolveWithTikhonov(H, b, Alpha) ;
30 Err(n, 1) = Alpha ;
31 Err(n, 2) = norm(y-x(:, n), inf) ;
32 end
33 disp(Err) ;
34
35 function x = SolveWithTikhonov(A, b, Alpha)
36 [U, S, V] = svd(A) ;
37 D = diag(S) ;
38 [m, n] = size(A) ;
39 x = zeros(m, 1) ;
40 for i = 1 : n
41 x = x + U(:, i)'*b*D(i)/((D(i))ˆ2+Alpha)*V(:, i) ;
42 end
43 end
i=1
(3.14)
Minimizing ϕα (x) which appears in equation (3.12) is equivalent to solving
the normal equations
Hansen [22] considered the use of the generalized singular value decomposi-
tion (GSVD) to obtain the solution of problem (3.15). The generalized singular
value decomposition for a pair (F , L), where F ∈ Rm×n and L ∈ Rp × n, with
p ≤ n ≤ m is given by the form
Σ 0
F =U X −1 L = V (M 0)X −1 (3.16)
0 In−p
where
σ1 0 ... 0 µ1 0 ... 0
0 σ2 ... 0 0 µ2 ... 0
Σ= , M =
.. .. .. .. .. .. .. ..
. . . . . . . .
0 0 ... σp 0 0 ... µp
kLxk ≤ M
kF x − bk ≤ ε (3.19)
(F T F + αB T B)xα = F T b (3.20)
subject to
kF xα − bk = ε (3.21)
and the Morozov discrepancy principle is the problem of selecting α such that,
(3.20-3.21) is satisfied.
4
Solving a System of Nonlinear Equations
Abstract
This chapter discusses the solutions of nonlinear systems using MATLAB®
and Python. It is divided into two sections. The first section presents four
numerical methods for solving single nonlinear equation. The second section
discusses numerical methods for solving a system of nonlinear equations.
f (x) = 0
89
90 Solving a System of Nonlinear Equations
3.5
y = exp(-x)+0.5
3.0 y = cos(x)
exp(-x)+0.5 = cos(x)
2.5
2.0
1.5
y 1.0
0.5
0.0
−0.5
−1.0
−1 0 1 2 3 4 5 6 7
x
FIGURE 4.1: Graphs of the functions e−x + 0.5 (dashed curve) and cos(x)
(dash-dotted curve). The x-coordinate of circle at the intersection of the two
curves is the root of e−x − cos x + 0.5 = 0 in [4, 6].
function f either lies in [a, c] or [c, b]. If it lies in [a, c], then f (a) · f (c) < 0.
In this case, we know that the root does not lie in the interval [c, b] and we
look for the root in [a, c]. If it lies in [c, b], then f (c) · f (b) < 0 and in this
case, we know that the root does not lie in [a, c], so we look for it in [c, b].
The bisection method continues dividing the interval, which contains the root
into two sub-intervals, such that one sub-interval contains the root whereas
the other does not; therefore, the method considers only the interval which
contains the root and drops the other half.
The MATLAB code that implements the bisection method is as follows:
Now, we can call the function Bisection from the command prompt:
>> format long
>> Epsilon = 1e-8 ;
>> f = @(x) xˆ2 - 3 ;
>> r = Bisection(f, 1, 2, Epsilon)
r =
1.732050813734531
Now, if x1 is a root for the function f (x), then f (x1 ) = 0. That is:
f (x0 )
x1 = x0 −
f 0 (x0 )
f (xn )
xn+1 = xn −
f 0 (xn )
Running the above code with x0 = 1.0 one time and x0 = −1.0 another time:
In [6]: x, Iters = NewtonRaphson(f, fp, x0, Eps)
In [7]: print(’Approximate root is:’, x, ’\nIterations:’, Iters)
Approximate root is: 1.7320508100147276
Iterations: 4
In [8]: x, Iters = NewtonRaphson(f, fp, -x0, Eps)
In [9]: print(’Approximate root is:’, x, ’\nIterations:’, Iters)
Approximate root is: -1.7320508100147276
Iterations: 4
f (xn ) − f (xn−1 )
f 0 (xn ) ≈ .
xn − xn−1
Starting from some interval [a, b] that contains a root for f (x), the secant
method iteration approaches the zero of f (x) in [a, b].
The MATLAB function Secant implements the secant method. It receives
a function f, the limits of the interval [a, b] that contains the root of f and a
tolerance ε > 0. It applies the secant method to return an approximate solution
x and the number of iterations Iterations
x =
1.732050807565499
Iterations =
5
g(x∗ ) = x∗
f (x) = g(x) − x
That means x1 is a fixed point for the function g(x). The idea behind the
iterative method towards a fixed point is to write the function f (x) in the
shape g(x) − x, and starting from some initial guess x0 , the method generates
a sequence of numbers x0 , x1 , x2 , . . . that converges to the fixed point of the
function g(x), using the iterative rule:
xn+1 = g(xn )
Solving a Single Nonlinear Equation 95
To show how the the iterative method works, it will be applied to find the
roots of the function
f (x) = x2 − 3
first the function x2 − 3 is written in the form g(x) − x. One possible choice is
to write:
f (x) = x2 − 3 = x2 + 2x + 1 − 2x − 4 = (x + 1)2 − 2x − 4
from which
(x∗ + 1)2 − 4
x∗ =
2
Starting from an initial point x0 , the iterative method to find a root for f (x)
is:
(x(n) + 1)2 − 4
x(n+1) =
2
The following MATLAB code, computes a root of the function f (x) = x2 − 3
using the iterative method:
Running the function IterativeMethod with g(x) = ((x − 1)2 − 4)/2 one time
and g(x) = (4 − (x + 1)2 )/2 another time:
In [15]: x0, Eps = 1.0, 1e-8
In [16]: g = lambda x: ((x+1)**2-4)/2.0
In [17]: x, Iters = IterativeMethod(g, x0, Eps)
In [18]: print(’Approximate root is:’, x, ’\nIterations:’, Iters)
Approximate root is: -1.732050811416889
Iterations: 59
In [19]: g = lambda x: (4-(x-1)**2)/2.0
In [20]: x, Iters = IterativeMethod(g, x0, Eps)
In [21]: print(’Approximate root is:’, x, ’\nIterations:’, Iters)
Approximate root is: 1.732050811416889
Iterations: 59
Note: It is worthy to notice that the selection of the function g(x) is not
unique. For example, for the function f (x) = x2 − 3, the following iterative
forms can be used:
4−(x(n) −1)2
1. x(n+1) = 2
(3−x(n) )(1+x(n) )
2. x(n+1) = 2
-3ˆ(1/2)
>> x = solve(’exp(-x)-sin(x)’)
x =
0.5885327439818610774324520457029
>> x = solve(’xˆ3-cos(x)+log(x)’)
x =
0.89953056480788905732035721409122
Note: The Python symbolic library does not look as mature as the MAT-
LAB symbolic toolbox. The second and third problems cannot be solved with
Python, but MATLAB solves them. Python raises the exception:
In [25]: solve(exp(-x)-sin(x), x)
Traceback (most recent call last):
File "<ipython-input-43-7d4dd4404520>", line 1, in <module>
solve(exp(-x)-sin(x), x)
raise NotImplementedError(’\n’.join([msg, not_impl_msg % f]))
NotImplementedError: multiple generators [exp(x), sin(x)]
No algorithms are implemented to solve equation -sin(x) + exp(-x)
In [26]: solve(x**3-cos(x)+log(x))
Traceback (most recent call last):
...
NotImplementedError: multiple generators [x, cos(x), log(x)]
No algorithms are implemented to solve equation x**3 + log(x) - cos(x)
f1 (x1 , x2 , . . . , xn ) = 0
f2 (x1 , x2 , . . . , xn ) = 0
.. ..
. .
fn (x1 , x2 , . . . , xn ) = 0
where J(x∗ ) is the Jacobian matrix and the ij component of the J is defined
by:
∂fi (x)
(J(x∗ ))ij =
∂xj (x=x∗ )
Now, if we set f (x) = 0, we obtain the equation:
x = x∗ − J −1 (x∗ )f (x∗ )
Starting from an initial guess x(0) for the solution of f (x) = 0, the itera-
tion:
x(n+1) = x(n) − J −1 (x(n) )f (x(n) )
converges to the closest solution of f (x) = 0 to the initial point x(0) .
In MATLAB, the function ’jacobian’ can be used to find the Jacobian
matrix of the nonlinear system of equations f (x).
x2 + y 2 = 30
−x2 + y 2 = 24
We write:
f (x, y) = [x2 + y 2 − 30, −x2 + y 2 − 24]T
The Jacobian matrix for the nonlinear system is given by:
2x 2y
J(x, y) =
−2x 2y
Solving a System of Nonlinear Equations 99
The following MATLAB code implements the solution of the above given
problem, starting from the initial guess [1, 1]T .
1 % SolveWithJacobi.m
2 function [z, Iterations] = Newton sys(f, J, z0, Eps)
3 iJ = @(z) inv(J(z)) ;
4 z = z0 - iJ(z0)*f(z0) ;
5 Iters = 0 ;
6 while norm(z-z0, inf) ≥ Epsilon
7 z0 = z ;
8 z = z0 - iJ(z0)*f(z0) ;
9 Iters = Iters + 1 ;
10 end
It is also possible to use the MATLAB function ’solve’ to find all the
solutions of f (x) = 0, by using the MATLAB command:
[x1, x1, ..., xn] = solve([f1, f2, ..., fn], [x1, x2, ..., xn]);
For example:
>> [x, y] = solve(’xˆ2+yˆ2-30’, ’-xˆ2+yˆ2-24’, [x y])
x =
3ˆ(1/2)
3ˆ(1/2)
-3ˆ(1/2)
-3ˆ(1/2)
y =
3*3ˆ(1/2)
-3*3ˆ(1/2)
3*3ˆ(1/2)
-3*3ˆ(1/2)
Using the sympy library, the problem can be solved through the following
code:
In [34]: from sympy import *
In [35]: x, y = symbols(’x, y’)
In [36]: solve([x**2+y**2-30, -x**2+y**2-24], [x, y])
Out[37]:
[(-sqrt(3), -3*sqrt(3)),
(-sqrt(3), 3*sqrt(3)),
(sqrt(3), -3*sqrt(3)),
(sqrt(3), 3*sqrt(3))]
Example 4.2 We use Newton’s method to solve the nonlinear system [47]:
We write:
10π − 3 T
f = [3x1 − cos x2 x3 − 0.5, x21 − 81(x2 + 0.1)2 + sin x3 + 1.06, e−x1 x2 + 20x3 + ]
3
The MATLAB code to compute the solution using the above iteration is
given by:
>> clear ; clc ;
>> f = @(z)[3*z(1)-cos(z(2)*z(3)-1/2 ;
z(1)ˆ2-81*(z(2)+0.1)ˆ2+sin(z(3))+1.06 ;
exp(-z(1)*z(2))+20*z(3)+(10*pi-3)/3] ;
>> J = @(z) [3, z(3)*sin(z(2)*z(3)), z(2)*sin(z(2)*z(3));
2*z(1), - 162*z(2) - 81/5, cos(z(3));
-z(2)*exp(-z(1)*z(2)), -z(1)*exp(-z(1)*z(2)), 20] ;
>> z0 = [0.1; 0.1; -0.1] ;
>> Eps = 1e-8 ;
>> z, Iterations = Newton_sys(f, J, z0, Eps)
>> fprintf(’Iterations = %i\n’, Iterations) ;
Iterations = 4
>> fprintf(’The solution of the system is given by:\n\n\t\t\t\t x1 =
%18.15f\n\t\t\t\t x2 = %18.15f\n\t\t\t\t x3 = %18.15f\n’, z1(1), z1(2),
z1(3)) ;
The solution of the system is given by:
x1 = 0.500000000000000
x2 = -0.000000000000000
x3 = -0.523598775598299
Abstract
Data interpolation means to use a given set of n+1 data points to approximate
a function f (x) by a polynomial Pn (x) = an xn + an−1 xn−1 + . . . + a1 x + a0 (of
degree not exceeding n), such that Pn (xi ) = f (xi ), i = 0, . . . , n, where a0 , . . . , an
are constants. The data points are given by the table:
x x0 x1 ... xn
f (x) f (x0 ) f (x1 ) ... f (xn )
ψ(x) = (x − x0 )(x − x1 ) . . . (x − xn )
ψ(x)
L̃(x) = = (x − x0 )(x − x1 ) . . . (x − xi−1 )(x − xi+1 ) . . . (x − xn ),
x − xi
1 function p = LagrangeInterp(t, x, y)
2 n = length(x) ;
3 p = 0 ;
Lagrange Interpolation 107
4 for i = 1 : n
5 s = 1 ;
6 for j = 1 : n
7 if j , i
8 s = s*(t-x(j))/(x(i)-x(j)) ;
9 else
10 continue
11 end
12 end
13 p = p + s*y(i) ;
14 end
1 # LagInterp.py
2 import numpy as np
3 def LagrangeInterp(t, x, y):
4 n = len(x)
5 p = 0.0
6 for i in range(n):
7 s = 1
8 for j in range(n):
9 if j != i:
10 s *= (t-x[j])/(x[i]-x[j])
11 else:
12 continue
13 p += s*y[i]
14 return p
15
16 x = np.linspace(0.0, np.pi, 11)
17 y = np.sin(x)
18 p = LagrangeInterp(np.pi/6, x, y)
19 print(p)
Since the first n + 1 derivatives exist for both ψ(x) and En (x), then so is
G(x). Therefore, G(x) has n + 2 distinct roots in [a, b]. From the intermediate
value theorem, G0 (x) has n + 1 distinct roots in (a, b) and G00 (x) has n distinct
roots in (a, b). By the mathematical induction, G(k) has n−k +2 distinct roots
in (a, b). Hence, G(n+1) has at least one root in (a, b). Let ξ be a root for G(n+1)
in (a, b), that is G(n+1) (ξ) = 0.
(n+1)
From the definitions of En (x) and ψ(x), En (x) = f (n+1) (x) and
ψ (n+1) = (n + 1)!. Then,
(n + 1)!
G(n+1) (x) = f (n+1) (x) − En (t)
ψ(t)
At x = ξ,
(n + 1)!
G(n+1) (ξ) = f (n+1) (ξ) − En (t) = 0,
ψ(t)
from which,
ψ(t) (n+1)
En (t) = f (ξ)
(n + 1)
where
n
Y x − xj
Li (x) = ,
xi − xj
j=1
j,i
suffer a problem that it is not possible to obtain Li+1 (x) from Li (x) (that is
there is no iterative method to compute the polynomials Li (x), i = 0, . . . , n).
This makes the complexity of the algorithm high. Hence, Newton’s interpo-
lating polynomials can be seen as an alternative to the Lagrange polynomials.
P0 (x) = c0 = f (x0 ).
1 function c = ComputeNewtonCoefs(x, f)
2 n = length(x) ;
3 c = zeros(size(x)) ;
4 c(1) = f(1) ;
5 for j = 2 : n
6 z = x(j) - x(1:j-1) ;
7 pm1 = c(1) ;
8 for k = 1 : j
9 pm1 = pm1+sum(c(k)*prod(z(1:k-1))) ;
10 end
11 c(j) = (f(j)-pm1)/prod(z(1:j-1)) ;
12 end
13 end
8 for k = 1 : n
9 w(k) = prod(z(1:k)) ;
10 end
11 yy = sum(c.*w) ;
12 end
To test whether the functions give true results, they have been tested to
interpolate data sampled from the function
cos (4x)
f (x) = , 0 ≤ x ≤ 2.
1+x
A total of 10 equidistant data points (xj , f (xj )), j = 0, . . . , 9 are used to gen-
erate a Newton’s polynomial and then it is used to approximate the val-
ues of the function at 101 equidistant data points. The MATLABs script
NewtonInterpCos.m is used to do the task and plot the figure. Its code:
1 clear ; clc ;
2 x = linspace(0, 2, 10) ;
3 y = cos(4*x)./(1+x) ;
4 xx = linspace(0, 2, 101) ;
5 yy = zeros(size(xx)) ;
6 for j = 1 : length(yy)
7 yy(j) = NewtonInterp(x, y, xx(j)) ;
8 end
9 plot(x, y, 'bo', xx, yy, '-.m', 'LineWidth', 3) ;
10 xlabel('x') ;
11 ylabel('y') ;
12 legend('Data points (x, y)', 'Newton Polynomial') ;
13 axis([-0.1, 2.1, -0.8, 1.1]) ;
14 set(gca, 'fontweight', 'bold') ;
15 grid on ;
16 set(gca, 'XTick', 0:0.2:2) ;
17 set(gca, 'YTick', -0.8:0.2:1) ;
sin (4x)
f (x) = ,0 ≤ x ≤ 2
1+x
to approximate the values of the function at 101 equidistant data points. The
full code of NewtonInterpSin.py is:
0.8
0.6
0.4
0.2
y
-0.2
-0.4
-0.6
-0.8
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x
6 z = zeros(len(x), 'float')
7 c[0] = f[0]
8 for j in range(1, n):
9 z[0] = 1.0 ;
10 for k in range(j):
11 z[k+1] = x[j]-x[k]
12 pm1 = 0.0
13 w = zeros(j, 'float')
14 for k in range(j):
15 w[k] = prod(z[:k+1])
16 pm1 += (c[k]*w[k])
17 c[j] = (f[j]-pm1)/prod(z[:j+1])
18 return c
19
20 def NewtonInterp(x, y, xi):
21 c = ComputeNewtonCoefs(x, y)
22 n = len(x)
23 z = zeros(n, 'float') ;
24 w = zeros(n, 'float') ;
25 z[0] = 1 ;
26 for k in range(n-1):
27 z[k+1] = xi - x[k]
28 for k in range(n):
29 w[k] = prod(z[:k+1])
30 yy = sum(c*w)
31 return yy
32
33 x = linspace(0, 2, 10)
34 y = sin(4*x)/(1+x)
35 xx = linspace(0, 2, 101) ;
36 yy = zeros(len(xx), 'float') ;
37 for j in range(len(yy)):
38 yy[j] = NewtonInterp(x, y, xx[j]) ;
39
Newton’s Interpolation 113
By executing the code, the data points and Newton’s interpolating poly-
nomials are shown in Figure 5.2.
0.6
0.4
0.2
y
0.0
−0.2
−0.4
−0.6
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
x
f [xk ] = f (xk ) = yk , k = 0, . . . , n
f [xk+1 ] − f [xk ]
f [xk , xk+1 ] = , k = 0, . . . , n − 1
xk+1 − xk
f [xk+1 , xk+2 ] − f [xk , xk+1 ]
f [xk , xk+1 , xk+2 ] = , k = 0, . . . , n − 2
xk+2 − xk
and generally,
c0 = f (x0 ) = f [x0 ]
f [x1 ] − f [x0 ]
c1 = = f [x0 , x1 ]
x1 − x0
f [x1 , x2 ] − f [x0 , x1 ]
c2 = = f [x0 , x1 , x2 ]
x2 − x0
and generally,
0 0 0 0 0 0
0.2500 3.7577 15.0306 0 0 0
0.5000 0.0000 -15.0306 -60.1224 0 0
0.7500 -2.2791 -9.1165 11.8282 95.9341 0
1.0000 -0.0000 9.1165 36.4661 32.8506 -63.0836
Coef =
0 15.0306 -60.1224 95.9341 -63.0836
The Newton’s interpolating polynomial obtained by using these coefficients
is shown in Figure 5.3.
116 Data Interpolation
5
Data points (x, y)
Newton Polynomial
4
1
y
-1
-2
-3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x
2
FIGURE 5.3: Approximation of the function 4e−x sin (2πx) at 101 points,
using a Newton’s interpolating polynomial constructed by 5 data points.
0.35
Data (x, y)
Interpolating Polynomial y s =P 7 (s)
0.3
0.25
0.2
y
0.15
0.1
0.05
0
0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3
x
FIGURE 5.4: The original data (x, y) and the data obtained by interpolation
(s, P (s)).
polynomial consisting of the union of the straight lines joining between the
points (xj , yj ), j = 0, . . . , 7. At any given point sk , k = 0, . . . , 99 the polynomial
is evaluated by determining i such that xi ≤ s ≤ xi+1 and using the equation
of the line joining between (xi , yi ) and (xi+1 , yi+1 ). It finally returns the result
in P .
In MATLAB the data and the interpolating polynomial are plotted by
using:
>> plot(x, y, ’ro’, s, P, ’:b’, ’LineWidth’, 3) ;
>> xlabel(’x’) ;
>> ylabel(’y’) ;
>> legend(’Data’, ’(s, P(s))’)
The data points and interpolating polynomial are plotted in Figure 5.4.
0.35
The data (x, y)
The spline P(x)
0.3
0.25
0.2
y
0.15
0.1
0.05
0
0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3
x
FIGURE 5.5: The original data (x, y) and the data obtained by interpolation
(xx, P (xx)).
By typing:
>> Q = spline(x, y)
The result is:
Q =
form: ’pp’
breaks: [0 0.4286 0.8571 1.2857 1.7143 2.1429 2.5714 3]
coefs: [7x4 double]
pieces: 7
order: 4
dim: 1
The structure Q contains information about the interpolating polynomial.
For 8 data points, 7 pieces of cubic splines are required to join between each
two data points (xi , yi ) and (xi+1 , yi+1 ); i = 0, . . . , 7. The field “breaks” con-
tains the x − coordinates of the data points. The field “coefs” contain the
coeficients of each cubic spline piece between the two data points (xi , yi ) and
(xi+1 , yi+1 ); i = 0, . . . , 7. The error in approximating the original function by
the interpolating spline polynomial is h5 , where h = max xi+1 − xi ; i = 1, . . . , 7.
Therefore, the order of convergence is 4.
Then, Q can be evaluated at the points of the vector xx by using the ppval
function. This is done as follows:
>> S = ppval(Q, xx) ;
MATLAB’s Interpolation Tools 119
Now, S and P are identical. This can be seen through using the command:
>> Error = norm(P-S, inf)
Error =
0
0.35
The data (x, y)
Hermite cubic spline P(x)
0.3
0.25
0.2
y
0.15
0.1
0.05
0
0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3
x
FIGURE 5.6: The original data (x, y) and the data obtained by interpolation
(xx, P (xx)) using the pchip function.
120 Data Interpolation
cos (4x)
f (x) =
1+x
will be used to interpolate the function with linear, quadratic and cubic
piecewise polynomials, based on the interp1d function. The Python script
interpwp1d.py implements the interpolation with interp1d:
0.0
−0.2
−0.4
−0.6
0.0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3.0
x
0.6
0.4
0.2
y
0.0
−0.2
−0.4
−0.6
0.0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3.0
x
1.0
Data points (x, y)
pchip_interpolate
0.8 CubicSpline
0.6
0.4
0.2
y
0.0
−0.2
−0.4
−0.6
0.0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3.0
x
10
11 plt.plot(x, y, 'bo', label='Data points (x, y)', lw = 4)
12 plt.plot(xx, yy1, ls='-', color='orangered', ...
label='pchip interpolate', lw= 2)
13 plt.plot(xx, yy2, ls='--', color='purple', ...
label='CubicSpline', lw= 2)
14 plt.xlabel('x', fontweight='bold')
15 plt.ylabel('y', fontweight='bold')
16 plt.legend()
17 plt.grid(True, ls=':')
18 plt.xticks(arange(0, 3.3, 0.3), fontweight='bold')
19 plt.yticks(arange(-0.6, 1.2, 0.2), fontweight='bold')
5 y = (sin(5*x)+cos(5*x))/(2*(1+x**2))
6 xx = linspace(0, 3, 101) ;
7 LG = lagrange(x, y)
8 print(LG)
9 yy1 = LG(xx)
10
11 plt.plot(x, y, 'bo', label='Data points (x, y)', markersize = 8)
12 plt.plot(xx, yy1, ls='-', color='purple', label='Lagrange', ...
lw= 2)
13
14 plt.xlabel('x', fontweight='bold')
15 plt.ylabel('y', fontweight='bold')
16 plt.legend()
17 plt.grid(True, ls=':')
18 plt.xticks(arange(0, 3.3, 0.3), fontweight='bold')
19 plt.yticks(arange(-0.6, 1.2, 0.2), fontweight='bold')
Figure 5.9 shows the graph of the function f (x) approximated at 101 points
using Lagrange interpolation.
1.0
Data points (x, y)
0.8 Lagrange
0.6
0.4
0.2
y
0.0
−0.2
−0.4
−0.6
0.0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3.0
x
Abstract
This chapter discusses the numerical methods for approximating derivative
and integrations of functions. The chapter is divided into two sections: the
first section discusses the numerical differentiation of functions based on
finite difference formulas and the second discusses the numerical integra-
tion based on Newton-Cotes and Gauss methods. Such numerical differen-
tiation or integration algorithms are implemented using both MATLAB® and
Python.
h2 00 hk
f (x0 + h) = f (x0 ) + hf 0 (x0 ) + f (x0 ) + · · · + f (k) (x0 )
2! k!
hk+1 (k+1)
+ f (ξ), ξ ∈ [x0 , x0 + h] (6.1)
(k + 1)!
and
h2 00 (−h)k (k)
f (x0 + h) = f (x0 ) − hf 0 (x0 ) + f (x0 ) − · · · + f (x0 )
2! k!
(−h)k+1 (k+1)
+ f (η), η ∈ [x0 − h, x0 ] (6.2)
(k + 1)!
125
126 Numerical Differentiation and Integration
By setting k = 1 in Equation (6.1) and solving for f 0 (x0 ) the first derivative
of f (x) at x = x0 is given by:
f (x0 + h) − f (x0 ) h 00
f 0 (x0 ) = + f (ξ), ξ ∈ [x0 , x0 + h] (6.3)
h 2
Hence, by taking as small possible value of h ≥ ε > 0, f 0 (x) can be approxi-
mated by:
f (x0 + h) − f (x0 )
f 0 (x0 ) ≈ (6.4)
h
where ε denotes the machine precision. If h is taken to be less than ε, round-off
error can affect the accuracy of approximate derivative at x0 .
The approximate formula of the derivative f 0 (x)0 in Equation (6.4) is called
the forward difference formula.
By setting k = 1 in Equation (6.2) and solving for f 0 (x0 ) the first derivative
of f (x) at x = x0 is given by:
f (x0 + h) − f (x0 ) h 00
f 0 (x0 ) = − f (η), η ∈ [x0 − h, x0 ] (6.5)
h 2
from which f 0 (x) can be approximated by:
f (x0 + h) − f (x0 )
f 0 (x0 ) ≈ (6.6)
h
where ε denotes the machine precision. If h is taken to be less than ε, round-off
error can affect the accuracy of approximate derivative at x0 .
The approximate formula of the derivative f 0 (x)0 in Equation (6.4) is called
the forward difference formula. Also, the approximate formula of f 0 (x)
in Equation (6.6) is called the backward difference formula [31].
The last terms of equations (6.3)
h 00
R2f (x) = f (x), x ∈ [x0 , x0 + h]
(2)!
and (6.6)
h
R2b (t) = − f 00 (t), t ∈ [x0 − h, x0 ]
2
give the remainders at points x ∈ [x0 , x0 + h] and x0 − h, x0 , respectively.
Ignoring the remainder while approximating some function is known as the
truncation error. If f 00 (x) is bounded by a constant M1 ∈ R+ in [t0 , t0 + h]
and by M2 ∈ R+ in [x0 − h, x0 ], then:
This indicates that the truncation error in both the forward and backward
difference formulas are O(h).
Numerical Differentiation 127
Example 6.1 In this example, Equation (6.4) will be used to find approxi-
3
mation of derivative of f (x) = e− sin (x )/4 for h = 10−1 , . . . , 10−15 . The exact
derivative of f (x) is
3x2 cos x3 − sin (x3 )/4
0
f (x) = − e
4
The error for each value of h will also be shown.
The Python code is:
1 #fpapproxim.py
2 import numpy as np
3 f = lambda x: np.exp(-np.sin(x**3)/4)
4 fp = lambda x: -3*x**2*np.cos(x**3)/4*f(x)
5 h = 0.1
6 fpapprox = []
7 Eps = np.spacing(1.0)
8 while h ≥ Eps:
9 fp1 = (f(1.+h)-f(1.))/h
10 fpapprox.append([h, fp1, np.abs(fp(1.)-fp1)])
11 h /= 10
12 print('---------------------------------------------------------')
13 print(' h', '\t\t\t', 'Approx Der', '\t\t ', 'Approx Error')
14 print('---------------------------------------------------------')
15 for x in fpapprox:
16 print('{0:1.3e}'.format(x[0]), '\t', ...
'{0:1.15e}'.format(x[1]), '\t\t', ...
'{0:1.15e}'.format(x[2]))
1 %fpapproxim.m
2 f = @(x) exp(-sin(xˆ3)/4) ;
3 fp = @(x) -3*xˆ2*cos(xˆ3)/4*f(x) ;
4 fpapp = zeros(15, 3) ;
5 h = 1e-1 ;
6 j = 1 ;
7 while h ≥ eps
8 fp1 = (f(1.0+h)-f(1.0))/h ;
9 fpapp(j,:) = [h, fp1, abs(fp(1)-fp1)] ;
10 h = h / 10 ;
11 j = j + 1 ;
12 end
13 fprintf('----------------------------------------------------\n') ;
14 fprintf(' h \t\t\t\t Approx Der \t\t\t Approx Error\n') ;
15 fprintf('----------------------------------------------------\n') ;
16 for j = 1 : 15
17 fprintf('%5.3e\t\t%16.15e\t\t%16.15e\n', fpapp(j,1), ...
fpapp(j,2), fpapp(j,3)) ;
18 end
10−1
10−2
10−3
10−4
Error
10−5
10−6
10−7
10−8
10−15 10−14 10−13 10−12 10−11 10−10 10−9 10−8 10−7 10−6 10−5 10−4 10−3 10−2 10−1
h
by the line:
fp1 = (f(1.0)-f(1.0-h))/h
in the codes of Example (6.1). Similar results can be obtained by running a
code fpbdfapprox.py that is based on Equation (6.6):
runfile(’D:/PyFiles/fpbdfapprox.py’, wdir=’D:/PyFiles’)
--------------------------------------------------------------
h Approx Der Approx Error
--------------------------------------------------------------
1.000e-01 -3.631033359769975e-01 3.475370461356703e-02
1.000e-02 -3.332305430411631e-01 4.880911677732580e-03
1.000e-03 -3.288531422600549e-01 5.035108966244262e-04
1.000e-04 -3.284001380376989e-01 5.050667426836908e-05
1.000e-05 -3.283546835874951e-01 5.052224064605593e-06
1.000e-06 -3.283501365247687e-01 5.051613382045517e-07
1.000e-07 -3.283496807782171e-01 4.941478659592491e-08
1.000e-08 -3.283496363692961e-01 5.005865610918647e-09
1.000e-09 -3.283495697559146e-01 6.160751592210190e-08
1.000e-10 -3.283495697559146e-01 6.160751592210190e-08
1.000e-11 -3.283484595328899e-01 1.171830540547258e-06
1.000e-12 -3.282929483816587e-01 5.668298177174957e-05
1.000e-13 -3.275157922644211e-01 8.338390990093592e-04
1.000e-14 -3.219646771412953e-01 6.384954222135142e-03
1.000e-15 -2.220446049250313e-01 1.063050264383992e-01
By setting k = 2 in equations (6.1) and (6.5), subtracting Equation (6.5)
from (6.1) and solving for f 0 (x) gives:
10−1
10−2
10−3
10−4
10−5
10−6
Error
10−7
10−8
10−9
10−10
10−11
Central difference
Forward difference
10−12
10−15 10−14 10−13 10−12 10−11 10−10 10−9 10−8 10−7 10−6 10−5 10−4 10−3 10−2 10−1
h
FIGURE 6.2: Comparing the forward differencing errors to the central differ-
ence errors in the loglog scale.
1 import numpy as np
2 import matplotlib
3 import matplotlib.pyplot as plt
4 matplotlib.rc('text', usetex=True)
5 matplotlib.rcParams['text.latex.preamble'] ...
=[r"\usepackage{amsmath}"]
6 a, b = 0.0, 2.0
7 N = 1000
8 x = np.linspace(a, b, N+1)
9 f = lambda x: np.exp(-np.sin(x**3)/4)
10 df = np.diff(f(x))
Numerical Differentiation 131
2.00
Exact Solution
Approximate Solution
1.75
1.50
1.25
1.00
0.75
y
0.50
0.25
0.00
−0.25
−0.50
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
x
3
FIGURE 6.3: Graph of exact derivative of f (x) = e− sin (x ) is plotted in the
interval [0, π/2] against the graph of approximate derivative obtained by a
forward difference formula.
11 dx = (b-a)/N
12 dfdx = df/dx
13 plt.plot(x[:N], dfdx, '-m', ...
lw=2,label=r'$\mathbf{-\frac{3xˆ2\cos(xˆ3)}{4} ...
eˆ{-\frac{\sin(xˆ3)}{4}}}$')
14 plt.xlabel('x', fontweight='bold')
15 plt.ylabel('y', fontweight='bold')
16 plt.grid(True, ls=':')
17 plt.legend(loc='upper left')
Figure 6.4 shows the graph of f 0 (x), x ∈ [0, π2 ] using the function diff.
2 3)
cos(x3 ) − sin(x
− 3x 4 e 4
0
y
−1
−2
× 10-4
2
2
d 2 f(x)/dx 2 = 2(1-2x 2 )e -x
1.5
0.5
y
-0.5
-1
-3 -2 -1 0 1 2 3
x
2
FIGURE 6.5: Graph of f 00 (x) = 2(1 − 2x2 )e−x in the period [−3, 3].
Numerical Integration 133
If F (x) is the anti-derivative of f (x) in [a, b], then from the fundamental
theorem of calculus
Z b
I= f (x) dx = F (b) − F (a)
a
However, there are many functions defined over finite intervals whose anti-
derivatives are unknown, although they do exist. Such examples include:
Z π
sin (x)
I1 = dx
0 x
and Z 3 2
I2 = e−x dx
0
134 Numerical Differentiation and Integration
where the different numerical integration methods differ from each other in
the way by which the points xj are selected and coefficients wj are calculated.
Equation (6.12) is called numerical quadrature. Hence, the term numer-
ical quadrature points to a form, in which an integration formula is approxi-
mated by a finite sum.
In this chapter, we discuss two classes of numerical integration methods,
namely: the Newton-Cotes and Gauss quadratures.
x−b x−a 1
P1 (x) = f (a) + f (b) = (−(x − b)f (a) + (x − a)f (b)) ,
a−b b−a h
hence,
Z b Z b
1 h
f (x)dx ≈ [−(x − b)f (a) + (x − a)f (b)] dx = (f (b) + f (a))
a h a 2
(6.14)
Numerical Integration 135
f(b)
f(x)
f(a)
a b
FIGURE 6.6: Approximating the area under curve by the trapezium area.
The formula in equation (6.14) gives the trapezoidal rule, where the
area under the f (x) curve from x = a to x = b is approximated by area
of a trapezium whose bases are f (a) and f (b) and height is b − a.
In Figure 6.6 the area under the curve of f (x) is approximated by the
area of the trapezium (shaded area).
To find an estimate of the integral error, it is convenient to start from
the formula:
(x − a)(x − b) 00
f (x) = P1 (x) + f (ξ), ξ ∈ [a, b]
2
from which the approximation error is given by:
Z Z
b b (x − a)(x − b)
(f (x)dx − P1 (x)) dx = f 00 (ξ)dx
2
a a
(b − a)3 00 h3 00
≤ |f (ξ)| = |f (ξ)|
12 12
If f 00 (x) is bounded by M ∈ R+ in [a, b], then
Z Z
b b
M 3
(f (x)dx − P1 (x)) dx ≤ |f (x) − P1 (x)| dx ≤ h (6.15)
12
a a
The error formula 6.15 shows that large value of h = b − a causes large
error.
In Python:
In [1]: a, b, f = 0.0, 1.0, lambda x: 1/(1+x**2)
In [2]: I = (b-a)/2*(f(b)+f(a))
In [3]: print(I)
0.75
where c = (a + b)/2.
Let x = a + th, 0 ≤ t ≤ 2 be a parametric representation of x in terms
of t. Then, P2 (x) can be rewritten as:
(t − 1)(t − 2) t(t − 1)
P2 (x) = f (a) + t(t − 2)f (c) + f (b)
2 2
Then,
b 2
(t − 1)(t − 2) t(t − 1)
Z Z
f (x)dx ≈ h f (a) + t(t − 2)f (c) + f (b) dt
2 2
a
0
1 4 a+b 1
=h f (a) + f + f (b)
3 3 2 3
b−a a+b
= f (a) + 4f + f (b) (6.16)
6 2
f(x1 )
f(x0 )
x0 x1 xN
Figure 6.7 shows the areas under trapeziums, which are in total very
close to the area under the function’s curve.
The following MATLAB code uses the composite trapezoidal rule with
R1
N = 10 × 2k , k = 0, . . . , 14 to approximate 0 1/(1 + x2 )dx:
1 %comptrapz.m
2 clear ; clc ;
3 format short e
4 a = 0.0; b = 1.0 ; f = @(x) 1./(1+x.ˆ2) ;
5 N = 10 ; Approx = [] ;
6 fprintf('------------------------------------------------\n')
7 fprintf(' N\t\t Approx Int.\t Error\n') ;
8 fprintf('------------------------------------------------\n')
9 while N ≤ 200000
Numerical Integration 139
10 h = (b-a)/N ;
11 x = linspace(a, b, N+1) ;
12 I = h/2*(f(x(1))+f(x(N+1))) ;
13 I = I + h*sum(f(x(2:N))) ;
14 Err = abs(pi/4-I) ;
15 Approx = [Approx; [N I Err]] ;
16 fprintf('%6i\t%1.12f\t%1.12e\n', N, I, Err) ;
17 N = 2*N ;
18 end
19 fprintf('------------------------------------------------\n')
1 # compsimp.py
2 import numpy as np
3 a, b, f = 0.0, 1.0, lambda x: 1/(1+x**2)
4 ApproxTable = []
5 N = 10
6 print('-----------------------------------------------------')
7 print(' N\t Approx Int.\t Error')
8 print('-----------------------------------------------------')
9 while N ≤ 200000:
10 x, h = np.linspace(a, b, N+1), (b-a)/N
11 I = h/2*(f(x[0])+f(x[-1]))
12 I = I + h*sum(f(x[1:N]))
13 Err = np.abs(np.pi/4-I)
14 print('{0:6.0f}'.format(N), '\t', ...
'{0:1.12f}'.format(I), '\t', '{0:1.12e}'.format(Err))
140 Numerical Differentiation and Integration
15 N = 2*N
16 print('-----------------------------------------------------')
Z b Z x2 Z x2k
f (x)dx = f (x)dx + . . . + f (x)dx
a x0 x2k−2
h h
≈ (f (x0 ) + 4f (x1 ) + f (x2 )) + . . . + (f (x2k−2 )
3 3
+4f (x2k−1 ) + f (x2k ))
k k−1
h 4h X 2h X
= (f (x0 ) + f (xN )) + f (x2j−1 ) + f (x2j ) (6.19)
3 3 3
j=1 j=1
The following MATLAB code uses the composite Simpson’s rule to com-
R1
pute 0 1/(1 + x2 )dx with N = 10 × 2k , k = 0, . . . , 9
Numerical Integration 141
1 %compsimp.m
2 clear ; clc ;
3 format short e
4 a = 0.0; b = 1.0 ; f = @(x) 1./(1+x.ˆ2) ; IExact = pi/4 ;
5 N = 10 ; Approx = [] ;
6 fprintf('-------------------------------------------------\n')
7 fprintf(' N\t\t Approx Int.\t\t Error\n') ;
8 fprintf('-------------------------------------------------\n')
9 while N ≤ 10000
10 [I, Err] = CompSimpson(f, a, b, N, IExact) ;
11 Approx = [Approx; [N I Err]] ;
12 fprintf('%6i\t\t%1.12f\t\t%1.15e\n', N, I, Err) ;
13 N = 2*N ;
14 end
15 fprintf('-------------------------------------------------\n')
16
17 function [I, Error] = CompSimpson(f, a, b, N, IExact)
18 h = (b-a)/N ;
19 x = linspace(a, b, N+1) ;
20 I = h/3*(f(x(1))+f(x(N+1))) ;
21 for j = 2 : N
22 if mod(j, 2) == 0
23 I = I + 4*h/3*f(x(j)) ;
24 else
25 I = I + 2*h/3*f(x(j)) ;
26 end
27 end
28 Error = abs(IExact-I) ;
29 end
1 #CompositeSimpson.py
2 import numpy as np
3 def CompSimpson(f, a, b, N, IExact):
142 Numerical Differentiation and Integration
>> I = trapz(x, f)
I =
0.74681800146797
Python also contains a trapz as a part of the scipy.integrate library. It
differs from the MATLAB’s trapz function by receiving the vector of f before
vector x:
In [10]: from scipy.integrate import trapz
In [11]: import numpy as np
In [12]: x = np.linspace(0, np.pi, 500)
In [13]: g = np.zeros_like(x)
In [14]: g[0], g[1:] = 1.0, np.sin(x[1:])/x[1:]
In [15]: I = trapz(g, x)
In [16]: print(I)
1.8519360005832526
Also, the MATLAB’s function quad applies the Simpson’s rule to evaluate
Rb
a (x)dx. It receives a function handle to f , a and b and then it returns the
f
value of the integral:
>> x = linspace(0, pi, 501) ;
>> h = @(x) sin(x)./x ;
>> I = quad(h, 0, pi)
I =
1.8519e+00
The Python’s quad function located in the scipy.integrate library
receives same arguments as MATLAB’s quad function, but it returns two
values: the integration value and the approximation error.
In [17]: from scipy.integrate import quad
In [18]: h = lambda x: np.exp(-x**2)
In [19]: I = quad(h, 0.0, 1.0)
In [20]: print(’Integral = ’,I[0], ’, Error = ’, I[1])
Integral = 0.7468241328124271 , Error = 8.291413475940725e-15
where c1 , . . . , c2n−1 .
The purpose is to find optimal sets of points a ≤ x1 < . . . < xn ≤ b (through
finding −1 ≤ s1 < s2 < . . . < sn−1 < sn ≤ 1) and weights w1 , w2 , . . . , wn such
that
b−a 1
Z b n
b−a b−a X b−a
Z
b+a b+a
f (x)dx = f s+ ds ≈ wj f sj +
a 2 −1 2 2 2 2 2
j=1
(6.21)
is exact for f (x) = c0 + c1 x + · · · + c2n−1 x2n−1 .
Integrating Equation (6.20) from a to b gives:
Z b Z b
c0 + c1 x + · · · + c2n−1 x2n−1 dx
f (x)dx =
a a
b2 − a2 b2n − a2n
= c0 (b − a) + c1 + · · · + c2n−1 (6.22)
2 2n
From Equation (6.20):
hence,
w1 + w2 + · · · + wn = b−a
b2 − a2
w1 x1 + w2 x2 + · · · + wn xn =
2
b 3 − a3
w1 x21 + w2 x22 + · · · + wn x2n = (6.24)
3
.. ..
. = .
n−1 n−1 bn − an
w1 x1 + w2 x2 + · · · + wn xn−1 n =
n
The points x1 , . . . , xn and weights w1 , . . . , wn are found by solving the sys-
tem of nonlinear equations (6.24).
Following is the derivation of the Gauss quadratures for n = 1 and n = 2:
1. One point Gauss quadrature: In the case that n = 1, there is one point
x1 and one weight w1 to be found. Since 2n − 1 = 1, it is assumed that
f (x) = c0 + c1 x. The equations in x1 and w1 are:
w1 = b−a
b2 − a2
w1 x1 =
2
Solving this system of equations gives:
b+a
w1 = b − a and x1 = .
2
The Gauss quadrature is:
Z b
b+a
f (x)dx = (b − a)f (6.25)
a 2
w1 + w2 = b−a
b2 − a2
w1 x1 + w2 x2 =
2
b3 − a3
w1 x21 + w2 x22 =
3
b4 − a4
w1 x31 + w2 x32 =
4
The following Python code can by used to find the solution of the nonlinear
system:
In [21]: import sympy as smp
In [22]: from sympy.solvers import solve
In [23]: w1, w2, x1, x2, a, b = smp.symbols(’w1, s2, x1, x2,
a, b’, cls=smp.Symbol)
In [24]: Syst = [w1+w2-(b-a), w1*x1+w2*x2-(b**2-a**2)/2, \
...: w1*x1**2+w2*x2**2-(b**3-a**3)/3,
w1*x1**3+w2*x2**3-(b**4-a**4)/4]
In [25]: vrs = [w1, w2, x1, x2]
In [26]: Sol = solve(Syst, vrs)
In [27]: print(Sol[0])
(-(a - b)/2, -(a - b)/2, -sqrt(3)*a/6 + a/2
+ sqrt(3)*b/6 + b/2, a/2 + b/2 + sqrt(3)*(a - b)/6)
In [28]: print(Sol[1])
(-(a - b)/2, -(a - b)/2, sqrt(3)*a/6 + a/2
- sqrt(3)*b/6 + b/2, a/2 + b/2 - sqrt(3)*(a - b)/6)
The Python symbolic library gives two solutions of the nonlinear system,
in which the values of x1 and x2 exchange. If we put a condition x1 < x2
the solution of the nonlinear system is:
b−a b+a b−a √ b+a b−a √
w1 = w2 = , x1 = − 3 and x2 = + 3
2 2 6 2 6
The two-points Gauss quadrature is:
Example 6.3 In this example, the one- and two-points Gauss quadratures
will be used to evaluate the integral:
Z 2
xdx
0 1 + 2x2
Numerical Integration 147
Z xj+1
xj + xj+1
f (x)dx ≈ hf
xj 2
Then,
Z b N −1 Z xj+1 N −1
X X xj + xj+1
f (x)dx ≈ f (x)dx = h f
a xj 2
j=0 j=0
R xj+1
The 2-points Gauss quadrature approximates xj f (x)dx as
xj+1
xj + xj+1 xj+1 − xj √
Z
h
f (x)dx ≈ f − 3
xj 2 2 6
xj + xj+1 xj+1 − xj √
+f + 3
2 6
Then,
Z b N
X −1 Z xj+1
f (x)dx ≈ f (x)dx
a j=0 xj
N −1
xj + xj+1 xj+1 − xj √
h X
= f − 3
2 2 6
j=0
xj + xj+1 xj+1 − xj √
+f + 3
2 6
The following MATLAB code shows the approximations and errors of the
1- and 2-points Gauss quadratures for different values of N (= 10 × 2k , k =
0, . . . , 10).
1 clear ; clc ;
2 f = @(x) x./(1+2*x.ˆ2) ;
3 a = 0.0 ; b = 2.0 ;
4 N = 10 ;
5 IExact = log(9)/4 ;
6 GL1Approx = [] ; GL2pApprox = [] ;
7
8 while N < 20000
9 x = linspace(a, b, N+1) ;
10 h = (b-a)/N ;
11 I1 = h*sum(f((x(1:N)+x(2:N+1))/2)) ;
148 Numerical Differentiation and Integration
12 Err1 = abs(I1-IExact) ;
13 GL1Approx = [GL1Approx; [N I1 Err1]] ;
14 I2 = ...
h/2*sum(f((x(1:N)+x(2:N+1))/2-(x(2:N+1)-x(1:N))*sqrt(3)/6)
15 +...f((x(1:N)+x(2:N+1))/2+(x(2:N+1)-x(1:N))*sqrt(3)/6)) ;
16 Err2 = abs(I2-IExact) ;
17 GL2pApprox = [GL2pApprox; [N I2 Err2]] ;
18 N = 2*N ;
19 end
20
21 fprintf('Integration with 1-point Gauss quadrature\n')
22 fprintf('--------------------------------------------------\n')
23 fprintf(' N\t\t Approx Int.\t\t Error\n') ;
24 fprintf('--------------------------------------------------\n')
25 [m, L] = size(GL1Approx) ;
26 for j = 1 : m
27 fprintf('%6i\t\t%1.12f\t\t%1.15e\n', GL1Approx(j,1), ...
28 GL1Approx(j,2), GL1Approx(j,3)) ;
29 end
30 fprintf('--------------------------------------------------\n\n\n')
31
32 fprintf('Integration with 2-points Gauss quadrature\n')
33 fprintf('--------------------------------------------------\n')
34 fprintf(' N\t\t Approx Int.\t\t Error\n') ;
35 fprintf('--------------------------------------------------\n')
36 [m, L] = size(GL2pApprox) ;
37 for j = 1 : m
38 fprintf('%6i\t\t%1.12f\t\t%1.15e\n', GL2pApprox(j,1), ...
39 GL2pApprox(j,2), GL2pApprox(j,3)) ;
40 end
41 fprintf('--------------------------------------------------\n\n\n')
1 import numpy as np
2 f = lambda x: x/(1.0+2.0*x**2)
3 a, b = 0.0, 2.0
4 N = 10
5 IExact = np.log(9)/4
6 GL1Approx, GL2Approx = [], []
7 while N < 20000:
8 I1, I2 = 0.0, 0.0
9 x = np.linspace(a, b, N+1)
10 h = (b-a)/N
11 for j in range(N):
12 I1 += h*(f((x[j]+x[j+1])/2.0))
13 I2 += ...
h/2*(f((x[j]+x[j+1])/2-(x[j+1]-x[j])*np.sqrt(3.)/6.)+\
14 f((x[j]+x[j+1])/2+(x[j+1]-x[j])*np.sqrt(3.)/6.))
15 Err1 = abs(I1-IExact) ;
16 GL1Approx.append([N, I1, Err1])
17 Err2 = abs(I2-IExact)
18 GL2Approx.append([N, I2, Err2])
19 N = 2*N
20
21 print('Integration with 1-point Gauss quadrature\n')
22 print('----------------------------------------------------\n')
23 print(' N\t\t Approx Int.\t\t Error\n') ;
24 print('----------------------------------------------------\n')
25 m = len(GL1Approx)
26 for j in range(m):
27 print('{0:6.0f}'.format(GL1Approx[j][0]), '\t', \
28 '{0:1.12f}'.format(GL1Approx[j][1]), '\t', \
29 '{0:1.12e}'.format(GL1Approx[j][2]))
30 print('----------------------------------------------------\n\n\n')
31
32 print('Integration with 2-points Gauss quadrature\n')
150 Numerical Differentiation and Integration
33 print('----------------------------------------------------\n')
34 print(' N\t\t Approx Int.\t\t Error\n') ;
35 print('----------------------------------------------------\n')
36 m = len(GL2Approx) ;
37 for j in range(m):
38 print('{0:6.0f}'.format(GL2Approx[j][0]), '\t', \
39 '{0:1.12f}'.format(GL2Approx[j][1]), '\t', \
40 '{0:1.12e}'.format(GL2Approx[j][2]))
41 print('----------------------------------------------------\n\n\n')
Since the number of subintervals is doubled in the above example, the rates
of convergences for the 1- and 2-points Gauss quadratures can be computed
using the rule:
ErrorN
RN = log2
Error2N
The following Python code can be used to see the rates of convergences of the
two methods:
1 RConv1 = []
2 print('Rates of convergence for 1-point Gauss quadrature: \n')
3 print('-----------------------\n')
4 print(' N\t Conv. Rate\n')
5 print('-----------------------\n')
6 for k in range(m-1):
7 RConv1.append(np.log2(GL1Approx[k][2]/GL1Approx[k+1][2]))
8 print('{0:4.0f}'.format(GL1Approx[k][0]), '\t ...
{0:2.3f}'.format(RConv1[-1]))
9 print('-----------------------\n')
10
11 RConv2 = []
12 print('Rates of convergence for 2-points Gauss quadrature: \n')
13 print('-----------------------\n')
14 print(' N\t Conv. Rate\n')
15 print('-----------------------\n')
16 for k in range(m-1):
17 RConv2.append(np.log2(GL2Approx[k][2]/GL2Approx[k+1][2]))
18 print('{0:4.0f}'.format(GL2Approx[k][0]), '\t ...
{0:2.3f}'.format(RConv2[-1]))
19 print('-----------------------\n')
320 2.000
640 2.000
1280 2.000
2560 2.000
5120 2.000
-----------------------
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0
Legendre-Gauss points
The matrix at the left-hand side of Equation (6.27) is the transpose matrix
of the Vandermonde matrix of type n×n, constructed from the vector of points
x1 , . . . , x n .
The following Python code (FindLGParams.py) includes a function LGpw
that receives a parameter n and returns the corresponding Legendre-Gauss
points and weights.
1 #FindLGParams.py
2
3 from scipy.special import legendre
4 import numpy as np
5
6 def LGpw(n):
7 s = list(np.sort(np.roots(legendre(n))))
8 X = (np.fliplr(np.vander(s))).T
9 Y = np.array([(1-(-1)**j)/j for j in range(1, n+1)])
10 w = np.linalg.solve(X, Y)
11 return s, w
Numerical Integration 153
12
13 s, w = LGpw(3)
14 print('n = 3:\n')
15 print('Points are: ', s, '\n Weights are:', w)
16
17 print('n = 6:\n')
18 s, w = LGpw(6)
19 print('Points are: ', s, '\n Weights are:', w)
n = 6:
Points are: [-0.9324695142031514, -0.6612093864662644,
-0.23861918608319704, 0.23861918608319688, 0.6612093864662634,
0.9324695142031519]
Weights are: [0.17132449 0.36076157 0.46791393 0.46791393
0.36076157 0.17132449]
Example 6.4 This example uses the function LGpw to find the points and
weights of a 5-points Legendre-Gauss quadrature, then applies it to find the
integration in Example 6.3:
Z 2
x
dx
0 1 + 2x2
20 I = 0.0
21 for j in range(N):
22 xm = (x[j]+x[j+1])/2.0
23 I += h/2.0*sum([w[k]*f(xm+h/2*s[k]) for k in ...
range(len(s))])
24 Err = abs(I-IExact)
25 GL5Approx.append([N, I, Err])
26 N = 2*N
27
28 print('Integration with 5-point Gauss quadrature\n')
29 print('----------------------------------------------------\n')
30 print(' N\t\t Approx Int.\t\t Error\n') ;
31 print('----------------------------------------------------\n')
32 m = len(GL5Approx)
33 for j in range(m):
34 print('{0:6.0f}'.format(GL5Approx[j][0]), '\t', \
35 '{0:1.12f}'.format(GL5Approx[j][1]), '\t', \
36 '{0:1.12e}'.format(GL5Approx[j][2]))
37 print('----------------------------------------------------\n\n\n')
-----------------------------------------------------
N Approx Int. Error
-----------------------------------------------------
5 0.549306144553503 3.995001813766e-10
10 0.549306144334415 6.564653283378e-13
20 0.549306144334055 6.063411283909e-16
40 0.549306144334055 2.021137094636e-16
80 0.549306144334055 2.021137094636e-16
160 0.549306144334055 4.042274189272e-16
320 0.549306144334055 4.042274189272e-16
640 0.549306144334054 1.414795966245e-15
-----------------------------------------------------
The MATLAB code LGApprox is:
1 n = 5 ;
2 [s, w] = LGpw(n) ;
3 a = 0.0 ; b = 2.0 ; f = @(x) x./(1+2*x.ˆ2) ;
4 N = 5 ;
5 IExact = log(9.0)/4 ;
6 LG5Approx = [] ;
7 while N < 1000
8 x = linspace(a, b, N+1) ;
9 h = (b-a)/N ;
10 I = 0.0 ;
11 for j = 1 : N
12 xm = (x(j)+x(j+1))/2 ;
Numerical Integration 155
13 I = I + h/2.0*sum(w.*f(xm+h/2*s)) ;
14 end
15 Err = abs(I - IExact) ;
16 LG5Approx = [LG5Approx; [N, I, Err]] ;
17 N = 2*N ;
18 end
19
20 fprintf('Integration with 5-point Gauss quadrature\n')
21 fprintf('-----------------------------------------------------\n')
22 fprintf(' N\t\t Approx Int.\t\t Error\n') ;
23 fprintf('-----------------------------------------------------\n')
24 m = length(LG5Approx) ;
25 for j = 1 : m
26 fprintf('%4i\t%1.15f\t%1.12e\n', LG5Approx(j, 1), ...
27 LG5Approx(j, 2), LG5Approx(j, 3)/IExact) ;
28 end
29 fprintf('-----------------------------------------------------\n')
30
31 function [s, w] = LGpw(n)
32 s = sort(roots(legendreV(n))) ;
33 X = fliplr(vander(s))' ;
34 Y = zeros(n, 1) ;
35 for j = 1 : n
36 Y(j) = (1-(-1)ˆj)/j ;
37 end
38 w = X\Y ;
39 end
40
41 function Legn = legendreV(n)
42
43 %legendreV.m
44 %
45 % This function receives a parameter n representing the ...
polynomial degree
46 % and returns the cofficients of Legendre polynomial of ...
degree n using the
47 % recursive relation
48 % P n(x) = ...
((2n-1)/n)xP {n-1}(x)-((n-1)/n)P {n-2}(x)
49 %
50 % Written by Eihab B.M. Bashier, [email protected]
51
52 if n == 0
53 Legn = [0, 1] ;
54 elseif n == 1
55 Legn = [1, 0] ;
56 else
57 L1 = conv([1, 0], legendreV(n-1)) ;
58 L2 = legendreV(n-2) ;
59 if length(L2) < length(L1)
60 L2 = [zeros(1, length(L1)-length(L2)) L2] ;
61 end
62 Legn = (2*n-1)/n*L1-(n-1)/n*L2 ;
63 end
64 end
156 Numerical Differentiation and Integration
11
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0
Legendre-Gauss-Lobbato points
1 import numpy as np
2 from scipy.special import legendre
3 def LGLpw(n):
4 s = list(np.sort(np.roots([-1, 0, ...
1]*np.polyder(legendre(n-1)))))
5 X = (np.fliplr(np.vander(s))).T
6 Y = np.array([(1-(-1)**j)/j for j in range(1, n+1)])
7 w = np.linalg.solve(X, Y)
8 return s, w
9
10 n = 4
11 s, w = LGLpw(n)
12
13 a, b = 0.0, 2.0
14 f = lambda x: x/(1.0+2.0*x**2)
15 N = 5
16 IExact = np.log(9.0)/4.0
17 LGL5Approx = []
18 while N < 1000:
19 x, h = np.linspace(a, b, N+1), (b-a)/N
20 I = 0.0
21 for j in range(N):
22 xm = (x[j]+x[j+1])/2.0
23 I += h/2.0*sum([w[k]*f(xm+h/2*s[k]) for k in ...
range(len(s))])
24 Err = abs(I-IExact)
25 LGL5Approx.append([N, I, Err])
26 N = 2*N
27
28 print('Integration with 5-point Gauss quadrature\n')
29 print('----------------------------------------------------\n')
30 print(' N\t\t Approx Int.\t\t Error\n') ;
31 print('----------------------------------------------------\n')
32 m = len(LGL5Approx)
33 for j in range(m):
34 print('{0:6.0f}'.format(LGL5Approx[j][0]), '\t', \
35 '{0:1.15f}'.format(LGL5Approx[j][1]), '\t', \
36 '{0:1.12e}'.format(LGL5Approx[j][2]/IExact))
37 print('----------------------------------------------------\n\n\n')
38
39 RConv5 = []
40 print('Rates of convergence for '+str(n)+'-points ...
Legendre-Gauss-Lobbato quadrature: \n')
41 print('-----------------------\n')
42 print(' N\t Conv. Rate\n')
43 print('-----------------------\n')
44 for k in range(m-1):
45 RConv5.append(np.log2(LGL5Approx[k][2]/LGL5Approx[k+1][2]))
46 print('{0:4.0f}'.format(LGL5Approx[k][0]), '\t ...
{0:2.3f}'.format(RConv5[-1]))
47 print('-----------------------\n')
-----------------------------------------------------
N Approx Int. Error
-----------------------------------------------------
5 0.549303859258719 4.159930413361e-06
10 0.549306121271021 4.198575552244e-08
20 0.549306144007535 5.944232913986e-10
40 0.549306144329063 9.088447173451e-12
80 0.549306144333977 1.410753692056e-13
160 0.549306144334054 2.021137094636e-15
320 0.549306144334055 4.042274189272e-16
640 0.549306144334054 1.212682256782e-15
-----------------------------------------------------
1 n = 4 ;
2 [s, w] = LGLpw(n) ;
3 a = 0.0 ; b = 2.0 ; f = @(x) x./(1+2*x.ˆ2) ;
4 N = 5 ;
5 IExact = log(9.0)/4 ;
6 LGL4Approx = [] ;
7 while N < 1000
8 x = linspace(a, b, N+1) ;
9 h = (b-a)/N ;
10 I = 0.0 ;
11 for j = 1 : N
12 xm = (x(j)+x(j+1))/2 ;
13 I = I + h/2.0*sum(w.*f(xm+h/2*s)) ;
14 end
15 Err = abs(I - IExact) ;
16 LGL4Approx = [LGL4Approx; [N, I, Err]] ;
17 N = 2*N ;
18 end
Numerical Integration 159
19
20 fprintf('Integration with 4-point Gauss quadrature\n')
21 fprintf('-----------------------------------------------------\n')
22 fprintf(' N\t\t Approx Int.\t\t Error\n') ;
23 fprintf('-----------------------------------------------------\n')
24 m = length(LGL4Approx) ;
25 for j = 1 : m
26 fprintf('%4i\t%1.15f\t%1.12e\n', LGL4Approx(j, 1), ...
27 LGL4Approx(j, 2), LGL4Approx(j, 3)/IExact) ;
28 end
29 fprintf('-----------------------------------------------------\n')
30
31 RConv4 = [] ;
32 fprintf('Rates of convergence for 4-points Gauss quadrature: \n')
33 fprintf('-----------------------\n')
34 fprintf(' N\t Conv. Rate\n')
35 fprintf('-----------------------\n')
36 for k = 1:m-1
37 RConv4 =[RConv4;(log2(LGL4Approx(k, 3)/LGL4Approx(k+1, ...
3)))] ;
38 fprintf('%4i\t%1.12e\n', LGL4Approx(k, 1), RConv4(end)) ;
39 end
40 fprintf('-----------------------\n')
41
42 function [s, w] = LGLpw(n)
43 s = sort(roots(conv([-1, 0, 1],polyder(legendreV(n-1))))) ;
44 X = fliplr(vander(s))' ;
45 Y = zeros(n, 1) ;
46 for j = 1 : n
47 Y(j) = (1-(-1)ˆj)/j ;
48 end
49 w = X\Y ;
50 end
51
52 function Legn = legendreV(n)
53 if n == 0
54 Legn = [0, 1] ;
55 elseif n == 1
56 Legn = [1, 0] ;
57 else
58 L1 = conv([1, 0], legendreV(n-1)) ;
59 L2 = legendreV(n-2) ;
60 if length(L2) < length(L1)
61 L2 = [zeros(1, length(L1)-length(L2)) L2] ;
62 end
63 Legn = (2*n-1)/n*L1-(n-1)/n*L2 ;
64 end
65 end
11
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0
Lobatto-Gauss-Radau points
of degree n. The positions of the n LGR-points are shown in Figure 6.10 for
n = 1, . . . , 11.
A MATLAB function PolyAdd the receives vectors of coefficients of two
polynomials P1 of degree n and P2 (x) of degree m and returns the vectors of
coefficients of P1 (x) + P2 (x) is described by the following code:
The MATLAB function LGRpw receives a positive integer n and uses the
function PolyAdd to return the LGR points and weights.
1 % LGRApp.m
2 clear ; clc ;
3 n = 3 ;
4 [s, w] = LGRpw(n) ;
5 a = 0.0 ; b = 2.0 ; f = @(x) x./(1+2*x.ˆ2) ;
6 N = 5 ;
7 IExact = log(9.0)/4 ;
8 LGRApprox = [] ;
9 while N < 1000
10 x = linspace(a, b, N+1) ;
11 h = (b-a)/N ;
12 I = 0.0 ;
13 for j = 1 : N
14 xm = (x(j)+x(j+1))/2 ;
15 I = I + h/2.0*sum(w.*f(xm+h/2*s)) ;
16 end
17 Err = abs(I - IExact) ;
18 if Err < eps
19 break ;
20 end
21 LGRApprox = [LGRApprox; [N, I, Err]] ;
22 N = 2*N ;
23 end
24
25 fprintf('%s%i%s\n', 'Integration with Gauss quadrature based ...
on', n, '-LGR points:')
26 fprintf('-----------------------------------------------------\n')
27 fprintf(' N\t\t Approx Int.\t\t Error\n') ;
28 fprintf('-----------------------------------------------------\n')
29 m = length(LGRApprox) ;
30 for j = 1 : m
31 fprintf('%4i\t%1.15f\t%1.12e\n', LGRApprox(j, 1), ...
32 LGRApprox(j, 2), LGRApprox(j, 3)/IExact) ;
33 end
34 fprintf('-----------------------------------------------------\n')
35
36 RConv4 = [] ;
37 fprintf('%s%i%s\n', 'Rates of convergence for Gauss ...
quadrature based on ', n, '-LGR points')
38 fprintf('-----------------------\n')
39 fprintf(' N\t Conv. Rate\n')
40 fprintf('-----------------------\n')
41 for k = 1:m-1
42 RConv4 =[RConv4;(log2(LGRApprox(k, 3)/LGRApprox(k+1, 3)))] ;
43 fprintf('%4i\t%3.3f\n', LGRApprox(k, 1), RConv4(end)) ;
44 end
45 fprintf('-----------------------\n')
162 Numerical Differentiation and Integration
Abstract
Differential equations have wide applications in modelling real phenomena
[61, 62]. Analytical solutions of differential equations can be found for few
special and simple cases. Hence, numerical methods are more appropriate in
finding solutions of such differential equations.
This chapter discusses some of the numerical methods for solving a system
of initial value problems:
dy
= f (x, y), y(a) = y 0 , (7.1)
dx
where y : R → Rn , f : R × Rn → Rn and x ∈ [a, b]. Here, we assume that the
functions fj (x, y) are Lipschitz continuous in [a, b] for all j = 1, . . . , n and at
least one function fk (x, y) is nonlinear for some k ∈ {1, 2, . . . , n}.
It is divided into five sections. The first, second and third sections discuss
the general idea of Runge-Kutta methods, explicit and implicit Runge-Kutta
methods. The fourth section discusses the MATLAB® built-in functions for
solving systems of differential equations. The fifth section discusses the scipy
functions and gekko Python methods for solving initial value problems.
165
166 Solving Systems of Nonlinear Ordinary Differential Equations
The weighted average of the slope over the interval [xi , xi+1 ] is then a
(j) (j)
convex combination of the slopes f (xi , y(xi ); j = 1, . . . , M , given by
M M
(j) (j)
X X
bj f (xi , y(xi )), bi = 1
l=1 i=1
The formulas (7.3) and (7.4) define the M-stage Runge-Kutta method.
If the coefficients ajl = 0 for all l > j the resulting Runge-Kutta method is
explicit. If ajl = 0, for all l > j, but ajj , 0 for some j, the method is
semi-explicit. Otherwise, it is implicit [14].
(1) (j) (j) (M )
The quantity ξ i = y(xi ) + hi M
P
j=1 bj f (xi , y(ti )) − y(xi ) defines
the local truncation error of the M-stage Runge-Kutta method. The M-stage
Runge-Kutta method is said to be of order P ∈ N, if its local truncation error
is of O(hP +1 ), where h = max {hi }.
i=0,...,N −1
The order of the M-stage Runge-Kutta method is achieved by satisfying
conditions on the coefficients (a)ij , bi and ci , i, j ∈ {1, . . . , M }. Butcher [14]
stated the number of conditions on an implicit M-stage Runge-Kutta method,
to be of order P as in table (7.1).
Order (P ) 1 2 3 4 5 6 7 8
Number of Restrictions 1 2 4 8 17 37 85 200
In Table (7.2) we state the conditions on the coefficients (A, b, c) for the
M-stage Runge-Kutta methods up to order four. Here a method can of order
P if it satisfies all the conditions from 1 to P .
TABLE 7.2: The relationships between the order of the M-stage Runge-Kutta
method, and the conditions on the triplet (A, b, c)
Order Conditions
PM
1 bi = 1
Pi=1
M PM 1
2 j=1 bi aij = 2
Pi=1
M
bi c2i = 13 1
PM PM
3 j=1 bi aij cj = 6
Pi=1 Pi=1
4 M
bi c3i = 14 M PM 1
j=1 bi aij cj = 12
Pi=1
M PM 1 Pi=1
M PM P M 1
i=1 j=1 bi ci aij cj = 8 i=1 j=1 ( l=1 bl ali )aij cj = 24
0 0 0 ... 0 0
c2 a21 0 ... 0 0
..
c A c3 a31 a32 . 0 0
=
bT .. .. .. ..
.
..
.
..
. . . .
cM aM 1 aM 2 ... aM M −1 0
b1 b2 ... bM −1 bM
(`)
In an explicit Runge-Kutta method, the function slope at a point xi depends
(1) (`−1)
only on the slopes at previous points xi , . . . , xi . The most important
explicit Runge-Kutta methods are the Euler’s methods, Heun’s method
and the classical fourth-order Runge-Kutta method [16].
is
1 2
y(t) = 1 + e−t
2
and the error in approximating y(tj ) by yj is
Error = max{|y(tj ) − yj |, j = 0, 1, . . . , N }.
The Python code SolveIVPWithEuler solves the given initial value prob-
lem using Euler’s method:
1 # SolveIVPWithEuler.py
2 from numpy import linspace, zeros, exp, inf
3 from numpy.linalg import norm
4 def EulerIVP(f, a, b, N, y0):
5 t = linspace(a, b, N+1)
6 yexact = 0.5*(1+exp(-t**2))
7 f = lambda t, y: t-2*t*y
8 h = 2.0/N
9 y = zeros((N+1,), 'float')
10 y[0] = y0
11 for j in range(N):
12 y[j+1] = y[j] + h*f(t[j], y[j])
13 return t, y, norm(y-yexact, inf)
14
15 Rows = 15
16 f = lambda t, y: 2*t*y
17 a, b = 0.0, 2.0
18 y0 = 1.0
19 EulerErrors = zeros((Rows, 2), 'float')
20 print('------------------------------------------')
21 print(' N\t\t Error')
22 print('------------------------------------------')
23 for j in range(Rows):
24 N = 2**(4+j)
25 [t, y, Error] = EulerIVP(f, a, b, N, y0)
26 EulerErrors[j, 0] = N
27 EulerErrors[j, 1] = Error
28 print('{0:8.0f}'.format(N), '\t', '{0:1.12e}'.format(Error))
29 print('------------------------------------------')
30
31 RatesOfConvergence = zeros((Rows-1, 2), 'float')
32 print('Rates of convergence of Eulers method:')
33 print('--------------------------------------')
34 print(' N\t\t Conv. Rate')
35 print('--------------------------------------')
36 for j in range(Rows-1):
37 RatesOfConvergence[j, 0] = EulerErrors[j, 0]
38 RatesOfConvergence[j, 1] = log2(EulerErrors[j, ...
1]/EulerErrors[j+1, 1])
39 print('{0:6.0f}'.format(RatesOfConvergence[j, 0]), '\t\t',\
40 '{0:1.3f}'.format(RatesOfConvergence[j, 1]))
41 print('--------------------------------------')
170 Solving Systems of Nonlinear Ordinary Differential Equations
runfile(’D:/PyFiles/SolveIVPWithEuler.py’, wdir=’D:/PyFiles’)
------------------------------------------
N Error
------------------------------------------
16 2.212914947992e-02
32 1.061531476368e-02
64 5.201658072192e-03
128 2.570740890148e-03
256 1.278021559005e-03
512 6.371744244492e-04
1024 3.181315727203e-04
2048 1.589510709855e-04
4096 7.944690307737e-05
8192 3.971629161525e-05
16384 1.985635490032e-05
32768 9.927729750725e-06
65536 4.963752954890e-06
131072 2.481848491165e-06
262144 1.240917262835e-06
------------------------------------------
From the table of order of convergence, it can be seen that Euler method
is a first-order method (O(h)).
Explicit Runge-Kutta Methods 171
1 clear ; clc ;
2 f = @(t, y) t-2*t*y ;
3 a = 0.0 ; b = 2.0 ; y0 = 1 ;
4 yexact = @(t) 0.5*(1+exp(-t.ˆ2)) ;
5 Rows = 15 ;
6 EulerErrors = zeros(Rows, 2) ;
7 fprintf('--------------------------------------\n') ;
8 fprintf(' N\t\t\t Error\n') ;
9 fprintf('--------------------------------------\n') ;
10 for j = 1 : Rows
11 N = 2ˆ(3+j) ;
12 [t, y, Error] = EulerIVP(f, a, b, N, y0) ;
13 EulerErrors(j, 1) = N ;
14 EulerErrors(j, 2) = Error ;
15 fprintf('%8i\t%1.10e\n', N, Error) ;
16 end
17 fprintf('--------------------------------------\n') ;
18
19 fprintf('--------------------------------------\n\n') ;
20 RatesOfConvergence = zeros(Rows-1, 2) ;
21 fprintf('Rates of convergence of Eulers method:\n') ;
22 fprintf('--------------------------------------\n') ;
23 fprintf(' N\t\t Conv. Rate\n') ;
24 fprintf('--------------------------------------\n') ;
25 for j = 1 : Rows - 1
26 RatesOfConvergence(j, 1) = EulerErrors(j, 1) ;
27 RatesOfConvergence(j, 2) = log2(EulerErrors(j, ...
2)/EulerErrors(j+1, 2)) ;
28 fprintf('%8i\t %1.12e\n', RatesOfConvergence(j, 1), ...
29 RatesOfConvergence(j, 2)) ;
30 end
31 fprintf('--------------------------------------\n') ;
32
33 function [t, y, Error] = EulerIVP(f, a, b, N, y0)
34 t = linspace(a, b, N+1) ;
35 h = (b-a)/N ;
36 y = zeros(1, N+1) ;
37 yexact = 0.5*(1+exp(-t.ˆ2)) ;
38 y(1) = y0 ;
39 for j = 1 : N
40 y(j+1) = y(j) + h*f(t(j), y(j)) ;
41 end
42 Error = norm(y-yexact, inf) ;
43 end
1 clear ; clc ;
2 f = @(t, y) t-2*t*y ;
3 a = 0.0 ; b = 2.0 ; y0 = 1 ;
4 yexact = @(t) 0.5*(1+exp(-t.ˆ2)) ;
5 Rows = 15 ;
6 HeunErrors = zeros(Rows, 2) ;
7 fprintf('--------------------------------------\n') ;
8 fprintf(' N\t\t\t Error\n') ;
9 fprintf('--------------------------------------\n') ;
10 for j = 1 : Rows
11 N = 2ˆ(3+j) ;
12 [t, y, Error] = HeunIVP(f, a, b, N, y0) ;
13 HeunErrors(j, 1) = N ;
14 HeunErrors(j, 2) = Error ;
15 fprintf('%8i\t%1.10e\n', N, Error) ;
16 end
17 fprintf('--------------------------------------\n\n') ;
18 RatesOfConvergence = zeros(Rows-1, 2) ;
19 fprintf('Rates of convergence of Heuns method:\n') ;
20 fprintf('--------------------------------------\n') ;
21 fprintf(' N\t\t Conv. Rate\n') ;
22 fprintf('--------------------------------------\n') ;
23 for j = 1 : Rows - 1
24 RatesOfConvergence(j, 1) = HeunErrors(j, 1) ;
25 RatesOfConvergence(j, 2) = log2(HeunErrors(j, ...
2)/HeunErrors(j+1, 2)) ;
26 fprintf('%8i\t %1.12e\n', RatesOfConvergence(j, 1), ...
27 RatesOfConvergence(j, 2)) ;
28 end
29 fprintf('--------------------------------------\n') ;
30
31 function [t, y, Error] = HeunIVP(f, a, b, N, y0)
32 t = linspace(a, b, N+1) ;
33 h = (b-a)/N ;
34 y = zeros(1, N+1) ;
35 yexact = 0.5*(1+exp(-t.ˆ2)) ;
36 y(1) = y0 ;
37 for j = 1 : N
38 k1 = f(t(j), y(j)) ;
39 k2 = f(t(j+1), y(j)+h*k1) ;
40 y(j+1) = y(j) + h/2*(k1+k2) ;
41 end
42 Error = norm(y-yexact, inf) ;
43 end
Explicit Runge-Kutta Methods 173
--------------------------------------
N Error
--------------------------------------
16 1.6647420306e-03
32 3.8083683419e-04
64 9.1490315453e-05
128 2.2439310337e-05
256 5.5575532305e-06
512 1.3830109670e-06
1024 3.4496196688e-07
2048 8.6142039946e-08
4096 2.1523228200e-08
8192 5.3792741372e-09
16384 1.3446276315e-09
32768 3.3613245520e-10
65536 8.4030005176e-11
131072 2.1010082563e-11
262144 5.2486903712e-12
--------------------------------------
1 % SolvePredPrey.m
2 clear ; clc ;
3 f = @(t, z) [0.7*z(1)-1.3*z(1)*z(2);
4 -z(2)+z(1)*z(2)] ;
5 a = 0.0 ; b = 50 ; y0 = [0.9; 0.1] ;
6 N = 500 ;
7 x = linspace(a, b, N+1) ;
8 h = (b-a)/N ;
9 y = zeros(2, N+1) ;
10 y(:, 1) = y0 ;
11 for j = 1 : N
12 k1 = f(x(j), y(:, j)) ;
13 k2 = f(x(j)+h/2, y(:, j)+h/2*k1) ;
14 k3 = f(x(j)+h/2, y(:, j)+h/2*k2) ;
15 k4 = f(x(j)+h, y(:, j)+h*k3) ;
16 y(:, j+1) = y(:, j) + h/6.0*(k1+2*k2+2*k3+k4) ;
17 end
18 figure(1) ;
19 plot(x, y(1, :), '-b', x, y(2, :), '-.m', 'LineWidth', 2)
20 xlabel('Time (t)') ;
21 ylabel('Population') ;
22 legend('Prey', 'Predator') ;
23 set(gca, 'FontSize', 12)
24 set(gca, 'Fontweight', 'bold') ;
25 grid on ;
26 set(gca, 'XTick', linspace(a, b, 11)) ;
27 set(gca, 'YTick', 0:0.3:3) ;
28 set(gca, 'GridLineStyle', ':')
29 grid on ;
3
Prey
Predator
2.7
2.4
2.1
1.8
Population
1.5
1.2
0.9
0.6
0.3
0
0 5 10 15 20 25 30 35 40 45 50
Time (t)
FIGURE 7.1: Dynamics of the predator and prey populations obtained by the
fourth-order Runge-Kutta method.
y(x) e −x
2
= 0.5(1 + )
1.00
0.95
0.90
0.85
0.80
0.75
y
0.70
0.65
0.60
0.55
0.50
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
x
FIGURE 7.2: Solution of the initial value problem of Example 7.1, given by
2
y(x) = (1 + e−x )/2.
16384 1.000
32768 1.000
65536 1.000
131072 1.000
--------------------------------------
The graph of the problem solution is shown in Figure 7.2.
The MATLAB code is:
1 clear ; clc ;
2 f = @(t, y) t-2*t*y ;
3 a = 0.0 ; b = 2.0 ; y0 = 1 ;
4 yexact = @(t) 0.5*(1+exp(-t.ˆ2)) ;
5 Rows = 15 ;
6 BackwardEulerErrors = zeros(Rows, 2) ;
7 fprintf('--------------------------------------\n') ;
8 fprintf(' N\t\t\t Error\n') ;
9 fprintf('--------------------------------------\n') ;
10 for j = 1 : Rows
11 N = 2ˆ(3+j) ;
12 [t, y, Error] = BackwardEulerIVP(f, a, b, N, y0) ;
13 BackwardEulerErrors(j, 1) = N ;
14 BackwardEulerErrors(j, 2) = Error ;
15 fprintf('%8i\t%1.10e\n', N, Error) ;
16 end
17 fprintf('--------------------------------------\n\n') ;
18
19 plot(t, y, 'm', 'LineWidth', 2) ;
20 xlabel('x') ; ylabel('y') ;
21 grid on ;
22 set(gca, 'XTick', linspace(a, b, 11)) ;
23 set(gca, 'YTick', 0.5:0.05:1.0) ;
24 set(gca, 'fontweight', 'bold') ;
25 legend('y(x) = (1+eˆ{-xˆ2})/2') ;
26
27 function [t, y, Error] = BackwardEulerIVP(f, a, b, N, y0)
180 Solving Systems of Nonlinear Ordinary Differential Equations
28 t = linspace(a, b, N+1) ;
29 h = (b-a)/N ;
30 y = zeros(1, N+1) ;
31 yexact = 0.5*(1+exp(-t.ˆ2)) ;
32 y(1) = y0 ;
33 for j = 1 : N
34 y(j+1) = y(j) + h*f(t(j), y(j)+h*f(t(j), y(j))) ;
35 end
36 Error = norm(y-yexact, inf) ;
37 end
and satisfies,
P M (xi ) = y i , i = 0, 1, . . . , n.
Collocation Runge-Kutta methods are deeply related to the Gauss quadra-
ture methods discussed in the past chapter. The collocation are obtained by
transforming Gauss points (in [−1, 1]) such as the LG, LGL, LGR, ...etc points
to the interval [0, 1], using the transformation
1
x(s) = (1 + s) .
2
7.3.2.1 Legendre-Gauss Methods
The first four Gauss-Legendre methods have the Butcher’s tables:
(i) M = 1:
1 1
2 2
1
This method is called the mid-point rule.
(ii) M = 2 : √ √
1 3 1 1 3
2 − 6 4 4 − 6
√ √
1 3 1 3 1
2 + 6 4 + 6 4
1 1
2 2
Implicit Runge-Kutta Methods 181
(iii) M = 3 : √ √ √
1 15 5 2 15 5 15
2 − 10 36 9 − 10 36 − 30
√ √
1 5 15 2 5 15
2 36 + 24 9 36 − 24
√ √ √
1 15 5 15 2 15 5
2 + 10 36 + 30 9 + 15 36
15 4 5
18 9 18
Example 7.3 In this example the mid-point rule will be used to solve the
logistic-growth model:
By executing the code, the errors and rates of convergence for different
values of mesh points are shown below.
runfile(’D:/PyFiles/SolveLogisticWithMidPoint.py’,
wdir=’D:/PyFiles’)
------------------------------------------
N Error
------------------------------------------
100 1.948817140496e-05
200 4.811971295204e-06
400 1.195559782174e-06
800 2.979672797387e-07
1600 7.437647475683e-08
3200 1.857971743124e-08
6400 4.643131656934e-09
12800 1.160556650781e-09
25600 2.901111573195e-10
51200 7.252876077501e-11
102400 1.812816563529e-11
204800 4.535261055594e-12
409600 1.134092819655e-12
819200 2.786659791809e-13
1638400 6.639133687258e-14
------------------------------------------
Implicit Runge-Kutta Methods 183
1 clear ; clc ;
2 Rows = 15 ;
3 r = 0.25 ;
4 f = @(t, P) r*P*(1-P) ;
5 a = 0.0; b = 20.0 ;
6 P0 = 0.2 ;
7 GaussMidErrors = zeros(Rows, 2) ;
8 fprintf('------------------------------------------\n') ;
9 fprintf(' N\t\t Error\n') ;
10 fprintf('------------------------------------------\n') ;
11 N = 100 ;
12 for j = 1 : Rows
13 [t, P, Error] = GaussMidIVP(f, a, b, N, P0) ;
14 GaussMidErrors(j, 1) = N ;
15 GaussMidErrors(j, 2) = Error ;
16 fprintf('%8i\t%1.12e\n', N, Error) ;
17 N = 2 * N ;
18 end
19 fprintf('------------------------------------------\n')
20
21 figure(1)
22 plot(t, P, '-m', 'LineWidth', 2) ;
23 legend('P(t) = P 0/(P 0+(1-P 0)eˆ{-rt})') ;
24 xlabel('Time (t)') ;
25 ylabel('Population (P(t))') ;
26 grid on
27 set(gca, 'fontweight', 'bold') ;
28 set(gca, 'fontsize', 12) ;
29 set(gca, 'XTick', a:(b-a)/10:b) ;
184 Solving Systems of Nonlinear Ordinary Differential Equations
1
P(t) = P0 /(P0 +(1-P0 )e-rt )
0.9
0.8
Population (P(t))
0.7
0.6
0.5
0.4
0.3
0.2
0 2 4 6 8 10 12 14 16 18 20
Time (t)
FIGURE 7.3: Solution of the logistic growth model using the Gauss mid-point
method.
Implicit Runge-Kutta Methods 185
The first method is the implicit trapezoidal method which has the
Butcher table:
0 0 0
1 1
1 2 2
1 1
2 2
and has the form:
1
y n+1 = y n + f (tn , y n ) + f (tn+1 , y n+1 )
2
The implicit trapezoidal method is of second order.
1 % ITSolCompSpec.m
2 clear ; clc ;
3 t0 = 0 ; T = 50.0 ; r1 = 0.3 ; r2 = 0.2 ; a = 1.0 ; b = 0.9 ;
4 f = @(t, z) [0.3*z(1)*(1-z(1))-z(1)*z(2); ...
0.2*z(2)*(1-z(2))-0.9*z(1)*z(2)] ;
5 z0 = [0.5; 0.5] ; Dim = length(z0) ;
6 N = 100*ceil(T-t0) ;
7 t = linspace(t0, T, N+1) ;
8 h = (T-t0)/N ;
9 Epsilon = 1e-15 ;
10 n = length(t) ;
11 z = zeros(Dim, N+1) ;
12 YPrime = zeros(n, Dim) ;
13 z(:, 1) = z0 ;
14 i = 1 ;
15 while (i ≤ N)
16 k1 = f(t(i), z(:, i)) ;
17 k2 = feval(f, t(i+1),z(:, i)+h*k1) ;
18 YPrime(i, :) = k1' ;
19 yest = z(:, i) + h*k1 ;
20 z(:, i+1) = z(:, i) + h/2.0*(k1+k2) ;
21 while(norm(z(:, i+1)-yest)≥Epsilon)
22 yest = z(:, i+1) ;
23 k2 = feval(f, t(i+1),yest) ;
24 z(:, i+1) = z(:, i) + (h/2.0)*(k1+k2) ;
25 end
186 Solving Systems of Nonlinear Ordinary Differential Equations
0.9 Population 1
Population 2
0.8
Population Density
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0 5 10 15 20 25 30 35 40 45 50
Time (years)
26 i = i + 1 ;
27 end
28 x = z(1, :) ;
29 y = z(2, :) ;
30 plot(t, x, '-b', t, y, ':m', 'LineWidth', 2) ;
31 grid on
32 set(gca, 'fontweight', 'bold')
33 legend('Population 1', 'Population 2') ;
34 axis([t0, T, -0.1, 1.1])
35 xlabel('Time (years)') ; ylabel('Population Density') ;
36 set(gca, 'XTick', t0:(T-t0)/10:T) ;
37 set(gca, 'YTick', 0:0.1:1) ;
1 import numpy as np
2 t0 = 0 ; T = 50.0 ; r1 = 0.3 ; r2 = 0.2 ; a = 1.0 ; b = 0.9 ;
3 f = lambda t, z: np.array([0.3*z[0]*(1-z[0])-z[0]*z[1], ...
0.2*z[1]*(1-z[1])-0.9*z[0]*z[1]])
4 z0 = [0.5, 0.5]
5 Dim = len(z0)
6 N = 100*int(T-t0)
7 t = np.linspace(t0, T, N+1)
8 h = (T-t0)/N
9 Epsilon = 1e-15
10 n = len(t)
11 z = np.zeros((Dim, N+1), 'float')
12 YPrime = np.zeros((n, Dim), 'float')
13 z[:, 0] = z0
14 i = 0
15 while i < N:
16 k1 = f(t[i], z[:, i])
17 k2 = f(t[i+1],z[:, i]+h*k1)
18 YPrime[i, :] = k1
Implicit Runge-Kutta Methods 187
1 5 1 1
2 24 3 − 24
1 2 1
1 6 3 6
1 2 1
6 3 6
and has the form:
1
y n+1 = y n + f (tn , y n ) + 4f (tn+1/2 , y n+1/2 ) + f (tn+1 , y n+1 )
6
where tn+1/2 = tn + h/2 and y n+1/2 ≈ y(tn + h/2). The Hermite-Simpson
method is a fourth order.
1 clear ;
2 t0 = 0 ; tf = 200 ;
3 s = 10 ;
4 mu T = 0.02 ;
5 r = 0.03 ;
6 Tmax = 1500 ;
7 k1 = 2.4e-5 ;
8 k2 = 2.0e-5 ;
9 mu I = 0.26 ;
10 Nv = 850 ;
11 NC = 200 ;
12 Q = 2 ;
13 mu b = 0.24 ;
14 mu V = 2.4 ;
15
16 f = @(t, x) [s-mu T *x(1)+r*x(1)*(1-(x(1)+x(2))/Tmax)-k1*x(1)*x(3);
17 k2*x(1)*x(3)-mu I *x(2) ;
18 Nv* mu b *x(2)-k1*x(1)*x(3)-mu V *x(3)] ;
19 x0 = [1e+3; 0; 1e-3] ;
20 N = 10*(tf-t0) ;
21 Epsilon = 1e-13 ;
22 x = zeros(1+N, 3) ;
23 t = linspace(t0, tf, 1+N) ;
24 h = (tf-t0)/N ;
25 x(1,:) = x0 ;
26
27 for i = 1 : N
28 K1 = f(t(i), x(i, :)) ;
29 xest = x(i, :) + h*K1' ;
30 K3 = f(t(i+1), xest) ;
31 xmid = 0.5*(x(i, :)+xest)+h/8*(K1'-K3') ;
32 K2 = f(t(i)+h/2, xmid) ;
33 x(i+1,:) = x(i,:) + h/6*(K1'+4*K2'+K3') ;
34 end
35 T = x(:, 1) ; I = x(:, 2) ; V = x(:, 3) ;
36 figure(1) ;
37 subplot(3, 1, 1) ;
38 plot(t, T, '-b', 'LineWidth', 2) ;
39 xlabel('Time') ;ylabel('Susceptible CD4+ T-Cells') ;
40 grid on ;
41 set(gca, 'fontweight', 'bold') ;
42 set(gca, 'XTick', linspace(0, tf, 11)) ;
43 grid on ;
44 subplot(3, 1, 2) ;
45 plot(t, I, '--r', 'LineWidth', 2) ;
46 xlabel('Time') ;ylabel('Infected CD4+ T-Cells') ;
47 grid on ;
48 set(gca, 'fontweight', 'bold') ;
49 set(gca, 'XTick', linspace(0, tf, 11)) ;
50 grid on ;
51 subplot(3, 1, 3) ;
52 plot(t, V, '-.m', 'LineWidth', 2) ;
53 xlabel('Time') ;ylabel('Viral load') ;
54 grid on ;
Implicit Runge-Kutta Methods 189
1 import numpy as np
2 t0 = 0
3 tf = 200
4 s = 10
5 mu T = 0.02
6 r = 0.03
7 Tmax = 1500
8 c1 = 2.4e-5
9 c2 = 2.0e-5
10 mu I = 0.26
11 Nv = 850
12 mu b = 0.24
13 mu V = 2.4
14
15 f = lambda t, x: ...
np.array([s-(mu T+r*(1-(x[0]+x[1])/Tmax)-c1*x[2])*x[0],\
16 c2*x[0]*x[2]-mu I *x[1],\
17 Nv* mu b *x[1]-c1*x[0]*x[2]-mu V *x[2]])
18 z0 = [1e+3, 0.0, 1e-3] ;
19 Dim = len(z0)
20 N = 100*int(tf-t0)
21 t = np.linspace(t0, tf, N+1)
22 h = (tf-t0)/N
23 Epsilon = 1e-15
24 n = len(t)
25 z = np.zeros((Dim, N+1), 'float')
26 zmid = np.zeros((Dim, N), 'float')
27 YPrime = np.zeros((n, Dim), 'float')
28 z[:, 0] = z0
29 i = 0
30 while i < N:
31 k1 = f(t[i], z[:, i])
32 zest = z[:, i]+h*k1
33 k3 = f(t[i+1], zest)
34 zmid[:, i] = 0.5*(z[:, i]+zest) + h/8*(k1-k3)
35 k2 = f(t[i]+h/2.0,zmid[:, i])
36 z[:, i+1] = z[:, i] + h/6.0*(k1+4*k2+k3)
37 i = i + 1
38
39 T, I, V = z[0, :], z[1, :], z[2, :]
40 import matplotlib.pyplot as plt
41 plt.figure(1)
42 plt.plot(t, T, color='orangered', lw=2)
43 plt.subplot(3, 1, 1)
44 plt.plot(t, T, color='orangered', lw=2)
45 plt.xlabel('Time')
46 plt.ylabel('Susceptible CD4+ T-Cells', fontweight='bold')
47 plt.grid(True, ls=':')
190 Solving Systems of Nonlinear Ordinary Differential Equations
1000
600
400
200
0
0 20 40 60 80 100 120 140 160 180 200
Time
500
Infected CD4+ T-Cells
400
300
200
100
0
0 20 40 60 80 100 120 140 160 180 200
Time
× 104
4
3
Viral load
0
0 20 40 60 80 100 120 140 160 180 200
Time
To solve the IVP (7.6)-(7.7), we have to write a function, which returns the
right-hand side of Equation (7.6). This can be done as follows:
1 function z = fp(t, y)
2 z = f(t, y) ;
Now from the command window (or a MATLAB script), we can define the
time period, the initial condition and call the ode45 function to solve the given
IVP. We do this as follows:
192 Solving Systems of Nonlinear Ordinary Differential Equations
1 tspan = [t 0, t 1] ;
2 IC = y0 ;
3 [t, y] = ode45(@fp, tspan, y0) ;
4 plot(t, y) ;
The symbol @ in front of the name of the function, informs MATLAB that
what is follows is the name of the function which defines the slope.
1 % LogisticGrowthSlope.m
2 function z = LogisticGrowthSlope(t, y)
3 z = 0.01*y*(1-y) ;
1 % SolveLogisticWithAlee.m
2 clear ; clc ;
3 tspan = [0, 700] ;
4 y0 = 0.3 ;
5 x0 = 0.2 ;
6 LogGrowth = @(t, y) 0.01*y*(1-y)*(4*y-1) ;
7 [t, y] = ode45(LogGrowth, tspan, y0) ;
8 [s, x] = ode45(LogGrowth, tspan, x0) ;
9 plot(t, y, '-b', s, x, '-.r', 'LineWidth', 2) ;
10 legend('y 0 = 0.3 > Threshold = 0.25', 'x 0 = 0.2 < Threshold ...
= 0.25') ;
11 hold on ;
12 plot([0.0, 700], [0.25, 0.25], '--k') ;
13 xlabel('Time') ;
14 ylabel('Population') ;
15 grid on ;
16 axis([0, 700, 0, 1.1]) ;
17 set(gca, 'fontweight', 'bold') ;
18 set(gca, 'GridLineStyle', ':') ;
19 set(gca, 'YTick', 0:0.125:1) ;
0.875
0.625
0.5
0.375
0.25
0.125
0
0 100 200 300 400 500 600 700
Time
FIGURE 7.6: Solution of the logistic growth model with Alee effect using the
ode45 MATLAB solver.
The second step is to write a function, which can evaluate the right-hand side
of the system, where the output is an n-dimensional vector. This is done as
follows:
1 function z = fp(t, z)
2 z = [ f 1(t, z) ;
3 f 2(t, z) ; ...
4
5 f n(t, z) ;
6 ] ;
Finally, we write a script to define the initial condition, time space and calls
the ode45 function to solve the system.
T = [t_0, t_1] ;
z0 = [y_10; y_20; ...; y_n0] ;
[t, z] = ode45(@fp, T, z0) ;
The ode45 function discretizes the time space [t0 , t1 ] into m (not equidistant)
points and returns the result in t. Also, the matrix z is of type (m+1)×(n+1).
The first column of z (z(:, 1)) is the solution vector y1 , the second column
(z(:, 2)) is the solution vector y2 , ...etc.
1 % SIRSSlope
2 function z = SIRSSlope(t, x)
3 S = x(1) ; I = x(2) ; R = x(3) ;
4 z = [-0.03*S*I+0.02*R; 0.03*S*I-0.01*I; 0.01*I-0.02*R] ;
5 end
1 % SolveSIRS.m
2 clear ; clc ;
3 tspan = [0, 300] ;
4 z0 = [0.85; 0.15; 0] ;
5 [t, z] = ode45(@SIRSSlope, tspan, z0) ;
6 S = z(:, 1) ; I = z(:, 2) ; R = z(:, 3) ;
7 plot(t, S, '--b', t, I, ':r', t, R, '-.m', 'LineWidth', 2) ;
8 grid on ;
9 xlabel('Time') ;
MATLAB ODE Solvers 195
1
Susceptibles
0.9 Infected
Recovered
0.8
0.7
0.6
Population
0.5
0.4
0.3
0.2
0.1
0
0 30 60 90 120 150 180 210 240 270 300
Time
FIGURE 7.7: Solution of the SIRS model with the ode45 MATLAB solver.
10 ylabel('Population') ;
11 legend('Susceptibles', 'Infected', 'Recovered') ;
12 axis([0, 300, 0, 1]) ;
13 set(gca, 'fontweight', 'bold') ;
14 set(gca, 'XTick', linspace(0, 300, 11)) ;
15 set(gca, 'GridLineStyle', ':') ;
If we let y1 (x) = y(x), y2 (x) = y 0 (x), we can write the Vanderpool system can
be written in the form:
The MATLAB ode45 function will fail to solve the Vanderpool system (7.9).
We can use the MATLAB function, ode23s instead to solve the system (7.9).
We will declare σ as a global variable and then solve the system as follows:
196 Solving Systems of Nonlinear Ordinary Differential Equations
1 %vdpsig.m
2 function z = vdpsig(t, y)
3 global sigma
4 z = zeros(2, 1) ;
5 z(1) = y(2);
6 z(2) = sigma*(1-y(1)ˆ2)*y(2)-y(1) ;
1 % SolveVDPSystem.m
2 clear ; clc ;
3 global sigma ;
4 tspan = [0 200] ; y0 = [2; 1] ;
5 figure(1) ;
6 sigma = 1 ;
7 [t,y] = ode23s(@vdpsig, tspan, y0);
8 subplot(3, 2, 1) ;
9 plot(t,y(:,1),'-b', 'LineWidth', 2) ;
10 xlabel('x') ; ylabel('y(x)') ;
11 title(['\sigma = ' num2str(sigma)]) ;
12 sigma = 5 ;
13 [t,y] = ode23s(@vdpsig, tspan, y0);
14 subplot(3, 2, 2) ;
15 plot(t,y(:,1),'--b', 'LineWidth', 2) ;
16 xlabel('x') ; ylabel('y(x)') ;
17 title(['\sigma = ' num2str(sigma)]) ;
18 sigma = 10 ;
19 [t,y] = ode23s(@vdpsig, tspan, y0);
20 subplot(3, 2, 3) ;
21 plot(t,y(:,1),'--b', 'LineWidth', 2) ;
22 xlabel('x') ; ylabel('y(x)') ;
23 title(['\sigma = ' num2str(sigma)]) ;
24 sigma = 20 ;
25 [t,y] = ode23s(@vdpsig, tspan, y0);
26 subplot(3, 2, 4) ;
27 plot(t,y(:,1),'--b', 'LineWidth', 2) ;
28 xlabel('x') ; ylabel('y(x)') ;
29 title(['\sigma = ' num2str(sigma)]) ;
30 sigma = 40 ;
31 [t,y] = ode23s(@vdpsig, tspan, y0);
32 subplot(3, 2, 5) ;
33 plot(t,y(:,1),'--b', 'LineWidth', 2) ;
34 xlabel('x') ; ylabel('y(x)') ;
35 title(['\sigma = ' num2str(sigma)]) ;
36 sigma = 80 ;
37 [t,y] = ode23s(@vdpsig, tspan, y0);
38 subplot(3, 2, 6) ;
39 plot(t,y(:,1),'--b', 'LineWidth', 2) ;
40 xlabel('x') ; ylabel('y(x)') ;
41 title(['\sigma = ' num2str(sigma)]) ;
By executing the above MATLAB code, we get the solutions for different
values of σ as shown in Figure 7.8.
Python Solvers for IVPs 197
σ =1 σ =5
3 3
2 2
y(x) 1 1
y(x)
0 0
-1 -1
-2 -2
-3 -3
0 25 50 75 100 125 150 175 200 0 25 50 75 100 125 150 175 200
x x
σ = 10 σ = 20
3 3
2 2
1 1
y(x)
y(x)
0 0
-1 -1
-2 -2
-3 -3
0 25 50 75 100 125 150 175 200 0 25 50 75 100 125 150 175 200
x x
σ = 40 σ = 80
3 3
2 2
1 1
y(x)
y(x)
0 0
-1 -1
-2 -2
-3 -3
0 25 50 75 100 125 150 175 200 0 25 50 75 100 125 150 175 200
x x
FIGURE 7.8: Solution of the Vanderpool system for σ = 1, 5, 10, 20, 40 and 80.
1 import numpy as np
2 from scipy.integrate import odeint
3
4 f = lambda x, t: (4*(np.sin(2*t))*x-3*x**2)
5 t = np.linspace(0.0, 20.0, 401)
6 x0 = 0.1
7
8 x = odeint(f, x0, t)
9
10 import matplotlib.pyplot as plt
11 plt.figure(1, figsize=(10, 10))
12 plt.plot(t, x, color='orangered', lw = 2)
13 plt.grid(True, ls = ':')
14 plt.xlabel('t', fontweight='bold')
15 plt.ylabel('x(t)', fontweight='bold')
16 plt.xticks(np.arange(0, 22, 2), fontweight='bold')
17 plt.yticks(np.arange(0.0, 1.0, 0.1), fontweight='bold')
0.9
0.8
0.7
0.6
0.5
x(t)
0.4
0.3
0.2
0.1
0.0
0 2 4 6 8 10 12 14 16 18 20
t
FIGURE 7.9: Solution of the initial value problem using the odeint function.
1 import numpy as np
2 from scipy.integrate import odeint
3 import matplotlib.pyplot as plt
4 global sigma ;
5 tspan = [0, 200] ; y0 = [2, 1] ;
6 t = np.linspace(0.0, 200.0, 10001)
7 vdpsig = lambda z, t: [z[1], sigma*(1-z[0]**2)*z[1]-z[0]]
8 plt.figure(1, figsize = (20, 10)) ;
Python Solvers for IVPs 199
9 sigma = 1 ;
10 y = odeint(vdpsig, y0, t)
11 y1= y[:, 0]
12 plt.subplot(3, 2, 1) ;
13 plt.plot(t,y1,color = 'crimson', lw = 2, label = r'$\sigma = ...
$'+str(sigma))
14 plt.xlabel('x', fontweight='bold') ; plt.ylabel('y(x)', ...
fontweight='bold')
15 plt.legend()
16 plt.grid(True, ls=':')
17
18 plt.xticks(np.arange(0.0, 220.0, 20), fontweight='bold')
19 plt.yticks(np.arange(-2, 3), fontweight='bold')
20
21 sigma = 5 ;
22 y = odeint(vdpsig, y0, t)
23 y1= y[:, 0]
24 plt.subplot(3, 2, 2) ;
25 plt.plot(t,y1,color = 'crimson', lw = 2, label = r'$\sigma = ...
$'+str(sigma))
26 plt.xlabel('x', fontweight='bold') ; plt.ylabel('y(x)', ...
fontweight='bold')
27 plt.legend()
28 plt.grid(True, ls=':')
29 plt.xticks(np.arange(0.0, 220.0, 20), fontweight='bold')
30 plt.yticks(np.arange(-2, 3), fontweight='bold')
31
32 sigma = 10 ;
33 y = odeint(vdpsig, y0, t)
34 y1= y[:, 0]
35 plt.subplot(3, 2, 3) ;
36 plt.plot(t,y1,color = 'crimson', lw = 2, label = r'$\sigma = ...
$'+str(sigma))
37 plt.xlabel('x', fontweight='bold') ; plt.ylabel('y(x)', ...
fontweight='bold')
38 plt.legend()
39 plt.grid(True, ls=':')
40 plt.xticks(np.arange(0.0, 220.0, 20), fontweight='bold')
41 plt.yticks(np.arange(-2, 3), fontweight='bold')
42
43 sigma = 20 ;
44 y = odeint(vdpsig, y0, t)
45 y1= y[:, 0]
46 plt.subplot(3, 2, 4) ;
47 plt.plot(t,y1,color = 'crimson', lw = 2, label = r'$\sigma = ...
$'+str(sigma))
48 plt.xlabel('x', fontweight='bold') ; plt.ylabel('y(x)', ...
fontweight='bold')
49 plt.legend()
50 plt.grid(True, ls=':')
51 plt.xticks(np.arange(0.0, 220.0, 20), fontweight='bold')
52 plt.yticks(np.arange(-2, 3), fontweight='bold')
53
54 sigma = 40 ;
55 y = odeint(vdpsig, y0, t)
56 y1= y[:, 0]
57 plt.subplot(3, 2, 5) ;
200 Solving Systems of Nonlinear Ordinary Differential Equations
The solutions of the system for different values of σ are explained by Figure
7.10.
2 2
1 1
y(x)
y(x)
0 σ =1 0 σ =5
−1 −1
−2 −2
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
x x
2 2
1 1
y(x)
y(x)
0 σ =10 0 σ =20
−1 −1
−2 −2
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
x x
2 2 σ =80
1 1
y(x)
y(x)
0 0
−1 −1
σ =40
−2 −2
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
x x
FIGURE 7.10: Solution of the Vanderpool system for σ = 1, 5, 10, 20, 40 and 80.
Python Solvers for IVPs 201
Assigning the value False to the parameter remote informs gekko to set
the server to be the local machine.
If the model m contains a variable x that is bounded from below by xmin ,
from above by xmax and has an initial value x0 , then it can be declared as
follows:
x = m.Var(value = x0 , lb = xmin , ub = xmax )
Example 7.9 This example uses gekko to solve the SIR model:
Ṡ(t) = Λ − αS(t)I(t) − µS(t), S(0) = S0 , t ∈ [0, T ]
˙
I(t) = αS(t)I(t) − (µ + β)I(t), I(0) = I0 , t ∈ [0, T ]
Ṙ(t) = βI(t) − µR(t), R(0) = R0 , t ∈ [0, T ]
with Λ = 0.05, µ = 0.05, α = 2, β = 0.6 and T = 25.
The gekko Python code is:
1 #SolveSIRgekko.py
2 from gekko import GEKKO
3 import numpy as np
4 import matplotlib.pyplot as plt
5
6 Lambda = 0.05 ; mu = 0.05 ; alpha = 2 ; beta = 0.6
7 t0 = 0
8 T = 25
9 N = 1000
10 m = GEKKO(remote=False)
11 m.time = np.linspace(t0, T, N+1)
12 S, I, R = m.Var(value=0.9), m.Var(value=0.1), m.Var(value=0.0)
13 # Equations
14 m.Equation(S.dt() == Lambda-alpha*S*I-mu*S)
15 m.Equation(I.dt() == alpha*S*I-(mu+beta)*I)
16 m.Equation(R.dt() == beta*I-mu*R)
17
18 m.options.IMODE = 4 # simulation mode
19 m.solve()
202 Solving Systems of Nonlinear Ordinary Differential Equations
20 t = m.time
21
22 plt.figure(1, figsize=(8, 8))
23
24 plt.plot(t, S, color='darkblue', ls = '-', lw = 3, ...
label='Suceptible Population (S(t))')
25 plt.plot(t, I, color='crimson', ls = '--', lw = 3, ...
label='Infected Population (I(t))')
26 plt.plot(t, R, color='darkmagenta', ls = ':', lw = 3, ...
label='Recovered Population (R(t))')
27 plt.xlabel('Time (t)', fontweight='bold')
28 plt.ylabel('Population', fontweight='bold')
29 plt.xticks(np.arange(0, T+T/10, T/10), fontweight='bold')
30 plt.yticks(np.arange(0, 1.1, 0.1), fontweight='bold')
31 plt.grid(True, ls='--')
32 plt.axis([0.0, T, 0.0, 1.1])
1 1.50866E-13 1.14988E-01
2 4.75681E-15 3.14821E-01
3 1.74925E-15 3.13396E-01
4 1.39990E-13 2.80934E+01
5 6.11632E-16 2.59675E+00
6 1.54265E-16 2.30466E+00
7 3.91236E-17 1.83967E+00
8 9.97549E-18 9.28526E-01
9 2.46346E-18 6.43451E-01
---------------------------------------------------
Solver : IPOPT (v3.12)
Solution time : 6.480000000010477E-002 sec
Objective : 0.000000000000000E+000
Successful solution
---------------------------------------------------
The solution graph is shown in Figure 7.11.
1.0
0.9
0.8
0.7
Population
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0
Time (t)
FIGURE 7.11: Solution of the SIR model in [0, 10] using the gekko package.
204 Solving Systems of Nonlinear Ordinary Differential Equations
Example 7.10 In this example, gekko will be used to solve the stiff system
of differential equations:
dy1 (t)
= −1002 ∗ y1 (t) + 1000 ∗ y22 (t), y1 (0) = 1
dt
dy1 (t)
= y1 (t) + y2 ∗ (1 + y2 (t)), y2 (0) = 1
dt
for 0 ≤ t ≤ 5.
The code is:
y1(t)
y2(t)
4.0
3.6
3.2
2.8
2.4
y(t)
2.0
1.6
1.2
0.8
0.4
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Time (t)
FIGURE 7.12: Solution of the stiff system of ODEs in [0, 5] using the gekko
package.
Python Solvers for IVPs 205
Abstract
Standard numerical methods are initially designed to solve a class of gen-
eral problems without considering the structure of any individual problems.
Hence, they seldom produce reliable numerical solutions to problems with
complex structures such as nonlinearity, stiffness, singular perturbations and
high oscillations. While the explicit difference schemes can solve such prob-
lems with low computational cost, they suffer the problems of small stability
regions and hence they suffer severe restrictions on step sizes to achieve con-
venient results. On the other-hand, the implicit finite difference schemes enjoy
wide stability regions but suffer high associated computational costs and their
convergence orders cannot exceed one order above the explicit methods for the
same number of stages [1].
This chapter highlights some of the cases in which the standard finite
difference schemes fail to find reliable solutions, then it discusses the rules upon
which the nonstandard schemes stand with several examples to explain the
idea. MATLAB® and Python are used to implement the solution algorithms
in all the sections.
The chapter is organized as follows. The first section discusses some numer-
ical cases in which the standard finite difference methods give inappropriate
solutions. In the second section, the construction rules of nonstandard finite
difference methods are introduced. Exact finite difference schemes based on
nonstandard methods are presented in Section 3, for solving some given ini-
tial value problems. Finally, in the fourth section, design of nonstandard finite
difference schemes -for the case when exact finite differences are hard to find-
is presented.
P j+1 − P j−1
= P j (1 − P j ) ⇒ P j+1 = P j−1 + 2hP j (1 − P j ).
2h
Figure 8.1 shows the solution of the logistic model Ṗ (t) = P (t)(1 −
P (t)), P (0) = 0.5 using the centeral, forward and backward difference
schemes. The figure show that the solutions obtained by both the for-
ward and backward difference scheme have the same behavior as the
original model, where as the centeral finite difference scheme is irrelative
to the exact solution.
(ii) Using the standard denominator function h [40]. The first and second
derivatives at a point tj are approximated by:
0.8
0.5
0.2
−0.1
−0.4
−0.7
−1.0
0 10 20 30 40 50 60 70 80 90 100
Time (t)
1.0 1.0
0.9 0.9
0.8 0.8
0.7 0.7
Population (P(t))
Population (P(t))
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
Backward finite difference scheme Forward finite difference scheme
0.0 0.0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Time (t) Time (t)
FIGURE 8.1: Solution of the logistic model Ṗ (t) = P (t)(1 − P (t)), P (0) = 0.5
using the forward, backward and central finite difference schemes.
1 import numpy as np
2 import matplotlib.pyplot as plt
3
4 f = lambda s, v, B: B*v+s
5 N = 1000
6 h, t = 1.0/N, np.linspace(0, 1, N+1)
7 y0 = 0
8 MaxRows = 12
9 Eul = np.zeros((MaxRows, len(t)), float)
10 rk4 = np.zeros((MaxRows, len(t)), float)
11 itr = np.zeros((MaxRows, len(t)), float)
12 EulError = np.zeros((MaxRows, ), float)
13 rk4Error = np.zeros((MaxRows, ), float)
14 itrError = np.zeros((MaxRows, ), float)
15 A = np.zeros((MaxRows,), float)
210 Nonstandard Finite Difference Methods for Solving ODEs
16 row = 0
17 A[row] = 5.0
18 i = 0
19 while row < MaxRows:
20 # y is the exact solution of the problem
21 y = 1/A[row]**2*(np.exp(A[row]*t)-(A[row]*t+1))
22 plt.plot(t, y, lw=2, label='A = '+str(A[row]))
23 plt.legend()
24 Eul[i, 0] = y0
25 rk4[i, 0] = y0
26 itr[i, 0] = y0
27 for j in range(N):
28 # Solving with Euler's method
29 Eul[i, j+1] = Eul[i, j] + h*f(t[j], Eul[i, j], A[row])
30 # Solving with implicit trapezoidal rule
31 k1 = f(t[j], rk4[i, j], A[row])
32 k2 = f(t[j]+h, itr[i, j]+h*k1, A[row])
33 itr[i, j+1] = itr[i, j] + h/2*(k1+k2)
34 # Solving with the classical fourth-order Runge-Kutta ...
method
35 k1 = f(t[j], rk4[i, j], A[row])
36 k2 = f(t[j]+h/2, rk4[i, j]+h/2*k1, A[row])
37 k3 = f(t[j]+h/2, rk4[i, j]+h/2*k2, A[row])
38 k4 = f(t[j]+h, rk4[i, j]+h*k3, A[row])
39 rk4[i, j+1] = rk4[i, j] +h/6*(k1+2*k2+2*k3+k4)
40 # computing the norm-infinity error for the three methods
41 EulError[i] = np.linalg.norm(y-Eul[i, :], np.inf)
42 itrError[i] = np.linalg.norm(y-itr[i, :], np.inf)
43 rk4Error[i] = np.linalg.norm(y-rk4[i, :], np.inf)
44 i += 1
45 row += 1
46 if row ≥ MaxRows:
47 break
48 else:
49 A[row] = A[row-1] + 2.0
50 print('-----------------------------------------------------')
51 print(' A\t Euler Error\t\t Imp. Trapz Err\t\t RK4 Error')
52 print('-----------------------------------------------------')
53 for row in range(MaxRows):
54 print('{0:4.0f}'.format(A[row]), ...
'\t','{0:1.8e}'.format(EulError[row]), \
55 '\t', '{0:1.8e}'.format(itrError[row]), '\t', ...
'{0:1.8e}'.format(rk4Error[row]))
56 print('-----------------------------------------------------')
For different values of the step size h, the resulting solutions are shown
in Figure 8.2.
The solutions obtained for the exponential decay model include solution
profiles with asymptotic stable, periodic and unstable dynamics, where
the periodic and unstable solutions do not correspond any true solution
of the exponential decay model. The first graph alone, where the popu-
lation size decays to zero agrees with the true solution of the differential
equation.
Another example is the use of the Heun’s method to solve the logistic
model
dx(t)
= x(t)(1 − x(t)), x(0) = 0.5
dt
212 Nonstandard Finite Difference Methods for Solving ODEs
0.50 h = 0.5 0.50 h = 1.5
0.45 0.42
0.40 0.34
0.35 0.26
Population (P(t))
Population (P(t))
0.30 0.18
0.25 0.10
0.20 0.02
0.15
−0.06
0.10
−0.14
0.05
−0.22
0.00
−0.30
0.0 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0 0.0 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0
Time (t) Time (t)
Population (P(t))
0.1 31959
0.0 h = 2.0 15979
−0.1 −0
−0.2 −15979
−0.3 −31959
−0.4 −47938
−0.5 −63917
0.0 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0 0.0 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0
Time (t) Time (t)
1 import numpy as np
2 import matplotlib.pyplot as plt
3 steps = [0.5, 2.75, 3.0, 3.35]
4 plt.figure(1, figsize=(12, 12))
5 T = 100
6 for k in range(4):
7 h = steps[k]
8 N = round(T/h)
9 t = np.linspace(0, T, N+1)
10 P = np.zeros like(t)
11 P[0] = 0.5
12 for j in range(N):
13 k1 = P[j] * (1.0-P[j])
14 Pest = P[j]+h*k1
15 k2 = Pest*(1.0-Pest)
16 P[j+1] = P[j] + h/2*(k1+k2)
17 plt.subplot(2, 2, k+1)
18 plt.plot(t, P, color = 'blue', lw=3, label = 'h = ...
'+str(h))
19 plt.xlabel('Time (t)', fontweight='bold')
20 plt.ylabel('Population (P(t))', fontweight = 'bold')
21 plt.legend()
22 plt.grid(True, ls = '--')
23 plt.xticks(np.arange(0, 105, 10.0), fontweight = 'bold')
Construction Rules of Nonstandard Finite Difference Schemes 213
0.60
1.00 h = 2.75
0.59
0.95
0.58
0.90
0.85 0.57
Population (P(t))
Population (P(t))
0.80 0.56
0.75 0.55
0.70 0.54
0.65 0.53
0.60 0.52
0.55 0.51
0.50 h = 0.5
0.50
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Time (t) Time (t)
0.74
h = 3.0 0.80
0.70
0.72
0.66
0.64
0.62
0.56
Population (P(t))
Population (P(t))
0.58
0.48
0.54
0.50 0.40
0.46 0.32
0.42 0.24
0.38 0.16
0.34 0.08
h = 3.35
0.30 0.00
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Time (t) Time (t)
FIGURE 8.3: Solution of the logistic model using Heun’s method for different
values of step-size.
Figure 8.3 shows the solution of the logistic model with the Heun’s
method for different values of the step-size h.
From Figure 8.3, corresponding to some values of the step-size h, the
numerical solution of the logistic model could have periodic and chaotic
behaviors, which are not corresponding to any solution of the logistic
model.
(I) Because using a discrete derivative of order that differs than the order
of the differential equation can lead to numerical instability, the order of
the discrete model shall be as same as the order of the differential equation.
Under this rule the central finite difference scheme cannot be used as
an approximation of the first derivative in the discrete model of a first-
order differential equation. Either the forward or backward difference
schemes can be used.
(II) Nonstandard denominator functions have to be used for the discrete repre-
sentation of the continuous derivative. A nonstandard discrete represen-
tation of dy
dt at t = tj is of the form:
1 − e−h
φ(h) = ,
h
hence the resulting discrete model is
1 #expdecnsden.py
2 import numpy as np
3 import matplotlib.pyplot as plt
4 steps = [0.5, 1.05, 2.5, 5.0]
5 plt.figure(1, figsize=(12, 12))
Construction Rules of Nonstandard Finite Difference Schemes 215
6 T = 105
7 for k in range(4):
8 h = steps[k]
9 phi = (1-np.exp(-h))/h
10 N = round(T/h)
11 t = np.linspace(0, T, N+1)
12 P = np.zeros like(t)
13 P[0] = 0.5
14 for j in range(N):
15 P[j+1] = (1.0-phi)*P[j]
16 plt.subplot(2, 2, k+1)
17 plt.plot(t, P, color = 'blue', lw=2, label = 'h = ...
'+str(h))
18 plt.xlabel('Time (t)', fontweight='bold')
19 plt.ylabel('Population (P(t))', fontweight = 'bold')
20 plt.legend()
21 plt.grid(True, ls = '--')
22 plt.xticks(np.arange(0, 110, 15), fontweight = 'bold')
23 mnp, mxp = np.floor(10*min(P))/10, np.ceil(10*max(P))/10
24 plt.yticks(np.arange(mnp, mxp+(mxp-mnp)/10, ...
(mxp-mnp)/10), fontweight = 'bold')
The result of executing the code is Figure 8.4, in which the exponential
decay model is solved with a nonstandard denominator function for
different values of the step size h.
Population (P(t))
0.30 0.30
0.25 0.25
0.20 0.20
0.15 0.15
0.10 0.10
0.05 0.05
0.00 0.00
0 15 30 45 60 75 90 105 0 15 30 45 60 75 90 105
Time (t) Time (t)
Population (P(t))
0.30 0.30
0.25 0.25
0.20 0.20
0.15 0.15
0.10 0.10
0.05 0.05
0.00 0.00
0 15 30 45 60 75 90 105 0 15 30 45 60 75 90 105
Time (t) Time (t)
1 # nonlocapplog.py
2 import numpy as np
3 import matplotlib.pyplot as plt
4 steps = [0.75, 3.75, 7.5, 15.0]
5 plt.figure(1, figsize=(12, 12))
6 T = 75
7 for k in range(4):
8 h = steps[k]
9 N = round(T/h)
10 t = np.linspace(0, T, N+1)
11 P = np.zeros like(t)
12 P[0] = 0.5
13 for j in range(N):
14 P[j+1] = (1.0+h)*P[j]/(1+h*P[j])
15 plt.subplot(2, 2, k+1)
16 plt.plot(t, P, color = 'blue', lw=3, label = 'h = ...
'+str(h))
17 plt.xlabel('Time (t)', fontweight='bold')
18 plt.ylabel('Population (P(t))', fontweight = 'bold')
19 plt.legend()
20 plt.grid(True, ls = '--')
21 plt.xticks(np.arange(0, 80, 7.5), fontweight = 'bold')
22 mnp, mxp = np.floor(10*min(P))/10, np.ceil(10*max(P))/10
23 plt.yticks(np.arange(mnp, mxp+(mxp-mnp)/10, ...
(mxp-mnp)/10), fontweight = 'bold')
Figure 8.5 is obtained by executing the python code. It shows the solu-
tions of the logistic model using different values of step-sizes. All these
solutions correspond the true solution of the differential equation model.
Hence, for all values of step-sizes, the discrete model does not suffer
numerically instability.
(IV) Dynamical consistency between the solutions of the discrete model and
differential equation. That means all the properties possessed by the
solution of the differential equation must be possed by the solution of the
discrete model. Particular properties include positiveness, monotonicity,
boundness, limit cycles and other periodic solutions, etc.
Exact Finite Difference Schemes 217
1.00 1.00
0.95 0.95
0.90 0.90
0.85 0.85
Population (P(t))
Population (P(t))
0.80 0.80
0.75 0.75
0.70 0.70
0.65 0.65
0.60 0.60
0.55 0.55
0.50 h = 0.75 0.50 h = 3.75
0.0 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0 0.0 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0
Time (t) Time (t)
Population (P(t))
0.80 0.80
0.75 0.75
0.70 0.70
0.65 0.65
0.60 0.60
0.55 0.55
0.50 0.50
0.0 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0 0.0 7.5 15.0 22.5 30.0 37.5 45.0 52.5 60.0 67.5 75.0
Time (t) Time (t)
and y j ≈ y(tj ). Suppose that the solution of the discrete finite difference
scheme is
y j = G(tj , h, y j , y 0 , λ). (8.4)
According to Mickens [37] Equation (8.3) is an exact finite difference
scheme of the differential equation (8.1) if its solution (8.4) is as same as the
solution of associated differential equation (8.2). That means
e−αtj yj
yj 1
= e−αtj = 0 ⇒ y j+1 = e−αh y j
−αt −αh
j+1
y e j+1 y j+1 e
Subtracting y j from the two sides and multiplying the right-hand side by α
and dividing by α gives the discrete scheme:
1 − e−αh
y j+1 − y j = −α yj
α
from which,
y j+1 − y j
= −αy j
1−e−αh
α
Exact Finite Difference Schemes 219
Hence, instead of using the standard denominator function h, using the denom-
−αh
inator function 1−eα will result in an exact finite difference scheme.
Example 8.1 (First order linear ODE [39]) In this example, an exact
finite difference scheme will be derived for the exponential decay model
dy(t) π 1
= − y(t), y(0) =
dt 4 2
The following Python code solves the exponential decay model using the
exact finite difference scheme with different values of the step size h and shows
the corresponding infinity norm of the difference between the exact solution
and the solution of the discrete model.
1 import numpy as np
2 import matplotlib.pyplot as plt
3 plt.figure(1, figsize=(12, 12))
4 alpha = np.pi/4
5 T = 100
6 steps = np.array([0.1, 0.5, 1.0, 2.0, 4.0, 5.0, 6.25, 10.0, ...
12.5, 20.0, 25.0, 50.0, 100.0])
7 Errors = []
8 for k in range(1, len(steps)):
9 h = steps[k]
10 phi = (1-np.exp(-alpha*h))/alpha
11 N = int(round(T/h))
12 t = np.linspace(0, T, N+1)
13 P, PEx = np.zeros like(t), np.zeros like(t)
14 P[0] = 0.5
15 PEx[0] = 0.5
16 for j in range(N):
17 P[j+1] = (1.0-phi*alpha)*P[j]
18 PEx[j+1] = P[0]*np.exp(-alpha*t[j+1])
19 MaxError = np.linalg.norm(PEx-P, np.inf)
20 Errors.append([h, MaxError])
21 print('---------------------')
22 print(' h\t\t Error')
23 print('---------------------')
24 for j in range(len(Errors)):
25 print('{0:3.2f}'.format(Errors[j][0]) + '\t' + ...
'{0:1.8e}'.format(Errors[j][1]))
26 print('---------------------')
5.00 5.20417043e-18
6.25 1.30104261e-17
10.00 4.24465151e-17
12.50 3.25836634e-17
20.00 1.24790452e-18
25.00 2.25704533e-17
50.00 4.40824356e-18
100.00 3.88652225e-35
----------------------------
d2 u
(t) + ω 2 u(t) = 0, u(0) = u0 , u(2) = u1
dt2
The characteristic equation corresponding to the given BVP is
r2 + ω 2 = 0,
Using the trigonometric identity sin(a) cos(b) − cos(a) sin(b) = sin(a − b) and
remembering that tj+1 − tj = tj − tj−1 = h, tj + 1 − tj−1 = 2h the determinant
is expanded into:
sin (ωh)uj−1 − sin (2ωh)uj + sin (ωh)uj+1 = 0 ⇒ uj−1 − 2 cos (ωh)uj + uj+1 = 0
Subtracting 2uj from and adding 2cos(ωh) to the two sides and using the
trigonometric identity cos (ωh) = 1 − 2 sin2 ( ωh
2 ) gives:
ωh j 4 ωh 2 j
uj−1 − 2uj + uj+1 = −4 sin2 u ⇒ uj−1 − 2uj + uj+1 = − 2 sin2 ω u
2 ω 2
Exact Finite Difference Schemes 221
Hence, the exact finite difference scheme is obtained by selecting the denomi-
nator function
2 ωh
φ(ω, h) = sin .
ω 2
As h → 0, sin ( ωh ωh
2 ) → 2 and hence, φ(ω, h) → h.
The exact solution of the problem is:
1
3 2 − 32 cos (2ω)
uexact (t) = cos (ωt) + sin (ωt)
2 sin (2ω)
To solve the given BVP using MATLAB for u0 = 1.5 and u1 = 0.5, the
time space [0, 2] is divided into N -subintervals, where N is a positive integer.
The solution is obtained by solving the linear system
0
1 0 0 0 ... 0 0 0 0
u
u0
1
φ2 ω 2 − φ22 1
φ2 0 ... 0 0 0 0 u1 0
ω 2 − φ22
1 1 2
0 φ2 φ2 ... 0 0 0 0 u
0
.. .. .. .. .. .. .. .. .. ..
= ..
. . . . . . . . . . .
1
ω 2 − φ22 1
N −2
0 0 0 0 ... 0 u 0
φ2 φ2 N −1
1
ω 2 − φ22 1
0 0 0 0 ... 0 u 0
φ2 φ2
0 0 0 0 ... 0 0 0 1 uN u1
By executing the MATLAB code, the solutions of the BVP for the different
values of N are shown in Figure 8.6 and the infinity norm errors are also shown.
-------------------------------
N Error
-------------------------------
10 3.330669073875e-15
20 6.883382752676e-15
40 4.996003610813e-15
80 1.088018564133e-14
-------------------------------
If the standard denominator function h is used instead of φ(ω, h), then the
table of errors will be:
-------------------------------
N Error
-------------------------------
10 1.308207805131e+00
20 3.921038479674e+00
40 1.694674604840e+00
80 5.894499384984e-01
-------------------------------
Exact Finite Difference Schemes 223
1.5 2
1.2 1.6
0.9 1.2
0.6 0.8
0.3 0.4
u(t)
u(t)
0 N = 10 0 N = 20
-0.3 -0.4
-0.6 -0.8
-0.9 -1.2
-1.2 -1.6
-1.5 -2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
t t
2.1 2.1
1.68 1.68
1.26 1.26
0.84 0.84
0.42 0.42
u(t)
u(t)
0 N = 40 0 N = 80
-0.42 -0.42
-0.84 -0.84
-1.26 -1.26
-1.68 -1.68
-2.1 -2.1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
t t
FIGURE 8.6: Solution of the harmonic oscillator system using an exact finite
difference scheme for different numbers of subintervals.
where
x(t)
z(t) =
y(t)
and a, b, c and d are real numbers.
Matrix A is similar to its Jordan form J ∈ R2×2 , hence there exists an
invertible matrix P ∈ R2×2 such that J = P −1 AP . The diagonal elements
of J are the eigenvalues of A and the columns of P are the eigenvectors or
generalized eigenvectors of matrix A. The Jordan form J of A has one of three
forms:
λ1 0 λ 0 λ 1
J= , or
0 λ2 0 λ 0 λ
with λ1 , λ2 , 0 and λ , 0.
In MATLAB, the Jordan form of a matrix A can be found by using the
MATLAB function jordan, which returns the matrices P and J.
>> A = [2, 0, 0; 1, 1, -3; -1, -1, -1] ;
>> [P, J] = jordan(A) ;
>> disp(’P = ’) ; disp(P) ;
P =
0 0 1.0000e+00
1.2500e-01 1.5000e+00 -1.2500e-01
1.2500e-01 -5.0000e-01 -1.2500e-01
>> disp(’J = ’) ; disp(J) ;
J =
-2 0 0
0 2 1
0 0 2
In Python, the Jordan form of a matrix A can be found by using the
method jordan form of a symbolic matrix.
In [1]: import numpy as np
In [2]: import sympy as smp
In [3]: A = np.array([[2, 0, 0], [1, 1, -3], [-1, -1, -1]])
In [4]: P, J = (smp.Matrix(A)).jordan_form()
In [5]: P, J = np.array(P), np.array(J)
In [6]: print(’P = \n’, P)
P =
[[0 0 -2]
[1 -3 1]
[1 1 0]]
In [7]: print(’J = \n’, J)
J =
[[-2 0 0]
[0 2 1]
[0 0 2]]
Exact Finite Difference Schemes 225
dz
= P JP −1 z.
dt
Multiplying the two sides of equation (8.5) by P −1 and using the linear trans-
formation
u(t) = P −1 z(t)
Equation (8.5) can be written as
du
= Ju (8.6)
dt
The solution of the linear system (8.5) is obtained by solving the linear
system (8.6) and using the linear transformation
z(t) = P u(t).
In [48], the exact finite difference schemes are derived for the three kinds
of Jordan forms. The author proved that the linear system (8.6) has an exact
finite difference scheme of the form:
uj+1 − uj
= J θuj+1 + (1 − θ)uj
(8.7)
φ
where φ and θ are to be determined in terms of the step-size h and the eigen-
values of A. The functions φ and θ are as follows:
I. In the case that matrix A has two distinct roots λ1 and λ2 , that is
λ1 0
J= ,
0 λ2
φ and θ are
(λ1 − λ2 ) eλ1 h − 1 eλ2 h − 1
φ=
λ1 λ2 (eλ1 h − eλ2 h )
and
λ2 eλ1 h − 1 − λ1 eλ2 h − 1
θ=
(λ1 − λ2 ) (eλ1 h − 1) (eλ2 h − 1)
II. In the case that A has a repeated eigenvalue λ and Dim(A − λI) = 2,
that is
λ 0
J= ,
0 λ
then,
eλh − 1
φ=
λθ (eλh − 1) + λ
226 Nonstandard Finite Difference Methods for Solving ODEs
2 eλh − 1
φ= ,
λ (eλh + 1)
and if
1 − eλh + λheλh
θ= 2 ,
(eλh − 1)
then 2
eλh − 1
φ=
λ2 heλh
III. In the case that A has a repeated eigenvalue λ and Dim(A − λI) = 1,
that is
λ 1
J= ,
0 λ
then φ and θ will have the forms:
1 − eλh + λheλh
θ= 2
(eλh − 1)
and 2
eλh − 1
φ= .
λ2 heλh
Example 8.3 In this example, an exact finite difference scheme will be estab-
lished to solve the linear system:
To solve this linear system of equations, let z(t) = [x(t), y(t)]T , then the
matrix form of the system is:
−2 1 1
ż(t) = z(t), z(0) = .
1 −1 0.5
− 1 1√5
" #
v 1,2 = 2± 2
1
Exact Finite Difference Schemes 227
1 %SolveLinSysExact.m
2 clear ; clc ; clf ;
3 a = 0.0 ; b = 10.0 ;
4 A = [-2, 1; 1, -1] ;
5 N = 40 ;
6 t = linspace(a, b, N+1) ;
7 h = (b-a)/N ;
8 [phi, tht, P, J] = CompPhiThet(A, h) ;
9 z0 = [2; 1] ;
10 z = zeros(2, N+1) ; u = zeros(2, N+1) ;
11 z(:, 1) = z0 ;
12 u(:, 1) = P\z(:, 1) ;
13 B = eye(2)-tht*phi*J ; C = eye(2)+(1-tht)*phi*J ;
14 for j = 1 : N
15 u(:, j+1) = B\(C*u(:, j)) ;
16 z(:, j+1) = P*u(:, j+1) ;
17 end
18
19 uex = zeros(size(u)) ; %Exact transformed solution
20 zex = zeros(size(z)) ; %Exact solution
21 uex(:, 1) = u(:, 1) ; zex(:, 1) = z(:, 1) ;
22 iP = inv(P) ;
23 for j = 1 : N
24 uex(:, j+1) = expm(J*t(j+1))*uex(:, 1) ;
25 zex(:, j+1) = P*uex(:, j+1) ;
26 end
27 Errors = abs(z-zex)' ;
28 fprintf('------------------------------------------------\n') ;
29 fprintf(' t\t | x(t)-xexact(t) | \ t | y(t)-yexact(t) | \ n') ;
30 fprintf('------------------------------------------------\n') ;
31 for j = 1 : length(t)
32 fprintf(' %2i\t %8.7e\t\t %8.7e\n', t(j), Errors(j, ...
1), Errors(j, 2)) ;
33 end
34 fprintf('------------------------------------------------\n') ;
35
36 plot(t, z(1, :), '-b', t, z(2, :), '--m', 'LineWidth', 2)
37 xlabel('t') ;
38 set(gca, 'XTick', linspace(a, b, 11)) ;
228 Nonstandard Finite Difference Methods for Solving ODEs
1 import numpy as np
2 from sympy import Matrix
3 from scipy.linalg import expm
4 import matplotlib.pyplot as plt
5
6 def CompPhiThet(A, h):
7 eps = np.spacing(1.0)
8 P, J = Matrix(A).jordan form()
9 P, J = np.array(P, float), np.array(J, float)
10 L1, L2 = J[0, 0], J[1, 1]
11 if isinstance(L1, float) and isinstance(L2, float):
12 if np.abs(L1) > eps and np.abs(L2) > eps:
13 if np.abs(L1-L2) > eps:
14 phi = (L1-L2)*(np.exp(L1*h)-1)*(np.exp(L2*h)-1)\
15 /(L1*L2*(np.exp(L1*h)-np.exp(L2*h)))
16 tht = (L2*(np.exp(L1*h)-1)-L1*(np.exp(L2*h)-1))\
17 /((L1-L2)*(np.exp(L1*h)-1)*(np.exp(L2*h)-1))
18 else:
19 phi = (np.exp(L1*h)-1)ˆ2/(L1ˆ2*h*np.exp(L1*h))
20 tht = (1-np.exp(L1*h)+L1*h*np.exp(L1*h))\
21 /(np.exp(L1*h)-1)**2
22 if np.abs(L1) ≥ eps and np.abs(L2) < eps:
23 phi = h
24 tht = (np.exp(L1*h)-1-L1*h)/(L1*h*(np.exp(L1*h)-1))
25 if np.abs(L1) < eps and np.abs(L2) ≥ eps:
26 phi = h
27 tht = (np.exp(L2*h)-1-L2*h)/(L2*h*(np.exp(L2*h)-1))
28 if np.abs(L1) < eps and np.abs(L2) < eps:
29 phi = h
30 tht = 0.5
31 return phi, tht, P, J
32
33 a, b = 0.0, 10.0
34 A = np.array([[-3, 2], [1, -1]])
35 N = 10
36 t = np.linspace(a, b, N+1)
37 h = (b-a)/N
38 phi, tht, P, J = CompPhiThet(A, h)
39 z0 = np.array([1, 0.5])
40 z, u = np.zeros((2, N+1), float), np.zeros((2, N+1), float)
41 z[:, 0] = z0
42 u[:, 0] = np.linalg.solve(P, z[:, 0])
43 B = np.eye(2)-tht*phi*J ; C = np.eye(2)+(1-tht)*phi*J
44 for j in range(N):
45 u[:, j+1] = np.linalg.solve(B,C@u[:, j])
46 z[:, j+1] = P@u[:, j+1]
47
48 uex, zex = np.zeros((2, N+1), float), np.zeros((2, N+1), float)
49 zex[:, 0] = z0 # Exact solution of the linear system
50 uex[:, 0] = u[:, 0]
51 for j in range(N):
52 uex[:, j+1] = expm(J*t[j+1])@uex[:, 0]
53 zex[:, j+1] = P@uex[:, j+1]
54 Errors = np.abs(z-zex)
230 Nonstandard Finite Difference Methods for Solving ODEs
55 print('------------------------------------------------') ;
56 print(' t\t | x(t)-xexact(t) | \ t | y(t)-yexact(t) | ') ;
57 print('------------------------------------------------') ;
58 for j in range(len(t)):
59 print('{0:2.0f}'.format(t[j]), '\t', ...
'{0:8.7e}'.format(Errors[0, j]), '\t\t' \ ...
'{0:8.7e}'.format(Errors[1, j]))
60 print('------------------------------------------------') ;
61
62 plt.plot(t, z[0, :], '-b', label='x(t)', lw = 3)
63 plt.plot(t, z[1, :], '--m', label='y(t)', lw= 3)
64 plt.xlabel('t', fontweight='bold') ;
65 plt.xticks(np.linspace(a, b, 11), fontweight='bold')
66 plt.yticks(np.linspace(0, 1, 11), fontweight='bold')
67 plt.grid(True, ls=':')
68 plt.legend()
1 − (n − 1)αt0 un−1
0
C=
(n − 1)un−1
0
To derive the exact finite difference scheme of Equation (8.8), the substi-
tutions:
t0 → tk , t → tk+1 , u0 → uk and u → uk+1
are used, with the notice that tk+1 − tk = h. Then,
s
(n − 1)uk
uk+1 = n−1
(n − 1) 1 + α(n − 1)hun−1
k
Exact Finite Difference Schemes 231
Raising the two sides to power n − 1 and dividing the numerator and denom-
inator of the RHS by n − 1 and doing few manipulations give the form:
un−1 n−1
k+1 − uk = −α(n − 1)hun−1
k un−1
k+1
or
(uk+1 − uk ) un−2
k+1 + un−3
k+1 uk + · · · + u k+1 u n−3
k + un−2
k = −α(n−1)hun−1
k un−1
k+1 ,
1 # ExactFDMupown.py
2 import numpy as np
3
4 def ExactSol(t, n, u0):
5 U = (n-1)*u0**(n-1)/((n-1)*(1+(n-1)*t*u0**(n-1)))
6 u = np.array([v**(1/(n-1)) for v in U])
7 return u
232 Nonstandard Finite Difference Methods for Solving ODEs
8
9 def ExactFDM(t, n, h, u0):
10 u = np.zeros like(t)
11 u[0] = u0
12 for k in range(len(u)-1):
13 u[k+1] = ...
((n-1)*u[k]**(n-1)/((n-1)*(1+(n-1)*h*u[k]**(n-1)))) ...
**(1/(n-1))
14 return u
15
16 a, b, N = 0.0, 10.0, 5
17 t = np.linspace(a, b, N+1)
18 h, u0 = (b-a)/N, 0.5
19 ns = [2, 4, 8, 10]
20 for n in ns:
21 uex = ExactSol(t, n, u0)
22 ufdm = ExactFDM(t, n, h, u0)
23 Errors = np.zeros((2, N+1, float)
24 Errors[0, :] = t
25 Errors[1, :] = np.abs(uex-ufdm)
26 Errors = Errors.T
27
28 print('Errors corresponding to n = '+str(n))
29 print('-------------------------------')
30 print(' h \t Error')
31 print('-------------------------------')
32 for j in range(len(Errors)):
33 print('{0:3.2f}'.format(Errors[j][0]) + '\t' + ...
'{0:1.8e}'.format(Errors[j][1]))
34 print('-------------------------------\n\n')
Errors corresponding to n = 5
-------------------------------
t Error
-------------------------------
0.00 0.00000000e+00
2.00 0.00000000e+00
Exact Finite Difference Schemes 233
4.00 0.00000000e+00
6.00 0.00000000e+00
8.00 0.00000000e+00
10.00 0.00000000e+00
-------------------------------
Errors corresponding to n = 8
-------------------------------
t Error
-------------------------------
0.00 0.00000000e+00
2.00 0.00000000e+00
4.00 5.55111512e-17
6.00 5.55111512e-17
8.00 5.55111512e-17
10.00 1.11022302e-16
-------------------------------
Errors corresponding to n = 10
-------------------------------
t Error
-------------------------------
0.00 0.00000000e+00
2.00 0.00000000e+00
4.00 0.00000000e+00
6.00 5.55111512e-17
8.00 5.55111512e-17
10.00 5.55111512e-17
-------------------------------
The code of the MATLAB script ExactFDMupown.m is:
1 a = 0.0 ; b = 10.0 ;
2 N = 5 ;
3 t = linspace(a, b, N+1) ;
4 h = (b-a)/N;
5 u0 = 0.5 ;
6 ns = [2, 4, 8, 10] ;
7 for n = ns
8 uex = ExactSol(t, n, u0) ;
9 ufdm = ExactFDM(t, n, h, u0) ;
10 Errors = zeros(2, N+1) ;
11 Errors(1, :) = t ;
12 Errors(2, :) = abs(uex-ufdm) ;
13 Errors = Errors' ;
14 fprintf('Errors corresponding to n = %i\n', n) ;
15 fprintf('-------------------------------\n') ;
16 fprintf(' t \t Error\n') ;
234 Nonstandard Finite Difference Methods for Solving ODEs
17 fprintf('-------------------------------\n') ;
18 for j = 1 : N+1
19 fprintf('%2i\t%8.7e\n', Errors(j, 1), Errors(j, 2)) ;
20 end
21 fprintf('-------------------------------\n\n') ;
22 end
23
24 function u = ExactSol(t, n, u0)
25 U = (n-1)*u0ˆ(n-1)./((n-1)*(1+(n-1)*t*u0ˆ(n-1))) ;
26 u = U.ˆ(1/(n-1)) ;
27 end
28
29 function u = ExactFDM(t, n, h, u0)
30 u = zeros(size(t)) ;
31 u(1) = u0 ;
32 for k = 1 : length(t)-1
33 u(k+1) = ...
((n-1)*u(k)ˆ(n-1)/((n-1)*(1+(n-1)*h*u(k)ˆ(n-1)))) ...
ˆ(1/(n-1)) ;
34 end
35 end
1 − e−αh
φ(h, α) =
α
and the nonlocal approximation is used for the nonlinear term −βun (t) as
−β(n − 1)un−1
k un−1
k+1
un−2 n−3 n−3
k+1 + uk+1 uk + · · · + uk+1 uk + un−2
k
Then, the exact finite difference scheme for the differential equation (8.11)
will be:
from which:
(1 + φα)uk
uk+1 =
1 + φβuk
The Python script ExactFDMLogistic.py solves the logistic model, using
an exact finite difference scheme:
1 import numpy as np
2 import matplotlib.pyplot as plt
3 a, b, N = 0.0, 50.0, 100
4 t = np.linspace(a, b, N+1)
5 h = (b-a)/N
6 u0 = 0.1
7
8 u = np.zeros like(t)
9 u[0] = u0
10 alpha, beta = 0.25, 0.25 ;
11 phi = (1-np.exp(-alpha*h))/alpha ;
12 for k in range(N):
13 u[k+1] = (1+phi*alpha)*u[k]/(1+phi*beta*u[k]) ;
14 MaxError = np.linalg.norm(u-uex, np.inf)
15 print('Maximum error = ', MaxError)
16 plt.plot(t, u, color='brown', lw=3)
17 plt.xlabel('Time (t)', fontweight = 'bold')
18 plt.ylabel('Population u(t)', fontweight = 'bold')
19 plt.grid(True, ls=':')
20 s = (b-a)/10.
21 plt.xticks(np.arange(a, b+s, s), fontweight = 'bold')
22 plt.yticks(np.arange(0., 1.1, 0.1), fontweight = 'bold')
By executing the code, the maximum error is shown and the solution of the
model is explained in Figure 8.7.
runfile(’D:/PyFiles/ExactFDMLogistic.py’, wdir=’D:/PyFiles’)
Maximum error = 3.3306690738754696e-16
1 %ExactFDMLogistic.m
2 a = 0.0 ; b = 50.0 ; N = 100 ;
3 t = linspace(a, b, N+1) ; h = (b-a)/N ;
4 u0 = 0.1 ;
5 u = zeros(1, N+1) ;
6 alpha = 0.25 ; beta = 0.25 ;
7 phi = (exp(alpha*h)-1)/alpha ;
8 u(1) = u0 ;
9 uex = zeros(1, N+1) ; uex(1) = u0 ;
236 Nonstandard Finite Difference Methods for Solving ODEs
1.0
0.9
0.8
0.7
Population u(t)
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 5 10 15 20 25 30 35 40 45 50
Time (t)
FIGURE 8.7: Solution of the logistic model, using exact finite difference
scheme.
10 for k = 1 : N
11 u(k+1) = (1+phi*alpha)*u(k)/(1+phi*beta*u(k)) ;
12 uex(k+1) = ...
alpha*u0/((alpha-beta*u0)*exp(-alpha*t(k+1))+beta*u0) ;
13 end
14
15 MaxError = norm(u-uex, inf) ;
16 disp(['Max Error = ' num2str(MaxError)]) ;
17
18 plot(t, u, 'r', 'LineWidth', 3) ;
19 xlabel('Time (t)') ;
20 ylabel('Population u(t)') ;
21 set(gca, 'fontweight', 'bold') ;
22 grid on ;
23 s = (b-a)/10 ;
24 set(gca, 'XTick', a:s:b) ;
25 set(gca, 'YTick', linspace(0, 1.1, 12)) ;
In [37] Mickens put the rules for selecting the denominator function and
dealing with the nonlinear terms. Given an initial value problem,
du
= f (u), u(0) = u0
dt
To find the denominator function, the equilibria points of the differential equa-
tions are calculated by solving:f (u) = 0 for u(t). Let u1 , u2 , . . . , un be the equi-
libria points of the differential equation. Let
df
rk = , k = 1, . . . , n
du u=uk
1 − e−rh
φ(r, h) =
r
The linear and nonlinear terms at the right-hand side can be approximated
using nonlocal approximations. For example a term u(t) in the differential
equation can be approximated at tk by 2uk − uk+1 , a term u2 (t) in the differ-
ential equation can be approximated at tk by 2u2k − uk uk+1 , etc.
Example 8.4 This example is taken from [37] and constructs a nonstandard
finite difference scheme for the differential equation:
du
= u2 (t) − u3 (t), u(0) = u0
dt
The equilibria points of the model are u1 = u2 = 0 and u3 = 1. The denom-
inator function is φ(h) = 1 − e−h . The term u2 (t) is approximated at tk by
2u2k − uk uk+1 and the nonlinear term −u3 (t) is approximated by −u2k uk+1 .
The nonstandard finite difference scheme is:
uk+1 − uk
= 2u2k − uk uk+1 − u2k uk+1
φ
which after few manipulations gives:
(1 + 2φuk )uk
uk+1 =
1 + φ(uk + u2k )
1 import numpy as np
2 import matplotlib.pyplot as plt
3 a, b, N = 0.0, 25.0, 100
238 Nonstandard Finite Difference Methods for Solving ODEs
4 t = np.linspace(a, b, N+1)
5 h = (b-a)/N
6 u0 = 0.1
7
8 u = np.zeros like(t)
9 u[0] = u0
10 phi = 1-np.exp(-h)
11 for k in range(N):
12 u[k+1] = (1+2*phi*u[k])*u[k]/(1+phi*(u[k]+u[k]**2))
13 plt.plot(t, u, color='orangered', lw=3)
14 plt.xlabel('t', fontweight = 'bold')
15 plt.ylabel('u(t)', fontweight = 'bold')
16 plt.grid(True, ls=':')
17 s = (b-a)/10.
18 plt.xticks(np.arange(a, b+s, s), fontweight = 'bold')
19 plt.yticks(np.arange(0., 1.1, 0.1), fontweight = 'bold')
1.0
0.9
0.8
0.7
0.6
u(t)
0.5
0.4
0.3
0.2
0.1
0.0
0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5 25.0
t
FIGURE 8.8: Solution of the model u0 (t) = u2 (t) − u3 (t) using Micken’s non-
standard finite difference scheme.
Other Nonstandard Finite Difference Schemes 239
7 for k = 1 : N
8 u(k+1) = (1+2*phi*u(k))*u(k)/(1+phi*(u(k)+u(k)ˆ2)) ;
9 end
10
11 plot(t, u, 'r', 'LineWidth', 3) ;
12 xlabel('Time (t)') ;
13 ylabel('Population u(t)') ;
14 set(gca, 'fontweight', 'bold') ;
15 grid on ;
16 s = (b-a)/10 ;
17 set(gca, 'XTick', a:s:b) ;
18 set(gca, 'YTick', linspace(0, 1.1, 12)) ;
Part III
Abstract
Linear programming is one of the most important methods of mathematical
programming and most applied in practical life to ensure optimal use of limited
resources. Such examples include the optimal mix of the products produced
by a particular plant to achieve the maximum profit according to the available
labors and raw materials. Also, moving certain products from production areas
to consumption centers so that each consumer center satisfies its demand at
the lowest possible cost.
This chapter is organized as follows. The first section discusses the form
of a linear programming problem. The second and third sections discuss
the solutions of linear programming problems using the built-in functions in
MATLAB® and Python. Section 4 uses the Python pulp package for solving
linear programming problems. The package Pyomo is another Python package
for solving linear programming problems, and is discussed in Section 5. Section
6 uses the gekko Python package for solving linear programming problems.
Section 7 is devoted to solving quadratic programming problems using many
MATLAB and Python packages.
min α1 x1 + α2 x2 + · · · + αn xn (9.1)
243
244 Solving Optimization Problems: Linear and Quadratic Programming
inequality constraints
LHS ≤ RHS.
min αT x (9.5)
subject to constraints
Ax = b (9.6)
Cx ≤ d (9.7)
x ∈ X (9.8)
Form of a Linear Programming Problem 245
1.5
1.25
1 y = f(x)
0.75
0.5
0.25
0
y
-0.25
-0.5
-0.75
-1 y = -f(x)
-1.25
-1.5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x
where,
a11 a12 ... a1n x1 b1
a21 a22 ... a2n x2 b2
A = ,x = ,b = ,
.. .. .. .. .. ..
. . . . . .
am1 ams ... amn xn bm
(X1min , X1max )
c11 c12 ... c1n d1
c21 c22 ... c2n d2 (X2min , X2max )
C = ,d = ,X =
.. .. .. .. .. ..
. . . . . .
cl1 cls ... cln dl (Xnmin , Xnmax )
The problem of maximizing some objective function f (x) is equivalent to
the problem of minimizing −f (x), as the maximization problem of f (x) and
the minimization problem of −f (x) have the same optimal solution xopt . In
Figure 9.1 the graphs of the functions f (x) = e−(x−0.5) sin ( πx
2 ) and −f (x) are
plotted. From the graphs of the two functions, it is clear that the maximum
value of f (x) is obtained at the same point that minimizes −f (x).
Then, the problem:
max αT x
subject to constraints
Ax = b
Cx ≤ d
x ∈ X
is equivalent to the problem:
min β T x
246 Solving Optimization Problems: Linear and Quadratic Programming
subject to constraints
Ax = b
Cx ≤ d
x ∈ X
(β = −α) without further change in the equality or inequality constraints.
The following MATLAB instructions are used to find the optimal solution:
>> objfun = [-3; -1; -2] ;
>> C = [3, 1, 0; 1, 0, 2; 0, 1, 2] ;
>> d = [40; 60; 60] ;
>> lb = [0; 0; 0] ;
>> [xopt, fval] = linprog(objfun, C, d, [], [], lb, [])
xopt =
10
10
25
fval =
-90
The MATLAB linprog function uses by default the dual-simplex algorithm.
The choice of the algorithm can be changed through the optimoptions struct.
For example to switch to the interior-point solver, and solve Example 9.1
the following instructions can be used [17].
>> Options = optimoptions(’linprog’, ’Algorith’, ’interior-point’);
>> [xopt, fval] = linprog(objfun, C, d, [], [], lb, [], Options)
xopt =
10.0000
10.0000
25.0000
fval =
-90.0000
>> Options = optimoptions(’linprog’, ’Algorith’,
’interior-point-legacy’, ’Disp’, ’Iter’) ;
>> [xopt, fval] = linprog(objfun, C, d, [], [], lb, [], Options)
248 Solving Optimization Problems: Linear and Quadratic Programming
Optimization terminated.
xopt =
10.0000
10.0000
25.0000
fval =
-90.0000
Python has also a function linprog located in the scipy.optimize library.
To use the linprog function, it shall imported from the optimize library of
scipy [18]. This can be done as follows:
In [1]: from scipy.optimize import linprog
The Python function linprog form is close to the MATLAB’s linporg form.
Its syntax is:
OptSol = linprog(objfun, A_ub = C, b_ub = d, A_eq = A, b_eq = b,
bounds = bnds,\ method=’optmethod’, options = optoptions)
To solve Example 9.1 with Python, the following Python instructions can
be used:
In [1]: import numpy as np
In [2]: import scipy.optimize as opt
In [3]: objfun = np.array([-3, -1, -2])
In [4]: C = np.array([[3, 1, 0], [1, 0, 2], [0, 1, 2]])
In [5]: d = np.array([40, 60, 60])
In [6]: bnds = [(0., None), (0., None), (0., None)]
In [7]: OptSol = opt.linprog(objfun, A_ub = C, b_ub = d, bounds = bnds, \
options = ({"disp":True}))
Primal Feasibility Dual Feasibility Duality Gap Step Path Parameter Objective
1.0 1.0 1.0 - 1.0 -6.0
0.1157362093423 0.1157362093423 0.1157362093423 0.8915842403063 0.1157362093423 -31.98924052603
0.01711151690354 0.0171115169036 0.0171115169036 0.8667218148033 0.01711151690367 -76.59885574796
0.0001832497415752 0.0001832497416004 0.0001832497416007 0.9929871273485 0.0001832497416554 -89.85504503568
9.462525147506e-09 9.462525023246e-09 9.462524985793e-09 0.9999484444422 9.462524789912e-09 -89.99999255556
Optimization terminated successfully.
Current function value: -89.999993
Iterations: 4
In [8]: print(OptSol)
con: array([], dtype=float64)
fun: -89.99999255555932
message: ’Optimization terminated successfully.’
nit: 4
slack: array([2.26564158e-06, 5.24875590e-06, 7.23457025e-06])
status: 0
success: True
x: array([ 9.99999993, 9.99999794, 24.99999741])
In [10]: print(OptSol)
con: array([], dtype=float64)
fun: -90.0
message: ’Optimization terminated successfully.’
nit: 3
slack: array([0., 0., 0.])
status: 0
success: True
x: array([10., 10., 25.])
In [12]: print(OptSol)
con: array([], dtype=float64)
fun: -90.0
message: ’Optimization terminated successfully.’
nit: 3
slack: array([0., 0., 0.])
status: 0
success: True
x: array([10., 10., 25.])
Also, an initial guess x0, the linear inequality and equality constraints and
the variables bounds are passed to the fmincon function.
The following MATLAB instructions can be used to solve Example 9.1
using the fmincon function:
>> f = @(x) objfun’*x ;
>> x0 = [0.; 0.; 0.] ;
>> C = [3, 1, 0; 1, 0, 2; 0, 1, 2] ;
>> d = [40; 60; 60] ;
>> lb = [0; 0; 0] ;
>> [xopt, fval] = fmincon(f, [0.;0; 0.], C, d, [], [], lb, [])
xopt =
10.0000
10.0000
25.0000
fval =
-90.0000
Then, some variable (for example LPP) shall be used to define the problem and
solve it. pulp enables the user to name his/her problem and defining its type
(maximization, minimization) through the function LpProblem. To define a
maximization problem the parameter LpMaximize shall be passed as a second
argument to LpProblem, and to define a minimization problem the parameter
LpMinimize shall be passed. For example:
In [15]: LPP = plp.LpProblem("Problem of maximizing profit",
LpMaximize)
The variables of the problem shall be defined by using the pulp function
LpVariable. It receives a string and bounds of the variable. For example if
the problem contains a variable 0 ≤ x ≤ 5:
x = plp.LpVariable("x", 0., 5.)
Those variables can be used with problem instance LPP.
Next, the equality and inequality constraints of the problem are added to
the problem instance. If the problem contains an inequality constraint 2x+y ≤
10 then this constraint is added to the problem using an instruction:
LPP += 2*x + y <= 10
In pulp the user does not have to change the forms of the inequality con-
straints. if the problem contains a constraint x + y ≥ 2, it can be added to the
problem instance directly without transforming it to other form:
LPP += x + y >= 2
Finally, after defining the whole problem, the solve() method of the LPP
instance can be used to solve the problem and displaying the results.
The Python script SolveExLinProg.py is used to solve the problem of
Example 9.1 and show the results:
1 # SolveExLinProg.py
2 import pulp as plp
3 LPP = plp.LpProblem('Problem statement: \n', plp.LpMaximize)
4 x, y, z = plp.LpVariable("x", lowBound=0.), ...
plp.LpVariable("y", lowBound=0.), plp.LpVariable("z", ...
lowBound=0.)
5 LPP += 3*x + y + 2*z
6 LPP += 3*x + y ≤ 40
7 LPP += x + 2*z ≤ 60
8 LPP += y + 2*z ≤ 60
9 LPP += x ≥ 0.
10 LPP += y ≥ 0.
11 LPP += z ≥ 0.
12 print(LPP)
13 status = LPP.solve()
14 print('Status of the optimization process: ' + ...
plp.LpStatus[status])
252 Solving Optimization Problems: Linear and Quadratic Programming
_C2: x + 2 z <= 60
_C3: y + 2 z <= 60
_C4: x >= 0
_C5: y >= 0
_C6: z >= 0
VARIABLES
x Continuous
y Continuous
z Continuous
data are passed to the model during the runtime of the optimization pro-
cess. This section presents the creation of concrete models for solving linear
programming problems.
In anaconda, pyomo can be installed by using the command:
conda install -c conda-forge pyomo
It is important to be sure that the targeted optimization solver is installed
before proceeding to the use of pyomo. For example, if the ’glpk’ solver is
not installed, it can be installed by typing the command:
conda install -c conda-forge glpk
The first step for solving a linear programming problem is to prepare the
pyomo environment and solver, through importing both. This can be done by:
import pyomo as pym
from pyomo.opt import SolverFactory
Then a variable of either type ConcreteModel or AbstractModel is constructed
by using:
Model = pym.ConcreteModel()
or
Model = pym.AbstractModel()
If for example 1 ≤ x ≤ 5 and 0 ≤ y < ∞ are two variables of the optimization
problem, they can be declared by using:
Model.x = pym.Var(bounds=(1., 5.))
Model.y = pym.Var(within=NonNegativeReals)
The objective function is defined by using the pym.Objective method which
receives the mathematical expression of the objective function (for example
10 x + y/100):
Model.objfun = pym.Objective(expr = 10.*Model.x+Model.y/100)
If the problem contains a constraint 10x + y ≤ 100, this constraint can be
added to the model by typing:
Model.con1 = pym.Constraint(expr=10*Model.x + Model.y <= 100)
After completely defining the model, an optimization solver is used to solve
the problem defined by the model.
To solve Example 9.1 using pyomo, a Python script SolveEx0WithPyomo.py
is used. Its code is:
10 m.options.IMODE = 3
11 xopt=m.solve()
12 print('Solution found at \n')
13 print('{0:10.7f}'.format(x[0]))
14 print('{0:10.7f}'.format(y[0]))
15 print('{0:10.7f}'.format(z[0]))
16 print('Value of the objective function: ', ...
'{0:10.7f}'.format(3*x[0]+y[0]+2*z[0]))
---------------------------------------------------
Solver : IPOPT (v3.12)
Solution time : 9.100000001126318E-003 sec
Objective : -89.9999999911064
Successful solution
---------------------------------------------------
Solution found at
10.0000000
10.0000000
25.0000000
Value of the objective function: 90.0000000
Example 9.2
x1 + x4 ≥ 10
2x1 + x2 + x3 + 2x4 ≥ 24
x2 + x3 ≥ 20
x1 + x2 + x3 + x4 ≤ 30
0 ≤ x1 , x 2 , x 3 , x 4 ≤ 20
1 % SolveQuadProg.m
2 clear ; clc ;
3 H = [2, 0, 4, 0; 0, 8, 0, 4; 4, 0, 8, 0; 0, 4, 0, 2] ;
4 alp = [2; 1; 1; 2] ;
5 C = [-1, -0, -0, -1; -2, -1, -1, -2; 0, -1, -1, 0; 1, 1, 1, 1] ;
6 d = [-10; -24; -20; 30] ;
7 lb = [0; 0; 0; 0] ;
8 ub = [20; 20; 20; 20] ;
9 [xopt, fval] = quadprog(H, alp, C, d, [], [], lb, ub) ;
10 fprintf('Optimal solution found at: \n') ; disp(xopt) ;
11 fprintf('Objective function at optimal point: %10.7f\n', fval) ;
1 clear ; clc ;
2 H = [2, 0, 4, 0; 0, 8, 0, 4; 4, 0, 8, 0; 0, 4, 0, 2] ;
3 alp = [2; 1; 1; 2] ;
4 objfun = @(x) 0.5*x'*H*x + alp'*x ;
5 C = [-1, -0, -0, -1; -2, -1, -1, -2; 0, -1, -1, 0; 1, 1, 1, 1] ;
6 d = [-10; -24; -20; 30] ;
7 lb = [0; 0; 0; 0] ;
8 ub = [20; 20; 20; 20] ;
9 x0 = [1; 1; 1; 1] ;
10 [xopt, fval] = fmincon(objfun, x0, C, d, [], [], lb, ub) ;
11 fprintf('Optimal solution found at: \n') ; disp(xopt) ;
12 fprintf('Objective function at optimal point: %10.7f\n', fval) ;
Results of execution:
EXIT: Optimal Solution Found.
---------------------------------------------------
Solver : IPOPT (v3.12)
Solution time : 1.630000000295695E-002 sec
Objective : 1289.99999909229
Successful solution
---------------------------------------------------
Solution found at
4.9999999
10.0000000
Solving Quadratic Programming Problems 259
10.0000000
5.0000000
Value of the objective function: 1289.9999991
Abstract
Nonlinear programming problems are divided into two kinds of problems:
unconstrained and constrained [19, 20]. An unconstrained optimization
problem is a problem of the form
E(x) = 0, (10.2)
g(x∗ ) = 0 (10.4)
y T H(x∗ )y ≥ 0, ∀ y ∈ Rn , (10.5)
261
262 Solving Optimization Problems: Nonlinear Programming
where g(x) = ∇f (x) is the gradient vector and H(x) is the Hessian matrix
defined by:
∂f (x)
∂x1
∂f (x)
∂x2
g(x) = ∇f (x) =
..
.
∂f (x)
∂xn
(ii) Search direction problem: given a function f (x) and point x(k) at
iteration k, find a unit vector p(k) , such that f (x(k) +αp(k) ) is a decreas-
ing function for 0 < α < αmax . That is p(k) is a descent direction.
The numerical optimization methods differ from each other by the methods
through which the search directions are determined and the gradient vector
and Hessian matrices are approximated and (or) updated.
1 import numpy as np
2 def LineSearch(f, g, x, p):
3 a, b = 0.3, 0.9
4 alpha = 1.0
5 while f(x+alpha*p) > f(x) + a*alpha*np.dot(g(x), p):
264 Solving Optimization Problems: Nonlinear Programming
6 alpha *= b
7 return alpha
g(x(k) )
x(k+1) = x(k) − α(k)
kg(x(k) )k
1 % SolveEx1withSteepDesc.m
2 clear ; clc ; clf ;
3 f = @(x) x(1)ˆ2 -x(1)*x(2)+ 4*x(2)ˆ2 + 1 ;
4 g = @(x) [2*x(1)-x(2); -x(1)+8*x(2)] ;
5 x0 = [1; 1] ; Eps = 1e-8 ;
6 [x, Iterations] = GradDec(f, g, x0, Eps) ;
7 disp('Optimum solution = ') ; disp(x) ;
8 disp(['Iterations = ' num2str(Iterations)]) ;
9
10 function [x, Iterations] = GradDec(f, g, x0, Eps)
11 x = x0 ;
12 Iterations = 0 ;
13 while norm(g(x), 2) ≥ Eps
14 p = -g(x)/norm(g(x), 2) ;
15 alpha = LineSearch(f, g, x, p) ;
16 x = x + alpha * p ;
17 fprintf('%3i\t\t%14.9f\t\t%12.10e\n', Iterations, f(x), ...
norm(g(x), 2)) ;
18 Iterations = Iterations + 1 ;
19 end
Solving Unconstrained Problems 265
20 end
21
22 function alpha = LineSearch(f, g, x, p)
23 a = 1-2/(1+sqrt(5)) ;
24 b = 2/(1+sqrt(5)) ;
25 alpha = 1.0 ;
26 while f(x+alpha*p) > f(x) + a*alpha*g(x)'*p
27 alpha = b*alpha ;
28 end
29 end
29 1.000000000 3.1130274879e-08
30 1.000000000 4.5572657815e-09
---------------------------------------------------
Optimum solution =
1.0e-09 *
-0.306552140415857
0.513653144168223
Iterations = 31
The corresponding Python script SolveEx1withSteepDesc.py is:
1 # SolveEx1withSteepDesc.py
2 import numpy as np
3 def LineSearch(f, g, x, p):
4 a, b = 1-2/(1+np.sqrt(5)), 2/(1+np.sqrt(5))
5 alpha = 1.0
6 while f(x+alpha*p) > f(x) + a*alpha*np.dot(g(x), p):
7 alpha *= b
8 return alpha
9
10 def GradDec(f, g, x0, Eps):
11 x = x0 ;
12 Iterations = 0 ;
13 print('--------------------------------------------')
14 print('Iteration\t f(x)\t | | g(x) | | ')
15 print('--------------------------------------------')
16 while np.linalg.norm(g(x), 2) ≥ Eps:
17 p = -g(x)/np.linalg.norm(g(x), 2)
266 Solving Optimization Problems: Nonlinear Programming
18 alpha = LineSearch(f, g, x, p)
19 x = x + alpha * p
20 print('{0:5.0f}'.format(Iterations), \
21 '{0:12.10f}'.format(f(x)),\
22 '{0:10.8e}'.format(np.linalg.norm(g(x))))
23 Iterations += 1
24 print('--------------------------------------------')
25 return x, Iterations
26
27 f = lambda x: x[0]**2-x[0]*x[1]+4*x[1]**2 + 1
28 g = lambda x: np.array([2*x[0]-x[1], -x[0]+8*x[1]])
29 x0 = np.array([1, 1])
30 Eps = 1e-8
31 x, Iterations = GradDec(f, g, x0, Eps)
32 print('x = ', x)
33 print('Iterations = ', Iterations)
The disadvantage with the steepest descent method is that at the begin-
ning it converges quickly to the optimum solution, but as it comes closer to
the solution its convergance rate drops rapidly and its progress towards the
optimum solution becomes too slow [3].
and
x(k+1) = x(k) + α(k) p(k) = x(k) − α(k) H −1 (x(k) )g(x(k) )
Example 10.1 In this example, Newoton’s method will be used to find the
optimum solution of the unconstrained minimization problem:
x22
min 5x21 + + 5 log e−x1 −x2
x∈R2 2
Solving Unconstrained Problems 267
1 % MiWithPureNewton.m
2 clear ; clc ; clf ;
3 f = @(x) 5 *x(1)ˆ2 + x(2)ˆ2/2 + 5*log(1+exp(-x(1)-x(2))) ;
4 g = @(x) [10*x(1) - 5*exp(-x(1) - x(2))/(exp(-x(1) - x(2)) + 1);
5 x(2) - 5*exp(-x(1) - x(2))/(exp(-x(1) - x(2)) + 1)] ;
6 H = @(x) [5*(5*exp(x(1) + x(2)) + 2*exp(2*x(1) + 2*x(2)) + ...
2)/(2*exp(x(1) + x(2)) + exp(2*x(1) + 2*x(2)) + 1) ...
7 5*exp(x(1) + x(2))/(2*exp(x(1) + x(2)) + exp(2*x(1) + 2*x(2)) ...
+ 1);
8 5*exp(x(1) + x(2))/(2*exp(x(1) + x(2)) + exp(2*x(1) + 2*x(2)) ...
+ 1) ...
9 (7*exp(x(1) + x(2)) + exp(2*x(1) + 2*x(2)) + 1)/(2*exp(x(1) + ...
x(2)) + exp(2*x(1) + 2*x(2)) + 1)] ;
10 x0 = [10; 10] ; Eps = 1e-8 ;
11 [x, Iterations] = PureNewton(f, g, H, x0, Eps) ;
12 disp('Optimum solution = ') ; fprintf('%15.14e\n%15.14e\n', ...
x(1), x(2)) ;
13 disp(['Iterations = ' num2str(Iterations)]) ;
14
15 function [x, Iterations] = PureNewton(f, g, H, x0, Eps)
16 x = x0 ;
17 Iterations = 0 ;
18 fprintf('--------------------------------------------\n') ;
19 fprintf('Iteration\t\t f(x)\t\t\t | | g(x) | | \ n') ;
20 fprintf('--------------------------------------------\n') ;
21 fprintf('%5i\t\t%14.9f\t\t%12.10e\n', Iterations, f(x0), ...
norm(g(x0), 2)) ;
22 while norm(g(x), 2) ≥ Eps
23 p = -H(x)\g(x) ;
24 alpha = LineSearch(f, g, x, p) ;
25 x = x + alpha*p ;
26 Iterations = Iterations + 1 ;
27 fprintf('%5i\t\t%14.9f\t\t%12.10e\n', Iterations, f(x), ...
norm(g(x), 2)) ;
28 end
29 fprintf('--------------------------------------------\n') ;
30 end
31
32 function alpha = LineSearch(f, g, x, p)
33 a = 1-2/(1+sqrt(5)) ;
34 b = 2/(1+sqrt(5)) ;
35 alpha = 1.0 ;
36 while f(x+alpha*p) > f(x) + a*alpha*g(x)'*p
37 alpha = b*alpha ;
38 end
39 end
268 Solving Optimization Problems: Nonlinear Programming
Iterations = 4
The code of the Python’s script MinEx2WithPureNewton.py is:
1 import numpy as np
2
3 def LineSearch(f, g, x, p):
4 a, b = 1-2/(1+np.sqrt(5)), 2/(1+np.sqrt(5))
5 alpha = 1.0
6 while f(x+alpha*p) > f(x) + a*alpha*np.dot(g(x), p):
7 alpha *= b
8 return alpha
9
10 def PureNewton(f, g, x0, Eps):
11 x = x0 ;
12 Iterations = 0 ;
13 print('--------------------------------------------')
14 print('Iteration\t f(x)\t | | g(x) | | ')
15 print('--------------------------------------------')
16 while np.linalg.norm(g(x), 2) ≥ Eps:
17 p = -np.linalg.solve(H(x), g(x))
18 alpha = LineSearch(f, g, x, p)
19 x = x + alpha * p
20 print('{0:5.0f}'.format(Iterations), '\t\t', ...
'{0:12.10f}'.format(f(x)),\
21 '\t', '{0:10.8e}'.format(np.linalg.norm(g(x))))
22 Iterations += 1
23 print('--------------------------------------------')
24 return x, Iterations
25
26 f = lambda x: (10*x[0]**2 + x[1]**2)/2 + ...
5*np.log(1+np.exp(-x[0]-x[1]))
27 g = lambda x: np.array([10*x[0] - 5*np.exp(-x[0] - ...
x[1])/(np.exp(-x[0] - x[1]) + 1),
28 x[1] - 5*np.exp(-x[0] - x[1])/(np.exp(-x[0] - x[1]) + 1)])
Solving Unconstrained Problems 269
1 import numpy as np
2
3 def LineSearch(f, g, x, p):
4 a, b = 1-2/(1+np.sqrt(5)), 2/(1+np.sqrt(5))
5 alpha = 1.0
6 while f(x+alpha*p) > f(x) + a*alpha*np.dot(g(x), p):
7 alpha *= b
8 return alpha
9
10 def BFGS(f, g, x0, Eps):
11 x = x0 ;
12 Iterations = 0 ;
13 print('--------------------------------------------')
14 print('Iteration\t f(x)\t\t | | g(x) | | ')
15 print('--------------------------------------------')
16 H = np.eye(len(x0), dtype=float)
17 while np.linalg.norm(g(x), 2) ≥ Eps:
18 p = -np.linalg.solve(H, g(x))
19 alpha = LineSearch(f, g, x, p)
20 s = alpha * p
21 y = g(x+alpha*s) - g(x)
22 x = x + s
23 H = H + np.outer(y, y)/np.inner(y,s)-([email protected](s, ...
s)@H.T)/(s.T@H@s)
24 print('{0:5.0f}'.format(Iterations), '\t ', ...
'{0:12.10f}'.format(f(x)),\
25 '\t', '{0:10.8e}'.format(np.linalg.norm(g(x))))
26 Iterations += 1
27 print('--------------------------------------------')
28 return x, Iterations
29
30 f = lambda x: (10*x[0]**2 + x[1]**2)/2 + ...
5*np.log(1+np.exp(-x[0]-x[1]))
31 g = lambda x: np.array([10*x[0] - 5*np.exp(-x[0] - ...
x[1])/(np.exp(-x[0] - x[1]) + 1),
32 x[1] - 5*np.exp(-x[0] - x[1])/(np.exp(-x[0] - x[1]) + 1)])
33 x0 = np.array([1, 1])
34 Eps = 1e-8
35 x, Iterations = BFGS(f, g, x0, Eps)
36 print('x = ', x)
37 print('Iterations = ', Iterations)
10 1.9697255747 4.32879304e-08
11 1.9697255747 3.93521430e-08
12 1.9697255747 1.62355225e-10
--------------------------------------------
x = [0.11246719 1.12467185]
Iterations = 13
The corresponding MATLAB code is:
1 clear ; clc ;
2 f = @(x) 5 *x(1)ˆ2 + x(2)ˆ2/2 + 5*log(1+exp(-x(1)-x(2))) ;
3 g = @(x) [10*x(1) - 5*exp(-x(1) - x(2))/(exp(-x(1) - x(2)) + 1);
4 x(2) - 5*exp(-x(1) - x(2))/(exp(-x(1) - x(2)) + 1)] ;
5 x0 = [1.0; 1.0] ; Eps = 1e-8 ;
6 [x, Iterations] = BFGS(f, g, x0, Eps) ;
7 disp('Optimum solution = ') ; disp(x) ;
8 disp(['Iterations = ' num2str(Iterations)]) ;
9
10 function [x, Iterations] = BFGS(f, g, x0, Eps)
11 x = x0 ;
12 Iterations = 0 ;
13 fprintf('----------------------------------------\n') ;
14 fprintf('Iteration\t\t f(x)\t\t\t | | g(x) | | \ n') ;
15 fprintf('----------------------------------------\n') ;
16 fprintf('%5i\t\t%14.9f\t\t%12.10e\n', Iterations, f(x0), ...
norm(g(x0), 2)) ;
17 H = eye(length(x0)) ;
18 while norm(g(x), 2) ≥ Eps
19 p = -H\g(x) ;
20 alpha = LineSearch(f, g, x, p) ;
21 s = alpha*p ;
22 y = g(x+s)-g(x) ;
23 x = x + alpha*p ;
24 H = H + y*y'/(y'*s)-H*(s*s')*H'/(s'*H*s) ;
25 Iterations = Iterations + 1 ;
26 fprintf('%5i\t\t%14.9f\t\t%12.10e\n', Iterations, ...
f(x), norm(g(x), 2)) ;
27 end
28 fprintf('----------------------------------------\n') ;
29 end
30
31 function alpha = LineSearch(f, g, x, p)
32 a = 1-2/(1+sqrt(5)) ;
33 b = 2/(1+sqrt(5)) ;
34 alpha = 1.0 ;
35 while f(x+alpha*p) > f(x) + a*alpha*g(x)'*p
36 alpha = b*alpha ;
37 end
38 end
272 Solving Optimization Problems: Nonlinear Programming
1 import numpy as np
2
3 def LineSearch(f, g, x, p):
4 a, b = 1-2/(1+np.sqrt(5)), 2/(1+np.sqrt(5))
5 alpha = 1.0
6 while f(x+alpha*p) > f(x) + a*alpha*np.dot(g(x), p):
7 alpha *= b
8 return alpha
9
10 def DFP(f, g, x0, Eps):
11 x = x0 ;
12 Iterations = 0 ;
13 print('--------------------------------------------')
14 print('Iteration\t f(x)\t\t | | g(x) | | ')
15 print('--------------------------------------------')
16 H = np.eye(len(x0), dtype=float)
17 while np.linalg.norm(g(x), 2) ≥ Eps:
18 p = -np.linalg.solve(H, g(x))
19 alpha = LineSearch(f, g, x, p)
20 s = alpha * p
21 y = g(x+alpha*s) - g(x)
22 x = x + s
23 H = H + np.outer(s, s)/np.inner(y,s)-([email protected](y, ...
y)@H.T)/(y.T@H@y)
24 print('{0:5.0f}'.format(Iterations), '\t ', ...
'{0:12.10f}'.format(f(x)),\
25 '\t', '{0:10.8e}'.format(np.linalg.norm(g(x))))
26 Iterations += 1
27 print('--------------------------------------------')
28 return x, Iterations
29
30 f = lambda x: ((x[0]-2)**2+(1+x[2])**2)+(1+x[1]**2)
31 g = lambda x: np.array([2*x[0] - 4, 2*x[1], 2*x[2] + 2])
Solving Unconstrained Problems 273
32 x0 = np.array([1, 1, 1])
33 Eps = 1e-8
34 x, Iterations = DFP(f, g, x0, Eps)
35 print('x = ', x)
36 print('Iterations = ', Iterations)
--------------------------------------------
Iteration f(x) ||g(x)||
--------------------------------------------
0 1.3343685400 1.15649218e+00
1 1.0010384216 6.44491002e-02
2 1.0000032249 3.59162526e-03
3 1.0000000100 2.00154416e-04
4 1.0000000000 1.11542233e-05
5 1.0000000000 6.21603559e-07
6 1.0000000000 3.46407792e-08
7 1.0000000000 1.93046439e-09
--------------------------------------------
x = [ 2.00000000e+00 3.94054416e-10 -9.99999999e-01]
Iterations = 8
The MATLAB code is:
1 clear ; clc ;
2 f = @(x) (x(1)-2)ˆ2 + (1+x(3))ˆ2 + x(2)ˆ2+1 ;
3 g = @(x) [2*(x(1)-2); 2*x(2); 2*(x(3)+1)] ;
4 x0 = [1.0; 1.0; 1] ; Eps = 1e-8 ;
5 [x, Iterations] = DFP(f, g, x0, Eps) ;
6 disp('Optimum solution = ') ; format long e ; disp(x) ;
7 disp(['Iterations = ' num2str(Iterations)]) ;
8
9 function [x, Iterations] = DFP(f, g, x0, Eps)
10 x = x0 ;
11 Iterations = 0 ;
12 fprintf('------------------------------------------------\n') ;
13 fprintf('Iteration\t\t f(x)\t\t\t | | g(x) | | \ n') ;
14 fprintf('------------------------------------------------\n') ;
15 fprintf('%5i\t\t%14.9f\t\t%12.10e\n', Iterations, f(x0), ...
norm(g(x0), 2)) ;
16 H = eye(length(x0)) ;
17 while norm(g(x), 2) ≥ Eps
18 p = -H\g(x) ;
19 alpha = LineSearch(f, g, x, p) ;
20 s = alpha*p ;
21 y = g(x+s)-g(x) ;
22 x = x + alpha*p ;
23 H = H + s*s'/(y'*s)-H*(y*y')*H'/(y'*H*y) ;
24 Iterations = Iterations + 1 ;
274 Solving Optimization Problems: Nonlinear Programming
Example 10.3 In this example the MATLAB function fminunc will be used
to solve the unconstrained minimization problem:
10x21 + x22
min + 5 log e−x1 −x2
x∈R2 2
5 18 1.96973 1 4.72e-05
6 21 1.96973 1 1.67e-06
7 24 1.96973 1 2.98e-08
x =
0.1125
1.1247
Example 10.4 In this example Gekko will be used to solve the unconstrained
minimization problem:
10 print('{0:12.10e}'.format(x1[0]))
11 print('{0:12.10e}'.format(x2[0]))
---------------------------------------------------
Solver : IPOPT (v3.12)
Solution time : 4.699999990407377E-003 sec
Objective : -16.3750000000000
Successful solution
---------------------------------------------------
Solution found at x =
2.2500000000e+00
-4.7500000000e+00
---------------------------------------------------
Solver : IPOPT (v3.12)
Solution time : 1.490000000922009E-002 sec
Objective : 7.912533725331136E-015
Successful solution
---------------------------------------------------
Solution found at x =
9.9999991265e-01
9.9999982362e-01
E(x) = 0 (10.9)
where E ∈ Rp , p ≤ n.
To derive the necessary and sufficient optimality conditions, construct the
Lagrangian system:
L(x) = f (x) + E T (x)λ (10.10)
with λ ∈ Rp is the vector of Lagrange multipliers. Then, the first-order nec-
essary conditions for (x∗ , λ∗ ) to be optimal are:
(10.11)
∇λ L(x∗ , λ∗ ) = 0 = E(x∗ )
be positive semi-definite.
Solving Constrained Optimization Problems 279
be positive-definite.
If ∇x E T (x) ∈ Rn×p is a full-rank matrix, it has a QR factorization of the
form:
R
∇E T (x) = Q̃ Z
0
where Q̃ ∈ Rn×p , Z ∈ Rn×(n−p) , R ∈ Rp×p and 0 is a matrix of zeros of type
(n−p)×p. The matrix Q̃ is a basis for the column space of ∇x E T (x), whereas
Z is a basis for its null space.
At the optimum solution x∗ , the gradient of the objective function is
orthogonal to the constraints surface. That is the projection of the gra-
dient vector onto the constraint surface is zero. This can be expressed as
T
Z ∗ ∇f (x∗ ) = 0, which is equivalent to the first order necessary conditions
(10.11).The second order necessary conditions for optimality of (x∗ , λ∗ ) that
T
are equivalent to (10.12) is Z ∗ H ∗ Z ∗ be positive semi-definite, and the sec-
ond order sufficient condition of optimality of (x∗ , λ∗ ) that is equivalent to
T
(10.13) is that Z ∗ H ∗ Z ∗ be positive definite [20].
The Newoton’s optimization methods iterate to find a couple (x∗ , λ∗ ) ∈
R (n−p)×(n−p) such that the necessary conditions (10.11) and (10.12) are ful-
filled. At iteration k, the Karush-Kuhn-Tucker (KKT) system is composed from
a previously computed couple (x(k) , λ(k) ) to find (x(k+1) , λ(k+1) ) by solving
the KKT system:
T
s(k) −∇f (x(k) )
H(x(k) ) ∇E T (x(k) )
= (10.14)
(k+1)
∇E (x(k) )
T
0 λ −E(x(k) )
min f (x), f : Rn → R, x ∈ Rn
E(x) = 0, E : Rn → Rp
I(x) ≤ 0, I : Rn → Rq .
the active set at x̃. At the optimal solution x∗ all the equality constraints
shall satisfy:
E j (x∗ ) = 0, j = 1, . . . , p.
The inequality constraints are divided between active and inactive. Let S ∗
be the indices set of inequality constraints {j : I j (x∗ ) = 0, j ∈ {1, . . . , q}}
Therefore, the active set A∗ at the optimal solution x∗ consists of E(x∗ ) ∪
{I j (x∗ ) : j ∈ S ∗ }.
The Lagrangian system of the constrained minimization problem (10.1)-
(10.3) is defined by:
where λ ∈ Rp and µ ∈ Rq .
Then, the first-order necessary conditions of optimality for a point
(x∗ , λ∗ , µ∗ ) are:
be positive semi-definite.
The second order sufficient optimality condition is:
be positive definite.
1 % MinEx1Withfmincon.m
2 clear ; clc ;
3 x0 = [1; 5; 5; 1]; % Make a starting guess at the solution
4 Algorithms = ["sqp", "active-set", "interior-point", ...
"sqp-legacy"] ;
5 options = optimoptions(@fmincon,'Algorithm', 'sqp', 'Disp', ...
'Iter', 'PlotFcns','optimplotfval') ;
6 [x,fval] = fmincon(@objfun,x0,[],[],[],[],[],[],@confun,options);
7 fprintf('Optimal solution found at \nx ...
=\n%12.6e\n%12.6e\n%12.6e\n%12.6e\n\n', x(1), x(2), x(3), ...
x(4)) ;
8 fprintf('Value of objective function = %10.7f\n\n', fval) ;
9
10 function f = objfun(x)
11 f = x(1)*x(4)*(x(1)+x(2)+x(3))+x(3) ;
12 end
13
14 function [ic, ec] = confun(x)
15 % Nonlinear inequality constraints
16 ic(1) = -x(1)*x(2)*x(3)*x(4)+25.0 ;
17 ic(2) = -x(1) + 1 ;
18 ic(3) = x(1) - 5 ;
19 ic(4) = -x(2) + 1 ;
20 ic(5) = x(2) - 5 ;
21 ic(6) = -x(3) + 1;
22 ic(7) = x(3) - 5;
23 ic(8) = -x(4) + 1;
24 ic(9) = x(4) - 5 ;
25 % Nonlinear equality constraints
26 ec = -sum(x.*x) + 40 ;
27 end
282 Solving Optimization Problems: Nonlinear Programming
Example 10.7 In this example, the fmincon will use the algorithms sqp,
interior-point, active-set and sqp-legacy to solve the constrained min-
imization problem:
2 4
min x2 x21 − x33 +
x∈R3 x1 x2
subject to:
−6 ≤ x1 + x2 + x3 ≤ 6,
and
−1 ≤ x2 ≤ 1.
The MATLAB script MinEx2Withfmincon.m solves the above minimization
proplem. Its code is:
1 % MinEx2Withfmincon.m
2 clear ; clc ;
3 x0 = [-1;-1; -1]; % Make a starting guess at the solution
4 Algorithms = ["sqp", "active-set", "interior-point", ...
"sqp-legacy"] ;
5
6 for n = 1 : length(Algorithms)
7 options = optimoptions(@fmincon,'Algorithm',Algorithms(n), ...
'Display', 'off');
Solving Constrained Optimization Problems 283
x =
2.575362e+00
1.000000e+00
2.424638e+00
Value of objective function = 1.5532974
1 # MinEx1Withminimize
2 import numpy as np
3 from scipy.optimize import minimize
4
5 def objfun(x):
6 return x[0]*x[3]*(x[0]+x[1]+x[2])+x[2]
7
8 def ic(x):
9 return x[0]*x[1]*x[2]*x[3]-25.0
10
11 def ec(x):
12 return 40.0 - sum(x*x)
13
14 x0 = np.array([1., 5., 5., 1.])
15
16 cons = [{'type': 'ineq', 'fun': ic}, {'type': 'eq', 'fun': ec}]
17 lubs = [(1.0, 5.0), (1.0, 5.0), (1.0, 5.0), (1.0, 5.0)]
18 Sol = minimize(objfun,x0, method='SLSQP', bounds=lubs, ...
constraints=cons)
19 x = Sol.x
20 fval = Sol.fun
21 print('Solution found at x = ')
22 print('{0:12.10e}'.format(x[0]))
23 print('{0:12.10e}'.format(x[1]))
Solving Constrained Optimization Problems 285
24 print('{0:12.10e}'.format(x[2]))
25 print('{0:12.10e}'.format(x[3]))
26 print('Value of the Objective function = ' + ...
str(('{0:10.7f}'.format(fval))))
1 import numpy as np
2 from scipy.optimize import minimize
3
4 def objfun(x):
5 return 4*x[1]*(1+2*x[0]**2 - x[2]**3)**2 + x[0]/x[1]
6
7 def ic(x):
8 icon = np.zeros(4)
9 icon[0] = -x[0] - x[1]**2 - x[2] + 6
10 icon[1] = x[0] + x[1]**2 + x[2] - 6
11 icon[2] = -x[1] + 1
12 icon[3] = x[1] - 1
13 return icon
14
15 x0 = np.array([1., 1., 1.])
16
17 cons = [{'type': 'ineq', 'fun': ic}]
18 bnds = [(-np.inf, np.inf), (-1.0, 1.0), (-np.inf, np.inf)]
19 OptSol = minimize(objfun,x0, method='SLSQP', bounds=bnds, ...
constraints=cons)
20 print(OptSol)
21 x = OptSol.x
22 print('Solution found at x = ')
23 print('{0:12.10e}'.format(x[0]))
24 print('{0:12.10e}'.format(x[1]))
25 print('{0:12.10e}'.format(x[2]))
26 print('Value of the Objective function = ' + ...
str(('{0:10.7f}'.format(OptSol.fun))))
nfev: 46
nit: 8
njev: 8
status: 0
success: True
x: array([2.57481362, 1. , 2.42518638])
Solution found at x =
2.5748136179e+00
1.0000000000fe+00
2.4251863821e+00
Value of the Objective function = 2.5748937
1 # MinEx1WithGekko.py
2 from gekko import GEKKO
3 m = GEKKO ( )
4 x1 = m.Var(1, lb=1, ub=5)
5 x2 = m.Var(5, lb=1, ub=5)
6 x3 = m.Var(5, lb=1, ub=5)
7 x4 = m.Var(1, lb=1, ub=5)
8 m.Equation (x1 * x2 * x3 * x4 ≥ 25)
9 m.Equation (x1**2+x2**2+x3**2+x4**2 == 40)
10 m.Obj(x1 * x4 * (x1 + x2 + x3) + x3)
11 m.options.IMODE = 3
12 m.solve()
13 print('Solution found at x = ')
14 print('{0:12.10e}'.format(x1[0]))
15 print('{0:12.10e}'.format(x2[0]))
16 print('{0:12.10e}'.format(x3[0]))
17 print('{0:12.10e}'.format(x4[0]))
---------------------------------------------------
Solver : IPOPT (v3.12)
Solution time : 1.000000000931323E-002 sec
Objective : 17.0140171270735
Successful solution
---------------------------------------------------
Solving Constrained Optimization Problems 287
Solution found at x =
1.0000000570e+00
4.7429996300e+00
3.8211500283e+00
1.3794081795e+00
The Python script MinEx2WithGekko.py is used to solve Example 10.7.
1 # MinEx2WithGekko.py
2 from gekko import GEKKO
3 m = GEKKO ( )
4 x1 = m.Var(1, lb=0, ub=6.)
5 x2 = m.Var(1, lb=-1., ub=6.)
6 x3 = m.Var(1, lb=0, ub=6.)
7 m.Equation(x1 + x2**2 + x3 ≥ -6)
8 m.Equation(x1 + x2**2 + x3 ≤ 6)
9
10 m.Obj(x2*(1+2*x1**2 - x3**3)**2 + 4./(x1*x2))
11 m.options.IMODE = 3
12 m.solve()
13 print('Solution found at x = ')
14 print('{0:12.10e}'.format(x1[0]))
15 print('{0:12.10e}'.format(x2[0]))
16 print('{0:12.10e}'.format(x3[0]))
17 fval = x2[0]*(1+2*x1[0]**2 - x3[0]**3)**2 + 4./(x1[0]*x2[0])
18 print('Value of the Objective function = ' + ...
str(('{0:10.7f}'.format(fval))))
---------------------------------------------------
Solver : IPOPT (v3.12)
Solution time : 1.930000001448207E-002 sec
Objective : 1.43580274955049
Successful solution
---------------------------------------------------
Solution found at x =
2.1288804479e+00
1.3087777155e+00
2.1582204319e+00
Value of the Objective function = 1.4358027
11
Solving Optimal Control Problems
Abstract
Optimal Control Problems (OCPs) represent a model for a wide variety of real-
life phenomena. Some of these applications include the control of infectious
diseases, the continuous stirred tank reactor (CSTR), biological populations,
population harvesting, etc.
This chapter presents the basics of optimal control. It then discusses the
uses of indirect and direct transcription methods for solving them numerically
using MATLAB® and Python. It also discusses the use of the gekko Python
for solving the optimal control problems.
The chapter consists of six sections, organized as follows. The statement of
the problem is in the first section. The second and third sections discuss the
necessary conditions for the optimal control. Ideas about some of numerical
methods for solving the optimal control problems are presented in Section 4.
Section 5 discusses the numerical solution of optimal control problems based
on the indirect transcription methods. The numerical methods for solving opti-
mal control problems based on the direct transcription methods are presented
in Section 6.
11.1 Introduction
This chapter considers an optimal control problem (OCP) of the form:
Z tf
minimize ϕ x(tf ) + L0 t, x(t), u(t) dt (11.1)
u∈U t0
289
290 Solving Optimal Control Problems
The first-order necessary conditions for the optimality are found by apply-
ing the variational principle to the augmented cost function, also referred to
as Lagrangian [24, 11]. The augmented cost function is given by
Z tf
T
L0 [t] + p(t)T (f [t] − ẋ(t)) dt
J(u) = ϕ(tf , x(tf )) + ν Ψ(tf , x(tf )) +
t0
(11.7)
First-Order Optimality Conditions and Existence of Optimal Control 291
where p(t) ∈ Rn are the adjoint variables or the co-state variables, ν ∈ Rl are
Lagrange multipliers, the final time tf , may be fixed or free.
The minimum of J(u) is determined by computing the variation δJ with
respect to all the free variables and then equating it to the zero [15]. i.e.
δJ = 0.
∂ϕ(tf , x(tf )) ∂ϕ(tf , x(tf )) T ∂Ψ ∂Ψ
δJ = δtf + δxf + ν δtf + δxf
∂tf ∂x(tf ) ∂tf ∂x(tf )
Z tf
∂H ∂H ∂H
+(δν T )Ψ + H[tf ] − pT (tf )ẋ(tf ) δtf + δx + δu + δp dt
t0 ∂x ∂u ∂p
(11.8)
(iv) equating the coefficient of δx(tf ) to zero, gives the transversality condi-
tions
∂ϕ(tf , x(tf )) ∂Ψ(tf , x(tf ))
p(tf ) = + νT (11.14)
∂x(tf ) ∂x(tf )
(vi) equating the coefficient of δtf to zeros, gives the condition on the Hamil-
tonian at the terminal time,
Equations (11.11, 11.12, 11.13) define the Euler-Lagrange equations [54, 15]
for the optimal control problems.
∂x(T )
p(T ) = =1
∂x(T )
If u∗ (t) and x∗ (t) are the optimal control and the resulting optimal trajectory
and p∗ (t) is the co-state variable corresponding to the optimal trajectory,
Necessary Conditions of the Discretized System 293
then the tuple (u∗ (t), x∗ (t), p∗ (t)) can be found by solving the Euler-Lagrange
equations as follows.
By solving the second equation in p(t) together with the transversality
condition, we get
p∗ (t) = ea(T −t)
By solving the first equation in u(t), we get
3.
∂H
x(si+1 ) = x(si ) + h [si ] = x(si ) + hf [si ] s0 ≤ si ≤ sKN (11.20)
∂p
294 Solving Optimal Control Problems
4.
∂ϕ(sKN , x(sKN ))
p(sKN ) = (11.21)
∂x(sN )
An Illustrative Example
Example 11.2
Z T 2
min J(u) = I(T ) + bu (t) + I(t) dt
u 0
subject to:
˙ = β(1 − I(t))I(t) − αu(t)I(t), I(0) = I0 ,
I(t) t ∈ [0, T ]
Solving Optimal Control Problems Using Indirect Methods 297
1 import numpy as np
2 import matplotlib.pylab as plt
3
4 def UpdateControl(x, p):
5 return np.array([min(u, umax) for u in [max(u, umin) for u ...
in a*p*x/(2.0*b**2)]])
6 def SolveStateEquation(u):
7 from scipy import interpolate
8 f = lambda z, v: (bt-bt*z-gm-a*v)*z
9 th = np.arange(h/2.0, T, h)
10 uh = interpolate.pchip(t, u)(th)
11 x[0] = x0
12 for j in range(len(t)-1):
13 k1 = f(x[j], u[j])
14 k2 = f(x[j]+h*k1/2, uh[j])
15 k3 = f(x[j]+h*k2/2, uh[j])
16 k4 = f(x[j]+h*k3, u[j+1])
17 x[j+1] = x[j] + h/6*(k1+2*k2+2*k3+k4)
18 return x
19
20 def SolveCostateEquation(x, u):
21 p[N] = pT
22 th = np.arange(h/2.0, T, h)
23 uh = interpolate.pchip(t, u)(th)
24 xh = interpolate.pchip(t, x)(th)
25 g = lambda z, lm, v: -1.0+(2*bt*z+gm+a*v-bt)*lm
26 for j in [N-int(i)-1 for i in list(range(len(t)-1))]:
27 k1 = g(x[j+1], p[j+1], u[j+1])
28 k2 = g(xh[j], p[j+1]-h*k1/2, uh[j])
29 k3 = g(xh[j], p[j+1]-h*k2/2, uh[j])
30 k4 = g(x[j], p[j+1]-h*k3, u[j])
31 p[j] = p[j+1] - h/6.0*(k1+2*k2+2*k3+k4)
32 return p
33
34 b, a, gm, bt = 1.0, 0.5, 0.2, 1.0
35 T = 20.0
36 x0, pT = 0.01, 1.0
37 umin, umax = 0.0, 1.0
38 N = 10000
39 t = np.linspace(0.0, T, N+1)
40 h = T/N
41 x, p, u = np.ones like(t), np.ones like(t), 0.5*np.ones like(t)
42
43 x = SolveStateEquation(u)
44 from scipy import interpolate
45 p = SolveCostateEquation(x, u)
46 uold = u
47 u = UpdateControl(x, p)
48 Error = np.linalg.norm(u-uold, np.inf)
49 Iterations = 1
50 while Error ≥ 1e-15:
Solving Optimal Control Problems Using Indirect Methods 299
51 uold = u
52 x = SolveStateEquation(u)
53 p = SolveCostateEquation(x, u)
54 u = UpdateControl(x, p)
55 Error = np.linalg.norm(u-uold, np.inf)
56 print(Iterations, Error)
57 Iterations += 1
58 print(Iterations, '\t', Error)
59 u = UpdateControl(x, p)
60
61 plt.figure(1)
62 plt.subplot(1, 2, 1)
63 plt.plot(t, u, color='crimson', lw=2)
64 plt.xlabel('Time (t)', fontweight='bold')
65 plt.ylabel('Optimal Treatment (u(t))', fontweight='bold')
66 plt.grid(True, ls = '--')
67 plt.xticks(np.arange(0, T*1.1, T/10), fontweight='bold')
68 mnu, mxu = np.floor(1000*min(u))/1000, np.ceil(1010*max(u))/1000
69 hu = (mxu-mnu)/10
70 plt.yticks(np.arange(mnu, mxu+hu, hu), fontweight='bold')
71 plt.axis([0, T, mnu, mxu])
72 plt.subplot(1, 2, 2)
73 plt.plot(t, x, color = 'purple', lw = 2)
74 plt.xlabel('Time (t)', fontweight='bold')
75 plt.ylabel('Infective Population (I(t))', fontweight='bold')
76 plt.grid(True, ls = '--')
77 plt.xticks(np.arange(0, T*1.1, T/10), fontweight='bold')
78 mnx, mxx = np.floor(10*min(x))/10, np.ceil(10*max(x))/10
79 hx = (mxx-mnx)/10
80 plt.yticks(np.arange(mnx, mxx+hx, hx), fontweight='bold')
81 plt.grid(True, ls = '--')
82 plt.axis([0, T, mnx, mxx])
1 0.03923191725605521
2 0.0033203169324064197
3 0.0002443004136510607
4 1.7888301100443815e-05
5 1.3080226375083992e-06
6 9.563564312697892e-08
7 6.9922647949471894e-09
8 5.112286149966394e-10
9 3.737765652545022e-11
10 2.7327029528123603e-12
11 1.9961809982760315e-13
12 1.432187701766452e-14
13 1.1102230246251565e-15
14 1.1102230246251565e-16
300 Solving Optimal Control Problems
0.253 0.70
0.245 0.63
0.237 0.56
0.229 0.49
0.221 0.42
0.213 0.35
0.205 0.28
0.197 0.21
0.189 0.14
0.181 0.07
0.173 0.00
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Time (t) Time (t)
FIGURE 11.1: The optimal control and optimal trajectory of Example 11.1.
The optimal control and state variables are computed iteratively, until at some
iteration k, the condition |u(k) − u(k−1) | < ε is fulfilled, where ε is a small
arbitrary constant (selected to be 10−15 in the example). In this example, it
is noticed that fourteen iterations were needed to compute the optimal control
and optimal trajectory.
subject to:
1 import numpy as np
2 import matplotlib.pylab as plt
3
4 def UpdateControl(p):
5 return np.array([max(u, 0) for u in -p[:, 1]/(2.0*r)])
6
7 def SolveStateEquation(u):
8 from scipy import interpolate
9 f = lambda z, v: np.array([-m1*z[0]-m2*z[1], -m3*z[1]+v])
10 th = np.arange(h/2.0, T, h)
11 uh = interpolate.pchip(t, u)(th)
12 x = np.zeros((len(t), 2), 'float')
13 x10, x20 = 300.0, 0.0
14 x[0, :] = np.array([x10, x20])
15 for j in range(len(t)-1):
16 k1 = f(x[j], u[j])
17 k2 = f(x[j]+h*k1/2, uh[j])
18 k3 = f(x[j]+h*k2/2, uh[j])
19 k4 = f(x[j]+h*k3, u[j+1])
20 x[j+1] = x[j] + h/6*(k1+2*k2+2*k3+k4)
21 return x
22
23 def SolveCostateEquation(x):
24 p = np.zeros((len(t), 2), 'float')
25 th = np.arange(h/2.0, T, h)
26 xh = interpolate.pchip(t, x)(th)
27 g = lambda z, lm: np.array([m1*lm[0]-2*z[0]+2*xd, ...
m2*lm[0]+m3*lm[1]])
28 for j in [N-int(i)-1 for i in list(range(len(t)-1))]:
29 k1 = g(x[j+1], p[j+1])
30 k2 = g(xh[j], p[j+1]-h*k1/2)
31 k3 = g(xh[j], p[j+1]-h*k2/2)
32 k4 = g(x[j], p[j+1]-h*k3)
33 p[j] = p[j+1] - h/6.0*(k1+2*k2+2*k3+k4)
34 return p
35
36 xd, r, m1, m2, m3 = 100.0, 10.0, 9.0e-4, 3.1e-3, 4.15e-2
37 T = 90.0
38 umin, umax = 0.0, 1.0
39 N = 30000
40 t = np.linspace(0.0, T, N+1)
41 h = T/N
42 u = np.ones like(t)
43
Solving Optimal Control Problems Using Indirect Methods 303
44 x = SolveStateEquation(u)
45 from scipy import interpolate
46 p = SolveCostateEquation(x)
47 uold = u
48 u = UpdateControl(p)
49 Error = np.linalg.norm(u-uold, np.inf)
50 print('Iteration\t\t Error\n')
51 Iterations = 1
52
53 while Error ≥ 1e-15:
54 uold = u
55 x = SolveStateEquation(u)
56 p = SolveCostateEquation(x)
57 u = UpdateControl(p)
58 Error = np.linalg.norm(u-uold, np.inf)
59 print(Iterations, '\t\t', Error)
60 Iterations += 1
61
62 plt.figure(1)
63 plt.plot(t, u, color='crimson', lw=2)
64 plt.xlabel('Time (t)', fontweight='bold')
65 plt.ylabel('Optimal Treatment (u(t))', fontweight='bold')
66 plt.grid(True, ls = '--')
67 plt.xticks(np.arange(0, T*1.1, T/10), fontweight='bold')
68 mnu, mxu = np.floor(0.1*min(u))*10, np.ceil(0.1*max(u))*10
69 hu = (mxu-mnu)/10
70 plt.yticks(np.arange(mnu, mxu+hu, hu), fontweight='bold')
71 plt.axis([0, T, mnu, mxu])
72 plt.savefig('OCPGI1.eps')
73 plt.savefig('OCPGI1.png')
74
75 plt.figure(2)
76
77 plt.subplot(1, 2, 1)
78 plt.plot(t, x[:, 0], color = 'purple', lw = 2, label = 'Glucose')
79 plt.xlabel('Time (t)', fontweight='bold')
80 plt.ylabel('Glucose (x1(t) mg/dl)', fontweight='bold')
81 plt.grid(True, ls = '--')
82 plt.xticks(np.arange(0, T*1.1, T/10), fontweight='bold')
83 mnx, mxx = np.floor(0.1*min(x[:, 0]))*10, np.ceil(0.1*max(x[:, ...
0]))*10
84 hx = (mxx-mnx)/10
85 plt.yticks(np.arange(mnx, mxx+hx, hx), fontweight='bold')
86 plt.grid(True, ls = '--')
87 plt.axis([0, T, mnx, mxx])
88
89 plt.subplot(1, 2, 2)
90 plt.plot(t, x[:, 1], color = 'orangered', lw = 2, label = ...
'Insulin')
91 plt.xlabel('Time (t)', fontweight='bold')
92 plt.ylabel('Insulin (x2(t) mg/dl)', fontweight='bold')
93 plt.grid(True, ls = '--')
94 plt.xticks(np.arange(0, T*1.1, T/10), fontweight='bold')
95 mnx, mxx = np.floor(0.1*min(x[:, 1]))*10, np.ceil(0.1*max(x[:, ...
1]))*10
96 hx = (mxx-mnx)/10
97 plt.yticks(np.arange(mnx, mxx+hx, hx), fontweight='bold')
304 Solving Optimal Control Problems
98 plt.grid(True, ls = '--')
99 plt.axis([0, T, mnx, mxx])
100 plt.savefig('OCPGI2.eps')
101 plt.savefig('OCPGI2.png')
60
54
48
42
Optimal Treatment (u(t))
36
30
24
18
12
6
0
0 9 18 27 36 45 54 63 72 81 90
Time (t)
300 620
286 558
272 496
258 434
Glucose (x1(t) mg/dl)
244 372
230 310
216 248
202 186
188 124
174 62
160 0
0 9 18 27 36 45 54 63 72 81 90 0 9 18 27 36 45 54 63 72 81 90
Time (t) Time (t)
(b) Glucose and Insulin dynamics, left figure: glucose, right figure: insulin.
The MATLAB code for solving the optimal control problem of the glucose-
insulin interaction is as follows:
6 h = T/N ;
7 t = linspace(0, T, N+1) ;
8 uold = 0.5*ones(N+1, 1) ; %p = ones(1, N+1) ; x = ones(1, N+1) ;
9 x = SolveStateEquationGI(uold) ;
10 p = SolveCostateEquationGI(x) ;
11 u = UpdateControlGI(p) ;
12
13 Iterations = 1 ;
14 Error = norm(u-uold, inf) ;
15 while Error ≥ 1e-15
16 uold = u ;
17 x = SolveStateEquationGI(u) ;
18 p = SolveCostateEquationGI(x) ;
19 u = UpdateControlGI(p) ;
20 Error = norm(u-uold, inf) ;
21 disp([num2str(Iterations) \t\t num2str(Error)]) ;
22 Iterations = Iterations + 1 ;
23 end
24
25 figure(1) ;
26 plot(t, u, '-b', 'LineWidth', 2) ;
27 xlabel('Time (t)') ;
28 ylabel('Optimal Infusion Rate (u(t))') ;
29 grid on ;
30
31 figure(2) ;
32 subplot(1, 2, 1) ;
33 plot(t, x(:, 1), '-r', 'LineWidth', 2) ;
34 xlabel('Time (t)') ;
35 ylabel('Glucose (x 1(t))') ;
36 grid on ;
37
38 subplot(1, 2, 2) ;
39 plot(t, x(:, 2), '-r', 'LineWidth', 2) ;
40 xlabel('Time (t)') ;
41 ylabel('Insulin (x 2(t))') ;
42 grid on ;
43
44 function u = UpdateControlGI(p)
45 global b ;
46 u = max(-p(:, 2)/(2*b), 0) ;
47 end
48
49 function x = SolveStateEquationGI(u)
50 global m1 m2 m3 x10 x20 N h t T ;
51 x = ones(N+1, 2) ;
52 f = @(z, v) [-m1*z(1)-m2*z(2), -m3*z(2)+v] ;
53 th = h/2:h:T-h/2 ;
54 uh = pchip(t, u, th) ;
55 x(1, :) = [x10, x20] ;
56 for j = 1 : N
57 k1 = f(x(j, :), u(j)) ;
58 k2 = f(x(j, :)+h*k1/2, uh(j)) ;
59 k3 = f(x(j, :)+h*k2/2, uh(j)) ;
60 k4 = f(x(j, :)+h*k3, u(j+1)) ;
61 x(j+1, :) = x(j, :) + h/6*(k1+2*k2+2*k3+k4) ;
62 end
306 Solving Optimal Control Problems
63 end
64
65 function p = SolveCostateEquationGI(x)
66 global xd m1 m2 m3 N h t T;
67 p = ones(N+1, 2) ;
68 g = @(z, lm) [m1*lm(1)-2*z(1)+2*xd, m2*lm(1)+m3*lm(2)] ;
69 th = h/2:h:T-h/2 ;
70 xh = pchip(t(:), x', th(:))' ;
71 p(N+1, :) = zeros(1, 2) ;
72 for j = N : -1 : 1
73 k1 = g(x(j+1, :), p(j+1, :)) ;
74 k2 = g(xh(j, :), p(j+1, :)-h*k1/2) ;
75 k3 = g(xh(j, :), p(j+1, :)-h*k2/2) ;
76 k4 = g(x(j, :), p(j+1, :)-h*k3) ;
77 p(j, :) = p(j+1, :) - h/6.0*(k1+2*k2+2*k3+k4) ;
78 end
79 end
I(x(t)) ≤ 0 (11.28)
E(x(t)) = 0 (11.29)
then,
Z tf 2 bN c 2 bN c
h 4h X 2i−1 2h X
L0 t, x(t), u(t) dt ≈ (L00 + LN L2i
0 ) + L 0 + 0
t0 3 3 3
i=1 i=1
2 2 bN c bN c
h 0 4h X 2h X
J(u(t)) ≈ ϕ x ) + L0 + LN
N
0 + L2i−1
0 + L2i
0 (11.30)
3 3 3
i=1 i=1
k1 = f (xi , ui )
1
k2 = f (xi + hk1 /2, ui+ 2 )
1
k3 = f (xi + hk2 /2, ui+ 2 )
k4 = f (xi + hk3 , ui+1 )
h
xi+1 = xi + (k1 + 2k2 + 2k3 + k4 ) (11.33)
6
1
where ui+ 2 = u(ti + h/2) can be approximated from some interpolation
method, such the piecewise cubic Hermite interpolating polynomials.
Solving Optimal Control Problems Using Direct Methods 309
b N2 c b N2 c
h 0
N N 4h X 2i−1 2h X 2i
minimize ϕ x ) + (L0 + L0 ) + L0 + L0 (11.34)
u∈Rm(1+Nc ) 3 3 3
i=1 i=1
subject to:
h
xi+1 − xi − (k1 + 2k2 + 2k3 + k4 ) = 0 (11.35)
6
subject to the initial condition
x(0) − x0 = 0 (11.36)
E(xi ) = 0, i = 0, 1, 2, . . . , N (11.37)
I(xi ) ≤ 0, i = 0, 1, 2, . . . , N (11.38)
minimize Ψ(y)
Y ∈Rn(1+N )+m(1+Nc )
Ē(Y ) = 0,
Ī(Y ) ≤ 0
Y (i − 1 + (i − 1)N ) − x0i = 0, i = 1, 2, . . . , n
11.6.2.1 Examples
Example 11.4 This example is taken from [26].
Z T
minimize J(u) = S(T )+I(T )−R(T )+ (S(t)+I(t)−R(T )+C1 u2 (t)+C2 v 2 (t))dt
u(t) 0
310 Solving Optimal Control Problems
subject to:
where
0 ≤ S(t), I(t), R(t) ≤ 1
and
umin ≤ u(t) ≤ umax and vmin ≤ v(t) ≤ vmax
To solve this problem with MATLAB, the model’s paremeters C1 , C2 , λ, α, µ
and β shall be declared as global variables.
Based on the Simpson’s rule, a function SIRObjective is implemented for
the evaluation of the objective function.
7 v = x(5+3*N+NC: 5+3*N+2*NC) ;
8
9 th = t(1:N)+h/2.0 ;
10 Sh = pchip(t, S, th) ;
11 Ih = pchip(t, I, th) ;
12 Rh = pchip(t, I, th) ;
13 uh = pchip(t, u(1+floor((0:N)/Q)), th) ;
14 vh = pchip(t, v(1+floor((0:N)/Q)), th) ;
15 umin = 0.05 ;
16 umax = 0.60 ;
17 vmin = 0.1 ;
18 vmax = 0.9 ;
19 S0 = 0.9 ;
20 I0 = 0.07 ;
21 R0 = 0.03 ;
22 ic = zeros(4+4*NC, 1) ;
23 ec = zeros(3*N+5, 1) ;
24 ic(1:1+NC) = umin - u ;
25 ic(2+NC:2+2*NC) = u - umax ;
26 ic(3+2*NC:3+3*NC) = vmin-v ;
27 ic(4+3*NC:4+4*NC) = v - vmax ;
28 f1 = @(x, y, z, v) Lambda-alpha*x*y-(mu+v)*x ;
29 f2 = @(x, y, z, v) alpha*x*y-(mu+beta)*y-v*y ;
30 f3 = @(x, y, z, u, v) beta*y+u*x-mu*z+v*y ;
31 for j = 1 : N
32 k11 = f1(S(j), I(j), R(j), u(1+floor(j/Q))) ;
33 k12 = f1(S(j)+h/2*k11, Ih(j), Rh(j), uh(j)) ;
34 k13 = f1(S(j)+h/2*k12, Ih(j), Rh(j), uh(j)) ;
35 k14 = f1(S(j)+h*k13, I(j+1), R(j+1), ...
u(1+floor((j+1)/Q))) ;
36
37 k21 = f2(S(j), I(j), R(j), v(1+floor(j/Q))) ;
38 k22 = f2(Sh(j), I(j)+h/2*k21, Rh(j), vh(j)) ;
39 k23 = f2(Sh(j), I(j)+h/2*k22, Rh(j), vh(j)) ;
40 k24 = f2(S(j+1), I(j)+h*k23, R(j+1), ...
v(1+floor((j+1)/Q))) ;
41
42 k31 = f3(S(j), I(j), R(j), u(1+floor(j/Q)), ...
v(1+floor(j/Q))) ;
43
44 k32 = f3(Sh(j), Ih(j), R(j)+h/2*k31, uh(j), vh(j)) ;
45 k33 = f3(Sh(j), Ih(j), R(j)+h/2*k32, uh(j), vh(j)) ;
46 k34 = f3(S(j+1), I(j+1), R(j)+h*k33, ...
u(1+floor((j+1)/Q)), v(1+floor((j+1)/Q))) ;
47 ec(j) = S(j+1) - S(j) - h/6 * ...
(k11+2*k12+2*k13+k14) ;
48 ec(N+j) = I(j+1) - I(j) - h/6 * ...
(k21+2*k22+2*k23+k24) ;
49 ec(2*N+j) = R(j+1) - R(j) - h/6 * ...
(k31+2*k32+2*k33+k34) ;
50 end
51 ec(3*N+1) = S(1)-S0 ;
52 ec(3*N+2) = I(1)-I0 ;
53 ec(3*N+3) = R(1)-R0 ;
54 ec(3*N+4) = u(end)-u(end-1) ;
55 ec(3*N+5) = v(end)-v(end-1) ;
312 Solving Optimal Control Problems
1 clear ; clc ;
2 global N NC Q h Lambda beta mu alpha C1 C2 t ;
3 Lambda = 0.05 ; mu = 0.05 ; alpha = 2 ; beta = 0.6 ;
4 t0 = 0 ;
5 T = 5 ;
6 C1 = 2 ;
7 C2 = 0.5 ;
8 h = 0.025 ;
9 N = T/h ;
10 Q = 2 ;
11 NC = N/Q ;
12 t = linspace(t0, T, 1+N) ;
13 tc = linspace(t0, T, 1+NC) ;
14 S0 = 0.9 ;
15 I0 = 0.07 ;
16 R0 = 0.03 ;
17 x0 = zeros(3*N+2*NC+5, 1) ;
18
19 Options = optimset('LargeScale', 'off', 'Algorithm', ...
'active-set', 'Display', 'Iter', 'MaxIter', inf, ...
'MaxFunEvals', inf, 'TolFun', 1e-6, 'TolX', 1e-6, ...
'TolFun', 1e-6) ;
20 [x, fval] = fmincon(@SIRObjective, x0, [],[],[],[],[],[], ...
@SIRConstsRK4, Options) ;
21
22 S = x(1:1+N) ;
23 I = x(2+N:2+2*N) ;
24 R = x(3+2*N:3+3*N) ;
25 u = x(4+3*N:4+3*N+NC) ;
26 v = x(5+3*N+NC:5+3*N+2*NC) ;
27 tc = 0:hc:T ;
28 us = pchip(tc, u, t) ;
29 vs = pchip(tc, v, t) ;
30
31 figure(1) ;
32
33 subplot(1, 2, 1) ;
34 plot(t, us, '-b', 'LineWidth',2) ;
35 xlabel('Time (t)') ;
36 ylabel('Vaccination Strategy (u(t))') ;
37 grid on ;
38 set(gca, 'XTick', 0:0.5:T) ;
39 axis([0, T, 0, 1]) ;
40
41 subplot(1, 2, 2) ;
42 plot(t, vs, '-r', 'LineWidth', 2) ;
43 axis([0, T, -0.1, 1]) ;
44 xlabel('Time (t)') ;
45 ylabel('Treatment (v(t))') ;
46 grid on ;
47 set(gca, 'XTick', 0:0.5:T) ;
Solving Optimal Control Problems Using Direct Methods 313
48 axis([0, T, 0, 0.2]) ;
49
50 figure(2) ;
51
52 subplot(1, 3, 1) ;
53 plot(t, S, 'b', 'LineWidth', 2) ;
54 xlabel('t') ;
55 ylabel('Susceptibles (S(t))') ;
56 grid on ;
57 set(gca, 'XTick', 0:0.5:T) ;
58 axis([0, T, 0.0, 1]) ;
59
60 subplot(1, 3, 2) ;
61 plot(t, I, 'r', 'LineWidth',2) ;
62 xlabel('Time (t)') ;
63 ylabel('Infected Population (I(t))') ;
64 grid on ;
65 set(gca, 'XTick', 0:0.5:T) ;
66 axis([0, T, 0.0, 0.2]) ;
67
68 subplot(1, 3, 3) ;
69 plot(t, R, 'm', 'LineWidth', 2) ;
70 xlabel('Time (t)') ;
71 ylabel('Recovered Population (R(t))') ;
72 grid on ;
73 set(gca, 'XTick', 0:0.5:T) ;
74 axis([0, T, 0.0, 0.8]) ;
Example 11.5 We consider the optimal control problem of the SIR model
described in Example 11.4, and use the Gekko Python package to solve it.
The Python code SolveGekSIR uses the Gekko Python to solve the prob-
lem of example 11.4.
1 0.2
0.9 0.18
0.8 0.16
Treatment (v(t))
0.6 0.12
0.5 0.1
0.4 0.08
0.3 0.06
0.2 0.04
0.1 0.02
0 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (t) Time (t)
0.9 0.18
0.7
0.8 0.16
0.7 0.14
Susceptibles (S(t))
0.5
0.6 0.12
0.4 0.08
0.3
0.3 0.06
0.2
0.2 0.04
0.1
0.1 0.02
0 0 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t Time (t) Time (t)
FIGURE 11.3: Solution of the optimal control problem with dynamics gov-
erned by an SIR model, using the MATLAB optimization toolbox.
7 T = 5
8 C1 = 2
9 C2 = 0.5
10 N = 200 #number of subintervals [t j, t {j+1}]
11 m = GEKKO(remote=False) #no remote server is required
12 m.time = np.linspace(t0, T, N+1) #discretizating the interval ...
[t0, T] by N+1 points
13 S, I, R = m.Var(value=0.9), m.Var(value=0.07), ...
m.Var(value=0.03) #Initializing the state variables
14 u, v = m.Var(lb=0.0, ub=0.8), m.Var(lb=0.05, ub=1.0) ...
#Initializing the control variables
15 X = m.Var(value=0.0) #The bolza part of the objective function
16 p = np.zeros(N+1)
17 p[N] = 1.0
18 final = m.Param (value=p)
19 # Equations
20 m.Equation(S.dt() == Lambda-alpha*S*I-(mu+u)*S) #First state ...
equation in S
21 m.Equation(I.dt() == alpha*S*I-(mu+beta)*I-v*I) #Second state ...
equation in I
22 m.Equation(R.dt() == beta*I+u*S-mu*R+v*I) #Third state ...
equation in R
Solving Optimal Control Problems Using Direct Methods 315
In Figure 11.4 the optimal vaccination and treatment strategies and the
corresponding susceptible, infected and recovered populations, obtained by
the Gekko Python package are shown.
Example 11.6 In this example we consider the pest control problem found
in [51]. The purpose is to find the optimal spraying schedule to eradicate the
number of insect preys of spiders in agroecosystems. The optimal control is
given by:
Z T
ξ
minimize J(u) = (z(t) + u2 (t))dt
u(t) 0 2
0.60 0.20
0.54 0.18
0.48 0.16
Vaccination (u(t))
Treatment (v(t))
0.42 0.14
0.36 0.12
0.30 0.10
0.24 0.08
0.18 0.06
0.12 0.04
0.06 0.02
0.00 0.00
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Time (t) Time (t)
FIGURE 11.4: Gekko solution of the optimal control model with dynamics
governed by an SIR model.
Solving Optimal Control Problems Using Direct Methods 317
subject to:
x(t)
ẋ(t) = rx(t) 1 − − cx(t)y(t) − α(1 − q)u(t), x(0) = x0 , t ∈ [0, T ]
W
ẏ(t) = y(t)(−a + kbz(t) + kcx(t)) − αKqu(t), y(0) = y0 , t ∈ [0, T ]
z(t)
ż(t) = ez(t) 1 − − by(t)z(t) − αqu(t), z(0) = z0 , t ∈ [0, T ]
V
where
umin ≤ u(t) ≤ umax
The values of model parameters are as follows: r = 1, e = 2.5, a = 3.1, b =
1.2, c = 0.2, α = 0.7, q = 0.9, k = 1.0, V = 1000, W = 5, K = 0.01, T = 50 and ξ =
0.05, 0.1.
The Python code is:
1 #!/usr/bin/env python3
2 from gekko import GEKKO
3 import numpy as np
4 import matplotlib.pyplot as plt
5
6 r = 1.0 ; e = 2.5; a = 3.1 ; b = 1.2 ; c = 0.2 ; al = 0.7 ;
7 q = 0.9 ; k = 1.0 ; V = 1000; W = 5 ; K = 0.01 ; D = 0.5 ;
8 T = 50 ;
9 t0 = 0
10 N = 300
11 m = GEKKO(remote=False)
12 m.time = np.linspace(t0, T, N+1)
13 x, y, z = m.Var(value=3.1), m.Var(value=3.7), m.Var(value=2.2)
14 u = m.Var(lb=0.00, ub=1.0)
15 X = m.Var(value=0.0)
16 p = np.zeros(N+1) # mark final time point
17 p[N] = 1.0
18 final = m.Param (value=p)
19 # Equations
20 m.Equation(x.dt() == r*x*(1-x/W)-c*x*y-al*(1-q)*u)
21 m.Equation(y.dt() == y*(-a+k*b*z+k*c*x)-al*K*q*u)
22 m.Equation(z.dt() == e*z*(1-z/V)-b*y*z-al*q*u)
23 m.Equation(X.dt() == z+D/2*u**2)
24 m.Obj (X * final) # Objective function
25 m.options.IMODE = 6 # optimal control mode
26 m.solve()
27 t = m.time
28
29 plt.figure(1, figsize=(16, 16))
30 plt.subplot(2, 2, 1)
31 plt.plot(t[1:], u[1:], color='purple', lw = 2)
32 plt.xlabel('Time (t)', fontweight='bold')
33 plt.ylabel('Optimal Spray Schedule (u(t))', fontweight='bold')
34 plt.xticks(np.arange(0, T+T/10, T/10), fontweight='bold')
35 mnu = np.floor(1000*min(u))/1000
36 mxu = np.ceil(1000*max(u))/1000
318 Solving Optimal Control Problems
1.0 3.360
0.9 3.306
2.588 2.364
2.310 2.140
2.032 1.916
1.754 1.692
1.476 1.468
1.198 1.244
0.920 1.020
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
Time (t) Time (t)
1.0 3.350
0.9 3.297
0.8 3.244
0.7 3.191
Optimal Spray Schedule (u(t))
0.6 3.138
0.5 3.085
0.4 3.032
0.3 2.979
0.2 2.926
0.1 2.873
0.0 2.820
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
Time (t) Time (t)
3.700 3.250
3.423 3.027
3.146 2.804
2.869 2.581
Preys in vineyard (z(t))
2.592 2.358
Spiders (y(t))
2.315 2.135
2.038 1.912
1.761 1.689
1.484 1.466
1.207 1.243
0.930 1.020
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
Time (t) Time (t)
FIGURE 11.5: Gekko solution of the pest control optimal control problem.
The solution of the pest optimal control problem with Gekko is explained
in Figure 11.5.
Bibliography
[1] Eihab B.M. Bashier. Fitted numerical methods for delay differential equa-
tions arising in biology. PhD thesis, University of the Western Cape, Cape
Town, South Africa, September 2009.
[2] Logan Beal, Daniel Hill, R Martin, and John Hedengren. Gekko opti-
mization suite. Processes, 6(8):106, 2018.
[3] Amir Beck. Introduction to Nonlinear Optimization. SIAM - Society for
Industrial and Applied Mathematics, 2014.
[4] M.D. Benchiboun. Linear convergence for vector sequences and some
applications. Journal of Computational and Applied Mathematics,
55(1):81–97, Oct 1994.
[5] David Benson. A Gauss Pseudospectral Transcription of Optimal Control.
PhD thesis, Massachusetts Institute of Technology, 2005.
[6] F. Benyah and L.S. Jenning. A comparison of the ill-conditioning of two
optimal control computation methods. Technical report, the University of
Western Australia, Dept. of Mathematics and Statistics, Nedlands 6907
WA, 2000.
[7] F. Benyah and L. S. Jennings. Ill-conditioning in optimal control com-
putation. In Proceedings of the Eighth International Colloquium on Dif-
ferential Equations (ed. D. Bainov), pages 81–88, VSP BV, Utrecht, The
Netherlands, 1998.
[8] F. Benyah and L. S. Jennings. Regularization of optimal control compu-
tation. In International Symposium on Intelligent Automation Control
(ISIAC’98) World Automation Congress, pages pp. ISIAC 091.1–091.6,
Anchorage, Alaska, 1998. TSI Press.
[9] F. Benyah and L. S. Jennings. A review of ill-conditioning and regu-
larization in optimal control computation. In Eds X., Yang, K. L. Teo,
and L. Caccetta, editors, Optimization Methods and Applications, pages
23–44. Kluwer Academic Publishers, Dordrecht, The Netherlands, 2001.
[10] J. T. Betts. Practical Methods for Optimal Control Using Nonlinear Pro-
gramming. Society for Industrial and Applied Mathematics, 2001.
321
322 Bibliography
[11] J. T. Betts and W.P. Hoffman. Exploring sparsity in the direct tran-
scripion method for optimal control. Computational Optimization and
Applications, 14:179–201, 1999.
[12] P. T. Boggs and J. W. Tolle. Sequential quadratic programming. Acta
Numerica, pages 1–48, 1996.
[13] J. C. Butcher. Numerical Methods for Ordinary Differential Equations.
John Wiley and Sons Ltd, 2016.
[14] J.C. Butcher. Implicit runge-kutta processes. Mathematics of Computa-
tions, 18(85):50–64, 1964.
[15] B.C. Chachuat. Nonlinear and Dynamic Optimization: From Theory to
Practice - IC-32: Spring Term 2009. Polycopiés de l’EPFL. EPFL, 2009.
[16] Stephen J. Chapman. MATLAB Programming for Engineers. CEN-
GAGE LEARNING, 2015.
[17] Thomas Coleman, Mary Ann Branch, and Andrew Grace. Optimization
toolbox for use with matlab, 1999.
[18] SciPy community. SciPy Reference Guide. Scipy Community, 1.0.0 edi-
tion, October 2017.
[19] Yinyu Ye David G. Luenberger. Linear and Nonlinear Programming.
Springer-Verlag GmbH, 2015.
[20] Fletcher. Practical Methods of Optimization 2e. John Wiley & Sons,
2000.
[21] Sidi Mahmoud Kaber Gregoire Allaire. Numerical Linear Algebra.
Springer-Verlag New York Inc., 2007.
[22] Per Christian Hansen. Regularization tools: A matlab package for anal-
ysis and solution of discrete ill-posed problems. Numerical Algorithms,
6(1):1–35, Mar 1994.
[23] William E. Hart, Carl Laird, Jean-Paul Watson, and David L. Woodruff.
Pyomo – Optimization Modeling in Python. Springer US, 2012.
[41] Stuart Mitchell, Stuart Mitchell Consulting, and Iain Dunning. Pulp: A
linear programming toolkit for python, 2011.
[42] A. Neumair. Solving ill-conditioned and singular linear systems: A tuto-
rial on regularization. SIAM Rev., 40(3):636–666, 1998.
[43] V.A. Patel. Numerical Analysis. Publications of Harcourt Brace Collage,
1994.
[44] Jesse A. Pietz. Pseudospectral collocation methods for the direct tran-
scription of optimal control problems. Master’s thesis, Rice University,
2003.
[45] Promislow. Functional Analysis. John Wiley & Sons, 2008.
[46] Charles C. Pugh. Real Mathematical Analysis. Springer-Verlag GmbH,
2015.
[47] J. Douglas Faires Richard Burden. Numerical Analysis. Cengage Learn-
ing, Inc, 2015.
[48] Lih-Ing W. Roeger. Exact finite-difference schemes for two-dimensional
linear systems with constant coefficients. Journal of Computational and
Applied Mathematics, 219(1):102–109, sep 2008.
[49] Alan Rothwell. Optimization Methods in Structural Design. Springer-
Verlag GmbH, 2017.
[50] A.L. Shwartz. Theory and Implementation of Numerical Methods Based
on Runge-Kutta Integration for Solving Optimal Control Problems. Phd
thesis, Electronic Research Laboratory, UC Burkeley, 1996.
[51] Silva, C. J., Torres, D. F. M., and Venturino, E. Optimal spraying in
biological control of pests. Math. Model. Nat. Phenom., 12(3):51–64,
2017.
[52] James Stewart. Calculus: Early Transcendentals. BROOKS COLE PUB
CO, 2015.
[53] Strang Strang. Linear Algebra and Its Applications. BROOKS COLE
PUB CO, 2005.
[54] H. Sussmann and J.C. Willems. 300 years of optimal control: From the
brachystrochrone to the maximum principle. IEEE Control Systems,
pages 32–44, 1997.
[55] John T. Workman Suzanne Lenhart. Optimal Control Applied to Biolog-
ical Models. Taylor & Francis Ltd., 2007.
[56] G.W. Swan. An optimal control model of diabetes mellitus. Bulletin of
Mathematical Biology, 44(6):793 – 808, 1982.
Bibliography 325
327
328 Index