Numerical Methods in Finance. Part A. (2010-2011)
Numerical Methods in Finance. Part A. (2010-2011)
(2010-2011)
October 6, 2010
1 [email protected]
Contents
0 Preface iv
0.1 Aims, objectives, and organisation of the course. . . . . . . . . iv
i
2.4 Numerical computations with matrices . . . . . . . . . . . . . 34
2.5 Matlab exercises for week 4 (Linear Algebra) . . . . . . . . . . 39
2.5.1 Ill posed systems . . . . . . . . . . . . . . . . . . . . . 39
2.5.2 MATLAB capabilities investigation: sparse matrices . . 40
2.5.3 Solving Ax = b by iteration . . . . . . . . . . . . . . . 40
ii
4.1.1 Existence and Uniqueness. . . . . . . . . . . . . . . . . 101
4.1.2 Autonomous linear ODE’s . . . . . . . . . . . . . . . . 103
4.1.3 Examples of non-linear differential equations . . . . . . 107
4.1.4 Numerical methods for systems of ODE’s. . . . . . . . 114
4.2 Stochastic Differential Equations . . . . . . . . . . . . . . . . 125
4.2.1 Black-Scholes SDE and Ito’s lemma. . . . . . . . . . . 125
4.2.2 The derivation of Black-Scholes pricing formula as an
exercise in Ito calculus. . . . . . . . . . . . . . . . . . . 128
4.2.3 Numerical schemes for solving SDE’s . . . . . . . . . . 129
4.2.4 Numerical example: the effect of stochastic volatility. . 133
4.2.5 Some popular models of stochastic volatility . . . . . . 137
iii
Chapter 0
Preface
• The law of large numbers, central limit theorem, large deviations theory
and risk, Markowitz portfolio theory
iv
The course has been originally written by Sebastian van Strien and com-
pletely redesigned in 2007 by Oleg Zaboronski. Paul Clifford made a signif-
icant contribution to designing the course project. We shall use mainly the
following texts, and you will not regret buying the first two.
• W.H. Press, S.A. Teukolsky, W.T. Vetterling and B.P. Flannery, Nu-
merical Recipes in C, Cambridge (1992)
• D.J. Higham and N.J. Higham, Matlab Guide SIAM Society for Indus-
trial & Applied Mathematics (2005)
v
• J.Y. Cambell, A.W. Lo and A.C. MacKinlay, The econometrics of fi-
nancial markets, Princeton 1997.
Notes will be handed out during each teaching session. At some point
the notes for will also become available on the web site:
http://www2.warwick.ac.uk/fac/sci/maths/people/staff/oleg_zaboronski/fm
1
Chapter 1
The purpose of this chapter to give a first introduction into Matlab, and to
explain why it is in particular useful when considering linear models (so for
example, a vector describing a portfolio)1 .
Matlab Examples.
Let us show how to use Matlab to do matrix multiplications. Enter the
matrices in the following way
A=
1 2
3 4
Note that the semi-colon after the last command suppresses the output. Now
use ∗ for multiplication of the two matrices.
1
Portfolio: a collection of investments held by an institution or a private individual
(Wikipedia)
2
> A * B - B*A
-1 -3
3 1
> A^2*B
17 17
37 37
Inf Inf
Inf Inf
Av = tv and v 6= 0
or equivalently if
(A − tI)v = 0
where I is the identity matrix (v is called the eigenvector associated to t).
Eigenvalues and eigenvectors in general need not be real. For the equation
(A − tI)v = 0
det(A − tI) = 0,
3
multiplicities), but a matrix does not necessarily have n linearly independent
eigenvectors (if some of the eigenvalues appear with multiplicity).
Matlab can easily compute the eigenvalues of a matrix:
> A=[5,-1;3,1]; eig(A)
4
2
Eigenvectors together with the corresponding eigenvalues can be computed
as follows:
> [V,D]=eig(A)
V =
0.7071 0.3162
0.7071 0.9487
D =
4 0
0 2
The terms in the diagonal of the matrix D are the eigenvalues of A, while
V gives the eigenvectors (the first column of V gives the first eigenvector and
so on):
> A *[0.7071;0.7071]
2.8284
2.8284
which is 4 times the initial vector (0.7071, 07071). The 2nd eigenvector is
mapped to 2 times itself. N. B. Matlab always normalises eigenvectors to
have unit L2 norm, k | vk |2 = 1. Such a normalisation is not necessarily
P
the best choice in the context of stochastic matrices, see below.
Equation Avk = tk vk for k = 1, 2, . . . n can be written in the matrix form:
AV = V D,
4
are linearly independent, matrix V is invertible and we have the following
matrix decomposition
A = V DV −1
Note that even if matrix elements of A are real, its eigenvalues can be com-
plex, in which case V and D are complex. If A does not have a basis of
eigenvectors (which can happen if several eigenvalues coincide) then one has
to allow D to be of a more general (Jordan) form (Jordan decomposition
theorem). D will then have a block-diagonal form. Diagonal entries of the
Jordan block are equal to an eigenvalue of A, the superdiagonal consists of
1’s, the rest of block elements are zero.
The expression A = V DV −1 implies
An = V Dn V −1 .
But since D is such a special matrix, it is easy to compute its powers, and
from this we can prove the theorem below.
Theorem 1.1.1. If the norm of each eigenvalue of the diagonalisable matrix
A is < 1 then for each vector v one has An v → 0 as n → ∞.
If there is an eigenvalue of A with norm > 1 then for ‘almost all’ vec-
tors v (i.e., for all vectors which do not lie in the span of the eigenspaces
corresponding to the eigenvalues with norm ≤ 1) one has |An v| → ∞ as
n → ∞.
If A has an eigenvalue equal to one, then different things can happen
depending on the matrix.
Idea of the proof (part 1): Let λmax be an eigenvalue of A with the
largest norm, ||λmax || < 1. For v 6= 0,
||An v||
||An v|| = ||v|| · ≤ ||v|| · ||An || ≤ ||v|| · ||V || · ||V −1 || · ||Dn ||
||v||
≤ ||v|| · ||V || · ||V −1 || · |λmax |n → 0, n → ∞.
Important remarks:
• If A is a symmetric n × n square matrix (for example a covariance matrix) then
it always has n real eigenvalues and a basis of n real eigenvectors which form
an orthogonal basis. In this case we can choose V so that its columns form an
orthogonal basis (the eigenvectors of A). Then V is an orthogonal matrix: V −1
is equal to V tr .
5
• Eigenvectors of a non-symmetric matrix do not necessarily form a basis. For
1 1
example, the non-symmetric matrix A = has only 1 as an eigenvalue,
0 1
and A has only one eigenvector. Even if A has a basis of eigenvectors, they do
not need be orthogonal, and so in general V −1 is not equal to V tr .
• Stochastic matrices
6
Suppose that on average 3/4 of the investment into the new companies
is lost due to bankruptcy in the first year. Suppose also that only 1/2 of
the capital tied in the one-year old companies is transferred to two-year old
companies due to bankruptcy or early exit in their second year of operation.
Then N1 (n + 1) = N0 (n)/4 and N2 (n + 1) = N1 (n)/2.
Assume that the return from one-year old companies due to an early
exit is 2N1 (n) dollars in year n, whereas the return from the two-year old
companies which must exit in the end of their third year of operation is
4N2 (n) dollars. Then the total amount we re-invest in new companies in the
end of year n is N0 (n + 1) = 2N1 (n) + 4N2 (n) dollars. In matrix form
N0 (n + 1) 0 2 4 N0 (n)
N1 (n + 1) = 1/4 0 0 N1 (n) . (1.2.1)
N2 (n + 1) 0 1/2 0 N2 (n)
This transformation determines the ‘age structure’ of the population in
year n + 1 once it is known what it is in year n. Let us give the above matrix
a name:
0 2 4
L = 1/4 0 0 .
0 1/2 0
Iterating expression (1.2.1) we get
where N (n) is the column vector [N0 (n), N1 (n), N2 (n)]′ . Vector N (0) de-
scribes the initial distribution of the capital between the start-ups we manage.
For example, the vector N (0) = [C, 0, 0]′ corresponds to our fund investing
C dollars in new start-ups.
So what is the growth-rate associated to this model? If this year the
total capital tied in the start-ups is N0 (n) + N1 (n) + N2 (n) dollars, then
it is 2N1 (n) + 4N2 (n) + (1/4)N0 (n) + (1/2)N1 (n) dollars next year. So the
growth-rate is
The survival of our investment fund depends on whether this rate is less or
bigger than one. Unfortunately the answer is not obvious from the above
7
formula. We will develop the machinery needed for answering such questions
in the next section.
Notice that one can also use similar models to predict the age structure of
people in a country (the prediction that in the West in the next few decades
the average age of the population will continue to grow is based on similar
models. In this case the model specifies reproduction and death rates for
each age group of the population.).
8
Moreover, since in the limit each of the columns is a multiple of
8
2
1
the sizes of the age groups is eventually approximately in this proportion. So
in these models one can only make predictions if one knows the age structure
of the population, not just its size.
Why do you get this limit, and why are all the columns of the matrix
multiples of each other? This is easy to explain by writing the matrix in
terms of its basis of eigenvalues:
[V,D]=eig(L)
V =
-0.9631 0.9177 0.9177
-0.2408 -0.2294 - 0.2294i -0.2294 + 0.2294i
-0.1204 -0.0000 + 0.2294i -0.0000 - 0.2294i
D =
1.0000 0 0
0 -0.5000 + 0.5000i 0
0 0 -0.5000 - 0.5000i
Since L = V DV −1 , Ln = V Dn V −1 and since Dn tends to
1 0 0
0 0 0
0 0 0
9
the question about the growth rate posed in the end of the previous section:
If n → ∞, N (n) ≈ w(N (0))v, where v = [8, 2, 1]′ is an eigenvector corre-
sponding to the eigenvalue λ = 1 and w(N (0)) is the scalar which depends
on the initial vector. Therefore, the total growth rate is asymptotically 1
(the largest eigenvalue), i. e. there is no growth. However, the total growth
achieved by some large time n is
P n
(L N 0 )k X n
Γ= k = (L [p0 , p1 , p2 ]′ )k ,
C k
where C is the total initial investment, pi is the share of the investment into
the companies which are i years old. As it is easy to see from the limiting
form of Ln , Γ = 0.55 is minimal for p0 = 1, p1 = 0, p2 = 0, and Γ = 2.2 is
maximal for p0 = 0, p1 + p2 = 1.
Our analysis was very easy at it happened that matrix L had a real eigen-
value the norm of which was larger than the norm of any other eigenvalue.
As a result we didn’t really have to calculate large powers of L explicitly.
Were we particularly lucky or is this situation generic? The answer is ’The
latter’ given that we are dealing with irreducible matrices.
We say that a non-negative matrix P is irreducible if there exists an
integer k ≥ 1 so that P k has only strictly positive coefficients.
Theorem 1.2.1. If P is an irreducible non-negative matrix, then:
(1) There is one real positive eigenvalue ̺ > 0 which is larger in norm
than all others.
(2) If ̺ > 1 then for almost all vectors v one gets that |P n v| → ∞ as
n → ∞;
(3) If ̺ ∈ (0, 1) then for all vectors v, P n v → 0 as n → ∞;
(4) If ̺ = 1 then P n tends to a matrix Q for which each column of Q
is some multiple of one column vector w with positive entries. The column
vector w is the eigenvector of P associated with eigenvalue 1.
The matrix L from this section is irreducible: L5 > 0. So Ln converges
to a non-zero matrix because the largest eigenvalue (largest in norm) of L is
equal to one. If instead of L we would consider another irreducible matrix
> A=[0,2,3;1/4,0,0;0,1/2,0]
0 2 3
1/4 0 0
0 1/2 0
10
then because the eigenvalue of A with the largest norm is 0.9466, the matrix
An tends to zero as n → ∞ with this rate.
Such a remarkable statement as 1.2.1 deserves some comments. Parts (2)
and (3) follow easily from Theorem 1.1.1. Parts (1) and (4) are a consequence
of the famous Perron-Frobenius Theorem:
Theorem 1.2.2. Let A = (aij ) be a real n × n matrix with positive entries
aij > 0. Then the following statements hold:
(1) There is a positive real eigenvalue r of A: any other eigenvalue λ
satisfies |λ| < r.
(2) The eigenvalue r is simple: r is a simple root of the characteristic
polynomial of A. In particular, both the right and the left eigenspaces asso-
ciated with r are one-dimensional.
(3) There are left and right eigenvectors associated with r which have
positive entries.
P P
(4) minj i aij ≤ r ≤ maxj i aij .
For the proof of Perron-Frobenius (PF) theorem, see the Course’s web-
page. Here we will simply show how parts (1), (4) of Theorem 1.2.1 follow
from Theorem 1.2.2. Let k: matrix Ak has only positive entries. As A has
only non-negative entries and is irreducible, Ak+1 has only positive entries.
The largest in norm eigenvalues of Ak and Ak+1 have the form rk , rk+1 , where
r is an eigenvalue of A. By PF theorem, rk and rk+1 are real positive. There-
fore, r is real positive. Also, any other eigenvalue λ of A satisfies |λ|k < rk
by PF applied to Ak . Therefore, r > |λ| where λ is an arbitrary eigenvalue
of A distinct from r. Part (1) is proved.
To prove part (4), we notice that by Jordan decomposition theorem, there
is a basis in which matrix A has the following form:
1 0T
A=
0 A0 ,
where all eigenvalues of A0 lie inside the unit circle. Hence in virtue of Thm.
1.1.1,
k 1 0T
lim A =
k→∞ 0 0.
In the arbitrary basis
k 1 0T
lim A = V V −1 ,
k→∞ 0 0.
11
which is rank-1 matrix whose columns are proportional to the eigenvector
associated with the eigenvalue 1. Part (4) is proved.
The above expression constitutes a well known probabilistic sum rule con-
veniently re-written in the matrix form. In order to be definite, we assume
that after an optimistic day with probability 2/3 and 1/3 respectively an op-
timistic or pessimistic day follows. After a pessimistic day these probabilities
are 1/4 and 3/4 respectively. The transition matrix of our Markov model of
the mood of the market takes the form
2/3 1/4
P =
1/3 3/4
Matrix P is also called a Markov or a stochastic matrix. The last term
is used for non-negative matrices such that the sum of the coefficients of
each column is equal to one. It is easy to see that matrix P built out of
conditional probabilities is stochastic. We want to compute the probability
that the stock market is in a certain state on day n, i. e. the numbers o(n)
and p(n). Since these are probabilities, o(n) + p(n) = 1 and each of these
12
o(n)
numbers is greater or equal to zero. The vector also is called a
p(n)
probability vector. Question: Can you check that our model is consistent,
in the sense that stochastic matrix maps stochastic vector to a stochastic
vector?
Iterating (1.3.2) we get
o(n) n o(0)
=P .
p(n) p(0)
To study stochastic matrices we need the following Corollary of Perron-
Frobenius Theorem 1.2.2
Theorem 1.3.1. If P is an irreducible stochastic matrix, then precisely one
of the eigenvalues of P is equal to 1 and all others have norm < 1. As
n → ∞ the matrices P n converge to a matrix Q whose columns are all the
same. This column vector is the eigenvector w corresponding to eigenvalue
1. Its coefficients are positive and add up to 1. (A stochastic vector.) Each
column of Q is equal to w.
Firstly, we have to prove is that the largest eigenvalue of P is equal to 1.
This follows from part (4) of PF theorem, as for any stochastic matrix
X X
minj pij = maxj pij = 1.
i i
Secondly, we need to verify that all columns of the limiting matrix is not
just proportional but equal to the (stochastic) eigenvector associated with
eigenvalue 1. This is true as the limiting matrix is also stochastic, therefore
matrix elements in each column must sum to 1.
13
are 0.98 and 1.) Note that irreducibility
of P is crucial
for the uniqueness
1 0 0
of equilibrium state. If for example, P = 0 0.9 0 then limn→∞ P n =
0 0.1 1
1 0 0
0 0 0 , and so in particular, limn→∞ P n v does seriously depend on v.
0 1 1
Note that in the example from the beginning of this section, P is irre-
ducible, and so P n tends to a matrix with all column the same. In fact,
Hence for the example at hand, the market in its equilibrium state is up
with probability 0.4286 and down with probability 0.5714.
It is often convenient to describe Markov models using trellis diagrams:
Up 2/3 Up
1/3
1/4
Down Down
3/4
14
where S0 is the initial value of the asset, K is the exercise price, Nu (n) is the
number of upward (’optimistic’) days among n days, Nd (n) is the number of
’pessimistic’ days. The model can be generalized by allowing the probability
of the move on the day n to depend on what has happened in the previous
day. We see that the pay-off depends on the number of ’ups’ and ’downs’
only. These are exactly the quantities the statistics of which we studied in
the ’market mood’ model.
15
1
5
2
3
4
0.0000 0 0 0 0
0.3332 0.3334 0.3332 0.3333 0.3333
0.1112 0.1111 0.1112 0.1111 0.1111
0.3332 0.3334 0.3332 0.3333 0.3333
0.2223 0.2222 0.2223 0.2222 0.2222
16
0.0000 0 0 0 0
0.4164 0.3334 0.3332 0.3333 0.3333
0.1391 0.1111 0.1112 0.1111 0.1111
0.4164 0.3334 0.3332 0.3333 0.3333
0.2779 0.2222 0.2223 0.2222 0.2222
So in this case, it would be best to put all your money in the first fund,
despite the fact that eventually this fund dies out. It is the first fund that
actually makes money, because the sum of the elements of the first column
is > 1 (the sum of the first column is 1.1, while the sum of the others is 1).
This last example can serve as a light-hearted description of the economy
of the former Soviet Union. The government subsidizes the industry. The
subsidies get stolen and moved abroad. The state eventually collapses.
Further in the course we shall discuss Monte-Carlo methods, and see why
the fact that one of the other eigenvalues of P could be close to 1, could play
a role: because of this P n is not that close to the limit Q even for n large,
which leads to slow convergence of the Monte-Carlo-based numerical scheme.
17