0% found this document useful (0 votes)
2 views

chap1a

Chapter 1 of the document discusses the solution of nonlinear equations in computational finance, emphasizing the complexity of solving systems of nonlinear equations compared to linear ones. It introduces the concept of implied volatility in pricing European call options and presents numerical methods such as the bisection method and Newton's method for finding roots of nonlinear equations. The chapter highlights the importance of these methods in the context of financial modeling and option pricing.

Uploaded by

CJ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

chap1a

Chapter 1 of the document discusses the solution of nonlinear equations in computational finance, emphasizing the complexity of solving systems of nonlinear equations compared to linear ones. It introduces the concept of implied volatility in pricing European call options and presents numerical methods such as the bisection method and Newton's method for finding roots of nonlinear equations. The chapter highlights the importance of these methods in the context of financial modeling and option pricing.

Uploaded by

CJ
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

COMPUTATIONAL FINANCE

Chapter 1: The Solution of Nonlinear Equations


An essential component in the quantitative modeling of financial processes is the solu-
tion of equations and inequalities. If the model leads to a system of linear equations of the
form
Ax = b,

where A is an m × n matrix, then the tools of linear algebra can be brought to bear to
analyze this system, and the highly developed software of numerical linear algebra can be
employed to find the actual solution or an approximation to it.
The situation is usually considerably more complicated when the model leads to m
nonlinear equations in n unknowns, written conveniently as

F (x) = 0

where x = (x1 , . . . , xn ) and F = (f1 , . . . , fm ). There is in general no guarantee that such


a system will have a solution, and if there are solutions, that any of them can be found
numerically. If the system can in fact be solved then the usual methods are iterative and
require repeatedly the solution of linear systems.
Many of the analytical and numerical problems disappear when we have the special
case of one equation in one unknown, written as

f (x) = 0.

We shall begin our discussion by considering this case.

1. The numerical solution of f (x) = 0.

In order to have a concrete case in mind let us consider the following example from the
world of equity options:
A “European call” is an option which gives its holder (owner) the right to buy an asset
for a specified amount $K at a specified time T . K is known as the strike price and T is
the time of maturity of the option. If at time T the asset can be bought at a cheaper price

1
than $K then the option will not be exercised. But if the asset trades for more than $K
then the holder will exercise the option. The option confers a right but not an obligation
and hence has value. The writer (seller) of the option, on the other hand, is obligated to
sell the asset on demand at time T for $K regardless of its value at that time. This creates
risk for which the writer must be compensated by charging for the option.
Pricing options is one of the central topics of computational finance and will dominate
this course. For the above European call there is the famous Black-Scholes formula which
explicitly gives its price
C = SN (d1 ) − Ke−rT N (d2 )

where S is the price of the asset at the time t = 0 when the option is sold. Here

ln(S/K) + (r + σ 2 /2)T
d1 = √
σ T

ln(S/K) + (r − σ 2 /2)T
d2 = √
σ T
Z x
1 2
N (x) = √ e−s /2 ds.
2π −∞
The parameter r is the riskless interest rate at which money can be invested today with
payout at time T and is generally assumed known. The second parameter σ is the so-called
volatility of the asset and is a measure of the day to day changes in the value S of the
asset. There is a great deal of uncertainty of how to choose σ because it is not an observable
quantity. What is observable is the market price C of the option. So one possible approach
to determining σ is to find that value of σ which for given K, r, T and S produces a value
C which is identical to the quoted market price. This value of σ is called the “implied
volatility.” Mathematically, we need to solve the problem

f (σ) ≡ C − SN (d1 (σ)) + Ke−rT N (d2 (σ)) = 0

where K, r, T , S and C are given. It is clear that this is a highly nonlinear equation in
the unknown σ which can only be solved numerically (or approximately). Implied volatility
calculations are said to take place around the clock in the financial derivative industry.

2
With this example in the background let us now consider feasible numerical methods
for solving the general problem
f (x) = 0.

We shall denote a root of this equation by x∗ .


If this equation has to be solved only a few times then efficiency in its solution is of
secondary concern. One may as well plot f versus x, read off the screen an approximate
value for x∗ and then refine this value by replotting f over a small interval centered at
this approximation. It is clear that such an interactive approach is expensive in terms of
man-hours and function evaluations.
This intuitive approach is made formal by the method of bisection which under mild
conditions on the function f is guaranteed to give a numerical solution arbitrarily close to
the analytic solution x∗ .
Bisection algorithm: Let f be continuous on the interval [a, b]. Suppose that f (a)f (b) ≤
0. Then
i) f (x) = 0 has at least one solution x∗ ∈ [a, b].
ii) let xL = a, xR = b; for n = 1, 2, . . . set
xL + x R
zn = .
2
xR − x L
Then |zn − x∗ | <
2n
If f (zn ) 6= 0 and zn is not acceptably close to x∗ then set
xL = z n if f (zn )f (xR ) ≤ 0
xR = z n if f (zn )f (xR ) > 0
and go to n + 1.
The continuity of f and a sign change over the interval [a, b] guarantee that f has at
least one root in the interval. In every step of the algorithm we insure that f changes sign
over [xL , xR ]. Since in every step of the algorithm we halve the interval we see that at step n
b−a
xR − x L =
2n
so that
b−a
|zn − x∗ | < .
2n

3
Suppose we wish to insure that |z − x∗ | < 10−6 . This is guaranteed if

bn − a
< 10−6
2n

or
ln(b − a) + 6 ln 10
n> .
ln 2
For example, if b − a = 1 then
n > 19.93

so that as many as 20 function evaluations may be required to achieve the stated accuracy.
Suppose we wish to apply the bisection method to the implied volatility calculation.
By inspection d1 and d2 are continuous functions of σ and since N is a continuous function
of x it follows that f is a continuous function of σ. Moreover,

lim N (d1 (σ)) = lim N (x) = 1


σ→∞ x→∞

lim N (d2 (σ)) = lim N (x) = 0


σ→∞ x→−∞

so that
lim f (σ) = C − S.
σ→∞

This is a negative quantity since if the call were more expensive than the asset one may as
well buy the asset itself.
The condition as σ → 0 is a little trickier. We note that

ln S/K + rT
lim d1 (σ) = lim d2 (σ) = lim √ .
σ→0 σ→0 σ→0 σ T

If ln S/K + rt < 0 then d1 and d2 → −∞ as σ → 0 and

lim f (σ) = C > 0.


σ→0

If ln S/K + rt > 0 then d1 and d2 → ∞ as σ → 0 and

lim f (σ) = C − (S − Ke−rT ).


σ→0

4
For a correctly priced call this quantity is also positive, for if

ln S/K − rT > 0, i.e., S > Ke−rT

and
C < (S − Ke−rT )

then an investor can sell short the asset for $S and buy the call for C. The value of the
contract at time T is then
(S − C)erT

which would exceed the strike price K required to repurchase the asset sold short before.
Hence
lim f (σ) > 0 and lim f (σ) < 0
σ→0 σ→∞

so that the bisection method will succeed.


A search method like the method of bisection is not useful if efficiency of the computa-
tion is of concern. Then more sophisticated methods must be employed which, unfortunately,
tend to be more delicate and require deeper mathematical insight to insure that they work
and work well.
Let us again consider the problem

f (x) = 0.

We shall rewrite the equation generically in the form of a so-called fixed point equation

x = g(x)

where g is chosen such that any x∗ which satisfies

x∗ = g(x∗ )

is also a solution of f (x) = 0. The solution x∗ is known as a fixed point of g.


There are infinitely many fixed point equations associated with a given f (x) = 0. For
example, one may write trivially

x = g(x) ≡ x − αf (x)

5
for any non-zero scalar α. More commonly, a g is obtained by solving f for x in terms of a
function of x. For example,
f (x) ≡ x − x2 = 0

can rewritten as the fixed point equations

x = x2

or

x= x.

The numerical solution of


x = g(x)

is obtained by simple substitution or, what is the same, a fixed point iteration. We assume
that we have an initial guess x0 and compute iteratively the sequence {xn } from

xn+1 = g(xn ), n = 0, 1, 2, . . .

If the sequence converges to some x∗ then we have found a root of f .


Convergence of the iteration requires a particular structure of g. If g does not have the
right properties then the iteration will not converge regardless of how close x 0 is to a root
of f . For example,
f (x) ≡ x − x2 = 0

has a root at x∗ = 1 and at x∗ = 0. But if we try to solve

x = x2

with an initial guess of


x0 = 1 + ², ²>0

then xn → ∞ as n → ∞ no matter how small ² is chosen. The desirable property for a fixed
point x∗ is that it be a point of attraction which is defined as follows.
Definition: Let x∗ be a fixed point of g. x∗ is a point of attraction of the fixed point
iteration if there is a neighborhood N (x∗ , δ) of x∗ (i.e., an interval of radius δ around x∗ )
such that for any x0 ∈ N (x∗ , δ) the fixed point iteration converges to x∗ .

6
The discussion above shows that x∗ = 1 is not a point of attraction of

g(x) = x2 .

A moment’s reflection will show that the fixed point x∗ = 0 is a point of attraction. Similarly,
x∗ = 1 is a point of attraction of the alternate fixed point equation

x = g(x) = x

but x∗ = 0 is not. In many applications one has a reasonable idea of x ∗ and it is important
to have a fixed point formulation for f (x) = 0 for which x∗ is a point of attraction so that
a good initial guess will lead to convergence. This raises the question of what property of g
makes a fixed point a point of attraction. We have the following theoretical criterion.
Theorem: Let x∗ be a fixed point of the function g. Suppose that g is continuously
differentiable in a neighborhood of x∗ and that

|g 0 (x∗ )| < 1.

Then x∗ is a point of attraction.


Proof: If g 0 is continuous and |g 0 (x∗ )| < 1 then |g 0 (x)| ≤ 1 − α for some α > 0 and all x in
some interval N (x∗ , δ) around x∗ of radius δ. (We don’t know x∗ or the length δ, but we
know from analysis that such a δ > 0 exists.) By the mean value theorem for differentiable
functions we also know that

g(x) − g(x∗ ) = g 0 (ξ)(x − x∗ )

for some ξ between x and x∗ . Hence if x ∈ N (x∗ , δ) then

|g(x) − g(x∗ )| ≤ (1 − α)|x − x∗ |.

Now suppose that x0 is an arbitrary point in N (x∗ , δ). Then

|x1 − x∗ | = |g(x0 ) − g(x∗ )| < (1 − α)|x0 − x∗ |

so that x1 ∈ N (x∗ , δ); similarly |xn+1 − x∗ | ≤ (1 − α)|xn − x∗ | and hence

|xn − x∗ | ≤ (1 − α)n |x0 − x∗ |

7
which guarantees that xn → x∗ as n → ∞.
If we examine the two fixed point equations

x = g1 (x) = x2

and

x = g2 (x) = x

associated with
f (x) = x − x2 = 0

then we see that


g10 (1) = 2

g20 (1) = 1/2

so x∗ = 1 is not a point of attraction for g1 but is a point of attraction for g2 .

Newton’s method
For the method of bisection we only required a continuous function f and an interval
over which f changes sign. The algorithm itself asks for the evaluation of f at given points
and does not demand that f be given analytically. For example, the evaluation of f may
involve a table-look up or f may require the solution of a differential equation whose solution
depends on the independent variable x. Hence f may be very general and complicated.
If f is given analytically and f 0 is continuous then we can solve

f (x) = 0

by a fixed point method called Newton’s method (sometimes Newton Raphson method)
which is in general much more efficient than bisection. The idea is as follows: Given an
initial guess x0 then for n = 0, 1, 2, . . . we linearize f around xn and find xn+1 as the solution
of the linear problem. The linearization is obtained from the first two terms of the Taylor
expansion of f , i.e. the linearization of f around xn is

Ln x = f (xn ) + f 0 (xn )(x − xn )

8
so that xn+1 is the solution of Ln x = 0, or

f (xn )
xn+1 = xn − , n = 0, 1, 2.
f 0 (xn )

There is, of course, a natural geometric interpretation of this method. The equation

y = Ln x

is the tangent to f at xn and xn+1 is the point where the tangent crosses the x-axis. The
hope is that xn+1 is a better approximation to x∗ than the preceding iterate xn .
Newton’s method has two exceedingly important properties. Under mild conditions we
are guaranteed convergence from a good initial x0 and once xn is sufficiently close to x∗ the
convergence is very rapid. These properties are easy to establish in view of our discussion
of fixed point iterations.
We observe that Newton’s method is a fixed point iteration for

x = g(x)

where
f (x)
g(x) = x − .
f 0 (x)
Let x∗ be a root of f and suppose, as is usually the case, that f 00 exists and that f 0 (x∗ ) 6= 0.
Then it follows that
f 0 (x∗ )2 − f (x∗ )f 00 (x∗ )
g 0 (x) = 1 − = 0.
f 0 (x∗ )2
Hence x∗ is a point of attraction and Newton’s method will converge from a good initial
guess. Moreover, it follows from the identity

1 00
g(y) = g(x) + g 0 (x)(y − x) + g (ξ)(y − x)2
2

for some ξ between x and y that

1 00
|xn+1 − x∗ | = |g(xn ) − g(x∗ )| = |g(x∗ ) + g 0 (x∗ )(xn − x∗ ) + g (ξ)(xn − x∗ )2 − g(x∗ )|
2

so that in view of g 0 (x∗ ) = 0 we obtain

|xn+1 − x∗ | ≤ K|xn − x∗ |2

9
for some constant K related to g 00 (x). Thus, if in the iteration the error |xn − x∗ | is 10−2
then the next iterate gives an error of order 10−4 . This convergence is called quadratic
and usually insures that only two or three iterations are needed to have an acceptable
approximation to x∗ .
The dominant draw-back to using Newton’s method is that a good initial guess is
required since convergence to a point of attraction is a local property. If one were to provide,
say, a Newton’s method based program for solving the implied volatility problem one would
have to guard against a bad choice of the initial guess σ 0 , unless, of course, one can establish
that Newton’s method will automatically converge. This will require additional conditions
on f .
Theorem: Let f be defined on the interval [a, b]. Suppose that

f (x∗ ) = 0 for some x∗ ∈ [a, b]

f 0 (x) > 0

f 00 (x) ≥ 0

Then Newton’s method will converge from x0 = b.


The proof of this result can be given rigorously, but a geometric argument will make
clear what happens. First of all, the root x∗ is unique in [a, b] because of f 0 > 0. Then
we observe that the tangent to f at x0 = b will lie below the graph of f because f 00 > 0
implies that f is concave downward. Hence the tangent crosses the x-axis at a point x which
satisfies
x∗ ≤ x 1 < x 0

because the tangent has positive slope. The same observation applies to all subsequent
tangents. As a consequence we generate a decreasing sequence of numbers {x n } which is
bounded below by x∗ . Hence the sequence must converge and from

f (xn )
xn+1 = xn −
f 0 (xn )

follows that xn must converge to a root of f , hence to x∗ . Similar arguments are used if
f is decreasing or convex downward. The geometry tells us whether monotone convergence
can be guaranteed.

10
Let us look at an application of this result to the implied volatility calculation.

f (σ) ≡ C − SN (d1 (σ)) + Ke−rT N (d2 (σ)) = 0

where
A
d1 (σ) = + bσ, d2 (σ) = d1 (σ) − 2bσ
σ
with
ln(S/K) + rT √
A= √ and b = T /2.
T
Our discussion of the problem in connection with the bisection method already established
that under reasonable assumption we may assume that

f (σ) = 0

has a solution. We compute:

f 0 (σ) = −SN 0 (d1 )d01 + Ke−rT N 0 (d2 )d02 .

From the definition of N (d2 ) we find

1 2
N 0 (d2 ) = √ e−d2 /2

and if we substitute d2 = d1 − 2b then simple algebra leads to

S rT
N 0 (d2 ) = N 0 (d1 ) e
K

so that
f 0 (σ) = −2bSN 0 (d1 ) < 0.

Furthermore,
2bS 2
f 00 (σ) = − √ (−d1 d01 )e−d1 /2 .

But
1
d01 = − d2
σ
so that
2bS 2
f 00 (σ) = − √ (d1 d2 )e−d1 /2 .
σ 2π

11
From the definition of d1 and d2 we see that
µ ¶µ ¶
A A
d1 d2 = + bσ − bσ .
σ σ

If we set
p
σ0 = |A/b|

then f 00 is negative for σ < σ 0 and positive for all σ > σ 0 . A look at successive tangents
to the graph of f shows that starting from σ 0 Newton’s method will converge. If f (σ 0 ) > 0
we obtain a monotone increasing sequence, if f (σ 0 ) < 0 we generate a monotone decreasing
sequence. Hence this choice of initial value is sufficient to guarantee monotone convergence
as long as the call is correctly priced so that f (σ) = 0 has a solution. While in general
quadratic convergence only sets in close to the correct volatility in practice this approach is
quite efficient compared to the bisection method.
A few final comments: The strengths and weaknesses of Newton’s method carry
over to the solution of the system
F (x) = 0.

Given xn the system is again linearized around xn

Ln x = F (xn ) + F 0 (xn )(x − xn )

where F 0 is the n × n matrix


F 0 (x) = (δfi /δxj )

xn+1 is the solution of


Ln x = 0,

i.e, formally
xn+1 = xn − F 0 (xn )−1 F (xn )

but computed from


F 0 (xn )δ = −F (xn )

xn+1 = xn + δ

12
in order to avoid the inverse of F 0 (x). The multi-dimensional Newton method will again
converge for a good initial guess as long as F 0 (x∗ ) is non-singular, and convergence close to
the solution remains quadratic. In general, the choice of the initial condition is more critical
than in the scalar case and has lead to some sophisticated methods for choosing x 0 .
Lack of convergence of Newton’s method for a good initial guess is usually due to the
incorrect calculation of F 0 (x). In addition, F may not be explicitly given but only in terms of
an input-output algorithm (given x one can calculate F (x)) so that F 0 is not calculable. One
can avoid F 0 (x) altogether if its entries are replaced by difference quotients. For example,
the jth column of F 0 (x) can be approximated by

F (x + hêj ) − F (x)
h

for small h where êj is the jth unit vector. By definition the limit of this difference quotient
as h → 0 is the jth column of F 0 (x). This discrete approximation to Newton’s method and
many variations thereof are closely related to secant methods and interpolation methods
which are discussed in texts on the numerical solution of nonlinear systems.

13

You might also like