0% found this document useful (1 vote)
266 views334 pages

Marcus Overhaus, Ana Bermudez, Hans Buehler, Andrew Ferraris, Christopher Jordinson, Aziz Lamnouar - Equity Hybrid Derivatives-Wiley (2007)

Uploaded by

Ashley Chraya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
266 views334 pages

Marcus Overhaus, Ana Bermudez, Hans Buehler, Andrew Ferraris, Christopher Jordinson, Aziz Lamnouar - Equity Hybrid Derivatives-Wiley (2007)

Uploaded by

Ashley Chraya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 334

Equity Hybrid

Derivatives

MARCUS OVERHAUS
ANA BERMÚDEZ
HANS BUEHLER
ANDREW FERRARIS
CHRISTOPHER JORDINSON
AZIZ LAMNOUAR

John Wiley & Sons, Inc.


Copyright c 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Bermúdez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.


Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to
the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax
(978) 646-8600, or on the Web at www.copyright.com. Requests to the Publisher for permission should
be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ
07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in
preparing this book, they make no representations or warranties with respect to the accuracy or
completeness of the contents of this book and specifically disclaim any implied warranties of
merchantability or fitness for a particular purpose. No warranty may be created or extended by sales
representatives or written sales materials. The advice and strategies contained herein may not be suitable
for your situation. You should consult with a professional where appropriate. Neither the publisher nor
author shall be liable for any loss of profit or any other commercial damages, including but not limited
to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our
Customer Care Department within the United States at (800) 762-2974, outside the United States at
(317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may
not be available in electronic books. For more information about Wiley products, visit our Web site at
www.wiley.com.

Library of Congress Cataloging-in-Publication Data:


Equity hybrid derivatives / Marcus Overhaus [et al.].
p. cm. — (Wiley finance series)
Includes bibliographical references and index.
ISBN-13: 978-0-471-77058-9 (cloth)
ISBN-10: 0-471-77058-2 (cloth)
1. Derivative securities. 2. Convertible securities. I. Overhaus,
Marcus. II. Title. III. Series.
HG6024.A3E684 2006
332.64 57—dc22
2006005369

Printed in the United States of America.


10 9 8 7 6 5 4 3 2 1
Contents

Preface ix
PART ONE
Modeling Volatility
CHAPTER 1
Theory 3
1.1 Concepts of Equity Modeling 3
1.1.1 The Forward 5
1.1.2 The Shape of Dividends to Come 6
1.1.3 European Options on the Pure Stock Process 10
1.2 Implied Volatility 11
1.2.1 Sticky Volatilities 13
1.3 Fitting the Market 16
1.3.1 Arbitrage-Free Option Price Surfaces 16
1.3.2 Implied Local Volatility 17
1.3.3 European Payoffs 21
1.3.4 Fitting the Market with Discrete Martingales 23
1.4 Theory of Replication 27
1.4.1 Replication in Diffusion-Driven Markets 30

CHAPTER 2
Applications 35
2.1 Classic Equity Models 35
2.1.1 Heston 35
2.1.2 SABR 43
2.1.3 Scott’s Exponential Ornstein-Uhlenbeck Model 45
2.1.4 Other Stochastic Volatility Models 45
2.1.5 Extensions of Heston’s Model 46
2.1.6 Cliquets 49
2.1.7 Forward-Skew Propagation 52
2.2 Variance Swaps, Entropy Swaps, Gamma Swaps 56
2.2.1 Variance Swaps 58
2.2.2 Entropy Swaps 68
2.2.3 Gamma Swaps 69
2.3 Variance Swap Market Models 71
2.3.1 Finite Dimensional Parametrizations 76
2.3.2 Examples 79
2.3.3 Fitting to the Market 83

iii
iv CONTENTS

PART TWO
Equity Interest Rate Hybrids
CHAPTER 3
Short-Rate Models 91
3.1 Introduction 91
3.2 Ornstein-Uhlenbeck Models 94
3.3 Calibrating to the Yield Curve 95
3.3.1 Hull-White Model 95
3.3.2 Generic Ornstein-Uhlenbeck Models 98
3.4 Calibrating the Volatility 100
3.4.1 Hull-White/Vasicek 101
3.4.2 Generic Ornstein-Uhlenbeck Models 104
3.5 Pricing Hybrids 105
3.5.1 Finite Differences 106
3.5.2 Monte Carlo 107
3.6 Appendix: Least-Squares Minimization 109
3.6.1 Newton-Raphson Method 110
3.6.2 Broyden’s Method 110

CHAPTER 4
Hybrid Products 112
4.1 The Effects of Assuming Stochastic Rates 112
4.2 Conditional Trigger Swaps 115
4.3 Target Redemption Notes 118
4.3.1 Structure 118
4.3.2 Back-Testing 120
4.3.3 Valuation Approach 123
4.3.4 Hedging 127
4.4 Convertible Bonds 128
4.4.1 Introduction 128
4.4.2 The Governing Equation 131
4.4.3 Detailed Specification of the Model 134
4.4.4 Analytical Solutions for a Special CB 137
4.5 Exchangeable Bonds 138
4.5.1 The Valuation PDE 138
4.5.2 Coordinate Transformations for Numerical Solution 140

CHAPTER 5
Constant Proportion Portfolio Insurance 145
5.1 Introduction to Portfolio Insurance 145
5.2 Classical CPPI 146
5.3 Restricted CPPI 149
5.3.1 Constraints on the Investment Level 149
5.3.2 Constraints on the Floor 149
5.3.3 An Example Structure 151
Contents v

5.4 Options on CPPI 152


5.4.1 The Pricing 152
5.4.2 Delta, Gamma, and Vega Exposures 152
5.4.3 Hedging 152
5.5 Nonstandard CPPIs 153
5.5.1 Complex Fee Structures 153
5.5.2 Dynamic Gearing 154
5.5.3 Perpetual CPPI 154
5.5.4 Flexi-Portfolio CPPI 155
5.5.5 Off-Balance-Sheet CPPI 156
5.6 CPPI as an Underlying 158
5.7 Other Issues Related to the CPPI 158
5.7.1 Liquidity Issues (Hedge Funds) 158
5.7.2 Assets Suitable for CPPIs 158
5.8 Appendixes 159
5.8.1 Appendix A 159
5.8.2 Appendix B 160
5.8.3 Appendix C 161

PART THREE
Equity Credit Hybrids
CHAPTER 6
Credit Modeling 167
6.1 Introduction 167
6.2 Background on Credit Modeling 167
6.2.1 Structural Approach 168
6.2.2 Reduced-Form Approach 171
6.3 Modeling Equity Credit Hybrids 175
6.3.1 Dynamics of the Hazard Rate 175
6.3.2 Model Choice 176
6.4 Pricing 180
6.4.1 Credit Default Swap 180
6.4.2 Credit Default Swaption 181
6.4.3 European Call 184
6.5 Calibration 186
6.5.1 Stripping of Hazard Rate 186
6.5.2 Calibration of the Hazard Rate Process 187
6.5.3 Calibration of the Equity Volatility 188
6.5.4 Discussion 188
6.6 Introduction of Discontinuities 188
6.6.1 The New Framework 189
6.6.2 Dynamics of the Survival Probability 189
6.6.3 Pricing of European Options 190
6.6.4 Fourier Pricing 194
6.7 Equity Default Swaps 196
6.7.1 Modeling Equity Default Swaps 198
vi CONTENTS

6.7.2 Single-Name EDSs in a Deterministic Hazard Rate Model 198


6.8 Conclusion 203

PART FOUR
Advanced Pricing Techniques
CHAPTER 7
Copulas Applied to Derivatives Pricing 207
7.1 Introduction 207
7.2 Theoretical Background of Copulas 207
7.2.1 Definitions 207
7.2.2 Measures of Dependence 209
7.2.3 Copulas and Stochastic Processes 211
7.2.4 Some Popular Copulas 213
7.3 Factor Copula Framework 217
7.4 Applications to Derivatives Pricing 218
7.4.1 Equity Derivatives: The Altiplano 218
7.4.2 Credit Derivatives: Basket and Tranche Pricing 223
7.5 Conclusion 228

CHAPTER 8
Forward PDEs and Local Volatility Calibration 229
8.1 Introduction 229
8.1.1 Local and Implied Volatilities 229
8.1.2 Dupire’s Formula and Its Problems 231
8.1.3 Dupire-like Formula in Multifactor Models 232
8.2 Forward PDEs 233
8.3 Pure Equity Case 235
8.4 Local Volatility with Stochastic Interest Rates 238
8.5 Calibrating the Local Volatility 242
8.6 Special Case: Vasicek Plus a Term Structure of Equity Volatilities 244

CHAPTER 9
Numerical Solution of Multifactor Pricing Problems Using
Lagrange-Galerkin with Duality Methods 248
9.1 Introduction 248
9.2 The Modeling Framework: A General D-factor Model 250
9.2.1 Strong Formulation of the Linear Problem:
Partial Differential Equations 251
9.2.2 Truncation of the Domain and Boundary Conditions 253
9.2.3 Strong Formulation of the Nonlinear Problem: Partial
Differential Inequalities 254
9.2.4 Weak Formulation of the Nonlinear Problem:
Variational Inequalities 256
Contents vii

9.3 Numerical Solution of Partial Differential Inequalities


(Variational Inequalities) 259
9.3.1 A Duality (or Lagrange Multiplier) Method 260
9.4 Numerical Solution of Partial Differential Equations (Variational
Equalities): Classical Lagrange-Galerkin Method 262
9.4.1 Semi-Lagrangian Time Discretization: Method
of Characteristics 262
9.4.2 Space Discretization: Galerkin Finite Element Method 265
9.4.3 Order of Classical Lagrange-Galerkin Method 270
9.5 Higher-Order Lagrange-Galerkin Methods 271
9.5.1 Crank-Nicolson Characteristics/Finite Elements 272
9.6 Application to Pricing of Convertible Bonds 279
9.6.1 Numerical Solution 280
9.6.2 Numerical Results 280
9.7 Appendix: Lagrange Triangular Finite Elements 285
9.7.1 Lagrange Triangular Finite Elements 285
9.7.2 Coefficients Matrix and Independent Term in Two
Dimensions 287

CHAPTER 10
American Monte Carlo 297
10.1 Introduction 297
10.2 Broadie and Glasserman 299
10.3 Regularly Spaced Restarts 299
10.4 The Longstaff and Schwartz Algorithm 301
10.4.1 The Algorithm 301
10.4.2 Example: A Call Option with Monthly
Bermudan Exercise 303
10.5 Accuracy and Bias 305
10.5.1 Extension: Regressing on In-the-Money Paths 306
10.5.2 Linear Regression 308
10.5.3 Other Regression Schemes 310
10.5.4 Upper Bounds 310
10.6 Parameterizing the Exercise Boundary 311

Bibliography 313

Index 323
Preface

Equity hybrid derivatives are a very young class of structures which have drawn a
lot of attention over the past two years for many different reasons. Equity hybrid
derivatives combine all existing, and therefore established, asset classes like equity,
credit, interest rate, foreign exchange, and commodity derivatives. Hence, they
present a very interesting challenge to combining different modeling techniques
and thereby forming a solid hybrid model framework. This is why we have again
decided to publish a book entirely concerned with this very interesting topic.
Hybrid derivatives are a strategic and profitable business that every serious top-tier
investment bank needs to offer to its client base and are therefore an integral part of
its derivatives business.
In this volume, we have not tried to write an introductory text: we have assumed
some prior familiarity with mathematics and finance. Part One of this book gives
insight into different volatility models (Heston, SABR etc) and their applications to
equity markets. It also contains some very recent developments such as variance
swap market models. Part Two gives a brief review of short rate models and their
incorporation into equity-interest rate hybrid structures. Important examples are
discussed, such as the conditional trigger swap (CTS), convertible bonds, and the
very popular CPPI structures. Part Three contains a thorough introduction to credit
modeling and its importance to equity-credit hybrid derivative structures. Pricing
and calibration techniques are also discussed in detail, and important examples
like the EDS (equity default swap) are given. Part Four is dedicated to advanced
pricing techniques applied to various hybrid and callable structures. We start with
copulas applied to equity and credit derivatives (Altiplanos and default baskets),
then discuss forward PDEs and local volatility calibration techniques and their
application to equity-rate hybrids. This is followed by a thorough presentation
of numerical solutions for multi-factor pricing problems, including an important
example, the convertible bond. Finally, we conclude with an exposition of American
Monte Carlo techniques for derivative pricing.
We would like to offer our special thanks to Professor Alexander Schied for
careful reading of the manuscript and valuable comments. We would also like to
express our gratitude to Kenji Felgenhauer, Eric Bensoussan, Peter Carr, and Maria
Noguieras.

The Authors

London
February 2006

ix
PART
One
Modeling Volatility
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 1
Theory

n this chapter, we will introduce some basic concepts of equity modeling. We will
I discuss how the stock price can be modeled in a framework with deterministic
interest rates, dividends, and default probabilities and how a given implied volatility
surface can be matched with Dupire’s ‘‘implied local volatility.’’ We also mention
alternatives and how European payoffs whose value depends only on the stock
price on a single maturity can be priced independent of further model assumptions
by hedging with European options. We also make a few remarks on theoretical
aspects of replication. This chapter is the foundation of chapter 2, where we
will discuss applications: various stochastic volatility models, pricing of Cliquets,
variance swaps, and related products and models to price options on variance. The
assumptions of deterministic interest rates and default risk probabilities are then
subsequently relaxed in the later chapters of this book.

1.1 CONCEPTS OF EQUITY MODELING


Since the main focus of this chapter is the modeling of the pure equity risk, we
will work with a framework where interest rates, dividends, and default risk are
deterministic. We will model the stock price on a stochastic base ( , ,
( t )t 0 , ). The measure is the ‘‘historic’’ measure. We denote by r (rt )t 0 the
deterministic interest rates and we use ( t )t 0 to refer to a deterministic repo
rate (it represents the gains we make from lending out a share). We will also use B
for the cash bond (or ‘‘money market account’’), that is,
t
Bt : e 0 ru du

and
T
P(t, T) : e t rs ds

for the price at time t of a zero bond with maturity T (see chapter 3 for the case of
stochastic interest rates). We also assume that the stock can default: In the event of
default, whose time we denote by , we stipulate that the value of the share drops
to zero. We also assume that corporate zero bonds of the company we want to
model are traded for all maturities, and that the value of any outstanding bonds will
also drop to zero in the event of default (in practice, this rarely happens: usually, a
bond will have some ‘‘recovery value,’’ which represents the fraction of the notional

3
4 MODELING VOLATILITY

that the defaulted company is still able to pay).1 Since the ‘‘risky’’ corporate zero
bond can default, its price at any time prior to must be less than the price of the
‘‘riskless’’ government zero bond with same maturity and notional: the zero bond
is trading at a spread. We will assume that this spread, or hazard rate, h (ht )t 0 ,
of the risky bond interest rate over the riskless rate r, is deterministic, such that the
price of the risky zero bond with maturity T at time t is given as

T
PS (t, T) : 1 t e t (ru hu ) du

(Our restriction to deterministic hazard rates will be lifted in chapter 6, where we


discuss approaches to model h as a stochastic process.) The default event itself
is then modeled as an inaccessible exponentially distributed stopping time with
intensity h, which is assumed to be independent from the filtration (i.e., of stock
price, interest rates, volatility, etc.).2 Inaccessibility means that the default cannot
be foretold by observations of the stock price: It excludes, for example, stopping
times that are the result of the stock price crossing some barrier.3 In this setting, we
assume, under any pricing measure equivalent to that

T
[ T 1 e t hs ds , (1.1)
t] t

which implies the intuitive relation PS (t, T) [ T t] P(t, T) (i.e., the price
of the risky bond is the price of the riskless bond times the probability of default).
Moreover, our assumptions that zero bonds are available for all maturities, implies
that we can roll over a capital investment of 1 into the risky zero bonds and thereby
generate a risky cash bond,

t
BSt : 1 t e 0 ru hu du
,

which will also drop to zero in the event of default.


As mentioned above, the availability of both the risky and the riskless bond
allows us to synthesize a payoff of 1 if the company has defaulted up to some
maturity T. As a consequence, we can hedge out the risk of default when pricing an
option. A put written on the stock S (St )t 0 , for example, can be decomposed into

(K ST ) (K ST ) 1 T K1 T

Hence, we can split its value in the ‘‘default value’’ K1 T , which we can hedge by
entering into a long position in a risky zero bond and a short position in a riskless
zero bond, and the ‘‘survival value’’ (K ST ) 1 T , whose hedge we can approach
using standard replication theory, as explained in section 1.4.

1 Section9.6 on page 279 covers the pricing of convertible bonds under various more detailed
assumptions.
2 Blanchet-Scalliet/Jeanblanc [1] provide a good introduction into intensity models.
3
Mathematically speaking, an accessible stopping time can be approximated by an increasing
sequence of stopping times ( n )n with n for n .
Theory 5

1.1.1 The Forward


We also assume that the stock pays dividends. On each of the dividend dates
0 : 0 1 2 , we assume that first a proportional dividend of k 1 e dk
and then a cash dividend of k is paid. As a result, the stock price at a dividend
date will drop by the relative amount e dk 1 k and the absolute amount k .
Of course, dividends are only paid if the stock did not yet default. Hence, if we
assume that the stock price process S (St )t 0 only jumps due to dividends, at each
dividend date k , we have
dk
S k
S k
e k (1.2)

In this setting, let us derive the value of a forward contract with delivery time T:
Assume first we buy shares today and that we short S0 riskless zero bonds in
order to borrow the required initial capital.4 Since we hold the stock we will earn
repo and receive the dividends it pays. To handle them, we decide to reinvest all
proportional dividends and proceeds from repo contracts into the stock. Since we
receive as many cash dividends as we hold units of stock, this implies that at any
time k before default, we receive an amount of
k
u du j k dj
ke
0

We will use these proceedings to buy back our initially issued debt. If the stock does
not default, this implies that at time T, we hold
T
0 u du k: k T dk
e

units of stock and that we are short


T k T
u du ru du
S0 e 0 ru du j k dj
ke e
0 k

k: k T

k( T
ru u ) du j k dj 0 ru du
S0 ke
0 e
k: k T

units of the zero bond. In order to be able to deliver exactly one share in time T, we
T
0 u du k: k T dk
chose e such that our terminal capital reads

T T
0 (ru u ) du dk (ru u ) du dj
Kno default S0 e k: k T
ke k
j: k j T ,
k: k T

which is therefore the fair strike conditional on no default. However, if there is a


default at some time 0 T, then we will forgo the dividends thereafter. Hence,
if we receive K for the share at T, we will be short the missing dividend amounts.

4 We implicitly assume that we by ourselves cannot default.


6 MODELING VOLATILITY

To protect ourselves against default, we need a mechanism which ensures that our
terminal bank account always has the same value at time T, be there default or
not. This can be achieved if we ‘‘forward-sell’’ the proceeds of the dividends. To
this end, we sell ‘‘risky’’ (corporate) zero bonds with maturities 0 1 N
(where N T N 1 ). Each bond has a notional of

T
u du j: k dj
j T
ke k ,

and since we hold the appropriate amount of shares at any time before default,
we will always be able to fulfill our obligations arising from shorting the bonds.
However, since the bond is risky, we have to pay a risk premium of h to the buyers
of these bonds, so shorting the bonds yields only
T
k (r hu ) du u du j: k dj
e 0 u
e k j T

Summing up, we find that the forward strike on S with maturity T is given as
t t
0 (ru u ) du k: k t dk
kh
u du
(ru u ) du d
Ft S0 e j: k j t j
ke 0 e k (1.3)
k: k t

Hence, in the absence of cash dividends, the fair forward strike for an asset does not
depend on the default risk involved. Note that F must in all cases be non-negative
due to no-arbitrage constraints.

1.1.2 The Shape of Dividends to Come


Given the form of the forward (1.3), what are the implications for potential stock
price processes? Between the discrete cash dividends, standard no-arbitrage argu-
ments show that if there is ‘‘no free lunch with vanishing risk,’’5 then there exists a
measure , equivalent to , and a local -martingale Y such that
t
Yt (ru hu u ) du
St S k 1
e k 1 t
Yk 1

holds for t [ k 1 , k ). In k, equation (1.2) applies and we get, added up,


t
0 (ru u ) du k: k t dk
t
(ru u ) du k: k t dk Yt
St S0 e Yt ke k on t.
Yk
k: k t
(1.4)

However, Y is also subject to the constraint that the process S cannot become
negative. We will now investigate the impact of this property. The following

5 Thenotion of ‘‘no free lunch with vanishing risk’’ is a stronger form of ‘‘no arbitrage.’’
Only the former is equivalent to the existence of a local martingale measure, and we will
always assume we are in this setting. See Delbaen/Schachermayer [2] for a detailed analysis
and examples.
Theory 7

discussion holds for local martingales in general, but we focus on the relevant true
martingale case. For ease of exposure, let us briefly assume that S0 1, r 0, 0,
h 0 and di 0, that is, with ,
Yt
St Yt k (1.5)
Yk
k: k t

At any point t, the forward of the stock to a later date T t must remain non-
negative. This implies that at any dividend date k , the stock price must exceed the
value of all forthcoming dividends,
Yt
St S k j t [ k, k 1)
Yk
j k

Since a martingale M with unit mean that is bounded from below by some [0, 1]
can be written as Mt Mt (1 ) in terms of a non-negative martingale M
which has again unit mean, we can write Y in terms of some non-negative martingale
X with unit mean as
Yt Xt 1 j k j j k j
t [ k, k 1)
Yk Xk S k
S k

By induction from k 1 it follows that

St 1 k Xt k
k 1 k: k t

In the case of nonzero interest rates, repo rates, and default intensity, we have more
general:

Result 1 There exists a non-negative martingale X with unit mean such that

St Ft Xt At on t (1.6)

with

Ft : S0 k Rt (1.7)
k 1

At : Dt k (1.8)
k: k t
t
0 (ru hu u ) du k: k t dk
Rt : e

and
k
k :
R k

We call X the ‘‘pure’’ stock price process.


8 MODELING VOLATILITY

The implication of the previous result is that we can focus on the modeling of the
pure martingale part X instead of modeling S itself. This will be the subject of this
part of the book. Extension to the case of stochastic interest rates or stochastic
default intensities is presented in the later chapters of this book.
The previous remarks also allow us to derive the form of the total return version
of the stock: Here, we reinvest the proceeds from repo rate and dividends directly
back into the asset, as soon as they occur.

Result 2 The total return process S(TR) of the stock is given as

t t
(TR) k (r hu ) du (ru hu ) du
St S0 u 0 (ru hu ) du
ke 0 e Xt 1 t ke k 1 t
k: k t k: k t

We can also go a step further: Since we are sure of the dividends we will receive,
we may forward-sell them. To this end, assume that we buy one share and that we
k
t j:t j k u du dj
write risky zero bonds for 1 2 with notionals ak : ke .
We will be able to honor the respective bond obligations if we reinvest the proceeds
from repo rates and continuous dividends into the stock. The gain from forward-
selling the dividends will be precisely A0 , as defined in (1.8). Hence, the overall price
process of this asset is
(plain) t
St S0 e 0 ru hu du
Xt 1 S0 BSt Xt (1.9)
t

(plain)
The crucial observation is that St is tradable (i.e., available for hedging purposes).
This will be used in section 1.4.

Ito in the Presence of Dividends The process (1.6) exhibits jumps, in which case
the standard Ito formula does not hold anymore. In our case these jumps are of
finite variation, which essentially implies that if we apply Ito to some f C2 , then
the second derivative of f will be integrated over the quadratic variation of the
‘‘continuous part’’ of S only. For convenience, let us reformulate (1.6) in terms of
purely proportional but then stochastic discrete dividends. To this end we first define
the deterministic functions

t : k1 k t and dt : dk 1 k t , (1.10)
k k

which are nonzero only on the dividend dates ( k )k 1, . They represent the fixed
and proportional dividends paid at each time t. Accordingly,

dt
St St e t (1.11)

We can then define the stochastic ‘‘proportional dividend process’’ by accumulating


the cash dividends into the exponential drift of the stock as

St dt t
Dt : log log e ,
St St
Theory 9

which gives

t
St Xt e 0 (ru u ) du u t Du (1.12)

Of course, if there are no fixed cash dividends , then D d is deterministic. The


SDE of S can now be written as

dSt Dt dXt
(rt t ) dt (1 e ) t (dt) , (1.13)
St Xt

where t ( ) denotes the Dirac measure in t. If X is continuous, and if its quadratic


variation is absolutely continuous with respect to the Lebesgue measure, then there
exists an integrable short-variance process ( t )t 0 and a Brownian motion B
such that

Xt t u dBu ,
0

where we have used the Doleans-Dade-exponential

1
t (Z) : e Zt 2 Zt

In this case, Ito’s formula for S and f C2 (or finite and convex) becomes

1
df (St ) f (St )dSt f (St ) S2t t f (St ) St f (St ) (1.14)
2

where S2t t dt is the quadratic variation of the continuous part of S.6 In integral
form, (1.14) reads

T T
1
f (ST ) f (S0 ) f (St )St (rt t )dt t dBt f (St ) S2t t dt
0 2 0
Dt
f (St e ) f (St )
t T

Also note that the quadratic variation of S is given as

T 2
S T S2t t dt e Dt
1 S2t
0 t T

6
If f is finite and convex, f exists as a positive measure. For example, the second derivative
of f (x) : x is the Dirac measure in zero, 0 .
10 MODELING VOLATILITY

1.1.3 European Options on the Pure Stock Process


Since S is an affine transformation (1.6) of the pure stock price X, we can express
the prices of European options on the former in terms of prices of European options
on the latter. Indeed, let

(T, K) : BT 1 ST K (1.15)

Then,

K AT
(T, K) P(0, T)FT 1T XT
FT
K AT
PS (0, T)FT T, ,
FT

where we define

(T, k) : XT k , (1.16)

which is the price of call on the pure stock price with strike k.7

Result 3 The call price on a stock S is given in terms of a call on the pure stock
price X as

K AT
(T, K) PS (0, T)FT T, (1.17)
FT

Hence, if call prices (T, K) are available for all strikes and maturities, we can derive
the respective prices (T, k) for all ‘‘pure strikes’’ k from the market via

1
(T, k) : T, kFT AT (1.18)
PS (0, T)FT

By put/call parity, the price of a put on S with strike K and maturity T is given as

(K, T) : BT 1 K ST (T, K) P(0, T)K P(0, T)FT ,

which implies the obvious lower bound

BT 1 K ST P(0, T) PS (0, T) K P(0, T) [ T]K

7 Strictly
speaking, we can call (T, k) only then the price of the respective call on X, if either
the market is complete (i.e., is unique) or if the call on S with strike K kFT AT and
maturity T is quoted in the market, in which case its price is given under any -equivalent
martingale measure by (1.15).
Theory 11

Consequently, the ‘‘pure’’ put (T, k) : (k XT ) on X is given in terms of


the put (T, K) on the original stock S as

1 S
(T, k) T, kFT AT (0, T) kFT AT (1.19)
PS (0, T)FT

The above results imply that as long as we consider markets where only the
stock price process S and European options are liquidly traded, we can focus entirely
on the process X. The above equations, (1.17) and (1.18), respectively, allow us to
convert one representation into the other. We will frequently switch between the
two objects S and X, depending on the application.

1.2 IMPLIED VOLATILITY

The most famous stock price model is the Black & Scholes model. In our frame-
work (1.6), it is given under the unique risk-neutral measure by assuming that X
is a geometric Brownian motion; that is,

dXt
t dWt (1.20)
Xt

for some non-negative function and a -Brownian motion W. The solution


to (1.20) is
t 1 t 2
Xt e 0 u dWu 2 0 u du (1.21)

In fact, this model has been introduced by Samuelson [3] and the time-dependent
version above is due to Merton [4], but in practice most people refer to it as the
Black-Scholes model (usually in the case without discrete dividends, though). The
crucial contribution by Black and Scholes [5] was not so much the model itself,
but the fundamental insight that any contingent claim H(ST ) for a sufficiently well-
behaved function H can be replicated perfectly by continuous trading in the stock.
The impact of this insight cannot be underestimated: Ever since Black and Scholes
published their work, a huge industry has evolved in whose core lies the idea of
replication of otherwise risky payoffs. The bottom line of the idea is that since we
can replicate the payoff, there is no risk in selling a contingent claim. Hence, the
costs of replication are certain, and it is justified to call this cost the price of the
contingent claim (we will discuss this in more detail in section 1.4).
In the Black-Scholes model, it is particularly easy to compute the prices of many
standard payoffs. A standard example is the price of a European call on X with
maturity T and ‘‘pure strike’’ k as defined in (1.16) (recall result 3, which shows
that it is sufficient to consider X rather than S). Its value is given as

T
1 2
(T, k) T, k, u du
T 0
12 MODELING VOLATILITY

in terms of the famous Black-Scholes formula

1 2T
ln k 2
(k, T ): (d ) k (d ) with d :
T

To price an option on S in Black and Scholes’s framework, note that the price of a
call with maturity T and strike K is given as

S S K AT
(T, K, ): PS (0, T)FT T, , S
FT

Since the Black and Scholes formula is strictly increasing in , it is possible to


solve for the latter given a market call price ˆ (T, k). This yields the common measure
of implied volatility for the price of an option:

Definition1.2.1 We call

ˆ (T, k) : (T, k ) 1 ˆ (T, k) (1.22)

the implied volatility of X at (T, k) and

ˆ S (T, K) : S
(T, K, ) 1 ˆ (T, K)

the implied volatility of S for (T, K). Note that by construction ˆ (T, k) ˆ S (T, kFT
AT ).

Interpretation It should be stressed that the notion of ‘‘implied volatility’’ does not
imply that we are actually using the Black-Scholes model. Indeed, it is evident from
quoted market prices that their model is no longer sufficient to evaluate contingent
claims. To see this, consider figure 1.1, where we have plotted implied volatilities
of STOXX50E.8
The effect that implied volatility ˆ (T, k) is a decreasing function of strike is called
skew. Most equity markets have such a shape, but some are less pronounced than
STOXX50E; for example, the Japanese N225, which is shown in figure 1.2.9 The
point is that implied volatility depends strongly on the strike across all maturities.
This means that the underlying stock price process cannot be explained using the
Black-Scholes model, for which the implied volatility does not depend on the strike.
Rather, we need to find a convenient model for X is able to produce implied
volatility surfaces that such as the ones displayed in the figures. When considering
alternative models, we should take into account the fact that the general shape of
implied volatility is remarkably stable: Figure 1.3 on page 14 shows how the implied
volatility surface of STOXX50E has changed in the last few years.

8 Werefer to underlyings by their Reuters code.


9
Symmetric ‘‘smiles’’ are a common feature in FX markets. In other markets, such as
commodities, the skew might actually be upward sloping.
Theory 13

.STOXX50E Implied Volatility 09/12/2005

50

45

40

35
Implied Volatility

30

25

20

15

10
09/06/2024
5 09/12/2019
09/06/2015
0
10%

09/12/2010
30.0%
50.0%
70.0%
90.0%
110.0%

130.0%

09/06/2006
150.0%

170.0%

190.0%

210.0%

Strike/Spot

FIGURE 1.1 Implied volatilities for different strikes and maturities for STOXX50E. The
graph shows a strong ‘‘skew’’ in strike direction for all maturities.

.N225 Implied Volatility 09/12/2005

50

45

40

35
Implied Volatility

30

25

20

15

10
09/06/2024
5 09/12/2019
09/06/2015
0
10%

09/12/2010
30.0%
50.0%
70.0%
90.0%
110.0%

130.0%

09/06/2006
150.0%

170.0%

190.0%

210.0%

Strike/Spot

FIGURE 1.2 N225 features nearly a ‘‘smile’’-type shape in strike.

1.2.1 Sticky Volatilities


Another interesting question is how implied volatility moves on an instantaneous
time scale when the stock price moves. Following Balland [6], we consider sticky
strike and sticky delta markets. In a sticky strike market, the implied volatility
14 MODELING VOLATILITY

.STOXX50E Implied Volatility 02/12/2002 .STOXX50E Implied Volatility 01/12/2003

50 50

45 45

40 40

35 35

30 30

25 25

20 20

15 15

10 10
03/07 03/08
5 03/06 5 03/07

0 03/05 0 03/06
50%

50%
03/04 03/05
60%

60%
70%

70%
80%

80%
90%

90%
100%

100%
03/03 03/04
110%

110%
120%

120%
130%

130%
140%

140%
.STOXX50E Implied Volatility 01/06/2004 .STOXX50E Implied Volatility 29/11/2004

50 50

45 45

40 40

35 35

30 30

25 25

20 20

15 15

10 10
09/08 02/09
5 09/07 5 02/08

0 09/06 0 02/07
50%

50%

09/05 02/06
60%

60%
70%

70%
80%

80%
90%

90%
100%

100%

09/04 02/05
110%

110%
120%

120%
130%

130%
140%

140%

.STOXX50E Implied Volatility 01/06/2005 .STOXX50E Implied Volatility 05/12/2005

50 50

45 45

40 40

35 35

30 30

25 25

20 20

15 15

10 10
09/09 03/10
5 09/08 5 03/09

0 09/07 0 03/08
50%

50%

09/06 03/07
60%

60%
70%

70%
80%

80%
90%

90%
100%

100%

09/05 03/06
110%

110%
120%

120%
130%

130%
140%

140%

FIGURE 1.3 Historic STOXX50E implied volatility during the last few years.

ˆ tS (T, K) of an option on S with cash strike K is deterministic. In a sticky delta


market, on the other hand, the implied volatility ‘‘relative to the forward’’ or,
in our case, ˆ t (T, k), is deterministic. The impact of the actual behavior of the
implied volatility can best be seen in the effect on the delta of a European option
(for notational simplicity, assume that S X). To this end, we write the implied
volatility ˆ S (T, K) of S at time t 0 as a function of S0 as ˆ S (S0 , T, K). The price of
a call with maturity T and strike K is then

ˆ (S0 , T, K) S
S0 , T, K, ˆ 0S (S0 , T, K) (1.23)
Theory 15

In a sticky strike market, the function ˆ 0S does not depend on S0 , that is, ˆ 0S (S0 , T, K)
(T, K), for some function . In a sticky delta situation, on the other hand, we have
ˆ 0S (S0 , T, K) (T, K S0 ). Consequently, the total derivative of (1.23) with respect
to S0 is

S0
ˆ (S0 , T, K) S0
S
( ) S
( ) S
S0 ˆ 0 (S0 , T, K)
ˆS
BS BS S
(T, K) (T, K) S0 ˆ 0 (S0 , T, K)

The symbol BS denotes Black and Scholes’s delta (the derivative of S is S )


0
BS
and the symbol denotes Black and Scholes’s vega (the derivative in volatility).
In a sticky strike situation, the derivative S0 ˆ 0S (S0 , T, K) is zero (i.e., the delta of
the call is given as Black and Scholes’s delta). In contrast, consider a sticky delta
market. In this case, we have S0 ˆ 0S (S0 , T, K) K S20 ˆ 0 (T1 K S0 ). Since the slope
of implied volatility is typically decreasing, this means that S0 ˆ 0S (S0 , T, K) 0, the
implied volatility for a fixed cash strike K rises if stock rises (note that it is actually
possible to compute the sticky delta purely from market data; cf. remark 2.3.2 on
page 78). This is in contrast to market experience: At least for strikes around ATM,
an increasing spot level will usually lead to a decline in volatility levels.10 Hence, a
sticky delta assumption is not compatible with market behavior.
Interestingly, both sticky strike and sticky delta behavior of the implied volatility
can be characterized neatly following Balland [6]. Under the assumption that the
driving stock price is a square-integrable martingale, he shows that the stock price in
a sticky delta market must have independent increments, while the only stock price
process that is compatible with a sticky strike market is Black and Scholes (i.e., the
case without skew).
An intuitive argument for the latter result goes as follows: Assume the market
is sticky strike, and that there are two calls with different strikes and the same
maturity, each with a different implied volatility. However, only one of the two
implied volatilities can actually be realized, which means at least in continuous time
processes that only one of the two hedges can work (see also section 2.2.1). (For a
thorough derivation of the result refer to Balland [6].)

Remark 1.2.1 This means that a volatility surface that has skew and is arbitrage
free for a given spot value S0 will no longer be arbitrage free for any other spot
value.
To see why a sticky delta model implies that the stock price process has independent
increments, note that in a sticky delta model, the price of a forward started call with
payoff

ST2
k
ST1

for 0 T1 T2 is given at time T1 as

T, k, T1 (T, k)

10
It can also lead to increase in the skew for downside strikes, hence out-of-the-money put
implied volatilities may actually rise.
16 MODELING VOLATILITY

This is a deterministic quantity; hence, the price today of the forward started call is
equal to its value at T1 (recall that we have assumed that there are no interest rates).
Since it is possible to extract the forward distribution of ST2 ST1 from the forward
started call prices by taking their second derivatives, it follows that the stock price
has independent increments. Examples of such processes include exponential Levy
processes. In contrast, stochastic volatility models such as Heston’s (2.1) are not
sticky delta because the implied volatility in such models does not move due only to
the movement of the spot, but also due to the movement of the other state variables
(i.e. the short volatility). Also compare remark 2.3.2 on page 78, where the delta in
(very general) stochastic volatility models is computed from the market.

1.3 FITTING THE MARKET

In this section we make the idealizing assumption that European options ˆ (T, k) on
X (or S, equivalently) are traded for all strikes and maturities. In such a situation,
it is very natural to ask whether the observed market prices are in some way ‘‘free
of arbitrage’’ in that they can be reproduced with a martingale that has the required
marginal distribution.

1.3.1 Arbitrage-Free Option Price Surfaces


In general, absence of arbitrage proves to be a tricky concept when it comes
to continuous time processes. While in discrete time the former is equivalent to
the existence to an equivalent martingale measure, this is not true anymore in
continuous time, and examples of markets exist, which are free of arbitrage but
where not even a local martingale measure exists (the standard reference on this
topic is Delbaen/Schachermayer [2]). To avoid technical difficulties, we will therefore
introduce a stronger notion of absence of arbitrage:

Definition1.3.1 We say the market of European call prices ˆ ( ˆ (T, k))T 0 ,k 0


is strongly free of arbitrage if there exists a non-negative true martingale X on some
stochastic base ( , , , ) which reprices the market, that is,

ˆ (T, k) XT k (1.24)

2
for all (T, k) 0.

The key contribution in this context is due to Kellerer [7]:11

Theorem 1.3.1 The market ˆ ( ˆ (T, k))T 0 ,k 0


is strongly free of arbitrage if
and only if

(a) For all T, the function ˆ (T, ) satisfies:

11 See Föllmer/Schied [9] for a proof.


Theory 17

(i) It is continuous, strictly decreasing and convex in k.


(ii) Its right-hand derivative in k satisfies 0 ˆ
k (T, k) 1.
ˆ
(iii) (T, 0) 1 and limk ˆ (T, k) 0. 12

(b) For all k, the function ˆ ( , k) is increasing.


(c) ˆ (0, k) (1 k) .

The martingale that reprices the market can be chosen to be Markov.

Note that the above conditions allow that k ˆ (T, 0) 1. This is the case if
the random variable XT has a nontrivial probability mass in zero. Since X is
non-negative, the state zero must be absorbing, hence, X can ‘‘default’’ without
being triggered (which can be interpreted as that the company still serves its
debt obligations). However, we regard this as an undesirable property and will
understand, if not mentioned otherwise, that k ˆ (T, k) 1.

Example 1 The ‘‘constant elasticity of variance’’ (CEV) model by Cox [8] is given
as the unique strong solution to the SDE

dXt X dWt (1.25)

where [ 12 , 1].13 This model is occasionally used as a ‘‘local volatility’’ approach


to incorporate skew (the resulting implied volatility exhibits an upward-sloping
downside skew).
For all 1, the process X can reach zero with a nonzero probability and then
‘‘dies’’ there.

Theorem 1.3.1 is a convenient tool to assess whether a given market or an inter-


polation scheme for market prices is free of strong arbitrage. However, it does not
describe how the process X can actually be computed. The best-known approach
in this direction is Dupire’s ‘‘implied local volatility’’ for continuous market price
processes, which requires the knowledge of European option prices for all strikes and
maturities. Madan/Yor discuss alternatives to construct pure jump processes [10].
For the discrete case where only a finite number of European options is provided,
time and state discrete martingales can also be constructed, as we will show below.

1.3.2 Implied Local Volatility


The core idea of implied local volatility is due to Dupire [11]. His idea is intriguingly
simple: Given observed market prices ˆ (T, k) for all k 0 and T 0 , we ask:
is it possible to find a function : 2 0 0 such that the solution to the SDE

dXt
t (Xt ) dWt
Xt

12
The condition that ˆ (T, 0) 1 ensures that any process with the correct marginals is a true
martingale.
13 For [0, 12 ), equation (1.25) has infinitely many solutions.
18 MODELING VOLATILITY

for a Brownian motion W exists, is unique, has the martingale property, and reprices
the market? That this is indeed possible can be derived using the following theorem
due to Gyöngy [12] (the original work [11] used an approach via the Fokker-Planck
equation):

Theorem 1.3.2 If Y is an m-dimensional continuous semi-martingale of the form


n
j j
dŶt t dt t dŴt ,
j 1

with predictable bounded and integrable drift and volatility matrix , then the
solution to
n
j
dYt a(t Yt ) dt bj (t Yt ) dŴt , Y0 : Ŷ0
j 1

with
j2
a(t, y) : t Ŷt y and bj (t, y)2 : t Ŷt y

exists, is unique, and has the same marginal distributions as Ŷ.

Let us assume that the ‘‘real market’’ price process X̂ (X̂t )t 0 is a true strictly
positive martingale under some measure ˆ . In this case (and if the quadratic variation
of the stock is absolutely continuous with respect to the Lebesgue measure),14 there
exists a ˆ -Brownian motion Ŵ and a stochastic ‘‘short variance’’ process ˆ ( ˆt )t 0 ,
such that X̂ satisfies

dX̂t ˆt dB̂t
X̂t

The market price of a call with strike k and maturity T is then given as

ˆ (T, k) : ˆ X̂T k

Theorem 1.3.2 implies that given some Brownian motion B on some stochastic base,
the SDE
dXt
t (Xt ) dBt
Xt

with

t (x) : ˆ ˆt X̂t x

14 Cf. propositions 3.8 (p. 202) and 1.5 (p. 328) in Revuz/Yor [13].
Theory 19

has a unique strong solution which has the same marginal distribution as X̂.
Therefore, X reprices all European options; in particular, [ Xt ] ˆ [ X̂t ] 1 for
all t (i.e., X is a true martingale).
To obtain an analytic form for , we use Ito’s formula for convex payoffs:
T T
1
(X̂T k) (X̂0 k) 1X̂t k dX̂t X̂t k d X̂ t
0 2 0
T T
1
1X̂t k dX̂t X̂t k t X̂t2 dt
0 2 0

Taking expectations and derivation in T yields

ˆ (T, k) 1ˆ 2
T X̂T k T X̂T
2
1ˆ ˆ 2
X̂T k T X̂T X̂T
2

X̂T2 (t XT )2 X̂T k
2
1 2
k (T k)2 ˆ [X̂T k]
2

Since the density of X̂T can be computed as ˆ [X̂T k] : 2 ˆ


kk
(T, k), we obtain
Dupire’s formula: given

ˆ
t (t, k)
2
2
t (x) : 2 ˆ
, (1.26)
2
k kk (t, k)

the unique solution X to

dXt
t (Xt ) dWt (1.27)
Xt

exists, is a martingale, and reprices the market. An example can be found in


figure 1.4.

Remark 1.3.1 The above formula is given in terms of the calls on the pure stock
price X̂. This is much more robust than using the call prices on Ŝ via (1.18) since
the effect of discontinuities in the forward (resulting from discrete dividends) are
eliminated.
It is also possible to write (1.26) in terms of implied volatility (which has the
same advantage of being robust with respect to jumps in the forward, etc). To this
end, one simply replaces the call prices in (1.26) by their equivalent values in terms
of the Black and Scholes formula and the implied volatilities.

Conceptually, implied local volatility is a very neat approach: starting from


the observable implied distribution of the underlying martingale X̂, a diffusion X
is constructed that has the same marginal distributions as the original process.
20 MODELING VOLATILITY

.SPX Implied Volatility 09/12/2005 .SPX Local Volatility 09/12/2005

70 70

60 60

50 50

40 40

30 30

20 20

12/18 06/18

10 06/16 10 12/15

12/13 06/13

0 06/11 0 12/10
30%

30%
12/08 06/08
40.0%

40.0%
50.0%

50.0%
60.0%

60.0%
70.0%

70.0%
80.0%

80.0%
90.0%

90.0%
100.0%

100.0%
06/06 12/05
110.0%

110.0%
120.0%

120.0%
130.0%

130.0%
140.0%

140.0%
150.0%

150.0%
FIGURE 1.4 Implied volatility and implied local volatility of SPX. The local volatility is
computed using Dupire’s formula and then interpolated by a smooth spline.

This approach ensures that the resulting process X reprices all European claims
correctly. In particular, skew exposure for European knock-out options and the
like is taken into account properly: as an example, consider a European digital
call that pays 1 if the stock is above the strike at maturity. The impact of skew
on such a product is severe. The graph in figure 1.5 shows the difference between
plain BS prices (computed with the strike-implied volatility) and the local volatility
price. We have also provided the price given by a tight call spread, 1XT K
1
2 (T, K ) (T, K ) .
The issue with (1.26) in practice is that it is very difficult to be used directly.
The main problem is that we usually have only a finite number of traded European
options. In order to obtain a local volatility function using Dupire’s formula (1.26),
we therefore need to intra- and extrapolate option prices (or implied volatilities). The
resulting European call price surface then needs to satisfy the no-arbitrage conditions
of theorem 1.3.1 to ensure that (1.26) is finite and not imaginary. Moreover, the
volatility function itself must ensure that the solution to (1.27) is unique and
nonexplosive.15 This is highly nontrivial and makes an extra- and interpolation
algorithm for discretely quoted market prices difficult to implement in practice. A
far more robust approach is calibration of a local volatility function via forward
PDEs, as described in chapter 8.

Example 2 Assume that market price process is a ‘‘jump diffusion’’ (cf. Merton [14])
1 2t Nt
Wt hmt
X̂t e 2 i 1 i , (1.28)

where N is a Poisson process with intensity h and where ( i )i is an iid sequence


of random variables independent of N with a nontrivial distribution; moreover,
m: ˆ e 1 1 0. The process W is a Brownian motion and is a constant.

15
A sufficient condition for the existence of a global unique solution to (1.27) is Lipschitz
continuity; see Protter [15].
Theory 21

European digital call prices (3m)

100%

90%

80%

70%

60%
Price

50%

40% Black&Scholes

Call Spread
30%
Local volatility
20%

10%

0%
90% 95% 100% 105% 110%
Strike/Spot

FIGURE 1.5 The prices of digital options for various strikes, computed with the
Black-Scholes model, by approximation via call prices and by using implied local volatility.
The example shows the importance of capturing the skew correctly when pricing nontrivial
European options.

In this case, (1.26) is not well defined at T 0; hence, no solution to the


problem of fitting the market with a diffusion of the type (1.27) exists.

1.3.3 European Payoffs


While implied local volatility is a very valuable tool to price path-dependent options,
nonvanilla European options by themselves can be priced more straightforwardly
by using directly quoted vanilla options.
To this end, note that any twice differentiable function H : 0 can be
written as
x0
H(x) H(x0 ) H (x0 )(x x0 ) H (k)(x0 x) dx H (k)(x x0 ) dx ,
0 x0
(1.29)

that is, as long as we can trade European options ˆ and ˆ with all strikes at the
maturity T, and under suitable integrability assumptions, we can compute

BT 1 H(ST ) P(0, T) H(K̂) H (K̂) FT K



H (K) ˆ (T, K) dK H (K) ˆ (T, K) dK,
0 K̂

which holds for any potential martingale measure and also covers the possibility
of default where the payoff at maturity is H(0). The strike K̂ is arbitrary and
22 MODELING VOLATILITY

can be set to the forward. Note that (1.29) also holds for convex functions with
their generalized derivatives. In particular, H does not need to be defined in 0: for
example, the formula is also valid for the convex function H(x) x 1 log(x),
the price for which is infinite if the stock has a nonzero probability of default.
The advantage of using (1.29) instead of implied local volatility to price the
payoff H is that (1.29) also yields a hedge for H: by construction, the formula
will tell us how many European options of each strike we have to buy to perfectly
replicate the payoff H. This is of great advantage, since an implied local volatility
model in itself gives a hedging strategy only in terms of the spot (cf. section 1.4).
Of course, in practice we will neither be willing nor able to invest in infinitely many
options. Instead, we will limit ourselves to a reasonable discretization of the real
line. The first step is to super-replicate H; we concentrate on convex functions since
most financial payoffs are convex functions or combinations thereof.
A convex function H can be approximated from above by linear functions. That
p p
means that if we select two sequences K̂ K0 K1 and K̂ Kc0 Kc1 of
p c
strikes with limn Kn 0 and limn Kn , respectively, then an approximation
sup sup
HT of H from above, HT HT , is given by

sup
HT : H(K̂) H (K̂)(ST K̂) wpn Kpn ST wcn ST Kcn (1.30)
n 1 n 1

with
n 1
H(Kcn ) H(Kcn 1)
wcn : wck
Kcn Kcn 1
k 1

and
p p n 1
H(Kn ) H(Kn 1) p
wpn : p p wk
Kn Kn 1 k 1

A similar formula holds for a subreplication strategy.


If H is a function that is finite in zero and linear beyond some K in the
sense that H[K , ) (x) x , we can use only a finite number of strikes: the
corresponding super-replicating payoff is given by
np nc
sup
ĤT : H(K̂) H (K̂)(ST K̂) wpn Kpn ST wcn ST Kcn (1.31)
n 1 n 1

p p
where 0 Knp K0 K̂ Kc0 Kcnc : K . The condition that H is
linear beyond some strike is necessary to be able to limit ourselves to some maximal
strike. Alternatively, we could postulate that for some large strike K , the value of
the respective call is practically zero, and will remain zero for the life of the contract
we want to price. We hence assume that there is no probability mass beyond this
‘‘zero price strike’’ K . Then, (1.31) gives a super-replication price and indeed a
super-hedging position for all convex payoffs.
Theory 23

Approximation of x-1-log(x) from above

0.3

0.2

0.1

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

-0.1

FIGURE 1.6 The super-hedge for the function H(x) : x 1 log(x).

Interpretation It should be noted that this approach is tantamount to assum-


ing that the stock price will attain only discrete values in T, namely the strikes
p p
(K0 , , Knc np ) : (0, K1 , , Knp , Kc1 , , Kcnc 1 , K ): this is because (1.30) is noth-
ing but the price of H computed under the probability measure

ˆ (T, k 1)
ˆ (T, k )
XT k : 1 (1.32)
k 1 k

where

K AT
k :
FT

(recall from (1.24) that ˆ are the market prices of calls on X). It is an attractive idea
because it will always compute an upper bound for convex payoffs. Accordingly, we
call a stock price model with (1.32) a ‘‘most expensive’’ model. Since the assumptions
made (such as the existence of K ) are relatively weak, it provides a good framework
to assess the value of a European payoff H (of course pricing via (1.32) is not limited
to convex payoffs). If we want to follow this approach for options that depend on
more than one maturity, though, we also have to construct the transition probabilities
between the marginal distributions (1.32). This is the subject of the next section.

1.3.4 Fitting the Market with Discrete Martingales


We will now discuss an alternative to implied local volatility which follows the
construction (1.32) above. The idea here is not to assume that a smooth surface of
European option prices is quoted in the market, but to give in to the fact that many
of these options are traded only finitely. We saw in the previous section that such an
24 MODELING VOLATILITY

assumption implies that we can replicate European payoffs using the liquid vanilla
instruments. The same will not hold true if we construct transition densities between
the marginal distributions (1.32), because these transition densities are not uniquely
defined by the observable market prices. However, following Buehler [16], we will
give a constructive approach identifying suitable transition matrices by imposing
‘‘secondary information.’’ More precisely, we will choose kernels that match the
prices of more exotic payoffs such as forward started options.
To be precise, assume that a discrete number of call prices ˆ (T, k)(T,k) can be
observed in the market; we denote by the set of maturities for which we have at
least one call price, and we enumerate them as 0 T1 Tm . For each maturity
Tj , we denote by j : k : (Tj , k) the set of strikes for which calls are quoted.
To avoid the case of local martingales we assume that 0 ˆ
j with (Tj , 0) 1. We
also assume as in the previous section that there is some (artificial and large) ‘‘zero
price’’ strike k ˆ
j such that (Tj , k ) 0 for all j. To ease notation, we write
j j
0 : k0 kd k for the strikes of calls with maturity j . We will also make
j
use of the first differences,

ˆ (Tj , kj ˆ (Tj , kj )
j 1)
: j j
k 1 k
j
for 0, , dj 1 and d : 0. Finally, we will also need the lower convex hull
j
of all call prices beyond some maturity, that is,

Hj : sup f (k) ˆ (Ti , k) : for all i j and k i


f :f is convex

Note that this function is just the lower convex linear hull of all call prices with
maturities Ti with i j.
The following corollary is a direct consequence of theorem 1.3.1:

Corollary 1.3.1 The discrete set of call prices ˆ (T, k)(T,k) is strongly free of
arbitrage if, and only if, the following conditions hold:

(a) Convexity: For all j 1, , m,


j j
1 0 dj
0

(b) Monotonicity: For all j 1, , (m 1) and all k j,

ˆ (Tj , k) ˆ (Tj 1 , k)

In particular, there exists a non-negative martingale X with states j per maturity


Tj with a marginal distribution given by
j j
[XTj k ]: 1 (1.33)

for 0, , dj .
j
If, in addition, 1 0, then X is strictly positive.
Theory 25

It is quite straightforward to check these conditions for real market data. (In
Buehler [16], it is also discussed how to turn data that do not satisfy the above
conditions into a ‘‘close’’ fit which then satisfies the conditions.) The crucial point is
that we can actually construct a martingale X as described in the above corollary.
Before we discuss this, let us recall from the discussion in section 1.3.3 that all
martingales that realized a density (1.33) are ‘‘most expensive’’ in the following
sense: Let Y on ( 2 , 2 , 2 , 2 ) be any other martingale (possibly with continuous
state space and time) which reprices the market, that is, for all j 1, , m and
kj j we have

2 (YTj kj ) ˆ (Tj , kj )

Then X is more expensive than Y for any convex payoff H : 0 0, that is,

H(XTj ) 2 H(YTj )

for all j 1, , m.
j j
Constructing Discrete Transition Densities Let us now denote by pj (p0 , , pd )
j
j
the row vector of probabilities of XTj being exactly k , that is,
j j j
p : 1

j 0 j
for 1, , dj and p0 : 1 . For any discrete martingale X with states at
maturity Tj , there exists a transition kernel
j j
0,0 0,dj
j (dj 1 1) (dj 1)

j j
dj 1 ,0 dj 1 ,dj

j j 1 j
with transition probabilities u, ‘‘from ku to k ’’,
j j
u, : [XTj k XTj 1
kju 1 ]
j
Hence, the problem at hand is how to find a sequence of kernels ( )j 1, ,m such
that

pj 1 j
pj

for all j 1, , m. More precisely, we search for a sequence of matrices ( j) for


j
which the following properties hold:

(a) Each element of j is non-negative.


(b) Each row of j is a probability distribution,16
j
1 dj 1
1 dj 1 1

16 We used 1n to denote the unit vector in n.


26 MODELING VOLATILITY

j j j
(c) has the martingale property, that is, for kj (k0 , , kd )T we have
j

j j
k kj 1

j
(d) is compatible with the marginal distributions,

pj 1 j
pj

The key is now that the above conditions (a)–(b) are all linear in the coefficients of
j
. That implies that we can formulate the quest for j as a linear programming
problem,

Aj j
yj
j (1.34)
0

(dj 2) 2(dj 2) 2 (dj 2)(dj 1 2)


where j is the vector of all elements of j and where Aj
j (dj 2) 2(dj 2) 2
and y (two of the linear conditions are redundant). For such prob-
lems, very efficient algorithms exist, and we can solve even large systems very quickly;
the existence of a solution is guaranteed by corollary 1.3.1. Note that this means that
the space of solutions for (1.34) is very large—while the number of elements of grows
quadratically, the number of conditions grows only linearly. This simply reflects the
fact that many possible transition densities are compatible with the observed marginal
distributions.
To put it positively, this means that we can select a transition kernel that satisfies
‘‘secondary requirements.’’

Repricing Forward Started Options Any of the solutions to (1.34) will perfectly
reprice the observed European option prices. However, it is often desirable also
to impose certain assumptions on the prices of forward started options: a fixed
strike forward started option with ‘‘reset date’’ T1 and maturity T2 T1 has the
payoff

XT2
x
XT1

While it is trivial to compute the value of such a payoff in the Black-Scholes model,
it is by far not clear what the fair value of such a contract should be in the market.
We will come back to this type of product at a later stage in section 2.1.6. Here, we
assume that we have a good idea of the price of a few strikes for each of the options
between Tj 1 and Tj . Accordingly, we denote by Ĉj (x) the price of an option with
payoff

XTj
x (1.35)
XTj 1
Theory 27

j
Its price given a transition kernel is
dj 1 dj j
j k
Cj (x) : pju 1
u, j 1
x ,
u 0 0 ku

which is once more just a linear expression in the elements of j . The idea is now to
choose a transition kernel j , which minimizes for a range x1 xn of strikes
the distance between model prices and assumed market prices, that is,

Cj (x1 ) Ĉj (x1 )

Cj (xn ) Ĉj (xn ) w

under an appropriate norm. This leads to an optimization problem of the form

min j j zj
j w
Aj j yj (1.36)
j
0

For w 1 and w , this reduces again to linear programming. For w 2, it is a


linear least-squares problem, which can also be solved efficiently.17
Pricing and Hedging In a model with discrete states, it is straightforward to evaluate
options whose value depends only on the dates T1 , , Tm . A discrete-state Monte
Carlo engine, for example, just needs to invert the conditional transition probabili-
ties, which can be accomplished very efficiently; cf. Glasserman [17]. For multi-asset
applications, an ad hoc approach is to use copulas to model their interdependency
(see chapter 7). The method lends itself also to backward pricing, since the necessary
(expensive) inversion of the transition matrices needs to be done only once during
the life of the model.

1.4 THEORY OF REPLICATION


In section 1.3.2, we introduced the concept of implied local volatility, where the
pure stock price process X under a martingale measure is given as

dXt
t (Xt ) dWt (1.37)
Xt
for a function , which can in theory be implied from market quotes of European
prices. One striking feature of such a model is that it is generally complete: we can
replicate the payoff of an option by continuous trading in the stock. This is in stark
contrast to the discrete model above, where such a strategy except for European
payoff is not available.

17 Other methods of selecting an appropriate transition kernel include the use of a ‘‘mean
variance’’ criterion (which is also linear in the underlying probabilities). See [16] for more
details.
28 MODELING VOLATILITY

Replicating Trading Strategies To illustrate the idea of replication, assume that we


sell a European claim with payoff18

HT H(ST )

To ease notation in the following, we will concentrate solely on the case of zero
dividends, default probability, and interest rates (i.e., on the case where S X). We
comment on the general case afterward.
The question is now the following: Can we trade in X such that the result of
trading plus a potential initial capital has at T the same value as HT ? This requires
the concept of a trading strategy: a trading strategy ( t )t [0,T] is a (random)
process whose value t denotes the amount of shares we should hold at time t. This
value may depend on past information, in particular the path of X up time t, but it
obviously cannot include any future information of the value of X. Mathematically,
we say that the process must be predictable. We also require that the process
T
is suitably integrable (i.e., 0 2t d X t is almost surely finite) and bounded from
below.19
To execute the trading strategy, assume we have an initial capital of H0 and
that we start at time 0 with buying 0 shares. We borrow the required amount
C0 0 X0 H0 from the bank,20 so that the value of our portfolio at time 0 is
V0 H0 .
Let us consider first discrete time trading, that is, that the hedging strategy
is constant on intervals [(k 1) , k ] for k 1, , n with : T n, i.e. n T.
After the end of the first interval, the value of our position in X has changed due
to the movement of the stock (i.e., it is now 0 X ), while our debt of 0 X0 H0
did not change since we assumed that interest rates are zero. Now we rebalance our
position in X according to our hedging strategy, which tells us now to hold units
of X. Accordingly, we have to buy ( 0 ) shares for the price of ( 0 )X ,
the excess of which we need to borrow again from the bank. The overall cost to
hold shares in is therefore

C ( 0 )X 0 X0 H0 X 0 (X X0 ) H0

Proceeding further in time, the accumulated cost to hold k at time k is

k
Ck k Xk H0 (j 1) Xjt X(j 1)
j 1

The value of our portfolio including the shares is

k
Vk : k Xk Ck H0 (j 1) Xj X(j 1)
j 1

18
For technical reasons, assume that H is bounded from below.
19 This excludes the ‘‘suicide strategy’’: double your bets until you lose.
20 Borrowing a negative amount means to invest it in the bank.
Theory 29

In case of a continuous trading strategy, the same arguments hold: the right hand
sum converges against the integral of over X, so we have

t
Vt H0 u dXu
0

for all t [0, T]. We now call the strategy replicating, if the value of VT matches
the value of HT , that is, if

T
HT H0 t dXt (1.38)
0

The point here is that the cost of replicating HT is covered by the constant H0 , which
justifies calling it the fair price of HT . If such a replication strategy is possible for all
payoffs of some set , then we say that the market ( , X) is complete.21
The most natural market is what we call the market of relevant payoffs: assume
we are allowed to trade in X and in some liquid instruments C (C1 , , Cn ). Then
the only economically relevant payoffs are those that depend functionally on X and
C; this means that we will consider only payoffs that are measurable with respect
X,C
to T for some finite T (in contrast to payoffs measurable with respect to the
larger -algebra T ). As usual, we also limit ourselves to payoffs that are bounded
from below. We now simply say the market (X, C) is complete, if any payoff HT
that is measurable with respect to X,C and bounded from below can be replicated
by some trading strategy ( , ) 22 in the sense that

T n T
HT [ HT ] u dXu u dCu (1.39)
0 1 0

Of course, it is generally not clear that such a replication strategy exists—the


reason why replication works in a local volatility model is that we have assumed
that there is an equivalent measure under which the process X, defined by
equation (1.37), is a Markovian martingale. To this end, consider now again a
payoff HT H(XT ) given in terms of a smooth, bounded function H with bounded
derivatives. We can then define, somewhat ad hoc, the bounded martingale (Ht )t [0,T]

Ht : [ H(XT ) t ]

Because of the Markov-property of X, this can be written as

Ht [ H(XT ) Xt ] : ht (Xt )

21
It is an important point that the notion of a ‘‘fair price’’ implies the existence of a replication
strategy. In incomplete markets, for example, some payoffs cannot be replicated, and therefore
do not have a unique price.
22 With T 2 d X n T 2
0 u u 1 0 ud C u .
30 MODELING VOLATILITY

If we assume that h is a C1,2 function, we can apply Ito and find, using the martingale
property of Ht , that
T
H(XT ) [ H(XT ) ] X ht (Xt ) dXt
0

Hence, we have found a trading strategy t : X ht (Xt ), which replicates our payoff,
and the price H0 : [H(XT )]. Similarly, at any later time t T, we can write
T
HT Ht X ht (Xt ) dXt ,
t

that is, the value Ht is the fair price of HT at t.

1.4.1 Replication in Diffusion-Driven Markets


The above considerations can now also be applied to more general cases: we will
now show how hedging works in a framework where a range of market instruments
is driven by an underlying Markov process (we will concentrate on diffusions here).
This will be put to use in section 2.3 when we discuss hedging of options on
variance with variance swaps. The idea is as follows: if we want to hedge a payoff
HT H(XT ), we will try to use the stock price X to hedge it, but if the market is not a
local volatility model, we will need additional traded instruments to cover us against
changes in the value of HT . To this end, we assume that there are liquid instruments
C (C1 , , Cn ) that are traded alongside the stock. For example, think of a finite
number of European options on S. We assume without loss of generalization that
the price processes C (Ct )t [0,T] for 1, , n are defined until T; for an option
with an earlier maturity T T, we simply set Ct : CT for t [T , T]. We also
assume that C1 , , Cn are bounded from below.
To apply the same idea as for the case of local volatility, we now stipulate that the
vector (X, C1 , , Cn ) of market instruments is given in terms of a finite-dimensional
m 1
diffusion Z (Zt )t [0,T] with open state space 0 by a function as

(Xt , C1t , , Cnt ) (Zt )

n 1
The function : 0 is assumed to be invertible and differentiable. For
all applications it will be appropriate to assume that X itself is among the state
variables Z, and we set Z0 : X accordingly. The process Z is thought to represent
the ‘‘state factors’’ of the market. The inherent assumption is that the relevant
information available in the entire market is incorporated in a finite set of states Z;
in the end, we could well assume that Z (X, C1 , , Cd ), but it is often tricky to
model, say, European options along with the stock price.
We limit our attention to diffusions and assume that Z is the unique strong
solution to an SDE
d
j j
dZt (Zt ) dt (Zt ) dWt Z0 z , (1.40)
j 1
Theory 31

where W (W 1 , , W d ) is a d-dimensional Brownian motion. The drift vector


m 1 and the volatilities j j j m 1
( 0, , m) : ( 0, , m) : 0 for
j 1, , d are not explicitly time dependent, but imposing, say, zm : 0, m : 1
and m 1 d 0 allows to set Zm t.
m t
Note, in particular, that in contrast to standard assumptions, we do not require
that has full rank. We merely require that the SDE (1.40) has a unique strong
solution. A sufficient but not necessary criterion is that and are Lipschitz
continuous.

Example 3 Let

dXt
t Wt1 1 2W2
t
Xt
d t a( t ) dt b( t ) dWt1

such that is well defined and set Zt : (Xt , t , t). Such a model satisfies the above
assumptions with

Ct : (Xt , C1t ),

where C1t is the value of a European option with maturity T T, for example,

C1t 1
(t Xt , t ) : XT K Xt , t

Example 4 In the same setting as in the example before, define Z0 : X, Z1 : ,


t
Z3t : t and, additionally, Z2t : Vt (t) 0 u du; see also section 2.3. Then,

T
C1t : 1
(t Xt , t , Vt (t)) : u du Xt , t , Vt (t)
0

T
u du t Vt (t)
t

satisfies the assumptions made before. The contract C1 is called the variance swap
with maturity T on X. We will see in section 2.3 that these instruments are very
natural hedging instruments for options on realized variance.

Delta Hedging Works Now assume as before that we want to hedge a smooth,
bounded European payoff HT H(XT ). As before, we define the martingale
(Ht )t [0,T] via

Ht : [ H(XT ) t ]

Note that because we have assumed that H is bounded, this is a true martingale.
Using the Markov property of Z, we can write again

Ht ht (Zt ) : [ H(XT ) Zt ]
32 MODELING VOLATILITY

Since is invertible, we can set gt (x, c1 , , cn ) : ht 1


(x, c1 , , cn ) , such that
given X and C (C1 , , Cn ), we have

Ht gt Xt , Ct

Assumption1 For smooth-bounded functions H CK with bounded derivatives


and compact support, the function

ht (z) : H(Xt , C1t , , Cnt ) Z0 z

is C1 in z for all t and continuous in t.

If the assumption holds, then h is differentiable in z, and so is g (via the inverse


function theorem applied to ). An application of Ito (possibly to an approximation
of g by C1,2 functions)23 shows that ‘‘delta hedging works,’’

T n T
HT H0 X gt (Xt , Ct ) dXt C gt (Xt , Ct ) dCt (1.41)
0 1 0

General Contingent Claims To handle more general payoffs, note that we can
approximate nonsmooth European payoffs by bounded smooth payoffs, and that
general path-dependent payoffs of the form HT H(Xt [0,T] , Ct [0,T] ) can be approx-
imated by payoffs that depend on finitely many states of X and C. The latter payoffs,
in turn, can be approximated by payoffs that are products of payoffs of the
form Hk (Xtk , Ctk ) for a finite number of dates t1 , , tn ; see also [18]. The crucial
condition to ensure completeness of the market is that assumption 1 holds.

Theorem 1.4.1 Under assumption 1, the market of relevant payoffs on (X, C) is


complete.

For example, option payoffs with prices Ht ht (Zt , At ), which depend not only on
Z, but on some finite variation process A (A1 , , Aq ) can be still delta-hedged
with Z,

T n T
HT H0 X gt (Xt , Ct , At ) dXt C gt (Xt , Ct , At ) dCt , (1.42)
0 1 0

where gt (x, c, a) : ht 1 (x, c), a .

23 Technical details and tighter results can be found in Buehler [18].


Theory 33

Deterministic Dividends, Interest Rates, and Default Risk Until now, we have focu-
sed solely on the case where S X, that is, we have abandoned the deterministic
market data of the first section. Let us briefly comment on the impact of using

St Ft Xt 1 t At 1 t

according to (1.6). The aim is to replicate H(ST ).


As a first step, we distinguish between default and no default. Since S drops to
zero upon default,

H(ST ) H Ft Xt At 1 T H(0)1 T

Hence,

P(0, T) [ H(ST ) ] PS (0, T) H FT XT AT P(0, T) 1 BST 1


H(0)

PS (0, T) H XT P(0, T) PS (0, T) H(0)

with H(x) : H(FT x AT ). Since we can lock in P(0, T) PS (0, T) H(0) by enter-
ing into a static position of risky and riskless bonds, we need to concentrate only on
the replication of H XT . We hence define

Ht : P(t, T) H(XT ) t 1 t (1.43)

As before, we assume that Z (Z0 , , Zm ) with Z0 X is uniquely given by the


SDE (1.40), and that a range of traded instruments C (C1 , , Cn ) is given as a
function of Z. Without loss of generality we can assume with the same trick as above
that each instrument C attains zero value upon default; hence, we set
t
Ct (Zt )e 0 (ru hu ) du
1 BSt (Zt )
t

We will also strip S of all its dividends and the repo by using the (tradable) process
S(plain) defined in (1.9). The key is that we will use the risky bond BS as the cash
account, that is, we aim to construct a replication strategy such that

T n T T
(plain) S
H(ST )1 T H0 u dSu u dCu u dBu
0 1 0 0

As usual, the value of is determined on the set t by the self-financing


requirement as

(plain) n
Ht t St 1 t Ct
t :
BSt

The next step is to express, as before, the conditional expectation (1.43) in terms of
Z, rewrite it by the invertibility assumption in terms of C, and then apply Ito. Let
34 EQUITY HYBRID DERIVATIVES

ht (x, z) : H(XT ) Xt x, Zt z and gt (x, c) : ht x, 1 x, c , to which

our previous results apply if g is sufficiently smooth. Defining, moreover,

(plain)
(plain) St Ct
gt St , Ct : gt , ,
BSt BSt

yields that on t,
n
(plain) (plain) (plain)
dHt (rt ht )Ht dt S gt St , Ct dSt C gt St , Ct dCt
1

In other words, the market is complete.


Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 2
Applications

hile chapter 1 highlighted the principles of equity pricing from a rather theoret-
W ical point of view, we want to focus now on practical aspects: we will discuss a
few commonly used stochastic volatility models and applications to Cliquet pricing;
we will also address the pricing of payoffs that depend on the realized variance of
an asset. In particular, ‘‘variance swaps’’ have become very liquid instruments and
trading volumes are set to grow even further. The respective options on variance are
an attractive new class of products on which to work.

2.1 CLASSIC EQUITY MODELS


In section 1.3.2, we discussed how we can construct martingales that fit a given
initial option price surface, the most popular approach being Dupire’s implied local
volatility. We have already mentioned that in practice, it is rarely possible to obtain
a continuum of option prices. Another problem with using an ‘‘implied’’ model is
that it does not allow us to control the specific dynamics of the resulting actual
stock price process. In this sense, we want to stress that a model that fits very well
to some market does not at all guarantee that it produces acceptable prices: for
example, consider a stock for which only forwards are traded, but no options. Then
a ‘‘perfectly fitting’’ model would be given by a deterministic stock price process.1 In
this case it is obvious that this ‘‘model’’ cannot be correct if we want to price options
on the stock. This argument can be carried over to volatility models: The mere fit
of a model to European option data does not imply that it gives sensible hedges or
prices for exotic payoffs. For this reason, it makes sense to take a ‘‘structural’’ point
of view and model the stock and its volatility directly, using a particular assumption
on the SDE it satisfies. We will review here a few of such classical stochastic volatility
models.

2.1.1 Heston
By far the most popular model is probably Heston’s stochastic volatility model [19].
It is given as a solution to the SDE

d t ( t ) dt t dWt1

1 This example is due to Peter Carr.

35
36 MODELING VOLATILITY

Stylized parameter effects of "Vol Of Vol" and "Correlation" on 1y implied volatility in Heston's model

30

25
Correlation

20
Implied Volatility

VolOfVol VolOfVol
15

10
Heston with zero correlation
Heston
5
BlackScholes

0
60.7% 74.1% 90.5% 110.5% 135.0% 164.9%
Strike/Forward (in log-scale)

FIGURE 2.1 Stylized effects of changing vol of vol and correlation in Heston’s model on the
1y implied volatility. The ‘‘Heston’’ parameters are 0 15%2 , 20%2 , 1, 70%
and 35%.

dXt Xt t dBt (2.1)


dBt dWt1 1 2 dWt2 ,

where W (W 1 , W 2 ) is a two-dimensional standard Brownian motion. We call


the ‘‘speed of mean reversion’’ or ‘‘mean reversion speed,’’ the ‘‘long vol,’’
the ‘‘vol of vol,’’ the ‘‘correlation,’’ and the initial value 0 the ‘‘short vol.’’ We
also refer to as ‘‘level of mean reversion.’’ The two parameters vol of vol and
correlation can be thought of as being responsible for the skew. This is illustrated
in figure 2.1: vol of vol controls the volume of the smile and correlation its ‘‘tilt.’’
A negative correlation produces the desired downward skew of implied volatility.
The other three parameters control the term structure of the model:2 In figure 2.2,
the impact of changing short vol, long vol, and mean reversion speed on the term
structure of ATM implied volatility is illustrated. It can be seen that short vol lives
up to its name and controls the level of the short dated implied volatilities, while
long vol controls the long end. Reversion speed controls the skewness or ‘‘decay’’ of
the curve from the short vol level to the long vol level.
Note, however, that the distinction of the parameters by their effect on term
structure and strike structure above was made for illustration purposes only: In
particular, and are strongly interdependent if the model is used in the form (2.1).
Indeed, is meant to be the ‘‘speed’’ of the process, but it does not feature in
the volatility term of the variance. This is counterintuitive in the following sense:

2
Note that the parameters , , and 0, , are not really ‘‘orthogonal’’; we group them here
just for illustration purposes.
Applications 37

Short Vol Long Vol

25 25

20 20

15 15

10 ShortVol 15 10 LongVol 20
ShortVol 10 LongVol 15
5 5
ShortVol 20 LongVol 25
0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12

Reversion Speed

20

16

12

8 Reversion Speed 1
ReversionSpeed 0.5
4
ReversionSpeed 1.5
0
0 2 4 6 8 10 12

FIGURE 2.2 The effects of changing short vol, long vol, and mean-reversion speed on the
ATM term structure of implied volatilities. Each graph shows the volatility term structure for
12 years. The reference Heston parameters are 0 15%2 , 20%2 , 1, 70% and
35%.

Consider the time change t : t, such that

t ( t )dt t dWt (2.2)

The process ( t )t can be seen as being in ‘‘unit speed,’’ 1. From this point of
view it would be more natural to parameterize the process in (2.1) as

d t ( t ) dt t dWt1

Properties of Heston’s Model One of the most attractive features of Heston’s model
is the fact that its variance is mean reverting. Such a mean-reverting feature is
commonly seen in real market data; see also figure 2.3. Moreover, its calibrated
correlation of around 70% is quite stable over time and produces, as we will show,
a relatively good fit to the market’s implied volatilities, at least for maturities beyond
three months. (Figures 2.6, 2.7, and 2.10 show examples of calibrating Heston and
other models to market data.)
However, Heston’s popularity is probably mainly derived from the fact that
it is possible to price European options on X using a semiclosed-form Fourier
transformation, which in turn allows rapid calibration of the model parameters to
market data.
The underlying mathematical reason for the relative tractability of Heston’s
model is that is a squared Bessel process, which is well understood and rea-
sonably tractable (cf. Revuz/Yor [13]). In fact, a statistical estimation on SPX by
Aı̈t-Sahalia/Kimmel [20] of [1 2, 2] in the extended model

d t ( t ) dt t dWt1
38 MODELING VOLATILITY

SPX spot level and 30-day realized volatility

2000 100

1800 90
SPX
1600 80
30-day vol

1400 70

1200 60

Volatility
Price

1000 50

800 40

600 30

400 20

200 10

0 0
07/10/1957 30/03/1963 19/09/196812/03/1974 02/09/1979 22/02/198515/08/1990 05/02/1996 28/07/2001 18/01/2007

FIGURE 2.3 Historic SPX quotes and estimated 30-day variance. Apart from occasional
spikes we can identify the mean-reverting nature of the variance. It should be noted that the
level of mean-reversion itself also varies over time.

has shown that, depending on the observation frequency, a value around 0 7 would
probably be more adequate. What is more, the square-root volatility term means
that unless
2
2 , (2.3)

the process can reach zero with nonzero probability. The crux is that this conditions
is regularly violated if the model is calibrated freely to observed market data. While
a vanishing short variance is not a problem in itself (after all, a variance of zero just
implies that nobody trades), it makes numerical approximations more complicated.
In a Monte Carlo simulation, for example, we have to take the event of being
negative into account. The same problem appears in a PDE solver: Heston’s PDE
becomes degenerate if the short vol hits zero (cf. section 9.4). A violation of (2.3)
also implies that the distribution of short variance at some later time t is very wide
(see figure 2.4).
Additionally, if (2.3) does not hold, then the stock price X may fail to have
a second moment if the correlation is not negative enough in the sense detailed
in proposition 3.1 in Andersen/Piterbarg [21]. Again, this is not a problem from a
purely mathematical point of view, but it makes numerical schemes less efficient.
In particular, Monte Carlo simulations perform much less well. Although an Euler
scheme will still converge to the desired value, the speed of convergence deteriorates.
Moreover, we cannot safely use control variates anymore if the payoff is not
bounded.

Computing European Option Prices with Fourier Transforms To compute European


option prices, we focus on the call price. Following Carr/Madan [22], we will price
Applications 39

Probability density of Heston's short vol for a vol of vol of 20%

75

65

55

45

35

25 1m
3m
15
6m

-5 0 5 10 15 20 25 30 35 40
Short Volatility Level
Probability density of Heston's short vol for a vol of vol of 40%

75

65

55

45

35

25 1m
3m
15
6m

-5 0 5 10 15 20 25 30 35 40
Short Volatility Level

FIGURE 2.4 This graphs shows the density of t for one, three, and six months for the case
where condition (2.3) is satisfied (left side) or not (right side). Apart from the vol of vol, the
parameters were 0 15%2 , 20%2 , and 1.

the call via Fourier inversion (see also Lewis [23] for a detailed overview of the
subject). Let, as before,

(T, ek ) : XT ek

Since the call price itself is not an L2 function in k, we define a dampened call
k
c(T, k) : e (T, ek )

for an 0 (see Carr/Madan [22] for a discussion on the choice of ). We also


denote by t the density and by t the characteristic function of log Xt . Then,

t (z) : eikz c(t, k) dk

eikz k
1x k ex ek t (x) dx dk

x
e(iz )k x
e(iz 1)k
dk t (x) dx

e(iz 1)x
t z i( 1)
t (x) dx
(iz )(iz 1) (iz )(iz 1)
40 MODELING VOLATILITY

We can then price a call on X using

e k
(T, ek ) e izk
t (z) dz
0

The method also lends itself to Fast-Fourier transformation if a range of option


prices for a single maturity is required.
Heston’s Characteristic Function Let us now show how we can compute Heston’s
characteristic function,

T (z) : eiz log XT

We present here an approach that is mathematically not rigorous, but very intuitive.
See Heston’s original work for a more precise derivation of the characteristic
function. We have

T (z) eiz log XT


T iz T du
eiz 0 u dBu 2 0 u

z iz z2 T
e 2 0 u du ,

T z2 T
where z is the complex measure associated with the density eiz 0 u dBu 2 0 u du .
t
We have Bt Bzt 0iz u du for a
z
-Brownian motion Bz . This implies that under
z
, the process satisfies

d t ( t ) dt t dWtz with : iz and : .

Here, W z is a z -Brownian motion with a correlation of with respect to Bz . We


can therefore compute using the more general function

xT h 0Txu du
T( , h x0 ) : e

for a process

dxt (m kxt ) dt xt dWt


h 0txu du
To this end, note that because of the Markov property of x, the process e T t
h 0txu du
( , h xt ) is a martingale on [0, T]. Hence, by using Ito and division by e , we
obtain the PDE

0 h T t( , h x) T T t( , h x)
1 2 2
(m kx) x T t( , h x) x xx T t ( , h x)
2
h T t( , h x) T T t ( , h x) m x T t ( , h x)
1 2 2
x k x T t( , h x) xx T t ( , h x)
2
Applications 41

with boundary condition 0 ( , h x) e x. Since x is affine, we guess that is an


exponential of an affine function,
xAT ( ,h) mBT ( ,h)
T( , h x) e (2.4)

By solving the above PDE for this function, we obtain

ae t
AT ( , h) t
(2.5)
be

and
be t
T b t (a b) log b
Bt ( , h) At ( , h) dt (2.6)
0 b

with the constants

: h( b) 2
a: h( b) 2
2
: h k
2
b: h k
: k2 2 2

For the case where m is time-dependent, see section 2.1.5 below.

Simulating Heston Once we have calibrated the model using the aforementioned
semiclosed form solution for the European options, the question is how to eval-
uate complex products. At our disposal are PDEs and Monte Carlo schemes. We
briefly comment on the Monte Carlo approach: we want to simulate the Heston
process (2.1) in an interval [0, T]. Since the conditional transition density of the
entire process is not known, we have to refrain from solving a discretization of the
SDE (2.1). To this end, assume that we are given fixing dates 0 t0 tN T
and let ti : ti 1 ti for i 0, , N 1. Moreover, we denote by Wi for
i 0, , N 1 a sequence of independent normal variables with variance i , and
by Bi a corresponding sequence where Bi and Wi have correlation .
When using a straightforward Euler scheme, we will face the problem that can
become negative. It works well simply to reduce the volatility term of the variance
to the positive part of the variance, that is, to simulate

ti 1 i ( i) i i Wi

A flaw of this scheme is that it is biased. This is overcome by using the moment-
matching scheme

1 e 2 ti
ti
ti 1 ti ti e ti Wi , (2.7)
2
42 MODELING VOLATILITY

Removing the bias

0.80%

0.60%
Error in expected annualized quadratic variation

0.40%

0.20%

Unbiased
0.00%
200

50
-0.20% 20

15 MC Steps per year

-0.40% 10
0 0
1 5
1
2 2
Years 3
4
5

FIGURE 2.5 Plain Euler with various steps per year vs. the unbiased scheme. The model
parameters were 0 30%2 , 20%2 , 2, 70%, 35%. The graph shows the
T
error between the true and the simulated value of [ 0 t dt] T.

which works well in practice, see figure 2.5. Higher-order schemes such as Milstein
cannot be used with this process since the square root is not differentiable at 0 (this
is not such a big problem if we ensure that (2.3) is satisfied). A similar approach
is used to compute the stock price: Here, we note that the integral over t in the
interval [ti , ti 1 ] conditional on ti is given as

1 e ti
iV : ti ti

hence, we set

1
Xti 1
: Xti exp iV Bi iV
2

A powerful tool to improve the convergence of the estimation of an expectation are


control variates (for the case where (2.3) holds). The idea is as follows: Assume we
want to compute the expectation of a random variable X (the payoff) and denote by
n
[X] the estimated value of X using n Monte Carlo paths. The standard deviation
of the error in this estimate is given by Var[X] n (i.e., it is worthwhile to try
to reduce the variance of the variable we estimate). Now assume that there is a
second random variable Y (the control variate) whose expectation [Y] we know
analytically.
The idea is that we estimate the value of X hY and add back the value of
hY. It is clear that this scheme is unbiased if our original Monte Carlo scheme
Applications 43

was unbiased. To compute the ideal , note X hY has the variance Var[X]
2hVar[X, Y] h2 Var[Y], which is minimized if we set

Var[X, Y]
h:
Var[Y]

Since we usually do not know Var[X, Y] and Var[Y], we can replace the above
quantities by the estimates on the nth path. Extension of this idea to a number of
control variates is straightforward (a good reference on Monte Carlo in practice is
Glasserman [17]).
An efficient control variate depends by construction on the actual payoff, but if
no other variance reduction techniques are used, using the integrated variance and
the stock price is usually a good choice. To this end, we track in addition to and
X also Vi 1 : Vi i V, which is an unbiased estimator of the integrated variance

ti ti
1 e
u du ti 0
0

2.1.2 SABR
The SABR model introduced by Hagan et al. [24] is given as

d t dWt1
t
dXt Xt t dBt (2.8)
dBt dWt1 1 2 dWt2 ,

for [ 12 , 1] and X0 x and 0 0. It is a blend between the CEV model


(cf. example 1) and a log-normal volatility model: the former is obtained from (2.8)
by using 0, while the latter corresponds to 1. This model is very popular
in interest rate modeling due to the fact that it is possible to derive approximations
for the implied volatility directly from the model parameters. These approximations
can then be used to interpolate the implied volatility surface in an arbitrage-free way
without the need to compute European option prices numerically with subsequent
computation of implied volatilities. The implied volatility for a strike k at maturity
T is approximated in [24] as

(1 )2 2 1 2 3 2 2
z 0 1 24 (xk) 4 (xk)( 1) 2
24 T
ˆ (k, T) (2.9)
(xk)(1 ) 2 )2 )4
1 (1
24 log2 x
k
(1
1920 log4 x
k

with

x 1 2 z z2 z
z: (xk)(1 ) 2
log and : log
k 1

While this model is convenient for marking implied volatilities, it has a few
drawbacks when used for pricing equity options. The first issue is that for the
case 1, the stock price itself becomes zero with a nonzero probability just as
the CEV process in example 1. While this might be acceptable for single stocks,
44 MODELING VOLATILITY

SABR Calibration .STOXX50E (11/01/2006)


Heston Calibration .STOXX50E (11/01/2006)

0.50%
0.50%

0.40%
0.40%

0.30%
0.30%

0.20%
0.20%

0.10%
0.10%
0.00%
0.00%
-0.10%
-0.10%
-0.20%
-0.20%
-0.30%
-0.30%
-0.40% 2y -0.40% 2y
1y6m
-0.50% 1y6m
-0.50%
9m
75%

80%

9m

963%
85%

821%
90%

687%
95%

562%
100%

3m

448%
105%

346%
110%

3m

258%
115%

185%
120%

128%
125%

85%

55%
Strike/Spot
Strike/Spot

FIGURE 2.6 Calibration of SABR and unconstrained Heston to STOXX50E data for
maturities from 3m to 2y. Heston appears to fit better to most indices at the time of writing.
The calibrated values were 0 15 9%, 46 9%, 78 0%, 0 58 and X0 0 75
for SABR and v0 15 7%2 , 40 2%2 , 0 30, 68 5% and 38 3% for
Heston. The SABR fit is only marginally worse for fixed 1 and X0 1, in which case the
remaining parameters become 0 15 9%, 46 9% and 78 0%.

this is rarely a desirable feature for index price processes. Another issue is that
in the case 1 and 0, the stock price in this model is not a martingale,
as Jourdain shows in [25]. He also shows that the model has moments up to
order 1 (1 2 ); hence, the second moment does not exist for 1 2. These
problems stem from the fact that the model has a log-normal volatility, which
implies that volatility can grow exponentially. However, most historic data indicate
that an unbounded volatility process is rather unlikely, and that volatility should be
mean-reverting in some sense (to this end, see figure 2.3 on page 38). Nonetheless,
the model offers an alternative to the Heston model because it can be calibrated very
quickly to observed European market prices using (2.9). At the moment, however,
it does not seem to beat Heston in terms of fitting the market, as figure 2.6 shows.

The SABR model has been extended in several ways. In [26] Hagan et al. discuss
the model with a more general local volatility function F,

d t dWt1
t
dXt F(Xt ) t dBt
dBt dWt1 1 2 dW 2 ,
t

for which they also present analytical approximations. Moreover, Henry-Labordère


[27] discusses approximation formulas for much more general models than (2.8). In
particular, he introduces a mean-reverting drift into the SDE for and, additionally,
shows how the local volatility function F in the above equations must be chosen to
perfectly match the short-end skew. In a recent paper, Bourgade and Croissant [28]
also work in this extended framework.
Applications 45

2.1.3 Scott’s Exponential Ornstein-Uhlenbeck Model


Scott [29] has proposed a short-variance process, which is modeled as an exponential
Ornstein-Uhlenbeck (OU) process,

dvt ( t vt ) dt dWt1
dXt Xt evt dBt (2.10)
dBt dWt1 1 2 dW 2
t

This process has been investigated in depth by Fouque et al. in [30]. This model
shares with the preceding SABR model the loss of the martingale property for 0
and the limitations if the second moment is to be retained (in fact, Jourdain discusses
in [25] both models). From a practical point of view the problem with (2.10) is
that no straightforward method is available that allows the efficient computation of
European option prices or implied volatilities. It should be noted, however, that the
process v itself is very easy to simulate. The complication is to simulate the stock
price X, for which we have to revert to solving the SDE (2.10) via discretization.
The use of control variates as discussed above improves the convergence of a Monte
Carlo scheme, but again this limits us to the case where X has a second moment.
However, if we want to price European options, we can make use of the following
observation: let t : evt , then

T 1 2 T 2 1 T dt
t dWt 1 t dWt 2 0 t 1
XT k e 0 0 k Wt:t T

2 T
1
YT T, kYT 1 , t dt
T 0

with
T 1 1 2 2 T dt
t dWt 0 t
YT : e 0 2

Hence, we have reduced the computation of a European option to a one-factor


problem. This obviously works for all ‘‘pure’’ stochastic volatility models where the
volatility does not depend functionally on the stock price level.

2.1.4 Other Stochastic Volatility Models


The list of stochastic volatility models that have been proposed for option pricing
is long. However, apart from Heston-type and SABR-type models, most stochastic
volatility models do not admit an easy access to the pricing of European options
or their implied volatilities.3 In contrast, for many Levy models proposed in the
literature (see, for example, Overhaus et al. [31] and Shoutens [32]), the character-
istic function is available, such that the approach discussed on page 38 can be used

3 Schoebel/Zhou [34] have shown that it is possible to obtain the characteristic function of

logarithm of the stock price if the short volatility itself is given as an OU process. This model,
however, is somehow unnatural since the short volatility can become negative.
46 MODELING VOLATILITY

to price Europeans. Numerical methods for such models tend to be more involved
than for diffusion-based models; see Cont/Tankov [33] for a good account on using
Levy models in finance.

2.1.5 Extensions of Heston’s Model


Using Heston’s model (2.1) as a basis, we can develop a range of related models that
still admit a characteristic function that can be computed more or less quickly. The
first extension is a model in which the level of mean reversion is time dependent:
assume that ( t )t 0 is a non-negative function and set

d t ( t t ) dt t dWt1
dXt Xt t dBt (2.11)
dBt dWt1 1 2 dW 2
t

A good example, which we will pick up again in section 2.3.3, is t : m ( 0


m)e ct . Following the computations for Heston’s model, we find that we can still
write the characteristic function of log X as an exponential of an affine function as
in (2.4). Indeed, the only change is that now, instead of (2.6),

T
BT (h, ) : (T t)At (h, ) dt
0

If time dependency of the other parameters of Heston’s model is required, we can


revert to the case of piecewise constant parameters. Indeed, let us set

d t t( t t ) dt t t dWt1
dXt Xt t dBt (2.12)
1 2 2
dBt t dWt 1 t dWt

with functions , , and , which are piecewise constant on 0 t0 tn .


Assume that tk T tk 1 . The characteristic function of log XT is then given as

vtk Ak (z) t mtk


eiz log XT eiz log XT tk eiz log Xtk e k Bk (z)

for some constants Ak (z) and Bk (z). By iteration, we obtain once again an exponential
affine characteristic function of log XT .
In a different direction, Heston’s model can be extended by adding jumps to the
return process. A popular example is Bates’s ‘‘Heston Jump Diffusion’’ [35], which
is a combination of Heston’s model and the jump diffusion model with normal
jumps in the return as in example 2 on page 20. Since the characteristic function
of the jump diffusion part can be computed easily and since the jumps and the
Brownian motions are independent, the characteristic function of Bates’s model is
just the product of the characteristic functions of Heston’s model and the Jump
Diffusion model with zero short volatility (i.e., 0 in (1.28)). The parameters of
this model can also be made time dependent with piecewise constant values.
Applications 47

.STOXX50E Heston Calibration 10/01/2006 .STOXX50E Extended Heston Calibration 10/01/2006

0.20% 0.20%

0.15% 0.15%

0.10% 0.10%

0.05% 0.05%

0.00% 0.00%

-0.05% -0.05%

-0.10% -0.10%

5y 5y
-0.15% -0.15%
3y 3y
1y 1y
-0.20% -0.20%
75%

75%
3m 3m
80%

80%
85%

85%
90%

90%
95%

95%
100%

100%
1m 1m
105%

105%
110%

110%
115%

115%
120%

120%
125%

125%
Strike/Spot Strike/Spot

.STOXX50E Heston TD Calibration 10/01/2006 .STOXX50E Bates TD Calibration 10/01/2006

0.20% 0.20%

0.15% 0.15%

0.10% 0.10%

0.05% 0.05%

0.00% 0.00%

-0.05% -0.05%

-0.10% -0.10%

5y 5y
-0.15% -0.15%
3y 3y
1y 1y
-0.20% -0.20%
75%

75%

3m 3m
80%

80%
85%

85%
90%

90%
95%

95%
100%

100%

1m 1m
105%

105%
110%

110%
115%

115%
120%

120%
125%

125%

Strike/Spot Strike/Spot

FIGURE 2.7 Various models fitted to STOXX50E for maturities from 1m to 5y. The
introduction of time dependency clearly improves the fit. Figure 2.8 on page 48 shows a
summary of the calibration for STOXX50 while figure 2.9 on page 48 and figure 2.10 on
page 49 show the summaries for SPX and FTSE, respectively.

When the number of parameters in a model increases, it will usually also fit


better to the implied volatility. In particular, the extension (2.11) is a good way of
improving the short-end fit of Heston’s model to the implied volatility market. If
a much better fit is required, the piecewise constant time-dependent Heston model
with or without jumps can be used, as is illustrated in figures 2.7 through 2.10.
However, it should be noted that by introducing piecewise constant time-
dependent data, we lose much of a model’s structure. It is turned from a time-
homogeneous model that ‘‘takes a view’’ on the actual evolution of the volatility via
its SDE into a kind of an arbitrage-free interpolation of market data: If calibrated
without additional constraints to ensure smoothness of the parameters over time,
this is reflected in large discrepancies of the parameter values for distinct periods.
48 MODELING VOLATILITY

.STOXX50E Calibration Results 10/01/2006

Heston
0.25%
Extended Heston
0.20% Bates TD
Heston TD
0.15%

0.10%
(Model-Market)/Spot

0.05%

0.00%

-0.05%

-0.10%

-0.15%

-0.20%

-0.25%
1Y/75%

1Y/95%

2Y/90%

3Y/85%

4Y/75%

4Y/95%

5Y/90%
1m/98%

3m/90%

6m/75%

6m/95%

1Y/110%

1Y/135%

2Y/105%

2Y/125%

3Y/100%

3Y/115%

4Y/110%

4Y/140%

5Y/105%

5Y/125%
1m/101%

3m/105%

6m/110%

FIGURE 2.8 A summary view of the calibration for STOXX50E. The extension of Heston
via (2.11) in particular improves the fit of Heston’s model to the short end, which is a
common problem of the original model.

.SPX Calibration Results 11/01/2006

Heston
0.25%
Extended Heston
0.20% Bates TD
Heston TD
0.15%

0.10%
(Model-Market)/Spot

0.05%

0.00%

-0.05%

-0.10%

-0.15%

-0.20%

-0.25%
1Y/75%

1Y/95%

2Y/90%

3Y/85%

4Y/75%

4Y/95%

5Y/90%
1m/98%

3m/90%

6m/75%

6m/95%

1Y/110%

1Y/135%

2Y/105%

2Y/125%

3Y/100%

3Y/115%

4Y/110%

4Y/140%

5Y/105%

5Y/125%
1m/101%

3m/105%

6m/110%

FIGURE 2.9 Calibration results for SPX. The naıve calibration for Heston gives a very bad
fit that exceeds the desired 0.10% error threshold frequently.

For example, the excellent fit of the time-dependent Heston model in figure 2.8 is
achieved with the following parameter values (short volatility 0 was 15.0%):

6m 1y 3y
Long vol 20.7% 23.6% 36.1% 46.5%
Reversion speed 5.0 3.2 0.4 0.3
Correlation 55.2% 70.9% 80.1% 69.4%
VolOfVol 78.7% 81.5% 35.3% 60.0
Applications 49

.FTSE Calibration Results 30/12/2005

Heston
0.25%
Extended Heston
0.20% Bates TD
Heston TD
0.15%

0.10%
(Model-Market)/Spot

0.05%

0.00%

-0.05%

-0.10%

-0.15%

-0.20%

-0.25%
1Y/75%

1Y/95%

2Y/90%

3Y/85%

4Y/75%

4Y/95%

5Y/90%
1m/98%

3m/90%

6m/75%

6m/95%

1Y/110%

1Y/135%

2Y/105%

2Y/125%

3Y/100%

3Y/115%

4Y/110%

4Y/140%

5Y/105%

5Y/125%
1m/101%

3m/105%

6m/110%

FIGURE 2.10 Calibration results for FTSE.

Moreover, the increased number of parameters makes it more difficult to hedge


in such a model in practice. Even though both Heston and the time-dependent
Heston models create complete markets, as discussed in section 1.4.1, we will
always need to additionally protect our position against moves in the parameters
values of our model. Just as for vega in Black and Scholes, this is typically done by
computing ‘‘parameter greeks’’ and neutralizing the respective sensitivities. Clearly,
the more parameters are involved, and the less stable these are, this ‘‘parameter
hedge’’ becomes less and less reliable.

2.1.6 Cliquets
A classic group of ‘‘volatility products’’ in equity markets is called Cliquets. The
term generally refers to contracts whose payoff depends one way or the other on
the performance of an asset over a future period of time. For example, a globally
floored Cliquet with a local floor of 2.5% and a cap of 5% over the reset dates
0 t0 tn T pays

n
Sti
2 5% 1 5% (2.13)
Sti 1
i 1

where we used the notation a b: max a, b and a b: min a, b . Other, more


exotic payoffs include:

■ Napoleons:

n
Sti
90% 110% 100% ,
Sti 1
i 1
50 MODELING VOLATILITY

■ Multiplicative Cliquets:

n
Sti
max ,1 1 ,
Sti 1
i 1

■ Reverse Cliquets:

n
Sti
C k
Sti 1
i 1

for C 0 and k 0.

The evaluation of such products is by far not trivial and the market has not yet
settled for an agreed reference model. In fact, at least for single underlying products, a
big step forward would be if it were possible to price and, more importantly, actually
hedge plain forward-started options consistently. For example, a forward-started
call has the payoff

St2
k (t1 t2 ) (2.14)
St1

Puts are defined accordingly.4


If we want to price a forward-started option of the type above, it is clear that at
the reset date t1 , the contract turns into a plain European option. Since such options
are liquidly traded, this price must be very accurate. In other words, any model we
may propose should internally be able to produce future implied volatility shapes
(i.e., European option prices) that are consistent with historic behavior: we have
already discussed in section 1.2.1 that the general shape of the implied volatility
surface is similar over time. However, we do not necessarily need to fit the entire
implied volatility surface perfectly. Intuitively, the main importance is to fit and
explain well those implied volatilities at time-to-maturity of the length of the period
: ti ti 1 , so typically one month, three months, six months, or one year.

Stochastic Implied Volatility Under these circumstances, the most natural modeling
approach is to model directly the implied volatility surface (or, equivalently, the
implied forward distribution or the European option prices). The first such stochastic
implied volatility model (to our knowledge) was proposed by Brace et al. in [36].5
It has also been discussed by Cont et al. [38] and Haffner [39]. The idea is relatively

4 The form (2.14) is called a fixed notional forward started call; the variable notional form

has the payoff

St2 St1 k (t1 t2 )

5
In an earlier work, Schoenbucher [37] discusses an implied volatility model for a single
strike K.
Applications 51

straightforward: Let us denote by t (T, k) the implied volatility in our model at time
t for a strike k and a maturity T. We now want to model this quantity directly as a
stochastic process. While it is possible to formulate this idea in terms of stochastic
functions in the spirit of Brace et al. [36], we consider here the more direct approach
of writing in terms of a sufficiently well-behaved function G and an m-dimensional
parameter process Z (Zt )t 0 as

k
t (T, k) : G Zt T t,
Xt

For example, we use a d-dimensional Brownian motion W (W 1 , , W d ) and


assume that the m-dimensional process Z is the unique strong solution to an SDE

d
j j
dZt (Zt ) dt (Zt ) dWt
j 1

for vectors (z), 1 (z), , d (z) m.

The function G is chosen such that it gives a reasonable shape of the implied
volatility for all possible parameter values z . This is why we have written
G(z x, c) as a function of the natural coordinates’ time-to-maturity x T t and
relative strike c k Xt instead of fixed maturities and cash strikes. Ideally, the
parameters of the process Z would have a direct interpretation such as level, skew,
kurtosis, and term structure of the implied volatility surface. However, it should be
clear that the specification of such a function and the dynamics of Z are constrained
by no-arbitrage conditions: In particular, the price process of each European option
should be a local martingale.6
The price of a call with cash strike k and maturity T at time t is given by

k k
t (T, k) Xt T t, , t T t, : (Xt , Zt T t, k)
Xt Xt

If the implied volatility surface is well defined, then it follows from the continuity
of the stock price process that X is given in the form Xt t( 0 s dBs ) for some
Brownian motion B and with a short variance process , which is the square of the
2 7
instantaneously maturing implied volatility, t t (0 Xt ) . In other words, the call
price is a function of X and Z, and as such we can apply Ito. As a result, we obtain

6 The existence of a local martingale measure is equivalent to ‘‘no free lunch with vanishing
risk’’; see Delbaen/Schachermayer [2].
7
To see this, note that is the derivative of the instantaneously maturing variance swap.
Moreover, the instantaneous squared implied volatility is equal to the instantaneous variance.
52 MODELING VOLATILITY

a regularity condition on the interplay between , , and .

1 2
0 x Xt2 t
2
m d n
1 j j 2
Zit i (Zt ) i (Zt ) k (Zt ) Zi ,Zk (Zt )
2 t t
i 1 j 1 i,k 1

d m
j j1 2
i (Zt ) t i (Zt )
2 Zt ,Xt
j 1 i 1

This expression can be expanded using the standard derivatives for the Black &
Scholes formula, which results in a complex PDE for and (see Brace et al. [36]
for details). While this approach is very appealing, it has the unfortunate drawback
simply that no ‘‘stochastic implied volatility’’ model has yet been published that
is not from the start a stochastic volatility model. The main problem of the entire
approach is that it is very difficult to find a function G that actually ensures that
the European option prices at any time t are strongly arbitrage free in the sense of
definition 1.3.1 on page 16; if a model produces arbitrage situations in itself, then
the ‘‘price’’ of a derivative computed with this model is meaningless. Indeed, it seems
that the only functional forms for G so far known are those that stem from starting
with price process X in the first place: this is one of the motivations of using the
SABR model discussed above, for which we have approximative formulas for the
implied volatilities. However, even if we use the implied volatility surface function
given by, say, a Heston model (2.1) and simply see it as a function

G:z: ( 0, , , , ) G(z , )

which maps the parameters of the model to an implied volatility surface, the
restrictions imposed by the no-arbitrage equation derived above are severe (also see
the comments in example 5 on page 75).

Remark 2.1.1 Instead of modeling implied volatility, we could also consider alter-
natives such as the call prices on the stock, its implied distribution, or the implied
local volatility. The latter has been discussed by Derman/Kani in the related context
of their implied trees [40].

2.1.7 Forward-Skew Propagation


To price Cliquets, we have to revert to less ambitious approaches. Note that it is, of
course, possible to price a forward-started option using the Black-Scholes formula.
For a given flat volatility , the price of such a call (2.14) on X is given as
BS
(t1 , t2 , k) : t2 t1 , k

Just as before, this allows us to define what is called the forward implied volatility
of a given market price ˆ (t1 , t2 , k) for the call as
1 ˆ (t1 , t2 , k)
ˆ (t1 , t2 , k) : t2 t1 , k
Applications 53

This quantity is often used as a way to quote the price of a forward-started option.
For example, we call k ˆ (t1 , t1 , k) the forward skew at t1 for the period
: t2 t1 . Given a particular model, this forward skew can be used to compare
the prices of forward-started options with the same reset period but with different
starting dates: see, for example, the fourth graph in figure 2.12 on page 82, which
shows how the forward skew for is equal to three-month changes with the start
date in a Heston model. We can clearly see that the skew becomes more and more
U-shaped.
Sometimes it is required that a model ‘‘propagates the skew,’’ that is, that the
forward skew matches the current skew for the same time-to-maturity as closely
as possible. One way to achieve this works as follows: As before, denote by the
period between two reset dates, and we assume that we can extract the distribution
of St1 from the market using the second derivative in strike of standard spot-started
European options. The idea is now to assume that

Sti
Yi :
Sti 1

is independent of Yj , j i 1, , 1 and that it has exactly the same distribution as


Y1 . This implies that the discrete stock price is given as a product of independent
variables,

i
Sti Yj
j 1

Such a model is called an independent increment model and by construction it


will perfectly ‘‘preserve the skew.’’8 Apart from the unrealistic assumption that
the increment of a stock price does not depend on its past behavior in any way,
this model also has the drawback that the prices of spot-started European options
with maturities t2 , , tn are completely determined by the initial distribution of
Y1 . Consequently, the ATM spot-started options will usually not fit to the market
prices. To alleviate this obvious drawback, it has been proposed to maintain the
ATM implied volatility for the forward-started options in Black and Scholes and
to apply a certain skew to them. These forward starts are then used to back out
the assumed distribution of Yi , which is possible because of the assumption of
independent increments: If all forward-started call prices are known, the forward
distribution is as usual given by the second derivative of these prices in strike. Hence,
a simple model of this type can be realized by jumping independently between the
reset dates ti according to the forward distributions implied by the forward-started
call prices.9

8 There are many ways to obtain such a model. The easiest approach is to use a Levy process

(CGMY or Merton’s model) and calibrate it only to options with maturity . The resulting
fit for the spot-started options is usually good enough to obtain an idea of the approximate
price level of a Cliquet. An alternative approach is to use directly the distribution inferred by
the European options with maturity .
9
To obtain an idea of the impact of such a model, calibrate a Merton-model with time-
dependent volatility parameters: first, the jump parameters are calibrated to the -maturity
54 MODELING VOLATILITY

Blending the Skew Instead of using purely independent increments, it is often desir-
able to introduce some interdependency between the increments while retaining the
possibility of controlling closely the shape of the forward distribution. In fact, what
is needed is a model where each Yi is distributed according to some distribution ,
which is parameterized by a parameter-vector . If these parameters are the same
for all i 1, , n, then the model is an independent increment model.
We want to discuss such a model now: It allows us to blend between a pure
independent increment model and a real stochastic volatility model. The idea is to
use the distribution in Heston’s model for the forward distribution. Using previous
results, we can combine the various forward distributions such that it is possible to
blend between a pure Heston model and an independent increment model. Let us
therefore define for the first interval t [0, t1 ] the initial process

1 1( 1 1 1 1
d t t ) dt t dWt
1 2W
dXt Xt 1d Wt 1 t

The distribution of Xt1 is then controlled by the parameters 1 ( 01 , 1 , 1 , 1 , 1 ).


To model the next increment, we again want to use Heston’s model. Hence, set for
t (t1 , t2 ]

2 2 2 2 2 2
d t ( t ) dt t dWt
2 2W
dXt Xt 1d Wt 1 t

The key is that we can introduce a dependency on the values of the previous process
by letting

2 2 2 2 1
t1 : 0 (1 ) t1 ,

1
where we usually set 02 : t
1 1 ( 01 1 )e to avoid jumps in the
forward variance curve of the model. The blending parameter 2 allows us to
blend from the independent increment case ( 2 1) to the pure (piecewise time-
dependent) Heston case ( 2 0). The parameters for the second maturity are
2
( 2 , 2 , 2 , 2 2 ). This process can then be iterated to yield a sequence
of semidependent short volatilities for each interval. Additionally, the sequence
1
, , n can be used to fit the model to the ATM spot options.
While the other parameters could be chosen freely, it is in the spirit of the
approach—propagating the skew—to keep , and constant, because this implies
that the forward distribution of Xti for i 2, , n has the general properties of
the initial distribution for Xt1 . The parameter can be varied to assess the impact
of co-correlation between the increments. Indeed, if 0 and if and the start
values for each interval, ( 0i )i 1, ,n , are kept constant, then the model simply is an
independent increment model with identically distributed increments.

options. Then, a time-dependent volatility coefficient for the Black and Scholes diffusion part
of the model is calibrated to the strip of ATM options.
Applications 55

.STOXX50E explicit 3m calibration 01/06/2005

0.06%

0.04%
Calibration Error (ModelPrice-MarketPrice)/Spot

0.02%

0.00%
3m95% 3m97.5% 3m100% 3m102.5% 3m105% 1m100% 2m100%

-0.02%

-0.04%

-0.06%

FIGURE 2.11 The fit of the Heston model to the 3m skew. The calibrated parameters are
11 25%2 , 17 39%2 ,
0 2 75, 65%, and 51 69% (note that
condition (2.3) is violated).

Of course, the general idea of randomizing the parameters of the distribution


can be applied to any stock price model, but the ‘‘blended Heston skew’’ model
described here has the advantage that the characteristic function of the logarithm of
the stock price can be computed easily: in each interval, a formula of the type (2.4)
holds. For i 1, , n we can find constants Ai and Bi such that

i t
ti h t i si ds ti 1 A
i i Bi i
e i 1
ti 1 e

Iteration yields a closed form for the characteristic function. To match the very
short-term options better it is possible to add a jump diffusion component along the
lines of Bates [35].10

Example As an example, assume we want to price a Cliquet structure with three


monthly reset periods. We have calibrated a Heston model to the following options:
3m calls on 100%, 102.5%, and 105%; 3m puts on 95% and 97.5%; and 1m
and 2m calls on 100%. Since the reset period of the Cliquet we want to price is
three months, we have given the 3m options twice as much weight as the other two
options.
The resulting Heston model fits very well to the calibration instruments, as
shown in figure 2.11.
As a next step, we have set up the above model with i : , i , i and
i i 1
0 : ti for i 1, , n. As a result, the model is just the calibrated Heston

10
An additional stochastic interdependency can be modeled by setting i : yti for an
y
independent square root diffusion y with SDE dyt c(mt yt ) dt yt dWt , which has
a piecewise constant mean-reversion level m in order to match the ATM-Europeans or the
variance swap term structure.
56 MODELING VOLATILITY

model as long as i 0, while it is an independent increment model if we set i 1;


note that the increments are not exactly identically distributed because the short vol
parameters 0i vary. The interesting point is now the impact on the forward skew of
changing between these extreme values: Figure 2.12 shows how blends between
a skew-preserving model and a true homogeneous Markovian model.
Finally, we can assess the impact of the blending of the skew when pricing a
Cliquet structure. As an example, we show in figure 2.13 what happens when we
price the globally floored Cliquet (2.13).

Remark 2.1.2 The last graph of figure 2.12 shows the usual effect that in stochastic
volatility models the forward skew for start dates that are farther away tends to
become more ‘‘U-shaped.’’ The reason for this behavior can be explained as follows:
For a time-homogeneous stochastic volatility model such as Heston, the price of a
forward-started call on X with reset date t1 , maturity t2 , and strike k is given as

Xt2 Xt2
k k t1 c( t1 t2 t1 , k)
Xt1 Xt1

with

X
c( , k) : k 0
X0

At time t1 , the implied volatility for the relative strike k and time-to-maturity
: t2 t1 is according to (1.22) given as
1
ˆ t1 ( , k) : ( ,k ) ĉ( t1 , k) ,

that is, it is a function of the random short variance t1 . Due to the homogeneity of
the model, the skew k ˆ t1 ( , k) will be very similar in shape to k ˆ 0 ( , k) for all
reasonable values of t1 . In particular, the ‘‘expected future skew’’ k ˆ t1 ( , k)
is nearly the same as k ˆ 0 ( , k) (see figure 2.14). The quantity ‘‘forward skew,’’
on the other hand, is given as
1
(t1 , t2 , k) : ( ,k ) ĉ( t1 , k)

Since ( , k ) 1 is concave for out-of-the-money options, it follows from Jensen


that we obtain the observed U-shape. It seems theoretically more natural to preserve
the expected future skew instead of the forward skew. The former is a genuine
property of all homogeneous stochastic volatility models.

2.2 VARIANCE SWAPS, ENTROPY SWAPS, GAMMA SWAPS


We have seen that under the assumption that sufficiently many European options
on the underlying S, or X, are traded, we can price European payoffs uniquely
using (1.29) or its discrete version (1.30). A particularly popular application of (1.29)
is the pricing of variance swaps, suggested first by Neuberger. We also present two
relatively new products, entropy swaps and gamma swaps.
Applications 57

Forward Skew with 0% Blending (Independent Increments)

20

Forward Implied Volatility


15

3m
6m
10 9m
1y
1y3m
1y6m
5
94.00% 96.00% 98.00% 100.00% 102.00% 104.00% 106.00%
Strike
Forward Skew with 30% Blending

20
Forward Implied Volatility

15

3m
6m
10 9m
1y
1y3m
1y6m
5
94.00% 96.00% 98.00% 100.00% 102.00% 104.00% 106.00%
Strike
Forward Skew with 70% Blending

20
Forward Implied Volatility

15

3m
6m
10 9m
1y
1y3m
1y6m
5
94.00% 96.00% 98.00% 100.00% 102.00% 104.00% 106.00%
Strike
Forward Skew with 100% Blending (Heston)

20
Forward Implied Volatility

15

3m
6m
10 9m
1y
1y3m
1y6m
5
94.00% 96.00% 98.00% 100.00% 102.00% 104.00% 106.00%
Strike

FIGURE 2.12 The impact of changing the blending parameter on the forward skew. We
can clearly see the usual increasingly upward-sloping forward skew in the classic Heston
model.
58 MODELING VOLATILITY

Impact of blending between Heston and independent increments

2.500%

2.000%

1.500%

1.000%

0.500%

0.000%
Globally Fwd Call Fwd Call Fwd Call Fwd Call Fwd Call Fwd Call Fwd Call Fwd Call
Floored Spread Spread Spread Spread Spread Spread Spread Spread
Cliquet 3m 6m 9m 1y 1y3m 1y6m 1y9m 2y
-0.500%

Blending 0% Blending 30% Blending 70% Blending 100%

FIGURE 2.13 The price of the globally floored Cliquet (2.13) with maturity in two years
along with the values of the prices of the involved forward-started call spreads. The price
differences stem mostly from the difference in the prices of the forward-started options,
rather than the global floor.

Forward Skew vs. Expected Future Skew

20

15
Volatility

10 Propagated Skew

Expected Future Skew

Forward Skew

5
94.00% 96.00% 98.00% 100.00% 102.00% 104.00% 106.00%

FIGURE 2.14 Forward skew and expected future skew in the Heston model.

2.2.1 Variance Swaps


A variance swap with maturity T is a contract that pays the realized variance of
the return of the stock over the period [0, T] in exchange for a previously agreed
strike, K2 (the strike is usually quoted in ‘‘volatility,’’ K). In the absence of any
Applications 59

proportional or fixed dividends, and no risk of default, the realized variance is


commonly defined as

n 2
n 252 St
(T) : log k (2.15)
n 1 Stk 1
k 1

where 0 t0 tn : T are the business days in the period [0, T]. The scaling
factor

1 252
:
[T] n 1

‘‘annualizes’’ the returned variance: the number 252 is the standardized number of
business days per year; we can think of [T] as being approximately T. If the stock
price pays dividends and is subject to default risk, then we use here

2
n
n 1 Stk tk edtk
(T) : log 1 tk , (2.16)
[T] Stk 1
k 1

where tk denotes the discrete cash dividend paid at tk and where 1 e dtk is the
proportional dividend for this date, cf. (1.10). The idea of this convention is that
we do not want to count movements of the stock price that are due to (previously
known) dividend payments. Indeed, if no further dividends are paid in (tk 1 , tk ), we
obtain (cf. equation (1.6) on page 7):

Stk tk edtk Stk Ftk 1 Xtk Atk 1 Rk


1 tk 1 tk 1 k
Stk 1
Stk 1 Ftk 1 Xtk 1
Atk 1
Rk 1

In practice, default risk is not excluded as in (2.16), but by imposing a cap on


the overall realized variance (discussed on page 64 ff). Moreover, dividends are in
practice taken out only for single stocks; for indices, (2.15) is used. See remark 2.2.1
for the impact of dividends. Let us first consider the case where dividends are taken
out (i.e., (2.16)).
Given dividend dates 0 0 m T, we have

m
n 1
(T) log S j log S j 1 (2.17)
[T]
j 1

We will assume that the right-hand side is in fact the definition of realized variance
(cf. remark 2.2.1 below). A variance swap pays the actual realized variance up to its
maturity T in exchange for a previously agreed strike K2 . Its payoff is therefore

n
(T) K2
60 MODELING VOLATILITY

We will denote by t (T, K) the value at time t of a variance swap with strike K
1
and maturity T. Since both [T] and K are constants, it is sufficient to compute the
expectation (2.17) for the purpose of evaluating a variance swap, which is given by

t (T, K) K2
Vt (T) : [T]
P(t, T)

If is a pricing measure, and if there are no cash dividends, this means that

Vt (T) log X T t

The fair strike K (T) for this maturity, which renders the initial value of the trade
zero, is therefore

1
K (T) : V0 (T)
[T]

Remark 2.2.1 Note that approximation (2.17) works well if we want to price vari-
ance swaps. However, the pathwise approximation of realized variance by quadratic
variation is not perfect, as is illustrated in figure 2.15. This is particularly important
if we price nonaffine payoffs of realized variance; see Barnorff-Nielsen et al. [41] for
a discussion on the properties of the error.

Realized Variance vs Quadratic Variation

30%

25% Realized Variance

Quadratic Variation

20%
Annualized Volatility

15%

10%

5%

0%
0 20 40 60 80 100 120 140 160 180 200
Days

FIGURE 2.15 The quality of the approximation of realized variance by quadratic variation.
The graph shows an example path of each of the two quantities for Heston’s model with the
calibrated parameters from figure 2.6.
Applications 61

Pricing and Hedging Following Demeterfi et al. [42], we henceforth assume that
the pure stock price X is continuous, and that X t is absolutely continuous with
respect to the Lebesgue measure. We have mentioned already in section 1.1.2 that
this implies that there exists a stochastic short variance process ( t )t 0 and a
Brownian motion B such that
t
Xt t Z Zt : u dBu ,
0
1
where t (Y) : eYt 2 Y t denotes again the Doleans-Dade-exponential. Accordingly,
the quadratic variation of the returns of X is given as
T
log X T u du
0

On [ j 1, j) with j, we have t (F j 1
Xt B j 1
)Rt Rtj 1 . Hence,11

2
F j 1
Rtj 1
d log S t d XR t
F j 1
Xt Rt B j 1
Rt
F j 1
Rtj 1
2 d(XR)t 2d log(F Xt Rt B j 1
Rt )
F j 1
Xt Rt B j 1
Rt j 1

Hence,

j 2 B j 1 j
log S j
log S j 1
d St Rt 2 log 1 d j
j 1
St Rtj 1
S j
(2.18)

Let us focus for a moment on the case when there are no discrete cash dividends.
We obtain

n 1
(T) log X T
[T]
and, using Xt St Ft ,
T T
ST dXt ST dSt
log X T 2 log 2 2 log 2 (2.19)
FT 0 Xt S0 0 St

This means that we can replicate realized variance by holding a static position in
a log-contract with payoff 2 log ST and by dynamic delta-hedging with a delta
of t : 2 St (for clarity of exposure we ignore discounting here). One particular
point is that the cash-delta $t : St t 2 (2.19) is constant: we hold at all times the
value 2 in the stock. Similarly, the gamma t : 1 S2t implies that our cash gamma
of t$ : S2t t is constant, too. (In the light of the discussion below this makes a
variance swap particularly suited to ‘‘trade volatility.’’) For (2.18), the expression

11
Xt Yt X t Y t 2 X, Y t
62 MODELING VOLATILITY

DAX Realized Variance and Its Hedge (31 days)

60% 1.00%
Realized variance
Hedge 0.80%

50% Hedging Error


0.60%

0.40%
40%
Realized Variance

0.20%

Hedging Error
30% s 0.00%

-0.20%

20%
-0.40%

-0.60%
10%

-0.80%

0% -1.00%
01/92 01/93 01/94 01/95 01/96 01/97 01/98 01/99 01/00 01/01 01/02 01/03 01/04 01/05

FIGURE 2.16 The quality of hedging variance swaps with (2.19). The graph shows daily the
realized variance over 31 business days, the return from the hedging strategy (2.19), and the
hedging error.

is slightly more complicated, but it is still of the same basic structure. (Note that
additional terms are European-type payoffs on S, whose value can be computed
using formula (1.29).)
To assess the quality of the hedging strategy implied by this equation, we have
used historic DAX returns and priced a variance swap against two log-contracts plus
their daily delta-hedge. Figure 2.16 shows the impressive performance of this hedge.
To calculate the cost of exercising this strategy, note that under any equivalent
matringale , the expectation of the right hand side of (2.19) is given as

V0 (T) : 2 log XT (2.20)

To compute log XT , note that this value is equal to [ H(XT ) ] with H(x)
x 1 log x, the function shown in figure 1.6 on page 23.
This function can also be used to center the strip of options around some ‘‘refer-
ence strike’’ K̂. To this end, note that H(x K̂) has a minimum of zero in K̂. We have

ST S0 K̂ ST K̂ S0
H H log ST log S0 ,
K K K̂ K̂
that is,
T
n 2 S0 ST 2 dSt
(T) H ST H S0 (2.21)
[T] K [T] 0 St

Following this strategy, that is, taking a static position in H(ST K̂) instead of
log ST , requires an additional position in a future. See Demeterfi et al. [42] for an
extensive discussion of this subject. Also, Carr/Madan discuss various extensions of
Applications 63

the idea of pricing volatility-sensitive options via hedging arguments similar to (2.21)
in [43]. For example, it can be shown that pricing H via (1.31) means that in actual
fact, a corridor variance swap is priced, that is, the returns in the sum (2.16) will
be counted only if the stock price is between the lowest and highest strike of (2.21).
Therefore, a sufficiently wide strike range should be used. Corridor variance swaps
and their hedging are discussed in Carr/Lewis [44].

Remark 2.2.2 In some contracts, in particular for indices, realized variance is


defined using equation (2.15), even if dividends are present. In that case, we have to
evaluate
T
2
log S T t dt Dt
0 t T

2
The additional terms Dt (of which there are only finitely many) can be hedged and
priced with European options using formula (1.29).

Trading Volatility Apart from the fact that variance swaps can be hedged and priced
using European options and a clearly defined delta-hedging strategy, what are the
reasons to trade this product?
One motivation to trade in volatility is that apart from the stock, the price
of an equity derivative is massively dependent on the volatility of the stock price.
Practitioners therefore seek to protect themselves against moves in volatility. A very
common method works as follows (assume that X S): To price an option with
payoff H(XT ), we use the Black-Scholes model with a constant volatility ,

dXt
dBt , (2.22)
Xt

where we estimate a reasonable from European options traded at maturity T.


For example, we might decide that the payoff H is sufficiently close to a call with
maturity T and strike k with a market price of ˆ 0 (T, k) at time zero. Its implied
volatility (cf. definition 1.2.1) is denoted by ˆ 0 (T, k), and we choose to use this
implied volatility for our Black-Scholes model (2.22) by setting : ˆ 0 (T, k). Let us
use X̂ (X̂t )t 0 to denote the real market price process.
Then, our price for H is given as

BS
ht (X̂0 , ) : H(XT ) X0 X̂0

At some later time t, the value of H given the observed spot X̂t is then computed as

BS
ht (X̂t , ) : H(XT ) Xt : X̂t (2.23)

That works well if the real price process X̂ is a Black-Scholes diffusion with volatility
. In reality, though, that is unlikely. Assume, for example, that in fact

dX̂t
t dB̂t (2.24)
X̂t
64 MODELING VOLATILITY

for some stochastic short variance ( t )t 0. Then, our price (2.23) evolves as

1 2
dht (X̂t , ) t ht (X̂t , ) dt XX ht (X̂t , ) X̂t2 t dt
2
X ht (X̂t , ) dX̂t

Using the fact that h is a Black-Scholes price for H and that it therefore satisfies the
Black-Scholes PDE
1 2
0 t h(t, x, ) XX h(t, x, )x2 2
,
2
we have
T
1 2
H(X̂T ) h0 (X̂0 , ) XX ht (X̂t , ) X̂t2 t
2
dt
2 0
T
X ht (X̂t , ) dX̂t
0

(See also the results from El Karoui, Jeanblanc-Picquè and Shreve [45].) The cost of
our strategy to replicate H(XT ) via its Black-Scholes hedge is therefore not covered
by the initial price h0 (X̂0 , ). The term
T
1 2
XX ht (X̂t , ) X̂t2 t
2
dt (2.25)
2 0

shows that we will have an additional contribution from the mismatch in volatility
2
weighted by cash gamma $ : 2 12 For convex payoffs, cash gamma
XX ht (X̂t , ) X̂t .
will be positive, so we see that we lose money if the real variance stays above ,
and we will gain if our initial guess was larger than the real variance. Equation (2.25)
also reveals that it is not sufficient for a perfect hedge that the realized variance,
T 2 T.
0 u du, equals

Vega Hedging To protect ourselves against the profit and loss swings arising from
a wrong volatility assumption in (2.25), it is natural to readjust the Black-Scholes
volatility during the life of the product. After all, if we price the call (T, k) itself, we
will not match the market as soon as its implied volatility changes.
Assume therefore that at some later time t, the call trades at some ˆ t ˆ t (T, k).
We can then infer its implied volatility ˆ t ˆ t (T, k) by inverting the Black-Scholes
price for the call,13

BS k 2
t Xt , ˆ t : Xt T, , ˆ (T t)
Xt t

12 Thesecond derivative of the price with respect to the stock is called gamma, and we call its
product with the square of the stock price cash gamma.
13
Note that in contrast to the discussion in section 1.2, the current stock price level is not
based on unity here, hence the additional scaling by Xt .
Applications 65

Hence, the our price process for H is now given as

h t X̂t , ˆ t

A common practice is to protect the position against the change in volatility by vega
hedging. The idea is to buy as many calls Ĉt such that the overall sensitivity of the
position to changes in both X̂t and ˆ t is zero (recall that the derivative of a price
with respect to volatility is called vega; hence the name vega hedging). In our case,
this means first to define the Black-Scholes delta-neutral portfolio
neutral BS
t : t X t (X̂t , ˆ t )X̂t ,

and then build a hedging position


1 neutral
X ht (X̂t , ˆ t ) X̂t ht (X̂t , ˆ t ) BS t
t (X̂t , ˆ t)

The first observation is that this strategy applied to the payoff H(XT ) : (XT k)
will yield a perfect hedge: we simply hold ˆ . This is an advantage over the pure
delta-hedging strategy discussed initially.
However, it is clear that we still do not cover the cost of this hedge with our
initial price, h0 (X̂0 , ˆ 0 ). Heuristically, we expect that the hedge above works better,
but it is not clear that this is actually true in practice. Another problem with this
approach is that it requires us, at least in this pure form, to select a reference option
that can be used for vega hedging. In light of today’s strong volatility skews, the
choice of a strike is a tricky problem and requires a good knowledge of the product
that we want to risk manage.14
Here is where the variance swaps come in: Their price does not depend on
a strike. Moreover, their payoff is directly the realized variance; hence, variance
swaps are a more natural instrument to hedge against changes in volatility. Indeed,
variance swap trades are in practice quoted in units of vega.
The idea behind trading vega is as follows: In terms of the variance swap
volatility : K (T), a variance swap with maturity T pays out the quantity 2 .
This payoff has a vega of

1
V0 (T) 2
[T]

If we now assume that we have an overall vega exposure in our trading book, we
can neutralize this exposure by buying

N: (2.26)
2
units of variance swaps (the quantity N is the ‘‘notional’’ of a trade of ). This
approach is consistent with the idea of hedging volatility exposure with variance
swaps. (For a thorough account on this approach, see section 2.3). However, it

14
Since we can always revert to a time-dependent volatility in the Black-Scholes model, the
maturity of the option is not such an issue.
66 MODELING VOLATILITY

requires that the vega of the portfolio is the sensitivity of the portfolio with respect
to changes in the fair strike of the variance swap. In particular, it requires us to
compute all option payoffs with a model that at least reprices the Europeans in (1.30)
and therefore the variance swap itself.
More commonly, though, the vega of a book is an accumulated sum of Black-
Scholes vegas across strikes (and possibly maturities), as discussed above. In this
case, it seems sensible to assign the Black-Scholes vegas per strike weights according
to (1.30). Of course, such an approach does not generally produce a perfect hedge,
and it also disrespects changes in skew and kurtosis of the implied volatility surface.

Volatility as an Asset Class Apart from the potential use of variance swaps for vega
hedging, they also offer the investor a way to invest in volatility. This can be
attractive for many reasons. One of the most interesting properties is that volatility
tends to be anticorrelated to movements of the market. Volatility increases if the
market is falling and often decreases if the market rallies. (Note, though, that during
the dotcom boom both price levels and volatility rose; cf. figure 2.3.) Now, most
market participants would probably prefer to trade implied volatility in some way.
The drawback of using plain implied volatility as an underlying, however, is
that once a strike of the respective option, to which the implied volatility refers, is
fixed (for example at-the-money), this strike can entirely change its characteristics
depending on the movements of the stock price. For example, if we start off with
a strike at-the-money and the market starts to fall, we end up with an out-of-the
money strike above current spot level. Implied volatility in this region often appears
to be ‘‘cheap.’’ (For most indices, upside implied volatility is lower than at-the-
money implied volatility.) Moreover, the farther out the strike, the less liquid the
corresponding option becomes, with the effect of increasing transaction costs.
Here, variance swaps are a good and relatively inexpensive alternative (in terms
of transaction costs). They offer exposure to volatility in a way that does not depend
on the level of the market in the sense above. Indeed, cash gamma of a variance swap
is simply constant 2, if we use the static replication strategy (2.21). In fact, we could
also define the variance as the contract that has a constant cash gamma, that is, as
the contract that always has the same sensitivity to changes in realized variance,
regardless of the level of the stock. See Demeterfi et al. [42] for this approach. A
linear cash gamma can be realized using gamma swaps, which are discussed below.

Remark 2.2.3 The market’s interest in trading volatility has led to the introduction
of ‘‘variance indices,’’ notably VIX for SPX and VDAX for the GDAXI. These
indices can be seen as rolling the square-root of variance swaps with a fixed
maturity, a property that makes them very costly to replicate.
It is also noteworthy that trading in options on VIX futures started on CBOE
in February 2006.

As soon as trading in variance swaps began, it became clear that variance swaps
on single names are very sensitive to large price moves in the underlying asset, as
can be seen easily from equation (2.15). In particular, the payout will be infinite if
the asset defaults (recall that in practice, the case of default is not excluded by using
definition (2.16)). For this reason, investors who sold variance swaps have requested
to impose a cap on the potential payout of a variance swap. Typically, this cap is
Applications 67

around 250% of K2 ; that is, the payoff of such a capped variance swap is, in the
absence of dividends,

n
min (T), 2 5K2 K2

This is equivalent to

n
(T) K2 n
(T) 2 5K2 1 T 2 5 K2 1 T

The latter payoff is also valid in the presence of dividends if (2.16) is used plus the
additional payoff of 250%K2 in the event of default.
By requesting protection against extreme stock price movements, investors
who sold the capped variance swaps essentially bought out-of-the-money calls on
variance. The availability of such products then spurred the development of more
standard options: common options on variance that are available today are simple
calls

n
(T) K2 , (2.27)

and puts

K2 n
(T) (2.28)

but also volatility swaps with payoff

n
(T) K

(Note that value of a zero-strike volatility swap is always less than the value of a
zero-strike variance swap.) More recently, options on forward variance swaps have
emerged. For example, a call on forward variance between T1 and T2 has at time T1
the payoff

VT1 (T1 , T2 )
K2
[T2 T1 ]

where Vt (T1 , T2 ) is the price at time t of the variance between T1 and T2 , that is,

Vt (T1 , T2 ) Vt (T2 ) Vt (T1 )

It should be noted that this contract has a different nature than a forward starting
call on variance swap, which pays at T2 the quantity
n n
(T2 ) [T2 ] (T1 ) [T1 ] VT1 (T1 , T2 )
k ,
[T2 T1 ] [T2 T1 ]

where k is now a relative strike.


68 MODELING VOLATILITY

Remark 2.2.4 (Quoting Conventions) European options on variance such as (2.27)


and (2.28) are usually quoted in terms of ‘‘vol points,’’

Price
2 K (T)

As before, K (T) denotes the variance swap volatility.

2.2.2 Entropy Swaps


Since variance swaps offer exposure to the realized volatility of the returns of the
stock X, they are relatively insensitive to the level of the stock price.15 As an
alternative measure of variance, it is possible to define the payoff of what we will
call an entropy swap as
T T
Xt d log X t Xt t dt (2.29)
0 0

Intuitively, this ‘‘entropy variance’’ has the convenient property that if stock price
and short variance are negatively correlated, then rises in one quantity are offset
by falls of the other. Moreover, if the market drifts sidewards (i.e., the level of X
does not change much), then the payoff behaves roughly like a variance swap: If
the instantaneous correlation between X and is zero, then the value of weighted
variance and standard variance are equal. Price and hedging strategy of such a swap
can be computed using the same ideas as above. To this end, note that
T T
1
Xt d log X t dX t
0 0 Xt
T
2 log Xt dXt 2 XT log XT XT 2 X0 log X0 X0 ,
0

Hence, pricing an entropy swap boils down to approximate the convex and bounded
function H(x) : x log x x 1 via (1.29); while the weights for evaluating a vari-
ance swap via (1.29) are given as 1 k2 , they are 1 k in the case of an entropy swap.
Since X is a martingale with X0 1, we can compute the value of an entropy swap
with maturity T at time 0 as

E0 (T) 2 XT log XT

Let us define the stock price measure X by setting X [A] : [1A XT ] for all A T
and all T . This measure is given by using X itself as a numeraire, and the above
expression shows that 12 E0 (T) is simply the relative entropy of X with respect to ,
hence the name entropy swap.

15 Indeed, in classical stochastic volatility model such as Heston’s (cf. (2.24)), where the short
variance is not functionally dependent on X, the delta of a variance swap is zero. This is not
true for local volatility models or other models where the volatility is functionally dependent
on the spot level.
Applications 69

X
Shadow Options The connection between an entropy swap and the measure goes
further: we have
T T T T
X X
Et (T) Xt t dt [ Xt t ] dt [ t ] dt t dt
0 0 0 0

In other words, the price of an entropy swap is the value of a variance swap under
X
. With regard to this measure, recall that we used (T, k) to denote a put on X
with strike k and maturity T. Hence,

1 1
(T, k) k XT k XT
XT k

X 1 1 X 1
k : k T, ,
XT k k

where we call X following Lewis [23] the ‘‘shadow call’’ on X. It is the call on XT 1
under the numeraire X. The shadow put X is defined similarly; together we have

X
(T, k) : k (T, 1 k)
X
(T, k) : k (T, 1 k)

Hence, the shadow option prices can be read from the market. So, in principle, we
could compute the value of an entropy swap, E0 (T) X log X
T , using (1.29) in
terms of shadow options.

2.2.3 Gamma Swaps


While entropy swaps are an interesting alternative to variance swaps, they are not
particularly well suited for real-life investments, because they require us to strip
dividends, repo, and interest rates from the traded stock price, S, in order to obtain
X. This is very unnatural from an economic point of view and inconveniences the
investor. This drawback can be overcome by using what are called gamma swaps or
weighted variance swaps: A gamma swap pays at maturity the weighted variance of
the stock price,

n 2
252 Stk Stk
log 1 tk (2.30)
n 1 S0 Stk 1
k 1

Assuming that there are no cash dividends, we approximate (2.30) as


T
St
d log X t
0 S0

A gamma swap has the same attractive property as the entropy swap of being
exposed to correlation between stock price and volatility. See figure 2.17 for past
70 MODELING VOLATILITY

Payoffs of rolling 1y Variance and Gamma Swap STOXX50E

45% 160%

40% 140%

35%
120%

30%
100%

25%
80%
20%

60%
15%
Variance Swap
40%
10% Gamma Swap

Annual Index Return


5% 20%

0% 0%
10/98 04/99 10/99 04/00 10/00 04/01 10/01 04/02 10/02 04/03 10/03 04/04 10/04 04/05

FIGURE 2.17 Past performance of 1y variance and gamma swaps on STOXX50E. We have
also plotted the return performance of the index.

performance of gamma swaps. Under the assumption of continuity of X, the price


of a gamma swap is

T T
St St
0 (T) : d log X t t dt
0 S0 0 S0
T
Ft
[ Xt t ] , dt
0 S0
T
Ft
( T E0 )(t) dt
0 S0

(recall the symbols Ft and At from page 7). In other words, a gamma swap
is a sequence of forward variance swaps and forward entropy swaps. We can
approximate its price as
n
Fti
0 (T) E0 (ti ) E0 (ti 1)
S0
i 1

When it comes to hedging a gamma swap, let h 0 and define H(x) : x log x
x 1 as above. Let us also recall equation (1.12) and Ito’s formula (1.13). They give
us again a hedging program,
T T
St 2
t dt H(ST ) H(S0 ) log St dSt , (2.31)
0 S0 S0 0

similar to (2.19). Here, we can see why the product is called gamma swap: The
cash gamma t$ : S2t t for this product is t$ St S0 (i.e., linear in spot). The
Applications 71

DAX Realized Weighted Variance and its Hedge (31 days)

4.50% 0.04%
Realized weighted variance
4.00% Hedge 0.03%
Hedging Error
3.50%
0.02%
Realized Weighted Variance

3.00%
0.01%

Hedging Error
2.50%
s 0.00%
2.00%

-0.01%
1.50%

-0.02%
1.00%

0.50% -0.03%

0.00% -0.04%
01/92 01/93 01/94 01/95 01/96 01/97 01/98 01/99 01/00 01/01 01/02 01/03 01/04 01/05

FIGURE 2.18 The quality of hedging weighted variance swaps with (2.31). The graph shows
the daily realized weighted variance over 31 business days, the return from the hedging
strategy (2.19), and the hedging error.

performance of this hedge for real-life gamma swaps is as good as it is for variance
swaps, as figure 2.18 shows.

2.3 VARIANCE SWAP MARKET MODELS

While the evaluation of variance swaps, entropy swaps and gamma swaps is relatively
model independent, such formulas are not known for options on realized variance,
as introduced in section 2.2.1.16 To price and hedge a call (2.27) on realized variance
on a stock where only European options are traded, we have to use a particular stock
price model. In this section we will discuss a general modeling approach that is based
on the idea to hedge options on variance with variance swaps. As an illustration,
figure 2.19 shows the term structure of variance swap fair strikes K for a few major
indices. The aim is to model the entire curve of variance swaps as a random variable
and then derive in a second step the dynamics of a stock price process that realizes
the modeled variance. (We do not attempt to develop a model that prices variance
swaps; rather, their prices are input parameters for the model.) Of course, a model
that describes well the evolution of variance swap price curves cannot only be used
to hedge options on realized variance. Since we will also provide an ‘‘associated
stock price process’’ in the model (and an intuitive meaning of correlation), we
can use such a model to price and hedge any exotic derivative. For example, it is
natural to hedge Cliquet-type products as discussed in section 2.1.6 using forward

16 Inthe particular situation where the skew is symmetric in the logarithm of the strike (i.e., if
the instantaneous correlation is zero), it is possible to infer the distribution of integrated
variance. See Carr/Lee [46].
72 MODELING VOLATILITY

Variance swap prices 24/10/2005

26%

24%

22%

20%
K*

18%

16% S&P500

STOXX50E
14% DAX
FTSE100
12%

10%
28/05/2005 10/10/2006 22/02/2008 06/07/2009 18/11/2010 01/04/2012 14/08/2013
Maturity

FIGURE 2.19 Variance swap fair strikes for major stock price markets.

started variance swaps.17 This approach is particularly appealing in the light of


recent trading volumes in variance swaps.
The entire approach is very similar to the Heath-Jarrow-Merton (HJM) approach
[47] in interest rates. There, the dynamics of the forward interest rates are modeled
as stochastic variables; we will consider forward variance. The basic assumption is
that alongside the ‘‘pure’’ stock X, at any time t, (zero-strike) variance swaps for all
finite maturities with prices

1
Vt (T)
[T]

are liquidly traded. Under the assumption of ‘‘no free lunch with vanishing risk,’’
there exists an equivalent measure under which both X and all variance swap price
processes and therefore V (V(T))T 0 are local martingales (for ease of exposure
we will frequently refer to V(T) as the price process of a variance swap even though,
strictly speaking, the price process is V(T) [T]). While variance swap prices V are
readily available in the market, they are slightly difficult to model directly: Since the
prices Vt (Vt (T))T t of variance swaps have to be increasing in T at any time t, it
is more natural to work instead with the forward variances

vt (T) : T Vt (T) (2.32)

Forward variance is ‘‘the market’s expectation’’ at time t of the variance at time T,


just as the forward rate in interest rates is the expectation of the short interest rate
under the forward measure. (Note that in contrast to a forward rate, a forward

17 This has also been proposed by Bergomi [48].


Applications 73

variance of zero is a natural state, for example, on weekends.) The main point is that
due to its definition (2.32), forward variance itself is tradable and must therefore be
a local martingale under a pricing measure, if such a measure exists.
As with interest rates, it is much more natural to look at the evolution of the
forward variance curve over time in ‘‘fixed time-to-maturity,’’ rather than a fixed
maturity. We expect the properties of forward variance vt (T) to change markedly
during the remaining time to maturity T t: for example, very long-term forward
variance should not be as volatile as short-term forward variance. It is therefore
more convenient to use the Musiela parametrization18 of forward variance,

ut (x) : vt (x t) (2.33)

Accordingly, the price of a variance swap (modulo scaling by the inverse of time-to-
maturity) in Musiela-parametrization is
x
Ut (x) : ut (y) dy
0

HJM Theory for Variance Swaps The idea of ‘‘variance curve models’’ as introduced
by Buehler [49] is now to start by specifying the dynamics of the family u (u(x))x 0
itself, just as HJM-type interest rate models are specified by starting with the forward
rate dynamics. The additional complication in the case of forward variance is that we
do not only want to model the variance swap prices in this way, but we also need to
model a consistent stock price process whose expected realized variance is the price
of the respective variance swap. We ignore the effects of dividends in this section.
To formalize our setup, assume that we have a d-dimensional Brownian motion
W (W 1 , , W d ) under a measure , which creates the filtration ( t )t 0 t. We
will model the variance curves directly under their martingale measure; the ideas
from section 1.4 will then be used to derive conditions on market completeness.
Assume that u (u(x))x 0 is a family of non-negative processes u(x) (ut (x))t 0
given by
d
j j
dut (x) t (x) dt t (x) dWt (2.34)
j 1

for some integrable predictable processes and ( 1 , , d ). Reversing the


construction above, we can then define the forward variance processes v (v(T))T 0
by setting

ut (T t) T t
vt (T) : (2.35)
vT (T) T t

(note that vt (T) is well-defined for t T). Equivalently, the variance swap price
processes for finite maturities T are defined as
T
Vt (T) : vt (r) dr
0

18 Musiela introduced this concept for interest rates in [50].


74 MODELING VOLATILITY

Definition2.3.1 We call u given by (2.34) a variance curve model if v(T) given


by (2.35) is a local martingale for all T and if there exists a local martingale X
for the stock price such that

log X T t Vt (T)

for all t and all T .

Let us assess when a curve u is indeed a variance curve model.19 First of all, it is
natural to assume that all initial variance swap prices are finite, that is,
x
u0 (y) dy
0

for all x . Indeed, if this does not hold, the expected value of the logarithm of
X cannot exist. Second, we have to ensure that for each T , the process v(T) is
a local martingale. To this end, we require that is in C1 and its derivative x (x)
is integrable with respect to Brownian motion. Then,

d
j j
dvt (T) t (T t) x vt (T t) dt t (T t) dWt ,
j 1

which implies that the following HJM drift condition for forward variance must
hold:

t (x) x ut (x)

As a next step, note that the process

t : ut (0) (2.36)

T
is an adapted non-negative process. Since [ 0 t dt] V0 (T) , its square root
is integrable with respect to any Brownian motion B. Each such Brownian
motion B can be written in terms of W as

d t
j
Bt s dWsj , (2.37)
j 1 0

where ( 1 , , d ) is some potentially stochastic ‘‘correlation vector’’ with


values in [ 1, 1]d , which always has unit norm, t 2 1. This means that

Xt : t s dBs
0

19 For a more technically detailed exposure, refer to Buehler [49] and [18].
Applications 75

is a well-defined local martingale with the property that

T
log X T t s ds t Vt (T),
0

just as required. We call X an associated stock price process to u.


The Brownian motion B or, alternatively, the correlation vector was arbitrary
in the construction of X. Indeed, B plays the role of a ‘‘correlation’’ or ‘‘skew’’
parameter: If the dynamics of u in the form of are given, then the specification
of links the movement of the variance curve with the stock price movement.
In particular, this implies that volatility structure of the variance curve and its
correlation with the stock price movement can be estimated one after the other.
However, the general formulation of a variance curve above in terms of
equation (2.34) plus the requirement of non-negativity is more subtle than it may
appear in the first place. Indeed, it is very difficult to assess whether a general
stochastic integral (2.34) will remain non-negative. In particular it means that we
cannot—as in the HJM-framework for interest rates—specify the volatility structure
independently of the initially observed forward variance curve u0 .
A natural approach to this problem is to model u as an exponential,

ut (x) u0 (x)ewt (x) ,

where w satisfies the integral equation

d
j j
dwt (x) at (x) dt bt (x) dWt
j 1

Applying our previous results implies the HJM-type drift condition

d
1
at (x) x wt (x) bj (x)2
2
j 1

This approach is well suited for statistical estimation of a volatility structure


independent of the initial state u0 of the variance curve, for example, via a PCA-type
estimation of the factors driving the curve. However, it should be noted that this
approach also excludes all those classical stochastic volatility models that allow the
volatility to reach zero, such as Heston’s. Moreover, it is usually more complicated
to ensure a true martingale property for the process X if u is given in the form above:
recall in particular Jourdain’s results [25] for the SABR model and for Scott’s model,
which we discussed in sections 2.1.2 and 2.1.3, respectively. Nonetheless, given a
‘‘volatility structure’’ w that ensures the martingale property for all initial values
u, the above formulation can be used to ‘‘fit the market.’’ This will be discussed in
section 2.3.3.
76 MODELING VOLATILITY

2.3.1 Finite Dimensional Parametrizations


One drawback of our approach so far is that we formulated the dynamics for u in
a very general way. But in practice, the formulation of u in terms of a predictable
integral equation (2.34) is inconvenient for numerical purposes. Moreover, this
formulation implies that the entire curve u is the state of the process, an object
difficult to handle on a computer. What we are really interested in is a finite-
dimensional representation of the curve u. Indeed, in real life, a finite number of
variance swap market quotes is usually interpolated or approximated by some non-
negative increasing functional , which itself depends on only a finite number of
parameters z m . If Z is the parameter vector at time t, this means that the
t
price at time t of a variance swap starting in t with time-to-maturity x is given as

Ut (x) (Zt x)

Since the function must be increasing in x, we can set G(z x) : x (z x) 0


such that the forward variance process is given as

ut (x) G(Zt x) (2.38)

The process Z (Zt )t 0 is called the parameter process of the functional G. The
idea is to restrict the dynamics of Z to ensure that the forward variances vt (T)
G(Zt T t) are local martingales.
To this end, recall the definition of the driving diffusion in section 1.4,
equation (1.40). There, we have assumed that the entire market of tradable instru-
ments has been given as a functional of a finite-dimensional diffusion (Z0 , , Zm )
where Z0 represented the stock price X itself. We have shown that such a framework
is naturally complete in the sense that ‘‘delta hedging works’’ if assumption 1 on
page 32 holds. Consequently, we will use the last m parameters Z (Z1 , , Zm ) to
drive the parameters of the function G, and incorporate the associated stock price
m
X Z0 afterwards. To this end, assume that on the open set 0 , the SDE

d
j j
dZt (Zt ) dt (Zt ) dWt Z0
j 1

for a drift vector : m and volatility vectors ( 1 , , d ) with j :


m
0 has a unique, strong solution. Moreover, assume that the variance curve
2,2
functional G : 0 0 is a C function with finite variance swap prices
T
for all states z , i.e. 0 G(z x) dx for all T . A direct application of Ito
shows that the family u defined by (2.38) is a variance curve model if, and only if,
the ‘‘consistency condition’’

1 2 2
x G(z, x) (z) z G(z, x) (z) xx G(z x) (2.39)
2

holds for all (z, x) 0 and if v(T) is a true martingale for all finite T.20

20 For technical details cf. [49].


Applications 77

Remark 2.3.1 It should be noted that we look at the heat equation (2.39) here
in a nonclassical way: obviously, if the process Z is given, then (2.39) is satisfied
for all functions G defined as G(z x) : g(Zx ) Z0 z in terms of a suitably
well-behaved function g.
In contrast, here we start with the function G and ask when a process Z
exists to satisfy (2.39): The idea is that we observe the variance swap market
data and then choose a suitable function G, which interpolates these data well.
Afterwards we use (2.39) to derive constraints on the dynamics on the parameters
that drive the curve to ensure that the resulting variance swap price processes are
local martingales. The entire approach is very closely related to the idea of a ‘‘finite
dimensional parametrization’’ of a variance curve, cf. [49]. This concept has been
developed in the context of interest rate theory by Björk/Svensson [51], Filipovic [52]
and Filipovic/Teichmann [53].

The Associated Stock Price Once we have obtained what we call a consistent pair
(G, Z), the next step is again to construct an associated stock price process X. From
the considerations of the previous section, we know that the short variance of X is
given by ut (0) G(Zt 0). It remains to model an appropriate correlation structure;
to this end, assume that ( 1 , , d ) is a ‘‘local’’ correlation vector; that
is, j for j 1, , d is a measurable function j : 0 [ 1, 1] such that
(z, s) 2 1 for all (z, s). Then, the stock price X is the strong unique solution to

d
j j
dXt (Zt Xt )Xt G(Zt 0) dWt , X0 1
j 1

The solution exists and is unique because j (Zt x)x is process Lipschitz for all j
1, , d, hence is a well-defined non-negative local martingale, and we call the triplet
(G, Z, ) a variance curve market model. It models all relevant market instruments
jointly in an arbitrage-free way. This setting also includes local volatility models
(in which case X itself is part of the vector Z) and, naturally, stochastic volatility
models. Moreover, the current framework fits into the settings of section 1.4: The
vector Ẑ (X, Z1 , , Zm ) is Markov by construction; the market instruments are
the stock X itself and the variance swaps with price processes

T t
Vt (T) G(Zt x) dx Vt (t)
0

t
The process Vt (t) 0 s ds with dVt (t) G(Zt 0) dt represents the running variance
of log X. Without loss of generality we can assume that Zm t Vt (t). Let us then
define variance swap price functional

x
(z, x) : zm
t (z x) zm
t G(z y) dy,
0

which gives the price of a variance swap with maturity T t in terms of Z as


Vt (T) (Zt T t). If this functional can be inverted locally in the sense that
78 MODELING VOLATILITY

there is some 0 and some time-to-maturities xM x1 such that the


function

z (z, x1 t), , (z, xM t)

is invertible for 0 t , then it is possible to extract locally the vector Zt from


the observation of only Vt (t) and a finite number of variance swaps with maturities
Ti xi t. If this function and therefore also its inverse is C1 , then the results of
section 1.4 can be applied, which means that under assumption 1 the market given
by (G, Z, ) is complete. Moreover, all the payoffs depending on the value processes
of X and the variance swaps can be replicated by trading in stock and variance
swaps (see Buehler [54] for technical details).

Remark 2.3.2 (Delta in Stochastic Volatility Models) In the particular case where
the correlation vector does not depend on X, the stock price at some later time
T depends on the current level Xt only through its initial value. This allows us to
compute the delta of a European option directly from market data without the need
to calibrate a model: We can write the price of a call with maturity T and strike k as

T 1 T
u dBu 2 t u du
t (T, k) XT k Xt , t Xt e t k Xt , t

T
u dBu
1 T k
Xt e t 2 t u du t
Xt

Hence, the ‘‘stochastic volatility delta’’ for any model that is well fitted to the market
is given as

1 k 1 k
Xt t (T, k) t T, ( k t) T,
Xt Xt Xt2 Xt

That implies, in particular, that two different stochastic volatility models of this type
that fit the market prices perfectly will have the same delta. Hence, the only way
‘‘pure’’ two-factor models can be distinguished is via their ‘‘vega hedge.’’21

It is sometimes assumed that stochastic volatility models have a sticky strike delta
due to the computation above. However, this is not the case since the implied
volatility given in such a model for a relative strike remains the same only in the
(zero-probability) case that all other state parameters remain constant.

21 Also note that jump models in which the jump parameters do not depend on the stock price
level have the same delta.
Applications 79

2.3.2 Examples
Let us now assess a few examples of variance curve functionals. Obviously, a
rich source of such functionals is to start with a stochastic volatility model and
use the variance swap curve functional given by this model as a starting point.
The natural question is then which other processes can drive the same variance
curve.

Example 5 A consistent parameter process Z for the ‘‘linearly mean-reverting’’


variance curve functional
z3 x
G(z, x) : z2 (z1 z2 )e

must follow an SDE of the form

dZ1t Z3t (Z2t Z1t ) dt 1 (Z


t ) dWt
dZ2t 2 (Z
t ) dWt
dZ3t 0

One popular example is Heston’s model (2.1).


The interpretation of this observation is that if variance swaps are priced using
Heston’s model, which in turn is calibrated every day to market data, then the speed
of mean reversion, Z3 , must theoretically be kept constant. Using entropy swaps,
it can also be shown [54], that the product of ‘‘vol of vol’’ and ‘‘correlation’’ in
Heston’s model must in theory be kept constant.

Also note that this example covers by a simple coordinate transformation the
Nelson-Siegel interpolation function for interest rates, G(z x) z1 (z2 z3 )e z4 x .
More generally, assume that G is a polynomial exponential, that is, that G is of the
form
n
zi x
G(z x) : pk (z x)e (2.40)
i 1

n
for polynomials pi (z x) i
k 1 ik
(z)xk and n m. Using (2.39), it is straightfor-
ward to show that the ‘‘speeds of mean reversion’’ Z1 , , Zn for any consistent
parameter process must be constant. A similar result holds for functions of the form
n zi x , in which case the parameters z ,
G(z x) : exp i 1 pk (z x)e 1 , zn must not
only be constant, but also need to come in pairs in which one is twice the value of
the other parameter. The observation that speeds of mean reversion must generally
be constant for interest rate models was first shown by Björk/Christensen [55] and
further investigated by Filipovic [52].
Another example of functionals of the class (2.40) is given by, G(z x) z1
(z2 z3 )e z4 x z4 e z5 x . Following Buehler [49], we use the following reparametriza-
tion, which makes it easier to ensure that the function remains positive:
80 MODELING VOLATILITY

Example 6 The ‘‘double linearly mean-reverting’’ variance curve functional is


defined for positive constants and c as

x cx x
G(z, x) : z3 (z1 z3 )e (z2 z3 ) e e (2.41)
c

with a well-defined limit for c. A consistent parameter process follows an SDE


of the form

dZ1t (Z2t Z1t ) dt 1


(Zt ) dWt
dZ2t c(Z2t Z3t ) dt 2 (Z
t ) dWt
dZ3t 3
(Zt ) dWt

For the case in which Z1 , , Z3 are square roots of affine functions, such a process
fits in the affine framework of Duffie et al. [56]. The curve (2.41) has proven to be
a good interpolation for actual market data; an example is given in figure 2.20.
Also recall that we have shown in section 2.1.5 that in the case 1 (z) z1 and
2 3 0, a semi-closed form for European option prices can be derived. We will
discuss a model based on (2.41) below.

While the linearly mean-reverting models admit a range of possible parameter


processes, this is not generally true. Here is an example of a curve that admits only
one parameter process:

.FTSE Variance Swap Fair Strikes 11/01/2006

25%

20%
Volatility (K*)

15%

10%

Market
5%
Model

0%
28/05/05 10/10/06 22/02/08 06/07/09 18/11/10 01/04/12 14/08/13
.STOXX50E Variance Swap Fair Strikes 11/1/2006

25%

20%
Volatility (K*)

15%

10%

Market
5%
Model

0%
28/05/05 10/10/06 22/02/08 06/07/09 18/11/10 01/04/12 14/08/13

FIGURE 2.20 Fit of the double mean-reverting functional (2.41) to FTSE and STOXX50E
market data.
Applications 81

Example 7 For the ‘‘exponential linearly mean-reverting’’ variance curve functional

z4 x z3 2z4
G(z, x) : exp z2 (z1 z2 )e (1 e ) ,
4
any parameter process Z is constant in Z2 , Z3 and Z4 . The parameter Z1 follows an
Ornstein-Uhlenbeck process

dZ1t Z4t (Z2t Z1t ) dt Z3 dWt

that is, G is driven only by Scott’s exponential OU model.


It is also interesting to see whether a functional admits a parameter process at all.
To this end, note that sometimes functions like g(z x) z1 z2 x , g(z x)
z1 z2 x (for 0) or g(z x) z1 z2 log(1 x) are used to interpolate the
term structure of implied volatility. Applied to variance swap curves, though, it can
be seen easily that such an interpolation of the variance swap volatility, that is, using
G(z, x) : xg2 (z x), does not admit a consistent parameter process. This observation
means that at least in the case of flat skew, implied volatility cannot consistently be
interpolated with such functions.

Remark 2.3.3 The results here are of a theoretical nature. In practice, the speed of
mean reversion of a Heston model must be calibrated to market data, and we cannot
enforce a constant value over a long period of time without considerably weakening
the fit of the model to the market. Moreover, it should be clear that a real trading
desk faces many more inconsistencies arising from trading in the real world.
From this point of view, the results here regarding a constant mean reversion
should be merely taken as advice to avoid strong movements of the parameter as a
result of the daily recalibration of the model. Indeed, in our experience, imposing a
penalty on movements of the speed of mean reversion during calibration leads to a
much more stable daily recalibration of, for example, Heston’s model.

A Double Mean-Reverting Model Following example 6, a convenient parametrization


to drive the double linearly mean-reverting curve functional (2.41) is given by

d t ( t t ) dt t dWt
d t c(mt t ) dt t dWt (2.42)
dmt mt dWtm

and
dXt
t dBt
Xt
The correlation structure of the involved Brownian motions is given in terms of the
parameters , , r , and m as

Bt Wt1
Wt Bt ˆ Wt2
Wt Bt ˆ r , Wt2 r̂ , Wt3
Wtm m Bt ˆm Wt4
82 MODELING VOLATILITY

Variance Curve Model .FTSE Fit to European option prices 11/1/2006 Variance Curve Model .STOXX50E Fit to European option prices 11/1/2006

0.50% 0.50%

0.40% 0.40%

0.30% 0.30%

0.20% 0.20%
(Model - Market)/Spot

(Model - Market)/Spot
0.10% 0.10%

0.00% 0.00%

-0.10% -0.10%

-0.20% -0.20%

-0.30% -0.30%

-0.40% -0.40%

-0.50% -0.50%
80%

60%
85%

70%
90%

80%
2y6m 2y6m
95%

90%
100%

100%
1y6m 1y6m
105%

110%
10m 10m
110%

120%
115%

130%
6m 6m
120%

140%
1m 1m

FIGURE 2.21 Calibration of the double mean-reverting model (2.42) to FTSE and
STOXX50E market data. The variance swap fits are shown in figure 2.20.

(we used the notation ˆ : 1 2 ). The exponentials and are assumed to be


from (0 5, 1) to ensure that (2.42) has a unique strong solution. To ensure that X is
a true martingale, we assume that , and m are negative.
The dynamics of this model are very intuitive: The short variance is a mean-
reverting process whose mean-reversion itself is stochastic. Such a behavior is often
observed in real markets. The stochasticity of m has been introduced to fit the
market slightly better, but in general and in the interest of parsimony, we usually set
0.
The calibration of the initial states 0 , 0 , and m0 , along with the reversion
speeds and c, can be done by fitting (2.41) to the observed variance swap market
data. The remaining parameters , , , , , and (r , is usually set to zero),
on the other hand, require quite an expensive calibration via Monte Carlo. This is
numerically far less robust than the calibration of, say, a Heston model. Indeed,
to reduce the time spent during the calibration, we typically calibrate to only five
maturities with three options per maturity. (While being theoretically attractive,
such a model is necessary only if we want to price spread-type products such as
forward-started options on variance). Figure 2.21 shows the calibration results for
this model to STOXX50E and FTSE market data.
To assess the impact of the model choice, we also calibrated the model (2.11)
with t : m ( 0 m)e ct and piecewise constant vol of vol and correlation to the
same market data. It can be written as

d t ( t t ) dt t t dWt
(2.43)
d t c(m0 t ) dt,

and it also has the variance curve (2.41). The calibration results are shown in
figure 2.22. Given the calibrated model, we can now price arbitrary options on
variance. Figures 2.23 and 2.24 display the prices of calls on variance computed
with the two calibrated models (all prices here are computed using Monte Carlo
simulation with control variates on the variance swaps).
Applications 83

Extended Heston TD .FTSE Fit to European option prices 11/1/2006 Extended Heston TD .STOXX50E Fit to European option prices 11/1/2006

0.50% 0.50%

0.40% 0.40%

0.30% 0.30%

0.20% 0.20%
(Model - Market)/Spot

(Model - Market)/Spot
0.10% 0.10%

0.00% 0.00%

-0.10% -0.10%

-0.20% -0.20%

-0.30% -0.30%

-0.40% -0.40%

-0.50% -0.50%
80%

60%
85%

70%
90%

80%
2y6m 2y6m
95%

90%
100%

100%
1y6m 1y6m
105%

110%
10m 10m
110%

120%
115%

130%
6m 6m
120%

140%
1m 1m

FIGURE 2.22 Calibration of the extended time-dependent Heston model (2.43) to FTSE and
STOXX50E market data. The variance swap fits are shown in figure 2.20.

.FTSE Options on Variance ATM 11/01/2006

5
Price / 2 K*

2
Double Mean-Reverting Model

Extended Heston
1

0
3m 6m 9m 1y 1y3m 1y6m 1y9m 2y 2y3m 2y6m 2y9m 3y
Strike % of K*

FIGURE 2.23 Prices of ATM calls on realized variances with the calibrated double
mean-reverting and the calibrated extended Heston model (2.43).

2.3.3 Fitting to the Market


The previous sections discussed how we can model consistently the idea of ‘‘inter-
polating the variance swaps’’: We assumed that if we find a suitable function that
interpolates well the variance swaps at any time t, then we can derive no-arbitrage
conditions on the dynamics of the parameters of this function. This approach is
in spirit the idea of Björk/Christensen [55], who first introduced this concept of
‘‘consistency’’ for interest rates. However, by far more popular interest rate models
are those that serve only two purposes: a perfect fit to the interpolated discount
84 MODELING VOLATILITY

.FTSE Options on Variance 1y maturity 11/01/2006

10

9 Double Mean-Reverting Model

8 Extended Heston

6
Price / 2 K*

0
75% 80% 85% 90% 95% 100% 105% 110% 115% 120% 125%
Strike % of K*

FIGURE 2.24 Prices of 1y calls on realized variances with the calibrated double
mean-reverting and the calibrated extended Heston model (2.43).

bonds, regardless of the interpolation method used, and a parsimonious specification


of the volatility structure of the model. The best known models of this class are the
one- and two-factor extended Vasicek or Hull-White models; see chapter 3.
We will now discuss similar approaches for variance curves and thereby forgo
the consistency approach. The aim is now to fit the market and to be able to
describe the volatility structure of variance in a parsimonious way. We assume that
we observe a sufficiently smooth variance swap market curve U0 (U0 (x))x 0 with
forward non-negative forward variance curve u0 (x) : x U0 (x). Recall the fixed
time-to-maturity quantities vt (T) : ut (T t) and Vt (T) : Ut (T t).

Example 8 Dupire [57] proposed a ‘‘fitting stochastic volatility’’ model based on


an exponential representation of the forward variance curve, that is,

vt (T) : u0 (T) t s dWs1


0

where is a deterministic volatility function.

Indeed, this approach can easily be generalized. To this end, assume that we are
given a variance swap curve model (G, Z). Then,

u0 (x t)
ut (x) : G(Zt x) (2.44)
G(Z0 x t)

can be seen to be a variance curve model that reprices the variance swap market
(i.e., u0 u0 ). A model very similar to Dupire’s is therefore given by using Scott’s
Applications 85

exponential OU model,

ewt
ut (x) : u0 (x t) , (2.45)
[ ew t ]

where dwt wt dt dWt is an Ornstein-Uhlenbeck process. We call this model


‘‘fitted log-normal.’’ It can be extended to a sum of correlated Ornstein-Uhlenbeck
processes, as proposed by Bergomi [48]. However, following Jourdain [25], care
should be taken with (2.44) to ensure that the associated stock price process is a
true martingale (the local martingale property is ensured if the original model yields
a local martingale for the stock price).

Remark 2.3.4 Sin [58] makes the following observation: a local martingale

dXt
t dBt (X0 1)
Xt

with nonexplosive short variance is a true martingale if, and only if, the process
does not explode under the measure X associated to the numeraire X.22
Using this result we can show that (2.44) applied to Heston’s model will retain
the martingale property of the associated stock price as long the correlation is not
positive.

The drawback of the ‘‘fitted log-normal’’ model is that it is to our knowledge


not possible to efficiently compute European option prices. That implies that we
have to revert to expensive numerical methods if the model is to be calibrated to
European prices. A model that does not have this drawback can be constructed from
example (2.43) given earlier: we have discussed that European prices for

d t ( t t ) dt t dWt
(2.46)
d t c(m0 t ) dt

can be computed relatively efficiently using Fourier transforms provided is non-


negative. Since
t
s
[ t ] e s ds
0

it is easy to see that if we set t : u0 (t) x u0 (x), then we fit the market:


[ t ] u0 (t). The non-negativity condition on essentially implies that u0 must
have the form u0 (x) e x f (x) for some increasing function f . As long as this
condition is satisfied, we obtain a ‘‘fitted Heston model’’ that reprices the initial

22 To see this, let n : inf t : t n such that (Xn )n with Xtn : Xt n is a true (discrete
time) martingale on the filtration ( n T )n . Fix T 0 and define n on n T by n [A] :
XTn 1A . Assuming that ( , T ) is Polish, there exists by Kolomogorov extension a
probability measure X on T such that X [A] n
[A] for all A n T , and for all B T,
X X
we have then via Lebesgue decomposition that [B] [ XT 1B ] [B T ]. Using
B yields the desired result.
86 MODELING VOLATILITY

Setup: ATM Calls on Variance

4
Price / 2K*

2 Double Mean-Reverting Model

Fitted Heston

1 Fitted Log-Normal

0
3m 6m 9m 1y 1y3m 1y6m 1y9m 2y 2y3m 2y6m 2y9m 3y 3y6m

FIGURE 2.25 We have adjusted the parameters for the fitted Heston and fitted log-normal
model by hand to roughly match ATM calls on variance between 1y and 2y of the double
mean-reverting model. The graph shows the quality of the match and the impact on the short
and long end of the ATM curve.

variance curve, which has a parsimonious parameter structure and which allows
the calibration of these ‘‘volatility parameters’’ , and via European options.
Additionally, the volatility parameters can be made piecewise time dependent;
cf. (2.12) and the discussion thereafter.23

Model Dependency If we use a specific model to price and hedge an exotic payoff,
we are subject to model risk. Hence, it is important to assess the impact of the choice
of a model. To this end, we present here a few results on the comparison between
the fitted Heston model, the fitted log-normal model and the double mean-reverting
model (2.42). To be able to compare the models, we interpolate the variance swaps
using the variance swap curve function (2.41). Next, we price a 100% ATM call on
variance using the double mean-reverting model using the parameters calibrated in
the examples before. Then, we adjust the parameters and in the fitted Heston
and fitted log-normal model such that they both have very similar option prices for
the 1y to 2y 100% ATM calls (the correlation parameters do not have a big impact
on pricing of options on variance).
Having matched the models in this way, we can now compare the impact of the
choice of a model first by comparing ATM calls with different maturities and second
by comparing the prices of out-of-the-money calls. This is shown in figures 2.25
and 2.26. It is remarkable how similar prices the two fitted models produce: once

23 Inthis section, we construct models that mainly serve the purpose of fitting the market.
As in other fitting models, this can easily lead to economically counterintuitive calibration
results.
Applications 87

Setup: 1y Calls on Variance for various strikes

10

9 Double Mean-Reverting Model

Fitted Heston
8
Fitted Log-Normal
7

6
Price / 2K*

0
75% 80% 85% 90% 95% 100% 105% 110% 115% 120% 125%
Strike % of K*

FIGURE 2.26 This graph shows the prices of 1y calls for the three models shown in
figure 2.25.

the fitted log-normal and the fitted Heston agree for the ATM option, they produce
very similar OTM option prices. Because of this very similar fit, the two models also
produce very similar ‘‘VarSwapVegas’’; hence, both price and hedge of a European
option on realized variance are relatively robust with respect to model choice once
the ATM calls are matched.
PART
Two
Equity Interest Rate Hybrids
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 3
Short-Rate Models

3.1 INTRODUCTION
When pricing equity derivatives, we generally need to model only a single market
instrument: the stock price.1 The interest rate world, on the other hand, consists of
many instruments: futures, swaps, and the like, all of which can move independently.
These are generally combined to form the yield curve, commonly expressed in terms
of zero coupon bond prices P(t, T) (i.e., the value seen at time t of 1 unit of currency
paid at time T) or the zero coupon rate R(t, T), defined by

P(t, T) exp R(t, T)(T t)

Another useful representation is in terms of the forward rate, f (t, T). This is defined
as the rate, fixed at time t, for instantaneous borrowing at time T. If we agree at
time t that we will invest 1 at time T for an infinitesimal period , the amount we
will get back at time T is 1 f (t, T) . We can hedge this by shorting the zero
coupon bond with maturity T and buying 1 f (t, T) units of the zero coupon bond
with maturity T , making

1 P(t, T)
f (t, T) lim 1
0 P(t, T )

ln P(t, T)
T
The EUR yield curve is shown in terms of R(0, T) and f (0, T) in figure 3.1.
Two different approaches to interest rate modeling are

■ Market models, where we model the market instruments such as LIBOR2 or


CMS3 rates directly. Examples of market models include the well-known BGM
model [59].

1 Treatingvolatility as an asset class in its own right.


2
The LIBOR (London Inter-Bank Offer Rate) is the uncompounded rate fixed at t for a loan
or investment at t, paid back at T. The value of one unit of currency at time t is worth the
promise of 1 LIBOR(t, T)(T t) units at time T, so

1 1
LIBOR(t, T) 1
T t P(t, T)

91
92 EQUITY INTEREST RATE HYBRIDS

EUR yield curve

4.5

3.5
Zero rate
Rate

Forward rate
3

2.5

2
0 5 10 15 20
Maturity (T)

FIGURE 3.1 EUR yield curve in terms of the zero rate, R(0, T), and the forward rate, f (0, T).

■ HJM models [60], where we model the evolution of the entire forward curve
f (t, T).

When pricing equity-interest rate hybrids we will automatically have at least


two stochastic factors (the equity and the interest rate), so to keep the problems
tractable it is often convenient to use a simple one-factor model for the interest-rate
component. One such family of models is the short-rate family (a subset of the
HJM models). These are particularly tractable and allow us to use PDE methods for
pricing derivatives, unlike some market models.
The short rate, rt , is the instantaneous borrowing rate. It is not observed directly
in the market, but can be expressed in term of zero-coupon bonds as

(ln P(t, T))


rt
T T t
f (t, t)

We will ignore the small differences between the LIBOR fixing date and the accrual start date,
and the LIBOR payment date and accrual end date. For convenience, we will refer to LIBOR
rates whatever the actual currency (instead of using terms like EURIBOR, etc.).
3 The n-year CMS (Constant Maturity Swap) rate at time t is the rate that gives an n-year

swap, starting at t, zero value. If the payment dates in the swap are T1 to Tm t n (in an
annual swap, for example, we have Ti t i), we have

1 P(t, Tm )
CMS m ,
i 1 i P(t, Ti )

where i is the day-count fraction Ti Ti 1 (with T0 t).


Short-Rate Models 93

In a short-rate model, the short rate is modeled as some specified stochastic process.
For a general single-factor model we will have

drt (t, rt )dt (t, rt )dWt ,

where Wt is a Brownian motion in the risk-neutral measure, , where the money


market account,
t
Bt exp rs ds ,
0

is the numeraire.
Three examples of popular short-rate models are the Hull-White or Vasicek
model ([61], [62]):

drt ( t t rt )dt t dWt , (3.1)

the Black-Karasinski model ([63]):

d ln rt ( t t ln rt )dt t dWt ,

and the Cox-Ingersoll-Ross model ([64]):

drt ( t t rt )dt t rt dWt

Each of these models is capable of fitting the entire term structure of interest
rates if the yield curve obeys certain constraints. For example, the Black-Karasinski
model requires that the forward rate,

ln P(0, t)
f (0, t) ,
t
is positive for all t. The function t can be calibrated so that the models fit the initial
yield curve, that is,
t
P(0, t) exp rs ds
0

The volatility parameters, t and t , can also be calibrated. While these param-
eters do affect the fit to the yield curve, they will generally be calibrated to swaption
and/or cap prices. For any set of volatility parameters, we must adjust the drift term
t to fit the yield curve.
Since we only have a one-factor model, the ways in which the yield curve can
evolve are limited, with changes to all forward rates being perfectly correlated.
Figure 3.2 shows some possible changes to the forward curve in a simple Hull-White
model, whereas figure 3.3 shows actual changes to the forward curve. While one-
factor models may capture the dynamics of individual rates, they cannot capture the
relationship between different rates. As a consequence, one-factor models are not
suitable for pricing derivatives that depend on differences between two market rates,
such as CMS spread options.
94 EQUITY INTEREST RATE HYBRIDS

Perturbations to the EUR forward curve

5
4.5
4
3.5 +1%
Forward rate

3 +0.5%
2.5 Original
2 -0.5%
1.5 -1%
1
0.5
0
0 5 10 15 20
Maturity (T)

FIGURE 3.2 Possible changes to the forward curve from a single-factor Hull-White model
using a mean reversion of 10%. A change to the short rate (the front end of the curve) decays
away exponentially with maturity.

Historic EUR forward rates

4.5

4 13-Jan-06
16-Dec-05
Forward

3.5
18-Nov-05
3 21-Oct-05
23-Sep-05
2.5

2
1/14/2006 1/14/2010 1/14/2014 1/14/2018 1/14/2022
Maturity

FIGURE 3.3 The EUR forward rate curve calculated from market data on five different
dates, shown as a function of maturity.

3.2 ORNSTEIN-UHLENBECK MODELS

We will consider a useful family of short-rate models that can be constructed


by expressing the short rate as some function of a variable, xt , following an
Ornstein-Uhlenbeck process:

dxt t xt dt t dWt (3.2)


Short-Rate Models 95

as

rt rt (xt , xt , t) (3.3)

If we let rt xt xt , we recover the Hull-White or Vasicek model. If we let


rt exp(xt xt ), we get the Black-Karasinski model (and the Black-Derman-Toy
model ([65]) if we set 0). To have more control over the relationship between
the short rate and its volatility, we can find some parameterization that interpolates
between a normal and a log-normal model, such as

1
rt exp( (xt xt )) 1

In this model, the limit 0 corresponds to the Hull-White or Vasicek model and
the limit 1 corresponds to the Black-Karasinski model.

3.3 CALIBRATING TO THE YIELD CURVE

3.3.1 Hull-White Model


The Hull-White or Vasicek model is particularly popular as it is the most analytically
tractable nontrivial interest rate model. Closed form solutions exist for several
options since we have closed forms for both the short-rate distribution and the
money market account (our numeraire). In this model we can express the drift t in
terms of the initial yield curve P(0, t). However, the expression involves the second
derivative of P(0, t), which means we must use some smooth function like a cubic
spline for the yield curve. This may not always be ideal as these functions tend to
have unwanted nonlocal behavior. However, with a simple change of variables we
can remove the need to calculate t and the need to use smooth functions for P(0, t).
As mentioned above, we can rewrite the Hull-White model in terms of some
variable xt whose simple SDE (given by equation (3.2)) does not involve t . We do
this by letting

rt xt xt

where xt obeys

dxt ( t t xt )dt

Adding this to equation (3.2) recovers the usual Hull-White SDE in equation (3.1).
If we choose x0 f (0, 0), we have x0 0 and E[xt F0 ] 0, so xt is just the expected
future short rate in the risk-neutral measure.
Integrating equation (3.2) we have
s
xs xt exp( ts ) exp( us ) u dWu ,
t
96 EQUITY INTEREST RATE HYBRIDS

where
s
ts u du
t

To show how xt relates to the yield curve, we need to price a zero-coupon bond:

T
P(t, T) exp rs ds Ft
t

T T
exp xs ds exp xs ds Ft
t t

T T
exp xs ds xt B̂(t, T) exp s B̂(s, T)dWs Ft
t t

T T
1 2 2
exp xs ds xt B̂(t, T) s B̂(s, T) ds , (3.4)
t 2 t

where
T
B̂(t, T) exp( ts )ds (3.5)
t

If we let t 0 and differentiate the log of equation (3.4), we get

T
1 d 2 2
f (0, T) xT s B̂(s, T) ds
2 dT 0
T
2
xT s B̂(s, T) exp( sT )ds
0

We can use this equation to calculate xt from the initial yield curve, should we need
to. We can also use it to eliminate xt from the expression for the P(t, T) given in
equation (3.4), giving

t
P(0, T) 1 2 2
P(t, T) exp xt B̂(t, T) s [B̂(s, t) B̂(s, T)2 ]ds
P(0, t) 2 0

Since the short rate is not observable in the market, there is no reason why we
should explicitly need rt . Instead, we can use xt when simulating Monte Carlo paths
or writing PDEs for derivatives prices. However, for finding closed-form solutions it
is often simpler to work with a slightly different variable,

xt rt f (0, t)
t
2
xt s B̂(s, t) exp( st )ds (3.6)
0
Short-Rate Models 97

This has zero expectation in the t forward measure, t. We can price derivatives as

t
V P(0, t) [Payoff(xt , t)],

for which we need the probability density of xt in t .


To calculate the distribution of xt , we could use the dynamics of xt in (i.e.,
equation (3.2)), then change measure to t . The Radom-Nikodym derivative is just
the ratio of the numeraires, so we need an expression for the money market account:

t
Bt exp rs ds
0
t
exp (xs xs )ds
0
t t
1 1 2 2
exp xs ds s B̂(s, t) ds
P(0, t) 0 2 0
t s t
1 1 2 2
exp u exp( us )dWu ds s B̂(s, t) ds
P(0, t) 0 0 2 0
t t
1 1 2 2
exp s B̂(s, t)dWs s B̂(s, t) ds
P(0, t) 0 2 0

The Radom-Nikodym derivative is therefore

t t
d t 1 1 2 2
exp s B̂(s, t)dWs s B̂(s, t) ds
d Bt P(0, t) 0 2 0

The SDE followed by x under t is therefore

Qt 2
dxs s xs ds s dWs s B̂(s, t)ds,

with solution
t t
2 t
xt s B̂(s, t) exp( st )ds s exp( st )dWs
0 0

Comparing this to equation (3.6), we see that

t
xt s exp( st )dWs
t
0

To price derivatives in t we also need zero-coupon bond prices. Using equation


(3.6) to substitute for xt in equation (3.4) gives

t
P(0, T) 1
P(t, T) exp xt B̂(t, T) B̂(t, T)2 2
s exp( 2 st )ds (3.7)
P(0, t) 2 0
98 EQUITY INTEREST RATE HYBRIDS

3.3.2 Generic Ornstein-Uhlenbeck Models


In this section we look at fitting the generic short-rate model given in equation (3.3)
to the yield curve. Traditionally, this has been done by using forwards induction on
trees (see Jamshidian [66] and Hull and White [67]). We present a conceptually very
similar approach, using PDEs instead of trees.
Let V(x, t) be the price of a derivative depending on the stochastic interest rates,
seen at time t when the parameter governing the short rate is x. Since we are working
in the risk-neutral measure, V(x, t) Bt is a martingale and so

V(x, t) 1 V V 1 2V
2
d r(x, t)V tx dt
Bt Bt t x 2 x2
a martingale

It follows that the price V(x, t) must obey the PDE

V V 1 2V
2
r(x, t)V tx 0 (3.8)
t x 2 x2
Now define the t forward measure probability density of x as (x, t). This must
obey

V(x0 , 0) P(0, t) t
[V(x, t)]

P(0, t) V(x, t) (x, t)dx (3.9)

Note that the left-hand side of this equation does not depend on t. This can hold
only if obeys certain conditions. Differentiating the above equation with respect
to t gives

V
0 P(0, t) V f (0, t) V dx
t t
V
Substituting for t using equation (3.8) and integrating by parts gives

(x ) 1 2( 2 )
0 V (r(x, t) f (0, t)) t dx
t x 2 x2
boundary terms

The boundary terms vanish if we assume that goes to zero sufficiently quickly as
x . Since the above equation must hold for any derivative payoff V(x, t), the
term in brackets must be zero.4 We have the following PDE for :
2 2
(x ) 1 ( )
(r(x, t) f (0, t)) t 0 (3.10)
t x 2 x2

4
This follows by setting V equal to the term in brackets making it the integral of ( )2 , so
( ) must be zero.
Short-Rate Models 99

Assuming all of the coefficients of the above equation are well behaved, we
can always find some solution . However, in order for the model to match the
yield curve, we must be able to price zero-coupon bonds correctly. Going back to
equation (3.9) and letting V(x, t) 1 (and so V(x0 , 0) P(0, t)) we have

1 (x, t)dx

Obviously, if this is not satisfied, then cannot be the t forward measure probability
density. Differentiating the above equation with respect to t, using equation (3.10)
and integrating by parts we get

(r(x, t) f (0, t)) (x, t)dx 0 (3.11)

Again, we have used the properties of as x to set the boundary terms to


zero. Equations (3.10) and (3.11) together let us fit the model to the yield curve.
Recall that the parameter for fitting the yield curve is embedded in the expression
for r(x, t). Going back to our earlier notation, we wrote

r r(x, xt , t),

so assuming we know (x, t) up to some time t, the problem of fitting the model to
the yield curve is simply the problem of finding xt so that equation (3.11) is satisfied.
As an example, in the Hull-White model we wrote

r(x, t) x xt
x f (0, t)

and so we have

x (x, t)dx 0,

which we know to be true since x has zero expectation in t .


We can bootstrap this calibration along since knowledge of (x, t) allows us to
find xt , which in turn allows us to find (x, t t) using some numerical PDE solver.
This simple bootstrapping will give us errors in the propagation from t to t t
of order ( t)2 since we only work out xt using information at the start of the
time-step, and so our approximation for the average x in t to t t has an error of
order t and we are using it to propagate the PDE a distance t. However, having
done this first step, we then have a solution (x, t t) that is accurate to O(( t)2 )
and so we can calculate xt t to order ( t) and from this get an O(( t)2 ) solution for
2

the average x in the period. We therefore can find (x, t t) with errors of O( t3 ),
T 2
so after t steps, we have an error in and x of O( t ).
In figures 3.4 and 3.5, we show the results of fitting a BK model to the EUR
yield curve. We used a constant volatility of 10% and a mean reversion of 1%. Note
that x has discontinuities corresponding to the discontinuities in the forward rate in
figure 3.1.
100 EQUITY INTEREST RATE HYBRIDS

Probability density for BK model

0.45
0.4
0.35
0.3
Density

0.25
0.2
0.15
0.1
0.05
0
-3 -2 -1 0 1 2 3
x

FIGURE 3.4 The probability density ( ) in the BK model using the EUR yield curve, a
volatility of 10% and a mean reversion of 1%.

Drift for BK model with the EUR yield curve

-3
-3.1 0 5 10 15 20 25 30
-3.2
-3.3
-3.4
x-bar

-3.5
-3.6
-3.7
-3.8
-3.9
-4
Time (years)

FIGURE 3.5 The integrated drift (x) for a BK model using the EUR yield curve, a volatility of
10% and a mean reversion of 1%.

3.4 CALIBRATING THE VOLATILITY

In this section we discuss how to calibrate the parameters that govern the volatility
structure of short-rate models. In this case, that means the volatility of x and the
mean reversion. Mean reversion has the effect of reducing the overall volatility so
it must be calibrated alongside the volatility parameters. We will generally want to
calibrate the volatility parameters to fit some liquid volatility-dependent instruments
such as caps and swaptions.
Short-Rate Models 101

3.4.1 Hull-White/Vasicek
As before, we will treat the case of the Hull-White/Vasicek model separately as it
allows for several closed-form or near-closed-form solutions. In particular, we have
closed forms for the zero-coupon bond and the distribution of rt (see section 3.3.1).
A cap is a string of caplets, which are options to receive LIBOR. We will assume
we have dates Ti , 0 i n, describing n caplet periods. The i’th caplet runs from
Ti 1 to Ti , with the LIBOR being fixed (and the exercise decision being made) on
date Ti 1 and the payment being received on date Ti . The i’th LIBOR is

1 1
Li 1 ,
i P(Ti 1 , Ti )

where i Ti Ti 1 is the day-count fraction for the i’th period. The value of the
i’th caplet seen on its exercise date is therefore

ci (Ti 1) P(Ti 1 , Ti ) i Li K
1 (1 K i )P(Ti 1 , Ti )

We can now use the results of section 3.3.1 to get a closed-form solution for the
price of a caplet. Substituting equation (3.7) into the above equation gives

(1 K i )P(0, Ti )
ci (Ti 1) 1 exp B(Ti 1 , Ti )xTi 1
P(0, Ti 1 )
Ti 1
1 2 2
B(Ti 1 , Ti ) s exp( 2 sTi 1 ds
2 0

To evaluate this, let Zi B(Ti 1 , Ti )xTi 1 so that Zi is normally distributed with vari-
Ti 1
ance ui B(Ti 1 , Ti )2 0 2 2
s exp( sTi 1 )ds. Since the exponential is monotonic
in Zi , the caplet will be exercised if

(1 K i )P(0, Ti ) 1
Zi A ln ui ,
P(0, Ti 1 ) 2

giving a price of

ci (0) P(0, Ti 1 )N(d1i ) (1 K i )P(0, Ti )N(d2i ),

where

(1 K i )P(0,Ti )
ln P(0,Ti 1 ) ui
d1i
ui 2
d2i d1 ui
102 EQUITY INTEREST RATE HYBRIDS

The price of the cap is therefore

n
Cap P(0, Ti 1 )N(d1i ) (1 K i )P(0, Ti )N(d2i ),
i 1

For pricing swaptions, we can either use a closed-form approximation, or a


near-closed-form exact solution. We’ll deal with the exact solution first.
A swaption is an option on a swap. We will assume we have an option to pay
coupons of K and receive LIBOR. If we also assume the LIBOR fixing, accrual, and
payment dates are all aligned, then the i’th LIBOR payment is worth

Ti 1 Ti 1
P(t, Ti 1) [Li 1 i P(Ti 1 , Ti )] P(t, Ti 1) [(1 P(Ti 1 , Ti )]

P(t, Ti 1) P(t, Ti ),

and so the floating side of the swap is worth P(t, T0 ) P(t, Tn ). The whole swap is
worth

n
Swap(t) P(t, T0 ) P(t, Tn ) K i P(t, Ti )
i 1

On the exercise date, which we will assume is T0 , the swaption is worth

n
Swaption(T0 ) 1 P(T0 , Tn ) K i P(T0 , Ti )
i 1

Substituting for P(T0 , Ti ) using equation (3.7) gives

n
Swaption(T0 ) 1 ai exp B̂(T0 , Ti )xT0 , (3.12)
i 1

where

T0
( in K i )P(0, Ti ) 1
ai exp B̂(T0 , Ti )2 2
s exp( 2 sT0 )ds
P(0, T0 ) 2 0

and in is the Kronecker delta. The swaption is therefore the sum of a series of
options on zero-coupon bonds, the strikes being determined by the solution of the
equation

n
ai exp B̂(T0 , Ti )x 1
i 1
Short-Rate Models 103

As the left-hand side of this equation is clearly monotonically decreasing in x , we


can solve this very efficiently using a Newton-Raphson method. Once we have found
x , we can price the swaption as
n
F(x , T0 , T0 ) F(x , T0 , Tn ) K i F(x , Ti ),
i 1

where F(x , t, T) is the price of an option to receive a zero coupon bond P(t, T) at t
if xt x , that is,

P(0, T) 1 x2
F(x , t, T) exp B̂(t, T)x B̂(t, T)2 Vt exp dx
2 Vt x 2 2Vt
x
P(0, T)N B̂(t, T) Vt ,
Vt

where Vt is the variance of the short-rate distribution at time t,


t
2
Vt s exp( 2 st )ds
0

Alternatively, we can find a closed-form approximation for the swaption price as


was first done by Jamshidian [68]. We approximate the sum in equation (3.12) by a
single exponential as
n
1 2
ai exp B̂(t, Ti )x G exp Hx H Vt (3.13)
2
i 1

If we want to match the expectation of this under the t forward measure (and thus
the price of the swap), we have

P(0, Tn ) K ni 1 i P(T0 , Ti )
G
P(0, T0 )

We can choose H to match the expectation of the slope of the function. Differen-
tiating both sides of equation (3.13) with respect to x and taking the expectation
gives
n
P(0, Tn )B̂(T0 , Tn ) K i 1 i P(T0 , Ti )B̂(T0 , Ti )
H n
P(0, Tn ) K i 1 i P(T0 , Ti )

In figure 3.6, we show an example for a 1y20y swaption with 0 1, 0 1, and


zero initial interest rates.
Given this approximation, we can express the swaption price as

ln G H VT0 ln G H VT0
P(0, T0 ) N GN
H VT0 2 H VT0 2
104 EQUITY INTEREST RATE HYBRIDS

300
Exact
Approximation
250

200
Swap

150

100

50
−0.1 −0.05 0 0.05 0.1
x'

FIGURE 3.6 Log-normal swap approximation for a 1y20y swap.

Note that the swaption price depends on only up to the exercise date of
the swaption. This means that if we fix the mean reversion, we can bootstrap the
volatility term-structure. Alternatively, since we can find analytic expressions for the
derivatives of the swaption prices with respect to the volatility and mean reversions
we could use some Newton-Raphson-based minimization strategy.

3.4.2 Generic Ornstein-Uhlenbeck Models


In this section we discuss calibrating the volatility structure for models that do not
have closed-form solutions for swaptions/caps. The important thing is to be able to
price caps and swaptions as efficiently as possible. We can use PDE methods to get
accurate prices given a set of parameters (kappas, sigmas, and other parameters that
the model might have) and then embed the pricing in some minimization algorithm.
Traditionally, we would price a swaption with finite differences by propagating
the values of the payments in the swap back to the exercise date, calculating the
value of the swaption there, then propagating that price back to the evaluation date.
To price an nymy swaption (i.e., a swaption that is exercised after n years into an m
year swap) we would have to propagate for n m years on the PDE grid. However,
for each new set of parameters, we must recalibrate xt , and in doing so we calculate
(x, t), the probability density of x in the t forward measure. We therefore do not
have to propagate all the way back to the evaluation date, but can propagate the
swap price to the exercise date, then use to calculate the swaption price with

Swaption P(0, T0 ) (x, T0 )Swap(x, T0 ) dx

This reduces the computational cost of pricing the swaption to just propagating m
years on the grid.
Short-Rate Models 105

In the general problem, the price of the nymy swaption depends on the volatility
up to the end of the swap (through the drift term xt ), so it is not possible to
bootstrap the volatility. Instead, we have to calibrate the entire volatility term
structure simultaneously with some appropriate nonlinear minimization algorithm
(see section 3.6).

3.5 PRICING HYBRIDS

In this section we assume we have a stochastic stock process as well as stochastic


interest rates. We will model the stock price as

dSt S
(rt t )dt (S, t)dWtS , (3.14)
St

where t incorporates the dividends (assumed to be proportional to the stock price)


and the repo rate. We assume the interest rates follow an Ornstein-Uhlenbeck model
given by equation (3.3). To distinguish between the stock price process and the
interest rate process we will write

r r
dxt t xt dt t dWt

We assume we have some correlation structure

dWtr , dWtS dt

Recall that we defined

rt rt (xt , xt , t)

The volatility of the stock may depend on St (i.e., local volatility) or be just a
function of time. Calibrating the volatility is the subject of chapter 8, but for now we
just mention that the equity process volatility is affected by the interest rate volatility
assuming we are calibrating to a market of European options.
We can remove the dividends and repo from the problem by changing variables
as follows. Let

t
exp 0 s ds
St S0 exp(yt )
P(0, t)

We will work in terms of yt instead of St as it is continuous. The analogous SDE to


equation (3.14) is

1 S 2 S
dyt rt f (0, t) ( t ) dt t dWt (3.15)
2
106 EQUITY INTEREST RATE HYBRIDS

3.5.1 Finite Differences


To find the PDE followed by the prices of the hybrid products we assume we have
some derivative whose price depends only on the short-rate driving variable (xt ) and
the stock price (or equivalently yt ): V(x, y, t). The value of the derivative discounted
by the money market account Bt must be a martingale, so we have

V V V 1 S 2
d(V B) rt V ( t x) rt f (0, t) ( t )
t x y 2
( tr )2 2 V S r
2V ( S 2
t )
2V
t t dt
2 x2 x y 2 y2
a martingale part

This gives the following PDE for V:

V V V 1 S 2
rt V ( t x) rt f (0, t) ( t )
t x y 2
( tr )2 2 V S r
2V ( S 2
t )
2V
t t 0
2 x2 x y 2 y2

To improve the accuracy slightly, we can work with the deterministically discounted
value of the derivative by defining U(x, y, t) V(x, y, t)P(0, t). This gives the PDE

U U U 1 S 2
(rt f (0, t))U ( t x) rt f (0, t) ( t )
t x y 2
( tr )2 2 U S r
2U ( S 2
t )
2U
t t 0
2 x2 x y 2 y2

Unless we represent the yield curve by at least a cubic spline, the forward curve
f (0, t) will be discontinuous and so will the short rate rt . The difference between
the two will generally have smaller discontinuities than the short rate itself and is
continuous in the Hull-White/Vasicek model and in the limit of zero interest rate
volatility. For this reason, U is generally better to work with than V. Note that the
PDE for the Vasicek case in terms of rt involves the drift term t . By writing the
PDE in terms of x instead, we have removed the need to calculate this term (which
depends on the second derivatives of the zero-coupon bonds). By using U instead
of V, we have also removed the need to calculate the forward rate f (0, t) (which
depends on the first derivative of the zero-coupon bonds) since rt f (0, t) can be
expressed in terms of x using equation (3.6) as

t
2
rt f (0, t) xt s B̂(s, t) exp( st )ds
0

We therefore not only do not need twice-differentiable yield curves, or even


once-differentiable ones—we can get away with discontinuous forward rates.
Short-Rate Models 107

3.5.2 Monte Carlo


An alternative method for pricing derivatives is to use Monte Carlo simulation. For
that we need to be able to simulate paths of the SDEs followed by x and y. We
will treat two cases here—the full problem with local volatility and non-Gaussian
interest rates and the special case of the Hull-White/Vasicek model with a term
structure of equity volatility; in this case we have a closed form for the Greens
function and can therefore take large steps in the simulation.
Vasicek Term Structure of Log-Normal Equity Volatilities We need to simulate
paths of xt and yt given by equations (3.2) and (3.15), and the money market
account, which follows the process

dBt rt Bt dt (3.16)

We will consider the changes from t to T. The solution of equation (3.2) is


T
r r
xT xt exp( tT ) s exp( sT )dWs (3.17)
t

We can rewrite equation (3.15) as


t
1 d r2 2 1 S2 S S
dyt xt s B̂(s, t) ds t dt t dWt ,
2 dt 0 2

where B̂ is defined in equation 3.5. Using equation (3.17) to substitute for xT we get
T t
1 r2 2 r2 2
yT yt xt B̂(t, T) s B̂(s, T) ds s B̂(s, t) ds
2 0 0

T T T
1 S2 r r S S
s ds u B̂(u, T)dWu s dWs
2 t t t

To simulate the money market, we rewrite equation (3.16) as


t
1 d r2 2
d ln(Bt P(0, t)) xt s B̂(s, t) ds
2 dt 0

Once again, we use equation (3.17) to substitute for xt , giving


T
r r
ln BT ln Bt ln P(0, t) ln P(0, T) xt B̂(t, T) u B̂(u, T)dWu
t

In order to simulate the steps from t to T, we must sample from the integrals
T
r r
I1 s exp( sT )dWs ,
t
T
r r
I2 s B̂(s, T)dWs ,
t
T
S S
I3 s dWs
t
108 EQUITY INTEREST RATE HYBRIDS

We can sample from these if we know the covariance matrix Cij , where
T
r2
C11 s exp( 2 sT )ds
t
T
r2 2
C22 s B̂(s, T) ds
t
T
S2
C33 s ds
t
T
r2
C12 s B̂(s, T) exp( sT )ds
t
T
r S
C13 s s exp( sT )ds
t
T
r S
C23 s s B̂(s, T)ds
t

Let Dij be the Cholesky decomposition of C, so

C DT D

If we sample three independent normal variables, Z1 , Z2 , Z3 , we can write

Ii Dij Zj
j

We can therefore simulate paths of the short rate, stock price, and money market
account.
Generic Ornstein-Uhlenbeck Models For the more general model given by equations
(3.2) and (3.3), we can still simulate xt exactly, but not the money market account
or the stock price. We will therefore have to take small steps in the Monte Carlo
simulation where we can find the distribution approximately. Letting T t t, we
have the change in the money market account as
T
ln BT ln Bt r(xs , s)ds
t
ln Bt r(xt , t) t,

The change in the stock variable is given by


T T
1 S2 S
yT yt rs f (0, s) s ds (s, ys )dWsS
t 2 t

1 S2
yt r(xt , t) f (0, t) s t
2
T S T s
S
(t, yt ) dWsS S
(t, yy ) dWuS dWsS
t y t t
Short-Rate Models 109

1 S2
yt r(xt , t) f (0, t) s t
2
2
T S T
S
(t, yt ) dWsS S
(t, yy ) dWuS t
t y t

We therefore need to sample from the following stochastic integrals:


T
r r
I1 s exp( sT )dWs ,
t
T
I4 dWsS ,
t

with covariances
T
r2
C11 s exp( 2 sT )ds
t
C44 t
T
r2
C14 s exp( sT )ds
t

Overall, this simulation has strong order 1. The interest rate process is simulated
exactly and the Milstein scheme we use for the equity process has strong order 1, as
does the simulation of the money market account. See Kloeden and Platen [69] for
details of Milstein schemes.

3.6 APPENDIX: LEAST-SQUARES MINIMIZATION


When calibrating the parameters of an interest rate model to swaption/cap data,
the problem generally reduces to trying to fit m prices by adjusting n parameters.
If we have m n, we cannot necessarily fit all of the prices simultaneously, so we
must try to minimize the error in the price in some norm. A commonly chosen
norm is the L2 norm, where we find the least-square error. If we let the parameters
be x (x1 , x2 , xn ) and the differences between the market prices and the model
prices be y(x) (y1 , y2 , ym )(x), then the problem is to find the vector x that
minimizes

yj (x)2
j

While we could use some general algorithm for minimizing a single function of
many variables, by reducing the vector y to a single number we throw away useful
information about the individual components of y. Many techniques exist for this
style of minimization, but here we will just describe two, Newton-Raphson and
Broyden’s methods, as these are particularly easy to implement. More details can be
found in Press et al. [70].
110 EQUITY INTEREST RATE HYBRIDS

3.6.1 Newton-Raphson Method


yi
When the Jacobian Jij xj can be calculated simply, such as when calibrating the
Hull-White model to swaptions/caps, we can use the Newton-Raphson method to
minimize the L2 norm. If we have a good trial solution xk , with residual errors
yk y(xk ), we can get a better solution by linearizing the problem about this point,
giving
y k
y(xk xk ) ŷ(xk xk ) yk x
xi i
i

We want to minimize
yj k yj yj k k
ŷ2j (ykj )2 2 ykj x x x
xi i xi xl i l
j j ij ijl

A 2vT xk xkT M xk

Differentiating with respect to x and setting the result to zero, we have


1
x M v

The new trial solution is

xk 1
xk M 1
v

If we have a linear problem, this technique will solve it in one iteration; for nonlinear
problems, the number of iterations will depend on how far away from the linear
regime our starting solution is. To handle constraints on the parameters, we can use
the sequential quadratic programming method (see [71]) or re-express the original
problem in terms of unconstrained parameters. For instance, if we have one original
parameter, x, which we know must be strictly positive, we can re-express the
problem in terms of x log(x) instead. The new parameter, x , is free to assume
any real value, and ensures that x exp(x ) 0.

3.6.2 Broyden’s Method


To handle the case where we do not know the derivatives, we can use Broyden’s
method to estimate them. Here at each step we only have an approximate estimate
of the Jacobian
yi
Jij Bij
xj

At each step, k, of our iterative procedure, we use the approximate Jacobian to


calculate the matrix M and vector v and get the new trial solution xk 1 , then
update the estimated Jacobian to be consistent with the previous step. Given that the
k’th step is xk xk 1 xk and y changes by yk yk 1 yk , with an estimated
Jacobian of Bk , we can compute an updated Jacobian, Bk 1 , that satisfies

yk Bk 1
xk
Short-Rate Models 111

There is no unique solution to this, but a good thing to use in practice is Broyden’s
method, where we let

( yk Bk xk ) xk
Bk 1
Bk ,
xk xk
since

( yk Bk x k ) xk
yk Bk xk
xk xk

For more details, see Press et al. [70].


Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 4
Hybrid Products

n this chapter we discuss when it is necessary to use stochastic rates to price a


I derivative and what effects they have on the prices, giving the conditional trigger
swap TARN, convertible bond and exchangeable bond as examples.
All options depend on interest rates through the discounting of future payments.
If we treat interest rates as stochastic, then the money market account (often used
as our numeraire) becomes stochastic. So apart from the explicit hybrid products,
where we receive payments based on both interest rate market observables (LIBOR
and CMS rates) and equities, we may also need to consider stochastic interest rates
when pricing options where the value of some equity/index affects the time when we
receive some payments. A good example of this is the target redemption note (see
section 4.3).

4.1 THE EFFECTS OF ASSUMING STOCHASTIC RATES

Whether or not we choose to price a particular option with stochastic rates will
depend on what risks we think are significant and against which we need to hedge
ourselves. Interest rates tend to be less volatile than equities, with typical short-rate
volatilities being around a few percent, whereas equity volatilities may be of the
order 10% to 100%. Often, the effect of stochastic rates will be swamped by the
effects of the more volatile equities, and it will not be necessary to use a more
CPU-intensive two-factor model.

Stochastic LIBOR and CMS rates The most obvious effect of stochastic interest
rates is to make quantities such as LIBOR and CMS1 rates stochastic. If we have
an option with a payoff dependent on a combination of these and an equity
performance, there is a good chance we will need to model the interest rates as
stochastic. However, as mentioned, the interest rate volatilities may be so low as to
make this unnecessary. Examples of derivatives that depend on stochastic interest
rates in this way are conditional trigger swaps (see section 4.2) and hybrid best-of
products, which pay coupons of the form

max(LIBOR, a (St S0 1))

1 See footnote on page 91.

112
Hybrid Products 113

These derivatives tend to depend strongly on the assumed correlation between the
interest rate and equity processes.
Note that derivatives containing a stream of LIBOR payments that cannot be
terminated early do not necessarily need to be modeled using stochastic interest rates
as we can hedge the payments in a way that does not depend on what happens to
the LIBOR rates.2

Stochastic numeraires The second effect of assuming stochastic rates is to make


the money market account and zero-coupon bond prices stochastic. These are often
used as numeraires, so the time value of money is affected. Any option where the
time of a given payment is uncertain will be affected by stochastic interest rates.
Examples of such options are

■ Bermudan/American callable options. Here the timing of the strike payment


depends on when the holder decides to exercise; that decision will depend on
what has happened to the interest rates.
■ Target redemption notes (TARNs; see section 4.3) where the overall level of
return is guaranteed, but how the return is distributed throughout the life of the
option depends on an equity performance.
■ Any option in which you receive a payment the first time an event happens
(e.g., a barrier is breached).
■ Any option with a stream of LIBOR payments that can be called/knocked out,
such as the floating side of a conditional trigger swap.

The longer the maturity of an option, the greater will be the effect of stochastic
rates. A 1% change in interest rates will have only a 1% effect on the price of a
one-year zero-coupon bond, but the effect on the price of a 30-year bond will be
compounded up to more like 30%. For this reason, many options will be considered
to be hybrid options once their maturity becomes large enough, say five or ten years.

2 If
we have to make a payment of LIBOR, fixed at T1 and paid at T2 , we can hedge this
by buying the zero-coupon bond with maturity T1 and shorting the zero-coupon bond with
maturity T2 . At T1 , the T1 bond is worth 1; we use this to buy more of the T2 zero-coupon
bond, giving us

1
1 (T2 T1 )LIBOR(T1 , T2 )
P(T1 , T2 )

units of the T2 bond. At T2 , we use this to make the LIBOR payment.


If we have a stream of back-to-back LIBOR payments (as in the floating side of a swap) with
fixing dates T0 Tn 1 and payment dates T1 Tn , we can hedge this by buying the T0
zero-coupon bond and shorting the Tn zero-coupon bond. At T0 , we invest 1 in the T1 ZCB;
at T1 we sell this, invest 1 in the T2 ZCB and use the remainder to pay back the LIBOR. We
repeat this until we reach Tn , where we have to pay back the LIBOR and buy back the Tn
ZCB for 1. In this way, paying a stream of LIBOR payments is equivalent to borrowing 1 at
T0 and paying it back at Tn .
114 EQUITY INTEREST RATE HYBRIDS

Adjusted local volatilities The final effect of stochastic rates that we will mention
here is more subtle. It is not so much a direct effect of stochastic rates on the payoff
of a product as a breakdown of our usual modeling assumptions for long-dated
options. Interest rates affect the stock price process through the drift term; in the
risk-neutral measure the drift of the stock price is just the short rate rt . Ignoring
dividends, we can write the stock price at time t as

t t
S S
St S0 exp rs ds t s dWs , (4.1)
0 0

where sS is the equity process volatility (which may be stochastic), W S is the


Brownian motion driving the stock price process in the risk-neutral measure and
is the Doleans-Dade exponential.3
When we assume interest rates are deterministic, all of the volatility of the
terminal stock price, St , comes from the final exponential in the above equation and
so we calibrate sS to implied volatilities. In reality, since interest rates are stochastic,
the first exponential in the above equation is also a source of randomness and so
the equity process (local) volatility must be adjusted to match the implied volatility
surface (for more details of this, see chapter 8). Note that this effect exists even
when the interest rates are uncorrelated with the equity process. If we are always
calibrating to market implied volatilities, the terminal stock price distribution in
the t forward measure (and thus the prices of European-style equity options) is
unaffected by our assumptions about stochastic rates.4 However, options that are
more directly sensitive to the local volatility than the implied volatility will be
affected.
In particular, options with a knock-out feature will be affected. Consider an
option that pays a fixed coupon c semiannually up to the first time the stock rises
above some barrier B on one of the coupon dates, Ti . The probability of being
above the barrier at time t (in the t forward measure) is unaffected by our rate
assumptions.5 However, the more positive the correlation between stochastic rates
and the equity process, the less the equity local volatility will be and so the more
correlated the stock prices on adjacent barrier dates will be.
We can write the present value of the coupon payment on date Tn as

PVn cP(0, Tn )Prob n (S1 B S2 B Sn B)

1
t (Z) exp Zt Z t
2

4
The discussion in European payoffs in section 1.3.3 applies here as well.
5 We can write a call price in terms of the T forward measure as C(K, T) P(0, T) T [(ST
1 C
K) ]. Differentiating with respect to K gives P(0,T) K T
[1ST K ], i.e., the probability of
the stock being above the barrier.
Hybrid Products 115

This probability can be rewritten as

Prob(S1 B) Prob(S2 B S1 B) Prob(Sn B max(S1 Sn 1) B)

If the local volatility is reduced by introducing stochastic interest rates, the terms in
the above expression will also be reduced as there is less volatility to allow the stock
to move over the barrier at date Ti , given that it wasn’t over the barrier at date Ti 1 .
Note that the first exponential in equation (4.1) will be correlated with the short
rate at time t, rt . Even if there is no instantaneous correlation between the equity
process and the interest rate process (i.e., dS and dr are uncorrelated), the terminal
values (St and rt ) will be correlated, so derivatives with a payoff sensitive to this
correlation will be affected by modeling rates as stochastic even if we assume no
instantaneous correlation.

4.2 CONDITIONAL TRIGGER SWAPS

A conditional trigger swap is like a standard swap in that the holder (or issuer)
receives coupons in exchange for paying LIBOR. However, what would normally
be fixed coupons depend on the performance of an equity. Additionally, the whole
trade knocks out if the equity ever goes above some barrier.
In each of the example trades here, the underlying is the Nikkei (N225) index and
the payments are made in Japanese yen. The holder pays JPY LIBOR semiannually
on dates Ti and receives c1 at date T1 , then subsequently receives

c1 if S(Ti ) Bc
c2 if S(Ti ) Bc ,

at date Ti where Bc is a barrier set below the current spot level. If the index performs
well, the holder will receive a string of large coupons, whereas if the index plunges
below the barrier, he will receive only the small coupons.
On top of this, the whole structure knocks out if the index goes above Bk on
date Ti , where Bk is a knock-out barrier level set above the current spot. On the date
when the structure knocks out, the LIBOR and coupon payments are still made, but
subsequent ones are not.
The payoff is shown graphically in figure 4.1.
Without the knock-out barrier, this deal would not require stochastic rates at all.
We can value the floating side payments as we normally would, as 1 P(T0 , TN ). We
can decompose the fixed side payments into the guaranteed amounts c1 , where is
the appropriate day-count fraction, and a series of digital options paying (c2 c1 )
if STi Bc , which can be completely hedged with European options.
We can think of the floating side of the option as being equivalent to paying
1 at T0 and receiving 1 at maturity or when the option knocks out. This is shown
116 EQUITY INTEREST RATE HYBRIDS

FIGURE 4.1 Cash flows for a conditional trigger swap. Tf is the first observation date Ti at
which S(Ti ) Bk or maturity.

FIGURE 4.2 Alternative cash flows for a conditional trigger swap. Tf is the first observation
date Ti at which S(Ti ) Bk or maturity.

in figure 4.2. The floating side payments in the option are therefore sensitive to
stochastic rates because the time when the holder effectively pays back the notional
is stochastic. The value of the stream of LIBOR payments can be written as
QT1
float side 1 P(T0 , T1 ) P(T0 , T1 ) [1S1 Bc (1 P(T1 , T2 ))]
QT2
P(T0 , T2 ) [1S1 S2 Bc (1 P(T2 , P3 ))]

For now we will ignore the effect of the stochastic rates on the distribution of the
time Tf and just note that each of the terms

1 P(Ti , Ti 1)

is positively correlated with interest rates. The more positively correlated the terms

1S1 , ,Si Bc
Hybrid Products 117

FIGURE 4.3 Effect of correlation on the floating side of a sample of conditional trigger
swaps. SR and DR refer to stochastic rates and deterministic rates, respectively.

are with interest rates, the greater each of the expectations will be. The greater the
index/interest rate correlation, the less positively correlated the above terms will be
with interest rates, and the smaller the PV of the LIBOR payments will become. As
the correlation increases, the value of the floating side decreases.
The other effect to consider is the change in the local volatility. As correlation
increases, the local volatility decreases and so the probability of the option’s
knocking out by a particular date decreases. Consequently, as correlation increases,
the expected lifetime of the option (the expectation of Tf ) increases. This is a smaller
effect than the one discussed above, so is not noticeable in the floating side, but
we can see that the PV of the fixed side payments do indeed increase slightly with
correlation.
Figures 4.3 and 4.4 show the effect of correlation on the PV of the float and fixed
side payments for some representative trades. Note that although the maturities of
these deals are 15y or 30y, the expected lifetimes are actually much smaller, so the
stochastic interest rates have only a small impact on the prices.
The details of the trades are as follows:

Trade Maturity Small Large Coupon Knock-out


coupon (c1 ) coupon (c2 ) barrier (Bc ) barrier (Bk )
1 30y 0.1% 2.0% 10,064 12,764
2 15y 0.0% 0.0% 10,220 12,716
3 30y 0.8% 2.9% 10,220 13,163
118 EQUITY INTEREST RATE HYBRIDS

FIGURE 4.4 Effect of correlation on the fixed side of a sample of conditional trigger swaps.
SR and DR refer to stochastic rates and deterministic rates, respectively.

4.3 TARGET REDEMPTION NOTES

4.3.1 Structure
The target redemption note (TARN) is a coupon-bearing, capital-guaranteed struc-
ture that pays an attractive coupon for the first year or two, and that further-
more pays a guaranteed6 total coupon amount, distributed among the remaining
coupon dates of the structure. The structure’s maturity might be eight years or
more.
The defining features of the TARN structures we consider here are:

■ The sum of coupons paid (TARN level) is guaranteed and the capital protected.
■ For an initial period, coupon payments are fixed.
■ The timing of the residual coupon and of the redemption payment are not
guaranteed, but are dependent on the performance of an underlying.

So the market risk is in the timing of the payments, not in their aggregate
size.
A wide variety of instruments can be used as the underlying for a TARN. The
structure originated as an interest rate derivative: Equity-linked TARNs are a more
recent development (since around 2003). Equity-linked underlyings can be indices

6 The term guaranteed should invariably be interpreted as carrying an implicit ‘‘assuming no


default of the issuer’’ caveat. There is nothing absolute about these guarantees in the absence
of additional arrangements such as escrow accounts, which take the funds out of the control
of the issuer.
Hybrid Products 119

(e.g., Dow Jones Euro Stoxx50), baskets (of as many as 20 stocks) or worst-of
baskets (having just a few constituents in the basket, perhaps just three). A CPPI
strategy (chapter 5) can also serve as the underlying for a TARN.
We will adopt a typical example TARN to study, with terms as follows:

■ 10-year maturity.
■ Underlying is the Dow Jones Euro Stoxx 50 index.
■ TARN level of 13.5%.
■ Annual coupons Ci,1 i 10 at anniversary dates ti,1 i 10 .
■ The first two coupon amounts are fixed: C1 C2 4 5%.

The remaining coupons are equity linked and given in terms of the performance of
the underlying since the inception of the structure, Pi S(ti ) S(0) 1, thus:

i 1 i 1
Ci Min Pi Cj , TARN level Cj ,3 i 9
j 1 j 1

9
C10 TARN level Cj
j 1

In words: The equity-linked coupons, before the last one, pay the excess of the
stock’s performance, up to the coupon date, over the accumulated coupons prior to
that time; the total aggregated coupon being however capped at the TARN level.
The final coupon ‘‘tops up’’ the total aggregated coupon to equal the TARN level
irrespective of the underlying stock’s performance.
When the total aggregated coupon reaches the TARN level, the capital is
returned and the structure terminates. Neither the total income from the structure
nor the repayment of capital is therefore in doubt: just the timing of the income
and redemption payments, and hence the yield (to maturity or to early redemption).
The structure attracts the investor who believes that it will be called early; say, after
three or four years. In that case, he will have received attractive coupons from a
medium-term note. He must believe that it is not unreasonable to suppose that the
index will have risen by 13.5% in three or four years.
To illustrate the risk taken on this market view, we may look at the internal
rate of return (IRR) arising from various possible redemption scenarios. Table 4.1
lists the most favorable scenario (which is that the instrument is called after just
three years), the two extreme scenarios at four-year termination (the ones generating
the minimum and maximum coupon at three years), the most favorable five-year
termination case and the ten-year case. We note that even the most favorable four-
year termination reduces the IRR by more than 1% and the most favorable five-year
termination by nearly 1 34 %. The investor’s yield drops abruptly if his favorable
early-termination scenarios are not realized. Worse still, if the market falls and does
not recover, he is trapped in a structure that provides a yield far below risk free.
A further conclusion from the table is that the dominant factor in the realized IRR
is the timing of the redemption payment, not the details of how the coupon is
distributed amongst the anniversary dates.
120 EQUITY INTEREST RATE HYBRIDS

TABLE 4.1 Internal rates of return for TARN redemption scenarios


Year of Termination
Payments 3 4 4 5 10
Initial 100.00% 100.00% 100.00% 100.00% 100.00%
Year 1 4.50% 4.50% 4.50% 4.50% 4.50%
Year 2 4.50% 4.50% 4.50% 4.50% 4.50%
Year 3 104.50% 0.00% 4.49% 4.49% 0.0%
Year 4 104.50% 100.01% 0.00% 0.0%
Year 5 100.01% 0.0%
Year 10 104.5%
IRR 4.50% 3.39% 3.43% 2.77% 1.37%

Compare also figures 4.10 and 4.11 showing how the TARN’s value derives
from the distribution of early- and late-termination cases.
While this is not an atypical structure, there are variants on the theme. One such
caps each coupon payment, potentially preventing the TARN level being reached
at some coupon date and thereby lengthening the structure when the basic TARN
would have redeemed early. In the above expression for the coupon amounts,

i 1
Pi Cj
j 1

is in this case replaced by

i 1
Min Pi Cj , Cap
j 1

4.3.2 Back-Testing
For the purposes of marketing a structured derivative, it is common to perform
back-testing. This procedure evaluates how the structure would have performed had
it been purchased at some time in the past. In particular, it is common to evaluate
the results of having, hypothetically, made the investment in the structure on each
business day during an appropriate time interval, using historical daily time series
for all relevant underlyings.7
Although back-testing is no part of derivative valuation, we apply it to our
example TARN to illustrate some of its features. The Stoxx50 index is available from
January 1987, so derivatives of ten-year maturity can be back-tested meaningfully,
assuming one ‘‘clone’’ of the structure to be initiated per business day during the ten

7
It is also common to speak loosely of the results as giving probabilities of particular outcomes:
this is not correct.
Hybrid Products 121

years between January 1, 1987, and December 31, 1996. It transpires that even the
latest starting of the simulated structures would have terminated by December 31,
1999. Accordingly, we can calculate their realized IRRs and plot them: Figure 4.5
shows the realized IRR for each trial, and figure 4.6 shows their distribution into
the maximum possible IRR and percent-wide bands below it.

5.00%

4.50%

4.00%

3.50%

3.00%

2.50%

2.00%

1.50%

1.00%

0.50%

0.00%
0 500 1000 1500 2000 2500 3000

FIGURE 4.5 Realized IRRs for each hypothetical back-tested TARN. One is assumed to
have been started each business day for ten years from January 1, 1987.

100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%
1.5% - 2.5% 2.5% - 3.5% 3.5% - 4.5% 4.5%

FIGURE 4.6 The distribution of realized IRRs for hypothetical back-tested TARNs. The gap
in the histogram corresponds to the gap between the three-year termination case and the
most favorable four-year case in the last row of Table 4.2.
122 EQUITY INTEREST RATE HYBRIDS

We have the surprising result that 86% of the simulated TARN issues terminate
after three years (the market rose at least 13.5%).
This is in fact a nice illustration of the power of back-testing: If the investor
believes that this behavior is representative of the outcome for an investment he
is considering, he will see the investment as extremely attractive. The risk-neutral
probability of markets rising 13.5% in three years is, of course, nowhere near
this high.
We may gain some intuition for this high percentage by observing the time
series for the index over the relevant interval (see figure 4.7). The vertical lines
in the graph mark the start date of the last simulated TARN, and its redemption
date. (It is clear by inspection that it terminates after three years, and in fact no
simulated TARN survives later than this.) We can see that the back-testing interval is
dominated by rising markets and excludes (because even the latest-starting TARNs
have terminated) the decline from the markets’ peak in 2000, hence the excellent
back-testing performance of this structure.
Now we are in a position to understand how the TARN offers such an attractive
early coupon and what the pitfall is. The early coupons are paid for by the probability
that the capital will be tied up for a long time, earning no great return, in the event
that the early redemption scenarios are not realized. The larger the initial coupon,
the lower the probability of early redemption and the longer it will be necessary
to lock up the capital in order to make the structure value work. In structuring a
TARN, there is a balance to be struck between initial coupons sufficient to attract
investors, the probability that early redemption will not happen, and the length of
time for which the investor’s capital will be locked up in the structure if it is not
redeemed early.

Stoxx50E Closing Levels

6000

5000

4000

3000

2000

1000

0
01-01-1987

01-01-1988

01-01-1989

01-01-1990

01-01-1991

01-01-1992

01-01-1993

01-01-1994

01-01-1995

01-01-1996

01-01-1997

01-01-1998

01-01-1999

01-01-2000

01-01-2001

01-01-2002

01-01-2003

01-01-2004

01-01-2005

FIGURE 4.7 Closing levels of Stoxx50 from January 1, 1987. The vertical lines mark the
inception date of the latest simulated TARN and its redemption date.
Hybrid Products 123

In the following sections, we extract the risks embedded in the structure, and
indicate how the various models in which the structure can be valued quantify these
risks.

4.3.3 Valuation Approach


Apart from the initial fixed-coupon payments, the TARN embeds a strip of call
spreads, where the lower strike depends on the past performance, and a payment of
the redemption amount on the first anniversary on which the performance reaches
the barrier (i.e., TARN level), 113.5% of initial spot in our example. Hence, we
may investigate the interest rate risk, the embedded lookback call spreads, and the
barrier risk.

Barrier Risk We first concentrate on the barrier risk at K 113 5%S0 . The barrier
in question pays the redemption amount as soon as the stock reaches K, or at
maturity if the stock never reaches this level. A barrier can generally be seen as a
limit of call spreads. As such, the impact of using a Black-Scholes-type model is
severe: It neglects the presence of the skew around the barrier.
In formulae, this rests on the fact that

Digital(K) K Call(K)
BS
K Call (K, ˆ (K))
BS
Digital (K, ˆ (K)) VegaBS (K, ˆ (K)) Kˆ (K)
DigitalBS (K, ˆ (K))

(Here ˆ (K) denotes the implied Black-Scholes volatility of a call with strike K and
the BS superscript indicates Black-Scholes-type formulae.) To illustrate the impact of
the vega term, we may evaluate the digital in the Black-Scholes Model and compare
it with approximations to the derivative of the call price using call spreads of
around the barrier K.8

Call(K K) Call(K K)
Digital(K)
2 K

The results of this test for the digital at 18-month maturity are given in table 4.2.
(Compare also figure 1.5, in which similar comparisons are illustrated.)
The wide discrepancy here forces us to use a model that takes into account the
skew, such as local volatility. In principle, such models reprice all European vanillas
correctly. Consequently, as we can see in table 4.3, they give consistent prices for
the barriers. (Again, compare figure 1.5.)

8
The actual size of the spread is determined by trading considerations, because the call spread
also allows one to constrain the possible delta positions occurring during the life of the trade.
124 EQUITY INTEREST RATE HYBRIDS

TABLE 4.2 The Black-Scholes digital


compared to call spread approximations

BS Digital Call Spread


0.5% 16.12% 20.60%
0.25% 20.60%
0.05% 20.59%
0.01% 20.59%
0.001% 20.59%

TABLE 4.3 The Black-Scholes digital compared to call spread


approximations and local volatility prices at various maturities.
Data for Stoxx50, June 2005

Maturity BS Digital Call Spread Local Volatility


1y6m 16.1% 20.6% 20.6%
2y 18.8% 24.4% 24.4%
2y6m 21.8% 28.5% 28.6%
3y 22.4% 29.7% 29.7%

Lookback Elements The next elements for consideration in the TARN are the embed-
ded call spreads. The coupons have the form:

C1 4 5%
C2 4 5%
S3
C3 Min 13 5%, 109%
S0

S4 S3
C4 Min 13 5%, 109% 109%
S0 S0

S4 S3
Min 13 5%, Max , 109%
S0 S0

S5 S4 S3
C5 Min 13 5%, Max , , 109% ,
S0 S0 S0

which shows that later coupons are lookback-type call spreads.9 In general, such
structures might depend strongly on future skew and interdependency between the
increments Si S0 and Si 1 S0 . Such effects are not always well captured by local
volatility models.

9A lookback on the maximum is a payoff of the form: F(Sn , Maxi n Si ) for t0 tn


Hybrid Products 125

To assess the impact of using a ‘‘structural’’ model (as opposed to a ‘‘fitting’’


model like local volatility) that still fits the European prices along the barrier K
well, but that also has self-consistent implied volatility dynamics, we calibrate a
Heston-type extended stochastic volatility model to the market. Carrying out the
comparison between these two models and plain Black-Scholes and the call spread
yields the comparison in figure 4.8 for the 113.5% barrier of our example. The
same comparison for the TARN using a BS model (i.e., term structure of implied
volatility along K), a local volatility model and a stochastic volatility model is shown
in figure 4.9.
The TARN can be thought of as two guaranteed coupons of 4.5%, plus an
option to receive the notional when a barrier of 13.5% is reached, plus an extra
4.5% paid somewhere between year 3 and year 10. Well-calibrated stochastic or
local volatility models should agree on the prices of the fixed coupons and the barrier
option, as well as the value of the first variable coupon. It is only the remainder of
the deal that behaves like the sum of lookback options (and then only if the index
falls within the narrow range of 109% to 113.5%). It is perhaps not too surprising
that the choice of the stochastic versus local volatility model does not have a large
effect on the price.

Impact of Stochastic Rates Although there is no explicit dependence on interest


rates in this product, the choice of deterministic versus stochastic rate models and
of the correlation between the index and the rates can still have a significant effect
on the price.
This behavior can be understood from looking at figures 4.10 and 4.11, where
we show the probabilities of the TARN expiring at a particular date and the
corresponding contributions to the overall price.
As the correlation between the index and the rates is increased, two effects
occur:

35%

30%

25%

20%
Price

15%

10%

5%

0%
2y 2y6m 3y

BlackScholes CallSpread Local Vol Stochastic Vol

FIGURE 4.8 Different models applied to the 113.5% barrier option of various maturities.
126 EQUITY INTEREST RATE HYBRIDS

■ The local volatility of the index process has to decrease (as the effect of the
stochastic rates on the implied volatilities becomes stronger).
■ The cash bond becomes more positively correlated with the index level.

The first effect increases the correlation between the index prices on consecutive
fixing dates, which increases the probability that if the index is above the barrier
on date i, then it was also above the barrier on date i 1. The probability of the
TARN expiring in each of periods 4 to 9 is therefore reduced (figure 4.10). The

96.0%

95.0%

94.0%

93.0%

92.0%

91.0%

90.0%
BlackScholes Local Vol Stochastic Vol

FIGURE 4.9 Different models applied to the TARN structure.

Risk-neutral probability for each redemption date

0.4

0.35

0.3 Correlation

0.25 0.8
Probability

0.4
0.2 0
-0.4
0.15 -0.8

0.1

0.05

0
0 1 2 3 4 5 6 7 8 9 10
Redemption year

FIGURE 4.10 Risk-neutral probabilities of redemption on the possible expiry dates.


Hybrid Products 127

Contributions to the TARN price from different redemption dates

0.4

0.35

0.3
Correlation
Price contribution

0.25 0.8
0.4
0.2 0
-0.4
0.15 -0.8

0.1

0.05

0
0 1 2 3 4 5 6 7 8 9 10
Redemption year

FIGURE 4.11 Contributions to the TARN value from the possible expiry dates.

second effect means that paths where the index performed badly (and so the TARN
expired after ten years) have a smaller cash bond and so the final payment in less
strongly discounted and so worth more, seen from today. This can be seen from
figure 4.11, where the correlation has a very large effect on the contribution to
the TARN price of the paths expiring after ten years. Note that the second effect
becomes more pronounced for later maturities thanks to compounding up of the
cash bond. The net effect is that the TARN value increases with increasing index
versus rate correlation, as shown in figure 4.12.

4.3.4 Hedging
Once the deal is priced, a hedging strategy to manage the risk must be employed.
The considerations necessary to hedge the product are similar to the pricing con-
siderations: We are faced with some fixed coupons (for which we do not need any
hedging), a barrier, and a stream of lookback call spreads.

97.50%

97.00%

96.50%

96.00%
Price

95.50%
X
95.00%

94.50% X is the value for local volatility


with deterministic interest rates
94.00%

93.50%
-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0
Index-rate correlation

FIGURE 4.12 The effect on the calculated TARN value of correlation between the index and
the EUR short rate, in a local volatility model with stochastic short rates.
128 EQUITY INTEREST RATE HYBRIDS

Barrier Hedging The risk in hedging a barrier is that the delta becomes very large
when we approach the barrier, and suddenly collapses when the barrier is reached.
We then face the problem of not being able to unwind our delta position fast
enough and face a gap risk. To alleviate the situation, call spreads can be used to
approximate the barrier. However, the size of the call spread is crucial: set too large,
the product becomes too expensive; too low, and the gap risk becomes too strong.
In general, the size of the spread lies in the experience of the trader and depends on
the liquidity of the stock, the size of the position, any other positions in the book,
and so forth.

Lookback Call Spreads The specific nature of the lookback call-spreads embedded
in the TARN is that the payment of one spread determines the lower strike of the
preceding call.
One possible strategy is therefore to set up call spreads for all times ti : i 3
with upper strike K (the barrier) and lower strike initially at 109%S0 . The later
strikes might also be adjusted by the probabilities of the process reaching the barrier
up to ti . Clearly, this initial portfolio of calls must be adjusted during the life of
the trade to account for the actual movement of the stock, taking into account the
transaction costs for each repositioning.

4.4 CONVERTIBLE BONDS


4.4.1 Introduction
In this chapter we consider an intensity-based framework for pricing convertible
bonds (CBs; see, for example, Overhaus et al. [31]) in a ‘‘two-and-a half-factor’’
setting. The two factors are the stock price and the interest rate. The hazard rate (the
half factor) is modeled as a deterministic function of the stock, the interest rate, and
time. We account explicitly for the stock price behavior and holder’s rights in the
event of default as well as the recovery value of the bond. Most comparable existing
models are special cases of this general setting.
There are three main issues on the modeling side:

■ Whether the stock value or the firm value is the main underlying factor
■ Whether there are additional stochastic factors, such as an interest rate or hazard
rate
■ How default is modeled and what happens upon default to the state variables,
the CB holders’ rights, and the convertible value.

In general, credit risk models fall into two main categories: structural and
reduced form (see Chapter 6). In structural models (see [72]–[82]) the state variable
is usually the value of the firm or firm asset value, which moves randomly. All claims
on the firm’s value are modeled as derivative securities with the firm value as the
underlying. Default occurs when the value of the firm hits or crosses a boundary.
It is necessary to specify the process for the firm value, the location of the barrier,
and the form and amount of recovery upon default. These models provide a link
between the equity and debt instruments issued by a firm, which may be necessary,
for example, in the valuation of CBs and callable bonds; they can be used, at least
Hybrid Products 129

in theory, to optimize the capital structure, and default risk is endogenized and
measured based on the share price and fundamental data only. However, the firm
value is unobservable and often difficult to model. The volatility of the firm value
is particularly hard to estimate. Also, models become too complex for reasonable
capital structures. Finally, they are not well suited for pricing and hedging of credit
instruments. In reduced form models (see [77], [83]–[88]) default is exogenous,
occurring at the first jump time of a counting process, Nt , with jump intensity t .
The main issues in reduced form models are the specification of processes for the
riskless short rate rt , the hazard rate t , and the recovery value.
The early models of convertible bonds (Ingersoll [89] and Brennan and Schwartz
[90]) follow Merton [73] in using the value of the firm with geometric Brownian
motion as the sole state variable. Brennan and Schwartz [91] and more recently
Nyborg [92] and Carayannopoulos [93] included in addition a stochastic interest
rate. Brennan and Schwartz and Nyborg assumed the short rate follows a mean-
reverting log-normal process; Carayannopoulos assumed the short rate follows the
Cox, Ingersoll, and Ross [94] model. Default risk is usually incorporated structurally
by capping payouts to the bond by the value of the firm.
Recent literature, on the other hand, mainly uses the stock price as a state
variable and either ignores credit risk (Zhu and Sun [95]; Epstein, Wilmott, and
Haber [96]; Barone-Adesi, Bermúdez, and Hatgioannides [97]; Bermúdez, and
Nogueiras [107]), incorporates it via a credit spread (McConnell and Schwartz [99],
Cheung and Nelken [100], Ho and Pfeffer [101]), or models it in a reduced form
setting as an exogenously specified default process (see Duffie and Singleton [87]).
However, some authors have pointed out (see Schonbucher [102]) that given the
hybrid nature of convertibles, asset-based models are the right class to consider
in order to account for credit risk. Arvanitis and Gregory [103] implemented and
compared both type of models for CB valuation. Bermúdez and Webber [104]
proposed an asset-based model that incorporates both endogenous and exogenous
default, as well as endogenized recovery.
In the equity-based approach most authors use a single-factor model, although
some allow interest rates to be stochastic in addition. The Vasicek [105] or else
the extended Vasicek (Hull and White [106]) model is used by Epstein, Haber,
and Wilmott [96]; Barone-Adesi, Bermúdez, and Hatgioannides [97], Bermúdez
and Nogueiras [107]; and Davis and Lischka [108]. Ho and Pfeffer [101] used
the Black, Derman, and Toy [109] model; and Zvan, Forsyth, and Vetzal [110]
and Yigitbasioglu [111] used the Cox, Ingersoll, and Ross [94] model. Cheung and
Nelken [112] adopted the model developed by Kalotay, Williams, and Fabozzi [113].
Very few authors model the hazard rate stochastically (Davis and Lischka
[108], Arvanitis and Gregory [103]). Most recent papers model the hazard rate
as a deterministic function of the state variables (also called a quasi-factor or half
factor) instead. To model the credit spread as a function of the state variables is
very intuitive and appears to provide realistic valuations, sensitivities, and implied
parameters, but it does constrain the credit spread to have an explicit relationship
with the stock price. This suggests developing a model in which both stock prices
and credit spreads follow separate but correlated random processes, as proposed
by Davis and Lischka [108]. As these authors point out, although there are three
sources of uncertainty—stock price, interest rate and credit spread—more than two
factors tend to be avoided for computational tractability. From the implementation
130 EQUITY INTEREST RATE HYBRIDS

point of view, stochastic hazard rates offer the same complexity as stochastic interest
rates, given that the dynamics for both processes are often very similar and their role
in the valuation PDE is analogous (Duffie and Singleton [87]).
The first authors to have modeled default exogenously, in the spirit of reduced
form models, were Davis and Lischka [108] (DL) and later Takahashi, Kobayahashi,
and Nakagawa [114] (TKN). They assumed that default occurs at the first jump
of a Poisson process, and they modeled the intensity of the jump as a deterministic
function of the stock price. They assumed that upon default the stock price jumps
to zero. DL modeled the recovery as a constant fraction R of the par value of the
bond, whereas TKN modeled recovery as a fraction of the market value of the bond
prior to default. However, it can be argued that these approaches penalize the equity
upside of the CB. The value of a convertible bond has components of different
default risk; the value contributed to the bond by its conversion rights can be argued
not to be subject to the same risk treatment as the fixed payments. Therefore, it may
be convenient, or even essential, to split the CB value into a bond part and an equity
part. In general, the value of the debt and equity components will be linked, and the
valuation problem reduces to solving a coupled system of equations. Splitting models
allow one to apply a different credit regime to the debt and equity components.
Moreover, they may be of interest to investors in order to identify different sources
of risk and be able to hedge them. How to split the convertible value, though, is an
open and controversial matter.
The first authors presenting splitting and writing the model as a coupled system
of equations were Tsiveriotis and Fernandes [115] (TF). The value of the equity
component and the value of the bond component were discounted differently to
reflect their supposed different credit risk. Ayache, Forsyth, and Vetzal [116], [117]
(AFV) extended previous literature by proposing a general specification of default
in which the stock price jumps by a given percentage upon default and the issuer
has the right either to convert or to recover a given fraction R of the bond part
of the convertible. The way they define the bond part is different from the original
definition of Tsiveriotis and Fernandes.
We consider a unified framework for pricing convertible bonds incorporating
interest rate and credit risk. We assume a jump-diffusion process for the stock
price and a mean-reverting process for the interest rate. We model the intensity
as a deterministic function of the stock and the interest rate, leading to an extra
so-called quasi-factor or half factor. Upon default, the model has an arbitrary loss
rate on the stock price, and an arbitrary default value V for the convertible that
may be a function of the state variables. The model contains many other models as
special cases. We identify most of the previous models, and we show that the main
difference between them is the specification of the recovery value.
DL and TKN implement their model in a lattice. TF use explicit finite differences
and an explicit algorithm to solve the coupled system of equations. AFV use a
modified Crank-Nicolson method combined with a penalty method for the free
boundaries and an implicit algorithm to solve the coupled system of equations. In
chapter 9 we discretize using a Lagrange-Galerkin method, and use an iterative
method to deal with the free boundaries.
In the next section we present the general valuation framework. Section 4.4.3
provides a detailed specification of the model, namely the interest rate model,
the hazard rate, the recovery value, and the conversion rights upon default. Also
Hybrid Products 131

in this section, previous models that are special cases of the general framework
are identified. Section 4.4.4 provides the analytical solution for a special bond
convertible just at expiry.

4.4.2 The Governing Equation


We follow a standard procedure given, for instance, by Protter [118]. Suppose that
the value St of the underlying asset follows a jump-augmented geometric Brownian
motion under the objective measure, P ,

dSt S dt qt St dt S St dZSt t St dNt , (4.2)

where ZtS is a standard Brownian motion under P , dt is the continuous dividend


yield, and qt is the repo rate. Nt is a counting process with intensity t . t is a
deterministic loss rate. Nt models exogenous default events. At a jump time for Nt
the equity value falls by a proportion ,

S S 1 (4.3)

It is well known that the process t t t is the P compensator of Nt , that is, the
unique finite-variation previsible process such that Nt t t is a martingale under P .
Under the equivalent martingale measure (EMM), P, associated with the money
t
market account, Bt exp 0 rs ds , the relative price BStt is a martingale, so

dSt rt dt qt St dt S St dZSt t St dNt t dt , (4.4)

where ZSt is a standard Brownian motion under P and dt t dt is the


P compensator of the jump component.
Notice that the setup defined by equation 4 4 is an incomplete market, meaning
that there is at least one contingent claim that cannot be hedged. Equivalently, under
the assumption of no arbitrage, there is no unique equivalent martingale measure
with which to price a contingent claim. However, given that the loss rate, t , is
deterministic, the market can be completed by adding a defaultable bond issued by
the firm whose equity is modeled by St .
Let us also assume that the short rate follows the stochastic process
r
drt r dt r dZt , (4.5)

where r and r are the expected rate of return and volatility of the spot interest
rate, which may be functions of the short-rate level as well as time. ZSt and Zrt are
both Brownian motions that may be correlated

dZrt dZSt t dt, with 1 t 1 (4.6)

We suppose that the firm has issued a convertible bond with market value Vt . The
bond matures at time T with face value F. At any time up to and including time T
the bond may be converted to equity. Its value upon conversion at time t is nt St ,
where nt is the conversion ratio (which may be zero). The bond may be called by the
132 EQUITY INTEREST RATE HYBRIDS

issuer for a call price MCt and also it may be redeemed by the holder for a put price
MPt . We assume that call and put prices include already accrued interest, which
must be paid by the issuer upon call and upon put.
By Ito’s lemma (see Jacod and Shiryaev [119]), the process followed by Vt is

Vt 1 2V 2V 1 2V
2 2 t t 2 t
dVt S St St S r r
t 2 S2t St rt 2 r2t
Vt Vt
rt dt qt t t St r dt
St rt
Vt S Vt r
S St dZt r dZt V St
St rt

where V St Vt St , t Vt St , t and Vt St , t Vt St 1 t , t is the


value of the convertible bond if a jump occurs at time t.
Under the EMM the relative price V t
Bt is a martingale. Imposing this condition,
we have
Vt 1 2V 2V 1 2V
2 2 t t 2 t
r t Vt S St St S r r
t 2 S2t St rt 2 r2t
Vt Vt
rt dt qt t t St r (4.7)
St rt
t t Vt St , t Vt St , t

Notice that t t Vt St , t Vt St , t dt is the compensator of the jump V St .


When Vt is a deterministic function of St St 1 t , equation 4 7 reduces to

Vt 1 2V 2V 1 2V
2 2 t t 2 t
rt t Vt S St St S r r
t 2 S2t St rt 2 r2t
Vt Vt
rt dt qt t t St r t Vt St , t (4.8)
St rt

Inequality constraints that follow from the optimal conversion, redemption,


and call strategies, as defined by Brennan and Schwartz [90], make the convertible
bond valuation problem a free-boundary problem that can be formulated as a
variational inequality. This is modeled below via a Lagrange multiplier p, which
adds or subtracts value to ensure that the constraints are being met.
We will use the notation:
2 2 2
1 2 2 1 2
S St St S r r (4.9)
t 2 S2t St rt 2 r2t

rt dt qt t t St r
St rt

to write the valuation equation in short as

pt Vt rt t Vt t Vt St , t , (4.10)
Hybrid Products 133

together with conditions

max nt St , MPt Vt max MCt , nt St , (4.11)


max nt St , MPt Vt max MCt , nt St pt 0, (4.12)
Vt max nt St , MPt pt 0, (4.13)
Vt max MCt , nt St pt 0 (4.14)

If the bond pays coupons discretely, typically every year or half year, let
K rt , tc 10 be the amount of discrete coupon paid on date tc . Then the following
condition must be imposed in order to avoid arbitrage opportunities:

Vt rt , St , tc Vt rt , St , tc K rt , tc (4.15)

Such discrete cash flows may be incorporated in the governing valuation equation
by adding a Dirac delta function term K t tc to the RHS of 4 10 .
The final condition for the convertible bond is the exercise condition at the
maturity time T,

VT rT , ST , T max nT ST , F K rT , T max nT ST , F , (4.16)

where we have introduced the adjusted face value F F K rt , T .


Solving (4 10) 4 14 subject to boundary, final, and jump conditions gives
the theoretical value of the convertible bond.

Splitting Procedures Given the hybrid nature of convertibles, it is possible and


often desirable to split the value V of the convertible into a bond part W and an
equity part U. Early models valued CBs by replication as a portfolio of a bond
and a warrant. Unfortunately, this approach is limited to the case when the bond
is convertible only at expiry and there are no other embedded options, such as call
and put features. In general, the two parts are linked and the valuation problem is a
coupled system of equations. Splitting models allows a different credit treatment to
be applied to the debt and equity parts. This may be of interest to investors in order
to identify their risks and be able to hedge them.
Let us assume the value of the bond is split into an equity part U and a bond part
W. U is the part related to payments in equity, and therefore includes the conversion
and call option. W is related to payments in cash, and includes the coupons and the
put option. In general, both are derivatives on the underlying stock price and the
instantaneous interest rate, and will follow partial differential equations similar to
4 10 with default values given by W and U , respectively. The two parts have
embedded early exercise features and therefore follow inequalities with Lagrange
multipliers pW and pU ,

Wt rt pt Wt t Wt St , t pW
t , (4.17)
Ut rt pt Ut t Ut St , t pU
t (4.18)

10 Most CBs pay fixed coupons. Some pay float, hence K rt , tc


134 EQUITY INTEREST RATE HYBRIDS

To be fully specified we need to supply inequality constraints and final conditions


to 4 17 and 4 18 . At the final time the payoff to the convertible is given by 4 16 .

VT rT , ST , T max nT ST , F (4.19)

The splitting determines how VT is allocated between WT and UT . How to decom-


pose the convertible value, or equivalently how to define the bond and equity parts,
is an open and controversial matter. Three possible splittings are (see figure 4.13):

■ Splitting 1. UT : asset or nothing call, WT : cash or nothing put


■ Splitting 2. UT : equity, WT : equity premium (put)
■ Splitting 3. UT : risk premium (warrant), WT : bond floor

The motivation of the splitting is to apply a different credit treatment to equity


and debt. Originally, in the TF model, the main objective was to use a different
discount factor for the debt part and the equity part, such that the bond part
is discounted with the risky rate and the equity part with the risk-free rate. The
same effect may be achieved without the splitting by modeling the hazard rate as
a function of the stock price. However, if we want to use a different recovery in
bond and equity, a splitting is necessary in order to define the recovery value of
the convertible. It will be mandatory to solve a coupled system of equations only
when the default value of the convertible V depends explicitly on one or both of
the values of the equity part U and the bond part W.

4.4.3 Detailed Specification of the Model


In this section we discuss in detail the remaining components of the model, namely
the hazard-rate specification, the recovery value, and the conversion rights upon
default.

Splitting 1
V W U

= +

X S X S X S

Splitting 2
V W U

= +

X S X S X S

Splitting 3
V W U

= +

X S X S X S

FIGURE 4.13 Possible splittings of the CB payoff at maturity.


Hybrid Products 135

TABLE 4.4 Models for the hazard rate


Davis and Lischka [108] t

Olsen [120]
Takahasi, Kobayahashi, and Nakagawa [114] c k Sa
Ayache, Forsyth, and Vetzal [117]
Arvanitis and Gregory [103] k exp aSt d
Das and Sundaram [121] k exp brt c(T t) d Sa

The Hazard-Rate Process As an alternative to stochastic hazard rate (chapter 3,


Davis and Lischka [108]), herein we may use a deterministic function of the state
variables and time. Many parameterizations could be applied; table 4.4 shows some
specifications that have been used in the literature. Several authors model the hazard
rate as a function of the stock price only, and impose negative correlation via a
power or an exponential function. In both specifications the spread is a monotonic
decreasing function of the stock price; but only the power function guarantees an
infinite hazard rate for zero stock price, which may be a desirable property (see
Olsen [120]). Recently Das and Sundaram [121] have combined an exponential
dependency on the interest rate with a power dependency on the stock price, and
have added the time to maturity. They calibrated the hazard rate using market
prices of CDSs and historical data, and they tested this approach empirically by
pricing some real convertible bonds. Their results seem very satisfactory. Andersen
and Buffum [122] are concerned with the simultaneous calibration of the hazard
rates and the volatility smiles; they point out the need to make the hazard rate time
dependent to avoid mispricing.

The Recovery Value Regarding the recovery of defaultable claims, many models (as
reviewed by Schonbucher [102] and Bielecki and Rutowski [124]) have been proposed
in the literature: recovery of treasury (RT), recovery of par (RP), multiple defaults
(MD), recovery of market value (RMV), zero recovery (ZR) and stochastic recovery.
The RT is very convenient from the computational point of view. The reason
is that the price of a defaultable issue under RT is a weighted average of the
default-free instrument and the price under zero recovery, which is usually easy to
compute. However, the RT can lead to unrealistic shapes of spread curves and lead
to recoveries above 100%. The RP and RMV models are similar for issues close to
par. The RMV is more consistent for the pricing of credit risk derivatives, but it
does less well in pricing downgraded and distressed debt. The RMV is very elegant,
in the sense that pricing of financial instruments can be done by discounting with
the adjusted defaultable rate r 1 R , where is the hazard rate and R is the
recovery rate. In RP the pricing is more complicated. Both models are suited for
the calibration of the implied credit spreads, although in RMV it is not possible to
separate the calibration of the hazard rate, , and the loss rate, 1 R . RMV cannot
be used with firm-value models, whereas the RP can be used in intensity-based and
firm-value models. Finally, the intuition behind both models is different: the RMV
is motivated by the idea of reorganization and renegotiation of debts; the RP is
motivated by the idea of bankruptcy proceedings under an authority ensuring strict
relative priority.
136 EQUITY INTEREST RATE HYBRIDS

Suppose default occurs at time . We define the recovery value on the CB,
V , as the sum of the recovery values on the bond and equity parts, W and U ,
respectively,

V W U (4.20)

Conversion Rights upon Default Another issue regarding the default value is whether
or not the model should allow for conversion upon default. Realdon [129] showed
that it can be rational for CB holders to convert when the debtor approaches
distress. In the pricing literature, only AFV allow for conversion upon default. This
is consistent with the assumption that the stock price falls on default by a given
fraction and does not necessarily vanish. We adopt their assumption and redefine
the bond value upon default as the maximum between the conversion price and the
recovery value. In this case the pricing equations can be written as

pt Vt rt t Vt t max nt St 1 , Vt (4.21)

No other models explicitly consider holder rights on default. However, given that
DL and TKN assume the stock price jumps to zero upon default, the conversion
option is worthless.

Previous Models as Special Cases of General Framework Most of the previous mod-
els fit into the general framework presented above. The particular specification of
the hazard rate, the loss rate, and the recovery value will determine the difference.
We have summarized why in table 4.5.

■ Davis and Lischka

Their equation is a special case of 4 10 for deterministic interest rate, loss rate
equal to 1, and recovery of par.

■ Takahasi, Kobayahashi, and Nakagawa

Their equation is a special case of 4 10 for deterministic interest rate, loss rate
equal to 1, and recovery of market value.

■ Tsiveriotis and Fernandes

TABLE 4.5 Comparison of previous models


Model Loss rate Default value V U W
U W
TF 0 U RW
TKN 1 RU RW
DL 1 0 RF
AFV nS 1 RW 0 RW
AFV total default 1 0 0
AFV partial default 0 nS 0
Hybrid Products 137

Although TF do not discuss default, and they model credit risk via a credit
spread, a posteriori we could identify their model in the more general setting of the
previous section. The equation they propose for the total value of the convertible is
the one-factor counterpart of 4 10 for zero loss rate, , constant hazard rate, ,
equal to the credit spread, rc , and value upon default, V , equal to the equity part of
the bond, U,

t 0, (4.22)
t rc , (4.23)
Vt Ut (4.24)

This means that in the event of default the stock price does not jump. Also the
bond part vanishes, and therefore the holder is not entitled to any cash flows, but
conversion is allowed at any time after default. This was pointed out by AFV.
We would rather give the following interpretation. If we write the credit spread,
rc , as the product of a hazard rate, , and a loss rate 1 R, where R is the recovery
rate on the bond part, it can be easily shown that the default value of the convertible
turns out to be Vt St , t Ut St , t RWt St , t . This means that on default the
total equity part is recovered, which is consistent with the fact that the stock price
does not jump on default, or equivalently the recovery on equity is one. On the
other hand, the recovery on the bond part is not zero. Therefore, TF can be seen as
a special case of 4 10 with zero loss rate and recovery a fraction of bond and
equity part.

4.4.4 Analytical Solutions for a Special CB


We consider a special case for which an analytical solution is available: a zero
coupon bond, which is convertible only at expiry. It is well known that the value of
such a convertible may be written as the sum of the value of the straight bond plus
n call options on the underlying stock with strike price X F n. Indeed

V r, S, T max nS, F
max(nS F, 0) F
F
n max(S n , 0) F (4.25)

For simplicity, we assume the default value on the bond, V , is independent of the
state variables, although it may depend on time. This includes, for example, the RP
and RT models. Closed-form solution under other recovery models can easily be
found. The value of the convertible may be written as

V r, S, t FZ r, t, T
T
ds qs s s ds
n St e t N d1 XZ r, t s N d2
T
sV s Z r, t s SV t, s ds, (4.26)
t

where
138 EQUITY INTEREST RATE HYBRIDS

■ Z r, t s is the Vasicek discount factor from time t to time s


■ SV t, s is the survival probability of the issuer from time t to time s
■ And
T
ds qs ds
1 St e t s s 1
d1 ln Var,
Var XZ t, t T 2

d2 d1 Var,

with
T T T
2 2
Var S s ds r s ds 2 s S s r s ds
0 0 0

4.5 EXCHANGEABLE BONDS

The valuation of convertible bonds (CBs) under stochastic short-rate models and
deterministic hazard rates is well established. In this section we introduce a minor
extension that nevertheless introduces an interesting new feature.
We consider the case of a bond that is convertible into stock at the option of
the holder, exactly as for a standard CB, but which is issued by an issuer other than
the company into whose stock the bond is convertible. Such instruments are known
as exchangeable bonds.11 A typical use of this latter structure is for a company to
issue bonds convertible into shares of another company in which it already has a
stake, thereby reducing its exposure to that stock, and effectively selling its stake in
the event that the bond is converted.
The principal new feature for modeling that arises here is the exposure to two
credits. The company whose stock underlies the bond can default, or the issuer can
default, and the consequences for the bondholder of these two possible defaults are
quite different. We do not consider here any correlation between the defaults.

4.5.1 The Valuation PDE


We use the following diffusion processes for the equity and short rate processes:

dSt S S S
rt t dt t dZt dNt
St
r r
drt ( t t rt )dt t dZt

E dZSt dZrt St , rt , t dt,

where

11 Or occasionally as synthetic convertible bonds.


Hybrid Products 139

■ rt is the instantaneous short rate, modeled as a Vasicek process (section 3.3.1).


S
t is the volatility of the share, which requires calibration as in chapter 8.

■ Nt is a Poisson process with a deterministic intensity St (hazard-rate function).

t and t are the long-term mean and mean reversion speed, respectively.
r
t is the volatility of the short rate.

■ ZSt and Zrt are correlated Wiener processes.


■ Nt models exogenous default events of the company that issued the stock. At a
jump time for Nt , the share price is assumed to fall to zero.

Let V St , rt , t be the price of an exchangeable bond. Since V is subject to


two sources of default risk, the issuer and the underlying risk, we introduce Bt
for the deterministic hazard rate function of the issuer. V St , rt , t then satisfies the
following PDE:

2 2 2
V 1 S 2 V S r V 1 r 2 V S V
t St St t t t rt t St
t 2 S2t St rt 2 r2t St
V B S S B
( t t rt ) rt t t V t RS t, rt t RB t 0 (4.27)
rt

In (4.27) we introduce the following quantities:

■ RB t is the value recovered by the holder in case of default of the issuer: The
‘‘recovery value’’ of the exchangeable bond. We may decompose this value a
little, thus

RB t P(t, t TLP) LP

in which

— LP is the liquidation proceeds: the amount eventually received by the holder,


delayed perhaps some while after the default event, by an interval known as
the time to liquidation proceeds.
— TLP is the time to liquidation proceeds: the interval that elapses, or is
assumed to elapse for modeling purposes, between the default and the
eventual payment of the liquidation proceeds.
— P(t, t TLP) is the discount factor from the default time t to t TLP.

The liquidation proceeds may in turn be broken down thus:

LP RR PP(t) accrued

in which

— PP is the principal payable: the amount due to bondholders in the event of


default. It could in principle be a function of time (Original Issue Discount
bonds, zero coupon bonds issued at a large discount to par, might have this
feature) but will typically be the full notional.
140 EQUITY INTEREST RATE HYBRIDS

— RR is the recovery rate: the fraction of the amount due to a bondholder


which the issuer actually pays (or, again, is assumed to pay for the purposes
of modelling).
— The accrued is of course the amount accrued, at the time of default, on the
coupon accruing at that time.

■ RS t, rt is the value of the exchangeable bond if the underlying defaults. After


the underlying defaults, the exchangeable loses all its equity value (the value due
to the holder’s conversion right), but the issuer is still obliged to make the coupon
and redemption payments. Accordingly, the residue is simply an ordinary bond,
although it retains its issuer call feature.
However, the call feature will not have any value once the stock price has
dropped to zero: if it is a soft call, then the trigger level will manifestly not be
satisfied, and if not, then the call price would, in practice, be set sufficiently high
as not to terminate the pure bond without equity content. Hence, RS t, rt can
be approximated by the price of a standard coupon-bearing bond under Vasicek
interest rates. Note that this is a risky bond, subject to default of the issuer, with
recovery value equal to RB . The closed-form solution for such a bond is given by
B
RS t, rt i Ki SV t, ti P r, t, ti
N R SV B t, T P r, t, T
RV r, t ,

where

— i 1, , N runs over the remaining coupons.


— Ki is the coupon amount payable at ti .
— SV B t, ti is the survival probability of the issuer from t to ti .
— P r, t, ti is the zero-coupon bond price seen at time t and maturing at ti ,
given r.
— N is the notional.
— R is the redemption factor (generally equal to 1, but greater than 1 for
premium redemption bonds).
— RV r, t is the expected value of recovery in case of default of the issuer as
of time t.

This function depends on time t and the state variable r, so it needs to be


evaluated at every node in the FD grid. The recovery contribution to the value
of the bond, RV r, t , is given by

T
RV r, t t RB s SV B t, s P r, t, s B
s ds

4.5.2 Coordinate Transformations for Numerical Solution


In order to use (4.27) directly, the drift term t is needed. The expression for
this function involves second derivatives of P(0, t), so a piecewise linear function
for P(0, t) (or the yield R(0, t)) is not sufficiently smooth. We can transform the
Hybrid Products 141

problem such that t , or first derivatives of P(0, t), are not required. In this we follow
section 3.3.1.
The variable

yt rt f (0, t)

is introduced. This follows the process


r r
dyt ( (t) t yt )dt t dWt ,

where (t) is the variance of rt


t
(t) ( sr )2 exp(2( s t ))ds
0

and
t
0t s ds
0

Since f (0, t) is the expected future short rate in the t forward measure, PDE
grids that remain centered on yt 0 will always capture the relevant region.
The process for the equity becomes

dSt S S S
(yt f (0, t) t )dt t dWt dNt ,
St

which contains the possibly nonsmooth functions f (0, t) and hSt . To remove this, we
use the variable

xt log(SV S (0, t)P(0, t)St S0 )

where SV S (0, t) is the survival probability of the underlying stock, given by

t
SV S (0, t) exp 0 s ds

We have the processes


1 S 2 S S
dxt (yt 2 ( t ) )dt t dWt ,
r r
dyt ( (t) t yt )dt t dWt

If V St , rt , t X(x, y, t) the pricing PDE in terms of the x-y variables becomes


2 2 2
Xt 1 S Xt S r Xt 1 r 2 Xt 1 S 2 Xt
t t t t (yt 2( t ) )
t 2 x2t xt yt 2 y2t xt
Xt B S S B
(V(t) t yt ) yt f (0, t) t t Xt t RS t, rt t RB 0
yt
142 EQUITY INTEREST RATE HYBRIDS

This has removed second derivatives of the yield curve ( t ), and f (0, t) and St from
the convection (drift) term. However, the PDE still contains f (0, t), St and Bt in the
reaction (discounting) term. To remove those (and improve convergence in the event
of discontinuous forward or hazard rates), we can solve for a deterministically risky
discounted form of X, that is,

Y(x, y, t) X(x, y, t)P(0, t)SV B (0, t)SV S (0, t) X(x, y, t) (0, t)

where

(0, t) P(0, t)SV B (0, t)SV S (0, t)

This follows the PDE


2 2 2
Yt 1 S Yt S r Yt 1 r 2 Yt 1 S 2 Yt
t t t t (yt 2( t ) )
t 2 x2t xt yt 2 y2t xt
Yt S B
(V(t) t yt ) yt Yt t RS t, rt t RB (0, t) 0 (4.28)
yt

In practice, at every time step from time t1 to time t2 , with t2 t1 , we solve for the
function

Y(x, y, t)
Y(x, y, t)
(0, t1 )

Notice that, at a known time step, t1 , we have

Y(x, y, t1 ) X(x, y, t1 )

In particular, at maturity

Y(x, y, T) X(x, y, T)

therefore, solving for Y, we do not need to rescale the payoff. The PDE for Y(x, y, t)
becomes

Y 1 2Y 2Y 1 2Y Y
S S r r 2 1 S 2
t t t t (yt 2( t ) )
t 2 x2t xt yt 2 y2t xt
Y S B (0, t)
(V(t) t yt ) yt Y t RS t, rt t RB 0 (4.29)
yt (0, t1 )

where

(0, t) 1 1
(0, t1 ) (t, t1 ) P(t, t1 )SV B (t, t1 )SV S (t, t1 )
Hybrid Products 143

After solving this equation between two times t1 and t2 , we can multiply by the risky
discount factor between these two times

P(0, t1 )SV B (0, t1 )SV S (0, t1 )


(t2 , t1 )
P(0, t2 )SV B (0, t2 )SV S (0, t2 )

This is equivalent to solving for X(x, y, t).

Local Volatility The implied volatility of a European option is affected by both


equity volatility and interest rate volatility, and the calibration of the equity local
volatility needs to take this into account as described in chapter 8, alongside the
possible jump to zero of the stock. This done, the algorithm will correctly reprice
European options in the presence of deterministic default.

Implementation The recovery of the exchangeable RB (t) in the event of issuer default
is a function only of time if the time to liquidation proceeds is assumed to be zero.
Otherwise, it is an approximation to drop the r-dependence. (The same calculation
appears in the evaluation of a CB in a deterministic interest rate model, and in the
valuation of a straight defaultable bond with recovery in a similar model.)
The value of the exchangeable bond at the time the underlying defaults RS t, rt
enters the PDE (4 29) alongside RB (t) in the source term

S B (0, t)
t RS t, rt t RB
(0, t1 )

This quantity is evaluated at every point on the r-grid, at every time step.

An Example To make the discussion less abstract, we can consider a real example.
In 2004, Banca Monte Dei Paschi Di Siena S.p.a. (Banca MPS) issued a bond
convertible into the stock of Banca Nazionale Del Lavoro S.p.a. (BNL). Both are
Italian banks listed on the Milan exchange12 ; the former is the world’s oldest bank
and the latter one of Italy’s largest banking groups. Both are, unsurprisingly, good
credits: as of late 2005, 5-year Banca MPS credit default swaps traded below 20 bps
(basis points), rising to around 25 at ten years; five-year BNL credit default swaps
traded around 35 bps, rising to 45 at ten years.
The exchangeable bond in question matures in July 2009; carries a 1% coupon,
payable annually; and is convertible at any time after January 15, 2006, and callable
by Banca MPS after July 2007, subject to the stock trading above 3.09 for 20 out
of the preceding 30 business days. We can compare the valuations obtained for this
bond with a hypothetical bond issued by BNL (a marginally worse credit) on its own
stock; i.e., a standard convertible. Thus we keep the equity details unchanged in the
comparison. Moreover, as the hazard rate appearing in the S term in (4.27) is
that of BNL, the stock drift term is itself unaltered in the comparison.
We also make a recovery assumption: for illustration, we will assume recovery
ratios of zero and 30% on default of the issuer. (In the exchangeable case, the
separate default of the stock does not need a recovery assumption.) The comparison

12 Reuters codes BMPS.MI and BANI.MI, respectively.


144 EQUITY HYBRID DERIVATIVES

120.60%

120.40%

120.20%

120.00%

Recovery = 0%
119.80%
Recovery = 30%

119.60%

119.40%

119.20%

119.00%
Convertible Exchangeable

FIGURE 4.14 Prices of exchangeable and hypothetical plain convertible bonds in percent of
notional, for zero and 30% recovery assumptions.

is shown in figure 4.14. The better credit of Banca MPS naturally tends to raise
the fair price for the exchangeable. We should also bear in mind that the issuer
being Banca MPS rather than BNL itself softens the effect of a potential default of
BNL, as in the exchangeable case such a default would yield the full value of the
coupon-bearing bond to the holder rather than, say, 30% of notional, which would
be the case if the issuer were BNL.
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 5
Constant Proportion
Portfolio Insurance

5.1 INTRODUCTION TO PORTFOLIO INSURANCE

Portfolio insurance is a family of investment strategies designed to give the investor


the possibility to limit downside risk while benefiting from rallying markets. These
strategies protect the investor from falling markets and allow him to recover his initial
capital or less commonly a percentage of it. One well-known portfolio insurance
strategy is constant proportion portfolio insurance (CPPI). This was introduced by
Perold in 1986 for fixed income instruments and by Black and Jones in 1987 for
equity instruments.
In this chapter, we will introduce CPPI and discuss options on CPPI portfolios.
CPPI is a trading strategy intended to keep a constant proportional exposure to a
certain risky asset while guaranteeing a minimum value of the portfolio throughout
its life. Let us define a strategy as a pair of evolving weights (a(t), b(t)). The portfolio
consists of a risky fund denoted RF(t) and a floor denoted Floor(t). The value of the
strategy denoted CPPI(t)1 is then given by

CPPI(t) a(t)RF(t) b(t)Floor(t)

The risky asset could be an equity, commodity, or any other risky underlying.
The floor is almost always a bond, either a coupon-bearing bond or a zero bond.
Let us first introduce some standard key words that are commonly used when
dealing with CPPIs. Note that this section is essential to the understanding of the
rest of the chapter.

CPPI Key Words:

■ Floor: The reference level to which the CPPI value is compared; it could be seen
as the present value of the protected amount at maturity
■ Cushion: CPPI Floor
■ Cushion%: Cushion/CPPI
■ Multiplier: A fixed number symbolizing how much leverage we put into the
structure; also called the gearing

1 This also referred to as net asset value (NAV).

145
146 EQUITY INTEREST RATE HYBRIDS

■ InvestmentLevel: The percentage invested in the risky asset portfolio; this is also
known as the exposure.

Multiplier Cushion%
e m c

Allocation Mechanism The rebalancing of the money between the risky asset and the
riskless bond is done in the following way:
The investment level is computed first as follows:

CPPIt Floort
ILt m
CPPIt

And the CPPI index value is then computed as follows:

RFt Floort
CPPIt CPPIt 1 1 ILt 1 1 1 ILt 1 1
RFt 1 Floort 1

This algorithm corresponds to the most basic CPPI in which we do not have any
special features. This will be discussed in detail in the following sections.
The CPPI itself is a hybrid underlying and needs a hybrid modeling framework
in order to account for the various risks embedded in it. The risky asset portfolio
could itself be a hybrid, as is the case in most of the strategies on hedge funds
and mutual funds. Indeed, hedge funds and especially fund of funds do execute
investment strategies that involve different types of underlyings.
Moreover, as we will see in section 5.5, flexi-portfolio CPPIs in general and
momentum and rainbow CPPIs in particular can be defined on a basket of hybrid
underlyings involving various asset classes ranging from equity to interest rates to
credit to commodities.
The remainder of this chapter will be organized as follows. First, we discuss
the most basic form of the CPPI, the classical CPPI. We then introduce various
restrictions and discuss their impact on the CPPI strategy. Pricing and hedging
options on the CPPI index is next. Finally, we introduce some nonstandard CPPIs,
namely off-balance-sheet, momentum, and perpetual CPPIs.

5.2 CLASSICAL CPPI

The classical CPPI is a self-financing strategy that rebalances the money between
the risky asset and the riskless one, depending on the performance of the former,
throughout its life. It has the following characteristics:

■ The floor is a zero bond whose redemption is the guaranteed capital.


■ No restriction is imposed on the investment level.
■ No fees are taken out of the CPPI index (see section 5.3.2).
■ No ratcheting (lock-in) is applied to the floor (see section 5.3.2).
Constant Proportion Portfolio Insurance 147

900.00

CPPI
800.00
Equity Fund

700.00 Floor

600.00

500.00

400.00

300.00

200.00

100.00

0.00
0 2 4 6 8 10 12
time in years

FIGURE 5.1 Example of a simulated CPPI strategy that does not deleverage.

There are other structures that relax the above restrictions and will be discussed
later.
When pricing an ATM option on a classical CPPI, the only risk we are dealing
with is the gap risk (assuming, of course, that we don’t have any liquidity issues).

The Gap Risk In extreme market conditions, the CPPI index could rapidly fall
below the floor before the insurance manager has the chance to rebalance his
portfolio. The CPPI index will not have a chance to recover, as the investment level
will have reached zero and the manager will be unable to repay the guaranteed
investment.
It is easily seen that the risky asset has to fall by more than 1/m (m being
the multiplier) between two rebalancing dates for the CPPI index to drop below
the floor. Moreover, the greater the leverage, the greater is the risk on the fund
value to drop at a rate proportional to the leverage as the risky asset falls, allowing
correspondingly less opportunity to the fund manager to execute the rebalancing.
This means that we have a crash put option with a strike of 1-1/m embedded in the
strategy and the strategy is no longer a delta one2 strategy.
The graphs in figures 5.1 and 5.2 give examples of simulated outcomes of a
classical CPPI strategy: It can be seen from figure 5.2 that any upside gains made
by the strategy at the beginning fade away as soon as the risky asset drops, the
investment level can become zero, and the strategy will not recover. A series of
restrictions may be added to the classical CPPI strategy in order to allow the investor

2
A delta one product is a product whose payoff is linear in the underlying risky asset, a
product whose risk could be hedged entirely by the risky asset.
148 EQUITY INTEREST RATE HYBRIDS

240.00
CPPI

220.00
Equity Fund

200.00
Floor

180.00

160.00

140.00

120.00

100.00

80.00

60.00

40.00

20.00

0.00
0 2 4 6 8 10 12

time in years

FIGURE 5.2 Example of a simulated CPPI strategy that deleverages completely.

to benefit from any upsides whenever they happen, even in cases where the classical
CPPI would have completely deleveraged.

The Continuous Time Classical CPPI If we consider a continuous time strategy with
no constraints on the floor or investment level, the pricing of options on a CPPI
resembles the one of power options. To illustrate this, let’s call Vt and Ft the values
of the CPPI index and the floor at time t, respectively. We can therefore write

dFt dSt
dVt Vt Et Et
Ft St

where we have V0 1, Fmaturity 1, Vt Ct Ft , Et mCt and dFt rFt dt,


where Ct is the cushion at time t The variation of the latter is given by:

dCt dVt dFt


dFt dSt
Vt Et Et dFt
Ft St

Assuming that the risky asset St is log-normal, dSt St dt dWt , we have


the following result3
m
Ct t St

C0 2 m2 2
t exp t , where r m r
Sm
0 2 2

3 See appendix B for details of the computation.


Constant Proportion Portfolio Insurance 149

Finally, the value of the portfolio is given by

m
Vt t St Ft

This result means that in the limit of continuous trading an option on the CPPI
strategy is nothing other than a power option on the risky asset. However, the
rebalancing in most CPPI strategies is not continuous, and moreover most of them
contain one or more restricting features. In the next section we present the most
common restrictions imposed on the classical CPPI strategy.

5.3 RESTRICTED CPPI

The traditional CPPI strategy can be restricted or modified depending on the appetite
and whim of the investor. In this section, we examine some of the modifications that
may be made. Typically, these are motivated by risk aversion, legal constraints, or
performance.

5.3.1 Constraints on the Investment Level


In this section, we depart from the classical CPPI by introducing restrictions on the
fraction of the fund that may be invested in the risky asset, and we look briefly at
the implications on the CPPI and ATM options on the CPPI.

Minimum Investment Level As mentioned previously, if the risky asset falls substan-
tially and the investment level becomes zero, there is no chance for the strategy to
recover. To allow the strategy to pick up from a downturn, a minimum level of
investment in the risky asset may be imposed (also called minimum delta).
From the graph in figure 5.3 we can see that the CPPI index is not guaranteed
to end above par due to the minimum investment level restriction. This risk of not
recovering the initial investment implies that the value of an ATM option on the
CPPI index will increase. Indeed, the increase in the probability of not guaranteeing
the initial capital will increase the option value.

Maximum Investment Level A maximum investment level is sometimes imposed on


the CPPI strategy in order to reduce the gap risk and avoid an unbounded leverage.
In a classical CPPI, as the risky asset rises, the investment level becomes greater and
greater (the gearing effect) and the CPPI index becomes more like the pure risky
asset. As an example, in the case of the underlying risky asset being a mutual fund
or a basket of mutual funds, the investment level is capped at 100% due to legal
restrictions.

5.3.2 Constraints on the Floor


In the previous section, the implications of restrictions on the investment level were
considered. Restrictions or modifications may also be placed on the other component
of the CPPI fund, the floor. We examine those in the current section.
150 EQUITY INTEREST RATE HYBRIDS

240.00
Restricted CPPI

220.00 Equity Fund


Floor
200.00
Classical CPPI

180.00

160.00

140.00

120.00

100.00

80.00

60.00

40.00

20.00

0.00
0 2 4 6 8 10 12
time in years

FIGURE 5.3 Example of a CPPI strategy with a minimum investment level of 30%.

Ratcheting When the market rallies, any gains made by the CPPI strategy could be
lost if a downturn occurs. Therefore, the investor is not guaranteed to benefit from
those gains. Ratcheting is introduced to allow the investor to lock in gains made
from upside movements of the market.
Ratcheting operates as follows: Whenever the CPPI strategy performs well and
reaches a new maximum, a percentage RP of that maximum is guaranteed to the
client (we say that we ratchet at RP%). This raising of the floor of the strategy
reduces the exposure to the risky asset, introduces a lookback effect to the strategy,
and adds, in some cases, more vega to the option by increasing the gap risk.
If we compare figures 5.2 and 5.4, we can see the benefit of ratcheting to the
investor. In these graphs, we have looked at the same simulated paths of the risky
asset and the guaranteed amount with ratcheting is far bigger than without. In the
former case, the floor remains at a low level while on the latter it is raised, taking
advantage of the sharp rise in the risky asset early in the life of the strategy.

Protected Fees When an investment manager manages a portfolio such as a CPPI


strategy for a client, he will be paid some fees in one way or another depending on
the performance of the portfolio. Distributors to retail market also get paid fees for
the service.
In the case of a CPPI strategy, fees are usually taken from the CPPI index, but
there are cases in which they are taken from the risky asset portfolio. They, then, are
equivalent to a proportional dividend on the CPPI index paid at each rebalancing
date. These fees are cash amounts paid to the fund manager or to the investor. From
the perspective of the investor they are effectively coupons.
Usually, the fees continue to be paid out from the CPPI index even if the latter
drops close to or below the floor. This penalizes the CPPI portfolio, as it is likely to
Constant Proportion Portfolio Insurance 151

240.00
CPPI
220.00
Equity Fund

200.00 Floor

180.00

160.00

140.00

120.00

100.00

80.00

60.00

40.00

20.00

0.00
0 2 4 6 8 10 12
time in years

FIGURE 5.4 Effect of ratcheting on a CPPI strategy with minimum investment level of 30%.

end up below par in the case of a classical CPPI, for example. The remedy for this is
to protect the fees.
When the fees of the strategy are protected, they are often added back to the
floor and the latter then becomes a coupon-bearing bond instead of a zero bond.
This lowers the investment level, lowers the leverage, and reduces the value of an
ATM CPPI option (it is like raising the strike level), which means less gap risk.

Straight-Line Floor Investors wanting to benefit from a drop in interest rates might
prefer a straight-line floor, one that varies linearly with time and so does not
correspond to a bond as in the classical case. Indeed, as interest rates fall, a bond
floor rises and the investment level reduces, which limits the benefit from the risky
asset performance (negative correlation between equity and bond prices, when the
equity market is doing badly the interest rates are in general cut to boost the
economy, inducing a rise in bond prices). Accordingly, stochasticity of interest rates
plays a role in the pricing of options on the CPPI index.
In cases in which the investor is relying on an early exit from the strategy, a
straight-line floor proves to be effective, as he or she knows with certainty what the
level of the floor will be at any time because the floor is insensitive to interest rates.

5.3.3 An Example Structure


We consider the structure outlined in figure 5.5 and whose term sheet is presented
in section 5.8, appendix C. The risky underlying is Eurostoxx50E and the riskless
asset is a coupon bond (protected fees). The ratcheting is applied during the first
five years at 100% (to invite more subscriptions) and at 85% thereafter (see section
5.3.2 for more details on ratcheting).
152 EQUITY INTEREST RATE HYBRIDS

(Zero) Bond + Call on CPPI

Bank A Client

Notional

FIGURE 5.5 Example of CPPI structure.

5.4 OPTIONS ON CPPI

5.4.1 The Pricing


The pricing is done within the framework presented in chapter 3, concerning interest
rate hybrids. We can add jumps to this framework to account for the gap risk.
A Monte Carlo simulation of the strategy is performed. The complexity of the
structure imposes the use of different models depending on the risks embedded in
the structure.
Indeed, depending on the nature of the underlying risky asset (e.g., a hedge fund
vs. a well-known index), the bond floor (emerging markets currency vs. USD or
EUR, for example), and the various features contained in the contract, the structure
could, for example, be very sensitive to the volatility of interest rates, to jumps in
the risky asset, or to both.
Therefore, the model used to price an option on a CPPI index could vary from
one structure to another depending on the risks embedded in each structure, as
explained above.

5.4.2 Delta, Gamma, and Vega Exposures


An option on a classical CPPI strategy is not a delta one product, as the gap
risk introduces optionality. As soon as we introduce restrictions (and fees) on the
strategy, we introduce even more optionality. A similar conclusion could be drawn
about the vega exposure. Indeed, the vega of an option on a classical CPPI is not
zero, precisely because of gap risk, and as soon as we introduce restrictions, we
usually increase the vega exposure.
Dividends affect the strategy in a similar way to fees, with the difference being
that the dividends are taken only from the risky asset rather than from the CPPI
index. This lowers the forward of the strategy, because the risky asset drops, and
introduces more vega.
The frequency of rebalancing in a CPPI strategy could be different from the
hedging frequency (e.g., monthly rebalancing versus daily hedging), and this creates
a large gamma exposure when approaching rebalancing dates.

5.4.3 Hedging
Classical CPPI, if managed correctly and continuously, will always end up at or
above the guaranteed level, except if downward jump occurs. An ATM European
Constant Proportion Portfolio Insurance 153

put on a classical CPPI will not have any value and therefore any greeks. A model
incorporating jumps in the risky asset, such as Merton’s jump-diffusion model [4] is
the only way to price this option (i.e., price the gap risk).
Risk management becomes more qualitative than quantitative. In many cases
it is difficult to hedge the greeks given by the model (when the risky asset is hedge
funds, mutual funds, etc.). In the case where the CPPI is managed by a third party,
the risk manager must ensure that the manager of the CPPI stays within the limits
so that the CPPI does not carry additional risk beyond the unavoidable gap risk.
Gap risk can be hedged by selling stability notes. These are basically a series
of knock-out OTM Cliquet puts (see section 2.1.6 for more details about Cliquet
structures) with a resetting frequency matching that of the CPPI (i.e., daily, weekly,
monthly). So the implied gap risk is redefined as a Cliquet put.
In the following section, we depart further from the classical CPPI and present
some nonstandard CPPI strategies that use techniques borrowed from the asset
management world.

5.5 NONSTANDARD CPPI S


The CPPI strategy is an innovation that comes from the asset management per-
spective. People often refuse to think of it as a derivative product, and this is why
several features, characteristics, and ways of thinking are inspired from there. The
fee structure we present here is just an another example of emulating what is done
within the fund industry.

5.5.1 Complex Fee Structures


High-Water-Mark Fee Structure The initial motivation behind the development of
this strategy is the fact that clients are used to dealing with fund managers who
are paid on the relative performance of the fund over a certain period, a year in
general, which is well known in the hedge fund industry as high-water mark.4 In
this strategy, at every rebalancing date we lock in the highest performance of the
CPPI index, and the incentive fees are paid only if the new index level is higher than
the previous locked in value. Extra fees are taken on a regular basis, accounting for
administrative and running expenses.
The CPPI algorithm looks like

RAt Bt
CPPIt CPPIt 1 ILt 1 1 ILt 1 fr t
RAt 1 Bt 1
f i lockint lockint 1

lockint max lockint 1 , CPPIt , lockin0 CPPI0

where f r and f i denote the running and incentive fees, respectively, and RAt , Bt ,
and ILt denote the values of the risky asset, bond floor, and the investment level,
respectively, at a given time t.

4 See section 5.8, appendix A for the definition of high-water mark.


154 EQUITY INTEREST RATE HYBRIDS

Similar to the high-water mark fee structure, structures exist in which the fees
are tied to the investment level rather than the CPPI index. The philosophy behind
this choice is basically to make the asset manager’s earnings linked only to the risks
he manages and not to the riskless part of the fund; it is a risk-reward approach.

5.5.2 Dynamic Gearing


The multiplier serves as a leveraging and deleveraging tool and is fixed, in general,
for the life of the strategy. Its value is usually between 2.5 and 6, depending on the
risky asset. Ideally, one would like to leverage more when the risky asset portfolio
is performing well and less for the opposite scenario. This is achieved by exploiting
the anticorrelation between the volatility and the performance of the asset, and
considering the multiplier as a decreasing function of the realized volatility over a
certain period of time (one month in general).
As an example, we can define the multiplier as follows:
1
mt min cap,
Volt

where Volt is the monthly realized volatility for the period t 1month, t .
This would decrease the multiplier for a falling market and would prevent it
from reaching very high levels in case of a rallying market thanks to the introduction
of a cap.

5.5.3 Perpetual CPPI


A perpetual CPPI is, as its name suggests, a strategy in which there is no maturity.
This means that the asset manager will be paid a minimum fee even if the strategy
does not perform. If the strategy deleverages, no fees are paid until the index bounces
back. In case of a massive drop (gap risk), the manager bears the risk and has to
ensure that the index is back to the floor level. Then, the index grows, say, at the
money market rate and the income generated by the protection asset is put into the
risky asset, and the index starts to leverage again, but slowly.
Any early gains made by the strategy are locked in and could be cashed in at
any time; hence, this strategy is also known as a fixed threshold strategy. This is
advantageous compared to the standard ratcheting strategy in which the client has
to wait until maturity to benefit from the performance of the CPPI index. This does,
of course, come at a cost as the investor puts some capital at risk as the floor stays
at the same level in the case of continuous underperformance of the index.
The CPPI (fund) is an open-end5 structure (fund) that terminates only if the
client unsubscribes from the fund or there is not enough money under management.
In practice, the fixed threshold strategy is likely to terminate the first time it drops
to or below the floor level as the speed of bouncing back is hindered by the fact that
it relies on the money market growth for it to leverage again.
The investor avoids interest rate risk by being sure of the amount that he or she
is going to cash in at a given time (after any lock-in, the floor is a horizontal line,
that is, constant in time), and the only interest rate risk could come from the fact
that the underlying risky asset is itself sensitive to the volatility of interest rates.

5 See appendix A for a definition of open-end fund.


Constant Proportion Portfolio Insurance 155

5.5.4 Flexi-Portfolio CPPI


Actively Managed CPPIs An actively managed fund is a fund in which the risky asset
and floor portfolios can change during its life. This is the case for most hedge funds
and mutual funds. This means that the asset manager has some freedom in the choice
of the underlyings in which to invest. He remains, however, subject to regulatory
constraints, especially for mutual funds, as they are typically required to limit their
exposure to high-risk assets in order to avoid spectacular losses, as sometimes occurs
for hedge funds. However, the asset manager does not have any constraints when it
comes to picking his risky portfolio as long as the latter remains within the horizon
defined by the regulator or investment guidelines. (See the example of investment
guidelines in section 5.8, appendix C.)
An example of actively managed CPPI is the flexi-basket CPPI, which is a
strategy where the underlying risky asset is a basket whose underlyings are chosen
from a horizon of names. The idea is similar to the cheapest-to-deliver concept,
where you do not specify the name of the bond you deliver. The selection of the
basket is left to the fund manager, who does the rebalancing of the CPPI portfolio.
This enables the asset manager to select the stocks or funds that he or she believes
will perform better, taking advantage of the information available during the life of
the strategy.
As opposed to actively managed funds or CPPIs where the risky underlying is
not known beforehand or can change throughout the life of the strategy, passively
managed funds are tied to a predefined set of underlyings that constitute the risky
asset portfolio.

Passively Managed CPPIs Passively managed funds follow a predefined investment


strategy. The fund manager is literally executing an algorithm, which is designed
to take advantage of the performances of the assets in a predefined basket. In
what follows, we present the momentum and rainbow CPPI strategies, where the
allocation mechanism within the risky portfolio is done following the concept of the
classical equity structures, momentum, and rainbow.

Momentum CPPI An example of a momentum structure is an option written on an


exotic basket of underlyings, whose payoff at maturity is equal to the initial capital
invested plus a geared performance of the underlying basket above some reference
level.
Note that we mean by performance the value of the asset at some time in the
future compared to its value at the last rebalancing date before this time.
The momentum CPPI is based on the momentum structure philosophy and is
intended to imitate the asset manager’s behavior and way of payment. Indeed, at
each rebalancing date, say monthly, the risky asset of the CPPI is calculated as
a weighted sum of the performances, above some reference level, of the various
underlyings in the basket. A large weight is assigned to the best performer, the
following largest weight to the next-best performer, and the smallest weight to the
worst performer. The reference level is in general a one-month interest rate future.
Considering a basket of n underlyings Si 1 i n and a vector of n ordered
weights wi such that wn wn 1 w1 , the risky asset value at a given time t
156 EQUITY INTEREST RATE HYBRIDS

is given by

n (i)
St
RFt RF t wi (i)
i 1 S t

where t is the last rebalancing date before t and S(i) is the the underlying asset
with the ith performance at time t
Note that if the performance of the underlyings in the basket are below the
reference level, they are replaced by the reference level (Libor, Euribor, etc.).
The rebalancing of the momentum CPPI is then done similarly to the classical
case in which we have one underlying.
Another strategy similar to the momentum CPPI is the rainbow CPPI. The idea
of the rainbow CPPI strategy is likewise based on the classical rainbow structure.
The difference between the two strategies, momentum and rainbow, is the fact that
in the latter the weights are set and used on the same rebalancing date, whereas in
the momentum CPPI, they are set on the previous rebalancing date.
These two structures are very sensitive to the correlation and are appealing for
clients who believe in trends. We can incorporate all the features mentioned before,
namely ratcheting and minimum investment level, to make the investor benefit from
the potential gains made by the strategy at different times throughout its life.
The underlying basket for these structures could range from an all-equity basket
to a very diversified one containing, for example, a fixed income index, an equity
index, a foreign exchange index, a commodity underlying, a mutual fund, and a
hedge fund. Note, however, that in case of small or medium-sized mutual funds
or hedge funds, being constituents of the momentum basket, these strategies can
disrupt the fund management if the amounts traded are large compared to the size of
the funds. For example, if a good performance of the underlying fund in one period
is followed by a sharp drop in its value in the next, then the algorithm requires the
CPPI manager to buy and then sell a substantial fraction of the fund. This will create
serious disruptions within the fund if the amounts traded are substantial compared
to the fund’s size.
The above strategy ideas could be extended to accommodate some other popular
exotic structures containing ‘‘worst-off’’ and/or ‘‘best-off’’ features. The innovation
in this field is very rapid in an attempt to respond to the variety of investors and
their risk appetites.

5.5.5 Off-Balance-Sheet CPPI


An off-balance-sheet CPPI is a CPPI strategy managed entirely by a third party. The
client is guaranteed a payoff that depends on a fund managed by somebody else,
where the latter executes a CPPI algorithm (see Figure 5.6). The liability is there but
the assets are not on the books, hence the term off balance sheet.
Regulatory requirements have in part been behind the introduction of this family
of CPPIs. Indeed, asset managers and pension fund managers are not allowed to call
a fund, portfolio, or product capital guaranteed unless it is really true for all cases.
Those institutions are not allowed or unwilling to take the gap risk so they buy the
guarantee from outside.
Constant Proportion Portfolio Insurance 157

Put on CPPI

Bank A CPPI Manager

Premium

Zero Bond + Call


on CPPI
100 Investor

FIGURE 5.6 Example of off-balance-sheet CPPI structure.

On the other hand, after the collapse of Enron and WorldCom, the U.S.
accounting standards for derivatives were changed. Profits cannot be taken up front
unless they are 100% locked in. A party who sold a CPPI can take the management
fees only on an ongoing basis rather than up front, as the structure still bears the
gap risk. A purchase of the protection enables the manager of the CPPI to show the
profits up front, as all potential risks have been hedged out.
Hedging such exposure is quite tricky, as it is not always easy to trade the
underlying fund. Risk management becomes more qualitative than quantitative. In
many cases, the manager is not able to hedge the greeks given by the model (e.g.,
when the risky asset is a hedge fund, mutual fund, etc.). The risk manager must
furthermore ensure that the manager of the CPPI stays within the limits so that CPPI
does not carry more risk than the gap risk.
A clear reporting line with the CPPI manager needs to be established. The
guarantor provides the CPPI manager with the theoretical exposure coming out of
the CPPI algorithm, and the CPPI manager must report its portfolio allocation to
the guarantor.
In case of early redemption, break clauses need to be negotiated between the
guarantor and the CPPI manager. Should the manager not follow the instructions of
the guarantor, the guarantee is canceled.
Off-balance-sheet CPPIs can be offered in different forms:

■ European or American put on the CPPI


■ Bank guarantee on the fund following a CPPI algorithm
■ An option embedded in the fund
158 EQUITY INTEREST RATE HYBRIDS

Usually, the value of the CPPI at time 0 is protected, except when the CPPI
strategy contains a ratcheting feature implying a profit lock-in in case the strategy per-
forms well. Then the guarantor guarantees at maturity the amount: LockIn(maturity)
CPPI(maturity).

5.6 CPPI AS AN UNDERLYING

The popularity of the CPPI strategies in the marketplace, the growth of the hedge
fund community, and the familiarity of investors with the strategy have resulted in
the birth of complex structures treating the CPPI strategy as an underlying itself.
TARNs (target redemption notes, section 4.3) on CPPI are a popular structure.
Momentum- and rainbow- type structures can have CPPI indices as their underlyings,
to mention a few. Basically, any exotic structure written on classical underlyings,
be it equity, interest rates, credit, funds, or commodities could be extended to
incorporate CPPI strategies.

5.7 OTHER ISSUES RELATED TO THE CPPI

5.7.1 Liquidity Issues (Hedge Funds)


When a hedge fund is considered as the risky underlying for the CPPI strategy,
liquidity issues have to be taken into account, as there is sometimes a large
asymmetry in the settlement dates between a buyer and a seller. A buyer settles at
T 5 (meaning that settlement occurs 5 days after the agreement) whereas a seller
would settle at T 30 or T 60. This makes life difficult for the CPPI manager, as
he has to rebalance his portfolio, say, quarterly. This is not an issue when we deal
with standard underlyings, as there is no asymmetry and all parties settle at T 2.
Note that the settlement periods mentioned here are just examples for illustration.

5.7.2 Assets Suitable for CPPIs


The classical CPPI is by nature a self-financing strategy that performs a rebalancing
of the underlying portfolio throughout its life, which makes the investor exposed
to the realized volatility rather than implied volatility. Therefore, assets with high
implied volatility compared to realized volatility are good underlyings for the CPPI
strategy. In general, upward trending assets with low volatility (e.g., funds) and no
cyclicality are prime candidates as underlyings for a CPPI strategy.
In the case of emerging markets where no mature option market exists, a CPPI
strategy is a good instrument that gives access to the upside with very small vega
exposure (mainly gap risk). It needs, however, to be engineered, taking into account
liquidity constraints and country-specific risk. Diversification is always desirable in
order to lower the volatility and the risk of quick deleveraging.
Constant Proportion Portfolio Insurance 159

5.8 APPENDIXES
5.8.1 Appendix A
In this section, we give some background on the various types of hedge funds, which
are classified based on their investment strategies. Thereafter, we give definitions of
some keywords widely employed within the fund community.

Types of Hedge Funds Merger arbitrage: Funds involved in event-driven investments


such as mergers and acquisitions and leveraged buy-outs, taking advantage of the
fact that the stock of a target company rises and the stock of a buying company
depreciates.
High yield: Funds specializing in securities whose underlying companies are in
difficulty, which could range from corporate restructuring to inability to honor
payments to complete bankruptcy. These funds enter into investment strategies that
involve taking opportunistic positions in debt, stock, and credit derivatives on the
company.
Convertible arbitrage: This involves managing a portfolio of convertible bonds and
hedging the interest rate exposure and equity risk by buying or selling the securities
in question (e.g., bonds, stocks).
Fixed income relative value: Funds in which managers try to take advantage of price
inefficiencies and mispricings between a related set of fixed income securities and
neutralize any interest rate exposure.
Equity arbitrage: Similar to fixed income relative value funds, equity arbitrage
fund managers seek to profit from price ineffencies between a related set of equity
instruments while neutralizing the risk to directional market movements.
Macro arbitrage: Managers of these funds make forecasts about the shift in world
economies and anticipate movements in stock markets, fixed income securities,
interest rates, inflation, and foreign exchange due to political changes and global
shifts in supply and demand of commodities.

Fund Keywords Open-end fund: Holding a share in an open-end fund is like holding
a stock. This is the most common structure, and deals are done on a stock exchange
and in a secondary market.
Closed-end fund: A hedge or mutual fund that has stopped accepting subscriptions
from investors, at least temporarily, and are traded on a secondary market only by
professionals.
Fund of funds: An investment vehicle consisting of shares in various hedge or mutual
funds. The vehicle could have a strategy focus, the underlying funds following a
given investment strategy, or a diversified one in which the fund managers have
different strategies. An investment in a fund of funds offers many structural benefits
compared to one in a classic hedge or mutual fund. Indeed, funds of funds offer more
transparency and provide frequent portfolio updates. The barrier to entry is another
advantage, with levels of minimum investment many times less than single funds
being very common. The fee structure is in general complex, as the investor has to
160 EQUITY INTEREST RATE HYBRIDS

pay the incentive and running fees for both the fund of funds and the underlying
funds.
Prime brokerage: Large financial institutions often have a prime brokerage group
that is dedicated to providing hedge funds with administrative, back-office, and
financing services. Other services like providing offices, infrastructure, and initial
capital are sometimes offered to help fund managers start their business.
Master feeder fund: A common structure in the United States through which a fund
can run two funds, one onshore for U.S.-based investors and another one offshore
for non-U.S.-based investors. The underlying funds are called feeder funds, and the
father entity is called the master fund. This is created to allow U.S. and non-U.S.
investors to have participation in a single fund.
Drawdown: The percentage difference between the maximum and minimum asset
values of a fund over a certain period. It is often used as a measure of the risk of a
fund.
High-water mark: This is a provision that ensures that the asset manager receives
incentive fees only for real profits. He is, in fact, paid only on the basis of the
performance attained above the highest net asset value realized previously.
Hurdle rate: The minimum return required from the asset manager. The latter
receives incentive fees only for the extra return above it.
Sharpe ratio: Introduced by William R. Sharpe, this is the extra return, above the
risk-free one, realized by a fund in units of the risk taken. It is calculated as the
difference between the average annualized return and the risk-free rate divided by
the annualized volatility of the fund.
Venture capital: Also called capital risk, this is money given to starting funds
(start-ups in general) that seek high-return investments.

5.8.2 Appendix B
In this section, we give more details about the computations for the continuous time
strategy. Recall that: V0 1, Fmaturity 1, Vt Ct Ft , Et mCt and dFt rFt dt.
The change in Vt and Ct is given by the following equations:

dFt dSt
dVt Vt Et Et
Ft St
dFt dSt
dCt Vt Et Et dFt
Ft St

The log-normality of the risky asset implies

dFt
dCt Ct Ft mCt mCt dt dWt dFt
Ft
Ct mCt rdt mCt dt dWt
Ct [ 1 m r m] dt m dWt
Constant Proportion Portfolio Insurance 161

Therefore,

m2 2
Ct C0 exp 1 m r m t m Wt
2

On the other hand, we know that

m 2
Sm
t Sm
0 exp m t m Wt
2

From these two results we can conclude that


m
Ct t St

C0 2 m2 2
t exp t , where r m r
Sm
0 2 2

Finally,
m
Vt t St Ft

5.8.3 Appendix C
Example of Matrix of Investment Guidelines

Local Limitations
Name Bloomberg Maximum Minimum
Allocation Allocation
EMI Index France GPEMIFR FP Equity 50% 0%
Groupama France Stock GPINFRA FP Equity 50% 0%
Groupama Croissance GRPCRSS FP Equity 50% 0%
EMI Index Euro GPEMIEU FP Equity 50% 0%
Groupama Euro Stock GPACFRA FP Equity 50% 0%
Euro Gan EURGNSV FP Equity 50% 0%
Groupama Avenir Euro FIGRAVE FP Equity 25% 0%
Actions Nouvelle Europe GRPAMOR FP Equity 20% 0%
Groupama Actions GPACINT FP Equity 80% 0%
Internationales
Groupama Actions Mid Cap GPACMUS FP Equity 25% 0%
US
Groupama US Stock FIFINUS FP Equity 50% 0%
Groupama ASIE GRPASIE FP Equity 20% 0%
Actions Croissance Japan FIACROJ FP Equity 50% 0%
Groupama Japan Stock NPPNGAC FP Equity 50% 0%
Groupama Euro Crédit MT FIOBECR FP Equity 25% 0%
Groupama Euro Crédit LT GPOBLIF FP Equity 25% 0%
Groupama Institutions LT GRINSLT FP Equity 100% 20%
Groupama Index Inflation LT GRINILT FP Equity 100% 0%
162 EQUITY INTEREST RATE HYBRIDS

Name Bloomberg Maximum Minimum


Allocation Allocation
Groupama US Etat LT GRGETLT FP Equity 100% 20%
Groupama Etat Monde LT GRPCAPT FP Index 100% 20%
Groupama Alternatif GRALTEQ FP Equity 15% 0%
Equilibre
Global Limitations
Emerging Markets Equity 20% 0%
North America Equity 50% 0%
Developed Europe Equity 50% 0%
Japan/Developed Asia Equity 50% 0%
High Yield (emerging 25% 0%
markets, credit, etc.)
Equity in Total 80% 0%
Equity (mid-cap, small cap) 25% 0%
Fixed Income (investment 100% 20%
grade)

Example Term Sheet An example term sheet is given on the following two pages.

Principal Protected Note on CPPI on Stoxx50E


Indicative Terms and Conditions CPPI invests in the reference asset (the underlying
fund(s)) and a reserve asset (zero-coupon bond). The allocation between the two is
done on a monthly basis following the allocation mechanism. If the CPPI increases
in value, it will invest more in the reference asset to increase leverage. If the CPPI
decreases in value, it will invest less in the reference asset to protect the capital.

Summary of Terms and Conditions


Issuer Deutsche Bank AG, London
Currency EUR
Maturity 10 years
Notional 10,000,000 EUR
Business Days London, modified following
Calculation Agent Deutsche Bank AG, London
Payoff at maturity Notional Notional
MAX(CPPIfinal , LockInfinal )
Max 0, 100%
CPPI0

CPPI on the Stoxx50E


Description of CPPI The CPPI consists of two components: (a) the reference asset
and (b) the reserve asset, in different proportions, as determined and computed by
the calculation agent. The allocation between (a) and (b) will be performed daily and
determined by the allocation mechanism. The allocation will be adjusted to protect
the CPPI on the downside and to provide return on the upside.
Constant Proportion Portfolio Insurance 163

Sponsor Deutsche Bank AG, London


Business Days London, modified following
Composition The index will consist of: The reference asset and the reserve asset
Reference Asset Stoxx50E index
Reserve Asset EUR-denominated zero-coupon bond
Allocation Means, in relation to an index business day, the proportion of the
index invested in the reference asset
Mechanism (‘‘investment level’’). The investment level is a function of the distance
between the current index level and the protected amount. The index
sponsor will adjust the investment level according to the following
formula:
CPPIt Floort
ILt Min MaxIL, Max MinIL, m CPPIt

with CPPI(t) CPPI valuation on such index business days


Floor(t) value of the floor at time t
m multiplier 5
MaxIL Maximum investment level 100%
MinIL Minimum investment level 0%
IL(0) initial investment level 71 80%
Indext
CPPI Valuation CPPIt CPPIt 1 1 1 ILt 1
Indext 1
ZBt Feet (dt dt 1)
1 MAX(0, 1 ILt 1)
ZBt 1 365
where Index(t) level of the reference asset at time t
IL(t) investment level in fund at time t
dt calendar day at time t
Fee(t) 1 90% if CPPI(t) floor(t) otherwise fee(t) 0 90%
Lock In LockInt MAX(LockInt 1, p CPPIt )
where p 1 0 for the first 5 years and 0.85 thereafter.
LockIn(0) 100%
Floor Floort Bondt LockInt
where Bond(t) clean price of a EUR-denominated bond with a
coupon of 0.90% p.a.
PART
Three
Equity Credit Hybrids
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 6
Credit Modeling

6.1 INTRODUCTION

The growth witnessed in the credit derivatives market in recent years has led to the
introduction of equity hybrid structures that depend on the creditworthiness and
the performance of an equity underlying. Convertible bonds constituted the first
generation of these structures and are still the most liquid of them, while the equity
default swap remains the main innovation in this field.
This chapter presents a methodology to value derivative securities written on
equity underlyings subject to credit risk. Arbitrage-free valuation techniques are
employed, and the methodology is applied to derivative securities written on assets
subject to default risk as well as to pure credit derivative instruments.

6.2 BACKGROUND ON CREDIT MODELING

The risk of default is the financial loss that a counterparty would bear if a reference
entity does not honor its commitments. This is in general called a credit event and
could range from downgrades by a rating agency to failure to pay debt to complete
liquidation. In theory, every financial transaction embeds this kind of risk regardless
of the counterparties involved. Estimating the likelihood of occurrence of a credit
event is the center of any methodology aiming at modeling credit risk or default
risk. However, this is not sufficient for the pricing of contingent claims sensitive to
default risk. Indeed, we need to model the loss given default (or recovery), risk-free
interest rates, and in the case of multiname securities, the dependency between the
credit events.
There are two main routes to modeling default risk, the structural approach
and the reduced-form or intensity, based approach. In the first approach, we make
explicit assumptions about the capital structure of the company, its debt, and the
dynamics of its assets. In the reduced-form approach, the dynamics of the default
are exogenously given by a default rate (intensity). Intensity-based models focus
directly on describing the conditional probability of default without the definition
of the exact default event. The use of a Poisson process framework to describe
default captures the idea that the timing of a default takes the investor by surprise.
Technically speaking, either the default time is a stopping time in the asset filtration
(structural models) or it is a stopping time in a larger filtration (intensity models).

167
168 EQUITY CREDIT HYBRIDS

6.2.1 Structural Approach


The firm value model was introduced by Black and Scholes in 1973 and Merton
in 1974. It consists of defining the default time as the time that the underlying
process, the assets of the firm, hits a certain barrier.

Standard Approach Consider a company with market value Vt at time t, which rep-
resents the expected discounted value of its future cash flows. The company has a debt
modeled as a zero-coupon bond with face value D and maturity T If the company
cannot honor its commitments at maturity, the debtors take control of the company.
The firm value Vt (also called asset price) is modeled as geometric Brownian
motion and its dynamics are given as follows:

dVt
dt V dWt (6.1)
Vt
V0 0

The default event is defined as the inability of the company to pay its debt at
maturity; that is, VT D. The default probability is therefore given by

P 0, T VT D
2
V D
T V WT log
2 V0
2
D V
log V0 2 T
,
V T

where L VD is the leverage ratio.


0
The bondholder receives at maturity D T, T min VT , D , which could be
written as

D T, T D D VT

Therefore, the bondholder is long a default-free bond with a face value D and
short a put option on the assets of the company.
On the other hand, the equity value ET max VT D, 0 is a call option on
the assets of the company. The values today of the debt and equity of the company
are given by
T
D 0, T B 0, T e V0 d1 B 0, T D d2
T
E0 B 0, T e V0 d1 B 0, T D d2
2
V0 V
log D 2 T
d1 , and d2 d1 V T,
V T
where B 0, T is the default-free zero-coupon bond.
Credit Modeling 169

The credit spread is defined as the excess return, above the risk-free rate,
demanded by investors for bearing the default risk of the underlying entity. Its
expression, using the formulas above, is given by

1 D 0, T
sp 0, T log
T B 0, T D
e TV d1
1 0
log d2
T D

First-Passage Approach In the standard approach, the value of the company can
reach any value between today and the maturity without triggering the default event
(value of the company at any point in time before maturity can be below the face
value of the debt). The test for default or no default is done only at maturity. The
first-passage approach, on the other hand, defines the default event as the first time
the value of the company drops below a predefined barrier H.
Given the dynamics (6.1), the probability of default is given as follows:

P 0, T 1 min Vt H, VT D
t T

2 2
V H2 V
log L 2 T 2
1 log DV0 2 T
H 2
V

V T V0 V T

Within this approach the bondholder receives at maturity

D T, T VT VT D 1mint T Vt H

VT VT D VT D 1mint T Vt H

D VT D VT D VT D 1mint T Vt H

D D VT VT D 1mint T Vt H

standard approach

The bondholder is therefore long a default free zero coupon bond with face
value D, a down-and-in call on the assets of the company, and short a put on the
assets of the company.
On the other hand, the equity value at maturity is given as follows:

ET VT D 1mint T Vt H
170 EQUITY CREDIT HYBRIDS

that is, the equityholder is long a down and out call on the assets of the company.
The values D 0, T and E0 of the debt and equity today are given by

2
1
standard T H 2
V
D 0, T D 0, T B 0, T e V0
V0
2
1
H 2
V
1 B 0, T D 2
V0
2 2
2 1 2 1
E0 Estandard
0 B 0, T e T
V0 L V
1 B 0, T DL V
2
2
H2 V
log DV0 2 T
1 , and 2 1 V T
V T

where Estandard
0 and D 0, T standard are the values of the equity and debt at time 0
given in the standard approach described above.
The derivation of these formulas could be found in [130].
The credit spread is therefore expressed as follows:

2
1
1 L T H 2
V
sp 0, T log e 1 d1 d2
T B 0, T V0

2
1
H 2
V
2
V0

Discussion The calibration of structural models is problematic, as the value of the


firm is not directly observable in the market. The face value and the maturity of the
debt is not easy to estimate from the balance sheet given the complexity of the capital
structure of the company. Indeed, we often have a mixture of short-, medium-, and
long-term debts as well as different seniorities. The barrier level, in the case of a
first-passage approach, is another parameter that is not easy to estimate, and its
definition is generally ad hoc and conditions the occurrence of default event.
Another drawback of the two approaches described above is the fact that the
default cannot happen immediately. This has been addressed by random barrier
models such as credit grades. However, this adds even more complexity in terms of
the calibration as we need to calibrate the volatility of the barrier level in addition
to all other parameters.
Nevertheless, this approach could be useful to provide predictive tools related
to upcoming default events. Indeed, a pre-default event could be defined as the
first time the asset value is below a certain level higher than, but close to, the
default barrier, and users of structural models observe the evolution of the so-called
“distance-to-default” or the marginal default probabilities. From a mathematical
point of view, this means that the default time is predictable, a stopping time with
Credit Modeling 171

respect to the asset filtration. Unfortunately, market reality is different, as we do


witness spread movements as well as jumps to default that happen in a surprising
way.
The inability of structural models to properly describe the default event is a
drawback well described by Madan in ‘‘Pricing the Risks of Default’’ (2000):

default is often a complicated event and specifying the precise conditions


under which it must occur are easily misspecified. The conditions one writes
down may be too stringent so that it often occurs before these conditions
are met, or the conditions are too weak and default fails to occur when all
the requisite conditions have been met.

6.2.2 Reduced-Form Approach


In reduced-form or intensity-based models, the dynamics of the default are described
exogenously and directly under the pricing measure. We model the instantaneous
likelihood of default through the hazard-rate process.

Definition of the Hazard Rate of a Default Time

Hazard rate: Deterministic case Let us consider a security with default time . is a
continuous random variable measuring the length of time from today to the default
time.
Let F(t) denote the distribution function of :

F(t) t ,t 0 (6.2)
F(0) 0

We also define the survival function S(t) by

S(t) 1 F(t) t ,t 0 (6.3)


S(0) 1

As to the probability density function, it is given by

t t
f (t) F (t) S (t) lim
0

The distribution of the random variable default time can be specified with the
hazard-rate function, which gives the instantaneous default probability for a security
that has attained time x, given survival to this time:

F x x F x
x x x x (6.4)
1 F x
f (x)
x
1 F x
172 EQUITY CREDIT HYBRIDS

The function

f (x)
h(x) ,
1 F x

used in statistics under the name of hazard-rate function, is the conditional proba-
bility density function of at time x, given survival to that time.
The hazard-rate function can easily be linked to the survival function as follows

S (x)
h(x) S(x)
S(0) 1

We get
t
S(t) exp h(s)ds (6.5)
0

In the same way,

S(t x)
t x x
S(x)
t x
exp h(s)ds (6.6)
x

Lastly, distribution and density functions of the default time can be expressed
as a function of the hazard rate function as follows:
t
F(t) 1 S(t) 1 exp h(s)ds (6.7)
0

f (t) S(t) h(t) (6.8)

These relationships show that modeling a default process is equivalent to


modeling a hazard-rate function.

Hazard rate: General case Let us define the default process by Nt 1 t , where
is the default time. It is assumed that the increasing one-jump process Nt admits
an absolutely continuous compensator t , where t is a predictable and increasing
process such that Nt t is a martingale [131]:

t
t hs ds
0

where the non-negative predictable process h stands for the intensity process or
hazard rate. With a constant intensity h, for example, default is a Poisson process
with intensity h. More generally, for t , ht can be viewed as the conditional
rate of arrival of default at time t, given all information available up to that time.
Roughly speaking, for a small time interval of length t, the conditional probability
that default occurs between t and t t, given survival to t, is given by ht t.
Credit Modeling 173

We consider two increasing and complete information filtrations t t such


that

t
,
ht t

the default time being outside the span of T.


Now, the process

t
Mt Nt hs ds
0
t
Nt hs 1s ds
0
t
Nt hs 1 Ns ds (6.9)
0

is a , t martingale.
The following result will allow us to eliminate the jump process Nt from the
evaluation of any derivative payoff.

Lemma 6.2.1 We admit the following result:

t
E 1 Nt t exp hs ds (6.10)
0

Let us define S(t, T) the probability of no default (or survival probability). It can
be expressed as

S(t, T) E 1 T 1 t t 1 t

A direct application of Bayes’s theorem gives

E 1 T 1 t t
S(t, T)
E 1 t t

E E 1 NT T t
E 1 Nt t
T
E exp hs ds t (6.11)
t

where the last equality directly follows from (6.10).


This important result generalizes the results of the previous section, allowing
now the hazard-rate function to be stochastic.
174 EQUITY CREDIT HYBRIDS

Construction of a Default Time In this section, we give a background on how to


construct a default time.
Let , , t , be a filtered probability space, and Zt a diffusion process on
this space. Let be an exponential random variable with intensity 1 independent
from Zt
t
We define the default time as the first time when the process 0 h Zu du is
above the random variable :
t
inf t, h Zu du ,
0

where h is a positive function. An equivalent way of defining is

inf t, N t 1
0h Zu du

where Nt is a Poisson process with intensity 1 independent of Zt , t 0.


Therefore, the conditional distribution of the default time given t ( t
Zs , t s ) is given by

t
t t exp h Zu du (6.12)
0

We introduce the filtration t t 1 s , t s , the enlarged filtration with


respect to which is a stopping time. Indeed, is not stopping time with respect to
the filtration1 t ; otherwise, we would have t t 1 t and not (6.12).
For the pricing of contingent claims on defaultable securities, the following
theorem is essential:

Theorem 6.2.1 Let X be an integrable random variable; we have the following


result:

E 1 tX t 1 t Yt

where

E 1 tX t
Yt
E 1 t t

Modeling a hazard-rate function provides us information on the immediate


default risk of each entity known to be alive at a given time t and facilitates
comparisons with other entities. Also, as we will see, this kind of modeling can
be easily adapted to the stochastic default case. Linked to this point, the strong
similarities between the hazard-rate function and the short rate allow us to borrow
some modeling techniques from the short-rate world.

1 In the structural model approach, is a stopping time with respect to t.


Credit Modeling 175

6.3 MODELING EQUITY CREDIT HYBRIDS

We are given a filtered probability space , , t , where all processes are


assumed to be defined and adapted to the filtration t . We will be working in
an arbitrage-free setting, and we will be considering the dynamics of the involved
processes directly under the risk-neutral measure .

6.3.1 Dynamics of the Hazard Rate


We will exploit the results of the previous sections to choose a stochastic diffusion
for the hazard-rate process. The similarities between the hazard rate and the short
rate legitimate the use of some modeling techniques proper to the short rate. Two
kind of diffusion could be employed to model the dynamics of the hazard rate: affine
diffusions and mixed diffusions.

Affine Diffusion Models The tractability of affine diffusion models [132] makes them
prime candidates for the modeling of the hazard rate process. Indeed, we can easily
obtain closed-form solutions for the survival (default) probabilities in some cases,
and we only need to solve an ordinary differential equation in some other cases.
An affine diffusion model is given by the following stochastic differential
equation:

dht t, ht dt t, ht dWt ,

where and 2 are affine functions of ht ; that is, t, ht a1 t b1 t ht and


2 t, h a2 t b2 t ht The survival probability (the equivalent of a zero-
t
coupon bond) is given by

S 0, T exp m(T) h0 n(T) ,

where m(T) and n(T) are deterministic functions of T, a1 , a2 , b1 , and b2


Note that the affine diffusions models remain tractable and yield similar expres-
sion for the survival probabilities in the multidimensional case.

Mixed Diffusion Models This class of models contains the CEV kind diffusion
given by

dht at ht dt t ht dWt (6.13)

These models have the drawback of being computationally unattractive (except


for 0 or 0 5, because we end up with an affine diffusion).
The model choices we make in the remainder of this chapter are based on affine
models. Precisely, the dynamics we will study for the hazard-rate function are the
Hull-White diffusion in the first stage. An extension to include jumps is studied later.
176 EQUITY CREDIT HYBRIDS

6.3.2 Model Choice


Assuming Hull-White-type diffusions for the hazard rate and short rate, we end with
the following system of SDEs:

dSt S S
St rt yt dt t dWt dNt ht dt
drt r r r dW r
t t rt dt t t
h h h h
dht t t ht dt t dW t

with

d WS, Wh t Sh dt

d Wr, Wh t rh dt

d WS, Wr t Sr dt

where Nt 1 t , with being the inaccessible default time. Nt is taken to be


independent of all the Brownian motions, W S , W h , and W r yt t t , where t
and t are the dividend yield and repo rate, respectively.
Initially, and in order to allow for a more tractable model, instantaneous
short rate is taken to be deterministic. The extension to stochastic interest rates is
straightforward and will be discussed later.

Study of the Hazard Rate Dynamics As specified above, the dynamics of the hazard
rate are given by

h h h h
dht t t ht dt t dWt (6.14)

The solution to (6.14) is given by


t t t u
h h h h
ht h0 exp u du exp u du exp s ds u du
0 0 0 0
t u
h h h
exp s ds u dWu
0 0

We have

S(t, T) S(ht , t, T)
exp m(t, T) n(t, T)ht (6.15)

with

m(T, T) 0
n(T, T) 0

This is true for all affine-type diffusions as mentioned earlier, that is, for all
diffusions for which the drift term and the square of volatility are linear functions
of ht
Credit Modeling 177

Furthermore, the survival probability S(t, T) satisfies the PDE below:

S 1 2 2S S
h h h
(h, t) t (h, t) t t ht (h, t) ht S(t, T)
t 2 h2 h

with the terminal condition

S(T, T) 1

Replacing the expression of the survival probability (6.15) in the PDE, we obtain

h 1 h
2
mt (t, T) t n(t, T) t n(t, T)2 1 nt (t, T) h
t n(t, T) ht 0,
2

where mt (t, T) and nt (t, T) are the first derivatives of m(t, T) and n(t, T) with
respect to t
By separating terms that do not depend on ht and those that do depend on ht , we
get a polynomial of degree one with respect to the variable ht ; both coefficients will
be equal to zero. We therefore have the following system of differential equations:

nt (t, T) h
t n(t, T) 1
n(T, T) 0

and
2
h 1 h
mt (t, T) t n(t, T) 2 t n(t, T)2
m(T, T) 0

Integrating these equations, we get

T T
1 h
2
m(t, T) u n(u, T)2 du h
u n(u, T)du (6.16)
2 t t
T u
h
n(t, T) exp s ds du (6.17)
t t

Let us define the T-maturity instantaneous forward hazard rate by

log S(0, T)
fh(0, T) (6.18)
T

with

S(0, T) exp m(0, T) n(0, T)h0

We have

fh(0, T) nT (0, T)h0 mT (0, T)


178 EQUITY CREDIT HYBRIDS

where
T
h
nT (0, T) exp u du
0

and
T 2 T
h h
mT (0, T) u n(u, T)nT (u, T)du u nT (u, T)du
0 0
T 2 T T
h h h h h
u (u, T) (u, v)dv du u (u, T)du
0 u 0

where
T
h h
(t, T) exp u du
t

Thus,
T
h h h
fh(0, T) (0, T)h0 u (u, T)du
0
T 2 T
h h h
u (u, T) (u, v)dv du
0 u

This expression can be rewritten as

fh(0, T) g(T) h(T)

with
h (0, T)h T h h (u, T)du
g(T) 0 0 u
T 2 T
h(T) h h (u, T) h (u, v)dv du
0 u u

such that
h h
g (T) T T g(T)
g(0) h0

Consequently,
h h
T g (T) T g(T)

and
h h
T fhT (0, T) h (T) T fh(0, T) h(T)
T 2 T
h(T) h h (u, T) h (u, v)dv du
0 u u
T 2
h h
h (T) T h(T) 0 u (u, T)2 du
Credit Modeling 179

Finally, the parameter h is uniquely determined and given by

T 2
h h h
T fhT (0, T) u (u, T)2 du h
T fh(0, T)
0

However, in practice, we do not need to calibrate Th , as it is implied by today’s


credit curve. The above result has only a theoretical interest in showing that Th is
uniquely specified. Note also that the expression of the survival probability could
T
be derived by simply computing the expectation E exp t hs ds t , which is
T
easy given that exp t hs ds t is a log normal variable.

Dynamics of the Survival Probability The expression of the survival probability


under the risk-neutral measure is given by

T
S(t, T) E exp hs ds t
t

Given this expression, we show that its dynamics under is given by

dS(t, T) SP
ht dt (t, T)dWth , (6.19)
S(t, T)

where

SP h h
(t, T) t (t, T)

with

T
h h
(t, T) (t, u)du
t

Furthermore, we have that

T T T
h h
hs ds (t, T)ht (u, T) uh du h
(u, T) h h
u dWu
t t t

This is going to be very useful when pricing various derivatives as we discount


T
with exp t hs ds . The above results are very similar to the ones obtained in the
case of short-rate models.
180 EQUITY CREDIT HYBRIDS

6.4 PRICING

6.4.1 Credit Default Swap


A credit default swap (CDS) is a financial agreement between two counterparties in
which the protection buyer makes regular fixed payments during its term, whereas
the protection seller is binded to making a payment upon the default of a reference
entity. The default event definition could range from the failure of making a simple
interest payment to complete bankruptcy. The default payment could be made at
maturity (standard or European type) or at time of default (American type).
The CDS price (spread) is the premium value that makes the agreement a fair
contract, that is, the default leg is equal to the premium leg. As mentioned above, we
assume that the interest rates are deterministic (independence between the default
process and the interest rates yield exactly the same results), and the default payment
is made at the time of default (American case).

Premium Leg The premium leg is the price of a risky coupon bond with notional
equal to the one of the CDSs and where all the payments are discounted using risky
zero rates (default-free zero bond times the survival probability). Let t1 , t2 , , tn
T be the set of payment dates where T is the maturity of the CDS, N is the
notional of the swap, and R is the recovery rate of the reference entity assumed to
be constant.
n
PL N c ti B 0, ti E 1 ti
i 1
n
N c ti B 0, ti S 0, ti
i 1
n
N c ti Bd 0, ti
i 1

where ti is the day count fraction for the period [ti 1 , ti ] , t0 0 and Bd 0, ti is
the risky zero coupon bond defined as follows:

Bd 0, ti B 0, ti S 0, ti (6.20)

Note that this expression is not true if we have correlated hazard rate and
short-rate processes.

Default Leg The default leg is the expected value of the default payment (DP) minus
the accrual premium payment (AP). These two quantities are computed as follows:

DP NE [ 1 R B 0, 1 T]

N 1 R E [B 0, 1 T]
Credit Modeling 181

T
N 1 R B 0, u dF u
0
T
N 1 R 1 Bd 0, T f 0, u Bd 0, u du
0

log B 0,T
where f 0, T is the instantaneous forward rate given by: f 0, T T and
n
AP N E c ti 1 B 0, 1ti 1 ti
i 1
n ti ti
Nc ti Bd 0, ti Bd 0, u du u ti 1 f 0, u Bd 0, u du
i 1 ti 1 ti 1

The spread of the credit default swap is the value of c that makes the default leg
equal to the premium leg, hence:
T
1 R 1 Bd 0, T 0 f 0, u Bd 0, u du
Spread T n ti
(6.21)
0 Bd 0, u du i 1 ti 1 u ti 1 f 0, u Bd 0, u du

Note that credit default swaps are quoted in spread in the marketplace.

6.4.2 Credit Default Swaption


Let’s call CDSS,N (t), the value at time t of a CDS contract starting at time TS and
maturing at time TN , with notional N and recovery rate R.
Similar to a CDS starting today (6.21), the expression of CDSS,N (t) is given as
follows:
N Ti
CDSS,N (t) N c ti Bd (t, Ti ) (u Ti 1 )B(t, u) dS(t, u)
i S 1 Ti 1

TN
N (1 R) B(t, u) dS(t, u)
TS

where
■ S(t, u) is the survival probability from t to u conditional to no default at time t.
■ ti is the length of time expressed in fraction of years between Ti 1 and Ti .
■ B(t, u) (Bd (t, u)) is the default-free (risky or defaultable) discount factor from t
to time u.
■ c is the premium paid at every payment date by the protection buyer.

The CDS par spread sN is defined as the rate c that cancels the present value of
the swap, and is given as follows:
TN
(1 R) TS B(t, u) dS(t, u)
sN t N Ti
i S 1 ti Bd (t, Ti ) Ti 1 (u Ti 1 )B(t, u) dS(t, u)
182 EQUITY CREDIT HYBRIDS

CDS Option Price Let’s call CS,N (t) CS,N (t, TS , K) the value at time t of a call option
maturing at time TS and struck at K written on the CDS spread contract CDSS,N (t).
If the default occurs before the option maturity TS , two different treatments are
possible: either the option is knocked out and its value drops to zero or the option
remains valid and pays the default protection at maturity.
We focus here on the pricing of the knock-out CDS whose price at time TS is
given by
N
CS,N (TS ) N (sN K) ti Bd (TS , Ti )
i S 1

Ti
d
(u Ti 1 )fh(TS , u)B (TS , u)du ,
Ti 1

where Bd t, T is given by (6.20).


The payoff of the option can be rewritten as
N Ti d
N (1 R) i S 1 Ti 1 fh(TS , u)B (TS , u)du
N
CS,N (TS ) N K i S 1 ti Bd (TS , Ti )
Ti d (T
Ti 1 (u Ti 1 )fh(TS , u)B S , u)du

Let us define the level of the swap at time t, LVLS,N (t) as


N Ti
LVLS,N (t) N ti Bd (t, Ti ) (u Ti 1 )fh(t, u)B
d
(t, u)du (6.22)
i S 1 Ti 1

Writing the expression of the call price under QLVL , we get

CS,N (t) LVLS,N (t) E LVL [sN K] t

Under LVL , the CDS spread, sN , is a martingale. Its diffusion is assumed to be


log normal and can be written as
dsN (t) SN S,N
t dWt
sN (t)
The CDS option price is therefore given by the Black formula as follows:

CS,N (t) LVLS,N (t) sN (t) (d1 ) K (d2 ) ,

where d1 and d2 are given by


sN (t)
log K (t, TS ) TS t
d1
(t, TS ) TS t 2
d2 d1 ( t, TS ) TS t
TS
1 SN
2
(t, TS ) u du
TS t t
Credit Modeling 183

Link between CDS Spread Volatility and Hazard Rate Volatility The objective of this
S
section is to express the CDS spread volatility t N as a function of the hazard rate
volatility th For the ease of the computations we set N 1.
At time t, the CDS spread, sN (t), is defined as

N Ti
i S 1 Ti 1 B(t, u) dS(t, u)
sN (t) (1 R)
LVLS,N (t)

where LVLS,N (t) is given by (6.22).


Recall the diffusions of the survival probability S(t, T) and the instantaneous
forward hazard rate fh(t, T) under the risk-neutral measure :

dS(t, T) SP
ht dt (t, T)dWt
S(t, T)

and

SP
dfh(t, T) (t, T) fh (t, T)dt fh (t, T)dWt ,

SP
where (t, T) and fh (t, T) are given by

T
SP h h
(t, T) t (t, u)du (6.23)
t
h h
fh (t, T) t (t, T) (6.24)

Applying Ito to the process sN (t), we get

N Ti
1 R
dsN (t) d fh(t, u)Bd (t, u) du
LVLS,N (t)
i S 1 Ti 1

A
N Ti
(1 R) i S 1 Ti 1 fh(t, u)Bd (t, u) du
d LVLS,N (t) ,
LVLS,N (t)2
B

where the remaining terms are drift related.


The expansion of the terms A and B gives

N Ti
A Bd (t, u) fh (t, u) fh(t, u) SP
(t, u) du dWt dt
i S 1 Ti 1
184 EQUITY CREDIT HYBRIDS

N
B ti Bd (t, Ti ) SP
(t, Ti ) dWt
i S 1
N Ti
d SP
(u Ti 1 )B (t, u) fh (t, u) fh(t, u) (t, u) du dWt dt
i S 1 Ti 1

SN
We can now relate the CDS spread volatility t to the hazard rate volatility
h SP (t, u), as follows:
t , through
fh (t, u) and

N Ti
SN i S 1 Ti 1 Bd (t, u) fh (t, u) fh(t, u) SP (t, u) du
t N Ti
(6.25)
i S 1 Ti 1 fh(t, u)Bd (t, u) du

N ti Bd (t,Ti ) SP (t,Ti )
i S 1 Ti fh (t,u)
Ti 1 (u Ti 1 )Bd (t,u) fh(t,u) SP (t,u) du

LVLS,N (t)

6.4.3 European Call


Deterministic Interest Rates Case At a given time t, the price of a standard call
option struck at K and maturing at T is given by

C(t) B(t, T) E ST 1 T K t

B(t, T) E ST K 1 T t

T
1 t B(t, T) E exp hs ds ST K t
t

where ST is the nondefaultable stock and its diffusion is given as follows:

dSt S S
St
rt ht yt dt t dWt

and we have

St St 1 t

On the other hand, B(t, T) is the nondefaultable zero-coupon bond of maturity T.


After some calculations, we obtain

FtT
C(t) 1 t B(t, T)S(t, T) (d1 ) K (d2 ) (6.26)
S(t, T)

where FtT is the T-forward value of the asset.


Credit Modeling 185

The expressions of d1 , d2 and are given by

FtT 1 T 2
log KS(t,T) 2 t u du
d1
T 2
t u du

T
2
d2 d1 u, T du
t
2 2
2 S SP Sh SP S
t, T t (t, T) 2 (t, T) t

Stochastic Interest Rates Case Under the risk-neutral measure, the dynamics of the
asset St , the survival probability S(t, T) and the nondefaultable zero-coupon bond
B(t, T) are respectively given by

dSt S S
rt yt dt t dWt dNt ht dt
St
dS(t, T) SP (t, T)dW h
ht dt t
S(t, T)
dB(t, T) B (t, T)dW r
rt dt t
B(t, T)

where rt is the instantaneous short rate and

d WS, Wh t Sh dt
d Wr, Wh t rh dt

d WS, Wr t Sr dt

As for the survival probability, the diffusion parameter of the zero-coupon bond is
defined as
T
B r r
(t, T) t (t, T),
t

where
T
r r
(t, T) exp u du
t

Nt being independent of the rest of the random terms (Brownian motions), we will
focus on the nondefaultable stock. The price of a European call option is given as
follows:
T
C(t) E exp rs ds ST 1 T K t
t

T
1 tE exp rs hs ds ST K t
t
186 EQUITY CREDIT HYBRIDS

T
1 t St exp yu du (d1 ) (6.27)
t

T
rh SP B
1 t B(t, T)S(t, T)K exp (u, T) (u, T)du (d2 )
t

The expressions of d1 , d2 and are given by


T
St exp t yu du
T rh B (u, T) SP (u, T)
log t du
KS t, T B t, T
d1
T
t (u, T)2 du
T
t (u, T)2 du
2
T
d2 d1 (u, T)2 du,
t

where

S 2 B (u, T)2 SP (u, T)2 2 Sr S B (u, T)


(u, T) u u
Sh S SP rh SP B
2 u (u, T) 2 (u, T) (u, T)

Case Studied To minimize the number of parameters to estimate, we chose to work


within a deterministic interest rate framework. As shown above, the extension to a
stochastic interest rate framework is straightforward, but requires some estimation
of the correlation between the short rate and the hazard rate.
In the next section, we are going to focus on the calibration of the model’s
parameters exploiting the above results on the pricing of some derivatives products
within the modeling framework.

6.5 CALIBRATION
6.5.1 Stripping of Hazard Rate
Calibration of Default Probabilities Credit default swaps are the most liquid credit
derivative instruments on a reference entity; therefore, we can use them to back
out default probabilities. This is done by discretizing the integrals in the CDS price
formula given by (6.21). Indeed, we can write the default and accrual payments as
follows:
nd
DP N 1 R B 0, tk F 0, tk F 0, tk 1
k 1
nd
AP Nc tk t tk 1 B 0, tk F 0, tk F 0, t tk 1
k 1
Credit Modeling 187

where nd is the number of discretization dates, tk is the next coupon date after tk ,
and F 0, tk tk is the default probability. The premium leg, on the other
hand, as a function of default probabilities, is equal to
n
PL Nc ti B 0, ti 1 F 0, ti
i 1

By bootstrapping we can we get back the default probabilities F 0, tk , k


1, , nd .

Hazard Rate Curve Estimation The hazard function is assumed to be a piecewise


constant function, between the maturity dates of the credit default swaps on the
reference entity, and is given as follows:
m
ht hi 1ti 1 t ti
i 1

The default probability is therefore given as follows:


t
F t 1 exp hu du
0

i
1 exp hi ti ti 1 hi (t) t ,
i 1

where as before (t) is the next date (in the sequence t1 , t2 , tm ) after t, and i is
the corresponding index (i 1, , m ). The hazard rate hi is then given by

1 1 F 0, ti
hi log
ti ti 1 1 F 0, ti 1
1 S 0, ti
log ,
ti ti 1 S 0, ti 1

where S 0, ti is the survival probability up to date ti .

6.5.2 Calibration of the Hazard Rate Process


As discussed above, we can back out the default probabilities and hazard rate curve
seen from today using credit default swaps. Similarly to interest rate models (see
chapter 3), the function th does not need to be calibrated and is fully determined
by today’s survival (default) probabilities. Therefore, we only need to calibrate the
volatility and mean reversion parameters th and th which we can do using the credit
default swaption prices and the relationship given in (6.25).
Recall the relationship between the CDS option implied volatility and the CDS
S
spread local volatility t N given by the following expression:
tS
1 SN
2
(t0 , tS ) u du
tS t0 t0
188 EQUITY CREDIT HYBRIDS

The main assumption relies on the log-normal distribution of the CDS spread
under its natural measure. Therefore, we can rewrite (6.25) such that we can
S
have an explicit dependency between t N and th , by replacing fh and SP by
the expressions given in (6.24) and (6.23). We can therefore use a least square
minimization to compute a piecewise constant th and th .

6.5.3 Calibration of the Equity Volatility


The calibration of the equity diffusion consists of the calibration of the local volatility
parameter S to standard market option prices. To this end, we will use the formula
(6.26) derived before for the price of a European call option in the framework where
the stock jumps to zero as the default event happens.
Given that equity option prices are more liquid than their CDS counterpart, we
will use the former to calibrate tS and Sh simultaneously.

6.5.4 Discussion
For most names, and due to the lack of liquid credit default swaptions available on
the credit market, the parameters h and h could be calibrated bond options. If
none of these is liquid enough, calibration to historical data is sometimes done and
the adjustment of these parameters (historical vs. risk-neutral) is left to the discretion
of the trader.
Sending h to 0 in (6.14), we get a Ho-Lee kind diffusion for the hazard rate
such that the mean reversion effect disappears. This property may prove to be useful
when dealing with a credit quality with no particular mean-reverting behavior or
with historical data too limited to be used to assess a value for this parameter.
The introduction of defaultable bonds written on the same credit name into the
set of calibration instruments could allow us to calibrate the hazard-rate parameters
to the market. Note that the presence of these instruments in the hedging strategy
legitimates their use in the calibration. Furthermore, despite their close link to the
CDS product, their difference in liquidity reflects the different nature of the risk they
bear. Convertible bonds, provided some liquidity, could also be added as calibration
instruments.
In the following section, we are going to present an extension to this framework
by introducing jumps in the stock diffusion and hazard-rate diffusion. This will
enable us to capture correctly the movements of the credit spreads. Indeed, when a
downgrade is announced by a rating agency, the spreads widen in a jumpy way.

6.6 INTRODUCTION OF DISCONTINUITIES

A natural extension of the model consists of introducing jump processes within


the previous framework to account for the spread-widening effects. The objective
is threefold: First, the presence of jumps in the equity process allows us to capture
the equity smile that is not explained by the introduction of the credit component.
Second, adding some discontinuities to the hazard-rate process will allow us to
capture, as said above, the spread movements observed in the market. Last, by
describing the jumps in both processes, equity and hazard rate, with the same jump
Credit Modeling 189

processes, we capture better the joint behavior of the hazard rate and the equity.
Indeed, the same jump happens at the same time in both quantities with different
amplitudes, which allows to capture the correlation between extreme events.

6.6.1 The New Framework


We model the discontinuities in both processes, equity and hazard rate, with the
same Poisson processes Nt1 and Nt2 with deterministic intensities 1t and 2t . The
jump sizes J1S ,J2S for the equity and J1h ,J2h for the hazard rate are constant. The reason
we introduce two jump processes is to account for downgrades and upgrades where
the jump sizes account for the average downgrade and upgrade effects. The three
Poisson processes N,N1 and N2 are taken to be independent:

dSt S S
rt yt dt t dWt (J 1) dNt ht dt
St

J1S dNt1 1
t dt J2S dNt2 2
t dt

and the hazard rate process is given by

h h h h
dht t t ht dt t dWt (6.28)

J1h dNt1 J2h dNt2 ,

where Wth and WtS are correlated as before with the correlation tSh . We set J 0
such that when the default event occurs, the equity process drops to zero and stays
there.

6.6.2 Dynamics of the Survival Probability


We integrate the SDE (6.28) to get the expression of the hazard rate:
For t s,

t t
h h h h h
ht hs s, t u s, u du u s, u dWuh
s s
t t
J1h h
s, u dNu1 J2h h
s, u dNu2
s s

Integrating between t, and T, we have :

T T T
h h
hu du (t, T)ht (u, T) uh du h
(u, T) h h
u dWu
t t t
T T
J1h h
(u, T)dNu1 J2h h
(u, T)dNu2
t t
190 EQUITY CREDIT HYBRIDS

The survival probability is therefore given by the following expression:

T
S(t, T) E exp hu du t
t

T T
h h 1
exp (t, T)ht (u, T) uh du ( h
(u, T) h 2
u ) du
t 2 t

T
E exp J1h h
(u, T)dNu1 t
t

T
E exp J2h h
(u, T)dNu2 t
t

T T
h h 1
exp (t, T)ht (u, T) uh du ( h
(u, T) h 2
u ) du
t 2 t

T
1
exp u exp( J1h h
(u, T)) 1 du
t

T
2
exp u exp( J2h h
(u, T)) 1 du
t

In the above expression, we have made use of the independence between all
the random processes involved in the diffusion of the hazard rate (the two Poisson
processes and the Brownian motion).
T
We can also write the expression for exp t hu du (the equivalent of the cash
bond in interest rate):

T T T
1 h h h 1 h h 2
exp hu du exp (u, T) u dWu ( (u, T) u ) du
t S(t, T) t 2 t
T T
exp J1h h
(u, T)dNu1 exp( J1h h
(u, T)) 1 1
u du
t t
T T
exp J2h h
(u, T)dNu2 exp( J2h h
(u, T)) 1 2
u du
t t

6.6.3 Pricing of European Options


The price of a European call option with a strike K and maturity T is given by

Ct B t, T E ST 1 T K t

B t, T E 1 T ST K t
Credit Modeling 191

T
1 tB t, T E E exp hu du ST K N1 , N2 t ,
t

Cnt

where B t, T is the nondefaultable zero-coupon bond maturing at time T.

T
Cnt E exp hu du [ZT K] N1 , N2
t
T
S
ZT Ft,T exp hu du
t
T T
S h 1 S
2
exp u dWu u du
t t 2
1
NT 2
NT
T t T t
1 S
exp u J1 du 1 J1S exp 2 S
u J2 du 1 J2S
t n1 1 t n2 1

Therefore,

Ct 1 tB t, T NT1 t n1 NT2 t n2
n1 0 n2 0
T
E E exp hu du ZT K NT1 t n1 , NT2 t n2 t
t

T k
T i du
t u
NTi t k exp i
u du , i 1, 2
t k!

Conditionally to NT t k , the jump times T1 , T2 , , Tk have the following


law:
1
1
k t t1 t2 tk T t1 t2 tk dt1 dt2 dtk
T
t u du

Define
T T
i i
H t, T exp hu du , t,T u du
t t

and

Hn1 ,n2 t, T H t, T 1
NT 2
t n1 ,NT t n2
n ,n
ZT1 2 ZT N 1 2
n1 ,NT
T t t n2
192 EQUITY CREDIT HYBRIDS

The call price computation requires computing a double summation. Indeed, we


have

Ct 1 tB t, T
T T
1 1 1 n ,n2
E Hn1 ,n2 t, T ZT1 K t
n1 ! n2 ! t t
n1 0 n2 0
1 1 1 2 2 2 1 1
t1 t2 tk t1 t2 tk dt1 dt2 dtk1 dt12 dt22 dtk2
1 2
exp t,T exp t,T ,

n ,n2
where the expressions of Hn1 ,n2 t, T and ZT1 are respectively given by

n1 n2
n1 ,n2 1 1 h 1
H t, T exp (ti1 , T)J1h h 2
(ti2 , T)J2h
S(t, T)
i1 1 i2 1

T
exp exp( J1h h
(u, T)) 1 1
u du
t

T
exp exp( J2h h
(u, T)) 1 2
u du
t

T T
h h h 1 h h 2
exp (u, T) u dWu ( (u, T) u ) du
t 2 t

n ,n2 n1 n2
S
ZT1 Ft,T Hn1 ,n2 t, T 1 J1S 1 J2S
T T
exp J1S 1
u du exp J2S 2
u du
t t

T T
S S 1 S
2
exp u dWu u du
t 2 t

To simplify these expression, we introduce the new quantities:

n1 n2
n1 ,n2
H t, T exp T ti11 J1h T ti22 J2h
i1 1 i2 1

T
exp exp( J1h h
(u, T)) 1 1
u du
t

T
exp exp( J2h h
(u, T)) 1 2
u du
t
Credit Modeling 193

n ,n2 n1 n2
ZT1 t, T 1 J1S 1 J2S
T T
exp J1S 1
u du exp J2S 2
u du
t t

such that

1
Hn1 ,n2 t, T n1 ,n2
H t, T S(t, T)
T T
h h h 1 h h 2
exp (u, T) u dWu ( (u, T) u ) du
t 2 t
n ,n2
n ,n S ZT1 t, T
ZT1 2 Ft,T n1 ,n2
H t, T S(t, T)
T T
h h h 1 h h 2
exp (u, T) u dWu ( (u, T) u ) du
t 2 t

T T
S S 1 S
2
exp u dWu u du
t 2 t

Therefore,

n ,n2 n ,n2 n ,n2


E Hn1 ,n2 t, T ZT1 K t
S
Ft,T ZT1 t, T 1 ZT1 K t

n1 ,n2 n ,n2
KH t, T S(t, T) 2 ZT1 K t ,

where 1 and 2 are given as follows:

T T
d 1 S S 1 S
2
exp u dWu u du
d t 2 t

T T
d 2 h h h 1 h h 2
exp (u, T) u dWu ( (u, T) u ) du
d t 2 t

We can therefore write the price Ct as follows:

1 2
Ct 1 tB t, T exp t,T exp t,T (6.29)

T T
1 1
Pn1 ,n2 t, T
n1 ! n2 ! t t
n1 0 n2 0
1 1 1 2 2 2 1 1
t1 t2 tk t1 t2 tk dt1 dt2 dtk1 dt12 dt22 dtk2 ,
194 EQUITY CREDIT HYBRIDS

where

n ,n2 n ,n2 n1 ,n2 n ,n2


Pn1 ,n2 t, T S
Ft,T ZT1 t, T d11 KH t, T S(t, T) d21
S n1 ,n2
Ft,T ZT t, T 1 2
log n1 ,n2 2 t, T
n ,n2 KH t, T S(t, T)
d11
t, T
n ,n2
S
Ft,T ZT1 t, T 1 2
log n1 ,n2 2 t, T
n ,n2 KH t, T S(t, T)
d21
t, T
T
1 S 2 h (u, T) h )2 Sh S h h (u, T)
t, T u ( u 2 u u u du
2 t

It is clear from (6.29) that a direct computation of the price of a European call
option is time consuming and a better alternative is needed. In the following section,
we apply the Fourier method in order to speed the pricing.

6.6.4 Fourier Pricing


Applying the Carr-Madan technique, we define the dampened call price as follows:

Ct k exp k C t, T, k , with 0 and k log K

C t, T, k is the price of a European option with strike K and maturity T. The


Fourier transform of Ct k is given by

ei k Ct k dk

T
B t, T ei k e k E exp hu du e sT ek t ,
t

where ST esT is the nondefaultable stock.

T
B t, T E exp hu du ei k e k dk esT ek t
t

T sT
B t, T E exp hu du ei k e k dk esT ek t
t

B t, T S t, T
i i 1
Credit Modeling 195

with

T
1
E exp hu du e i 1 sT
t
S t, T t

S
Ft,T T
S Sh S
E exp u u dZu t
S t, T t

T
S Sh h h
E exp u u 1 (u, T) u dWuh t
t

T
E exp log 1 J1S 1 h
(u, T)J1h dNu1 t
t

T
E exp log 1 J2S 1 h
(u, T)J2h dNu2 t
t

T T
S
2 1 h h
2
exp u du exp (u, T) u du
2 t 2 t

T
J1h h (u,T)
exp J1S 1 e 1 1
u du
t

T
J2h h (u,T)
exp J2S 1 e 1 2
u du
t

where i 1, and ZS is independent from W h such that: WtS Sh h


t Wt
Sh S
t Zt . The above expression is simplified as follows:

S
Ft,T 1 T
S Sh
2
S Sh h h
exp u u 2 u u u (u, T) du
S t, T 2 t

T
1 2 h h
2
exp (u, T) u du
2 t

T
1 J1h h (u,T)
exp 1 J1S e 1 J1S
t

J1h h (u,T) 1
1 e u du

T
1 J2h h (u,T)
exp 1 J2S e 1 J2S
t

J2h h (u,T) 2
1 e u du
196 EQUITY CREDIT HYBRIDS

We can therefore compute the call price by inverting the Fourier transform
computed above.

e k
k
C t, T, k e Ct k B t, T S t, T

1 i k
e d
0 i i 1

This technique is computationally very quick, which is very important for the
calibration.

6.7 EQUITY DEFAULT SWAPS

The equity default swap, or EDS, is an instrument whose definition intentionally


reflects that of the much-better-known credit default swap (CDS).
The two counterparties to the EDS transaction are the protection buyer and
protection seller. The protection that is traded is protection against a stock’s reaching
a level that is relatively low compared to the reference price (the stock’s traded price
at the inception of the trade). This level, the barrier, might be 70%, perhaps 50%,
or even 30% of the reference price, that is, low compared to barriers typically seen
in standard down-barrier options.
The key difference between the EDS and CDS is that the CDS protection is
triggered only on a default event, doubtless accompanied by a precipitous drop in
the share price of the defaulting company, perhaps to near zero, whereas the EDS
protection pays out if the stock drops below a level that is still far from zero, whether
or not this is accompanied by a default. (It is perfectly possible for the share price
of a company to fall, over a reasonably long period, by a factor of three with no
suggestion that the company is close to default nor with necessarily a corresponding
fall in its dividend payment.)
This breach of the barrier is known as the knock-in event. We may immediately
distinguish two cases: the knock-in event being caused by the stock diffusing across
the barrier in the ordinary way (the mechanism by which barriers are breached when
barrier options are priced, as they frequently are, in a pure diffusion model such as
local volatility); and the event being caused by a default causing the share price to
‘‘gap’’ to below the barrier (perhaps to near zero).
Note that although we distinguish these cases within the context of a diffusion
model with jump-to-default, there is no such distinction written into the definition
of the structure. Nor, for that matter, is it generally clear from market information
what is causing any given move in the stock. An advantage often claimed for EDS
over CDS is that the determination of a credit event is less transparent than an
observation of a share price, as the latter is based on public market information.
As its name suggests, the EDS is structured as a swap having two legs: the fixed
leg and the protection leg, sometimes known as the equity leg. Each leg, of course,
is written on the same notional amount N0 . The protection buyer is short the fixed
Credit Modeling 197

leg: he pays a predetermined coupon stream to the protection seller. This might be
x% N0 quarterly, for example. The protection seller pays a predetermined amount
to the protection buyer in the event of the barrier’s being breached. This amount
will be y% N0 less the accrued on the fixed leg at the time of the knock-in event.
Another variant replaces the fixed coupons with floating, so the payments might
be calculated using a LIBOR rate plus a spread. Again, the floating leg payment in
the event of a knock-in event is the appropriate accrued amount.
It is possible to extend the equity default swap concept to so-called multiname
structures, in which the protection traded, and therefore the definition of the knock-
in event, relates to the first of several underlyings to hit its corresponding barrier
(compare first-to-default structures in the credit market); all barriers being set at the
same level relative to their spot values at the inception of the trade. We will not
explicitly describe these here; it is a straightforward generalization. At the time of
writing, in the authors’ experience, multiname EDSs trade less frequently than single
name.

Structuring an EDS EDSs as described above trade in the interbank market. For
wider distribution, however, it is common to structure the product as a note.
The buyer pays 100 on entering the position, and receives this amount back
from the issuer at maturity unless there has been a knock-in event prior to this.
The knock-in event is defined as the first date on which the closing price of the
underlying share on the relevant exchange is at or below the barrier. (It is precisely
this transparency and simplicity of definition that is the feature argued in favor
of the EDS.) The buyer of the note receives a coupon stream, or else it may be a
zero-coupon instrument, in which case he receives only a coupon at maturity, in
addition to the redemption payment.
In the event of a knock-in, the note is said to accelerate (i.e., terminate early),
and the holder receives an early redemption payment much less than 100: only 50,
say. He also receives the accrued on the coupon accruing at the time of the knock-
in event.
The holder therefore stands to lose a substantial fraction of his investment if
the underlying share breaches the barrier. If default were the only process that
could trigger this, he would be accepting simple credit risk and would have sold the
protection against that risk. Of course, in a model with default and diffusion, either
process can cause the knock-in event to trigger. In exchange for accepting this risk,
he is compensated with an above-risk-free coupon stream.
This trade can be decomposed into an EDS as defined above, whose protection
leg pays 50, plus a bond that knocks out under the same conditions as the knock-in
event of the note.
The protection seller (note holder) may also enter into a cancelable swap to
mitigate his interest rate risk. Thus, he may agree to exchange his fixed-coupon
payment for a sequence of floating payments plus a spread. He would then have
interest rate risk only on the spread (the floating payments plus final redemption
payment value exactly to 100, irrespective of rates). The cancellation clause would,
of course, be precisely the knock-in event of the note.
198 EQUITY CREDIT HYBRIDS

6.7.1 Modeling Equity Default Swaps


For a general-case multiname EDS priced under stochastic hazard rates, any of
the choices for the hazard-rate diffusion given in section 6.3.1 may be realized by
brute-force Monte Carlo. There are a number of issues with this:

■ Each time step of each path of each underlying requires an interpolation to


be made on a local volatility surface, which can be slow. That said, careful
implementation techniques can considerably alleviate this problem.
■ Structures having many underlyings are computationally intensive to risk man-
age: for N underlyings, there are obviously N deltas, gammas, and vegas, and
1
2 N(N 1) off-diagonal gammas. Clearly, this is not limited to multiname EDSs,
nor is it specific to Monte Carlo as a numerical technique, but it is a significant
issue in modeling and risk managing these positions. Combined with the insta-
bility of simple finite difference approximations for the gammas of barrier-type
products valued by Monte Carlo, there is a real issue in obtaining this risk for
multiname EDS. Exactly the same issue is addressed in section 7.4.1 as applied
to another multiunderlying structure type: the Altiplano.
■ The simulation requires as parameters the correlations between all pairs of
equities and hazard rates. In particular, it requires equity-equity correlations,
which we take to be given. It requires also the correlations between equities
and their corresponding hazard rates; the calibration of the hazard-rate process
provides these. But it also needs correlations between pairs of hazard rates and
between hazard rates and other stocks: We have to make assumptions about
these categories of data.

Furthermore, it is found that the EDS is not especially sensitive to the volatility
of the hazard rate. Accordingly, it is reasonable to model it under deterministic
hazard rates, which is usually done.

6.7.2 Single-Name EDSs in a Deterministic Hazard Rate Model


In the case of single-name EDSs, we can do significantly better than Monte Carlo by
treating the structure as a sort of barrier option. The usual approach to these types
is to use a finite difference or finite element scheme to discretize the PDE, and to
apply a Dirichlet condition at the barrier. Chapter 9 gives an account of numerical
solution of the PDEs of finance using finite element methods, and there are many
accounts of the application of finite difference techniques to these PDEs.
If the protection amount is y% N0 less the accrued on the fixed leg at the
time of the knock-in event, then the Dirichlet condition is a sawtooth function of
time, whose discontinuities are the coupon dates. This should not cause a problem
in practice, as the scheme will in any event place time steps exactly on the coupon
dates in order to adjust the node values by the coupon payments.
One natural corollary to using a PDE approach instead of a Monte Carlo is
that the barrier is assumed to be observed continuously, whereas in a simple Monte
Carlo, the observations are necessarily discrete at the step dates. In the case of our
example above, the observation of exchange closing prices implies that the Monte
Carlo approach is exact. Accordingly, in adopting a PDE approach we are trading
Credit Modeling 199

AHLN.AS implied volatility

90

80

70

60

50

40

30

20

10

3m
21m
0
39m
10% 25% 40%
55% 70% 57m
85% 99% 114%
129% 144% 75m
159% 174%
Strike (% spot) 189% 204%
218% 233% Maturity
248%

FIGURE 6.1 Ahold implied volatility as of December 2005, across a wide range of strikes
and up to 7y maturity.

a slight bias in the pricing for an improvement in speed and in the stability of the
greeks.2
We therefore have a PDE in one spatial variable to solve for the single-name
EDS. For a time-dependent protection payment R(t), the PDE we solve in a local
volatility model with deterministic default risk is

V 1 2V V
(S, t)S2 2 r(t) (t) S r(t) (t) V (t)R(t) (6.30)
t 2 S S
The significant term in this is, of course, the inhomogeneous source term
introduced by the presence of jumps.
A Worked Example: Ahold As an example, we select for study a five-year EDS written
on Ahold (Reuters code AHLN.AS) with a barrier at 50% (of the share’s traded
price at the inception of the trade). (Ahold is a group of food retail and food service
operators listed on Euronext and other exchanges.) As of December 2005, the credit
default swap curve for this company was rising steeply from around 20bps for a
one-year CDS to around 110bps at five years. With this information and market
implied volatilities (shown in figure 6.1), we can calibrate a local volatility for the
stock, given the possibility of jump to zero, using the procedures of chapter 8. The
results of the calibration procedure are shown in figure 6.2.
The figure shows the relative error in European option prices after the calibration,
that is, the difference between prices calculated in a local volatility model and market
prices inferred from implied volatilities, divided by the price of the stock:
PLV PMkt
Error
S
2A barrier shift can approximately compensate for this.
200 EQUITY CREDIT HYBRIDS

AHLN.AS local volatility calibration

0.008

0.007

0.006

0.005
Relative error

0.004

0.003

0.002

0.001

-0.001

-0.002 02-Mar-09
0.64

30-Aug-07
2.56
3.84
4.64
5.12

Maturity
5.6

30-Jun-06
6.08
6.56
7.04

7.52
8

8.64

9.6 30-Dec-05

11.52

13.44

15.36
Spot

FIGURE 6.2 The local volatility calibration error on European option prices as a fraction of
spot. The peak indicates the onset of arbitrage in the data.

The graph indicates that the majority of the region displayed calibrates to within 10
basis points, regarded as reasonably acceptable. However, there is a pronounced
peak at longer maturities and at spot prices below about 50% of the prevailing
traded price. This is not an error in the calibration: It indicates the onset of arbitrage
between the implied volatilities and the CDS curve used in the calibration. We can
think of this in the following way:
For a constant volatility of the diffusion process, the presence of default risk
makes puts more expensive as it raises the conditional (on no default) forward
while at the same time introducing a likelihood of a maximum payout from the put.
(The calibration options are taken to be riskless, perhaps exchange-traded, options
on a risky share.) It also raises the call price: the increased forward contributing
positively to the price while the probability of default before maturity resulting in a
zero payout acts in the opposite way. Call-put parity is still required to hold, as the
effect of default risk on the distribution at maturity is simply to change its form by
introducing a peak at ST 0. (Throughout, we are considering that default results in
the share price dropping to zero.) The local volatility calibration tries to compensate
for this price-increasing effect of the credit by lowering the diffusion volatility to
preserve the observed market price. If a near-zero local volatility cannot reproduce
the calibration prices, then the data are arbitrageable.
We may value the EDS in a finite difference lattice scheme and use this to look
at the price as a function of spot at the t 0 time step of the grid.3 In the interests
of simplicity, we will, in the following, drop the time variation of the protection
payment caused by the accrued coupon. No essential features of the protection leg
are lost.

3 The barrier of 50% keeps the lattice clear of the arbitrageable region.
Credit Modeling 201

EDS Approach to Default Protection

1.2

EDS
0.8
Default Protection
Price

0.6

0.4

0.2

0
50%

63%

76%

89%

102%

116%

129%

142%

155%

168%

181%

194%

207%

220%

234%

247%

260%

273%

286%

299%

312%

325%

338%

352%

365%

378%

391%

404%
Spot / S0

FIGURE 6.3 A 5y EDS on AHLN.AS vs. share price. The asymptote is the value of a default
protection written on the stock.

Figure 6.3 plots only the protection leg of the EDS. Note that, since the plot is
taken from a single FD grid, the local volatility model is assumed valid, inasmuch
as the local volatility surface is held constant rather than the implied surface. (See
section 1.2.1 and remark 1.2.1 for reasons why keeping the implied surface constant
between plots, and recalibrating local volatility each time, is inappropriate.) The
graph shows an asymptote, which is the value of a pure default protection calculated
according to

Default Protection : [ 10 T DF( ) ]


t t
10 t T exp rs ds ht exp hs ds dt
0 0 0
T t t
ht exp hs ds exp rs ds dt,
0 0 0

this being the value of a payment of one at the time of default, if default occurs
before a maturity T.4 Although not visible in the graph, there is a small offset
between the analytic default protection value and the limiting EDS leg value. This
decreases slowly with increasing the number of time steps in the FD grid.
Were we to model the protection leg of the EDS on a default-free underlying,
we would call it a deep out-of-the-money American Digital put,5 and the asymptotic

4
In evaluating the default protection, a quantity completely independent of spot price, the
same interest rates and hazard rates were used as for the EDS.
5 This is just a matter of language. The contract terms are (apart from the accrued coupon)

identical between an American Digital put and the protection leg of an EDS. The only
distinction is whether we are considering the underlying to be risky or not.
202 EQUITY CREDIT HYBRIDS

Equity and credit sensitivities

0.03
Vega
Default prot. CS01
CS01
0.02

0.02

0.01

0.01

0.00
50%

60%

70%

80%

90%

100%

110%

120%

130%

140%

150%

160%

170%

180%

190%

200%

210%

220%
-0.01
Spot / S0

FIGURE 6.4 The vega and CDS curve sensitivity of the protection leg across a reasonably
wide range of spot prices, showing regions of predominant equity sensitivity and credit
sensitivity. The asymptote is the CS01 of the pure default protection.

value as S would be zero, in contrast to figure 6.3. The only way in which
such a structure can yield value to the holder, in the default-free model, is by the
stock diffusing across the barrier. Contrast this with the EDS default protection
leg on a risky underlying where the protection buyer can receive a payout either
if diffusion carries the stock to the barrier or if default carries the stock clean
through the barrier. Both possibilities contribute value to the structure, in amounts
according to the distance of the asset from the barrier relative to the general level of
its volatility and to its hazard rates. Accordingly, we can identify the two regimes in
which the EDS can exist, and call them diffusion dominated and default dominated,
corresponding to the cases where most of the value comes from the possibility of the
stock diffusing to the barrier, and to the converse case where it mostly comes from
the possibility of default.
We can quantify these notions by looking at the equity and credit sensitivities
of the protection leg. We do so for our example Ahold EDS.
Figure 6.4 shows the vega and CDS curve sensitivity of the protection leg, in iso-
lation, over a wide range of share prices around the prevailing traded price. The vega
looks qualitatively very similar to the American Digital: necessarily zero at the bar-
rier, positive elsewhere, and tending to zero as S and the probability of diffusion
to the barrier consequently vanishes. The sensitivity to CDS rates (known as CS01)
tends to a nonzero asymptote as S : this is the CS01 of the pure default protec-
tion, as expected. The regions in which one sensitivity is substantial and the other
negligible serve to identify the diffusion dominated and default dominated regions.
The negative CS01 near the barrier is at first sight counterintuitive: We plot it
on an expended horizontal scale and a greatly expanded vertical scale in figure 6.5.
The expectation is that increased CDS rates implies increased probability of default
before maturity, increased value and positive CS01. This is indeed the case far from
the barrier.
Credit Modeling 203

Credit sensitivity near the barrier


Price CS01
1.2 0.0006
Price
CS01

0.0004
1.0

0.0002
0.8

0.6

-0.0002

0.4
-0.0004

0.2
-0.0006

0.0 -0.0008
50%

54%

58%

62%

66%

70%

74%

78%

82%

86%

90%
Spot / S 0

FIGURE 6.5 The CDS curve sensitivity of the EDS protection leg in the diffusion-dominated
region near the barrier.

We can, however, understand how this intuition fails by noting that increased
hazard rates increase the drift of the asset (the convection term in (6.30)) and so tend
to bring it further from the barrier early in the lifetime of the structure. It is precisely
in the diffusion-dominated region near the barrier that this is critical, where the
likelihood is that the asset will diffuse to the barrier before it defaults. Increased drift
lessens this likelihood, or lengthens the expected time before the barrier is breached.
The negative CS01 indicates that this is more significant than the increase in the
probability of breaching the barrier due to a default given that in this region any
default is likely to occur after the barrier is hit.

6.8 CONCLUSION

In this chapter we have presented a modeling framework suitable for equity- and
credit-sensitive structures. The main problem we face when it comes to pricing these
structures is liquidity. Indeed, the scarcity of the data especially from the credit point
of view makes it difficult to calibrate any model no matter how good that model.
While convertible bonds (see chapter 4, in the context of equity-interest rate
hybrids) remain the most liquid and popular hybrid structure, we have witnessed
lately a surge in new hybrid structures, such as the equity default swap.
PART
Four
Advanced Pricing Techniques
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 7
Copulas Applied to Derivatives Pricing

7.1 INTRODUCTION

In this chapter, we highlight the importance that copulas have gained in derivatives
pricing in the last decade. This is due mainly to the growth seen in the credit
derivatives markets. Indeed, basket default swaps, CDO tranches, and all correlation
structures are often priced and risk managed using the copula technology. Copulas
have been used widely in the insurance business, and the transition to the credit
derivatives world was natural, as the latter is often thought of as an insurance
business on the default of companies.
This chapter is organized as follows: First, we tackle copulas from a theoretical
point of view by presenting various properties and families of copulas. We present
as well the copula of a stochastic process highlighting the time dependency, or
autocorrelation, induced by a process. Second, we look at some applications to
derivatives pricing. We start by presenting the factor copula technique, which enables
us to reduce dimensionality of the problem and find semiclosed-form solutions for
various derivatives contracts. Last, we apply the previous approach to the pricing
not only of credit derivatives but some popular multiunderlying equity derivatives,
precisely collateralized debt obligations, basket default swaps, and Altiplanos.

7.2 THEORETICAL BACKGROUND OF COPULAS

7.2.1 Definitions
Definition and Sklar Theorem

Definition7.2.1 A copula is an n-dimensional function C: [0, 1]n [0, 1], which


has the following properties:

■ C is increasing in each of its coordinates uk for k 1, 2, , n


■ C u 0 if at least one of the coordinates of the vector u is equal to 0
■ C u uk if all the coordinates of u but uk are equal to 1
■ for every x, y [0, 1]n , such that x y1 . The volume of the hypercube H
[x1 , y1 ] [x2 , y2 ] [xn , yn ], VC H is non-negative, where

1x y means that xk yk for k 1, ,n .

207
208 ADVANCED PRICING TECHNIQUES

2 2 2
i1 i2 in
VC H 1 C(zi1 , , zin )
i1 1i2 1 in 1

where zij xj if ij 1 and zij yj if ij 2 for j 1, , n


From the above definition, we can conclude that a copula is a multidimensional
distribution with uniform marginals. The following theorem proves to be at the
heart of the copula theory and its applications.

Theorem 7.2.1 (Sklar) Let F be an n-dimensional distribution function with


marginals F1 , F2 , , Fn , then there exist an n-dimensional copula C, such that
n
F x1 , x2 , , xn C F1 x1 , F2 x2 , , Fn xn for all x1 , x2 , , xn

And if F1 , F2 , , Fn are continuous, then C is unique.

Remark 7.2.1 Let C be the copula of the random variables X and Y Any increasing
transformation of X, Y has the same copula C

Frechet Bounds

Theorem 7.2.2 Let C be a two-dimensional copula; we have the following result:

max u v 1, 0 C u, v min u, v , u, v [0, 1]2

This result is straightforward: Due to monotonicity we have C u, v C u, 1


u and C u, v C 1, v v. On the other hand, let H [u, 1] [v, 1] we have
VC H 0, which implies C u, v u v 1, the positivity of C completes the
proof.
W u, v max u v 1, 0 and M u, v min u, v are called Frechet
bounds. The Frechet bounds also exist in the multidimensional case and are given
as follows:
n
W u1 , , un max ui n 1, 0
i 1

M u1 , , un min u1 , , un

u1 , , un [0, 1]n , n 2
Another well-known and useful copula is the independence copula defined as
follows:
n
u1 , , un ui , u1 , , un [0, 1]n , n 2
i 1

1 n
Remark 7.2.2 W is not a copula for n 3 Indeed, consider H 2 , 1 , VW H
n
1 2 0 This means that condition 4 in the definition above is violated.

Conditional Distributions and Partial Derivatives The partial derivatives of a copula


function are closely linked to the concept of conditional expectations. This proves to
Copulas Applied to Derivatives Pricing 209

be very useful when sampling random variables whose dependency is described by


a copula function. The heuristic argument comes from the following computation:
Consider the probability measure Q and two random variables x1 and x2 with
distribution functions F1 and F2 respectively and a joint distribution F.

Q X1 x1 X2 x2 lim Q X1 x1 x2 X2 x2 x2
x2 0

F x1 , x2 x2 F x1 , x2
lim
x2 0 F2 x2 x2 F2 x2
C F1 x1 , F2 x2 x2 C F1 x1 , F2 x2
lim
x2 0 F2 x2 x2 F2 x2
C
u, v F1 x1 ,F2 x2
v

The multidimensional density2 f is linked to the copula function as follows:


n
C
f x1 , x2 , , xn u1 , u2 , , un F1 x1 ,F2 x2 , ,Fn xn fi xi
u1 un
i 1
n
c F1 x1 , F2 x2 , , Fn xn fi xi
i 1

where c u1 , u2 , , un is the copula density.

7.2.2 Measures of Dependence


Concordance

Definition7.2.2 The realizations x1 , y1 , and x2 , y2 of two random variables


X, Y are said to be concordant if x1 x2 y1 y2 0, and discordant if
x1 x2 y1 y2 0

Kendall’s Tau

Definition7.2.3 In a discrete space, if we call x1 , y1 , x2 , y2 xn , yn the


possible realizations of two random variables X, Y. Let note c be the number of
concordant pairs and d the number of discordant pairs. The Kendall’s for this
sample is defined as follows:

c d
c d

In the general case, Kendall’s is defined as the probability of concordance


minus the probability of discordance. Indeed, let X1 , Y1 and X2 , Y2 be two

2f F
x1 , x2 , , xn x1 xn x1 , x2 , , xn where F is the cumulative distribution function.
210 ADVANCED PRICING TECHNIQUES

independent vectors with the same joint distribution function as X, Y ; we have

Q X1 X2 Y1 Y2 0 Q X1 X2 Y1 Y2 0

Theorem 7.2.3 Let X1 , Y1 and X2 , Y2 be independent vectors with continuous


random variables with joint distributions F1 and F2 , respectively, with common
marginals for FX and FY Denote C1 and C2 the corresponding copulas such that
F1 x, y C1 FX (x), FY (y) and F2 x, y C2 FX (x), FY (y) If we define as the
difference between the probabilities of concordance and discordance of X1 , Y1 and
X2 , Y2 : Q X1 X2 Y1 Y2 0 Q X1 X2 Y1 Y2 0 , then

(C1 , C2 ) 4 C2 u, v dC1 u, v 1
[0,1]2

In the case of two continuous random variables X, Y whose copula is C, the


Kendall’s is given by

XY (C, C) 4 C u, v dC u, v 1
[0,1]2

Spearman’s Rho The Spearman’s rho dependence measure is also based on concor-
dance and discordance concepts.

Definition7.2.4 If we take now three independent random vectors X1 , Y1 ,


X2 , Y2 , and X3 , Y3 with the same joint distribution F, the Spearman’s rho is
defined as the difference between the probability of concordance and the probability
of discordance of the two random vectors X1 , Y1 and X2 , Y3 :

Q X1 X2 Y1 Y3 0 Q X1 X2 Y1 Y3 0

Theorem 7.2.4 For continuous random variables X and Y whose copula is C, the
Spearman’s rho is given by

XY 3 (C, ) 12 uvdC u, v 3
[0,1]2

12 C u, v dudv 3,
[0,1]2

where is the independence copula introduced previously.

Tail Dependence It is well known that the Gaussian copula, the most used one in the
financial industry, fails to capture tail dependencies as its tails flatten very quickly.
Copulas Applied to Derivatives Pricing 211

On the other hand, when dealing with fat-tailed distributions we want to know how
well we capture the dependency between those extreme values. The two concepts
of low tail dependency and up tail dependency have been introduced in order to
measure the extreme values dependency as the number of financial contingent claims
that depend on these values has risen dramatically in the last years. This has been
triggered by the markets being marked with few extreme events, from the technology
bubble to the corporate scandals and of course the terrorist attacks.

Definition7.2.5 Let X and Y be two random variables with cumulative distri-


bution functions FX and FY . The coefficients of upper and lower dependency are
defined as follows:

L lim Q Y FY 1 (u) X FX1 (u)


u 0

U lim Q Y FY 1 (u) X FX1 (u) ,


u 1

provided, of course, that these limits exist.

If L ]0, 1] U ]0, 1] then X and Y are said to be asymptotically dependent in


the lower tail (upper tail), and if L 0 U 0 , they are said to be asymptotically
independent in the lower tail (upper tail).
We can show using the Base rule and the relationship between the survival function3
S x, y of X, Y and the cumulative distribution functions of X, Y and X, Y
that
C u, u
L lim
u 0 u
1 C u, u 2u
U lim
u 1 1 u

7.2.3 Copulas and Stochastic Processes


Some properties of stochastic processes can be characterized by their finite-
dimensional distributions and therefore by copulas. Unfortunately, many of the
stochastic process concepts are stronger than the finite dimensional distributions. In
this paragraph, we focus on some well-known Markov4 processes. We define the
copula of a Markov process X as being the copula Cs,t of Xt and Xs .

The Copula of a Brownian Motion From the definition of a Brownian motion we


know that for t s and x, y real numbers, we have

x y
Q Wt x Ws y
t s

3 It is defined as follows: S x, y Q X x, Y y .
4
A process Xt is said to be Markov if Q Xt x s Q Xt x Xs where s
Xu , u s
212 ADVANCED PRICING TECHNIQUES

where is the cumulative distribution function of a standard normal distribution


variable.
Exploiting the relationship between the conditional expectation and partial deriva-
tives of a copula we can write
x
s,t C
CW Fs x , Ft y u, v Fs w ,Ft y fs w dw
v
where Fs and Ft are the cumulative distribution functions of Ws and Wt , respectively,
and fs is the density of Ws Hence

x
s,t y w
CW Fs x , Ft y fs w dw
t s
Fs 1 Fs x 1
Ft Ft y Fs 1 Fs w
fs w dw
t s
Fs x 1
Ft Ft y Fs 1 w
dw
t s
s,t
We therefore can express the copula CW as follows:
u 1 1
s,t t v s w
CW u, v dw
t s
1 1
We have used Ft x t x The copula density is then given by
1 1
t v s u
t t s
s,t
cW u, v 1
, u, v [0, 1]2 ,
t s v

where is the density of a standard normal variable.

Remark 7.2.3 A geometric Brownian motion is an increasing transformation of its


underlying arithmetic Brownian motion; therefore, they have the same copula.
Copula of a Continuous Martingale By employing a time change we can link the
copula of a martingale to that of a Brownian motion. This is possible only for
martingales whose bracket goes to infinity when time goes to infinity.

Theorem 7.2.5 Let Xt be a Q, t local martingale with deterministic bracket such


that X0 0 and lim X t a s. Define Tt and Bt as follows:
t

Tt : inf u, X u t , Bt XTt

Bt is an Tt Brownian motion and Xt B X t, and its copula is given by


u 1 1
s,t X t v X s w
CX u, v dw for t s
X t X s
Copulas Applied to Derivatives Pricing 213

The Copula of an Ornstein-Uhlenbeck Process: The OU process is defined on Q, t


as follows:

drt rt dt dWt

where Wt is standard t Brownian motion and , 0 Solving the above SDE we


get

t
(t s) t u
rt rs e e dWu , t s
s

We can apply the above theorem to mt rt e t r0 We have m 2 e2 t 1 ,


t 2
and the copula of the process mt is given by

e2 t 1 1 e2 s 1 1
u
2 v 2 w
s,t
Cm u, v dw
e2 t 1 e2 s 1
2 2

mt is a monotone transformation of rt and therefore has the same copula. We can


note that the OU copula depends on but not on . On the other hand, when
goes to zero, the OU copula is reduced to the Brownian motion copula. This is not
surprising, as the process itself reduces to a Brownian motion. Whereas when goes
to infinity, the OU copula tends to the independence copula

7.2.4 Some Popular Copulas


In this section, we present some copulas widely used in practice. They are appealing
for their analytical and numerical tractability. Indeed, the Gaussian copula is
used almost everywhere in the financial literature even if its use is most of the
time implicit as we mention geometric Brownian motions and multidimensional
log-normal distributions without using the word copula. It belongs to the family
of elliptic copulas. Another family of widely used copulas is the Archimedean
copulas.

Elliptic Copulas

Gaussian Copula

Definition7.2.6 The Gaussian copula is defined as follows:

1 1 1
C u1 , u2 , u3 , , un n, (u1 ), (u2 ), , (un )

where is a correlation matrix, n, is the standard n-dimensional normal distri-


1
bution with correlation matrix , and is the inverse cumulative distribution
function of a standard one-dimensional normal variable.
214 ADVANCED PRICING TECHNIQUES

The corresponding copula density is given by

1 1 t 1X
exp 2X
2 n det
c (x1 ), (x2 ), , (xn )
i n x2i
1
2
exp 2
i 1

with Xt x1 , x2 , , xn . This leads to ( (xi ) ui )

1 1 t 1
c u1 , u2 , , un exp U In U
det 2

1 1 1
where Ut (u1 ), (u2 ), , (un ) and In is the identity matrix.
The tail dependency parameters in the two-dimensional case are given as follows:

L 0
U 0

This concludes that the bivariate Gaussian copula does not exhibit tail dependency.

Student Copula (t-Copula)

Definition7.2.7 The t-copula is characterized by a correlation matrix and a


parameter called degree of freedom. It is defined as follows:

1 1 1
C u1 , u2 , u3 , , un T ,n, T (u1 ), T (u2 ), ,T (un )

where T ,n, is the n-dimensional student distribution with correlation matrix and
number of degrees of freedom , and T 1 is the inverse cumulative distribution of a
student random variable with degrees of freedom.
The corresponding copula density is given by

n n
2 1 Xt 1X 2
1
2 n det
c T (x1 ), T (x2 ), , T (xn ) 1
i n 1
2 1 x2i 2
1
i 1 2

Changing variables and denoting T (xi ) ui and Ut (T 1


(u1 ), T 1
(u2 ), ,
T 1 (u ))
leads to
n

n n
n
2 2 1 Ut 1U 2
1
1
2 2 det
c T (u1 ), T (u2 ), , T (un ) 1
,
i n 2 2
(T 1 (ui ))
1
i 1
Copulas Applied to Derivatives Pricing 215

where the function is defined as follows:

x y 1
y e x dx
0

The tail dependency parameters for the bivariate t-copula with linear correlation
can be shown to be:

1 1
L 2 2T 1
1

1 1
U 2 2T 1
1

It can be seen that a bivariate t-distribution exhibits tail dependency, in contrast


with the bivariate Gaussian copula.

Archimedean Copulas

Definition7.2.8 Let be a strictly decreasing continuous function from [0, 1] to


[0, ] such that 1 0 [ 1] , the pseudo inverse of , is defined as follows:
1
[ 1] x ,0 x 0
(x)
0, 0 x

if 0 , we have [ 1] 1

Theorem 7.2.6 Define the function C from [0, 1]2 to [0, 1] such that C u, v
[ 1] u v . C is a copula if, and only if, is convex.
For a proof of this theorem, please refer to Nelson [133].
A generalization of this result defines the Archimedean5 copula in the multidimen-
sional case.
The function is called the generator of the copula, and if 0 , is called
a strict generator and the corresponding copula a strict Archimedean copula.

Examples:

■ The independence copula is a strict Archimedean copula: indeed, u, v


uv exp ln u ln v , so define x ln x for x [0, 1] We there-
1
fore have 0 and u, v u v
■ Similarly, we can show that the Frechet boundary W, in the two-dimensional
case, is also an Archimedean copula with x 1 x for x [0, 1]
x 1 [ 1]
■ Considering x with [ 1, [ 0 we have C u, v
1
u v max[ u v 1 , 0], which is called the Clayton cop-
ula, and it is a strict copula if 0

5 The word Archimedean is used because the Archimedean axiom is satisfied by these copulas.
216 ADVANCED PRICING TECHNIQUES

MinMax Copula In this section, we derive the copula of the minimum and maximum
of n iid random variables X1 , X2 , Xn with a distribution function F. We know
that the distribution function of the order r (meaning that we order the variables
from 1 to n and select the one of order r, which is similar to what we do with default
times when trying to price an r-to-default basket) is given by

n
n i n i
Fr x F x 1 F x
i
i r

The minimum mX and maximum MX therefore have the following distributions:

n
n i n i n
FmX x F x 1 F x 1 1 F x
i
i 1

FMX x Fn x

On the other hand, we have the joint distribution of mX and MX given as follows:

Fm,M x, y Q mX x, MX y
Q mX x, MX y 1x y Q MX y 1x y
n
n i n i
F x F y F x 1x y Fn y 1x y
i
i 1

By solving the equation CmM FmX x , FMX y Fm,M x, y we get to the following
expression for the MinMax copula of n iid random variables:

1 1 n 1 1
v vn 1 u n 1 ,1 1 un vn
CmM u, v 1 1
v, 1 1 un vn

This copula is linked to the Clayton copula mentioned above. Indeed, if we consider
1
the Clayton copula C with n , we can write

v CmM 1 u, v C u, v

We can also note that when n goes to , the MinMax copula approaches
the independence copula Moreover, the Kendall’s and Spearman’s for the
MinMax copula are given by

1
mM ,
2n 1
n i
12n 1 2n n n! 3
mM 3 2n
12 1
2n i n k 3n !
n i 0
Copulas Applied to Derivatives Pricing 217

7.3 FACTOR COPULA FRAMEWORK

The copula is only a way to separate the dependency structure (the copula) from
the distribution of each random variable (the marginals). When it comes to pricing
multiasset derivatives in practice, the dimension of the distribution of the copulated
assets is generally high, and Monte Carlo simulation is the only applicable numerical
method. There is nothing wrong with using Monte Carlo to price multiasset
structures; however, as these structures become commoditized and traded in large
volumes, the need for quicker methods becomes a necessity in order to deal with the
volume. Another interest in faster pricers is the fact that they enable us to extract
much more useful information from these structures such as greeks.
In order to tackle the dimensionality problem, factor copulas have been intro-
duced. The idea is to factor the correlation matrix such that it depends only on
few factors that explain a large percentage of the whole variance. This is similar to
performing a Principal Component Analysis (PCA) on the correlation matrix.
The approach presented below is the one-factor Gaussian copula framework,
a setting that is particularly well suited for high dimensional problems. The idea is
that we choose a common factor that explains the dependency between n random
variables such that they are independent conditionally on the common factor.
Let’s define a series of hitting times k with k 1, , n , and n being the
number of assets. The hitting time could be the time to default of a credit name or
the first time a stock hits certain level:

k
inf t 0 Skt Lk

where Lk is the barrier related to the asset Sk that prevails at time t.


The joint distribution function of the hitting times is

1 n
F(t1 , , tn ) Q t1 , , tn , (7.1)

which can be modeled using the Gaussian copula thus:

1 1
F(t1 , , tn ) n, (F1 (t1 )), , (Fn (tn )) (7.2)

where n, is the n-dimensional Gaussian distribution with correlation matrix


, and Fi ti Q i ti is the distribution function of i for i 1, , n
Note that we have here chosen the copula function for the hitting times to be a
Gaussian copula. It is a modeling choice and not something that could be derived.
1 1
By construction, the vector (F1 ( 1 )), , (Fn ( n )) is a Gaussian vector with
a correlation matrix .
1
Let X1 , , Xn be a Gaussian vector, where Xk (Fk ( k )), k 1, , n.
In the following, we consider a special case of a one factor copula representation,
where

2
Xk kZ 1 Z
k k
218 ADVANCED PRICING TECHNIQUES

and Z, Zk , k 1, , n are independent standard Gaussian random variables. On


the other hand, k [ 1, 1], k 1, , n are calibrated to the initial correlation
matrix
k Z
Denoting by pt Q k t Z , and exploiting the independence assumption,
we readily get

1
k Z (Fk (t)) kZ
pt
2
1 k

The joint distribution and copula functions are then given by

n 1
(Fk (tk )) kz
F(t1 , , tn ) (z)dz
2
k 1 1 k

n 1
(uk ) kz
C(u1 , , un ) (z)dz
2
k 1 1 k

1 2
where (z) e z 2 is the Gaussian density.
2
In the following section we are going to take advantage of source of the foregoing
results in order to price semianalytically some complex structures that otherwise
require lengthy Monte Carlo simulations to price and risk manage. We have chosen
a well-known equity derivative structure named Altiplano, a very popular credit
derivatives structures called collateralized debt obligations (CDOs), and basket
default swaps.

7.4 APPLICATIONS TO DERIVATIVES PRICING


7.4.1 Equity Derivatives: The Altiplano
A family of equity derivatives structures called mountain range options arose during
the 1990s and subsequent years. They are a series of path-dependent options on
a basket of underlying assets. Exotic mountain names have been assigned to these
structures like Altiplano, Himalaya, Atlas, Everest, Annapurna, Etna, and many
more.
These structures are usually written on many underlying assets, and Monte Carlo
simulation is the only suitable numerical method to price them. Unfortunately, this
method shows its limits when it comes to live risk management. Indeed, because of
the large dimension of the portfolio, the number of computed quantities, that is,
price, and different greeks grows rapidly, it becomes intractable to have quick and
accurate results.
In the remainder of this section, we focus on the pricing of the Altiplano
structures using the factor copula framework presented before. We therefore start
Copulas Applied to Derivatives Pricing 219

with a quick reminder of an Altiplano payoff, then prepare the ingredients for
the factor copula approach by computing the cumulative distribution functions for
a currently running period and a period starting in the future. A semianalytical
solution is derived for the price of this structure.
The structure we are interested in is a multiasset, multibarrier option that pays
a series of coupons depending on the number of assets crossing the barriers and on
the barrier period. No underlying is removed throughout the life of the product.6
We are given

■ n starting assets prices

Si0 , i 1, 2, n

■ The corresponding correlation matrix of these assets


■ A set of m monitoring periods [T0 , T1 ] [Tm 1 , Tm ]

K11 K12 K1n

■ A barrier matrix K being the barriers applying to each

Km
1 K2
m
Km
n
asset serving each monitoring period
■ A maturity date T

C10 C11 C1n

■ A matrix C of coupon payments, the structure pays at

Cm0 C1
m
Cmn
Cij
Ti the coupon where j is the number of assets which breached their respective
barriers during the period [Ti 1 , Ti ].
Normally, one has Ci0 Ci1 Ci2 0 and Ci3 Ci4 Cin 0.

The coupon payments are made at the end of each barrier period.
We define the n stopping times for each period l,
hence il is the first time when asset Si hits the barrier Kli for the period l.
Let Nl be the number of assets that hit the given barriers, that is,

n
Nl T 1T l T
l 1 i
i 1

6
Other variants of the Altiplanos structure remove poorly performing assets from the basket
during the lifetime of the structure.
220 ADVANCED PRICING TECHNIQUES

Therefore, the value at time 0 of an m period, n-underlying Altiplano that pays


at maturity is
m n
price B 0, Tl Cli Q(Nl i) (7.3)
l i 0

In order to use the factor copula approach, we will need to compute the marginal
distribution functions of the hitting time, and this is the subject of the next section.
Cumulative Distribution Functions of the Hitting Times
Current period The current period is a barrier period that contains the valua-
tion sale. We are working in the n-dimensional correlated Black-Scholes model,
that is,

dSit
(rit dti )dt i i
t dWt
Sit
The hitting time, of a certain barrier LS0 , for an underlying St is defined as
follows:

inf St L S0
0 t

We have to compute the following quantity F(T1 ) Q T1 :

F(T1 ) Q inf t, St L S0 T1 Q inf (St S0 ) L T1


0 t 0 t
t t
Q inf t, s ds s dWs L T1
0 t 0 0

EQ1 LT1 1inf t, 0t s dBs L T1


0 t

s is defined as follows:
s
1 2 u
s rs ds s , s Ws du
2 0 u

where rs is the short rate and ds is the dividend and repo rates and s is the volatility
function of St .
Lt is defined as follows:
t 2 t
1 s s
Lt exp ds dBs
0 2 s 0 s

and
t 2 t
dQ1 1 s s
exp ds dWs
dQ 0 2 s 0 s
Copulas Applied to Derivatives Pricing 221

t s
Girsanov theorem tells us that Bt Wt 0 s
is a Brownian motion under Q1
Define the following quantities:
t 2 t
s 2
t ds and Vt s ds
0 s 0

Assuming that s has the same sign sign( s ), either positive or negative
0 s T1
throughout the period (the same assumption is made for forward periods), we have
the following result:

F(T1 ) EQ1 LT1 1inf t, 0t s dBs L T1


0 t

ln(L) T1 VT1 ln(L) T1 VT1


L2 T1
VT1 VT1

Forward period For a forward period [Ti , Ti 1 ], the computation is similar.

F(Ti , Ti 1) Q inf St L S0 Ti 1
Ti t

Q inf St L S0 Ti 1 , STi L S0
Ti t

Q inf St L S0 Ti 1 , STi L S0
Ti t

Q inf St L S0 Ti 1 , STi L S0 Q STi L S0


Ti t
B
A

The second term B is easy to compute:

ln(L) Ti
Ti
B where Ti s ds
VTi 0

Following the same steps of the computation in the previous paragraph we have:

A 2 (x1 , y1 , ) fact 2 (x2 , y2 , )

where 2 is the bivariate normal distribution function and x1 , y1 , x2 , y2 , and fact


are given by

Ti ln(L) ln(L) Ti i,i 1 VTi ,Ti 1 Ti ,Ti 1


x1 , y1
VTi VTi 1

Ti ln(L) 2 i,i 1 Ti ,Ti 1 VTi


x2 ,
VTi
ln(L) Ti i,i 1 Ti ,Ti 1 (VTi VTi 1 )
y2
VTi 1
222 ADVANCED PRICING TECHNIQUES

VTi
VTi 1

2 i,i 1 Ti ,Ti 1 2 Ti ,Ti 1 Ti ,Ti 1 VTi Ti i,i 1


fact L e

where
Ti 1 2
s
Ti ,Ti 1 ds
Ti s

i,i 1 sign( s )
Ti s Ti 1

Ti 1
2
VTi ,Ti 1 s ds
Ti

Up to this point we have prepared the main ingredients for the copula approach.
In the next paragraph, we apply this in order to derive a semiclosed solution for the
structure presented above.

Pricing of Altiplanos In order to compute the price in (7.3), all we need to do is to


compute the probabilities Q Nl (t) k (k hits for the period l).
n
Recall that Nl (t) i 1 1Tl 1 l t , the counting process associated with the
i
number of hits up to time t for the period l.
The probability generating function of Nl (t) is given by
l
Nl (t) EQ uN (t) (7.4)
n
Q Nl (t) k uk (7.5)
k 0

Recall

1
k Z (Fk (t)) kZ
pt
2
1 k

Using the iterated expectations theorem, Nl (t) can be rewritten as

n
i Z i Z
Nl (t) EQ 1 pt pt u (7.6)
k 1
n
i Z z i Z z
1 pt pt u (z)dz (7.7)
k 1
n
uk l
k (z) (z)dz (7.8)
k 0
Copulas Applied to Derivatives Pricing 223

n i Z i Z
where the last equality stems from a formal expansion of i 1 1 pt pt u .
i Z
Note that we have dropped the index l from the probabilities pt to ease the notation.
Using the vieta’s formulas, which link the roots of a polynomial to its coefficients,
i Z z
we can express k (z) as a function of pt ,i 1, , n

l
k (z) ( 1)n k l
n (z) rli1 z rli2 z rlin k
z
1 i1 i2 in k n

where
i Z z
1 pt
rlk (z) i Z z
pt
n
l i Z z
n (z) pt
i 1

The probability of k hits by time t is thus given by

Q Nl (t) k l
k (z) (z)dz

and the price in (7.3) is given as follows:


m n
price B 0, Tl Cli l
i (z) (z)dz
l i 0

This integral maybe evaluated using a simple quadrature and it reduces the
computation time massively.
In the following section, we focus on the application of the above approach to
multiname credit derivatives, specifically CDOs and basket default swaps.

7.4.2 Credit Derivatives: Basket and Tranche Pricing


The extensive application of copulas in the financial industry is in large part due
to the development of the credit derivatives market. In this section, we present an
approach to pricing the well-known CDOs and basket default swaps in the factor
copula framework presented above in the same way we did for Altiplano.

Pricing of CDO Tranches

Introduction Collateralized Debt Obligations belong to a class of securitized


products called Asset Backed Securities (ABS). These are securities backed by
pools of assets which range from corporate bonds, bank loans, catastrophe bonds,
emerging market securities, credit cards, or various types of mortgage securities.
These securities are the property of a special purpose vehicle (SPV) which issues
a variety of equity and debt notes (tranches). Typically, the SPV issues an equity,
junior, mezzanine, and senior tranches plus sometimes a super-senior tranche. The
fundamental difference of these tranches lies in the risk they bear as the repayment
224 ADVANCED PRICING TECHNIQUES

of both interests and principal is done in a given order and the first losses are borne
by the equity tranche.
The SPV is called a CDO which also refers to the various notes issued by the SPV,
this leads to the known circular phrase ‘‘a CDO issues CDOs.’’ These are bought by
investors looking to gain an exposure to a diversified portfolio of underlying assets
without having to buy each asset individually, and to obtain a higher return than is
available on other securities of equivalent credit rating.
The motivations behind participating in the CDO market are different for both
the originator and the investor. Banks for example, or any other holder of assets, aim
at shrinking the balance sheet, therefore reducing the required regulatory capital, or
economic capital. Investors, on the other hand, are looking for both investment and
arbitrage opportunities. These motivations coupled with the source of the underlying
assets allow for a classification of CDOs into Balance Sheet and Arbitrage CDOs.

Synthetic CDOs Synthetic CDOs are similar to ordinary CDOs, or cash CDOs,
except that their portfolios are constituted of Credit Default Swaps (‘‘CDS’’) rather
than actual bonds or loans. In a CDS, one counterparty pays a premium to a second
counterparty in exchange for a contingent payment should a defined credit event
occur such as the reference entity going into default. This way the CDO gains
exposure synthetically to a reference credit entity without purchasing a bond or
a loan. An analogy can be drawn with insurance where one party pays regular
premiums against the protection, by the other party, against potential coverage
losses.
The sophistication of synthetic CDOs has reached another level as the underlying
portfolio is customized to include CDO notes (CDO2 , CDO squared), and this works
in the same way as a standard CDO structure. Leveraged super senior and CPPI on
CDO tranches constitute the latest innovations in the field of CDOs.
Synthetic CDOs offer access to a more diversified portfolio of assets and a larger
number of assets than Cash CDOs. They also create a cheaper capital structure,
leading to higher equity returns and higher portfolio quality.

Default Leg of a CDO Given a pool of n credit names with recovery rates Ri and
nominals Ni , i 0, , n We denote by i the default time of credit name i. We
assume the recovery rates are known beforehand. Define Lt the cumulative loss of
the portfolio at the time t:

n
Lt Li 1 i t,
i 1

where Lit is the loss if the name i defaults before time t, and Li 1 Ri Ni We
assume that interest rates are deterministic.
We consider a CDO tranche where the default leg pays losses borne by the
above portfolio and which are in excess of K1 and not more than K2 K1 K2 . The
payoff of the tranche is therefore given as follows:

Pt Lt K1 1K1 Lt K2 K2 K1 1K2 Lt
Copulas Applied to Derivatives Pricing 225

n i
K1 0 corresponds to the equity tranche, K2 i 1 L corresponds to the
n i
senior tranche and finally K1 0 and K2 i 1 L corresponds to the mezzanine
tranche.
For a maturity T, the price of the default leg is the expectation of the sum of all
default payments before T. It is given as follows7 :

T
DefaultLeg EQ B 0, t Pt Pt EQ B 0, t dPt (7.9)
t T 0

Using integration by parts we have:


T
DefaultLeg B 0, T EQ [PT ] f 0, t B 0, t EQ Pt dt
0

where f 0, t is the instantaneous forward rate defined as follows: f 0, t


log B 0,t
t
For simplification we consider the case where all the names have the same nominal
and recovery rate; this is called a homogeneous portfolio (Li L, for all i’s). Denote
by k1 and k2 the number of defaults that correspond to the losses K1 and K2 ,
respectively.

k2 n
EQ Pt kL K1 Q L t kL K2 K1 Q Lt kL (7.10)
k k1 k k2

Within the one-factor Gaussian copula framework presented above, we have

Q Lt kL Q N(t) k k (z) (z)dz (7.11)

where k is the coefficient of uk in the moment-generating function.

Fixed Leg The fixed leg is a risky coupon-bearing bond with a notional equal to
the notional of the tranche K1 , K2 . Hence, we have the following expression:

p
FL EQ s ti B 0, ti K2 K1 Lti K1 Lti K2
i 1
Pti

p
s ti B 0, ti K2 K1 EQ Pti ,
i 1

where ti s , i 1, , p , are the payment dates and s is the fixed premium paid at
each payment date.

7 We are able to write 7.9 because Pt is an increasing pure jump process.


226 ADVANCED PRICING TECHNIQUES

The CDO spread is the value of s that makes the default leg equal to the fixed
led and is given by

T
B 0, T EQ [PT ] 0 f 0, t B 0, t EQ Pt dt
spread p (7.12)
i 1 ti B 0, ti K2 K1 EQ Pti

Note that in the above computation we have omitted for simplicity the accrual
payment that could be added to the fixed leg in the event that the default happens
between two premium dates.
This comes down again to the computation of the probabilities Q Nt m ,
which is, as explained before, very easy to compute within the factor copula
framework.
We can therefore compute the prices of different tranches in a CDO by numerical
integration. The generalization to a nonhomogenous portfolio is straightforward.

Pricing of Basket Default Swaps Another structure that provides similar credit pro-
tection to CDO tranches is a Basket Default Swap. A basket swap is similar to
single name credit default swap except that it offers protection against a num-
ber of credit entities (2 to 25 or 30 names) rather than a single entity. These
can be structured in a way that offers investors access to different risk profiles
depending on their risk appetite, and protects against different ranges of portfolio
losses.
In a First to Default (FTD) swap, the protection buyer is only protected against
the losses incurred on the first default and the contract is terminated after this event.
In a Second to Default swap, the protection buyer is protected against the losses
incurred on the first and second defaults but not the third with the contract being
terminated after the second default. In a Ninth to Default swap, the protection buyer
is protected against the losses incurred on the first nine defaults but not the tenth.
These structures would be similar to the equity tranche, mezzanine tranche, and
senior tranche of a CDO, respectively.
The success of these structures stems from the fact that they offer the investor
higher spreads than the underlying single name credit default swaps. However, this
spread depends on the level of correlation between the reference credits constituting
the basket.

Pricing a kth to Default Basket As in the case of CDOs, we consider a pool of


n credit names with recovery rates Ri and nominals Ni , i 0, , n We denote
by i the default time of credit name i. We assume the recovery rates are known
beforehand. Let Nt be the counting process:

n
Nt 1 i t
i 1

We also assume for simplicity that we have a homogeneous portfolio (Ri R,


and Ni N), and we have therefore Li 1 R N L where Li is the loss incurred
if the name i defaults. We also assume that interest rates are deterministic.
Copulas Applied to Derivatives Pricing 227

As mentioned before, the kth to default basket is swap that provides a protection
against defaults up to the kth . We therefore are interested in the distribution of (k) ,
which is the kth default time.
We can write:
(k) (k)
F (t) Q t Q Nt k
n
Q Nt m
m k

The Default Leg The default leg denoted DL is computed as follows:


T
(k) (k)
DL L EQ B 0, 1 (k) T L B 0, t f (t)dt, (7.13)
0

(k)
(k) (k) (k) dF (t)
where f (t) is the density function of given by f (t) dt
(k)
Within the one factor Gaussian copula framework, F (t) is given as a sum of
Q Nt m , k m n, where, as explained before:

Q Nt m k (z) (z)dz

where k is the coefficient next to uk in the moment-generating function.


Note that (7.13) could be written as follows:
T
(k) (k)
DL L EQ B 0, 1 (k) T L B 0, t s (t)dt
0

Integrating by parts we get:

T
(k) (k)
DL L 1 S T B 0, T f 0, t B 0, t S (t)dt ,
0

(k)
where s (t) is the survival probability density given by
(k) (k)
(k) dF (t) dS (t)
s (t)
dt dt
(k)
and S (t) by
(k) (k)
S (t) Q t Q Nt k
k 1
Q Nt m
m 0

Therefore, depending on the number of terms in the sum, we can use one or the
other (survival or default) to ease computations.
228 EQUITY HYBRID DERIVATIVES

The Fixed Leg As with the CDO, the fixed leg of a basket default swap is basically
a risky coupon bond. Hence, we have the following expression:
p
FL N EQ s ti B 0, ti 1 (k) ti
i 1
p
(k)
N s ti B 0, ti S (ti )
i 1

where ti s , i 1, , p , are the payment dates, ti ti ti 1 , and s is the fixed


premium paid at each payment date.
As with the CDO, the kth to default basket default swap spread is the value of s
that makes the default leg equal to the fixed led and is given by

(k) T (k)
1 R 1 S T B 0, T 0 f 0, t B 0, t S (t)dt
spread p (k)
i 1 ti B 0, ti S (ti )

Again, we have omitted for simplicity the accrual payment that could be added
to the fixed leg in case the default happens between two premium dates.
This comes down again to the computation of the probabilities Q Nt m ,
which is, as explained before, very easy to compute within the factor copula
framework.

7.5 CONCLUSION

Copulas have found many applications within the field of financial derivatives. Their
application is not limited to equity or credit derivatives. Copulas are also used for
risk management purposes as the calculation of risk limits for large portfolios poses
problems similar to the pricing of CDOs and Altiplanos. They we also applied in
interest rates derivatives for structures like CMS spread options, and the field of
hybrid derivatives.
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 8
Forward PDEs and Local Volatility
Calibration

8.1 INTRODUCTION

8.1.1 Local and Implied Volatilities


In the Black-Scholes model ([134]), the stock price follows geometric Brownian
motion with a constant volatility :

dSt
(rt t )dt dWt , (8.1)
St

where rt is the short interest rate and t contains the repo rate and a dividend yield.
This is discussed in more detail in chapter 1. Under this assumption, the price of
a European call option with strike K and maturity T is given by the Black-Scholes
formula

C(K, T) P(0, T)(FT N(d ) KN(d ))

where P(t, T) is the price at time t of a zero-coupon bond with maturity T, FT is the
stock forward, and
1 2 (T
ln(K FT ) 2 t)
d
T t

Now that the European options themselves form a liquid market, prices are
available for many options on many stocks and indices. The implied volatility of an
option is the constant volatility that when used in the above equations recovers the
market price of the option. Figure 8.1 shows the dependence of the implied volatility
of the Stoxx50 index on the maturity and strike of the options.
Since the implied volatility is a function of the strike price, the volatility that we
use in equation (8.1) cannot be constant. If we want our stock model to be Markovian
in just one factor, we must make the volatility of the stock a deterministic function
of both the stock price and time. This is referred to as the local volatility. In reality,
though, there is not such a simple relationship between volatility and stock price.
Studies of historical market data show that the volatility is stochastic and can be
modeled well by mean-reverting processes such as Heston’s model [135].

229
230 ADVANCED PRICING TECHNIQUES

.STOXX50E Implied Volatility 09/12/2005

50

45

40

35
Implied Volatility

30

25

20

15

10
09/06/2024
5 09/12/2019
09/06/2015
0
10%

09/12/2010
30.0%
50.0%
70.0%
90.0%
110.0%

130.0%

09/06/2006
150.0%

170.0%

190.0%

210.0%

Strike/Spot

FIGURE 8.1 Implied volatility surface for the Stoxx50 index.

A local volatility model has the benefit over a stochastic volatility model that it
is Markovian in only one factor (and therefore more tractable). It is also possible to
calibrate a local volatility model to a complete implied volatility surface (assuming
there is no arbitrage). It has the drawback that it predicts unrealistic dynamics
for the stock volatility and therefore the implied volatility surface. However, a
local volatility model is sufficient for pricing some products—particularly ones with
European payoffs, which can be hedged perfectly with a static set of positions in
European calls and puts.
In a simple one-factor model with no extra sources of randomness, Dupire [136]
showed that we can express a local volatility in terms of the implied volatility surface
and its derivatives. However, this formula can be difficult to use in practice. If we add
more sources of randomness to our model—for example, stochastic interest rates,
hazard rates, or dividends—Dupire’s formula no longer applies and we must find
another way to create a local volatility surface from an implied volatility surface.
Tied in with the problem of calibrating a local volatility surface is the problem
of pricing options with European payoffs (where the payoff depends on the value
of the stock on a single maturity) where no closed form solutions exist. This can
obviously be done by Monte Carlo simulation or backwards induction using a tree
or numerical PDE solver. However, simulation methods suffer from a slow rate of
convergence, while backward PDE methods and trees have better convergence but
can price only one option at a time.
In this chapter, we demonstrate the powerful technique of using forward PDEs
to price multiple European options very efficiently. We then go on to discuss how
to use this technique to calibrate a local volatility surface to an implied volatility
surface in single- and multifactor models.
Forward PDEs and Local Volatility Calibration 231

8.1.2 Dupire’s Formula and Its Problems


Dupire [136] showed that the local volatility (which we shall denote by l ) can
be expressed in terms of the implied volatility surface, or more simply in terms of
European call prices. We derive his full result in section 8.3, equation (8.18), but for
the purposes of this section, we can ignore the effects of interest rates and dividends.
Letting rt t 0, Dupire’s formula becomes
C
l T (K, T)
2
(K, T)2 2
(8.2)
K2 KC2 (K, T)

The drawback of this equation in practice is that it requires the knowledge of


call prices for all strikes and maturities, whereas in reality there will be data for only
call prices at a discrete set of strikes and maturities. We must therefore interpolate
the call prices (or implied volatilities) between the market data points if we are to
use equation (8.2). For the local volatility to be continuous in the stock price (which
is necessary for good convergence of any numerical scheme) we need the second
derivatives of the call prices with respect to strike to be continuous. We also need
the call prices to be once differentiable with respect to time. Additionally, the call
prices must obey the following no-arbitrage conditions:

C(K, T) (S0 K)
C
0,
T
C
1 0
K

and
2C
0,
K2

as well as the boundary conditions

C(K, T) S0 K as K 0

and

C(K, T) 0 as K

The equivalent conditions when expressed in terms of implied volatilities are even
more complicated to evaluate.
Occasionally, the market data may include regions of arbitrage owing to large
bid-offer spreads on illiquid options or the difficulty of extrapolating into regions
where there is no data. However, it is necessary to remove all arbitrageability from
the implied volatility surface as Dupire’s formula is only valid up to the first time
when the above conditions are violated. If, instead of trying to fit a smooth implied
volatility surface to a discrete set of European options, we assume a local volatility
232 ADVANCED PRICING TECHNIQUES

surface l (S, t), then our arbitrage and boundary conditions become that l (S, t) is
greater than zero and S l (S, t) is Lipschitz continuous [137]. Finding a local volatility
surface that satisfies these no-arbitrage conditions is much simpler than finding an
implied volatility surface. The only difficulty is how to fit the local volatility surface
to the market call prices, and this will be addressed in the rest of the chapter.

8.1.3 Dupire-like Formula in Multifactor Models


Another reason for looking for an alternative to Dupire’s formula is that to the
authors’ knowledge there is no equivalent formula in higher-dimensional models.
To demonstrate the difficulty of finding a two-factor version, we can consider
a simple non-dividend-paying stock with interest rates following the Hull-White
model [61]. We have
r r
drt ( t rt )dt t dWt
dSt l
rt dt (S, t)dWtS
St

Borrowing equation (8.33) from section 8.6, we have


T
2 1
imp (T) ( tl )2 2 l r
t t B̂( , t, T) B̂( , t, T)2 ( tr )2 dt
T 0

(see section 8.6 for details). The local volatility at time T depends not just on the
implied volatility and its derivatives at T, but also on the implied volatility at all
times t T. We cannot then hope to find a simple expression like Dupire’s formula,
even in the simplest case we can study.
When we follow the steps that lead to Dupire’s formula, but with stochastic
interest rates (see section 8.4), we arrive at the following expression (see equation
(8.27)):
c
2 2 exp(k) gT (y) (x, y, T)dxdy
l T k
(K, T)2 2c
, (8.3)
c
k2 k
where c, x, y, and k are transformed call prices, stock prices, interest rates, and
strikes, respectively, and is the joint probability distribution of x and y. (See section
8.4 for more details.) The integral in equation (8.2) involves the expectation of r at
fixed S and cannot be uniquely determined by the implied volatility locally to S and
t (as we have shown above in the simplified case).
An alternative might be to find a liquid derivative that is sensitive to this unknown
integral. The problem with this is that we know there is enough information
to calibrate the local volatility given just the call prices. If we introduce more
instruments, we then have an over-specified problem.1 We could use the new

1 This is if we assume the volatility is just a function of the stock price and time. If we let the

volatility depend on the short rate as well, then we could use the extra liquid instruments—if
they existed! See Gyöngy [138].
Forward PDEs and Local Volatility Calibration 233

instruments to come up with a local volatility surface, but it would price neither the
new instruments nor the European calls correctly (unless by some accident we had
a model that exactly described the behavior of the market, which is very unlikely).
Equation (8.3) shows that we can find the local volatility if we know the joint
probability distribution for S and r. The approach we present in section 8.5 is to
bootstrap both the local volatility and the probability distribution together. First, we
derive the PDEs satisfied by the probability distributions in the one-factor (section
8.3) and two-factor (section 8.4) cases.

8.2 FORWARD PDEs


By forward/backward PDE, we mean the direction in time in which the PDE is solved.
In a forward PDE, we specify the solution at the evaluation date and propagate
it forward in time; all of the forward PDEs we discuss here solve for the present
value of some European derivative as a function of its maturity T and strike K. In a
backward PDE, such as the familiar Black-Scholes PDE, we solve for the value of a
particular derivative as seen from a time t and a stock level S. A forward PDE, where
applicable, has the advantage that we can use the solution to price many different
options. To price different options using backward PDEs, we must solve the PDE
once per option as each option has a different final payoff.
In this section, we derive forward PDEs for the probability distribution arising
from some general risk-neutral processes:
i
dxi i (xt , t)dt i (xt , t)dWt (8.4)

for 1 i n, with
j
d Wti , Wt ij dt

Any derivative price V(x, t), discounted by the money market account
t
Bt exp rs ds
0

must be a martingale in the risk-neutral measure. Hence, applying Ito, we have

V 1 V V 1 2V
d rt V i ij i j dt
B B t xi 2 xi xj
i i,j

1 V
i dWti ,
B xi
i

and so setting the drift to zero gives the PDE for V:


2
V V 1 V
rt V i ij i j 0 (8.5)
t xi 2 xi xj
i i,j
234 ADVANCED PRICING TECHNIQUES

Next, we define the Arrow-Debreu price (x , t) as the present value of a


derivative that pays off (xt x ) at time t. This is related to the t forward measure
probability density of x, (x, t) by

(x, t) P(0, t) (x, t), (8.6)

as can be seen from the defining equations for and :

V(x0 , 0) V(x, t) (x, t)dx (8.7)

P(0, t) V(x, t) (x, t)dx (8.8)

The latter follows by noting that P(s, t) is the numeraire of the t forward
measure, t , so

V(x0 , 0) t
V(xt , t) t
[V(xt , t)]
P(0, t) P(t, t)

To derive the forward PDE for , we note that the left-hand side of equation
(8.7) is independent of t. Differentiating both sides with respect to t, and using
equation (8.5), gives

V 1 2V
0 V rt V i ij i j dx
t xi 2 xi xj
i i,j

Integrating by parts, we get

2( )
( i ) 1 ij i j
0 V rt dx
t xi 2 xi xj
i i,j

boundary terms. (8.9)

These boundary terms depend on the specific problem. In all the cases we will
discuss, i and i are well behaved everywhere, including infinity, so these boundary
terms can be ignored. The above equation holds for all payoffs V(x, t), and so the
only way in which it can hold generally is by setting

2
( i ) 1 ( ij i j )
rt 0
t xi 2 xi xj
i i,j

This is the PDE that we have been seeking for .


We can remove the effect of deterministic interest rates or reduce the effect of
jumps in the forward curve for stochastic interest rates by working with rather
Forward PDEs and Local Volatility Calibration 235

than . If we let the forward short rate with maturity T, observed at t, be f (t, T),
then

f (t, T) P(t, T)
T

Using equation (8.6), we can show that obeys

2( )
( i ) 1 ij i j
(rt f (0, t)) 0 (8.10)
t xi 2 xi xj
i i,j

The above equation demonstrates why it can be better in practice to work with
rather than . When interest rates are deterministic, rt f (0, t) 0, so the reaction
term vanishes; when we use a Vasicek/Hull-White model for interest rates [61], [62],
rt f (0, t) is continuous, even if f (0, t) has discontinuities (which can happen if the
yield curve is not interpolated smoothly). For more complicated rate models, such
as Black-Karazinski ([63]), rt f (0, t) will not be continuous, but its jumps will
generally be much smaller than the ones in rt alone.
The initial conditions for or can be found by taking the limit of the SDEs
in equation (8.4) as t 0. If the drift and volatility functions are bounded, then the
equations reduce to

i
xi ( t) i (x0 , 0) t i (x0 , 0)W ( t),

and so we can use an n-factor Gaussian as an initial condition at time t. At


time t 0 the solution will be a delta function, so difficult to represent on a finite
difference grid. An alternative approach is to rescale the coordinates near t 0 so
that the solution is a multifactor Gaussian is constant in the limit t 0. We discuss
this further in section 8.4.
Once we have (x, t), it is then easy to price any derivative with a European
payoff at t by using equation (8.8). We can therefore price a whole series of European
options with different strikes and maturities by propagating the solution for out to
the latest maturity once. Note that this is only one example of a forward PDE. We
show in section 8.3 that in a single-factor equity model (with no stochastic interest
rates/hazard rates, etc.) we can derive a forward PDE for the call prices themselves
as a function of their strikes and maturities.

8.3 PURE EQUITY CASE

In this section, we describe the equations governing the pure equity problem (by
which we mean that there are no stochastic interest rates, credit, or volatility).
We assume we have some stock S which pays a mixture of cash and proportional
dividends as defined in section 1.1.1. Recall equation (1.6):

St Ft Xt At
236 ADVANCED PRICING TECHNIQUES

(Ft and At are defined in equations (1.7) and (1.8) of section 1.1.2.) We will use this
to transform away the dividends and yield curve, leaving us with a martingale X,
which we will assume follows the SDE

dXt
(Xt , t)dWt
Xt

It will simplify the numerics (and in particular the boundary conditions that go into
equation (8.9)) to work in log-space, so we define xt ln(Xt X0 ) and have

1 2
dxt (xt , t)dt (xt , t)dWt (8.11)
2

We will use (xt , t) as shorthand notation for (exp(xt ), t). The meaning will always
be clear from the context.
Using equation (8.10) from section 8.2, then provided is bounded as x ,
we can ignore the boundary terms and get that

1 ( 2 ) 1 2( 2 )
0 (8.12)
t 2 x 2 x2

If we have computed (x, t) for some t, we can use it to price call options using
equation (8.8):

C(K, t) P(0, t) X0 Ft (exp(x) exp(k)) (x, t)dx,


k

where k is the strike transformed to log-space by

K Ft X0 exp(k) At

We also define a normalized call price

c(k, t) C(Ft X0 exp(k) At , t) (Ft X0 P(0, t)),

and get that

c(k, t) exp(x) exp(k) (x, t)dx (8.13)


k

Obviously, we can price any European payoff using . However, if we are


interested just in call prices (as is the case when calibrating a local volatility), we can
instead write a PDE for c(k, t) and solve for the call prices directly. To do this, we
start by differentiating equation (8.13) twice with respect to k, giving

c
exp(k) (x, t)dx
k k
Forward PDEs and Local Volatility Calibration 237

and

2c c
exp(k) (k, t) (8.14)
k2 k

This last equation allows us to convert from c to . Next, we differentiate c with


respect to t, giving

c 1 2
exp(x) exp(k) 1 ( )dx,
t k 2 x x

where we have used equation (8.12) to eliminate t . Integrating by parts gives

c 1 2
exp(x) 1 ( )dx (8.15)
t 2 k x
1 2
exp(x) dx (8.16)
2 k x
1 2
exp(k) (k, t) (k, t) (8.17)
2

By combining equations (8.14) and (8.17) we can eliminate (k, t), giving the desired
PDE for c(k, t):

2
c 1 2 c c
(k, t)
t 2 k2 k

Rearranging this gives Dupire’s formula in log-coordinates:

c
2
2 t
(k, t) 2c
(8.18)
c
k2 k

Now we have two ways to price European call options efficiently: solving
the PDE for or solving the PDE for c. Note that the initial condition for c,
c(k, 0) exp(k) 1 is discontinuous in its first derivative in k, so for very short
maturities we will have noise in the numerical solution. We can reduce this by taking
smaller time steps and using a denser mesh near t 0. Alternatively, we can solve
for near t 0, and switch to using the PDE for c at some larger t.
The initial condition for is even more pathological than the one for c, being
a delta function. However, we can transform our x coordinate by xt xt t t,
where t is some averaged implied volatility at time t, and work with (x , t), where
(x , t)dx (x, t)dx. The initial condition for is a Gaussian. Unfortunately,
the coefficients of the transformed PDE become infinite as t 0. To get around
this, we could start the PDE from a small time t, but in practice a second-order
Crank-Nicholson scheme where the coefficients are evaluated halfway through a
time step works even if started at t 0.
238 ADVANCED PRICING TECHNIQUES

8.4 LOCAL VOLATILITY WITH STOCHASTIC INTEREST RATES

In this section, we derive the forward PDEs for a two-factor interest rate and equity
model. As discussed in chapter 3, many short rate models can be expressed as
follows:

rt f (0, t) g(yt , yt , t), (8.19)

where yt follows the Ornstein-Uhlenbeck process:

r r
dyt yt dt t dWt

in the risk-neutral measure. The function yt is assumed to have been calibrated to fit


the yield curve. If g(y, y, t) y y, we have the extended Vasicek/Hull-White model,
whereas if g(y, y) exp(yt yt ) f (0, t) we have the Black-Karasinski model. Of
course, we could have written rt g(yt , yt , t), and this would have been equivalent
mathematically. However, with the definition of equation (8.19), the function g is
generally smoother (it is continuous in the Vasicek model, whereas the paths of rt
may not be).
We have the stock price process in the risk-neutral measure:

dSt S S
(rt t )dt t dWt
St

The term t encompasses all of the dividend/repo terms. To take these into account
we change a variable, defining

t
xt ln St exp s ds P(0, t)
0

St
ln , (8.20)
Ft

where Ft is the stock forward with maturity t. This new variable follows the process

1 S 2 S S
dxt gt (yt ) ( t ) dt t dWt (8.21)
2

Note that the deterministic interest rate case corresponds to gt (y) 0, and equation
(8.21) reduces to equation (8.11).
We have our SDEs for the two state variables, x and y, so we can apply equation
(8.10) to get the PDE followed by the joint probability density (x, y, t):

1 (( S )2 ) 1 r2
0 gt (yt ) y ( )
t x 2 x y 2 y
1 2 (( S )2 ) 2( S )
r
(8.22)
2 x2 x y
Forward PDEs and Local Volatility Calibration 239

In practice, this is a cumbersome PDE to solve: The initial condition at t 0 is a


two-dimensional delta function, and the distribution then spreads out as t while
the peak decreases as 1 t. Taking advantage of this information, we can rescale
the x and y coordinates as

x x at ,
y y bt ,

where at and bt scale as t as t 0, and also rescale as

(x , y , t)
(x, y, t)
at bt

Note that we have

(x, y, t)dxdy (x , y , t)dx dy ,

so is just the probability density associated with the new coordinates x and y .
We can use at and bt to make our PDE grid cover just the region where the
probability density is significant. In the interest rate direction, we know that the
marginal probability distribution is Gaussian with variance
t
r
Vr (t) (s)2 exp(2 (s t))ds,
0

and so we can define bt as the square root of this:

t
bt r (s)2 exp(2 (s t))ds
0

This makes y the number of standard deviations that the short rate is away from
the mean.
The problem is more complicated in the equity direction, where we do not have
a normal marginal probability distribution because of the volatility skew. However,
we can compute the actual marginal probability distribution from call prices and use
some feature of it to determine aT . One approach is to use the at-the-money implied
volatility atm and let

at atm (t) t (8.23)

A better choice of at will allow us to distribute mesh points more efficiently in terms
of speed of calculation for a given tolerance.
We get the following PDE for :

1 ( S )2 a
0 gt (by ) x
t a x x 2a a
( r )2 1 2 (( S )2 ) r 2( S )
y (8.24)
2b2 y y 2a2 x2 ab x y
240 ADVANCED PRICING TECHNIQUES

Note that the coefficients diverge as t 0. One approach is to move the initial
condition to some small time t; however, depending on the implementation of the
PDE solver, this might not be necessary. The authors have obtained good results
propagating from t 0 with an ADI scheme where the coefficients are evaluated
halfway through a time-step, hence avoiding the infinities at t 0.
The initial condition for is a two-factor Gaussian; for small t we have

ab 1 x 2 a2 y 2 b2 2 x y ab
(x , y , t) exp 2)
2 1 2 S rt 2(1 ( S )2 t ( r )2 t S rt

1 1
exp 2)
x2 y2 2 xy
2 1 2 2(1

if we use a and b as in equations (8.23) and (8.4).


Figure 8.2 show the results of propagating equation (8.24) for 18 years using the
volatility surface of the Nikkei index and a Hull-White model for the JPY interest
rates with an instantaneous correlation of zero (the terminal distribution shown in
the figure actually has some positive correlation from the effect of rate shifts on the
growth rate of the index; see section 4.1). The long tail to the left of the distribution
comes from the skew of the implied volatility surface; the in-the-money options have
higher implied volatilities than out-of-the-money ones. The marginal distributions of
x and y are shown in figure 8.3; note that in the interest-rate direction, we have just
a Gaussian, as we are using a Hull-White model. In the equity direction, we have the
same distribution we would get from solving the one-factor problem in section 8.3.
The coordinate rescalings are useful for the solution of the PDE but cumbersome
when discussing the method, so we now return to working with . Inverting equation
(8.20), the stock price is given by

St Ft exp(xt ),

Arrow-Debreu price for N225-JPY

0.12
0
JPY (y)
-5.8 -4 -2.2 -0.4 -0.12
1.4 3.2
N225 (x)

FIGURE 8.2 The joint distribution (x, y) for the N225 and JPY after 18 years, with zero
instantaneous correlation.
Forward PDEs and Local Volatility Calibration 241

Marginal N225 distribution

0.7

0.6

0.5
Probability 0.4

0.3

0.2

0.1

0
-7 -5 -3 -1 1 3 5
x

Marginal JPY Distribution

18
16
14
12
Probability

10
8
6
4
2
0
-0.12 -0.06 0 0.06 0.12
y

FIGURE 8.3 The marginal equity and interest rate distributions for N225 and JPY after 18
years.

and so the price of a European call is given by

C(K, T) P(0, T) FT exp(x) exp(k) (x, y, T)dxdy,


k

where

K Ft exp(k)

In an attempt to get a Dupire-like result, we can differentiate this expression with


respect to k and T respectively. As in the previous section, in order to simplify the
equations, we first write

C(K, T)
c(k, T)
P(0, T)FT

exp(x) exp(k) (x, y, T)dxdy (8.25)


k
242 ADVANCED PRICING TECHNIQUES

Working in terms of c(k, T) rather than C(K, T), we don’t have to worry about
dividends or the initial yield curve. Differentiating with respect to T and using
equation (8.22) gives

c ( S )2
exp(x) exp(k) gT (y)
T k x x 2
1 2 (( S )2 )
dxdy
2 x2

Integrating by parts a few times gives

c 1 S
exp(k) (k, T)2 (k, y, T)dy
T 2

exp(k) gT (y) (x, y, T)dydx (8.26)


k

Differentiating equation (8.25) with respect to k gives

c
exp(k) (x, y, T)dxdy
k k
2c
exp(k) (x, y, T)dxdy exp(k) (k, y, T)dy
k2 k

and so we can combine the last three equations to give

S 2 2
c ( ) c c
exp(k) gT (y) (x, y, T)dxdy (8.27)
T 2 k2 k k

This is almost, but not quite, the Dupire-like result we want. Indeed, if we let
gt (y) 0 (which reduces the problem to the deterministic interest rate case), the
above expression reduces to Dupire’s formula. Unfortunately, there is no way to
back out the second term on the right-hand side from just European option prices.
However, if we have propagated up to time T, we can use the above expression to
determine the local volatility between T and T T.

8.5 CALIBRATING THE LOCAL VOLATILITY

In the previous sections, we have shown how to use forward PDEs to price European
options efficiently given a local volatility surface. It is then a standard inverse problem
to find the local volatility surface that is consistent with an implied volatility surface.
We can parameterize the local volatility in some way, then adjust the parameters
until we correctly reprice a set of European options. Since the prices of European
options with maturity T depend only on the local volatility surface at times t T,
we can bootstrap the calibration; that is, we can calibrate the surface up to some
time Ti , then calibrate the surface from Ti to Ti 1 , leaving the local volatility at
Forward PDEs and Local Volatility Calibration 243

t Ti unchanged. We can also use (x, Ti ), found in the calibration up to Ti , as a


starting point for the calibration from Ti to Ti 1 .
Depending on the amount of detail in the implied volatility surface we are trying
to fit, we might want to parameterize the local volatility by tens or hundreds of
parameters in any given time slice. The number of iterations that any root-finding
algorithm is likely to need will grow accordingly. It can therefore become a very slow
process to solve the inverse problem where the forward problem involves solving for
with a one- or two- (or potentially higher) factor PDE solver. However, a better
approach is available.
Assuming that we have evolved up to time t, we can write the call prices
consistent with our local volatility surface as integrals over (see, for example,
equation (8.25) for the two-factor version). We can then find a relationship between
our local volatility (x, t) at this time and the rates of change of call prices with
respect to maturity at fixed k. In one factor this is just equation (8.17) and in two
factors it is equation (8.26). We can therefore express the call prices at time t t in
terms of (t) and the local volatility between t and t t.
As an example, in the two-factor case we have

c(k, Ti 1) (exp(x) exp(k)) (x, y, Ti )dydx


k
1 S
exp(k) (k, Ti )2 (k, y, Ti )dy
2

exp(k) gTi (y) (x, y, Ti )dxdy (8.28)


k

We want to find some function (x) that when applied between Ti and Ti 1 , gives
the minimum discrepancy between the model call prices c(k, Ti 1 ) and the market
call prices cm (k, Ti 1 ). The above equation (and indeed the equivalent equation in
the one-factor case) reduces to
S
cm (k, Ti 1) c(k, Ti 1) a(k, T) b(k, T) (k, T)2 ,

where a and b are known at time T. We can iteratively guess the parameters of
(x) until we minimize the above difference (in some norm). This is much faster
than iterating the full problem, where c(k, Ti 1 ) is the result of an expensive PDE
solution. Note that we could insist that the difference between the market and model
call prices be zero, letting

S a(k, T)
(k, T)2
b(k, T)
The problems with doing this are that a might be negative (if the data are arbitrage-
able) and that at low/high strikes both a and b go to zero and the ratio of them cannot
be computed with much confidence. A better approach is to minimize the difference
between the model and the market call prices in some norm: c cm , plus some
penalty function for nonsmooth (x) functions. We might want to minimize
2 2
S 2
F( ) w(k) cm (k, Ti 1) c(k, Ti 1) z(k) dk
k2
244 ADVANCED PRICING TECHNIQUES

for some weight functions w(k) and z(k). Realistically, we would choose some
discretised version of the above such as
2 2
S 2
F( ) w(kn ) cm (kn , Ti 1) c(kn , Ti 1) z(kn )
n
k2n

This function can be easily evaluated given a local volatility curve and does not
involve expensive finite-difference computations, so we can back out the best local
volatility curve using some least-squares fitting routine. It is also much safer to use a
simple functional form in a minimization routine: the numerical noise in propagating
with a PDE solver could easily confuse a minimization algorithm.
We can now use our local volatility with our slow finite-difference solver to
propagate from Ti to Ti 1 . Note that equation (8.28) is equivalent to using
an explicit finite difference step to propagate from Ti to Ti 1 , and the predicted
call prices at time Ti 1 (and therefore the local volatility) are accurate only to
O( T), where T Ti 1 Ti . However, while the call prices at Ti 1 from our local
volatility surface might differ from the market call prices by O( T), we know them
to the same order of accuracy as our finite-difference solution. Any errors from the
approximation will not accumulate but be corrected for on the next time step. The
accuracy of our fit to the call prices is always O( T), regardless of how many steps
we are using to propagate up to time T providing our finite difference solver is
accurate to at least O( T 2 ).

8.6 SPECIAL CASE: VASICEK PLUS A TERM STRUCTURE OF


EQUITY VOLATILITIES

In this section, we show a quicker way to find a term structure of equity volatilities
when the interest rates follow a Vasicek/Hull-White model and the stock has no
implied volatility skew.
We have the risk-neutral dynamics
r r
drt ( t rt )dt t dWt (8.29)

and
dSt S S
(rt t )dt t dWt
St

To remove the effects of the dividends, we define a new variable Xt by


t
St
Xt St exp u du ,
0 P(0, t)Ft

which has the familiar dynamics

dXt S S
rt dt t dWt
Xt
Forward PDEs and Local Volatility Calibration 245

Xt corresponds to the strategy of reinvesting the dividends in the stock and so is


tradable.
We want to find the dynamics of Xt in the T forward measure, T , in which
the zero-coupon bond P(t, T) is the numeraire; first, we must find the dynamics of
P(t, T) under the risk-neutral measure . Integrating equation (8.29) gives
s
rs exp( (u s)) ur dWu nonstochastic terms
t

The zero-coupon bond P(t, T) has value

T
P(t, T) exp rs ds (8.30)
t

T s
exp exp( (u s)) ur dWur ds (8.31)
t t

T
exp B̂( , u, T) ur dWur , (8.32)
t

where
1 exp( (u T))
B̂( , u, T)

Equation (8.32) gives us the volatility of the zero-coupon bond. We can find the
drift using the fact that P(t, T) is tradable, so P(t, T) Bt must be a martingale and
so we have

dP(t, T)
rt dt B̂( , t, T) tr dWtr
P(t, T)

Since Xt is tradable, Xt P(t, T) will be a T -martingale, so it follows that

Xt
d P(t,T) S S
Xt t dWt B̂( , t, T) tr dWtr ,
P(t,T)

where Wt are Brownian motions in T . It follows that under T, XT is log-normally


distributed with mean 1 P(0, T) (since X0 1) and variance
T
S 2 S r
VT ( t ) 2 t B̂( , t, T) t B̂( , t, T)2 ( tr )2 dt (8.33)
0

We can write

1 VT
XT WT ,
P(0, T) T
246 ADVANCED PRICING TECHNIQUES

where is the Doléans-Dade exponential:

VT VT VT
WT exp WT
T T 2

and so the stock price is given by

VT
ST Ft WT
T

The price of a call with maturity T and strike K is

C(K, T) P(0, T) T [(St K) ]

VT
P(0, T) T FT WT K
T

If we use the same assumptions that go into the definition of implied volatility
(i.e., deterministic interest rates and a constant volatility), then we can identify
2
VT imp,T T

Substituting this into equation (8.33) gives the relationship between the implied
volatility, imp and the process volatilities, S and r :

T
2 1 S 2 S r
imp (T) ( t ) 2 t B̂( , t, T) t B̂( , t, T)2 ( tr )2 dt (8.34)
T 0

To find S from imp and r , we must invert the above expression. This can be
done numerically by bootstrapping. Figure 8.4 shows the result for a constant implied

Equity Process Volatility

0.3

0.25 -0.6
-0.4
0.2
Volatility

-0.2
0.15 0
0.2
0.1
0.4
0.05 0.6
0
0 5 10 15 20 25 30
Time (years)

FIGURE 8.4 Equity process volatility for different correlations, with a fixed implied volatility
of imp 20%, with 1% and r 1%.
Forward PDEs and Local Volatility Calibration 247

volatility of imp 20%, with 1% and r 2%, for a range of correlations. The
larger the correlation, the more suppressed the local volatility becomes. Note that
even with zero correlation, the process volatility is less than the implied volatility.
Also note that we cannot fit a flat implied volatility beyond a certain maturity for
each correlation.
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 9
Numerical Solution of Multifactor Pricing
Problems Using Lagrange-Galerkin with
Duality Methods

9.1 INTRODUCTION

Many financial derivative products are conveniently modeled in terms of one or more
factors, or stochastic spatial variables, and time. Based on the contingent claims
analysis developed by Black and Scholes [72] and Merton [73], a partial differential
equation (PDE) for the fair price of these derivatives can be obtained. Valuation
PDEs for financial derivatives are usually parabolic and of second order. In the
more general case, partial differential inequalities (PDIs) arise. The inequality comes
when the option has some embedded early-exercise features and the price of the
contingent claim must satisfy some inequality constraints in order to avoid arbitrage
opportunities. In other words, if the price were to violate those constraints, the
option would be exercised, since both the buyer and the seller of an option will try to
maximize the value of their rights under the contract. Early-exercise features appear,
for example, in American options and in the conversion, call, and put provisions
of convertible bonds. These are so-called free boundary problems because there are
(a priori) unknown boundaries separating the regions where inequalities are strict
from those where they are saturated.
It is almost always impossible to find an explicit solution to a free boundary
problem: we need numerical techniques. The extra complication in those problems
comes from the fact that we do not know where the free boundary is; it is an extra
unknown that we need to find as part of the solution procedure. The most common
method of handling the early exercise condition is simply to advance the discrete
solution over a time step, ignoring the restriction, and then to make a projection on
the set of constraints (see, for example, Clewlow and Strickland [139]). This is very
easy to implement but has the disadvantage that the solution is in an inconsistent
state at the beginning of each time step (see Wilmott, Dewynne, and Howison [140]).
Rigorous methods to deal with free boundaries transform the original problem
into a new one having a fixed domain from which the free boundary can be
found a posteriori. The problem may be formulated in two ways: The first is as a
linear complementary problem (a strong formulation), usually combined with finite
difference methods; the second is as a variational inequality (a weak formulation),
usually related to finite element methods. The latter has some advantages. First,

248
Numerical Solution of Multifactor Pricing Problems 249

variational inequalities are an excellent framework to deal with issues such as


existence and uniqueness of the solution. Second, they are appropriate to analyze
the error incurred in the numerical methods (numerical analysis). Finally, writing a
weak formulation is a necessary step in using finite element methods for numerical
solution.
In this chapter, we present a framework for solving multifactor option-pricing
problems with early-exercise features. We first introduce a duality (or Langrange
multiplier) method to deal with inequality constraints in the solution or its deriva-
tives. Then the method of characteristics and finite elements is proposed for time
and space discretization, respectively. The combination of these numerical methods
has been applied to finance by Bermúdez and Nogueiras [107] and Vázquez [141].
There are three main issues when using PDE methods in contingent claim
valuation: (1) how to account for early-exercise features; (2) how to discretize the
model; and (3) how to deal with the convection-dominance problem. The main goal,
of course, is to achieve the best trade-off between speed and accuracy according to
the circumstances of our valuation problem.1
In order to deal with free boundary problems, we first reformulate them in
a mixed form by means of a Lagrange multiplier. Then we propose an iter-
ative algorithm in which the solution of the nonlinear problem, that is, the
problem with constraints, is approximated by a sequence of solutions of linear
(unconstrained) problems. This algorithm is a particular application of the one
introduced by Bermúdez and Moreno [142] and has been used extensively in
other fields. The algorithm provides great generality in the sense that it allows
for any type of constraint to be imposed on the value function or its deriva-
tives, which may depend on the spatial variables and time. It can be applied to
either the weak or the strong problem; hence, it can be combined with either
finite difference or finite element discretization. Moreover, since the solution of
the nonlinear problem is approximated by a sequence of solutions of linear prob-
lems, any existing code for discretizing partial differential equations could be easily
extended.
Sometimes in finance the effect of the volatility term is smaller than the effect
of the drift term, giving rise to the so-called convection-dominated problems. Fur-
thermore, the diffusive term may become degenerate in the following senses: We
may distinguish between weak degeneration, as in Black-Scholes where the dif-
fusion vanishes for zero spot, or strong degeneration of ultraparabolic type, as
in Asian options in which there is no diffusion in one direction. In such situ-
ations second-order centered space-discretization schemes may lead to spurious
oscillations. In the tree framework this is equivalent to saying that the local
drift is so large relative to the diffusion that branching into the usual bino-
mial or trinomial tree will lead to negative probabilities. Hull and White [143]
have solved this with their alternative branching technique. In a PDE approach,
one has to resort to first-order, one-sided space differencing or to the more
recent Eulerian and characteristics techniques, such as the ones described in
Ewing and Wang [144]. The method of characteristics for time discretization is
a possible approach to deal with convection-dominated problems. Its main advan-
tages are:

1 Research, VAR, risk management, pricing.


250 ADVANCED PRICING TECHNIQUES

■ It is unconditionally stable even when applied to the transport equation (no


diffusion). Hence, it copes well with degenerate diffusions.
■ It yields discrete symmetrical linear systems, whose numerical solution is faster
than nonsymmetrical ones.

The modified method of characteristics or semi-Lagrangian method was intro-


duced in the 1990s by Pironneau [145] and Douglas and Russell [146]. It has been
applied to convection-diffusion equations combined with space discretization using
both finite differences (see [146]) and finite elements (see [147], [148], [149]). When
combined with finite elements it is referred to as characteristics finite elements, or the
Lagrange-Galerkin method. The classical method of characteristics is only first order
accurate. However, higher order characteristic finite element methods have been
proposed by Boukir et al. [150] and Rui and Tabata [151]. Bermúdez, Nogueiras,
and Vázquez [152, 197] extended the method in Rui and Tabata and applied it to the
valuation of American-Asian options. We describe the classical Lagrange-Galerkin
method as well as the second-order scheme proposed in [152, 197].
While most papers and books on financial derivatives employ finite differences
(FDs) for the numerical solution (see, for instance, Wilmott, Dewynne, and Howison
[140]), the use of finite elements (FEs) has several advantages (see Ciarlet and Lions
[153]):

■ It can be used with domains of arbitrary geometries and arbitrary boundary


conditions.
■ It allows unstructured meshes, which can be convenient for making refinements
in particular regions of the domain, such as around free boundaries or near
barriers.
■ It has a solid mathematical foundation. There is a well-developed theory about
a priori and a posteriori error estimates, which allows a rigorous analysis and is
a necessary tool for adaptive algorithms.
■ It is well suited for modern computer architectures, particularly parallel
processing.

This chapter is organized as follows: In section 9.2, we introduce the general


modeling framework. In section 9.3, we describe the duality method to reduce
the nonlinear problem to a sequence of linear problems. Section 9.4 describes the
classical first-order Lagrange-Galerkin method, which can be used to solve the
linear problem numerically. Section 9.5 introduces higher-order Lagrange-Galerkin
schemes, and section 9.6 applies the classical Lagrange-Galerkin method to the
valuation of convertible bonds.

9.2 THE MODELING FRAMEWORK: A GENERAL


D-FACTOR MODEL

The fair price of many financial derivatives can be obtained by solving final-
value problems for parabolic partial differential equations, eventually involving
inequality constraints. These constraints could affect the option value, like in
American-and Bermudan-style options or in convertible bonds, or may be imposed
Numerical Solution of Multifactor Pricing Problems 251

on the value of the spatial derivative of the solution, if we intend, for instance, to
price a barrier option subject to a cap on the delta. Those are the so-called free
boundary problems and are an example of nonlinear problems. Recall that free
boundary problems may be mathematically formulated as linear complementary
problems (strong formulation) or as variational inequalities (weak formulation).
The terminology weak–strong refers to the regularity of the solution. Each solution
of the strong formulation is a solution of the weak problem. Conversely, if a solution
of the weak problem is smooth enough, then it is a solution of the original problem
in the classical sense. Existence and uniqueness of a strong solution require the final
and boundary conditions to be sufficiently smooth (payoff functions are generally
not even differentiable). These constraints can be weakened when we use a weak
formulation of the problem; the difficulties do not disappear, but solutions are
sought in more general functional spaces (weighted Sobolev spaces).
In this section, we introduce a general d-factor pricing framework. First, we
set the final-value linear problem in the absence of constraints. Then we formulate
the nonlinear problem, both in strong and weak form. In order to use an itera-
tive algorithm to deal with the free boundaries, we will need to reformulate the
problem in a mixed form by means of a Lagrange multiplier. We will distinguish
between the primal formulation, which involves only the unknown option price,
and the mixed formulation, which involves the Lagrange multiplier as an extra
unknown.

9.2.1 Strong Formulation of the Linear Problem: Partial Differential


Equations
Let the value of a contingent claim be a function of time t and d spatial variables
x x1 , , xd . Very often, the value of the contingent claim can be found as the
solution of a parabolic partial differential equation of the following form:

d 2
x, t Aij x, t x, t
t xi xj
i,j 1

d
Bj x, t x, t A0 x, t x, t
xj
j 1

f x, t , in 0, T , (9.1)

where is the spatial domain and Aij , Bi , A0 and f are given measurable functions
of x, t . Typically, xj represents quantities such as the value of an underlying asset
or a stochastic interest rate. Therefore, they run either in the interval [0, ) or in
the whole real line . We also have to include the final condition, the payoff of the
contract, which depends on the specific derivative product. In general, we will write

x, T x (9.2)
252 ADVANCED PRICING TECHNIQUES

In order to obtain a weak formulation, we need to rewrite the equation in divergence


form
d
x, t aij x, t x, t
t xi xj
i,j 1

d
vj x, t x, t a0 x, t x, t
xj
j 1

f x, t in 0, T , (9.3)

where the new coefficients aij , bi , a0 are given by


1
aii Aii , aij aji Aij Aji , (9.4)
2
d d
aij Aii 1 Aij Aji
vi Bi Bi , (9.5)
xj xi 2 xj
j 1 j i

a0 A0 (9.6)

Notice that we have imposed symmetry to the matrix A aij . Equation 9 3 is


simply a d-dimensional linear convection-diffusion-reaction equation, with diffu-
sion matrix A (aij ), velocity vector v v1 , v2 , , vd (convection), and reaction
coefficient a0 .
It will be useful, for the following sections, to formulate the model using the
material or total derivative of with respect to time t and the velocity field v,
namely,
d
d
x, t x, t vj x, t x, t (9.7)
dt t xj
j 1

With this notation, equation 9 3 becomes


d
x, t aij x, t x, t a0 x, t x, t
xi xj
i,j 1

f x, t (9.8)

Denoting the differential operator by

[ x, t ] x, t
d
aij x, t x, t
xi xj
i,j 1

a0 x, t x, t , (9.9)

the pricing problem (9.1)–(9.2) may be written as


Numerical Solution of Multifactor Pricing Problems 253

Problem 1 Linear Problem (Strong Formulation)


Find : [0, T] such that

[ x, t ] f x, t in 0, T (9.10a)
x, T x in (9.10b)

9.2.2 Truncation of the Domain and Boundary Conditions


Very often, the pricing problem is a pure Cauchy problem; that is, only an initial
condition is needed to guarantee existence and uniqueness of solution. However,
numerical discretization, by using either finite difference, finite elements, or finite
volume methods, makes it necessary to cut the domain at finite distance and to
introduce ‘‘artificial’’ boundary conditions. Those are generally obtained by financial
arguments, but also by pure mathematical reasoning, and have to be included in
the weak formulation. This process, called localization, often arises in numerical
finance, and introduces a model error that has been studied, for instance, by Kangro
and Nicolaides [154] and by Barles et al. [162].
Let us still call the bounded domain, and its boundary. We denote D
(respectively R ) the subset of where Dirichlet (respectively Robin) boundary
conditions are imposed. Specifically,

x, t x, t g x, t on R, (9.11)
nA
x, t l x, t on D, (9.12)

where
d
x, t aij x, t x, t ni x (9.13)
nA xj
i,j 1

A x, t x, t n x ,

and n (n1 , , nd ) denotes a unit outward normal vector to . In equations 9 11


and 9 12 , functions , g, and l are data.
In general, 0 D R is a nonempty set in which no boundary conditions
are needed because the natural condition for the weak formulation is identically
satisfied. Specifically, 0 is the set where t A x, t n x 0 and, therefore, n x, t
A
0 for any function .
The discussion on boundary conditions, sometimes ignored in financial litera-
ture, is often a complicated task and depends on the particular financial product.

Remark 9.2.1 In a d-factor model the computational or “localized” domain is


frequently a rectangle [a1 , b1 ] [a2 , b2 ] [ad , bd ]. In such a case, the boundary
may be decomposed as:

d
i i , (9.14)
i 1
254 ADVANCED PRICING TECHNIQUES

where i (respectively i ) is the part of the boundary characterized by the unit


outward normal vector

0, ,1 i , ,0

(respectively 0, , 1i, , 0 ).

9.2.3 Strong Formulation of the Nonlinear Problem: Partial Differential


Inequalities
Early-exercise features, in American options or convertible bonds, for instance, may
be included in the model by means of unilateral constraints applied to . Hence,
partial differential inequalities, rather than partial differential equations, have to be
considered.
If x, t is the value of an American-or Bermudan-style option, extra constraints
need to be added to Problem 1 in order to avoid arbitrage opportunities. Specifically,
if the holder of the option has the right to exercise at time t and to receive an exercise
value R1 x, t , then we need

x, t R1 x, t (9.15)

In this case, x, t satisfies equation (9.10a) only if

x, t R1 x, t

If

x, t R1 x, t ,

then

[ x, t ] f x, t 0 (9.16)

Similarly, if the option may be exercised by the issuer2 at time t for an exercise value
R2 x, t , we need

x, t R2 x, t (9.17)

In that case x, t solves equation (9.10a) only if

x, t R2 x, t

If

x, t R2 x, t ,

2 For example the convertible bond call provision or any other issuer callable contract.
Numerical Solution of Multifactor Pricing Problems 255

then

[ x, t ] f x, t 0 (9.18)

We refer to 9 16 and 9 18 as partial differential inequalities (PDIs), and we


call 9 15 and 9 17 unilateral conditions, restrictions or constraints.
The general nonlinear problem will be:

Problem 2 Nonlinear Problem (Strong Primal Formulation)


Find : [0, T] , satisfying boundary conditions 9 11 and 9 12 , such
that

[ x, t ] f x, t 0 if R1 x, t x, t R2 x, t in 0, T
[ x, t ] f x, t 0 if x, t R2 x, t in 0, T
[ x, t ] f x, t 0 if x, t R1 x, t in 0, T
x, T x in
(9.19)

The above is a so-called primal formulation, because it involves only the


unknown . It is also called linear complementary problem. Problem 9 19 can
be rewritten by introducing a Lagrange multiplier, leading to a so-called mixed
formulation. Specifically, 9 19 is equivalent to:

Problem 3 Nonlinear Problem (Strong Mixed Formulation)


Find functions , p : [0, T] satisfying boundary conditions 9 11 and
9 12 such that

[ x, t ] f x, t p x, t in 0, T (9.20a)
x, T x in (9.20b)

and furthermore

R1 x, t x, t R2 x, t , (9.21)

with

R1 x, t x, t R2 x, t p x, t 0 in 0, T (9.22)
x, t R1 x, t p x, t 0 in 0, T (9.23)
x, t R2 x, t p x, t 0 in 0, T (9.24)

Function p is a Lagrange multiplier, which adds or subtracts value in order to


ensure that constraints in the solution are being met. Certainly, in the region where
p 0, the equality in 9 10a holds. The surfaces separating the regions where
p 0, p 0 and p 0 are the so-called free boundaries.
256 ADVANCED PRICING TECHNIQUES

Let us introduce the following family (indexed by x, t) of set- (or multi-) valued
graphs defined by

if Y R1 x, t
( , 0] if Y R1 x, t
G x, t Y 0 if R1 x, t Y R2 x, t (9.25)
[0, ) if Y R2 x, t
if Y R2 x, t

It is straightforward to show that inequalities (9.21)–(9.24) are equivalent to


the relation

p x, t G x, t x, t (9.26)

Hence, Problem 3 may be rewritten as

Problem 4 Nonlinear Problem (Strong Mixed Formulation)


Find t satisfying boundary conditions 9 11 and 9 12 and p t
such that

[ x, t ] f x, t p x, t in 0, T
x, T x in
p x, t G x, t x, t in

where and are suitable X-dependent function spaces for and p, respectively.

9.2.4 Weak Formulation of the Nonlinear Problem: Variational Inequalities


In order to discretize in space using the finite element method we have to rewrite the
problem in a variational (or weak) form. We recall that variational inequalities are
not only the starting point for the finite element discretization, but also constitute
a powerful tool to deal with theoretical issues, such as existence and uniqueness of
the solution as well as numerical analysis.
In order to write a weak formulation of the valuation problem we multiply
equation (9.20a) by a test function defined in . Then we integrate in to get

x, t x dx

d
aij x, t x, t x dx
xi xj
i,j 1

a0 x, t x, t x dx

f x, t x dx p x, t x dx (9.28)
Numerical Solution of Multifactor Pricing Problems 257

We use Green’s formula to transform the second term on the left-hand side:

d
aij x, t x, t x dx
xi xj
i,j 1

d
aij x, t x, t x dx
xj xi
i,j 1

d
aij x, t x, t x ni x d (9.29)
xj
i,j 1

The boundary term in this equation is decomposed as (from 9.13):

d
aij x, t x, t x ni (x)d
xj
i,j 1

x, t x d
nA

x, t x d x, t x d
D
nA R
nA

If we restrict the test functions to those vanishing on D and replace nA on R using


boundary condition 9 11 , we get

x, t x g x, t x, t x d (9.30)
nA R

Substitution into 9 29 , and then of 9 29 into 9 28 , yields

d
x, t x dx aij x, t x, t x dx
xj xi
i,j 1

a0 x, t x, t x dx x, t x d
R

p x, t x dx f x, t x dx g x, t x d (9.31)
R

This is a so-called weak mixed formulation since both the primitive unknown and
the Lagrange multiplier p are involved. In what follows, we will write another weak
formulation that includes only the unknown .
258 ADVANCED PRICING TECHNIQUES

Let us introduce the family of convex sets of functions defined, for each t in
[0, T], by

(t) : R1 x, t x R2 x, t , a.e. in (9.32)

where is a suitable space for .


Using 9 22 , 9 23 and 9 24 it is straightforward to show that, for any (t)
we have3

p x, t x x, t dx 0 (9.33)

For each time t we replace the arbitrary test function in 9 31 by , where


is in (t) such that x l x, t on D . We get

d
dx aij x, t dx
xj xi
i,j 1

a0 dx d
R

p x, t dx f dx g d (9.34)
R

Finally, we use 9 33 in this equality and obtain the following variational inequality
of the first kind:

d
dx aij x, t dx
xj xi
i,j 1

a0 dx d
R

f dx g d (9.35)
R

This is a weak primal formulation in the sense that now is the only unknown.

3 In fact, the pointwise inequality holds

p x, t x x, t 0
Numerical Solution of Multifactor Pricing Problems 259

In order to write the problem in a more compact way, we introduce the following
notations: Let a t , be the family of bilinear symmetric forms

d
a t , aij x, t x, t x dx
xj xi
i,j 1

a0 x, t x, t x dx

x, t x, t x d , (9.36)
R

and L t be the family of linear forms:

L t f x, t x dx g x, t x d (9.37)
R

Then the problem can be written in the two equivalent forms:

Problem 5 Nonlinear Problem (Weak Primal Formulation)


Find t t satisfying Dirichlet condition 9 12 and final condition 9 2
such that

t t dx a t t , t L t t

t , x l x, t on D (9.38)

Problem 6 Nonlinear Problem (Weak Mixed Formulation)


Find t and p t satisfying unilateral conditions 9 21 – 9 24 ,
Dirichlet boundary condition 9 12 and final condition 9 2 such that

t dx a t t , L t p dx 0, (9.39)

where

0 : D
0 , (9.40)

and and are suitable spaces for and p, respectively.

9.3 NUMERICAL SOLUTION OF PARTIAL DIFFERENTIAL


INEQUALITIES (VARIATIONAL INEQUALITIES)
As mentioned before, the most common method of handling the early exercise
condition is simply to advance the discrete solution over a time step ignoring the
restriction and then to make a projection on the set of constraints. This is easy
to implement, but a discrete form of the linear complementary problem or the
variational inequality is not satisfied.
260 ADVANCED PRICING TECHNIQUES

In the case of a single-factor American put, the algebraic linear complementary


problems are commonly solved using a projected iteration method (PSOR) (see
Wilmott [163], Vázquez [141]).
Clarke and Parrot [164] suggested a multigrid method to accelerate convergence
of the basic relaxation method. They showed that the algorithm, when applied to
the valuation of American options with stochastic volatility, gives optimal numerical
complexity and the performance is much better than for the PSOR.
On the other hand, Forsyth and Vetzal [165] proposed an implicit penalty
method for valuing American options and showed that when a variable time step
is used, quadratic convergence is achieved. They derived sufficient conditions to
guarantee monotonic convergence of the nonlinear penalty iteration and also to
ensure that the solution of the penalty problem is an approximate solution to the
discrete linear complementary problem. They compared the efficiency and the
accuracy of the method with the commonly used technique of handling the American
constraint explicitly in the tree methodologies. Convergence rates as the time step
and the mesh size tend to zero for the standard CRR tree are compared with
convergence rates for an implicit finite volume method with Crank-Nicolson time
stepping and the penalty method for handling the American constraint. They
found that the PDE method is asymptotically superior to the binomial lattice
method.
Barone-Adesi et al. [97], Bermúdez and Nogueiras [107], and Bermúdez et al.
[152, 197] used a Lagrange multiplier (or duality) method to solve variational
inequalities arising in the valuation of convertible bonds and Amerasian options.
This method has been introduced in [142] for solving elliptic variational inequalities
of the second kind (see also Parés et al. [166] for further analysis). It has not
been applied much in finance but has been used extensively in other fields such as
computational mechanics. As mentioned in Section 9.1, the algorithm can be used
for any type of constraint to be imposed on the value function or its derivatives,
which may depend on the spatial variables and on time. It is based on the mixed
formulation and could be applied to either the weak or the strong problem; hence it
could be combined with either finite difference or finite element discretizations.
In this section, we describe this general methodology to solve the nonlinear
problems introduced in the previous section. The solution of the nonlinear problem
is approximated by a sequence of solutions of linear problems. In the next section,
we describe the numerical solution of the linear problem using a discretization in
time with characteristics and a discretization in space with finite elements. Finally,
the Lagrange multiplier method is applied to the fully discretized problem.

9.3.1 A Duality (or Lagrange Multiplier) Method


Recall that inequalities 9 21 – 9 24 establish a relationship between p and ,
which can be written in a more compact way as

p x, t G x, t x, t , (9.41)

where G x, t is the family (indicated by x, t) of set (or multi)-valued graphs


introduced in 9 25 .
Since G x, t is a multivalued function, equation 9 41 is not easy to implement.
However, we have the following result (see Bermúdez and Moreno [142]):
Numerical Solution of Multifactor Pricing Problems 261

G (x, t)

R1 (x, t)

R2 (x, t) Y

FIGURE 9.1 Yosida approximation.

Lemma 9.3.1 The following two statements are equivalent:

U G x, t Y , (9.42)
U G x, t Y U 0, (9.43)

where G x, t is the Yosida approximation of G x, t (see Figure 9.1) defined by

1
Y R1 x, t if Y R1 x, t
G x, t Y 0 if R1 x, t Y R2 x, t
1
Y R2 x, t if Y R2 x, t

We notice that, unlike G, G is a Lipschitz-continuous (univalued) function.


In view of this lemma and the previous discussion, relations (9.21)–(9.24) are
equivalent to the following equality:

p x, t G x, t x, t p x, t , (9.44)

where is a positive real number.


We are now in a position to introduce the following iterative algorithm:

(a) At the beginning, the function p0 is given arbitrarily.


(b) At iteration m, an approximation of the Lagrange multiplier pm is known
and we proceed as follows:
First, we work out a new approximation of (t), m 1 , by solving the linear
problem in either weak- or strong-form

Problem 7 Linearized Continuous Problem. Given pm , find m 1


such that
262 ADVANCED PRICING TECHNIQUES

■ Weak form

m 1 dx a t m 1, L t, pm dx 0,

(9.45)

or
■ Strong form

[ m 1] f pm (9.46)

together with boundary conditions 9 11 , 9 12 and final condition

m 1 x, T x (9.47)

Then, we update the Lagrange multiplier p by using equation (9 44).


Precisely, pm 1 is defined as

pm 1 x, t G x, t m 1 x, t pm x, t , (9.48)

where, in order to achieve convergence, has to be greater than some positive


value which depends on coefficients aij , a0 , and bi (see Bermúdez and Moreno
[142] for details).

9.4 NUMERICAL SOLUTION OF PARTIAL DIFFERENTIAL


EQUATIONS (VARIATIONAL EQUALITIES): CLASSICAL
LAGRANGE-GALERKIN METHOD
In the previous section we introduced an iterative algorithm to approximate the
solution of the nonlinear problem by a sequence of solutions of linear problems.
Note that equations (9.45) and (9.46) are linear because, although they include the
Lagrange multiplier, it is known from the previous iteration. What remains is to solve
the linear problem by discretizing in time an space. We approximate equation (9.45)
with a semi-discretization in time using the method of characteristics and a spatial
discretization using finite elements. Recall that spatial discretizations using finite
differences start with the strong formulation (9.46), whereas spatial discretization
using finite elements are applied to the weak formulation (9.45). Time discretization
using characteristics could be combined with both finite differences and finite
elements.

9.4.1 Semi-Lagrangian Time Discretization: Method of Characteristics


The pricing equation is simply the convection-diffusion equation together with a
reaction term producing an exponentially decay due to the discounting. Accurate
modeling of the interaction between convective and diffusive processes is one of
the most challenging tasks in the numerical approximation of partial differential
equations; the choice of the numerical method depends on whether the problem is
diffusion dominated or convection dominated. Sometimes, in finance, the diffusion
Numerical Solution of Multifactor Pricing Problems 263

is quite small relative to the convection, leading to a so-called convection-dominated


problem. The numerical solution of convection-dominated problems is more complex
than the solution of fully elliptic or parabolic equations. Problems arise due to the lack
of natural dissipation embedded in parabolic partial differential equations, which
helps make numerical schemes stable. Moreover, the solution of linear hyperbolic
problems will only be as smooth as the initial solution; hence, the regularizing effect
of the fully parabolic equation could be lost. In all such circumstances, standard
finite element and finite difference approximations may present difficulties. A large
literature has been built up on a variety of techniques for analyzing and overcoming
those difficulties; books like Morton [167] are entirely devoted to the subject. A
summary of numerical methods for time-dependent convection-dominated PDEs can
be found in Ewing and Wang [144]. They provide a historical review of classical
numerical methods and a survey of the recent developments on the Eulerian and
characteristics Lagrangian methods. Eulerian methods use the standard temporal
discretization, while the main distinguishing feature of characteristic methods is the
use of characteristics to carry out the discretization in time.
The method of characteristics (or Lagrangian method) for time discretization is
a possible approach for dealing with convection-dominated problems. It is part of
the more general family of upwinding methods, which take into account the local
flow direction. This approach is based on the discretization of the total (or material)
derivative, introduced in (9.7), which is the time derivative along the characteristic
lines. In other words, it is the derivative in time for a particle moving with velocity
v (see (9.5)). In a Lagrangian coordinate system, one would only see the effect of
diffusion, reaction, and the right-hand-side terms but not the effect of convection.
Often, the solutions of the convection-diffusion PDEs change less rapidly along the
characteristics than they do in the time direction. This explains why characteristic
methods usually allow large time steps while still maintaining stability and accuracy.
When combined with finite elements, the method of characteristics is referred
to as the characteristics finite element (or the Lagrange-Galerkin) method. Its main
advantages are that it is unconditionally stable and that it yields discrete symmetrical
linear systems.
The classical semi-Lagrangian or characteristics method is first-order accurate
in time. Applications in finance have been developed by Vázquez [141], to solve
the one-factor model arising in the valuation of American options; Pironneau and
Hetch [168], to solve the two-factor model arising in the valuation of an American
put on the maximum of two assets; and by Barone-Adesi et al. [97], Bermúdez and
Nogueiras [107], and Bermúdez et al. [152, 197] in the valuation of convertible
bonds and American-Asian options.

Characteristic Curves For given x, t 0, T , where , the char-


acteristic line through x, t associated with vector field v is the vector function
Xe x, t solving the initial value problem

Xe (x, t )
v Xe (x, t ), , Xe (x, t t) x (9.49)

It represents the trajectory described by the material point that occupies position x
at time t and is driven by the velocity field v (see Figure 9.2). Under some regularity
264 ADVANCED PRICING TECHNIQUES

x
Time t

Time z

Xe (x, t ; )

FIGURE 9.2 Characteristic line through x, z associated with vector field v.

assumptions on v, the characteristic line-solving problem 9 49 is well defined in


[0, T] and is unique for each initial condition x, t .
From the definition of the characteristic curves and by using the chain rule, it
follows that the material or total derivative, as defined in 9 7 , satisfies

(Xe (x, t ), ) Xe (x, t ), v Xe (x, t ), Xe (x, t ),


d
Xe (x, t ), , (9.50)
d
where denotes the partial derivative with respect to time and is the gradient.
Approximation of the Material Derivative: Time Discretization In order to carry out a
semidiscretization in time, we consider a partition of the time interval [0, T] into N
time steps of size t T N that we will denote by tn T n t for n 0, 1, , N.
Then, equation 9 50 suggests the following first-order approximation of at
time tn 1
d Xe x, tn 1 tn , tn (x, tn 1)
x, tn 1 x, tn 1 , (9.51)
d tn tn 1
where we have used that Xe x, tn 1 tn 1 x.
The approximation 9 51 leads to the following implicit semidiscretized scheme
for equation 9 45 :

Problem 8 Semi-Discretized Scheme


n 1
Given pnm 1 , find m 1 for n 0, ,N 1, satisfying Dirichlet bound-
n 1
ary condition m 1 (x) l(x, tn 1 ), such that
n n 1
Xe x, tn 1 tn m 1 x n 1
(x)dx an 1
m 1 (x),
tn tn 1

Ln 1
pnm 1 (x) (x)dx 0, (9.52)

and
0
m 1 (x) (x),
Numerical Solution of Multifactor Pricing Problems 265

n 1
where (x) (x, tn 1 ), pn 1
(x) p(x, tn 1 ), an 1
( , ) a(tn 1 , ) and
Ln 1 ( ) L(tn 1 ).

In most cases, the Cauchy problem 9 49 is not easy to solve analytically. How-
ever, the O t error of scheme 9 52 does not change if we replace Xe x, tn 1 tn
by a first-order approximation given, for example, by an explicit Euler scheme

XE x, tn 1 tn x t v x, tn 1 (9.53)

9.4.2 Space Discretization: Galerkin Finite Element Method


Galerkin methods are obtained by restricting both the solution and the test functions
involved in the variational formulation to be in a finite dimensional space. In finite
element methods, this space is made up of globally continuous functions that are
polynomials in each element of a polygonal mesh of the domain In Galerkin
methods, the solution and the test functions are looked for in the same finite
dimension space. The solution of the PDE is built as a sum of all these local
approximating functions. Usually, only the spatial variables are treated in this way,
while time is discretized with FD or other methods. With two spatial variables, the
domain is partitioned into triangles and/or quadrangles. Three-dimensional spatial
domains allow partitions into tetrahedrons, hexahedrons, or prisms.
The distinction between FE and FD is relevant at the theoretical level, when
dealing with the numerical analysis. Once the discrete scheme is written and one
is left with algebraic transformations of values at the grid points, the distinction
vanishes. On structured meshes, finite differences and finite elements plus specific
numerical integration (using, for example, vertices) can be shown to be equivalent.
The contrast should be seen more as variational methods versus finite differences
rather than finite elements versus finite differences.
However, as mentioned in section 9.1, FE are more flexible than FD in incorpo-
rating boundary conditions and in that they allow unstructured meshes. As shown
by Zvan et al. [110], unstructured meshing can be applied to a wide variety of
financial models. The idea is that an accurate solution of the pricing PDE requires on
many occasions a fine-mesh spacing in certain regions of the domain, usually where
the gradient is steep, whereas in regions where the gradient is flat, a coarser mesh can
be used. Some studies have indicated, for example, the need for small-mesh spacing
near barriers (Figlewski and Gao [169], Zvan et al. [170]). Pooley [171] proves that
the finite element method with standard unstructured meshing techniques can lead
to significant efficiency gains over structured meshes with a comparable number of
vertices for pricing barrier options. Pironneau and Hetch [168] present and test an
adaptive algorithm for a problem with a free boundary that arises in finance for the
pricing of American options, leading to satisfactory results.4
FE has some other computational practicalities compared to FD (see Winkler
et al. [172]):

■ FE is very suitable for modular programming.

4
They use a characteristics/FE method for the space discretisation and the Brennan Schwartz
algorithm to deal with the American early exercised.
266 ADVANCED PRICING TECHNIQUES

■ A solution for the entire domain is computed instead of isolated nodes as with
the FD method.
■ FE provides accurate “greeks” as a byproduct.
■ FE can easily deal with irregular domains, whereas this is difficultly in FD.

Finite elements, which are a widely used technique in areas such as computational
mechanics, have become quite popular in financial engineering. A recent text book
is Topper [196].

Fully Discretized Lagrange-Galerkin Scheme In order to solve the Problem 8 numer-


ically, a discretization must be done; in other words, the problem must be replaced
by a new one with a finite number of degrees of freedom or unknowns.
As mentioned previously, Galerkin methods, replaces the space of functions
by a finite dimensional space h and defines a discrete counterpart of Problem 8
where the function is approximated by h h and the Lagrange multiplier p is
approximated by ph h . In finite elements, the space h is made up of globally
continuous functions that are polynomials in each element of a polygonal mesh of
the domain . Let us denote by h a family of polygonal meshes of the domain ,
where the parameter h tends to zero and represents the size of the mesh. We assume
that the mesh contains Nh nodes and that any function in h is uniquely defined by
its values at the nodes. If h we call the values (qi )j , i 1, , Nh the set of
degrees of freedom. As in the continuous problem, we define

0,h h h : h q 0 for all nodes q on D (9.54)

The discrete problem can be written as:

Problem 9 Fully Discretized Scheme


Given pnh,m1 n 1
h , find h,m 1 h , for n 0, 1, ,N 1 such that

n 1
h,m 1
q l q q node on D, (9.55)

n n 1
h
Xe x, tn 1 tn h,m 1 n 1
h dx an 1
h,m 1
, h
tn tn 1

Ln 1
h pnh,m1 h dx, for every h 0,h, (9.56)

and

0
h,m 1 (q) h (q), (9.57)

for all nodes q of the mesh h, where h is the interpolated function of in the
space h .
Numerical Solution of Multifactor Pricing Problems 267

Let us define the bilinear form

n 1 n 1 1 n 1
an 1
h
, h an 1
h
, h h h dx, (9.58)
t

and the linear form

1
Lnm 1
h Ln 1
h
n
h Xe x, tn 1 tn h dx
t

pnh,m1 h dx (9.59)

Then Problem 9 above can be rewritten as:

Problem 10 Given pnh, m1 Ph , find n 1


h,m 1 h satisfying Dirichlet boundary con-
dition (9.55) such that

n 1
an 1
h,m 1
, h Lnm 1
h h 0,h (9.60)

Let us ignore for the moment Dirichlet boundary condition (9.55), in other
words, let us assume that h 0,h .
Let B 1 , 2 , , Nh be a basis of h . Then the solution of 9 60 can be
written (we omit indices for the sake of simplicity) in the form

Nh
n 1
h,m 1 j j, (9.61)
j 1

so that the discrete problem 9 60 is equivalent to finding Nh numbers 1, 2, , Nh


satisfying

Nh
an 1
j, i j Lnm 1
i , i 1, 2, , Nh (9.62)
j 1

Equivalently:

Problem 11 Find Nh
1, 2, , Nh such that

h bh (9.63)
268 ADVANCED PRICING TECHNIQUES

where

d
1 l k
h kl an 1
l, k l k dx aij dx
t xj xi
i,j 1

a0 l k dx l kd , (9.64)
R

bh k
Lnm 1
k f k dx g kd
R

1 n
h Xe x, tn 1 tn k dx pnh,m1 k dx (9.65)
t

Since the matrix A is symmetric (see 9 4 ), h is symmetric as well. If addition-


ally h is positive definite, Cholesky’s method can be used to solve the system 9 63 .
In the special case where coefficients aij , a0 , and do not depend on time, the linear
system has a matrix independent of both time step and iteration; therefore, it needs
to be computed and factorized only once. Also, in expression 9 65 the first two
terms on the right hand side are independent of time (if f and g are) and iteration,
whereas the third term must be computed at every time step, and the fourth at every
time step and iteration. Consequently, in order to solve these systems it is convenient
to use Cholesky or, more generally, direct Gauss-like methods, because, since the
factorization step needs to be done only once, at each iteration just two triangular
systems have to be solved.
In Problem 11 we have not taken into account Dirichlet boundary conditions.
More precisely, we have included test functions i in the basis of h that do not
satisfy the boundary condition i D 0. Eliminating these functions of the basis
is equivalent to eliminate the corresponding unknowns and equations (degrees of
freedom). This process turns out to be unpleasant from the programming point
of view. A simpler procedure is to replace the i-th equation (assuming the node i
belongs to D ) by the equation

i l(qi )

Actually the i-th equation is replaced by the ‘‘programming equivalent’’ obtained


by substituting the diagonal term ( h )ii by a large number, say H, and the right-hand
side by Hl(qi ). This process is called blocking of the degrees of freedom.
The problem arising now is how to choose the basis B 1 , 2 , , Nh . The
elements of a suitable basis are functions that become zero in big regions of
so that many terms of the matrix h are zero; that is, h is a sparse matrix. We
also need an efficient algorithm to work out the matrix of coefficients, h , and the
right-hand-side vector bh , since the calculation of h and bh using formulas 9 64
and 9 65 is inefficient.
In the appendix, we consider Lagrange triangular finite elements in a
d-dimensional domain . We will show how to build the matrix of coefficients
and the independent term in the particular case of Lagrange triangular finite ele-
ments of degree one in two space dimensions. This finite element space consists
Numerical Solution of Multifactor Pricing Problems 269

of continuous piecewise linear functions on a triangular mesh of the domain .


Specifically,

h h C : h K 1 K h , (9.66)

where C denotes the space of continuous functions defined in and 1 represents


the space of polynomials of degree less than or equal to one in two variables. In this
case the basis function i takes value 1 at the vertex i of the mesh of the domain and
is zero at all other vertices. We refer to Ciarlet [180] and Zienkiewicz et al. [181] as
reference textbooks in the finite element method.

The Iterative Algorithm The algorithm we have introduced in section 9.3 to solve
the continuous variational inequalities can now be written in summarized form for
the fully discretized problem.

At time step n 1 we start with pnh,01 pnh and calculate sequences


pnh,m1 and n 1
h,m
, indexed by m, and defined as follows:

(a) At iteration m, we know pnh,m1


n 1
(b) We first compute h,m 1
as the solution of the linear problem

n 1
an 1
h,m 1
, h Lnm 1 ( h) h 0,h (9.127)

(c) Then we update the Lagrange multiplier using formula 9 48

pnh,m1 1
q G q, tn 1
n 1
h,m 1
q pnh,m1 q ,

for all nodes q of the mesh h where

1
Y R1 q, tn 1 if Y R1 q, tn 1
G q, tn 1 Y 0 if R1 q, tn 1 Y R2 q, tn 1
1
Y R2 q, tn 1 if Y R2 q, tn 1
(9.128)

By applying the results of convergence in Bermúdez-Moreno [142] we know


n 1
that, for sufficiently large, the sequence h,m converges to the solution hn 1 as
m goes to infinity.
Figure 9.3 shows a flow chart of the complete algorithm: Lagrange-Galerkin
combined with the iterative procedure.
270 ADVANCED PRICING TECHNIQUES

0
Initialization h

For n = 0 , .... , N − 1 Timestep loop

Solving system of characteristics

n +1
p h ,0 = ph

For m = 0,1... Iteration loop

Solving linear problem

Updating Lagrange multiplier

Test
No
Yes

STOP n = n -1
Yes No

FIGURE 9.3 Iterative method combined with Lagrange-Galerkin discretization.

9.4.3 Order of Classical Lagrange-Galerkin Method


There is a broad literature analyzing the classical first-order characteristic method
combined with finite elements applied to convection diffusion equations. Suli [182]
showed error estimates of the form O hk t in l L2 norm, where t
denotes the time step, h the spatial step, and k the degree of the finite element space.5
Pironneau [145] stated error estimates of the form O hk t hk 1 t in
l L2 norm under the assumption that the normal component of the velocity
field vanishes on the boundary of the spatial domain and for an approximate
discrete velocity field. In both cases, the constants depend on the norm of the
solution. More recently, Bause and Knabner [183] proved convergence of order

5 L2 ( ) is the space of square-integrable -valued functions defined in with the norm

1 2
f L2 : f (x)2 dx

N
l (L2 ( )) is the space of -valued functions, , defined in tn n 0 such that (tn ) ( )
for all n 0, , N. In this space we consider the norm

l (L2 ) : max (tn ) L2


n 0, ,N

1 2
max (x, tn )2 dx
n 0, ,N
Numerical Solution of Multifactor Pricing Problems 271

O h2 min h, h2 t t for linear finite elements and zero velocity on the


boundary, where the constants on the error estimates depend only on the data.

9.5 HIGHER-ORDER LAGRANGE-GALERKIN METHODS

In order to obtain a better accuracy in space, it is necessary to use finite element


spaces of higher order. In order to achieve a better accuracy in time, it is necessary
to use higher-order characteristic methods. The latter consists of higher-order
schemes for the discretization of the total derivative 9 7 . There are two main
approaches: multilevel schemes, which use a multistep formula to approximate
the total derivative, and Crank-Nicolson schemes, which use a centered formula.
Specifically, for fixed x, t x, tn 1 the following two second-order formulae
could be used to approximate the material derivative:

d
Xe (x, tn 1 ), ,
d
■ Centered formula
t t
2 ),
Xe (x,tn 1 Xe (x,tn 1 t t
2 2 ), 2
t (9.67)

■ Three-level backward formula

3 Xe (x,tn 1 ), 4 Xe (x,tn 1 t), t Xe (x,tn 1 2 t), 2 t


2 t (9.68)

The exact characteristic lines can be replaced by second-order approximations


keeping still the O t2 error. Possible schemes are:

■ Runge-Kutta scheme

1 t n
XRK x, tn 1 tn : x tvn 2 x v 1
x for n 0, 1, , N 1
2
(9.69)

■ Explicit two-step scheme

XTS x, tn 1 tn : x t 2vn x vn 1
x for n 2, , N 1
(9.70)

Ewing and Russel [184] introduced multistep Lagrange-Galerkin methods for


the convection-diffusion equation with constant coefficients. Boukir et al. ([150],
[185]) also used multistep characteristics combined with either mixed finite elements
or spectral methods to solve the incompressible Navier-Stokes. They proved the
stability of the method and obtained error estimates.
Rui and Tabata [151] proposed a second-order Crank-Nicolson characteristic
method for the convection-diffusion equation with constant coefficients and Dirichlet
272 ADVANCED PRICING TECHNIQUES

boundary conditions, with the exact characteristics approximated using a Runge-


Kutta scheme. Bermúdez et al. [152, 197] extended their work to variable coefficients,
possibly degenerate diffusion, nondivergence-free velocity field, non-zero reaction,
and more general Dirichlet-Robin boundary conditions.

9.5.1 Crank-Nicolson Characteristics/Finite Elements


In section 9.3 we showed how the solution of the nonlinear problem could be written
as the limit of solution of linear problems. In section 9.4 those linear problems
were discretized using first order Lagrange-Galerkin schemes. In this section, we
describe the second-order Lagrange-Galerkin method proposed by Bermúdez et al.
[152, 197] for solving the general linear problem (9.10) together with boundary
conditions (9.11) and (9.12). First, we write the weak formulation and then the semi-
Lagrangian time discretization using both exact and approximate characteristics.
Finally, we write the fully discretized problem and address some of the results in
Bermúdez et al. [152, 197].

Weak Formulation We proceed to write a weak formulation of equation 9 3 . Given


that we will carry out a semi-discretization in time using characteristics, we first
write equation 9 3 at point Xe (x, t ) and time giving (see 9 50 ),

d
d
Xe (x, t ), aij Xe (x, t ), Xe (x, t ),
d xi xj
i,j 1

a0 Xe (x, t ), Xe (x, t ), f Xe (x, t ), (9.71)

Equivalently, in vector notation

d
Xe (x, t ), div A Xe (x, t ), Xe (x, t ),
d
a0 Xe (x, t ), Xe (x, t ),
f Xe (x, t ), (9.72)

We will use the following notation:

Xe k
Fe kl x, t : x, t
xl
Xe kl x, t

In order to write a weak formulation, we need the following lemma.

Lemma 9.5.1 Let X : X , X C1 , be an invertible vector valued func-


tion. Let F X and assume that detF x 0 x . Then for smooth vector field
w and scalar field , we have:
Numerical Solution of Multifactor Pricing Problems 273

(a)

div w X x x dx

t
F x n x w X x x d

1
F x w X x x dx

t
div F x w X x x dx (9.73)

(b) If additionally X x x x ,

t
F x n x w X x x d

1
n x w X x x det F x d , (9.74)

where n is the outward unit normal vector to .

Proof: See Bermúdez et al. [152, 197]


Note that equation 9 73 can be considered a Green’s formula. Also, substitu-
tion of 9 74 into 9 73 yields

1
div w X x x dx n x w X x x det F x d

1 t
F x w X x x dx div F x w X x x dx (9.75)

To write a weak formulation we multiply equation 9 72 by a test function


satisfying 0 on D , integrate in , and use the Green’s formula 9 75 with
X x Xe x, t and w A , obtaining

d
Xe (x, t ), x dx
d

Fe 1 (x, t )A Xe (x, t ), Xe (x, t ), x dx

div Fe t (x, t ) A Xe (x, t ), Xe (x, t ), x dx

a0 Xe (x, t ), Xe (x, t ), x dx

n x A Xe (x, t ), Xe (x, t ), x det Fe 1 (x, t )d


R

f Xe (x, t ), x dx (9.76)
274 ADVANCED PRICING TECHNIQUES

Now, using Robin condition 9 11 , the boundary term in the above formulation
can be rewritten as

n x A Xe (x, t ), Xe (x, t ), x det Fe 1 (x, t )d


R

g Xe (x, t ), Xe (x, t ), x det Fe 1 (x, t )d (9.77)


R

Substitution of 9 77 into 9 76 yields

d
Xe (x, t ), x dx
d

Fe 1 (x, t )A Xe (x, t ), Xe (x, t ), x dx

div Fe t (x, t ) A Xe (x, t ), Xe (x, t ), x dx

a0 Xe (x, t ), Xe (x, t ), x dx

Xe (x, t ), x det Fe 1 (x, t )d


R

f Xe (x, t ), x dx

g Xe (x, t ), x det Fe 1 (x, t )dx (9.78)


R

This is a weak formulation of equation (9.3). Note that if t it reduces to the


weak formulation in (9.31) with p 0 (since we are solving the linear problem).

Second-Order Semidiscretized Scheme with Exact Characteristic Lines In order to


carry out a semidiscretization in time, we consider a partition of the time interval
[0, T] into N time steps of size t T N that we will denote by tn T n t for
n 0, 12 , 1, 32 , , N.
We introduce the following notation:

Xen x : Xe x, tn 1 tn , Fne x : Fe x, tn 1 tn ,
n 12 n 12
Xe x : Xe x, tn 1 tn 1 , Fe x : Fe x, tn 1 tn 1
2 2

The method proposed in [152, 197] consists of fixing t tn 1 , n 0, 1, , N 1 in


the weak formulation 9 78 and applying a Crank-Nicholson scheme with respect
Numerical Solution of Multifactor Pricing Problems 275

to . We have
n Xen x n 1 x
x dx
t
1
2 An 1
x n 1
x x dx
1 1
2 Fne (x)An Xen x n Xen x x dx
1 t
2 div Fne (x) An Xen x n Xen x x dx
1
2 an0 1
x n 1 x x dx
1
2 an0 Xen x n Xen (x) x dx
1 n 1
2 R
x x d
1 n 1
2 R
Xen x x det Fne (x) d
1
2 fn 1
x f n Xen (x) x dx
1 n 1
2 R
g x x dx
1 1
2 R
gn Xen x x det Fne (x) dx, (9.79)

where we have used that Xe (x, tn 1 tn 1 ) x and Fe (x, tn 1 tn 1 ) I, I being the


identity matrix.
Bermúdez et al. [152, 197] proved that the scheme 9 79 is of order O t2 at
n 21
point Xe x , tn 1 .
2

Second-Order Semidiscretized Scheme with Approximate Characteristic Lines As


mentioned before, in most cases the system of characteristics 9 49 cannot be solved
exactly. Following Rui and Tabata [151], the exact characteristic lines, Xen x , could
be replaced in 9 79 by a numerical approximation using an explicit method, like
the first-order Euler scheme 9 53 , or second-order Runge-Kutta scheme 9 69 . As
before, we will denote this approximations by XEn x and XRK n
x , respectively.
Thus, we are left with the following second-order approximation to 9 79 in
the case the characteristic lines are not known explicitly:
n n n 1
XRK x x
x dx
t
1
2 An 1 x n 1 x x dx
1 1
2 FnE (x)An XEn x n
XEn x x dx
1 t
2 div FnE (x) An XEn x n XEn x x dx
1
2 an0 1
x n 1 x x dx
1
2 an0 XEn x n XEn (x x dx
1 n 1
2 R
x x d
276 ADVANCED PRICING TECHNIQUES

1 1
2 R
n XEn x x det FnE (x)d
1
2 fn 1 x f n XEn (x) x dx
1
2 R
gn 1 x x dx
1 1
2 R
gn XEn x x det FnE (x)dx (9.80)

Note that, in order to preserve the O t2 error bounds, it is necessary to use a


second-order approximation of the characteristics lines in one term only. Besides,
1 t 1
similarly to the exact characteristic case, det FnE , div FnE x , and FnE could
be replaced by their O t2 approximations below, avoiding the inversion of matrix
FnE . Indeed, given

FnE x : XEn x I x t Ln 1
x ,

it can be shown (see [152, 197]) that

1 2
FnE x I t Ln 1
x t2 Ln 1
x
1
det FnE 1 t div vn 1
x (9.81)
t
div FnE x t div vn XEn x

Bermúdez et al. [152, 197] proved stability in l L2 norm of scheme 9 80


under some hypothesis on the data and sufficiently small time step. Also, under
further regularity assumptions on the data, they proved l L2 error estimates of
order O t2 for the semidiscretized in time scheme.

Fully Discretized Lagrange-Galerkin Scheme In Bermúdez et al., [152, 197] a fully


discretized Lagrange-Galerkin scheme is proposed for a wide class of finite element
spaces. Let kh be a family of finite element spaces, where h denotes the space
parameter and k is the ‘‘approximation order’’ in the following sense:
k
‘‘There exists an interpolation operator h : C0 h satisfying

h s Khk 1 s
k 1 C0 Hk 1
, s 0, 1

for a positive constant K independent of h.’’6

6 H k 1 ( ) is the Sobolev space of order k 1. This is the set of -valued functions defined
in which are square-integrable and have square-integrable derivatives up to order k 1. In
this space we consider the norm

1
k 1 2
2 2
k 1 : L2
D L2
,
1

where D denotes the derivative of order .


Numerical Solution of Multifactor Pricing Problems 277

Notice that the finite element space 9 66 falls into this family for k 1.
The fully discretized scheme reads as follows:

Problem 12 Fully Discretized Second Order Scheme


N
0 k n N k
Given h h, find h : h n 1 h such that

n 12 n 12 k
t h, h t , h h h for n 0, , N 1, (9.82)

where
n n n 1
n 12 h
XRK h
t h, h : dx
t
n 1 n
An 1
h
An h
XEn
h dx
2
t
Ln 1
An n
h XEn h dx
2
t
div vn An n
h XEn h dx
2
an0 1 n 1
h
an0 n
h
XEn
h dx
2
n 1 n
h
1 t div vn 1
h
XEn
d ,
R
2

and
n 12 fn 1 f n XEn
t , h : h dx
2
gn 1 1 tdiv vn 1 gn XEn
d
R
2

In Bermúdez et al. [152, 197] stability results for the fully discretized scheme
9 82 and error estimates of order O t2 O hk in l L2 norm are proved.
These results are under the hypothesis that all inner products in the Galerkin
formulation are calculated exactly. However, in practice numerical integration has
to be used to approximate these integrals. Quadrature formulae have to be carefully
chosen in order to preserve stability and the above order in the error estimates.
Specifically, in [152, 197] the following finite element spaces are considered

■ For a family of rectangular meshes of parameter h, h

Qkh f C0 :f K Qk , K h ,

where Qk is the space of polynomials of degree less than or equal to k in each


variable separately.
278 ADVANCED PRICING TECHNIQUES

■ For a family of triangular meshes of parameter h, h

k
h f C0 :f K k, K h ,

where k is the space of polynomials of degree less than or equal to k.

They carried out some numerical tests in two space dimensions to illustrate
the theoretical results regarding second-order Lagrange-Galerkin schemes combined
with quadrature. It is well known that for the classical first-order-in-time Lagrange-
Galerkin method, numerical integration may lead to conditional stability (see [184],
[186], [187]). They did not find any sign of instability when using scheme 9 82
combined with either Q2h and the tensor product of the Simpson rule in each
coordinate or 2h with a seven-point quadrature formula. In both cases, an extra
term of the form O 1 t appears in the estimates of the error for fixed h. This
agrees with evidence found for the first-order Lagrange-Galerkin scheme ([145],
[187]).
They also carried out a comparison between the second-order Lagrange-Galerkin
and the classical first-order scheme. Some of their results are shown in Figures 9 4
through 9 7. Example 1 in Figures (9.4) through (9.6) shows specific numerical
solutions obtained with first- and second-order discretization in space. Example 2 in
figure (9.7) shows the first- and second-order convergence in the time discretization.

0.6

0.5

0.4

0.3

0.2

0.1

0
0.5
0.5
0
0

Y 0.5 0.5
X

FIGURE 9.4 Exact solution of the rotating Gaussian hill problem with T 2 (Source:
Nogueiras [198]).

0.6

0.5

0.4

0.3

0.2

0.1

0
0.5
0.5
0
0

Y 0.5 0.5
X

FIGURE 9.5 Second-order characteristics with second-order Q2h FE. Numerical solution for
the rotating Gaussian hill problem with T 2. Mesh parameters are h 0 015625 and
t 0 01. (Source: Nogueiras [98]).
Numerical Solution of Multifactor Pricing Problems 279

0.6
0.5
0.4
0.3
0.2
0.1
0
0.5
0.5
0
0
Y 0.5 0.5 X

FIGURE 9.6 Second-order characteristics with first-order Q1h FE. Numerical solution for the
rotating Gaussian hill problem with T 2. Mesh parameters are h 0 015625 and
t 0 01. (Source: Nogueiras [98]).

linf((0,T) ;L2 (Ω)) error curve


1
10

0
10

1
Relative error (%)

10

2
10

3
10

4
10 (LG) /Q
2
2 h
2
5
y=C/N
10 2
(LG) /Q
1 h
y=C/N
6
10
1 2 3
10 10 10
N: number of time steps
FIGURE 9.7 Second order Q2h FE and different characteristics methods. l L2 norm of
numerical error in log-log scale for a convection-(strong degenerated)-diffusion-reaction
problem with variable coefficients. (Source: Nogueiras [98]).

Overall, the second-order scheme outperforms the first-order scheme in terms of


trade-off between the speed and the accuracy.

9.6 APPLICATION TO PRICING OF CONVERTIBLE BONDS

In this section, we apply the numerical methods described in this chapter to the
valuation of convertible bonds. We consider the intensity-based framework for
pricing convertible bonds described in section 4.4.
We study the convergence of the numerical method using the special case of a
bond convertible only at expiry. Then we show prices for a real bond. Section 9.6.1
describes the numerical solution, and section 9.6.2 gives the numerical results.
280 ADVANCED PRICING TECHNIQUES

9.6.1 Numerical Solution


The valuation of convertible bonds can be considered as a special case of the more
general two-factor model presented in section 9.2. Moreover, the model in section
4.4.2 is a special case of the Problem 3 for the choices:

x1 rt , (9.83)
x2 St , (9.84)
1 2 1 1 2 2
A11 r , A12 A21 St S r, A22 S St , (9.85)
2 2 2
B1 r, B2 rt dt qt t t , (9.86)
A0 rt t, F t Vt St , t , (9.87)

and

R1 rt , St , t max nSt , MPt , (9.88)


R2 rt , St , t max nSt , MCt (9.89)

The Interest Rate Model We assume the interest rate follows the extended Vasicek
model introduced in 3.1. This model combines tractability with the flexibility to
calibrate to a prespecified initial term structure. We recall that the short-rate process
under the EMM is

drt t t rt dt r dWt , (9.90)

where (t) can be chosen so that model spot rates coincide with market spot rates.

9.6.2 Numerical Results


In this section, we show numerical results obtained when using classical Lagrange-
Galerkin methods to price an actual CB, the Adidas-Salomon issue maturing on
October 8, 2018. The evaluation date is December 16, 2005; hence, the time to
maturity of the convertible expressed in years is

T 12 8192 (9.91)

The bond has face value

F 50000 EUR, (9.92)

and can be converted until September 20, 2018, at the rate (see table 9.1)

n 440 1961 (9.93)


Numerical Solution of Multifactor Pricing Problems 281

TABLE 9.1 Conversion schedule for


Adidas-Salomon convertible bond

From Date To Date Conversion Ratio


18-Nov-03 20-Sep-18 490.1961

TABLE 9.2 Call schedule for Adidas-Salomon convertible bond


From Date To Date Call Price (% Par) Trigger Level
8-Oct-09 7-Oct-12 100 132 6
8-Oct-12 7-Oct-15 100 117 3
8-Oct-15 8-Oct-18 100

TABLE 9.3 Put schedule for


Adidas-Salomon convertible bond

Date Put Price (% Par)


8-Oct-09 100
8-Oct-12 100
8-Oct-15 100

The Adidas-Salomon issue is continuously soft-callable, that is, the stock price
has to be above the trigger level before the call can be exercised; the call schedule is
in table 9.2.
It is also puttable at par at three-year intervals, as shown in table 9.3.
The bond pays a 2.5% coupon annually on July 12.7
We assume a constant volatility for the underlying stock of S 23%, a
continuous dividend yield d 1 5868% and a repo rate q 0 4%. We obtain
0 0188 for the correlation between the short rate and the equity, using the
one-month EUROLIBOR as a proxy for the instantaneous rate (see figures 9.8
and 9.9).8
The extended Vasicek model 9 90 has been calibrated to market data as of
December 16, 2005 (see section 3.3.1). The following values were obtained for the

7 Thisissue has a call announcement period of 45 days and a conversion announcement period
of 14 days. It has also the so-called French dividend conversion, meaning that the shares
received upon conversion do not pay those dividends paid by ordinary shares between the
date of conversion and the end of the fiscal year in which conversion occurs. All those features
have been ignored for the sake of simplicity.
8 Both time series were obtained from Bloomberg.
282 ADVANCED PRICING TECHNIQUES

180.0000

160.0000

140.0000

120.0000
Price (EURO)

100.0000

80.0000

60.0000

40.0000

20.0000

0.0000
11-Mar-97 24-Jul-98 6-Dec-99 19-Apr-01 1-Sep-02 14-Jan-04 28-May-05 10-Oct-06
Maturity

FIGURE 9.8 Adidas-Salomon daily stock price from January 1, 1998, to January 12, 2006.

6.00%

5.00%

4.00%
Rate

3.00%

2.00%

1.00%

0.00%
11-Mar-97 24-Jul-98 6-Dec-99 19-Apr-01 1-Sep-02 14-Jan-04 28-May-05 10-Oct-06
Maturity

FIGURE 9.9 One-month LIBOR daily rates from January 1, 1998, to January 12, 2006.

interest rate volatility parameters:

0 0203, (9.95)

r 0 6868% (9.96)

We use for the default specification 0 0055, 1 and R 40%.


The instantaneous interest rate is r 2 3804% and the stock price S 157 54.
Numerical Solution of Multifactor Pricing Problems 283

Convergence test In order to test the numerical method, we consider the special
case of bond convertible only at expiry, for which we have an analytical solution
(see section 4.4.4) and therefore we can compute the errors.
We set

R1 rt , St , t 0, (9.97)
R2 rt , St , t , (9.98)

given that there is no early-exercise embedded options.


Domain bounds are set to be r [0, 1 5] and S [9, 2077]. S corresponds
to roughly a 99 9% confidence interval on ST . We give L2 errors over both the
entire domain and also over a narrower region of interest r S , where
r 0 15] and S 1152]. S is roughly a 99% confidence interval on S .
[0, [16, T
reflects a range of values of r and S likely to be observed in practice and so the
error on is likely to be more representative.
We present results obtained for successive grid refinements for the relative error
in L2 . Mesh 1 is the coarsest with just 15 space steps in the interest rate dimension,
40 in the stock dimension, and 120 time steps up to time T 3 5. Each successive
mesh doubles both the number of space steps in each dimension and the number of
time steps so that the finest mesh, mesh 4, has 120 interest rate steps, 320 equity
steps, and 280 times steps. We use as a benchmarking measure the total relative
error define as
1
T 2 2
0 errort L2
dt
1
,
T 2 2
0 exact solutiont L2 dt

where
1
2
2
f L2
: f d ,

and

errort exact solutiont numerical solutiont

TABLE 9.4 Error and convergence


Convergence
Mesh ErrorTD Factor ErrorRI Factor
1 8 36E 03 5 25E 03
2 4 59E 03 1 82 2 99E 03 1 76
3 2 45E 03 1 88 1 92E 03 1 56
4 1 28E 03 1 91 1 19E 03 1 61
5 6 66E 04 1 92 7 91E 04 1 51
284 ADVANCED PRICING TECHNIQUES

FIGURE 9.10 Numerical solution at evaluation date.

1000

1000

2000
Lagrange multiplier

3000

4000

5000
0
6000

7000 0.05

8000
0 100 200 300 400 0.1
500 600 r

FIGURE 9.11 Lagrange multiplier at evaluation date.

The analytical formulae for the ‘‘exact solution’’ was given in 4 27 ; the value
of the ‘‘exact solution’’ for the current level of the interest rate and the stock price
is 79674 4525.
The numerical results are presented in table 9.4. On the boundaries we use the
analytical solution. In each case two of the boundaries are Dirichlet and two are
Neumann. ‘‘Error TD’’ is the error on the entire domain ; ‘‘Error RI’’ is the error
on the region of interest, . ‘‘Factor’’ is the progressive error reduction factor in
moving to a finer mesh level from the preceding mesh level.
Numerical Solution of Multifactor Pricing Problems 285

As mentioned in section 9 4 3, the classical Lagrange-Galerkin method is uncon-


2
ditionally stable and has convergence order of O h O ht O t under
suitable conditions for the coefficients of the equation. Although our models do
not satisfy the required assumptions, the same error estimate has been obtained
empirically. In table 9.4 it can be seen that the ratio between two consecu-
tive errors tends to 2, which is consistent with the order of convergence given
above.

Pricing of a real CB Finally, we show the numerical solution for the Adidas-Salomon
issue maturing on October 8, 2018, as of December 16, 2005. Figure 9 10 shows
the CB prices, and figure 9 11 the Lagrange multiplier. Results were computed with
mesh 4.

9.7 APPENDIX: LAGRANGE TRIANGULAR FINITE ELEMENTS

9.7.1 Lagrange Triangular Finite Elements in d

The domain d is decomposed into simplices of dimension d (triangles if d 2,


tetrahedra if d 3, etc.) and the space h is the space of continuous functions in
that are polynomials of degree smaller than or equal to k over any single simplex.
A set of d 1 points c1 , , cd 1 not lying on the same hyperplane is considered,
that is, such that the matrix

c11 c1d 1

(9.99)
cd1 cdd 1
cd 11 cd 1d 1

has a nonzero determinant.


The convex hull of these d 1 points

d 1 d 1
K x i ci 0 i 1, 1 i d 1, i 1 , (9.100)
i 1 i 1

is called a d-dimensional simplex in d

If x K the corresponding i (x) i are known as barycentric coordinates of


x. Notice that

i (cj ) ij ,

and that i is an affine function (polynomial of degree one) in the variables xj .


286 ADVANCED PRICING TECHNIQUES

The subsets of K obtained when the following conditions are imposed

i1 i2 ir 0,

are called d r dimensional faces of K.9


The barycenter of K is the point that has all the barycentric coordinates equal,

1
i 1 i d 1.
d 1

Let K be a d-simplex and k a positive integer; the subset of points of K

k 1 k 1
x K: j (x) 0, , , ,1 , 1 j d 1 , (9.101)
K k k

is called a lattice of order k in K. Note that the lattice of order k in K has d k k


elements.
Let k be the space of polynomials of degree equal to or less than k. Since a
homogeneous polynomial of d variables and degree j has d jj 1 terms, the dimension
of k is
d d 1 d k 1 d k
1 1 2 k k (9.102)

It can be shown that any polynomial of degree k is uniquely determined by its values
at the d k k points of the lattice of order k in K.
k
With the triple K, K , k we will build spaces of approximation h.
Let h be a partition of into simplices such that every face of a simplex Ki of
h is either:

■ A subset of D,
■ A subset of R,

or

■ A face of another simplex Kj of h; in such a case Ki and Kj are said to be


adjacent.

The diameter of K is denoted by hK and h max hK , K h .


For k 1 the following functional spaces are built associated to h

k
h h C0 ( ) : h K k K K h (9.103)

Clearly,
k
h k K (9.104)
K h

9 For d 3, those are edges, faces, and vertices.


Numerical Solution of Multifactor Pricing Problems 287

This inclusion simply states that any function of kh is a polynomial of degree equal
to or less than k over each individual element. Conversely, what is the necessary
k
and sufficient condition for an element of K k (K) to be in h , that is, for the
h
polynomial pieces to stick with continuity? The above will hold if and only if for
every pair of adjacent elements Ki and Kj , the pieces defined on them agree on the
points
k k
(9.105)
Ki Kj

k
Therefore, any function in h is uniquely determined by its values at the points of
the set
k k
(9.106)
h K
K h

From now on, h will be called a triangulation of (even if the dimension d is


(k)
different from 2) and the elements of h nodes of the triangulation. Notice that
there can be nodes that are not vertices.
The dimension of space kh is the same as the number of nodes. Also, it is
possible to define a basis of kh such that its elements are functions with support
reduced to a few elements of h . Let Nh be the number of nodes that we will assume
to be numbered.
k
qi : i 1, , Nh . (9.107)
h

k
Node qi contributes to the basis with function i h uniquely determined by

i (qj ) ij 1 j Nh . (9.108)

9.7.2 Coefficients Matrix and Independent Term in Two Dimensions


We consider the problem in two dimensions (d 2). Let us see how to organize the
calculations to build h and bh in 9 64 and 9 65 , respectively, if we choose the
space of Lagrange triangular finite elements of degree one,
1
h h C0 ( ) : h K 1 (K) K h (9.109)

First, we consider the calculations ignoring the boundary condition on D .


(1)
Let h be the set of nodes of the triangulation h that we will assume to be
numbered
(1)
qi : i 1, , Nh (9.110)
h

1
Notice that the dimension of h equals Nh . Any node qj defines an element in the
1
basis i h such that

i qj ij 1 i, j Nh (9.111)
288 ADVANCED PRICING TECHNIQUES

Then the solution h can be written as

Nh

h j j and j h (qj )
j 1

t
Therefore, the column vector h q1 , , h qNh is the solution of the linear
system 9 63 . The calculation of h and bh using formulas 9 64 and 9 65
respectively, is inefficient because the same integrals are calculated several times over
the same triangles. The method described below, which is the one used in practice,
is based on the concepts of elementary matrix and assembling. The idea is that we
will compute the contribution to the matrix and right-hand side vector over each
individual element of the triangulation and then we will assemble them together in
a systematic way to build the global approximation of the solution.
Let us recall that the discretized problem in two dimensions (ignoring Dirichlet
boundary conditions) can be written as follows:

n 1 (1)
Problem 13 Find uh h,m 1 h
such that

2
uh vh
aij dx
xj xi
i,j 1

a0 uh vh dx

uh vh d
R

(1)
f vh dx gvh d vh h
, (9.112)
R

where

1
a0 a0
t
1 n
f f Xe pnh,m1
t h

Let us consider the first term of the left-hand-side of this equality. We have that

2 uh
uh vh vh vh a11 a12 x1
aij dx x1 x2 uh dx (9.113)
xj xi K a21 a22
i,j 1 K x2
h

Let cK K K
1 , c2 , c3 be the vertices of the triangle K and m1K , m2K , m3K the corresponding
indices in the numbering of h , that is assume

cK
1 qm1K cK
2 qm2K cK
3 qm3K (9.114)
Numerical Solution of Multifactor Pricing Problems 289

(1) 3 K K K
Let vh Vh then vh K i 1 vh ci i , where i is the only polynomial of degree
equal to or less than one, such that

K K
i (cj ) ij (9.115)

Equivalently,

vh cK
1
K K K K
vh K 1 2 3 vh cK
2 (vh K ) (9.116)
vh cK
3

Therefore,

3 K
vh vh cK
i
i
x1
x1 i 1
vh 3 K
x2 vh cK
i
i
x2
i 1

K
1
K
2
K
3
vh cK
1
x1 x1 x1
K K K
vh cK
2
1 2 3
x2 x2 x2 vh cK
3

K
D vh K (9.117)

Substitution of (9.117) into (9.113) yields

2
uh vh t
aij dx (vh K )t D K
A D K
(uh K ) dx (9.118)
xj xi K
i,j 1 K h

A similar process for the other terms leads to the following formulation of Prob-
lem 13:

t
(vh K )t D K
A D K
dx
K
K h

t t
K K K K
a0 dx d (uh K )
K K R

t t
(vh K )t K
f dx K
gd (9.119)
K K R
K h
290 ADVANCED PRICING TECHNIQUES

We introduce the Boolean matrix

m1K m2K m3K

0 1 0 0
MK ,
0 1 0 0
0 1 0

such that, for any vector v of Nh components

vm1K
MK v vm2K (v K ), (9.120)
vm3K

that is, matrix MK selects among the set of all degrees of freedom v Nh the
three that correspond to the element K.
In that way, Problem 13 can be written as

t t
(vh )t MK D K
A D K
dx
K
K h

t t
K K K K
a0 dx d MK (uh )
K K R

t t t
(vh )t MK K
fdx K
gd (9.121)
K K R
K h

Notice that this equality must be satisfied for all vh Nh ; therefore,

t t
h MK K
h MK and bh MK bK
h , (9.122)
K h K h

where

t
K K K
h D A D dx
K
t
K K
a0 dx
K
t
K K
d , (9.123)
K R
Numerical Solution of Multifactor Pricing Problems 291

and

t t
bK
h
K
fdx K
gd (9.124)
K K R

The 3 3 matrix K h is often called the elementary matrix, and the three-
K
component vector bh is the elementary right-hand side, corresponding to the
element K.
The operations in (9.122) are known with the name of assembling of the
elementary matrix and the right-hand-side terms. Let us see how it works in
practice. By definition,

MK mi K j (9.125)
ij

Hence,

3
t t
MK K
h MK MK K
h MK
ij is sj
s 1
3 3
MK K
h rs MK
ri sj
s 1 r 1
3 3
K
mr K i h rs msK j ,
s 1 r 1

and therefore,

t 0 if i mrK or j msK r, s 1, 2, 3
MK K
h MK K
ij h rs if i mrK and j msK r, s 1, 2, 3
(9.126)

In that way, for the calculation of h and bh the following algorithm can be used:

■ Initialize h and b to zero.


K
■ Do a loop over the elements of h. For every K h, compute h and bK
h
and
then define

K
( h )m K m K ( h )m K m K h

bh m K
(bh )m K
bK
h
292 ADVANCED PRICING TECHNIQUES

Change to the Reference Element The integrals that appear in the calculation of K h
and bKh
will be done through a change of variable to the reference element. Integrals
on the boundary and on the interior have to be dealt with differently. Therefore, we
will denote
1 t
K K K
h D A D dx
K
t
K K
a0 dx (9.127)
K
2 t
K K K
h d , (9.128)
K R

and
1 t
bK
h
K
fdx (9.129)
K
2 t
bK
h
K
gd (9.130)
K R

Let K be the triangle of vertices

0 1 0
c1 , c2 , c3 , (9.131)
0 0 1

that we will call the reference triangle. Let K be any element of h . There exists a
unique affine invertible mapping FK : K K such that FK (ci ) cK
i for i 1, 2, 3. It
is the mapping

FK (x̂) CK x̂ cK
1, (9.132)

where CK is the matrix

CK (cK
2 cK K
1 , c3 cK
1) (9.133)

It is easy to check that

K
i FK i, i 1, 2, 3, (9.134)

where

1 (x1 , x2 ) 1 x1 x2 , (9.135)
2 (x1 , x2 ) x1 , (9.136)
3 (x1 , x2 ) x2 (9.137)

K K
Indeed, i FK 1 (K̂) and also i FK cj i cj ij .
Numerical Solution of Multifactor Pricing Problems 293

The formula of the change of variable is

dx FK det CK dx (9.138)
K K

On the other hand, by the chain rule, we have that

FK 1 FK 2 K
x1 x x1 x x1 x x1 FK x
(9.139)
FK 1 FK 2 K
x2 x x2 x x2 x FK x
x2

Therefore,

K
x1 t x1
K
CK 1 , (9.140)
x2 x2

or, in summarized form,

t
D K
CK 1 D (9.141)

K
Substituting this expression for [D ] in (9.127) we obtain

1 t t
K
h D CK 1 A CK 1 D det CK dx
K
t
a0 det CK dx
K

If we introduce the following notation:

t
[GK ] CK 1 A CK 1

K det CK ,

we may write

1 t
K
h K D [GK ] D dx
K
t
K a0 dx (9.142)
K
294 ADVANCED PRICING TECHNIQUES

Therefore,

2
1 t
K
h K D [GK ] D dx K a0 dx
, 1 K K

2
K [GK ] dx K a0 dx (9.143)
x̂ x̂
, 1 K K

If the coefficients aij , a0 are constant in K, then [GK ] is constant in K and

2
1
K
h K [GK ] dx K a0 dx (9.144)
K x̂ x̂ K
, 1

The numbers

H dx and J dx, (9.145)


K x̂ x̂ K

do not depend on the element considered and are calculated just once. Also, notice
that

r s r!s!
H H , J J and dx (9.146)
K (r s 2)!

In this way, just the matrix [GK ] and a0 depend on the element. The matrix [GK ] is
worked out using the values of aij and the coordinates of vertex cK
i . Completing the
calculations described, we obtain

1 g11 2g12 g22 (g11 g21 ) (g12 g22 )


K K
h (g11 g12 ) g11 g12
2 (g21 g22 ) g21 g22
1 1 2 1
a0 K
1 2 1 1 2 , (9.147)
12 1 2 1 2 1
Numerical Solution of Multifactor Pricing Problems 295

where
2
2
g11 K a11 cK
32 cK
12 a12 a21 cK
11 cK
31 cK
32 cK
12

2
a22 cK
11 cK
31 ,

2
g12 g21 K a11 cK
12 cK
22 cK
32 cK
12 a12 cK
21 cK
11 cK
32 cK
12

a21 cK
12 cK
22 cK
11 cK
31 a22 cK
21 cK
11 cK
11 cK
31 ,
2
2
g22 K a11 cK
12 cK
22 a12 a21 cK
21 cK
11 cK
12 cK
22

2
a22 cK
21 cK
11

Similarly, for the right-hand-side term we have that


1 t
bK
h K
K
f FK x dx (9.148)
K

We proceed to compute the boundary integrals of the elementary matrix and the
right-hand side by a change of variable to the reference element. In order to do so,
we will define parameterizations of the edges of K
1
Edge 1: K( ˆ ) FK ˆ , 0 cK K
2 c1 ˆ cK
1

2
Edge 2: K( ˆ ) FK 1 ˆ, ˆ cK K
3 c2 ˆ cK
2

3
Edge 3: K( ˆ ) FK 0, 1 ˆ cK K
1 c3 ˆ cK
3

Therefore,
1
l
d K ˆ cK
l 1 cK
l dˆ cK
4 cK
1 , (9.149)
l edge 0

and finally we have


3 1
2
K l
h K cK K
l 1 cl
l
K ˆ l
ˆ l
ˆ dˆ,
0
l 1

3 1
2
bK
h
l
K cK K
l 1 cl
l
ˆ g l
K ˆ dˆ,
0
l 1

where
1 2 3
(ˆ ) ˆ ,0 , (ˆ) 1 ˆ, ˆ , (ˆ) 0, 1 ˆ , (9.150)
296 EQUITY HYBRID DERIVATIVES

and

l 1 if l R
K (9.151)
0 otherwise

The calculation of the integrals that appear in bKh


, as well as the ones that appear
in the elementary matrix when the coefficients aij , and a0 are not constant in the
element K, are done via numerical integration. It can be shown ([180]) that the error
does not increase if an appropriate formula, which depends on the finite element
space, is used.
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

CHAPTER 10
American Monte Carlo

10.1 INTRODUCTION

Traditionally, the numerical techniques for pricing derivatives fall into two distinct
categories: Monte Carlo simulation and backwards induction methods such as trees
or PDE methods. The following table sums up the strengths of each type:

Monte Carlo Trees/PDEs


Early exercise N Y
Path dependent Y N
Many underlyings Y N

In the most general terms, a derivative consists of a series of payments that


depend on decisions made by the two parties in the contract and the values of some
quantities observable in the market: the underlyings, which we will model with some
stochastic processes. The payments can depend on the values of the underlyings
observed on the date of the payment or on some functional of the paths followed by
the underlyings up to the payment date.
To price derivatives using PDE methods, we must be able to represent the
price of the derivative at time t as a function of a small number of state variables.
For path-dependent options, the number of state variables may be larger than the
number of underlyings. For instance, were we pricing an Asian option, our state-
space at time t would have to include the current stock price and the running average
stock price.
If we assume that our stock price follows the Black-Scholes SDE, and that the
average is defined as

ti t Sti
At ,
ti t 1

for some set of averaging dates ti , then we have the following PDE for the price of
the derivative in between averaging dates.

V V 1 2V
2 2
rt V rt SV S (10.1)
t S 2 S2

297
298 ADVANCED PRICING TECHNIQUES

Across averaging dates we have the condition

V(Sti , At , ti ) V(Sti , At , ti ), (10.2)


i i

where

1
At Sti iAti (10.3)
i i 1

At the maturity of the option, we calculate the final payment as a function of both S
and A, suitably discretized, then evolve it back to the previous averaging date using
equation (10.1). At that date, we can calculate the value of the option as a function
of S and A before the averaging using equations (10.2) and (10.3).
The computational time taken to price derivatives using PDE methods scales
exponentially with the number of state variables. For this reason, PDEs are not
generally used to price options that depend on more than a few underlyings, or
path-dependent options more complicated than simple knock-in/out barriers. For
these options, Monte Carlo methods have traditionally been used.
In Monte Carlo pricing, we exploit the fact that the value of a series of payments,
Ci at times ti , can be written as

Ci
V(0) ,
Bti
i

where Bt is the money market account and is the risk-neutral measure. We


generate sample paths for the underlying processes and calculate the value the
derivative would have if each path were realized. We then average over the paths
to get the value of the option. We can easily handle path-dependent options,
as we have the entire path available to us. Additionally, the time taken to price
multiunderlying deals scales linearly with the number of underlyings involved, rather
than exponentially as in the PDE case.
However, this approach cannot always be used directly when pricing options
where the holder/issuer must make some decision that does not depend in a trivial
way on the market data. In the case of a simple European call option, the exercise
decision directly depends on the values of market observables (the stock price) on
the exercise date and so the choice is trivial (we exercise if the stock is worth more
than the strike) and can be incorporated into a Monte Carlo pricing. However, when
the option can be exercised before maturity (i.e., the holder has an early exercise
decision) the decision depends only indirectly on the market observables, through
the price of another option. For example, at the maturity date of a European call
with strike K, we know that we will exercise if S K and receive an amount S K.
For a monthly Bermudan call, the decision at the maturity date is identical (assuming
we have not already exercised). However, one month from maturity we must decide
whether we would rather receive S K immediately or keep what is effectively a
one-month European call. Two months from maturity, we can either receive S K
American Monte Carlo 299

or a two-month Bermudan call and so on. Traditional Monte Carlo methods fall
down here, as we have no way of calculating the values of these suboptions.1
In this chapter, we present some methods for pricing these options using Monte
Carlo simulation. Throughout the chapter, we will use expected continuation value
to mean the expected value of the remaining payments in the option (i.e., the value
of continuing to hold the option). We will use exercise value to mean the value we
would get for exercising the option immediately and realized continuation value to
mean the value of the remaining payments in the option along a particular path.
Note that the exercise decision can never be based on the realized continuation value
as this would imply that the exerciser could foretell the future.

10.2 BROADIE AND GLASSERMAN


The expected continuation value at any exercise date is just the price of an option,
seen at the exercise date. Pricing options with many exercise dates can be thought of
as pricing options on options on options, and so on. We can always use Monte Carlo
to price these suboptions, by starting a new Monte Carlo simulation for each path as
it hits each exercise date. This was first suggested by Broadie and Glasserman [88].
We will assume we have a derivative that can be exercised at a set of dates
T1 TN , and that the evaluation date is T0 . From T0 to T1 , we simulate P paths.
At T1 , we split each path into P more paths from T1 to T2 , each starting where the
original path left off. At T2 , we split each of these P2 paths into P more paths, and
so on until the maturity of the option. This is shown in figure 10.1.
We can treat each set of paths that are common up to the last early exercise
date TN 1 as a separate Monte Carlo and average over them to get the expected
continuation value at TN 1 given the path up to that point. We can use this to decide
whether or not to exercise for those paths at TN 1 (assuming we have not exercised
beforehand). We then take all the paths that are common up to TN 2 and average
over those to give the expected continuation value at TN 2 , and so on. We repeat
this procedure until we get back to the evaluation date.
The number of paths needed for the final section of the path is PN , which makes
the algorithm prohibitively slow if there are several exercise dates. In fact, we do not
need the whole path up to TN 1 in order to do a Monte Carlo simulation for the last
period and therefore find the expected continuation value at TN 1 . Since problems
will often be Markovian in just a few factors, any paths with identical Markov
factors at the branching date will have identical expected continuation values. We
can use this information to come up with a more efficient algorithm.

10.3 REGULARLY SPACED RESTARTS


When pricing using PDEs, we discretize our state space into a finite set of states
at each date and store the expected continuation value for each of these possible

1
Pricing options with early exercise decisions is not a problem with backwards pricing
methods since at any exercise date t, the contents of our PDE grid will be the value (at t) of
the remaining payments in the option (assuming we have not already exercised) as a function
of the state space at t. In the Bermudan option case, we simply replace the contents of our
grid at each node with S K if this is larger.
300 ADVANCED PRICING TECHNIQUES

Bushy Trees

160
150
140
130
Spot

120
110
100
90
80
0 100 200 300 400 500
Time

FIGURE 10.1 Broadie and Glasserman method. We simulate P paths up to the first exercise
date, then divide each path into P new paths. At the second exercise date, we divide each of
the P2 paths into P new paths, and so on up until the maturity of the option.

states. We can do the same thing with Monte Carlo simulation. Starting at the last
early-exercise date, we can discretize our state space in some way and start a Monte
Carlo simulation at each point. For each point, we generate P paths running from
time TN 1 to TN and average over each set of paths to give the expected continuation
value at each point in our discretized state-space at TN 1 .
We can now go back to the penultimate early-exercise date, TN 2 ; again, we
discretize the state-space and start a Monte Carlo simulation at each point, simulating
P paths from TN 2 to TN 1 . For each path, we can decide whether the option
would be exercised at TN 1 based on the exercise value at TN 1 and the expected
continuation value there, which we estimate by interpolating between the points at
which we started our first set of Monte Carlo simulations. For each new set of paths,
we average over the paths to calculate the expected continuation value at time TN 2
as a function of the discretized state-space there. This is shown in figure 10.2.
We can iterate this until we get back to the evaluation date. This approach
scales much more nicely with the number of early exercise dates and paths than the
Broadie and Glasserman approach. However, it is still prohibitively slow, especially
when we have to discretize in several dimensions at each fixing date. The time
taken using P paths per starting point and M restart points per fixing in each of d
directions is proportional to Md P. Assuming the expected continuation value is a
smooth function and we interpolate linearly, we will have a discretization error that
scales as 1 M2 . To get 1 P convergence in our price, as we would expect from
Monte Carlo, we therefore need M P1 4 , making the CPU time scale as P1 d 4 .
In high-dimensional problems, this scaling can make the method completely
impractical. No method based on calculating the expected continuation value at a dis-
crete set of points across state space will be able to cope with very-high-dimensional
problems as there is too much information to store or calculate. Take the example of
an option on a basket of 20 stocks. If we try to estimate the expected continuation
values on an M20 point mesh, even with M 3, that is 3 5 109 values to store.
American Monte Carlo 301

Thick grids approach

130

120

110
Spot

100

90

80

70
0 100 200 300 400 500
Time

FIGURE 10.2 Regularly spaced restarts. At each early exercise date, we discretize the state
space and for each point we simulate P paths up to the following exercise date and average
to get the expected continuation value.

10.4 THE LONGSTAFF AND SCHWARTZ ALGORITHM

The Longstaff and Schwartz algorithm [189] is an algorithm for combining back-
wards induction and Monte Carlo simulation that overcomes the scaling of the
previous two methods at the expense of introducing some bias into the answer. The
algorithm is sometimes called least-squares American Monte Carlo.
As with the previous methods, we try to use Monte Carlo simulation to find
the expected continuation value (CV e ) for each path at each early exercise date. As
discussed in the previous section, we cannot hope to find CV e as a function of all
the Markov factors in a high-dimensional problem. In least-squares Monte Carlo,
we instead try to reduce it to a function of a few relevant quantities: the regression
variables. If we choose the regression variables well, this will drastically reduce the
dimensionality of the problem of calculating CV e . However, if we do not choose
good regression variables, we will throw away useful information and find a bad
exercise strategy and hence a biased price.

10.4.1 The Algorithm


The strategy is as follows. We simulate P complete paths up to the maturity of the
option, then work backwards through the exercise dates. Assuming we have not
exercised the option early, we calculate all of the payments made after the last early
exercise date (TN 1 ) and discount them to that date (for each path). We refer to this
value as the realized continuation value, CVpr (TN 1 ), for each path p just after the
early exercise date. For each path, we also calculate the values of some regression
variables (observable at TN 1 ), rpN 1 , on which we think the estimated continuation
value will depend strongly. We then let the estimated continuation value, CV e (TN 1 ),
be some parameterized functional form of the regression variables and try to find
302 ADVANCED PRICING TECHNIQUES

the parameters that gives the best fit (in a least-squares-error sense) to the realized
continuation values. If our CV e function has parameters cN 1 , we can write

CVpe (TN 1) fN 1
(rpN 1
cN 1
)

We try to find the parameters cN 1


that minimize

2
(c) CVpr (TN 1) fN 1
(rpN 1
cN 1
) (10.4)
p

Now that we have the function CV e , we have an estimate of the expected


continuation value for each path. We can use this to decide whether the option
would be exercised for each path by comparing CVpe with the exercise value for the
path at that exercise date: EVp . If the holder of the option has the right to exercise, it
will be exercised if EVp CVpe , whereas if the issuer has the right it will be exercised
if EVp CVpe . According to this strategy, the realized continuation value for path
p just before the exercise decision is made, CVpr (TN 1 ), is EVp if we exercise and
CVpr (TN 1 ) if we do not.
Note that although we use CVpe to determine whether or not we exercise, the
value we get if we choose not to is CVpr (TN 1 ). This differs from the approach in
the previous section where we set the realized value to CVpe (because CVpr was not
available). This approach matches what would happen in real life and gives rise to a
less biased result, as will be explained later.
Having calculated the realized continuation value just before the exercise date
TN 1 , we discount these values back to date TN 2 where they become CVpr (TN 2 ).
We repeat the above procedure to calculate CVpr (TN 2 ) and so CVpr (TN 3 ), and
so on. We iterate this until we reach the evaluation date and have parameterized
expected continuation values, CV e , for each of the early exercise dates.
We could average over the realized continuation values for each path at the
evaluation date (CVpr (T0 )) to give the price of the option. However, this gives rise
to a slight bias (the foresight bias) as the exercise decision for each path will weakly
depend on the realized continuation value for that path, through the regression
coefficients. We discuss this bias in more detail in section 10.5. It is common practice
to remove this bias by using a separate set of Monte Carlo paths to price the option.2
We will refer to the two sets of paths as the regression paths and the pricing paths.

2 There are practical reasons for using separate regression and pricing paths. In general, the

computational time for one regression path will be more than that for one pricing path. Also,
random errors in the functions CV e only have a second-order effect on the overall price (see
section 10.5), so we can afford to use fewer paths to estimate these functions than we need
to use to find the final price. There is also an issue with the amount of memory used in the
regression phase of the algorithm. Since we have to store all paths in the regression phase (or
recalculate them, expensively), for some problems we can run out of memory trying to store
too many paths. Instead, we can use a smaller number of paths in the regression phase, few
enough to fit in the computer’s memory, and then use a larger number of paths in a separate
pricing phase, where we can calculate one path at a time and discard them after they have
been processed.
American Monte Carlo 303

With each of the pricing paths, we can repeat the above backwards-induction
algorithm (but omitting the regression step and using the previously calculated
regression coefficients) to find CVpr (T0 ). Alternatively, we can loop forward over
the exercise dates until we find the date at which the option will be exercised;
we then discount the exercise value at this date to the evaluation date to give
CVpr (T0 ).

10.4.2 Example: A Call Option with Monthly Bermudan Exercise


To demonstrate the regression phase of the algorithm we will consider the simple
example of a Bermudan option with monthly exercise dates and a maturity of two
years. The strike of the option is set to 100, which is the current spot price.
If we price this with deterministic volatility, hazard, and interest rates, then the
estimated continuation value at time TN 1 can only depend on the stock price at
that time, so we choose this as our regression variable. The first two columns of the
table below show the values of the stock at times TN 1 and TN . If we choose not
to exercise the option at TN 1 , we will eventually receive max(SN 100, 0) at TN .
Discounting this back to TN 1 (using the discount factor of 0.997) gives the realized
continuation values shown in the third column.

SN 1 SN CV r (TN 1)

112.07 114.67 14.62


86.19 95.53 0
106.74 90.20 0
109.77 113.82 13.78

Our decision on whether or not to exercise will be based on the estimated


continuation value, which we will model as a cubic function of the regression
variable (SN 1 ):

e 2 3
CVN 1 (SN 1 ) aN 1 bN 1 SN 1 cN 1 SN 1 dN 1 SN 1

Taking all the paths into account, we try to find the regression coefficients,
a, b, c and d, that minimize equation (10.4). (Details of how to do this are given in
section 10.5.2.) We find the least-square error comes from the polynomial

e
CVN 1 (SN 1 ) 37 18 1 19SN 1 0 0098S2N 1 0 000011S3N 1 (10.5)

Figure 10.3 shows the results of fitting a cubic function to 2000 paths. The points
are the realized continuation values, and the line is the above cubic function.
The table below shows the same four paths but we have added two extra
columns. The fourth column shows the estimated continuation value found using
equation (10.5), and the fifth column shows the value we get if we choose to exercise
immediately.
304 ADVANCED PRICING TECHNIQUES

Regression at date N-1

160
140
120
100
Value

80 Realized CV
60 Estimated CV
40
20
0
-20 0 50 100 150 200 250
Stock price at date N-1

FIGURE 10.3 Result of fitting a cubic to the realized continuation values at the penultimate
exercise date of a Bermudan call option.

SN 1 SN CV r (TN 1) CV e (TN 1) EVN 1 CV r (TN 1)

112.07 114.67 14.62 11.62 12.07 12.07


86.19 95.53 0 0.60 0 0
106.74 90.20 0 8.65 6.74 0
109.77 113.82 13.78 10.30 9.77 13.78

For the first path, CV e EV, so we choose to exercise the option immediately.
For the remaining three paths, CV e EV, so we choose not to exercise the option.
The final column of the table shows the realized continuation value for each path just
before the exercise decision. Note that for the third path, we decide not to exercise
but the option ultimately expires worthless; not exercising was the best decision we
could make with the information available at the exercise date.
For the next step in the regression phase, we take the realized values at date TN 1
and discount them back to date TN 2 where they become the realized continuation
values, CV r . The discount factor for TN 2 to TN 1 is 0.997, giving the following
results.

SN 2 SN 1 CV r (TN 2)

110.95 112.07 12.03


92.11 86.19 0
109.18 106.74 0
119.21 109.77 13.74

We have included SN 1 only for comparison with earlier tables; it is not used in
this step of the regression phase. Now we have a new set of regression variables (SN 2 )
American Monte Carlo 305

and corresponding realized continuation values, CV r , so we repeat the above step to


estimate CV e at date TN 2 . We repeat this until we get back to the evaluation date.

10.5 ACCURACY AND BIAS

In the above algorithm, we have four main sources of error—three in the regression
phase and one in the pricing phase.
Like any Monte Carlo price, both the regression and pricing phases suffer from
random errors. If we use too few paths in the pricing phase, our result will not
be very accurate. Similarly, if we use too few paths in the regression phase, the
regression coefficients will not be very accurate, which in turn means our exercise
strategy will not be optimal.
For options with a single regression variable, such as the one considered above,
the error coming from the pricing phase is likely to be much bigger than the error
coming from the regression phase, for the same number of paths. The only points
where the accuracy of the regression coefficients matter are the points where CV e
and EV cross over and we move from a region where it is optimal to exercise to a
region where it is optimal to hold on to the option—in other words, the exercise
boundary.
If our function CV e is slightly wrong, the position of the exercise boundaries
(and therefore our exercise strategy) will also be slightly wrong. However, in the
vicinity of the true exercise boundary, it makes little difference whether we exercise
or hold on to the option as both choices result in very similar prices. Small errors in
CV e therefore have a much smaller effect on the overall pricing—in fact the error in
the pricing scales as the square of the error in the exercise boundary.
A more important source of error comes from the fact we are modeling CV e
by some smooth parameterized function. This might not have enough freedom
accurately to approximate the true expected continuation values. In figure 10.4, we
show CV e for the penultimate exercise date of the Bermudan call approximated with
cubic and quintic functions, as well as the exact answer.
Clearly the cubic and quintic do not—indeed cannot—fit the real expected CV
for all spots, although the quintic does a much better job than the cubic. These errors
do not go away as we increase the number of paths used in the regression phase,
which means that for this option, American Monte Carlo gives only a lower bound
to the price. This is a general result: The Longstaff and Schwartz algorithm always
gives a suboptimal exercise strategy and so if the holder has the right to exercise, the
algorithm will always give a lower bound, whereas if only the issuer has the right to
exercise, the algorithm will give an upper bound to the price.
Figure 10.4 also demonstrates why we use CV r and not CV e as the realized
value of the option if we choose not to exercise. The inaccurate values of CV e have
only a small effect on the pricing when we use them only to determine the exercise
strategy; were we to use them as the realized values for nonexercised paths, the
errors in CV e would become errors in our final price, potentially leading to very
biased results. We would also have no guarantee that we had found a lower (or
upper) bound to the price.
The polynomials in figure 10.4 do not agree closely with the more accurate
piecewise-linear function. We show how to improve on this in section 10.5.1.
306 ADVANCED PRICING TECHNIQUES

Cubic and quintic fits

60

50

40

30 Cubic
Value

Quintic
20 Exact
10

0
50 70 90 110 130 150
-10
Spot

FIGURE 10.4 Fits to the expected continuation value at the penultimate exercise date of a
Bermudan call option.

The final source of error is the foresight bias mentioned in section 10.4.1. In
the regression phase, the exercise decision for each path depends on the regression
coefficients, which in turn depend on the realized continuation value for that path. We
therefore have a small foresight bias; for a low number of paths, our exercise strategy
will be better (for those paths) than the real-world exercise strategy. In figure 10.5, we
show the results of pricing an at-the-money call option, exercisable after 2y and 4y.
Generally, this foresight bias will be much smaller than the random error
in the price, and can be reduced by using independent pricing and regression
paths. However, for derivatives with many early exercise dates, this error may be
compounded up until it is significant. See Fries [190] for more details.

10.5.1 Extension: Regressing on In-the-Money Paths


In the above example with the cubic function, CV e becomes slightly negative for
some out-of-the-money paths of the stock price when in reality we know it must
be slightly positive. This is just an artifact of fitting a polynomial to the data. In
the region where CV e becomes negative, our strategy would tell us to exercise the
option even though we receive nothing and give up the chance to receive a positive
amount at the next exercise date. In reality, we would never consider exercising the
option unless it were in-the-money. We can use this to improve our pricing.
Since we will consider exercising the option only when it is in-the-money, we
also only need to know CV e for the paths where the option is in-the-money. Instead
of fitting a cubic to all the points in figure 10.3, we can just fit it to the in-the-money
points (i.e., where the stock is above the strike at TN 1 ). The result of doing this is
shown in figure 10.6.
American Monte Carlo 307

Bias in American Monte Carlo algorithm

350

300

250
Price

200

150

100
0 0.2 0.4 0.6 0.8
1/sqrt(Paths)

FIGURE 10.5 Foresight bias when pricing an in-the-money call, exercisable after 2y and 4y.
The line shows the average price of the option from 10,000 independent pricings, each using
P paths, against P 0 5 . The error bars are one standard deviation of the random errors in
each individual pricing. Here the bias scales as P 0 5 , and is equivalent to approximately one
third of a standard deviation.

Polynomial fits to in-the-money paths

25

20
Continuation Value

15 Cubic
Quintic
10 Exact

0
90 95 100 105 110 115 120
Spot

FIGURE 10.6 Cubic and quintic fits to the expected continuation value using only the
in-the-money paths.

The fit to the in-the-money region is now much better than when we tried to fit
all points at once. The fit to the region that’s out-of-the-money is truly abysmal, but
since we know we’re not going to exercise there anyway, it doesn’t matter.
For any option where we know we will not exercise if some condition does not
hold, we can improve the performance of the algorithm by only regressing on paths
308 ADVANCED PRICING TECHNIQUES

that could potentially be exercised, and never exercising paths that can never be
exercised (regardless of the estimated expected continuation value).

10.5.2 Linear Regression


We will now go into more detail on how to fit the CV e functions.
Assuming we have a set of regression variables rp and realized continuation
values CVpr , how should we fit a function CV e (rp )? Dropping the N 1 indices from
equation (10.4), we want to minimize
2
(c) CVpr f (rp c)
p

For a general nonlinear function of the coefficients, we would have to use some
general minimization algorithm like the Newton-Raphson algorithm described in
section 3.6.1. However, since we may have thousands or even millions of paths, this
will be incredibly slow. For this reason, we restrict the allowed functions to linear
functions of the coefficients. We write

CVpe f (rp c)
K
fk (rp )ck
k 1

Letting

Fpk fk (rp ),

we have
2

(c) CVpr Fpk ck


p k
T
A 2R c cT Mc (10.6)

where

A (CVpr )2 ,
p

R is the vector

Ri Fpi CVpr ,
p

and M is the symmetric matrix


j
Mij Fpi Fp
p
American Monte Carlo 309

Note that once we have computed A, R, and M, the computational time taken to
calculate the mean-square error for a given trial set of parameters, c, is independent
of the number of paths. This makes linear regression much faster than fitting some
nonlinear function.
Differentiating equation (10.6) with respect to ck gives

2 ck Mkk 2Rk
ck
k

To find the coefficients that minimize , we need to solve the simultaneous equations

Mc R (10.7)

Since M is a symmetric matrix, we can invert equation (10.7) by finding the


Cholesky decomposition of M. However, it is possible that one or more of the
regression coefficients has no effect on the solution or that some of the functions
are linearly dependent, either through the choice of regression functions or through
using an insufficient number of paths. There is also the possibility that if the real
matrix M is close to singular, then numerical rounding/truncation errors may cause
it to have negative eigenvalues and so according to equation (10.6) there will be
no minimum of . Cholesky decomposition will fail if M has zero or negative
eigenvalues, so instead we use singular value decomposition as a more robust way
of solving equation (10.7).
In singular value decomposition (SVD), we decompose the matrix M as

M UDVT , (10.8)

where U and V are orthogonal matrices and D is a diagonal matrix containing the
singular values of M (i.e., the eigenvalues of M2 ). Since M is a real, symmetric K K
matrix, U, V and D are also K K matrices and we have

1
c VD UT R (10.9)

Now if there exist any linear combinations of the regression coefficients that are
zero for all paths (i.e., the regression functions are not linearly independent), there
will be elements of D that are zero. The corresponding columns of V are called the
nullspace of M. If the vector e is in the nullspace, we have Me 0, and so any
component of c in this direction has no effect on and so cannot be determined.
We can therefore set the components of c in these directions to something arbitrary
without affecting the result. To set the components to zero, we set the corresponding
elements of D 1 to zero.
In fact, very small singular values may be the result of numerical rounding
errors and so it is best to ignore all singular values below some threshold. Numerical
Recipes [70] suggests ignoring all singular values which are less than of the largest
singular value, where is machine precision (about 10 15 for double-precision
numbers) by setting the corresponding elements of D 1 to zero.
310 ADVANCED PRICING TECHNIQUES

10.5.3 Other Regression Schemes


If the expected continuation value we are trying to approximate is not a particularly
smooth function of the regression variables, fitting it with a polynomial will smooth
out features that we want to capture. A common alternative is to break the phase
space up into smaller regions and fit either a constant or some other functional form
in each region. If we split the domain of possible regression variables into groups
Ai , we can write the expected continuation value as

CV e (r) fki (r)1r Ai


ik

One way of doing this is to perform a separate regression for each set of paths,
the i’th set of paths being all the paths for which rp Ai . The limit of fitting just a
constant for each group is Tilley’s method [191].
An alternative method is just to treat

fki (r)1r Ai

as separate regression functions and do a single regression but with many regression
functions. The SVD method described in section 10.5.2 is robust enough to handle
this. The ‘‘exact’’ lines in figures 10.4 and 10.6 were produced by fitting piecewise-
linear functions in this way.
The drawback of this approach is that the more partitions are used, the
fewer paths fall into each partition, so we must use a greater number of overall
paths to guarantee a good accuracy of the estimated continuation value. We must
also be careful how we choose the partitions, making sure that we are likely
to get a reasonable number of paths in each. This approach can work well for
low-dimensional problems where the phase space can easily be partitioned, but
for higher-dimensional problems, the number of partitions (and consequently the
number of paths we need to use) may make the algorithm too slow.

10.5.4 Upper Bounds


The Longstaff and Schwartz algorithm gives a lower bound for options where the
holder has the right to exercise. Upper bounds to the price can be found using
strategies proposed by Rogers [192] and Andersen and Broadie [193]. The value of
an option can be expressed as

h
Q sup
B

where is a stopping time and ht is the value of exercising at time t. In other words
the option is worth the expectation of the discounted cashflows from the optimal
exercise strategy. The Longstaff and Schwartz algorithm effectively tried to find
the optimal stopping strategy, but in reality would always find a less-than-optimal
strategy and hence give a lower bound to the price.
American Monte Carlo 311

In Rogers and the like, they express the price in terms of a dual formulation

ht
Q inf M0 max Mt ,
M t Ti Bt

where M is some martingale. The strategy then becomes to find the M that minimizes
the above expression. The better the choice of M, the tighter the upper bound.

10.6 PARAMETERIZING THE EXERCISE BOUNDARY

In the previous section, we discussed the Longstaff and Schwartz algorithm, where
the expected continuation value is parameterized as a function of some regression
variables. The estimate of the expected continuation value was only used to decide
the exercise strategy; all that mattered was whether it was greater or less than EV.
This gives rise to an alternative strategy suggested by Andersen [194]. Instead
of parameterizing the expected continuation value, we can parameterize the exercise
boundary itself. This strategy can only work if we already have some knowledge of
the topology of the exercise regions (e.g., how many exercise boundaries we need to
find). If we allow the strategy to find an arbitrary number of exercise regions, we
will end up with an exercise region around each path where CV r EV (i.e., perfect
foresight).
As an example, consider the example of the Bermudan call option again. We
know that we only exercise such an option before maturity in order to receive a
dividend. We give up some optionality in exercising the option, but the value of this
optionality decreases the more in the money the option becomes, so there is some
stock price above which we will exercise the option and below which we will not.
We can find the best exercise boundary for a set of paths by finding the path p that
minimizes

[(EVp CVpr )1Sp Sp ]

If we sort the paths by their stock prices, we just need to find Sp that minimizes

p
1
[EVp CVpr ]
P
p 1

As we increase the total number of paths used, Sp will converge to the correct exercise
boundary. Note that this strategy is biased—the exercise strategy will always be
better for those paths than we could hope for in real life, so this procedure will give
an upper bound to the price. If we then price the option with a new set of paths but
using the same exercise boundary, we can generate a lower bound to the price.
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

Bibliography

[1] Blanchet-Scalliet, C., and Jeanblanc, M. Hazard rate for credit risk and hedging
defaultable contingent claims, Working Paper, 2002.
[2] Delbaen F., and Schachermayer, W. A general version of the fundamental theorem of
asset pricing. Mathematische Annalen 300:463–520 (1994).
[3] Samuelson, P. Rational theory of warrant pricing. Industrial Management Review 6:
13–31 (1965).
[4] Merton, R.C. The theory of rational option pricing. Bell Journal of Economics and
Management Science 4:141–183 (1973).
[5] Black, F., and Scholes, M. The pricing of options and corporate liabilities. Journal of
Political Economy, 81: 637–59 (1973).
[6] Balland, P. Deterministic implied volatility models. Quantitative Finance 2: 31–44
(2002).
[7] Kellerer, H. Markov-komposition und eine anwendung auf martingale (in German).
Mathematische Annalen 198: 217–229 (1972).
[8] Cox, J. The constant elasticity of variance option pricing model. Journal of Portfolio
Management (December 1998).
[9] Föllmer, H., and Schied. Stochastic Finance. 2nd edition. Berlin: de Gruyter, 2004.
[10] Madan, D., and Yor, M. Making Markov martingales meet marginals: with explicit
constructions. Working Paper 2002.
[11] Dupire, B. Pricing with a smile. Risk 7 (1):18–20, 1996.
[12] Gyöngy, I. Mimicking the one-dimensional marginal distributions of processes
having an Ito differential”. Probability Theory and Related Field 71:501–516,
1986.
[13] Revuz, D., and Yor, M. Continuous Martingales and Brownian Motion, 3rd ed.
Heidelberg: Springer, 1999.
[14] Merton, R.C. Option pricing with discontinuous returns. Bell Journal of Financial
Economics 3: 145–166 (1976).
[15] Protter, P. Stochastic Integration and Differential Equations, 2nd edition. Heidelberg:
Springer, 2004.
[16] Buehler, H. Expensive martingales. Quantitative Finance (April 2006).
[17] Glasserman, P. Monte Carlo Methods in Financial Engineering. Heidelberg: Springer,
2004.
[18] Buehler, H. Volatility markets: Consistent modelling, hedging and practical implemen-
tation. PhD thesis, 2006.
[19] Heston, S. A closed-form solution for options with stochastic volatility with applica-
tions to bond and currency options. Review of Financial Studies (1993).
[20] Aït-Sahalia, Y., and Kimmel, R. Maximum likelihood estimation of stochastic volatility
models. NBER Working Paper No. 10579, June 2004.
[21] Andersen, L., Piterbarg, V. Moment explosions in stochastic volatility models. Working
Paper, April 15, 2004. http://ssrn.com/abstract=559481.
[22] Carr, P., and Madan, D. Towards a theory of volatility trading. In Robert Jarrow (ed.).
Volatility. London: Risk, (2002), 417–427.

313
314 EQUITY HYBRID DERIVATIVES

[23] Lewis, A. Option Valuation under Stochastic Volatility. Newport Beach, CA: Finance
Press, 2000.
[24] Hagan, P., Kumar, D., Lesniewski, A., Woodward, D. ‘‘Managing smile risk,’’ Wilmott,
pp. 84–108 (September 2002).
[25] Jourdain, B. Loss of martingality in asset price models with lognormal stochastic volatil-
ity. Working Paper, 2004. http://cermics.enpc.fr/reports/CERMICS-2004/CERMICS-
2004-267.pdf
[26] Hagan, P., Lesniewski, A., and Woodward, D. Probability distribution in the SABR
model of stochastic volatility. Working Paper, March 22, 2005.
[27] Henry-Labordère, P. ‘‘A general asymptotic implied volatility for stochastic volatility
models’’. April 2005 http://ssrn.com/abstract=698601
[28] Bourgade, P., Croissant, O. Heat kernel expansion for a family of stochastic volatility
models: -geometry. Working Paper, 2005. http://arxiv.org/abs/cs.CE/0511024.
[29] Scott, L. Option pricing when the variance changes randomly: theory, estimation and
an application. Journal of Financial and Quantitative Analysis 22:419–438 (1987).
[30] Fouque, J-P., Papanicolaou, G., and Sircar, K. Derivatives in Financial Markets with
Stochastic Volatility. New York: Cambridge University Press: 2000.
[31] Overhaus, M., Ferraris, A., Knudsen, T., Milward, R., Nguyen-Ngoc, L., and Schindl-
mayr, G. Equity Derivatives—Theory and Applications. Hoboken, NJ: Wiley, 2002.
[32] Schoutens, W. Levy Processes in Finance. Hoboken, NJ: Wiley, 2003.
[33] Cont, R., and Tankov, P. Financial Modelling with Jump Processes. Boca Raton, FL:
CRC Press, 2003.
[34] Schoebel, R., and Zhu, J. Stochastic volatility with an Ornstein-Uhlenbeck process: An
Extension. European Finance Review 3:2346 (1999).
[35] Bates, D. Jumps and stochastic volatility: Exchange rate process implicit in
deutschemark options. Review of Financial Studies 9:69–107 (1996).
[36] Brace, A., Goldys, B., Klebaner, F., and Womersley, R. Market model of stochas-
tic implied volatility with application to the BGM model. Working Paper, 2001.
http://www.maths.unsw.edu.au/ rsw/Finance/svol.pdf.
[37] Schönbucher, P.J. A market model for stochastic implied volatility. Philosophical
Transactions of the Royal Society A 357:2071–2092 (1999).
[38] Cont, R., da Fonseca, J., and Durrleman, V. Stochastic models of implied volatility
surfaces. Economic Notes 31(2): 361–377 (2002).
[39] Haffner, R. Stochastic Implied Volatility. Heidelberg: Springer, 2004.
[40] Derman, E., and Kani, I. Stochastic implied trees: Arbitrage pricing with stochastic
term and strike structure of volatility/International Journal of Theoretical and Applied
Finance 1(1):61–110 (1998).
[41] Barndorff-Nielsen, O., Graversen, S., Jacod, J., Podolskij, M., and Shephard, N.
A central limit theorem for realised power and bipower variations of continuous
semimartingales. Working Power 2004. http://www.nuff.ox.ac.uk/economics/papers/
2004/W29/BN-G-J-P-S fest.pdf.
[42] Demeterfi, K., Derman, E., Kamal, M., and Zou, J. More than you ever wanted to
know about volatility swaps. Journal of Derivatives 6(4):9–32 (1999).
[43] Carr, P., and Madan, D. Towards a theory of volatility trading. In: Robert Jarrow, ed.,
Volatility. Risk Publications, pp. 417–427 (2002).
[44] Carr, P., and Lewis, K. Corridor variance swaps. Risk (February 2004).
[45] El Karoui, N., Jeanblanc-Picquè, M., and Shreve, S.E. Robustness of the Black and
Scholes formula. Mathematical Finance 8:93 (April 1998).
[46] Carr, P., and Lee, R. Robust replication of volatility derivatives. Working Paper, April
2003. http://math.uchicago.edu/ rl/voltrading.pdf.
[47] Heath, D., Jarrow, R. and Morton, A. Bond pricing and the term structure of interest
rates: A new methodology for contingent claims valuation. Econometrica 60(1992).
Bibliography 315

[48] Bergomi, L. Smile dynamics II. Risk (September 2005).


[49] Buehler, H. Consistent variance curves Finance and Stochastics (2006).
[50] Musiela, M. Stochastic PDEs and term structure models. Journées Internationales de
France, IGR-AFFI, La Baule (1993).
[51] Björk, T., and Svensson, L. On the existence of finite dimensional realizations for
nonlinear forward rate models. Mathematical Finance, 11(2): 205–243(2001).
[52] Filipovic, D. Consistency Problems for Heath-Jarrow-Morton Interest Rate Models
(Lecture Notes in Mathematics 1760). Heidelberg: Springer, 2001
[53] Filipovic, D., and Teichmann, J. On the geometry of the term structure of interest rates.
Proceedings of the Royal Society London A 460: 129–167 (2004).
[54] Buehler, H. Volatility markets: Consistent modeling, hedging and practical implemen-
tation. PhD thesis TU Berlin, to be submitted 2006.
[55] Björk, T., and Christensen, B.J. Interest rate dynamics and consistent forward curves.
Mathematical Finance 9(4): 323–348 (1999).
[56] Duffie, D., Pan, J., and Singleton, K. ‘‘Transform analysis and asset pricing for affine
jump-diffusions. Econometrica 68: 1343–1376 (2000).
[57] Dupire, B. Arbitrage pricing with stochastic volatility. In Carr, P., Derivatives Pricing:
The Classic Collection pp. 197–215, London: Risk, 2004.
[58] Sin, C. Complications with stochastic volatility models. Advances in Applied Probabil-
ity 30: 256–268 (1998).
[59] Brace, A., Gatarek, D., and Musiela, M. The market model of interest rate dynamics.
Mathematical Finance 7: 127–154 (1997).
[60] Heath, D., Jarrow, R., and Morton, A. Bond pricing and the term structure of
interest rates: A new methodology for contingent claims valuation. Econometrica
61(1): 77–105 (1992).
[61] Hull J., and White, A. One-factor interest-rate models and the valuation of interest
rate derivatives. Journal of Financial and Quantitative Analysis 28: 235–254 (1993).
[62] Vasicek, O. An equilibrium charecterisation of the term structure. Journal of Financial
Economics 5: 177–188 (1997).
[63] Black, F., and Karasinski P. Bond and option pricing when short rates are lognormal.
Financial Analysts Journal (July–August 1991): 52–59.
[64] Cox, J.C., Ingersoll, J.E., and Ross, S.A., A theory of the term structure of interest
rates. Econometrica 53: 385–407 (1985).
[65] Black, F., Derman, E., and Toy, W. A one-factor model of interest rates and its
application to Treasury bond options. Financial Analysts Journal (July–August 1990):
52–59.
[66] Jamshidian, F. Forward induction and the construction of yield curve diffusion models.
Journal of Fixed Income 1:62–74 (1991).
[67] Hull, J., and White, A. Numerical procedures for implementing term structure models
I: Single-factor models. Journal of Derivatives 2: 7–16 (1994).
[68] Jamshidian, F. Bond and option evaluation in the Gaussian interest rate model.
Research in Finance 9:131–70 (1991).
[69] Kloeden, P., and Platen, E. Numerical Solution of Stochastic Differential Equations,
3rd ed. Heidelberg: Springer, 1999.
[70] Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P. Numerical Recipes
in C, 2nd ed., Cambridge: Cambridge University Press, 1993.
[71] Gill, P.E., Murray, W., and Wright, M.H., Practical Optimization. San Diego: Aca-
demic Press, 1981.
[72] Black, F., and Scholes, M. The pricing of options and corporate liabilities. Journal of
Political Economy 81:637–654 (1973).
[73] Merton, R.C. Theory of rational option pricing. Bell Journal of Economics and
Management Science 4:141–183 (Spring 1973).
316 EQUITY HYBRID DERIVATIVES

[74] Black, F., and Cox, J. Valuing corporate securities: Some effects of bond indenture
provisions. Journal of Finance 351–367 (1976).
[75] Merton, R.C. On the pricing of corporate debt: The risk structure of interest rates.
Journal of Finance 29:449–470, 1974.
[76] Geske, R. The valuation of corporate liabilities as compound options. Journal of
Financial and Quantitative Analysis 12:541–552 (1977).
[77] Hull, J., and White, A. The impact of default risk on the prices of options and other
derivatives securities. Journal of Banking and Finance 19(2):299–322 (1995).
[78] Nielsen, L.T., Saa-Requejo, J., and Santa-Clara, P. Default risk and interest rate risk:
The term structure of default spreads. Working Paper, INSEAD, 1993.
[79] Schonbucher, P.J. Valuation of securities subject to credit risk. Working paper, Uni-
versity of Bonn, February 1996.
[80] Zhou, C. A jump-diffusion approach to modeling credit risk and valuing default-
able securities. Finance and Economics Discussion Series, Federal Reserve Board 15
(1997).
[81] Longstaff, F.A., and Schwartz, E.S. A simple approach to valuing risky fixed and
floating rate debt. Journal of Finance 29:789–819 (1995).
[82] Briys, E. and Varenne, F. Valuing risky fixed rate debt: An extension. Journal of
Financial and Quantitative Analysis 32(2):239–248 (1997).
[83] Ramaswamy, K., and Sundaresan, S.M. The valuation of floating rate instruments,
theory and evidence. Journal of Financial Economics 17:251–272 (1986).
[84] Jarrow, R.A., and Turnbull, S.M. Pricing derivatives on financial securities subject to
credit risk. Journal of Finance 50:53–85, 1995.
[85] Duffie, D., and Singleton, K. Econometric modelling of term structures of defaultable
bonds. Working Paper, Stanford University, 1994.
[86] Duffie, D., and Singleton, K. An econometric model of the term structure of interest
rate swap yields. Journal of Finance 52(4):1287–1321 (1997).
[87] Duffie, D., and Singleton, K.J. Modeling term structure of defaultable bonds. Review
of Financial Studies 12:687–720 (1999).
[88] Lando, D. On Cox processes and credit risky bonds. Review of Derivatives Research
2(2/3):99–120 (1998).
[89] Ingersoll, J.E. A contingent claim valuation of convertible securities. Journal of Finan-
cial Economics 4:289–322 (1977).
[90] Brennan, M.J., and Schwartz, E.S. Convertible bonds: Valuation and optimal strategies
for call and conversion. Journal of Finance 32:1699–1715 (1977).
[91] Brennan, M.J., and Schwartz, E.S. Analysing convertible bonds. Journal of Financial
and Quantitative Analysis 15(4):907–929 (1980).
[92] Nyborg, K.G. The use and pricing of convertible bonds. Applied Mathematical Finance
3:167–190 (1996).
[93] Carayannopoulos, P. Valuing convertible bonds under the assumption of stochastic
interest rates: An empirical investigation. Quarterly Journal of Business and Economics
35(3):17–31 (summer 1996).
[94] Cox, J., Ingersoll, J., and Ross, S. A theory of the term structure of interest rates.
Econometrica 53:385–467 (1985).
[95] Zhu, Y.-I., and Sun, Y. The singularity separating method for two factor convertible
bonds. Journal of Computational Finance 3(1):91–110 (1999).
[96] Epstein, D., Haber, R., and Wilmott, P. Pricing and hedging convertible bonds under
non-probabilistic interest rates. Journal of Derivatives, Summer 2000, 31–40 (2000).
[97] Barone-Adesi, G., Bermúdez, A., and Hatgioannides, J. Two-factor convertible bonds
valuation using the method of characteristics/finite elements. Journal of Economic
Dynamics and Control 27(10):1801–1831 (2003).
Bibliography 317

[98] Nogueiras, M.R. 2005. Numerical analysis of second order Lagrange-Galerkin schemes.
Application to option pricing problems. Ph.D. thesis, Department of Applied Mathe-
matics, Universidad de Santiago de Compostela, Spain.
[99] McConnell, J.J., and Schwartz, E.S. LYON taming. The Journal of Finance
41(3):561–577 (July 1986).
[100] Cheung, W., and Nelken, I. Costing the converts. Risk 7(7):47–49 (1994).
[101] Ho, T.S.Y., and Pfteffer, D.M. Convertible bonds: Model, value, attribution and
analytics. Financial Analyst Journal, (September–October 1996):35–44.
[102] Schonbucher, P.J. Credit Derivatives Pricing Models: Models, Pricing and Implemen-
tation. Hoboken, NJ: Wiley, 2003.
[103] Arvanitis, A., and Gregory, J. Credit: The Complete Guide to Pricing, Hedging and
Risk Management. London: Risk Books, 2001.
[104] Bermúdez, A., and Webber, N. An asset based model of defaultable convertible
bonds with endogenised recovery. Working Paper, Cass Business School, London,
2004.
[105] Vasicek, O.A. An equilibrium characterisation of the term structure. Journal of
Financial Economics, 5:177–188 (1977).
[106] Hull, J.C., and White, A. Pricing interest rate derivative securities. Review of Financial
Studies 3:573–592 (1990).
[107] Bermúdez, A., and Nogueiras, M.R. Numerical solution of two-factor models for
valuation of financial derivatives. Mathematical Models and Methods in Applied
Sciences 14(2):295–327 (February 2004).
[108] Davis, M., and Lischka, F. Convertible bonds with market risk and credit risk. Studies in
Advanced Mathematics. Somerville, MA: American Mathematical Society/International
Press, 2002:45–58.
[109] Black, F. Derman, E., and Toy, W. A one factor model of interest rates and
its application to Treasury bond options. Financial Analyst Journal 46:33–39
(1990).
[110] Zvan, R., Forsyth, P.A., and Vetzal, K.R. A general finite element approach for PDE
option pricing model. Proceedings of Quantitative Finance 98, (1998).
[111] Yigitbasioglu, A.B. Pricing convertible bonds with interest rate, equity and FX risk.
ISMA Center Discussion Papers in Finance, University of Reading (June 2002).
[112] Cheung, W., and Nelken, I. Costing the converts. In Over the Rainbow, vol. 46,
London: Risk Publications, 1995: 313–317.
[113] Kalotay, A.J., Williams, G.O., and Fabozzi, F.J. A model for valuing bonds and
embedded options. Financial Analyst Journal (May–June 1993): 35–46.
[114] Takahashi, A., Kobayahashi, T., and Nakagawa, N. Pricing convertible bonds with
default risk: A Duffie-Singleton approach. Journal of Fixed Income, 11(3):20–29,
(2001).
[115] Tseveriotis, K., and Fernandes, C. Valuing convertible bonds with credit risk. Journal
of Fixed Income, 8(2):95–102 (September 1998).
[116] Ayache, E., Forsyth, P.A., and Vetzal, K.R. Next generation models for convertible
bonds with credit risk. Wilmott Magazine 68–77 (December 2002).
[117] Ayache, E., Forsyth, P.A., and Vetzal, K.R. Valuation of convertible bonds with credit
risk. Journal of Derivatives 11(1):9–29, (April 2003).
[118] Protter, P. Stochastic Integration and Differential Equations, vol. 21 of Applications
of Mathematics, 3rd ed. Heidelberg: Springer-Verlag, 1995.
[119] Jacod, J., and Shiryaev, A.N. Limit Theorems for Stochastic Processes. Berlin: Springer,
(1988).
[120] Olsen, L. Convertible bonds: A technical introduction. Research tutorial, Barclays
Capital, 2002.
318 EQUITY HYBRID DERIVATIVES

[121] Das, S.R., and Sundaram, R.K. A simple model for pricing securities with equity,
interest-rate and default risk. Defaultrisk.com, 2004.
[122] Andersen, L., and Buffum, D. Calibration and implementation of convertible bond
models. Journal of Computational Finance 7(2):1–34 (2003).
[123] Kiesel, R., Perraudin, W., and Taylor, A. Credit and interest rate risk. In M.H.A.
Dempster, ed., Risk Management: Value at Risk and Beyond. New York: Cambridge
University Press, 2002.
[124] Bielecki, T. and Rutkowski, M. Credit Risk: Modeling, Valuation and Hedging.
Heidelberg: Springer Finance, 2002.
[125] Bakshi, G., Madan, D., and Zhang, F. Understanding the role of recovery in default risk
models: Empirical comparisons and implied recovery rates. Working Paper, University
of Maryland, November 2001.
[126] Unal, H., Madan, D., and Guntay, L. A simple approach to estimate recovery rates with
APR violation from debt spreads. Working Paper, University of Maryland, February
2001.
[127] Hamilton, D.T., Gupton, G., and Berthault, A. Default and recovery rates of corpo-
rate bond issuers: 2000. Special comment, Moody’s Investor Service, Global Credit
Research, February 2000.
[128] Altman, E.I., Resti, A., and Sirone, A. Analysing and explaining default recovery
rates. Report, ISDA, Stern School of Business, New York University, December
2001.
[129] Realdon, M. Convertible subordinated debt valuation and ‘‘conversion in distress.’’
Working Paper, Department of Economics and Related Studies, University of York,
2003.
[130] Overhaus et al., Modelling and Hedging Equity Derivatives, Risk Books, 1999.
[131] JeanBlanc, M. Modelling of Default Risk. Mathematical Tools, 2000.’’
[132] Duffie Khan A yield factor model of interest rates, 1996.
[133] Nelson, R.B. An introduction to copulas. Mathematical Tools, 2000, p.91.
[134] Black, F., and Scholes, M. The pricing of options on corporate liabilities. Journal of
Political Economy 81: 637–659 (1973).
[135] Heston, S.L. A closed-form solution for options with stochastic volatility with appli-
cations to bond and currency options. Review of Financial Studies 6: 327–343
(1993).
[136] Dupire, B. Pricing with a smile. Risk 7: 18–20 (January 1994).
[137] Revuz, D., and Yor, M. Continuous Martingales and Brownian Motion, 3rd ed.
Heidelberg: Springer, 1998:375 (theorem (2.1)).
[138] Gyöngy, L. Mimicking the one-dimensional marginal distributions of processes having
an Ito differential. Probability Theory and Related Fields 71: 501–516 (1986).
[139] Clewlow, L., and Strickland, C. Implementing Derivative Models. Hoboken, NJ: Wiley,
1998.
[140] Wilmott, P., Dewynne, J., and Howison, J. Option Pricing: Mathematical Models and
Computation. New York: Oxford Financial Press, 1993.
[141] Vázquez, C. An upwind numerical approach for an American and European option
pricing model. Applied Mathematics and Computation 273–286 (1998).
[142] Bermúdez, A., and Moreno, C. Duality methods for solving variational inequalities.
Computer Mathematics with Applications, 7:43–58 (1981).
[143] Hull, J.C., and White, A. Efficient procedures for valuing European and American path
dependent options. Journal of Derivatives 1:21–31 (Fall 1993).
[144] Ewing, R.E., and Wang, H. A summary of numerical methods for time-dependent
advection-dominated partial differential equations. Journal of Computational and
Applied Mathematics 128:423–445 (2001).
Bibliography 319

[145] Pironneau, O. On the transport-diffusion algorithm and its application to the navier-
stokes equations. Journal of Numerical Mathematics 38(3):309–332 (1982).
[146] Douglas, J., and Russell, T. Numerical methods for convection dominated diffusion
problems based on combining methods of characteristics with finite element methods
or finite differences. SIAM Journal on Numerical Analysis 19(5):871 (1982).
[147] Baker, M.D., Suli, E., and Ware, A.F. Stability and convergence of the spectral
Lagrange-Galerkin method for mixed periodic/non-periodic convection dominated
diffusion problems. IMA Journal of Numerical Analysis, 19:637–663 (1999).
[148] Baranger, D., Esslaoui, D., and Machmoum, A. Error estimate for convection problem
with characteristics method. Numerical Algorithms, 21:49–56 (1999).
[149] Baranger, J., and Machmoum, A. A ‘‘natural’’ norm for the method of charactersistics
using discontinuous finite elements: 2d and 3d case. Mathematical Modeling and
Numerical Analysis 33:1223–1240 (1999).
[150] Boukir, K., Maday, Y., Metivet, B., and Razanfindrakoto, E. A high order characteris-
tics/finite element method for the incompressible Navier-Stokes equations. International
Journal for Numerical Methods in Fluids 25:1421–1454 (1997).
[151] Rui, H., and Tabata, M. A second order characteristic finite element scheme
for convection-diffusion problems. Journal of Numerical Mathematics 92:161–177
(2002).
[152] Bermúdez, A., Nogueiras, M.R., and Vázquez, C. 2006. Numerical analysis of
convection-diffusion-reaction problems with higher order characteristics/finite ele-
ments. Part I: Time discretization. To appear in Siam Journal on Numerical Analysis.
[153] Ciarlet, P.G., and Lions, J.L. eds. Handbook of Numerical Analysis, vol. 1 of North-
Holland. Amsterdam: Elsevier Science, 1989.
[154] Kangro, R., and Nicolaides, R. Far field boundary conditions for Black-Scholes
equations. SIAM Journal on Numerical Analysis 38(4):1357–1368 (2000).
[155] Barles, G., and Souganidis, P.E. Convergence of approximation schemes for fully
nonlinear second order equations. Asymptotyc Analysis 4(4):271–283 (1991).
[156] Duvaut, G., and Lions, J.L. Les inéquations en mécanique et en physique. In Travaux
et Recherches Mathématiques, vol. 21. Paris: Dunod, 1972.
[157] Glowinski, R., Lions, J.L., and Trémolières, R. Analyse Numérique Des Inéquations
Variationnelles. Paris: Dunod, 1973.
[158] Bensoussan, A. and Lions, J.L. Applications Des Inéquations Variationneles En
Contrôle Stochastique. Paris: Dunod, 1978.
[159] Jaillet, J., Lamberton, D., and Lapeyre, B. Variational inequalities and the pricing of
American options. Acta Applicandae Mathematicae 21:263–289 (1990).
[160] Crandall, M.G., Ishii, H., and Lions, P.L. User’s guide to viscosity solutions of second
order partial differential equations. Bulletin of the American Mathematical Society
27(1):1–67 (1992).
[161] Lions, P.L. Optimal control of diffusion processes and Hamilton-Jacobi-Bellman
equations, part 2: Viscosity solutions and uniqueness. Communications in Partial
Differential Equations 8(11):1229–1276 (1983).
[162] Barles, G., Daher, C.H., and Souganidis, P. Convergence of numerical schemes for
parabolic equations arising in finance theory. Mathematical Models and Methods in
Applied Science 5:125–143 (1995).
[163] Wilmott, P. Derivatives: The Theory and Practice of Financial Engineering. Hoboken,
NJ: Wiley, 1998.
[164] Clarke, N., and Parrot, K. Multigrid American option pricing with stochastic volatility.
Applied Mathematical Finance 6:177–195 (1999).
[165] Forsyth, P.A., and Vetzal, K. Quadratic convergence for valuing american options
using a penalty method. SIAM Journal on Scientific Computation 23:2096–2123
(2002).
320 EQUITY HYBRID DERIVATIVES

[166] Parés, C., Castro, M., and Macı́as, J. On the convergence of the Bermúdez-Moreno
algorithm with constant parameters. Numerische Mathematik 92:113–128 (2002).
[167] Morton, K.W. Numerical Solution of Convection-Diffusion Problems. Boca Raton,
FL: Chapman & Hall, 1996.
[168] Pironneau, O., and Hetch, F. Mesh adaptation for the Black and Scholes equations.
Journal of Numerical Mathematics, 8(1):25–35 (2000).
[169] Figlewski, S., and Gao, B. The adaptive mesh model: A new approach to efficient
option pricing. Working Paper, Stern School of Business, New York University,
1997.
[170] Zvan, R., Forsyth, P.A., and Vetzal, K.R. PDE methods for pricing barrier options.
Journal of Economic Dynamics and Control, 24 (2000).
[171] Pooley, D.M., Forsyth, P.A., Vetzal, K.R., and Simpson, R.B. Unstructured meshing
for two asset barrier options. Applied Mathematical Finance 7:33–60 (2000).
[172] Winkler, G., Apel, T., and Wystup, U. Valuation of options in heston’s stochastic
volatility model using finite element methods. In Foreign Exchange Risk. London: Risk
Publications, 2001.
[173] Topper, J. Finite element modeling of exotic options. Discussion paper 216, Department
of Economics, University of Hannover, December 1998.
[174] D’Halluin, Y., Forsyth, P., Vetzal, K., and Labahn, G. A numerical PDE approach for
pricing callable bonds. Applied Mathematical Finance, 8:49–77 (2001).
[175] Zvan, R., Forsyth, P.A., and Vetzal, K.R. A finite volume approach for contingent
claims valuation. IMA Journal of Numerical Analysis 21:703–721 (2001).
[176] Zvan, R., Forsyth, P.A., and Vetzal, K.R. Convergence of lattice and PDE methods
valuing path dependent options with interpolation. Review of Derivatives Research
5:273–314, 2002.
[177] Zvan, R., Forsyth, P.A., and Vetzal, K.R. Robust numerical methods for PDE models
of Asian options. Journal of Computational Finance 1:39–78 (1998).
[178] Zvan, R., Forsyth, P.A., and Vetzal, K.R. A finite element approach to the pricing of
discrete lookbacks with stochastic volatility. Applied Mathematical Finance 6:87–106
(1999).
[179] Selmin, V., and Formaggia, L. Unified construction of finite element and finite volume
discretisation for compressible flows. International Journal for Numerical Methods in
Engineering 39:1–32, (1996).
[180] Ciarlet, P.G. The Finite Element Method for Elliptic Problems, vol. 4 of Studies in
Mathematics and its Applications. Amsterdam: North-Holland, 1978.
[181] Zienkiewicz, O.C., Taylor, R.L., and Zhu, J.Z. The Finite Element Method: Its Basis
and Fundamentals. Amsterdam: 6th ed., Elsevier Butterworth-Heinemann, 2005.
[182] Suli, E. Stability and convergence of the Lagrange-Galerkin method with nonexact
integration. In J. R. Whiteman, ed. The Proceedings of the Conference on the Mathe-
matics of Finite Elements and Applications, MAFELAP VI. Academic Press: London,
1998: 435–442.
[183] Bause, M., and Knabner, P. Uniform error analysis for Lagrange-Galerkin approx-
imations of convection-dominated problems. SIAM Journal of Numerical Analysis
39:1954–1984 (2002).
[184] Ewing, R.E., and Russel, T.F. Multistep Galerkin methods along characteristics for
convection-diffusion problems. In R. Vichtneveski and R.S. Stepleman, ed. Advances in
Computer Methods for Partial Differential Equations IV. IMACS Publications, 1981:
28–36.
[185] Boukir, K., Maday, Y., Metivet, B., and Razafindrakoto, E. A high-order character-
istics/finite element method for incompressible Navier-Stoke equations. International
Journal on Numerical Methods in Fluids 25:1421–1454 (1997).
Bibliography 321

[186] Priestley, A. Exact projections and the Lagrange-Galerkin method: A realistic alterna-
tive to quadrature. Journal of Computational Physics, 112:316–333 (1994).
[187] Morton, K.W., Priestley, A., and Suli, E. Stability of the Lagrange-Galerkin method with
nonexact integration. Mathematical Modeling and Numerical Analysis 22:625–653,
1988.
[188] Broadie, M., and Glasserman, P. (1973). Pricing American-style securities using simu-
lation. Journal of Economic Dynamics and Control 21(8–9): 1323–1352 (1997).
[189] Longstaff, F., and Schwartz, E. Valuing American options by simulation: A simple
least-squares approach. Review Financial Studies 14:113–148 (2001).
[190] Fries, Christian P. Foresight bias and suboptimality correction in Monte-Carlo
pricing of options with early exercise: Classification, calculation and removal.
http://www.christian-fries.de/finmath/foresightbias.
[191] Tilley, J.A. Valuing American options in a path simulation model. Transactions of the
Society of Actuaries 45:83–104 (1993).
[192] Rogers, C. Monte Carlo valuation of American options. Mathematical Finance
12:271–286 (2002).
[193] Andersen, L., and Broadie, M. ‘‘A primal-dual simulation algorithm for pricing
multidimensional American options. Management Science 50(9): 1222–1234 (2004).
[194] Andersen, L.B.G. A simple approach to the pricing of Bermudan swaptions in the
multifactor LIBOR market model. (March 5, 1999). http://ssm.com/abstract=155208.
[195] Windcliff, H., Forsyth, P., and Vetzal, K. Asymptotic boundary conditions for the
Black-Scholes equation. Working Paper, University of Waterloo, October 2001.
[196] Topper, J. 2005. Financial Engineering with Finite Elements. Hoboken, NJ: Wiley-
Finance.
[197] Bermúdez, A., Nogueiras, M.R., and Vázquez, C. 2006 Numerical analysis of
convection-diffusion-reaction problems with higher order characteristic/finite elements.
Part II: Fully discretized scheme and quadrature formulas. To appear in Siam Journal
on Numerical Analysis.
Equity Hybrid Derivatives
By Marcus Overhaus, Ana Bermúdez, Hans Buehler, Andrew Ferraris, Christopher Jordinson and Aziz Lamnouar
Copyright © 2007 by Marcus Overhaus, Aziz Lamnouar, Ana Berm´udez, Hans Buehler,
Andrew Ferraris, and Christopher Jordinson

Index

Affine diffusion models, for hazard rate Characteristics


processes, 175–176 curves/lines, 263–264
Altiplano, 207, 218–223 finite elements, See
American Monte Carlo 297–311 Lagrange-Galerkin method
Broadie & Glasserman method, 299 method, 263
Longstaff & Schwarz algorithm, classical/first-order, 262–265
301–310 Crank-Nicolson, 274–276
accuracy/bias, 305–306 multilevel schemes, 271
parameterizing the exercise boundary, Cliquets, 49–56
311 multiplicative, 50
upper bounds, 310–311 reverse, 50
American-style option, 254 Closed-end fund, 159
Arbitrage Constant Maturity Swap (CMS) rates,
arbitrage-free option price surfaces, 91–92n
16–17
no-arbitrage conditions, 20, 231–232 Collateralized debt obligations (CDOs),
Arrow-Debreu price, 234 223–226
Asset backed securities, 223 balance sheet CDO, 224
Ayache, Forsyth and Vetzal model, 130, cash CDO, 224
136–137 default leg, 224–225
synthetic CDO, 224
Barrier risk, 123–124 Concordance, 209
Basket default swaps, 207, 226–228 Conditional trigger swaps, 115–118
Bermudan-style options, 113, 254, 298 Constant elasticity of variance (CEV)
Black-Karasinski model, See short rate model, 17
models Constant proportion portfolio insurance
Black-Scholes (CPPI), 145–163
barrier pricing, 123–124 actively managed, 155
formula, 12 allocation mechanism, 146
implied volatility, See Implied classical, 146–149
volatility continuous time, 148–149
model, 11 deleveraging, 149–150
PDE, 233 dynamic gearing, 154
vega hedging, 64 flexi-basket, 155
Boundary conditions, 253–254 flexi-portfolio, 146, 155–156
Dirichlet, 253–254 gap risk, 147–148
Robin, 253–254 investment level, 146, 149, 153
Broadie & Glasserman method, See key words, 145–146
American Monte Carlo liquidity issues, 158
Broyden’s method, 109–111 maximum investment level, 149
minimum investment level, 149, 150f,
Capital risk, 160 151f
Caps, 100–102, momentum, 146, 155
Carr-Madan technique, 93, 194 nonstandard, 153–358
Cash delta, 61 off-balance-sheet CPPI, 156–158
Cash gamma, 61, 64, 70. options, 152–153
Characteristic functions, 38–41 passively managed, 155–156

323
324 INDEX

Constant proportion portfolio insurance default leg, 180–181


(CPPI) (Continued) premium leg, 180
perpetual, 154 spread, 181
principal protected note, 162 link to hazard rate volatility,
protected fees, 150–151 183–184
rainbow, 156, Credit default swaption, 181–184
ratcheting, 150 Credit modeling, 167–203
rebalancing, 146–147, 153, 156 background, 167–174
restricted, 149–152 calibration, 186–188
straight-line floor, 151 convertible bond pricing, 128–131
suitable assets, 158 first-passage approach, 169–170
Convection-diffusion-reaction equation, model choice, 176–179
252 pricing, 180–186
diffusion matrix, 252 reduced-form approach, 128,
reaction coefficient, 252 171–174
velocity vector, 252 standard approach, 168–169
Convection-dominated problems, 249, structural approach, 128, 168–171
263 survival probability dynamics, 179,
Convertible arbitrage, 159 189–190
Convertible bonds, 128–138
See also Exchangeable bonds Default risk, 167–171
analytical solutions, 137–138 effect on the forward, 6
call price, 132 effect on replication arguments,
conversion ratio, 131 33–34
conversion rights upon default, 136 effect on variance swaps, 59
firm-value model, 135 modeling, See Credit modeling
governing equation, 131–134 Default protection, 201
model review, 128–131, 136–137 Dependence measures, 209–211
model specifications, 134–136 Doleans-Dade exponential, 114, 246
put price, 132 Domain truncation, See Boundary
recovery 135, 140 conditions
splitting procedures, 130, 133–134 Drawdown, 160
numerical solution, 279–285 Duality method, 260–262
unilateral conditions, 133 continuous problem, 260–262
Copulas, 207–228 convergence, 262, 269
applications, 218–228 discrete problem, 269
Archimedean, 215 Yosida approximation, 261
Clayton, 215–216 Dupire’s implied local volatility, 17, 19,
definitions, 207–209 231–232
elliptic, 213–215 Dupire’s formula, 230–233, 237. See
factor copula framework, 217–218 also Local volatility
Gaussian, 213–214 multifactor models, 232–233
independence copula, 215
minmax, 216 Entropy swaps 68–69
stochastic processes, 211–213 Equity default swaps (EDS), 196–203
student, 214–215 CDS curve sensitivity, 203f
t-copula, 214–215 equity leg/protection leg, 196
tail dependence, 210–211 modeling, 198
Cox-Ingersoll-Ross model, See Short multiname, 198
rate models structuring, 197
CPPI. See Constant proportion portfolio Equity models, 35–56
insurance Eulerian methods, 263
Crank-Nicolson characteristics finite Exchangeable bonds, 138–144
elements, 271–279
Credit default swap (CDS), 180–186 Finite differences, 250
equity default swap sensitivity, 202 Finite elements, 265–269, 276–279
Index 325

See also Lagrange triangular finite Kendall’s tau, 209–210, 216


elements
blocking degrees of freedom, 268 Lagrange-Galerkin method
triangulation, nodes, 287 See also Lagrange triangular finite
First-to-default swap, 226 elements
Forward PDEs, 233–237 classical, 262–271
pure equity model, 235–237 convergence order, 270, 285
call prices, 236–237 higher-order, 271–279
stochastic equity + interest rate convergence order, 275–279
model, 238–242 Lagrange multiplier, 132, 249
Forward skew, 52–56 method, See Duality method
Forward started options, 15, 49–56 Lagrange triangular finite elements,
Frechet bounds, 209 268–269, 285–296
Free boundaries, 255 assembling, 288, 291
Free-boundary problem, 132, 248, elementary matrix, 287–295
254–259 independent term, 286–295
Fund keywords, 159–160 simplex, 285–286
Fund of funds, 159–160 reference element, 292–296
Future skew, 56 Least-squares American Monte Carlo,
See Longstaff & Schwartz
Galerkin methods, 265 algorithm
Gamma swaps, 69–71 Least-squares minimization, 109–111
Generic Ornstein-Uhlenbeck models, Linear complementarity problem, See
98–100, 104–105, 108–109 Partial differential inequalities
Green’s formula, 273 Local volatility, 17–21, 229–233,
242–244
Hazard rate, 171–173, See also Credit calibration, 229–233, 242–244
modeling Dupire’s implied local volatility,
curve calibration, 187 17–21
modeling, 135, 175 introduction, 229–230
process, 135, 171 with stochastic interest rates,
stripping, 186–187 238–242
volatility calibration, 187 Localized domain, See Boundary
Heath-Jarrow-Morton (HJM) models, conditions
92 London Interbank Offered Rate
Hedge funds, 158–159 (LIBOR). 91n
Heston model, 35–43, 46–56 Longstaff & Schwartz algorithm,
High-water-mark fee structure, 301–310
153–154, 160
HJM, See Heath-Jarrow-Merton Macro arbitrage, 159
Hull-White model, See Short rate Master feeder fund, 160
models Material derivatives, 252
for hazard rates, 175–176, approximation, 264–265, 271
Hurdle rate, 160 Milstein scheme, 42, 109
Multifactor pricing problems, numerical
Implied volatility, 11–16, 229–230. solution
skew, See Forward skew, Future skew duality methods, 260–262
sticky strike / sticky delta, 13 introduction, 248–250
Independent increment equity model, 53 Lagrange-Galerkin, 262–279
Interest rate models, 91–95, See also model formulation,
Short rate models, linear problem, 251–253
nonlinear problem, 254–259
Jump diffusion models, 20 Musiela parametrization of forward
for Convertible bonds, 130 variance, 73
for CPPIs, 153
with Heston, 55 Napoleons, 49
326 INDEX

Newton-Raphson method, 110 Subreplication strategy, 22


Survival probability, 138, 177, 187
Open-end structure fund, 154 See also Credit modeling
Ornstein-Uhlenbeck (OU) process, Swaptions, 100–105
94–95
See also Generic Ornstein-Uhlenbeck Takahasi Kobayahashi and Nakagawa
models (TKN) model, 136
Target redemption notes (TARNs), 113,
Partial differential equations (PDEs) 118–128
See also Forward PDEs back-testing, 120–123
numerical solution, See embedded call spreads, 124, 128
Lagrange-Galerkin methods internal rate of return, 119–121
linear problem, strong formulation, redemption scenarios, internal rates
251–253 of return, 120t
Partial differential inequalities (PDIs), stochastic rate impact, 125–127
248, 254–259 structure, 118–120
numerical solution, See Duality valuation approach, 123–128
methods Tilley’s method, 310
nonlinear problem, Tsiveriotis and Fernandes (TF) model,
strong primal formulation, 255 136–137
strong mixed formulation, Total derivatives, See material
255–256 derivatives
unilateral conditions, 255
Prime Brokerage, 160 Variance swaps, 31, 56–87
capped, 66–67
Realized variance, 58–60, 71 HJM theory, 73–75
Replication, theory 27–34 market fitting, 83–85
market models, 71–87
SABR model, 43–44, pricing/hedging, 61–63
Scott’s exponential OU model, 81 Variational equalities,
Semi-Lagrangian time discretization, See See also Partial differential equations
Characteristics method linear problem, weak formulation,
Sharpe ratio, 160 272–274
Short rate models, 91–111 Variational inequalities, 256–259,
Black-Karasinski, 93 272–274
Cox-Ingersoll-Ross, 93 See also Partial differential
Generic Ornstein-Uhlenbeck models, inequalities
94–95 nonlinear problem, weak mixed
calibration, 98–100, 104–105 formulation, 256, 259
Hull-White/Vasicek model, 93, 280 nonlinear problem, weak primal
calibration, 95–97, 101–104 formulation, 258–259
with a Black Scholes equity model, test function, 256, 258
224–247 Vasicek model, See Short rate models
Short-variance, 9, 45 Venture capital, 160
Simplex, See Lagrange triangular finite Vieta’s formula, 223
elements VIX, 66
Singular Value Decomposition, 310
Sklar theorem, 207–208 Yield curves, 91–94
Spearman’s rho, 210, 216 Yosida approximation, See Lagrange
Special purpose vehicle (SPV), 223–224 multiplier method
Stochastic implied volatility, 50–52
Stochastic volatility models, Zero-coupon bonds, 91–93
See Heston model, SABR model, in Hull White/Vasicek model, 96–97
Scott’s exponential OU model

You might also like