Variance Swaps and Volatility Derivatives
Variance Swaps and Volatility Derivatives
John Crosby
Glasgow University
My website is: http://www.john-crosby.co.uk
If you spot any typos or errors, please email me.
My email address is on my website
Lecture given 20th February 2009
for the M.Sc. and Ph.D. courses
in Quantitative Finance
in the Department of Economics
at Glasgow University
File date 21st June 2009 20.08
Presentation on Variance Swaps and Volatility Derivatives
Friday February 20, 2008 Glasgow, UK
Motivation
The market prices of stocks (or other assets such as foreign
exchange rates or commodity prices) uctuate randomly.
Once we have observed a time-series of market prices, we
can compute the realised variance. If we take the square
root, we can compute the realised volatility. Suppose a
trader wishes to take a view (via a trading position) today
on the realised variance that will be observed over some
given future time period. How can she do this?
What sort of derivatives can be used for this and how are
they priced and hedged?
How is the realised variance over this given time period
(which is unknown today but will be known at the end of
the time period) related to the implied volatilities,
observable today, of vanilla options which mature at the
end of the time period?
These are questions which we will try to answer today.
2
Why this is not easy
It might be tempting to think that if a trader thinks that,
for example, realised volatility over a given time period
will be higher than the implied volatility of an option
maturing at the end of the time period, then she should
buy the vanilla option. However, what strike should the
option have? Vanilla options which are struck
at-the-forward forward have the largest vega (sensitivity to
volatility). But options, which are at-the-forward forward
at the time they are written, may be deep in or out of the
money later (because the stock price moves) at which time
they will have a much lower vega.
It is clear that vanilla options are an imperfect vehicle for
a trader to take a view on volatility or variance. This is
because the price of the vanilla and the sensitivity of the
price of the vanilla to variance (ie the partial derivatives of
the vanilla price with respect to variance) depends on the
stock price.
What sort of instrument or derivative might be a better
vehicle to take a view on variance.
3
A primer
Let us introduce some notation. Suppose today, time t
0
,
we write a European option which matures at time T on a
stock whose price, at time t, is denoted by S(t). We
denote the price of the option, at time t, by C(t). We
assume that the stock price follows geometric Brownian
motion with volatility . Well assume at this stage, for
simplicity, that interest-rates are zero and the stock pays
no dividends. We delta-hedge our short position in the
European option and rebalance our portfolio every t.
Note that t is nite - not innitesimal.
The P+L (prot and loss) over the time interval from t to
t + t is:
C
t
+
1
2
2
C
S
2
(S)
2
,
where S S(t + t) S(t).
In the last line, we have used a Taylor series expansion and
cancelled out the delta terms.
4
A primer 2
However, the Black and Scholes (1973) pde says:
C
t
+
1
2
2
(S(t))
2
2
C
S
2
= 0.
Hence, substituting, we get that the P+L over the time
interval from t to t + t is:
1
2
S
2
2
C
S
2
_
(S)
2
S
2
2
_
.
Note that if were to let t tend to zero, the P+L would
tend to zero (this is simply the Merton (1973) hedging
argument). However t is not innitesimal.
We can sum up the P+L over each time interval t. Then
the P+L over the time interval from t
0
to T is:
1
2
S
2
2
C
S
2
_
(S)
2
S
2
2
_
.
Notice how there is a path-dependency in this P+L. If, for
example, S were to tend to be large, when
2
C
S
2
was large
and positive, then the P+L would tend to be large and
positive. If, for example, S were to tend to be small
relative to S, and if
2
C
S
2
is positive (which it certainly is
for a vanilla option), then the P+L would tend to be
negative.
5
A primer 3
So, in general, the P+L of the delta-hedging strategy is
path-dependent. However, while we assumed the option
was European, we never assumed it had a vanilla payo.
The option could have any payo at time T.
Suppose that the option is such that its gamma
2
C
S
2
is
identically equal to 1/S
2
. Then the P+L over the time
interval from t
0
to T is:
1
2
_
(S)
2
S
2
2
_
.
Note that
2
is constant (we assumed this at the
beginning). Furthermore,
(S)
2
S
2
is a possible denition
for realised variance. In the market, variance swaps (which
we will dene and explain shortly - but which are
essentially forward contracts on realised variance) have a
payo whose oating part is
(log(S(t + t)/S(t)))
2
.
However, if t and S are small then a Taylors series
expansion implies
(S)
2
S
2
and (log(S(t + t)/S(t)))
2
are
approximately equal. So the P+L (upto a scaling factor) is
approximately the same as that of a variance swap.
6
A primer 4
What sort of derivative has a gamma equal to 1/S
2
?
Integrating twice, we get C(t) = a log(S(t)) + bS(t),
where a and b are constants of integration.
Notice how we can interpret a as cash (or equivalently a
bond) and the term bS(t) as a forward contract.
The term log(S(t)) represents a derivative whose payo is
log of the stock price at maturity T. It is called a log
contract (actually we often normalise by the initial stock
price S(t
0
) so the payo of the log contract is
log(S(T)/S(t
0
))) and we will see that it plays a pivotal
role in the pricing of variance swaps. (Note that a log
contract can have a negative payo).
We will now consider the pricing and hedging of variance
swaps.
7
Variance swaps (denition)
A variance swap is a nancial derivative whose payo is
dened as follows: It is written at time t
0
and matures at
time T. The time interval [t
0
, T] is partitioned into N
time periods t
i
, i = 1, 2, ..., N where t
N
= T. The time
periods do not have to be equal although they are often
approximately equal. The payo of a (discretely
monitored) variance swap at time T is:
1
(T t
0
)
N
i=1
_
(log(S(t
i
)/S(t
i1
))
2
K
2
_
,
where K is a constant (called the xed leg).
K is often chosen (as for IR swaps) to make the initial (ie
time t
0
) price of the variance swap equal to zero.
Note that, in practice in the markets, the oating leg does
not subtract the square of the mean (so it is not really a
variance).
However, the mean squared is typically tiny so it doesnt
make much dierence. Furthermore, the denition means
variances are additive in the sense that we can dene a
forward starting variance swap which starts in three
months time and which is based on the computed realised
variance for a further six months, say. Then if we own such
a forward starting variance swap and a three month
variance swap (starting today), then it is the same as
owning a nine month variance swap (starting today).
8
Variance swap pricing methodologies
In practice, all vanilla variance swaps have payos which
are discretely monitored. However, from a theoretical
standpoint, it is also relevant to consider continuously
monitored variance swaps. We will consider pricing
variance swaps from two dierent viewpoints.
The rst viewpoint is the classic log-contract replication
approach. It has the benet that it also shows how to
hedge variance swaps. This approach requires some (fairly
weak) assumptions and actually gives prices for
continuously monitored variance swaps.
The second approach prices discretely monitored variance
swaps. It has the advantage that it is very generic because
it works for almost all stochastic processes that might be
used in mathematical nance. It has the disadvantage that
it does not show how to hedge variance swaps.
9
Variance swap practicalities
Before considering the pricing of variance swaps, we will
mention a few practical issues.
Variance swaps are now very, very actively traded on stock
indices (and sometimes on individual stocks). They are
also traded, but less commonly, in other asset classes such
as fx.
There are futures and options contracts on the CBOE VIX
index which are now also very actively traded. The VIX
index is the market price of a portfolio of vanilla options
which (as we will show) replicates future realised variance.
Specically, the VIX index squared, at time t, is
(essentially) the risk-neutral conditional time t expectation
of the annualised realised variance between time t and
time t plus 30 calendar days.
The prices of vanilla options, variance swaps, VIX futures
and VIX options are all closely linked - both practically
and theoretically.
Swaps on volatility are occasionally traded.
10
Log-contract replication approach
We make the standard assumptions of a market with
no-arbitrage as well as continuous and frictionless trading
(no transactions costs).
We assume that the stock price has continuous sample
paths i.e. there are no jumps.
We make no assumptions about the volatility of the stock -
it could be constant, deterministic, stochastic with its own
source of randomness (stochastic volatility) or, in principle,
a function of the stock price (local volatility).
We assume that the stock price is strictly positive at all
times (this, in fact, rules out a Bachelier type arithmetic
process with normal volatility so not all local volatility
functions are possible - in addition, it, typically, rules out
models with default).
Hence, we write the dynamics of the stock price S(t) S
at time t under the risk-neutral equivalent martingale
measure Q in the form:
dS
S
= (r q)dt + (t, S, . . .)dz,
where dz denotes standard Brownian increments and r
and q denote the interest-rate and the dividend yield (both
assumed constant) respectively.
11
Log-contract replication approach 2
We want to value a variance swap written at time t
0
,
which matures at time T and which has a continuously
monitored oating-leg payo equal to:
1
(T t
0
)
_
T
s=t
0
2
(s, S, . . .)ds.
We know that the price V (t
0
), at time t
0
, of the oating
leg of the variance swap is the expected discounted payo
i.e. it is:
V (t
0
) = E
Q
t
0
[exp(r(T t
0
))
1
(Tt
0
)
_
T
s=t
0
2
(s, S, . . .)ds] =
exp(r(T t
0
))
1
(Tt
0
)
E
Q
t
0
[
_
T
s=t
0
2
(s, S, . . .)ds].
If we apply Itos lemma, we know:
d(log S) = (r q
1
2
2
(t, S, . . .))dt + (t, S, . . .)dz.
Eliminating the term (t, S, . . .)dz, implies:
dS
S
d(log S) =
1
2
2
(t, S, . . .))dt.
Hence, integrating from t
0
to T implies:
1
2
_
T
s=t
0
2
(s, S, . . .)ds =
_
T
t
0
(
dS(s)
S(s)
d(log S(s))).
12
Log-contract replication approach 3
Note that no expectations have been taken (yet). The last
equation says that future realised variance can be captured
no matter which path the stock price takes (assuming our
assumptions hold - the assumption of no jumps in the
stock price is crucial here). Simplifying, we can write:
1
2
_
T
s=t
0
2
(s, S, . . .)ds =
_
T
t
0
dS(s)
S(s)
log(S(T)/S(t
0
)).
In the last equation, the term
_
T
t
0
dS(s)
S(s)
is a stochastic
integral. Or to put it another way, it is the gain (or loss)
from a self-nancing trading strategy. What strategy?
13
Log-contract replication approach 4
It is the trading strategy of holding at all times between t
0
and T a position in 1/S units of stock. In other words, at
any time t, t [t
0
, T], hold 1/S(t) units of stock. Since
one unit of stock is worth S(t), 1/S(t) units of stock are
worth:
(1/S(t))S(t) = 1.
To put it even more simply, the trading strategy is to
dynamically trade the stock in such a way that at all
times, the value of the position in the stock is worth one
unit of account (one dollar, for example).
Note that it is a dynamic trading strategy - as the stock
price changes so does the position. In that respect, it is
like delta-hedging where the delta equals 1/S(t). The
value of the position is always one dollar.
14
Log-contract replication approach 5
Note:
E
Q
t
0
[
_
T
s=t
0
dS(s)
S(s)
] = E
Q
t
0
[
_
T
t
0
(r q)ds +
_
T
t
0
(s, S, . . .)dz(s)].
The expectation of the second term in square brackets is
zero. Hence, the expectation evaluates to (r q)(T t
0
).
What we would like to know is the initial (i.e. time t
0
)
value of the trading strategy. The terminal value (i.e. at
time T) is (r q)(T t
0
). Hence, the initial (i.e. time t
0
)
value of the trading strategy is
exp(r(T t
0
))(r q)(T t
0
).
If we look at the second term in the equation
1
2
_
T
s=t
0
2
(s, S, . . .)ds =
_
T
t
0
dS(s)
S(s)
log(S(T)/S(t
0
)),
We see it is a static position in a contract which pays the
log of the stock price at time T (normalised by its time t
0
price). In other words, it is a static position in an exotic
derivative which we call a log contract. What is the value
of the log contract?
15
Log-contract replication approach 6
The price of the log contract, at time t
0
, is:
exp(r(T t
0
))E
Q
t
0
[log(S(T)/S(t
0
))].
In principle, we can calculate this expectation.
For example, if the stock actually follows geometric
Brownian motion with constant volatility , then:
E
Q
t
0
[log(S(T)/S(t
0
))] =
E
Q
t
0
[(r q
1
2
2
)(T t
0
) +
_
T
t
0
dz(s)].
The expectation of
_
T
t
0
dz(s) is clearly zero. Hence, the
price of the log contract, at time t
0
, is:
exp(r(T t
0
))(r q
1
2
2
)(T t
0
).
On the other hand, this is not very useful. We essentially
needed to compute E
Q
t
0
[
2
] which is essentially what we
needed to compute to value the variance swap in the rst
place. Furthermore, the value of the variance swap is
trivial to compute under geometric Brownian motion - the
(undiscounted) value of the oating leg is simply
2
.
We can also value the log contract under the Heston (1993)
stochastic volatility model in which the instantaneous
stochastic variance (t) follows the SDE:
16
Log-contract replication approach 7
d = ( )dt + c
dz
, with (t
0
)
0
,
As an exercise (during the lunch break or at the
computing lab) I would like you to prove that:
E
Q
t
0
[
_
T
s=t
0
(s)ds] =
(
0
)
[1 exp((T t
0
)] + (T t
0
).
This immediately gives the value of the variance swap.
Why is this an intuitive result? What happens when
T t
0
?
The last result is dependent on the model (Heston (1993)).
What would be more interesting to know is, what is the
price of the log contract (and hence the variance swap)
under our stated assumptions (which apart from assuming
no jumps allows for quite a rich specication of dynamics
eg. local volatility, stochastic volatility, a combination of
the two). Motivation for nding results which are only
weakly dependent on the model comes from the fact while
the result above is dependent on the model (Heston
(1993)), it is not strongly so to the extent that the result
above does NOT depend on the volatility of volatility nor
on the correlation between the instantaneous variance and
the stock price.
17
Log-contract replication approach 8
A key rst-step is the following argument. If a trader has a
short position in 2/K
2
vanilla call options with strike K
and a long position in 1/K
2
vanilla call options with
strike K K and a long position in 1/K
2
vanilla call
options with strike K + K (all the options have the same
maturity) where K > 0, then, if we let K tend to zero,
the payout at maturity of the traders portfolio is the same
as that of the Dirac delta function. In words, the payout is
zero if the stock price is not equal to K and the payout is
+ if the stock prices equals K at maturity.
In maths, the Dirac delta function is a building block
function - we can make other functions by integrating
(summing) Dirac delta functions.
In mathematical nance, we can replicate any European
style (path-independent) payo by recognising that, since
it can be represented as a sum (in practice, innite sum) of
Dirac delta functions, it can be represented as a sum (with
possibly negative weights) of vanilla options (not
necessarily calls) with dierent strikes.
Strictly speaking, the step from the rst to the second
requires the absence of arbitrage (which we assume
throughout) and the existence of a market for vanilla
options of all strikes (which, in practice, is only an
approximation to reality - we discuss this later).
18
Log-contract replication approach 9
The following result is key. For any generalized function
f(S) and any scalar 0:
f(S) = f() + f
(K)(S K)
+
dK tangent correction
+
_
0
f
(K)(K S)
+
dK tangent correction.
This decomposition may be interpreted as a Taylor series
expansion with remainder of the nal payo f() about the
expansion point .
The rst two terms give the tangent to the payo at ; the
last two terms continuously bend this tangent so it
conforms to the nonlinear payo.
The payo of an arbitrary claim has been decomposed into
the payo from f() bonds, f
(u)du + 1
S<
_
S
(u)du
= f() + 1
S>
_
S
(u)du 1
S<
_
S
f
(u)du
= f() + 1
S>
_
S
_
f
() +
_
u
(v)dv
_
du
1
S<
_
S
_
f
()
_
u
f
(v)dv
_
du.
Noting that f
()(S ) + 1
S>
S
_
S
_
v
f
(v)dudv
+1
S<
_
S
v
_
S
f
(v)dudv.
20
Log-contract replication approach 11
Integrating over u yields:
f(S) = f() + f
()(S ) + 1
S>
S
_
(v)(S v)dv
+1
S<
_
S
f
(v)(v S)dv
= f() + f
()(S ) +
(v)(S v)
+
dv
+
_
0
f
(v)(v S)
+
dv.
Q.E.D.
Note the result is completely model independent.
21
Log-contract replication approach 12
Recall the decomposition of the payo function f(S):
f(S) = f() + f
()(S )
+
_
0
f
(K)(K S)
+
dK +
_
(K)(S K)
+
dK.
No arbitrage implies that the initial (i.e. time t
0
) price
V
t
0
[f(S)] of f(S(T), payable at time T, can be expressed
in terms of the initial (i.e. time t
0
) price exp(r(T t
0
))
of a bond maturing at time T and the initial prices
C(t
0
, K) and P(t
0
, K) of vanilla calls and puts
respectively maturing at time T:
V
t
0
[f(S)] = f() exp(r(T t
0
)) + f
()[C(t
0
, ) P(t
0
, )]
+
_
0
f
(K)P(t
0
, K)dK +
_
(K)C(t
0
, K)dK.
When = S(t
0
) exp((r q)(T t
0
)) F
0
, the forward
stock price, the second term vanishes by put-call parity
(because C(t
0
, K) P(t
0
, K) = 0 in this special case),
and the initial price decomposes as:
V
t
0
[f(S)] = f(F
0
) exp(r(T t
0
))
. .
intrinsic value
+
_
F
0
0
f
(K)P(t
0
, K)dK +
_
F
0
f
(K)C(t
0
, K)dK
. .
time value
.
22
Log-contract replication approach 13
Lets apply our general formula for the special case when
f(S) = log S. Then:
V
t
0
[log S] = exp(r(T t
0
)) log +
[C(t
0
, ) P(t
0
, )]
_
0
P(t
0
, K)
K
2
dK
_
C(t
0
, K)
K
2
dK.
The price of the log contract, at time t
0
, is the last
expression minus exp(r(T t
0
)) log S(t
0
):
Note that the term
[C(t
0
,)P(t
0
,)]
is simply 1/ forward
contracts struck at . It is a static position.
The term exp(r(T t
0
)) log (likewise
exp(r(T t
0
)) log S(t
0
)) is simply log (likewise
log S(t
0
)) in cash (or, equivalently, in bonds).
We have replicated the payo of a log contract and hence,
by no-arbitrage, priced a log contract.
Note that the log contract is replicated by static positions
in bonds and vanilla options (and possibly forward
contracts).
In practice, we have to replace the integrals by discrete
summations since vanilla options will not be traded with
literally all strikes.
23
Log-contract replication approach 14
The position in 1/S units of stock is a dynamic position
and is continuously rebalanced. We gave the value of this
position a few slides ago.
Taking into account both the log contract and the
dynamic position in 1/S units of stock, we have priced a
variance swap.
The price, at time t
0
, of the oating leg of the variance
swap is:
2
(T t
0
)
(exp(r(T t
0
))(r q)(T t
0
)
+ exp(r(T t
0
)) log S(t
0
) exp(r(T t
0
)) log
[C(t
0
, ) P(t
0
, )]
+
_
0
P(t
0
, K)
K
2
dK +
_
C(t
0
, K)
K
2
dK).
We have focussed on replication but hedging is the same as
replication with a minus sign.
In practice, is often set equal to the forward stock price
as this generally delineates between whether puts or calls
have the greatest liquidity in the market.
24
Log-contract replication approach 15
In practice, we only have options traded in the market for a
discrete set of strikes (rather than a continuum of strikes).
If we were to ignore all options with strikes outside a
particular range (equivalently set the weights to zero),
then it is clear from the pricing formula above that we will
always price the variance swap at below fair value.
In practice, there will be a benet to a trader trading
variance swaps in the context of a very large vanilla
options book. Options with strikes so high or so low that
they have no liquidity today may have been traded when
the spot price was much higher or lower in the past and as
such may be on the traders book. These can be
aggregated with the variance swap trades which produces
an economy of scale.
25
Log-contract replication approach 16
With only a discrete set of strikes available, the hedge for
the log contract will not be perfect.
However, there is an easy and intuitive way to account for
this.
The log contract is always a concave function of the stock
price. Hence, we can construct chords or tangents which
always lie below or above the log contract payo. We can
then solve analytically for the weights for vanilla options
which exactly match the chords or tangents. This will
perfectly sub-replicate or super-replicate the log contract
and at the same time give something akin to a bid-oer
spread in the price of the variance swaps. The paper by
Demeter, Derman, Kamal and Zou (1999, More than
you ever wanted to know about volatility swaps)
illustrates this very well.
26
Log-contract replication approach 17
There is a second equally intuitive way of accounting for a
discrete set of strikes:
Evaluate the log contract at some pre-specied stock
prices. Take as given the positions in bonds and forward
contracts from the portfolio constructed on the slides
above. Then solve for the weights of the call and put
options with the available strikes which minimise the sum
of the squares of dierences between the log contract and
the almost-replicating portfolio at the pre-specied stock
prices.
Because the almost-replicating portfolio is linear in these
weights, this problem is easily solvable by Tikhonov
regularisation (which means that one only has to invert a
matrix - it does NOT involve non-linear least squares ts
(calibration)).
Furthermore, one can use the weights (dK/K
2
) from the
slides above as initial guesses in the Tikhonov
regularisation.
27
Log-contract replication approach 18
This second way has the disadvantage of not
sub-replicating or super-replicating the log contract. On
the other hand, it may give the trader an
almost-replicating portfolio at lower cost than perfect
sub-replication or super-replication. Furthermore, while
the almost-replicating portfolio will have residual risks,
the trader may be content to have these risks in the
context of having a view on which parts of the implied
volatility surface are cheap or expensive - a view which can
also be easily incorporated into the Tikhonov
regularisation.
28
Interview questions
Let me ask you two questions which you might be asked at
job interviews.
Do the prices of vanilla options depend only on the (risk
neutral) distribution of the (log of the) stock price at
maturity (as opposed to the (risk neutral) distribution of
the (log of the) stock price at any other times)?
Do vanilla option prices contain information about the
prices of any path-dependent derivatives?
29
Interview questions 2
The answer to the rst question is yes. The prices of
vanilla options depend only on the (risk neutral)
distribution of the (log of the) stock price at maturity?
This is a well-known result.
Perhaps, initially surprisingly, the answer to the second
question is also yes. In fact, we have seen this today: We
have priced variance swaps whose payo is clearly
path-dependent. We can price variance swaps in terms of
vanilla options. Hence, vanilla option prices do contain
information about the prices of path-dependent
derivatives, namely, variance swaps.
To score an additional bonus point, you should mention
that this conclusion only holds to the extent that the
assumptions that we made hold. The assumptions include
that there are no jumps in the stock price which is
somewhat restrictive. However, apart from that, the
assumptions we have made are actually quite weak.
30
Discretely monitored variance swaps
Recall that a (discretely monitored) variance swap has a
payo dened as follows: It is written at time t
0
and
matures at time T. The time interval [t
0
, T] is partitioned
into N time periods t
i
, i = 1, 2, ..., N where t
N
= T. The
time periods do not have to be equal although they are
often approximately equal. The payo of a (discretely
monitored) variance swap at time T is:
1
(T t
0
)
N
i=1
_
(log(S(t
i
)/S(t
i1
))
2
K
2
_
,
where K is a constant (the xed leg) (usually chosen so
that the initial (i.e. time t
0
) price of the variance swap is
zero).
We will focus on the oating leg.
Consider a process for the stock price as follows:
31
Discretely monitored variance swaps 2
Under the risk-neutral equivalent martingale measure Q
(which may, in fact, not be unique)
S(t) = S(t
0
) exp((r q)(t t
0
) + X
t
),
where X
t
0
0 and X
t
is such that E
Q
t
0
[exp(X
t
)] = 1 for all
t t
0
. Clearly, exp(X
t
) is a martingale.
Here r and q are the risk-free rate and the dividend yield
which we will assume are constant for notational
convenience. However, one nice feature of the methodology
we will now discuss is that it is easy to relax this
assumption and have either deterministic term-structures
or have stochastic interest-rates and dividend yields.
Actually, the only assumption we need to make is that we
have a market with no-arbitrage.
32
Discretely monitored variance swaps 3
Introduce z (which may be real or complex). We dene:
E
Q
t
0
[exp(iz log(S(t)/S(t
0
))] =
E
Q
t
0
[exp(iz((r q)(t t
0
) + X
t
))],
to be the characteristic function of log(S(t)/S(t
0
)).
Mathematically, the characteristic function is the Fourier
Transform of the probability density function of
log(S(t)/S(t
0
)).
The characteristic function is known in essentially closed
form for many stochastic process including when the stock
price follows:
The Black and Scholes (1973) geometric Brownian motion
model, the Heston (1993) stochastic volatility model, the
Merton (1976) jump-diusion model, models of the ane
jump-diusion type (which covers many models with
stochastic interest-rates, stochastic interest-rates AND
jumps), all Levy process models (see the book by
Schoutens (2003), Levy processes in nance: Pricing
nancial derivatives for reading), Levy process models
with stochastic time-changing (stochastic time-changing
generalises the idea of stochastic volatility) or processes
more suitable for other asset classes such as the CEE2
process of Carr and Crosby (2008) or the commodities
model of Crosby (2008).
33
Discretely monitored variance swaps 4
In fact, the characteristic function is known in essentially
closed form for nearly every model used in nance except
for local volatility models.
This is true even though the probability density function is
typically not known in closed form.
Fourier inversion methods can then be used to price vanilla
options.
34
Discretely monitored variance swaps 5
Recall the sequence of dates at which the variance swap
payo is determined:
t
0
< t
1
< ... < t
i1
< t
i
< ...t
N
= T.
Dene the extended characteristic function (z; t
i
, t
i1
) as
follows:
(z; t
i
, t
i1
) E
Q
t
0
[exp(iz[(r q)(t
i
t
i1
) + X
t
i
X
t
i1
])]
= E
Q
t
0
[exp(iz log(S(t
i
)/S(t
i1
)))].
Essentially any model which has an analytic characteristic
function also has an analytic extended characteristic
function.
Then note:
2
(z; t
i
, t
i1
)
z
2
= E
Q
t
0
[(log(S(t
i
)/S(t
i1
)))
2
exp(iz[(r q)(t
i
t
i1
) + X
t
i
X
t
i1
])].
Hence, evaluating the last equation at z = 0:
2
(0; t
i
, t
i1
)
z
2
= E
Q
t
0
[(log(S(t
i
)/S(t
i1
)))
2
]
35
Discretely monitored variance swaps 6
The price of any derivative is the expected discounted
payo.
Hence, the price, at time t
0
, of the oating leg of the
variance swap is:
E
Q
t
0
[exp(r(T t
0
))
1
(T t
0
)
N
i=1
[log(S(t
i
)/S(t
i1
))]
2
]
=
1
(T t
0
)
exp(r(T t
0
))
N
i=1
2
(0; t
i
, t
i1
)
z
2
.
But we know (z; t
i
, t
i1
) and hence
2
(0;t
i
,t
i1
)
z
2
in
essentially closed form for many models. Hence, we can
price the variance swap.
36
Discretely monitored variance swaps 7
This methodology is very generic and can be used for
almost all stochastic processes that have been used in
mathematical nance (with the exception of local volatility
models because neither the characteristic function nor the
extended characteristic function are known).
The disadvantage of this methodology is that it says
nothing about hedging.
Theres no doubting this is a big practical disadvantage.
However, one would typically be interested to use this
methodology when there are jumps in the stock price
process. In this case, the market is incomplete and hence
perfect hedging or replication is not possible anyway.
In practice, one would choose a stochastic process. Then
one calibrates the parameters of the stochastic process by
nding those parameter values which minimise the sum of
squares of dierences between the market prices and model
prices of vanilla options. Using these parameters, one can
then price variance swaps using the formula on the last
slide.
37
Discretely monitored variance swaps 8
The methodology allows us to highlight some features of
variance swaps.
Question: Is a continuously monitored variance swap
worth more or less than a discretely sampled variance
swap? To answer this question, we will answer a slightly
more generic question rst.
Consider two variance swaps based on the realised variance
observed between t
0
and T. The times at which the stock
price is observed to compute the payo are equally spaced
(ie t
i
t
i1
is the same for all i). The dierence is that for
the rst variance swap, the number of monitoring times is
N
1
, and for the second variance swap, the number of
monitoring times is N
2
, with N
2
= 2N
1
. Which variance
swap is worth more?
We assume that the extra monitoring times of the
second variance swap lie exactly in the middle of the
intervals between the monitoring times of the rst variance
swap and that the other monitoring times of the second
variance swap coincide with those of the rst variance
swap.
38
Discretely monitored variance swaps 9
The payos of the (oating legs of the) variance swaps are:
1
(T t
0
)
N
1
i=1
(log(S(t
i
)/S(t
i1
)))
2
,
1
(T t
0
)
N
2
j=1
(log(S(t
j
)/S(t
j1
)))
2
,
respectively.
The answer to our question is clearly going to be
somewhat dependent on the stochastic process X
t
.
Suppose X
t
is a Levy process (a process with stationary
and independent increments eg. Brownian motion).
It is not dicult to see that the extended characteristic
function for a Levy process is of the form:
E
Q
t
0
[exp(iz[(r q)(t
i
t
i1
) + X
t
i
X
t
i1
])]
= exp((t
i
t
i1
)[iz(r q) + (z) iz(i)]),
for some function (z), independent of t
i1
and t
i
.
For example, if the Levy process is Brownian motion with
volatility , then (z) =
2
z
2
/2.
39
Discretely monitored variance swaps 10
Applying our formula, we have that the price, at time t
0
,
of the oating leg of the rst variance swap is:
exp(r(T t
0
))[
(0)]
+
exp(r(T t
0
))
(T t
0
)
N
1
i=1
[(
(0)i(i)+i(rq))
2
(t
i
t
i1
)
2
],
with a similar expression for the second variance swap.
Note that in the last expression we used the result:
1
(T t
0
)
N
1
i=1
(t
i
t
i1
) = 1.
The rst term is independent of the monitoring frequency
N
1
but the second term is not.
Note that [
i=1
[(
(0)i(i)+i(rq))
2
(t
i
t
i1
)
2
]
will typically be tiny compared to the rst term
exp(r(T t
0
))[
(0)].
For example, if the Levy process is Brownian motion with
volatility = 0.2 and r = 0.03, q = 0, T t
0
= 1,
N
1
= 252 (which corresponds to daily monitoring of a one
year swap), then the second term is less than one ten
thousandth of the rst term. (Note: As an exercise (during
the lunch break or at the computing lab) I would like you
to prove this mathematically). This means that the second
term is completely negligible (especially relative to the
likely bid-oer spread - typically around 0.5 to 1.0
percentage points).
This suggests that, although it is true that for N
2
= 2N
1
,
the price of the (oating leg of the) rst variance swap is
always greater than or equal to the price of the (oating
leg of the) second variance swap, in practice (for daily
monitoring, say), any dierence between the two will
typically be very small - at least for processes with
stationary and independent increments (i.e. Levy
processes).
42
Discretely monitored variance swaps 13
Further intuition can be gleaned by considering, rstly, a
two year variance swap with only one monitoring date and,
secondly, a two year variance swap with two monitoring
dates at year one and at year two. The price of the rst
involves the expectation of [log S(2) log S(0)]
2
and the
price of the second involves the expectation of
[log S(1) log S(0)]
2
+ [log S(2) log S(1)]
2
.
Straightforward algebra shows the rst quantity is greater
than the second if, and only if, the expectation of
2[log S(2) log S(1)][log S(1) log S(0)] is positive.
If log S(t) has zero drift, this expectation is identically
equally to zero for a process with independent increments
(by denition). Hence, we see again that the two variance
swaps have the same price in this special case.
For processes that have neither stationary nor independent
increments such as in the model of Heston (1993), this
expectation will (typically) be positive (this is true even if,
somehow, log S(t) has zero drift). Hence, for such
processes, the prices of variance swaps may be much more
sensitive to the monitoring frequency.
In the Heston (1993) model, the prices of variance swaps
will be most sensitive to the monitoring frequency when
the mean reversion rate is large and when the correlation is
far from zero.
43
Discretely monitored variance swaps 14
The answer to the question Is a continuously monitored
variance swap worth more or less than a discretely sampled
variance swap? is obtained by letting the number of
monitoring times tend to innity in our arguments above:
The price of a discretely sampled variance swap is greater
than or equal to the price of a continuously monitored
variance swap.
Strict equality will only hold under special circumstances.
However, generally speaking, in practice, any dierences
will be small.
As a nal comment, we note that, in the limit that
t
i
t
i1
0, for all i, i.e. for a continuously monitored
variance swap, under a Levy process, we have that the
price, at time t
0
, of the oating leg of the variance swap
tends to:
exp(r(T t
0
))[
(0)].
This is because the second term (see two slides ago) tends
to zero.
As a sanity check on the last formula, for the case of
Brownian motion with volatility , (z) =
2
z
2
/2.
Hence,
(0) =
2
, which agrees with our intuition.
44
From variance swaps to volatility swaps
Volatility swaps also trade - although less frequently. The
payo of a (discretely monitored) volatility swap is:
1
(T t
0
)
N
i=1
_
_
(log(S(t
i
)/S(t
i1
))
2
K
v
_
,
where K
v
is a constant (the xed leg) (again usually chosen
so that the initial price of the volatility swap is zero).
Is there a simple, if approximate, way to relate volatility
swap rates to variance swap rates?
Suppose that future realised variance V (T) has (under the
risk neutral measure Q) mean
V
and variance
2
V
. In
other words,
V
is the xed rate on a variance swap with
zero initial price.
V
= E
Q
t
0
[V (T)] and
2
V
= E
Q
t
0
[(V (T)
V
)
2
].
Doing a Taylor series expansion of
_
V (T) around its
mean implies (correct to second order):
_
V (T) =
V
+
(V (T)
V
)
2
(V (T)
V
)
2
8
3/2
V
.
45
From variance swaps to volatility swaps 2
Taking expectations under Q and, observing that
E
Q
t
0
[(V (T)
V
)] = 0 and that
V
=
_
E
Q
t
0
[V (T)],
implies that (correct to second order):
E
Q
t
0
[
_
V (T)] =
_
E
Q
t
0
[V (T)]
2
V
8
3/2
V
.
The term on the left is the xed rate on a volatility swap
such that it has zero initial price. The rst term on the
right is the square root of the xed rate on a variance swap
such that it has zero initial price.
Note that the former (E
Q
t
0
[
_
V (T)]) is certainly less than
or equal to the latter (
V
=
_
E
Q
t
0
[V (T)]) (with equality
only in the degenerate case that
2
V
= 0). This is to be
expected from Jensens inequality.
We stress the last result is only an approximation.
46
Are implied volatilities predictions of future realised volatilities
One occasionally hears it said that implied volatilities are
the markets best guesses of future realised volatilities.
Is this true? What does it mean (if anything)?
Consider a stock price process of the form:
dS
S
= (r q)dt + (t, S, . . .)dz,
where (t, S, . . .) might be stochastic but, if it is
stochastic, it is independent of dz.
We consider the price, at time t
0
, of a vanilla (standard
European) option, maturing at time T, which is struck at
the forward price F(t
0
) S(t
0
) exp((r q)(T t
0
)). We
denote realised volatility, over the time period [t
0
, T], by
RV (t
0
, T).
RV (t
0
, T) =
1
(T t
0
)
_
T
t
0
(s, S, . . .)
2
ds.
47
Are implied volatilities predictions of future realised volatilities 2
Since the volatility is independent of the stock price (this
is a key part of the argument), we can compute the price
of the vanilla option by, rstly, conditioning on the realised
volatility and using the Black and Scholes (1973) formula
and then, secondly, taking expectations over the realised
volatility (in other words, by using the tower law i.e. the
law of iterated expectations).
Hence, the price, at time t
0
, of the vanilla option is:
E
Q
t
0
[exp(r(T t
0
))[F(t
0
)N(RV (t
0
, T)
_
T t
0
/2)
F(t
0
)N(RV (t
0
, T)
_
T t
0
/2)]].
Suppose the option maturity T t
0
is very small. We can
do a Taylor series expansion of the term in the inner square
brackets to deduce that the price, at time t
0
, of the vanilla
option is approximately (correct to terms in (T t
0
)):
E
Q
t
0
[exp(r(T t
0
))[F(t
0
)RV (t
0
, T)
_
T t
0
/
2]]
=
exp(r(T t
0
))F(t
0
)
T t
0
2
E
Q
t
0
[RV (t
0
, T)].
48
Are implied volatilities predictions of future realised volatilities 3
On the other hand, we can compute the price of the vanilla
option using the implied volatility appropriate for an
at-the-money-forward strike and a maturity of T t
0
.
This is simply the Black and Scholes price with implied
volatility IV (t
0
, T). We can do the same Taylor series
expansion for small T t
0
to conclude the price of the
option is approximately (correct to terms in (T t
0
)):
exp(r(T t
0
))F(t
0
)
T t
0
2
IV (t
0
, T).
If we equate these two vanilla option prices and cancel
terms, we obtain:
E
Q
t
0
[RV (t
0
, T)] = IV (t
0
, T).
Hence, we see that the risk-neutral expectation of future
realised volatility, over a short time period, is
approximately (correct to terms in (T t
0
)) equal to the
at-the-money-forward implied volatility of options
maturing at the end of the short time period.
49
Are implied volatilities predictions of future realised volatilities 4
So the claim that implied volatilities are the markets best
guesses of future realised volatilities is true - at least for
very short time periods.
Or is it?
Carr and Wu (2006) show that the sample average
dierence between the 30-day realized variance on the S+P
500 and the VIX squared is more than 150 bp and highly
signicant. The variance risk premium in excess returns
form is 40 per cent, for being long a 30-day variance
swap and holding it to maturity. In other words, shorting
variance swaps and hence receiving the xed leg generates
positive excess returns on average.
Does this contradict our result on the last slide?
No. As Carr and Wu (2006) point out, the highly negative
variance risk premium indicates that investors are averse
to variance risk and the compensation for bearing variance
risk can come in the form of a lower mean variance level
under the real world empirical measure than under the risk
neutral measure Q.
50
Summary and General Conclusions
Variance swaps can be priced and hedged or replicated by
synthetically creating log contracts.
They are very actively traded in the markets as are futures
and options on the CBOE VIX index. The VIX index is
the market price of a portfolio of vanilla options which has
weights derived from those required to replicate log
contracts.
The extended characteristic function approach prices
discretely monitored variance swaps. It is very simple and
generic but it has the disadvantage that it says nothing
about hedging or replicating variance swaps.
51
References
Trading variance and log contracts was introduced by
Anthony Neuberger (Neuberger A. (1990) Volatility
trading Working paper, London Business School;
Neuberger, A. (1994) The Log Contract: A new
instrument to hedge volatility, Journal of Portfolio
Management, Winter, p74-80; Neuberger, A. (1996) The
Log Contract and Other Power Contracts, in The
Handbook of Exotic Options, edited by I. Nelken,
p200-212).
The paper by Kresimir Demeter, Emanuel Derman,
Michael Kamal and Joseph Zou (Demeter K., Derman E.,
Kamal M. and Zou J. (1999) More than you ever wanted
to know about volatility swaps Journal of Derivatives
6(4), p 9-32; also a Goldman Sachs Quantitative Strategies
Note available on Emanuel Dermans website
http://www.ederman.com) is an excellent and very
readable article.
A paper by Peter Carr and Liuren Wu (Carr P. and L. Wu
(2006) A Tale of Two Indices, Journal of Derivatives,
13(3), p13-29) examines VIX futures and options in depth.
The extended characteristic function approach can be
found in a seminar presentation given by George Hong of
UBS at Cambridge University in 2004. (Hong G. (2004)
Forward Smile and Derivative Pricing Summer 2004,
available on the website of the Centre for Financial
Research, Judge Business School, Cambridge University).
52