0% found this document useful (0 votes)
80 views40 pages

Maximum Likelihood Estimation of Stochastic Volatility Models

This document summarizes a study that develops and implements maximum likelihood estimation for stochastic volatility models using option prices. It compares a full likelihood method that estimates volatility from option prices to an approximate method using implied volatility from short-dated options. Simulation results show the approximation method has small accuracy loss compared to sampling error. The method is applied to market index option prices to estimate parameters for several stochastic volatility models.

Uploaded by

sandeepvempati
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views40 pages

Maximum Likelihood Estimation of Stochastic Volatility Models

This document summarizes a study that develops and implements maximum likelihood estimation for stochastic volatility models using option prices. It compares a full likelihood method that estimates volatility from option prices to an approximate method using implied volatility from short-dated options. Simulation results show the approximation method has small accuracy loss compared to sampling error. The method is applied to market index option prices to estimate parameters for several stochastic volatility models.

Uploaded by

sandeepvempati
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Journal of Financial Economics 83 (2007) 413452

Maximum likelihood estimation of stochastic


volatility models
$
Yacine At-Sahalia

, Robert Kimmel
Department of Economics and Bendheim Center for Finance, Princeton University, Princeton, NJ, 08540, USA
Received 8 June 2004; received in revised form 23 September 2005; accepted 10 October 2005
Available online 11 September 2006
Abstract
We develop and implement a method for maximum likelihood estimation in closed-form of
stochastic volatility models. Using Monte Carlo simulations, we compare a full likelihood procedure,
where an option price is inverted into the unobservable volatility state, to an approximate likelihood
procedure where the volatility state is replaced by proxies based on the implied volatility of a short-
dated at-the-money option. The approximation results in a small loss of accuracy relative to the
standard errors due to sampling noise. We apply this method to market prices of index options for
several stochastic volatility models, and compare the characteristics of the estimated models. The
evidence for a general CEV model, which nests both the afne Heston model and a GARCH model,
suggests that the elasticity of variance of volatility lies between that assumed by the two nested
models.
r 2006 Elsevier B.V. All rights reserved.
JEL classications: G12; C22
Keywords: Closed-form likelihood expansions; Volatility proxies; Heston model; GARCH model; CEV model
ARTICLE IN PRESS
www.elsevier.com/locate/jfec
0304-405X/$ - see front matter r 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.jneco.2005.10.006
$
We are especially grateful to Bill Schwert (the Editor) and an anonymous referee for comments and
suggestions that greatly improved the paper. Financial support from the NSF under grant SBR-0350772 is also
gratefully acknowledged.

Corresponding author.
E-mail address: [email protected] (Y. At-Sahalia).
1. Introduction
In this paper, we develop and implement a technique for the estimation of stochastic
volatility models of asset prices. In the early option pricing literature, such as Black and
Scholes (1973) and Merton (1973), equity prices followed a univariate Markov process,
usually a geometric Brownian motion. The instantaneous relative volatility of the equity
price is then constant. Evidence from the time-series of equity returns against this type of
model was noted at least as early as Black (1976), who commented on the fat tails of the
returns distribution. Evidence from option prices also calls this type of model into
question; if equity prices follow a geometric Brownian motion, the implied volatility of
options should be constant through time, across strike prices, and across maturities. These
predictions can easily be shown to be false; see, e.g., Stein (1989), At-Sahalia and Lo
(1998), or Bakshi et al. (2000).
1
An alternative is offered by true stochastic volatility
models, such as Stein and Stein (1991) or Heston (1993), in which innovations to volatility
need not be perfectly correlated with innovations to the price of the underlying asset. Such
models can explain some of the empirical features of the joint time-series behavior of stock
and option prices, which cannot be captured by the more limited models.
However, estimating stochastic volatility models poses substantial challenges. One
challenge is that the transition density of the state vector is hardly ever known in closed
form for such models; some moments may or may not be known in closed form, depending
on the model. Furthermore, the additional state variables that determine the level of
volatility are not all directly observed. The estimation of stochastic volatility models when
only the time-series of stock prices is observed is essentially a ltering problem, which
requires the elimination of the unobservable variables.
2
Alternately, the value of the additional state variables can be extracted from the
observed prices of options. This extraction can be through a proxy used in place of the
unobservable volatility, for instance the Black-Scholes implied volatility of an at-the-
money short-maturity option, as in e.g., Ledoit et al. (2002), which can be further rened in
the case of the specic CEV model: see Lewis (2000). Other simple approximation
techniques (including one we propose below) are possible and broadly applicable.
A potentially more accurate procedure is to calculate option prices for a variety of levels
of the volatility state variables, and use the observed option prices to infer the current
levels of those state variables; see, e.g., Pan (2002). The rst method has the virtue of
simplicity, but is an approximation that does not permit identication of the market price
of risk parameters for the volatility state variable; the second method is more complex,
ARTICLE IN PRESS
1
One class of models that attempts to model equity prices more realistically takes the approach of having
instantaneous volatility be time-varying and a function of the stock price. Models of this type include Derman and
Kani (1994), Dupire (1994), and Rubinstein (1995). Such time-inhomogeneous models are often able to match an
observed cross-section of option prices (across different strike prices and possibly also across maturities) perfectly.
However, empirical studies such as Dumas et al. (1998) have found that they perform poorly in explaining the
joint time series behavior of the stock and option prices.
2
This can be achieved by computing an approximate discrete time density for the observable quantities by
integrating out the latent variables as in e.g., Ruiz (1994), or the derivation of additional quantities such as
conditional moments of the integrated volatility to be approximated by their discrete high frequency versions as in
e.g., Bollerslev and Zhou (2002). For some specic models, typically those in the afne class, other relevant
theoretical quantities, such as the characteristic function, as in Chacko and Viceira (2003), Jiang and Knight
(2002), Singleton (2001), or the density derived numerically from the inverse characteristic function as in Bates
(2002), can be calculated and matched to their empirical counterparts.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 414
but allows full identication of all model parameters. Whichever method is used to extract
the implied time-series observations of the state vector, subsequent estimation in practice
has typically been simulation-based, relying either on Bayesian methods as in e.g., Jacquier
et al. (1994), Kim et al. (1999) and Eraker (2001), or on the efcient method of moments of
Gallant and Tauchen (1996), as in e.g., Andersen and Lund (1997).
In this paper, we develop a method that employs maximum likelihood, using closed
form approximations to the true (but unknown) likelihood function of the joint
observations on the underlying asset and either option prices (when the exact technique
described above is used) or the volatility state variables themselves (when the
approximation technique described above is used). The statistical efciency of maximum
likelihood is well-known, but in nancial applications likelihood functions are often not
known in closed form for the model of interest, since the state variables of the underlying
continuous-time theoretical model are observed only at discrete time intervals. Our
solution to this problem relies on the approach of At-Sahalia (2001), who develops series
approximations to the likelihood function for arbitrary multivariate continuous-time
diffusions at discrete intervals of observations; see At-Sahalia (2002) for the univariate
theory. This method has been shown to be very accurate, even when the series are
truncated after only a few terms, for a variety of diffusion models at least in the univariate
case: see At-Sahalia (1999), Jensen and Poulsen (2002), Stramer and Yan (2005) and Hurn
et al. (2005).
In all cases, we rely on observations on the joint time-series of the underlying asset price
and either an option price or a proxy for the unobserved volatility extracted from the
implied volatility of a short-dated at-the-money option. By comparing the results we
obtain from the exact procedure (where the option pricing model is inverted to produce
an estimate of the unobservable volatility state variable from the observed option price) to
those of the approximate procedure (where the implied volatility from a short-dated at-the-
money option is used to construct a proxy for the volatility state variable), we can assess
the effect of that approximation. We nd that the error introduced by the approximation is
often smaller than the sampling noise inherent in the estimation of the parameters, so that
using an implied volatility proxy does not have adverse consequences, other than not
allowing the identication of the market prices of volatility risk, and results in a large
computational efciency gain. As to the specic proxy used, we nd that the use of an
unadjusted Black-Scholes implied volatility proxy for volatility introduces signicant bias
to some parameter estimates. For those cases, we propose a modied proxy that accounts
for expected changes in volatility over the life of the option, and can be computed as a
closed form adjustment to the Black-Scholes implied volatility. Even in those cases where
the use of an unadjusted Black-Scholes implied volatility proxy does cause signicant bias,
the modied proxy largely eliminates this error.
The closed form feature of our approach offers considerable benets: for example,
estimation is quick enough that large numbers of Monte Carlo simulations can be run to
test its accuracy, and we do so in this paper. For most other methods, large numbers of
simulations are already required for a single estimation; simulating on top of simulations
to run large numbers of Monte Carlos with these techniques is so time-consuming as to be
practically infeasible, and we are not aware of evidence on their small-sample behavior. By
contrast, we demonstrate that our technique is quite feasible for typical stochastic volatility
models, even if option prices rather than implied volatilities are used. Evidence from the
included Monte Carlo simulations shows that the sampling distribution of the estimates is
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 415
fairly well predicted by standard statistical asymptotic theory, as it applies to the maximum
likelihood estimator.
We illustrate our method on actual data using several typical models, including the
afne model of Heston (1993), a GARCH stochastic volatility model as in Nelson (1990)
and Meddahi (2001), and a CEV model as in, e.g., Jones (2003). An early summary of some
of the models we use as examples, as well as several others, can be found in Taylor (1994).
However, it is also important to note that our method is applicable to arbitrary diffusion-
based stochastic volatility models; the only requirement is that either the specication of
the model be sufciently tractable for option prices to be mapped into the state variables at
a reasonable computational cost, or that a tractable proxy based on implied volatility be
available.
The rest of this paper is organized as follows. In Section 2, we discuss a general class of
stochastic volatility models for asset prices. Section 3 presents our estimation method in
detail, showing how to apply it to the class of models of the previous section. In Section 4,
we show how to apply this method to the three models cited above, developing the explicit
closed form likelihood expressions, and extracting the state vector from option prices. In
Section 5, we discuss different implied volatility proxies for the purpose of extracting the
volatility state variable. Section 6 tests the accuracy of the method by performing Monte
Carlo simulations for the model of Heston (1993) and the CEV model, assessing the
accuracy of the estimates, the degree to which their sampling distributions conform to
asymptotic theory, and the effect of using the implied volatility proxies. In Section 7, we
apply this estimation method to real S&P 500 option data for the three stochastic volatility
models, and analyze and compare the results. Section 8 concludes and sketches out an
extension of the method to jump-diffusions. The appendix contains the closed form
likelihood expansion for the three models under consideration.
2. Stochastic volatility models
We consider stochastic volatility models for asset prices and in this section briey review
them and establish our notation. Although we refer to the asset as a stock throughout,
the models described can just as easily be applied to other classes of nancial assets, such as
foreign currencies or futures contracts. A stochastic volatility model for a stock price is one
in which the price is a function of a vector of state variables X
t
that follows a multivariate
diffusion process:
dX
t
= m
P
(X
t
) dt s(X
t
) dW
P
t
, (1)
where X
t
is an m-vector of state variables, W
P
t
is an m-dimensional canonical Brownian
motion under the objective probability measure P, m
P
() is an m-dimensional function
of X
t
, and s() is an m m matrix-valued function of X
t
. The stock price is given
by S
t
= f (X
t
) for some function f (), but usually either the stock price or its natural
logarithm is taken to be one of the state variables. We take the stock price itself to be the
rst element of X
t
, and write X
t
= |S
t
; Y
t
]
/
, with Y
t
an N-vector of other state variables,
N = m 1.
From the well-known results of Harrison and Kreps (1979) and Harrison and Pliska
(1981), and many extensions since then, the existence of an equivalent martingale measure
Q guarantees the absence of arbitrage among a broad class of admissible trading
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 416
strategies.
3
Under the measure Q, the state vector follows the process
dX
t
= m
Q
(X
t
) dt s(X
t
) dW
Q
t
, (2)
where W
Q
t
is an m-dimensional canonical Brownian motion under Q, and m
Q
() is an
m-dimensional function of X
t
. The stock itself, since it is a traded asset, must satisfy
dS
t
= (r d)S
t
dt s
1
(X
t
) dW
Q
t
, (3)
where r is the risk-free rate, d is the dividend yield on the stock (both taken to be constant
for simplicity only), and s
1
(X
t
) denotes the rst row of the matrix s(X
t
). In other words,
under the measure Q, an investment in the stock must have an instantaneous expected
return equal to the risk-free interest rate. The instantaneous mean (under Q) of the stock
price is therefore dependent only on the stock price itself, but its volatility can depend on
any of the state variables including, but not limited to, S
t
itself.
The price f(t; X
t
) of a derivative security that does not pay a dividend must satisfy the
FeynmanKac differential equation
qf(t; X
t
)
qt

m
i=1
qf(t; X
t
)
qX
t
(i)
m
Q
i
(X
t
)
1
2

m
i=1

m
j=1
q
2
f(t; X
t
)
qX
t
(i)qX
t
(j)
s
2
ij
(X
t
) rf(t; X
t
) = 0, (4)
where m
Q
i
(X
t
) denotes element i of the drift vector m
Q
(X
t
), and s
2
ij
(X
t
) denotes the element
in row i and column j of the diffusion matrix s(X
t
)s
/
(X
t
). The price of an ordinary
derivative security with a European-style exercise convention must satisfy the boundary
condition
f(T; X
T
) = g(X
T
), (5)
where T is the maturity date of the derivative and g(X
T
) is its nal payoff. Usually, the
derivative payoff is a function only of the stock price
g(X
T
) = h(S
T
) (6)
for some function h; for standard options, such as puts and calls, this condition is always
satised.
The nature of a solution to Eq. (4) depends critically on the volatility specication in
Eq. (3). If s
1
satises
s
1
(X
t
)s
/
1
(X
t
) = s
S
(S
t
) (7)
for some function s
S
(S
t
), then the stock price is a univariate process under the measure Q
(although not necessarily under P because of the potential dependence of m
P
(X
t
) on state
variables other than S
t
). In this case, the price of any European-style derivative with a nal
payoff of the type specied in Eq. (6) can be expressed as f(t; X
t
) = x(t; S
t
), and Eq. (4)
simplies to
qx(t; S
t
)
qt

qx(t; S
t
)
qS
t
(r d)S
t

1
2
q
2
x t; S
t
( )
qS
t
s
2
S
(S
t
) rx t; S
t
( ) = 0 (8)
with the consequence that the instantaneous changes in prices of all derivative securities
are perfectly correlated with the instantaneous price change of the stock itself. In this case,
ARTICLE IN PRESS
3
The denition of admissibility in the literature varies. It is usually either an integrability restriction on the
trading strategy, which requires that the RadonNikodym derivative of Q with respect to P have nite variance,
or a boundedness restriction on the deated wealth process, which imposes no such restriction on dQ=dP.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 417
knowledge of S
t
and the parameters of the model are sufcient to price any derivative with
nal payoff of the type in Eq. (6); any additional state variables are either wholly
irrelevant, or affect the stock price dynamics only under the measure P, and are therefore
irrelevant for derivative pricing purposes. (Of course, if the application at hand is
something other than derivative pricing, the dynamics under the P measure could be
relevant.) Models of this type usually allow explicit time dependency by replacing s
S
(S
t
)
with s
S
(t; S
t
); see, e.g., Derman and Kani (1994), Dupire (1994), and Rubinstein (1995),
who develop univariate models (or, more precisely, discrete-time approximations to
continuous-time univariate models) that have the ability to match an observed cross-
section of option prices perfectly. Some of these techniques are also able to match observed
prices of a term structure (with respect to maturity) of option prices as well. Such models
are usually calibrated from the cross-section and possibly the term structure of option
prices observed at a single point in time, rather than estimated from time-series
observations of the stock price itself. Calibration methods specify dynamics under the
measure Q only, leaving the dynamics under P unspecied. Such methods are therefore
able to accurately reect a number of empirical regularities, such as volatility smiles and
smirks, but cannot tell us anything about risk premia of the state variables in the model (if
our method is implemented using proxies derived from implied volatilities instead of
option prices, the risk premia will not be fully identied either).
Despite this ability to match a cross-section, and often a term structure, of observed
option prices perfectly, Dumas et al. (1998) nd that univariate calibrated models imply a
joint time-series behavior for the stock price and option prices that is not consistent with
the observed price processes. Consequently, such models require periodic recalibration, in
which the volatility function s
S
(t; S
t
) is changed to match the new observed cross-section
and term structure of option prices. The need for such recalibration shows that the price
process implied by such models cannot be the true price process, and the implications of
such models with respect to derivatives pricing, hedging, etc., are therefore suspect.
Stochastic volatility models, in which Eq. (7) is not satised, offer an alternative. Having
the volatility of the stock depend on a set of state variables that can have variation
independent of the stock price itself permits more exible time-series modeling than is
possible with the univariate calibrated type of model. Furthermore, stochastic volatility
models are able to generate volatility smiles and smirks, although they are not able to
match a cross-section of options perfectly, as are the calibrated models. Nonetheless, a
stochastic volatility model with one or more elements in Y
t
provides considerable
exibility in modeling. In all the specic models we consider in Sections 4 and 6, volatility
depends on a single state variable (i.e., Y
t
has a single element).
3. The estimation method
In stochastic volatility models, part of the state vector X
t
is not directly observed. There
are two fundamentally different approaches to dealing with this issue in estimation. One
approach is to assume that we observe only a time-series of observations of the stock price
S
t
, and apply a ltering technique. The elements of X
t
, other than S
t
, are considered
unobserved, and, since S
t
is not a Markov process, the likelihood of an observation of S
t
depends not only on the last observation S
t1
, but on the entire history of the stock price.
Such an approach is taken by Bates (2002). This approach does not fully identify all of the
parameters of the Q-measure dynamics. The model offers as many as m independent
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 418
sources of risk, but the stock price instantaneously depends on only one of these sources.
Consequently, only the rst element of m
Q
() can be identied. If the dynamics under the
measure P are the object of interest, then this approach has some advantages; for example,
an incorrect specication of the Q-measure dynamics does not taint the P-measure
estimation. However, if the Q-measure dynamics are the objective, then clearly another
approach must be taken.
A second approach, which we adopt, is to assume that a time-series of observations of
both the stock price S
t
and a vector of option prices (which, for simplicity, we take to be
call options) C
t
is observed. The time-series of Y
t
can then be inferred from the observed
C
t
. If Y
t
is multidimensional, sufciently many options are required with varying strike
prices and maturities to allow extraction of the current value of Y
t
from the observed stock
and call prices. Otherwise, only a single option is needed. This approach has the advantage
of using all available information in the estimation procedure, but the disadvantage that
option prices must be calculated for each parameter vector considered, in order to extract
the value of volatility from the call prices.
There are two distinct methods for extracting the value of Y
t
from the observed option
prices. One method is to calculate option prices explicitly as a function of the stock price
and of Y
t
, for each parameter vector considered during the estimation procedure. This
approach has the advantage of permitting identication of all parameters under both the P
and Q measures. An alternative is to use the Black-Scholes implied volatility of an at-the-
money short-maturity option as a proxy for the instantaneous volatility of the stock. This
approach has the virtue of simplicity, but can only be applied when there is a single
stochastic volatility state variable. The Q-measure parameters are not fully identied when
this method is employed. We use both of these approaches, and an additional method we
develop, in Section 7 and compare them.
For reasons of statistical efciency, we seek to determine the joint likelihood function of
the observed data, as opposed to, for example, conditional or unconditional moments.
We proceed as follows to determine this likelihood function. Since, in general, the
transition likelihood function for a stochastic volatility model is not known in closed form,
we apply the closed form approximation method of At-Sahalia (2001) which yields in
closed form the joint likelihood function of |S
t
; Y
t
]
/
. From there, the joint likelihood
function of the observations on G
t
= |S
t
; C
t
]
/
is obtained simply by multiplying the
likelihood of X
t
= |S
t
; Y
t
]
/
by a Jacobian term. If a proxy for Y
t
is used, this last step is not
necessary.
We now examine each of these steps in turn: rst, the determination of an explicit
expression for the likelihood function of X
t
; second, the identication of the state vector
X
t
from the observed market data on G
t
; and third, a change of variable to go back from
the likelihood function of X
t
to that of G
t
. We present in this section the method in full
generality, before specializing and applying the results to the four specic stochastic
volatility models we consider.
3.1. Closed-form likelihood expansions
The second step in our estimation method requires that we derive an explicit expression
for the likelihood function of the state vector X
t
= |S
t
; Y
t
]
/
under P. Specically, consider
the stochastic differential equation describing the dynamics of the state vector X
t
under the
measure P, as specied by (1). Let p
X
(D; x[x
0
; y) denote its transition function, that is, the
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 419
conditional density of X
tD
= x given X
t
= x
0
, where y denotes the vector of parameters
for the model.
Rather than the likelihood function, we approximate the log-likelihood function,
l
X
ln p
X
. We now turn to the question of constructing closed form expansions for the
function l
X
of an arbitrary multivariate diffusion. The expansion of the log likelihood in
At-Sahalia (2001) takes the form of a power series (with some additional leading terms) in
D, the time interval separating observations:
l
(J)
X
(D; x[x
0
; y) =
m
2
ln(2pD) D
v
(x; y)
C
(1)
X
(x[x
0
; y)
D

J
k=0
C
(k)
X
(x[x
0
; y)
D
k
k!
, (9)
where
D
v
(x; y)
1
2
ln(det|v(x; y)]) (10)
and v(x) s(x)s
/
(x). The series can be calculated up to arbitrary order J. The unknowns
so far are the coefcients C
(k)
X
corresponding to each D
k
, k = 1; 0; . . . ; J. We then
calculate a Taylor series in (x x
0
) of each coefcient C
(k)
X
, at order j
k
in (x x
0
), which
will turn out to be fully explicit. Such an expansion will be denoted by C
(j
k
;k)
X
, and is taken
at order j
k
= 2(J k).
The resulting expansion is then
~
l
(J)
X
(D; x[x
0
; y) =
m
2
ln(2pD) D
v
(x; y)
C
(j
1
;1)
X
(x[x
0
; y)
D

J
k=0
C
(j
k
;k)
X
(x[x
0
; y)
D
k
k!
(11)
and At-Sahalia (2001) shows that the coefcients C
(j
k
;k)
X
can be obtained in closed form for
arbitrary specications of the dynamics of the state vector X
t
by solving a system of linear
equations.
The system of linear equations determining the coefcients is obtained by forcing the
expansion (9) to satisfy, to order D
J
, the forward and backward Fokker-Planck-
Kolmogorov equations, either in their familiar form for the transition density p
X
or in
their equivalent form for ln p
X
. For instance, the forward equation for ln p
X
is of the
following form:
ql
X
qD
=

m
i=1
qm
P
i
(x)
qx
i

1
2

m
i=1

m
j=1
q
2
n
ij
(x)
qx
i
qx
j

m
i=1
m
P
i
(x)
ql
X
qx
i

m
i=1

m
j=1
qn
ij
(x)
qx
i
ql
X
qx
j

1
2

m
i=1

m
j=1
n
ij
(x)
q
2
l
X
qx
i
qx
j

1
2

m
i=1

m
j=1
ql
X
qx
i
n
ij
(x)
ql
X
qx
j
. (12)
In the appendix, we give the resulting coefcients C
(j
k
;k)
X
in closed form for the stochastic
volatility model of Heston (1993), and two other related stochastic volatility models. While
the expressions at rst look daunting, they are in fact quite simple to implement in practice.
First, the calculations yielding the coefcients in formula (11) are performed using a
symbolic algebra package such as Mathematica. Second, and most important, for a given
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 420
model, the expressions need to be calculated only once. So, to estimate, for instance,
the model of Heston (1993) (or one of the other models considered), the expressions in the
appendix are all that is needed for that model. The reader can then safely ignore the
general method that gives rise to these expressions and simply plug the coefcients C
(j
k
;k)
X
we give in the appendix into formula (11).
3.2. Identication of the state vector
When Y
t
contains a single element, that is N = 1, one possible identication approach is
to use the Black-Scholes implied volatility of an at-the-money short-maturity option as a
proxy for the instantaneous relative standard deviation of the stock. From Eq. (3), the
instantaneous relative volatility of the stock is given by

s
0
(X
t
)s
/
0
(X
t
)

=S
t
. Since the stock
price is observed and there is only one degree of freedom remaining in determining the
instantaneous relative standard deviation, the stock and the implied volatility of a single
option are sufcient to identify all elements of X
t
. Such an approach is based on the
theoretical observation that the implied volatility of an at-the-money option converges to
the instantaneous volatility of the stock as the maturity of the option goes to zero. This
approach has several advantages, but some disadvantages as well. First, it does not fully
identify the Q-measure parameters. Second, this approach cannot be taken if Y
t
has more
than one element; in this case, multiple options are needed to identify the elements of Y
t
,
and simple approximation rules similar to that used for the univariate case are not
available. Third, as we will see below, there are situations (such as the CEV model) where it
is not sufciently accurate.
If this approach is not possible or desirable, the elements of Y
t
can be inferred from
observed option prices C
t
by calculating true (i.e., not dependent on the above
approximation) option prices. Monte Carlo simulations in Section 6 below assess the
effect of making this approximation on the overall quality of the estimates. Since the
potential for simplication by using the approximation technique is substantialin effect,
rendering the option pricing model unnecessaryit is indeed worth investigating the
tradeoff between the accuracy of the estimates and the effort involved in dealing with the
option pricing model.
Clearly, to identify the N elements of Y
t
requires observation of at least N option
prices. If the mapping from the N elements of Y
t
to prices of N options C
t
with given
strike prices and maturities has a unique inverse, then these options sufce to identify the
state vector. If the inverse mapping is not unique, additional options are required, leading
to a stochastic singularity problem. In this case, some or all of the options must be
assumed to be observed with error. Whether the mapping from Y
t
to the option prices is
invertible must be veried for each specic model considered. In the specic models we use
in our empirical application, N = 1 and this is not an issue.
For each time period in a data sample, we therefore need not only observations of the
stock price S
t
, but also at least N option prices of varying strikes and/or maturities. We
denote the time of maturity and strike price of element i of C
t
as T
i
and K
i
, respectively.
The value of each element of C
t
thus depends on the time to maturity T
i
t, the stock
price S
t
, the values of the other state variables Y
t
, and the option strike price K
i
; these
inputs form an (N 3)-dimensional space. As always, it is useful to reduce the
dimensionality of the space of inputs as much as possible. We propose a number of
approaches for achieving a low dimensionality, as follows. Holding T
i
t constant for
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 421
each of the N options throughout the data sample reduces the dimensionality by one; we
must then consider each of the N option inputs as occupying an (N 2)-dimensional
space. We might be inclined to hold the strike price K
i
constant throughout the data
sample as well, although such a choice is usually not practical; if the stock price exhibits
considerable variation over the data sample, it is unlikely that option prices with any xed
strike price K
i
are observed in the market for the entire data sample. If, however, in
addition to holding the time to maturity constant for each of the N options, we also hold
moneyness (i.e., the ratio of S
t
to K
i
) constant, then the dimensionality of the input
space is reduced to N 1; each of the N options must be calculated for a variety of values
of S
t
and Y
t
, but the time to maturity T
i
t is held xed for each option, and the
strike price K
i
is a simple function of the stock price for each option. In fact, option
markets usually provide a reasonable range of moneyness traded at each point in time
introducing new options if necessarythereby insuring that such data are always
available. It should be noted, given these choices, that each C
t
(i) is not simply a time-series
of observations of the same call throughout the data sample: the time to maturity remains
constant, and the moneyness also remains constant even as the stock price changes
through the sample.
A further reduction in dimensionality of the input space is possible if the stochastic
volatility model satises a homogeneity property. Note that the payoff of a European
call option is rst-order homogeneous in the stock price and strike price. Denoting
the call price C as a function of time of maturity, stock price, strike price, and Y
t
,
we have
C(T; aS
T
; aK; Y
T
) = (aS
T
aK)

= a(S
T
K)

= aC(T; S
T
; K; Y
T
). (13)
In general, the price of an option is not rst-order homogeneous prior to T,
unless additional restrictions are placed on the model. The following conditions are
sufcient:
s
1
(X
t
)s
/
1
(X
t
) = j
11
(Y
t
)S
2
t
,
s
1
(X
t
)s
/
i
(X
t
) = j
1i
(Y
t
)S
t
= j
i1
(Y
t
)S
t
; i41,
s
i
(X
t
)s
/
j
(X
t
) = j
ij
(Y
t
); i41; j41,
m
Q
i
(X
t
) = c
i
(Y
t
); i41 (14)
for some set of functions j
ij
(Y
t
), 1pi; jpm, and c
i
(Y
t
), 2pipm. In this case, we can
express the call price as
C(t; S
t
; K; Y
t
) = S
t
H(t; m
t
; Y
t
), (15)
where m
t
is the logarithmic moneyness of the option:
m
t
= ln S
t
ln K. (16)
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 422
Substituting this expression into Eq. (4), we nd that the pricing partial differential
equation simplies to
0 =
qH(t; m
t
; Y
t
)
qt
H(t; m
t
; Y
t
)d
qH(t; m
t
; Y
t
)
qm
t
(r d)

m
i=2
qH(t; m
t
; Y
t
)
qY
t
(i)
c
i
(Y
t
)
1
2
qH(t; m
t
; Y
t
)
qm
t

q
2
H(t; m
t
; Y
t
)
qm
2
t

j
11
(Y
t
)

m
i=2
qH(t; m
t
; Y
t
)
qY
t
(i)

q
2
H(t; m
t
; Y
t
)
qY
t
(i) qm
t

j
i1
(Y
t
)

1
2

m
i=2

m
j=2
q
2
H(t; m
t
; Y
t
)
qY
t
(i) qY
t
(j)
j
ij
(Y
t
). (17)
Note that the solution H(t; m
t
; Y
t
) cannot depend on S
t
, but this does not present a problem,
since S
t
has been eliminated from the coefcients of the partial differential equation.
Furthermore, the strike price does not appear in the PDE or in the scaled option payoff:
H(T; m
T
; Y
T
) = (1 e
m
T
)

. (18)
The option price therefore inherits the homogeneity of its payoff. Thus, by calculating
scaled option prices (i.e., option prices divided by the stock price), the dimensionality of
the input space can be reduced to m 1.
Provided the stochastic volatility model under consideration satises the homogeneity
conditions of Eq. (14), scaled option prices with m 1 distinct combinations of time to maturity
T
i
t and moneyness S
t
=K
i
must be calculated for varying values of Y
t
. The time-series of
values of Y
t
can then be inferred by comparing the calculated option prices to the observed
option prices. Once these values have been calculated for a given value of the parameter vector,
the joint likelihood of the time-series of observations of S
t
and Y
t
must be calculated.
A variety of techniques exist for calculating option prices, and the most appropriate
method in general depends on the specic stochastic volatility model under question. For
instance, if the characteristic function of the transition likelihood is known in closed form
(as is sometimes the case even when the likelihood itself is not known), options can be
priced through a variety of Fourier transform methods.
A natural alternative is to apply the density expansion technique described in Section 3.1
to the risk-neutral dynamics (2), yielding at order J the transition function q
(J)
X
(D; x[x
0
; y),
or, after expanding the state vector X = |S; Y]
/
, q
(J)
X
(D; s; y[s
0
; y
0
; y). Denote the
corresponding marginal with respect to s as q
(J)
X
(D; s[s
0
; y
0
; y). Then an expansion of the
call option price at order J is given by the FeynmanKac formula
C
(J)
(t; S
t
; K; Y
t
) = exp(r(T t))

o
0
(S
T
K)

q
(J)
X
(T t; S
T
[S
t
; Y
t
; y) dS
T
. (19)
Since the specication of the market prices of risk is often such that the dynamics of the
process under P and Q involve only adjustments to the parameter values, but no change of
functional form for the drift and diffusion, the expression for q
(J)
X
follows directly from
those already derived for p
(J)
X
. If there is a change of functional form of the drift vector and/
or the diffusion matrix due to the market prices of risk, then the same method described in
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 423
Section 3.1 can be applied to the new specication to obtain q
(J)
X
. And note that the
expression (19) is truly a one-dimensional integral because there is no need to obtain
q
(J)
X
(D; s[s
0
; y
0
; y) as the integral over the forward volatility variable of q
(J)
X
(D; s; y[s
0
; y
0
; y);
rather, integration over y of the backward PDE satised by q
(J)
X
(D; s; y[s
0
; y
0
; y) shows that
q
(J)
X
(D; s[s
0
; y
0
; y) solves the same equation. A closed form expansion for q
(J)
X
(D; s[s
0
; y
0
; y) is
therefore obtained directly by solving the backward equation using the same functional
form, with the coefcients obtained explicitly by constraining the solution to also satisfy
the forward equation.
3.3. Change of variables: from state to observed variables
We have now obtained an expansion of the joint likelihood of observations on X
t
=
|S
t
; Y
t
]
/
in the form (11). If a proxy based on an options implied volatility has been used to
identify Y
t
, then this likelihood can be maximized directly; provided the instantaneous
interest rate and dividend yield are observed rather than estimated, then the identication of
Y
t
does not depend in any way on the model parameters. The value of X
t
therefore remains
constant as the model parameters are varied during a likelihood search. When the true option
prices are calculated, this is no longer the case; as the model parameters are varied during a
likelihood search, the implied values of X
t
do not remain constant. Estimation by
maximization of the likelihood of X
t
is therefore not possible; rather, estimation requires
maximization of the likelihood of the observed market prices, G
t
= |S
t
; C
t
]
/
.
The third and last step of our method is therefore moving from X
t
to the time-series
observations on G
t
, and this step requires only that the likelihood of X
t
be multiplied by a
Jacobian term. This term is a function of the partial derivatives of the X
t
with respect to S
t
and C
t
; these derivatives are arranged in a matrix, and the Jacobian term is the
determinant of this matrix. Because S
t
is itself an element of X
t
, the determinant takes on a
particularly simple form:
J
t
= det
qS
t
qS
t
qS
t
qY
t
(1)

qS
t
qY
t
(N)
qC
t
(1)
qS
t
qC
t
(1)
qY
t
(1)

qC
t
(1)
qY
t
(N)
.
.
.
.
.
.
.
.
.
.
.
.
qC
t
(N)
qS
t
qC
t
(N)
qY
t
(1)

qC
t
(N)
qY
t
(N)

= det
1 0 0
qC
t
(1)
qS
t
qC
t
(1)
qY
t
(1)

qC
t
(1)
qY
t
(N)
.
.
.
.
.
.
.
.
.
.
.
.
qC
t
(N)
qS
t
qC
t
(N)
qY
t
(1)

qC
t
(N)
qY
t
(N)

= det
qC
t
(1)
qY
t
(1)

qC
t
(1)
qY
t
(N)
.
.
.
.
.
.
.
.
.
qC
t
(N)
qY
t
(1)

qC
t
(N)
qY
t
(N)

. (20)
It is therefore only necessary to calculate partial derivatives of the option prices C
t
with
respect to the state variables Y
t
; these derivatives are the stochastic multivariate analog of
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 424
the familiar vega of Black-Scholes option prices. The delta coefcients of the option prices
do not appear in the Jacobian term. When we calculate the option prices to identify the
state vector X
t
(as per Section 3.2), the derivatives are also calculated as a by-product.
Once the state vector is identied and the Jacobian term from the change of variables
formula computed, the transition function of the observed asset prices (the stock and
options), G
t
= |S
t
; C
t
]
/
; can be derived from the transition function of the state vector
X
t
= |S
t
; Y
t
]
/
. Specically, consider the stochastic differential equation describing the
dynamics of the state vector X
t
under the measure P, as specied by (1). Let p
X
(D; x[x
0
; y)
denote its transition function, that is the conditional density of X
tD
= x given X
t
= x
0
,
where y denotes the vector of parameters for the model. Let p
G
(D; g[g
0
; y) similarly denote
the transition function of the vector of the asset prices G observed D units apart.
We now express the stock and option prices as functions of the state vector,
G
tD
= f (X
tD
; y). Dening the inverse of this function to express the state as a function
of the observed asset prices, X
tD
= f
1
(G
tD
; y), we have the following for the conditional
density of G
tD
= g given G
t
= g
0
:
p
G
(D; g[g
0
; y) = det
qf f
1
(g; y)

qx

1
p
X
(D; f
1
(g; y)[f
1
(g
0
; y); y)
= J
t
(D; g[g
0
; y)
1
p
X
(D; f
1
(g; y)[f
1
(g
0
; y); y), (21)
where J
t
(D; g[g
0
; y) is the determinant dened in (20).
Then, recognizing that the vector of asset prices is Markovian and applying Bayes Rule,
we see that the log-likelihood function for discrete data on the asset prices vector g
t
sampled at dates t
0
; t
1
; . . . ; t
n
has the simple form

n
(y) n
1

n
i=1
l
G
(t
i
t
i1
; g
t
i
[g
t
i1
; y), (22)
where
l
G
(D; g[g
0
; y) ln p
G
(D; g[g
0
; y) = ln J
t
(D; g[g
0
; y) l
X
(D; f
1
(g; y)[f
1
(g
0
; y); y)
with l
X
obtained in Section 3.1, and we are done.
We assume in this paper that the sampling process is deterministic. Indeed, in typical
practical situations, and in our Monte Carlo experiments below, these types of models are
estimated on the basis of daily or weekly data, so that t
i
t
i1
= D = 7=365 or t
i
t
i1
=
D = 1=252 is a xed number. At-Sahalia and Mykland (2003) provide a treatment of
maximum likelihood estimation in the case of randomly spaced sampling times. Maximum
likelihood estimation of the parameter vector y then involves maximizing expression (22),
evaluated at the observations g
t
0
; g
t
1
; . . . ; g
t
n
over the parameter values.
4. Example: the Heston model
In what follows, we apply the method described above to the prototypical stochastic
volatility model, that of Heston (1993). Under the Q measure, S
t
and Y
t
follow the dynamics
dX
t
= d
S
t
Y
t

=
(r d)S
t
k
/
(g
/
Y
t
)

dt

(1 r
2
)Y
t

S
t
r

Y
t

S
t
0 s

Y
t


d
W
Q
1
(t)
W
Q
2
(t)

. (23)
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 425
Note that Y
t
is a local variance rather than a local standard deviation; while keeping this in
mind, we will continue to refer to Y
t
as the stochastic volatility variable. Y
t
follows the
square root process of Feller (1951), and is bounded below by zero. The boundary value
zero cannot be achieved if Fellers condition, 2k
/
g
/
Xs
2
, is satised. If we instead restate the
dynamics in terms of the logarithmic stock price s
t
= ln S
t
, we have
d
s
t
Y
t

=
r d
1
2
Y
t
k
/
(g
/
Y
t
)

dt

(1 r
2
)Y
t

r

Y
t

0 s

Y
t


d
W
Q
1
(t)
W
Q
2
(t)

. (24)
The log stock price s
t
has volatility that is an afne function of Y
t
, and the covariance
between s
t
and Y
t
is also afne in Y
t
itself. The model of Black and Scholes (1973) is
obviously a special case of the model of Heston (1993), in which s = 0 and Y
0
= g
/
so that
Y
t
is constant. The likelihood function for the model of Heston (1993) is not known in
closed form,
4
unless we impose parameter restrictions that in effect make the model
equivalent to that of Black and Scholes (1973); hence the need for methods such as ours to
estimate models of this type by maximum likelihood.
The market price of risk specication in the model is L = |l
1

(1 r
2
)Y
t

, l
2

Y
t

]
/
. The
joint dynamics of s
t
and Y
t
under the objective measure P are then
d
s
t
Y
t

=
a bY
t
k(g Y
t
)

dt

(1 r
2
)Y
t

r

Y
t

0 s

Y
t


d
W
P
1
(t)
W
P
2
(t)

, (25)
where
a = r d; b = l
1
(1 r
2
) l
2
r
1
2
; k = k
/
l
2
s; g =
k l
2
s
k

g
/
. (26)
When the volatility state variable Y
t
is not observable, its value must be extracted from
option prices as discussed above in order to carry out the maximum likelihood estimation
of the models parameters, y = |k; g; s; r; l
1
; l
2
]
/
. Since the price of a call option is a
monotonically increasing function of the level of volatility, the value of Y
t
can be
determined from the price of a single option. We therefore take as given a joint time-series
of observations of the log stock price s
t
and the price of an at-the-money, constant-
maturity option C
t
. In principle, any option can be used, but this choice has three
advantages. First, at-the-money and short-dated options are likely to be the most actively
traded and liquid options, so their prices are least affected by microstructure and other
such issues. Second, at-the-money options are highly sensitive to changes in volatility, so
small observation errors in the price will have minimal effect on the implied level of
volatility. Finally, as described in Section 3.2, the use of options with constant moneyness
and time to maturity considerably simplies the extraction of volatility from the vector of
observed option prices. Note that this model satises the homogeneity requirements of
(14), so that only the value of Y
t
need be varied when computing option prices.
To calculate option prices in this model, we can use either the Feynman-Kac based on
an expansion at order J of the density function, as given in Eq. (19), or characteristic
functions as in Heston (1993), modied by Carr and Madan (1998), exploiting the fact that
ARTICLE IN PRESS
4
See Lamoureux and Paseka (2004) for an expression of the density of the Heston model using Fourier
inversion of the characteristic function, which reduces the dimensionality of the required integration to a one-
dimensional integral. The remaining integral is over a modied Bessel function of noninteger order.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 426
this particular model is afne under the Q measure (it is also afne under P but this is
irrelevant). The former method can be applied to all stochastic volatility models (provided
only that they imply option prices that are martingales and not just local martingales),
since the expansion q
(J)
X
can be calculated without restrictions, whereas the latter is specic
to models that admit a closed form characteristic function; see other examples beyond
Hestons model in Lewis (2000). However, the expansion-based pricing method will be
most accurate for short-maturity options, since the density expansion is a Taylor
expansion in the time variable.
In either case, we start with the option price expressed as
C(s
t
; Y
t
; K; D) = E
Q
||exp(s
tD
) K]

[s
t
; Y
t
], (27)
where K is the strike price of the option, and D is the time remaining until maturity.
Heston (1993) provides a Fourier transform method for calculating the option price;
however, with this method, the characteristic function of the option is singular at the
origin, making numeric integration difcult. Carr and Madan (1998) present an alternate
Fourier transform procedure that avoids this difculty. Rather than computing the option
price directly, we calculate the option price scaled by the current price of the stock:
c(s
t
; Y
t
; K; D) = exp|s
t
]C(s
t
; Y
t
; K; D). (28)
It is then convenient to express the scaled option price in terms of the logarithmic
moneyness of the option rather than the raw value of the strike price, m
t
= s
t
ln K. This
scaled option price is given by
c(s
t
; Y
t
; K; D) =

o
0
Re
exp|w
0
w
1
m
t
w
2
Y
t
]
a(a 1) u
2
(2a 1)iu

du, (29)
where a is an arbitrary scaling parameter and
w
0
= D (r d)(a 1) r (r d)iu ( )
k
/
g
/
s
2
(Dg
1
2 ln(g
2
)),
w
1
= iu a; w
2
= (u
2
(2a 1)iu a(a 1)) 1
1
g
2

1
g
1
,
g
0
=

c
0
c
1
u c
2
u
2

,
g
1
= k (iu a 1)rs g
0
; g
2
= 1
g
1
g
0

exp Dg
0

1
2

,
c
0
= (k
/
)
2
s(a 1)(2k
/
r s) s
2
(a 1)
2
(1 r
2
),
c
1
= is(2s(a 1)(1 r
2
) 2k
/
r s); c
2
= s
2
(1 r
2
).
This expression can be evaluated quickly, since it is a one-dimensional integral. Heston
(1993) even refers to similar one-dimensional integrals as closed form. Since we use
options with constant moneyness and time to maturity, the integral above need only be
calculated for each parameter vector evaluated during a likelihood search and over a one-
dimensional grid of values of Y
t
. By the above procedure, we can nd the values of s
t
and
Y
t
as functions of S
t
and C
t
. As discussed in Section 3.1, we then derive the likelihood f
sY
of s
t
and Y
t
explicitly. The log-likelihood formulas, made specic for this particular model,
are given in the appendix.
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 427
In the case where volatility is unobservable, the dependence of the joint likelihood
function of X
t
= |S
t
; Y
t
]
/
under P on the full set of market price of risk parameters is
introduced by the Jacobian term, itself resulting from the transformation from |S
t
; C
t
]
/
to
|S
t
; Y
t
]
/
as described in Section 3.3. In the unobservable volatility case, the separate
identication of the two market price of risk parameters is tenuous (see the Monte Carlo
experiments below).
5. Volatility proxies
If, on the other hand, we have available a proxy for the state volatility variable, then
maximum likelihood estimation of the vector y can proceed directly without the need for
option prices. Note, however, that the dynamics under P of the process |S
t
; Y
t
]
/
, or |s
t
; Y
t
]
/
as given in Eq. (25), will only permit identication of the parameters |k; g; s; r; b]
/
or
equivalently |k; g; s; r; l
1
]
/
, since both components of the observed vector are viewed under
P. In that situation, we will (arbitrarily) treat the l
2
parameter as xed at 0, and given the
other identied parameters, translate the estimated value of b into an estimate for l
1
.
The rst proxy we use is an unadjusted Black-Scholes proxy in which the implied
volatility of a short-maturity at-the-money option is used in place of the true instantaneous
volatility state variable. The use of this proxy is justied in theory by the fact that the
implied volatility of such an option converges to the instantaneous volatility of the
logarithmic stock price as the maturity of the option goes to zero. The Monte Carlo
simulations of Section 6 show that, for some models, the use of this proxy in place of an
exact option pricing method still produces accurate estimates for the Heston model.
Option pricing formulae are known in closed form for only a few models, and tractable
numeric procedures such as those based on characteristic functions can be used only for a
limited class of models. The ability to use a numerically tractable proxy is thus a signicant
advantage. However, as we will see below for the more general CEV model, we nd that
this simple proxy method introduces signicant bias in the estimation of the elasticity of
volatility parameter (simulations not shown). To remedy this problem, we develop an
alternate proxy method.
5.1. The integrated volatility proxy
Our alternate proxy, which we name the integrated volatility proxy, corrects for the
effect of mean reversion in volatility during the life of an option. If Y
t
is the instantaneous
variance of the logarithmic stock price, we can express the average integrated variance
from time t to T as V(t; T):
V(t; T) =
1
T t

T
t
Y
u
du. (30)
If the volatility process is instantaneously uncorrelated with the logarithmic stock price
process, we can then calculate option prices by taking the expected value of the Black-
Scholes option price with V(t; T) as the implied variance over the probability distribution
of V(t; T): see Hull and White (1987). If the two processes are correlated, then the price of
the option is a weighted average of Black-Scholes prices evaluated at different stock prices
and volatilities: see Romano and Touzi (1997).
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 428
Our proxy is determined by calculating the expected value of V(t; T) rst, and
substituting this value into the Black-Scholes formula as implied variance. This proxy is
model-free, in that it can be calculated whether or not an exact volatility can be computed,
and results in a straightforward estimation procedure. On the other hand, this procedure is
in general approximate, rst because the volatility process is unlikely to be instantaneously
uncorrelated with the logarithmic stock price process, and second because the expectation
is taken before substituting V(t; T) into the Black-Scholes formula rather than after.
We examine below in Monte Carlo simulations the respective impact of these two
approximations, with the objective of determining whether the tradeoff involved between
simplicity and exactitude is worthwhile.
The idea is to adjust the Black-Scholes implied volatility for the effect of mean reversion
in volatility, essentially undoing the averaging that takes place in Eq. (30). Specically, if
the Q-measure drift of Y
t
is of the form a bY
t
(as it is in all the models we examine), then
the expected value of V(t; T) is given by
E
t
|V(t; T)] =
e
b(Tt)
1
b(T t)

Y
t

a
b

a
b
. (31)
(A similar expression can be derived in the special case where b = 0.) By taking the
expected value on the left-hand side to be the observed implied variance V
imp
(t; T) of a
short-maturity T at-the-money option, our adjusted proxy is then given by
Y
t
-
bV
imp
(t; T) a(T t)
e
b(Tt)
1

a
b
. (32)
To examine the accuracy of this proxy, we perform the following experiment for a range
of different parameter values y: given Y
t
, generate at-the-money short-maturity (30 days as
is the case for the VIX data; see Section 7 below) option prices using the exact CEV model,
compute the options Black-Scholes implied volatility V
imp
(t; T), then back out Y
t
using
Eq. (32) and plot the resulting value of Y
t
(proxy-implied) against the original value of Y
t
.
If the overall procedure is accurate, then the plot should be close to a 45

line. The result is


reported in Fig. 1, which is to be compared to Fig. 2 for the same experiment but using the
unadjusted Black-Scholes proxy, where one simply sets Y
t
- V
imp
(t; T) instead of our
ARTICLE IN PRESS
0 0.05 0.1 0.15 0.2
true volatility state variable
0
0.05
0.1
0.15
0.2
p
r
o
x
y

i
m
p
l
i
e
d

v
o
l
a
t
i
l
i
t
y

s
t
a
t
e

v
a
r
i
a
b
l
e
Fig. 1. From implied volatility proxy back to latent volatility state variable: integrated volatility proxy.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 429
proxy where one sets E
t
|V(t; T)] - V
imp
(t; T). Each point corresponds to a different value
of Y
t
, and one of ten sets of parameter values chosen to be realistic for the model. As is
clear from comparing the two gures, adjusting for the mean-reverting behavior of the
volatility state variable results in a marked improvement in our ability to produce an
accurate proxy-implied Y
t
from an option-implied volatility. By observing the departures
from the 45

line in Fig. 2, we see that the Black-Scholes proxy tends to overstate


(understate) Y
t
when volatility is low (high). This is as expected, given the mean reversion
in volatility. Our adjusted proxy corrects for this, and recovers the unobservable Y
t
with
much greater accuracy.
The Black-Scholes proxy Y
t
- V
imp
(t; T) does not depend on any of the models
parameters (except the instantaneous interest rate and dividend yield, which we treat as
observed); by contrast, our integrated volatility proxy depends on the parameters of the
drift of volatility. But, given the explicit inversion formula (32), we can simply take
|S
t
; V
imp
(t; T)]
/
as the state vector, write its likelihood from that of |S
t
; Y
t
]
/
using a
Jacobian term for the change of variable (32), and estimate all the parameters in one
stage.
5
5.2. The Lewis proxy for the CEV model
We also consider a more general model, which nests the Heston model as well as the
GARCH model we will investigate empirically below. Under the Q measure, the state
variables S
t
and Y
t
follow a process of the following form:
dX
t
= d
S
t
Y
t

=
(r d)S
t
k
/
(g
/
Y
t
)

dt

(1 r
2
)Y
t

S
t
r

Y
t

S
t
0 sY
b
t

d
W
Q
1
(t)
W
Q
2
(t)

, (33)
ARTICLE IN PRESS
0 0.025 0.05 0.075 0.1 0.125 0.15
true volatility state variable
0
0.025
0.05
0.075
0.1
0.125
0.15
p
r
o
x
y
-
i
m
p
l
i
e
d

v
o
l
a
t
i
l
i
t
y

s
t
a
t
e

v
a
r
i
a
b
l
e
Fig. 2. From implied volatility proxy back to latent volatility state variable: Black-Scholes proxy.
5
If a different proxy is used, where the inversion of V
imp
(t; T) back into Y
t
is not available in closed form, one
can employ a two-stage estimation procedure to speed computations. In the rst stage, estimate only the
parameters of the univariate volatility process, using the Black-Scholes proxy. Then use the rst-stage estimated
parameters to calculate the volatility proxy, which is then used in a second-stage estimate of all model parameters.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 430
where we constrain the parameter bX1=2, as well as bp1 (to retain the uniqueness of
option prices). This model is considered by Jones (2003) among others and nests both the
models of Heston (1993) (b = 1=2) and the GARCH model (b = 1). Note, however, that
the special properties of these models could still warrant separate investigation, despite
their being nested by the model of (33); for example, the Fourier inversion method for
option pricing is feasible for the model of Heston (1993), where b = 1=2, but not for the
general CEV model. The state variable Y
t
has a boundary at zero, but this boundary can
only be achieved when b = 1=2, and even then only for certain values of the model
parameters. Proceeding as before, we can express these dynamics in terms of the
logarithmic stock price rather than the stock price itself:
d
s
t
Y
t

=
r d
1
2
Y
t
k
/
(g
/
Y
t
)

dt

(1 r
2
)Y
t

r

Y
t

0 sY
b
t

d
W
Q
1
(t)
W
Q
2
(t)

. (34)
We again set at zero the component of the market price of risk vector corresponding to Y
t
,
and specify L = |l
1

(1 r
2
)Y
t

; 0]
/
.
The P-measure dynamics of the state variables are then
d
s
t
Y
t

=
r d
1
2
Y
t
l
1
(1 r
2
)Y
t
k(g Y
t
)

dt

(1 r
2
)Y
t

r

Y
t

0 sY
b
t

d
W
P
1
(t)
W
P
2
(t)

,
(35)
where k
/
= k and g
/
= g. The Q-measure dynamics are not afne in Y
t
. The corresponding
log-likelihood expansion can be found in the appendix.
It is possible to rene the implied volatility proxy by expressing it in the form of a Taylor
series in the volatility of volatility parameter s in the case of the CEV model, where the
Q-measure drift of Y
t
is of the form a bY
t
, and the Q-measure diffusion of Y
t
is of the
form sY
b
t
. Lewis (2000) showed that in that case
V
imp
(t; T) = E
t
|V(t; T)] s(T t)
1
1
2

ln(S
t
=K) (r d)(T t)
Y
t
(T t)

r
b

T
t
(e
b(Ts)
1) e
b(st)
Y
t

a
b

a
b

b1=2
ds O(s
2
), (36)
where E
t
|V(t; T)] is given in (31) as a function of Y
t
. While Eq. (36) should be expected to
be more accurate in the case of the CEV model, provided that the parameter s is indeed
small, it is unfortunately not that useful in our context because the relation between the
observed V
imp
(t; T) and the latent Y
t
is not invertible without numerical computation of
the parameter-dependent integral in Eq. (36), followed by the numerical inversion of the
nonlinear relation from V
imp
(t; T) back to Y
t
. This somewhat defeats the purpose of using
a volatility proxy for our estimation purposes, since option prices are themselves integrals
of the state variables: inferring Y
t
from option prices C
t
directly, through Eq. (19), also
requires inversion of a parameter-dependent integral.
Said differently, if we are willing to solve numerically a parameter-dependent integral as
part of the maximum likelihood search, then we might as well use the exact option prices
C
t
to infer Y
t
directly, as in the general inference method we discuss above, and dispense
with the additional approximation involved in the small-volatility-of-volatility Taylor
series, or the use of implied volatility altogether. This is not to say that expression (36)
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 431
cannot be useful in different contexts, such as guring out the implications for the implied
volatility smile of given parameter values, for instance, which is an application that uses
the formula in the direction from Y
t
to V
imp
(t; T), but not for our purpose of inverting it in
the direction from V
imp
(t; T) to Y
t
. But the inversion required to use the proxy (36) is of
the same degree of complexity as that of using option prices directly, thereby nullifying the
only advantage of using a proxy in our context, which is to tradeoff some accuracy in
exchange for a greatly simplied search for the likelihood maximum.
Even if we estimate the parameters based on a change of state variables from |S
t
; Y
t
]
/
to
|S
t
; V
imp
(t; T)]
/
, applying Ito s Lemma to (36) in order to nd out the Q-dynamics of
V
imp
(t; T) implied by the Q-dynamics of Y
t
does not eliminate the need to compute that
integral for each pair of observations and parameter values along the maximization path of
the likelihood. For these reasons, we suggest the immediately computable proxy (32).
Furthermore, that proxy has the advantage of being applicable to all stochastic volatility
models with afne drift, not just the CEV model.
6
6. Monte Carlo results
One major advantage of the proposed estimation method for stochastic volatility models
is that it is numerically tractable, so that large numbers of Monte Carlo simulations can be
conducted to determine the small-sample distribution of the estimators, examine the effect
of replacing the unobservable volatility variable Y
t
by a proxy, and compare the small-
sample behavior of the estimators to their predicted asymptotic behavior.
6.1. Estimators for the Heston model
We start with simulations for the model of Heston (1993) for a variety of assumptions
about sample length, time between observations, and observability of the volatility state
variable. This model is a natural choice, since option prices can be calculated easily
through Fourier inversion of the characteristic function; it is possible therefore to compare
results obtained with the exact option pricing formula to those obtained using the
unadjusted Black-Scholes proxy discussed above. We use sample lengths of n = 500, 5; 000,
and 10; 000 transitions at the daily (D = 1=252) frequency, and n = 500 transitions at the
weekly (D = 7=365) frequency. The parameter values used in the simulations are similar
to the values obtained from the empirical application in Section 7, and are reported in
Table 1. Note in particular that a value of 0:8 is chosen for r, to reect the empirical
regularity that innovations to volatility and stock price are generally strongly negatively
ARTICLE IN PRESS
6
The integrated volatility proxy requires us to calculate the expectation of V t; T ( ). This is an exact calculation,
with result given in (31), for all models whose volatility has an afne drift. For models in which the drift of the
volatility state variable is not afne, this expectation cannot in general be found in closed form. One possible way
to apply the integrated volatility proxy to such models is to use a local linear approximation, in which the drift of
volatility is replaced by the rst two terms of a power series expansion of the drift about the current value of
volatility. The accuracy of such an approximation for short maturities in a term structure context is conrmed by
Takamizawa and Shoji (2003). However, since in the models we consider, the drift of volatility is afne, we do not
need to employ this. And, in any event, in the absence of an easily computable formula for inverting V
imp
(t; T)
into Y
t
, we might as well use option prices C
t
directly, as discussed above, and bypass the use of a proxy for Y
t
altogether.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 432
correlated. The values of the instantaneous interest rate and dividend yield, r and d, are
held xed.
For each batch of simulations, we generate 1; 000 sample paths using an Euler
discretization of the process, using 30 sub-intervals per sampling interval; 29 out of every
30 observations are then discarded, leaving only observations at either a daily or weekly
frequency. Each simulated data series is initialized with the volatility state variable at its
unconditional mean, and the stock at 100. An initial 500 observations are generated and
then discarded; the last of these observations is then taken as the starting point for the
simulated data series. We then generate 500, 5,000, or 10,000 additional observations.
We next estimate the model parameters using the method described above. When simulating
the joint dynamics of the state vector X
t
= |S
t
; Y
t
]
/
, we have the luxury of deciding whether
Y
t
is observable or not; we can determine the effect of ignoring the difference between the
(unobservable) stochastic volatility variable Y
t
, and an (observable) proxy, namely the implied
volatility of a short-dated at-the-money option, or an afne function of the implied volatility
of such an option. Our method can be applied to either situation: treating Y
t
as unobservable
or replacing it by an observable proxy, which, as discussed in Section 3, eliminates the need for
the third step of our method, and greatly simplies the second.
Table 1 reports results, all with observed volatility. Table 2 reports results, but with
volatility not directly observed, that is, employing the full estimation procedure where we
use the model to generate simulated option prices, i.e., observations on C
t
, then use
G
t
= |S
t
; C
t
]
/
as the observed vector. In each case, the mean difference between the
estimates and the true values of the parameter (used in the data-generation procedure) over
the simulated paths is reported as the bias of the estimation procedure. The standard
deviation of each parameter is computed accordingly and reported.
Throughout, the best estimates are for the s and r parameters. Regardless of sampling
frequency and whether or not volatility is observed, both the biases and standard errors of
the estimates are small relative to the parameter values. The g parameter fares only slightly
ARTICLE IN PRESS
Table 1
Monte Carlo simulations with observed volatilityHeston model
Parameter True value 500 Daily obs. 5,000 Daily obs. 10,000 Daily obs. 500 Weekly obs.
Bias Std. dev. Bias Std. dev. Bias Std. dev. Bias Std. dev.
k 3.00 0.86 1.57 0.068 0.38 0.033 0.25 0.159 0.56
g 0.10 0.0005 0.022 0.0001 0.0057 0.0001 0.004 0.0001 0.0084
s 0.25 0.0002 0.006 0.0000 0.0020 0.0000 0.0014 0.0004 0.006
r 0.8 0.0002 0.013 0.0001 0.0042 0.0001 0.003 0.0015 0.013
l
1
4.0 0.92 6.5 0.07 1.9 0.0977 1.4 0.2 2.9
This table shows the results of 1,000 Monte Carlo simulations for the Heston model, with observed volatility and
500, 5,000, and 10,000 daily observations (i.e., D = 1=252) and 500 weekly observations (i.e., D = 7=365). The
second column shows the true value y
0
of the parameters used to generate the simulated sample paths. The Bias
column shows the mean bias of the estimated parameter vector, i.e., the difference between the estimated
parameters and the true values. The Std. dev. column shows the standard deviation of the parameter estimates.
The bias and standard deviation quantify the distance |
^
y
(J;instant)
n;D
y
0
| for the various number of observations
and sampling frequency. The market price of risk of the stochastic volatility variable, l
2
, is not identied when
volatility is treated as observed. The instantaneous interest rate and the instantaneous dividend yield of the stock
are held xed at the values of 4% and 1.5%, respectively.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 433
worse when volatility is observed; when volatility is unobserved, the standard deviation of
g is much larger, for reasons discussed below. The k and l
1
parameters are estimated with
less accuracy. The bias in the k estimates decreases substantially with the length of the
sample; this is possibly indicative that the source of the bias identied in Li et al. (2004)
could be relevant for sample sizes of 500 daily observations. As expected, we typically
overestimate in small-samples the speed of mean reversion when it is low (so the bias on the
k parameter is positive).
The use of otherwise similar batches of simulations with differing numbers of daily
observations in each simulated series provides some insight into how fast the small-sample
distribution of the estimated parameters approaches the asymptotic distribution. As the
number of observations in each simulated data series increases, we would expect the
standard errors of the parameter estimates to decrease at a rate inversely proportional to
the square root of the number of observations. The decreases in standard errors are
approximately what one would expect from asymptotic theory; for example, in Table 1, the
small-sample standard errors for all parameters except k are very close to the asymptotic
standard errors. The small-sample standard error for k is larger than the asymptotic
standard error for 500 daily observations, but is much closer for 5,000 and 10,000 daily
observations. The standard errors for all parameters decrease with sample size at roughly
the rate one would predict from asymptotic theory, i.e., by a factor of the square root of
ten when increasing from 500 to 5,000 observations, and by a factor of the square root of
two when increasing from 5,000 to 10,000 observations. These results suggest that the
distribution of the estimates is approaching its asymptotic limit.
When the value of the volatility state variable is determined through the use of an option
price C
t
, rather than observed directly, the identication of l
2
relies exclusively on the
introduction of the Jacobian term in the likelihood function of the observables. As
expected given this tenuous dependence of the likelihood on the second market price of
risk parameter, that parameter is generally identied quite poorly. The strong correlation
between the two Brownian motions driving state variable evolution confounds this
problem; as shown in Table 2, the standard errors for both market price of risk parameters
ARTICLE IN PRESS
Table 2
Monte Carlo simulations with unobserved volatilityHeston model
Parameter True value 500 Weekly obs. Using option prices
Bias Std. dev.
(1) (2) (3) (4)
k 3.00 0.59 1.59
g 0.10 3.04 14.5
s 0.25 0.013 0.03
r 0.8 0.0006 0.026
l
1
7.0 0.54 4.9
l
2
6.0 0.67 3.8
This table shows the results of 1,000 Monte Carlo simulations for the Heston model with 500 weekly observations
(i.e., D = 7=365) and unobserved volatility, but observed option prices. The data are pairs of values of |S
t
; C
t
]
/
,
and we use Fourier inversion to back out the unobservable value Y
t
of the state variable. Column 2 shows the
parameters used to generate the simulated sample paths. This table quanties the distance |
^
y
(J;price)
n;D
y
0
|. The
instantaneous interest rate and the instantaneous dividend yield of the stock are held xed at the values of 4% and
1.5%, respectively.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 434
are large. This is a feature of the sampling noise inherent in the estimation, not of the
approximation involved in the likelihood function.
Also of note is the large standard error for the g parameter when volatility is
unobserved. This result might seem surprising, given the relatively accurate estimation of
this parameter when volatility is observed. However, the results are not comparable. Note
that g is always multiplied by k to determine the constant term in the drift of Y
t
. When
volatility is treated as observed, the market price of risk parameter l
2
is held xed, so that
the value of k
/
(i.e., the P-measure speed of mean reversion) is constrained by the value of
k (i.e., the Q-measure speed of mean reversion). However, when volatility is treated as
unobserved, k
/
and k can vary independently. Consequently, k is estimated more poorly
when volatility is unobserved, and this has an effect on the estimation of g. If we consider
the product kg, we nd it is estimated only slightly worse when volatility is unobserved.
The increase in the standard error of g when volatility is unobserved is therefore largely a
by-product of the increase in volatility of k, rather than a result of any severe deterioration
of our ability to estimate the constant term in the drift of the volatility state variable.
6.2. The effects of using a volatility proxy in the Heston model
Of course, volatility is not observed. Consequently, of particular interest are the results for
the Monte Carlo simulations with the same sampling frequency and number of observations,
but different methods of determining the level of the volatility state variable, Y
t
, using
indirect market data such as an option price, a simple Black-Scholes implied volatility from a
short-term at-the-money option, or our adjusted integrated volatility proxy.
When using a proxy, one way to assess its effect is to examine whether the proxy for Y
t
introduces enough additional noise to be noticeable at the scale of the standard error of the
estimators due to the sampling noise. Indeed, for a given sample size, the best possible
estimator is the true (but effectively incomputable) maximum likelihood estimator; call it
^
y
(true)
n;D
when n data points sampled at interval D are used. Around the true parameter vector
y
0
, its sampling distribution is
^
y
(true)
n;D
y
0
. Because of the Cramer-Rao lower bound, the
distance |
^
y
(true)
n;D
y
0
| is as close as one can get to y
0
using a consistent estimator. Because
^
y
(true)
n;D
is incomputable, we consider alternative estimators
^
y
(J)
n;D
from an expansion of the
log-likelihood function of order J ; J = 1 in these simulations. There are three versions of
the latter, in simulations:
^
y
(J;instant)
n;D
if the (unobservable in reality, but not in simulations)
instantaneous volatility is used,
^
y
(J;proxy)
n;D
when using a volatility proxy instead of the
unobservable instantaneous volatility and
^
y
(J;price)
n;D
if we use an option price C
t
to back out
by Fourier inversion the implied value of the volatility state variable. One estimator we
propose for empirical work is
^
y
(J;proxy)
n;D
. To evaluate the performance of a proposed
estimator such as
^
y
(J;proxy)
n;D
, we would like to compare |
^
y
(J;proxy)
n;D

^
y
(true)
n;D
| to |
^
y
(true)
n;D
y
0
|, or
measure the increase in |
^
y
(J;proxy)
n;D
y
0
| relative to |
^
y
(true)
n;D
y
0
|. Even though we cannot
compute
^
y
(true)
n;D
, we can estimate |
^
y
(true)
n;D
y
0
| using the inverse of Fishers information
matrixa useful by-product of using a likelihood-based estimation method. In other
words, we can assess how much farther from y
0
we end up because of the dual
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 435
approximations (using a proxy instead of the instantaneous volatility and an expansion of
the log-likelihood instead of the true log-likelihood), recalling that there is nothing we can
do about the true MLE sampling error |
^
y
(true)
n;D
y
0
|.
Then, since E||
^
y
(J;proxy)
n;D

^
y
(true)
n;D
|
2
] is the root mean squared error of the estimator, we
can compare the bias and variance of
^
y
(J;proxy)
n;D

^
y
(true)
n;D
to those of
^
y
(true)
n;D
y
0
. Note that this
measures directly the effect of the estimator using the volatility proxy. We can also
compare |
^
y
(J; price)
n;D
y
0
| to |
^
y
(J;proxy)
n;D
y
0
|, and |
^
y
(J;instant)
n;D
y
0
| to |
^
y
(J;proxy)
n;D
y
0
| in
order to isolate in simulations the effect of the approximation introduced by the volatility
proxy, keeping in mind that, just like
^
y
(true)
n;D
,
^
y
(J;instant)
n;D
too is an infeasible estimator since
instantaneous volatility is not in general observable.
The results in Table 1 are based on the assumption that volatility is observed, whereas the
results in Table 2 are based on volatility extracted from option prices. At the daily frequency,
the standard errors for l
1
are roughly similar; the standard error for k is substantially smaller
when volatility is observed through a proxy. There is no question that when volatility is
observed, the estimation error is going to be smaller, and the table conrms that.
But in the real world volatility is not observed. So comparing the estimation accuracy
with and without volatility being observed is useful for benchmarkingwhich is why we
include itbut not for judging an estimation method in relative terms. The relevant
comparison is among different methods that do not assume that volatility is observed. For
that purpose, Table 3 compares the use of the observed volatility state variable (infeasible
ARTICLE IN PRESS
Table 3
Effect of volatility proxyHeston model
Parameter True value Observed vol. Black-Scholes implied vol. Integrated vol. proxy
Bias Std. dev. Bias Std. dev. Bias Std. dev.
(1) (2) (3) (4) (5) (6) (7) (8)
k 3.00 0.17 0.56 0.16 0.55 0.16 0.53
g 0.10 0.0002 0.008 0.0007 0.008 0.001 0.007
s 0.25 0.0000 0.003 0.03 0.005 0.0009 0.006
r 0.8 0.0000 0.006 0.0009 0.006 0.0006 0.006
l
1
4.0 0.19 2.8 0.21 2.8 0.25 2.7
This table shows the results of 1,000 Monte Carlo simulations for the Heston model with 2; 500 daily observations
(i.e., D = 1=252) comparing the estimator when volatility is treated as observed (infeasible outside of
simulations) to those obtained using two different proxies for the instantaneous volatility (the implied volatility
of a short dated at-the-money option and our adjusted integrated volatility proxy). Column 2 shows the
parameters used to generate the simulated sample paths; the simulations are the same as those used in Table 2.
Column 3 shows the mean bias of the estimated parameter vector, i.e., the difference between the estimated
parameters and the values shown in the second column, when the volatility is observed. Column 4 shows the
standard deviation of the parameter estimates, also using Fourier inversion. These two columns quantify
|
^
y
(J;instant)
n;D
y
0
|. Columns 5 and 6 show the same information, but using the implied volatility of an at-the-money
short-maturity option to determined the level of the stochastic volatility variable. These two columns quantify
|
^
y
(J;proxy)
n;D
y
0
|, for the unadjusted implied volatility proxy. Columns 7 and 8 report the comparable numbers
when our adjusted or integrated volatility proxy is used. The instantaneous interest rate and the instantaneous
dividend yield of the stock are held xed at the values of 4% and 1.5%, respectively.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 436
outside of simulations) to that of two alternative proxies to determine the level of
stochastic volatility: the Black-Scholes, unadjusted, implied volatility proxy, and our
adjusted integrated volatility proxy. The l
2
parameter is held xed at its data generating
value, since it is unidentied when a proxy is used; holding this parameter xed in both sets
of estimates makes the results comparable.
The table shows that the adjusted proxy and the implied volatility give comparable
results for the Heston model, with a slight advantage for the adjusted proxy. Note in
particular that the implied volatility proxy results in a slight bias in the estimated s
parameter, which is typically estimated very precisely. In this particular model, we are only
attempting to estimate a single volatility of volatility parameter. But this bias already
announces the markedly different resultsand issues with the use of an unadjusted implied
volatility proxythat we will see below for the CEV model, where volatility of volatility is
controlled by two separate parameters.
A different but related point is that the design where the value of l
2
is known biases the
results in favor of the use of the proxy. Indeed, while we can estimate the parameter
b = l
1
(1 r
2
) l
2
r 1=2 in (26) quite accurately, the resulting estimate for l
1
is
necessarily dependent upon the assumption made about l
2
. When using real data, we do
not have the luxury of knowing the value of l
2
. One solution is simply to focus on the
parameter b alone, but this is of little use if the objective is to price derivatives (in which
case we need the parameters under Q). Another solution is to rst estimate the process
using the full procedure, and use the resulting value of l
2
as an input above. This said, it is
important to note that this feature could well be shared by other methods designed to
estimate stochastic volatility models, but their numerical intensity makes simulating them
impractical so it is difcult to know precisely how they behave.
6.3. Simulations for the CEV stochastic volatility model
Similar simulations to those of the Heston model for the CEV model show that, when an
unadjusted Black-Scholes implied volatility proxy is used, signicant bias results in the
estimation of the elasticity parameter b in particular. The estimates of b will often bump
against the exogenously imposed boundary of 1 (there to ensure that the deated stock
price is a martingale, not just a local martingale.) We therefore focus on the effect of using
the integrated volatility proxy, and compare that to the Lewis proxy (36). Table 4 shows
the parameter estimates and standard errors obtained for the CEV model using the same
procedure as that described in Section 4. Obviously, the use of a proxy induces estimation
error relative to the (infeasible) use of the instantaneous volatility, but the estimation
biases and variances remain small.
As noted above, the computational times for the Lewis proxy are substantially higher
than with the simple integrated volatility proxy (32) due to the need to numerically invert
the mapping from the observed implied volatility to the volatility state variable, and the
further nonlinearity of the relation between V
imp
(t; T) and Y
t
which cannot be inverted
explicitly. Furthermore, despite the fact that it should in theory be more accurate, the table
shows that the additional complexity results in no marked improvement over our simple
integrated volatility proxy, despite its much higher computational burden. In fact, for two
of the main parameters of the model, s and b, the Lewis proxy results in a decrease in
accuracy relative to the simpler integrated volatility proxy. Results for the other
parameters are comparable for the two proxies.
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 437
6.4. Comparing small-sample to asymptotic distributions
We can also use Monte Carlo simulations to assess whether the predicted asymptotic
behavior of maximum likelihood estimators is matched in small-samples by our
maximum likelihood estimator. Table 5 compares the asymptotic standard deviations of
the estimates obtained from the approximate likelihood function with the empirical
standard deviations obtained from the Monte Carlo simulations. As shown, the two
versions of the standard deviations converge as the sample size gets larger, suggesting not
only that the likelihood approximations are quite accurate but also that standard statistical
theory, namely
n
1=2
(
^
y y) N(0; F
1
), (37)
where F = E|ql
G
=qy qy
/
] is Fishers information matrix, works well in this context. As is
well known, the Cramer-Rao lower bound states that F
1
is the lowest possible asymptotic
variance achievable by a consistent estimator of y.
Table 5 shows that we are close to the efcient asymptotic standard errors for all
parameters despite the nite sample sizes, as the nite sample distribution appears close to
the asymptotic distribution with as few as 500 daily observations (of course, anything can
happen in nite samples and it is theoretically possible for a different estimator to beat
maximum likelihood in nite samples).
ARTICLE IN PRESS
Table 4
Effect of volatility proxyCEV model
Parameter True value Observed volatility Integrated volatility proxy Lewis volatility proxy
Bias Std. dev. Bias Std. dev. Bias Std. dev.
(1) (2) (3) (4) (5) (6) (7) (8)
k 4.00 1.07 2.10 1.11 2.23 1.09 2.11
g 0.05 0.0014 0.018 0.0008 0.015 0.0006 0.014
s 0.75 0.0004 0.15 0.048 0.14 0.13 0.10
r 0.75 0.0004 0.017 0.0016 0.02 0.006 0.02
b 0.80 0.0065 0.065 0.055 0.088 0.19 0.029
l
1
4.0 1.28 7.56 1.15 7.45 1.35 7.66
This table shows the results of 1,000 Monte Carlo simulations of the CEV model with 500 daily observations (i.e.,
D = 1=252), comparing the situation where volatility is observed to that where our integrated volatility proxy is
used. Column 2 shows the parameters used to generate the simulated sample paths; the simulations are the same
as those used in Table 2. Column 3 shows the mean bias of the estimated parameter vector, i.e., the difference
between the estimated parameters and the values shown in Column 2, when the volatility is observed directly.
Column 4 shows the standard deviation of the parameter estimates, also with observed volatility. These two
columns quantify |
^
y
(J;instant)
n;D
y
0
|. Columns 5 and 6 show the corresponding information using the implied
volatility of an at-the-money short-maturity option to determine the level of the stochastic volatility variable.
These two columns quantify |
^
y
(J;proxy)
n;D
y
0
|. Columns 7 and 8 repeat the experiment for the Lewis proxy. The l
2
parameter is not identied and held xed at its data-generating value l
2
= 0. The instantaneous interest rate and
the instantaneous dividend yield of the stock are held xed at the values of 4% and 1.5%, respectively.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 438
6.5. Conclusions from the Monte Carlo simulations
We therefore leave our Monte Carlo analysis with the following conclusions:
1. The use of an implied volatility of a short-dated at-the-money option as a proxy for the
unobservable volatility variable Y
t
means that one market price of risk parameter is not
identiable, but on the other hand the separate identication of the two market price of
risk parameters is poor when option prices are used and computational times are greatly
increased due to the need to numerically compute the option prices. It seems therefore
that using a proxy is a reasonable tradeoff to make if we can live without the full
identication or make an arbitrary assumption about one of the market price of risk
parameters.
2. Black-Scholes implied volatility works reasonably well as a proxy in simulations for the
Heston model. For the more general CEV model, however, the use of this unadjusted
implied volatility leads to signicant bias in the elasticity of variance parameter b (which
of course is held xed and not estimated in the Heston model). The integrated volatility
proxy takes the Black-Scholes implied volatility and adjusts it for the effect of mean
reversion in volatility during the life of an option. This is only a partial adjustment to
what makes the Black-Scholes implied volatility different from the exact unobservable
instantaneous volatility, but our simulations show that this simple adjustment goes a
long way towards eliminating the bias. For the CEV model, the Lewis proxy does not
ARTICLE IN PRESS
Table 5
Asymptotic and small-sample variances of estimates with observed volatility
Parameter True value 500 Daily obs. 5,000 Daily obs. 10,000 Daily obs.
ASE SSSE ASE SSSE ASE SSSE
(1) (2) (3) (4) (5) (6) (7) (8)
k 3.00 1.14 1.57 0.36 0.38 0.25 0.25
g 0.10 0.019 0.022 0.0059 0.0057 0.0042 0.0041
s 0.25 0.0061 0.0062 0.0019 0.0020 0.0014 0.0014
r 0.8 0.0133 0.0134 0.0042 0.0042 0.003 0.003
l
1
4.0 6.2 6.5 1.97 1.91 1.40 1.42
This table shows the standard deviations of the parameter estimates
^
y
(J;instant)
n;D
for the Heston model, calculated
both analytically and from the Monte Carlo simulations. All values are based on daily observations. Column 2
shows the true values of the parameter vector used to generate the sample paths for the Monte Carlo simulations
and to calculate the standard deviations from the likelihood expressions. Column 3, marked ASE for Asymptotic
Standard Error, shows the asymptotic standard deviations of each parameter when the data series contains 500
daily observations. These values were obtained by computing the expected value of the second derivatives of the
log likelihood in the form of an integral. Column 4, marked SSSE for Small Sample Standard Error, shows the
standard deviations of the parameter estimates from the Monte Carlo simulations, i.e., the same information as in
the corresponding column of Table 1. Columns 5 and 6 show the same information as the third and fourth
columns, but with 5,000 daily observations instead of 500. Finally, Columns 7 and 8 show the same information
but with 10,000 daily observations. The market price of risk for the stochastic volatility variable is not identied
when volatility is observed; the instantaneous interest rate and dividend yield of the stock are held xed at 4% and
1.5%/year, respectively. Note that, with this number of observations, the standard deviations calculated
analytically from the approximate likelihood function are quite close to those observed in the Monte Carlo
simulations.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 439
provide an improvement over the integrated volatility proxy, while necessitating major
computational effort due to the need to recover the latent volatility variable numerically
rather than explicitly.
3. The small-sample distributions of the maximum likelihood estimators are well
approximated by their asymptotic counterparts, even in samples as small as 500 daily
observations.
7. Three models and the data
Given the guidance provided by the Monte Carlo simulations above, we are now ready
to tackle real data. When applying our method to real data, we use direct observations on
asset prices G
t
= |S
t
; C
t
]
/
with S
t
representing the S&P 500 Index and C
t
the price of a
short-maturity at-the-money option. The option price is computed from its implied
volatility, itself measured as the Chicago Board Options Exchange (CBOE) Volatility
Index (VIX). We use the VIX data computed using the methodology introduced by the
CBOE on September 22, 2003, which is an implied volatility index based on the European
S&P 500 options as opposed to the American S&P 100 options (whose implied volatility
index symbol is now VXO).
The VIX is an estimate of the implied volatility of a basket of S&P 500 Index Options
(SPX) constructed from different traded options in such a way that at any given time it
represents the implied volatility of a hypothetical at-the-money option with T = 30
calendar days to expiration (or 22 trading days). This constant maturity and constant
moneyness feature of the data matches nicely with the assumptions we have made to
reduce the dimensionality of the option pricing problem (see Section 3.2). In what follows,
we will use the VIX as our measure of the implied volatility, VIX
2
t
= V
imp
(t; T), from
which we can compute our proxy for Y
t
through (32).
The anticipated daily cash dividends of the S&P 500 are forecast by the CBOE. These
forecasts are generally very accurate in light of the short time span as well as the averaging
effect of a large stock index. The VIX options are European, simplifying the analysis. For
further details on the VIX, see Whaley (2000).
We use daily data from January 2, 1990 until September 30, 2003. Each trading day is
considered to be D = 1=252 after the previous day, regardless of the calendar time passed
(i.e., weekends and holidays receive no special consideration). The results for each of the
three models are discussed briey below. Both point estimates and standard errors for each
of the three models can be estimated quickly and easily, without the need for simulations;
other models can be estimated as easily using the technique outlined above. Figs. 3 and 4
show respectively the SPX and VIX for the period under consideration.
7.1. The Heston model
In Table 6, we report the estimation results for the Heston (1993) model described in
Eq. (24), treating volatility Y
t
as observed in the form of a proxy (the VIX index), and also
using the two-stage estimation procedure, where the univariate CEV model is used in the
rst stage to construct an integrated volatility proxy. The two methods produce similar
results. The Monte Carlo results suggest that the effect of replacing Y
t
by the simple
implied volatility proxy is quite smallwithin the asymptotic standard errors based on
Fishers Information matrix. As expected, we nd that the correlation parameter r
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 440
between the innovations to stock price and stochastic volatility is strongly negative,
hovering around 0:75. The long-term value of the volatility g
1=2
is estimated to be
approximately 21% per year with a speed of mean reversion coefcient of approximately 5.
The large uncertainty for the risk premia estimates is perhaps not surprising, given that
the sample period is 13 years long, and that risk premia are typically poorly estimated even
in much longer samples. These parameters pertain to the drift, and the quality of the
estimates of drift parameters typically depends only on the length of the sample, and not
the sampling frequency. (To take an extreme case, consider an arithmetic or geometric
Brownian motion. The volatility can be estimated to an arbitrary degree of precision by
sampling frequently enough, but the drift estimate is independent of sampling frequency.
The rst and last observations provide as good an estimate of the drift as weekly, daily or
hourly observations.) Given the length of the available data, there is little that can be done
to improve the quality of the l
1
estimate, apart from waiting for more data to accrue.
ARTICLE IN PRESS
1990 1992 1994 1996 1998 2000 2002 2004
date
400
600
800
1000
1200
1400
U
n
d
e
r
l
y
i
n
g
:

S
P
X

I
n
d
e
x
Fig. 3. The SPX (S&P 500) index represents the value of the underlying asset.
1990 1992 1994 1996 1998 2000 2002 2004
date
10
15
20
25
30
35
40
45
I
m
p
l
i
e
d

v
o
l
a
t
i
l
i
t
y

(
%
)

:

V
I
X

I
n
d
e
x
Fig. 4. The VIX index represents the value of the implied volatility of a basket of short-maturity at-the-money
options on the S&P 500 index.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 441
7.2. The GARCH stochastic volatility model
We now turn to the GARCH stochastic volatility model; see Nelson (1990) and
Meddahi (2001). S
t
and Y
t
follow a process of the following form under the Q measure:
dX
t
= d
S
t
Y
t

=
(r d)S
t
k
/
(g
/
Y
t
)

dt

(1 r
2
)Y
t

S
t
r

Y
t

S
t
0 sY
t

d
W
Q
1
(t)
W
Q
2
(t)

. (38)
Note that Y
t
, which we take to be positive, has a boundary at zero, and this boundary is
never achieved as long as k
/
g
/
X0. The dynamics of s
t
= ln S
t
under Q are independent of
the stock level
d
s
t
Y
t

=
r d
1
2
Y
t
k
/
(g
/
Y
t
)

dt

(1 r
2
)Y
t

r

Y
t

0 sY
t

d
W
Q
1
(t)
W
Q
2
(t)

. (39)
With the VIX series as our volatility proxy, we assume that the market price of risk
of the volatility state variable is zero, and that for the stock price is proportional to the
stock price itself, and to the square root of the volatility state variable, or
L = |l
1

(1 r
2
)Y
t

; 0]
/
. With these assumptions, the dynamics of the state variables
under the measure P are then
d
s
t
Y
t

=
r d (l
1
(1 r
2
)
1
2
)Y
t
k(g Y
t
)

dt

(1 r
2
)Y
t

r

Y
t

0 sY
t

d
W
P
1
(t)
W
P
2
(t)

,
(40)
where k
/
= k and g
/
= g. The Q-measure dynamics are not afne in Y
t
but, as noted
earlier, this is not a problem for our technique which is applicable to all diffusion
ARTICLE IN PRESS
Table 6
Parameter estimates for the Heston model
Parameter Heston model for (SPX,VIX)
Black-Scholes implied volatility proxy Integrated volatility proxy
Estimate Standard error Estimate Standard error
(1) (2) (3) (4) (5)
k 5.07 0.68 5.13 0.71
g 0.0457 0.0065 0.0436 0.0065
s 0.48 0.0036 0.52 0.0033
r 0.767 0.0056 0.754 0.0054
l
1
3.9 4.3 3.9 4.1
This table shows the estimated parameter values for the Heston stochastic volatility model using the (SPX,VIX)
dataset. Column 2 shows the parameter estimates for daily observations with observed volatility. Standard errors
are shown in Column 3. Columns 2 and 3 use the unadjusted VIX data directly as a proxy for instantaneous
volatility. Columns 4 and 5 report the corresponding results, also from daily observations, when using our
adjusted integrated volatility proxy (constructed from the VIX data) as a proxy for instantaneous volatility.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 442
specications. The log-likelihood formula corresponding to this model is given in the
appendix.
In Table 7, we report the estimation results for this model. Compared to the Heston
model, the unconditional mean of volatility is estimated at a much higher value in this
model, or about 27% (note, however, that the standard error for this parameter is large).
The point estimates for the correlation parameter are roughly similar to those in the
Heston model. The volatility of volatility and risk premia parameters are not directly
comparable, owing to the differing model specications.
7.3. The CEV stochastic volatility model
In Table 8, we report the estimation results for the general CEV model described above,
using the two-stage estimation procedure, where the rst-stage estimates are used to construct
an integrated volatility proxy. From the rst-stage estimation, the proxy is given by
Y
t
= 0:0061 1:1308VIX
t
, (41)
where VIX
t
is the implied variance of the short-maturity at-the-money option. At an
implied variance of 0.0463, the proxy is equal to the implied variance; at higher values, the
ARTICLE IN PRESS
Table 8
Parameter estimates for the CEV model
Parameter CEV model for (SPX,VIX) CEV univariate model for VIX
Estimate Standard error Estimate Standard error
(1) (2) (3) (4) (5)
k 4.1031 0.89 2.2 0.92
g 0.0451 0.009 0.0528 0.016
s 0.8583 0.012 1.79 0.063
b 0.6545 0.0026 0.94 0.0097
r 0.760 0.005
l
1
3.9 4.1
Columns 2 and 3 show the estimated parameter values and standard errors for the general CEV stochastic
volatility model using the (SPX,VIX) dataset at the daily frequency. Columns 4 and 5 show the results for the
univariate CEV stochastic volatility model tted to the integrated volatility proxy.
Table 7
Parameter estimates for the GARCH model
Parameter GARCH model for (SPX,VIX)
Estimate Standard error
k 1.62 1.1
g 0.074 0.04
s 2.204 0.016
r 0.754 0.0056
l
1
2.4 3.8
This table shows the estimated parameter values for the GARCH stochastic volatility model using the (SPX,VIX)
dataset. The results are for daily observation frequency. Standard errors are in the right column.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 443
proxy exceeds implied variance by about 13% of the difference between implied variance
and 0.0463; similarly, at lower levels, the proxy is lower than implied variance by a
comparable amount. From the second-stage estimates, shown in Table 8, the
instantaneous standard deviation of the stock price at the unconditional mean of the
volatility state variable is approximately 0.21, which is about the same as in the Heston
model, but much lower than that obtained for the GARCH model. Of particular interest
for the CEV model is the exponent b, which is estimated at 0.65, above the Heston value of
0.5 but below the GARCH value of 1. Either value can be rejected at the conventional 95%
condence level (see below). This nding stands in contrast to that of Jones (2003) who,
using a Bayesian method, estimates this exponent above the GARCH value of 1. Note
however that the univariate estimate of b from tting our integrated volatility proxy as a
univariate process, namely the second equation in (35), produces a higher estimate of b,
0.94. The rst-stage estimate of b from the unadjusted VIX data is right at the boundary
value of 1. Under the assumptions of the model, this univariate estimate is legitimate, as
the second equation in (35) is the proper diffusion marginal for the variable Y
t
. The
discrepancy between the point estimates obtained from the full (SPX,VIX) dataset and the
marginal VIX dataset is evidence of some degree of misspecication of the CEV model.
The point estimate for the correlation coefcient is roughly the same as in the other models.
This parameter is consistently estimated at around 0:75 for the three models under
consideration. One consequence is that theoretical developmentswhether theoretical option
pricing models or econometric methods for volatility estimationthat assume that the two
Brownian motions W
P
1
and W
P
2
are uncorrelated are likely to be badly misspecied.
In Fig. 5, we report the implied volatility smile predicted by the CEV model at the
parameter values just estimated using the joint (SPX,VIX) data. The volatility state
variable is estimated at its unconditional mean g. The results look broadly consistent in
shape and magnitude with the typical patterns documented for the implied volatility smile
as in e.g., At-Sahalia and Lo (1998).
Finally, we compute tests for parameter stability across subsamples, and the tests reject
the null hypothesis of equal parameter values. It is not clear however that nding
statistically signicantly differences in parameter estimates across different time periods
should be interpreted (in economic terms as opposed to statistical terms) as evidence
against the model instead of evidence in favor of regime switches in volatility, similar to
what is observed empirically with interest rates. Fig. 4 shows clearly identiable volatility
peaks and low-volatility periods.
Similar simulations (not shown) for the CEV model show a signicant bias in the
estimation of the elasticity parameter. We therefore examine the effect of using the
integrated volatility proxy with simulations based on the CEV model (which nests two of
the other models used). Table 4 shows the parameter estimates and standard errors
obtained using the two-stage procedure described in Section 4. As shown, the use of the
integrated volatility proxy does not result in a substantial degradation in the quality of
the estimates; although some measures have become worse, others have improved, and the
differences are usually quite small.
7.4. Likelihood ratio tests
The CEV model nests the Heston (b = 1=2) and GARCH (b = 1) models. The use of
likelihood estimation makes it straightforward to calculate likelihood ratio statistics for the
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 444
nested models. The statistics for both models tested against the CEV model at the daily
frequency are 782 and 122, respectively, using as above the (SPX,VIX) dataset and our
proxy construction. To make the likelihoods directly comparable, all estimates are based
on the two-stage procedure with the CEV model used in the rst stage to construct the
integrated volatility proxy. By this method, all three models are then using the same data
as input to the second-stage estimation. Both nested models are easily rejected at the
conventional 95% condence level. In both cases, the likelihood ratio statistic is many
times (and in one case, hundreds of times) the 95% cutoff value of 3.84, implying strong
rejection at any reasonable condence level.
The point estimate of the b coefcient lies between the Heston value of 1/2 and the
GARCH value of 1. Both of these values are boundary cases in an appropriate sense; for
values of b below 1/2, the boundary of zero is achievable, so that the stock price can be
instantaneously deterministic. For values of b above 1, the deated stock price is a local
martingale, but not a martingale, and there exists a replicating portfolio for the stock that
is cheaper than the stock itself, see e.g., Heston et al. (2004). Although violation of either
bound does not result in arbitrage opportunities, both situations could be considered
undesirable modeling properties. The two boundary values of 1/2 and 1 are commonly
used in stochastic volatility models, owing to their tractability, but the point estimates,
standard errors, and likelihood ratios suggest that neither boundary value is appropriate,
with the elasticity of variance lying between the two. As we show, either boundary value is
very strongly rejected.
8. Extensions and conclusions
In conclusion, the main advantage of our approach is twofold: we provide a maximum
likelihood estimator for the parameters of the underlying model, with all its associated
desirable statistical properties, and we do it in closed form, fully if an implied volatility
proxy (either a simple proxy or our modied proxy) is used, and up to the option pricing
model linking the state vector to observed option prices if those are used.
ARTICLE IN PRESS
0.95 1 1.05 1.1
moneyness = strike/spot
0.18
0.2
0.22
0.24
0.26
0.28
0.3
0.32
C
E
V

i
m
p
l
i
e
d

v
o
l
a
t
i
l
i
t
y
Fig. 5. This plot reports the implied volatility smile predicted by tting the CEV model to the (SPX,VIX) data,
tted using our adjusted integrated volatility proxy as described in the text. The state variables are held xed at
their unconditional estimated means for the purpose of computing this implied volatility smile.
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 445
One advantage of our methodology is that it extends readily to the situation where the
underlying asset price and/or the volatility state variable(s) can jump. Suppose that,
instead of (1), X
t
follows under P the dynamics
dX
t
= m
P
(X
t
) dt s(X
t
) dW
P
t
J
P
t
dN
P
t
, (42)
where the pure jump process N
P
has stochastic intensity l(X
t
; y) and jump size 1. The jump
size J
P
t
is independent of the ltration generated by the X process at time t, and has
probability density n(:; y).
This setup incorporates the stochastic volatility with jump models that have been
proposed in the literature, such as Bates (2000), Bakshi et al. (1997), and Pan (2002). It is
possible to extend the basic likelihood expansion described in Section 3.1 to cover such
cases. The expression, due to Yu (2003), is
p
(J)
X
(D; x[x
0
; y) = exp
m
2
ln(2pD) D
v
(x; y)
c
(1)
X
(x[x
0
; y)
D

J
k=0
c
(k)
X
(x[x
0
; y)
D
k
k!

J
k=1
d
(k)
X
(x[x
0
; y)
D
k
k!
. (43)
Again, the series can be calculated up to arbitrary order J and the unknowns are the
coefcients c
(k)
X
and d
(k)
X
. The difference between the coefcients c
(k)
X
in (43) and C
(k)
X
in (9) is
due to the fact that the former is written for ln p
X
while the latter is for p
X
itself; the two
coefcients families match once the terms of the Taylor series of ln(p
(J)
X
) in D are matched
to the coefcients C
(k)
X
of the direct Taylor series ln p
(J)
X
. The coefcients d
(k)
X
are the new
terms needed to capture the presence of the jumps in the transition function. The latter
terms are needed to capture the different behavior of the tails of the transition density
when jumps are present. These tails are not exponential in x, hence the absence of a the
factor exp(c
(1)
X
D
1
) in front of the summation of d
(k)
X
coefcients. The coefcients can be
computed analogously to the pure diffusive case.
We perform Monte Carlo simulations for the Heston and CEV models to assess the
accuracy of the technique, and nd that it not only produces accurate estimates, but can
also be implemented efciently. Computational time for estimation is of the order of a few
minutes on a standard PC using Matlab when volatility is treated as unobserved, and
considerably less when a proxy is used. This is a major advantage of our method, in
additional to the statistical efciency of maximum likelihood. When the observed vector
consists of G
t
= |S
t
; C
t
]
/
, we can fully identify all the parameters of the model, including
the market prices of risk, provided an option pricing technique is included in the
estimation procedure. Use of either a simple Black-Scholes implied volatility or our
integrated volatility proxy simplies estimation, but does not permit identication of the
market price of risk for the volatility state variable. The asymptotic variances calculated
from the approximate likelihood expressions are close to those found empirically from the
Monte Carlo simulations. We nd that the use of the implied volatility of at-the-money
short-maturity options as a proxy for the true stochastic volatility results in reasonable
estimates for the Heston model, and the integrated volatility proxy performs similarly for
the CEV model. In this case, using such a proxy reduces the exercise to one of simply
applying our likelihood expansion to the state vector X
t
= |S
t
; Y
t
]
/
.
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 446
We apply our method to the Heston, GARCH, and CEV stochastic volatility models. One of
the ndings in our empirical analysis across models is the fact that the estimated correlation
coefcient r between the shocks to the stock level S
t
and the volatility variable Y
t
is consistently
in the neighborhood of 0:75 for all models. This negative correlation has long been noted (in
the form of the leverage effect). The robustness of this nding across models suggests that
stochastic volatility models, pricing and/or estimation methods that rely on the assumption of
uncorrelated shocks, such as Hull and White (1987), will be quite unrealistic in this context. Note
that while our integrated volatility proxy does not adjust for this correlation, the likelihood itself
does and the Monte Carlo results suggest that this is the rst-order consideration.
However, little in our estimation procedure depends on the specic properties of these
models. It is in fact applicable to a wide variety of diffusion-based stochastic volatility
models (where Y
t
is one-dimensional, and if a simple proxy for volatility is to be used, has
linear drift), or for that matter models with other types of latent variables. In Section 3, we
describe our method without reference to any specic model. Provided that traded asset
prices (such as the call options we use) or other observable quantities can be found to be
mapped into the unobservable latent state vector, or simply using our adjusted volatility
proxy, the method can then be employed.
Appendix A. The log-likelihood expansion for stochastic volatility models
In this appendix, we give the coefcients of the likelihood expansion at order J = 1
corresponding to each of the models considered. These expressions, as well as higher-order
expansions, are available upon request from the authors in computer form.
A.1. The Heston model
At order J = 1, with x
1
= s and x
2
= Y and the indirect parameters
a
1
= r d; a
2
= k
/
g
/
,
b
1
= rl
2
(1 r
2
)l
1

1
2
; b
2
= l
2
s k
/
,
the expressions for the coefcients appearing in formula (11) are given by
D
v
(x; y) =
1
2
ln((1 r
2
)sx
2
),
C
(1)
X
(x[x
0
; y) =
(x
2
x
20
)
2
2rs(x
2
x
20
)(x
1
x
10
) s
2
(x
1
x
10
)
2
2(1 r
2
)s
2
x
20

(x
2
x
20
)
3
4(1 r
2
)s
2
x
2
20

r(x
2
x
20
)
2
(x
1
x
10
)
2(1 r
2
)sx
2
20

(x
2
x
20
)(x
1
x
10
)
2
4(1 r
2
)x
2
20

(7r 8r
3
)(x
2
x
20
)
3
(x
1
x
10
)
24(1 r
2
)
2
sx
3
20

(7 10r
2
)(x
2
x
20
)
2
(x
1
x
10
)
2
48(1 r
2
)
2
x
3
20

rs(x
2
x
20
)(x
1
x
10
)
3
24(1 r
2
)
2
x
3
20

s
2
(x
1
x
10
)
4
96(1 r
2
)
2
x
3
20

(15 16r
2
)(x
2
x
20
)
4
96(1 r
2
)
2
s
2
x
3
20
,
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 447
C
(0)
X
(x[x
0
; y) =
(x
2
x
20
)(x
20
b
2
rsx
20
b
1
rsa
1
a
2
)
(1 r
2
)s
2
x
20

(x
1
x
10
)(rx
20
b
2
sx
20
b
1
sa
1
ra
2
)
(1 r
2
)sx
20

s
2
(x
1
x
10
)
2
24(1 r
2
)x
2
20

(x
2
x
20
)
2
(s(s 12ra
1
) 12a
2
)
24(1 r
2
)s
2
x
2
20

(x
2
x
20
)(x
1
x
10
)(rs
2
6sa
1
6ra
2
)
12sx
2
20
(1 r
2
)
,
C
(1)
X
(x[x
0
; y) =
rsa
2
b
1
rsa
1
b
2
s
2
a
1
b
1
a
2
b
2
(1 r
2
)s
2

(2rsb
1
b
2
s
2
b
2
1
b
2
2
)x
20
2(1 r
2
)s
2

s
4
r
2
s
4
6s
2
a
2
1
6s
2
a
2
6r
2
s
2
a
2
12rsa
1
a
2
6a
2
2
12(1 r
2
)s
2
x
20
.
Note that the zero boundary of the state variable x
2
depends here upon the values of the
parameters. This is due to the fact that x
2
in this model follows a square root process which
is the limiting (and only) case for which such a phenomenon occurs. This can be seen
through the presence in the coefcients of terms x
n
20
where n is an integer. Thus the
behavior of the likelihood expansion near such a boundary is specied exogenously to
match that of the assumed modelthe unattainability of the zero boundary in this case
in the limit where x
20
tends to zero; this is achieved by setting the log-likelihood expansion
to an arbitrarily high negative value.
A.2. The GARCH stochastic volatility model
At order J = 1, with x
1
= s and x
2
= Y, the expressions for the coefcients appearing in
formula (11) are given by
D
v
(x; y) =
1
2
ln(x
3
2
(1 r
2
)s
2
),
C
(1)
X
(x[x
0
; y)
=
(45r
2
44)(x
2
x
20
)
4
96(1 r
2
)
2
s
2
x
4
20

(14r 15r
3
)(x
2
x
20
)
3
(x
1
x
10
)
24(1 r
2
)
2
sx
7=2
20

3r(x
2
x
20
)
2
(x
1
x
10
)
4s(1 r
2
)x
5=2
20

(x
2
x
20
)
3
2(1 r
2
)s
2
x
3
20

(8 11r
2
)(x
2
x
20
)
2
(x
1
x
10
)
2
48(1 r
2
)
2
x
3
20

(x
2
x
20
)(x
1
x
10
)
2
4(1 r
2
)x
2
20

rs(x
2
x
20
)(x
1
x
10
)
3
24(1 r
2
)
2
x
5=2
20

s
2
(x
1
x
10
)
4
96(1 r
2
)
2
x
2
20

x
2
2
2x
2

x
20

(

x
20

rs(x
1
x
10
)) x
20
(x
20
2rs

x
20

(x
1
x
10
) s
2
(x
1
x
10
)
2
)
2(1 r
2
)s
2
x
2
20
,
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 448
C
(0)
X
(x[x
0
; y)
=
(x
2
x
20
)(4yk 4(r d)rs

x
20

(4k s
2
)x
20
2rsx
3=2
20
(2

1 r
2

l
1
1))
4(1 r
2
)s
2
x
2
20

(x
2
x
20
)
2
(48yk 36(r d)rs

x
20

24kx
20
5s
2
x
20
6rsx
3=2
20
(2

1 r
2

l
1
1))
48(1 r
2
)s
2
x
3
20

(x
1
x
10
)(x
2
x
20
)(36ykr 24(r d)s

x
20

r(12k s
2
)x
20
)
48(1 r
2
)sx
5=2
20

s
2
(x
1
x
10
)
2
48x
20
(1 r
2
)

(x
1
x
10
)(4ykr 4(r d)s

x
20

r(4k s
2
)x
20
2sx
3=2
20
(2

1 r
2

l
1
1))
4(1 r
2
)sx
3=2
20
,
C
(1)
X
(x[x
0
; y) =
x
20
(1 4l
2
1
(1 r
2
))
8(1 r
2
)

g
2
k
2
2(1 r
2
)s
2
x
2
20

(r d)gkr
(1 r
2
)sx
3=2
20

r(4k s
2
)

x
20

8(1 r
2
)s

r(2gk d(4k s
2
) r(4k s
2
))
4(1 r
2
)s

x
20

(4(r d)s

x
20

r(4k s
2
)x
20
2sx
3=2
20
4gkr)l
1
4(1 r
2
)s

x
20

gk(4k (4 3r
2
)s
2
) 2(r d)
2
s
2
4(1 r
2
)s
2
x
20

48k
2
24(2(d r k) kr
2
)s
2
(13 10r
2
)s
4
96(1 r
2
)s
2
.
Note that the behavior at the zero boundary of the state variable x
2
is unattainable,
provided kg is non-negative. The log-likelihood is set to an arbitrarily high negative value
when this condition is violated.
A.3. The CEV model
At order J = 1, with x
1
= s and x
2
= Y and the indirect parameters
a = r d; b = (1 r
2
)l
1

1
2
,
the expressions for the coefcients appearing in formula (11) are given by
D
v
(x; y) =
1
2
ln(x
12b
2
(1 r
2
)s
2
),
C
(1)
X
(x[x
0
; y)
=
x
42b
20
s
2
(x
1
x
10
)
4
96(1 r
2
)
2

(x
1
x
10
)
2
(x
2
x
20
)
4x
2
20
(1 r
2
)

x
(7=2)b
20
rs(x
1
x
10
)
3
(x
2
x
20
)
24(1 r
2
)
2

(1 2b)x
(3=2)b
20
r(x
1
x
10
)(x
2
x
20
)
2
4(1 r
2
)s

(9r
2
6 2b(1 r
2
))(x
1
x
10
)
2
(x
2
x
20
)
2
48x
3
20
(1 r
2
)
2
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 449

bx
12b
20
(x
2
x
20
)
3
2(1 r
2
)s
2

x
(5=2)b
20
r(3r
2
2 8b(1 r
2
) 4b
2
(1 r
2
))(x
1
x
10
)(x
2
x
20
)
3
24(1 r
2
)
2
s

(r
2
16b(1 r
2
) 28b
2
(1 r
2
))(x
2
x
20
)
4
96x
2(1b)
20
(1 r
2
)
2
s
2

x
12b
20
(x
3
20
2x
3=2b
20
rs(x
1
x
10
) x
2b
20
s
2
(x
1
x
10
)
2
2(x
2
20
x
1=2b
20
rs(x
1
x
10
))x
2
x
20
x
2
2
)
2(1 r
2
)s
2
,
C
(0)
X
(x[x
0
; y)
=
x
3=2b
20
(4gx
20
kr 4x
2
20
kr 4ax
1=2b
20
s 4bx
3=2b
20
s (1 2b)x
2b
20
rs
2
)(x
1
x
10
)
4(1 r
2
)s

x
12b
20
(4gx
20
k 4x
2
20
k 4ax
1=2b
20
rs 4bx
3=2b
20
rs (1 2b)x
2b
20
s
2
)(x
2
x
20
)
4(1 r
2
)s
2

(2b 3)x
32b
20
s
2
(x
1
x
10
)
2
48(1 r
2
)

x
5=2b
20
(12(1 2b)gx
20
kr 12(1 2b)x
2
20
kr 24ax
1=2b
20
s (15 28b 12b
2
)x
2b
20
rs
2
)(x
1
x
10
)(x
2
x
20
)
48(1 r
2
)s

(48bgx
20
k 24(1 2b)x
2
20
k 12a(1 2b)x
1=2b
20
rs 12b(1 2b)x
3=2b
20
rs (9 14b)x
2b
20
s
2
)(x
2
x
20
)
2
48x
2(1b)
20
(1 r
2
)s
2
,
C
(1)
X
(x[x
0
; y)
=
(3 28b 12b
2
6r
2
40br
2
24b
2
r
2
)s
2
x
22b
20
96(1 r
2
)

k
2
(x
20
g)
2
2(1 r
2
)s
2
x
2b
20

(1 2b)rsx
3=2b
20
(a bx
20
)
4(1 r
2
)

krx
1=2b
20
(x
20
g)(a bx
20
)
(1 r
2
)s

2a
2
4bgk gkr
2
2bgkr
2
4abx
20
2kx
20
4bkx
20
kr
2
x
20
2bkr
2
x
20
2b
2
x
2
20
4(1 r
2
)x
20
.
Note that the boundary at zero of the state variable x
2
cannot be achieved if b4
1
2
and
2kgXs
2
; whether the boundary is attainable when b = 1=2 depends on the other parameter
values (see the discussion above for the Heston model).
References
At-Sahalia, Y., 1999. Transition densities for interest rate and other nonlinear diffusions. Journal of Finance 54,
13611395.
At-Sahalia, Y., 2001. Closed-form likelihood expansions for multivariate diffusions. Working Paper, Princeton
University.
At-Sahalia, Y., 2002. Maximum-likelihood estimation of discretely-sampled diffusions: a closed-form
approximation approach. Econometrica 70, 223262.
At-Sahalia, Y., Lo, A., 1998. Nonparametric estimation of state-price-densities implicit in nancial asset prices.
Journal of Finance 53, 499547.
At-Sahalia, Y., Mykland, P.A., 2003. The effects of random and discrete sampling when estimating continuous-
time diffusions. Econometrica 71, 483549.
Andersen, T.G., Lund, J., 1997. Estimating continuous time stochastic volatility models of the short term interest
rate. Journal of Econometrics 77, 343378.
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 450
Bakshi, G., Cao, C., Chen, Z., 1997. Empirical performance of alternative option pricing models. Journal of
Finance 52, 20032049.
Bakshi, G., Cao, C., Chen, Z., 2000. Do call prices and the underlying stock always move in the same direction?
Review of Financial Studies 13, 549584.
Bates, D.S., 2000. Post-87 crash fears in the S&P 500 futures option market. Journal of Econometrics 94, 181238.
Bates, D.S., 2002. Maximum likelihood estimation of latent afne processes. Working Paper, University of Iowa.
Black, F., 1976. Studies of stock price volatility changes. In: Proceedings of the 1976 Meetings of the American
Statistical Association, pp. 171181.
Black, F., Scholes, M., 1973. The pricing of options and corporate liabilities. Journal of Political Economy 81,
637654.
Bollerslev, T., Zhou, H., 2002. Estimating stochastic volatility diffusions using conditional moments of integrated
volatility. Journal of Econometrics 109, 3365.
Carr, P., Madan, D.B., 1998. Option valuation using the fast Fourier transform. Journal of Computational
Finance 2, 6173.
Chacko, G., Viceira, L.M., 2003. Spectral GMM estimation of continuous-time processes. Journal of
Econometrics 116, 259292.
Derman, E., Kani, I., 1994. Riding on the smile. RISK 7, 3239.
Dumas, B., Fleming, J., Whaley, R.E., 1998. Implied volatility functions: empirical tests. Journal of Finance 53,
20592106.
Dupire, B., 1994. Pricing with a smile. RISK 7, 1820.
Eraker, B., 2001. MCMC analysis of diffusion models with application to nance. Journal of Business and
Economic Statistics 19, 177191.
Feller, W., 1951. Two singular diffusion problems. Annals of Mathematics 54, 173182.
Gallant, A.R., Tauchen, G., 1996. Which moments to match? Econometric Theory 12, 657681.
Harrison, M., Kreps, D., 1979. Martingales and arbitrage in multiperiod securities markets. Journal of Economic
Theory 20, 381408.
Harrison, M., Pliska, S., 1981. Martingales and stochastic integrals in the theory of continuous trading. Stochastic
Processes and Their Applications 11, 215260.
Heston, S., 1993. A closed-form solution for options with stochastic volatility with applications to bonds and
currency options. Review of Financial Studies 6, 327343.
Heston, S., Loewenstein, M., Willard, G.A., 2004. Options and bubbles. Technical Report, University of
Maryland.
Hull, J., White, A., 1987. The pricing of options on assets with stochastic volatilities. Journal of Finance 42,
281300.
Hurn, A.S., Jeisman, J., Lindsay, K., 2005. Seeing the wood for the trees: a critical evaluation of methods to
estimate the parameters of stochastic differential equations. Working Paper, School of Economics and
Finance, Queensland University of Technology.
Jacquier, E., Polson, N.G., Rossi, P.E., 1994. Bayesian analysis of stochastic volatility models. Journal of Business
and Economic Statistics 14, 429434.
Jensen, B., Poulsen, R., 2002. Transition densities of diffusion processes: numerical comparison of approximation
techniques. Journal of Derivatives 9, 115.
Jiang, G.J., Knight, J., 2002. Estimation of continuous-time processes via empirical characteristic function.
Journal of Business and Economic Statistics 20, 198212.
Jones, C.S., 2003. The dynamics of stochastic volatility: evidence from underlying and options markets. Journal of
Econometrics 116, 181224.
Kim, S., Shephard, N., Chib, S., 1999. Stochastic volatility: likelihood inference and comparison with ARCH
models. Review of Economic Studies 65, 361393.
Lamoureux, C.G., Paseka, A., 2004. Information in option prices and the underlying asset dynamics. Working
Paper, University of Arizona.
Ledoit, O., Santa-Clara, P., Yan, S., 2002. Relative pricing of options with stochastic volatility. Working Paper,
University of California at Los Angeles.
Lewis, A.L., 2000. Option Valuation under Stochastic Volatility. Finance Press, Newport Beach, CA.
Li, M., Pearson, N.D., Poteshman, A.M., 2004. Conditional estimation of diffusion processes. Journal of
Financial Economics 74, 3166.
Meddahi, N., 2001. An eigenfunction approach for volatility modeling. Technical Report, Universite de
Montre al.
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 451
Merton, R.C., 1973. The theory of rational option pricing. Bell Journal of Economics and Management Science 4,
141183.
Nelson, D.B., 1990. ARCH models as diffusion approximations. Journal of Econometrics 45, 738.
Pan, J., 2002. The jump-risk premia implicit in options: evidence from an integrated time-series study. Journal of
Financial Economics 63, 350.
Romano, M., Touzi, N., 1997. Contingent claims and market completeness in a stochastic volatility model.
Mathematical Finance 7, 399412.
Rubinstein, M., 1995. As simple as one, two, three. RISK 8, 4447.
Ruiz, E., 1994. Quasi-maximum likelihood estimation of stochastic volatility models. Journal of Econometrics 63,
289306.
Singleton, K., 2001. Estimation of afne asset pricing models using the empirical characteristic function. Journal
of Econometrics 102, 111141.
Stein, E.M., Stein, J.C., 1991. Stock price distributions with stochastic volatility: an analytic approach. Review of
Financial Studies 4, 727752.
Stein, J.C., 1989. Overreactions in the options market. Journal of Finance 44, 10111023.
Stramer, O., Yan, J., 2005. On simulated likelihood of discretely observed diffusion processes and comparison to
closed-form approximation. Working Paper, University of Iowa.
Takamizawa, H., Shoji, I., 2003. Modeling the term structure of interest rates with general short-rate models.
Finance and Stochastics 7, 323335.
Taylor, S.J., 1994. Modeling stochastic volatility: a review and comparative study. Mathematical Finance 4,
183204.
Whaley, R.E., 2000. The investor fear gauge. The Journal of Portfolio Management 26, 1217.
Yu, J., 2003. Closed-form likelihood estimation of jump-diffusions with an application to the realignment risk
premium of the Chinese yuan. Ph.D. Thesis, Princeton University.
ARTICLE IN PRESS
Y. At-Sahalia, R. Kimmel / Journal of Financial Economics 83 (2007) 413452 452

You might also like