FE570 Market Microstructure and Trading Strategies
Lecture 4. Empirical Properties of the Microstructure Data and Liquidity
Dan Pirjol
Stevens Institute of Technology
Week 4
1 / 38
Outline
Empirical properties of microdata
Bid-ask bounce
The Formation of Security Prices
Models of security prices
Random Walk
Random Walk at Microscale
Market Liquidity
2 / 38
Stylized facts of financial markets
In Lecture 2 we learned about the stylized facts (SF) of the financial markets:
▶ SF#1: Absence of autocorrelation. Autocorrelations of returns are often
insignificant, except for very small intraday time scales
▶ SF#2: Heavy tails. The distribution of the returns displays a power-law
or Pareto-like tail.
▶ SF#4: Aggregational Gaussianity. as the time scale over which returns
are calculated, their distribution looks more and more normal.
▶ SF#6: Volatility clustering. different measures of volatility display a
positive autocorrelation over several days, which quantifies the fact that
high-volatility events tend to cluster in time.
3 / 38
SF#1: Absence of autocorrelations
Absence of autocorrelation. Autocorrelations of returns are often
insignificant, except for very small intraday time scales
Log-returns with lag τ defined as Rτ (t) = log S(t+τ
S(t)
)
The plot shows corr(Rτ (t), Rτ (t + k)) for τ = 1 day for the SP500 index
4 / 38
Autocorrelations of returns in microdata
TAQ data for JPM prices on 13-Jan-2021 (11:00-11:30am): autocorrelation of
successive trade price changes. There is a significant autocorrelation of price
changes at lag one.
ρ1 = corr(∆Pt , ∆Pt+1 ) ≃ −0.385
5 / 38
Bid-ask bounce
This effect is due to the Bid-Ask bounce. This is the bounce of the stock price
between the bid and ask price and vice versa. Over small intra-day time scales,
the stock price movement takes place only inside the bid-ask spread.
6 / 38
Bid-ask bounce in data
30 mins of microdata: JPM trade price 10:00-10:30am 13-Jan-2021.
7 / 38
Bid-ask bounce in data: JPM TAQ dataset
Zoom in on one minute of data: 10:31am - 10:32am.
8 / 38
Noise in micro-data
The Bid-Ask bounce is one of the manifestations of noise in micro-data (high
frequency data).
Noise results from several factors:
▶ Bid-Ask Bounce
▶ Discrete nature of price changes at micro-scale
▶ Order arrival latency
9 / 38
Impact on the realized volatility
The bid-ask bounce produces high volatility estimates at small time scales,
even if the price stays within the bid-ask spread.
10 / 38
Microstructure data: JPM TAQ dataset
Let us examine more closely the structure of the price at microscale on the
example of the JPM TAQ file dataset. Contains 127,720 trades between
9:30-16:00 EST.
11 / 38
The summary statistics of price changes
Using the entire dataset, compute the price returns ∆pt = pt − pt−1 and study
their statistics at different sampling frequencies.
The mean of the price changes E[∆pt ] is very small!
12 / 38
Statistics of returns of financial time series and their estimation
▶ Moments of a random variable X with density f (x) : ℓ − th moment
Z ∞
′
mℓ = E (X ℓ ) = x ℓ f (x)dx (1)
−∞
▶ ℓ − th central moment
Z ∞
mℓ = E [(X − µx )ℓ ] = (x − µx )ℓ f (x)dx (2)
−∞
▶ First moment: mean or expectation of X
▶ Second moment: variance of X
▶ Skewness (symmetry) and Excess Kurtosis (fat-tails)
(X − µx )3
S(x) = E [ ] (3)
σx3
(X − µx )4
K (x) = E [ ] (4)
σx4
13 / 38
Autocorrelation of price returns
The autocorrelation plot ACF of the price changes
ρk = corr(∆pt , ∆pt−k )
14 / 38
SF#2: Heavy tailed distributions
The distributions of the returns has tails which are heavier than a Gaussian,
and are well approximated as power-law tails.
Heavy tails: the unconditional distribution of returns appears to display a
power-law or Pareto-like tail, with a tail index which is finite, higher than two
and less than five for most data sets studied.
The kurtosis of the distribution of returns
E[r (t, τ ) − E[r (t, τ )]]4
κ= −3
σ 4 (τ )
with σ 2 (τ ) = var(r (t, τ )). This vanishes for a normal distribution. κ > 0
indicates “heavy tailed” distribution.
The excess kurtosis decreases with the sampling frequency (SF#4:
Aggregational Gaussianity).
15 / 38
Stylized facts of the financial markets at micro scale
Denoting the trade price pt and the price change between successive
transactions ∆pt = pt − pt−1 we observe the following empirical properties of
the price changes:
▶ The returns have means close to zero
▶ The returns have extreme dispersion
▶ There is non-zero dependence between successive observations
The simplest model for trade prices at microscale embedding these properties is
the Roll model. To be discussed in the next lecture.
16 / 38
The formation of security prices
What can we say generally about security prices?
Which price? There are several prices at microscale:
▶ bid/ask price. In a dealer market there are posted bid and ask prices bt , at
at any time. In a LOB there is a best-bid and best-offer price (NBBO).
▶ Mid-quote: mt = 12 (at + bt )
▶ Trade price pt
▶ Expected price of next trade p̃t = mt + 21 st E[∆pt+1 ].
st = {+1, −1} for expected (buy, sell).
Economically, there is an efficient price given by the sum of the future expected
cash flows of the security, eg. dividends. This should be smooth varying, as
information about future earnings/cash flows becomes available. This price is
roughly the closing price each day on exchanges.
17 / 38
Historical evolution of thinking about security prices
▶ 1900 - Louis Bachelier: The French mathematician Louis Bachelier first
documented the idea and provided insights about stock market prices in
his Ph.D. dissertation titled ”The Theory of Speculation”.
▶ 1953 - Maurice Kendall: MK (a British statistician) proposed the
Random-Walk hypothesis in his paper titled ”The Analytics of Economic
Time Series, Part 1: Prices”.
▶ 1965 - Eugene Fama: Eugene Fama at University Chicago further
developed the idea in a paper titled ”Random Walks In Stock Market
Prices”, and eventually he infused the idea into the Efficient-Market
hypothesis.
▶ 1968 - Paul Samuelson: Establishes the log-normal model for stock price
returns which has been the basis of Black-Scholes theory (1976) and is
still used today as a first order approximation for stock prices dynamics
(enhanced with stochastic volatility and jumps, etc)
18 / 38
Louis Bachelier
▶ Born in Havre, France (1870), worked on the Paris Stock Exchange and
enrolled at Sorbonne where he defended his thesis on ”Theory of
Speculation” in 1900.
▶ First mathematical treatment of securities prices, based on a Brownian
motion model. This assumes that price changes are independent and
normally distributed. Used this to price options.
▶ The work was not considered important enough and he was not able to
find a university position. Today the most important association of
financial engineering is called the Bachelier Society.
19 / 38
The Random-Walk Model - General Model
Let pt denote the transaction price at time t. Two possible time definitions:
▶ Calendar time: the actual time-stamp of the trade. Irregular, not
uniformly spaced.
▶ Trading time: t increments by 1 for each trade.
Definition
The Random-Walk model (with drift) assumes that prices follow the law
pt = pt−1 + µ + ut
where ut , t = 0, 1, ... are independently and identically distributed random
variables. Intuitively, they arise from new information that bears on the security
value.
▶ µ is the expected price change (the drift).
▶ The units of pt are either dollars or ticks (discrete units of price, usually 1c
or 5c).
20 / 38
The Random-Walk Model - Martingale
We have seen that the average drift in microstructure data is very small and
can be neglected to a good approximation. We can thus assume that the
security prices follow a zero drift random walk.
Under this assumption the expected future price pt+k is equal to the most
recent price. There is a special concept in probability for this type of processes:
martingale.
Definition
The discrete stochastic process pt is called a martingale if its expected value,
conditional on the past realizations, is equal to the most recently known value
E [pt+1 |pt , pt−1 , ...] = pt
Note that the expectation in this formulation is conditioned on lagged pt or xt
that is the history of the process.
21 / 38
The Random-Walk Model - Construction
▶ To define a Random-Walk formally, take independent random variables
Z1 , Z2 , ..., where each variable is either
P 1 or -1, with a 50% probability for
either value, and S0 = 0 and Sn = nj=1 Zj . The series Sn is called the
simple random walk on Z
▶ This series (the running sum) gives the distance walked, if each part of the
walk is ofPlength one. The expectation E (Sn ) of Sn is zero
E (Sn ) = nj=1 E (Zj ) = 0.
▶ A Random-Walk is a process constructed as the sum of independently and
identically distributed (i.i.d) zero-mean random variables - a special case of
martingale.
▶ In microstructure analysis, transaction prices are usually not martingales.
However it is possible to identify a martingale component of the price.
22 / 38
The geometric Brownian motion as a multiplicative random walk
The most popular stock price model at the macro scale is the geometric
Brownian motion (also known in option theory as the Black-Scholes model).
Under this model the log-stock price St follows a random walk in discrete time
log St+1 = log St + ut
where ut = N((r − 12 σ 2 )∆t, σ 2 ∆t) are identical and independently distributed
normally distributed random variables. Randomness in the stock price is
multiplicative, instead of additive. The stock price can never become negative!
In continuous time, the stock price can be expressed in terms of a Brownian
motion Wt
1 2
St = S0 e σWt +(r − 2 σ )t
Model parameters:
▶ r is the risk-free interest rate
▶ σ is the stock price volatility
23 / 38
Beyond the normal distribution: exponential Levy models
One can relax the normality assumption of the increments of log St in the
Black-Scholes model while preserving the independence and identical
distribution of the increments, by replacing the Brownian motion Wt with a
general Levy process.
This gives a wide class of models called Exponential Levy models.
Definition (Exponential Levy models)
Assume Xt is a Levy process, with independent and identically distributed
increments ∆Xt = Xt+τ − Xt . Under the exponential Levy model, the asset
price follows the process
St = S0 e Xt −µt
where µ is chosen such that E[St ] = S0 (the stock process is a martingale).
Simplest example: the Merton jump model, where Xt is a compound Poisson
jump process.
24 / 38
Trade prices components
The pattern of trade prices observed at micro-scale has two components:
▶ Slow moving component mt called efficient price. This is the fundamental
security value, and embeds information about the future earnings of the
stock. Long lasting.
▶ Rapidly changing up-down component ∼ qt , responsible for the bid-ask
bounce. qt = {+1, −1} is the trade direction indicator which shows if a
trade is a buy or sell. This is “noise” due to trading activity. Transitory.
pt = mt + cqt
The Roll model (1985) is the simplest model which attempts to include both
components into the trade price.
25 / 38
Liquidity
26 / 38
Market Liquidity
What is liquidity?
▶ Liquidity is the property of the markets which allows rapid and cheap
trade execution. It is the most important characteristic of well-functioning
markets.
▶ Liquidity has several dimensions: time (trade execution), size (can trade
large amounts of stock) and cost (transaction costs).
▶ Given its importance, one would expect that the term liquidity would be
well defined and universally understood. In reality there are several
measures of liquidity, each of them measuring a different dimension.
27 / 38
Dimensions of liquidity
1. Immediacy - time dimension Execution speed is very important at the
microscale. This is called latency, and depends on technology, access to
markets (direct or through a broker). Impatient traders use market orders
for a quick transaction execution.
2. Depth - trade size dimension Trading a large block of stock introduces
market impact. What is the size of a trade that can be arranged at a given
cost?
3. Cost. There are two types of costs associated with trading:
▶ Direct costs. Broker fees, exchange, etc. They can be also negative
(rebates), which can impact traders behavior.
▶ Implicit cost. Most important in microstructure is the bid-ask spread. For
small trades, this is observable in the market, but this can become an
unknown for large orders, and has to be modeled. Called in traders
language market breadth.
28 / 38
Orders and Liquidity
Market and Limit Orders play different roles in relation to liquidity:
▶ Market orders consume liquidity. (Take liquidity)
▶ Limit Orders provide liquidity. (Make liquidity)
Offer (sell orders)
Buy at ask:
cross the
spread bid-ask spread
Bid (buy orders)
29 / 38
Liquidity vs Cost
30 / 38
Five types of traders offering liquidity
31 / 38
Market Spread
▶ Spread Components: The bid/ask spread is the price of
immediacy of trading.
▶ The spread incorporates the dealers’ operational costs, such as trading
system development and maintenance, clearing and settlement, etc. If
dealers are not compensated for their expenses, there is no rationale for
them to stay in the business.
▶ Dealers’ inventory costs contribute to the bid/ask spread, because they
must recover their potential losses by widening the spread. Since deals must
satisfy order flows on both sides of the market, they maintain inventories of
risky instruments (and sometimes undesirable).
▶ Spread covers the dealers’ risk of trading with counterparts who have
superior information about true security value. Informed traders trade at
one side of the market and may profit from trading with dealers. This
component of the bid/ask spread is called the adverse-selection component
since dealers confront one-sided selection of their order flow.
32 / 38
Adverse selection
Adverse selection refers to a situation where the type of product or
some essential feature of the product is hidden from one party in a
transaction.
The term originates in insurance, where it was used to denote the
non-disclosure of adverse conditions which could impact the payout
of the insurer (e.g. smoking or a risky lifestyle in life insurance).
33 / 38
Adverse selection
Simplest case: second hand cars market are cheaper than new cars.
34 / 38
Adverse selection
Adverse selection plays an important role in market microstructure.
Glosten and Milgrom built a model for bid-ask spread arising purely
from informational asymmetry between ’informed traders” and
”dealers”
More about this in Lecture 8: Information-based measures
35 / 38
The Bid-Ask Spread
Bid-Ask Spread: The size of the bid/ask spread is an important object of study in
microstructure theory, as a proxy for liquidity.
▶ Quoted spread is defined in terms of quotes: best-ask at and best-bid bt prices
PT
s Q = T1 t=1 (at − bt )
▶ Average spread is defined in terms of the asset fundamental price pt∗
PT
s = T1 ∗
t=1 2qt (pt − pt )
where pt is the trade price and qt is the trade indicator (qt = ±1 for buy/sell).
▶ Effective spread: The efficient price pt∗ is not observable and is usually proxied
with the mid-price mt = 21 (at + bt ).
1 PT
ES = T t=1 2qt (pt − mt )
▶ Realized spread is defined with delayed mid-prices.
PT
RS = T1 t=1 2qt (pt − mt+δ )
36 / 38
Measures of liquidity
The highfrequency R package computes 23 measures of liquidity.
37 / 38
Measures of liquidity
The effective and realized spread are two of the most important
liquidity measures.
38 / 38