Vector Auto Regression
(VAR)
Regression Analysis - Limitations
• Forecasting
FREIGHT _VOL = 169853.82 + 1708.65 IIP - 2229.46 FR_RATE +
2304.22 DIESEL_PRICE
t stat (12.93) (8.31) (-6.20)
(2.98)
Probability (0.00) (0.00) (0.00)
(0.01)
Adj R2=99.29% F (Prob) = 0.00 DW:1.96
• Y=f(X)…. X=f(Y)…??
• Lead-lag relationship
f
SENSEX= (gold, oil, ex_rate) SENSEX= (AR(1))f
• Error structure
VAR
• Let us consider a VAR(1) model with K=2
Y1t = m1 + a11Y1,t-1 + a12Y2, t-1 + ε1t
Y2t = m2 + a21Y1,t-1 + a22Y2, t-1 + ε2t
• In VAR framework, each variable expressed as a
linear combination of lagged values of itself and
lagged values of all other variables in the group
• a12 denotes linear dependence of Y1t on Y2, t-1 in
the presence of Y1,t-1
• Therefore, a12 is the conditional effect of Y2,t-1 on
Y1t given Y1t-1
• If a12=0, Y1t depends only on its own past
48,000
44,000
40,000
36,000
32,000
28,000
24,000
20,000
16,000
12,000
8,000
II III IV I II III IV I II III IV I II III IV I II III IV I
2019 2020 2021 2022 2023 2024
scap mcap
VAR
• In practice, VAR equation may be
expanded to include deterministic time
trends and other exogenous variables
• Similarly VAR(2), K=2 Model can be
written as
VAR
• In general, VAR(P) model can be written as
p
yt a0 i yt i ut
i 1
• Where, Yt, a0 and ut are (k1) vector
• Φj is (kk) matrix
• ut is a sequence of serially uncorrelated
random vectors with mean zero and
covariance matrix Σ
VAR
• In practice it is not possible to avoid
imposing some prior restrictions on a VAR
system
• There is always some limit on the
number of variables which can be
included in VAR model as well as on
the maximum number of lags
• If, for example, we have six variables in VAR
system and we want to impose five lags on
each variable, total number of regressors in
each equation would be 30
• This could make entire modeling process
impossible if we have small data set
VAR
VAR
xt a1 b1 xt 1 a 2 b2 xt 2 u1t
y c d y c d y u
t 1 1 t 1 2 2 t 2 2t
VAR
xt a1 b1 xt 1 a2 b2 xt 2 u1t
y x * * y * * y *
t t c1 d1 t 1 c2 d 2 t 2 u2t
VAR
• As, in general, ij are unknown, sample
values obtained from residuals of VAR
equation estimates will be needed for
further analysis
• The idea behind making error terms
orthogonal to each other is to enable the
equations to be used separately for policy
analysis
• In this context, policy analysis refers to
the impact of a known shock or
“orthogonal innovation” on the system
•
•
• M1 R
•
•
• M1(-1) 1.076738 0.001282
• (0.20174) (0.00067)
• [ 5.33734] [ 1.90082]
•
• M1(-2) 0.173432 -0.002140
• (0.31444) (0.00105)
• [ 0.55156] [-2.03583]
•
• M1(-3) -0.366464 0.002176
• (0.34687) (0.00116)
• [-1.05647] [ 1.87698]
•
• M1(-4) 0.077602 -0.001479
• (0.20789) (0.00069)
• [ 0.37329] [-2.12854]
•
• R(-1) -275.0290 1.139310
• (57.2174) (0.19127)
• [-4.80674] [ 5.95670]
•
• R(-2) 227.1744 -0.309056
• (95.3948) (0.31888)
• [ 2.38141] [-0.96918]
•
• R(-3) 8.511942 0.052365
• (96.9177) (0.32397)
• [ 0.08783] [ 0.16163]
•
• R(-4) -50.19906 0.001073
• (64.7554) (0.21646)
• [-0.77521] [ 0.00496]
•
• C 2413.824 4.919031
• (1622.65) (5.42416)
• [ 1.48758] [ 0.90687]
•
•
• R-squared 0.988154 0.852889
• Adj. R-squared 0.984034 0.801721
• Sum sq. resids 4820241. 53.86238
• S.E. equation 457.7944 1.530308
• F-statistic 239.8315 16.66813
• Log likelihood -236.1676 -53.73717
• Akaike AIC 15.32298 3.921073
•
•
•
• M1 R Table 17.3
•
•
• M1(-1) 1.037538 0.001091
• (0.16048) (0.00059)
• [ 6.46509] [ 1.85825]
•
• M1(-2) -0.044662 -0.001255
• (0.15591) (0.00057)
• [-0.28646] [-2.19871]
•
• R(-1) -234.8848 1.069082
• (45.5223) (0.16660)
• [-5.15977] [ 6.41709]
•
• R(-2) 160.1559 -0.223365
• (48.5283) (0.17760)
• [ 3.30026] [-1.25768]
•
• C 1451.976 5.796446
• (1185.59) (4.33894)
• [ 1.22468] [ 1.33591]
•
•
• R-squared 0.988198 0.806661
• Adj. R-squared 0.986571 0.779993
• Sum sq. resids 5373508. 71.97045
• S.E. equation 430.4572 1.575354
• F-statistic 607.0723 30.24882
• Log likelihood -251.7446 -60.99213
• Akaike AIC 15.10263 3.881890
•
Forecast
• M1988-I = 1451.97 + 1.03* M1987-IV – 0.04
M1987-III – 234.88* R1987-IV + 160.15* R1987-III
*statistically significant
VAR Forecasting
• A straightforward application of VAR
model is forecasting
• A VAR forecaster does not worry about the
economic theory underlying
• It does not need to make any assumption
about the values of exogenous variables in
the forecasting period
• This is in contrast with standard
econometric forecasting where forecasts
have to be conditioned upon knowledge of
exogenous variables
Forecasting in eViews
• Currently forecasts from a VAR is not available
from the VAR object.
• Forecasts can be obtained by solving a model
created from the estimated VAR.
• Click on Proc/Make Model from the VAR
window toolbar to create a model object from
the estimated VAR.
• You may then make any changes to the model
specification, including modifying the ASSIGN
statement before solving the model to obtain
the forecasts.
• See Chapter 26, “Models”, on page 761, for
further discussion on how to forecast from
model objects in EViews.
• You can forecast from a VAR by making a model object from the
VAR, then solving the model over the period you wish to forecast.
To make a model from your VAR, after you have estimated the
VAR click on Proc->Make Model.
To then solve the model, click on the Solve button, and then in the
"Solution sample" box, enter the period over which you wish to
forecast. Then hit ok.
You will then find some new series in your Workfile with a suffix
of _0. These are the forecasted values. Note that by default the
model will insert actual values for out-of-sample periods.
•
VAR
• Forecast obtained by this method are in
many cases better than those obtained
from more complex models
• All the endogenous variables must be
stationary or transformed to stationary
• Individual coefficients of the VAR model are
often difficult to interpret
• Practitioners of this technique often estimate
Granger causality, Impulse Response
Function and Error Variance
Decomposition
Cholesky decomposition
• If X is a positive definite matrix with row and column
dimensions n, then X can be factored into an upper
triangular matrix R (also of dimension n) such that:
• X = R’R
where R’ refers to the transpose of R.
• Examples of positive definite matrices in statistical
applications include the variance-covariance matrix, the
correlation matrix, and the X’X matrix in regression.
• The Cholesky decomposition is a square root matrix
(and the inverse of R).
• For this reason, it is sometimes referred to as the
Cholesky square root.
Cholesky decomposition
• To identify orthogonalised innovations in each of the variables and the
dynamic responses to such innovations, the variance-covariance matrix
of the VAR was factorized using the Cholesky decomposition method
suggested by Doan (1992)
• This method imposes an ordering of the variables in the VAR and
attributes all of the effects of any common components to the first
variable in the VAR system.
• Unlike the standard orthogonalized approach, the generalized approach
of Pesaran and Shin (1998) is not sensitive to the ordering of variables
in the VAR system
• Unlike orthogonal approach, the values for generalized variance
decomposition at each horizon do not necessarily sum to one.
Granger Causality
• Several studies on energy and other
economics examined causal relationship
between energy consumption and economic
growth.
• The central issue has been whether economic
growth stimulates consumption of energy or
the energy consumption itself is a stimulus
for economic growth via the indirect channel
of effective aggregate demand, improved
overall efficiency and technological progress
• The answer to these queries is expected to
play a crucial role from policy formulation
point of view.
Granger Causality
• For example, existence of unidirectional
causality running from income to energy
consumption implied that energy
conservation policies might be initiated
without deteriorating economic side
effects.
• On the other hand, if unidirectional
causality runs from energy consumption
to income, reducing energy consumption
could lead to a fall in income
Granger Causality
• A time series (X) is said to Granger-
cause another time series (Y)
– if the prediction error of current Y
declines by using past values of X in
addition to past values of Y
Causality Tests
• Standard Granger type tests require
conversion to make them I(0)
m n
X t i X t i j Yt j ut .....(5)
i 1 j 1
q r
Yt a bi Yt i c j X t j vt ......(6)
i 1 j 1
Causality Tests
Y GC X if,
• H0: 1 = 2 = ….. =n = 0 is rejected
• Against HA: at least one j 0, j = 1….. n,
• X GC Y if,
if
• H0: c1 = c2 = ….. =cn = 0 is rejected
• Against HA: at least one c j 0, j = 1….. r,
IRF & EVD
• Detecting Granger causality is essentially an in-sample
phenomenon, which is useful in discriminating Granger
exogeneity or endogeneity of the dependent variable in the
sample period,
• But it is unable to deduce the degree of exogeneity of the
variables out of sample period
• To address this, one can employ generalized forecast error
variance decomposition (EVD) and Impulse Response
Function (IRF) analysis.
IRF & EVD
• The IRF analysis can trace out the dynamic responses of
one variable to innovations in another variable
• The decomposition of variance measures the percentage
of a variable’s forecast error variance that occurs as the
result of a shock from a variable in the system.
• Therefore, by employing this technique, one can find the
relative importance of a set of variables that affect a
variance of another variable.
Impulse response Function (IRF)
• IRF traces out the response of dependent variable
in VAR system to shocks in error terms (u1 and u2)
• If u1 in M1 equation increases by one SE deviation,
such a shock will change M1 in current as well as
future period
• Since M1 appears in R regression, the change in u1
will also have an impact on R
• Similarly, a change in 1 SE deviation of u2 in R
equation will have an impact on M1
• IRF traces out the impact of such shocks for several
periods in future
IRF
• One of the important application of
orthogonalization process is that of dynamic
simulation and projection
• Suppose one wishes to use the following VAR(1)
system for dynamic policy simulation
xt a1 b1 x t 1 u1t
y c d y u
t 1 1 t 1 2t
• More specifically, the question asked is: “what is
likely response of y at time t, t+1, t+2 etc to a
unitary exogenous shock on x at time t? That is,
what is likely to happen to yt over time if xt changes
by one unit at time t?
IRF
IRF
• This result would be systematically wrong since it
would ignore the effect (d1 times any change in
yt), which resulted from any change in u2t
• Recall that u1t and u2t are correlated, if u1t changes
then it is likely that u2t will alter as well so that
from second equation in VAR, yt will alter by same
amount as u2t
• Clearly a better tool for such an analysis would be
orthogonalized VAR, where the error terms are not
correlated
xt a1 b1 xt 1 u1t
y x * * y *
t t c1 d1 t 1 u 2 t
IRF
• Let us consider VAR(1) model
• Zt = A1Zt-1+Ut
= A1(A1Zt-2+Ut-1) + Ut
= A12Zt-2 + Ut + A1Ut-1
=……………….
n
Z t A1iU t i A1n 1 Z t n 1
i 0
lim 0 called stability condition,
n
• If the condition A1
holds then n
Z t A1iU t i
i 0
IRF
• Such a form representing a process which
generates Zt as an infinite sum of lagged random
errors weighted by generally diminishing
coefficients is called vector moving average (VMA)
representation
• For two variable VAR model, VMA representation is
xt
a1 b1 xt 1 u1t i
y c d y u
t i 0 1 1 t 1 2t i
• But the errors are correlated so we need to
undertake orthogonalization
• We know, u2t* = u2t – δu1t, where = (12 / 11)
IRF
• VMA model can be written as
xt
a1 b1 10 u1t i
y c d 1 *
t i 0 1 1 u 2t 1
xt 11(i )
12
(i )
u1t i
y (i ) (i ) *
t i 0 21 22 u 2 t 1
Z t i et i
i 0
• Since residuals are orthogonal, above equation can be
used directly for tracking dynamic responses of
particular variables to a shock in U1t
• Matrix Φi are called IRF and vector et is called vector of
innovation
Impulse Response
• The response of steel consumption to one
standard error (SE) shock in the equation of
GDP growth is presented in Fig. 2.
• The steel consumption responds positively at
the initial year after the shock but then
responds negatively, eventually returning to
its pre-shock level after a period of 4 years.
• As shown in Fig. 3, GDP growth does not seem
to be responsive to one SE shock in the
equation for the growth in steel consumption
•dlog(x) gr. Rate stationary
We need to check order of VAR
Go to View Lag Structure
Lag Length Criteria
It could be either 0 or 3
Set the order or VAR as 3
and re-estimate
VAR Equations
• DLOG(SENSEX) = 0.0009+ 0.02 DLOG(SENSEXt-1) - 0.027
DLOG(SENSEXt-2) -0.02 DLOG(SENSEXt-3) - 0.07 DLOG(EXt-1) - 0.52
DLOG(EXt-2) -0.65 DLOG(EXt-3) + 0.04 DLOG(OILt-1) - 0.007 DLOG(OIL(-
2)) - 0.03 LOG(OILt-3)
• DLOG(EX) = -7.87687119255e-05 - 0.001DLOG(SENSEXt-1) - 0.03
DLOG(SENSEXt-2) + 0.02 DLOG(SENSEXt-3) - 0.01 DLOG(EXt-1) - 0.05
DLOG(EXt-2) + 0.08 DLOG(EXt-3) + 6.6 e-06 DLOG(OILt-1) - 0.013
DLOG(OILt-2) + 0.01 DLOG(OILt-3)
• DLOG(OIL) = ......
After estimating VAR(3), WE
NEED TO CHECK Granger
causality
Go tp, View Lag Structure
Granger Causality
VAR Equations
• DLOG(SENSEX) = 0.0009+ 0.02 DLOG(SENSEXt-1) - 0.027
DLOG(SENSEXt-2) -0.02 DLOG(SENSEXt-3) - 0.07 DLOG(EXt-1) - 0.52
DLOG(EXt-2) -0.65 DLOG(EXt-3) + 0.04 DLOG(OILt-1) - 0.007 DLOG(OIL(-2))
- 0.03 LOG(OILt-3)
• DLOG(EX) = -7.87687119255e-05 - 0.001DLOG(SENSEXt-1) - 0.03
DLOG(SENSEXt-2) + 0.02 DLOG(SENSEXt-3) - 0.01 DLOG(EXt-1) - 0.05
DLOG(EXt-2) + 0.08 DLOG(EXt-3) + 6.6 e-06 DLOG(OILt-1) - 0.013
DLOG(OILt-2) + 0.01 DLOG(OILt-3)
• DLOG(OIL) = ......
Null hypothesis: non-causality
Non-causality from dlog(ex) to
dlog(sensex) rejected
Non-causality from dlog(sensex) to
dlog(ex) rejected
So, bi-directional causality between
dlog(ex) and dlog(sensex)
As expected, dlog(ex) & dlog(sensex) cannot Granger causes dlog(oil)
VAR in label or first-
differenced series?
• One debatable point concerns use of VAR
model in levels or in first differences.
• If all used variables follow a I(0) process, the
specification in levels is appropriate.
• However, as most time-series variables have
the problem of non-stationarity, the question
of differencing arises.
• According to Hamilton (1994), one
option is to ignore the non-stationarity
altogether and simply estimate the VAR
in levels, relying on standard t- and F-
distribution for testing any hypotheses.
VAR in label or first-differenced
series?
• Other option is to difference any apparently non-stationary
variables before estimating the VAR.
• If the true process is a VAR in differences, then differencing
should improve the small sample performance.
• The drawback to this approach is that the true process may
not be a VAR in differences.
• Some of the series may in fact have been stationary, or
perhaps some linear combinations of the series are
stationary, as in a cointegrated VAR.
• According to Hamilton (1994), in such circumstances, a
VAR in differenced form is mis-specified.
VAR in label or first-differenced
series?
• The case of losing useful information by differencing, while
there are cointegration vectors in the system, is also argued
by Sims (1980) and Doan (1992).
• The other area of debate is whether an unrestricted VAR
should be used where the variables in the VAR are
cointegrated.
• There is a body of literature that supports the use of a vector
error correction model (VECM), or cointegrating VAR, in this
situation.
• It has been argued, however, that in the short term,
unrestricted VAR performs better than a cointegrated VAR
or VECM.
Advanced VAR & Causality
• Extended VAR (Toda & Yamamoto, 1885)
• Threshold VAR
– Markov-switching VAR
• Asymmetric VAR
• Threshold VAR