Violation of CLRM
Prof. Rishman Jot Kaur Chahal
HSC - 205
Department of Humanities and Social Sciences
Indian Institute of Technology Roorkee
1 / 17
Autocorrelation
Detecting Autocorrelation
Graphically: Visual examination of the estimated residuals provide
clue for the presence of autocorrelation. We can simply plot them
against time, the time sequence plot.
Durbin Watson d test: Most celebrated test for detecting serial
correlation is that developed by statisticians Durbin and Watson.
Following is the d-statistic
t=n
(uˆt − ût−1 )2
P
t=2
d= Pt=n 2 (1)
t=1 uˆt
which is simply the ratio of the sum of squared differences in
successive residuals to the Residual Sum of Square (RSS).
Note that in the numerator of the d statistic the number of
observations is n − 1 because one observation is lost in taking
successive differences.
2 / 17
Autocorrelation
Detecting Autocorrelation
d statistic is based on estimated residuals which is great advantage.
However, following are some of the assumptions of this statistic:
1. The regression model includes the intercept term. If it is not present,
as in the case of the regression through the origin, it is essential to
rerun the regression including the intercept term to obtain the RSS.
2. The explanatory variables, the X’s, are nonstochastic, or fixed in
repeated sampling.
3. The disturbances ut are generated by the first-order autoregressive
scheme: ut = ρut−1 + ϵt . Therefore, it cannot be used to detect
higher-order autoregressive schemes.
3 / 17
Autocorrelation
Detecting Autocorrelation
4. The error term ut is assumed to be normally distributed.
5. The regression model does not include the lagged value(s) of the
dependent variable as one of the explanatory variables. Thus, the test
is inapplicable in models of the following type:
Yt = β1 + β2 X2t + β3 X3t + ... + βk Xkt + γYt−1 + ut
where yt−1 is the one period lagged value of Y.
6. There are no missing observations in the data.
4 / 17
Autocorrelation
Detecting Autocorrelation
Deriving the exact sampling distribution or probability distribution of
the d statistic is difficult as it depends in a complicated way on X
values.
So, unlike t, F and χ2 tests there is no critical value that will lead to
accept or reject the null (no first order serial correlation of the
disturbances (ut )).
So, what did Durbin and Watson did?
5 / 17
Autocorrelation
Detecting Autocorrelation
Durbin and Watson derived a lower bound (dL ) and and upper bound
(dU ) such that if the computed d lies outside these critical values a
decision can be made regarding the presence of positive or negative
serial correlation.
Simplify the d statistic as follows:
P 2 P 2 P
uˆt + ût−1 − 2 uˆt ût−1
d= P 2
uˆt
Note that
X X
ˆ 2≈
ut−1 uˆt 2
6 / 17
Autocorrelation
Detecting Autocorrelation
Thus,
P
ût ût−1
d ≈ 2(1 − P ˆ2 )
u t
Consider,
P
uˆt ût−1
ρ̂ = P 2
uˆt
Since −1 ≤ ρ ≤ 1, thus 0 ≤ d ≤ 4.
These are the bounds of d; any estimated d value must lie within
these limits.
Rule of Thumb: If d is found to be 2 in an application, one may
assume that there is no first-order autocorrelation, either positive or
negative.
7 / 17
Autocorrelation
Detecting Autocorrelation
Decision Rule:
Figure: Gujarati et al., 2009 fifth edition
But how to get dL and dU ?? From the Durbin-Watson tables
following the sample size and given explanatory variables.
8 / 17
Autocorrelation
Detecting Autocorrelation
The mechanics of the Durbin–Watson test:
Run the OLS regression and obtain the residuals.
Compute d from Eq.9.
Now find the critical values of dL and dU for the given sample size and
given number of explanatory variables.
Follow the decision rules given in the figure above.
9 / 17
Autocorrelation
Detecting Autocorrelation from an Example
Let us consider the example of wages-productivity regression where
wages are affected by the productivity of the individuals.
From the Durbin-Watson tables, we can see that for n=46 and one
explanatory variable, dL = 1.475 and dU = 1.566. at 5 percent level of
significance.
Also, consider the estimated d value of the regression is 0.2175,
Since the computed d of 0.2175 lies below dL , we cannot reject the
hypothesis that there is positive serial correlation in the residuals.
10 / 17
Autocorrelation
Detecting Autocorrelation through a General Test: Breusch-Godfrey (BG) Test
Also known as LM Test as it is based on Lagrange multiplier principle.
Consider a 2 variable regression model as:
Yt = β1 + β2 Xt + ut (2)
Assume that the error term ut follows the pth-order autoregressive,
AR(p), scheme as follows:
ut = ρ1 ut−1 + ρ2 ut−2 + ... + ρp ut−p + ϵt
The null hypothesis is of no serial correlation i.e. H0 :
ρ1 = ρ2 = ρ3 = ... = ρp = 0.
11 / 17
Autocorrelation
Detecting Autocorrelation through a General Test: Breusch-Godfrey (BG) Test
Steps for BG Test:
Estimate the model by OLS and obtain uˆt .
Regress uˆt on the original Xt and introduce the additional ût−1 ,
ût−2 ,..., ût−p . So, if p=5 we will include five lagged values of residuals
as additional regressors.
ût = α1 + α2 Xt + p̂1 ût−1 + ... + p̂p ût−p + ϵt
Obtain its R 2 .
If the sample size is large (technically, infinite), Breusch and Godfrey
have shown that
(n − p)R 2 ∼ χ2p
If (n − p)R 2 exceeds the critical chi-square value at the chosen level of
significance, we reject the null hypothesis.
12 / 17
Autocorrelation
Remedies
One must try to analyse whether its a pure autocorrelation or not?
Sometimes due to misspecified modelling some important variables
are excluded due to which there is autocorrelation.
But if still there is pure autocorrelation, one can transform the model.
As in the case of heteroscedasticity, we used the generalized
least-square (GLS) method.
In large samples, we can use the Newey–West method to obtain
standard errors of OLS estimators.
13 / 17
Autocorrelation
GLS to correct pure Autocorrelation
If the coefficient of first-order autocorrelation ρ is known, the problem
of autocorrelation can be easily solved. Consider a two-variable
regression model as:
Yt = β1 + β2 Xt + ut (3)
Say ut = ρut−1 + ϵ where −1 < ρ < 1.
Yt−1 = β1 + β2 Xt−1 + ut−1 (4)
Multiply by ρ in both the sides of eq 4 and subtract 4 from 3.
Yt − ρYt−1 = β1 (1 − ρ) + β2 (Xt − ρXt−1 ) + ϵt
where ϵ = (ut − ρut−1 )
14 / 17
Autocorrelation
GLS to correct pure Autocorrelation
Thus,
Yt∗ = β1∗ + β2∗ Xt∗ + ϵt (5)
Now this error term satisfies the usual OLS assumptions and one can
apply OLS to this transformed model to estimate the betas.
Recall that GLS is nothing but OLS applied to the transformed model
that satisfies the classical assumptions.
Eq 5 is also known as Generalized Differenced Equation.
15 / 17
Autocorrelation
GLS to correct pure Autocorrelation
When ρ is not known
First Differenced Model.
Yt − Yt−1 = β2 (Xt − Xt−1 ) + (ut − ut−1 )
∆Yt = β2 ∆Xt + ϵt (6)
where ∆ is the first-difference operator.
The first-difference transformation may be appropriate if the
coefficient of autocorrelation is very high.
16 / 17
Autocorrelation
GLS to correct pure Autocorrelation
Example: our wages–productivity regression. Now rerun the
regression in the first-difference form. You will get the following
results:
∆Ŷt = 0.653∆Xt
where t = 11.40, r 2 = 0.426 and d = 1.7442.
The d value has increased dramatically, perhaps indicating that there
is little autocorrelation in the first difference regression.
17 / 17