0% found this document useful (0 votes)
27 views59 pages

Fe5209 3 Ay 2024

The document outlines the fundamentals of linear regression models as part of a financial econometrics course, referencing Chris Brooks's textbook. It covers key concepts such as simple and multiple linear regression, the ordinary least squares (OLS) estimation method, and the assumptions underlying classical linear regression. Additionally, it discusses the interpretation of coefficients, hypothesis testing, and the properties of OLS estimators.

Uploaded by

2207735629
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views59 pages

Fe5209 3 Ay 2024

The document outlines the fundamentals of linear regression models as part of a financial econometrics course, referencing Chris Brooks's textbook. It covers key concepts such as simple and multiple linear regression, the ordinary least squares (OLS) estimation method, and the assumptions underlying classical linear regression. Additionally, it discusses the interpretation of coefficients, hypothesis testing, and the properties of OLS estimators.

Uploaded by

2207735629
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Linear regression model

FE5209 Financial Econometrics

Chao ZHOU

[email protected]

3rd Lecture

Chao ZHOU RMI FE5209 1 / 59


Linear regression model

Textbook

This lecture follows closely Chirs Brooks's book Chapter 2 and


Chapter 3.
Introductory Econometrics for Finance, 2nd Edition, Chris Brooks,
Cambridge University Press, 2008. Chapters 2-8, 11. Software :
EViews
(E-book available on the website of NUS Libraries)

Chao ZHOU RMI FE5209 2 / 59


Linear regression model

Outline

1 Linear regression model


Simple linear regression model
Multiple linear regression
Examples

Chao ZHOU RMI FE5209 3 / 59


Linear regression model

Linear regression model


Regression analysis is almost certainly the most important tool in
econometrics.
Univariate regression is an attempt to explain movements in a
variable (dependent variable or response variable y ) by reference to
movements in one or more other variables (independent variable(s),
predictor variable(s) or explanatory variable(s) x1 , · · · , xp ).
One important goal of the regression is the prediction of future y
values when the corresponding values of x1 , · · · , xp are already
available.
Data : yt and xt,1 , · · · , xt,p , the values of these variables for the t th
observation.
Chao ZHOU RMI FE5209 4 / 59
Linear regression model

Simple linear regression model

Suppose that it is believed that y depends on only one x variable.


The model is

yt = α + βxt + ϵt (1)
where the subscript t (= 1, 2, 3, ..., T ) denotes the observation
number and ϵ a random disturbance (or noise, error) term.
α and β are also called intercept and slope of the line.

Chao ZHOU RMI FE5209 5 / 59


Linear regression model

U.S. weekly interest rates


U.S. weekly interest rates: 1-year (solid line) and 3-year

16
14
12
y1

10
8
6
4

0 500 1000 1500 2000

Index

Chao ZHOU RMI FE5209 6 / 59


Linear regression model

U.S. weekly interest rates : y1 y3

16
14
12
10
y3

8
6
4

4 6 8 10 12 14 16

y1

Chao ZHOU RMI FE5209 7 / 59


Linear regression model

Notations

A hat ( ˆ ) over a variable or parameter is used to denote a value


estimated by a model.
α̂ and β̂ are the estimated values for α and β .

Let yt denote the actual data point for observation t and let ŷt
denote the tted value from the regression line.
Let ϵ̂t denote the residual, which is the dierence between the actual
value of y and the value tted by the model for this data point, i.e.
(yt − ŷt ). ϵ̂ can be understood as an estimation for ϵ.

Chao ZHOU RMI FE5209 8 / 59


Linear regression model

Least squares estimation


The residual sum of squares (RSS) (or the sum of squared residuals)
is dened as
ϵ̂2t
X

So minimising RSS is equivalent to minimising − ŷt )2 .


P
t (yt

The equation for the tted line is given by ŷt = α̂ + β̂xt .


Let L denote the RSS, which is also know as a loss function.
T T
(yt − ŷt )2 = (yt − α̂ − β̂xt )2 (2)
X X
L=
t=1 t=1

where T is the number of observations.


Chao ZHOU RMI FE5209 9 / 59
Linear regression model

L is minimised with respect to α̂ and β̂ to nd the values of α and β


that give the line that is closest to the data.
The rst order condition says that
T
∂L
= −2 (yt − α̂ − β̂xt ) = 0 (3)
X
∂ α̂ t=1
T
∂L
= −2 xt (yt − α̂ − β̂xt ) = 0 (4)
X
∂ β̂ t=1

(3) is equivalent to
T T
xt = 0
X X
yt − T α̂ − β̂
t=1 t=1

T ȳ − T α̂ − β̂T x̄ = 0
α̂ = ȳ − β̂ x̄
Chao ZHOU RMI FE5209 10 / 59
Linear regression model

Substituting α̂ into (4)


T
xt (yt − ȳ + β̂ x̄ − β̂xt ) = 0
X

t=1
T T T T
xt2 = 0
X X X X
xt yt − ȳ xt + β̂ x̄ xt − β̂
t=1 t=1 t=1 t=1
T T
xt yt − T ȳ x̄ + β̂T x̄ 2 − β̂ xt2 = 0
X X

t=1 t=1

Rearranging for β̂ ,
T
! T
2 2
X X
β̂ T x̄ − xt = T ȳ x̄ − xt yt
t=1 t=1

Chao ZHOU RMI FE5209 11 / 59


Linear regression model

Therefore the coecient estimators for the slope β and the intercept
α are the minimisers of L, given by
P P
xt yt − T x̄ ȳ (xt − x̄)(yt − ȳ )
β̂ = P 2 2
= and α̂ = ȳ − β̂ x̄
(xt − x̄)2
P
xt − T x̄
1 P
(xt −x̄)(yt −ȳ )
We can rewrite β̂ = (T −1)1 P(xt −x̄)2 , which is equivalent to the
(T −1)
sample covariance between x and y divided by the sample variance of
x.

This method of nding the optimum is known as ordinary least


squares (OLS) estimation.

Chao ZHOU RMI FE5209 12 / 59


Linear regression model

Least squares line

The least squares line is given by

ŷt = α̂ + β̂xt = ȳ + β̂(xt − x̄)


1
P
(T −1)
(xt − x̄)(yt − ȳ )
= ȳ + 1
P 2
(xt − x̄)
(T −1)
(x t − x̄)
sxy
= ȳ + 2 (xt − x̄)
sx

where sxy is the sample covariance between x and y and sx2 is the
sample variance of x .

Chao ZHOU RMI FE5209 13 / 59


Linear regression model

Interpretation

The coecient estimate for β , β̂ , is interpreted as saying that, if x


increases by 1 unit, y will be expected, everything else being equal, to
increase by β̂ units.
α̂, the intercept coecient estimate, is interpreted as the value that
would be taken by the dependent variable y if the independent
variable x took a value of zero.

Chao ZHOU RMI FE5209 14 / 59


Linear regression model

Non-linear models with OLS


In order to use OLS, a model that is linear is required. More
specically, the model must be linear in the parameters (α and β ).
Models that are not linear in the variables can often be made to take
a linear form by applying a suitable transformation or manipulation.
(i) Exponential regression model : Yt = AXtβ e ϵt
Taking logarithms, we have ln Yt = ln(A) + β ln Xt + ϵt . Let
α = ln(A), yt = ln Yt and xt = ln Xt
yt = α + βxt + ϵt
(ii)
β
yt = α + + ϵt
zt
The regression can be estimated using OLS by setting xt = z1t .
Chao ZHOU RMI FE5209 15 / 59
Linear regression model

Non-linear models

On the other hand, some models are intrinsically non-linear, e.g.


yt = α + βxtγ + ϵt

Such models might be estimated using a non-linear method.

Chao ZHOU RMI FE5209 16 / 59


Linear regression model

Classical linear regression model


The model yt = α + βxt + ϵt with the assumptions
(1) The errors have zero mean, i.e. E (ϵt ) = 0.
(2) The variance of the errors is constant and nite, i.e.
var(ϵt ) = σ 2 < ∞.
(3) The errors are linearly independent of one another, i.e.
cov(ϵi , ϵj ) = 0.
(4) There is no relationship between the error and corresponding x
variate, i.e. cov(ϵt , xt ) = 0.
(5) ϵt is normally distributed, i.e. ϵt ∼ N(0, σ 2 ).

Assumption (5) is required to make valid inferences about the


population parameters (the actual α and β ) from the sample
parameters (α̂ and β̂ ) estimated using a nite amount of data.
Chao ZHOU RMI FE5209 17 / 59
Linear regression model

Estimators

With these assumptions, another way to nd a estimator for β is to


take the covariance with xt on both sides of the model.
yt = α + βxt + ϵt
cov(xt , yt ) = cov(xt , α + β xt + ϵt )
cov(xt , yt ) = cov(xt , β xt )
cov(xt , yt )
β =
cov(xt , xt )
1 P
(xt −x̄)(yt −ȳ )
Then, again, we have β̂ = (T −1)
1 P
(xt −x̄)2
.
(T −1)

Chao ZHOU RMI FE5209 18 / 59


Linear regression model

Properties of the OLS estimator

Under assumptions (1)-(4) listed above, the OLS estimator can be


shown to have the desirable properties that it is consistent, unbiased
and ecient.
Consistency
Unbiasedness
Eciency

Chao ZHOU RMI FE5209 19 / 59


Linear regression model

Estimating the variance of the error term (σ 2)

The variance of a random variable ϵt is given by


var(ϵt ) = σ 2 = E [(ϵt − E (ϵt ))2 ]

Under Assumption (1) above, it reduces to var(ϵt ) = σ 2 = E[ϵ2t ].


To estimate E [ϵ2t ], the sample counterpart to ϵt , ϵ̂t is used
1X
σ̂ 2 = ϵ̂2t
T
An unbiased estimator is σ̂ 2 = T 1−2 ϵ̂2t , where T is the number of
P
observations and 2 is the number of model parameters (i.e. α and β ).

Chao ZHOU RMI FE5209 20 / 59


Linear regression model

Standard error of the regression

q
1
ϵ̂2t is known as the standard error of the regression.
P
σ̂ = T −2

It is sometimes used as a broad measure of the t of the regression


equation.
Everything else being equal, the smaller this quantity is, the closer is
the t of the line to the actual data.

Chao ZHOU RMI FE5209 21 / 59


Linear regression model

Standard errors of the estimators

Standard errors are used as measure of the reliability or precision of


the estimators (α̂ and β̂ ). It's to have an idea of how 'good' these
estimates of α and β .
Given Assumptions (1)-(4) above, valid estimators of the standard
errors can be shown to be given by

s P 2 s P 2
xt x
^) = σ̂
SE(α 2
= σ̂ P 2 t (5)
T ( xt − T x̄ 2 )
P
T (xt − x̄)
s s
^ = σ̂ P 1 1
SE(β) = σ̂ (6)
(xt − x̄) 2 (T − 1)sx2

Chao ZHOU RMI FE5209 22 / 59


Linear regression model

Testing single hypothesis : t -test

Hypothesis testings : What can we say about the population (true)


values based on the estimated regression parameters from the sample
data ?
There are always two hypotheses that go together, known as the null
hypothesis (denoted by H0 ) and the alternative hypothesis (denoted
by Ha ).

Chao ZHOU RMI FE5209 23 / 59


Linear regression model

Example
Given the regression results, it is of interest to test the hypothesis
that the true value of β is in fact 0.5 :
H0 : β = 0.5, Ha : β ̸= 0.5
This would be known as a two-sided test, since the outcomes are
both β < 0.5 and β > 0.5 under Ha .
Sometimes, some prior information may suggest that β > 0.5 would
be expected rather that β < 0.5, then a one-sided test would be :
H0 : β = 0.5, Ha : β > 0.5

When β < 0.5 is expected, another one-sided test would be :


H0 : β = 0.5, Ha : β < 0.5
Chao ZHOU RMI FE5209 24 / 59
Linear regression model

Test statistics
In very general terms, if the estimated value is a long way away from
the hypothesised value, the null hypothesis is likely to be rejected ; if
the value under the null hypothesis and the estimated value are close
to one another, the null hypothesis is less likely to be rejected.
Under Assumption (5), it can be shown that the coecient estimates
will also be normally distributed,
α̂ ∼ N(α, var(α̂)) and β̂ ∼ N(β, var(β̂))

When the assumption (5) doesn't hold, the coecient estimates still
follow a normal distribution if all the other assumptions hold and the
sample size is suciently large.
Chao ZHOU RMI FE5209 25 / 59
Linear regression model

Thus we have
α̂ − α β̂ − β
∼ N(0, 1) and ∼ N(0, 1)
var(α̂)
p q
var(β̂)

Since the standard errors val(α ^ are unknown, one should


^) and val(β)
replace them by the sample estimated version SE (α̂) and SE (β̂).
Then the statistics follow a t -distribution with T − 2 degrees of
freedom rather that a normal distribution
α̂ − α β̂ − β
∼ tT −2 and ∼ tT −2
SE (α̂) SE (β̂)

These are called the t -test statistics.

Chao ZHOU RMI FE5209 26 / 59


Linear regression model

As in Lecture 2, we can use the condence interval to conduct the


tests.
For instance, the null hypothesis that β = β ∗ will not be rejected if
the test statistic lies within the non-rejection region
β̂ − β ∗
−t1−α/2 ≤ ≤ t1−α/2
SE (β̂)

which is equivalent to
−t1−α/2 · SE (β̂) ≤ β̂ − β ∗ ≤ t1−α/2 · SE (β̂)

i.e. one would not reject if


−t1−α/2 · SE (β̂) + β ∗ ≤ β̂ ≤ t1−α/2 · SE (β̂) + β ∗

Chao ZHOU RMI FE5209 27 / 59


Linear regression model

Signicance

If the null hypothesis is rejected at the 5% level, it would be said that


the result of the test is 'statistically signicant'.
If the null hypothesis is not rejected, it would be said that the result
of the test is 'not signicant', or that it is 'insignicant'.
Finally, if the null hypothesis is rejected at the 1% level, the result is
termed 'highly statistically signicant'.

Chao ZHOU RMI FE5209 28 / 59


Linear regression model

The exact signicance level p -value

The exact signicance level is also commonly known as the p-value. If


the test statistic is 'large' in absolute value, the p-value will be small,
and vice versa.
In fact, the null hypothesis is rejected if the p-value is smaller than
the signicance level (α).
p-values are almost always provided automatically by software
packages.

Chao ZHOU RMI FE5209 29 / 59


Linear regression model

Review : p-value

There are two possible errors that could be made :


1 Rejecting H0 when it was really true ; this is called a type I error.

2 Not rejecting H0 when it was false ; this is called a type II error.

Informally, the p-value is also often referred to as the probability of


making type I error.
Thus, for example, if a p-value of 0.05 or less leads the researcher to
reject the null (equivalent to a 5% signicance level), this is
equivalent to saying that if the probability of incorrectly rejecting the
null hypothesis is more than 5%, do not reject it.

Chao ZHOU RMI FE5209 30 / 59


Linear regression model

t-ratio test

If the test is
H0 : β = 0 , Ha : β ̸= 0

this is known as t-ratio test. Here β ∗ = 0, then


β̂
test statistic =
SE (β̂)

This ratio is known as the t -ratio.

Chao ZHOU RMI FE5209 31 / 59


Linear regression model

Multiple linear regression

It is very easy to generalise the simple linear regression model to the


multiple case :
yt = β0 + β1 x1t + · · · + βk xkt + εt = xt β + εt , t = 1, ..., T ,

where β is (k + 1) × 1 vector. So the variables {x1 , · · · xk } are a set


of k explanatory variables which inuence y .
In the simple linear regression model, there are 2 regressors (1 and
x ), while in the above setting, there are k + 1 regressors (1, and
x1 , · · · , xk ).

Chao ZHOU RMI FE5209 32 / 59


Linear regression model

Centering the explanatory variables


β0 is the expected value of y when all of the explanatory variables are
equal to 0.
However, frequently, 0 is outside the range of some explanatory
variables, making the interpretation of β0 of little real interest
In practice, we can center the explanatory variables.
If xi,1 , · · · , xi,T are the values of the i th explanatory variable and x̄i is
their mean, then (xi,1 − x̄i ), · · · , (xi,T − x̄i ) are values of the centered
explanatory variable.
If all explanatory variables are centered, then β0 is the expected value
of y when all of the explanatory variables are equal to their mean.
This gives β0 an interpretable meaning.
Chao ZHOU RMI FE5209 33 / 59
Linear regression model

Matrix notation

In matrix notation we have y = xβ + ε,where y and ε are T × 1


vectors, and x is a (T × (k + 1)) matrix.
For example, a simple linear regression can be written as

1 x11
     
y1 ϵ
 y2   1 x12     1
 β0  ϵ2

 ..  =  .. .
..  β1 +  ..
   
 .   .  .
 

yT 1 x1T ϵT

Chao ZHOU RMI FE5209 34 / 59


Linear regression model

OLS estimations
Minimise the RSS L = Tt=1 ε̂2t = Tt=1 (yt − xt β̂)2 .
P P
Write L in matrix notation
^ ′ (y − xβ)
L = ϵ̂′ ϵ̂ = (y − xβ) ^ = y′ y − β ^ ′ x′ y − y′ xβ
^+β
^ ′ x′ xβ
^ (7)
We can check easily that β′ x ′ y = y ′ xβ, thus (7) can be written
^ x′ y + β
L = y ′ y − 2β

^ x′ xβ
^′

The rst order condition for the minimisation of L is


∂L ^ = 0 which is x′ y = x′ xβ.
^
= −2x′ y + 2x′ xβ
^
∂β
Pre-multiplying both sides of the above equation by the inverse of
x′ x, the vector of OLS coecient estimates is given by
β̂ = (x′ x)−1 x′ y
Chao ZHOU RMI FE5209 35 / 59
Linear regression model

Testing multiple hypotheses : the F -test

The t -test was used to test single hypotheses, i.e. hypotheses


involving only one coecient.
To test more than one coecient simultaneously, F -test is employed.
Under the F -test framework, two regressions are required, known as
the unrestricted and the restricted regressions.
The unrestricted regression is the one in which the coecients are
freely determined by the data, as has been constructed previously.
The restricted regression is the one in which the coecients are
restricted, i.e. the restrictions are imposed on some β 's.

Chao ZHOU RMI FE5209 36 / 59


Linear regression model

F-test statistic
The F -test statistic for testing multiple hypotheses is given by
RRSS − URSS T − k − 1
F −test statistic = × (8)
URSS m
where
URSS = residual sum of squares from unrestricted regression
RRSS = residual sum of squares from restricted regression
m = number of restrictions
T = number of observations
k +1 = number of regressors in unrestricted regression
It can be shown that URSS ≤ RRSS .
Chao ZHOU RMI FE5209 37 / 59
Linear regression model

Examples

Informally, the number of restrictions can be seen as 'the number of


equality signs under the null hypothesis'. For example,
H0 : Number of restriction m
β1 + β2 = 1 1
β1 = 1 and β2 = 0 2
β1 = 0, β2 = 0 and β3 = 0 3

Chao ZHOU RMI FE5209 38 / 59


Linear regression model

"The" regression F-statistic


With softwares (e.g. R), in general, the F-statistic given in the output
of the regression function tests the null hypothesis that all of the
coecients except the intercept coecient are zero.
For example, if the model is,
y = β0 + β1 x1 + β2 x2 + β3 x3 + ϵ
then the null hypothesis tested by F-statistic is
β1 = 0 and β2 = 0 and β3 = 0,
and the alternative hypothesis is
β1 ̸= 0 or β2 ̸= 0 or β3 ̸= 0,
If this null hypothesis cannot be rejected, it would imply that none of
the explanatory variables in the model was able to explain variations
in y .
Chao ZHOU RMI FE5209 39 / 59
Linear regression model

Ideas of F-test

(i) If, after imposing constraints on the model, a residual sum of


squares results that is not much higher than the unconstrained
model's residual sum of squares, it would be concluded that the
restrictions were supported by the data.

(ii) If the residual sum of squares increased considerably after the


restrictions were imposed, it would be concluded that the
restrictions were not supported by the data and therefore that
the hypothesis should be rejected.

Chao ZHOU RMI FE5209 40 / 59


Linear regression model

The test statistic follows the F-distribution under the null hypothesis,
more precisely, Fm,T −k−1 .
Single hypotheses involving one coecient can be tested using a t - or
an F -test (they always give the same conclusion, see denitions in
Lecture 1), but multiple hypotheses can be tested only using an
F -test.

It is not possible to test hypotheses that are not linear or that are
multiplicative using this framework - for example, H0 : β1 β2 = 3, or
H0 : β12 = 4 cannot be tested.

Chao ZHOU RMI FE5209 41 / 59


Linear regression model

Goodness of t statistics

Goodness of t statistics are used to test how well the estimated


regression model ts the data.
OLS selected the coecient estimates that minimised the RSS, so
the lower was the minimised value of the RSS, the better the model
tted the data.

Chao ZHOU RMI FE5209 42 / 59


Linear regression model

Goodness of t statistics : R 2
A scaled version of the RSS is usually employed. The most common
goodness of t statistic is known as R 2 .
Total SS(TSS) = Explained SS(ESS) + RSS

(yt − ȳ )2 = (ŷt − ȳ )2 + (yt − ŷt )2


X X X

t t t

ESS RSS
R2 = =1−
TSS TSS
R 2 can be understood as the portion of the total uctuation of the
dependent variable, y , explained by the regression relation.
R 2 must always lie between zero and one. A higher R 2 implies,
everything else being equal, that the model ts the data better.
Chao ZHOU RMI FE5209 43 / 59
Linear regression model

Problems with R 2

However, there are a number of problems with R 2 as a goodness of


t measure :

(i) R 2 never falls if more regressors are added to the regression. It's
impossible to use R 2 as a determinant of whether a given
variable should be present in the model or not.

(ii) R 2 can take values of 0.9 or higher for time series regressions,
and hence it is not good at discriminating between models.

Chao ZHOU RMI FE5209 44 / 59


Linear regression model

In order to get around the problem (i), one can use the adjusted R 2 ,
denoted by R̄ 2
T −1
 
2
R̄ = 1 − (1 − R 2 ) (9)
T −k −1

If an extra regressor (explanatory variable) is added to the model, k


increases and unless R 2 increases by a more than o-setting amount,
R̄ 2 will fall.

The rule is : include the variable if R̄ 2 rises and do not include it if R̄ 2


falls.
R̄ 2 is always smaller than R 2 . Also, R̄ 2 may take negative values,
even with an intercept in the regression, if the model ts the data
very poorly.
Chao ZHOU RMI FE5209 45 / 59
Linear regression model

Model selection criteria


When there are many potential explanatory variables, often we wish
to nd a subset of them that provides a good regression model.
For linear regression, AIC (Akaike's information criterion) is
AIC = T log(σ̂ 2 ) + 2(1 + k)
where 1 + k is the number of regressors.
BIC (Bayesian IC) replaces 2(1 + k) in AIC by log(T )(1 + k).
Suppose there are M explanatory variables. Let σ̂M2
be the
estimate of σ using all of them, and let RSS(p) be the residual
2

sum of squares for a model with some subset of only p ≤ M of


the predictors.
RSS(p)
Cp = 2 − T + 2(p + 1)
σ̂M
Ajusted R 2 (R̄ 2 )
Chao ZHOU RMI FE5209 46 / 59
Linear regression model

Model selection criteria

With Cp , AIC, and BIC, smaller values are better, but for adjusted R 2
(R̄ 2 ), larger values are better.

Chao ZHOU RMI FE5209 47 / 59


Linear regression model

Linear Market Model


Linear market model :
rt = α + βrm,t + ϵt ,
where {rt } are the log returns of a single stock (e.g. GM) and {rm,t }
the log returns of an index (e.g. S&P composite index).
> gm=log(da[,2]+1) ; sp=log(da[,3]+1) #compute log returns
> m1=lm(gm ∼ sp) ; summary(m1) #t the market model
Call : lm(formula = gm ∼ sp)
Coecients :
Estimate Std.Error t value Pr (> |t|)
(Intercept) −0.004861 0.003434 −1.415 0.158
sp 1.072508 0.077177 13.897 < 2e − 16 ∗ ∗∗
Residual standard error : 0.07652 on 500 degrees of freedom
Multiple R-squared : 0.2786, Adjusted R-squared : 0.2772
Chao ZHOU RMI FE5209 48 / 59
Linear regression model

Nonlinear Market Model


Simple nonlinear market model :
rt = α1 + β1 rm,t + ϵt , if rm,t ≤ 0,
= α2 + β2 rm,t + ϵt , if rm,t > 0.
> idx=c(1 :502)[sp <= 0] # locate the non-positive market returns.
> nsp=rep(0,502) ; nsp[idx]=sp[idx] ; c1=rep(0,502) ; c1[idx]=1
> m2=lm(gm∼ c1+sp) ; m3=lm(gm∼nsp+sp)
> m4=lm(gm∼ sp+c1+nsp) ; summary(m4)
Coecients :
Estimate Std.Error t value Pr (> |t|)
(Intercept) −0.007778 0.007369 −1.055 0.2917
sp 1.041129 0.176838 5.887 7.21e − 09 ∗ ∗∗
c1 0.020713 0.010550 1.963 0.0502.
nsp 0.387630 0.236399 1.640 0.1017
Chao ZHOU RMI FE5209 49 / 59
Linear regression model

Log prices (d) of BHP (01/07/2002 - 31/03/2006)

BHP log close prices (adjusted)

3.5
3.0
bhp1

2.5

2003 2004 2005 2006

Time

Chao ZHOU RMI FE5209 50 / 59


Linear regression model

Log prices of VALE (01/07/2002 - 31/03/2006)

VALE log close prices (adjusted)

2.5
2.0
vale1

1.5
1.0
0.5

2003 2004 2005 2006

Time

Chao ZHOU RMI FE5209 51 / 59


Linear regression model

Log prices (daily) (01/07/2002 - 31/03/2006)

BHP

3.5
3.0
bhp1

2.5

2003 2004 2005 2006

Time

VALE
2.5
vale1

1.5
0.5

2003 2004 2005 2006

Time

Chao ZHOU RMI FE5209 52 / 59


Linear regression model

Log prices (daily) (01/07/2002 - 31/03/2006)

3.5
3.0
bhp

2.5

0.5 1.0 1.5 2.0 2.5

vale

Chao ZHOU RMI FE5209 53 / 59


Linear regression model

Pairs trading of BHP and VALE : Linear regression

BHP-0.717VALE

1.95
1.90
1.85
wt

1.80
1.75
1.70

0 200 400 600 800

Index

Chao ZHOU RMI FE5209 54 / 59


Linear regression model

U.S. weekly interest rates : y1 y3

16
14
12
10
y3

8
6
4

4 6 8 10 12 14 16

y1

Chao ZHOU RMI FE5209 55 / 59


Linear regression model

U.S. weekly interest rates : Regression y1 y3


Series m1$residuals

1.0
0.8
0.6
ACF

0.4
0.2
0.0

0 5 10 15 20 25 30

Lag

Chao ZHOU RMI FE5209 56 / 59


Linear regression model

U.S. weekly interest rates : r1 r3

1.5
1.0
0.5
r3

0.0
-0.5
-1.0

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

r1

Chao ZHOU RMI FE5209 57 / 59


Linear regression model

U.S. weekly interest rates : Regression r1 r3


Series m2$residuals

1.0
0.8
0.6
ACF

0.4
0.2
0.0

0 5 10 15 20 25 30

Lag

Chao ZHOU RMI FE5209 58 / 59


Linear regression model

Thank you for your attention.

Chao ZHOU RMI FE5209 59 / 59

You might also like