0% found this document useful (0 votes)

162 views

SimpleLinearRegression PDF

This document provides an overview of simple linear regression concepts including: 1. The least squares method is used to estimate the intercept (b0) and slope (b1) of the regression line by minimizing the sum of squared residuals. 2. The fitted values (Ŷi) from the regression line are highly correlated with the X values, while the residuals (ei) are uncorrelated with X. 3. Additional properties of the least squares estimates are derived, including that b0 is the Y intercept and b1 X̄ is added to get the mean value of Y (Ȳ). The slope b1 is the correlation between X and Y scaled by the standard deviations.

Uploaded by

kappi4u

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

162 views

SimpleLinearRegression PDF

Uploaded by

kappi4u

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 86

DMBA: Statistics

Lecture 2: Simple Linear Regression

Least Squares, SLR properties, Inference,
and Forecasting

Carlos Carvalho
The University of Texas McCombs School of Business
mccombs.utexas.edu/faculty/carlos.carvalho/teaching

1
Today’s Plan

1. The Least Squares Criteria

2. The Simple Linear Regression Model
3. Estimation for the SLR Model
I sampling distributions
I confidence intervals
I hypothesis testing

2
Linear Prediction

Ŷi = b0 + b1 Xi

I b0 is the intercept and b1 is the slope

I We find b0 and b1 using Least Squares
3
The Least Squares Criterion

The formulas for b0 and b1 that minimize the least squares

criterion are:

sY
b1 = corr (X , Y ) × b0 = Ȳ − b1 X̄
sX
where,
v v
u n u n
uX 2 uX 2
sY = t Yi − Ȳ and sX = t Xi − X̄
i=1 i=1

4
Review: Covariance and Correlation
Correlation and Covariance
Measure the direction and strength of the linear
, -"./01"$23"$direction relationship 56$23"$()4".1
.4*$strength
1"(.2)54/3)7$8"29""4$23"$:.1).8("/$;$.4*$<
between variables Y and X

The direction i
the sign of the c

Applied Regression
P n Analysis
Carlos M. Carvalhoi=1 (Y − Ȳ )(Xi − X̄ )
i
Cov (Y , X ) =
n−1 5
Correlation and Covariance

Correlation is the standardized covariance:

cov(X , Y ) cov(X , Y )
corr(X , Y ) = p =
var(X )var(Y ) sd(X )sd(Y )

The correlation is scale invariant and the units of measurement

don’t matter: It is always true that −1 ≤ corr(X , Y ) ≤ 1.

This gives the direction (- or +) and strength (0 → 1)

of the linear relationship between X and Y .

6
Correlation

3
corr = 1 corr = .5

2
1

1
0

0
-1

-1
-2

-2
-3

-3
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
3

3
corr = .8 corr = -.8
2

2
1

1
0

0
-1

-1
-2

-2
-3

-3

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

7
Correlation
Only measures linear relationships:
corr(X , Y ) = 0 does not mean the variables are not related!

corr = 0.01 corr = 0.72

20
0

15
-2

10
-4

5
-6

0
-8

-3 -2 -1 0 1 2 0 5 10 15 20

Also be careful with influential observations.

8
Back to Least Squares

1. Intercept:
b0 = Ȳ − b1 X̄ ⇒ Ȳ = b0 + b1 X̄

I The point (X̄ , Ȳ ) is on the regression line!

I Least squares finds the point of means and rotate the line
through that point until getting the “right” slope

2. Slope:

sY
b1 = corr (X , Y ) ×
sX

I So, the right slope is the correlation coefficient times a scaling

factor that ensures the proper units for b1
9
More on Least Squares

From now on, terms “fitted values” (Ŷi ) and “residuals” (ei ) refer
to those obtained from the least squares line.

The fitted values and residuals have some special properties. Lets
look at the housing data analysis to figure out what these
properties are...

10
The Fitted Values and X

160
140
Fitted Values

120
100

corr(y.hat, x) = 1
80

1.0 1.5 2.0 2.5 3.0 3.5

11
The Residuals and X

corr(e, x) = 0
20

mean(e) = 0
Residuals

10
0
-10
-20

1.0 1.5 2.0 2.5 3.0 3.5

12
Why?
What is the intuition for the relationship between Ŷ and e and X ?
Lets consider some “crazy”alternative line:
160
140

Crazy line: 10 + 50 X
120
Y

100

LS line: 38.9 + 35.4 X

80
60

1.0 1.5 2.0 2.5 3.0 3.5

13
Fitted Values and Residuals
This is a bad fit! We are underestimating the value of small houses
and overestimating the value of big houses.
30

corr(e, x) = -0.7
20

mean(e) = 1.8
Crazy Residuals

10
0
-10
-20

1.0 1.5 2.0 2.5 3.0 3.5

Clearly, we have left some predictive ability on the table!

14
Fitted Values and Residuals

As long as the correlation between e and X is non-zero, we could

always adjust our prediction rule to do better.

We need to exploit all of the predictive power in the X values and

put this into Ŷ , leaving no “Xness” in the residuals.

In Summary: Y = Ŷ + e where:
I Ŷ is “made from X ”; corr(X , Ŷ ) = 1.
I e is unrelated to X ; corr(X , e) = 0.

15
Another way to derive things

The intercept:
n n
1X 1X
ei = 0 ⇒ (Yi − b0 − b1 Xi )
n i=1 n i=1
⇒ Ȳ − b0 − b1 X̄ = 0
⇒ b0 = Ȳ − b1 X̄

16
Another way to derive things

The slope:
n
X
corr(e, X ) = ei (Xi − X̄ ) = 0
i=1
n
X
= (Yi − b0 − b1 Xi )(Xi − X̄ )
i=1
n
X
= (Yi − Ȳ − b1 (Xi − X̄ ))(Xi − X̄ )
i=1
Pn
i=1 (Xi − X̄ )(Yi − Ȳ ) sy
⇒ b1 = Pn = rxy
i=1 (Xi − X̄ )
2 sx

17
Decomposing the Variance

How well does the least squares line explain variation in Y ?

Since Ŷ and e are independent (i.e. cov(Ŷ , e) = 0),

var(Y ) = var(Ŷ + e) = var(Ŷ ) + var(e)

This leads to
n
X n
X n
X
(Yi − Ȳ )2 = (Ŷi − Ȳ )2 + ei2
i=1 i=1 i=1

18
Decomposing the Variance – ANOVA Tables

SSR: Variation in Y explained by the regression line.

SSE: Variation in Y that is left unexplained.

SSR = SST ⇒ perfect fit.

Be careful of similar acronyms; e.g. SSR for “residual” SS.

19
Decomposing
Decomposingthethe
Variance – The
Variance ANOVATables
– ANOVA Table

20
Applied Regression Analysis – Fall 2008
A Goodness of Fit Measure: R 2

The coefficient of determination, denoted by R 2 ,

measures goodness of fit:

SSR SSE
R2 = =1−
SST SST

I 0 < R 2 < 1.
I The closer R 2 is to 1, the better the fit.

21
A Goodness of Fit Measure: R 2

An interesting fact: R 2 = rxy

2 ( i.e., R 2 is squared correlation).

Pn
2 (Ŷi − Ȳ )2
R = Pi=1
n
(Yi − Ȳ )2
Pni=1 2
i=1 (b0 + b1 Xi − b0 − b1 X̄ )
= Pn 2
i=1 (Yi − Ȳ )
b12 ni=1 (Xi − X̄ )2 b12 sx2
P
2
= Pn 2
= 2
= rxy
i=1 (Y i − Ȳ ) s y

No surprise: the higher the sample correlation between

X and Y , the better you are doing in your regression.

22
BackBack
to the House
to the HouseData
Data

''/
''-
''.
Applied Regression Analysis
!""#$%%&$'()*"$+,
Carlos M. Carvalho

23
Prediction and the Modelling Goal

A prediction rule is any function where you input X and it

outputs Ŷ as a predicted response at X .

The least squares line is a prediction rule:

Ŷ = f (X ) = b0 + b1 X

24
Prediction and the Modelling Goal
Ŷ is not going to be a perfect prediction.
We need to devise a notion of forecast accuracy.

25
Prediction and the Modelling Goal

There are two things that we want to know:

I What value of Y can we expect for a given X?
I How sure are we about this forecast? Or how different could
Y be from what we expect?

Our goal is to measure the accuracy of our forecasts or how much

uncertainty there is in the forecast. One method is to specify a
range of Y values that are likely, given an X value.

Prediction Interval: probable range for Y-values given X

26
Prediction and the Modelling Goal

Key Insight: To construct a prediction interval, we will have to

assess the likely range of residual values corresponding to a Y value
that has not yet been observed!

We will build a probability model (e.g., normal distribution).

Then we can say something like “with 95% probability the

residuals will be no less than -$28,000 or larger than $28,000”.

We must also acknowledge that the “fitted” line may be fooled by

particular realizations of the residuals.

27
The Simple Linear Regression Model

The power of statistical inference comes from the ability to make

precise statements about the accuracy of the forecasts.

In order to do this we must invest in a probability model.

Simple Linear Regression Model: Y = β0 + β1 X + ε

ε ∼ N(0, σ 2 )

The error term ε is independent “idosyncratic noise”.

28
Independent Normal Additive Error

Why do we have ε ∼ N(0, σ 2 )?

I E [ε] = 0 ⇔ E [Y | X ] = β0 + β1 X
(E [Y | X ] is “conditional expectation of Y given X ”).
I Many things are close to Normal (central limit theorem).
I MLE estimates for β’s are the same as the LS b’s.
I It works! This is a very robust model for the world.

We can think of β0 + β1 X as the “true” regression line.

29
The Regression Model and our House Data

Think of E [Y |X ] as the average price of houses with size X:

Some houses could have a higher than expected value, some lower,
and the true line tells us what to expect on average.

The error term represents influence of factors other X .

30
Conditional Distributions
The conditional distribution for Y given X is Normal:

Y |X ∼ N(β0 + β1 X , σ 2 ).
σ controls dispersion:

31
Conditional vs Marginal Distributions

Y |X ∼ N(E [Y |X ], var(Y |X )).

I Mean is E [Y |X ] = E [β0 + β1 X + ε] = β0 + β1 X .
I Variance is var(β0 + β1 X + ε) = var(ε) = σ 2 .

Remember our sliced boxplots:

I σ 2 < var(Y ) if X and Y are related.

32
Prediction Intervals with the True Model
You are told (without looking at the data) that

β0 = 40; β1 = 45; σ = 10

and you are asked to predict price of a 1500 square foot house.

What do you know about Y from the model?

Y = 40 + 45(1.5) + ε
= 107.5 + ε

Thus our prediction for price is

Y ∼ N(107.5, 102 )

33
Prediction Intervals with the True Model

The model says that the mean value of a 1500 sq. ft. house is
$107,500 and that deviation from mean is within ≈ $20,000.

We are 95% sure that

I −20 < ε < 20
I $87, 500 < Y < $127, 500

In general, the 95 % Prediction Interval is PI = β0 + β1 X ± 2σ.

34
Summary of Simple Linear Regression

Assume that all observations are drawn from our regression model
and that errors on those observations are independent.

The model is
Yi = β0 + β1 Xi + εi

where ε is independent and identically distributed N(0, σ 2 ).

The SLR has 3 basic parameters:

I β0 , β1 (linear pattern)
I σ (variation around the line).

35
Key Characteristics of Linear Regression Model

I Mean of Y is linear in X .
I Error terms (deviations from line) are normally distributed
(very few deviations are more than 2 sd away from the
regression mean).
I Error terms have constant variance.

36
Break

Back in 15 minutes...

37
Recall: Estimation for the SLR Model
SLR assumes every observation in the dataset was generated by
the model:
Yi = β0 + β1 Xi + εi

This is a model for the conditional distribution of Y given X.

We use Least Squares to estimate β0 and β1 :

Pn
i=1 (Xi − X̄ )(Yi − Ȳ )
β̂1 = b1 = Pn 2
i=1 (Xi − X̄ )

β̂0 = b0 = Ȳ − b1 X̄

38
Estimation for the SLR Model

39
Estimation of Error Variance

iid
Recall that εi ∼ N(0, σ 2 ), and that σ drives the width of the
prediction intervals:

σ 2 = var(εi ) = E [(εi − E [εi ])2 ] = E [ε2i ]

A sensible strategy would be to estimate the average for squared

errors with the sample average squared residuals:
n
1X 2
σ̂ 2 = ei
n
i=1

40
Estimation of Error Variance

However, this is not an unbiased estimator of σ 2 . We have to alter

the denominator slightly:
n
2 1 X 2 SSE
s = ei =
n−2 n−2
i=1

(2 is the number of regression coefficients; i.e. 2 for β0 + β1 ).

We have n − 2 degrees of freedom because 2 have been “used up”

in the estimatiation of b0 and b1 .
p
We usually use s = SSE /(n − 2), in the same units as Y .

41
Degrees of Freedom

Degrees of Freedom is the number of times you get to observe

useful information about the variance you’re trying to estimate.

Pn
For example, consider SST = i=1 (Yi − Ȳ )2 :
I If n = 1, Ȳ = Y1 and SST = 0: since Y1 is “used up”
estimating the mean, we haven’t observed any variability!
I For n > 1, we’ve only had n − 1 chances for deviation from
the mean, and we estimate sy2 = SST /(n − 1).

In regression with p coefficients (e.g., p = 2 in SLR), you only get

n − p real observations of variability ⇒ DoF = n − p.

42
Estimation of Error Variance
Estimation of V2
!,"-"$).$s )/$0,"$123"($4506507
Where is s in the Excel output?

Remember that whenever you see “standard error” read it as

8"9"9:"-$;,"/"<"-$=45$.""$>.0?/*?-*$"--4-@$-"?*$)0$?.$estimated
.0?/*?-*$*"<)?0)4/&$V ).$0,"$.0?/*?-*$*"<)?0)4/
estimated standard deviation: σ is the standard deviation.
Applied Regression Analysis
!""#$%%%&$'()*"$+
Carlos M. Carvalho

43
Sampling Distribution of Least Squares Estimates

How much do our estimates depend on the particular random

sample that we happen to observe? Imagine:
I Randomly draw different samples of the same size.
I For each sample, compute the estimates b0 , b1 , and s.

If the estimates don’t vary much from sample to sample, then it

doesn’t matter which sample you happen to observe.
If the estimates do vary a lot, then it matters which sample you
happen to observe.

44
Sampling Distribution of Least Squares Estimates

45
Sampling Distribution of Least Squares Estimates

46
Sampling Distribution of Least Squares Estimates

LS lines are much closer to the true line when n = 50.

For n = 5, some lines are close, others aren’t:

we need to get “lucky”

47
Review: Sampling Distribution of Sample Mean

Step back for a moment and consider the mean for an iid sample
of n observations of a random variable {X1 , . . . , Xn }

Suppose that E (Xi ) = µ and var(Xi ) = σ 2

1 P
I E (X̄ ) = n E (Xi ) = µ
σ2
var(X̄ ) = var n1 1
P P
I Xi = n2
var (Xi ) =
n
2

If X is normal, then X̄ ∼ N µ, σn .

If X is not normal, we have the central limit theorem (more in a

minute)!
48
Oracle vs SAP Example (understanding variation)

49
Oracle vs SAP

50
Oracle vs SAP

Do you really believe that SAP affects ROE?

How else could we look at this question?

51
Central Limit Theorem

Simple CLT states that for iid random variables, X , with mean µ
and variance σ 2 , the distribution of the sample mean becomes
normal as the number of observations, n, gets large.

σ2
That is, X̄ →n N(µ, ), and sample averages tend to be normally
n
distributed in large samples.

52
Central Limit Theorem

Exponential random variables don’t look very normal:

1.0
0.8
0.6
f(X)

0.4
0.2
0.0

0 1 2 3 4 5

X
E [X ] = 1 and var(X ) = 1.
53
Central Limit Theorem

1000 means from n=2 samples

250
Frequency

150
50
0

0 1 2 3 4 5

apply(matrix(rexp(2 * 1000), ncol = 1000), 2, mean)

54
Central Limit Theorem

1000 means from n=5 samples

150
Frequency

100
50
0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

apply(matrix(rexp(5 * 1000), ncol = 1000), 2, mean)

55
Central Limit Theorem

1000 means from n=10 samples

250
200
Frequency

150
100
50
0

0.0 0.5 1.0 1.5 2.0

apply(matrix(rexp(10 * 1000), ncol = 1000), 2, mean)

56
Central Limit Theorem

1000 means from n=100 samples

200
150
Frequency

100
50
0

0.7 0.8 0.9 1.0 1.1 1.2 1.3

apply(matrix(rexp(100 * 1000), ncol = 1000), 2, mean)

57
Central Limit Theorem

1000 means from n=1k samples

200
150
Frequency

100
50
0

0.90 0.95 1.00 1.05 1.10

apply(matrix(rexp(1000 * 1000), ncol = 1000), 2, mean)

58
Sampling Distribution of b1

The sampling distribution of b1 describes how estimator b1 = β̂1

varies over different samples with the X values fixed.

It turns out that b1 is normally distributed: b1 ∼ N(β1 , σb21 ).

I b1 is unbiased: E [b1 ] = β1 .
I Sampling sd σb1 determines precision of b1 .

59
Sampling Distribution of b1

Can we intuit what should be in the formula for σb1 ?

I How should σ figure in the formula?
I What about n?
I Anything else?

σ2 σ2
var(b1 ) = P =
(Xi − X̄ )2 (n − 1)sx2

Three Factors:
sample size (n), error variance (σ 2 = σε2 ), and X -spread (sx ).

60
Sampling Distribution of b0

The intercept is also normal and unbiased: b0 ∼ N(β0 , σb20 ).

X̄ 2

1
σb20 = var(b0 ) = σ 2
+
n (n − 1)sx2

What is the intuition here?

61
The Importance of Understanding Variation

When estimating a quantity, it is vital to develop a notion of the

precision of the estimation; for example:
I estimate the slope of the regression line
I estimate the value of a house given its size
I estimate the expected return on a portfolio
I estimate the value of a brand name
I estimate the damages from patent infringement
Why is this important?
We are making decisions based on estimates, and these may be
very sensitive to the accuracy of the estimates!

62
The Importance of Understanding Variation

Example from “everyday” life:

I When building a house, we can estimate a required piece of
wood to 1/4”?
I When building a fine cabinet, the estimates may have to be
accurate to 1/16” or even 1/32”.
The standard deviations of the least squares estimators of the
slope and intercept give a precise measurement of the accuracy of
the estimator.

However, these formulas aren’t especially practical

since they involve the unknown quantity: σ

63
Estimated Variance

We estimate variation with “sample standard deviations”:

s s
s2 X̄ 2

1
sb1 = sb0 = s2 +
(n − 1)sx2 n (n − 1)sx2
qP
Recall that s = ei2 /(n − 2) is the estimator for σ = σε .
Hence, sb1 = σ̂b1 and sb0 = σ̂b0 are estimated coefficient sd’s.

A high level of info/precision/accuracy means small sb values.

64
Normal and Student’s t

Recall what Student discovered:

If θ ∼ N(µ, σ 2 ), but you estimate σ 2 ≈ s 2 based on n − p

degrees of freedom, then θ ∼ tn−p (µ, s 2 ).

For example:
I Ȳ ∼ tn−1 (µ, sy2 /n).
b0 ∼ tn−2 β0 , sb20 and b1 ∼ tn−2 β1 , sb21

I

The t distribution is just a fat-tailed version of the normal. As

n − p −→ ∞, our tails get skinny and the t becomes normal.

65
Standardized Normal and Student’s t

We’ll also usually standardize things:

bj − βj bj − βj
∼ N(0, 1) =⇒ ∼ tn−2 (0, 1)
σbj sbj

We use Z ∼ N(0, 1) and Zn−p ∼ tn−p (0, 1) to

represent standard random variables.

Notice that the t and normal distributions depend upon assumed

values for βj : this forms the basis for confidence intervals,
hypothesis testing, and p-values.

66
Testing and Confidence Intervals (in 3 slides)
Suppose Zn−p is distributed tn−p (0, 1). A centered interval is

P(−tn−p,α/2 < Zn−p < tn−p,α/2 ) = 1 − α

67
Confidence Intervals

Since bj ∼ tn−p (βj , sbj ),

bj − βj
1 − α = P(−tn−p,α/2 < < tn−p,α/2 )
sbj
= P(bj − tn−p,α/2 sbj < βj < bj + tn−p,α/2 sbj )

Thus (1 − α)*100% of the time, βj is within the Confidence

Interval: bj ± tn−p,α/2 sbj

68
Testing

Similarly, suppose that assuming bj ∼ tn−p (βj , sbj ) for our sample
bj leads to (recall Zn−p ∼ tn−p (0, 1))

bj − β j bj − βj
P Zn−p < − + P Zn−p >
= ϕ.
sbj sbj

Then the “p-value” is ϕ = 2P(Zn−p > |bj − βj |/sbj ).

You do this calculation for βj = βj0 , an assumed null/safe value,

and only reject βj0 if ϕ is too small (e.g., ϕ < 1/20).

In regression, βj0 = 0 almost always.

69
More Detail... Confidence Intervals

Why should we care about Confidence Intervals?

I The confidence interval captures the amount of information in
the data about the parameter.
I The center of the interval tells you what your estimate is.
I The length of the interval tells you how sure you are about
your estimate.

70
More Detail... Testing

Suppose that we are interested in the slope parameter, β1 .

For example, is there any evidence in the data to support the

existence of a relationship between X and Y?

We can rephrase this in terms of competing hypotheses.

H0 : β1 = 0. Null/safe; implies “no effect” and we ignore X .
H1 : β1 6= 0. Alternative; leads us to our best guess β1 = b1 .

71
Hypothesis Testing

If we want statistical support for a certain claim about the data,

we want that claim to be the alternative hypothesis.

Our hypothesis test will either reject or not reject the null
hypothesis (the default if our claim is not true).

If the hypothesis test rejects the null hypothesis, we have

statistical support for our claim!

72
Hypothesis Testing

We use bj for our test about βj .

I Reject H0 when bj is far from βj0 (ususally 0).
I Assume H0 when bj is close to βj0 .

An obvious tactic is to look at the difference bj − βj0 .

But this measure doesn’t take into account the uncertainty in

estimating bj : What we really care about is how many standard
deviations bj is away from βj0 .

73
Hypothesis Testing

The t-statistic for this test is

bj − βj0 bj
zbj = = for βj0 = 0.
sbj sb j

If H0 is true, this should be distributed zbj ∼ tn−p (0, 1).

I Small |zbj | leaves us happy with the null βj0 .
I Large |zbj | (i.e., > about 2) should get us worried!

74
Hypothesis Testing
We assess the size of zbj with the p-value :

ϕ = P(|Zn−p | > |zbj |) = 2P(Zn−p > |zbj |)

(once again, Zn−p ∼ tn−p (0, 1)).

p-value = 0.05 (with 8 df)

0.4

-z z
p(Z)

0.2
0.0

-4 -2 0 2 4

Z
75
Hypothesis Testing

The p-value is the probablity, assuming that the null hypothesis is

true, of seeing something more extreme
(further from the null) than what we have observed.

You can think of 1 − ϕ (inverse p-value) as a measure of distance

between the data and the null hypothesis. In other words, 1 − ϕ is
the strength of evidence against the null.

76
Hypothesis Testing

The formal 2-step approach to hypothesis testing

I Pick the significance level α (often 1/20 = 0.05),
our acceptable risk (probability) of rejecting a true
null hypothesis (we call this a type 1 error).
This α plays the same role as α in CI’s.
I Calculate the p-value, and reject H0 if ϕ < α
(in favor of our best alternative guess; e.g. βj = bj ).
If ϕ > α, continue working under null assumptions.

This is equivilent to having the rejection region |zbj | > tn−p,α/2 .

77
Example: Hypothesis Testing

Consider again a CAPM regression for the Windsor fund.

Does Windsor have a non-zero intercept?

(i.e., does it make/lose money independent of the market?).

H0 : β0 = 0 and there is no-free money.

H1 : β0 6= 0 and Windsor is cashing regardless of market.

78
Hypothesis Testing – Windsor Fund Example
-"./(($01"$!)2*345$5"65"33)42$72$8$,9:;<
Example: Hypothesis Testing

b! sb! b!
sb!
Applied Regression Analysis
!""#$%%%&$'()*"$+,
Carlos M. Carvalho
It turns out that we reject the null at α = .05 (ϕ = .0105). Thus
Windsor does have an “alpha” over the market.
79
Example: Hypothesis Testing
Looking at the slope, this is a very rare case where the null
hypothesis is not zero:

H0 : β1 = 1 Windsor is just the market (+ alpha).

H1 : β1 6= 1 and Windsor softens or exaggerates market moves.

We are asking whether or not Windsor moves in a different way

than the market (e.g., is it more conservative?).
Now,
b1 − 1 −0.0643
t= = = −2.205
sb1 0.0291
tn−2,α/2 = t178,0.025 = 1.96

Reject H0 at the 5% level

80
Forecasting

The conditional forecasting problem: Given covariate Xf and

sample data {Xi , Yi }ni=1 , predict the “future” observation yf .

The solution is to use our LS fitted value: Ŷf = b0 + b1 Xf .

This is the easy bit. The hard (and very important!) part of
forecasting is assessing uncertainty about our predictions.

81
Forecasting

If we use Ŷf , our prediction error is

ef = Yf − Ŷf = Yf − b0 − b1 Xf

82
Forecasting
This can get quite complicated! A simple strategy is to build the
following (1 − α)100% prediction interval:

b0 + b1 Xf ± tn−2,α/2 s

A large predictive error variance (high uncertainty) comes from

I Large s (i.e., large ε’s).
I Small n (not enough data).
I Small sx (not enough observed spread in covariates).
I Large difference between Xf and X̄ .

Just remember that you are uncertain about b0 and b1 !

Reasonably inflating the uncertainty in the interval above is always
a good idea... as always, this is problem dependent. 83
Forecasting

For Xf far from our X̄ , the space between lines is magnified...

84
Glossary and Equations

I Ŷi = b0 + b1 Xi is the ith fitted value.

I ei = Yi − Ŷi is the ith residual.
I s: standard error of regression residuals (≈ σ = σε ).

1 X 2
s2 = ei
n−2

I sbj : standard error of regression coefficients.

s s
s2 1 X̄ 2
sb1 = sb0 = s +
(n − 1)sx2 n (n − 1)sx2

85
Glossary and Equations

I α is the significance level (prob of type 1 error).

I tn−p,α/2 is the value such that for Zn−p ∼ tn−p (0, 1),

P(Zn−p > tn−p,α/2 ) = P(Zn−p < −tn−p,α/2 ) = α/2.

I zbj ∼ tn−p (0, 1) is the standardized coefficient t-value:

bj − βj0
zbj = (= bj /sbj most often)
sbj

I The (1 − α) ∗ 100% for βj is bj ± tn−p,α/2 sbj .

Jet Transport Performance 3rd 2011
100% (7)
Jet Transport Performance 3rd 2011
110 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
SMART START 3 - Theme 8 Weather
No ratings yet
SMART START 3 - Theme 8 Weather
11 pages
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Ms 236 N 0
No ratings yet
Ms 236 N 0
63 pages
CSE Syllabus Booklet 4 Yr BTech Revised 060120163 PDF
No ratings yet
CSE Syllabus Booklet 4 Yr BTech Revised 060120163 PDF
83 pages
8157it Manual
No ratings yet
8157it Manual
8 pages
Sec2 Regression PDF
No ratings yet
Sec2 Regression PDF
183 pages
Chapter 9 Simple Linear Regression and Correlation (1) (1)
No ratings yet
Chapter 9 Simple Linear Regression and Correlation (1) (1)
56 pages
BST 32202 LINEAR REGRESSION 6 SLR ASSUMPTIONS LSE
No ratings yet
BST 32202 LINEAR REGRESSION 6 SLR ASSUMPTIONS LSE
20 pages
Notes On Applied Linear Regression
No ratings yet
Notes On Applied Linear Regression
47 pages
Stats 101 - Class 03
No ratings yet
Stats 101 - Class 03
94 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
Stat 302 Lec 12
No ratings yet
Stat 302 Lec 12
59 pages
Simple Regression
No ratings yet
Simple Regression
46 pages
Simple Linear Regression: Parameters
No ratings yet
Simple Linear Regression: Parameters
34 pages
ch12 0
No ratings yet
ch12 0
82 pages
Regression
No ratings yet
Regression
56 pages
Simplelinearregression NBC
No ratings yet
Simplelinearregression NBC
50 pages
Lecture 14 PDF
No ratings yet
Lecture 14 PDF
32 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Regression Analysis
No ratings yet
Regression Analysis
37 pages
Simple Regression Analysis
No ratings yet
Simple Regression Analysis
60 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
68 pages
Sessions 18 19 - Regression - SLR MLR
No ratings yet
Sessions 18 19 - Regression - SLR MLR
70 pages
Regression Analysis
No ratings yet
Regression Analysis
52 pages
Chapter Simple Linear Regression 1
100% (1)
Chapter Simple Linear Regression 1
77 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness
No ratings yet
Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness
23 pages
CHAPTER 2 Simple Linear Regression
No ratings yet
CHAPTER 2 Simple Linear Regression
76 pages
Estad Istica II Chapter 4: Simple Linear Regression
No ratings yet
Estad Istica II Chapter 4: Simple Linear Regression
46 pages
Regression Model and Its Applications
100% (1)
Regression Model and Its Applications
30 pages
Regression
No ratings yet
Regression
46 pages
Statistics Week3
No ratings yet
Statistics Week3
19 pages
Notes2
No ratings yet
Notes2
16 pages
Session Presentation
No ratings yet
Session Presentation
79 pages
Statistics For Business Analysis: Learning Objectives
No ratings yet
Statistics For Business Analysis: Learning Objectives
37 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
Topic3_SimpleLinearRegressionModels
No ratings yet
Topic3_SimpleLinearRegressionModels
97 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
Regressi On
No ratings yet
Regressi On
16 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Annotated 4 Ch4 Linear Regression F2014
No ratings yet
Annotated 4 Ch4 Linear Regression F2014
11 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
R-programming - Unit 5
No ratings yet
R-programming - Unit 5
43 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
27 pages
Chapter 10 - 2 - 2
No ratings yet
Chapter 10 - 2 - 2
33 pages
WEEK2 Simple Regression
No ratings yet
WEEK2 Simple Regression
133 pages
Chapter 2: Simple Linear Regression
No ratings yet
Chapter 2: Simple Linear Regression
58 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
Simple Lin Regress Inference
No ratings yet
Simple Lin Regress Inference
51 pages
Statics Thinking-Regression
No ratings yet
Statics Thinking-Regression
51 pages
1.1 Simple Linear Regression Model
100% (1)
1.1 Simple Linear Regression Model
15 pages
Simple Linear Regression sample
No ratings yet
Simple Linear Regression sample
55 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Sri Vidya Saparya Vasana PDF
No ratings yet
Sri Vidya Saparya Vasana PDF
251 pages
Implementation Guidelines-2018-19 Horticulture PDF
No ratings yet
Implementation Guidelines-2018-19 Horticulture PDF
190 pages
Applied Ai Schedule
No ratings yet
Applied Ai Schedule
19 pages
Data Science: National Research University - Higher School of Economics
No ratings yet
Data Science: National Research University - Higher School of Economics
1 page
Calendar For Distance Education
No ratings yet
Calendar For Distance Education
1 page
Concept Note: Technology Information, Forecasting and Assessment Council (TIFAC)
No ratings yet
Concept Note: Technology Information, Forecasting and Assessment Council (TIFAC)
1 page
Introduction To Statistical Quality Control: Technometrics January 2012
No ratings yet
Introduction To Statistical Quality Control: Technometrics January 2012
6 pages
Bayes Chest Prob
No ratings yet
Bayes Chest Prob
5 pages
MTech CDS Brochure-2018
No ratings yet
MTech CDS Brochure-2018
1 page
National Institute of Rural Development and Panchayati Raj
No ratings yet
National Institute of Rural Development and Panchayati Raj
4 pages
About The Programmes
No ratings yet
About The Programmes
9 pages
Probst at Book
No ratings yet
Probst at Book
539 pages
Thyagaraja Krithi Telugu
No ratings yet
Thyagaraja Krithi Telugu
48 pages
Probst at Book
No ratings yet
Probst at Book
539 pages
Keshava Seva Samithi Brochure
No ratings yet
Keshava Seva Samithi Brochure
2 pages
Navanaatha
No ratings yet
Navanaatha
5 pages
BS 874-3.2 PDF
100% (1)
BS 874-3.2 PDF
18 pages
Geography Project
No ratings yet
Geography Project
2 pages
Prepared By: Jose P.J. HSST A.J.John Memorial H.S.S. Chathangottunada 1
No ratings yet
Prepared By: Jose P.J. HSST A.J.John Memorial H.S.S. Chathangottunada 1
8 pages
Climate Change The Science Impacts and Solutions Second Edition A. Barrie Pittock 2024 Scribd Download
100% (21)
Climate Change The Science Impacts and Solutions Second Edition A. Barrie Pittock 2024 Scribd Download
84 pages
Black Beauty
No ratings yet
Black Beauty
12 pages
Sample Story Challenge Resolution
100% (9)
Sample Story Challenge Resolution
2 pages
Mining Engineering NV 2014
100% (7)
Mining Engineering NV 2014
108 pages
John Zink - Flare Downstream Ref
50% (2)
John Zink - Flare Downstream Ref
20 pages
Lenormand 1 Thru 6
No ratings yet
Lenormand 1 Thru 6
6 pages
Bulk Liquid Oxygen, Nitrogen and Argon Storage Systems at Production Sites
No ratings yet
Bulk Liquid Oxygen, Nitrogen and Argon Storage Systems at Production Sites
35 pages
Dobson Julia M Curry Dean Dialogs For Everyday Use More Dial
No ratings yet
Dobson Julia M Curry Dean Dialogs For Everyday Use More Dial
72 pages
Are Urbanization, Industrialization and CO2 Emissions Cointegrated?
No ratings yet
Are Urbanization, Industrialization and CO2 Emissions Cointegrated?
30 pages
5-State of Matter
No ratings yet
5-State of Matter
26 pages
Bernoulli Molecular Explanation
No ratings yet
Bernoulli Molecular Explanation
5 pages
Hotel BMS Brochure
No ratings yet
Hotel BMS Brochure
8 pages
Lumb (1975) - Slope Failures in Hong Kong
No ratings yet
Lumb (1975) - Slope Failures in Hong Kong
20 pages
Art 2178 Fortuitous Event 2vasquez VS Ca
No ratings yet
Art 2178 Fortuitous Event 2vasquez VS Ca
3 pages
Fire Water Tank API 650 Datasheet
No ratings yet
Fire Water Tank API 650 Datasheet
16 pages
Estimation of Demand & Sales Forecasting
No ratings yet
Estimation of Demand & Sales Forecasting
6 pages
Biodiversity
100% (1)
Biodiversity
9 pages
Revision 1.2pptx
No ratings yet
Revision 1.2pptx
17 pages
The Hindu Monthly Quiz - May 2024
No ratings yet
The Hindu Monthly Quiz - May 2024
34 pages
Assignment:: Central Karakoram National Park
No ratings yet
Assignment:: Central Karakoram National Park
8 pages
Rides Round Friston Forest
No ratings yet
Rides Round Friston Forest
2 pages
Murat by Alexandre Dumas
No ratings yet
Murat by Alexandre Dumas
48 pages
Earth Science Reviewer
No ratings yet
Earth Science Reviewer
7 pages