0% found this document useful (0 votes)
13 views

Simple Linear Regression 69

This document provides an overview of simple linear regression. It defines linear regression as describing the linear relationship between a predictor variable (X) and response variable (Y). Key aspects covered include: - The linear regression equation Y = β0 + β1X + ε - How least squares regression fits the linear model by minimizing the sum of squared residuals - Descriptive statistics used in linear regression like variance, covariance, correlation - Inferential regression statistics like R2, standard error, ANOVA, hypothesis tests on coefficients - Assumptions of the linear regression model like independent and normally distributed errors. The document also briefly introduces multiple linear regression, extending the single predictor model to multiple predictors. Matrix algebra

Uploaded by

Härêm Ôd
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Simple Linear Regression 69

This document provides an overview of simple linear regression. It defines linear regression as describing the linear relationship between a predictor variable (X) and response variable (Y). Key aspects covered include: - The linear regression equation Y = β0 + β1X + ε - How least squares regression fits the linear model by minimizing the sum of squared residuals - Descriptive statistics used in linear regression like variance, covariance, correlation - Inferential regression statistics like R2, standard error, ANOVA, hypothesis tests on coefficients - Assumptions of the linear regression model like independent and normally distributed errors. The document also briefly introduces multiple linear regression, extending the single predictor model to multiple predictors. Matrix algebra

Uploaded by

Härêm Ôd
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Simple Linear

Regression (SLR)
Types of Correlation

Positive correlation Negative correlation No correlation


Simple linear regression describes the
linear relationship between a predictor
variable, plotted on the x-axis, and a
response variable, plotted on the y-axis
dependent Variable (Y)

Independent Variable (X)


Y   o  1 X

1
Y

1.0

o
X
Y   o  1 X

1
Y

1.0

o
X
Y

X
ε
Y

X
Fitting data to a linear model

Yi   o  1 X i   i

intercept slope residuals


How to fit data to a linear model?

The Ordinary Least Square Method (OLS)


Least Squares Regression

Model line: Y    0  1 X
Residual (ε) = Y  Y 
=
Sum of squares of residuals min (Y

 (Y
Y ) 2
 Y ) 2

• we must find values of  o and 1 that minimise


Regression Coefficients

S xy  xy
b1   2
S xx  x

b0  Y  b1 X
Required Statistics

n  number of observatio ns

X 
 X
n

Y 
 Y
n
Descriptive Statistics

 X  X 
n
2 S xx
Var ( X )  i 1
n 1
S yy (SST )
 Y  Y 
n
2

Var (Y )  i 1
n 1

 X  X Y  Y 
n
S xy
Covar ( X , Y )  i 1
n 1
Regression Statistics

SST   (Y  Y ) 2

SSR   (Y   Y ) 2

SSE   (Y  Y ) 2
Variance to be
explained by predictors
(SST)

Y
X1

Variance
Y
explained by X1 Variance NOT
explained by X1
(SSR)
(SSE)
Regression Statistics

SST  SSR  SSE


Regression Statistics

SSR
R 2

SST
Coefficient of Determination
to judge the adequacy of the regression model
Regression Statistics

R R 2

S xy  xy
R 
S xx S yy  x y

Correlation
measures the strength of the linear association between
two variables.
Regression Statistics
Standard Error for the regression model

S e  S  ˆ
2
e
2

SSE SSE   (Y  Y ) 2
S 
2

n2
e

S e  MSE
2
ANOVA
H 0 : 1  0
H A : 1  0

df SS MS F P-value

Regression 1 SSR SSR / df MSR / MSE P(F)

Residual n-2 SSE SSE / df

Total n-1 SST


If P(F)<a then we know that we get significantly better prediction of Y from the
regression model than by just predicting mean of Y.

ANOVA to test significance of regression


Hypothesis Tests for Regression
Coefficients

H 0 : i  0
H1 :  i  0

bi   i
t( n  k 1) 
Sbi
Hypotheses Tests for Regression
Coefficients

H 0 : 1  0
H A : 1  0

b1  1 b1  1
t( n k 1)  
S e (b1 ) 2
Se
S xx
Confidence Interval on Regression
Coefficients

Se2 Se2
b1  ta / 2,( n k 1)  1  b1  ta / 2,( n k 1)
S xx S xx

Confidence Interval for 1


Hypothesis Tests on Regression
Coefficients
H 0 : 0  0
H A : 0  0

b0   0 b0   0
t( n k 1)  
Se (b0 ) 1 X  2
S  
2
e

 n S xx 
Confidence Interval on Regression
Coefficients

 1 X 2
  1 X 2

b0  ta / 2,( nk 1) Se  
2
   0  b0  ta / 2,( nk 1) Se  
2

 n S xx   n S xx 

Confidence Interval for the intercept


Hypotheses Test the Correlation Coefficient

H0 :   0
HA :   0

R n2
T0 
1 R 2

We would reject the null hypothesis if t0  ta / 2,n 2


Diagnostic Tests For Regressions
Expected distribution of residuals for a linear
model with normal distribution or residuals
(errors).

i

Yi
Diagnostic Tests For Regressions
Residuals for a non-linear fit

i

Yi
Diagnostic Tests For Regressions
Residuals for a quadratic function
or polynomial

i

Yi
Diagnostic Tests For Regressions
Residuals are not homogeneous
(increasing in variance)

i

Yi
Regression – important points

1. Ensure that the range of values


sampled for the predictor variable
is large enough to capture the full
range to responses by the response
variable.
Y

X
Y

X
Regression – important points

2. Ensure that the distribution of


predictor values is approximately
uniform within the sampled range.
Y

X
Y

X
Assumptions of Regression

1. The linear model correctly


describes the functional relationship
between X and Y.
Assumptions of Regression

1. The linear model correctly


describes the functional relationship
between X and Y.
Y

X
Assumptions of Regression

2. The X variable is measured


without error
Y

X
Assumptions of Regression

3. For any given value of X, the sampled


Y values are independent
4. Residuals (errors) are normally
distributed.
5. Variances are constant along the
regression line.
Multiple Linear
Regression (MLR)
The linear model with a single
predictor variable X can easily
be extended to two or more
predictor variables.

Y  o  1 X1  2 X 2  ...   p X p  
Common variance
explained by X1 and X2
Unique variance
explained by X2

X2
X1

Y
Unique variance
Variance NOT
explained by X1
explained by X1 and X2
A “good” model

X1 X2

Y
Y  o  1 X1  2 X 2  ...   p X p  

intercept Partial Regression residuals


Coefficients

Partial Regression Coefficients (slopes):


Regression coefficient of X after controlling for
(holding all other predictors constant) influence
of other variables from both X and Y.
The matrix algebra of
Ordinary Least Square
Intercept and Slopes:

  ( X ' X ) X 'Y 1

Predicted Values:

Y   X
Residuals:

Y Y
Regression Statistics
How good is our model?

SST   (Y  Y ) 2

SSR   (Y   Y ) 2

SSE   (Y  Y ) 2
Regression Statistics

SSR
R 2

SST
Coefficient of Determination
to judge the adequacy of the regression model
Regression Statistics

n 1
R 2
 1 (1  R )
2

n  k 1
adj

n = sample size
k = number of independent variables

Adjusted R2 are not biased!


Regression Statistics
Standard Error for the regression model

S e  S  ˆ
2
e
2

SSE SSE   (Y  Y ) 2
S 
2

n  k 1
e

S e  MSE
2
ANOVA
H 0 : 1   2  ...   k  0
H A :  i  0 at least one!

df SS MS F P-value

Regression k SSR SSR / df MSR / MSE P(F)

Residual n-k-1 SSE SSE / df

Total n-1 SST


If P(F)<a then we know that we get significantly better prediction of Y from the
regression model than by just predicting mean of Y.

ANOVA to test significance of regression


Hypothesis Tests for Regression
Coefficients

H 0 : i  0
H1 :  i  0

bi   i
t( n  k 1) 
Sbi
Hypotheses Tests for Regression
Coefficients

H 0 : i  0
H A : i  0

b1   i bi   i
t( n  k 1)  
S e (bi ) 2
S e Cii S 2
e
S xx
Confidence Interval on Regression
Coefficients

bi  ta / 2,( nk 1) Se2Cii  i  bi  ta / 2,( nk 1) Se2Cii

Confidence Interval for i


  ( X ' X ) X 'Y
1
  ( X ' X ) X 'Y
1
  ( X ' X ) X 'Y
1
b1   i bi   i
t( n  k 1)  
S e (bi ) 2
S e Cii
Diagnostic Tests For Regressions
Expected distribution of residuals for a linear
model with normal distribution or residuals
(errors).

i
X Residual Plot

10

Residuals
5
0
-5 0 2 4 6 8
X

Xi
Standardized Residuals
ei
di 
2
S e
Standard Residuals

2.5
2
1.5
1
0.5
0
-0.5 0 5 10 15 20 25
-1
-1.5
-2
Model Selection

Avoiding predictors (Xs)


that do not
contribute significantly
to model prediction
Model Selection

- Forward selection
The „best‟ predictor variables are entered, one by one.

- Backward elimination
The „worst‟ predictor variables are eliminated, one by one.
Forward Selection
Backward
Elimination
Model Selection: The General Case
H 0 :  q 1   q  2  ...   k  0
H1 : at least one in not zero

SSE ( x1 , x2 ,..., xq )  SSE ( x1 , x2 ,..., xq , xq 1 ,..., xk )


k q
F
SSE ( x1 , x2 ,..., xq , xq 1 ,..., xk )
n  k 1

Reject H0 if : F  Fa ,k q ,n k 1
Multicolinearity
 The degree of correlation between Xs.

 A high degree of multicolinearity produces


unacceptable uncertainty (large variance) in
regression coefficient estimates (i.e., large
sampling variation)

 Imprecise estimates of slopes and even the


signs of the coefficients may be misleading.

 t-tests which fail to reveal significant factors.


Scatter Plot
Multicolinearity
 If the F-test for significance of regression is
significant, but tests on the individual
regression coefficients are not,
multicolinearity may be present.

 Variance Inflation Factors (VIFs) are very


useful measures of multicolinearity. If any VIF
exceed 5, multicolinearity is a problem.

1
VIF (  i )   Cii
1  Ri2
Model Evaluation

n
PRESS   ( yi  y(i ) )
 2

i 1

Prediction Error Sum of Squares


(leave-one-out)
Thank You!

You might also like