0% found this document useful (0 votes)

36 views20 pages

Regression for Beginners

This document provides an overview of linear regression analysis with a single independent variable. It defines key terms like predictors, criteria, and regression equation. It explains how the slope and intercept of the regression line are used to predict outcomes and how they can be changed. It also discusses calculating the regression line to minimize the sum of squared errors using the method of least squares. Sample calculations are provided to illustrate finding the regression equation and partitioning total sum of squares into portions due to regression and residual error.

Uploaded by

Amandeep_Saluj_9509

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views20 pages

Regression for Beginners

Uploaded by

Amandeep_Saluj_9509

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 20

Regression Basics

Predicting a DV with a Single IV

Questions
What are predictors and
criteria?
Write an equation for
the linear regression.
Describe each term.
How do changes in the
slope and intercept
affect (move) the
regression line?
What does it mean to
test the significance of
the regression sum of
squares? R-square?
What is R-square?
What does it mean to choose
a regression line to satisfy
the loss function of least
squares?
How do we find the slope
and intercept for the
regression line with a single
independent variable?
(Either formula for the slope
is acceptable.)
Why does testing for the
regression sum of squares
turn out to have the same
result as testing for R-
square?

Basic Ideas
Jargon
IV = X = Predictor (pl. predictors)
DV = Y = Criterion (pl. criteria)
Regression of Y on X e.g., GPA on SAT
Linear Model = relations between IV
and DV represented by straight line.

A score on Y has 2 parts (1) linear
function of X and (2) error.
Y X
i i i
= + + o | c
(population values)
Basic Ideas (2)
Sample value:
Intercept place where X=0
Slope change in Y if X changes 1
unit. Rise over run.
If error is removed, we have a predicted
value for each person at X (the line):

Y a bX e
i i i
= + +
' = + Y a bX
Suppose on average houses are worth about $75.00 a
square foot. Then the equation relating price to size
would be Y=0+75X. The predicted price for a 2000
square foot house would be $150,000.
Linear Transformation
1 to 1 mapping of variables via line
Permissible operations are addition and
multiplication (interval data)
1 0 8 6 4 2 0
X
4 0
3 5
3 0
2 5
2 0
1 5
1 0
5
0
Y
C h a n g in g t h e Y I n t e r c e p t

Y = 5 + 2 X
Y = 1 0 + 2 X
Y = 1 5 + 2 X
Add a constant
1 0 8 6 4 2 0
X
3 0
2 0
1 0
0
Y
C h a n g in g t h e S lo p e

Y = 5 + . 5 X
Y = 5 + X
Y = 5 + 2 X
Multiply by a constant
' = + Y a bX
Linear Transformation (2)
Centigrade to Fahrenheit
Note 1 to 1 map
Intercept?
Slope?
120 90 60 30 0
Degrees C
240
200
160
120
80
40
0
D
e
g
r
e
e
s

F

32 degrees F, 0 degrees C
212 degrees F, 100 degrees C
Intercept is 32. When X (Cent) is 0, Y (Fahr) is 32.
Slope is 1.8. When Cent goes from 0 to 100 (run), Fahr goes
from 32 to 212 (rise), and 212-32 = 180. Then 180/100 =1.8 is
rise over run is the slope. Y = 32+1.8X. F=32+1.8C.
' = + Y a bX
Review
What are predictors and criteria?
Write an equation for the linear
regression with 1 IV. Describe each
term.
How do changes in the slope and
intercept affect (move) the regression
line?
Regression of Weight on
Height
Ht Wt
61 105
62 120
63 120
65 160
65 120
68 145
69 175
70 160
72 185
75 210
N=10 N=10
M=67 M=150
SD=4.57 SD=
33.99
7 6 7 4 7 2 7 0 6 8 6 6 6 4 6 2 6 0
H e ig h t in I n c h e s
2 4 0
2 1 0
1 8 0
1 5 0
1 2 0
9 0
6 0
W
e
i g
h
t

i n

L
b
s
R e g r e s s io n o f W e ig h t o n H e ig h t

R e g r e s s io n o f W e ig h t o n H e ig h t

R e g r e s s io n o f W e ig h t o n H e ig h t

R is e
R u n
Y = - 3 1 6 . 8 6 + 6 . 9 7 X
X
Correlation (r) = .94.
Regression equation: Y=-316.86+6.97X
' = + Y a bX
Illustration of the Linear
Model. This concept is vital!
7 2 7 0 6 8 6 6 6 4 6 2
H e ig h t
2 0 0
1 8 0
1 6 0
1 4 0
1 2 0
1 0 0
W
e
i g
h
t
R e g r e s s io n o f W e ig h t o n H e ig h t

7 2 7 0 6 8 6 6 6 4 6 2
H e ig h t
R e g r e s s io n o f W e ig h t o n H e ig h t

R e g r e s s io n o f W e ig h t o n H e ig h t

( 6 5 , 1 2 0 )
M e a n o f X
M e a n o f Y
D e v ia t io n f r o m X
D e v ia t io n f r o m Y
L in e a r P a r t
E r r o r P a r t
y
Y '
e
Y X
i i i
= + + o | c
Y a bX e
i i i
= + +
Consider Y as
a deviation
from the
mean.
Part of that deviation can be associated with X (the linear
part) and part cannot (the error).

' = + Y a bX
'
i i i
Y Y e =
Predicted Values & Residuals
N

Ht

Wt

Y'

Resid
1

61

105

108.19

-3.19
2

62

120

115.16

4.84
3

63

120

122.13

-2.13
4

65

160

136.06

23.94
5

65

120

136.06

-16.06
6

68

145

156.97

-11.97
7

69

175

163.94

11.06
8

70

160

170.91

-10.91
9

72

185

184.84

0.16
10

75

210

205.75

4.25
M 67

150

150.00

0.00
SD

4.57

33.99

31.85

11.89

V 20.89

1155.56

1014.37

141.32

7 2 7 0 6 8 6 6 6 4 6 2
H e ig h t
2 0 0
1 8 0
1 6 0
1 4 0
1 2 0
1 0 0
W
e
i g
h
t
R e g r e s s io n o f W e ig h t o n H e ig h t

7 2 7 0 6 8 6 6 6 4 6 2
H e ig h t
R e g r e s s io n o f W e ig h t o n H e ig h t

R e g r e s s io n o f W e ig h t o n H e ig h t

( 6 5 , 1 2 0 )
M e a n o f X
M e a n o f Y
D e v ia t io n f r o m X
D e v ia t io n f r o m Y
L in e a r P a r t
E r r o r P a r t
y
Y '
e
Numbers for linear part and error.
Note M of Y
and Residuals.
Note variance of
Y is V(Y) +
V(res).
' = + Y a bX
Finding the Regression Line
Need to know the correlation, SDs and means of X and Y.
The correlation is the slope when both X and Y are
expressed as z scores. To translate to raw scores, just bring
back original SDs for both.
N
z z
r
Y X
XY

=
X
Y
XY
SD
SD
r b =
To find the intercept, use: X b Y a =
(rise over run)
Suppose r = .50, SD
X
= .5, M
X
= 10, SD
Y
= 2, M
Y
= 5.
2
5 .
2
5 . = = b
15 ) 10 ( 2 5 = = a X Y 2 15 ' + =
Slope
Intercept Equation
Line of Least Squares
7 2 7 0 6 8 6 6 6 4 6 2
H e ig h t
2 0 0
1 8 0
1 6 0
1 4 0
1 2 0
1 0 0
W
e
i g
h
t
R e g r e s s io n o f W e ig h t o n H e ig h t

7 2 7 0 6 8 6 6 6 4 6 2
H e ig h t
R e g r e s s io n o f W e ig h t o n H e ig h t

R e g r e s s io n o f W e ig h t o n H e ig h t

( 6 5 , 1 2 0 )
M e a n o f X
M e a n o f Y
D e v ia t io n f r o m X
D e v ia t io n f r o m Y
L in e a r P a r t
E r r o r P a r t
y
Y '
e
We have some points.
Assume linear relations
is reasonable, so the 2
vbls can be represented
by a line. Where
should the line go?
Place the line so errors (residuals) are small. The line we
calculate has a sum of errors = 0. It has a sum of squared
errors that are as small as possible; the line provides the
smallest sum of squared errors or least squares.
Least Squares (2)
Review
What does it mean to choose a regression line
to satisfy the loss function of least squares?
What are predicted values and residuals?
Suppose r = .25, SD
X
= 1, M
X
= 10, SD
Y
= 2, M
Y
= 5.
What is the regression equation (line)?
Partitioning the Sum of
Squares
e bX a Y + + =
bX a Y + = '
e Y Y + = ' ' Y Y e =
Definitions
) ' ( ) ' ( Y Y Y Y Y Y + =
= y, deviation from mean

+ =
2 2
)] ' ( ) ' [( ) ( Y Y Y Y Y Y
Sum of squares

+ =
2 2 2
) ' ( ) ' ( ) ( Y Y Y Y y
(cross products
drop out)
Sum of
squared
deviations
from the
mean
=
Sum of squares
due to
regression
+
Sum of squared
residuals
reg
error
Analog: SS
tot
=SS
B
+SS
W
Partitioning SS (2)
SS
Y
=SS
Reg
+ SS
Res
Total SS is regression SS plus
residual SS. Can also get
proportions of each. Can get
variance by dividing SS by N if you
want. Proportion of total SS due to
regression = proportion of total
variance due to regression = R
2

(R-square).
Y
s
Y
g
Y
Y
SS
SS
SS
SS
SS
SS
Re
Re
+ =
) 1 ( 1
2 2
R R + =
Partitioning SS (3)
YYY
Wt (Y)
M=150

Y'

Resid
(Y-Y')

Resid
2

105

2025

108.19

-41.81

1748.076

-3.19

10.1761

120

900

115.16

-34.84

1213.826

4.84

23.4256

120

900

122.13

-27.87

776.7369

-2.13

4.5369

160

100

136.06

-13.94

194.3236

23.94

573.1236

120

900

136.06

-13.94

194.3236

-16.06

257.9236

145

25

156.97

6.97

48.5809

-11.97

143.2809

175

625

163.94

13.94

194.3236

11.06

122.3236

160

100

170.91

20.91

437.2281

-10.91

119.0281

185

1225

184.84

34.84

1213.826

0.16

0.0256

210

3600

205.75

55.75

3108.063

4.25

18.0625

Sum =
1500

10400

1500.01

0.01

9129.307

-0.01

1271.907

Variance

1155.56

1014.37

141.32

2
) ( Y Y
Y Y '
2
) ' ( Y Y
Partitioning SS (4)
Total Regress Residual
SS 10400 9129.31 1271.91
Variance 1155.56 1014.37 141.32
12 . 88 . 1
10400
91 . 1271
10400
31 . 9129
10400
10400
+ = + = Proportion of SS
12 . 88 . 1
56 . 1155
32 . 141
56 . 1155
37 . 1014
56 . 1155
56 . 1155
+ = + = Proportion of
Variance
R
2
= .88
Note Y is linear function of X, so
.
XY YY
r r = = 94 .
'
1
'
=
X Y
r
0 12 . 35 . . 88 .
'
2 2 2
'
= = = = =
E Y YE YE YY
r r r R r
Significance Testing
Testing for the SS due to regression = testing for the variance
due to regression = testing the significance of R
2
. All are the
same.
0 :
2
0
=
population
R H
F
SS df
SS df
SS k
SS N k
reg
res
reg
res
= =

/
/
/
/ ( )
1
2
1
k=number of IVs (here
its 1) and N is the
sample size (# people).
F with k and (N-k-1)
df.
F
SS df
SS df
reg
res
= =

=
/
/
. /
. / ( )
.
1
2
9129 31 1
127191 10 1 1
57 42
) 1 /( ) 1 (
/
2
2

=
k N R
k R
F
Equivalent test using R-square
instead of SS.
F =

=
. /
( . ) / ( )
.
88 1
1 88 10 1 1
58 67
Results will be same within
rounding error.
Review
What does it mean to test the
significance of the regression sum of
squares? R-square?
What is R-square?
Why does testing for the regression sum of
squares turn out to have the same result as
testing for R-square?

Biostat Lecture 10
No ratings yet
Biostat Lecture 10
47 pages
9 Regression (Statistics IEM 2-2)
No ratings yet
9 Regression (Statistics IEM 2-2)
32 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
51 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
51 pages
Experiment No 7
No ratings yet
Experiment No 7
7 pages
Lecture 4
No ratings yet
Lecture 4
22 pages
Simple Regression
100% (1)
Simple Regression
50 pages
Regression Analysis and Multiple Regression: Session 7
No ratings yet
Regression Analysis and Multiple Regression: Session 7
100 pages
Simple Lin Regress Inference
No ratings yet
Simple Lin Regress Inference
51 pages
Linear Regression Case Study
No ratings yet
Linear Regression Case Study
6 pages
Lecture 5 Regression
No ratings yet
Lecture 5 Regression
77 pages
File4-Session3-Introduction To Regression
No ratings yet
File4-Session3-Introduction To Regression
50 pages
Regression
No ratings yet
Regression
56 pages
15.simple Linear Regression-530
No ratings yet
15.simple Linear Regression-530
54 pages
9 Regression (Statistics IEM 2-2)
No ratings yet
9 Regression (Statistics IEM 2-2)
32 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
51 pages
AMS 572 Presentation: CH 10 Simple Linear Regression
No ratings yet
AMS 572 Presentation: CH 10 Simple Linear Regression
54 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
12 pages
9 W9INSE6220 Fall 2023
No ratings yet
9 W9INSE6220 Fall 2023
42 pages
Lesson 11 Simple Linear Regression and Correlation
No ratings yet
Lesson 11 Simple Linear Regression and Correlation
38 pages
292322356
No ratings yet
292322356
69 pages
Regression Analysis Basics
No ratings yet
Regression Analysis Basics
56 pages
06 Least Squar Regression
No ratings yet
06 Least Squar Regression
25 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
Linear Regression and Tire Correlation
No ratings yet
Linear Regression and Tire Correlation
54 pages
Simple Linear Regression Sample
No ratings yet
Simple Linear Regression Sample
55 pages
Lecture 3 - Linear Regression Imran 20022025 092939am
No ratings yet
Lecture 3 - Linear Regression Imran 20022025 092939am
46 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
195 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
TCMG - MEEG 573 - SP - 20 - Lecture - 7
No ratings yet
TCMG - MEEG 573 - SP - 20 - Lecture - 7
69 pages
Simple Linear Regression and Correlation
No ratings yet
Simple Linear Regression and Correlation
50 pages
Regression Lecture Summary
No ratings yet
Regression Lecture Summary
31 pages
Part 8 Linear Regression
No ratings yet
Part 8 Linear Regression
6 pages
Simple Linear Regression and Correlation
No ratings yet
Simple Linear Regression and Correlation
39 pages
Week-4 BA Linear Regression
No ratings yet
Week-4 BA Linear Regression
16 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Regression-SIMPLE LINEAR
No ratings yet
Regression-SIMPLE LINEAR
25 pages
Study Material For Machine Learning - 2 - 1754721786127
No ratings yet
Study Material For Machine Learning - 2 - 1754721786127
22 pages
Regression & Correlation
No ratings yet
Regression & Correlation
44 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
Linear Regression Full Version
No ratings yet
Linear Regression Full Version
34 pages
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
No ratings yet
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
67 pages
Linear Regression for Academics
No ratings yet
Linear Regression for Academics
28 pages
Slide Chap11
No ratings yet
Slide Chap11
19 pages
Regression Analysis
No ratings yet
Regression Analysis
7 pages
Chap 10 Regression Analysis
No ratings yet
Chap 10 Regression Analysis
68 pages
Linear Regression For Intermediate
No ratings yet
Linear Regression For Intermediate
6 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Sec2 Regression PDF
No ratings yet
Sec2 Regression PDF
183 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
64 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
64 pages
Linear Regression
No ratings yet
Linear Regression
64 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Lecture 9-10 - Updated Vesion S25 - Regression
No ratings yet
Lecture 9-10 - Updated Vesion S25 - Regression
43 pages
Regression
No ratings yet
Regression
66 pages
Chapter 7 - S
No ratings yet
Chapter 7 - S
49 pages
Regression Analysis & Confidence Intervals
No ratings yet
Regression Analysis & Confidence Intervals
14 pages

Regression for Beginners

Uploaded by

Regression for Beginners

Uploaded by

Regression Basics

Predicting a DV with a Single IV

You might also like