Simple and Multiple Regression Analysis
Simple and Multiple Regression Analysis
and
Multiple regression analysis
By
(Linear)
It is statistical procedure that
attempts to predict or estimate the value
of the response variable from known
values of one or more independent
variables.
Y = a + bX
Suppose we want to test whether there is any relation
between birth weight (BW) of baby and blood pressure
(BP)
Here dependent variable is BP and independent variable
is BW. The equation is
BP=a+b(BW)
i.e. Y=a+bX
intercept
b =
xy − nx y
( x ) 2
x − n
2
or b =
xy − nx y
x − nx
2 2
and a = y − bx
Where xy =sum of products of all individual
values of x and y
b =
xy − nx y
( y ) 2
y − n 2
or b =
xy − nx y
y − ny 2 2
and a = x − b y
Where xy =sum of products of all individual
values of x and y
b=
xy − nx y
x − nx 2 2
and a = y − bx
Solution:
No BW (x) BP (y) xy x2
1 1.5 47.0 70.5 2.25
2 1.9 50.0 95.0 3.61
3 2.2 71.0 156.2 4.84
4 2.5 76.0 190.0 6.25
5 2.7 76.0 205.2 7.29
6 2.8 81.0 226.8 7.84
7 3.0 86.0 258.0 9.00
8 3.2 85.0 272.0 10.24
9 3.4 91.0 309.4 11.56
10 3.7 106.0 392.2 13.69
Total 26.90 769.00 2175.30 76.57
Mean 2.69 76.90
b=
xy − nx y
x − nx
2 2
and a = y − bx
= 76.90 − (25.3481) * (2.69)
= 8.7137
So that the regression equation will be
y=8.7137+25.3481*x
Calculate BP if BW is 3.5 kg
y (BP) =8.7137+25.3481*3.50
=97.4312
Relationship between regression coefficients
i.e. b and b1 and correlation coefficient
The relationship is
xy − nx y xy − nx y
bb =
1
y 2 − ny 2
x − nx
2 2
( xy − nx y ) 2
=
( x 2
− nx )( y
2 2
− ny 2 )
2
=
xy − nx y
( x 2
− nx )( y
2 2
− ny 2 )
= (r )
2
= r2
r 2 = bb 1
i.e. r = (bb ) 1
Types of regression
Simple Linear regression
Logistic regression
Prediction of dependent variable (qualitative) with one
or more than one independent variable (quantitative or
qualitative)
Ordinal regression
qualitative)
Example: A study was conducted to establish a statistical
relationship between the X and X by researcher. The data is given
below. Suppose Y is the response variable and X is the independent
variable
Patient X Y
1 140 12
2 141 12 -Find the correlation coefficient
3 143 11
4 141 14
5 141 17 -Estimate the regression equation
6 133 8
7 135 13 -Use regression equation line to
8 143 13
9 130 14
10 150 14 predict the Y for a patient whose
11 139 15
12 130 10 X is 145
13 140 10
14 161 18
15 135 11
Example: A researcher selected 9 sets of identical
twins to determine whether is a relationship between
the first born and second born twins in the IQ
scores. Is there is a strong relationship in the IQ
score between identical twins? The following table
represents their IQ scores
Patient 1 2 3 4 5 6 7 8 9
First born 112 127 105 132 117 135 122 101 128
Second
born 118 120 100 128 102 133 125 104 114
Multiple Regression
Analysis
Topics
• Review Simple Regression Analysis
• Multiple Regression Analysis
– Design requirements
– Multiple regression model
– R2
– Testing R2 and b’s
– Comparing models
– Comparing standardized regression
coefficients
Simple regression analysis
Simple regression considers the relation
between a single explanatory/independent
variable and response/dependent variable
i.e. Y=a+bX
Simple Regression Model
Regression coefficients are estimated by
minimizing ∑residuals2 (i.e., sum of the squared
residuals = observed - predicted) to derive this
model:
• A simple regression
model (one independent
variable) fits a
regression line in 2-
dimensional space
• A multiple regression
model with two
explanatory variables
fits a regression plane
in 3-dimensional space
Example: Self Concept and
Academic Achievement (N=103)
The General Idea
Multiple regression simultaneously considers
the influence of multiple explanatory variables
on a response variable Y
Predictable variation
by the combination of
independent variables
Total Variation in Y
Unpredictable
Variation
Proportion of Predictable and
Unpredictable Variation
(1-R2) = Unpredictable
Where: (unexplained) variation
Y= AA in Y
X1 = ASC
X2 =GSC
Y
X1
R2 = Predictable
X2 (explained)
variation in Y
Various Significance Tests
• Testing R2
– Test R2 through an F test
– Test of competing models (difference
between R2) through an F test of
difference of R2s
• Testing b
– Test of each partial regression
coefficient (b) by t-tests i.e. t= Reg.
coefficient/SE(b)
Example: Testing R2
• What proportion of variation in AA can be
predicted from GSC and ASC?
– Compute R2: R2 = .16 (R = .41) : 16% of the
variance in AA can be accounted for by the
composite of GSC and ASC
• H0: = 0
• tobserved = b-
standard error of b
• with N-k-1 df
Example: t-test of b
• tobserved = -0.44 - 0/14.24
• tobserved = -0.03
• tcritical(.05,2,100) = 1.96
– In order of significance