Linear Correlation and Regression
Linear Correlation and Regression
Regression
Y
* *
*
X
Correlation
• Rectangular coordinate
• Two quantitative variables
• One variable is called independent (X) and
the second is called dependent (Y)
• Points are not joined
Y
* *
*
X
Example. The data is about weight in kilograms
Wt. 67 69 85 83 74 81 97 92 114 85
(kg)
SBP 120 125 140 160 130 180 150 140 200 130
(mmHg)
Wt. 67 69 85 83 74 81 97 92 114 85
SBP(mmHg) (kg)
SBP 120 125 140 160 130 180 150 140 200 130
(mmHg)
220
200
180
160
140
120
100
80 wt (kg)
60 70 80 90 100 110 120
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
16
14
12
Height in CM
10
0
0 10 20 30 40 50 60 70 80 90
Age in Weeks
Negative relationship
Reliability
Age of Car
No relation
Correlation Coefficient
- Statistic showing the degree of relation between two
variables
If r = l = perfect correlation.
How to compute the simple correlation
coefficient (r)
xy x y
r n
( x) 2
( y)
2
x
2 . y
2
n n
or
:Example
A sample of 6 children was selected, data about their
age in years and weight in kilograms was recorded as
shown in the following table . It is required to find the
correlation between age and weight.
xy x y
r n
x2
( x) 2
. y 2
( y) 2
n n
Weight Age
(yrs)
Serial
Y2 X2 xy (kg)
)y( (x) .No
1
144 49 84 12 7
2
64 36 48 8 6
3
144 64 96 12 8
4
100 25 50 10 5
5
121 36 66 11 6
6
169 81 117 13 9
Total
y2 = 742∑ x2 = 291∑ xy= 461∑ y = 66∑ x= 41∑
41 66
461
r 6
(41) 2 (66) 2
291 .742
6 6
r = 0.759
strong direct correlation
EXAMPLE: Relationship between Anxiety and
Test Scores
Anxiety Test X2 Y2 XY
)X( score (Y)
10 2 100 4 20
8 3 64 9 24
2 9 4 81 18
1 7 1 49 7
5 6 25 36 30
6 5 36 25 30
X = 32∑ Y = 32∑ X2 = 230∑ Y2 = 204∑ XY=129∑
Calculating Correlation Coefficient
r = - 0.94
220
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
By using the least squares method (a procedure
that minimizes the vertical deviations of plotted
points surrounding a straight line) we are
able to construct a best fitting straight line to the
scatter diagram points and then formulate a
regression equation in the form of:
ŷ a bX
x y
xy
ŷ y b(x x) bb1 n
( x) 2
x 2
n
Regression Equation
Slope
SBP(mmHg)
220
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
Linear Equation
Y
ŷY = bX
a +bX
a
Change
b = Slope in Y
Change in X
a = Y-intercept
X
Hours studying and grades
Regressing grades on hours
Linear Regression
90.00 Final grade in course = 59.95 + 3.17 * study
R-Square = 0.88
80.00
70.00
12 7 1
8 6 2
12 8 3
10 5 4
11 6 5
13 9 6
Solution:
Y2 X2 xy Weight (y) Age (x) .Serial no
144 49 84 12 7 1
64 36 48 8 6 2
144 64 96 12 8 3
100 25 50 10 5 4
121 36 66 11 6 5
169 81 117 13 9 6
Regression equation
x n
2 41678
20
ŷ =112.13 + 0.4547 x
for age 25
B.P = 112.13 + 0.4547 * 25=123.49 = 123.5 mm hg
Multiple Regression
Multiple regression analysis is a straightforward
extension of simple regression analysis which allows
more than one independent variable.