111Equation Chapter 1 Section 1
SBST3203
ELEMENTARY DATA ANALYSIS
MAY 2020
FINAL EXAM
NAME : ARIF SOEBAH
ID NO : 830811125679001
PHONE NUMBER : 013-8880791
EMAIL : [email protected]
PART A (choose Question 1 and 3)
Question 1 (a)
^
y 4.907 0.4105 x
1 which is the slope that indicates y is expected to increase by 0.4105 unit when x increase
by 1 unit.
0 which is y-intercept of the regression line indicates that when value of x 0, y 4.907 .
^
y 4.907 0.4105 x
^
y 4.907 0.4105 120
^
y 44.353
Question 1 (b)(i)
Based on table A
x y xy x2
110 44 110 44 4840 1102 12100
135 50 18225
6750
85 28 7225
2380
95 33 9025
3135
115 41 13225
4715
70 25 4900
1750
Total = 610 Total = 221 Total = 23570 Total = 64700
y
2
s yy y i
n
44 50 28 33 41 25
2
s yy 44 50 28 33 41 25
2 2 2 2 2 2
6
2
221
s yy 8615
6
s yy 474.8333
s xy xy
x y
n
s xy 23570
610 221
6
sxy 1101.6667
s yy b1s xy
s2
n2
474.8333 0.4105 1101.6667
s2
62
s 5.6498
2
Question 1 (b)(ii)
95% Confidence Interval for
1
b1 t s 1
,n 2
2
t0.025,4 2.776
s2
s b1
sxx
x
2
6102
s xx x 2
64700 2683.3333
n 6
5.6498
s b1 0.0459
2683.3333
95% Confidence Interval for
1
0.4105 2.776 0.0459
0.2831 , 0.5379
Question 3
3(a)(i)
Let,
0 is Null Hypothesis and 1 is Alternative Hypothesis
Hypothesis:
0 : Attending or not attending the course has no effect on Award Status
1 : Attending or not attending the course has effect on Award Status
3(a)(ii)
The formula for expected values is,
E (Row total containing that cell)(Column Total Containing that cell) / (Grand Total)
The row and column calculated as below:
Attended Did Not Attend Total
Received 20 12 32
Do not Receive 32 36 68
Total 52 48 100
The table below shows the calculations to obtain the table with expected values:
Expected Values Attended Did Not Attend Total
Received 52 32 16.64
48 32 15.36
32
100 100
Do not Receive 52 68 35.36 48 68 32.64 68
100 100
Total 52 48 100
3(b)(i)
m 1 n 1
Degree of freedom
Where, m is no. of row, n is no. of column.
df m 1 n 1 2 1 2 1 1
0.01
2 ,df 2 0.01,1 6.635
Critical value for this test:
Decision rule: We reject the null hypothesis if 6.635
2
3(b)(ii)
Oi Ei
2
2
Ei
20 16.64 12 15.36 32 35.36 36 32.64
2 2 2 2
2
16.64 15.36 35.36 32.64
0.678 0.319 0.735 0.346
2
2 2.079
Decision:
By using the rejection region approach:
Since,
2 test : 2.079 6.635,
We fail to reject the
0
Conclusion:
We do not have enough evidence at 0.01 to show that attending or not attending the
course has effect on whether a salesman wins or doesn’t win an industry award.
PART B
QUESTION 1 (a)
R 2 0.998
The total variation in y is explained by
x1 and x2 by 99.8% .
2
Since R value is high, we conclude that the model fit the data.
QUESTION 1 (b)
Hypothesis
0 1 2 0
1 i 0, for at least one value of i , i 1, 2
Consider ANOVA table,
MSR 442.4
Ftest statistic 1474.6667
MSE 0.30
From the table,
F0.99,2,7 9.547
Conclusion:
F F0.99,2,7 0 at 1% significance level.
Since , we reject
Thus, the model contributes significantly to the prediction of y at 0.01 level of
significance.
^
x x 4 is
The prediction equation relating y and 1 when 2
^
y 28.3906 1.4631x1 3.8445 x2
^
y 28.3906 1.4631x1 3.8445 4
^
y 13.0126 1.4631x1
y 13.0126 1.4631x1 is the required prediction equation.
QUESTION 2 (a)
Let,
xi = The number of hours studied for a statistics test
yi = The test score for a statistics test
a) We know that,
The fitted simple linear regression model is,
^ ^ ^
y a bx
^ ^
Where, a is estimate of y-intercept of regression line while b is estimate of slope of
regression line.
Also we have,
^ cov( x, y ) ^ ss
b b xy ^ ^
var( x ) or ss x and a y b x
Where,
x x y y
ssxy i i
x x
ss x i
Here,
n 10, x i 100, y i 564
x
x i
100
10 ,
y
y i
564
56.4
n 10 n 10
2
x x x x y y
i i i
4 10
2
36 4 10 31 56.4 152.4
1 1.6
0 0
16 66.4
36 116.4
9 37.2
4 7.2
144 415.2
81 318.6
49 193.2
Total = 376 Total = 1305
From that, we get
^ ss xy 1305
b 3.470745
ssx 376
^
To find a ,
^ ^
a y bx
^
a 56.4 3.470745 10
^
a 21.692553
A simple linear regression model that fit the given data is,
^
y 21.692553 3.470745 x
QUESTION 2(b)
^
y 21.692553 3.470745 x
^
y 21.692553 3.470745 16
^
y 77.224468
^
y 77
The average test score of a student who spends 10 hours studying for the test is 77.