Chapter 4
4.1 a x
52 25 15 0 104 44 60 30 33 81 40 5 = 489 = 40.75
12
12
Ordered data: 0, 5, 15, 25, 30, 33, 40, 44, 52, 60, 81, 104; Median = (33 + 40)/2 = 36.5
Mode = all
4.2 x
5 7 0 3 15 6 5 9 3 8 10 5 2 0 12 = 90 = 6.0
15
15
Ordered data: 0, 0, 2, 3, 3, 5, 5, 5, 6, 7, 8, 9, 10, 12, 15; Median = 5
Mode = 5
4.3 a
5.5 7.2 1.6 22. 0 8.7 2.8 5.3 3.4 12.5 18.6 8.3 6.6
12
102.5
= 8.54
12
Ordered data: 1.6, 2.8, 3.4, 5.3, 5.5, 6.6, 7.2, 8.3, 8.7, 12.5, 18.6, 22.0; Median = 6.9
Mode = all
b The mean number of miles jogged is 8.54. Half the sample jogged more than 6.9 miles and half
jogged less.
4.4 a x
x
n
33 29 45 60 42 19 52 38 36 = 354 = 39.3
9
9
Ordered data: 19, 29, 33, 36, 38, 42, 45, 52, 60; Median = 38
Mode: all
b The mean amount of time is 39.3 minutes. Half the group took less than 38 minutes.
4.5 a
x
n
14 8 3 2 6 4 9 13 10 12 7 4 9 13 15 8 11 12 4 0
20
164
= 8.2
20
105
Ordered data: 0, 2, 3, 4, 4, 4, 6, 7, 8, 8, 9, 9, 10, 11, 12, 12, 13, 13, 14, 15; Median = 8.5
Mode = 4
b The mean number of days to submit grades is 8.2, the median is 8.5, and the mode is 4.
4.6 R g 3 (1 R1 )(1 R 2 )(1 R 3 ) 1 =
4.7 R g
4
(1 .25)(1 .10)(1 .50) 1 = .19
(1 R1 )(1 R 2 )(1 R 3 )(1 R 4 ) 1 =
(1 .50)(1 .30)(1 .50)(1 .25) 1 = .075
4.8 a x
.10 .22 .06 .05 .20 = .53 = .106
5
5
Ordered data: .05, .06, .10, .20, .22; Median = .10
b R g 5 (1 R1 )(1 R 2 )(1 R 3 )(1 R 4 )(1 R 5 ) 1 =
5
(1 .10)(1 .22)(1 .06)(1 .05)(1 .20) 1 = .102
c The geometric mean is best.
4.9 a x
x
n
- .15 - .20 .15 .08 .50 = .22 = .044
5
5
Ordered data: .20, .15, .08, .15, .50; Median = .08
b R g 5 (1 R1 )(1 R 2 )(1 R 3 )(1 R 4 )(1 R 5 ) 1 =
5
(1 .15)(1 .20)(1 .15)(1 .08)(1 .50) 1 = .015
c The geometric mean is best.
4.10 a Year 1 rate of return =
1200 1000
= .20
1000
Year 2 rate of return =
1200 1200
=0
1200
Year 3 rate of return =
1500 1200
= .25
1200
Year 4 rate of return =
2000 1500
= .33
1500
b x
x
n
.20 0 .25 .33 = .78 = .195
4
4
Ordered data: 0, .20, .25, .33; Median = .225
106
c Rg
4
(1 R1 )(1 R 2 )(1 R 3 )(1 R 4 ) 1 =
(1 .20)(1 0)(1 .25)(1 .33) 1 = .188
d The geometric mean is best because 1000(1.188) 4 = 2000.
4.11 a Year 1 rate of return =
10 12
= .167
12
Year 2 rate of return =
14 10
= .40
10
Year 3 rate of return =
15 14
= .071
14
Year 4 rate of return =
22 15
= .467
15
Year 5 rate of return =
30 22
= .364
22
Year 6 rate of return =
25 30
= .167
30
b x
x
n
- .167 .40 .071 .467 .364 .167 = .968 = .161
6
6
Ordered data: .167, .167, .071, .364, .40, .467; Median = .218
c R g 6 (1 R 1 )(1 R 2 )(1 R 3 )(1 R 4 )(1 R 5 )(1 R 6 ) 1
=
(1 .167 )(1 .40)(1 .071)(1 .467)(1 .364)(1 .167) 1 = .130
d The geometric mean is best because 12(1.130)6 = 25.
4.12 a x = 75,750; median = 76,410
b The mean starting salary is $75,750. Half the sample earned less than $76,410.
4.13 a x = 11.19; median = 11
b The mean number of days is 11.19 and half the sample took less than 11 days and half took more
than 11 days to pay.
4.14a
x = 117.08; median = 124.00
b The mean expenditure is $117.08 and half the sample spent less than $1246.00.
107
4.15a
b
x = 26.80; median = 27.00
x = 30.94; median = 31.00
c The mean and median of commuting time in New York is larger than that in Los Angeles.
4.16a
x = .81; median = .83
b The mean percentage is .81. Half the sample paid less than .83.
4.17a
x = 32.91; median = 32; mode = 32
b The mean speed is 32.91 mph. Half the sample traveled slower than 32 mph and half traveled
faster. The mode is 32.
4.18a
x = 592.04; median = 591.00
b The mean expenditure is $592.04. Half the sample spent less than $591.00
4.19 x
s2
9 3 7 4 1 7 5 4 = 40 = 5
8
8
x) 2
n 1
(x
x) 2
(x
[(9 5) 2 (3 5) 2 ... ( 4 5) 2 = 46 = 6.57
7
8 1
[(4 5) 2 (5 5) 2 ... (6 5) 2 = 8 = 1.14
7
8 1
12 6 22 31 23 13 15 17 21 = 160 = 17.78
9
9
x) 2
n 1
4 5 3 6 5 6 5 6 = 40 = 5
8
8
n 1
4.21 x
(x
4.20 x
2
[(12 17.78) 2 (6 17.78) 2 ... ( 21 17.78) 2 = 433.56 =
8
9 1
54.19
s2 =
54.19 = 7.36
108
4.22 x
(x
0 ( 5) ( 3) 6 4 (4) 1 ( 5) 0 3 = 3 = .30
10
10
x) 2
n 1
[(0 ( .3)) 2 ((5) ( .3)) 2 ... (3 ( .3)) 2 = 136.1 =
9
10 1
15.12
s2 =
15.12 = 3.89
4.23 The data in (b) appear to be most similar to one another.
4.24 a: s 2 = 51.5
b: s 2 = 6.5
c: s 2 = 174.5
4.25 Variance cannot be negative because it is the sum of squared differences.
4.26 6, 6, 6, 6, 6
4.27 a about 68%
b about 95%
c About 99.7%
4.28 a From the empirical rule we know that approximately 68% of the observations fall between
46 and 54. Thus 16% are less than 46 (the other 16% are above 54).
b Approximately 95% of the observations are between 42 and 58. Thus, only 2.5% are above 58
and all the rest, 97.5% are below 58.
c See (a) above; 16% are above 54.
4.29 a at least 75%
b at least 88.9%
4.30 a Nothing
b At least 75% lie between 60 and 180.
c At least 88.9% lie between 30 and 210.
109
4.31 Range = 25.85, s 2 29.46, and s = 5.43; there is considerable variation between prices; at
least 75% of the prices lie within 10.86 of the mean; at least 88.9% of the prices lie within 16.29
of the mean.
4.32 s 2 40.73 mph 2 and s = 6.38 mph; at least 75% of the speeds lie within 12.76 mph of the
mean;
at least 88.9% of the speeds lie within 19.14 mph of the mean
4.33 a Punter
Variance Standard deviation
40.22
6.34
14.81
3.85
3.63
1.91
b Punter 3 is the most consistent.
4.34 s 2 .0858 cm2, and s = .2929 cm; at least 75% of the lengths lie within .5858 of the mean;
at least 88.9% of the rods will lie within .8787 cm of the mean.
4.35
x 175.73 and s = 62.1; At least 75% of the withdrawals lie within $124.20 of the mean; at
least 88.9% of the withdrawals lie within $186.30 of the mean..
4.36a s = 15.01
b In approximately 68% of the days the number of arrivals falls within 15.01 of the mean; in
approximately 95% of the hours the number of arrivals falls within 30.02 of the mean; in
approximately 99.7% of the hours the number of arrivals falls within 45.03 of the mean
4.37 a x
47.71, s2 = 302.18 and s = 17.38
b.
c The histogram is approximately bell shaped allowing us to use the Empirical Rule.
Approximately 68% of adults are between 12.9 and 82.5 years old.
110
4.38a x
77.86 and s = 85.35
b.
c. The histogram is positively skewed; we must use Chebysheffs Theorem. At least 75% of
American adults watch between 0 and 249 minutes of television news.
4.39 a x
23.4 and s = 19.6
b.
The histogram is very positively skewed. As a result we can only use Chebysheffs Theorem. At
least 75% of American born outside the United States were between 0 and 62.6 years old
4.40 First quartile: L25 (15 1)
Second quartile: L50 (15 1)
Third quartile: L75 (15 1)
25
= (16)(.25) = 4; the fourth number is 3.
100
50
= (16)(.5) = 8; the eighth number is 5.
100
75
= (16)(.75) = 12; the twelfth number is 7.
100
4.41 30th percentile: L30 (10 1)
30
= (11)(.30) = 3.3; the 30th percentile is 22.3.
100
111
80th percentile: L80 (10 1)
80
= (11)(.80) = 8.8; the 80th percentile 30.8.
100
4.42 20th percentile: L 20 (10 1)
40th percentile: L 40 (10 1)
40
= (11)(.40) = 4.4; the 40th percentile is 52 +.4(6052) = 55.2.
100
4.43 First quartile: L25 (13 1)
Second quartile: L50 (13 1)
Third quartile: L75 (13 1)
25
= (14)(.25) = 3.5; the first quartile is 13.05.
100
50
= (14)(.5) = 7; the second quartile is 14.7.
100
75
= (14)(.75) = 10.5; the third quartile is 15.6.
100
4.44 Third decile: L 30 (15 1)
Sixth decile: L60 (15 1)
20
= (11)(.20) = 2.2; the 20th percentile is 43 + .2(5143) = 44.6.
100
30
= (16)(.30) = 4.8; the third decile is 5 + .8(7 5) = 6.6.
100
60
= (16)(.60) = 9.6; the sixth decile is 17 + .6(18 17) = 17.6.
100
4.45 Interquartile range = 15.6 13.05 = 2.55
4.46 Interquartile range = 7 3 = 4
4.47 First quartile = 5.75, third quartile = 15; interquartile range = 15 5.75 = 9.25
4.48
112
4.49 L85 = 75; The speed limit should be set at 75 mph.
4.50
a First quartile = 2, second quartile = 4, and third quartile = 8.
b Most executives spend little time reading resumes. Keep it short.
4.51 Dogs: First quartile = 1097.5, second quartile = 1204, and third quartile = 1337.
Cats: First quartile = 743, second quartile = 856, and third quartile = 988.
113
Dogs cost more money than cats. Both sets of expenses are positively skewed.
4.52 First quartile = 50, second quartile = 125, and third quartile = 260. The amounts are positively skewed.
4.53 BA First quartile = 25,730, second quartile = 27,765, and third quartile = 29836
BSc First quartile = 29,927, second quartile = 33,397, and third quartile = 36,745
BBA First quartile = 31,316, second quartile = 34,284, and third quartile = 39,551
114
Other First quartile = 28,254, second quartile = 29,951, and third quartile = 32,905
The starting salaries of BA and other are the lowest and least variable. Starting salaries for BBA and BSc
are higher.
4.54 a
b The quartiles are 145.11, 164.17, and 175.18
c There are no outliers.
d The data are positively skewed. One-quarter of the times are below 145.11 and one-quarter are
above 175.18.
4.55a Private course: The quartiles are 145.11, 164.17, and 175.18
115
Public course: The quartiles are 279, 296, and 307
b The amount of time taken to complete rounds on the public course are larger and more variable
than those played on private courses.
4.56 a The quartiles are 26, 28.5, and 32
b the times are positively skewed.
4.57 The quartiles are 8081.81, 9890.48, and 11,692.92. One-quarter of mortgage payments are
less than $607.19 and one quarter exceed $909.38.
4.58 TIME1
116
TIME2
Americans spend more time watching news on television than reading news on the Internet.
4.59
117
4.60 EDUC
SPEDUC
The two sets of numbers are quite similar.
4.61The quartiles are 34, 47, 60
118
Ages are symmetric.
4.62 The quartiles are 1, 2, 4
The number of hours of television watching is highly positively skewed.
4.63 There is a negative linear relationship. The strength is unknown.
4.64 a. r
s xy
sxsy
150
.7813
(16)(12)
There is a moderately strong negative linear relationship.
b. R2 = r2 = ( .7813)2 = .6104
61.04% of the variation in y is explained by the variation in x.
119
4.65a.
xi
x i2
400
1600
yi
20
40
14
16
60
Total
yi2
196
256
18
x i yi
280
640
3600
324
1080
50
17
2500
289
850
50
18
2500
324
900
55
18
3025
324
990
60
18
3600
324
1080
70
20
4900
400
1400
405
139
22,125
2,437
7,220
x i = 405
i 1
y i = 139
i 1
i 1
s xy
x i yi
i 1
s 2x
x i2
s 2x
n 1
y i2
y
i 1
( 405)(139)
1
7,220
26.16
8 1
8
( 405) 2
1
22,125
231.7
8 1
8
(139) 2
1
2,437
3.13
8 1
8
s2
y
sxs y
s xy
= 7,220
i 1
231.7 15.22
sy
i 1
s 2y
sx
x y
xi
i 1
i 1
n 1
i 1
i 1
i 1
y i2 = 2,437
x y
i
1
n 1
x i2 = 22,125
3.13 1.77
26.16
(15.22)(1.77)
..9711
R2 = r2 = .97112 = .9430
The covariance is 26.16, the coefficient of correlation is .9711 and the coefficient of determination
is .9430.
94.30% of the variation in expenses is explained by the variation in total sales.
120
b.
b1
s xy
26.16
.113
231.7
s 2x
405
50.63
8
139
17.38
8
b 0 y b1 x = 17.38 (.113)(50.63) = 11.66
The least squares line is
= 11.66 + .113x
y
The estimated variable cost is .113 and the estimated fixed cost is 11.66.
4.66
xi
yi
x i2
yi2
x i yi
Total
40
42
37
47
25
44
41
48
35
28
387
77
63
79
86
51
78
83
90
65
47
719
1,600
1,764
1,369
2,209
625
1,936
1,681
2,304
1,225
784
15,497
5,929
3,969
6,241
7,396
2,601
6,084
6,889
8,100
4,225
2,209
53,643
3,080
2,646
2,923
4,041
1,276
3,432
3,403
4,320
2,275
1,316
28,712
x i = 387
i 1
y i = 719
i 1
x i2 = 15,497
i 1
i 1
y i2 = 53,643
x y
i
i 1
28,712
s xy
1
n 1
x i yi
i 1
i 1
i 1
x y
(387)(719)
1
28,712
98.52
10 1
10
s 2x
n 1
i 1
x i2
xi
i 1
(387) 2
1
15,497
57.79
10 1
10
121
s 2y
n 1
y i2
i 1
R2 = r2 = .88112 = .7763
b1
s xy
s 2x
(719) 2
1
53,643
216.32
10 1
10
.8811
98.52
1.705
57.79
387
38.7
10
719
71.9
10
(57.79)(216.32)
98.52
sxsy
s xy
i 1
yi
b 0 y b1 x = 71.9 (1.705)(38.7) = 5.917
The least squares line is
= 5.917 + 1.705x
y
e. There is a strong positive linear relationship between marks and study time. For each additional
hour of study time marks increased on average by 1.705.
4.67
xi
yi
x i2
yi2
x i yi
Total
599
689
584
631
594
643
656
594
710
611
593
683
7,587
9.6
8.8
7.4
10.0
7.8
9.2
9.6
8.4
11.2
7.6
8.8
8.0
106.4
358,801
474,721
341,056
398,161
352,836
413,449
430,336
352,836
504,100
373,321
351,649
466,489
4,817,755
92.16
77.44
54.76
100.00
60.84
84.64
92.16
70.56
125.44
57.76
77.44
64.00
957.2
5750.4
6063.2
4321.6
6310.0
4632.2
5915.6
6297.6
4989.6
7952.0
4643.6
5218.4
5464.0
67,559.2
x
i 1
=7,587
y
i 1
= 106.4
x
i 1
67,559.2
122
2
i
= 4,817,755
y
i 1
2
i
= 957.2
x y
i
i 1
s xy
1
n 1
x i yi
i 1
i 1
i 1
x y
(7,587)(106.4)
1
67,559.2
26.16
12 1
12
s 2x
n 1
x i2
sx
s 2x
1,897.7 43.56
n 1
s 2y
(7,587) 2
1
4,817,755
1,897.7
12
= 12 1
i 1
i 1
xi
y i2
y
i 1
i 1
sxsy
s 2Y
s xy
(106.4) 2
1
957.2
1.25
12
= 12 1
sY
1.25 1.12
26.16
( 43.56)(1.12)
.5362
R2 = r2 = .53622 = .2875
The covariance is 26.16, the coefficient of correlation is .5362, and the coefficient of
determination is .2875. The coefficient of determination tells us that 28.75% of the variation in
MBA GPAs is explained by the variation in GMAT scores.
4.68
R2 = r2 = (.6332)2 = .4009; 40.09% of the variation in the employment rate is explained by the
variation in the unemployment rate.
4.69 a
123
R2 = r2 = (.2543)2 = .0647.
b There is a weak linear relationship between age and medical expenses. Only 6.47% of the
variation in average medical bills is explained by the variation in age.
c
5.966 .2257 x
The least squares line is y
d For each additional year of age mean medical expenses increase on average by $.2257 or 23
cents.
e Charge 25 cents per day per year of age.
4.70
R2= (.2435)2 = .0593
Only 5.93% of the variation in the number of houses sold is explained by the variation in interest
rates.
124
4.71
Only 0.55% of the variation in the number of wells drilled is explained by the variation in the
price of oil. The relationship is too weak to interpret the value of the slope coefficient.
4.72
R2 = (.0830)2 = .0069.
There is a very weak positive relationship between the two variables.
125
4.73
= 315.5 + 3.3x; Fixed costs = $315.50, variable costs = $3.30
y
4.74
= 263.4 + 71.65x; Estimated fixed costs = $263.40, estimated variable costs = $71.65
y
126
4.75a
b The slope coefficient is 510.37; home attendance increases on average by 510.37 for each win.
46.41% of the variation in home attendance is explained by the variation in the number of wins.
4.76a
R2 = .0915; there is a very weak relationship between the two variables.
b The slope coefficient is 58.59; away attendance increases on average by 58.59 for each win.
However, the relationship is very weak.
127
4.77
a. The slope coefficient is .26; for each million dollars in payroll the number of wins increases on
average by .26. Thus, to cost of winning one addition game is 1/.26 million = $3.846 million.
b. The coefficient of determination tells us that only 4.11.9% of the variation in the number of
wins is explained by the variation in payroll,
4.78
128
a. The slope coefficient is .0428; for each million dollars in payroll the number of wins increases
on average by .0428. Thus, to cost of winning one addition game is 1/.0428 million = $23.364
million.
b. The coefficient of determination = .0866, which reveals that the linear relationship is very weak.
4.79
a. The slope coefficient is .1526; for each million dollars in payroll the number of wins increases
on average by .1526. Thus, to cost of winning one addition game is 1/.1526 million = $6.553
million.
b. The coefficient of determination = .0876, which reveals that the linear relationship is very weak.
129
4.80a
For each additional win home attendance increases on average by 84.391. The coefficient of
determination is .2468; there is a weak relationship between the number of wins and home
attendance.
b
For each additional win away attendance increases on average by 31.151. The coefficient of
determination is .4407; there is a moderately strong relationship between the number of wins and
away attendance.
130
4.81
R2 = .4023. The relationship between wins and home attendance as a percentage of capacity is
weaker than the relationship between wins and home attendance.
4.82
For each additional win home attendance increases on average by 947.38. The coefficient of
determination is .1108; there is a very weak linear relationship between the number of wins and
home attendance.
131
For each additional win away attendance increases on average by 216.74. The coefficient of
determination is .0322; there is a very weak linear relationship between the number of wins and
away attendance.
4.83
R2 = .3304. The relationship between wins and home attendance as a percentage of capacity is
stronger than the relationship between wins and home attendance.
132
4.84 a
There is a weak negative linear relationship between education and television watching.
b R2 = .0572; 5.72% of the variation in the amount of television is explained by the variation in
education.
4.85 Correlation matrix
There is a weak positive linear relationship between the two variables.
4.86 Correlation matrix
There is a weak positive linear relationship between the two variables.
133
4.87
b1
R2
AT&T
0.687
.318
Aetna
1.256
.296
Cigna
1.829
.463
Coca-Cola
0.601
.324
Disney
1.104
.592
Ford
2.654
.296
McDonalds
0.637
.314
4.88
b1
R2
Barrick Gold
0.594
.071
Bell Canada Enterprises (BCE)
0.399
.089
Bank of Montreal (BMO)
0.610
.164
Enbridge
0.314
.109
Fortis
0.211
.032
Methanex
1.301
.270
Research in Motion (RIM)
1.465
.201
Telus
0.446
.097
Trans Canada Pipeline
0.393
.197
4.89
b1
R2
Amazon
1.324
.267
Amgen
0.492
.096
Apple
1.358
.401
Cisco Systems
1.100
.604
Google
1.075
.327
Intel
1.074
.556
Microsoft
0.865
.436
Oracle
0.866
.526
Research in Motion
1.920
.387
134
4.90 a
b We can see that among those who repaid the mean score is larger than that of those who did not
and the standard deviation is smaller. This information is similar but more precise than that
obtained in Exercise 3.23.
4.91 Repaid loan:
135
Defaulted on loan:
The box plots make it a little easier to see the overlap between the two sets of data (indicating that
the scorecard is not very good).
4.92
R2 = .67842 = .4603; 46.03% of the variation in statistics marks is explained by the variation in
calculus marks. The coefficient of determination provides a more precise indication of the
strength of the linear relationship.
4.93
136
= 369.93 + 116.53x. On average for each addition mph the cost of
The least squares line is y
repair increases by $116.53.
4.94
= 17.933 + .6041x
ay
b The coefficient of determination is .0505, which indicates that only 5.05% of the variation in
incomes is explained by the variation in heights.
4.95
137
The coefficient of determination is .0779, which indicates that only 7.79% of the variation in sales
is explained by the time between movies.
4.96a
b. The slope coefficient is .07; For each additional square foot the price increases on average by
$.07 thousand. More simply for each additional square foot the price increases on average by$70.
c. From the least squares line we can more precisely measure the relationship between the two
variables.
4.97 B.A.
138
B.Sc.
B.B.A.
Other
Using the same class limits the histograms provide more detail than do the box plots.
139
4.98 Private course
Public course
The information obtained here is more detailed than the information provided by the box plots.
4.99
a x 35.01, median = 36
140
b s = 7.68
c Half of the bone density losses lie below 36. At least 75% of the numbers lie between 19.64 and
50.38, at least 88.9% of the numbers lie between 11.96 and 58.06.
4.100
a x = 29,913, median = 30,660
b s 2 = 148,213,791; s = 12,174
c
d The number of coffees sold varies considerably.
141
4.101
R2 = r2 = .57422 = .3297; 32.97% of the variation in bone loss is explained by the variation in age.
4.102 a & b
= 49,337 553.7x
R2 = .5489 and the least squares line is y
c 54.8% of the variation in the number of coffees sold is explained by the variation in temperature.
For each additional degree of temperature the number of coffees sold decreases on average by 554
cups. Alternatively for each 1-degree drop in temperature the number of coffees increases on
average, by 553.7 cups.
d We can measure the strength of the linear relationship accurately and the slope coefficient gives
information about how temperature and the number of coffees sold are related.
4.103a mean, median, and standard deviation
142
x = 93.90, s = 7.72
c We hope Chris is better at statistics than he is golf.
4.104
a x = 26.32 and median = 26
b s 2 = 88.57, s = 9.41
143
c.
d The times are positively skewed. Half the times are above 26 hours.
4.105
80.21% of the variation in scores is explained by the variation in the number of putts.
144
4.106 a & b
= 8.2897 + 3.146x
R2 = .412 and the least squares line is y
c 41.2% of the variation in Internet use is explained by the variation in education. For each
additional year of education Internet use increases on average by 3.146 hours.
d We can measure the strength of the linear relationship accurately and the slope coefficient gives
information about how education and Internet use are related.
4.107
145
x = 150.77, median = 150.50, and s = 19.76. The average crop yield is 150.77 and there is a great
deal of variation from one plot to another.
4.108a & b
= 89.543 + .128 Rainfall
R2 = .369 and the least squares line is y
c 36.92% of the variation in yield is explained by the variation in rainfall. For each additional
inch of rainfall yield increases on average by .128 bushels.
d We can measure the strength of the linear relationship accurately and the slope coefficient gives
information about how rainfall and crop yield are related.
4.109
146
= 120.37 + .1802 Fertilizer
R2 = .1549 and the least squares line y
c 15.49% of the variation in yield is explained by the variation in the amount of fertilizer. For
each additional unit of fertilizer yield increases on average by.180 bushels.
d We can measure the strength of the linear relationship accurately and the slope coefficient gives
information about how the amount of fertilizer and crop yield are related.
4.110a
b The mean debt is $12,067. Half the sample incurred debts below $12,047 and half incurred debts
above. The mode is $11,621.
Case 4.1 a Scatter diagrams with time as the independent variable and temperature anomalies as
the dependent variable
147
Monthly average increase is .0006. For the 1600 month period the increase was 1600(.0006) = .96o
Celsius.
Scatter diagrams with carbon dioxide levels as the independent variable and temperature
anomalies as the dependent variable
The coefficient of determination is .5075, which means that 50.75% of the variation in
temperature anomalies is explained by the variation in CO2levels. There is a moderately strong
linear relationship.
Case 4.21880 to 1940
148
From 1880 to 1940 the earth warned at an average monthly rate of .0007o Celsius.
1941 to 1975
From 1941 to 1975 the earth cooled at an average monthly rate of .0004o Celsius
1976 to 1997
149
From 1976 to 1997 the earth warmed at an average monthly rate of .0021o Celsius.
1998 to 2009
From 1998 to 2009 the earth warmed at an average monthly rate of .0012o Celsius
Over different periods of time the earth has warmed and cooled.
150
Case 4.3 2003-04 Season
The cost of winning one additional game is 1million/.1526 = $6.553 million. However, the
coefficient of determination is only .0876, which tells us that there are many other variables that
determine how well a team will do.
2005-06 Season
The cost of winning one additional game is 1million/.7795 = $1.283 million. The coefficient of
determination is .3072.
151
The small coefficient of determination in the year before the strike seems to indicate that team
owners were spending large amounts of money and getting little in return. The results are
markedly different in the year after the strike. There is a much stronger linear relationship between
payroll and the number of wins and the cost of winning one additional game is considerably
smaller.
Case 4.4
The coefficient of determination is (.1787)2 = .0319. There is a weak negative linear relationship
between percentage of rejected ballots and Percentage of yes votes.
The coefficient of determination is (.3600)2 = .1296. There is a moderate positive linear
relationship between percentage of rejected ballots and Percentage of Allophones.
The coefficient of determination is (.0678)2 = .0046. There is a very weak positive linear
relationship between percentage of rejected ballots and Percentage of Allophones.
The statistics provide some evidence that electoral fraud has taken place.
152
153