Sample Testing
Sample Testing
The answers to the problems will be posted on the Sunday after the due
date. To view the answers, click on the Homework Solutions link.
In this homework, one important issue is to identify whether to use 2 sample t- procedure
or to use paired t-procedure. See sample problems to make the distinction. In addition,
you should check whether to use pooled t or separate variances t if 2 sample t is
appropriate.
1. Problem 6.13 in 6th edition (which is problem 6.16 in 5th edition). [note that the output
is given in the problem and thus you do not need to use minitab to find the output again]
a) For the pooled-variance t statistic, the degree of freedom would be:
n1 + n2 2 = 24 + 36 2 = 58. The corresponding t-value in the table is -4.04.
b) For the separate-variance t statistic, the degree of freedom would be:
df =
( n1 1 ) (n21)
2
( 1c ) ( n1 1 ) +c 2 (n21)
2
Where c=
s1
n1
2
s1 s2
+
n 1 n2
Thus, t = -3.90.
c)
H0: F = M
Vs
Ha: F M
i) For the pooled-variance, p-value = 2*0.0001 = 0.0002 < = 0.05. We reject the null
hypothesis. The same conclusion is obtained for = 0.01.
ii) For the separated-variance, p-value = 2*0.0002 = 0.0004 < = 0.05. We reject the null
hypothesis. The same conclusion is obtained for = 0.01.
Here we obtain the same conclusion for each or the statistic used; therefore the
conclusion doesnt depend on which statistic is used.
2. Problem 6.29 in 6th edition (which is problem 6.32 in 5th edition). [again, for this
problem, the minitab output is given in the book]
a)
H0: d = 0
Vs
Ha: d 0
3. Problem 6.43 in 6th edition (which is problem 6.39 in 5th edition) only work on a, b
(skip c)
a)
H0: NB = WB
Vs
Ha: NB WB
Standards deviations from the two distribution suggest that we could use the separatevariance t-test.
Two-Sample T-Test and CI: N-B Jet, W-B Jet
Two-sample T for N-B Jet vs W-B Jet
N-B Jet
W-B Jet
N
12
15
Mean
118.37
110.20
StDev
7.87
4.71
SE Mean
2.3
1.2
P-Value = 0.006
DF = 17
At 95% significance level ( = 0.05) we have p-value = 0.006 < = 0.05. Therefore we
reject the null hypothesis and conclude that the data provides enough evidence that there
is a difference in the average noise level of the two jets.
b) Calculation from MiniTab gave the size of the difference in the mean noise level
between the two types of jets using a 95% confidence interval to be (2.73, 13.60).
4. Problem 6.55 in 6th edition (which is problem 6.63 in 5th edition) [hand compute, do
not use minitab]
a)
i) H0: Female candidates expenditures in campaigns for public office is at least equal to
male candidates expenditures.
Ha: Female candidates expenditures in campaigns for public office is less than male
candidates expenditures.
ii) H0: F M
Vs
Ha: F M
b)
Probability Plot of Female
Normal - 95% CI
99
95
90
Mean
StDev
N
AD
P-Value
245.3
51.95
20
0.383
0.364
Mean
StDev
N
AD
P-Value
351
61.92
20
0.187
0.892
Percent
80
70
60
50
40
30
20
10
5
100
150
200
250
300
Female
350
400
450
95
90
Percent
80
70
60
50
40
30
20
10
5
100
200
300
400
500
600
Male
Both sets of data fall within the 95% lines of the normal probability plot, therefore the
condition of normality of each set of data can be assume. The standard deviation of the
female sample s1 = 51.95, this is close to that of the male sample s2 = 61.92. Therefore the
condition of normality, equal variance, and independent random samples is assumed and
we can proceed to estimate the confidence interval F - M assuming independent samples
and equal variance.
s p=
( n11 ) s21 + ( n 21 ) s 22
n 1+ n22
1951.95 2+1961.922
=57.15
20+ 202
1
sp
c)
1 1
+
n1 n2
245.3351
=5.85
57.15 1/10
y
t=
t = -5.85 < -t0.025 = -2.024 therefore we reject the null hypothesis and conclude that the
difference is statistically significant at 0.05 level.
d) yes the difference is of practical significance. The monetary value of this difference
varies between $69,130 and $142,270 which is an important amount of money.
5. Problem 6.56 in 6th edition (which is problem 6.64 in 5th edition)
The conditions to be satisfied before using the t procedure to analyze the data are:
-
The boxplots given show that the two data sets are just slightly skewed to the left, the
means are close to the median. So we can assume that the data are normally distributed.
The standard deviation of the female sample s1 = 51.95, this is close to that of the male
sample s2 = 61.92, so we can assume equal variance. Finally, it stated in the problem that
the group of males and females were randomly selected so we think of the two groups as
being independent.
Therefore all conditions for a t-test procedure have been met in the previous problem.
6. Current Population Reports presents data on the ages of married people. Ten married
couples are randomly selected and have the ages shown here:
Husband
54
21
32
78
70
33
68
35
54
52
Wife
53
22
33
74
64
35
67
30
45
48
Do the data suggest that the mean age of married men is greater than the mean age of
married women? Determine whether you will use two sample t-test or paired t-test.
(Hint: is the data paired? ) Check conditions and use minitab to perform the test. Test at
3% level of significance.
Since the random selection is made on the couples, we can think of a paired t-test since
the two samples (husbands and wifes) will not be independent. Moreover, there is a wide
variability among the ten data (78 and 74 for the highest couple against 21 and 22 for the
youngest couple), suggesting that a paired t-test will be appropriate in reducing the
couple to couple variability.
Probability Plot of Difference
Normal - 97% CI
99
Mean
StDev
N
AD
P-Value
95
90
2.6
3.565
10
0.288
0.542
Percent
80
70
60
50
40
30
20
10
5
-10
-5
0
5
Difference
10
15
The normal probability plot of the difference of ages between the two group shows that it
is close to a straight line and therefore we can that the paired data set has a normal
distribution. Also, since the couples were randomly selected, we can assume that the
paired data are independent.
H0: M - F 0
Vs
Ha: M - F > 0
N
10
10
10
Mean
49.70
47.10
2.60
StDev
18.92
17.36
3.57
SE Mean
5.98
5.49
1.13
P-Value = 0.023
At df = 9, t0.03 = 2.15
t = 2.31 > t0.03 = 2.15 => We reject H0 and conclude that the data provide enough evidence
that the mean age of married men is greater than the mean age of married women.
7. The costs of major surgery vary substantially from one state to another due to
differences in hospital fees, malpractice insurance cost, doctors fees and rent. A study of
hysterectomy costs was done in California and Montana. Based on a random sample of
200 patient records from each state, the sample statistics shown here were obtained.
State
Sample Mean
Sample
Standard
Deviation
Montana
200
$6,458
$250
California
200
$12,690
$890
a) Is there significant evidence that California has a higher mean hysterectomy cost
than Montana?
The sample standard deviation for California is more than 3 times the sample
standard deviation for Montana, suggesting that the equal variance assumption is not
appropriate. It is stated in the problem that both 200 patient records in California and
Montana were randomly selected, so the samples are independent. The size of the
sample (200) suggests that we can assume a normal distribution of the data. We will
use the separate-variance t-test to analyze the data.
H0: C M 0
df =
( n1 1 ) (n21)
2
( 1c ) ( n1 1 ) +c 2 (n21)
Vs
Ha: C M > 0
Where c=
s1
n1
s21 s22
+
n1 n2
y 2
s 21 s 22
+
n1 n 2
=0.073
126906458
8902 250 2
+
200 200
y
t ' =
=95.34
s 21 s22
+
n 1 n2