Statistics
Chapter 8: Tests of Hypotheses
Where We’ve Been
Calculated point estimators of
population parameters
Used the sampling distribution of a
statistic to assess the reliability of an
estimate through a confidence interval
2
Where We’re Going
Test a specific value of a population
parameter
Measure the reliability of the test
3
8.1: The Elements of a Test of
Hypotheses
Confidence Interval
µ? Where on the number line do the data point us?
(No prior idea about the value of the parameter.)
µ?
Hypothesis Test
Do the data point us to this particular value? µ0?
(We have a value in mind from the outset.)
4
8.1: The Elements of a Test of
Hypotheses
Null Hypothesis: H0
•This will be supported
unless the data
provide evidence that it
is false
• The status quo
Alternative Hypothesis: Ha
•This will be supported if
the data provide sufficient
evidence that it is true
• The research hypothesis
5
8.1: The Elements of a Test of
Hypotheses
If the test statistic has a high
probability when H0 is true, then H0 is
not rejected.
If the test statistic has a (very) low
probability when H0 is true, then H0 is
rejected.
6
8.1: The Elements of a Test of
Hypotheses
7
8.1: The Elements of a Test of
Hypotheses
8
8.1: The Elements of a Test of
Hypotheses
Note: Null hypotheses
are either rejected, or
else there is insufficient
evidence to reject them.
(I.e., we don’t accept
null hypotheses.)
9
8.1: The Elements of a Test of
Hypotheses
• Null hypothesis (H0): A theory about the values of one or more parameters
• Ex.: H0: µ = µ0 (a specified value for µ)
• Alternative hypothesis (Ha): Contradicts the null hypothesis
• Ex.: H0: µ ≠ µ0
• Test Statistic: The sample statistic to be used to test the hypothesis
• Rejection region: The values for the test statistic which lead to rejection of
the null hypothesis
• Assumptions: Clear statements about any assumptions concerning the
target population
• Experiment and calculation of test statistic: The appropriate calculation for
the test based on the sample data
• Conclusion: Reject the null hypothesis (with possible Type I error) or do
not reject it (with possible Type II error)
10
8.1: The Elements of a Test of
Hypotheses
Suppose a new interpretation of the rules by
soccer referees is expected to increase the
number of yellow cards per game. The
average number of yellow cards per game
had been 4. A sample of 121 matches
produced an average of 4.7 yellow cards
per game, with a standard deviation of .5
cards. At the 5% significance level, has
there been a change in infractions called?
11
8.1: The Elements of a Test of
Hypotheses
x 0 4.7 4
z* 10.94
sx .064
12
8.2: Large-Sample Test of a
Hypothesis about a Population Mean
The null hypothesis is
usually stated as an
equality …
… even though the alternative hypothesis
can be either an equality or an inequality.
13
8.2: Large-Sample Test of a
Hypothesis about a Population Mean
14
8.2: Large-Sample Test of a
Hypothesis about a Population Mean
15
8.2: Large-Sample Test of a
Hypothesis about a Population Mean
One-Tailed Test Two-Tailed Test
x 0 x 0
z z
x x
Conditions: 1) A random sample is selected from the target population.
2) The sample size n is large.
16
8.2: Large-Sample Test of a
Hypothesis about a Population Mean
17
8.2: Large-Sample Test of a
Hypothesis about a Population Mean
H0 : µ = 60,000
Ha : µ ≠ 60,000
Test Statistic:
x 0
z
x
61,340 60,000
z
2,185
z .613
Rejection Region: | z | > z.025 = 1.96
Do not reject H0
18
8.3:Observed Significance Levels: p -
Values
Suppose z = 2.12.
P(z > 2.12) = .0170.
But it’s pretty close, isn’t it?
19
8.3:Observed Significance Levels: p -
Values
The observed significance level, or p-value, for
a test is the probability of observing a test statistic
as extreme or more than the one actually
observed (z*) assuming the null hypothesis is true.
P ( z z* | H 0 )
The lower this probability, the less likely H0 is true.
20
8.3:Observed Significance Levels: p -
Values
H0 : µ = 65,000
Let’s go back to the
Ha : µ ≠ 65,000
Economics of
Test Statistic:
Education Review
x 0
report (= $61,340, z
x
s = $2,185). This
61,340 65,000
time we’ll test z
2,185
H0: µ = $65,000.
z 1.675
p-value: 2P(< 61,340 |H0 ) =
P(|z| > 1.675) = .0475
21
8.3:Observed Significance Levels: p -
Values
22
8.4: Small-Sample Test of a
Hypothesis about a Population Mean
x 0
t
s/ n
23
8.4: Small-Sample Test of a
Hypothesis about a Population Mean
One-Tailed Test Two-Tailed Test
x 0 x 0
t t
s/ n s/ n
24
8.4: Small-Sample Test of a
Hypothesis about a Population Mean
Suppose copiers average 100,000
between paper jams. A salesman
claims his are better, and offers to
leave 5 units for testing. The average
number of copies between jams is
100,987, with a standard deviation of
157. Does his claim seem believable?
25
8.4: Small-Sample Test of a
Hypothesis about a Population Mean
Suppose copiers H0 : µ = 100,000
average 100,000 Ha : µ > 100,000
between paper jams. A Test Statistic:
salesman claims his are x 0
better, and offers to t
s/ n
leave 5 units for testing.
The average number of 100,987 100,000
t
copies between jams is 157 / 5
100,987, with a standard t 14.06
deviation of 157. Does
his claim seem p-value: P( > 100,987|H0 ) =
believable? P(tdf=4 > 14.06) < .001
26
8.4: Small-Sample Test of a
Hypothesis about a Population Mean
Suppose copiers HReject
0 : µ = 100,000
the null hypothesis
average 100,000 Hbased on the very low
a : µ > 100,000
probability of seeing the
between paper jams. A Test Statistic:
observed results if the null
salesman claims his are x 0
were true.
better, and offers to claim does seem
So,t the
s/ n
leave 5 units for testing. plausible.
The average number of 100,987 100,000
t
copies between jams is 157 / 5
100,987, with a standard t 14.06
deviation of 157. Does
his claim seem p-value: P( >100,987|H0 ) =
believable? P(tdf=4 > 14.06) < .001
27
8.5: Large-Sample Test of a Hypothesis about
a Population Proportion
One-Tailed Test Two-Tailed Test
pˆ p0 pˆ p0
z z
pˆ pˆ
p0 q0
p0 = hypothesized value of p, pˆ , and q0 = 1 - p0
n
Conditions: 1) A random sample is selected from a binomial population.
2) The sample size n is large (i.e., np0 and nq0 are both > 15).
28
8.5: Large-Sample Test of a Hypothesis about
a Population Proportion
Rope designed for use in
the theatre must
withstand unusual
stresses. Assume a
brand of 3” three-strand
rope is expected to have
a breaking strength of
1400 lbs. A vendor
receives a shipment of
rope and needs to
(destructively) test it.
29
8.5: Large-Sample Test of a Hypothesis about
a Population Proportion
H0: p = .01
Ha: p > .01
Rejection region: z > 2.236
Test statistic:
pˆ p0
z
pˆ
.013 .01
z
(.013)(.987) / 1500
z 1.14
30
8.5: Large-Sample Test of a Hypothesis about
a Population Proportion
H0: p = .01
Ha: p > .01
There is insufficient
Rejection region:
evidence |z| >
to reject the2.236
null hypothesis based
Test statistic:
on the sample results.
pˆ p0
z
pˆ
.013 .01
z
(.013)(.987) / 1500
z 1.14
31
8.6: A Nonparametric Test
About a Population Median
The sign test provides inferences
about population medians, or central
tendencies, when skewed data or an
outlier would invalidate tests based on
normal distributions.
32
8.6: A Nonparametric Test
About a Population Median
H 0 : (or ) 0 H 0 : 0
H a : (or ) 0 H a : 0
33
8.6: A Nonparametric Test
About a Population Median
One-tailed test for a Population Two-tailed test for a
Median
Observed significance level:
Population Median
p-value = P(x>S) Observed significance
level:
p-value = 2P(x>S)
34
8.6: A Nonparametric Test
About a Population Median
Median time to failure for a band of compact disc players
is 5,250 hours. Twenty players from a competitor are
tested, with failure times from 5 hours to 6,575 hours.
Fourteen of the players exceed 5,250 hours.
Do the competitor’s machines perform differently?
35
8.6: A Nonparametric Test
About a Population Median
Median time to failure for a band of compact disc players is 5,250 hours.
Twenty players from a competitor are tested, with failure times from 5 hours
to 6,575 hours. Fourteen of the players exceed 5,250 hours.
Do the competitor’s machines perform differently?
H 0 : 5,250
H a : 5,250
.10
z* / 2 z.*05 1.645
( s .5) .5n 13.5 10
z 1.565 1.645
.5 n .5 20
36
8.6: A Nonparametric Test
About a Population Median
Median time to failure for a band of compact disc players is 5,250 hours.
Twenty players from a competitor are tested, with failure times from 5 hours
to 6,575 hours. Fourteen of the players exceed 5,250 hours.
Do the competitor’s machines perform differently?
H 0 : 5,250
H a : 5,250 Do not
reject H0
.10
z* / 2 z.*05 1.645
( s .5) .5n 13.5 10
z 1.565 1.645
.5 n .5 20
37
8.7: Inference concerning a
Population Variance
Sometimes the primary parameter of
interest is not the population mean but
rather the population variance 2.
We choose a random sample of size n from
a normal distribution.
38
8.7: Inference concerning a
Population Variance
The sample variance s2 can be used in its
standardized form:
((nn11))ss22
22
22
which has a Chi-Square distribution with n -
1 degrees of freedom. 39
8.7: Inference concerning a
Population Variance
To H : 22
22
To test H00 : 00 versus
test versusH
Haa ::one
oneor
or two
twotailed
tailed
we
weuse
usethe
thetest
teststatistic
statistic
((nn11))ss 22
22
with
with aa rejection
rejection region
region based
basedon
on
00
22
aachi
chi--square
squaredistributi
distribution
on with df nn11..
with df
Confidence
Confidenceinterval
interval::
((nn11))ss22 ( n 1
(n 1) s ) s 22
22
22
/ /22
22
(1(1/ /22)) 40
8.7: Inference concerning a
Population Variance
•A cement manufacturer claims that his
cement has a compressive strength with a
standard deviation of 10 kg/cm2 or less. A
sample of n = 10 measurements produced a
mean and standard deviation of 312 and
13.96, respectively.
41
8.7: Inference concerning a
Population Variance
•A cement manufacturer claims that his
cement has a compressive strength with a
standard deviation of 10 kg/cm2 or less. A
sample of n = 10 measurements produced a
mean and standard deviation of 312 and
13.96,
AAtest of
respectively.
hypothesis:
test of hypothesis: uses
usesthe
thetest
teststatistic:
statistic:
HH0: :==10
10(claim
(claimisiscorrect)
correct)
0
( n 1) s 2
9 (13. 96 2
)
HHa: :>>10
10(claim
(claimisiswrong)
wrong)
2
2
17.5
a
10 100
42
8.7: Inference concerning a
Population Variance
•Do these data produce sufficient evidence
to reject the manufacturer’s claim? Use
= .05. Rejection region: Reject H if 0
16.91905.
Conclusion: Since = 17.5, H0 is
rejected. The standard deviation
of the cement strengths is more
than 10.
43
Equivalence of Confidence
Interval & Testing
44
Exercises
The average number of minutes of a television
commercial is 4.8. Write down the null and
alternative hypothesis for testing the above
statement. Assuming the commercial time is
normally distributed, give the appropriate rejection
region for each of the following sample sizes and
significance levels.
a. n = 6, α= 0.01
b. n = 12, α = 0.05
c. n = 20, α= 0.1
45
Exercises
46