C 5 A
C 5 A
Introduction to
Hypothesis Testing
Chapter Goals
After completing this chapter, you
should be able to:
Formulate null and alternative hypotheses for
applications involving a single population mean or
proportion
Formulate a decision rule for testing a hypothesis
Know how to use the test statistic, critical value,
and p-value approaches to test the null hypothesis
Know what Type I and Type II errors are
Compute the probability of a Type II error
What is a Hypothesis?
Ahypothesis is a claim
(assumption) about a
population parameter:
◦ population mean
Example: The mean monthly cell phone bill
of this city is = $42
◦ population proportion
Example: The proportion of adults in this
city with cell phones is p = .68
The Null Hypothesis, H0
Is
always about a population
parameter, not about a sample statistic
H0 : μ 3 H0 : x 3
The Null Hypothesis, H0
(continued)
Claim: the
population
mean age is 50.
(Null Hypothesis:
Population
H0: = 50 )
Now select a
random sample
x = 20 likely if = 50?
If not likely, Suppose
the sample
REJECT mean age Sample
Null Hypothesis is 20: x = 20
Reason for Rejecting H0
Sampling Distribution of x
x
20 = 50
If H0 is true
... then we
If it is unlikely that
reject the null
we would get a
... if in fact this were hypothesis that
sample mean of
this value ... the population mean… = 50. pCha8-
8
Level of Significance,
Defines unlikely values of sample
statistic if null hypothesis is true
Defines rejection region of the sampling
distribution
Is designated by , (level of significance)
Typical values are .01, .05, or .10
Is selected by the researcher at the beginning
Provides the critical value(s) of the test
Level of Significance
and the Rejection Region
Level of significance = Represents
critical value
H0: μ ≥ 3
HA: μ < 3 Rejection
Lower tail test 0 region is
shaded
H0: μ ≤ 3
HA: μ > 3
Upper tail test 0
Type II Error
◦ Fail to reject a false null hypothesis
State of Nature
Decision H0 True H0 False
Do Not
No error Type II Error
Reject
Key: (1 - ) (β)
Outcome H0
(Probability) Reject Type I Error No Error
H0 () (1-β)
Type I & II Error Relationship
◦ β when
◦ β when σ
◦ β when n
Critical Value
Approach to Testing
Convert sample statistic (e.g.: x ) to test
statistic ( Z or t statistic )
σ
x = μ z
n
Two Tailed Tests
There are two cutoff H0: μ = 3
values (critical values): HA: μ 3
± zα/2
or /2 /2
xα/2
Lower
Reject H0 Do not reject H0 Reject H0
xα/2 -zα/2 0 zα/2
Upper
xα/2 μ0 xα/2
Lower Upper
σ
x /2 = μ z /2
n
Critical Value
Approach to Testing
Known Unknown
Large Small
Samples Samples
Calculating the Test Statistic
Hypothesis
Tests for μ
Known Unknown
Hypothesis
Tests for
Known Unknown
Hypothesis
Tests for
Known Unknown
= .05
-zα= -1.645 0
= .05
z
Reject H0 Do not reject H0
-1.645 0
-2.0
Since z = -2.0 < -1.645, we reject the null
hypothesis that the mean number of TVs in US
homes is at least 3
Hypothesis Testing Example
(continued)
An alternate way of constructing rejection region:
Now
expressed
= .05 in x, not z
units
x
Reject H0 Do not reject H0
2.8684 3
2.84 σ 0.8
Since x = 2.84 < 2.8684, x α = μ zα n = 3 1.645 100 = 2.8684
we reject the null
hypothesis
p-Value Approach to Testing
= .10
Standard Normal
What is z given = 0.10? Distribution Table (Portion)
.90 .10
Z .07 .08 .09
= .10
1.1 .3790 .3810 .3830
.50 .40
1.2 .3980 .3997 .4015
z 0 1.28
1.3 .4147 .4162 .4177
Critical Value
= 1.28
Example: Test Statistic (continued)
= .10
The sampling
distribution of p Hypothesis
is normal, so the Tests for p
test statistic is a
z value:
np 5 np < 5
pp and or
z= n(1-p) 5 n(1-p) < 5
p(1 p)
Not discussed
n in this chapter
Example: z Test for Proportion
A marketing company
claims that it receives
8% responses from
its mailing. To test
this claim, a random
sample of 500 were
Check:
surveyed with 25
responses. Test at n p = (500)(.08) = 40
the = .05 n(1-p) = (500)(.92) = 460
significance level.
Z Test for Proportion: Solution
H0: p = .08 Test Statistic:
HA: p .08 pp .05 .08
z= = = 2.47
= .05
p(1 p) .08(1 .08)
n = 500, p = .05 n 500
Critical Values: ± 1.96 Decision:
Reject Reject Reject H0 at = .05
Conclusion:
.025 .025
There is sufficient
-1.96 0 1.96 z evidence to reject the
-2.47 company’s claim of 8%
response rate.
p -Value Solution (continued)
Calculate the p-value and compare to
(For a two sided test the p-value is always two sided)
Do not reject H0
Reject H0 Reject H0 p-value = .0136:
/2 = .025 /2 = .025
P(z 2.47) P(x 2.47)
.0068 .0068 = 2(.5 .4932)
= 2(.0068) = 0.0136
-1.96 0 1.96
z = -2.47 z = 2.47
50 52
Reject Do not reject
H0: μ 52 H0 : μ 52
Type II Error (continued)
Suppose we do not reject H0: 52 when
in fact the true mean is = 50
50 52
Reject Do not reject
H0: 52 H0 : 52
Type II Error (continued)
Suppose we do not reject H0: μ 52
when in fact the true mean is μ = 50
Here, β = P( x cutoff ) if μ = 50
β
50 52
Reject Do not reject
H0: μ 52 H0 : μ 52
Calculating β
Suppose n = 64 , σ = 6 , and = .05
σ 6
cutoff = x = μ z = 52 1.645 = 50.766
(for H0 : μ 52) n 64
So β = P( x 50.766 ) if μ = 50
50 50.766 52
Reject Do not reject
H0: μ 52 H0 : μ 52
Calculating β
(continued)
Suppose n = 64 , σ = 6 , and = .05
50.766 50
P( x 50.766 | μ = 50) = P z = P(z 1.02) = .5 .3461 = .1539
6
64
Probability of
type II error:
β = .1539
50 52
Reject Do not reject
H0: μ 52 H0 : μ 52
Chapter Summary
Estimating two
population values
Population
means, Paired Population
independent samples proportions
samples
Examples:
Group 1 vs. Same group Proportion 1 vs.
independent before vs. after Proportion 2
Group 2 treatment
Difference Between Two Means
Population standard
σ1 and σ2 unknown,
n1 or n2 < 30 deviations are known
σ1 and σ2 known (continued)
Population means,
independent The confidence interval for
samples μ1 – μ2 is:
σ1 and σ2 known *
x
2 2
σ σ2
1 x 2 z /2 1
σ1 and σ2 unknown, n1 n2
n1 and n2 30
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, large
samples
Population means,
independent Forming interval
samples estimates:
σ1 and σ2 unknown,
n1 and n2 30
* the test statistic is a z value
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, large
samples
(continued)
Population means,
independent The confidence interval for
samples μ1 – μ2 is:
σ1 and σ2 known
2 2
s s2
σ and σ unknown, *
x 1 x 2 z /2 1
1 2
n1 and n2 30
n1 n2
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, small
samples
σ1 and σ2 unknown,
n1 or n2 < 30
*
σ1 and σ2 unknown, small
samples
(continued)
σ1 and σ2 known
sp =
n1 1s
1
2
n2 1s 2
2
σ1 and σ2 unknown, n1 n2 2
n1 and n2 30
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, small
samples
(continued)
σ1 and σ2 known x 1
x 2 t /2 sp
1 1
n1 n2
σ1 and σ2 unknown, Where t/2 has (n1 + n2 – 2) d.f.,
n1 and n2 30 and
sp =
n1 1s12 n2 1s2 2
σ1 and σ2 unknown,
n1 or n2 < 30
* n1 n2 2
Paired Samples
Tests Means of 2 Related Populations
Paired ◦ Paired or matched samples
samples ◦ Repeated measures (before/after)
◦ Use difference between paired values:
d = x1 - x 2
Eliminates Variation Among Subjects
Assumptions:
◦ Both Populations Are Normally Distributed
◦ Or, if Not Normal, use large samples
Paired Differences
The ith paired difference is di , where
Paired di = x1i - x2i
samples
n
The point estimate for d i
the population mean d= i =1
paired difference is d : n
n
The sample standard
deviation is i
(d d) 2
sd = i=1
n 1
n is the number of pairs in the paired sample
Paired Differences (continued)
Population means,
independent The test statistic for
samples μ1 – μ2 is:
σ1 and σ2 known * z = x 1
x 2 μ1 μ2
2 2
σ1 and σ2 unknown, σ σ2
n1 and n2 30
1
n1 n2
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, large
samples
Population means,
independent The test statistic for
samples μ1 – μ2 is:
σ1 and σ2 known
z=
x 1
x 2 μ1 μ2
2 2
σ1 and σ2 unknown,
n1 and n2 30
* s
1 s2
n1 n2
σ1 and σ2 unknown,
n1 or n2 < 30
σ1 and σ2 unknown, small
samples
The test statistic for
Population means,
independent μ1 – μ2 is:
samples
t=
x 1
x 2 μ1 μ2
σ1 and σ2 known 1 1
sp
σ1 and σ2 unknown,
n1 n2
n1 and n2 30 Where t/2 has (n1 + n2 – 2) d.f.,
n1 1s12 n2 1s2 2
σ1 and σ2 unknown,
n1 or n2 < 30
* and
sp =
n1 n2 2
Hypothesis tests for μ1 – μ2
Two Population Means, Independent Samples
Lower tail test: Upper tail test: Two-tailed test:
H0: μ1 – μ2 0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
HA: μ1 – μ2 < 0 HA: μ1 – μ2 > 0 HA: μ1 – μ2 ≠ 0
/2 /2
sp =
n1 1s12 n2 1s2 2 =
21 11.30 2 25 11.16 2 = 1.2256
n1 n2 2 21 25 2
Solution
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
HA: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
= 0.05 .025 .025
df = 21 + 25 - 2 = 44
Critical Values: t = ± 2.0154 -2.0154 0 2.0154 t
2.040
Test Statistic:
Decision:
3.27 2.53 Reject H0 at = 0.05
t= = 2.040
1 1
1.2256 Conclusion:
21 25 There is evidence of a
difference in means.
Hypothesis Testing
for Paired Samples
/2 /2
Number of Complaints:
Salesperson Before (1) After (2)
(2) - (1)
Difference, di
di
d = n
C.B. 6 4 - 2
T.F. 20 6 -14 = -4.2
M.H. 3 2 - 1
R.K. 0 0 0
sd =
i
(d d) 2
M.O. 4 0 - 4 n 1
-21 = 5.67
Paired Samples: Solution
Has the training made a difference in the number of
complaints (at the 0.01 level)?
Reject Reject
H0: μd = 0
HA: μd 0
/2 /2
= .01 d = - 4.2 - 4.604 4.604
- 1.66
Critical Value = ± 4.604
d.f. = n - 1 = 4
Decision: Do not reject H0
(t stat is not in the reject region)
Test Statistic:
Conclusion: There is not a
d μd 4.2 0
t= = = 1.66 significant change in the
sd / n 5.67/ 5 number of complaints.
Two Population Proportions
p 1
p 2 z /2
p1(1 p1 ) p 2 (1 p 2 )
n1
n2
Hypothesis Tests for
Two Population Proportions
Population proportions
n1p1 n2 p 2 x1 x 2
p= =
n1 n2 n1 n2
where x1 and x2 are the numbers from
samples 1 and 2 with the characteristic of interest
Two Population Proportions
(continued)
z=
p p p p
1 2 1 2
1 1
p (1 p)
n1 n2
Hypothesis Tests for
Two Population Proportions
Population proportions
Lower tail test: Upper tail test: Two-tailed test:
H0: p1 – p2 0 H0: p1 – p2 ≤ 0 H0: p1 – p2 = 0
HA: p1 – p2 < 0 HA: p1 – p2 > 0 HA: p1 – p2 ≠ 0
/2 /2
1 1
p (1 p) -1.96 1.96
n1 n2 -1.31
=
.50 .62 0 = 1.31
1 1 Decision: Do not reject H0
.549 (1 .549)
72 50 Conclusion: There is not
significant evidence of a
Critical Values = ±1.96 difference in proportions
For = .05 who will vote yes between
men and women.
Chapter Summary
Compared two independent samples
◦ Formed confidence intervals for the differences between
two means
◦ Performed Z test for the differences in two means
◦ Performed t test for the differences in two means
Compared two related samples (paired samples)
◦ Formed confidence intervals for the paired difference
◦ Performed paired sample t tests for the mean difference
Compared two population proportions
◦ Formed confidence intervals for the difference between
two population proportions
◦ Performed Z-test for two population proportions
Exercises 9.1
n1 = 100, n2 = 150,
x1 = 50, x2 = 65
s1 = 6, s2 = 8
Determine the 90% confidence interval estimate for the
difference between population means. Interpret the
estimate. (90% => Zα/2 = 1.645)
Determine the 98% confidence interval estimate for the
difference between population means. Interpret the estimate. (98%
=> Zα/2 = 2.33)
What are the advantages and disadvantages of using a
higher confidence level to estimate the difference between
the two populatiuon means?
Exercises 9.2
Hypothesis Tests
for Variances
σ2
Chi-Square test statistic * where
2 = standardized chi-square variable
n = sample size
s2 = sample variance
σ2 = hypothesized variance
The Chi-square Distribution
The chi-square distribution is a family of
distributions, depending on degrees of
freedom:
d.f. = n - 1
0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2
2
The critical value, , is found from the
chi-square table
Upper tail test:
H0: σ2 ≤ σ02
HA: σ2 > σ02
2
Do not reject H0 Reject H0
2
Example
A commercial freezer must hold the selected
temperature with little variation.
Specifications call for a standard deviation of
no more than 4 degrees (or variance of 16
degrees2). A sample of 16 freezers is tested
and yields a sample variance
of s2 = 24. Test to see
whether the standard
deviation specification
is exceeded. Use
= .05
Finding the Critical Value
The the chi-square table to find the critical value:
2 = 24.9958 ( = .05 and 16 – 1 = 15 d.f.)
The test statistic is:
(n 1)s 2
(16 1)24
=
2
= = 22.5
σ 2
16
Since 22.5 < 24.9958,
do not reject H0 = .05
/2
/2
2 2
Reject Do not reject H0 Reject Do not Reject
21- reject H0
21-/2 2/2
F Test for Difference in Two
Population Variances
Hypothesis Tests for Variances
s 22 = Variance of Sample 2
n2 - 1 = denominator degrees of freedom
The F Distribution
0 F 0 F
Do not Reject H0 Do not Reject H0
reject H0 F reject H0 F/2
rejection region rejection region for
for a one-tail test is a two-tailed test is
s12 s12
F = 2 F F = 2 F / 2
s2 s2
(when the larger sample variance in the numerator)
F Test: An Example
You are a financial analyst for a brokerage firm.
You want to compare dividend yields between
stocks listed on the NYSE & NASDAQ. You collect
the following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16
df1 = n1 – 1 = 21 – 1 = 20
Denominator:
df2 = n2 – 1 = 25 – 1 = 24
n 35 35
x 15.3 14.2
s 3.2 3.5
We have two large samples each n > 30. We will do the p-value method on
testing the difference between two means, with population variances
unknown.
2. Let α = 0.05
3. The test statistic we have Z=
(x−y)−(μ −μ ) 1 2
S2 S2 12 .
+nn
12
4. The above test statistic, based on the information provided is Z = 1.3722
5. Apply the p-value for the right-tailed test we see that p-value = 0.08499 >α.
Hence the null
hypothesis is rejected.
6. The two population means are the equal.
Problem 1: Reading Scores
Suppose we want to compare the reading scores
of men and women on a standardized reading test.
We take a random sample of 31 people and obtain
the results below. Note that the women
outperform the men by 4 points. Of course, this
might simply be sampling error. We would like to
test whether or not this difference is significant at
the =.05 level. Men Women
80 = 1 84 = 2
S 16 = S1 20 = S 2
n 16 = n1 15 = n2
Problem 1: Reading Scores
(cont’d)
H0: μ1= μ2
H1: μ1≠ μ2
Note that
◦ σ1,σ2 are not given
◦ n1+ n2 = 31
$100.00 $107.00
This is the data of $250.00 $240.00
numbers showing how $890.00 $880.00
much money 17 men and$765.00 $770.00
17 women spent on wine$456.00 $409.00
$356.00 $500.00
over the year.
$876.00 $800.00
$740.00 $900.00
$231.00 $1,000.00
$222.00 $489.00
$555.00 $800.00
$666.00 $890.00
$876.00 $770.00
$10.00 $509.00
$290.00 $100.00
$98.00 $102.00