0% found this document useful (0 votes)
89 views24 pages

Unit 5.2 Testing Two Population Means

Hypothesis testing can be used to compare statistics from two independent samples and determine if there are significant differences between population means. Some examples include comparing calculus grades between private and public students or emotional intelligence between men and women. The steps of hypothesis testing include formulating the null and alternative hypotheses, specifying the significance level, selecting the appropriate test statistic such as a z-test or t-test, determining the rejection regions, computing the test statistic, making a statistical decision to reject or not reject the null hypothesis, and drawing a conclusion. Excel can be used to conduct hypothesis tests such as a two sample t-test assuming unequal variances to investigate if one group has a higher mean, such as comparing the number of Facebook friends

Uploaded by

shane naigal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views24 pages

Unit 5.2 Testing Two Population Means

Hypothesis testing can be used to compare statistics from two independent samples and determine if there are significant differences between population means. Some examples include comparing calculus grades between private and public students or emotional intelligence between men and women. The steps of hypothesis testing include formulating the null and alternative hypotheses, specifying the significance level, selecting the appropriate test statistic such as a z-test or t-test, determining the rejection regions, computing the test statistic, making a statistical decision to reject or not reject the null hypothesis, and drawing a conclusion. Excel can be used to conduct hypothesis tests such as a two sample t-test assuming unequal variances to investigate if one group has a higher mean, such as comparing the number of Facebook friends

Uploaded by

shane naigal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT 5

HYPOTHESIS TESTING

Mathematics Department
XAVIER UNIVERSITY-ATENEO DE CAGAYAN
Hypothesis testing can be extended to procedures that
compare statistics from two samples of data selected from two
independent populations.

This test is used in situations in which there are two


experimental conditions and different participants have been
used in each condition

The following are some practical problems that require


hypothesis testing for the difference between two population
means:
■ Is there a significant difference in Calculus grades between
private and public SHS students?
■ Do women have higher EQ than men?
The following is the summary of the
steps when performing hypothesis
testing.

1. Formulate the null and alternative hypotheses.


2. Specify the level of significance to be used.
3. Select the appropriate test statistic.
4. Establish the rejection region/regions.
5. Compute the actual value of the test statistic from
the sample.
6. Make a statistical decision.
7. Draw the appropriate conclusion.
Testing the Difference Between the Means of Two
Independent Populations: Z test
Case 1 is performed if the population standard deviations
are given, regardless of the sample size.

Case 2 is performed if the population standard deviations


are not given but the sample sizes n1 and n2 are both
greater than or equal to 30 (Central Limit Theorem applies).

Case 3 is performed if the problem states that the population


standard deviations are equal, but they are unknown and
the sample sizes used are small (less than 30).

Case 4 is performed if the problem states that the


population standard deviations are unequal, but they are
unknown and the sample sizes used are small (less than
30).
Example 1 (Two-tailed Z test)
A study was conducted to compare the length of time it took men and women
to perform a certain assembly-line task. Independent random samples of 100
men and 100 women were employed in an experiment in which each person
was timed (in seconds) on identical tasks. The results are summarized below:

Men Women

n1  100 n 2  100
x1  45 x 2  38
s1  5.24 s 2  4.75

Do the data present sufficient evidence to suggest a significant difference


between the mean completion times of this task for men and women? Use
0.01 level of significance.
Solution:
Let 1 = mean completion time for men
 2 = mean completion time for women
d 0  0 (hypothesized difference)
Steps in hypothesis testing
1. Null and alternative hypotheses
H0: There is no significant difference in the mean completion times to
perform a certain assembly-line task between men and women.
H1: There is a significant difference in the mean completion times to
perform a certain assembly-line task between men and women.
(In symbols) H 0 : 1   2  0 versus H 1 : 1   2  0

2. Significance level:   0.01


3. Test Statistic: Since both sample sizes are large, we can use an
approximate Z-test, given by x  x 2   d 0
Z  1
2 2
s1 s
 2
n1 n2
4. Rejection Regions: Since it is a two-tailed test based on H 1 : 1   2  0
the rejection regions are in both tails of the Z distribution, given by
Z  Z or Z  Z 
2 2

Z   Z 0.005 or Z  Z 0.005

Based from the Z distribution table, since the area of 0.4950 is in between
Z = 2.57 and Z = 2.58, then the critical value of the test is  2.575

Thus, H0 is rejected if Z  2.575 or Z  2.575


otherwise, H0 is not rejected.
5. Computation of the test statistic

x1  x 2   d 0 45  38  0


Z    9.898
s
2
1 
s
2
2 5.242  4.752
n1 n2 100 100

6. Statistical decision
Since Z = 9.898 is in the rejection region, H0 is rejected.

7. Conclusion
There is a significant difference in the mean completion times to perform a
certain assembly-line task between men and women.
Example 2 (One-tailed t-test)
A certain electronics company investigated the life span between two
brands of laptop with the same specifications. A random sample of 15
Brand X laptops showed a mean lifespan of 7.75 years with a standard
deviation of 0.60 year while another random sample of 15 Brand Y
laptops showed a mean lifespan of 6.25 years with a standard deviation
of 0.85 year. Assume that the population variances are equal. At 0.05
level of significance, is there sufficient evidence to claim that Brand X
laptops have significantly longer mean life span than Brand Y?
Solution:
Let 1 = mean life span of Brand X laptop
 2 = mean life span of Brand Y laptop
d 0  0 (hypothesized difference)
Steps in hypothesis testing

1. Null and alternative hypotheses


H0: There is no significant difference in the mean life span between Brand X and
Brand Y laptops.
H1: Brand X laptops have significantly longer mean life span than Brand Y laptops.

(In symbols) H 0 : 1   2  0 versus H 1 : 1   2  0

2. Significance level:   0.05


3. Test Statistic: Since it is assumed that the population variances are equal, that is
 1   2 but unknown and n1  30 , n 2  30 . Then the appropriate test
statistic is pooled-variance t test where
x1  x2   d 0
t 
sp  1  1
n1 n2
4. Rejection Regions: Since the test is one-tailed, the rejection region is given
by t  t
t  t 0.05
Based from the t distribution table with degrees of freedom df = 28, the
critical value of the test is 1.701.

df  n1  n2  2
 15  15  2
 28

Thus, H0 is rejected if t > 1.701, otherwise, H0 is not rejected.


5. Computation
First, solve for the pooled standard deviation Sp and then solve for t.

n1  1s12  n 2  1s 22 15  10.60   15  10.85


2 2
sp    0.7357
n1  n 2  2 15  15  2

x1  x2   d 0 7.75  6.25  0


t    5.584
sp  1
n1
 1
n2
0.7357   115  115

6. Statistical decision
Since is in the rejection region, H0 is rejected.

7. Conclusion
There is sufficient evidence to claim that Brand X laptops have significantly
longer mean life span than Brand Y.
EXCEL
t-test: two independent samples assuming unequal variances
Women Men
110 100
100 90
150 95
200 112
160 115
170 89 Most people nowadays have
180 95 facebook accounts. Suppose you
160 96 want to investigate the number of
200 100 facebook friends an fb user have
250 102
130 110
when grouped according to gender.
125 105 Based on the following data, test
100 98 the hypothesis that women have
105 100 more facebook friends than men.
130 93 Use 0.05 level of significance.
210 85
225 108
230 100
Note: Check for normality and homogeneity of
130 120
variances
101 80
To check for normality using Excel, you may perform skewness and kurtosis
to see if the distribution of data is symmetric and mesokurtic which are two
of the characteristics of a normal distribution.

Women Men Women Women


110 100 skewness 0.41361966 kurtosis -1.057053177
100 90 std. error 0.548 std. error 1.095
150 95 2*std. error (+/-) 1.096 2*std. error (+/-) 2.19
200 112 Interpretation symmetric Interpretation mesokurtic
160 115
170 89 Men Men
180 95 skewness 0.13759564 kurtosis -0.108716609
160 96 std. error 0.548 std. error 1.095
200 100 2*std. error (+/-) 1.096 2*std. error (+/-) 2.19
250 102 Interpretation symmetric Interpretation mesokurtic
130 110
125 105
100 98
105 100
130 93
210 85
225 108
230 100
130 120
101 80
Rule of Thumb for Checking the Condition of
Unequal Variances (in Excel)

In most cases, if the ratio of sample variances is


less than or equal to 3, then it is safe to assume that
the population variances are equal.

Levene's test – a statistical test


for Homogeneity of Variances
(available in SPSS)
Steps
1. Open Microsoft Excel. Encode the two
samples of observations separately into
two columns in the spread sheet.

2. Select Data – Data Analysis – t test: Two-


Sample Assuming Unequal Variances, then
click OK.

3. In the dialogue box (right), enter the cell


range:
Variable 1: short-beaked data
Variable 2: bottlenose data
Enter the Hypothesized Mean Difference
(do=0 ).
Click Labels and set Alpha at 0.05.
Output Range: select any cell where you want
to display the output

4. Click OK.
Interpretation: t-test: two independent samples assuming unequal variances

Decision Rule (one-tailed test)


1. Critical value: Reject Ho in favor of H1 if t stat > t Critical two-tail
2. p-value: Reject Ho in favor of H1 if P(T<=t) one-tail  0.05.

Note: The ratio of the 2


variances is more than
3.

greater than the t Critical one-tail 1.72


p-value is less than 0.05

Decision/Conclusion: Reject Ho. The data provide sufficient evidence to claim that women
have more facebook friends than men.
Additional Notes Optional
The steps for interpreting the SPSS output on TEST for NORMALITY of data.
(Ho: The data is normally distributed.)

1. Here two tests for normality are run. For dataset smaller than 2000 elements, we use the Shapiro-
Wilk test, otherwise, the Kolmogorov-Smirnov test is used.

2. In the Test for Normality table, look under the Sig. column.

3. If the p-value is MORE THAN .05, then researchers have met the assumption of normality of data.
(Ho is not rejected.)
3. If the p-value is LESS THAN .05, then researchers have violated the assumption of normality of
data and will use a non-parametric test.

The steps for interpreting the SPSS output for homogeneity of variance
(Ho: The variances are equal.)

1. Use Levene’s test for Equality of Variances.

2. In the Levene’s test for Equality of Variances, look under the Sig. column.

3. If the p-value is MORE THAN .05, then researchers have met the assumption of homogeneity of
variance. (Ho is not rejected.)

4. If the p-value is LESS THAN .05, then researchers have violated the assumption of homogeneity
of variances.
Optional

SPSS: Analyze -> Descriptive Statistics -> Explore


 Normality Plots with Tests

Since the p-value > .05, Ho is not rejected and conclude that the data
comes from a normal distribution.

Thus,
The distribution of the number of facebook friends of women is normally
distribution.
The distribution of the number of facebook friends of men is normally
distribution.
Optional
SPSS: Analyze  Compare Means  Independent Samples t-test

Results
1. The Levene’s test shows that the variances between the two groups are
not equal. Thus, use the result of t-test (equal variances not assumed)
2. The independent samples t-test result shows that the difference is
significant (p-value < 0.05). Since the t statistic is positive, thus women
have more facebook friends than men.

You might also like