13
p t er
C ha
Copyright© Dorling Kindersley India Pvt. Ltd
Hypothesis Testing for
Categorical Data
(Chi-Square Test)
Hypothesis Testing for Categorical Data (Chi-Square Test) 1
Learning Objectives
Upon completion of this chapter, you will be able to:
Copyright© Dorling Kindersley India Pvt. Ltd
Understand the concept of chi-square statistic and chi-square
distribution
Understand the concept of chi-square goodness-of-fit test
Understand the concept of chi-square test of independence: two-
way contingency analysis
Understand the concept of chi-square test for population variance
and chi-square test of homogeneity
Hypothesis Testing for Categorical Data (Chi-Square Test) 2
Defining Chi-Square Test Statistic
Some researchers place the chi-square technique in the
category of non-parametric tests for testing of the hypothesis.
Copyright© Dorling Kindersley India Pvt. Ltd
Chi-square distribution is the family of curves with each
distribution defined by the degree of freedom associated to it. In
fact chi-square is a continuous probability distribution with
range 0 to ∞.
Hypothesis Testing for Categorical Data (Chi-Square Test) 3
Figure 13.1: Chi-Square distribution with 1, 5, and 10 degrees of freedom
Copyright© Dorling Kindersley India Pvt. Ltd
Hypothesis Testing for Categorical Data (Chi-Square Test) 4
Figure 13.2: Acceptance or rejection region in a Chi-Square test
Copyright© Dorling Kindersley India Pvt. Ltd
Hypothesis Testing for Categorical Data (Chi-Square Test) 5
Conditions for Applying the
Chi-Square test
In a contingency table, an expected frequency of less than 5 in a
Copyright© Dorling Kindersley India Pvt. Ltd
cell is less than the frequency required to apply the chi-square
test. In such cases, we need to “pool” the frequencies which are
less than 5 with the preceding or succeeding frequency, so that
the sum of the frequency will be 5 or more.
The sample should consist of at least 50 observations and
should be drawn randomly from the population. In addition, all
the individual observations in a sample should be independent
from each other.
Data should not be presented in percentage or ratio form,
rather they should be expressed in original units.
Hypothesis Testing for Categorical Data (Chi-Square Test) 6
Chi-Square Goodness-of-Fit Test
Chi-square test provides a platform that can be used to ascertain
whether theoretical probability distributions coincide with empirical
Copyright© Dorling Kindersley India Pvt. Ltd
sample distributions.
Example 13.1: A company is concerned about the increasing violent
altercations between its employees. The number of violent incidents
recorded by the management during six randomly selected months
is given in Table 13.2.
Hypothesis Testing for Categorical Data (Chi-Square Test) 7
Computation of Expected Frequencies and
Chi-square Statistic for Example 13.1
Copyright© Dorling Kindersley India Pvt. Ltd
Solved Examples\Excel\Ex [Link]
Hypothesis Testing for Categorical Data (Chi-Square Test) 8
Chi-square Test of Independence:
Two-way Contingency Analysis
When observations are classified on the basis of two variables
Copyright© Dorling Kindersley India Pvt. Ltd
and arranged in a table, the resulting table is referred to as a
contingency table. Chi-square test of independence uses this
contingency table for determining independence of two
variables; this is why this test is sometimes referred to as
contingency analysis.
When we add the row or column totals, the grand total (N) is
obtained. This grand total is the sum of all the frequencies and
represents the sample size.
Hypothesis Testing for Categorical Data (Chi-Square Test) 9
The expected frequency of cell jk is
Copyright© Dorling Kindersley India Pvt. Ltd
Expected frequency for any cell
Where RT is the row total, CT the column total, and N the total number of
frequencies
Hypothesis Testing for Categorical Data (Chi-Square Test) 10
Chi-Square Test Statistic
Copyright© Dorling Kindersley India Pvt. Ltd
Hypothesis Testing for Categorical Data (Chi-Square Test) 11
Example 13.2
The Vice President (Sales) of a garment company wants to determine
Copyright© Dorling Kindersley India Pvt. Ltd
Example 13.2 whether sales of the company’s brand of jeans is
independent of age group. He has appointed a marketing researcher
for this purpose. This marketing researcher has taken a random
sample of 703 consumers who have purchased jeans. The researcher
conducted survey for three brands of the jeans, namely Brand 1,
Brand 2, and Brand [Link] researcher has also divided the age groups
into four categories: 15 to 25, 26 to 35, 36 to 45, and 46 to 55. The
observations of the researcher are provided in Table 13.6:
Hypothesis Testing for Categorical Data (Chi-Square Test) 12
Table 13.6: Contingency table for Example 13.2
Copyright© Dorling Kindersley India Pvt. Ltd
Determine whether brand preference is independent of age group. Use
alpha=0.05.
Hypothesis Testing for Categorical Data (Chi-Square Test) 13
Table 13.7: Contingency table of the observed and expected
frequencies for Example 13.2
Copyright© Dorling Kindersley India Pvt. Ltd
Hypothesis Testing for Categorical Data (Chi-Square Test) 14
Table 13.8 : Computation of expected frequencies and chi-square
statistic for Example 13.2
Copyright© Dorling Kindersley India Pvt. Ltd
Solved Examples\Minitab\Ex [Link]
Hypothesis Testing for Categorical Data (Chi-Square Test) 15
Chi-square Test for
Population Variance
Copyright© Dorling Kindersley India Pvt. Ltd
Hypothesis Testing for Categorical Data (Chi-Square Test) 16
Example 13.3
A researcher draws a random sample of size 51 from the population.
The sample standard deviation is calculated as 15. Use alpha = 0.05
Copyright© Dorling Kindersley India Pvt. Ltd
and test the hypothesis that the population standard deviation is 20.
Hypothesis Testing for Categorical Data (Chi-Square Test) 17
Chi-square Test of Homogeneity
Chi-square test of homogeneity is used to determine whether
two or more independent variables are drawn from the same
Copyright© Dorling Kindersley India Pvt. Ltd
population or from different populations.
In other words, we can say that chi-square test of homogeneity
is used to determine whether two or more populations are
homogenous with respect to some characteristic of interest.
Hypothesis Testing for Categorical Data (Chi-Square Test) 18
Example 13.4
A television company has launched a new product with some
advanced features. The company wants to know the opinion of
Copyright© Dorling Kindersley India Pvt. Ltd
consumers about this product with respect to four characteristics:
preferred brand with new features, did not prefer brand with new
features, preferred only a few new features, and indifferent. The
company has divided consumers into three
groups—executives/officers; businessmen, and private consultants.
It has taken a random sample of size 459 and obtained results
presented in Table 13.9.
Hypothesis Testing for Categorical Data (Chi-Square Test) 19
Table 13.9: Consumer responses for a new product with some advanced
features
Copyright© Dorling Kindersley India Pvt. Ltd
Hypothesis Testing for Categorical Data (Chi-Square Test) 20
Table 13.10: Computation of expected frequencies for Example 13.4
Copyright© Dorling Kindersley India Pvt. Ltd
Hypothesis Testing for Categorical Data (Chi-Square Test) 21
Table 13.11: Computation of chi-square statistic for Example 13.4
Copyright© Dorling Kindersley India Pvt. Ltd
Solved Examples\Excel\Ex [Link]
Solved Examples\Minitab\EX [Link]
Hypothesis Testing for Categorical Data (Chi-Square Test) 22