Statistics For Management II - Parametric Hypothesis Tests
Statistics For Management II - Parametric Hypothesis Tests
Management - II
Parametric Hypothesis Tests
• If the population standard deviation (or variance) is known, we can use Z test.
Most of the time the population standard deviation (or variance) cannot be known.
That is why we mostly use t test for one sample hypothesis testing
Note: Some resources indicates that if we have a big sample (n>30 or n>50) we can apply
Z test. However, in application we never use Z test even if we have a big sample.
Step 2: Define the critical value (level of significance - region of acceptance of the null hypothesis).
Example: A company, which produces chocolate bars, claims that their average weight of a chocolate bar is 60 grams.
𝐻0 : µ = 60
𝐻1 : µ ≠ 60
Example: A quality control manager claims that on average the number of defective product per day is less than or
equal to 25.
𝐻0 : µ ≤ 25
𝐻1 : µ > 25
Example: A store manager claims that on average they sell at least 200 products in a day.
𝐻0 : µ ≥ 200
𝐻1 : µ < 200
The Dairy Fresh Ice Cream plant in Greensboro, Alabama, uses a filling machine for its 64-ounce cartons.
There is some variation in the actual amount of ice cream that goes into the carton. The machine can go out
of adjustment and put a mean amount either less or more than 64 ounces in the cartons. To monitor the
filling process, the production manager selects a simple random sample of 16 filled ice cream cartons each
day. The mean is calculated as 64.2 and sample standard deviation is calculated as 0.72.
Does the machine fill 64 ounces in the cartons on average? Conduct a test at the 5% level of significance.
Step 2: Define the critical value (level of significance - region of acceptance of the null hypothesis).
Critical value = in t table we check values based on significance level (α) and degree of freedom (df)
Accept 𝐻0 if t-stat falls between − 2.131 and + 2.131. Reject 𝐻0 if t-stat falls below − 2.131 or above + 2.131
𝑥ҧ − µ
𝑡= 𝑠
𝑛
𝑥:ҧ sample mean = 64.2
µ: population mean = 64
n: sample size = 16
𝑥ҧ − µ 64.2 − 64
𝑡= 𝑠 = = 1.11
0.72
𝑛 16
The sample test statistic t-stat = 1.11 lies inside the region of acceptance of − 2.131 ≤ t ≤ + 2.131
Thus, 𝐻0 is accepted.
𝐻0 : µ = 64
𝐻1 : µ ≠ 64
Based on these sample data, the company does not have sufficient evidence to conclude that the filling
machine is out of adjustment.
At the 0.01 significance level, can we conclude that the mean weight is less than 9.0 grams?
𝑠𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 9.2 + 8.7 + 8.9 + 8.6 + 8.5 + 8.7 + 9 70.4
𝑆𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 = 𝑋ത = = = = 8.8
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 8 8
Step 2: Define the critical value (level of significance - region of acceptance of the null hypothesis).
Critical Value from the t table ➔ the hypothesis is one-sided, degree of freedom (df) is 8-1=7 (n-1)
𝑥ҧ − µ
𝑡= 𝑠
𝑛
Then
𝑥ҧ − µ 8.8 − 9
𝑡= 𝑠 = = −2.498
0.2268
𝑛 8
Thus, 𝐻0 is accepted.
𝐻0 : µ ≥ 9
𝐻1 : µ < 9
Based on the sample, we cannot say that the mean weight is less than 9.0 grams with the confidence level of
99%.
Parametric
Two Sample Tests
Paired-sample t test
Independent-sample t test
It is used to test the difference between two
It is used to compare the means of randomly
variables for the same set of units.
selected samples from two different
• Repeated measurements for the same units
populations.
• Different measurements for the same units
𝐻0 : µ1 = µ2 𝐻0 : µ1 − µ2 = 0
𝐻1 : µ1 ≠ µ2 𝐻1 : µ1 − µ2 ≠ 0
2 2
𝑛1 − 1 𝑆1 + 𝑛2 − 1 𝑆2
𝑆𝑝2 =
𝑛1 − 1 + 𝑛2 − 1
𝑋ത1 : mean of the first sample 𝑛1 : number of units of the first sample
𝑋ത2 : mean of the second sample 𝑛2 : number of units of the second sample
The test statistic 𝒕𝒔𝒕𝒂𝒕 follows Student t distribution with 𝒏𝟏 + 𝒏𝟐 − 𝟐 degree of freedom
Mehmet Çağlar - Yildiz Technical University - [email protected] 17
Independent-sample t test
Number of product sold by two employees:
To compare performances of two employees, daily number of product sold was counted for both of them for 10 days.
Number of product sold by the employees are given in the table below. Is there a significant difference between
performances of these employees in terms of number of product sold in 0.05 significance level?
Employee A 22 34 52 62 30 40 64 84 56 59
Employee B 52 71 76 54 67 83 66 90 77 84
𝐻0 : µ𝑎 = µ𝑏 𝐻0 : µ𝑎 − µ𝑏 = 0
𝐻1 : µ𝑎 ≠ µ𝑏 𝐻1 : µ𝑎 − µ𝑏 ≠ 0
Degree of freedom = 𝑛1 + 𝑛2 − 2 = 10 + 10 − 2 = 18
H0 is rejected. Thus, we conclude that «𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑠𝑜𝑙𝑑 𝑏𝑦 𝑡ℎ𝑒
𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠».
The mean number of product sold by Employee B is 72, while the mean number of product sold by Employee A is 50.3
Test statistic for the difference between means of two paired samples.:
ഥ − µ𝐷
𝐷
𝑡𝑠𝑡𝑎𝑡 =
𝑆𝐷
𝑛
ഥ : mean of differences
𝐷
µ𝐷 : hypothesized mean difference
𝑆𝐷 : standard deviation of differences
n: sample size
The test statistic 𝒕𝒔𝒕𝒂𝒕 follows Student t distribution with 𝒏 −1 degree of freedom
Mehmet Çağlar - Yildiz Technical University - [email protected] 22
Paired-sample t test
𝐻0 : µ𝐷 = 0 (𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒏𝒐 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑖𝑛 𝑡ℎ𝑒 𝑡𝑒𝑥𝑡𝑏𝑜𝑜𝑘 𝑝𝑟𝑖𝑐𝑒𝑠 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑢𝑛𝑖𝑣𝑒𝑟𝑠𝑖𝑡𝑦 𝑏𝑜𝑜𝑘𝑠𝑡𝑜𝑟𝑒 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑜𝑛𝑙𝑖𝑛𝑒 𝑠𝑡𝑜𝑟𝑒)
𝐻1 : µ𝐷 ≠ 0 (𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑖𝑛 𝑡ℎ𝑒 𝑡𝑒𝑥𝑡𝑏𝑜𝑜𝑘 𝑝𝑟𝑖𝑐𝑒𝑠 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑢𝑛𝑖𝑣𝑒𝑟𝑠𝑖𝑡𝑦 𝑏𝑜𝑜𝑘𝑠𝑡𝑜𝑟𝑒 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑜𝑛𝑙𝑖𝑛𝑒 𝑠𝑡𝑜𝑟𝑒)
Degree of freedom = n − 1 = 16 − 1 = 15
ഥ − µ𝐷
𝐷
𝑡𝑠𝑡𝑎𝑡 =
𝑆𝐷
𝑛
ഥ = 42.6013
𝐷 and 𝑆𝐷 = 43.797
42.6013 − 0
𝑡𝑠𝑡𝑎𝑡 = = 3.8908
43.797
16
Decision: Reject H0
H0 is rejected. Thus, we conclude that at the 5% significant level «𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑖𝑛 𝑡ℎ𝑒 𝑡𝑒𝑥𝑡𝑏𝑜𝑜𝑘 𝑝𝑟𝑖𝑐𝑒𝑠
𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑢𝑛𝑖𝑣𝑒𝑟𝑠𝑖𝑡𝑦 𝑏𝑜𝑜𝑘𝑠𝑡𝑜𝑟𝑒 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑜𝑛𝑙𝑖𝑛𝑒 𝑠𝑡𝑜𝑟𝑒»
The mean price in the bookstore is 153.6 and the mean price in the online store is 111. Therefore, the price of the textbooks in
bookstore is higher than the prices in the online store.
Degree of freedom = n − 1 = 10 − 1 = 9
Decision: Reject H0
H0 is rejected. Thus, we conclude that at the 5% significant level «The mean sales after the promotion is higher than mean
sales before the promotion»
Mean sales after the promoting increased, therefore the promotion is successful.
We want to compare the performance of three cars (A, B and C) based on fuel-consumption. Fuel-consumption in miles per gallon by car is given in the
table below. Apply a test in 5% level of significance to compare their fuel-consumption.
𝐻1 : µ𝑖 ≠ µ𝑗 , 𝑓𝑜𝑟 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 µ𝑖 , µ𝑗 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑐𝑎𝑟𝑠 ℎ𝑎𝑣𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑚𝑒𝑎𝑛 𝑓𝑢𝑒𝑙 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛
n 7 7 6
2 2 2
𝑆𝑆𝐺 = 7 ∗ 20.9 − 22.3 + 7 ∗ 23.2 − 22.3 + 6 ∗ 22.9 − 22.3
𝑆𝑆𝐺 = 21.55
Total 33.73 19
Decision: Reject H0
𝐻1 : µ𝑖 ≠ µ𝑗 , 𝑓𝑜𝑟 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 µ𝑖 , µ𝑗 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑐𝑎𝑟𝑠 ℎ𝑎𝑣𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑚𝑒𝑎𝑛 𝑓𝑢𝑒𝑙 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑖𝑡𝑜𝑛
H0 is rejected. Thus, we conclude that at the 95% confidence level «𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑐𝑎𝑟𝑠 ℎ𝑎𝑣𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑚𝑒𝑎𝑛 𝑓𝑢𝑒𝑙
𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑖𝑡𝑜𝑛»
If there are differences in population means do pairwise comparison (Post Hoc Tests).
In ANOVA Multiple (Pairwise) Comparison methods are called Post Hoc Tests.
• Tukey-Kramer Method
• Rodger's Method
• Scheffe's method
n 7 7 6
𝑀𝑆𝑊 1 1
𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑟𝑎𝑛𝑔𝑒 = 𝑞1−α +
2 𝑛𝑖 𝑛𝑗
𝑞1−α : Value from studentized range table with D1 = k and D2 = n - k degrees of freedom for the desired level of confidence (1 –
a)
If the calculated pairwise comparison value is greater than the critical range, we conclude that the difference is significant.
𝑀𝑆𝑊 1 1
𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑅𝑎𝑛𝑔𝑒 = 𝑞1−α +
2 𝑛𝑖 𝑛𝑗
𝑞1−α = 3.63
MSW = 0.7165
𝑛𝐴 = 7, 𝑛𝐵 = 7, 𝑛𝐶 = 6
Critical Ranges
Pairwise comparisons:
If the calculated pairwise comparison value is greater than the critical range, we conclude that the difference is significant.
Interpretation:
There is no significant difference between fuel consumption of Car B and Car C at 95% confidence level. The difference between Car
A and Car B and the difference between Car A and Car C in fuel consumption are significant at 95% confidence level.
There is no significant difference between fuel consumption of Car B and Car C at 95% confidence level. The difference
between Car A and Car B and the difference between Car A and Car C in fuel consumption are significant at 95%
confidence level.
n 7 7 6
When we examine the mean fuel-consumption by cars, we can say that Car A has the lowest fuel consumption.
Therefore, Car A has the highest performance compared to Car B and Car C based on fuel-consumption.
Homogeneity (Equality) of variances: When comparing two or more independent populations’ means
(independent sample t test, ANOVA), the variances of each population must be equal. If the variances are
not equal, we can apply robust parametric tests (Welch’s Test).
Arithmetic mean is a parameter and is highly affected by any outliers. Besides, outliers affect the
distribution of the data and variance.
To apply any parametric test, the outliers must be detected and (if possible) they should be removed from
the dataset.
Any outliers can be easily detected using Box (box whisker) plot.
Normal Distribution
The normal probability distribution has the following properties (Wegner, 2013, pp. 133):
Skewness
Kurtosis
«If the variation from the normal distribution is sufficiently large, all resulting statistical tests are
invalid, because normality is required to use the F and t statistics» (Hair, et al., 2014: 69).
Jarque-Bera Test
P-P plot and Q-Q plot compares the cumulative distribution of actual data values with the cumulative
distribution of a normal distribution (Hair et al., 2014:70; Field, 2013:218).
The normal distribution forms a straight diagonal line, and the plotted data values are compared with the
diagonal. If a distribution is normal, the line representing the actual data distribution closely follows the
diagonal (Hair et al., 2014:70). Any deviation of the dots from the diagonal line represents a deviation from
normality. Kurtosis is shown up by the dots sagging above or below the line, whereas skewness is shown
up by the dots snaking around the line in an ‘S’ shape (Field, 2013:225)
Calculate standardized z values for skewness and kurtosis and compare the values to a critical value (e.g.
1.96, 2.58)
𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠 𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠
𝑧𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠 = =
6 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠
𝑛
𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠 𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠
𝑧𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠 = =
24 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠
𝑛
If 𝑧𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠 and 𝑧𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠 is between ±1.96 (0.05 significance level), the data is normally distributed.
The Shapiro–Wilk test is more appropriate method for small sample sizes (<50 samples) although it can
also be handling on larger sample size while Kolmogorov–Smirnov test is used for n ≥50 (Mishra et al.,
2019).
Always use both Graphical approaches and Statistical Tests to examine the distribution.
*If variances are not equal but the other assumptions are met, we can apply robust parametric tests
(Welch’s Test).
Groebner, D. F., Shannon, P. W., & Fry, P. C. (2018). Business statistics: A decision Making Appraoch (10th Ed.). Pearson education.
Hair, J. F., Black, W. C., Babin, B., J. & Anderson, R., E. (2014). Multivariate Data Analysis (7th Ed.). Pearson Education.
Levine, D. M., Szabat, K. A., & Stephan, D. F. (2016). Business Statistics: A First Course (7th Ed.). Pearson Education.
Lind, D. A., Marchal, W. G., & Wathen, S. A. (2021). Statistical Techniques in Business & Economics (18th Ed.). McGraw-Hill Education.
Mishra, P., Pandey, C. M., Singh, U., Gupta, A., Sahu, C., & Keshri, A. (2019). Descriptive statistics and normality tests for statistical data. Annals of cardiac
anaesthesia, 22(1), 67.
Newbold, P., Carlson, W. L., & Thorne, B. M. (2013). Statistics for Business and Economics (Global Edition). Pearson Education
Wegner, T. (2013). Applied business statistics: Methods and Excel-based applications (3rd ed.). Juta and Company Ltd.