0% found this document useful (0 votes)
8 views61 pages

Statistics For Management II - Parametric Hypothesis Tests

The document provides an overview of parametric and nonparametric hypothesis testing methods, detailing commonly used tests such as the one-sample t-test, independent-sample t-test, and paired-sample t-test. It outlines the hypothesis testing process in five steps and includes examples to illustrate the application of these tests. Additionally, it explains the conditions under which Z tests and t tests are used, emphasizing the importance of sample size and distribution assumptions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views61 pages

Statistics For Management II - Parametric Hypothesis Tests

The document provides an overview of parametric and nonparametric hypothesis testing methods, detailing commonly used tests such as the one-sample t-test, independent-sample t-test, and paired-sample t-test. It outlines the hypothesis testing process in five steps and includes examples to illustrate the application of these tests. Additionally, it explains the conditions under which Z tests and t tests are used, emphasizing the importance of sample size and distribution assumptions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Statistics for

Management - II
Parametric Hypothesis Tests

Assist. Prof. Mehmet Çağlar


[email protected]
Classification of Methods for Hypothesis Tests
Methods for
Hypothesis Tests

Parametric Methods Nonparametric Methods

These methods do not require an assumption


These methods require an assumption about the probability distribution of the
about the probability distribution of the population (distribution-free methods).
population (Normal Distribution)
Used for qualitative (categorical) data
Used for quantitative (numerical) data Used for quantitative data that is not normally
distributed and small sample size (n<30)

Mehmet Çağlar - Yildiz Technical University - [email protected] 2


Classification of Methods for Hypothesis Tests
Commonly used Parametric and Nonparametric tests

Type of test Parametric Test Nonparametric Test

One-Sample One-sample t test Wilcoxon Signed-Rank Test

Independent Samples Independent-sample t test Mann-Whitney U Test


Two-Sample
Paired Samples Paired-sample t test Wilcoxon Signed-Rank Test

More than One Factor One-way ANOVA Kruskal-Wallis Test


Two Samples
Two Factor Two-way ANOVA Friedman Test

Mehmet Çağlar - Yildiz Technical University - [email protected] 3


Classification of Methods for Hypothesis Tests
Z test and T test

• If the population standard deviation (or variance) is known, we can use Z test.

• If the population standard deviation (or variance) is unknown, we use t test.

Most of the time the population standard deviation (or variance) cannot be known.

That is why we mostly use t test for one sample hypothesis testing

Note: Some resources indicates that if we have a big sample (n>30 or n>50) we can apply
Z test. However, in application we never use Z test even if we have a big sample.

Mehmet Çağlar - Yildiz Technical University - [email protected] 4


One Population / One Sample Test
Hypothesis Testing Process is formalized in a five-step procedure, as follows:

Step 1: Formulate the hypotheses (the null and alternative hypotheses).

Step 2: Define the critical value (level of significance - region of acceptance of the null hypothesis).

Step 3: Calculate the test statistic.

Step 4: Compare the test statistic to the region of acceptance.

Step 5: Interpret the results.

Mehmet Çağlar - Yildiz Technical University - [email protected] 5


One Sample t Test
The one sample t test is a parametric test which is used to compare a sample mean with a benchmark (claimed value).

Example: A company, which produces chocolate bars, claims that their average weight of a chocolate bar is 60 grams.
𝐻0 : µ = 60
𝐻1 : µ ≠ 60

Example: A quality control manager claims that on average the number of defective product per day is less than or
equal to 25.
𝐻0 : µ ≤ 25
𝐻1 : µ > 25

Example: A store manager claims that on average they sell at least 200 products in a day.
𝐻0 : µ ≥ 200
𝐻1 : µ < 200

Mehmet Çağlar - Yildiz Technical University - [email protected] 6


One Sample t Test
Dairy Fresh Ice Cream (Groebner et al., 2018: 356-357)

The Dairy Fresh Ice Cream plant in Greensboro, Alabama, uses a filling machine for its 64-ounce cartons.
There is some variation in the actual amount of ice cream that goes into the carton. The machine can go out
of adjustment and put a mean amount either less or more than 64 ounces in the cartons. To monitor the
filling process, the production manager selects a simple random sample of 16 filled ice cream cartons each
day. The mean is calculated as 64.2 and sample standard deviation is calculated as 0.72.

Does the machine fill 64 ounces in the cartons on average? Conduct a test at the 5% level of significance.

Mehmet Çağlar - Yildiz Technical University - [email protected] 7


One Sample t Test
Step 1: Formulate the hypotheses (the null and alternative hypotheses).
𝐻0 : µ = 64
𝐻1 : µ ≠ 64

Step 2: Define the critical value (level of significance - region of acceptance of the null hypothesis).

Significance level = α = 5% = 0.05 Confidence level = 1-α = 95% = 0.95

Critical value = in t table we check values based on significance level (α) and degree of freedom (df)

degree of freedom (df) = n – 1 = 16-1 = 15

Critical value for two-sided α = 0.05 and df = 15 is 2.131

Thus, the region of acceptance for 𝐻0 is − 2.131 ≤ t ≤ + 2.131

The decision rule for accepting or rejecting 𝐻0 is then stated as follows:

Accept 𝐻0 if t-stat falls between − 2.131 and + 2.131. Reject 𝐻0 if t-stat falls below − 2.131 or above + 2.131

Mehmet Çağlar - Yildiz Technical University - [email protected] 8


One Sample t Test

Step 3: Calculate the test statistic

𝑥ҧ − µ
𝑡= 𝑠
𝑛
𝑥:ҧ sample mean = 64.2

µ: population mean = 64

𝑠: sample standard deviation = 0.72

n: sample size = 16

𝑥ҧ − µ 64.2 − 64
𝑡= 𝑠 = = 1.11
0.72
𝑛 16

Mehmet Çağlar - Yildiz Technical University - [email protected] 9


One Sample t Test
Step 4: Compare the test statistic to the region of acceptance.

The sample test statistic t-stat = 1.11 lies inside the region of acceptance of − 2.131 ≤ t ≤ + 2.131

Thus, 𝐻0 is accepted.
𝐻0 : µ = 64
𝐻1 : µ ≠ 64

Step 5: Interpret the results.

Based on these sample data, the company does not have sufficient evidence to conclude that the filling
machine is out of adjustment.

Mehmet Çağlar - Yildiz Technical University - [email protected] 10


One Sample t Test
Example (Lind et al., 2021: 302): A machine is set to fill a small bottle with 9.0 grams of medicine. A sample of
eight bottles revealed the following amounts (grams) in each bottle.

9.2 8.7 8.9 8.6 8.8 8.5 8.7 9.0

At the 0.01 significance level, can we conclude that the mean weight is less than 9.0 grams?

First, we need to calculate mean and sample standard deviation.

𝑠𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 9.2 + 8.7 + 8.9 + 8.6 + 8.5 + 8.7 + 9 70.4
𝑆𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 = 𝑋ത = = = = 8.8
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 8 8

σni=1(xi − 𝑥)ҧ 2 (9.2 − 8.8)2 +(8.7 − 8.8)2 + ⋯ + (9.0 − 8.8)2 0.36


𝑠= = = = 0.2268
n−1 8−1 7

Mehmet Çağlar - Yildiz Technical University - [email protected] 11


One Sample t Test
Step 1: Formulate the hypotheses (the null and alternative hypotheses).
𝐻0 : µ ≥ 9
𝐻1 : µ < 9

Step 2: Define the critical value (level of significance - region of acceptance of the null hypothesis).

Significance level = 0.01

Critical Value from the t table ➔ the hypothesis is one-sided, degree of freedom (df) is 8-1=7 (n-1)

Thus, critical value is −2.998

If test statistic is lower than -2.998, 𝐻0 is rejected

Mehmet Çağlar - Yildiz Technical University - [email protected] 12


One Sample t Test
Step 3: Calculate the test statistic.

𝑥ҧ − µ
𝑡= 𝑠
𝑛

Then

𝑥ҧ − µ 8.8 − 9
𝑡= 𝑠 = = −2.498
0.2268
𝑛 8

The test statistic calculated from the sample is -2.498

Mehmet Çağlar - Yildiz Technical University - [email protected] 13


One Sample t Test
Step 4: Compare the test statistic to the region of acceptance.

The sample test statistic t-stat = -2.498, critical value is -2.998.

If test statistic is lower than -2.998, 𝐻0 is rejected. Otherwise, 𝐻0 is accepted.

Thus, 𝐻0 is accepted.

𝐻0 : µ ≥ 9

𝐻1 : µ < 9

Step 5: Interpret the results.

Based on the sample, we cannot say that the mean weight is less than 9.0 grams with the confidence level of
99%.

Mehmet Çağlar - Yildiz Technical University - [email protected] 14


Two Sample Test

Parametric
Two Sample Tests

Independent Samples Paired Samples

Paired-sample t test
Independent-sample t test
It is used to test the difference between two
It is used to compare the means of randomly
variables for the same set of units.
selected samples from two different
• Repeated measurements for the same units
populations.
• Different measurements for the same units

Mehmet Çağlar - Yildiz Technical University - [email protected] 15


Independent-sample t test
To test the difference between the first population mean (µ1 ) and the second population mean (µ2 ) the null
hypothesis of no significant difference in the means of two independent populations and the alternative
hypothesis can be stated as:

𝐻0 : µ1 = µ2 𝐻0 : µ1 − µ2 = 0

𝐻1 : µ1 ≠ µ2 𝐻1 : µ1 − µ2 ≠ 0

𝐻0 : The means of two population are equal


𝐻1 : The means of two population are not equal

𝐻0 : There is no significant difference between two population means


𝐻1 : There is a significant difference between two population means

Mehmet Çağlar - Yildiz Technical University - [email protected] 16


Independent-sample t test
Test statistic for the difference between means of two independent samples.

𝑋ത1 − 𝑋ത2 − (µ1 − µ2 )


𝑡𝑠𝑡𝑎𝑡 =
1 1
𝑆𝑝2 +
𝑛1 𝑛2

2 2
𝑛1 − 1 𝑆1 + 𝑛2 − 1 𝑆2
𝑆𝑝2 =
𝑛1 − 1 + 𝑛2 − 1

𝑋ത1 : mean of the first sample 𝑛1 : number of units of the first sample

𝑋ത2 : mean of the second sample 𝑛2 : number of units of the second sample

µ1 − µ2 : hypothesized mean difference 𝑆12 : variance of the first sample


𝑆22 : variance of the second sample
𝑆𝑝2 : pooled variance

The test statistic 𝒕𝒔𝒕𝒂𝒕 follows Student t distribution with 𝒏𝟏 + 𝒏𝟐 − 𝟐 degree of freedom
Mehmet Çağlar - Yildiz Technical University - [email protected] 17
Independent-sample t test
Number of product sold by two employees:

To compare performances of two employees, daily number of product sold was counted for both of them for 10 days.
Number of product sold by the employees are given in the table below. Is there a significant difference between
performances of these employees in terms of number of product sold in 0.05 significance level?

Employee A 22 34 52 62 30 40 64 84 56 59

Employee B 52 71 76 54 67 83 66 90 77 84

Step 1: Formulating the hypotheses

𝐻0 : µ𝑎 = µ𝑏 𝐻0 : µ𝑎 − µ𝑏 = 0

𝐻1 : µ𝑎 ≠ µ𝑏 𝐻1 : µ𝑎 − µ𝑏 ≠ 0

𝐻0 : 𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒏𝒐 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑠𝑜𝑙𝑑 𝑏𝑦 𝑡ℎ𝑒 𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠


𝐻1 : 𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑠𝑜𝑙𝑑 𝑏𝑦 𝑡ℎ𝑒 𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠
Mehmet Çağlar - Yildiz Technical University - [email protected] 18
Independent-sample t test
Step 2: Defining the critical value.

Significance level is 0.05.

For the critical value:

This is a two-sided hypothesis test.

Degree of freedom = 𝑛1 + 𝑛2 − 2 = 10 + 10 − 2 = 18

the critical value = 2.101

Mehmet Çağlar - Yildiz Technical University - [email protected] 19


Independent-sample t test
Step 3: Calculating the test statistic.

Mean Standard Deviation


Employee A 50.3 18.73
Employee B 72.0 12.54

Mehmet Çağlar - Yildiz Technical University - [email protected] 20


Independent-sample t test
Step 4: Comparing the test statistic to the region of acceptance.

Do not reject H0 if -2.001 < t stat < 2.001,

otherwise reject H0.

Step 5: Interpreting the results.


𝐻0 : 𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒏𝒐 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑠𝑜𝑙𝑑 𝑏𝑦 𝑡ℎ𝑒 𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠
𝐻1 : 𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑠𝑜𝑙𝑑 𝑏𝑦 𝑡ℎ𝑒 𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠

H0 is rejected. Thus, we conclude that «𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 𝑠𝑜𝑙𝑑 𝑏𝑦 𝑡ℎ𝑒
𝑒𝑚𝑝𝑙𝑜𝑦𝑒𝑒𝑠».

The mean number of product sold by Employee B is 72, while the mean number of product sold by Employee A is 50.3

Employee B has a higher performance.

Mehmet Çağlar - Yildiz Technical University - [email protected] 21


Paired-sample t test
Hypothesis for paired-sample t test
𝐻0 : µ𝐷 = 0
𝐻1 : µ𝐷 ≠ 0

Test statistic for the difference between means of two paired samples.:

ഥ − µ𝐷
𝐷
𝑡𝑠𝑡𝑎𝑡 =
𝑆𝐷
𝑛
ഥ : mean of differences
𝐷
µ𝐷 : hypothesized mean difference
𝑆𝐷 : standard deviation of differences
n: sample size
The test statistic 𝒕𝒔𝒕𝒂𝒕 follows Student t distribution with 𝒏 −1 degree of freedom
Mehmet Çağlar - Yildiz Technical University - [email protected] 22
Paired-sample t test

Book prices (Levine et al., 2016: 357):

A student who does a price research for some of


the textbooks taught in the business department
finds out that the prices are different in the
university bookstore and online store. Prices of
some books are listed in the table. Are textbook
prices at the university bookstore different from
the prices offered at the online store at 5%
significance level?

Mehmet Çağlar - Yildiz Technical University - [email protected] 23


Paired-sample t test
Step 1: Formulate the hypotheses

𝐻0 : µ𝐷 = 0 (𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒏𝒐 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑖𝑛 𝑡ℎ𝑒 𝑡𝑒𝑥𝑡𝑏𝑜𝑜𝑘 𝑝𝑟𝑖𝑐𝑒𝑠 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑢𝑛𝑖𝑣𝑒𝑟𝑠𝑖𝑡𝑦 𝑏𝑜𝑜𝑘𝑠𝑡𝑜𝑟𝑒 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑜𝑛𝑙𝑖𝑛𝑒 𝑠𝑡𝑜𝑟𝑒)
𝐻1 : µ𝐷 ≠ 0 (𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑖𝑛 𝑡ℎ𝑒 𝑡𝑒𝑥𝑡𝑏𝑜𝑜𝑘 𝑝𝑟𝑖𝑐𝑒𝑠 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑢𝑛𝑖𝑣𝑒𝑟𝑠𝑖𝑡𝑦 𝑏𝑜𝑜𝑘𝑠𝑡𝑜𝑟𝑒 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑜𝑛𝑙𝑖𝑛𝑒 𝑠𝑡𝑜𝑟𝑒)

Step 2: Define the critical value.

Significance level is 0.05.

For the critical value:

This is a two-sided hypothesis test.

Degree of freedom = n − 1 = 16 − 1 = 15

the critical value = 2.131

Mehmet Çağlar - Yildiz Technical University - [email protected] 24


Paired-sample t test
Step 3: Calculate the test statistic.

ഥ − µ𝐷
𝐷
𝑡𝑠𝑡𝑎𝑡 =
𝑆𝐷
𝑛

ഥ = 42.6013
𝐷 and 𝑆𝐷 = 43.797

42.6013 − 0
𝑡𝑠𝑡𝑎𝑡 = = 3.8908
43.797
16

Mehmet Çağlar - Yildiz Technical University - [email protected] 25


Paired-sample t test
Step 4: Compare the test statistic to the region of acceptance.

Critical value = 2.1314

Test statistic = 3.8908

Decision: Reject H0

Step 5: Interpret the results.


𝐻0 : µ𝐷 = 0 (𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒏𝒐 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑖𝑛 𝑡ℎ𝑒 𝑚𝑒𝑎𝑛 𝑝𝑟𝑖𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑐𝑜𝑙𝑙𝑒𝑔𝑒 𝑏𝑜𝑜𝑘𝑠𝑡𝑜𝑟𝑒 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑜𝑛𝑙𝑖𝑛𝑒 𝑟𝑒𝑡𝑎𝑖𝑙𝑒𝑟)
𝐻1 : µ𝐷 ≠ 0 (𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑖𝑛 𝑡ℎ𝑒 𝑚𝑒𝑎𝑛 𝑝𝑟𝑖𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑐𝑜𝑙𝑙𝑒𝑔𝑒 𝑏𝑜𝑜𝑘𝑠𝑡𝑜𝑟𝑒 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑜𝑛𝑙𝑖𝑛𝑒 𝑟𝑒𝑡𝑎𝑖𝑙𝑒𝑟)

H0 is rejected. Thus, we conclude that at the 5% significant level «𝑇ℎ𝑒𝑟𝑒 𝑖𝑠 𝒂 𝒔𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒕 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝑖𝑛 𝑡ℎ𝑒 𝑡𝑒𝑥𝑡𝑏𝑜𝑜𝑘 𝑝𝑟𝑖𝑐𝑒𝑠
𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑢𝑛𝑖𝑣𝑒𝑟𝑠𝑖𝑡𝑦 𝑏𝑜𝑜𝑘𝑠𝑡𝑜𝑟𝑒 𝑎𝑛𝑑 𝑡ℎ𝑒 𝑜𝑛𝑙𝑖𝑛𝑒 𝑠𝑡𝑜𝑟𝑒»

The mean price in the bookstore is 153.6 and the mean price in the online store is 111. Therefore, the price of the textbooks in
bookstore is higher than the prices in the online store.

Mehmet Çağlar - Yildiz Technical University - [email protected] 26


Paired-sample t test
A company has 10 different stores. This company implemented a special promotion to increase their sales
for each store. The weekly sales (as thousand $) of each store before and after the promotion is given
below. Is the promotion successful? Test it in 95% confidence level.

Weekly sales Weekly sales


Stores
before the promotion after the promotion
Store 1 50 55
Store 2 75 75
Store 3 60 66
Store 4 45 50
Store 5 80 85
Store 6 85 87
Store 7 40 42
Store 8 45 45
Store 9 50 52
Store 10 55 58

Mehmet Çağlar - Yildiz Technical University - [email protected] 27


Paired-sample t test
Step 1: Formulate the hypotheses (the null and alternative hypotheses).
𝐻0 : µ𝐴 ≤ µ𝐵 𝐻0 : µ𝐴 − µ𝐵 ≤ 0 𝐻0 : µ𝐷 ≤ 0

𝐻1 : µ𝐴 > µ𝐵 𝐻1 : µ𝐴 − µ𝐵 > 0 𝐻1 : µ𝐷 > 0

µ𝐴 : Mean sales AFTER the promotion


µ𝐵 : Mean sales BEFORE the promotion
µ𝐷 : Mean difference sales

Step 2: Define the critical value.

Confidence level is 95% Significance level is 0.05.

For the critical value:

This is a one-sided hypothesis test.

Degree of freedom = n − 1 = 10 − 1 = 9

the critical value = 1.833

Mehmet Çağlar - Yildiz Technical University - [email protected] 28


Paired-sample t test

Step 3: Calculate the test statistic.


Weekly sales Weekly sales
Difference
ഥ − µ𝐷 Stores before the after the
𝐷 (After – Before)
𝑡𝑠𝑡𝑎𝑡 = promotion promotion
𝑆𝐷 Store 1 50 55 5
𝑛 Store 2 75 75 0
Store 3 60 66 6
Store 4 45 50 5
Store 5 80 85 5
Store 6 85 87 2
Store 7 40 42 2
ഥ − µ𝐷 3 − 0 Store 8 45 45 0
𝐷
𝑡𝑠𝑡𝑎𝑡 = = = 4.392 Store 9 50 52 2
𝑆𝐷 2.2
Store 10 55 58 3
𝑛 10 Mean 58.5 61.5 3
Standard deviation of the difference 2.2

Mehmet Çağlar - Yildiz Technical University - [email protected] 29


Paired-sample t test
Step 4: Compare the test statistic to the region of acceptance.

Critical value = 1.833

Test statistic = 4.392

Decision: Reject H0

Step 5: Interpret the results.


𝐻0 : µ𝐴 ≤ µ𝐵 𝑜𝑟 𝐻0 : µ𝐷 ≤ 0 (Mean sales after the promotion is equal to or less than means sales before the promotion)
𝐻1 : µ𝐴 > µ𝐵 𝑜𝑟 𝐻1 : µ𝐷 > 0 (The mean sales after the promotion is higher than mean sales before the promotion)

H0 is rejected. Thus, we conclude that at the 5% significant level «The mean sales after the promotion is higher than mean
sales before the promotion»

Mean sales after the promoting increased, therefore the promotion is successful.

Mehmet Çağlar - Yildiz Technical University - [email protected] 30


Comparing Means of more than 2 Populations

More than One Factor One-way ANOVA


Two Samples Two Factor Two-way ANOVA

ANOVA is acronym for Analysis of Variance

ANOVA is used to compare three or more population means.

Mehmet Çağlar - Yildiz Technical University - [email protected] 31


One-way ANOVA
Suppose that we have K different samples and n different observations and the population mean for each sample is donated as µ1 , µ2 ,
µ3 , … , µ𝑘

Then the hypotheses are:

𝐻0 : µ1 = µ2 = µ3 = ⋯ = µ𝑘 (𝐴𝑙𝑙 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛𝑠 𝑎𝑟𝑒 𝑒𝑞𝑢𝑎𝑙)

𝐻1 : µ𝑖 ≠ µ𝑗 , 𝑓𝑜𝑟 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 µ𝑖 , µ𝑗 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛𝑠 𝑑𝑖𝑓𝑓𝑒𝑟

Critical value is defined using F distribution:

Critical Value: 𝐹𝑘−1, 𝑛−𝑘, α

Mehmet Çağlar - Yildiz Technical University - [email protected] 32


One-way ANOVA
Test statistic is determined by calculating the values in the table below
Source of Sum of Degrees of
Mean Squares F Ratio
Variation Squares Freedom
𝑆𝑆𝐺 𝑀𝑆𝐺
Between groups SSG k-1 𝑀𝑆𝐺 = 𝐹=
𝑘−1 𝑀𝑆𝑊
𝑆𝑆𝑊
Within groups SSW n-k 𝑀𝑆𝑊 =
𝑛−𝑘
Total SST n-1

Mehmet Çağlar - Yildiz Technical University - [email protected] 33


One-way ANOVA
Example: Fuel-Consumption by Car (Newbold et al., 2013: 646-652)

We want to compare the performance of three cars (A, B and C) based on fuel-consumption. Fuel-consumption in miles per gallon by car is given in the
table below. Apply a test in 5% level of significance to compare their fuel-consumption.

Car-A Car-B Car-C


22.2 24.6 22.7
19.9 23.1 21.9
20.3 22.0 23.2
21.4 23.5 24.1
21.2 23.6 22.1
21.0 22.1 23.4
20.3 23.5 ---
n 7 7 6
Sum 146.3 162.4 137.4
Mean 20.9 23.2 22.9

Step 1: Formulate the hypotheses (the null and alternative hypotheses).

𝐻0 : µ𝐴 = µ𝐵 = µ𝐶 𝐴𝑙𝑙 𝑐𝑎𝑟𝑠 ℎ𝑎𝑣𝑒 𝑒𝑞𝑢𝑎𝑙 𝑚𝑒𝑎𝑛𝑠 𝑓𝑢𝑒𝑙 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛

𝐻1 : µ𝑖 ≠ µ𝑗 , 𝑓𝑜𝑟 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 µ𝑖 , µ𝑗 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑐𝑎𝑟𝑠 ℎ𝑎𝑣𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑚𝑒𝑎𝑛 𝑓𝑢𝑒𝑙 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛

Mehmet Çağlar - Yildiz Technical University - [email protected] 34


One-way ANOVA
Step 2: Define the critical value.

Significance level is 0.05

critical value for 𝐹𝑘−1, 𝑛−𝑘, α

n: total number of observation = 20

k: number of sample (group) = 3

α: Significance level = 0.05

Thus, critical value for 𝐹2, 17, 0.05

Critical Value = 3.592

Mehmet Çağlar - Yildiz Technical University - [email protected] 35


One-way ANOVA
Step 3: Calculate the test statistic.
Source of Sum of Degrees of
Mean Squares F Ratio
Variation Squares Freedom
𝑆𝑆𝐺 𝑀𝑆𝐺
Between groups SSG k-1 𝑀𝑆𝐺 = 𝐹=
𝑘−1 𝑀𝑆𝑊
𝑆𝑆𝑊
Within groups SSW n-k 𝑀𝑆𝑊 =
𝑛−𝑘
Total SST n-1

Mehmet Çağlar - Yildiz Technical University - [email protected] 36


One-way ANOVA
Step 3: Calculate the test statistic.

Car-A Car-B Car-C

n 7 7 6

Mean (𝑥)ҧ 20.9 23.2 22.9

Mean of the all observations (𝑥)ҧ 22.3

2 2 2
𝑆𝑆𝐺 = 7 ∗ 20.9 − 22.3 + 7 ∗ 23.2 − 22.3 + 6 ∗ 22.9 − 22.3

𝑆𝑆𝐺 = 21.55

Mehmet Çağlar - Yildiz Technical University - [email protected] 37


One-way ANOVA
Step 3: Calculate the test statistic.
Car-A Car-B Car-C
22.2 24.6 22.7
19.9 23.1 21.9
20.3 22.0 23.2
21.4 23.5 24.1
21.2 23.6 22.1
21.0 22.1 23.4
𝑆𝑆𝑊 = 𝑆𝑆1 + 𝑆𝑆2 + 𝑆𝑆3 20.3 23.5 ---
n 7 7 6
𝑆𝑆1 = (22.2 − 20.9)2 +(19.9 − 20.9)2 + ⋯ + 20.3 − 20.9 2
= 3.76
Mean 20.9 23.2 22.9
𝑆𝑆2 = (24.6 − 23.2)2 +(23.1 − 23.2)2 + ⋯ + 23.5 − 23.2 2
= 4.96
𝑆𝑆3 = (22.7 − 22.9)2 +(21.9 − 22.9)2 + ⋯ + 23.4 − 22.9 2
= 3.46

𝑆𝑆𝑊 = 𝑆𝑆1 + 𝑆𝑆2 + 𝑆𝑆3 = 3.76 + 4.96 + 3.46


𝑆𝑆𝑊 = 12.18

Mehmet Çağlar - Yildiz Technical University - [email protected] 38


One-way ANOVA
Step 3: Calculate the test statistic.
Source of Sum of Degrees of
Mean Squares F Ratio
Variation Squares Freedom
𝑆𝑆𝐺 𝑀𝑆𝐺
Between groups SSG k-1 𝑀𝑆𝐺 = 𝐹=
𝑘−1 𝑀𝑆𝑊
𝑆𝑆𝑊
Within groups SSW n-k 𝑀𝑆𝑊 =
𝑛−𝑘
Total SST n-1

Source of Sum of Degrees of


Mean Squares F Ratio
Variation Squares Freedom

Between groups 21.55 2 10.78 15.05

Within groups 12.18 17 0.7165

Total 33.73 19

Mehmet Çağlar - Yildiz Technical University - [email protected] 39


One-way ANOVA
Step 4: Compare the test statistic to the region of acceptance.

Critical value = 3.592

Test statistic = 15.05

Decision: Reject H0

Step 5: Interpret the results.

𝐻0 : µ𝐴 = µ𝐵 = µ𝐶 𝐴𝑙𝑙 𝑐𝑎𝑟𝑠 ℎ𝑎𝑣𝑒 𝑒𝑞𝑢𝑎𝑙 𝑚𝑒𝑎𝑛𝑠 𝑓𝑢𝑒𝑙 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑖𝑡𝑜𝑛

𝐻1 : µ𝑖 ≠ µ𝑗 , 𝑓𝑜𝑟 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑝𝑎𝑖𝑟 µ𝑖 , µ𝑗 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑐𝑎𝑟𝑠 ℎ𝑎𝑣𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑚𝑒𝑎𝑛 𝑓𝑢𝑒𝑙 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑖𝑡𝑜𝑛

H0 is rejected. Thus, we conclude that at the 95% confidence level «𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑐𝑎𝑟𝑠 ℎ𝑎𝑣𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑚𝑒𝑎𝑛 𝑓𝑢𝑒𝑙
𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑖𝑡𝑜𝑛»

If there are differences in population means do pairwise comparison (Post Hoc Tests).

Mehmet Çağlar - Yildiz Technical University - [email protected] 40


One-way ANOVA
If there are differences in population means do pairwise comparison (Post Hoc Tests).

In ANOVA Multiple (Pairwise) Comparison methods are called Post Hoc Tests.

Most commonly used post hoc tests are

• Tukey Test (Tukey’s HSD (Honest Significant Difference))

• Tukey-Kramer Method

• Bonferroni Procedure (Bonferroni Correction)

• Rodger's Method

• Scheffe's method

Mehmet Çağlar - Yildiz Technical University - [email protected] 41


One-way ANOVA

Car-A Car-B Car-C

n 7 7 6

Mean (𝑥)ҧ 20.9 23.2 22.9

Mehmet Çağlar - Yildiz Technical University - [email protected] 42


One-way ANOVA
The Tukey-Kramer Method for Pairwise Comparisons

Tukey-Kramer Critical Range

𝑀𝑆𝑊 1 1
𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑟𝑎𝑛𝑔𝑒 = 𝑞1−α +
2 𝑛𝑖 𝑛𝑗

𝑞1−α : Value from studentized range table with D1 = k and D2 = n - k degrees of freedom for the desired level of confidence (1 –
a)

MSW = Mean square within

𝑛𝑖 and 𝑛𝑗 : sizes of the samples

If the calculated pairwise comparison value is greater than the critical range, we conclude that the difference is significant.

Mehmet Çağlar - Yildiz Technical University - [email protected] 43


One-way ANOVA
The Tukey-Kramer Method for Pairwise Comparisons

𝑀𝑆𝑊 1 1
𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑅𝑎𝑛𝑔𝑒 = 𝑞1−α +
2 𝑛𝑖 𝑛𝑗

for 𝑞1−α , D1 = k = 3, D2 = n – k = 20-3 = 17, 1 – a = 0.95

𝑞1−α = 3.63

MSW = 0.7165

𝑛𝐴 = 7, 𝑛𝐵 = 7, 𝑛𝐶 = 6

Critical Ranges

0.7165 1 1 0.7165 1 1 0.7165 1 1


𝐴 𝑎𝑛𝑑 𝐵 = 3.63 + = 1.16 𝐴 𝑎𝑛𝑑 𝐶 = 3.63 + = 1.21 𝐵 𝑎𝑛𝑑 𝐶 = 3.63 + = 1.21
2 7 7 2 7 6 2 7 6

Mehmet Çağlar - Yildiz Technical University - [email protected] 44


One-way ANOVA
The Tukey-Kramer Method for Pairwise Comparisons

Pairwise comparisons:

Comparisons 𝑥ҧ𝑖 − 𝑥ҧ𝑖 Critical Range Significant

Car A vs Car B 2.3 1.16 Yes


Car A vs Car C 2.0 1.21 Yes

Car B vs Car C 0.3 1.21 No

If the calculated pairwise comparison value is greater than the critical range, we conclude that the difference is significant.

Interpretation:

There is no significant difference between fuel consumption of Car B and Car C at 95% confidence level. The difference between Car
A and Car B and the difference between Car A and Car C in fuel consumption are significant at 95% confidence level.

Mehmet Çağlar - Yildiz Technical University - [email protected] 45


One-way ANOVA
Interpretation:

There is no significant difference between fuel consumption of Car B and Car C at 95% confidence level. The difference
between Car A and Car B and the difference between Car A and Car C in fuel consumption are significant at 95%
confidence level.

Car-A Car-B Car-C

n 7 7 6

Mean (𝑥)ҧ 20.9 23.2 22.9

When we examine the mean fuel-consumption by cars, we can say that Car A has the lowest fuel consumption.
Therefore, Car A has the highest performance compared to Car B and Car C based on fuel-consumption.

Mehmet Çağlar - Yildiz Technical University - [email protected] 46


Assumptions of Parametric Hypothesis Tests
No outliers: There should not be any outliers in your data.

Normality: Data should have normal distribution.

Homogeneity (Equality) of variances: When comparing two or more independent populations’ means
(independent sample t test, ANOVA), the variances of each population must be equal. If the variances are
not equal, we can apply robust parametric tests (Welch’s Test).

Mehmet Çağlar - Yildiz Technical University - [email protected] 47


Assumptions of Parametric Hypothesis Tests
No outliers: There should not be any outliers in your data.

Outlier is an extreme value (too high or too low) in a dataset.

Arithmetic mean is a parameter and is highly affected by any outliers. Besides, outliers affect the
distribution of the data and variance.

To apply any parametric test, the outliers must be detected and (if possible) they should be removed from
the dataset.

Any outliers can be easily detected using Box (box whisker) plot.

Mehmet Çağlar - Yildiz Technical University - [email protected] 48


Assumptions of Parametric Hypothesis Tests

A box plot is a graphical summary of data that is based on


Quartiles (Q1, Q2 and Q3) and Interquartile Range (IQR=Q3-Q1).

Source: Anderson et al., 2010: 111

Mehmet Çağlar - Yildiz Technical University - [email protected] 49


Assumptions of Parametric Hypothesis Tests

20th value is an outlier.

19th value is also an extreme value.

Remove 20th observation from the data


and create a new box-plot.

Always start to remove from the most


extreme value. Remove the outliers one by
one.

Mehmet Çağlar - Yildiz Technical University - [email protected] 50


Assumptions of Parametric Hypothesis Tests

Normal Distribution
The normal probability distribution has the following properties (Wegner, 2013, pp. 133):

• The curve is bell-shaped.

• It is symmetrical about a central mean value, µ.

• The tails of the curve never touch the x-axis,


meaning that there is always a non-zero
probability associated with every value in the
problem domain (i.e. asymptotic).

• The distribution is always described by two


parameters: a mean (µ) and a standard
deviation (σ).

Mehmet Çağlar - Yildiz Technical University - [email protected] 51


Components of Normal Distribution

Skewness

Measure of symmetry of the


distribution

Kurtosis

Measure of peakedness of the


distribution

In a normal distributed data skewness and kurtosis are equal to 0


Mehmet Çağlar - Yildiz Technical University - [email protected] 52
Normality Tests

«If the variation from the normal distribution is sufficiently large, all resulting statistical tests are
invalid, because normality is required to use the F and t statistics» (Hair, et al., 2014: 69).

Graphical Analyses Statistical Tests

Histogram z value approach

P-P plot (probability–probability plot) Shapiro-Wilks Test

Q-Q plot (quantile–quantile plot) Kolmogorow Simirnov Test

Jarque-Bera Test

Mehmet Çağlar - Yildiz Technical University - [email protected] 53


Graphical Analyses for Normality
We create Histogram of the data to see its distribution.

P-P plot and Q-Q plot compares the cumulative distribution of actual data values with the cumulative
distribution of a normal distribution (Hair et al., 2014:70; Field, 2013:218).

The normal distribution forms a straight diagonal line, and the plotted data values are compared with the
diagonal. If a distribution is normal, the line representing the actual data distribution closely follows the
diagonal (Hair et al., 2014:70). Any deviation of the dots from the diagonal line represents a deviation from
normality. Kurtosis is shown up by the dots sagging above or below the line, whereas skewness is shown
up by the dots snaking around the line in an ‘S’ shape (Field, 2013:225)

Mehmet Çağlar - Yildiz Technical University - [email protected] 54


Graphical Analyses for Normality

Histogram vs Normal Q-Q Plot

Mehmet Çağlar - Yildiz Technical University - [email protected] 55


Graphical Analyses for Normality

Histogram vs Normal Q-Q Plot

Mehmet Çağlar - Yildiz Technical University - [email protected] 56


Statistical Tests for Normality
z value approach

Calculate standardized z values for skewness and kurtosis and compare the values to a critical value (e.g.
1.96, 2.58)

𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠 𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠
𝑧𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠 = =
6 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠
𝑛

𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠 𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠
𝑧𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠 = =
24 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠
𝑛

If 𝑧𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠 and 𝑧𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠 is between ±1.96 (0.05 significance level), the data is normally distributed.

Mehmet Çağlar - Yildiz Technical University - [email protected] 57


Statistical Tests for Normality

Shapiro-Wilks Test and Kolmogorow-Simirnov Test

H0: The data is normally distributed

H1: The data is not normally distributed

Compare the p value to significance level.

If p value > significance level, then the data is normally distributed.

Mehmet Çağlar - Yildiz Technical University - [email protected] 58


Graphical and Statistical Tests for Normality
Statistical tests are not poweful for small samples (n<30) and quite sensitive in large samples (Hair et al.,
2014).

The Shapiro–Wilk test is more appropriate method for small sample sizes (<50 samples) although it can
also be handling on larger sample size while Kolmogorov–Smirnov test is used for n ≥50 (Mishra et al.,
2019).

Always use both Graphical approaches and Statistical Tests to examine the distribution.

Mehmet Çağlar - Yildiz Technical University - [email protected] 59


Homogeneity of variances
Homogeneity (Equality) of variances: When comparing two or more independent populations’ means
(independent sample t test, ANOVA), the variances of each population must be equal.

Use Levene test.

H0: Variances are equal

H1: Variances are not equal

If p value > α ➔ the variances are equal

*If variances are not equal but the other assumptions are met, we can apply robust parametric tests
(Welch’s Test).

Mehmet Çağlar - Yildiz Technical University - [email protected] 60


References
Anderson, D. R., Sweeney, D. J. & Williams, (2010). Essentials of statistics for business and economis (6th Ed.). Cengage Learning.

Field, A. (2013). Discovering statistics using IBM SPSS statistics. sage.

Groebner, D. F., Shannon, P. W., & Fry, P. C. (2018). Business statistics: A decision Making Appraoch (10th Ed.). Pearson education.

Hair, J. F., Black, W. C., Babin, B., J. & Anderson, R., E. (2014). Multivariate Data Analysis (7th Ed.). Pearson Education.

Levine, D. M., Szabat, K. A., & Stephan, D. F. (2016). Business Statistics: A First Course (7th Ed.). Pearson Education.

Lind, D. A., Marchal, W. G., & Wathen, S. A. (2021). Statistical Techniques in Business & Economics (18th Ed.). McGraw-Hill Education.

Mishra, P., Pandey, C. M., Singh, U., Gupta, A., Sahu, C., & Keshri, A. (2019). Descriptive statistics and normality tests for statistical data. Annals of cardiac
anaesthesia, 22(1), 67.

Newbold, P., Carlson, W. L., & Thorne, B. M. (2013). Statistics for Business and Economics (Global Edition). Pearson Education

Wegner, T. (2013). Applied business statistics: Methods and Excel-based applications (3rd ed.). Juta and Company Ltd.

Mehmet Çağlar - Yildiz Technical University - [email protected] 61

You might also like