0% found this document useful (0 votes)

39 views19 pages

Parametric and Non-Parametric Hypothesis Tests (Slides)

Uploaded by

amanuels2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views19 pages

Parametric and Non-Parametric Hypothesis Tests (Slides)

Uploaded by

amanuels2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Drawing and testing assumptions

Parametric and non-parametric hypothesis tests

Please do not copy without permission. © ExploreAI 2023.
Drawing and testing assumptions

Statistical tests overview

| In hypothesis testing, statistical tests are used to decide whether we reject or fail to reject the null hypothesis.
We use either parametric or non-parametric tests, depending on some factors of the underlying data.

Parametric tests

● Assume that the data follow a specific distribution, such as a normal distribution.
● More precise estimates of the population parameters when the assumptions are met.
● When the data violate the assumptions, these tests may produce inaccurate or unreliable
results.

Non-parametric tests

● No assumptions about the distribution of the data.

● More robust to violations of the assumptions but less powerful when the assumptions of the
parametric test are met.
● Often preferred when the sample size is small.

2
Drawing and testing assumptions

Parametric tests overview

| Parametric tests are a group of statistical tests that are used to test hypotheses on population parameters,
such as the mean or variance, by making certain assumptions about the underlying distribution of the data.

T-test Z-test

Used to test hypotheses on the mean of a single Similar to the t-test but the z-test assumes that the
population, or the difference between the means of population standard deviation is known and the
two populations with small sample sizes. sample is large.

F-test Analysis of Variance (ANOVA)

Used to test hypotheses on the difference between Used to test hypotheses on the means of three or
the variances of two or more populations with large more populations.
sample sizes.

3
Drawing and testing assumptions

The t-test

| The t-test is a parametric test based on the t-distribution and is used to test hypotheses on the mean of a
single population, or the difference between the means of two samples, when the sample size is smaller.

The t-distribution, or Student’s t-distribution, is used to calculate the

probability of obtaining a sample mean that is different from the
population mean. It has heavier tails to account for the increased t-distribution
variability in smaller sample sizes.

The t-test uses the t-distribution to calculate the critical value that is
used to determine if the difference between is statistically
significant.
normal
distribution
|t-score| ≥ critical value -> Reject the null hypothesis
|t-score| < critical value -> Fail to reject the null hypothesis

4
Drawing and testing assumptions

One-sample t-test

| To compare a sample mean with the population mean, we use a one-sample t-test.

The test statistic t (t-score) is: Assumptions for one-sample t-test:

is the sample mean 01. Random sampling: The data are collected using a random
is the sample standard deviation sampling method to ensure that the sample is
is the sample size representative of the population.
is the population mean 02. Normality: The distribution of the sample means is
approximately normal.
03. Independence: The observations in the sample are
We can use the AVERAGE(), STDEV(), and COUNT() Google independent of each other. In other words, the value of one
Sheet functions to calculate the t-score. observation is not related to the value of another
observation.
Note: We also need to determine the critical value using the 04. Homogeneity of variance (homoscedasticity): The
degrees of freedom and level of significance either from a variance of the sample is approximately equal to the
statistical table, online calculator, or using Google Sheets. variance of the population.

5
Drawing and testing assumptions

Independent two-sample t-test

| To compare the means of two independent samples, when there is no link between the two groups, we use an
independent two-sample t-test.

The t-score is: Assumptions for independent samples t-test:

is sample one’s mean
is sample two’s mean 01. Normality: The data in each group are normally distributed.
is pooled standard deviation 02. Homoscedasticity: The variance of the data in each group is
is sample one’s size equal.
is sample two’s size 03. Independence: The observations within each group are
independent of each other, and the two groups are
independent of each other.
The pooled standard deviation is:

Note: To find the p-value from the statistical table, we need to

use the degrees of freedom as (n1 + n2 - 2) for the independent
two-sample test.

6
Drawing and testing assumptions

Paired two-sample t-test

| To compare the means of two related samples, i.e. two sets from the same group, we use a paired two-sample
t-test.

The t-score is: Assumptions for paired samples t-test:

01. Normality: The differences between the paired observations

are normally distributed.
02. Independence: The paired observations are independent of
each other.
is the mean of the differences between the samples
is the standard deviation of the differences
is the sample size

Note: To find the p-value from the statistical table, we need to

use the degrees of freedom as (n - 1) for the paired two-sample
test.

7
Drawing and testing assumptions

T-tests in Google Sheets

We can either use AVERAGE(), STDEV(), and COUNT() to calculate the t-score using Google Sheets and a p-value
table to determine the p-value, or we can use the built-in function T.TEST() to calculate the p-value.

=T.TEST(range1, range2, tails, type)

● range1 – The first sample of data or group of cells to ● type – Specifies the type of t-test.
consider for the t-test. ○ If 1: a paired test is performed.
● range2 – The second sample of data or group of cells to ○ If 2: a two-sample equal variance (homoscedastic) test
consider for the t-test. is performed.
● tails – Specify the number of distribution tails. ○ If 3: a two-sample unequal variance (heteroscedastic)
○ If 1: uses a one-tailed distribution. test is performed.
○ If 2: uses a two-tailed distribution.

If the populations being compared have equal variance, then we

can use the two-sample equal variance test. If the variances
differ, we use the two-sample unequal variance test.

Note: The T.TEST() function output is the probability

one-tailed two-tailed associated with the t-test, i.e. the p-value.
8
Drawing and testing assumptions

The z-test

| The z-test is a parametric test based on the normal distribution (a.k.a. the z-distribution) and is similar to the
t-test. However, the z-test is used when the sample is large and the population standard deviation is known.

In the z-test, the difference between the means of two populations

is expressed in terms of the number of standard deviations (𝝈).

The z-score (test statistic z) therefore represents the number of

standard deviations between the sample mean and population mean,
assuming the null hypothesis is true.

|z-score| ≥ critical value -> Reject the null hypothesis

|z-score| < critical value -> Fail to reject the null hypothesis

-2𝝈 -1𝝈 𝜇 +1𝝈 +2𝝈

9
Drawing and testing assumptions

One-sample z-test

| To compare a sample mean with the population mean when the sample size is large and standard deviation
known, we use a one-sample z-test.

The test statistic z is: Assumptions for one-sample z-test:

is the sample mean 01. Random sampling: The sample is selected randomly from
is the population standard deviation the population.
is the sample size 02. Normal distribution: The population from which the sample
is the population mean is drawn is normally distributed.
03. Large sample size: The sample size is sufficiently large,
typically at least 30, so that the central limit theorem can
Note: The only difference between the test statistic t and z for be applied.
a one-sample test is using the sample standard deviation for 04. Independence: The observations in the sample are
the t statistic and the population standard deviation for the z independent of each other.
statistic. 05. Known population standard deviation: The standard
deviation of the population is known.

10
Drawing and testing assumptions

Two-sample z-test

| To compare the means of two different samples when the sample size is large and standard deviation known,
we use a two-sample z-test.

The test statistic z for independent groups: Assumptions for two-sample z-test:

is sample one’s mean 01. Normality: The data in each group are normally distributed.
is sample two’s mean 02. Homoscedasticity: The variance of the data in each group is
is population one’s standard deviation equal.
is population two’s standard deviation 03. Independence: The two samples are independent of each
is sample one’s size other.
is sample two’s size 04. Known population standard deviations: The standard
deviation of the populations are known.
The test statistic z for related groups:

is the mean of the differences between the samples

is the hypothesised mean of the differences (usually equal to zero)
is the standard deviation of the differences
is the sample size

11
Drawing and testing assumptions

Z-tests in Google Sheets

We can either use AVERAGE(), STDEV(), and COUNT() to calculate the z-score using Google Sheets and a
statistical table to determine the p-value, or we can use the built-in function Z.TEST() to calculate the p-value.

=Z.TEST(data, value, [standard_deviation])

● data – The sample of data to consider for the z-test.

Note: The Z.TEST() function output is the probability
● value – The test statistic to use in the z-test, most often
associated with the z-test, i.e. the p-value.
the population mean.
● [standard deviation] – The standard deviation to
assume for the z-test, often the standard deviation of the
population.

It is important to note how the Google Sheet hypothesis testing functions differ based on the test we need to
use, whether it is a one or two-sample test, and whether our samples are independent or paired.

12
Drawing and testing assumptions

How to choose a parametric test

If our data follow
One-sample z-test
a normal
distribution 1
Number of
then: No Two-sample
samples
2 independent z-test
Yes
Samples
unrelated
No Two-sample paired
z-test
Sample size <
30
Yes

Yes One-sample t-test

No 1
Population Number of
Two-sample
stdev known samples
Yes independent t-test
2 Samples
unrelated
No Two-sample paired
t-test
13
Drawing and testing assumptions

Non-parametric tests overview

| Non-parametric tests are a group of statistical tests that are used to test hypotheses when the
underlying distribution of the data is unknown.

Several non-parametric tests are available to use in

hypothesis testing.
Continuous
Which test to apply usually depends Type of
Ordinal
on: data Categorical
Nominal
Difference
Test purpose
Correlation
Independent
Number of
Relationship
groups
Related

14
Drawing and testing assumptions

Non-parametric tests overview

| Rather than relying on estimates of population parameters as parametric tests do, non-parametric tests
are based on ranks or the number of times certain events occur in the data.

Some of the most commonly used non-parametric tests include:

● Kolmogorov-Smirnov test: Compares the distribution of a sample with a theoretical distribution.

● Mann-Whitney U-test: Determines if two independent groups come from populations with the same
distribution.
● Chi-square: Tests for independence between categorical variables in a contingency table.
● Spearman’s rank correlation coefficient: Measures the strength and direction of the relationship between two
variables using ranks.
● Wilcoxon signed rank test: Determines if there is a significant difference between two paired samples using
ranks.
● Friedman test: Tests for significant differences between three or more paired samples using ranks.
● Kruskal-Wallis H test: Tests for significant differences between three or more independent groups using
ranks.
15
Drawing and testing assumptions

Kolmogorov-Smirnov and ECDF

| Kolmogorov-Smirnov (KS) is a non-parametric test based on the empirical cumulative distribution function
(ECDF), which is a way to visually represent how data are distributed.

The ECDF maps each observation in a dataset to the proportion of

observations that are less than or equal to it.

It is a step function that increases by 1/n at each point, where n is the

sample size.

For example, considering the ECDF for the average annual salary
for women in Africa, we see that more than 50% of women earn
$2500 or less per year. We also see that 75% of women earn less
than $3750 per year.

16
Drawing and testing assumptions

Kolmogorov-Smirnov test overview

| Kolmogorov-Smirnov (KS) is used to test hypotheses on whether an underlying distribution observed in a

sample is similar to the hypothesised distribution, or whether two distributions are similar.

Considering that KS helps us examine underlying We will also need to specify the level of significance
distributions, it is a useful tool to test for normality, (𝛼), calculate a test statistic (denoted D for
which is a prerequisite of parametric tests. Kolmogorov-Smirnov), and determine the critical
value and p-value.
As with parametric tests, we need to state the null
and alternative hypotheses:
|D| ≥ critical value -> Reject the null hypothesis
● H0 is that the sample is drawn from a population |D| < critical value -> Fail to reject the null hypothesis
with a specific distribution, e.g. a normal
distribution.
p-value ≤ 𝛼 -> Reject the null hypothesis
● HA is that the sample is not drawn from a
p-value > 𝛼 -> Fail to reject the null hypothesis
population with the specified distribution.

17
Drawing and testing assumptions

Kolmogorov-Smirnov test statistic

The Kolmogorov-Smirnov test statistic is:

We need to calculate both difference terms to ensure that the

ECDF starts at 0 ((i-1)/n) and ends at 1 (i/n), i.e. the entire range.
where

is the index of the ordered sample Y1, Y2, …,Yn, i.e. the rank

is the sample size

is the ith ordered value in the sample

is the hypothesised cumulative distribution function (CDF)

evaluated at the ith ordered value of the sample data Yi

Both (i-1)/n and i/n represent the empirical cumulative

distribution function (ECDF)*.

The test statistic D is a single value which is the maximum

across both difference terms, |F(Yi) - (i-1)/n| and |i/n - F(Yi)|, for
all sample values, Yi.

*(i-1)/n represents the cumulative proportion observations that are expected to be strictly less than the ith ordered value, while i/n represents the proportion that is less than or equal to the ith. 18
Drawing and testing assumptions

Steps to the Kolmogorov-Smirnov test

The steps to performing KS: Although the p-value for a KS test can be calculated

01. State the null and alternative hypotheses: using statistical software or a programming language
a. H0 is that the sample is drawn from a like Python using built-in functions, it is much more
population with a specific distribution, e.g. a
involved and resource intensive in Google Sheets.
normal distribution.
b. HA is that the sample is not drawn from a In theory, the p-value can be calculated using either
population with the specified distribution.
the exact method, when n ≤ 35, or the approximate
02. Specify the level of significance (𝛼). (also known as asymptotic) method for larger sample
03. Calculate the test statistic, D, using the sizes (n).
Kolmogorov-Smirnov test statistic formula.

04. Determine the critical value using the KS table,

level of significance, and sample size.

05. Compare the test statistic (D) to the critical

value.
19

Research and Statistics Units 5 & 6
No ratings yet
Research and Statistics Units 5 & 6
37 pages
Parametric Tests Explained
No ratings yet
Parametric Tests Explained
7 pages
PARAMETRIC TEST-WPS Office
No ratings yet
PARAMETRIC TEST-WPS Office
7 pages
Analysing and Presenting Data: Practical Hints: Daniele CEI, Giorgio MATTEI
No ratings yet
Analysing and Presenting Data: Practical Hints: Daniele CEI, Giorgio MATTEI
53 pages
Course Unit 6 - Introduction To Statistical Inference - T Test and Z Test
No ratings yet
Course Unit 6 - Introduction To Statistical Inference - T Test and Z Test
6 pages
Hypothesis
No ratings yet
Hypothesis
4 pages
Introduction To Data Analytics: ITE 5201 Lecture7-Hypothesis Testing
No ratings yet
Introduction To Data Analytics: ITE 5201 Lecture7-Hypothesis Testing
99 pages
Differences Between Z and T Test
No ratings yet
Differences Between Z and T Test
6 pages
Ttest
No ratings yet
Ttest
8 pages
Parametric Test
No ratings yet
Parametric Test
49 pages
RM Sec 4
No ratings yet
RM Sec 4
33 pages
T - Test
No ratings yet
T - Test
45 pages
Assignment Topic: T-Test: Department of Education Hazara University Mansehra
No ratings yet
Assignment Topic: T-Test: Department of Education Hazara University Mansehra
5 pages
Tests of Significance
No ratings yet
Tests of Significance
35 pages
DAV 3 Cheatsheet
No ratings yet
DAV 3 Cheatsheet
8 pages
Statistical Inference and Analysis
No ratings yet
Statistical Inference and Analysis
7 pages
Prob Stat Lesson 9
No ratings yet
Prob Stat Lesson 9
44 pages
Lecture 9 - T-Test
No ratings yet
Lecture 9 - T-Test
29 pages
CLG Project Report
No ratings yet
CLG Project Report
13 pages
Finalptrp
No ratings yet
Finalptrp
16 pages
RM Module 4
No ratings yet
RM Module 4
22 pages
T-Statistics Notes
No ratings yet
T-Statistics Notes
7 pages
Simple Comparative Experiments Guide
No ratings yet
Simple Comparative Experiments Guide
8 pages
Inferential Statistics Part 2
No ratings yet
Inferential Statistics Part 2
19 pages
Z Test and T Test
No ratings yet
Z Test and T Test
15 pages
Day 3
No ratings yet
Day 3
88 pages
Theory
No ratings yet
Theory
7 pages
CAMI16 - Data Analytics
No ratings yet
CAMI16 - Data Analytics
55 pages
Z Test and T Test
100% (1)
Z Test and T Test
7 pages
Mm13 Content Module 8
No ratings yet
Mm13 Content Module 8
15 pages
Central Tendency Dispersion Visualization
No ratings yet
Central Tendency Dispersion Visualization
34 pages
Hypothesis Testing : Z-Test, T-Test, F-Test
No ratings yet
Hypothesis Testing : Z-Test, T-Test, F-Test
42 pages
Statistics for Analysts
No ratings yet
Statistics for Analysts
52 pages
Test Statistic - Note
No ratings yet
Test Statistic - Note
4 pages
Lesson 2. Simple Comparative Experiments
No ratings yet
Lesson 2. Simple Comparative Experiments
8 pages
"T" Test: DR - Shovan Padhy DM1 Yr (Senior Resident) Dept. of CP & T NIMS, Hyderabad
No ratings yet
"T" Test: DR - Shovan Padhy DM1 Yr (Senior Resident) Dept. of CP & T NIMS, Hyderabad
20 pages
ML Unit 3
No ratings yet
ML Unit 3
46 pages
T-Test Analysis in Quantitative Data
No ratings yet
T-Test Analysis in Quantitative Data
50 pages
T-Test: Definition, Formula, and Application
No ratings yet
T-Test: Definition, Formula, and Application
6 pages
Statistical Inference: (Analytic Statistics) Lec 10
No ratings yet
Statistical Inference: (Analytic Statistics) Lec 10
42 pages
Topic6 - One-Sample Z-Test & T-Test - Rhea Ruiz
No ratings yet
Topic6 - One-Sample Z-Test & T-Test - Rhea Ruiz
39 pages
Psych Stats Reviewer
100% (1)
Psych Stats Reviewer
16 pages
Hypothesis Testing Basics
No ratings yet
Hypothesis Testing Basics
41 pages
Different Types of Tests - Notes
No ratings yet
Different Types of Tests - Notes
10 pages
T Test
No ratings yet
T Test
3 pages
T Test
No ratings yet
T Test
21 pages
Ayush BRM File
No ratings yet
Ayush BRM File
35 pages
ADA Revision Questions and Quick Reads
No ratings yet
ADA Revision Questions and Quick Reads
17 pages
Parametric and Non-Parametric
No ratings yet
Parametric and Non-Parametric
35 pages
T Test by SW - Vidyamritananda
No ratings yet
T Test by SW - Vidyamritananda
13 pages
Tests of Significance
No ratings yet
Tests of Significance
19 pages
Test On Variables: in Surveys, The Foolish Ask Questions, Wise Cannot Answers
No ratings yet
Test On Variables: in Surveys, The Foolish Ask Questions, Wise Cannot Answers
24 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
10 pages
5 Single Sample T JASP
No ratings yet
5 Single Sample T JASP
10 pages
T-Tests & Chi2
100% (1)
T-Tests & Chi2
35 pages
Ross Jeffries - NLP Flash Cards
95% (22)
Ross Jeffries - NLP Flash Cards
6 pages
Transcription (Complex) Guidelines
100% (1)
Transcription (Complex) Guidelines
10 pages
Vehicle Requirements For COR
No ratings yet
Vehicle Requirements For COR
6 pages
Syllabus
No ratings yet
Syllabus
5 pages
Pharma HR & Marketing Insights
No ratings yet
Pharma HR & Marketing Insights
18 pages
Lecturer Job Application Cover Letter Sample
100% (1)
Lecturer Job Application Cover Letter Sample
8 pages
Importance of Professional Ethics in Indian Libraries
No ratings yet
Importance of Professional Ethics in Indian Libraries
8 pages
Thesis Title For Radiologic Technology Students
100% (3)
Thesis Title For Radiologic Technology Students
6 pages
Architectural Research Methods
No ratings yet
Architectural Research Methods
10 pages
U300 PRO PFD 009 A3 D Model
No ratings yet
U300 PRO PFD 009 A3 D Model
1 page
Brand Management Challenges
No ratings yet
Brand Management Challenges
2 pages
Cuemath Year 2
No ratings yet
Cuemath Year 2
2 pages
TM 9-700 Through TM 9-834 06142015
100% (1)
TM 9-700 Through TM 9-834 06142015
12 pages
Pelatihan Career Happiness Plan Untuk Meningkatkan
No ratings yet
Pelatihan Career Happiness Plan Untuk Meningkatkan
12 pages
Change Management and IFIX R
No ratings yet
Change Management and IFIX R
234 pages
October 2014 SAT CR
0% (1)
October 2014 SAT CR
15 pages
2 Teori Tektonik Lempeng
No ratings yet
2 Teori Tektonik Lempeng
53 pages
Science8 Q2 Mod2 Earthquake sEpicenterAndMagnitude V1
100% (1)
Science8 Q2 Mod2 Earthquake sEpicenterAndMagnitude V1
26 pages
Mil Week 5 Media and Information Sources
No ratings yet
Mil Week 5 Media and Information Sources
24 pages
Be Electronics and Telecommunication Semester 5 2024 May It Infra and Securityrev 2019 c Scheme
No ratings yet
Be Electronics and Telecommunication Semester 5 2024 May It Infra and Securityrev 2019 c Scheme
1 page
ASTM E18 (2019) - Part9
No ratings yet
ASTM E18 (2019) - Part9
1 page
Jaypee Institute of Information Technology - PDF - V 1748725853
No ratings yet
Jaypee Institute of Information Technology - PDF - V 1748725853
6 pages
21UME601 DTS Unit 1
No ratings yet
21UME601 DTS Unit 1
26 pages
Ashar It Park Thane - Google Search
No ratings yet
Ashar It Park Thane - Google Search
1 page
Types of Chemical Reaction
No ratings yet
Types of Chemical Reaction
9 pages
Coleshill AD Odour Assessment
No ratings yet
Coleshill AD Odour Assessment
27 pages
ICT-Grade 8-Halfyearly2024 Sample Paper
No ratings yet
ICT-Grade 8-Halfyearly2024 Sample Paper
3 pages
VMC1050 Manual
No ratings yet
VMC1050 Manual
211 pages
NBHvs Others 2687 FB 8504 BF
No ratings yet
NBHvs Others 2687 FB 8504 BF
16 pages
Street Harassment Thesis
100% (2)
Street Harassment Thesis
8 pages

Parametric and Non-Parametric Hypothesis Tests (Slides)

Uploaded by

Parametric and Non-Parametric Hypothesis Tests (Slides)

Uploaded by

Drawing and testing assumptions

Parametric and non-parametric hypothesis tests

Statistical tests overview

● No assumptions about the distribution of the data.

Parametric tests overview

F-test Analysis of Variance (ANOVA)

The t-distribution, or Student’s t-distribution, is used to calculate the

The test statistic t (t-score) is: Assumptions for one-sample t-test:

Independent two-sample t-test

The t-score is: Assumptions for independent samples t-test:

Note: To find the p-value from the statistical table, we need to

Paired two-sample t-test

The t-score is: Assumptions for paired samples t-test:

01. Normality: The differences between the paired observations

Note: To find the p-value from the statistical table, we need to

T-tests in Google Sheets

=T.TEST(range1, range2, tails, type)

If the populations being compared have equal variance, then we

Note: The T.TEST() function output is the probability

In the z-test, the difference between the means of two populations

The z-score (test statistic z) therefore represents the number of

|z-score| ≥ critical value -> Reject the null hypothesis

-2𝝈 -1𝝈 𝜇 +1𝝈 +2𝝈

The test statistic z is: Assumptions for one-sample z-test:

is the mean of the differences between the samples

Z-tests in Google Sheets

=Z.TEST(data, value, [standard_deviation])

● data – The sample of data to consider for the z-test.

How to choose a parametric test

Yes One-sample t-test

Non-parametric tests overview

Several non-parametric tests are available to use in

Non-parametric tests overview

Some of the most commonly used non-parametric tests include:

● Kolmogorov-Smirnov test: Compares the distribution of a sample with a theoretical distribution.

Kolmogorov-Smirnov and ECDF

The ECDF maps each observation in a dataset to the proportion of

It is a step function that increases by 1/n at each point, where n is the

Kolmogorov-Smirnov test overview

| Kolmogorov-Smirnov (KS) is used to test hypotheses on whether an underlying distribution observed in a

Kolmogorov-Smirnov test statistic

We need to calculate both difference terms to ensure that the

is the sample size

is the ith ordered value in the sample

is the hypothesised cumulative distribution function (CDF)

Both (i-1)/n and i/n represent the empirical cumulative

The test statistic D is a single value which is the maximum

Steps to the Kolmogorov-Smirnov test

04. Determine the critical value using the KS table,

05. Compare the test statistic (D) to the critical

You might also like