QUARTER 4 – STATISTICS AND PROBABILITY
CHAPTER 3
SAMPLING AND SAMPLING DISTRIBUTION Parameter is a value or measurement obtained from a
population. It is usually referred to as the true or actual
Sampling is a technique of selecting a subset of the value.
population to make statistical inferences from them.
Example: The mean grade of all students in a Statistic
Population is the entire group of individuals or objects class is 88.5
that you want to study.
Statistic is a value or measurement obtained from a
Sample consists one or more data drawn from the sample. It is an estimate of a parameter.
population
Example: The average height of a sample of 50 students
Types of Random Sampling is 5 feet and 7 inches.
1. Simple Random Sampling - the most basic
random sampling wherein each element in the A sampling distribution of sample means is a frequency
population has an equal probability of being distribution using the means computed from all possible
selected. random samples of a specific size taken from a
population.
Example: Random selection of 20 students from
class of 50 students. Each student has equal Steps in Constructing the Sampling Distribution of the
chance of getting selected. Mean
1. Determine the number of possible samples that
2. Systematic Random Sampling - this can be done can be drawn from the population using the
by listing all the elements in the population and 𝑁!
formula: 𝐶(𝑁, 𝑛) = (𝑁−𝑛)!𝑛!
selecting every nth element in the population
list. This is equally precise as the simple random
Where: N = size of the population
sampling. It is often used on long population lists.
n = size of the sample
Example: A teacher takes an alphabetized list of
2. List all possible samples and compute for the
student names and picks a random starting
mean of each sample.
point. Every 20th student is selected to take a
survey.
3. Construct a frequency distribution of the sample
means obtained in Step 2.
3. Stratified Random Sampling - is a random
sampling wherein the population is divided into
different strata or divisions. The number of Example 1:
samples will be proportionately picked in each
stratum that is why all strata are represented in A population consists of five values (Php 2, Php 3, Php 4,
the samples. Php5, Php 6). A sample of size 2 is to be taken from this
population.
Example: A student council surveys 100 students
by getting samples of 25 freshmen, 25 Step 1: Determine the number of possible samples that
sophomores, 25 juniors, and 25 seniors. can be drawn from the population using the formula:
𝑁!
𝐶(𝑁, 𝑛) = (𝑁−𝑛)!𝑛!
4. Cluster Random Sampling - is a random sampling
wherein population is divided into cluster or
Given: N = 5 ; n = 2
groups and then the clusters are randomly
𝑁!
selected. All elements of the clusters randomly 𝐶(𝑁, 𝑛) =
(𝑁 − 𝑛)! 𝑛!
selected are considered the samples of study. 5!
𝐶(𝑁, 𝑛) =
(5 − 2)! 𝑥 2!
Example: An airline company wants to survey its 𝐶(𝑁, 𝑛) = 10
customers one day, so they randomly select 55
flights that day and survey every passenger on
those flights.
Step 2: List all possible samples and compute for the 9. Compute the standard deviation of the sampling
mean of each sample. distribution of sample means.
10. Construct histogram.
Example:
A population consists of five values (Php 2, Php
3, Php 4, Php 5, Php 6). A sample of size 2 is to be taken
from this population.
Step 1. Compute the population mean.
∑𝑿
𝝁=
𝑵
𝟐 + 𝟑 + 𝟒 + 𝟓 + 𝟔 𝟐𝟎
𝝁= =
𝟓 𝟓
𝝁=𝟒
Step 3: Construct a frequency distribution of the sample
means obtained in Step 2.
Step 2. Compute the population variance
Step 3. Compute the population standard deviation
Histogram
Step 4. Determine the number of possible samples
N=5 ; n=2
𝑵!
𝑪(𝑵, 𝒏) =
(𝑵 − 𝒏)! 𝒏!
𝟓!
STEPS IN CONSTRUCTING THE SAMPLING DISTRIBUTION 𝑪(𝑵, 𝒏) =
(𝟓 − 𝟐!)𝟐!
OF THE SAMPLE MEANS 𝟓!
1. Compute the population mean. 𝑪(𝑵, 𝒏) =
𝟑! 𝟐!
2. Compute the population variance 𝑪(𝑵, 𝒏) = 𝟏𝟎
3. Compute the population standard deviation.
4. Determine the number of possible samples. There 10 possible samples
5. List all possible samples and their corresponding
means.
6. Construct the sampling distribution of the
sample means.
7. Compute the mean of the sampling distribution
of the sample means.
8. Compute the variance of the sampling
distribution of sample means.
Step 5. List all possible samples and their corresponding Step 8. Compute the variance of the sampling
means. distribution of sample means.
Step 6. Construct the sampling distribution of the ̅ 𝟐 ∙ 𝑷(𝑿
𝝈𝟐 𝑿̅ = ∑[𝑿 ̅ )] − 𝝁𝟐 ̅
𝑿
sample means 𝟔𝟕
𝝈𝟐 𝑿̅ = − 𝟒𝟐
𝟒
𝝈𝟐 𝑿̅ = 𝟎. 𝟕𝟓
Step 9. Compute the standard deviation of the sampling
distribution of sample means
Step 7. Compute the mean of the sampling distribution
of the sample means
̅ 𝟐 ∙ 𝑷(𝑿
𝝈𝑿̅ = √∑[𝑿 ̅ )] − 𝝁𝟐 ̅
𝑿
𝟔𝟕
𝝈𝑿̅ = √ − 𝟒𝟐
𝟒
𝝈𝑿̅ = √𝟎. 𝟕𝟓
𝝈𝟐 𝑿̅ = 𝟎. 𝟖𝟔
Histogram
̅ ∙ 𝑷(𝑿
𝝁̅𝒙 = ∑[𝑿 ̅ )]
𝟒𝟎
𝝁𝒙̅ =
𝟏𝟎
𝝁𝒙̅ = 𝟒
Application of the Central Limit Theorem Find 𝝈𝑿̅
𝜎 𝑁−𝑛
Example 1: 𝜎𝑋̅ = ∙√
√𝑛 𝑁−1
A population size of 𝑁 = 250 has 𝜇 = 76 and standard
deviation of 𝜎 = 14. 14 250 − 140
1. What is the probability that a random sample 𝜎𝑋̅ = ∙√
√140 250 − 1
size of n= 35 will have a mean of 79 or more?
𝜎𝑋̅ = 0.79
2. What is the probability that a random sample
size n = 140 will have a mean of between 74 and ̅ = 𝟕𝟒 to z – score
Convert 𝑿
77?
𝑋̅ − 𝜇
3. What is the probability that a random sample 𝑧=
𝜎𝑋̅
size n = 219 will have a mean of less than 76.5?
74 − 76
𝑧=
0.79
Solution for #1 𝑧 = −2.53
Given: Convert 𝑿 ̅ = 𝟕𝟕 to z – score
𝑁 = 250, 𝜇 = 76, 𝑋̅ − 𝜇
𝜎 = 14, n=35, ̅𝑋 = 79 𝑧=
𝜎𝑋̅
77 − 76
Find 𝝈𝑿̅ 𝑧=
0.79
𝜎 𝑁−𝑛 𝑧 = 1.27
𝜎𝑋̅ = ∙√
√𝑛 𝑁 − 1 ̅ ≤ 𝟕𝟕)
Find 𝑷(𝟕𝟒 ≤ 𝑿
14 250 − 35
𝜎𝑋̅ = ∙√ 𝑧 = −2.53 ↔ 0.4943
√35 250 − 1
𝑧 = 1.27 ↔ 0.3980
𝜎𝑋̅ = 2.20
𝑃𝑃(74 ≤ 𝑋̅ ≤ 77) = 𝑃𝑃(−2.53 ≤ 𝑧 ≤ 1.27)
̅ = 𝟕𝟗 to z – score = 0.4943 + 0.3980
Convert 𝑿
= 0.8923
𝑋̅ − 𝜇
𝑧=
𝜎𝑋̅
79 − 76
𝑧=
2.20
𝑧 = 1.36
̅ ≥ 𝟕𝟗)
Find 𝑷(𝑿
𝑧 = 1.36 ↔ 0.4132
𝑃(𝑋̅ ≥ 79) = 𝑃(𝑧 ≥ 1.36)
= 0.5000 − 0.4132 Solution for #3
= 0.0868 Given:
𝑁 = 250, 𝜇 = 76,
𝜎 = 14, n=219,
̅𝑋 = 76.5
Find 𝝈𝑿̅
𝜎 𝑁−𝑛
𝜎𝑋̅ = ∙√
√𝑛 𝑁−1
14 250 − 219
𝜎𝑋̅ = ∙√
√219 250 − 1
Solution for #2 𝜎𝑋̅ = 0.33
Given: Convert 𝑿 ̅ = 𝟕𝟔. 𝟓 to z – score
𝑁 = 250, 𝜇 = 76, 𝑋̅ − 𝜇
𝜎 = 14, n=140, 𝑧=
̅𝑋 = 74 and 77 𝜎𝑋̅
76.5 − 76
𝑧=
0.33
𝑧 = 1.52
̅ < 𝟕𝟔. 𝟓)
Find 𝑷(𝑿 Types of Tests
z = 1.52 ↔ 0.4357 Directional Test
̅ < 76.5) = P(z < 1.52)
P(X A test of any statistical hypothesis where the
= 0.5000 + 0.4357 alternative hypothesis is expressed, using less than (<) or
= 0.9357 greater than (>) is called directional test or one-tailed test
since the critical or rejection region lies entirely in one tail
of the sampling distribution.
CHAPTER 4
TEST OF HYPOTHESIS
Hypothesis testing is a decision-making process for
evaluating claims about a population based on the
characteristics of a sample purportedly coming from that
population. The decision is whether the characteristic is
acceptable or not.
Statistical hypothesis is a statement about the numerical
value of a population parameter. It is a statement or
tentative assertion which aims to explain facts about a
certain phenomenon.
Nondirectional Test
The null hypothesis, denoted by 𝐻0 , is a statement that A test of any statistical hypothesis where the
there is no difference between a parameter and a specific alternative hypothesis is written with a not equal sign (≠)
value, or that there is no difference between two is called a nondirectional test or two-tailed test since
parameters. there is no assertion made on the direction of the
difference. The rejection region is split into two equal
The alternative hypothesis, denoted by 𝐻𝑎 , is a parts, one in each tail of the sampling distribution.
statement that there is a difference between a parameter
and a specific value, or that there is a difference between
two parameters.
Example 1:
Example 2:
Types of Error Comparing the Sample Mean and the Population Mean
in a Large Sample size
Type I error occurs when we reject the null
hypothesis when it is true. It is also called alpha error (𝛼 The formula can be written as
error) (𝑋̅ − 𝜇)√𝑛
Type II error occurs when we accept the null 𝑧=
𝜎
hypothesis when it is false. It is also called beta error (𝛽 where
error) 𝑋̅ = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
𝜇 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
Level of Significance 𝑛 = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
• It is the probability of committing Type I error. 𝜎 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
• It is denoted by the Greek letter 𝛼 (alpha).
• The commonly used levels of significance are Critical Value of Z
0.05 and 0.01.
• The level of significance should be set before
testing the hypothesis.
Example:
A 0.01 level of significance means that the
researcher is willing to take 1% error in making a decision.
It also implies that he is 99% confident that he will make
Example: A sociologist believes that it costs more than
a right decision. Likewise a 0.05 level of significance
Php 90 000 to raise a child from birth to age one. A
means that the researcher is willing to take 5% error in
random sample of 49 families, each with a one-year old
making a decision. It also implies that he is 95% confident
child, was selected to see if this figure is correct. The
that he will make a right decision.
average expenses for these families revealed a mean of
Php 92 000 with a standard deviation of Php 4 500. Based
Steps in Testing the Hypothesis
on these sample data, can it be concluded that the
sociologist is correct in his claim? Use 0.05 level of
significance.
Ho: The average cost of to raise a child from birth to age
one is equal to Php 90 000
Ha: The average cost of to raise a child from birth to age
one is more than Php 90 000
Example: One-tailed or directional test
𝛼 = 0.05
Critical value of z = 1.65
(92000 − 90000)√49
𝑧= = 3.11
4500
Since the computed or test value falls within the rejection
The z-test of One-Sample Mean region, reject the null hypothesis
The z-test is used when the following conditions are There is a significant difference between the sample
satisfied. mean and population mean. Thus, the sociologist is
1. The population standard deviation is known or correct in claiming that the cost to raise a child from birth
given. to age one is more than Php 90 000.
2. The population standard is unknown but the
sample size is sufficiently large, (i.e., greater than
or equal to thirty, 𝑛 ≥ 30). In this case, we use
the sample standard deviation (s) to replace the
population standard deviation (𝜎).
Comparing the Sample Mean and the Population Mean
in a Small Sample size
The formula can be written as
(𝑋̅ − 𝜇)√𝑛
𝑡=
𝑠
Where
𝑋̅ = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
𝜇 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑛 = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 Example: The computer sales department claims that
s= 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 less than 60% of all purchasers of a certain kind of
𝑑𝑓 = 𝑛 − 1 computer program often call the manufacturer’s hotline
Example: The director of a secretarial school believes within one month of purchase. If 55 out of 100 software
that its graduates can type more than 75 words per purchasers selected at random called the hotline within
minute. A random sample of 12 graduates has been a month of purchase, test the claim at 0.05 level of
found to have an average of 77.2 words per minute with significance.
a standard deviation of 7.9 words per minute in a typing
test. Using the 0.05 level of significance, test the claim of Ho: The proportion of purchasers who will call the
the director. manufacturer’s hotline within one month of purchase is
60% or 0.60 (𝑝0 = 0.60)
Ho: the mean age of the employees is 22.8 years
Ha: The proportion of purchasers who will call the
Ha: the mean age of the employees is not 22.8 years manufacturer’s hotline within one month of purchase is
less than 60% or 0.60 (𝑝0 < 0.60)
two-tailed or non-directional test
one-tailed or directional test
𝛼 = 0.01
𝛼 = 0.05
Critical value of z = ±2.58
Critical value of z = −1.65
(26.2 − 22.8)√70
𝑧= = 6.18 55
4.6 𝑝0 = 60% ; 𝑝 = ; 𝑛 = 100
100
Since the computed or test value falls within the rejection 55
− 0.60
region, reject the null hypothesis. 𝑧= 100 = −1.02
√0.60(1 − 0.60)
100
There is a significant difference between the sample
mean and population mean. Thus, the treasurer of the
|-1.02|<|-1.65| = accept
firm can conclude that the manager is incorrect in his
estimate that the mean age of the employee is 22.8
Since the computed or test value falls outside the
years.
rejection region, do not reject the null hypothesis
Comparing the Sample Proportion and Population
There is No sufficient evidence to conclude that the
Proportion
proportion of purchasers that will call the manufacturer’s
hotline within one month of purchase is less than 60%.
Thus, the claim is false or incorrect.
The formula can be written as
𝑝 − 𝑝0
𝑧=
√𝑃0 (1 − 𝑃0 )
𝑛
Where
𝑝 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛
𝑝0 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖on
𝑛 = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
𝑥
𝑝=
𝑛
𝑥 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠