Unit-III Sample & Sampling Distribution
Unit-III Sample & Sampling Distribution
3
Basic Terminology
Element: An element is the entity on which
the data are collected.
Population: A population is the collection of
all the element of interest.
Sample: A sample is a subset of the
population.
Population frame :A list, map, directory, or other
source used to represent the population
Census: A census is an investigation of all the
individual elements making up the population—a
total listing rather than a sample.
Why Sample the Population?
1. To contact the whole population would be
time consuming.
2. The cost of studying all the items in a
population may be prohibitive.
3. The physical impossibility of checking all
items in the population.
4. The destructive nature of some tests.
5. The sample results are adequate.
The reason we select a sample is to collect data
to make an inference and answer research
question about a population.
5
Steps involve in sampling
Sample design
A sample design is a definite plan for obtain a
sample from a given population (Kothari, 1998).
It helps to decide the number of items to be
selected in the sample i.e. the size of the
sample. Purpose of sampling is to estimate an
unknown characteristic of a population. It is all
about selecting a random sample which is true
representative of the population under study.
Sampling design refers to the technique or procedure, the researcher undergoes for
selecting items as samples from the population or universe.
Types of sampling
Types of Probability Sampling
Simple Random Sampling
Systematic Sampling
Stratified Random Sampling
Cluster (or Area) Sampling
Multistage sampling
Types of non probability
sampling
Convenience sampling
Judgment sampling
Snowball sampling
Quota sampling
Probability Sampling
12
Most Commonly Used Probability
Sampling Methods
13
Simple Random Sample
Simple Random Sample: A sample selected so that each
item or person in the population has the same chance of
being included.
With Replacement(SRSWR)
Without Replacement(SRSWOR)
EXAMPLE:
A population consists of 845 employees of Ford. A sample of 52
employees is to be selected from that population. The name of
each employee is written on a small slip of paper and deposited
all of the slips in a box. After they have been thoroughly mixed,
the first selection is made by drawing a slip out of the box without
looking at it. This process is repeated until the sample of 52
employees is chosen.
14
Simple Random Sample: Using
Table of Random Numbers
A population consists of 845 employees of Ford. A sample of 52
employees is to be selected from that population.
A more convenient method of selecting a random sample is to use
the identification number of each employee and a table of
random numbers.
15
Systematic Random Sampling
Systematic Random Sampling: The items or individuals of the
population are arranged in some order. A random starting point is
selected and then every kth member of the population is selected for
the sample.
EXAMPLE
A population consists of 845 employees of Ford. A sample of 52
employees is to be selected from that population.
First, k is calculated as the population size divided by the sample
size. For the company, we would select every 16th (845/52)
employee list. If k is not a whole number, then round down.
Random sampling is used in the selection of the first name. Then,
select every 16th name on the list thereafter.
16
Stratified Random Sampling
Stratified Random Sampling: A population is first divided into
subgroups, called strata, and a sample is selected from each stratum.
Useful when a population can be clearly divided in groups based on
some characteristics
17
Example
Cluster Sampling
Cluster Sampling: A population is divided into clusters using naturally
occurring geographic or other boundaries. Then, clusters are
randomly selected and a sample is collected by randomly selecting
from each cluster.
19
Example
D
Quota Sampling
In many large-scale applications of
sampling procedures, it is not always
possible or desirable to list all members of
the population and randomly select
elements from that list. The reasons for
using any alternative procedures include
cost, timeliness, and convenience. One
alternative procedure is quota sampling.
Convenience Sampling
Convenience sampling is used because it is quick,
inexpensive, and convenient. Convenience samples are
useful for certain purposes, and they require very little
planning. Researchers simply use participants who are
available at the moment. The procedure is casual and easy,
relative to random sampling. Contrast using any available
participants with random sampling, where you must
(1) have a well-defined population,
(2) construct a list of members of the population if one is not
available,
(3) sample randomly from the list, and
(4) contact and use as many individuals from the list as possible.
Convenience sampling requires far less effort. However, such
convenience comes with potential problems, which we will
describe. Convenience samples are nonprobability samples.
Therefore, it is not possible to specify the probability of any
population element’s being selected for the sample. Indeed, it is
not possible to specify the population from which the sample was
drawn.
Example: A number of examples of convenience sampling can
be given. In shopping malls or airports, individuals are selected
as they pass a certain location and interviewed concerning
issues, candidates, or other matters. Phone surveys may be
based on anyone answering the phone between the hours of 9
A.M. and 5 P.M. Politicians use convenience sampling to
determine the attitudes of those they represent when they report
on the number of letters voluntarily sent to them by their
constituents.
Sampling Error
The sampling error is the difference
between a sample statistic and its
corresponding population parameter.
Examples:
X
s
s2 2
p
28
Case Analysis
On April 20, 1999, Eric Harris and Dylan Klebold entered Columbine
High School and began shooting teachers and students. Thirteen
individuals died, and the psychological community was again asked
to explain such violent behavior. A psychologist might decide to
interview Columbine students to obtain their perspectives on the
factors that motivated the two young men to commit such a
horrendous act. A group of ten students has already decided to
meet and discuss the events of that day. The psychologist asks if it
would be all right to attend the meeting and ask them some
questions. The students agree, and the psychologist records their
thoughts. Based on this information, the psychologist concludes that
a primary reason for the violent behavior was the peer dynamics in
the school that created groups of outcasts.
Critical Thinking Questions
Who appears to constitute the population of
interest?
Which type of sampling procedure best
describes that used by the psychologist?
What are the limitations of this sampling method,
and in what specific ways could the sampling
method have affected the findings?
What specific steps would you have taken to
obtain a representative sample?
Sampling Distribution of the
Sample Mean(x)
The sampling distribution of
the sample mean(x) is a
probability distribution consisting
of all possible sample means of a
given sample size selected from
a population.
31
Sampling Distribution of the Sample
Means - Example
Tartus Industries has seven production employees (considered the
population). The hourly earnings of each employee are given in the
table below.
32
Sampling Distribution of the
Sample Means - Example
33
Sampling Distribution of the
Sample Means - Example
34
Sampling Distribution of the
Sample Means - Example
35
Sampling Distribution of the
Sample Means - Example
36
Sampling Distribution of the
Sample Means - Example
37
Properties of sampling distribution
Expected Value of x equals the mean of the population
from which the sample is selected.
E(x)=μ
Where μ is population mean
Standard Deviation of x
N n
For Finite population: x
N 1 n
For Infinite population:
x
n
N n
Where
N 1
is known as Finite population correction (FPC)
20
Use the following expression to compute the
standard deviation of x
x
n
Whenever,
1. The population is infinte; or
2. The population is finite & and the sample size is less than or equal to 5% of the
population size; that is n / N 0.05
42
Using the Sampling
Distribution of the Sample Mean (Sigma Known)
X
z
n
43
Using the Sampling
Distribution of the Sample Mean
(Sigma Unknown)
If the population does not follow the normal
distribution, but the sample is of at least 30
observations, the sample means will follow the
normal distribution.
To determine the probability a sample mean falls
within a particular region, use:
X
t
s n
44
Using the Sampling Distribution of the Sample Mean
(Sigma Known) - Example
The Quality Assurance Department for Cola, Inc., maintains records
regarding the amount of cola in its Jumbo bottle. The actual amount
of cola in each bottle is critical, but varies a small amount from one
bottle to the next. Cola, Inc., does not wish to under fill the bottles.
On the other hand, it cannot overfill each bottle. Its records indicate
that the amount of cola follows the normal probability distribution.
The mean amount per bottle is 31.2 ounces and the population
standard deviation is 0.4 ounces.
At 8 A.M. today the quality technician randomly selected 16 bottles
from the filling line. The mean amount of cola contained in the
bottles is 31.38 ounces.
Is this an unlikely result? Is it likely the process is putting too much
soda in the bottles? To put it another way, is the sampling error of
0.18 ounces unusual?
45
Using the Sampling Distribution of the Sample Mean
(Sigma Known) - Example
X 31.38 31.20
z 1.80
n $0.4 16
46
Using the Sampling Distribution of the Sample Mean
(Sigma Known) - Example
47
Using the Sampling Distribution of the Sample Mean
(Sigma Known) - Example
What do we conclude?
It is unlikely, less than a 4 percent chance, we could
select a sample of 16 observations from a normal
population with a mean of 31.2 ounces and a
population standard deviation of 0.4 ounces and
find the sample mean equal to or greater than
31.38 ounces.
We conclude the process is putting too much
cola in the bottles.
48
Numerical Examples:
The mean length of certain manufactured tool is 41.5 hrs with S.D of 2.5
hrs. what is the probability that a simple random sample size 50
drawn from the population will have a mean length between 40.5 to
42 hrs?
Solution: Given; μ=41.5hrs
σ =2.5 hrs, n=50
P(40.5≤ x ≤ 42)=?
Based on this info, the statistics of the sampling distribution are computed as
2.5
x 0.354
n 50
Normal dist., x1 40.5 41.5
z1 2.828
/ n 0.354
x2 42 41.5
z2 1.41
/ n 0.354
So Standard error of sample mean,
P(-2.828≤ z ≤1.414) =P(z≤1.41)-P(z≤-2.83)
=0.9207-0.0023
=0.9184( ≈92%)
Thus, we can conclude that 0.9182 is the probability of the tool
having the mean between required values.
Sampling Distribution of the
proportion Mean(p)
The sampling distribution of
the proportion mean(p) is a
probability distribution consisting
of all possible values of the
sample proportion.
x
p ; where x number of elements in sample of int erest
n
& n is sample size
51
Properties of sampling distribution
Expected Value of p equals the mean of all the possible
values of p and is equal to population proportion.
E ( p ) p; where p is population proportion
Standard Deviation of x
52
Use the following expression to compute the
standard deviation of p
p(1 p )
p
n
Whenever,
1. The population is infinte; or
2. The population is finite & and the sample size is less than or equal to 5% of the
population size; that is n / N 0.05
35