0% found this document useful (0 votes)
20 views86 pages

Sampling Techniques and Size Determination

Uploaded by

sayihmehari74
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views86 pages

Sampling Techniques and Size Determination

Uploaded by

sayihmehari74
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter three

Sampling techniques and sample size determination

Teresa K.(Bsc., MPH)

Teresa K. 1
Sampling

Sample: is a group of people, objects, or items that


are taken from a larger population for measurement.

The sample should be representative of the


population to ensure that we can generalize the
findings from the research sample to the population
as a whole.

Teresa K. 2
Sampling cont’d...

Sample design: is a definite plan for obtaining a sample


from a given population. It refers to the technique or
the procedure the researcher would adopt in selecting
items for the sample.

The term sampling refers to strategies that enable us to


pick a subgroup from a larger group and then use this
subgroup as a basis for making inferences about the
larger group.

The researcher's goal is always to generalize about the


population based on observations of the sample.
Teresa K. 3
Sampling cont’d...

The manner in which a sample is drawn is an


important factor in determining how useful the sample
will be for making inferences about the population
from which it is drawn.
It is quite possible to have a very large sample upon
which no sound decision can be based.
This occurs because the respondents in the sample are
not really similar to the population about which we
want to make generalizations.

Teresa K. 4
Basic Terms

 Reference population (also called source population


or target population): is a group of individuals/
persons, objects, or items from which samples are
taken for measurement. For example a population of
presidents or professors, books or students.

 It refers to the entire group of individuals or objects


to which researchers are interested in generalizing
the conclusions.

Teresa K. 5
Basic term cont’d….

Teresa K. 6
Basic term cont’d….

Target and study population and sample

Target population

Study population

Sample

Teresa K. 7
Basic Terms cont’d…

Census: Obtained by collecting information about each


member of a population

Teresa K. 8
Basic terms cont’d…

Sampling Frame: is the list of people from which the


sample is taken.
It should be comprehensive, complete and up-to-
date.
 Examples of sampling frame: Electoral Register;
Postcode Address File; telephone book and so on.
Probability sampling: With probability sampling
methods, each population element has a known
(non-zero) chance of being chosen for the sample.

Teresa K. 9
Basic term cont’d...

Non-probability sampling: With non-probability


sampling methods, we do not know the probability
that each population element will be chosen, and/or
we cannot be sure that each population element has a
non-zero chance of being chosen
Sampling unit - the unit of selection in the sampling
process
Study unit - the unit on which information is collected.

Teresa K. 10
Basic term cont’d…

The sampling unit is not necessarily the same as the


study unit.
– If the objective is to determine the availability of
latrine, then the study unit would be the
household;
– If the objective is to determine the prevalence of
trachoma, then the study unit would be the
individual.
Sampling fraction (Sampling interval) – the ratio of the
number of units in the sample to the number of units
in the reference population (n/N)
Teresa K. 11
Steps in Sampling Design
There are steps that we need to follow to get in to the
respondents.
What is the target population?
Define the target population and study population.
What are the parameters of interest?
Define the parameters of interest of the study.
What is the sampling frame?
Select the sampling frame.
What is the appropriate sampling method?
Determine which sampling method we are going to
use depending on the setting of the population and
the purpose of the study.
Teresa K. 12
Steps in Sampling Design cont’d..

Plan procedures to select the sampling unit


Determine the size of the sample which will be
selected from the population.
Select actual sampling unit
Conduct field work

Teresa K. 13
Steps in selecting a sample cont’d…

Teresa K. 14
Sampling errors

A sample is expected to mirror the population from


which it comes, however, there is no guarantee that any
sample will be precisely representative of the
population.
The uncertainty associated with an estimate that is
based on data gathered from a sample of the
population rather than the full population is known as
sampling error.
Sampling errors are the random variations in the
sample estimates around the true population
parameters.
Teresa K. 15
Sampling error cont’d…

No sample is the exact mirror image of the population


Sampling error (chance )
Can not be avoided or totally eliminated
Sampling error decreases with the increase in the size
of the sample, and it happens to be of a smaller
magnitude in case of homogeneous population.
When n = N ⇒ sampling error = 0

Teresa K. 16
• The question is, why do sample
estimates have uncertainty
associated with them?

Teresa K. 17
Sampling error cont’d…

There are two reasons for this question. These are:

Estimates of characteristics from the sample data


can differ from those that would be obtained if the
entire population were surveyed.

Estimates from one subset or sample of the


population can differ from those based on a
different sample from the same population (sample
to sample variations).
Teresa K. 18
The cause of sampling error

Chance: main cause of sampling error and is the error


that occurs just because of bad luck.

Sampling bias: Sampling bias is a tendency to favor the


selection of participants that have particular
characteristics.

Teresa K. 19
Non Sampling Error (Measurement Error)

It is a type of systematic error in the design or conduct


of a sampling procedure which results in distortion of
the sample, so that it is no longer representative of the
reference population.

We can eliminate or reduce the non-sampling error


(bias) by careful design of the sampling procedure and
not by increasing the sample size.

It can occur whether the total study population or a


sample is being used. Teresa K. 20
Non Sampling Error (Measurement Error) cont’d…

It may either be produced by participants in the study


or be simply a by product of the sampling plans and
procedures
These biased observations can be very devastating to
the findings of the study.
Example: If you take male students only from a
student dormitory in Ethiopia in order to determine
the proportion of smokers, you would result in an
overestimate, since females are less likely to
smoke.
Increasing the number of male students would
not remove the bias.
Teresa K. 21
The Cause of Non-Sampling Error

The interviewers effect


The respondents effect
Non-response: It is the failure to obtain information
on some of the subjects included in the sample to be
studied. Non response results in significant bias when
the following two conditions are both fulfilled.
When non-respondents constitute a significant
proportion of the sample (about 15% or more)
When non-respondents differ significantly from
respondents.
Teresa K. 22
Sampling Error cont’d …

Thus, the total survey error is the sum of both


sampling error and non-sampling error.

Teresa K. 23
Advantage of sampling

We obtain a sample rather than a complete


enumeration (a census ) of the population for many
reasons:
Feasibility it may be the only feasible method of
collecting data
Reduced cost sampling reduces demands on
resource such as finance, personal and material
Greater accuracy sampling may lead to better
accuracy of collecting data
Greater speed data can be collected and
summarized more quickly
Teresa K. 24
Disadvantage of Sampling

If sampling is biased, or not representative or too small


the conclusion may not be valid and reliable

If the population is very large and there are many


sections and subsections, the sampling procedure
becomes very complicated

If the researcher does not possess the necessary skill


and technical knowhow in sampling procedure, then the
outcome will be devastated.
Teresa K. 25
Characteristics of a good sample design
From what has been stated above, we can list down the
characteristics of a good sample design as:
Sample design must result in a truly representative
sample.
Sample design must be such which results in a small
sampling error.
Sample design must be viable in the context of funds
available for the research study.
Sample design must be such that systematic bias can be
controlled in a better way.
Sample should be such that the results of the sample
study can be applied, in general, for the universe with a
reasonable level of confidence.
Teresa K. 26
Types of Sampling

There are many methods of sampling when doing


research.
One of the most important decisions that any
researcher makes is how to obtain the type of
participants needed for the study.
The sample that we draw for our study determines
the generalizability of our findings.

Teresa K. 27
Types of Sampling cont’d…

On the representation basis, the sample may be:


1. Probability Sampling
2. Non-Probability Sampling Method
Probability sampling is based on the concept of
random selection, whereas non-probability
sampling is ‘non-random’ sampling.

Teresa K. 28
Types of Sampling Methods

Sampling Method

Probability Samples
Non-Probability
Samples
Simple Stratified
Random
Purposive Judgmental
Systematic Cluster

Convenience
Multistage Random Sampling
Teresa K. 29
Probability Sampling Method …

Probability sampling strategies typically use a random


or chance process in sampling.
The "equal chance" and "independent" components of
random sampling are what makes us confident that
the sample has a reasonable chance of representing
the population

Teresa K. 30
Probability Sampling Method …

What does it mean to be independent? The


researchers select each person for the study
separately.

Let us say you were asked to participate in an


experiment, enjoyed it, and told your friends to
contact the researcher to volunteer for the study.

 This would be an example of non-independent


sampling.

Teresa K. 31
Probability Sampling Method cont’d …

In probability sampling
A sampling frame exists or can be compiled.
Generalization is possible (from sample to population)

Teresa K. 32
1. Simple Random Sampling

Simple random sampling is the most straightforward of


the random sampling strategies.
We use this strategy when we believe that the
population is relatively homogeneous for the
characteristic of interest.
To use SRS there should be frame for the population

Teresa K. 33
Simple Random Sampling cont’d …

Procedures to select the sample


Depending on the complexity of the population, we
can use different tools to select n samples from the
frame.
These are lottery method,
Table of random number (they are available in the
appendix of many research methods and statistics
textbooks) or
Computer generated random number.

Teresa K. 34
Simple Random Sampling cont’d …

Lottery method is appropriate if the total population


is not too large, otherwise if the population is too
large then it will be very difficult to use lottery
method.
Thus, table of random number or computer
generated random number is the feasible method to
be used.

Teresa K. 35
Example
Assume that the total number of patients who visit Gondar
University Hospital for the last six months is N. We want to see the
prevalence of TB among those patients who visited the hospital.
So if we thing that those patients who visited the hospital within
the specified time period are homogeneous with respect to the
variable of interest and list of the patients are available, then we
can use simple random sampling to select the sample.

Teresa K. 36
2. Systematic Random Sampling

Individuals are chosen at regular intervals ( for example,


every kth ) from the sampling frame.
Systematic sampling is thought as random, as long as the
periodic interval is determined beforehand and the
starting point is random
A method of selecting sample members from a larger
population according to a random starting point and a
fixed, periodic interval.

Teresa K. 37
Systematic Random Sampling cont’d…

It is frequently chosen by researchers for its


simplicity and its periodic quality.
To use systematic sampling as strategies to select the
study subject, it needs the population to be
homogeneous, however the method does not
require frame.
Hence, in the absence of frame, this method will be
the best choice.

Teresa K. 38
Steps in systematic sampling:
Define the population
Determine the desired sample size (n)
List the population from 1 to N
Determine K, where k=N/n
Select a random number between 1 and k, let us denote this
number by a Starting at a, take every Kth number on the list until
the desired sample is obtained.
Then the selected list will be a, a+k, a+2k, a+3k, …,

Teresa K. 39
Teresa K. 40
Teresa K. 41
Demerits of systematic random sampling
If there is any sort of cyclic pattern in the ordering of
the subjects which coincides with the sampling interval,
the sample will not be representative of the population.
Examples
 List of married couples arranged with men's names
alternatively with the women's names (every 2nd, 4th, etc.) will
result in a sample of all men or women.

 If we want to select a random sample of a certain day


(sampling fraction on which to count clinic attendance, this
day may fall on the same day of the week, which might, for
example be a market day.

Teresa K. 42
3. Stratified Random Sampling

Stratified random sampling is used when we have


subgroups in our population that are likely to differ
substantially in their responses or behavior (i.e. if the
population is heterogeneous).

In stratified random sampling, the population is first


divided into a number of parts or 'strata' according to
some characteristic, chosen to be related to the major
variables being studied.

Teresa K. 43
Stratified Random Sampling cont’d…

Often we used simple random sampling to select a


sample from each strata after stratification.

Teresa K. 44
Steps involve in stratified sampling method:
Define the population
Determine the desired sample size
Identify the variable and subgroups (strata) for which
you want to guarantee appropriate representation
(either proportional or equal)
Classify all members of the population as a member of
one of the identified subgroups
Randomly select (using simple random sampling or
others) an appropriate number of individuals from each
subgroup.
Then the total sample size will be the sum of all
samples from each subgroup.
Teresa K. 45
There are two methods to get the study subject from
each subgroup, proportional allocation or equal
allocation.
We use proportional allocation technique when our
subgroups vary dramatically in size in our population

Teresa K. 46
Stratified Random Sampling cont’d…

If N1 ≠N2 ≠ N3 ≠ …
N
If N1=N2=N3=…
n1 ≠ n2 ≠ n3 ≠ …
Proportional allocation is
Then n1=n2=n3=…
required equal allocation

Stratum1
(N1)
Stratum 2
(N2)
Stratum 3
(N3)

n1 n2 n3 …
Total sample = n1+n2+n3+ …
Teresa K. 47
The higher the population in the subgroup, the higher
the sample size will be.
However, equal allocation will be used if the total
population from each subgroup is approximately equal.

Teresa K. 48
Advantage of stratified sampling over simple
random sampling
The representativeness of the sample is improved.
That is, adequate representation of minority
subgroups of interest can be ensured by stratification
and by varying the sampling fraction between strata as
required.

DEMERIT
Sampling frame for the entire population has to be
prepared separately for each stratum.

Teresa K. 49
4. Cluster Random Sampling

In this sampling scheme, selection of the required


sample is done on groups of study units (clusters)
instead of each study unit individually.
The sampling unit is a cluster, and the sampling frame is
a list of these clusters.
If the study covers wide geographical area, using the
other methods will be too costly.
The idea is, divided the total population in to different
clusters and then the unit of selection will be cluster.
Therefore, total population in the selected cluster will
be taken as the sample.
Teresa K. 50
Steps in cluster sampling are:
 Define the population
 Determine the desired sample size
 Identify and define a logical cluster (can be kebele, Got,
residence, and so on)
 Make a list of all clusters in the population
 Estimate the average number of population number per
cluster
 Determine the number of clusters needed by dividing the
sample size by the estimated size of the cluster
 Randomly select the required number of clusters (using
table of random number as the total number of clusters is
manageable)
 Include in the sample all population in the selected cluster.
Teresa K. 51
Consider the following graphical display:

Teresa K. 52
5. Multistage Random Sampling

This is the most complex sampling strategy.


The researcher combines simpler sampling methods to
address sampling needs in the most effective way of
possible.
Example
Suppose we want to investigate the working efficiency
of nationalized health institutions in India and we want
to take a sample of few health institutions for this
purpose.

Teresa K. 53
Example

The first stage is to select large primary sampling


unit such as states in a country.
Then we may select certain districts and interview
all health institutions in the chosen districts.
This would represent a two-stage sampling design with
the ultimate sampling units being clusters of districts.

Teresa K. 54
Example cont’d …

If instead of taking a census of all health institutions


within the selected districts, we select certain
towns and interview all health institutions in the
chosen towns.
This would represent a three-stage sampling design.
If instead of taking a census of health institutions
within the selected towns, we randomly sample
health institutions from each selected town,
then it is a case of using a four-stage sampling plan.
If we select randomly at all stages, we will have what is
known as ‘multi-stage random sampling design’.
Teresa K. 55
Non-Probability Sampling Method

In the presence of constraints to use probability


sampling strategies, the alternative sampling method is
non-probability sampling method.
Non-probability sampling strategies are used when it is
practically impossible to use probability sampling
strategies.
Non-probability sampling is sampling procedure which
does not afford any basis for estimating the probability
that each item in the population has of being included
in the sample.

Teresa K. 56
1. Purposive Sampling
In purposive sampling, we sample with a purpose in
mind.
When the desired population for the study is rare or
very difficult to locate and recruit for a study, purposive
sampling may be the only option.
For example, you are interested in studying cognitive
processing speed of young adults who have suffered
closed head brain injuries in automobile accidents.
This would be a difficult population to find.

Teresa K. 57
2. Convenience Sampling

Convenience sampling selects a particular group of


people. But, it does not come close to sampling all of
a population.
The sample would generalize only to similar programs
in similar cities.
Convenience sampling looks just like cluster sampling.
The major difference is that the clusters of research
participants are selected by convenience rather than
by a random process.
3. Judgment Sampling
The researcher selects the sample based on his/her
judgment Teresa K. 58
4. Quota sampling

Is a method that ensures a certain number of sample


units from different categories with specific
characteristics are represented.
In this method the investigator interviews as many
people in each category of study unit as he can find
until he has filled his quota.
It is the non-probability equivalent of stratified
sampling. This differs from stratified sampling, where
the stratums are filled by random sampling.

Teresa K. 59
5. Snowball sampling

It is a special non-probability method used when the


desired sample characteristic is rare.
Snowball sampling relies on referrals from initial
subjects to generate additional subjects.
What we need to do in case of snowball sampling is
that first identify someone who meets the criteria and
then let him/her bring the others he/she knew.
While this technique can dramatically lower search
costs, it comes at the expense of introducing bias
because the technique itself reduces the likelihood that
the sample will represent a good cross section from the
population.
Teresa K. 60
Sample Size

Determining the sample size for a study is a crucial


component of study design.
The goal is to include sufficient numbers of subjects so
that statistically significant results can be detected.
Among the questions that a researcher should ask
when planning a survey or study is that "How large a
sample do I need?“
The answer will depend on the aims, nature and scope
of the study and on the expected result.
All of which should be carefully considered at the
planning stage.
Teresa K. 61
Sample Size Determination

In general, sample size determination depends on:


The type of data analysis to be performed
The desired precision of the estimates one wishes to
achieve
The kind and number of comparisons that will be
made
The number of variables that have to be examined
simultaneously
Type of study design used.

Teresa K. 62
Sample size determination depending on
outcome variables
There are two possible categories of variables.
– The first is where the variable of interest is
categorical with:
• Only two alternatives response: yes/no,
dead/alive, vaccinated/not vaccinated and so on,
Or
• Variables with multiple, mutually exclusive
alternatives responses, such as marital status,
religion, blood group and so on.
• For categorical variables, the data are generally
expressed as percentages or rates.
• So we can use percentage to compute the sample
size. Teresa K. 63
Sample Size Determination cont’d…

– The second category covers continuous response


variables such as birth weight, age at first marriage,
blood pressure and cerium uric acid level, for which
numerical measurement are usually made.

– In this case the data are summarized in the form of


means and standard deviations.

Teresa K. 64
Sample Size Determination cont’d…

There are several approaches to determine the sample


size.
Depending on the type of response variable, whether it
is categorical or continuous, we will have two sets of
formulas.
The sample size determination formulas come from the
formulas for the maximum error of the estimates and is
derived by solving for n.

Teresa K. 65
Sample size for continuous variables
(for single population )

This is the condition in which the research question is


about mean.
 Standard deviation of the population: It is rare that a
researcher knows the exact standard deviation of the
population.
 Typically, the standard deviation of the population is
estimated:
From the results of a previous survey,
From a pilot study,
From secondary data,
From judgment of the researcher.
Teresa K. 66
Maximum acceptable difference: This is the maximum
amount of error that you are willing to accept.
Desired confidence level : is your level of certainty that
the sample mean does not differ from the true
population mean by more than the maximum
acceptable difference. Commonly we use a 95%
confidence level.
Then the sample size determination formula for single
population mean is defined by:

z 2
 2
n = / 2
2
Teresa K.
w 67
Sample size for single population mean cont’d…

 Where:
– α= The level of significance which can be obtain as 1-
confidence level.
– σ = Standard deviation of the population
– w= Maximum acceptable difference
– z α/2 = The value under standard normal table for the
given value of confidence level

Teresa K. 68
Sample size for categorical variables
(for single population )

Sample Size for Single Population Proportion


This is the situation in which the variable of interest
is categorical.
Best estimate of population proportion of the
variable of interest should be known:

Teresa K. 69
Sample size for categorical variables
(for single population ) cont’d…

The possible source of the proportion are:


From the results of a previous study,
From a pilot study,
From judgment of the researcher.
Simply by taking 50%

Teresa K. 70
Then the formula for the sample size of single
population proportion is defined as:

2
z  /2 * p (1  p )
n = 2
w
Where:
•α = the level of significance which can be
obtained as 1- confidence level
•P= best estimate of population proportion
•W= maximium acceptable difference
•Zα/2 = the value under standard normal table
for the given value of confidence level
Teresa K. 71
Example 1

One of MPH student want to conduct a research on the prevalence


of ANC utilization of mothers in DABAT district. Given that the
prevalence from the previous study found to be 45.7% , what will be
the sample size he should take to address his objective?
Solution:
 Margin of error w = 5%
 A confidence level of 95% will give the value of Zα/2=1.96.
 Then using the formula :
2 2
 Z  P (1  P )  Z  0 . 457 (1  0 . 457 )
 0 . 05
n  2 
  2 

W2 0 . 05 2


1 . 96  0 . 457 ( 0 . 543 )
2

0 . 05 2
 382 Teresa K. 72
Some Considerations after calculation

• The final sample size would be corrected for:


– Non response, lost to follow up, lack of
compliance, and so on
– Consider the total size of the population(N): if
N<10,000 then we need correction formula which
is defined as:

no
n =
f
no
1
N
Teresa K. 73
Some Considerations after calculation cont’d…

Where: nf = final sample size, no = total sample


size from the above formula and N = total
population
– Take the design effect into account if needed

Teresa K. 74
exercise 1

Midwifery graduate student wants to do her thesis work


on the title “assessment of the outcome of pregnancy
among women who visited Gondar university hospital
gynecology and obstetrics ward for the year 2013”
What will be the sample size she should take for this
study?

Teresa K. 75
Sample size for two population
Here the objective of the study is to check whether
there is significant difference between two proportions
coming from two different population.
If the sample size to be taken from each group is
assumed to be equal:
Let us denote:
–p1 = current estimate of population proportion
P1 (for non exposed or control group)
–q1 = 1- p1
–P2 = current estimate of population proportion
P2 (for exposed or treated group)
–q2 = 1-p2
–P3 = estimated average
Teresa K. of p1 and p2 76
Sample size for two population cont’d…

– Q3 = estimated average of q1 and q2


– Zα = the Z value corresponding to the alpha error.
This is the value at two tailed.
– Zβ = the Z value corresponding to the beta error.
The z value for beta is always based on one tailed
test. So if beta is 0.05, 0.1, 0.2, or 0.3. then the
corresponding Z values are 1.65, 1.28, 0.85 and
0.52 respectively.

Teresa K. 77
Sample size for two population cont’d…

Then the formula for sample size from each


group is defined as:

2
 z 2p3q3  z pq 1 1  p q
2 2

n=  
 p1  p2 

Teresa K. 78
Example 1

An investigator wants to determine if the mortality rate


in calves raised by farmer's wives differs from the
mortality rate in calves raised by hired managers.
He/she hypothesizes a calf mortality rate of 0.25 for
calves raised by farmer's wife and 0.40 for calves raised
by hired managers.

The level of significance, alpha, is stated to be 0.01, and


the desired power of the test is 0.95. How many calves
should be included in the study?

Teresa K. 79
Solution: From the given information, the required
sample size can be computed using the following as:
2
 z 2 p3q3  z p1q1  p2 q2 
n=  
 p1  p2 

n=

n = 344
Teresa K. 80
Unequal Sample size for the Difference of proportions:

Teresa K. 81
when Unequal Sample sizes are required from
the two population
 Let us denote:
– P1 = current estimate of population proportion
P1 (for non exposed or control group)
– q1 = 1- p1
– P2 = current estimate of population proportion
P2 (for exposed or treated group)
– q2 = 1-p2
– P3 = estimated average of p1 and p2

Teresa K. 82
Unequal Sample sizes cont’d…

– q3 = estimated average of q1 and q2


– Zα = the Z value corresponding to the alpha error.
This is the value at two tailed.
– Zβ = the Z value corresponding to the beta error.
The z value for beta is always based on one tailed
test. So if beta is 0.05, 0.1, 0.2, or 0.3. then the
corresponding Z values are 1.65, 1.28, 0.85 and
0.52 respectively.
– n1 = require sample size from N1
– r = the value by which n1 is to be multiplied to give
n2. that is n2 = rn1
Teresa K. 83
Unequal Sample sizes cont’d…

Then the formula for sample size from each


group is:

2
 z (r  1) p3q3  z rp1q1  p2 q2 
n=  
 r( p1  p2 ) 

Teresa K. 84
Design effects

The loss of effectiveness by the use of cluster/multi


stage sampling instead of simple random sampling is
design effect.
The design effect is basically the ratio of actual variance
under the sampling method actually used (cluster/
multistage), to the variance computed under the
assumption of simple random sampling.
Design effect is that factor by how much sample
variance for the sample plan exceeds simple random
sample of same size.
How much worse your sample is from a simple random
sample.
Teresa K. 86

You might also like