100% found this document useful (1 vote)
164 views

Chapter 3 Sample Size Calculation and Sampling

This document discusses sample size determination and sampling methods. It begins by explaining that sample size depends on factors like how varied the population is, the desired level of confidence, and margin of error. It then discusses how to calculate sample sizes using statistical formulas that consider the desired level of precision, variability in the population, and required level of confidence. Finally, it covers different sampling methods, distinguishing between probability sampling methods that allow calculating the chance of selection, and non-probability sampling methods. The key methods discussed are simple random sampling, systematic sampling, stratified sampling, cluster sampling, and multi-stage sampling.

Uploaded by

estela abera
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
164 views

Chapter 3 Sample Size Calculation and Sampling

This document discusses sample size determination and sampling methods. It begins by explaining that sample size depends on factors like how varied the population is, the desired level of confidence, and margin of error. It then discusses how to calculate sample sizes using statistical formulas that consider the desired level of precision, variability in the population, and required level of confidence. Finally, it covers different sampling methods, distinguishing between probability sampling methods that allow calculating the chance of selection, and non-probability sampling methods. The key methods discussed are simple random sampling, systematic sampling, stratified sampling, cluster sampling, and multi-stage sampling.

Uploaded by

estela abera
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 60

Sample size determination and

Sampling Methods
By
Gizachew A (MPHE)

6/8/2021
Sample size determination
• How many completed questionnaires do we need to
have a representative sample?
• Generally the larger the better, but that takes more time
and money.
• Answer depends on:
– How different or dispersed the population is.
– Desired level of confidence.
– Desired degree of accuracy.
– Desired margin of error
6/8/2021
Cont’d
In general sample size determination depends on
– Objectives of the study
– Design of the study
– Plan for statistical analysis
– Degree of precision
– Degree of confidence
– Accuracy of the measurements

6/8/2021
Sample size …
• Which variables should be included in sample size
calculation?
It should relate to the study’s primary outcome
variable.

If the study have secondary outcome variables which


are considered important, the sample size should also
be sufficient for the analysis of these variables.

6/8/2021
• In planning any investigation we must decide how many
people need to be studied in order to answer the study
objectives.
• If the study is too small we may fail to detect important
effects, or may estimate effects too imprecisely.
• If the study is too large then we will waste resources.
• The eventual sample size is usually a compromise between
what is desirable and what is feasible.
• Therefore, feasible sample size is determined by the
availability of resources.
6/8/2021
Sample Size Cont…
The confidence interval approach
• To determine sample sizes using statistical formulae,
researchers use the confidence interval approach based
on the following factors:
– Desired level of data precision or accuracy
– Amount of variability in the population (homogeneity)
– Level of confidence required in the estimates of
population values

6/8/2021
Sample Size Cont…
Facts for calculating required sample size:
• a) The reasonable estimate of the key proportion to be
studied. If unable to guess, take it as 50%.
• b) The degree of accuracy required. The allowed
deviation from the true proportion in the population as a
whole. It can be within 1% or 5%, etc.
• c) The confidence level required, usually 95%.
• d) The size of the population that the sample is to
represent.
• If it is more than 10,000 the precise magnitude is not
likely to be very important; but if the population is less
than 10,000 then a smaller sample size may be required.
6/8/2021
sample size calculatin of single population proportion

• Estimate how big the proportion might be (P)


• Choose the margin of error you will allow in the
estimate of the proportion (say ± w)
• Choose the level of confidence that the
proportion in the whole population is indeed
between (p-w) and (p+w).
• We can never be 100% sure.
• Do you want to be 95% sure?

6/8/2021
Cont…
• The minimum sample size required, for a very large
population (N≥10,000) is:
n = Z2 p(1-p) / d2
• Show how the above formula is obtained.
• A 95% C.I. for P = p ± 1.96 se , if we want our
confidence interval to have a maximum width of ± w,
• 1.96 se = w
• 1.96 √p(1-p)/n = d
• 1.962 p(1-p)/n = d2 , Hence, n = 1.962 p(1-p)/ d2

6/8/2021
Example 1
a) p = 0.26 , d = 0.03 , Z = 1.96 ( i.e., for a 95% C.I.)
• n = (1.96) 2 (.26 × .74) / (.03) 2 = 821.25 ≈ 822
• Thus , the study should include at least 822 subjects.
b) If the above sample is to be taken from a relatively small
population (say N = 3000) , the required minimum
sample will be obtained from the above estimate by
making some adjustment .
• nf= n/(1+(n/N))
• 821.25 / (1+ (821.25/3000)) = 644.7 ≈ 645 subjects

6/8/2021
Point to be considered

6/8/2021
Sampling
Sampling defined;
• The process of selecting a portion of the population to
represent the entire population.
• A main concern in sampling:
– Ensure that the sample represents the population, and
– The findings can be generalized.
 The items so selected constitute what is technically called
a sample, their selection process or technique is called
sample design and the survey conducted on the basis of
sample is described as sample survey.
6/8/2021
• Inferences/ conclusions about the population are
based on the information from the sample drawn from
that population.
• Therefore, the accuracy of the conclusions depend on
how well the sample has been selected and on how
represenative it is.
• Sample should be truly representative of population
characteristics without any bias so that it may result in
valid and reliable conclusions.
6/8/2021
• Due to the variability in the characteristics of the
population, scientific sample designs should be applied
to select a representative sample.
• A representative sample: has all the important
characteristics of the population from which it is drawn
• If not, there is a high risk of distorting the view of the
population.
• For example, we may have a single sample composed of
50 cases, representing a population of 1000 individuals.

6/8/2021
• It enables us to estimate the characteristic of a
population by directly observing a portion of the
population.
• Researchers are not interested in the sample itself,
but in what can be learned from the sample—and
how this information can be applied to the entire
population.

6/8/2021
Sample Information

Population

6/8/2021
Advantages of sampling:

• Feasibility: Sampling may be the only feasible method of


collecting information.
• Reduced cost: Sampling reduces demands on resource such as
finance, personnel, and material.
• Greater accuracy: Sampling may lead to better accuracy
(quality) of collecting data
• Greater speed: Data can be collected and summarized more
quickly (time saving)
Disadvantages of sampling:
• There is always a sampling error.
• Sampling may create a feeling of discrimination within the
population.

6/8/2021
• While selecting a SAMPLE, there are basic
questions:
– What is the group of people (STUDY
POPULATION) from which we want to draw a
sample?
– How many people do we need in our sample?
– How will these people be selected?

6/8/2021
• Reference population (or target population): the
population of interest to whom the researchers would like
to make generalizations.
• Source population: the subset of the target population
from which a sample will be drawn.
• Study population: the actual group in which the study is
conducted = Sample
• Study unit: the units on which information will be
collected: persons, housing units, etc.

6/8/2021
Target population:
The conclusion may or
may not be generalizable
ility due to refusals, selection
bza Biases, etc.
li
e ra
n
Ge Source population:
If sampling is representative,
then the conclusion applies to
the sampled population

Sample:
The conclusion is drawn
from the sample

6/8/2021
Researchers are interested to know about factors associated with ART
use among HIV/AIDS patients attending certain hospitals in a given
Region

Target population = All ART


patients in the Region

Source population = All


ART patients in, e.g. 3,
hospitals in the Region

Sample

6/8/2021
Sampling Methods
Two broad divisions:

A. Probability sampling methods

B. Non-probability sampling methods

6/8/2021
A. Probability sampling
• Involves random selection of a sample
• A sample is obtained in a way that ensures every
member of the population to have a known
chance, non-zero probability of being included
in the sample.
• Involves the selection of a sample from a
population, based on chance.
6/8/2021
• Probability sampling is:
– more complex,
– more time-consuming and
– usually more costly than non-probability
sampling.
• However, study samples are randomly selected
and their probability of inclusion can be
calculated,
– reliable estimates can be produced and
– inferences can be made about the population.

6/8/2021
• There are several different ways in which a
probability sample can be selected.
• The method chosen depends on a number of
factors, such as
– the available sampling frame,
– how spread out the population is,
– how costly it is to survey members of the population

6/8/2021
Most common probability
sampling methods
1. Simple random sampling
2. Systematic random sampling
3. Stratified random sampling
4. Cluster sampling
5. Multi-stage sampling

6/8/2021
1. Simple random sampling
• Involves random selection
• Each member of a population has an equal chance of
being included in the sample.
• To use a SRS method:
– Make a numbered list of all the units in the population
– Each unit should be numbered from 1 to N (where N is the
size of the population)
– Select the required number.
• The randomness of the sample is ensured by: use of
“lottery’ methods , a table of random numbers,
Computer programs
6/8/2021
• It is gold standard sampling method
• SRS has certain limitations:
– Requires a sampling frame.
– Difficult if the reference population is dispersed.
– Minority subgroups of interest may not be selected.

6/8/2021
Simple Random Cont…

6/8/2021
2. Systematic random sampling
• Sometimes called interval sampling, systematic
sampling means that there is a gap, or interval,
between each selected unit in the sample
• The selection is systematic rather than randomly
• Important if the reference population is arranged
in some order:
– Order of registration of patients
– Numerical number of house numbers
– Student’s registration books
6/8/2021
Steps in systematic random sampling
1. Number the units on your frame from 1 to N (where N is
the total population size).
2. Determine the sampling interval (K) by dividing the
number of units in the population by the desired sample
size.
3. Select a number between one and K at random. This
number is called the random start and would be the first
number included in your sample.
4. Select every Kth unit after that first number
Note: Systematic sampling should not be used when a
cyclic repetition is inherent in the sampling frame.
6/8/2021
Example
• To select a sample of 100 from a population of 400,
you would need a sampling interval of 400 ÷ 100 = 4.
• Therefore, K = 4.
• You will need to select one unit out of every four
units to end up with a total of 100 units in your
sample.
• Select a number between 1 and 4 from a table of
random numbers.

6/8/2021
• Using the above example, you can see that with a systematic
sample approach there are only four possible samples that
can be selected, corresponding to the four possible random
starts:
A.1, 5, 9, 13...393, 397
B. 2, 6, 10, 14...394, 398
C.3, 7, 11, 15...395, 399
D.4, 8, 12, 16...396, 400
–Each member of the population belongs to only one of the
four samples and each sample has the same chance of being
selected.
6/8/2021
3. Stratified random sampling

• It is done when the population is known to have


heterogeneity with regard to some factors and those factors
are used for stratification
• Using stratified sampling, the population is divided into
homogeneous, mutually exclusive groups called strata, and
• A population can be stratified by any variable that is
available for all units prior to sampling (e.g., age, sex,
province of residence, income, etc.).
• A separate sample is taken independently from each stratum.
• Any of the sampling methods mentioned in this section (and
others that exist) can be used to sample within each stratum.
6/8/2021
Why do we need to create strata?
• Can make the sampling strategy more efficient.
• A larger sample is required to get a more accurate
estimation if a characteristic varies greatly from one
unit to the other.
• For example, if every person in a population had the
same salary, then a sample of one individual would be
enough to get a precise estimate of the average salary.

6/8/2021
• This is the idea behind the efficiency gain
obtained with stratification.
– If you create strata within which units share
similar characteristics (e.g., income) and are
considerably different from units in other strata,
then you would only need a small sample from
each stratum to get a precise estimate of total
income for that stratum.
– Then you could combine these estimates to get
a precise estimate of total income for the whole
6/8/2021
population.
• If you use a SRS approach in the whole population
without stratification, the sample would need to be
larger than the total of all stratum samples to get an
estimate of total income with the same level of
precision.
• Stratified sampling ensures an adequate sample size
for sub-groups in the population of interest.
• When a population is stratified, each stratum
becomes an independent population and you will
need to decide the sample size for each stratum.
6/8/2021
Stratified Random sampling
• types of stratified sampling
– Proportional and
– Non- proportional.
• proportional stratified sampling
– equal and proportional representation is given to
subgroups or strata.
– If the number of items is large the sample will have a
higher size and vice versa.
– each stratum is represented according to its size.
– %ge of allocation to a strata=n/N

6/8/2021
• Equal allocation:
– Allocate equal sample size to each stratum
• Proportionate allocation:
, j = 1, 2, ..., k where, k is
n
nj  Nj the number of strata and
N

– nj is sample size of the jth stratum


– Nj is population size of the jth stratum
– n = n1 + n2 + ...+ nk is the total sample size
– N = N1 + N2 + ...+ Nk is the total population
size

6/8/2021
Example: Proportionate Allocation
• Village A B C D Total
• HHs 100 150 120 130 500
• S. size ? ? ? ? 100

6/8/2021
4. Cluster sampling
• Sometimes it is too expensive to spread a
sample across the population as a whole.
• Travel costs can be high.
• To reduce costs, researchers may choose a
cluster sampling technique
• The clusters should be homogeneous, unlike
stratified sampling where by the strata are
heterogeneous

6/8/2021
Steps in cluster sampling
• Divide the population into groups or clusters.
• A number of clusters are selected randomly to
represent the total population, and then all units
within selected clusters are included in the sample.
• No units from non-selected clusters are included in
the sample.
• Differs from stratified sampling, where some units
are selected from each group.

6/8/2021
Cluster Cont…
Advantages
• Simple as complete list of sampling units within
population not required
• Less travel/resources required
Disadvantages
• Potential problem is that cluster members are more
likely to be alike, than those in another cluster
(homogenous)….
• This “dependence” needs to be taken into account
in the sample size….& the analysis (“design effect”)

6/8/2021
Example
• In a school-based study, we assume students of the
same school are homogeneous.
• We can select randomly sections and include all
students of the selected sections only
• Cost reduction is a reason for using cluster sampling.
• It creates 'pockets' of sampled units instead of
spreading the sample over the whole territory.
• Another reason is that sometimes a list of all units in
the population is not available, while a list of all
clusters is either available or easy to create.
6/8/2021
Cluster Cont…
Example: Cluster sampling

6/8/2021
Cluster Cont…
Difference Between Cluster & Stratified Sampling

6/8/2021
5. Multi-stage sampling
• Similar to the cluster sampling, except that it
involves picking a sample from within each chosen
cluster, rather than including all units in the cluster.
• This type of sampling requires at least two stages.
• In the first stage, large groups or clusters are
identified and selected. These clusters contain more
population units than are needed for the final sample.

6/8/2021
• In the second stage, population units are picked from
within the selected clusters (using any of the possible
probability sampling methods) for a final sample.
• The primary sampling unit (PSU) is the sampling
unit in the first sampling stage.
• The secondary sampling unit (SSU) is the sampling
unit in the second sampling stage, etc.

6/8/2021
Zone PSU

Woreda SSU

Kebele TSU

Sub-Kebele

HH

6/8/2021
Advantages
• No need to have a list of all of the units in the
population.
• All you need is a list of clusters and list of the units in
the selected clusters.
• Multi-stage sampling saves a great amount of time and
effort by not having to create a list of all the units in a
population.

6/8/2021
Multi-stage Sampling Cont…
Selecting a sampling method
• Population to be studied
– Size/geographical distribution
– Heterogeneity with respect to variable
• Availability of list of sampling units
• Level of precision required
• Resources available

6/8/2021
A. Non-probability sampling
• Every item has an unknown chance of being selected.
• There is an assumption that there is an even
distribution of a characteristic of interest within the
population.
• Elements are chosen arbitrarily, & there is no way to
estimate the probability of any one element being
included in the sample.

6/8/2021
• Also, no assurance is given that each item has a
chance of being included, making it impossible either
to estimate sampling variability or to identify possible
bias
• Difficult to ensure reliability
• No way to measure the precision of the findings
• Researchers are reluctant to use these methods

6/8/2021
Non- Probability sampling
• Focuses on
– volunteers,
– easily available units, or
– those just happen to be present when the researcher is
done.
• Non- Probability sampling is useful for
– quick and cheap studies,
– case studies,
– qualitative research,
– pilot studies, and
– developing hypotheses for further research.
6/8/2021
The most common types of non-probability sampling

1. Convenience sampling
2. Quota sampling
3. purpusive/judgmental sampling
4. snowball sampling
5. volenter sampling

6/8/2021
Types of non-probability methods
• Convenience sampling:
– also called accidental ,Haphazard , man in the street
sampling
– The researcher selects units that are ,convenient, close at
hand, and easy to reach
– it can deliver accurate results when the population is
homogeneous.
Purposive sampling:
– The researchers select the units with some purpose in
mind.
– The researcher specifies the characteristics of the
population of interest and then locates individuals who
match those characteristics.
6/8/2021
Types of non-probability methods
Quota sampling:
– The researcher sets quotas for different types of units.
– involves setting quotas
– uses convenience sampling to obtain those Quotas.
Snowball sampling:
– Used in studies involving respondents who are rare to
find.
– To start with, the researcher compiles a short list of
sample units from various sources.
– Each of these respondents are contacted to provide
names of other probable respondents.
6/8/2021
Snowball Sampling

6/8/2021
Non-probability Cont…
Factors to Consider in Sample Design

6/8/2021
Quiz
1. A resercher is intersted to assess Knowledge Attitude
and Practices(KAP) about mode of transmission and
prevention of COVID-19 among Wolaita sodo university
health science GC students. Assume that he wants to
generalize his result to all GC students population.
a. Which study design is appropraite to answer this
research question? why
b. Who are source population, study population, and study
unit in this senario?
c. What is the appropriate sampling technique?why?
6/8/2021

You might also like