CHAPTER 6
SAMPLE DESIGN
Some fundamental definitions
Population: Is the theoretically specified aggregation of
survey elements from which the survey sample is actually
selected.
Sampling Frame: Is the list of elements from which the
sample is drawn
Sample: A subset or some part of a larger population
Sample design: Is a definite plan for obtaining a sample
frame.
Sampling: Is the process of using a small number or part of
a larger population to make conclusion about the whole
population.
Element: Is unit from which information is collected and
which provides the basis of analysis.
Statistic: Is a characteristic of a sample. E.g mean of sample
Parameter: Is a characteristic of a population. E.g. mean of
population
Sampling
Design
Census
all people or elements in a group of interest
(a population) included in a study
Limitation of census
very time consuming
expensive
minor research would not justify a census
Advantages of census
Reliability
Detailed information
Sampling theory
Is the study of the relationship existing
between a population and sample drawn from
the population.
It is concerned with estimating the property of
the population from those of the samples and
also with gauging the precision of the
estimate.
Sampling theory Objectives
Sampling theory is design to attain one or more of the
following objectives
Statistical estimation: Sampling theory helps in
estimating unknown population parameters
from knowledge of statistical measurement on
sample studies.
Testing of hypothesis: It enables us to decide
whether to not accept or to reject the stated
hypothesis.
Statistical inference: Sampling theory helps in
making generalization about the population from
the studies based on samples drawn from it.
Advantages of Sampling
Complete enumeration of all sample units in the
entire universe is often unnecessary to obtain
reasonably accurate results.
An examination of the entire population is often too
costly, too time-consuming, and impractical (if not
impossible).
In the case of destructive testing, the sample
elements or units must be destroyed or must be
consumed to obtain necessary measurements.
Sampling technique is used under the
following conditions
Vast data: When the number of units is very
large, sampling technique must be used.
Infinite population
When census is impossible
When homogeneity is high
CHARACTERISTICS OF A GOOD SAMPLE
DESIGN
(a) must result in a truly representative sample.
(b) must be such which results in a small sampling error.
(c)must be viable in the context of funds available for the
research study.
(d)must be such so that systematic bias can be controlled
in a better way.
(e)should be such that the results of the sample study can
be applied, for the universe with a reasonable level of
confidence.
Steps involved in sample planning
Defining population
Census Vs Sample
Sampling Design
Sample Size
Execute Sampling Process
Defining the population
This implies specifying the subject of the study that
involves identifying which elements (items) are included,
as well as where and when.
If the research problem is not properly defined then
defining population will be difficult.
E.g. What is your population of interest?
To whom do you want to generalize your results?
All doctors
School children
Taxi drivers
Women aged 25-45 years
Other
Sampling
Design
A sample:
is a subset of selected from a
people
population - an example is an opinion poll
we may want our to
sample representative of the be
value
population,
we calculate from the sample (statistic)
i.e. any
is correct as estimates of the population
value (parameter)
Sampling
Design
specify population Nonprobability sampling
convenience sample; quota
sample; judgement sample
select method of Probability sampling
sampling simple random sample; stratified
sample; cluster sample; multi-
stage sample; proportionate/
determine sample size disproportionate sample
method of collecting observation; interviewing;
questionnaire
data from sample
timing of data collection
data analysis
Sampling
Design …….
3 factors that influence sample representativeness
Sampling procedure
Sample size
Participation (response)
When might you sample the entire population?
When your population is very small
When you have extensive resources
When you don’t expect a very high response
SAMPLING BREAKDOWN
SAMPLING…….
STUDY POPULATION
SAMPLE
TARGET POPULATION
Types of
Samples
Probability (Random) Samples
Simple random sample
Systematic random sample
Stratified random sample
Multistage sample
Multiphase sample
Cluster sample
Non-Probability Samples
Convenience sample
Purposive sample
Quota
Probability and Non probability Sampling
Probability Sampling
The distinguishing characteristic of probability sampling
is that one can specify that each element of the
population has a probability of being included in the
sample - we can estimate population parameters from
the sample statistics
Non-probability sampling
Is there is no way of specifying the probability of each
element’s inclusion in the sample - there is no assurance
that every element has some chance of being included
often used when a population cannot be precisely
defined and when a population list is unavailable
Probability Sampling
systematic procedures, not judgment, to select sample
ensures every element from the population has a
known probability of being selected for the sample
can evaluate the amount of error in data caused by
using a sample rather than census, i.e. a statistical
relationship exists between sample estimates and the
population
not affected by researcher likes and dislikes
more expensive and more difficult to obtain data
Types of Probability Sampling
1. Simple Random Sample:
Items selected from population such that each has an
equal and known probability of selection
can allocate a number to each item and sample can
be drawn using random number tables or lottery
method.
Applicable when population is small, homogeneous
& readily available.
e.g. a bank randomly selects from a list of 800 customers a
sample of 200; each list member has a 200/800 (25%)
chance of being included in the sample
- known probability is n/N, where
- n is the sample size and
- N population size
Types of
… 2. Systematic Random Sample
•relies on arranging the target population according
to some ordering scheme and then selecting
elements at regular intervals through that ordered
list.
• a random starting point selected
• every Kth unit in the frame is selected,
•where: K = Population Size/ Sample Size
Types
of…
3. Stratified Sample
• a stratum = a population segment having one or more
common characteristics (e.g. companies with between
10 and 50 employees)
• When the population is heterogeneous overall, but within
it there are homogeneous (strata)
populations population is stratified. the
• population divided into strata
• the basis for this division is judgment
• may help to ensure a sample, thus
representative reducing sampling error
Types
of…
One of the main reasons for using a stratified sample
is that stratifying has the effect of reducing sampling
error for a given sample size to a level lower than that
of an simple random sample of the same size.
This is so because of a very simple principle: the more
homogeneous a population is on the variables being
studied, the smaller the sample size needed to
represent it accurately.
Stratifying makes each sub-sample more
homogeneous by eliminating the variation on the
variable that is used for stratifying.
Probability Sampling
Proportionate / Disproportionate Sampling
proportionate - an equal % of respondents
sampled for each strata
disproportionate - a small important stratum is
over-sampled but restored to due weight in
overall results
Types
of…
4. Cluster Sample
Another modified random sample design --
requires that the sample unites be grouped in
clusters in the universe.
Not grouped by homogeneous strata in the
population.
population divided into clusters before
sample taken
randomly choose clusters and sample all elements within
each cluster
low cost for a given sample size
Difference Between Strata and Clusters
Although strata and clusters are both non-overlapping
subsets of the population, they differ in several ways.
All strata are represented in the sample; but
only a subset of clusters are in the sample.
With stratified sampling, the best survey results occur
when elements within strata are internally
homogeneous.
However, with cluster sampling, the best results occur
when elements within clusters are internally
heterogeneous.
Types
of…
5. Multi-stage Sample
A probability sample taken in multiple stages if survey
population is large and widely dispersed.
The selection procedure takes place in a hierarchy of
stages.
First primary sample unit
Second second sample
Third unit tertiary sample
. . . . . unit
last final (or ultimate) sample unit
Comparisons of probability sampling techniques
Sampling Cost Degree Advantage Disadvantage
of use
technique
Simple High cost Moderate Easy to analyze data and Requires sampling frame;
random sampling error Large errors for same
sample size compared to
stratified; High cost if
respondents are dispersed
Systematic Moderate Moderate Simple to draw sample Requires sampling frame
to high
cost
Stratified High cost Moderate Ensures group Requires information for
representation; stratification variable;
Comparison among Sampling frame needed
strata possible; reduce
variability
Cluster Low cost Frequent Lowers field cost; Larger errors for same
Possible to estimate Xics sample size compared to
of clusters other techniques
Non-probability sampling design
1. Convenience sampling
The sampling procedure of obtaining those people or units
that are most conveniently available
Economical and fastest way of getting questionnaire filled
up
2. Purposive sampling: We select a particular group of units
from the population based on reason/purpose. We need to
justify the reason why we choose a particular unit
Judgmental sampling: we use our judgment whether it is
appropriate to choose a particular unit from the
population
Quota sampling: we assign quota for group of units in a
population in drawing samples
3. Snowball sampling
In some studies identifying the population is very difficult
We try to find one study unit from the population and we
Comparisons of non-probability sampling techniques
S Sampling Cost Degree of Advantage Disadvantage
N technique use
1 Convenience Very low Extensively No need for list of Samples are
used population unrepresentative;
Making inference
beyond sample risky
2 Judgmental Moderate Average Sample guaranteed Bias due to expert’s
cost to meet a specific beliefs make samples
objective; useful for unrepresentative;
certain types of Making inference
forecasting beyond samples risky
3 Quota Moderate Very Requires no need Introduces bias in
cost extensively for list of population researcher’s
used classification of
subjects; Making
inference beyond
sample risky
4 Snowball Low cost Used in Useful in locating High bias because
special rare population sample units are not
situations independent; Inference
beyond samples risky
Sample Size
sample size (number of elements in sample)
and precision of the study are directly related
The larger the sample size the higher is
the accuracy.
The sample size determination is
statistical activity, purely which needs
knowledge. statistical
the adequate size of a sample is properly
estimated by deciding what level of accuracy
is expected (i.e. how large a standard error or
sampling error is acceptable)
The confidence level: A percentage value that tells
how confident a researcher can be about being
correct. It could be either 90%, 95%, or 99%. It tells
you how sure you can be.
It is expressed as a percentage and represents how
often the true percentage of the population who
would pick an answer lies within the confidence
interval. For example, 95% confidence level means
that if you had conducted the same survey 100 times,
95 times out of 100 the survey would have yielded
the same results.
Sample Size
There are a number of sample size determination methods
Personal judgments: The personal judgment and
subjective decision of the researcher in some cases can
be used as a base to determine the size of the sample.
Budgetary approach is another way to determine the
sample size. Under this approach the sample size is
determined by the available fund for the proposed study.
E.g., if cost of surveying of one individual or unit is 30 birr
and if the total available fund for survey is say 1800 birr ,
the sample size then will be determined as,
Sample size (n) = total budget of survey /Cost of unit
survey
Accordingly, the sample size will be 60 units (Br.1800 / Br.30 per unit
= 60 units)
Traditional inferences
This is based on rate and
precision level.
confidence
To estimate size usingthis
sample approach have information
we need to about:
the estimated variance of the population,
the magnitude of acceptable error and
the confidence interval
Cont…
Steps in estimating sample size:
1. Estimate the standard deviation of the population
1. From previous study or pilot study
2. Decide on the magnitude of error
3. Determine the confidence interval
4. Calculate sample size as follows:
Variance or heterogeneity of the population: refers to
the standard deviation of the population parameter.
From previous study or pilot study
If information about variance is not available
a researcher is expected to estimate it.
According to the rule of the thumb standard deviation is
one-sixth of the range.
Magnitude of acceptable error: The magnitude of
error (range of possible error) indicates how precise the
study must be.
It is acceptable error for that study. The researcher
makes subjective judgment about the desired
magnitude of error.
E.g., to estimate the average income of household
one may allow an error says 50
Confidence interval:
In most case (research) 95% confidence level is used.
That is, it is assumed that 95 times out of 100 the
estimate from sample will include the population
parameter.
Once the above concepts are understood and
determined the size of sample is quite simple to
determine. It is determined based on the following
relationship.
n = (ZS/E)2 and (for unlimited population)
n = N/ (1+N (e) 2 (for known population)
Where Z is standardization value indicating a confidence
level E represents acceptable magnitude of error an error
factor S represents sample SD or an estimate of the
population SD n represent sample size
Tables
It is also possible to use a table that suggests
the optimal sample size – given a population
size, a specific margin of error, and a desired
confidence interval.
This can help researchers avoid the formulas
altogether. The table below presents the
results of one set of these calculations. It
may be used to determine the appropriate
sample size for almost any study.
Source: The NEA Research Bulletin (1960), Vol 38:99
40
Sampling Error and Non Sampling Error
Sampling Error
The sampling error refers to the extent to which
the sample values on some variable of importance
to the research differ from those of the population
from which it was drawn.
Sampling error is the difference between the result
of a sample and the result of census.
It is the difference between the sample estimation
and the actual value of the population.
These are errors that are created because of the
chance only
Errors
Sampling (internal) Error
The fact that a sample was taken, the
sample statistic is expected to deviate
from the population parameter.
The mean of the sample might be different
from the population mean by chance
alone.
The standard deviation of the sample
might also be different from the population
standard deviation.
Therefore, we can expect some difference
between the sample statistics and the
population parameter.
This difference is known as sampling error.
Example
Suppose an individual student has scored the
following grades in 10 subjects (Consider these
subjects as population); 55, 60, 65, 90, 55, 75, 88,
45, 85, 82.
Say, a sample of four grades 55, 65, 82, and 90 are
selected at random from this population to estimate
the average grade of this student. The mean of this
sample is 73. But the population mean is 70. The
sampling error is therefore, 73 - 70 = 3.
However, the variation due to random fluctuation
(sampling error) decreases as the sample size
increases though it is not possible to completely
avoid sampling error.
Systematic Error (non-sampling error)
Systematic error is also called sampling bias.
Such error can be created from errors in the sampling
procedure, and it cannot be reduced or eliminate by
increasing the sample size.
Such error occurs because of human mistakes and
not chance variation.
Non-Sampling (external) Error
The practical considerations in taking a sample.
recording errors
processing errors
Generally, non-sampling errors occur in a sample
survey as well as in census survey where as the
sampling error occurs only in a sample survey.
Preparing the survey questionnaire and handling
the data properly can minimize non-sampling error.
as sample size increases, non-sampling errors (in
for example, data collection, non-responses,
measurement and analysis) may increase
Maximizing accuracy requires that total study error
be minimized.
Total error = sampling error + Non-sampling
Error
Errors
Bias
Most insidious to detect ....
poorly defined universe
inadequate sampling design
improperly worded questions
distorted answers
Non response rate
End of Chapter
Six