Sampling Techniques (PDFDrive)
Sampling Techniques (PDFDrive)
hird edition
WILLIAM G. COCHRAN I
@J
WILEY EASTERN LIMITED
New Delhi Bangaloro Bom ;y
"Tlrird U .S. Edition, 1m I
Authorized reprint of the editIOn pub1iabed by John Wiley &: Sons" Inc.,
New York, Chichester, Brisbane and Toroato
Copyright C 1m, John Wiley &: Sons, Inc.
All rights reserved. No part of this book may be reproduced in any form
without tbe written permission of Wiley-lnterscience, Inc.
10 9 8 7 6 5 4
As did the previous editions, this text\-.~ - ' mts a comprehensive account of
,sampling theory as it has been devr _ in sample surveys. It contains
William G. Cochran
South Orleans, Massachusetts
February, 1977
Contents
CHAPTER PAGE
1 INTRODUCTION
1.1 Advantages of the Sampling Method )
1.2 Some Uses of Sample Surveys . . . 2
1.3 The Principal Steps in a Sample Survey 4
1.4 The Role of Sampling Theory . . . 8
1.5 Probability Sampling .. . . . . 9
1.6 Alternatives to Probability Sampling J(}
1.7 Use of the Normal Distribution 11
1.8 Bias and lts Effects 12
1.9 The Mean Square Error IS
EUTcises 16
CHAPTER
2 SIMPLE RANDOM SAMPLING 18
CHAPTER
3 SAMPLING PROPORTIONS
AND PERCENTAGES 50
CHAPTER
4 THE ESTIMATION OF SAMPLE SIZE 72
CHAPTER
5 STRATIFIED RANDOM SAMPLING 89
CHAPTER
5A FURTHER ASPECTS OF
STRATIFIED SAMPLING 11 5
SA. I Effects of Deviations from the Optimum Allocation 115
5A.2 Effecrs of Errors in the Srratum Sizes . '.... 117
SA ,3 The Problem of Allocation with More than One Item 119
SA.4 Other Methods of Allocation with More than One Item 12 1
SA.S Two-Way Stratification with Small Samples 124
SA.6 Controlled Selection 126
5A.7 The Construction of Strata . . . . . . . 127
SA.S Number of Strata . . . . . . , . . . . 132
5A 9 Stratification After Selection of the Sample (Poststratification) 134
SAIO Quota Sampling . ' . . . . . . . . . . . . . . . . 13,5
SA .11 Estimation from a Sample of the Gain Due to Stratification 136
5A.12 Estimation of Variance with One Unit per Stratum 138
5A.13 Strata as Domains of Study , . . . . . . . . . 140
SA.14 Estimating Totals and Means Over Subpopulations . . . . . . , 142
5AlS SamplingfromTwoFrames . . . . . _ . . . . . . . . . . 144
Exercises 146
CHAPTER
6 RATIO ESTIMATORS 150
CHAPTER
7 REGRESSION ESTIMATORS 189
CHAPTER
8 SYSTEMATIC SAMPLING 205
CHAPTER
9 SINGLE-STAGE CLUSTER SAMPLING:
CLUSTERS OF EQUAL SIZES 233
CHAPTER
9A SINGLE-STAGE CLUSTER SAMPLING:
CLUSTERS OF UNEQUAL SIZES 249
CHAPTER
10 SUBSAMPLlNG WITH UNITS OF EQUAL SIZE 274
Exercises 2')0
CHAPTER
11 SUBSAMPLlNG WITH UNITS OF UNEQUAL
SiZES 2')2
11 .1 Introduction . . . . . . . . . . . . . . . . . . . 2,}2
11 .2 Sampling Methods when n = I .... .. . . . . . 2');1
11 .3 Sampling with Probability Proportional to Est imat ed Size 2IJ7
II <I Summary of Methods for If = I 2')')
II .."! Sampling Methods When If I . . . . . .. . .. . 300
11 .0 Two Useful Result. . . .. . . . .. ... . ... 300
11 .7 Units Selected with Equal Probabilities : Unbiased Estimator 303
11 .8 Unit · Selected with Equal Probabilities: Ratio to Sile Estimate .. 03
J 1.9 Units Selected with Unequal Probahilitil's with Replacement :
Unbiased Estimator . . . . . . . 306
11. 10 Unit:. elected Without Replacement 301l
11.1 1 Comparison of the Methods ] I0
11 . 12 RlIlios to Another Variable 311
11. 13 hoice of Sampling and ubsampling Fractions. Equal Prob-
IIbilities . . . . . . . . . . . . . . . . . . . . . . . . . 3 13
11.14 Optimum Selection Prohabilities and Sampling and Subsampling
Rates . . . . . . . . . . . . . . . 314
1l.IS Stratified Sampling. Unbia~ed Estimators 316
11 .16 Stratified Sampling. Ratio Estimates . . 317
J 1.17 Nonli near Estimators in Complex Surveys .. I H
1 1.1R Taylor Series xpansion :; II}
11.19 Balanced Repeated Replications .120
11.20 The Jackknife Method . . . . ;121
11.21 Comparison of the Three Approaches ;1.:!2
Exerdses .12~
CONTENTS xv
C HAPTER
J2 DOUBLE SAMPLING 327
:HAPTER.
13 SOURCES OF ERROR IN SURVEYS 359
13 . 1 Introduction 359
11 .2 EIT..:ct~of N(lnre~pnr"c 359
1.1 ..' Type" of Nonr\:#ponsc 364
11.-1 Call-backs 365
1.1.5 A Mathematical Modcl of the liect~ of Call-backs 367
13 .1\ Optimum Sam plin g ruction Amung the Nonrcspondents 370
1.1 .7 Adj ustm.:nth for Biu~ Without Call-hacks . . . . 374
1.1 .X A Mathematical Model for Errors of Mea~uremcnt . . . 377
IJ.t) Effects of Constan t Bias ... .. . . 379
D . IO EtTects of Errors that Are Uncorrelated Within the Sample 380
D . II tfcct~ of fnt rasample Corre lation Betwc.:n Errors of Measure -
ment . . . . . . . . . . . . . . 383
13 . 12 Summary of the Etfectsof Errors of Mea ~ urem<.!nt 384
13 .13 The Study of E rrors of Mea~urement . 3114
13.14 Repeated Measurement of Suh~amples . . . . . 3116
1.1 . 15 Interpenetrati ng Subsamples . . . . . . . . . 388
1.1 . 16 Combination of Interpenetration and Repeated Measurement 391
13 . 17 Sensitive Questions; Randomized R e~po n ses 392
13.ltl The Unrelated Second Question 393
13.19 Summary . . . . . . . .. . . . . . . 395
Exercises 396
xvi CONTENTS
References 400
Answers to Exercises 41 2
Author Index 419
Introduction
I
2 SAMPUNG TECHNIQUES
persons, or about one person in 1240. Surveys used to provide facts bearing on
sales and advertising policy in market research may employ samples of only a few
thousand.
Greater Speed
For the same reason, the data can be collected and summarized more quickly
with a sample than with a complete count. This is a vital consideration when the
information is urgently needed.
Greater Scope
In certain types of inquiry highly trained personnel or specialized equipment,
limited in availability, must be used to obtain the data. A complete census is
impracticable: the choice lies between obtaining the information by sampling or
not at all. Thus surveys that rely on sampling have more scope and flexibility
regarding the types of information that can be obtained. On the other hand, if
accurate information is wanted for many subdivisions of the population, the size of
sample needed to do the job is sometimes so large that a complete enumeration
offers the best solution.
Greater Accuracy
Because personnel of higher quality can be employed and given intensive
training and becau e more careful supervision of the field work and processing of
results becomes feasible when the volume of work is reduced, a sample may
produce more accurate results than the kind of complete enumeration that can be
taken .
extra questions about occupation, parentage, fertility, and the like, of those
persons whose names fell on two of the 40 lines on each page of the schedule. The
use of sampling was greatly extended in 1950. From a 20% sample (every fifth
line) information was obtained on items uch as income, years in school, mjgra-
tion , and service in armed forces . By taking every sixth person in the 20% sample,
a further sample of 3~% was created to give information on marriage and fertility.
A series of questions dealing with the condition and age of housing was split into
five sets, each set being filled in at every fifth house. Sampling was also employed
to speed up publication of the results. Preliminary tabulations { r many important
items, made on a sample basis, appeared more than a year and half before the final
reports.
This process continued in the 1960 and] 970 Censuses. Except for certain basic
information required from every person for constitutional or legal reasons, the
whole census was shifted to a sample basi . This change. accompanied by greatly
increased mechanization , resulted in much earlier publication and substantial
savings.
In addition to their use in censuses, continuing samples are employed by
government bureaus to obtain current information. In the United States, exam-
ples are the Current Population Survey, which provides monthly data on the size
and composition of the labor force and on the number of unemployed, the
National Health Survey, and the series of samples needed for the calculation of
the monthly Consumer Price Index.
On a smaller scale , local governments-city, state. and county- are making
increased use of sample surveys to obtain information needed for future planning
and for meeting pressing problems. In the United States most large cities have
commercial agencies that make a business of planning and conducting sample
surveys fnr clients.
Market research is heavily dependent on the sampling approach . Estimates of
the sizes of television and radio audiences for different programs and of news-
paper and magazine reader hip (including the advertisements) are kept continu-
ally under scrutiny. Manufacturers and retailers want to know the reactions of
people to new products or new methods of packaging, their complaints about old
products, and their reasons for preferring one product to another.
Business and industry have many uses for sampling in attempting to increase the
efficiency of their internal operations. The important areas of quality control and
acceptance sampling are outside the scope of this book. But, obviously, decisions
taken with respect to level or change of quality or to acceptance or rejection of
batches are well grounded only if results obtained from the sample data are valid
(within a reasonable tolerance) for the whole batch. The sampling of records of
business transaction (accounts, payrolls, stock, personnel)-u ually much easier
than the sampling of people- can provide serviceable information quickly and
economically. Savings can also be made through sampling in the estimation of
inventories, in studies of the condition and length of the life of equipment, in the
4 SAMPUNG TECHNIQUES
inspection of the accuracy and rate of output of clerical work, in investigating how
key personnel distribute their working time among different tasks, and, more
generally,.in the field known as operations research. The books by Deming (1960)
and Slonim (1960) contain many interesting examples showing the range of
applications of the sampling method in business.
Opinion, attitude, and election polls, which did much to bring the technique of
sampling before the public eye, continue to be a popular feature of newspapers. In
the field of accounting and auditing, which has employed sampling for many years,
a new interest has arisen in adapting modern developments to the particular
problems of this field . Thus, Neter (1972) describes how airlines and railways save
money by using samples of records to apportion income from freight and
passenger service. The status of sample surveys as evidence in lawsuits has also
been subject to lively discussion. Gallup (1972) has noted the major contribution
that sample surveys can make to the process of informed government by deter-
mining quickly people's opinions on proposed or new government programs and
has stressed their role as sources of information in social science.
Sample surveys can be classified broadly into two types- descriptive and
analytical. In a descriptive survey the objective is simply to obtain certain
information about large groups: for example, the numbers of men, women, and
children who view a television program. In an analytical survey, comparisons are
made between different subgroups of the population, in order to discover whether
differences exist among them and to form or to verify hypotheses about the
reasons for the e differences. The Indianapolis fertility survey, for instance, was
an attempt to determine the extent to which married couples plan the number and
spacing of children, the husband's and wife's attitudes toward this planning, the
reasons for these atti1lldes, and the degree of success attained (Kiser and Whelp-
ton, 1953).
The distinction between descriptive and analytical surveys is not, of course,
clear-cut. Many surveys provide data that serve both purposes. Along with the rise
in the number of descriptive surveys, there has, however, been a noticeable
increase in surveys taken primarily for analytical purposes, particularly in the
study of human behavior and health. Surveys of the teeth of school children before
and after fluoridation of water, of the death rates and causes of death of people
who smoke different amounts, and the huge study of the effectiveness of the Salk
polio vaccine may be cited. The study by Coleman (1966) on equality of
educational opportunity, conducted on a national sample of schools, contained
many egression analyses that estimated the relative contributions of school
characteristics, home b.:lckground, and the child's outlook to variations in exam
results.
Population to be Sampled
The word population is used to denote the aggregate frorn which the sample is
chosen. The definition of the population may present no problem, as when
sampling a batch of electric light bulbs in order to estimate the average length of
life of a bulb. In sampling a population of farms, on the other hand, rules must be
set up to define a farm , and borderline cases arise . These rules must be usable in
practice: the enumerator must be able to decide in the field, without much
hesitation, whether or not a doubtful case belongs to the population.
The population to be sampled (the sampled population) should coincide with
the population about which information ill wanted (the target population). Some-
times, for reasons of practicability or convenience. the sampled population is
more restricted than the target population. If so, it should be remembered that
conclusions drawn from the sample apply to the sampled population. Judgment
about the extent to which these conclusions will also apply to the target population
must depend on other sources of information. Any supplementary information
that can be gathered about the nature of the differences between sampled and
target population may be helpful.
Data to be CoUected
It is well to verify that all the data are relevant to the purposes of the survey and
that no essential data are omitted. There is frequently a tendency, particularly
with human populations, to ask too many questions, some of which are never
subseq uently analyzed. An overlong questionnaire lowers the quality of the
answers to important as well as unimportant questions.
Degree of Precision Desired
The results of sample surveys are always subject to some uncertainty because
only part of the population has been measured and because of errors of measure-
ment. This uncertainty can be reduced by taking larger samples and by using
6 SAMPUNO TECHNIQUES
superior instruments of measurement. But this usually costs time and muney.
Consequently, the specification of the degree of precision wanted in the results is
an important step. This step is the responsibility of the person who is going to use
the data. It may present difficulties, since many administrators are unaccustomed
to thinking in terms of the amount of error that can be tolerated in estimates,
consistent with making good decisions. The statistician can often help at this stage.
Methods of Measurement
There may be a choice of measuring instrument and of method of approach to
the population. Data about a person 's state of health may be obtained from
statements that he or she makes or from a medical examina,tion. The survey may
employ a self-administered questionnaire, an interviewer who reads a standard
set of questions with no discretion, or an interviewing process that allows much
latitude in the form and ordering of the questions. The approach may be by mail,
by telephone, by personal visit, or by a combination of the three. Much study has
been made of interviewing methods and problems (see, e.g., Hyman, 1954 and
Payne, 1951).
A major part of the preliminary work is the construction of record forms on
which the questions and answers are to be entered. With simple questionnaires,
the answers can sometimes be precoded-that is, entered in a manner in which
they can be routinely transferred to mechanical equipment. In fact , for the
construction of good record forms, it is necessary to visualize the structure of the
final summary tables that will be used for drawing conclusions.
The Frame
Before selecting the sample, the population must be divided into parts that are
caJled sampling units, or units. These units must cover the whole of the population
and they must not overlap, in the sense that every element in the population
belongs to one and only one unit. Sometimes the appropriate unit is obvious, as in
a population of light bulbs, in which the unit is the single bulb. Sometimes there is
a choice of unit. In sampling the people in a town, the unit might be an individual
person, the members of a family, or all persons living in the same city block. In
sampling an agricultural crop, the unit might be a field, a farm, or an area of land
whose shape and dimensions are at our disposal.
The construction of this list of sampling units, called a frame , is often one of the
major practical problems. From bitter experience, samplers have acquired a
critical attitude toward lists that have been routinely collected for some purpose.
Despite assurances to the contrary, such lists are often found to be incomplete, or
partly illegible, or to contain an unknown amount of duplication. A good frame
may be hard to come by when the population is specialized, as in populations of
bookmakers or of people who keep turkeys. Jessen (1955) presents an interesting
method of constructing a frame from the branches of a fruit tree.
INTRODUCTJON 7
Selection of the Sample
There is now a variety of plans by which the samplc may be selected. For each
plan that is considered, rough estimates of the !.Ize of sample can be made from a
knowledge of the degree of precision desired. The relative costs and time involved
for each plan are also compared before making a decision .
The Pretest
It has been found useful to tryout the questionnaire and the field methods on a
small scale. This nearly always results in improvements in the questionnaire and
may reveal other troubles that will be serious on a large scale, for example, that the
cost will be much greater than expected.
sampler learns to recognize mistakes in execution and to see that they do not occur
in future surveys.
0.4
0.2
01
o
m
1
---- J"" e - (iL - m) 2/2u d iJ.•
2
1
--- J"" e - ,2/ 2 d t
& 1.96- (8/u)
14 SAMPLING TECHNIQUES
Similarly, the lower tail, that is, the shaded area below P, has an area
--
1 f
- 1.96- (8/0')
e- t2 / 2 dt
& -00
From the form of the integrals it is clear that the amount of disturbance depends
solely on the ratio of the bias to the standard deviation . The results are shown in
Table 1.1.
TABLE 1.1
EFFECT OF A BIAS B ON THE PROBABILITY OF AN ERROR
GREATER THAN 1.9611
Probability of Error
B/11 <- 1.9611 > 1.9611 Total
0.02 0.0238 0.0262 0.0500
0.04 . 0.0228 0.0274 0.0502
0.06 0.0217 0.0287 0.0504
0.08 0.0207 0.0301 0.0508
0. 10 0.0197 0.0314 0.0511
0.20 0.0154 0.0392 0.0546
0.40 0.009 1 0.0594 0.0685
0.60 0.0052 0 .0869 0.0921
0.80 0.0029 0. 1230 0.1259
1.00 0.0015 0.1685 .0. 1700
1.50 0.0003 0.3228 0.3231
For the total probability of an error of more than 1.96CT, the bias ha little effect
provided that it is less than one tenth of the standard deviation. At this point the
total probability is 0.0511 instead of the 0.05 that we think it is. As the bias
increases further, the disturbance becomes more serious. At B = CT, the total
probability of error is 0.17, more than three times the presumed value.
The two tails are affected differently. With a positive bias, as in this example, the
probability of an underestimate by more than 1.96CT shrinks rapidly from the
presumed 0.025 to become negligible when B = CT. The probability of the corres-
ponding overestimate mounts steadily. In most applications the total error is the
primary interest, but occasionally we are particularly interested in errors in one
direction.
As a working rule, the effect of bias on the accuracy of an estimate is negligible if
the bias is less than one tenth of the standard deviation of the estimate. If we have
a biased method of estimation for which B/CT < 0.1 , where B is the absolute value
of the bias, it can be claimed that the bias is not an appreciable disadvantage of the
INTRODUCTION 15
method . Even with BI u == 0.2, the disturbance in the probability of the total error
is modest.
In using these results, a distinction must be made between the two sources of
bias mentioned at the beginning of this section. With biases of the type that arise in
estimating ratios, an upper limit to the ratio BI u can be found mathematically. If
the sample is large enough, we can be confident that BIu will not exceed 0.1. With
biases caused by errors of measurement or nonresponse, on the other hand, it is
usu<:llly impossible to find a guaranteed upper limit to Bl u that is small. This
tro ublesome problem is discussed in Chapter 13 .
TABLE 1.2
PROBABILITY Of AN AOSOlUTE ERROR ~ 1 \ MS[ ,
1.96 MS AND 2.576 V MSE
Probability
Even at B/ u =0.6, the changes in the probabilities as compared with those for
B/u = 0 are slight.
Because of the difficulty of ensuring that no unsuspected bias enters into
estimates, we will usually speak of the precision of an estimate instead of its
accuracy. Accuracy refers to the size of deviations from the true mean p" whereas
precision refers to the size of deviations from the flean m obtained by repeated
application of the sampling procedure.
EXERCISES
1.1 SUPPOSt ' hat you were using sampling to estimate the total number of words in a
book that contains jJlustrations.
(a) Is there an y problem of definition of the population? (b) What are the pros and cons
of (1) the page, (2) the line, as a sampling unit?
1.2 A sample is to be taken from a list of names that are on cards (one name to a card)
numbered consecutively in a file . Each name is to have an equal chance of being drawn in
the sample. What problems arise in the following common situations? (a) Some of the
names do not belong to the target population, although this fact cannot be verified for any
name until it has been drawn . (b) Some names appear on more than one card. All cards with
the same name bear consecutive numbers and therefore appear together in the file . (c)
Some names appear on more than one card, but cards bearing the same name may be
scattered anywhere about the file.
1.3 The problem of finding a frame that is complete and enables the sample to be drawn
is often an obst~c1e. What kinds of frames might be tried for the following surveys? Have
the frames any serious weaknesses? (a) A survey of stores that sell luggage in a large city.
(b) A survey of the kinds of articles left behind in subways or buses. (c) A survey of persons
bitten by snakes during the last year. (d) A survey to estimate the number of hours per week
spent by family members in watching television.
1.4 A city directory, 4 years old, lists the addresses in order along each street, and gives
the names of the persons living at each address. For a current interview survey of the people
in the city, what are the deficiencies of this frame? Can they be remedied by the
interviewers during the course of the field work? In using the directory, would you draw a
list of addresses (dwelling places) or a list of persons?
1.5 In estimating by sampling the actual value of the small items in the inventory of a
large firm, the actual and the book value were recorded for each item in the sample. For the
total sample, the ratio of actual to book value was 1.021, this estimate being approximately
normally distributed with a standard error of 0.0082. If the book value of the inventory is
S80,OOO, compute 95% confidence limits for the actual value.
1.6 Frequently data must be treated as a sample, although at first sight they appear to
be a complete enumeration. A proprietor of a parking lot finds that business is poor on
Sunday mornings. After 26 Sundays in operation, his average receipts per Sunday morning
are exactly SID. The standard error of this figure, computed from week-to-week variations,
is 11.2. The attendant costs S7 each Sunday. The proprietor is willing to keep the lot open at
this time if his expected future profit is S5 per Sunday morning. What is the confidence
probability that the long-term profit rate will be at least S5? What assumption must be
made in order to answer this question?
INTRODUcnON 17
1. 7 In Table 1.2, what happens to the probability of exceeding ]./MSE. 1. 96./MSE,
and 2.576./MSE·when Blu tends to infinity, that i5, when the MSE is due entirely to bias?
Do your results agree with the directions of the cbanges noted in Table 1.2 as Blu moves
from 0 to 0.67
1.8 When it is necessary to compare two estimates that have different frequency
distributions of errors (,1 - iJ.) , it is occasionally possible, in specialized problems, to
compute the cost or loss that will result from an error (,1 - iJ.) of any given size. The estimate
that gives the smaller expected loss is preferrod, other things being equal. Show that if the
loss is a quadratic function A(,1 - iJ.)2 of the error, we should choose the estimate with the
smaller mean square error.
CHAPTER 2
.!!... (n - ]) . (n - 2) . . . 1 n!(N-n )!
=-- (2 . 1)
N (N - l) (N - 2) (N - n + 1) (N)! NCn
Since a number that has been. drawn is removed from the population for all
subsequent draws, this method is also called random sampling without replace -
ment. Random sampling with replacement is entirely feasible: at any draw, all.
N members of the population are given an equal chance of being drawn, no matter
how often they have already been drawn . The formulas for the variances and
estimated variances of estimates made from the sample are often simpler when
sampling is with replacement than when it is without replacement. For this reason
sampling with replacement is sometimes used in the more complex sampling
plans, although at first sight there seems little point in having the same unit two or
more times in the sample.
SIMPLE RANDOM SAMPLING 19
TABLE 2. 1
O NE THOUSAND R AN DOM DIGITS
0()....04 05- 09 10- 14 15- 19 2()""24 25-29 3()""34 35- 39 40-44 45-49
00 54463 22662 65905 70639 79365 67382 ~~085 6983 1 47058 08186
OJ 15389 8 205 18850 39226 42249 90669 9~325 23248 60933 26927
02 85941 40756 824 14 02015 13858 78030 16269 65978 01385 15345
O~ 6 1149 69440 11286 8gZ t8 58925 03638 S2862 62733 33451 77455
04 05219 8J619 10651 67079 9251 1 5988k ~1502 72095 83463 75577
05 414 17 98326 87719 92294 46614 50948 64886 20002 97365 30976
06 21:057 94070 20652 35774 16249 75019 211 4505217 47286 76305
07 17783 00015 10806 8309 1 91530 36466 6248 1
3'998 1 49 J77 75779
08 40950 R4820 29881 85966 .62800 70326 84740 62660 77379 90279
09 82995 64 157 66 J64 411 80 100R9 41757 7825R 96488 88629 3723 1
10 96754 17676 55659 44 105 47361 34833 86679 23930 53249 27083
11 34357 88040 53364 71726 45690 66334 60332 22554 90600 711 13
12 063 18 37403 49927 57715 50423 (,7372 63 116 48888 21505 80182
13 621 11 52820 07243 79931 89292 84767 85693 73947 22278 11 551
14 47534 09243 67879 00544 234 10 12740 02540 54440 32949 13491
15 986 14 75993 84460 62846 59844 14922 48730 73443 48167 34770
16 24856 0364 ~44898 0935 1 98795 18644 39765 71058 90368 44104
17 \16887 12479 8062 1 66223 860115 78285 02432 53342 42846 94771
18 90801 21472 42815 77408 37390 76766 52615 321 41 30261S 18 106
19 55 165 77312 83666 36028 28420 70219 81369 41943 47366 4 1067
In using these tables to select a simple random sample, the first step is to number
the units in the population from 1 to N. If the first digit of N is a number between 5
and 9, the following method of selection is adequate. Suppose N = 528, and we
want n = 10. Select three columns (rom Table 2. 1, say columns 25 to 27. Go down
the three columns, e lecting the first 10 distinct numbers betwee n 001 and 528.
These are 36, 509, 364, 417, 348, 127, 149, 186,290, and 162. For the last two
numbers we jumped to columns 30 to 32. In repeated selecti ons it is advisable to
va ry the starting point in the table.
20 SAMPUNO TECHNIOUES
The disadvantage of this method is that the three-digit numbers 000 and 529 to
999 are not used, although skipping numbers does not waste much time. When the
first digit of N is less than 5, some may still prefer this method if n is small and a
large table of random digits is available.
With N = 128, for example, a second method that involves less rejection and i
easily applied is as follows. In a series of three-digit numbers, subtract 200 from all
numbers between 201 and 400, 400 from all numbers between 401 and 600, 600
from aU numbers between 601 and 800, 800 from all numbers between 801 and
999 and, of course, 000 from all numbers between 000 and 200. All remainders
greater than 129 and the numbers 000, 200, and so forth, are rejected. USing
columns 05 to07 in Table 2.1, we get 26,52,7,94,16,48,41,80,128, and 92, the
draw requiring 15 three-digit numbers for n = 10. In this sample the rejection rate
5/15 = 33% is close to the probability of rejection 72/200 = 36% for this method.
In using this method with a number N like 384, note that one subtracts 400 from a
number between 401 and 800, but automatically rejects all numbers greater than
800. Subtraction of 800 from numbers between 801 and 999 would give a higher
probability of acceptance to remainders between 001 and 199 than to remainders
between 200 and 384.
Other methods of sampling are often preferable to simple random sampling on
the grounds of convenience or of increased precision. Simple random sampling
serves best to introduce sampling theory.
Total: Y=LY,
N
=YI +Yl+" '+ YN
"
r y, = YI +Y2+" . + Y.
Hansen, Hurwitz, and Madow (1953) and Murthy (1967) give an alternative
definition of consistency, similar to that in classical statistics. An estimator is
consistent if the probability that it is in error by more than any given amount tends
to zero as the sample becomes large. Exact statement of this definition requires
care with complex survey plans.
As we have seen, a method of estimation is unbiased if the average value of the
estimate, taken over all possible samples of given size n, is exactly equal to the true
population value. If the method is to be unbiased without qualification, this result
must hold for any population of finite values Yi and for any n. To investigate
whether y is unbiased with simple random sampling, we calculate the value of 9
for all NC" samples and find the average of the estimates. The symbol E denotes
this average over all 'possible samples.
1beorem 2.1. The sample mean y is an unbiased estimate of Y.
Proof. By its definition
E -- L9 _L(Yt+Y2+ " '+y,,)
(2.2)
Y - ~"- n[N!/n!(N - n)!]
where the sum extends over all ",c" samples. To evaluate this sum, we find out in
how many samples any specific value Yi appears. Since there are (N -1) other
units available for the rest of the sample and (n - 1) other places to fill in the
sample, the number of samples containing Yi is
(N-l)!
(2.3)
Hence
(N - 1)1
L (Yt +Y2+' . . +y") = (n -l)!(N~ n)!(Y' + Y2 + ' .. + YN)
(2.6)
(2.11)
In (2.1 1) the sums of products extend over all pairs of units in the sample and
population. respecti ve ly. The sum on the left contains n (n - 1)/2 terms and that
on the right contains N(N - I )/2 terms .
24 SAMPUNO TECHNIQUES
Now square (2.9) and average over all simple random samples. Using (2.10) and
(2.11), we obtain
2 .:. 2 n{ - 2
n E(y- Y) = N (~I- Y) + .. ·+(YN- Y)
- 2
~( n -1) - - - - }
+ N-l [(YI-Y)(Y2-Y)+·· ·+(YN-I-Y)(YN-Y)]
Completing the square on the cr6ss-product term, we have
2
n E(y-Y) = N
-2 n{( I-N_l
n-1) [(YI-Y) + .. ·+(YN-Y)] -2 -2
n -1 -
+N-l [(YI- Y)+ .. ·+(YN- Y)]
- 2}
The second term inside the curly bracket vanishes, since the sum of the Y, equals
NY. Division by n 2 gives
V(y) = E(y- Y)2= N-n
nN(N-l) I-I
f (YI_ Y)2= S2n (N-N n)
This completes the proof.
CoroUary 1. The standard error of y is
S S
Uj=~=:;;;n-t (2.12)
This theorem reduces to theorem 2.2 if the variates YI, XI are equal on every unit.
I7oo/:. ~ply theorem 2.2 to the variate Uj = YI + Xj ' The population mean of UI
is U :c y + X, and theorem 2.2 gives
that is
- - 2 N- 1 1 ~ - - 2
E[(y - y)+(i - X)) =N- N - l t... [(YI - y)+(Xj - X)] (2.16)
n ,_I
Expand tl1e quadratic terms on both sides. By theorem 2.2,
- 2 N- n 1 2
E(y- Y) = --
nN N- l LN (YI -
j _
-
Y)
1
with a similar relation for E(i - x'l Hence these two terms cancel on the left and
right sides of (2.16). The result of the theorem (equation 2.15) follows from the
cross-product terms.
estimate the size of the sample needed in a survey that is being planned, and (3) to
estimate the precision actually attained in a survey that has been completed. The
formulas involve 5 2, the population variance. In practice this ~ill not be known,
but it can be estimated from the sample data. The relevant result is stated in
theorem 2.4.
Theorem 2.4. For a simple random sample
t (Yi - y)2
S2=..;.I_ _-
n -l
is an unbiased estimate of
N
L (Yi - Y) 2
5 2 =..;.1_ __
N- l
Proof. We may write
2 1" - -2
5 =- L [(Yi - Y)-(y- Y)] (2.1 7)
n - 1i~ 1
1["L
=-
n-l ,- I
-2
(y, - Y) - n(y- Y) -2] (2.18)
Now average over all simple random samples of size n. By the argument of
symmetry used in theorem 2'.2,
Hence
52
E(5 2 ) =-~n (N - l)-(N-n)]=5 2 (2.19)
(n - l)~
These estimates are slightly biased : (or most applications the bias is unimportant.
The reader should note the symbols employed for true and estimated variances
of the estimates. Thus, for y, we write
True variance: V(y) = u/
Estimated variance: v(y) = s/
1.8 CONFIDENCE LIMITS
It is usually assumed that the estimates y and Y.are normally distributed about
the corresponding population values. The reasons for this assumption and its
limitations are considered in section 2.15 . If the assumption holds, lower and
uppyr confidence limits for the population mean and total are.as follows :
~ean : .
(2.23)
Total :
(2.24)
11le symbol I is the value of the normal deviate corresponding to the desired
confidence probability. The most common values are
Confidence probability (%) 50 80 90 95 99
0.67 1.28 1.64 1.96 2.58
If the sample size is less than 50, the percentage points may be taken from
Student's I table with (n - 1) degrees of freedom , these being the degrees of
freedom in the estimated variance S2 . The t distribution holds exactly only if the
observations Yi are themselves normally distributed and N is infinite. ~oderate
departures from normality do not affect it greatly. For small samples with very
skew distributions, special methods are needed.
Exilmple. Signatures to a petition were collected on 676 sheets. Each sheet had enough
space for 42 signatures, but on many sheets a smaller number of signatures haH been
collected. The numbers of signatures per sheet were counted on a random sample of 50
sheets (about a 7% sample), with the results shown in Table 2.2 .
Estimate the total number of signatures to the petition and the 80% confidence limits.
28 SAMPUNG TECHNIQUES
The sampling unit is a sh,et, and the observations ,. are the numbers of signatures per
sheet. Since about half the sheets had the maximum number of signatures, 42, the data are
presented as a frequency distribution. Note that the original distribution appears to be far
from normal, the greatest frequency being at the upper end. Nevertheless, there is reason to
believe from experience that the means of samples of 50 are approximately normally
distributed.
We find
n =1..,. = 50, y = r.f,y, = 1471, 1.. f,y/ = 54,497
Hence the estimated total number of signatures is
Y=Ny = (676)(1471) 19,888
50
For the sample variance $ 2 we have
$2 ~
== n 1 a:f,<Y, - y)2] =n ~ J
r."y/- ("[. l;,YJ
'" "!"[54 497 - (1471)2] = 2290
49' 50 .
From (2.22) the 80% confidence limits are
y, 14 11 10 9 7 6 5 4 3 Total
f, 1 1 1 1 1 3 2 1 1 50
(2.25)
SlMPU! RANDOM SAMPUNG 29
where the sum extends over all N units in the pop.ulation. In this expression the a,
are random variables and the y, are a set of fixed numbers.
Oearly
n
Pr (a, = 0) = 1--
N
Thus aj is distributed as a binomial variate in a single trial, with P = nl N. Hence
n
E(aj)=P= N' (2.26)
To find V(y) we need also the covariance of al and a,. The product aja, is 1 if the
ith and jth unit are both in the sample and is zero otherwise. The probability that
two specific units are both in the sample is easily found to be n(n -1)1N(N -1).
Hence
Cov (a,a,) =E(ala,) - E(a,)E(aJ)
= n(n -1)
N(N-1)
(.!!.)2 =
N
n (
N(N -l) 1- N
n) (2.27)
(2.28)
(2.29)
using (2.26) and (2.27). Completing the square on the cross-product term gives
Since the probability that the 'ith unit is drawn is 1/N at each draw, the variate
distributed as a binomial number of successes out of n trials with p = 1/N.
tl
Hence
(2.33)
(2.34)
Using (2.32), (2.33), and (2.34), we have, for sampling with replacement,
_ 1 [N l n(N - 1) N n]
V(Y) =2 LYI N2 2 LYIYiN2 (2.35)
n 1- 1 1<1
1 N _ 2 (j2 N - 1 52
=- L (YI- Y) = - : : ; - - (2.36)
nN I _ I n N n
A
tYI Y-
1
R=-=- (2.38)
" i
IXi
1
Examples of this kind occur frequently when the sampling unit (the household)
comprises a group or cluster of elements (adult males) and Our interest is in the
population mean per element. Ratios also appear in many other applications, for
example, the ratio of loans for building purposes to total loans in a bank or the
ratio of acre~ of wheat to total acres on a farm .
The sampling distribution of R is more complicated than that of 9 because both
the numerator y and the denominator i vary from sample to sample. In small
samples the distribution of R is skew and R is usually a slightly biased estimate of
R. In large samples the distribution of R tends to normality and the bias becomes
ncgiigi ble. The folla.wing approximate result will serve for most purposes: the
distribution of'R is studied in more detail in Chapter 6.
Theorem 2.5. If variates y" Xi are measured on each unit of a simple random
sample of size n, assumed large, the MSE and variance of R = y/i are each
approximately
N
L (Yi - RX,) 2
MSE(R ) == V(R) == 1-::: f ,,-i ~_ _
- ...:....I (2.39)
~ nX2 N-l
where R = Y/ X is the ratio of the population means and f = n/ N.
Proof.
A Y y-Ri
R-R= -- R = - - (2.40)
:i :i
If n is large, i should not differ greatly from X. In order to avoid having to work
out the distribution of the ratio of two random variables (9 - R:i) and i. we replace
x by X in the denominator of (2.40) as an approximatiofl. This gives
R - R == i..:::.!!.
X
(2.41)
32 SAMPUNG 'I'ECHMlQUES
(2.43)
N-l
it is customary to take
n-l
This estim.ate can be shown to have a bias of order 1/ n.
For the estimated standard error of R, this gives
(2.46)
l
SIMPLE RANDOM SAMPLING 33
If X is not known, the sample estimate i is s.ubstituted in the denomina!or.
One way to compute s(R) is to express it as
(2.47)
Example. Table 2.3 shows the number of persons (XI). the weekly family income (x,).
and the weekly expenditure on food (y) in a simple random sample of 33 low-income
fa milies. Since the sample is small, the data are intended only to illustrate the calculations.
Estimate from the sample (a) the mean weekly expenditure on food per family. (b) the
mean weekly expenditure on food per person, and (c) the percentage of the income that is
spent on food . Compute the standard errors of these estimates.
Weekly Expenditure on Food per Family. This is the ordinary sample mean
907.2
y =33= S27 .49
By theorem 2.2 (ignoring the fpc). its standard error is
Sv =2_JI
.r;,
(YI _y)2
n- 1
1
,)n (n -1)
Jr. y,'_CIy;)2
- n
1
=Ji33)(32)./28,224 - (907 .2)2/33 = S1.76
(33)(32)
(The uncorrected sum of squares 28,224 is given underneath Table 2.3.)
Weekly Expenditure on Food per Person. Since the size of family varies. the estimate is a
ratio of two variables,
R r. y 907.2
1=r.X = 123 =S7.38per person
I
The sums of squares and products needed to compute S(R) by (2 .47) are found under
Table 2.3. We need in addition
2RI = 14.7512, R12= 54.3996, i,=3.7273
Extra decimals are carried in RI> 2R" RI ' to preserve accuracy.
Hence, from (2.47),
By (2.47) the reader may verify that the standard error is 2.38%.
34 SAMPLING TECHNIQUES
TABLE 2.3
SIZE, WEEKLY INCOME, AND FOOD COST OF 33 FAMILIES
Food Food
Family Size Income Cost Family Size Income Cost
Number xI X2 y Number Xl X2 Y
I 2 62 14.3 18 4 83 36.0
2 3 62 20.8 19 2 85 20.6
3 3 87 22.7 20 4 73 27.7 '
4 5 65 30.5 21 2 66 25.9
5 4 58 41.2 22 5 58 23.3
6 7 92 28.2 23 3 77 39.8
7 2 88 24.2 24 4 69 16.8
8 4 79 30.0 25 7 65 37.8
9 2 83 24.2 26 3 77 34.8
10 ~ 62 44.4 27 3 69 28.7
II 3 63 13.4 28 6 95 63 .0
12 6 62 19.8 29 2 77 19.5
13 4 60 29.4 30 ~ 69 21.6
14 4 75 27.1 31 6 69 18.2
15 2 90 22.2 32 4 67 20. 1
16 5 75 37.7 33 2 63 20.7
17 3 69 22.6
Total 123 2394 907.2
(2.48)
SIMPLE RANDOM SAMPLING 35
At first sight YI seems to be a ratio estimate as in section 2.11. Although n is
fixed. nj will vary from one sample of size n to another. The complication of a ratio
estimate can be avoided by considering the distribution of Yi over samples in
which both nand nj are fixed . We assume nj > O.
n,
In the totality of samples with given nand the probability that any specific set
of ni units from the N j units in domain j is drawn is •
N- N;Cn - n, 1
N-N,Cn - n, • N,Cn, N,Cn,
Since each specific set of n, units from domain j can appear with all selections of
(n -nj) units from the (N - nj) that are not in domainj, the numerator above is the
number of samples containing a specified set of nt, and the denominator is the total
number of samples. It follows that theorems 2.1,2.2, and 2.4 apply to the Yjk if we
put n; for n and IV, for N.
From theorem 2.1: Y; is an unbiased estimate of Y; (2.49)
s·
From theorem 2.2: the standard error of y, i s~ (2.50)
v n,
where
5,2 =
"'I
~
(Yjk - 9y
""-'-_--'-'- (2 .51)
k~J IV,·-l
From theorem 2.4: An estimate of the standard error of Yi is
(2.52)
where
"I ( - )2
S2= ~ Y,k - Yj (2.53)
, k- I n;-l
If the value of Nj is not known, the quantity n/ N may be used in place of ni N;
When computing the fpc. (With simple random sampling, nt/ N j is an unbiased
estimate of n/ N. )
Alternatively, if the total amount receivable in the list is known, a ratio estimate
can be used. The sample gives an estimate of the ratio (total amount of unpaid
bills)/(total amount of all biJIs). This is multiplied by the known total amount
receivable in the list.
If neither N, nor the total receivables is known, these estimates cannot be made.
Instead, we multiply the sample total of the y 's over units falling in the jth domain
by the raising factor N/n . This gives the estimate
(2.54)
We will show that Yi is unbiased and obtain its standard error over repeated
samples of size n . The device of keeping nj fixed as well as n does not help in this
problem.
In presenting the proof we revert to the original notation, in which Yi is the
measurement on the ith unit in the population. Define fo r every unit in the
population a new variate y;" where
In a simple random sample of size n, Yi' = Yi for each of the n, units that lie in the
jth domain; Y;' = 0 for each of the remaining n - nj units. If Y' is the ordinary
sample mean of the y;" the quantity
(2.56)
This result shows that the estimate Yi as defined in equation (2.54) is N tim~s
the sample mean of the Yi'.
In repeated samples of size n we can clearly apply theorems 2.1, 2.2, and 2.4 to
the variates y/. These show that Yi is an unbiased estimate of Yj with standard
error
NS'
u(Yi) = J;, h - (n/ N) (2.57)
where S' is the population standard deviation of the YI'. In order to compute S', we
regard the populatior as consisting of the M values Yt that are in the jth domain
SIMPLE RANDOM SAMPUNG 37
and of N - Nt zero values. Thus
~,2 _
" _--
1 (~ 2
l.. YI--
Yj2) (2.58)
N-l jthdom N
From theorem 2.4 a sample estimate of the standard error of Y is
j
Ns'
s(Yj)=-Jl-(n/N) (2.5!1)
.rn
In computing s', any unit not in the jth domain is given a zero value. Some students '
seem to have a psychological objection to doing this, but the method is sound.
The methods of this and the preceding section also apply to surveys in which the
frame used contains units that do not belong to the population as it has been
defined. An example illustrates this application.
&le. From a list of 2422 minor household expenditures a simple random sample
of 180 items was drawn in order to estimate the total spent for operation of the household.
Certain types of expenditure (on clothing and car upkeep) were not considered relevant. Of
the 180 sample items, 152 were relevant. The sum and uncorrected sum of squares of the
re levant amounts (in dollars) were as follows .
L y,'= 343.5, Ly,'2= 1491 .38
Estimate the total expenditure for household operation and give the standard error of the
estimate.
y.=!:!. £y,' =(2422)(343.5) $4622
, n '-I 180
From (2.59)
In computing s' we regard our sample of 180 items as having 28 zeros. Hence
S
'2 1 ['t".2
=(179) ... y, -~
(L y,')2]
1 [ (343.5)2]
= - - 1491.38-- - = 4670
(179) 180 '
Finally,
Sy, = (2422)
J4.670( 180) 7
180 1- 2422 = $3 5
The estimate is not precise, its coefficient of variation 375/4622 being about 8% .
.
In this example expenditures on car upkeep and clothing were excluded as not
relevant and therefore were scored as zeros in the sample. In some applications it
38 SAMPLING TECHNIQUES
is known in advance that certain units in the population contribute nothing to the
total that is being estimated. For instance, in a survey of stores to estimate total
sales of luggage, some stores do not handle luggage; certain area sampling units
for farm studies contain no farms . Sometimes it is possible, by expenditure of
effort, to identify and count the units that e<;>ntribute nothing, so that in our
notation (N - N,), hence N j , is known.
Consequently it is worth examining by how much V(~) is reduced when N j is
known. If N , is not known, (2.57) gives
2
N S 12 (
A
V(Y;.)=-n- 1- N
n)
If Yt and S, are the mean and standard deviation in the domain of interest (i.e.,
among the nonzero units) the reader may verify that
V(Y/) A • N
2
= -;;-(p(S,2 -2 (
+PiO,y;) 1- n) (2 .62)
N
If nonzero units are ide ntified, we draw a sample of size n, from them. The
estimate of the domain total is N;y, with variance
N;2
V(Ny ) =-S 2(1--ni)=N-2p .2S 2(I - -ni) (2.63)
I I n, I N, ni I I N;
The comparable variances are (2.62) and (2. 63), In (2.62) the average number of
nonzero units in the sample of size n is nP,.If we take nl = nP, in (2. 63), so that the
number of nonzeros to be measured is about the same wirh both methods. (2 .63)
becomes
_ N
2
wher.e Y", S",f~ are the population mean, s.d., and fpc, and T is a number >0. Then
the Lindeberg-type condition
,
I(Yvl- Y~f
Ii ::!s,.,~ _ _~ o
•. ~ (Nv - l)S/
is necessary and sufficient to ensure that y~ tends to normality with the mean and
variance given in theorems 2.1 and 2.2.
This imposing body of knowledge leaves something to be desired . It is not easy
to answer the direct question: "For this population, how large must n be so that
the normal approximation is accurate enough?" Non-normal distributions vary
greatly both in the nature and in the degree of their departure from normality. The
distributions of many types of economic enterprise (stores, chicken farms, towns)
exhibit a marked positive skewness, with a few large units and many small units.
The same kind of skewness is displayed by some biological populations (e.g., the
number of rats or flies per city block).
140
130 t"-
120
110 '-
100
90
~80
*
~ 70
.:: 60
40
~
I
30
20
...__
10
00
t:
100 200
l
I
300 400 500 600
r--1
700
City size (thousands)
800
-
900 1000 1100
Fla. 2.1. Frequency distribution of sizes of 196 United States Cities in 1920.
SIMPLE RANDOM SAMPUNG 41
As an illustration of a positively skewed distribution, Fig. 2.1 shows the
frequency distribution of the numbers of inhabitants in 196 large United States
cities in 1920. (The four largest cities, New York, Chicago, Philadelphia, and
Detroit, were omitted. Their inclusion would extend the horizontal scale to more
than five times the length shown and WOUld, of course, greatly accentuate the
skewness.) Figure 2.2 shows the frequency distribution of the total number of
50
40
r--
,
r-- .-..
.r-:- n
rL
,~.
_j
4 5 6 7 8 9
Millions
Fig.2.2 Frequency distribution of totals of 200 simple random samples with " = 49.
inhabitants in each of 200 simple random samples, with n = 49, drawn from this
population. The distribution of the sample totals, and likewise of the means, is
much more similar to a normal curve but still displays some positive skewness.
From statistical theory and from the results of sampling experiments on skewed
populations, some statements can be made about what usually happens to
confidence probabilities when we sample from positively skew populations, as
follows :
1. The frequency with which the assertion
y - 1.96sj1 < 9 <y + 1.96sj1
is wrong is usually higher than 5% .
2. The frequency with which
9 > y+1.96s9
is greater than 2.5%.
3. The frequency with which
y takes only two values-the value h with probability P and the value 0 with
probability Q. The population mean is Y = Ph. A simple random sample of size n
shows a units that have the value hand n - a units that have the valu.e O. For the
sample,
_ ah
y =-
n
i a 2h 2
(n - 1)s =LY 2 - -2 2
ny =ah - - -
n
(2.68)
Let n = 400, P = 0.1. Then Y"" O.lh . By trial we find that if a = 29 in expression
(2.68) the upper confidence limit is 39. 18h/400 = 0.098h, whereas a = 30 gives
40.34h/400 = O. lOlh. Hence any value of a $ 29 gives an upper confidence limit
that is too low . Similarly we find that if a 2:: 54 the lower limit is too high .
The variate a follows the binomial distribution with n == 400, P = 0.1. The
tables (Harvard Computation Laboratory, 1955) show that
The total probability of being wrong is not far from 0.05. In more than 60% of
the wrong statements, the true mean is higher than the stated upper limit.
There is no safe general rule as to how large n must be for use of the normal
approximation in computing confidence limits. For populations in which the
principal deviation from normality consists of marked positive skewness, a crude
rule that I have occasionally found useful is
n > 250\ 2 (2.69)
where G 1 IS Fisher'S measure of skewness (Fisher, 1932).
- E(y; - Yf _ 1 ~ ( y)- 3
(2.70)
O1- 3 - - - 3 I.. Yi-
U Nu i - I
This rule is designed so that a 95% confidence probability statement will be
wrong not more than 6% of the time. It is derived mathematically by assuming
SIMPLE RANDOM SAMPLING 43
that any disturbance due to moments of the distribution of ji higher than the third
is negligible. The rule attempts to control only the total frequency of wrong
statements, ignoring the direction of the error of estimate.
By calculating Gj, or an estimate, for a specific population, we can obtain a
rough idea of the sample size needed for application of the normal approximation
to compute confidence limits. The result should be checked by sampling experi-
ments whenever possible.
TABLE 2.4
FREQUENCY DISTRIBUTION OF AcRES IN · CROPS ON 556 FARMS
Class Coded
Intervals Scale Frequency jiYi fly? !iyl
(acres) Yi j;
0-29 -0.9 47 -42.3 38.1 -34.3
30-63 0 143 0 0 0
64;-97 1 154 154 154 154
98:_131 2 82 164 328 656
132- 165 3 62 186 558 1,674
166-199 4 33 132 528 2,112
200-233 5 13 65 325 1,625
234-267 6 6 36 216 1,296
268- 301 7 4 28 196 1,372
302- 335 8 6 48 384 3,072
336-369 9 2 18 162 1,458
370-403 10 0 0 0 0
404-437 II 2 22 241 2,662
438-471 12 0 0 0 0
472- 505 13 2 26 338 4,394
20,440.7
E(y/ ) == == 36.76385
556
u 2 = E(y/ ) - ji"2 = 3.97479
K3 =E(y, - 9)3=E(y/ ) - 3E(y/ ) Y + 2 y l
=15.411
G ",,~_lS.411 _19
I r 7.925 .
111111111111111
10486
44 SAMPUNG TECHNIQUES
Example. The data in Table 2.4 show the numbers of acres devoted to crops on 556
farms in Seneca County, New York. The data come from a series of studies by West (1951),
who drew repeated samples of size 100 from this population and examined the frequency
distributions of y, s, and Student's I for several items of interest in farm management
surveys.
The computation of 0, is shown under the table. The computations are made on a coded
scale, and, since 0, is a pure number, there is no need to return to the original scale. Note
that the first class-interval was slightly different from the others.
Since 0, = 1.9, we take as a suggested minimum n
n"'(25)(1.W=90
For samples of size 100, West found with this item (acres in crops) that neither the
distribution of ji nor that of Student's I differed significantly from the corresponding
theoretical normal distributions.
Good sampling practice tends to make the normal approximation more valid.
Failure of the normal approximation occurs mostly when the population contains
some extreme individuals who dominate the sample average when they are
present. However, these extremes also have a much more serious effect of
increasing the variance of the sample and decreasing the precision. Consequently,
it is wise to segregate them and make separate plans for coping with them, perhaps
by taking a complete enumeration of them if they are not numerous. This removal
of the extremes from the main body of the population reduces the skewness and
improves the normal approximation. This technique is an example of stratified
sampling, which is discussed in Chapter 5.
EXERCISES
2.1 In a population with N=6 the values of y, are 8, 3, I, 11,4, and 7. Calculate"the
sample mean_j for all possible simple random samples of size 2. Verify that y is an unbiased
estimate of Y and that its variance is as given in theorem 2.2.
2.2 For the same population, calculate S2 for all simple random samples of size 3 and
verify that E(S2) = S 2.
2.3 If random samples of size 2 are drawn with replacement from this population, show
by finding all possible samples that V(y) satifies the equation
V(Y) = = S2 (N -1)
(7'2
n 11 N
2.4 A simple random sample of 30 households was draw'n from a city area containing
14,848 households. The numbers of persons per household in the sample were as follows.
5, 6.3,3,2, 3,3,3, 4, 4, 3,2,7,4, 3,5, 4, 4,3,3, 4,3,3,1,2, 4,3, 4, 2,4
.. Henceforth in this book. the surname Rao will refer to 1. N. K. Rao unless otherwise noted.
46 SAMPLING TECHNIQUES
Estimate the total number of people in the area and compute the probability that this
estimate is within ± 10% of the true value.
2.5 In a study of the possible use of sampling to cut down the work in taking inventory
in a stock room, a count is made of the value of the articles on each of 36 shelves in the
room. The values to the nearest dollar are as follows.
29,38,42,44,45,47,51,53,53,54,56,56,56, 58,58,59,60,60,
60,60,61,61,61,62,64,65,65,67,67,68,69,71, 74, 77,82,85 .
The estimate of total value made from a sample is to be correct within S200, apart from a
1 in 20 chance. An advisor suggests that a simple random sample of 12 shelves will meet the
requirements. Do you agree?
Ly=2138, Ly2=131,682
2.6 After the sample in Table 2.2 (p. 28) was taken, the number of completely filled
sheets (with 42 signatures each) was counted and found to be 326. Use this information to
make an improved estimate of the total number of signatures and find the :;tandard error of
your estimate.
2.7 From a listof 468 sma1l2-yearcolleges a simple random sample of 100 colleges was
drawn. Th~ sample contained 54 public and 46 private colleges. Data for number of
students (y) and number of teachers (x) are shown below.
n L (y) rex)
Public 54 31,281 2,024
Private 46 13,707 1,075
L (y2) L (yx) L (x 2)
Public 29,881,219 1,729,349 111 ,090
Private 6,366,785 431,041 33,119
(a) For each type of college in the population, estimate the ratio (number of
s~dents)/(numbe~ of teachers). (b) Compute the standard errors of your estimates. (c) For
the public colleges; find 90% confidence limits for the student/teacher ratio in the whole
population.
2.8 In the preceding example test at the 5% level whether the student/teacher ratio is
significantly different in the two types of colleges.
2.9 For the public colleges, e3timate th~ total number of teachers (a) given that the
total number of public colleges in the population is 251, (b) without knowing this figure . In
each case compute the standard error of your estimate .
2.10 The table below shows the numbers of inhabitants in each of the 197 United
States cities that had populations over 50,000 in 1940. Calculate the st~ndard error of the
estimated total. number of inhabitant in all 197 cities for the following methods of
sampling: (a) a simple random sample of size 50, (b) a sample that includes the five largest
cities and is a simple random sample of size 45 from the remaining 192 cities (c) a sample
that includes the nine largest cities and is a simple random sample of size 41 from the
remaining cities.
SIMPLE RANDOM SAMPLING 47
FREQUENCY DISTRIBUTION OF CITY SIZES
Size Class Size C lass Size Class
(1000's) f (1000's} f (1000's) f
50-100 105 550--600 2
100-150 36 600-650 I 1500-1550
150-200 13 650-700 2
200-250 '6 700-750 0 1600-1650
250-300 7 750-800
300-350 8 800-850 1900-1950
350-400 4 850-900 2
400-450 1 900-950 0 3350-3400
450-500 3 950-1000 0
500-550 0 1000-1050 0 7450- 7500
2.11 Calculate the coefficient of skewness G I for the original population and the
population remaining after removing (a) the five largest cities, (b) the nine largest cities.
2.12 A small survey is to be taken to compare home-owners with renters. In the
population about 75% are owners, 25% are renters. For one item the variance is thought to
be about 15 for both owners and renters. The standard error of the difference between the
two domain means is not to exceed 1. How large a sample is needed (a) if owners and
renters can be identified in advance of drawing the sample, (b) if not? (An approximate
answer will do in (b); an exact discussion requires binomial tables.)
2.13 A simple random sample of size 3 is drawn from a population of size N with
replacement. Show that the probabilities that the sample contains 1,2, and 3 different units
(for example, aaa , aab, abc, respectively) are
1 3(N - 1) p = (N - 1)(N - 2)
P'= N 2 ' P2=~ ' J N2
As an estimate of Y we take y', the unweighted mean over the different units in the
sa mple. Show that the average variance of :Y' is
Hence show that V(ji') < V(y) , where ji is the ordinary mean of the n observations
in the sample. The result that V(ji') < V(ji) for any n > 2 was proved by Des Raj and
Kbamis (1958).
2.14 Two dentists A and B make a survey of the state of the teeth of 200 children in a
village. Dr. A selects a simple random sample of 20 children and counts the number of
decayed t~eth for each child, with the following results.
48 SAMPUNO TECHNIQUES
Number of decayed 0 1 2 3 4 5 6 7 8 9 10
teeth/child
Number of children 8 4 2 2 1 0 0 0 1 1
Dr. B , using the same dental techniques, examines all 200 children, recording merely
those who have no decayed teeth. He finds 60 children with no decayed teeth.
Estimate the total number of decayed teeth in the village children, (a) using A 's results
only, (b) using both A's and B's results. (c) Are the estimates unbiased? (d) Which
estimate do you expect to be more precise?
2.15 A company intends to interview a simple random sample of employees who have
been with it more than 5 years. The company has UOOO to spend, and each interview costs
$ 10. There is no separate list of employees with more than 5 years service, but a list can be
compiled from the files at a cost of $200. The company can either (a) compile the list and
interview a simple random sample drawn from the eligible employees or (b ) draw a simple
random sample of all employees, interviewing only those eligible. The cost of rejecting
those not eligible in the sample is assumed negligible .
Show that for estimating a total over the poP'ulation of eligible employees, plan (a ) gives
a smaller variance than plan (b ) only if Cj < 2~, where Cj is the coefficient of variation of
the item among eligible employees and OJis the proportion of noneligi bles in the company.
Ignore the fpc.
2.16 A simple random sample of size n = '" + n2 with mean y is drawn from a finite
population, and a simple random subsample of size n. is drawn from it with mean y, . Show
that (a) V(YI - Y2) =S2[(I/n, ) +(I/n2)J. where Y2is the mean of the remaining units in"2
the sample, (b ) V(y, - y) = S2[(1/n,) - (l/n)J, (c ) Cov {y, y, - y} = O. Repeated sampling
implies repetition of the drawing of both the sample and the subsample.
2.17 The number of distinct simple random samples of size n is of course N!I n I(N -
n)1. There has been some interest in finding smaller sets of samples of size n that have the
same properties as the set of simple random samples. One set is that of balanced incomplete
block (bib) designs. These I1re samples of " distinct units out of N such that (i) every unit
appears in the same number (r) of samples, (ii) every pair of units appears together in A
samples.
Verify that A = r(n - 1)/(N - 1) and that the number of distinct samples in the set b rN/ n.
Over the set of bib samples, prove in the uS!lal notation that if ji is the mean of a sample, (a )
V(j) =(1 - f)S 2/ n and (b) v(y) = (1-f)L(YI -ji)2/ n (n - l) is an unbiased estimate of
~~ . .
Note. There is no general method for finding the smallest r for which a bib can be
constructed. Sometimes the smallest known r provides N!/ n !(N - n)! samples, bringing us
back to simple random samples. But for N = 91, n '" 10, the smallest bib set has 91 samples
as against over 6 United States trillion SRS. Avadhani and Sukhatme (1973) have shown
how bib designs may be used in attempting to reduce travel costs between sampling units.
2.18 The following is an illustration by Royall (1968) of the fact that in simple random
sampling the sample m~an ji does not have uniformly minimum variance in the class of
estimators of the form L W ..y, considered by Godambe (1955), where the weight W,. may
depend on the other units that faU in the sample. For N = 3, n = 2, consider the estimator
~ A ~
YI2=iy,+ h2; Y 13 =h ,+h3; Y23 =h2+!Y3
~ A
where >';, is the estimatqr for the sample that has units (i, j). Prove Royall 's results that YI}
is unbiased and that V( Y,,) < V(y) if Y3(3Y 2- 3y, - .)'3) > O. The illustration is taken from an
earlier example by Roy and Chakravarti (1960).
SIMPLE RANDOM SA.MPLING 49
2.19 This exercise is another example of estimators geared to particular features of
populations. After the decision to take a simple random sample had been made, it was
realized that y, would be unusually low and YN would be unusually high . For this situation,
Sarndai (1972) examined the following unbiased estimator of Y.
Ys "" y+c if the sample contains y, but not YN
=y- c if the sample contains YN but not y,
=y for all other amples
where c is a constant. Prove SarndaJ's result that Ys is unbiased with
.:! [S2
2c
V(Ys)=(l-f) -;-(N _ l)(YN -y ,-nc)
]
Sampling Proportions
and Percentages
- fYi
A
Y=N=N=P . (3.2)
Also, for the sample,
n
_
IYi
1 a
y=-=-=p (3.3)
n n
Consequently the problem of estimating A and P can be regarded as that of
t'stimating the total and mean of a population in which every YI is either 1 or O. In
order to use the theorems in Chapter 2, we first express 5 2and S 2 in terms of P and
p. Note th:lt
N
Iy?=A =NP, fy ;2= a =np
1 1
Hence
N _ 2 N 2 -2
I
5 2 =1
(YI - Y) I YI - NY
_1_______
N -l N- l
1 2 N
= N_l(NP-NP)= N-I PO (3.4)
where 0 = 1 - P. Similarly
S2 =
f (YI- y)2 = _n_
_1_ _ _ _
pq (3.5)
n- l n -l
Application of theorems 2.1,2.2, and 2.4 to this population gives the following
results for simple random sampling of the units that are being classified.
Proof. In the corollary of theorem 2.4 it was shown that for a variate Yi an
unbiased estimate of the variance of the sample mean y is
v(y) = S2 (N-n) (3.9)
n N
For proportions, P takes the place of y, and in (3.5) we showed that
2 n
S =--pq (3.10) .
n- 1
Hence
2 N- n
v(P)=Sp = (n-l)NfXi (3.11)
..J!!L
n -1
The result may appear puzzling to some readers, since the expression pq/ n is
almost invariably used in practice for the estimated variance. The fact is that pq/ n
is not unbiased even with an infinite population.
Exll"'ple. From a list of 3042 names and addresses, a simple random sample of 200
names showed on investigation 38 wrong addresses. Estimate the total number of
addresses needing correction in the list and find the standard error of this estimate. We have
N = 3042, n =200, Q =38, p =0.19
SAMPLING PROPORTIONS AND PERCENTAGES 53
The estimated total number of wrong addresses is
A =< Np =(3042)(0.19) =< 578
The preceding formulas for the variance and the estimated variance of p hold
only if the units are classified into Cor C so that p is the ratio of the number of
units in C in the sample to the total number of units in the sample. In many surveys
each unit i composed of a group of elements, and it is the elements that are
classified. A few examples are as follows:
v(P)=PO
n
The function PO and its square root are shown in Table 3.1. These functions
may be regarded as the variance and standard deviation, respectively, for a sample
of size 1.
The functions have their greatest values when the population is equally divided
between the two classes, and are symmetrical about this point. The standard error
of p changes relatively little when P lies anywhere between 30 and 70%. At the
maximum value of .,[jiQ, 50, a sample size of 100 is needed to reduce tbe standard
54 SAMPLING TECHNIQUES
TABLE 3.1
V ALVI;S m PQ AND v'PQ
P = Population percentage in class C
P 0 10 20 30 40 50 60 70 80 90 100
error of the estimate to 5%. To attain a 1% standard error requires a sample size
of 2500.
This approach is not appropriate when interest lies in the total number of units
in the population that are in class C. In this event it is more natural to ask : Is the
estimate likely to be correct to within, say, 7% of the true total'! Thus we tenj to
think of the standard error expressed as a fraction or percentage of the true value,
NP. The fraction is
(3.13)
This quantity is called the coefficient of variation of the estimate. If the fpc is
ignored, the coefficient is J OJ nP. The ratio J OJP, which might be considered the
coefficient of variation for a sample of size 1, is shown in Table 3.2.
TABLE 3.2
VALUES OF Y Qf P FOR DIFFERENT VALVES OF P
P = Population percentage in class C
p 0 0.1 0.5 5 10 20
v' QfP 00 31.6 14.1 9.9 4.4 3.0 2.0
P 30 40 50 60 70 80 90
v' QfP 1.5 1.2 1.0 0.8 0.7 0.5 0.3
For a fixed sample size, the coefficient of variation of the estimated total in class
C decreases stea(llly as the true percentage in C increases. The coefficient is high
when P is less than 5 % . Very large samples are needed for precise estimates of the
total number PQssessing any attribute that is rare in the population. For P = 1 %,
we must have ~ =99 in order to reduce the coefficient of variation of the estimate
SAMPLING PROPORTIONS AND PERCENTAGES 55
to 0.1 or 10%. This gives a sample size of 9801. Simple random sampling, or any
method of sampling that is adapted for general purposes, is an expensive method
of estimating the total number of units of a scarce type.
( ) n! IJQ" - a (314)
Pra=a!(n_a)t .
Example. A family of eight contains three males and five females. Find the frequency
distribution of the number of males in a simple random sample of size 4. In this case
A = 3; A' =5, N = 8; n=4
From (3.16) the distribution of the number of males, a, is as follows :
a Probability
4! 5.4.3.2 1
o - .- - = -
0!4! 8.7.6.5 14
4! 3.5.4.3 6
1!3!·8.7.6.5 "" 14
4! 3.2.5.4 6
2
2!2!·8.7.6.5 = 14
41 3.2.1.5 1
3 3!1I · 8.7.6.5 = 14
4 Impossible = 0
SAMPLING PROPORTIONS AND PERCENTAGES 57
The reader may verify that the mean number of males is ~ and the variance is H. The e
results agree with the formulas previously established in section 3.2, which give
E(np) = nP = nA = (4)(3) =~
N 8 2
N-n 354 15
V(np) = nPO N - 1 = 4'8'8'7 = 28
f
;- 0
Pr(j, n - iiA u, N - Au) = auU (3.17)
(3.18)
i-a
(3.19)
58 SAMPLING TECHNIOUES
TABLE 3.3
SMALLEST VALUES OF np fOR liSE OF THE NORMAL
ApPROXIMATION
np = Number Observed n=
p in the Smaller Class Sample Size
O.S IS 30
0.4 20 50
0.3 24 80
0.2 40 200
0.1 60 600
0.05 70 1400
-0. 80 <Xl
The errOr in the normal approximation depends on all the quantities n, p, N, au,
and aL. The quantity to which the error is most sensitive is np or more specifically
the number observed in the smaller class. Table 3.3 gives working rules for
deciding when the normal approximation (3.19) may be used .
The rules in Table 3.3 are constructed so that with 95% confidence limits the
true frequency with which the limits fail to enclose P is not greater than 5.5% .
Furthermore, the probability that the upper limit is below P is between 2.5 and
3.5%, and the probability that the lower limit exceeds P is between 2.5 and 1.5%.
Example 1. In a simple random sample o{ size 100, from a population of size 500, there
are 37 units in class C. Find the 95% confidence limits for the proportion and for the total
number in class C in the population. In thjs example
n = 100, N=SOO, p=0.37
The example lies in the range in which the normal approximation is recommended. The
estimated standard. error of p is
Binomial Approximations
When the normal approximation does not apply, limits for P may be found from
the binomial tables (section 3.4) and adjusted, if necessary, to take account of the
fpc. Table VIm in Fisher and Yates' Statistical Tables (1957) gives binomial
confidence limits for P for any value of n, and is a useful alternative to the ordinary
binomial tables. Example 2 shows how the binomial approximation is computed.
Example 2. For another item in the sample in example 1, nine of the 100 units fall in
class C. From Romig's table for n = 100 the 95% limits for P are found to be 0.041 and
0.165. (The Fisher-Yates tables give 0.042 and 0.164 .) Iff, the sampling fraction, is less
than 5% , limits found in this way are close enough for most purposes. In this example,
f= 0.2 and adjustment is needed .
To apP!y"!'he adjustment, we shorten the interval between p and each limit by the factor
J1 - f = -/0.8 = 0.894. The adjusted limits are as follows :
PL = 0.090 - (0.894)(0.090 - 0.041) = 0.046
Pu = 0.090 + (0.894)(0.165 - 0.090) = 0.157
The limits read from the charts by Chung and DeLury are 0.045 and 0.157, respectively.
Burstein (1975) has produced a variant of this calculation that is slightly more accurate.
Suppose that a umts out of n arc in class C (in this example, a = 9, n = 100). In PL , replace
a/n = 0.090 by (a - 0.5 )/n ;; 0.085. In Pu, replace a/ n by (a + a/ n) / n = 0.0909. Also,
(I - f) is taken as (N - n)/(N - l). Thus, by Burstein's method,
Example J. In auditing records in which a very low error rate is demanded, the upper
confidence limit for A is primarily of interest . Suppose that 200 of 1000 records are verified
and that the batch of 1000 is accepted if no errors are found . Special tables have been
constructed to give the upper confidence limit for the number of errors in the batch. A good
approximation results from the following relation. The probability that no errors are found
in n when A errors are present in N is, from the hypergeometric distribution,
Example. Consider a population that consists of the five units, b, c. d, e, t, that fall in
three classes.
Class A, Units Denoted By
1 1 b
2 2 c,d
3 2 e,f
With random samples of size 3, we wish to estimate P = A ,/(A, +A 2 ) or, in this case,!.
Thus N =5 and N ' =' 3.
There are 10 possiblt> samples of size 3, all with equal initial probabilities. These are
grouped according to the value of n '.
n'= 1
Conditional
Sample a, a2 P Probability (P - p)
bel 1 0 ] i
_1
~
eel or de! 0 1 0 3 3
If samples are specified by the values of 0" 02, only two types are obtainable : 0, == 1,
02=0 ; a, = 0, a l == 1. Their conditional probabilities, ~ and 5. respectively, agree with the
general expression (3.24). Furthermore,
E(P) =i
cr 2 =
p
(N' - n'\PQ = (3 - 1)(!)(~) =~
N'-1)R" 3- 1 3 3 9
For n ' == 2 there Bre six possible sBmples, which give only two sets of values of 0" 0 2.
n' =2
Conditional
Sample a, a2 P Probability (P - P)
Number of units n
Of the n units, (a I + a 1') are found to fall in domain 1 and of these a I fall in class
C The proportion falling in class Coin domain 1 is estimated by PI = at/(al + al').
The frequency distribution and confidence limits for PI were discussed under Case
2 in sections 3.8 and 3.9.
For estimating the total number A J of units in class C in domain 1, there are two
possibilities. If Nh the total number of units in domain 1 in the population, is
known, we may use the conditional estimate
A NlaJ
AI=NIPl= - --, (3.25)
al +a1
Its standard error is computed as
(3.26)
where nl = al +at
If NI is not known, the estimate is
A,_Na)
A 1 - -- (3.27)
n
with estimated standard error
where P = ad n.
64 SAMPLING TECHNIQUES
C a, a2
C a,' a2'
(3 .30)
Example 1. A group of 61 leprosy patie nts were treated with a drug for 48 weeks. To
measure the effect of the drug on the leprosy bacilli, the presence of bacilli at six sites on the
body of each patient was tested bacteriologically. Among the 366 sites, 153, or 41.8% ,
were negative. What is the standard error of this percentage?
This example comes from a co ntrolled experiment rather than a survey, but it illustrates
how erroneous the binomial formula may be. By the binomial formula , we have n = 366,
and
s.e. (p) = J pq/( n - 1) = J(41.8 )(58.2)/365 = 2.58%
Each patient is a cluster unit with m = 6 elements (sites). To find the standard error by the
correct formula , we need the frequency distribution of the 61 values of P" It is more
conve nient to tabulate the distribution of Yi' the number of negative sites per patient. With
p, expressed in percents, Pi = 100y,/6. From the distribution in Table 3.4 we find I fy 2=
669 and
TABLE 3.4
N U MB ER OF NEGATIVE SITES PER PATI ENT
Yi = 6p{/IOO 0 2 3 4 5 6 Total
f 17 II 4 4 7 14 4 61
fYI 0 II 8 12 28 70 24 153
/.xP 2.3 10. 1 18.3 17.6 9.6 2.8 0.3 61.0
66 SAMPLING TECHNIQUES
If the size of cluster is not constant, let mi be the number of elements in the ith
cluster unit and let Pi = all mi. The proportion of units falling in class C in the
sample is
(3.31)
This form shows that the approximate variance involves a weighted sum of
squares of deviations of the PI from the population value P.
For the estimated variance we have
2
v(p ) = 1- 1 .2: a?- 2p.2: aim; +p 2 L m ,
(3.34)
nm 2 n- 1
Hence
_ /Xl _ (0.2885)(0.7115) 0.00197
_VbI.
(p) - n - 104
SAMPLING PROPORTIONS AND -PERCENTAGES 67
TABLE 3.5
DATA FOR A SIMPLE RANDOM SAMPLE OF 30 HOUSEHOLDS
Doctor Seen in
Number of Last Year
Number of
Household Persons Males Females Yes No
Number mj Qj
°i
1 5 1 4 5 0
2 6 3 3 0 6
3 3 2 2
4 3 2 3 0
5 2 I 0 2
6 3 2 0 3
7 3 2 0 3
8 3 1 2 0 3
9 4 2 2 0 4
10 4 3 0 4
II 3 2 0 3
12 2 1 1 0 2
13 7 3 4 0 7
14 4 3 4 0
15 3 2 1 I 2
16 5 3 2 2 3
17 4 3 0 4
18 4 3 I 0 4
19 3 2 1 I 2
20 3 I 2 3 0
21 4 I 3 2 2
22 3 2 0 3
23 3 2 0 3
24 1 0 0 J
25 2 1 1 2 0
26 4 3 1 2 2
27 3 1 2 0 3
28 4 2 2 2 2
29 2 1 1 0 2
30 4 2 2 1 3
Totals 104 53 51 30 74
68 SAMPLING TECHN IQUES
For the ratio formula, we note that there are 30 clusters and take
n =30
m, = total number in ith household
a, = number in ith household who had seen a doctOr
p = 0.2885, as before
104
m =-=3 .4667
30
L a,2=86; L m,' =404; L aim, = 113
The fpc may be ignored . Hence, from (3.34),
u ) = (86) - 2(0.2885)(113) + (0.2885)2( 404 ) = 0.00520
(p (30)(29)(3.4667)'
The variance given by the ratio method, 0.00520, is much larger thall that given by the
binomial formula , 0.00197. for various reasons. families differ in the frequency with which
their members co nsult a doctor. For the sample as a whole, the proportion who con.ult a
doctor is only a little more than one in four. but there are several families in which every
member has seen a doctor. Similar results would be obtained for any characteristic in which
the members of the sa me family tend to act in the same way .
[n estimating the proportion of males in the population, the results are different. By the
same type of calculation, we find
binomial formula : u(p) = 0 .00240
ratio formula u(p) = 0.00114
Here the binomial formula ouerestimates the variance. The reason is interesting. Most
households are set up as a result of a marriage, hence contain at least one male and one
female . Consequently the proportion of males per family varies less from one half than
would be expected from the binomia! formula . None of the 30 families. except one with
only one member, is compo;ed enti rely of males, or entirely of fema les. If the binomial
distribution were applicah le, with a true P of approximately o ne half. households with all
members of the same sex would constitute o ne quarter o[ the households of size 3 and one
eighth of the households of size 4. This property of the sex ratio has been discussed by
Hansen and Hurwitz (1942) . Other illustrations of the error committed by improper use of
the binomial formula in sociological investigations have been given by Kish (1957).
EXERCISES
3.1 For a population with N = 6, A = 4 , A' = 2, work out the value of a for all possible
simple random samples of size 3. Verify the theorems given for the mean and variance of
p = a/ II . Verify that
N-n
- --pq
(n-!)N
is an unbiased estimate of the variance of p.
3.2 In a simple random sample of 200 from a population of 2000 colleges, 120 colleges
wert in favor of a proposal: 57 were opposed, and 23 had no opinion . Estimate 95%
SAMPLING PROPORTIONS AND PERCENTAGES 69
confidence limits for the number of colleges in the population that favored the
proposal.
3.3 Do the results of the previous sample furnish conclusive evidence that the majority
of the colleges in the population favored this proposal?
3.4 A, population with N =7 consists of the elements BI> C" C 2 , C J , D" D 2 , and D J • A
simple random sample of size 4 is taken in order to estimate the proportion of C's to
C's + D's. Work out the conditional distributions of this proportion, p, and verify the
formula for its conditional variance.
3.5 In the preceding exercise. what is the probability that a sample of size 4 contains
8, ? Find (he average variance of p in exercise 3.4 over aI/ simple random samples of size 4.
This is 0.0393 as against 0.025 with N = 6. n = 4. and B, absent.
3.6 A simple random sample of 290 households was chosen from a city area containing
14,828 households. Each family was asked whether it owned or rented the house and also
whether it had the exclusive use of an indoor toilet. Results were as follows .
Owned Rented Total
3.11 Which of the two previous estimates seems more precise in the following
circumstances? N =2004, Y = 3011 . The sample with n = 100 showed that 73 students
made at least one visit. Their total number of visits was 152 and the estimated variance S2
was 1.55.
3.12 A simple random sample of n cluster units, each with m elements, is taken from a
population in which the proportion of elements in class C is P. A~ the intracluster
correlation varies, what are the highest and lowest possible values of the true variance of p
(the sample estimate of P ) and how do they compare with the binomial variance? Ignore the
fpc.
3.] 3 For thc sample of 30 households in Table 3.5, the dllta shown below refer to visits
to the dentist in the last year. Estimate the variance of the proportion of persons who saw a
dentist, and compare this with the binomial estimate of the variance.
3.14 In sampling for a rare attribute, one method is to continue drawing a simple
random sample until m units that possess the rare attribute have been found (Haldane,
1945) where m is chosen in advance. If the fpcis ignored, prove that the probability that the
5 1 4 5 4
6 0 6 4 4 0
3 I 2 4 3
3 2 I 3 J 2
2 0 2 3 0 3
3 0 3 4 J 3
3 2 3 0 3
3 2 3 1 2
4 I 3 1 0 I
4 0 4 2 0 2
3 1 2 4 0 4
2 0 2 3 1 2
7 2 5 4 1 3
4 1 3 2 0 2
3 0 3 4 0 4
SAMPLING PROPORTIONS AND PERCENTAGES 71
total sample required is of size n is
(n -I)! pmQ"- m
(n c: m )
(m-l)!(n - m )!
where P is the frequency of the rare attribute. Find the average size of the total sample and
show that if m > 1, p = (m - l)/ (n - 1) is an unbiased estimate of P. (For further discussion.
see Finney, 1949. and Sandelius. 1951. who considers a plan in which sampling continues
until e ither m have been found or the total sample size has reached a preassigned limit no.)
See also section 4.5.
74 SAMPLING TECHNIQUES
4. It often happens that data are to be published for certain major subdivisions
of the population and that desired limits of error are set up for each subdivision. A
separate calculation is made for the n in each subdivision, and the total n is found
by addition .
5. More than one item or characteristic is usually measured in a sample survey:
sometimes the number of items is large. If a desired degree of precision is
prescribed for each item, the calculations lead to a series of conflicting values of n,
one for each item. Some method must be found for reconciling these values.
6. Finally, the chosen value of n must be appraised to see whether it is
consistent with the resources available to take the sample. This demands an
estimation of the cost, labor, time, and materials required to obtain the proposed
size of sample. It sometimes becomes apparent that n will have to be drastically
reduced. A hard decision must then be faced-whether to proceed with a much
smaller sample size, thus reducing precision, 'or to abandon efforts until more
resources can be found.
In succeeding sections some of these questions are examined in more detail.
*
CT
p
= [i:i::;,
VN=1Y~
!PO
Hence the formula that connects n with the desired degree of precision is
d=t ~ fPQ
V~V-;;=-
where t is the abscissa of the normal curve that cuts off an area of a at the tails.
Solving for n, we find
n= 2 (4.1)
1+!(t PO-I)
2
N d
For practical use, an advance estimate p of P is substituted in this formula. If N is
large, a first approximation is
(4.2)
76 SAMPLING TECHNIQUES
where
V = pq = desired variance of the sample proportion
no
In practice we first calculate no. If no/ N is negligible, no is a satisfactory
approximation to the n of (4.1) . If not, it is apparent on comparison of (4.1) and
(4.2) that n is obtained as
no
no: =. no
(4.3)
1+(no-1)/N l+(no/N)
Example. In the hypothetical blood groups example we had
d = 0.05 , p = 0.5, (l' = 0.05 , t =2
Thus
_ (4)(0.5)(0.5) 00
no - (0.0025) 4
Let us assume that there are only 3200 people on the island . The fpc is needed, and we
find
no 400
n- - - = 356
1+(no- l)/ N L+~)
The formula for no holds also if d, p. and q_ are all expressed as percentages instead of
proportions. Since the product pq increases. p moves toward t or 50%, a conservative
estimate of n is obtained by choosing for p the value nearest to ~ in the range in which pis
thought likely to lie. If p seems likely to lie between 5 and 9%, for instance. we assume 9%
for the estimation of n.
Sometimes, particularly when estimating the total number NP of units in class
C, we wish to control the relative error r instead of the absolute error in Np ; for
example, we may wish to estimate NP with an error not exceeding 10% . That is,
we want
'NP - NPI )
Pr ( NP >r =Pr('p-PI~rP)=a
For this specification, we substitute rP or rp for d in formulas (4.1) and (4.2). From
(4.2) we get
(4.2),
Uy= ~~
Hence
-
r Y = tU9 == tV~S
-:;--- r (4.4)
N vn
Solving for n gives
(4.5)
78 SAMPLING TECHNIQUES
Example. In nurseries that produce young trees for sale it is advisable to estimate, in
late winter or early spring. how many healthy young trees are likely to be on hand, since this
determines policy toward the solicitation and acceptance of orders. A study of sampling
methods for the estimation of the total numbers of seedlings was undertaken by Johnson
(1943). The data that follow were obtained from a bed of silver maple seedlings 1 ft wide
and 430 ft long. The sampling unit was 1 it of the length of the bed. so that N = 430. By
complete enumeration of the bed if was found that Y = 19, S2 = 85.6, these being the true
population values.
With simple random sampling, how many units must be taken to estimate Y within 10%.
apart from a chance of 1 in 20? From (4.5) we obtain
12 S2 (4)(85.6)
no= r2f2= (1.9)2 95
n = s/
- -2 ( 1 +8C+--+-
s/ 2) (4 .6)
CYI nlY/ nl
The mean Yof the final sample is slightly biased. Take :- = y(1 - 2C).
Estimation of Y with Variance V
Take additional units to make the total sample size
n = _!_ 2)
S 2 ( 1+-
(4 .7)
V nl
If S were known exactly, the required sample size would be S2/ V. The effect of
not knowing S is to increase tile average size by the factor (1 + 2/ n I) .
Estimation of P with Variance V
Let PI be the estimate of P from the first sample. The combined size of the first
two samples should be
Plql 3 - 8plql 1 - 3plql
n = --+ + - --'--'-'''':' (4.8)
V Plql Vnl
The first term on the right is the size required if P is known to be equal to PI. With
this method, the ordinary binomial estimate P made from the complete sample of
size n is slightly biased. To correct for bias, take
~ 1_
_V_:_( - __;2p'-'-)
P=p+
pq
good initial estimate of the required n. Since the cv of P is JOlnP, it is easily verified that
n = 400 is adequate for P 0= 20%, but n = 1900 will be needed if P is only 5%.
Accordingly, he tak.es an initial sample with n , 0= 396 and finds P, =0.101. Since
JC=O. J, C = 0.01. Equation 4.9 gives
n= (0.899) + _ _3_ + I = 926
(0.01)(0.101) (0.0908 ) (0 .01)(40)
The combined sample gives tip = R8 ; p = 8R/ 926 = 0.0950. The correction (or bias, Cpl q,
amounts to 0.001 I. giving a final estimate of 0.094 or 9.4% .
The second method, a small pilot survey, serves many purposes, especially if the
feasibility of the main survey is in doubt. If the pilot survey it itself a simple
random sample, the preceding methods apply. But often the pilot work is
restricted to a part of the population that is convenient to handle or that will reveal
the magnitude of certain problems. Allowance must be made for the selective
nature of the pilot when using its results to estimate 5 2 or P. For instance, a
common practice is to confine the pilot work to a few clusters of units. Thus the
computed .\.2 measures mostly the variation within a cluster and may be an
underestimate of the relevant 5 2. The relation between intra- and intercluster
variation is discussed in Chapter 9. The same problem arises in cluster sampling
for proportions, in which the formula pq/ n may underestimate the effect of
variation among clusters. Cornfield (1951) gives a good illustration of the estima-
tion of sample size in cluster sampling for proportions.
Method 3-the use of results from previous surveys- points to the value of
making available, or at least keeping accessible, any data on standard deviations
obtained in previous surveys. Unfortunately, the cost of computing standard
deviations in complex surveys is high, even with electronic machines, and fre-
quently only those S.d.'s needed to give a rough idea of the precision of the
principal estimates are computed and recorded. If suitable past data are found, the
value of S2 may require adjustment for time changes. With skew data in which Vis
changing with time, 5 2is often found to change at a rate lying somewhere between
k Yand k y2, where k is a constant. Thus, if Y is thought to have increased by 10%
in the time interval since the previous survey, we might increase our initial
estimate of 5 ~ by 10 to 20% . .
Finally, a serviceable estimate of 5 2 can sometimes be made from relatively
little information about the nature of the population. In early studies of the
numbers of wireworms in soils, a tool was used to take a sample (9 x 9 x 5 in.) of
the tOPSOIl. For estimating n, the sampler needed to know the tandard deviation
of the number of wireworms found in a boring with the tool. If wireworms were
distributed at random over the topsoil, the number found in a small volume would
follow the Poisson distribution, for which 52 = Y. Since there might be some
tendency for wireworms to congregate, it was decided to assume 5 2 = 1.2 Y, the
factor 1.2 being an arbitrary safety factor. Although Y was not known, the values
TIlE ESTIMATION OF SAMPLE SIZE 81
of Y that are of economic importance with respect to crop damage could be
delineated. These two pieces of information made it possible to determine sample
sizes that proved satisfactory.
Deming (1960) shows how some simple mathematical distributions may be
used to estimate S2 from a knowledge of the range and a general idea of the shape
of the distribution. If the distribution is like a binomial, with a proportion p of the
observations at one end of the range and a proportion q at the other end,
52= pqh 2 , where h is the range . When p = q = t the value of S2= 0.2Sh 2 is the
maximum possible for a given range h. Other useful relations are that S2=
=
O.083h 2 for a rectangular distribution, S2 O.OS6h 2 for a distribution shaped like
a right triangle, and S2= O.042h 2 for an isosceles triangle.
These relations do not help much if h is large or poorly known . However, if h is
large, good sampling practice is to stratify the population (Chapter 5) so that
within any stratum the range is much reduced. Usually the shape also becomes
simpler (closer to rectan~ular) within a stratum. Consequently, these relations are
effective in predicting S , hence n, within individual strata.
TABLE 4.1
AN EXAMPLE Of DIFfERENT TYPES OF ITEM IN
REGIONAL SURVEYS
n == max ·( S? ) (4.11)
l7j V
If the subdivisions are into classes like age, income , S ; /171 may be less than S 2 for
central classes, but may be large for an extreme class with smalll7j. In this event,
we may either have to increase the value qf V in this subdivision or find some way
of identifying units in this subdivision in advance so that they can be
sampled at a higher rate. The method of double sampling (Chapter 12) is
sometimes useful for this purpose.
The demands on sample size are still greater in analytical studies in which the
specifications are
V(y, - Yi):S V (4.12)
fo r every pair of subdivisions (domains). In this case
n == max-
1 (S2 S.2)
~+ ...L (4.l3)
1./ V 171 17/
2
If the S/ are not very different from S 2, " will be 2kS / Vwhen the k domains are
of equal size, and still greater otherwise. The effect of fpc terms, neglected in this
discussion, is to reduce the required" 's to some extent.
The purpose in taking the sample is to diminish this loss. If C(n) is the cost of a
sample of size n, a reasonable procedure is to choose n to minimize
C(n )+ L(n) (4.15)
since this is the total cost involved in taking the sample and in making decisions
from its results. The choice of n determines both the optimum size of sample and
the most advantageous degree of precision .
Alternatively. th e same approach can be presented in terms of the monetary
gain that accrues from having the sample information, instead of in terms of the
loss that arises from errors in the sample information. If monetary gain is used, we
construct an expected gain G (n ) from a sample of size n. where G (n ) is zero if no
sample is taken . We maximize
G(n) - C(n)
In this form the principle is equivalent to the rule in classical eco lomics that profit
is to be maximized .
The simplest application occurs when the loss function , l (z), is Az 2, where A is a
constant. It follows that
L(n) = AE(z2) (4.16)
For instance, if Yis the sample estimate of Y, and z = Y- Y,
!! AS 2 AS 2
L(n)=A V(y) = - - - (4.17)
n N
if simple ra ndom sampling is used .
The simplest type of cost function for the sa mple is
C(n )=co+c,n (4.18)
where Cl) is the overhead cost. By differentiation. the value of n that minimizes
cost plus loss i
n = J AS 2/c, (4.19)
A mo re general form of this result is given by Yates (1960). The same analysis
applies to any method of sampling and estimation in which the variance of the
estimate is inversely proportional to n and the cost is a linear function of n.
Blythe (1945) descri bes the application of this principle to the estimation of the
volume of timber in a lot for selling purposes (see exercise 4 .11). Nordin (1944)
discusses the optimum size of sample for estimating potential sales in a market
that a manufacturer intends to enter. If the sales can be forecast accurately, the
amount of fixed equipment and the production per unit period can be allocated to
maximize the manufacturer'S expected profit. Grundy et al. (1954. 1956) consider
the optimum size of a second sample when the resul ts of a first sample are already
known .
TIlE ESTIMATION OF SAMPLE SIZE 85
This approach has received substantial further development from workers on
statistical decision theory. Generalizations include the substitution of utility for
money value as a scale on which to measure c sts and losses, the explicit use of
subjective prior information about unknown parameters by expressing this infor-
mation as " prior" probability distributions of the unknown parameters, and the
investigation of different types of cOSJ and loss functions and of qualitative as well
as quantitative data . For a comprehensive account of the method, see Raiffa and
Schlaifer (1961). Although it is still not evident how frequently decision prob-
lems will be amenable to complete solution by this approach, the method has
value in stimulating clear thinking about the important factors in a good decision .
One area that appears suitable for applications is the sampling of lots of articles in
a mass-production process in order to decide whether to accept or reject the lot on
the basis of its estimated quality. Sittig (1951) considers the economics of
sample-size determination, taking account of costs of inspection and the costs
incurred through defective articles in accepted lots and good articles in rejected
lots.
EXERCISES
4.1 In a district containing 4000 houses the perc;entage of owned houses is to be
estimated with a s.e. of not more than 2% and the percentage of two-car households with a
s.e. of not more than 1% . (The figures 2 and 1% are the absolute values, not the cv's.) The
true percentage of owners is tbought to lie between 45 and 65% and the percentage of
two-car households between 5 and 10% . How large a sample is necessary to satisfy both
aims?
4.2 In the population of 676 petition sheets (Table 2.2. page 28) how large must the
sample be if the tota l numher of signatures is to be estimated with a margin of error of 1000.
apart from a 1 in 20 chance? Assume that the value of S 2 given on page 28 is the population
S' .
4.3 A survey is to be made of the prevalence of the common diseases in a large
population. For any disea e that affects at least 1% of the individuals in the population, it is
desired to estimate the total number of cases. with a coefficient of variato n of not more than
. 20% . (a) What size of si mple random ample is needed, assuming that the presence of the
disease can be recognized without mistakes? (b) What size is needed if total cases are
wanted separately for males and females, with the same precision?
4.4 In a wireworm survey the number of wireworms per acre is to be estimated with
a limit of error of 30% , at the 95 % ptobability level, in any field in which wireworm
dens it y exceeds 200.000 per acre in the top 5 in . of soil. The sampling tool measures
Y x 9 x 5 in . deep. Assuming that the number of wireworms in a sin_gle sample follows a
distribution slightly more variable th an the Poisson , we take 52= 1.2 Y . What size of simple
ran dom sample is needed ? ( I acre = 43 ,560 sq ft.)
4.5 The following coefficients of variation per unit were obtained in a farm survey in
Iowa, the unit being an area 1 mile square (data of R. J . Jessen):
Estimated cv
Item (% )
Acres in farms 38
Acres in corn 39
Acres in oats 44
Number of famil y workers 100
Number of hired workers 110
Number of unemployed 317
A survey is planned to estimate acreage items with a cv of 2!% and numbers of workers
(excl uding unemployed) with a cv of 5% . With simple random sampling, how many units
are needed? How well would this sample be expected to estimate the number of unem-
ployed?
4.6 By experimental sampling, the mean value of a random variate is to be estimated
with variance V == 0.0005 . The values of the random variate for the first 20 samples drawn
are shown on p. 81. How many more samples are needed? (Use equation 4.7.)
4.7 A household survey is designed to estimate the proportion of families possessing
certain attributes. For the principal items of interest, the value of P is expected to lie
between 30 and 70% . With simple random sampling, how large are the values of n
necessary to estimate the following means with a standard error not exceeding 3%? (a) The
TIlE ESTIMATION OF SAMPLE SIZE 87
Sample Value of Sample Value of
Number Random Variate Number Random Variate
I 0.0725 II 0.071 2
2 0.0755 12 0.0748
3 0.0759 13 0.0878
4 0.0739 14 0.0710
5 0.0732 15 0.0754
6 0.0843 16 0.071 2
7 0.0727 17 0.0757
8 0.0769 18 0.0737
9 0.0730 19 0.0704
10 0.0727 20 0.0723
over-all mean P7 (b) The individual means Pi for the income classes-under '5000; $5000
to SI0,000; over SI0,000. (i = 1, 2, 3)7 (c) The differences between the means (PI - PI) for
every pair of the classes in (b)? Give a separate answer for (a), (b), and (c) . Income statistics
indicate that the proportions of families with incomes in t"e three classes above are 50, 38,
and 12% .
4 .8 The 4-year colleges in the United States were divided into cla. ses of four different
sizes according to their 1952-1953 enrollments. The standard deviations within each class
are shown below.
Class
2 3 4
If you know the.class boundaries but not the values of SI, how wen can you guess the SI
values by using simple mathematical figures (section 4.7)7 No college has less than 200
students and the largest has about 50,000 stude nts.
4.9 With a quadratic loss function and a linear cost function, as in section 4.10, S 2 is
reduced to S'2 by a superior sampling plan, Co. CI> and A remainiPg unchanged. If n ', V'
denote the new optimum sample size and the accompanying V { y), show that n ' < n and
that V'< V.
4.10 If the loss function due to an error in yisA Iy - Y1 and if the cost C= co+c ,n, show
that with simple random sampling, ignoring the fpc, the most economical value of n is
(~\2/3
c,.n;;,J
4.11 (Adapted from Blythe, 1945). The selling price of a lot of standing timber is UW,
where U is the price per unit volume and W is the volume of timber on the lot. The number
N of logs on the lot is counted, and the average volume per log is estimated from a simple
random sample of n logs. The estimate is made and paid for by the seller and is
88 SAMPLING TECHNIQUES
provisionally accepted by the buyer. Later, the buyer finds out the exact volume purchased,
and the seller reimburses him if he has paid for more than was delivered. If he has paid for
less than was delivered, the buyer does not mention the fact.
Construct th e seller's loss function . As~'uming that the cost of measuring n logs is en, find
the optimum value of n. The standard deviation of the volume per log may be denoted by S
and the fpc ignored.
4 .12 ·(a) The presence or absence of each of two characteristics is to be measured on
each unit in a simple random sample from a large population. If PI. P, are the percentages
of units in the population that possess characteristics 1 and 2, a client wishe to estimate
(PI - P, ) with a standard error not exceeding two percentage points. What sample size do
you suggest if the client thinks that PI a nd P, both lie between 40 and 60% and that the
characteristics are independently distributed on the units?
(b ) Suppose that in (a) the client thinks that the characteristics are positively correlated,
but does not know the correlation. You suggest an initial sample of 200, with the following
results.
Chal dcteristics
1 2 Number of units
Yes Yes 72
Yes No 44
No Yes 14
No No 70
200
What sample size do you now recommend to estimate (PI - P,) with a standard error s 2%?
4.13 (a) Suppose Y0U are estimating the sex ratio. which IS close to equality, and could
sample households of four persons father, mother, two child ren. Ignoring the small
proportion of families with identical twins, find the deff factor for a simple random sample
of /I households ver~us one of the 4n persons.
(b) Would identittal twin families lower or raise the deff factor?
C HAPTER 5
5.1 DESCRIPTION
In stratified sampling the population of N units is first divided into subpopula-
tions of N J, N 1 • .. . , N L units, respectively. These subpopulations are nonover-
lapping, and togethe r they comprise the whole of the population , so that
N 1 +N2 + " '+NL=N
The subpopulations are called strata . To obtain the full benefit from stratification,
the values of the Nh must be known . When the strata have been determined , a
sample is drawn from each. the drawings bei ng made independently in different
strata . The sample sizes within the strata arc denoted by n I, n 2, ... .n,..respec-
tively .
If a simple random sample is taken in each stratum, the whole procedure is
described as stratified random sampling.
Stratification is a ommon technique . There are many reason for this; the
principal ones are the following.
I.. If data of known precision are wanted for certain subdivisions of the
popu lation. it is advisable to treat each subdivision as a "population" in its own
right.
2. Administrative convenience may dictate the use of stratification; for exam-
ple , the agency conducting the survey may have field offices, each of which can
supervise the survey for a part of the population .
J . Sampling problems may differ markedly in different parts of the population .
With human popUlations, people living in institutions (e.g. , hotels. hospitals,
prisons) are often placed in a different stratum from people living in ordinary
homes because a different approach to the sampling i ' appropriate for the two
situations. In sampling businesses we may possess a list of the large firms, which
are placed in a separate stratum. Some type of area sampling may have to be used
for the smaller firms .
4. Stratification may produce a gain in precision in the estimates of characteris-
tics of the whole population . It may possible to divide a heterogeneous population
90 SAMPLING TECHNIOUES
The theory of stratified sampling deals with the properties of the estimates from
a stratified sample and with the best choice of the sample sizes nh to obtain
maximum precision. In this development it is taken for granted that the strata
have already been constructed. The problems of how to construct strata and of
how many strata there should be are postponed to a later stage (section SA.7).
S.2 NOTATION
The suffix h denotes the stratum and i the unit within the stratum. The notation
is a natural extension of that previously used . The following symbols all refer to
stratum h.
stratum weight
true mean
II,
_
L
;- 1
Ylti
y,,==-- sample mean
nit
true variance
(5 . J)
_
L nhYh
h- J
y=- - (5.2)
n
The difference is that in y" the estimates from the individual stra ta receive their
correct weights Nh! N. It is evident that Y coincides with y" proviqed that in every
stratum
nh
- =-
Ni,
or or
n N
This mean that the sampling fraction is the same in all strata. Thi stratification is
described as stratification with proportional allocation of the nh' It gives a
self-weighting sample. If numerous estimates have to be made, a self-weighting
sample is time-saving.
The principal properties of the estimate rtf
are outlined in the following
theorems. The first two theorems apply to stratified sampling in general and are
not restricted to stratified random sampling; that is, the sample from any stratum
need not be a simple random sample.
Theorem 5.1. If in every stratum the sample estimate Yh is unbiased, -then Y." is
an unbiased estimate of the population mean Y.
Proof.
L L
E (Y.,) = E L WhYh = L Wh Yh
h- J h- I
since the estimates are unbiased in the individual strata. But the population mean
Y may be written
N
This completes the proof.
92 SAMPLING TECHNIQUES
Ys, is a linear function of the jih with fixed weights Wh o Hence we may quote the
result in statistics for the variance of a linear function .
L L L
V(y.,) = r "'1/ V(YII) + 2 h-r I I r> h W W; Cov (YIIM
II - I
h (5.5)
But since samples are drawn independently in different strata, all covariance
terms vanish . This gives the result (5.3) .
To summarize theorems 5.1 and 5.2: if Yh is an unbiased estimate of Yh in every
stratum, and sample selection is independent in different strata, then Y." is an
unbiased estimate of Y with variance W,/ V(Yh). r
The important point about this result is that the variance of Y., depends only on
the variances of the estimates of the individual stratum means Yh • If it were
possible to divide a highly variable population into strata such that all items had
the same value within a stratum, we could estimate Y without any error. Equation
(5.4) shows that it is the use of the correct stratum weights N h / N in making the
estimate Y., that leads to this result.
Theorem 5.3. For stratified random sampling, the variance of the estimate jill
is
h= 1
2
Nit V(Yh) = N2 r L
11 - 1
Nh(Nh-nll)-=r Wh -(1 - 111)
nil nil
Some particular cases of this formula are given in the following corollaries.
STRATIFIED RANDOM SAMPLING
Corollary 1. If the sampling fractions n,,/ NI, are negligible in all ~trata.
V( -.
y"
)="NN SIon (NN-n) = 1-["
L.
I,
2
n
w L. 10 10
2
(S .H)
V(ysr) = S:\N~n)
Theorem 5.4. If ¥s( = Ny" is th e estimate of the popul a tion IOtal )', then
2
S"
V( Y,,) = L NIo(NI, -
A
/11') -- (5 . I () )
/I"
This follows at o nee from theo re m S.3.
TABLE 5.1
SIZES OF64 CITIES (IN 1000's) IN 1920 AND 1930
1920 Size (xIl1) 1930 Size (VIII)
Stratum Stratum
h = I 2 I 2
Note that the stratum with the largest cities has a variance nearly 10 times that of the other
stratum.
. STRATIFIED RANDOM SAMPLING 95
In proportional allocation, we have II, = 6, "2 = 18. From (5 .7), multipJying by WZ, we
have ;
N-"
V( f',...p) = - - L N~Sh 2
"
= m(16)(53,843) +(48)(5581)] = 1,882,293
u(f'_) = 1372
3. For", ="2= 12 we use the general formula (5.9):
.
V( Y",,,,,,) = L Nh (Nh - "h) -
S/
"h
(16)(4)(53,843) (48)(36)(5581)
= 12 + 12 1,090,827
u(Y",,,,,,) =1044
In this example equal sample sizes in the two strata are more precise than proportional
allocation. Both are greatly superior to simple random sampling.
(5.12)
These formulas assume that y., is normally distributed and that s(y.,,) is well
determined, so that the multiplier t can be read from tables of the normal
distri bution.
If only a few degrees of freedom are provided by each stratum, the usual
procedure for taking account of the sampling error attached to a quantity like
s(y.,) is to read the t-value from the tables of Student's t instead of from the
normal table. The distribution of s(ys,) is in general too complex to allow a strict
application of this method. An approximate method 01 assigning an effective
number of degrees of freedom to s(y.,) is as follows (Satterthwaite, 1946).
We may write
n. =
(L gh Sh
2
Y
2
4 (5 .16)
L gh 5h
nh - 1
The value of n. always lies between the smallest of the values (nh - 1) and their
sum . The approximation takes account of the fact that Sh 2 may vary from stratum
to stratum. It requires the assumption that the YIII are normal , since it depends on
tht" result that the variance of s/ is 2ah 4/(nh - 1). If the distribution of Yhi has
positive kurtosis, the variance of Sl, 2 will be larger than this and formula 5.16
overestimates the effective degrees of freedom .
Within any stratum the cost is proportional to the size of sample, but the cost per
unit Ch may vary from stratum to stratum. The term Co represents an overhead
co t. This costfunction is appropriate when the major item of cost is that of taking
the measurements on each unit. If travel costs between units are substantial,
empirical and mathematical studies suggest that travel costs are better rep-
resented by the expression !.th.r;;;,
where th is the travel cost per unit (Beardwood
et aI., 1959). Only the linear cost function (5.17) i considered here .
STRATIFIED RANDOM SAMPLrNG 97
Theorem 5.6. In stratified random sampling with a linear cost function of the
form (5.17), Ithe variance of the estimated mean y" is a minimum for a specified
co. t C. and t,he cost is a minimum for a specified variance V(y.,) , when is "h
proportional t,o WhShlJC".
Proof. We have
L
C = co + l: Chnh (5 .]7)
11 = 1
(5.18)
Our problems are \ !ither (1) to choose the"h soas to minimize V for specified C, or
(2) to choose the"h so as to minimize C for specified V. It happens that apart from
their flnal steps, t he problem have othe same solution . Choosing the nh to
mi nimize V for fix( :d C or C for fixed V are both equivalent to minimizing the
product
Stuart (] 954) has noted that (5 .19) may be minimized neatly by use of the
Cauchy- Schwarz ineqLlality. If aI!, bl! are two sets of L positive numbers, this
inequality comes from tthe identity
(5.20)
(5.21 )
Thus, no choice of the "h can make V'C smaller than (r WIlShJC"y. The
98 SAMPLING TECHNIQUES
(5.22)
nh W h Sh l./0. N h Sh l./0.
(5 .23)
; '"" I (Wh s l./0.) ::: L (N S l./0.)
h h h
This theorem leads to the following rules of conduct. In a given stratum, take a
larger sample if
One further step is needed to complete the allocation . Equation (5.23) gives the
nilin terms of n, but we do not yet know what value n has. The solution depends on
whether the sample is chosen to meet a specified total cost C or to give a specified
variance V for y." If cost is fixed, substitute the optimum values of nil in the cost
function (5.17) and solve for n. This gives
n ::=
(I WkS"Jc:)I W"SklJc:
(5.25)
V+(llN) L WhS,,2
where WII = NIII N.
An important special case arises if CII = e, that is, if the cost per unit is the same in
all strata. The cost becomes C = Co + en, and optimum allocation for fixed cost
reduce to optimum allocation for fixed sample size. The result in this special case
is as follows.
TIteorem S.7. In stratified random sampling VCr.,) is minimized for a fixed
total size of sample n if
(5.26)
STRATIFIED RANDOM SAMPLING 99
This allocation is sometimes called Neyman allocation, after Neyman (1934),
whose proof gave the result prominence. An earlier proof by Tschuprow (1923)
was later discovered .
A formula for the minimum variance with fixed n is obtained by substituting the
value of nh in (5.26) into the general formula for Verst) . The result is
_ (I WSh) 2 I WhSh2
Vmin(y.,) = n N (5.27)
(5.30)
, From the standard algebraic identity for the analysis of variance of a stratified
population, we have .
2 ~ - 2
(N - l)S = '" I (Yhi - Y)
/I i
= II
(Yhi - h )2+ y I Nh(Yh - 5')2
hilt
(5.34)
(I - I) ~ - - 2
= Vprop + - - " , Wh(Yh - Y) (5.35)
n
By the definition of V o"" we must have V prop <:!: V OPI' By (5.30) and (5 .31) their
difference is
(5 .36)
instead of to (5.35).
STRATIFIED RANDOM SAMPLING 101
It fol:ows that proportional stratification gives a higher variance than simple
random sampling if
(5.39)
Mathematically, this can happen. Suppose that the Sh 2 are all equal to S.} , so that
proportional allocation is optimum in the sense of Neyman. Then (5.39) bcwme~
2: Nh(Yh - Y)2< (L - 1)Sw2
or
(5.40)
Those familiar with the analysis of variance will recognize this relation as implying
that the mean square among strata is smaller than the mean square within strata,
that is, that the F -ratio is less than 1.
Examples are businesses of a specific kind, for example, groceries (in surveys
dealing with the volume of business or number of employees), schools (in surveys
related to'numbers of pupils), hospitals (in studies of patient load), and income tax
returns (for items highly correlated with taxable income). In the United States
farms also vary greatly in size as measured by total acreage or gross income, but
102 SAMPLING TECHNIQUES
TABLE 5.2
C"LCUL,tI.TION OF TH~ OPTIMUM ALLO ,tI.T!ON
Note that the optimum sampling fraction is 75% in stratum 1 but only 25% in
stratum 2. It is often found that because of the high variability of the stratum
consisting of the largest institutions, the formula calls for 100% sampling in this
stratum. Indeed, the allocation may call for more than 100% sampling (see section
5.8). Note also that the Sh are smaller in 1920 than in 1930. The] 920 data give an
overoptimistic impression of the precision to be obtained in a 1930 survey. As
mentioned in section 4.7, the possibility of a change in the levels of the Sh should
always be considered when using past data, even though an allowance for change
may have to be something of guess.
Geographic stratification, in which the strata are compact areas such as counties
or neighbourhoods in a city, is common-often for administrative convenience or
because eparate data are wanted for each stratum. It is usually accompanied by
some increase in precision because many factors operate to make people living or
crops growing in the same area show similarities in their principal characteristics.
The gains from geographic stratification, however, are generally modest. For
example, Table 5.3 shows data published by Jessen (1942) and Jessen and
Houseman (1944) on the effectiveness of geographic tratification for a number of
typical -farm economic items.
STRATIFIED RANDOM SAMPLING 103
Four sizes of stratum are represented- the township, the county, the "type of
farmi ng" area, and the state. To give some idea of the relative sizes of the strata,
there are about 1600 townships, 100 counties, and 5 areas in Iowa.
In the table the precision of a method of stratification is taken as inversely
proportional to the value of V(Y$I) given by the method. Thus the relative
precision of method 1 to method 2 is the ratio V2 (Y.,)/ V 1(y.,) , expressed as a
percentage. The data shown are averages over the numbers of items given in the
second column. The county is taken as a standard in each case. As indicated, the
gai ns in precision are moderate. In Iowa the use of 1600 strata (townships)
compared with no stratification (state) increases the precision by about 30% ; that
'5, it reduces the variance by about 25%.
TABLE 5.3
RELATIVE PRECISION OF DIFFERENT KINDS Of GEOORAPHIC
STRATIFICATION (IN PER CENT)
Stratum
Type of
No. of Farming
State Items Township County Area State
(h 2:: 2) (5.41)
provided that ti h S Nh for h 2:: 2. If it should happen that n, > N 2 , we change the
allocation to
_ WhSh
nh = (n - N j - N2)-L - - ' (h ~ 3) (5.41 )'
L WhSh
3
provided that nh s Nh for h 2:: 3. We continue this process until every nh s N h. The
resulting allocation may be hown to be optimum for given n, as would I)e
expected.
Care must be taken to use the correct formula for V(y.,) . The general formula
(5.6) in section 5.3 is correct if the nh given by the revised optimum allocation
are sub tituted. Formula (5 .27) for V",in(YSI)'
(5 .27)
no longer holds. If I' denotes summation over the strata in which fi h < N h , an
alternative correct formula is
(5.42)
n =----''---- n (l
(5.46)
In particular cases the form ' alas take various forms that may be more conve-
nient for computation. A few are given.
Presumed optimum allocation (for fixed n): Wh ex:: Whs" .
(2:: WhSh )2
(5.47)
n (5.49)
no
n =-- (5 .51)
1+ nu
N
Example. This example comes from a paper by Cornell (1947), which describes :!
sample of United States colleges and universities drawn in 1946 by the U.S. Office of
Education in order to estimate enrollments lor the 1946- 1947 academic year. The
illustration is for the population of 196 teachers' colleges and normal schools . These were
arranged in seven strata, of which one small stratum will be ignored . The first five strata
were constructed by size of institution: the sixth contained colleges for women only.
Estimates Sh of the Sh were computed from results for the 1943- 1944 academic year. An
"optimum," stratification based on these Sh was employed .
The objective was a coefficient of variation of 5% in the estimated total enrollment. In
1943 the total enrollment for this group of colleges was 56,472. Thus the desired standard
error is
(0.05)(56,472) = 2824
It may be objected that enrollments will be greater in 1946 than in 1943 and that
allowance should be made for this increase . Actually, the calculation assumes only that the
cv per college remains the same in J 943 and 1946-an assumption that may not be
unreasonable.
Table 5.4 shows the values of N h , Sh ' and Nhsh, which were known before determining n.
The appropriate formula for n is (5 .50), which applies to an "optimum" allocation for
estimating a total. With only 196 units in this population, it is improbable that the fpc will be
negligible. However, for purposes of illustration, a first approximation ignoring the fpc will
be sought. This is
STRATIFIED RANDOM SAMPLING 107
TABLE 5.4
DATA FOR ESTIMATING SAMPLE SIZE
Stratum Nh Sh Nhs h "h
I 13 325 4,225 9
2 18 190 3,420 7
3 26 189 4,914 10
4 42 82 3,444 7
5 73 86 6.278 13
6 24 190 4,560 10
be the proportions of units in C in the hth stratum and in the sample from that
stratum, respectively. For the proportion in the whole population, the estimate
appropriate to stratified random sampling is
~NhPh
(5.52)
P" =/...N
Theorem 5.9. With stratified random sampling, the variance of Pst is
V( )= J... ~ Nh 2(Nh - nh) PhQh (5.53)
P.. N 2 /... Nh - 1 nh
• The arithmetical results differ slightly from those given by Cornell (1947).
108 SAMP1..1NG TECHNIOUES
Proof. This is a particular case of the general theorem for the variance of the
estimated mean . From theorem 5.3
1 S 2
V(y.,) = N 2I Nh (Nh - nh )_h_ (5.54)
nh
Let Yhi be a variate which has the value 1 when the unit is in C, and zero otherwise.
In section 3.2, equation 3.4, it was shown that or this variate
2 Nh
Sh = N" _] P"O" (5.55)
For the sample estimate of the variance, substitute p"q"/(nh -1) for the
unknown P1,Oh/n" in any of tbe formulas above.
The best choice of the nIl in order to minimize V(p.,) follows from the general
theory in section 5.5.
Minimum Variance lor FixetfTotal Sample Size.
Thus
(5.60)
STRATIFIED RANDOM SAMP ING 109
Mini mum Variance for Fixed Cost, where Cost:; Co + L Ch nh'
nh
Nh J PhOhl CII
=. n=-"--,:=::==:i;;;:: (5.61)
L Nh JPhOhlch
The value of n is found as in section 5.5.
tABLE 5.5
R ELATI VE PR ECI SION OF STRAT IFI ED AND SIMPLE RANDOM SAMPLING
Simple Stratified
To illustrate the first result, Table 5.5 compares stratified random samplinE
(proportional allocation) with simple random sa mpling for three strata of equal
sizes (Wh = !). Four cases are included , the fir t having Ph = 0.4, 0:5, and 0:6 in the
three strata and the last (a nd most extreme) having Ph = 0.1,0.5, and 0.9. The next
two columns show the variances of the estimated proportion , multiplied by
nl(1 - f), and the last gives the relative precisions of stratified to simple random
sampling. The gain in precision is large only in the last two cases.
To compare proportional with optimum allocation for fixed n, it will be found
that apa rt from the multiplier (I -f),
(5.62)
110 SAMPLING TECHNIQUES
VapI _ a: W .JP;Q;,)2
h
(5.63)
Vprap - L WhPhOh
If all Ph lie between the two values Po and (1 - Po), we are interested in the
smallest value the relative precision will take. For simplicity, we consider two
strata of equal size (WI = W2 ). TIle minimum relative precision is attained when
t
P l = and P2= Po· The relative precision then becomes
VapI _ (0 .5+JP;Ool
(5.64)
Vprap - 2(0.25 + Po 00)
Some values of this function are given in table 5.6. Even with Po equal to 0.1, or as
high as 0.9, the relative precision is 94%. In most cases the :implicity and the
self-weighting feature of proportional stratification more than compensate for this
slight loss in precision.
The limitations of the example should be noted . It does not take account of
differential costs of sampling in different strata. In some surveys the Ph are very·
small, but they range from , say, 0.001 to 0.05 in different strata. Here there would
be a more substantial gain from optimum stratification.
TABLE 5.6
RELATIVE PRECISION Of PROPORTIONAL TO OPTIMUM ALLOCATION
0.4 or 0.6 C.l or 0.7 0.2 or 0.8 0.1 or 0.9 0.05 or 0.95
EXERCISES
5.1 In a population with N = 6 and L == 2 the values of y", are 0,1,2 in stratum 1 and 4 ,
Ii, II in stratum 2. A sample with n == 4 is to be taken . (a) Show that the optimum n" under
Neyman allocation, when rounded to integers, are nh = 1 in stratum 1 and nh == 3 in stratum
2. (b) Compute the estimate y" for every possible sample that can be drawn under
optimum allocation and under proportional allocation . Verify that the estimates arc
unbiased. Hence find Vapt(.Y,,) and Vprop(Y,,) directly. (c) Verify that Vu",(y,,) agrees with the
formula given in equation (5.6) and that Vprap(Y,,) agrees with the formula given in equation
(5.8) , page 93 . (d) Use of formula (5 .27), page 99 , to compute V"p,(y,,) is slightly incorrect
because it does not allow for the fact that the n h were rounded to integers. How well does it
agree with the corrected value?
5.2 The households in a town are to be sampled in oruer to estimate the average
amount of assets per household that are readily convertible into cash . The households are
stra tified into a high -rent and a low-rent stratum. A house in the high-rent stratum is
thought to have about nine times as much assets as one in the low-rent stratum, and Sh is
expected to be proportional to the square root of the stratum mean.
There are 4000 households in the high-rent stratum and 20,000 in the low -rent stratum .
(a) How would you distribute a sample of 1000 house holds between the two strata? (b) If
the object i to estimate the difference between assets per household in the two strata, how
should the sample be distributed?
5.3 The following data show the stratification of all the farms in a county by farm size
and the average acres of corn (ma;ze) per farm in each stratum. For a sample of 100 farms ,
compute the same sizes in each stratum under (a) proportional allocation. (b) optimum
allocation . Compare the precisions of these methods with that of simple random sampling.
- J..!..:1l[~ N
V,a~ - Vp'aP+ n(N - I) r.... h
(Y - y)- 2-
h
_!_ ~ (N - Nh )5h 2J
NL...
5.5 A samlpler haS(wo strata with relative sizes W I' W2 • He believes that 5" 52 can be
taken as equa' l but thinks that C2 may be between 2c, and 4c, . He would prefer to use
proportional :allocaticm but does not wish to incur a substantial increase in variance
compared witl~ optimum allocation. For a given cost C= C,n, +c 2 n 2 , ignoring the fpc, show
that
V""",(y,,) = Wlc, + W2C2
Va,., (y,,) (WIFc; + W2~)2
If W, = W z, compute th e relative increases in variance from using proportional alloca-
tion wh en C2 / e, = 2. 4 .
5.6 A sa mple r proposes to take a stratified random sample . He expects that his field
costs will be of the form L Ch nh' His advance estimates of relevant quan tities for the two
strata are as follows .
Stratum
I 0.4 10 $4
2 0.6 20 $9
(a) Find the values of" 1/11 and "2/ " that minimize the total field cost for a given value of
V(Y"j . (b) Fi nd the sa mpl e size required, under this optimum allocation, to make
V(y,,) = J. Ignore th e fpc . (c) How much will the total field cost be?
5.7 After the sa mple in exercise 5.6 is taken, the sampler finds tHat his field costs were
actually $ 2 per unit in stratum 1 and $ 12 in stratum 2. (a) How much greater is the field .cost
than anticipated? (b) If he had known the correct field costs in advance, could he have
attained V(}',,) = I for the original estimated field cost in exercise 5.6? (Hint. The
Cauchy- chwarz inequality, pagc 97, with V' = I, gives the answer to this question without
finding the new allocation .)
5.8 In a stratification with two strata , the values of the Wh and 5h are as follows.
Stratum
I 0.8 2
2 0.2 4
mpute the ample sizes 11 , . tl 2 in the two strata needed to satisfy the following conditions.
Each cllse requires a separate computation. (Ignore the fpc.) (a) The standard error of the
estimated populatiotl mean 9., is to be 0.1 and the total sample size 11 = n, + fJ 2 is to be
minimized . (b) The standard error of the;: estimated mean of each stratum is to be 0.1. (c)
The standard error of the difference between the two estimated stratum means is to be 0.1 ,
again minimizing the total size of sample .
5.9 With two strata, a sampler would like to have n, = n 2 for administrative conveni-
ence, instead of using the values given by the Neyman allocation. If V(Y.. ), V.,.,(Y,,} denote
STRA TIFlED RANDOM SAMPLING 113
(he variances given by the 1I J = n 2 and the Neyman allocations, respectively, show that the
fractional increase in variance '
V(y.,)- v.".(y,,) _ (r-l)2
V.".(y..,) r+1
where r = nl/ n2 as given by Neyman allocation. For the strata in exercise 5.8, case G, what
"I "2
would the fractional increase in variance be by using = instead of the optimum?
5. 10 If the cost function is of the form C = Co + L th rn:.,
where Co and the 'h are known
numbers, show that in order to minimize V(y.,) for fixed total cost nh must be proportional
to
(W::S/Y"
Fjnd the nh for a sample of size 1000 under the following conditions.
Stratum
1 0.4 1
2 0.3 2
3 0.2 4
5.11 If Vpn>p(y,,) is the variance of the estimated mean from a stratified random sample
of size " with proportional allocation and V(y) is the variance of the mean of a simple
random sample of size n, show that the ratio
VP"'P (ji.. )
V(y)
does not depend on the size of sample but that the ratio
VmI.(y.,)
V_(ji.,)
decreases as " increases. (This implies that optimum allocation for fixed n becomes more
effective in relation to proportional allocation as " increases.) [Use formulas (5 .8 and
5.27) .]
5.12 Compare the values obtained for V(P.,) under proportional allocation and
optimum allocation for fixed sample size in the following two populations. Each stratum is
of equal size. The fpc may be ignored.
Population 1 Population 2
Stratum Ph Stratum Ph
1 0.1 1 0.01
2 0.5 2 0.05
3 0.9 3 0.10
5.13 Show that in the estimation of proportions the results corresponding to theorem
5.8 are as follows .
(1- f) 2
.V",. = V,..op +--I
n
W.(p. - P)
where
Jp.O. =I W.Jp.O•.
5.14 In a firm , 62% of the employees are skilled or unskilled males, 31 % are clerical
females, and 7% are supervisory. From a sample of 400 employees the firm wishes to
estimate the proportion that uses certain recreational facilities. Rough guesses are that the
facilities are used by 40 to 50% of the males, 20 to 30% of the females , and 5 to 10% of the
supervisors. (a) How would you allocate the sample among the three gr ups? (b) If the true
proportions of users were 48. 21 , and 4% , respectively, what would the s.e. of the estimated
proportion p" be? (c) What would the s.e. of p be from a simple random sample with
n = 400?
5.15 Formula (5 .27) for the minimum variance of y" under Neyman allocation reads as
follows .
A student comments : "Since I W.S. 2 > (:L WS. )2 unless all the S. are equal, the formula
must be wrong because as n approaches N it will give a negative value for V( y,, ). " Is the
formula or the student wrong?
5.16 By formula (5.26) for Neyman allocation , the sampling fraction in stratum h is
f. = ".1 N. = "S.I N:L ~Sl. The situations in which this formula calls for more than 100%
sampling in a stratum (f. > 1) are therefore likely to be those in which the overall samplKtg
fraction nl N is fairly substantial and one stratum has unusually high variability. The
following is an example for a small population , with N = 100, n = 40.
Optimum
Stratum N. S. n.
1 60 2 15
2 30 4 15
3 10 15 10
100 40
(a) Verify that the optimum n. aTe as shown in the right column. (b) Calculate V(.y,,) by
formula (5.6) and by formula (5.42) and show that both give V(y... ) = 0. 12.
C HAPTER SA
Further Aspects
of Stratified Sampling
In practice, since the Sit are not known, we can only approximate this allocation.
If nit is the sample size used in stratum h, the variance actually attained, from
equation (5.6), page 92, is
(5A3)
__ (_)_~W/Sb2_.!.(~ )2
V (Y.,) V",," Y., - L. ~ L. WItSIt (5AA)
n~ n
In the first term on the right substitute for WitS" in terms of n,,' from (5Al). This
115
116 SAMPLING TECHNIQUES
. n ~
Reverting to equation (5A2), if the fpc (last term on the right) is negligible, we see
that
Vmu.(Y.rr) (I Wh Sh ) 2
(SA6)
n n2
Hence the proportional increase in variance resulting from deviations from the
optimum allocation is
V(,y,,) - Vmin(Y,,) .!_}: (nlo - nlo')2
(5A7)
Vlni" (,y.,) n h=I nh
where nil is the actual and nh' the optimum sample size in stratum h. If the fpc i
not negligible, the = sign in (5A.7) becomes 2::.
Let gh = lrih - "h '1/"10 be the absolute difference in the sample sizes in stratum h,
expressed as a fraction of the actual sample size nh. Then (5A7) becomes
(5AB)
TABLE SA.l
EFFECTS OF DEVIATIONS FROM OPTIMUM ALLOCATION
Stratum
"h ' li h IliA - "h'l (ii A - "A')'
(opt) (act) Ii" IiA
1. The sample estimate is biased. Because of the bias, we measure the accuracy
of the estimate by its mean square error about Y rather than by its variance aoout
its own mean (see section 1.9).
2. 'The bias remains constant as the sample size increases. Consequently, a size
of sample is always reached for which the estimate is less accurate than simple
random sampling, and all the gain in precision from stratification # lost.
3. The usual estimate s(Y.,) underestimates the true error of y.,. since it does not
oontain the contribution of the bias to the error. ' .
To justify these statements, note that in repeated sampling the mean valye of
r
the estimate is w" Y". The bias therefore amounts to
r(w" - W,,)Y,,
It is independent of the size of the sample. In finding the mean square error (MSE)
of the estimate, it is easy to verify that the varianoe term is given by the usual
118 SAMPLING TECHNIQUEf
Example. This illustrates the loss of precision from incorrect weights when stratifica-
tion is (a) slightly effective, (b) highly effective. Consider a large population with S l = I,
divisi ble into two strat a with WI = 0.9, W2 ,., 0.1. We will assume SI = S2 = S~ . Then,
neglecting terms in 1/ N~ ,
S2 == I W"S,,2+ I W"(Y,, - Y)l (SA. IO)
= S,,2+ WI W 2(Y I - Y2)2
that is,
1= 5,,2 "'0 .09(Y I - y 2)2
In (a) tak(.; 9, - 9, = I. Then S'/ =0.91 , and proportional stratifica tion with correct
weights reduces th e var iance by 9% , compared with simple random sampling.
In (b) take 9, - Y z = 3, giving S" 2 = 0.19, a reduction in variance of more than 80%.
With two strata and incorrect weights, the bias may be written
(w l - WI )(91 - 92 )
since (w) - WI) = -( W 2 - W 2) . Suppose that the estimated weights are WI = 0.92 and
W 2 = 0.08 . The bia& amounts to (0 .02)( I) = 0.02 in (a) and to 0.06 in (b) . Hence we have the
following comparable MSE's for a sa mple of size II.
In some surveys a large preliminary sample of size n' can be taken in order to
estimate the Wh oThis techClitIue, known as double sampling or two -phase sampl-
ing. has numerous applications and is discussed in Chapter 12. It will be shown
FURTIfER ASPECTS OF STRATIFIED SAMPLING 119
TABLE 5A.2
COMPARABLE VALUES OF MSE(y)
Stratified Random
S'Impe
n Random (a) (b)
that with double sampling the mean square error of y., is approximately
L WhS/ L Wh(Yh - Y) 2
n + n' (5A.Il)
By comparing this MSE with S2/ n, as given by equation (5 A. ] 0), we see that most
of the gain from stratification is retained provided that n' is much greater than n.
To put it more generally, a set of estimated weights preserves most of the potential
gain from stratification if the weights are much more accurately estimated than
they would be from a simple random sample of size n.
Example. Data given by Jessen (1942) illustrate a farm survey of this kind. The state of
Iowa was divided into five geographic regions, each denoted by its major agricultural
enterprise. Suppose that these regions are to be used as strata in a survey on dairy farming.
The three items of most interest are the number of cows milked per day, the number of
gallons of milk per day, and the total annual cash receipts from dairy products. From a
survey made in 1938, the estimated standard deviations 3~ within strata are shown in table
SA.3. In Table 5A.4 the optimum Neyman allocations based on these 3_ are given for the
individual items in a sample of 1000 farms.
120 SAMPLING TECHNIQUES
TABLE 5A.3
STANDARD' DEVIATIONS WITHIN STRATA
S"
Receipts
$" s" for Dairy
CoWS Gallons Products
Stratum Milked of Milk (I)
TABLE 5AA
SAMPLE SIZES WITHIN STRATA (n = 10(0)
Allocation
Optimum for
Average
Stratum Proportional Cows Gallons Receipts m"
Northeast dairy 197 254 258 236 250
Cash grain 191 182 209 246 212
Weste.rn livestock 219 203 171 194 189
Southern pasture 184 ]45 134 115 131
Eastern livestock 208 2]6 228 209 218
TABLE 5A.5
ExPECTED VARIANCES OF THE EsTIMATED MEAN
Type of allocation Cows Gallons Receipts
'The individual optimum allocations differ only moderately from each other. With one
exception, all three deviate in the same direction from a proportional allocation. Thus, in
the first stratum, proportional allocation suggests 197 fllfOlS. and the individual allocations
lead to numbers between 236 and 258. The average of the optimum sample sizes for tbe
three items, shown in the right-hand column, provides a satisfactory compromise alloca-
tion.
FURTHER ASPECTS OF STRATIFIED SAMPLING 121
Table 5A.5 shows the expected sampling variances of y", as given by the individual
optima, the compror.lise, and the proportional allocations. The formulas are as follows.
a:: WhSh) ~ V =L (WhSh) l () = L W~h~
v.", = n ' comp mh ' prop n
The compromise allocation gives results almost as precise as if it were possible to use
~epara te optimum allocations for each item. What is more noteworthy is that proportional
allocation is only slightly less precise than the compromise or the individual optima.
Furthermore, Table 5A.5 overestimates the precision of the optima and of the compro-
mise, since these allocations were made from estimated variances. This result is another
ill ustration of the flatness of the optimum mentioned in section 5A.l.
(SA.12)
where n lit is the optimum sample size in stratum h for variable j. For the data in
Table 5A.4, where the individual optima differ only slightly, Chatterjee's nit vary
from the average mIt in Table 5A.4 by, at most, one unit in any stratum.
In some surveys the optimum allocations for individual variates differ so much
that there is no obvious compromise. Some principle is needed to determine the
allocation to be used. Two useful ones suggested by Yates (1960) are presented.
The first applies to surveys with a specialized objective, in which the loss due to
an error of given size in an estimate can be measured in terms of money or utility,
as discussed in section 4.10. With k variates and quadratic loss functions, it may be
reasonable to express the total expected loss as a linear function of the variances of
the estimated population means or totals. For the means,
L =
k _ Ie
LJ aJV(YJ") = LJaJ"L WIt
L 22 (1 1)
SJ" - - '" (5A.13)
nit Hit
where Sf" is the variance of the jth variate in stratum h. Interchange of the order of
summation gives
Minimizing the product of (C - co) and the first term in L (the term depending On
the n,,) gives, by the Cauchy- Schwarz inequality,
Wh ~
nh ex: r- V L a/Si" (5A.16)
vc" /
The constant of proportionality is found by satisfying the consrraint given for Lor
C. For instance, suppose that the value of L is specified and that the fpc term may
be ignored. We have
n(W"A"/£) (SA.l?)
n" = L (W"A"/ £ )
where A" = JL a;S;". The required total sample size is, from (SA. 14),
j
In the second approach we specify the desired variance \') for each variate. For
population means this implies that
(j = 1, 2, ... , k ) (SA.19)
Inequality signs are used because the most economical allocation may supply
variances smaller than the desired \tj for some items.
!~ this approach the cost C [equation (SA.IS)] is minimized subject to the
tolerances "i and the conditions O:s n" :S N" . The problem is one in nonlinear
programming. Algorithms for its solution have been given by Hartley and
Hocking (1963), Chatterjee (1966), Zukhovitsky and Avdeyeva (1966), and
Huddleston ef al. (1970). Earlier, Dalenius (1957) gave an ingenious graphical
solution, while Yates (1960) and Kokan (1963) developed methods of successive
approximation , illustrated in the second edition of this book.
A useful first step is, of course, to work out the optimum allocation for each
variate separately and find the cost of satisfying its tolerance. Take the variate, say
Yh for which the cost C 1 is highest and examine whether the optimum nit values
for YI satisfy all the other (k - 1) tolerances. If so, we use this allocation and the
problem is solved, because no other allocation will satisfy the tolerance VI for Yt
at a cost as low as C I •
By working a series of examples in a related problem, Booth and Sedransk
(1969) have pointed out that in default of a computer program a good approxima-
tion to the solution of Yates' second problem can often be obtained by solving the
easier first problem. Specify that L in (SA.l3) shall have the value V* =L aJ "i,
where the \'} are the desired individual tolerances and the oJ are made inversely
FURTHER ASPECl'S OF STRATIFLED SAMPLING 123
proportional to the \tj. Thus with two variates, al = V2 /(V. + V2 ) , a2 =
V1/( V1 + V 2 ) , and
(5A.20)
Example. (Follr strata, two variates.) The data and the application of the approximate
method are shown in columns (1) to (6) of Table 5A.6. The problem ino find the smallest n
[or which
V(y",) ;:;; 0.04, V(Y2U) s; 0.01
TABLE 5A.6
ARTIFI CIAL DATA FOR FOUR STRATA, Two VARIATES
By working out the optimum allocation for each variate separately it is easily verified thar '
n =625 is needed to satisfy the first constraint and n = 676 is needed to satisfy the second.
However, n = 676 with its allocation does not satisfy the fin>t constraint giving 0.0589
instead of 0 .04 for V" An iterative solution to satisfy both constraints (presented in the
n
second edition),_gave = 732, with the nh
shown in column (7) of Table 5A.6.
To use the Booth and Sedransk approach with V, = 0.04, V2 = 0.01 , we specify
2(0.04)(0.01)
L = 0.2 V(YI,,) +0.8 V(Y2,,) (0.05) 0.016
(L w~"r
n = - -L--
(3 .416)2 _ 9
0.016 - 72
using column (5) of Table 5A.6. From (SA.17), column (S) also leads to the nIl values,
shown in column (6) of Table SA.6. As columns (6) and (7) show, the two solutions nIl and
n. agree well.
As Booth and Sedransk note, n sri in all problems of this type, since n satisfies the single
constraint L = V*, but it need not satisfy the constraint on every variate, whereas the ri
allocation satisfies L as well as the individual constraints.
124 SAMPLING TECHNIQUES
TABLE SA.7
NUMBI!R AND PROPOJlTlON Of ScHOOLS IN EACH Ceu.
Size: Expenditure per Pupil
of
City A B C D Totals nl.
mIl 15 21 17 9 mI . 62
I
PI I 0.091 0.127 0.103 o.O!s Pl . 0.376 4
m 21 10 8 13 7 nit. 38
II
PI I 0.061 0.049 0.079 0.042 PI. 0.231 2
.,
m,1 6 9 5 8 mao 28
III (1.055
Pal 0.036 0.030 0.049 Pa. 0.170 2
m.1 4 3 6 6 m•. 19
IV
p. J 0.024 0.018 0.036 0.036 p•. 0. 114 1
mas 3 2 S 8 ma. 18
V
P 6I 0.018 0.012 0.030 0.049 PI. 0. 109 1
The objective is.to give each school an approximately equal chance of selection
while giving each marginal class its proportional (epresentation. In this illustration
n = 10. Compute the numbers n,. ::II~. and n., .., nP./t where these products are
rounded to the nearest integers (with a further minor adjustment, if needed, so
that the nt. and the nJ both add to n). These numben are shown in Table SA.7.
FUltniER AiP£CTS OF STRA11FIED SAMPLING 12S
The next step is to draw n ~10 cells with probability nl.n);'2 for the ijtb cell.
This is done by constructing an" x n square (Table SA.8). In row 1 one column is
drawn at random. In row 2 one of the remaining columns is drawn at random, and
so on. At the end, each row and column contains one unit. (This draw is most
quickly made by a random pennutation of the numbers 1 to 10.) The results of one
draw are indicated by X's in Table SA.8 .
TABLE SA.B
10 x 10 SQuARE FOR DRAWlNG THE SAMPLE
Column
I 2 3 4 S 6 7 8 9 10
Row A B C D
I x
2 I x
3 x
4 x
5 x
II
6 x
7 III x
8 x
9 IV x
10 V x
Note that columns 1 and 2 are assigned to marginal stratum A, since " .1= 2.
Similarly, rows 1 through 4 are assigned to marginal stratum I, since n I. =4, and so
on. This completes the allocation of the sample to the 20 cells. The allocation
appears in more compact form in Table SA.9. Two schools are drawn at random •
from the 15 schools in cell IA. and so on. The probability that a school in row i,
columnj is drawn is proportional to n,.n./ Pij • Thus the probabilities are not equal.
although they will be approximately so if P" =nl.n./n2.
An unbiased estimate of the mean per school is
_ 1 n 2 Pii
Yu=-t--YIJ
" "1."./
=
where YI/ is the sample total in the ijth cell. If, however. Pi! nl.n./n2. the sample
mean y is probably preferable. since its bias should be negligible. A sample
estimate of variance is avaHable for both the unbiased and biased estimates.
126 SAMPLING TECHNIQUES
TABLE SA.9
ALLOCATION OF THE SAMPLE 1'0 THE 20 CELLS
A B C D Total
I 2 I I 0 4
11 0 0 2 0 2
III 0 I 0 I 2
IV 0 I 0 0 1
V 0 0 0 I I
Total 2 3 3 2 10
provided that n is at least twice the greater of Rand C and that at least two units
are drawn in every row and column.
If Pi/ differs markedly from ni.n.;/n 2 in some cells. an extra step keeps the
probabilities of selection of schools more nearly constant. After computing the nl.
and n.b examine the quantities Dij = nPI! - nl.nj n, after rounding them to inte-
gers. If, in any cell. D lj is a positive integer, automatically assign D./ units to this
cell. Reduce n, the ni.• and the n.j by the amounts required by this fixed allocation
and carry out the remaining allocation as before.
TABLE 5A.IO
ORDERING OF UNITS WITHJN STRATA FOR CONTROLLED SELEcnON
Original Order Revised Order
Stratum Stratum
1 1 1 3'
2 2 2 4'
3' 3' 3' 5'
4' 4' 4' 1
5' 2
f'UR11ffiR ASPBCTS OF STRATIFIED SAMPLING 127
numbering the units within strata, a prime (') indicates one ownership type, and
absence of a prime indicates the other type. Table 5A.l 0 (left side) shows the units
in the two strata.
If unit 1 or 2 is drawn from stratum I, we would like to draw unit 3',4', or 5'
from stratum II, so that both types of stratification are present with n = 2.
Similarly, 3' or 4' (stratum I) is desired with 1 or 2 (stratum II) . Controlled
selection makes the probability of these desired combinations as high as is
mathematically possible, while retaining equal probability selection within strata
and therefore unbiased estimates by the usual formulas for stratif}.ed sampling.
The purpose is either increased accuracy for given n or a saving in field costs.
With stratified random ampling the probability of a desired combination is
(.5)(.6)+(.5)(.4)=.5. This probability can be increased to .9 by two simple
changes in sample selection. Rearrange the units in stratum II so that the desired
combinations (3',4',5') with 1 and 2 in stratum 1 come first, as on the right in Table
SA.) O. Then draw a random number, between 1 and 100 and use it to select the
units from both strata. In stratum I, 1 s , s 25 selects unit 1, 26 s , s 50 selects
unit 2, and so on, so as to give each unit the desired one fourth probability of being
chosen. Similarly, in stratum II, 1 s , s 20 selects 3', 21 s , s 40 selects 4', and so
on. Hence, if 1 s , s20, we select (1,3'), if20 s , s 25, we elect (1, 4') and soon.
The joint selections and their probabilities are as foHows .
Pair: (1,3') (1,4') (2,4') (2,5') (3',5') (3',1) (4', 1) (4',2)
Probability .20 .05 .15 .10 .10 .15 .05 .20
The only nondesired combination is (3',5'). Thus the total probability of the
desired combinations is .90.
Since sampling is not independent in the two strata, the formulas for V(Y .. ) and
v(y,) do not apply. Hess, Riedel, and Fitzpatrick (1976) , give approximate
formulas . This monograph also gives an algorithm for the application of control-
led selection in problems with more strata, larger n, and more complex controls.
For another approach using balanced incomplete block designs, see Avadhani
and Sukhatme (1973) .
(5A.21)
W" = f Y.
>'h - I
f (t) dt, (5A.22)
Further,
(5A.23)
Z(y) == r Yo
.Jj{i) dt (5A.28)
If the strata are numerous and narrow, f(y) should be approximately constant
(rectangular) within a given stratum. Hence. .
I
y·
W h == f(t) dt == fh(YIt - Yh - t) (5A.29)
Yh - I
1
Sit == r.::(Yh - Yh - l) (5A.30)
v12
Y.
Zit - ZIt - l ==
J'
)'h - l
.Jj{i) dt == JJ,.(Yh - Y"-l) (5A.31)
Since (ZL - Zo) is fixed, it is easy to verify that the sum on the right is minimized by
making (Zit - Zit - I) constant.
Given f(y) . the rule is to form the cumulative of.JfCy) and choose the Yit SO that
they create equal intervals on the cum.Jf(y) scale. Table SA.II illustrates the use
of the rule.
TABLE SA.ll
CALCULATION OF STRATUM BoUNDARiES BY 1llE CUM ...;f<tl> Ruu
lndu,trial Loans Cum Industrial Loans Cum
Total Loans % fM "';f(y) Total Loans % fM VIM
0-5 3464 58.9 50-55 126 340.3
5- 10 2516 109.1 55-60 107 350.6
10-15 2157 155.5 60-65 82 359.7
15-20 1581 195.3 65-70 50 366.8
20-25 1142 229.1 70-75 39 373.0
25-30 746 256.4 75-80 25 378.0
30-35 512 279.0 80-85 16 382.0
35-40 376 298.4 85-90 19 386.4
4G-45 265 314.7 90-95 2 387.8
45-50 207 329.1 95-100 3 389.5
130 SAMPLING TE HNIQUES
Example. The data show the frequency distribution of the percentage of bank loans
devoted to industrial loans in a population of 13,435 banks of the United States
(McEvoy, 1956). The distribution is Skej with its mode at the lower end. In the cum.Jr
column, 58.9 = J3436. 109.1 = J3464 + 2516, and so on.
Suppose that we want five strata. Since the total of cum.Jr is 389.5, the division points
should be at 77.9. 155 .8,233.7, and 311.6 on this cale . The nearest available points are as
follows :
Stratum
2 3 4 5
The first two intervals. 58.9 and 96.6. are rather unequal. but cannot be improved on
without a finer subdivision of the original classes.
If the class intervals in the original distribution of yare of unequal length, a
slight change is needed. When the interval changes from one of length d to one of
length ud, the value of ..Jjfor the second interval is multiplied by.fU when forming
cum ..Jj.
Another method, propo ed by Sethi (1963), is to work out the boundaries given
by the calculus equations (5A.27) for a standard continuous distribution resem-
bling the study population. For the normal and various x 2 djstributions, Sethi has
tabulated the optimum boundaries for Neyman, equal, and proportional alloca-
tion fcit L s 6. If one of these distributions seems to approximate that in the study
population, the boundaries can be read from Sethi's tables.
Two further approximate methods require ome trial and error. From relatIOns
(5A.32), the Dalenius-Hodges rule is roughly equivalent to making WIrS/r con-
stant, as conjectured earlier by Dalenius and Gurney (1951). A similar rule is that
of Ekman (1959), who makes Wh(Yh - Yh - J) constant.
In comparisons on some theoretical and eight study populations, Cochran
(1961) found that the cum . ..Jjrule and the Ekman rule worked consistently well
(the Sethi method was not tried). In a study of United States hospital bed capacity,
whose distribution resembles X 2 with 1 degree of freedom, Hess, Sethi and
Balakrishnan (1966) found the Ekman method slightly superior to cum.lt and
Sethi's for L > 2, while Murthy (1967) also reports good performance by Ekman's
method.
The relations (5A.32) have an interesting consequence. If "",Sir is constant,
Neyman allocation gives a constant sample size nh = n/L in all strata. For the
approximate methods, the comparisons that have been made suggest that the
simple rule nil = n/ L is satisfactory.
Thus far we have made the unrealistic assumption that stratification can be
based on the values of Y itself. In practice, some other variable x is used (perhaps
FURTHER ASPECT'S OF STRATIFlED SAMPLING 131
the value of y at a recent census). Dalenius (1957) develops equations for the
boundaries of x that minimize L WhS yh , given a knowledge 0( the regression of y
on x.If this regression is nonlinear, these boundaries may differ considerably from
those that are optimum when x itself is the variable to be measured. The equations
indicate, however, that if the regression of y on x is linear and the correlation
between y and x is high within all strata the two sets of boundaries should be
nearly the same. Let
y = a +f3x+e
where E(e) = 0 for all x and e, x are ul1correlated. The variance of e within
stratum his S'/' Then the x-boundaries that make V(y,,) a minimum satisfy the
equations (Dalenius, 1957).
If S;hl/32S ;h is small for all h, these equations reduce to the fofm (5A.27) that
gives optimum boundaries for x. But S ;hl/3 2S ;h = (1- Ph 2)1 p/ where Ph is the
correlation between y and x within stratum h.
Although more investigation is needed, this result suggests that the cum.Jj rule
applied to x should give an efficient stratification for another variable y that has a
linear regression on x with high correlation. Some numerical results by Cochran
(1961) support this conjecture. Moreover, if the Ph are only moderate, as will
happen when the number of strata is increased, failure to use the optimum
x-boundaries should have a less deleterious effect on y.
The preceding discussion is, of course, mainly relevant to the sampling of
institutions stratified by'some measure of size. The situation is different when one
set of variables is closely related to Y I and another set, with a markedly different
frequency distribution, is closely related to Y2' One possibility is to seek compro-
mise stratum boundaries that meet the desired tolerances on V(Ylst) and V(Y2.,),
following a general approach given in section 5A.4, but computational methods
have not been worked out.
In geographical stratification the problem is less amenable to a mathematical
approach, since there are so many different ways in which stratum boundaries may
be formed . The usual procedure is to select a few variables that have high
correlations with the principal items in the survey and to use a combination of
judgment and trial and error to construct boundaries That are good for these
selected variables. Since the gains in precision from stratification are likely to be
modest, it is not worthwhile to expend a great deal of effort in improving
boundaries. Bases of stratification for economic items have been discussed by
Stephan (1941) and Hagood and Bernert (1945) and for farm items by King and
McCarty (1941).
132 SAMPUNG TECHNIOUES
V(Y")=~
n
t
h- I
Wh2S~h=LfJ2
n
t
It - I
LS
W h2S;h+_ _e_2t
n It -I
W h2 (5A.38)
(5A.39)
TABLE SA.12
V(Yat)! V(ff) AS A FUNcnON OF L FOR THE LINEAR REGIlESSION MODEL AND FOR
SoME ACTUAl. DATA
Linear Regression Model Data, Set
p-
L 0.99 0.95 0.90 0.85 2 3
Type of Data
Set Data x '!I Source
college enrollment data (set 1). In two compari ons on survey data, Hess, Sethi,
and Balakri hnan (1966) found that V(Ysr) decreased faster with L than (5A.39)
predicts, which suggests that model (5A.37) is oversimplified.
To complete this analysis, we require a cost function that shows how the cost
depends on L. Dalenius (1957) suggests the relation C = Les + nen . The cost ratio
e.1 en will vary with the type of survey. An increase in the number of strata
involves extra work in planning and drawing the ample and increases the number
of weights used in computing the estimates, unless they are self-weighting. In
orne surveys almost no change is required in the organization of the field work; in
others a separate field unit is set up in each stratum . Whatever the form of the cost
function, the results in Table 5A. 12 suggest that if an increase in L beyond 6
necessitates any substantial decrease in n in order to keep the cost constant the
increase will seldom bc profitable.
The discussion in this section is confined to surveys in which only over-all
estimates are to be made . If estimates are wanted also for geographic subdivisions
of the population, the argument fOl a larger number of strata is stronger.
(5A.40)
(5A.41)
Hence
(5A.42)
The first term is the value of V(Y.,,) for proportional stratification. The second
represents the increase in variance that arises because the mh do not distribute
themselves proportionally. But
1 _ ,
'2 L(1 - Wh )Sh-=- -
1 (L) Sh'
- - 21 L WhSh =-_I-Sh- -'2L
~ 2 1
2
Wh5 k
2
(5A.43)
n n n n nnh n
where Sh 2 is the average of the Sh 2 and fih = n/ L is the average number of units per
stratum. Thus, if the Sh 2 do not differ greatly, the increase is about (L - 1)/Ui h
times the variance for proportional stratification, ignoring the fpc . The increase
will be small if fin is reasonably large.
This method can also be applied to a sample that is already stratified by another
factor, for example, into five geographjc regions, provided that the! W h are known
separately within each regjon. Thi twofold stratification is widely employed in
U .S. National Survey: see Bean (1970) for a description of the estimation
formulas in the Health Interview Survey of the National Center for Health
Statistics.
with the agency but, in general, quota sampling may be described as stratified
sampling with a more or less nonrandom selection of units within strata. For this
reason, sampling-error formulas cannot be applied with confidence to the results
of quota samples. A number of comparisons betw~n the results of quota and
probability samples are summarized by Stephan and McCarthy (1958), who give
an excellent critique of the performance of both types of survey. The quota
method seems likely to produce samples that are biased on characteristics such as
income, education, and occupation, although it often agrees well with the proba-
bility samples on questions of opinion and attitude.
(SA.4S)
Now -
(SA.46)
Also, since v(Y$1) and y" are unbiased estimators of V(y,,) and Y, respectively,
Ev(y,,) = V(y.,)=E(Y;,)- y2 (SA.47)
(N - n) [(n -l) 2 - ]
V'an = n(N-I) - n- S +v(y,,) (SA.49)
. (N-n) 2
V,an =~S (SA.SO)
(SA.Sl)
Example. The calculations are illustrated from the first three strata in the sample of
teachers' colleges (section 5.9). The data in Table 5A.13 are for the later 1946 sample. The
means represent enrollment per college in thousands. The S. 2 values are slightly higher than
in the second edition, owing to a correction.
138 SAMPLING TECHNIQUES
TABLE SA.13
BASIC DATA FROM A STRATIFIED SAMPLE OF TEACHERS'
COLLEGES
57 26 160.945
With n small we u e formula (5A.44). We find y" =- 1.4715. The values of the for the Y.,
sample were not reported, but the figures in the right-hand column can be obtained from
preceding columns of Table 5A.13. The formulas work out as follows.
_ ] N.(N. -n.)
v(y,,)=- ~r2L S. 2 =0.00497
1'< "h
31 [160.945 ]
V,on = (26)(56) - 5-7 --(1 .4715)2+0.00497 = 0.01412
Stratification appears to have reduced the variance to about one third of the value tor a
simple random sample, the estimated del! factor (section 4.11) being 0.00497/0.01412 =
0.35. ' f
(5A.54)
The first term on the right is the correct variance (by theorem 5.4 with nh = 1).
The second term represents a positive bias, whose size depends on the success
attained in selecting pairs of strata whose true totals differ little. The form of the
estimate (5A.54) warns that construction of pairs by making the sample estimated
totals differ as little as possible can give a serious underestimate . The technique is
called the method of "collapsed strata."
With L odd, at least one group must of course, be of size different from 2. The
extension of the estimate (5A.54) to G groups of any chosen sizes L j 2: 2 is
A G Lj L, A ,. 2
Vl(Y.,)=I-1 I (Y/k-y//L j ) (5A.S6)
/ - 1 L / - k- I
where Y/ is the estimated total for group j. For L, = 2, when Y/ = }fl + Y,2, this
form agrees with (SA.S4). As with (S A.S4), the expectation of this VI(Y") gives
the correct variance V( Y.,), plus a positive bias foune by substituting Y/k and Y/
for Y/ k and Y, in (5A.56) .
When an auxiliary variate Ah is known for each stratum that predicts the
stratum total Y h , Hansen, Hurwltz, and Madow (1953) suggested the alternative
variance estimator
- 9. L, !:t -jk - 2
V2(Y.,) =j~l L /- I k~1 ( Y - A/kY;/A,) (SA.S7)
If An is a good predictor, the positive bias term in V2, coming from the
deviations (Y'k - Ajk Y;/ AY, is likely to be smaller than the corresponding term in
VI, although unlike Vb V2 also gives a biased estimate of the term in the S/ in
V(YII ). Hartley, Rao, and G. Kiefer (1969) found v21ess biased than VI in two of
three populations, with liftle difference in the third . .
These authors developed a method that does not involve the collapsing of
strata. This method uses one or more auxiliary variates Xlh, X2h, and so forth. 01.
which the true stratum means Yn are thought to have a linear regression. If Yh is
the sample value in stratum h, the method use the deviations
(5A.59)
The method appears promising and extends to ratio estimates, but the authors
warn that additional comparisons with " collapsed strata" methods are needed.
Using a different approach, Fuller (1970) developed a method of stratum
construction that provides an unbiased sample estimate of V(Y,,) with one unit
per stratum. For simplicity suppose that N/n =N/L = k (an integer). Select a
random number r between 1 and k. The first stratum consists of the units
numbered from (r + 1) up to (r + k) , the second those numbered from (r + k + 1) up
to (r+2k) , and so on, the last (Lth and nth stratum) those numbered from
r + (n - I)k + 1 to N = nk and those from 1 to r. At first sight, this last stratum may
look a poor choice. As Fuller notes, however, this method can work well in
geographic stratification with areal units. Here, stratification usually leans on the
notion that units near one another tend to be similar. By numbering units in
serpentine fashion , one can have YN near YI> so that the stratum that includes both
YN and Yl an also be internally homogeneous. The estimate v(Y,,) is a weighted
sum of the differences (Yh - Yh +I) 2.
The circular method would be less effective for a population showing a rising
trend from YI to YN, in which the stratum including both YI and YN would have
large internal variability. For this situation Fuller gives a second plan, slightly
more complex, which should give good precision with a rising trend and also
furnishes an unbiased estimate of V( Y.,).
"
(5A.62)
FURTII.E R ASPECTS OF STRAnFIEO SAMPLING 141
With L strata, L > 2, the optimum allocation depends on the amounts of precision
desired for different comparisons. For instance, the cost might be minimized
subject to the set of L(L -1)/2 conditions that V(Yh - y;)s V hh where the values
of V hl are chosen according to the precision considered necessary for a satisfac-
tory comparison of strata hand i.
Frequently a simpler method of allocation is adequate, especiaUy if the Sh and
Ch do not differ greatly. One approach is to minimize the average variance of the
difference between all L(L -1)/2 pairs of strata, that is, to minimize
2 (S
V- =- / S/ ... +-
-+-+ SL2) (SA.63)
L nl n2 nL
V is minimized, for fixed C, by the rule in (5A.62),
Sh
nhOC- (5A.64)
.£
This rule may result in certain pairs of strata being more precisely compared and
others less precisely than is felt appropriate. An alternative is to select the nh so
that the s.e. of the difference is the same, say ./V, for every pair of strata. This
amounts to making Sh 21nh = VI2 for every stratum. For a fixed cost this method
gives less over-all precision than the first method. The reader may verify that the
two optimum allocations give
2('[. Sh JZ.)2
v- =-=,-::,,-___..:.::_ v::= 2('[. S/Ch) (SA.6S)
L(C-co) , (C -co)
It follows from the Cauchy-Schwarz inequality that V is always greater than ii
unless Sh£. = constant. If V is substantially greater than ii, a compromise
allocation can sometimes be found, after a little trial and error, that will give an
average variance close to ii and also keep V(Yh - y,) reasonably constant.
Sometimes the objective is to obtain estimates for each stratum as well as
over-all estimates for the whole population. In planning the survey, we might
specify the following conditions.
S2
V(Yh) = _ h (I - In) s V h ,
nh
The fpc terms are now included, since the purpose is to specify the precision with
which the means in the finite population are to be estimated. The conditions on the
V(Yh) determine lower limits to the values of the nh. If these lower limits are found
to satisfy the condition on V(Y.,), the allocation problem is solved. When the
condition on V(y,,) is not satisfied, Dalenius (1957) has indicated a graphical
approach.
More complex problems arise when the L =2k strata represent all combina-
tions of k factors each at two levels, and the objective is to estimate the average
effects of the factors. If the stratum or cell to whicltany member of the population
142 SAMPLING TECHNIQUES
Sample mean :
N,(
Domain mean : i\ = L Y}u,
1=1 Nhl
The population total and mean for domain j over all strata are, respectivel)\,
- - Y,
Yi = INhjYIt" Yj =-
h N,
where N; =I N h /.
II
The complication arises because the nh, are random variable . If the N,,/ were
known, the problem would be simple. As estimates of Yi and 9" we could use
... y'
'9:.'=-'
, N;
By the method in ection 2.12, the ordinary formula for V(Y',,) is still valid,
provided all nil, > O. Thus
FURTHER ASPECTS OF STRATIfIED SAMPLING 143
(5 A.67)
The true and estimated variance of ~ are found hy the device used in section
2.13. A variate Y~, is introduced that equals Yhl, for all units in domain j and equals
zero for all other units in the population. As shown in section 2.13 thi s gives for the
estimated variance
(5A.68)
(5A.69)
Hence we take
(5A.70)
(5A.?})
144 SAMPLING TECHNIOUES
This is the formula for the combined ratio estimate for the two variables Yh/ and
Xh/' From section 6.11, the estimated variance may be expressed approximately
as
(SA.73)
i
, 2 _, _ , .0. 2 nh · _
lj) -:.::!!L(YltJ - Yj)
n"
2
(SA.74)
using (SA71). Furthermore, the first term in (5A74) can be expressed alterna-
tivelyas
Inserting these results in (SA73) gives! finally , for the estimated variance,
~) ' 1 '\' Nh 2(1_ fh) ['\'
V(Yj =N.~2t..
_ 2 (
nhi) _ ~
( -1) t..(Yhl/ -Yhi) +nhi 1 - - (Yh/-Yi )
2] (SA.7S)
j h nit nit I nit
The .•term on the right represents a between-stratum contribution to the
variance. Differences among strata means are not entirely e:liminated from the
variance of the estimated mean of any subpopulation. The between-stratum
contribution is small if the terms 1- nit;! nh are small, that is, if the subpopulation
is almost as large as the complete population.
As Durbin (1958) has pointed out, (5A75) applies also to means estimated for
the whole population, if the sample is incomplete for any reason such as 110n-
response, provided, of course, that Yj is the estimate used. In this event Yj is
interpreted as the estimated mean for the part of the population that would give a
response under the methods of data collection employed. There is, however, an
additional complication, in that the "nonreponse" p¥t of the population often has
a different mean from the "response" part. Thus Yj is a biased estimate of the
mean of the whole population, and this bias contribution is not included in
(SA75).
where Ya, Yab, YB denote the respective sample means. The weighting factors p and
q for the two samples that belong to frame B, with P + q = 1, are chosen to
minimize V( Y) under a cost function of the form
(5A.77)
In (5A. 76) the stratum sizes Nab = NB and Na :; (N.II. - N B ) will, of course, be
known.
With SB 2 >S/ and CB CA , Hartley showed that this method can give large
reductions in V(n as compared with sampling from frame A only, even if the
frame A sample is poststratified into the two strata a and B =abo
The problem becomes more difficult if frame A is also incomplete, two frames
A and B, with some duplication, being required to obtain complete coverage of
the population. For poststratification, there are three distinct strata: 4 (units in A
alone); ab (units in both A and B); and b (units in B alone). The three strata
cannot .be sampled directly, samples of sizes nA, nB having to be drawn from
frames A and B. Furthermore, the strata izes N a, Nab, Nb will not usually be
known. For simple random sampling from frames A and B, Hartley (1962)
146 SAMPLING TECHNIQUES
where, as before, P + q = 1 and the y' are sample totals in the strata, the suffixes
ab and ba denoting the samples in the duplicate stratum found in nA and nB.
Hartley determined P and q to minimize V( y) for fixed cost. Improvements in
Hartley's estimate (SA.78) have been given by Lund (1968) and Fuller and
Burmeister (1972), essentially by using better estimates of Na , Nab, Nb than are
implied in Hartley'S estimate. Fuller and Burmeister (1972) also dealt with the
case in which frame A i areal , with subsampling of the areal units. Hartley (1974)
gives a general approach to two-frame sampling, applicable to any sample design
in the two frames.
" EXERCISES
5A.I In planning a survey of sales in a certain type of store, with n =550. good
estimates of Sh are available from a previous survey in two of the three strata. The third
stratum consists of new stores and stores that had no sales in the previous survey, so that a
value for S3 has to be guessed. If S3 is actually 10, compute V (,y,,) as given by an estimated
Neyman allocation when S, is guessed as (a) 5, (b) 20. Show that in both cases the
proportional increase in variance over the true optimum is slightly over 2% .
True Estimated SA
Stratum WA SA (0) (b )
I 0.3 30 30 30
2 0.6 20 20 20
3 0.1 10 5 20
~A.2 Show that if all Sh, except St.. are correctly estimated and St. is estimated as
5t. = SL(I + A), tfte proportional increase in V.",(Y,,) , using 5, instead of the true 5 L for
Neyman allocation, is
A 2ni.(n - nU
U+A)n 2
where hi is the sample size in stratum L under true Neyman allocation. Verify that this
formula agrees with the results in exercise 5A.I. (The agreement is not exact because of the
rounding of the Ilh to integers.) Hence show that aSO% underestimation of SL has the same
effect as 100% overestimation.
SA.3 If there are two strata and if 41 is the ratio of the actual nl/n 2 to the Neyman
optimum n J 1l 2' show that whatever the values of N .. N2• S .. and 52' the ratio
Voni.(Y,, )/ V(Y.. ) is never less than 441(1 + 41 )2 when the fpc's are negligible .
5A.4 The results of a simple random sample with n ... 1000 can be classified into three
"strata," with Yh=10.2 . 12.6, and 17.1. s/=10.82 (the same in each stratum), and
J~ - 17 .66. The estimat d strlltum weight are Wh -O.S, 0.3, 0.2, respectively. Th~
FURnlER ASPECJ'S OF STRATIFIED SAMPLING 147
weights are known to be inexact, but it is thought that all are correct within 5%, 80 that the
worst cases are either W~ =0.525, 0.285,and 0.190 or Wh =0.475, 0.315, and 0.210. By
the methods of section 5A2, would you recommend stratification1 (Where needed, assume
that y~ = y~ and s~ 2 = S~ 2.)
5A5 In a stratified random sample with two variates the objective is to satisfy the
specifications
for minimum cost C= r c~nh' The fpc's can be ignored. (a) Prove the result by Chatterjee
(1972) that a compromise aUocation is necessary if
r W~S2AJc;; V r
Wh(Si,jSlk)Jc;;
-"'-----'--"'-'----=5 - 2 5=--~~:::...;...:.:;;,....;;..::
r W.. (S~,jS2~>Jc: VI r
W"SI~Jc;; .
(b) If V 2 / VI equals or exceeds the upper limit, the optimum aUocation for y, satisfies
both tolerances, with a corresponding re ult about the lower limit.
SA6 A survey with three strata is planned to estimate the percentage of families who
have accounts in saving banks and the average amount invested per family. Advance
estimates of the percentages Ph and the within-stratum S~ for tbe amount invested are as
foUows.
Stratum
1 0.6 20 90
2 0 .3 40 180
3 0.1 70 520
Compute the smallest sample sizes n and the nIt that satisfy the {oUowing requirements: (a)
The percentage of families is to be estimated with s.e. = 2 and the average amount invested
with s.e. = $S . (b) The percentage of families is to be estimated with s.e. = 1.S and the
average amount invested with s.e. = $S. •
Part (b) requires a compromise allocation, either by a computer program or the method
in the second edition, p. 123. The aUocation nh '" 371,344,315, with.n '" 1030 satisfies both
tolerances. Show that the 8ooth-Sedrarls~ method (section SA4) gives n,jn =0.431,
0.326, 0.243. This allocation would require n = 1073 to meet both tolerances.
5A.7 The table at top of p. 148 shows the frequency distribution of a population of 911
city sizes (or cities from 10,000 to 60000, arranged in classes of 2000. To shorten the
calculations, a coded y' and values oC JJ.
cum . ./i, cum. f,{y', andrfy'2 are given. Apply the
Dalenius-Hodges rule to create two strata for optimum allocation in the sense of Neyman.
Find the values of W.. and S.. for each of your strata. Verify (a) that the optimum sample
sizes are almost the same in the two ,trata and (b) by finding S2 for the whole population,
that
V(y) == 4 8
V.... (y,,) .
5A8 The right triangular distribution f(y) "'2(I-y). O<y<l. is divided into two
strata ~t the point Q . (a) Show that
W, = a(2 -a), W:z..... (1-a)2
s 2= a (6-6a+a
2 2
) :r. (l .... a?
I 18(2 - a)2 ' 2""~ ,.
148 SAMPUNG TECHNIQUES
(b) Show that under the cum ../frule the best choice of a is 1 - 1/~ - 0 .37 and that with
this boundary the optimum ntln, is aboutH and V(y.,) is about 27% of the value given by
simple random sampling.
SA.9 In both exercises SA. 7 and SA .8, show that the Ekman rule Wit (Ylt - Y"_I) -
constant agrees very closely with the cum . ./irule in determining the stratum boundaries .
SA.I0 A sum of $SOOO is available for a stratifted .ample. In the notation of section
SA.8 the cost function is thought to be, roughly, C ... 200L + IOn and
S2[p2
V(y.,) = -;- L1+(1 - p') ]
where p is the correlation between the variate used to construct the strata and the variate to
be measured in the survey. Compute the optimum L for p - 0.9S, 0.9, and 0.8. What is •
good compromise number of strata to' use for all three v.alues of p?
SA. 11 The following data are derived from a stratified sample of tire dealers taken in
March 1945 (Deming and Simmons, 1946). The dealers were aSlianed to Itrata accordln,
FURlHER ASPECT'S OF STRATIPlSD SAMPLING 149
fO tbe number of new tires held at a previous census. The sample means Yh are the mean
numbers of new tires per dealer. (a) Estimate the gain in precision due to tbe stratification.
(b) Compare this result with the gain that would have been attained from prooortional
allocation.
Stratum
Boundaries Nh Wh s_t n"
!i"
1-9 19,850 0.8032 4 .1 34.8 3000
10-19 3,250 0.1315 13.0 92.2 600
20-29 1,007 0.0407 25.0 174.2 340
30-39 606 0.0245 38.2 320.4 230
SA.12 A population has two strata of relative sizes W. = 0.8, W2 = 0.2 and within-
stratum variance~ S. 2= 100, S2 2 ,. 400. A stratified random sample is to be taken to satisfy
the following requirements: (i) the means of each stratum are to be estimated with variance
:s1; (U) V(Y.. ) :sO.S. lanoring the fpc. find the values of n lo n2 that satisfy all three
requirements for minimum n ... n. +112'
Hint. Note that -oV(y.. )/on 1 >-0\'(9.,)/0" 2 if ". <2" 2' Fuller (1966) has discussed
various methods of handling this problem.
SA.13 In an example due to Nordbotten (1956) and worked by Kokan (1963), a survey
is planned to estimate total employment Y. and the value of production Y1 in establish-
ments manufacturing furniture. When establishments ate stratified by size, the N~ and
l'Ou&h estimate of the S~~ are as follows.
Stratum
1,600
The requirement that estimates of Y. and Y2 not be in error by more than 6% (P ... 0.95)
amounts to tolerances
VI - V(YI.):S 0.03S1 : V1 - V(YlM):s S6.2S
Show that the optimum allocation for YI with nl - 450, nz -167, II '" 617 satisfies both
tolerances. Note that in this problem the fpc cannot be ianored.
SA.14 In stratified random samplin, with one unit per stratum, assume that the strata
can be grouped into pairs with ~ . .. N", - ~ (/ - 1. 2, . ..• L/2). An alternative sampling
method' draws two units at random from each pair of strata. Show that fOT this method
V(1).. J - (~-
,-
li) ,-I
'f [2N/(~-1) I-If S~+(Y,I - y",)2 ]
Hence show that the expected value of the "collapsed ,trata" estimate VI(t.) in formula
(SA. 54), sec.'tion SA.12, overestimates V(1)..J), tbe variance that would apply if strata twice
as larac were used.
CHAPTER 6
..
Ratio Estimators
Yn. =IX=~X
x X
(6.1)
Frequently we wish to estimate a ratio rather than a to tal or mean, for example,
the ratio of corn acres to wheat acres, the ratio of expenditures on labor to total
ex.penditures, or the ratio of liquid assets to total assets . The sample estimate is
R = y/x. In this case X need not be known. The use of ratio estimates for this
purpo e has already been discus ed in section 2.11 and (with cluster sampling for
proportions) 3.12.
Example. Table 6 .1 shows the number of inhabitants (in 1000's) in each of a simple
random sample of 49 cioes drawn from the population of 196 large cities discussed in
section 2.15 . The problem is to estimate the rotal number of Inhabitants in the 196 cities in
1930. The true 1920 total, X, is assumed to be known. Its value i 22,919.
The example is a suitable One for the ratio e timate. The majority of the cities in tht"
sample show an increase ill size from 1no to 1930 of the order of 20% . From the sample
data we have
Y =L >" :c 6262, x =1: x, = 5054 .
Consequently the ratio e timate of the 1930 total for all 196 citr~s is
./', y 6262
TR =;X = 5054(22,919) =28,397
TABLE 6,1
Srzn OF 49 LARoE UNIT£O STATES Cmu (in 1000's) IN 1920 (~J ANO 1930 (311)
~I Yi ~.
'!II :/;1 !II
76 80 2
138 50 243 291
143 507 634
67 67 87 105
179 260 30
29 SO 121 111
113 71 79
381 464 SO 6-4 256 288
23 4S 44
37 58 43 61
bJ 77
120 89 25 57
lIS 64
61 63 94 85
69 64 77
387 459 43 50
56 142 298
93 104 40
317
172 60 36 46
183 40 64
78 106 161 232
38 52 74
66 86 136 93
60
139 45 S3
57 116 130
46 65 36 54
46 53 50 58
48 75
'. 70
r-
60
50
r- - .
200 rollo .. lIml ttl
.g40
:::I
W!~
...~30 r - -
r -
20
10
1'- h )( cienolft p<>puillion tolal
0 t~ ,-I'
J..
~r--------- __________________________--,
g20r--- __
~
110 r-----.=-a
0 ,'::8-A11!~-
1
"" 6.1 Experimental comparison of tbe ratio estimate witb the estimate based on the sample mean.
RATIO ESTIMATORS 153
Figure 6.1 shows the ratio estimate and the estimate based on the sample mean per city
for each of 200 simple random samples of size 49 drawn from thls population. A substantial
improvement in precision from the ratio method is apparent.
(6.2)
(6.3)
ff (Y,-Rx,)21
V(R).:.!i(L- N-l J
1
(6.4)
where f =n/N is the sampling fraction. The method used in theorem 2.5 shows
154 SAMPLING TECHNIQUES
that (6.2), (6.3), and (6.4) are also approximations to the mean square errors of the
e timator in these formulas.
The !lrgu"!.e~t l~ading lJ the approximate result (6.4) was given in theorem 2.5 .
Since YR = XR, YR = NXA., the other two results follow immediately.
Coronary l. There are various alternative form of the result. Since Y = RX,
we may write
A N 2 (1- f) N - - 2
V(YR )= n(N-l) i~1 [(Y;- Y)-R(x;- X))
2
N (1- f) [ - 2 ' 2 - 2
=n(N-l) r(Yi- Y) +R r(xl- X)
-2R r (Yi - Y)(.t; - X)]
The correlation coefficient p between Yi and X; in the finite population is defined
by the equation
N _ _
E(Yi - y)(Xi - X) r (y; - y)(Xi - X)
P =JE(YI - 9)2E(x; - *>2 (N-l)S~..
An equivalent form is
2
y2(S/ s.. 2Sy",)
V(YR ) = (l-,n-::- ~+~- .........
A
(6.6)
n Y X YX
where S YJC = pS,sx is the covariance between Y. and X,. This relation may also be
written as
(6.7)
where Cyy,C.... are the squares of the coefficients of variation (cv) of Yi and XI
respectively, and CyX is the relative covariance.
Y
Corollary 2. Since YR , R , and R differ only by known multipliers, the
coefficient of variation (i.e., the standard error divided by the quantity being
estimated) is the same for aU three estimates. From (6,7) the square of this cv is
V(YR ) I- f (Cyy+C
(cv) 2 =~=- . .... - 2C'yO' ) (68)
•
r Y n I ,
RAno ESnMATORS 155
The quantity (cv)' has been called the relative variance by Hansen et al. (1953). Its
use avoids repetition of variance formulas for related quantities like the estimated
population total and mean.
I (Yi- RxY
(n -1)
as a sample estimate of the population variance. Thi ' estimate bas a bias of
order lin.
For the estimated variance, vCYR ), this gives
~ N2(1-f) " ~
v(YR) = 1I(n -1) i~1 (YI - Rx/)' (6.9)
= N (1- f)(s/+R1.s/-2Rsy~)
2
(6.11)
n
where Sy~ = E(Yt - y)(x/ - x)/(n -1)is the sample covariance between Y, and X"
There are two alternative formulas for the sample estimate of the variance.
Since YR = NXR. one form for R is
(6.12)
Since, however, R = Pix. the quantity X need not be known and is sometimes
not known when estimating R. This suggests the alternative form
(6.13)
This form could also be used for v('YR ), taking v£9'~) =X2tJ2(R).
This raises the question: If X is known, is VI preferable W "2? The n;w r is Qot
v,
at present clear. P. S. R. S. Raoand J. N. K. Rao (1971) studi d .he bi es :11 and ,
156 SAMPLING TECHNIQUES
were Os t s 2 and XI has a g1mrna distribution ax"- Ie- x. The range Os t s 2 was
'studied because in applications the residual variance of YI is thought to increase
with X , at different rates in different populations. They found V 2 Jess biased for
Os t s 3/2, but also less stable for I = 0 or I = 1.
R : R ±z.rvrJi) (6.15)
where z is tell rma] deviate corre po nding to the chosen confidence probability.
In secor'n 6.3 it was suggested that the normal approximation holds reasonably
well if the . ample size is at least 30 and is large enough so that the cv's of 9 and 1
are botH k 'lt than 0.1. When these conditions do not apply, the formula for v(R)
tends to gi\'-t vaJues that are too low and the positive skewness in the distribution
of R may become noticeable.
An alternative method of computing confidence limits, which takes some
account or the skewness of the distribution of R. has been used in biological assay
(Heller, 1932 ; Paulson, 1942). The method requires that 9 and i follow a
bivariate normal distribution, so tbat (9 - Ri) is norma11y distributed. It follows
that in simple random samples tbe quantity
y-Ri
(6.16)
J[(N - n)/ Nn}/ s/ + R 2s/ - 2Rs y%
is approximately normally distributed with mean zero and unit standard devia-
tion.
The value of R is unknown, but any contemplated value of R which makes this
normal deviate large enough may be regarded as rejected by the sample data.
Consequently, confidence limits for R are found by setting (6.16) equal to ±z and
solving tbe resulting quadratic equation for R. The confidence limits are approxi-
mate since the two roots of the quadratic are imaginary with some samples. Such
cases become rare if the cv's of j and i are Jess than 0.3.
KA flU ~"l1MATORS 157
After some manipulation, the two roots may be expressed as
where
2
N-ns
C99=--~
Nn y
is the square of the estimated cv of y, with analogous definitions of c9' and Cu. If
Z2 C,,, Z2 CU• and Z2 C,. are aU small relative to 1, the limits reduce to
R =A :t:zJc,,+cu -zCp.
This expression is the same as the normal approximation (6.15).
Even with bivariate normality, the FieUer limits have been criticized as not
conservative enough. James, Wilkinson, and Venables (1975) explain the nature
of the difficulty and present an alternative method.
flYI + f2 Y2 + . .. + fnYn
where the I's do not depend on the Yj, although they may be functions of the Xj. The
choice of /'s is restricted to those that give unbiased estimation of Y. The
estimator with the smallest variance is called the best linear unbiased estimator
(BLUE).
Formally, Brewer and Royall assume that the N population values (YI. x,) are a
random sample from a superpopulation in which
(6. 19)
where the £ j are independent of the X, and x, > O. ln arrays in which X , is fixed, £ ,
has mean 0 and variance AXj. The x,(i = 1, 2, ... , N) are known .
In the randomization theory used thus far in this book, the finite population
total Y has been regarded as a fixed quantity. Under model (6.19), on the other
N
hand, Y =pX +I £ 1 is a random variable. In defining an unbiased estimator under
this model, Brewer and Royall use a concept of unbiased ness which differs from
that in randomization theory. They regard an estimator Yas unbiased if B(Y) =
B( y) in repeated selections of the finite population and sample under the model.
Such an estimator might be called model-unbiased.
Theorem 6.3. Under model (6.19) the ratio estimator YR = Xy/i is a best
linear unbiased estimator for any sample, random or not, selected solely according
to the values of the X"
RATIO ESTIMATORS 159
Proof. Since E(sdx,) =0 in 'repeated sampling, it follow from (6.19) that
N
Y==~X+LSi : E(Y)=={3X (6.20)
Furthermore, with the model (6.19) any linear estimator Y is of the form
(6.21)
If we keep the n sample values Xi fixed in repeated sampling under the model
(6.19),
(6.22)
A n
From (6.20) and (6.22), Y is clearly model-unbiased if L l;Xi -x. Minimizing
V( n under this condition by a Lagrange multiplier gives
21;x, = ex,: I, = constant == XJ ni (6.23)
The constant must have the value X/ni in order to satisfy the model-unbiased
f
condition / XI = X. Hence the BLUE estimator Y is nyX/ ni = Xy/ i == YR , the
usual ratio estimator. This completes the proof.
Furthermore, from (6.20) and (6.21), with 1= X/ni,
"" It .N "N
YR - Y == L ';E{ - I Sj :;: (X/ ni)(L S;) - L E; (6.24)
(6.25)
N- "
where I denotes tbe sum over the (N - n) population values that are not in the
sample. Hence
. The practical relevance of these results is that they suggest the conditions under
which the ratio estimator is superior not only to y but is the best ot a,whole class of
estimators. When we are tzying to decid~ what kind of estimate to lI:se, a graph in
160 SAMPLING TECHNIOUES
which the sample values of Yj are plotted against those of XI is helpful. If this graph
shows a straight line relation passing through the origin and if the variance of the
points YI about the line seems roughly proportional to Xit the ratio estimator will be
hard to beat.
Sometimes the variance of the YI in arrays in which XI is fixed is not proportional
to XI' If this residual variance is of the form AV(XI), where V(XI) is known, Brewer
and Royall showed that the BLUE estimator becomes
"
y=#WiY;X; (6.28)
L" WiX/
where Wi = l/v(xl)' In a population sample of Greece, Jessen et al. (1947) judged
that the residual variance increased roughly as x? This suggests a weighted
regression with WI = l/xI2, which gives
(6.29)
For a given population and given n, V(YR ) in (6.26) is clearly minimized, given
every XI > 0, when the sample consists of the n largest XI in the population. 10 16
small natural populations of the type to which ratio estimates have been applied,
RO'Yall (1970) found for samples having n = 2 to 12 that selection of the n largest
XI usually increased the accuracy of YR. .
In summary, the Brewer-Royall results show that the assumption of a certain
~ of modelleads to an unbiased ratio estimator and formulas for V(YR ) and
v(YR ) that are simple and exact for any n > 1. The results might be used in
practice in cases where examination of the Y, X pairs from the available data
suggests that the model is reasonably correct. The variance formulas (6.26) and
(6.27) appear to be sensitive to inaccuracy in the model, although this issue needs
further study.
Further work by Royall and Herson (1973) discusses the type of sample
distribution needed with respect to the XI in order that YR remains unbiased when
there is a polynomial regression of YI on XI.
Write
1 1 1( i - .t\ - I 1 ( i - X) (6.30)
i =X + (i - X) = X 1+ X- J == X 1- X
Hence,
Now
E (y- Ri) = Y - RX =O
so that the leading term in the bias comes from the second term inside the
brackets. Furthermore,
- - - 1- 1
E y(i - X) = E (y - Y) (i - X) = -pSyS" (6.32)
n
E X-(x- - X)
- =E( x- - X)- 2 =I --IS 2
n "
Hence the leading term in the bias is
From (6)3) and (6.~5) the leading term in the quantity (bias/s.e.), which is the
same for R. Y R , and Y R , may be expressed as
(bias) _ (RS" - pSy)
~ = cv (x)(R 2S"2_ 2RpSyS,, + Sy2) 1/2 (6.36)
where cv (i) = ./1 - IS,,/ JnX. By substituting sample estimates of the terms in
(6.36), Kish, Namboodiri, and Pilla; (1962) computed the (bias/s.e.) values for
numerous item in various national and more localized studies. In the national
studie nearly all the (bias/s.e.) values were < 0.03 and almost the only values
> 0. 10 in their studies occurred for a single stratum with nh = 6 small hospitals.
The second result, due to Hartley and Ross (1954). gives an exact result for the
bias and an upper bound to the ·ratio of the bias to the standard error. Consider the
covariance, in simple random samples of size n, of the quantities Rand i. We have
= Y-XE(R) (6.38)
Hence
~ Y 1 ~ 1 ~
E(R) = X - X cOV (R, i) = R - X cov (R, i) (6.39)
Thus the bias in R is - cov (R. £)/ X. Unlike the Taylor approximation (6.33) to
the bias. this expression is exact.
Furthermore,
' 10
lblas . R~I = IPR ..fO"RO"xl
-
X
O"AO"i
S X
since Rand i cannot have a correlation> 1. Hence
Ibias in RI 0"1 f _
' - - - - - ' < -= = cv 0 x (6.40)
O"R X
The same bound applies, of course, to the bias in Y and R YR'
Thus, if the cv of i is
less than 0.1, the bias may safely be regarded as negligible in relation to the s.e.
E(R
R
_R)2== VI(1 + 3Cn xx + 6CXX _p 2Cyy + Co -2Cyx)
n Cyy + Cxx - 2 CyX
(6 .41)
Since the right-hand term inside the parentheses is less than 6Cu /n , this gives
E(R ;Ry < V(1+9: 1
xx
) (6.42)
(R-R)2 [C
E ~ == V I 1+ ;X (6 - 3p ) ] (6.43)
TABLE 6.2
Average percent
underestimation of MSE(R)
n
Estimator 4 6 8 12
V(R) in (6.4) 14 14 14 12
v,(R) in (6.12) 31 23 21 18
where Rh = Yh/ X h is the true ratio in stratum h, and Ph is defined as before in each
stratum.
Proof. Apply formu la (6.2), section 6.3, for a simple random sample to give in
stratutn h,
(6.46)
JL(cv of i h )
For example, with 50 strata and the cv of Xh about 0.1 in each stratum, the bias
in YR. might be as large as 0.7 times its standard error. The contribution of the
bias to the mean square error of Y10 would then be about one third.
Although in practice the bias is usually much smaller than it upper bound, the
d~nger of bias with the eparate ratio estimate should be kept in mind if
.JL(cv of i h ) elfCeeds, say, 0.3.
(6.47)
These are the standard estimates of the population totals Y and X. respectively,
made from a stratified sample. The combined ratio estimate, YRc (c for combined)
IS
YRc
A _ Y"x -- --::-
- -,.-
Y"X (6.48)
X. , x"
where y., = Viti N, i ll = X.,IN are the estimated population means from a
stratified sample.
166 S"AMPUNG TECHNIOUES
The estimate YRo does not require a knowledge of the X It , but only of X.
The combined estimate is much less subject to the risk of bias than the separate
estimate. Using the approach of Hartley and Ross in section 6.8, we have, writing
It = y,,/x'h
cov eRe, X.,) = E{ ~., .X
.,) - E(1t)E(x,,)
\i.,
= Y-XE(lt) (6.49)
Hence
and
(6 .50)
Thus the biases in Re , YRc are negligible relative to their standard errors, provided
only that the cv of X" is less than 0.1 .
Theorem 6.S. If the total sample size n is large,
2
A
V( YRc) ="N
L..
It {I - flt)
(Sylt
2
+ R 2 S,,1t2 - 2RphSv1tS,,1t ) (6.51)
It nit
Proof. This follows the same argument as theorem 2.5. In the present case the
key equation is
(6.52)
Now consider the variate UIt ; = Yiti - Rx hi • The right side of (6.52) is Na." where a.,
is the weighted mean of the variate UIt; in a stratified sample. Furthermore, the
population mean 0 of Uh; is zero, since R = Y/ x.
Hence we may apply to all theorem 5.3 for the variance of the estimated mean
from a stratified random sample. This gives
(6.53)
where
RATIO ESTIMATORS 167
From equation (6.45) and (6.5)) it is interesting to note that the approximate
variances of YIb and YR assume the same general form, the difference being that
the population ratio R" in the individual strata in (6.45) are all replaced by R in
(6.51) .
-- '-
'" N/(l- fh ) [ (R 2 - Rh 2) Sxl, 2 - 2 ( R - Rh )PhSvl,Sxh ]
I, nh
In si tuations in which the ratio estimate is appropriate the last term on the right
is usuall y mall. (It vanishes if within each tratum the relation between YIII and X/t i
is a straight line through the origin.) Thus, unless Rh is constant from stratum to
stratum, the use of a eparate ratio estimate in each stratum is likely to be more
preci e if the s a~mple in each stratum is large enough so that the approximate
formula for V( YR. ) is valid , and the cumulative bias that can affect Y Ib (section
6.10) is negligible . With only a small sample in each stratum, the combined
estimate is to be recommended unless there is good empirical evidence to the
contrary.
For ample estimates of the e variances we substitute sample estimates of R"
and R in the appropriate places. The sample mean squares Sy/ and sx/ are
substituted for the corresponding variances and the sample covariance for the
term PI,Sv"Sxh' The ample mean square and covariance must be calculated
separa te ly for each stratum.
Example. The data come from a census of all farms in leffer on County, Iowa. In this
example }I~, represents acres in corn and x'" acres in the farm . The population is divided
into two strata, the first stratum containing farms of as many as 160 acres. We assume a
sample of 100 farms . When tratified samplil1g is used, we will suppose that 70 farms are
taken from stratum 1 and 30 from stratum 2, this being roughly the optimum allocation .
The data are given in Table 6.3. The last three quantities, Q~, V h ', and Vh ", are auxiliary
quantities to be used in the computations, the last two being defined later.
We con 'ider five methods of estimating the population mean corn acres per farm . The fpc
are ignored .
1. Simple rand m ample: mean per farm estimate.
S.2 620
VI =-=-=6.20
n 100
168 SAMPLING TECHNIQUES
TABLE 6.3
DATA fROM JEFFERSON CoUNTY, IOWA
Size
Strata (farm acres) Nil. Sy,,' SyZl\ Su,'
V2 - !,
-,S12 + R 2 S. 2 - 2RS,.)
n
=3.51
3 . Stratified random sample: mean per farm estimate.
~2
~t V3=r-" S'l,/ = !:Q,.sylt':: 4.16
n~
4. Stratified random sample: ratio estimate usmg a separate ratio in each stratum.
V. = L O~(S,~2 +R~2S.~2_2R"S1M) =L Qh Vh ' =3.06
5. Stratified random sampling: ratio estimate using a combined ratio.
V, = LO,,(S,/+R 2S./-2RS,M) =L Q" V"ff", 3.10
The relative precisions of the various methods can be summarized as follows,
Method of Relative
Sampling method Estimation Precision
The Keyfitz method uses the identity that for nil =2,
2-
2Syll 2:= 2 L (YIII - 911)2 a! (Yhl- Yh2? = (dYh)2 (6.55)
leI
v( YI )= (~h) 2 2(1_ /h)Sy/ = (1- !h)(Yhl' - Yh2,)2:= (1- /h)(dYIl ') 2 (6.56)
where y~1 = NhYh/2. Similarly, for the sample estimate of the covariance,
(6.57)
Now
(6.59)
(6.60)
Keyfitz (1957) has extended this method to cover poststtatified estimators and
multistage sampling, and to give variances of differences of estimates from
successive surveys in periodic samples. Woodruff (1971) gives a general approach
that handles nonlinear estimators, unequal probabilities of selection, and samples
of size nh in the strata. As an illustration of Woodruff's approach, consider a
function feY> where Yrepresents the vector or set of m variables Y, = L Y'h' With
h
simple random sampling in stratum h, the flh are of the form (Nh/ nh) LYihh the
sum extending over the nil sample units in stratum h. By Taylor's approximation,
iJ
fch-f(Y)==I
, a
! (fI- Y,)=LL !}B! (V,h -
~I ' h U .lj
Yih) (6.61)
(6.62)
where
"
(6.63)
(6.67)
== r [r x,,(.t)
h" X"
(~
Y
_ ~)]
Xo a
2 (6.70)
_ ~Nh(NI - nh)Sdll,
- .(..
2 .
WIt
hS
dll
2_ 1
- --.(..
~ dIII2 (6.71)
II nil N II - l l _ 1
where d"i = Yhl - R"xlll is the deviation of Y"I from Rllx"I' By the methods given in
Chapter 5 for finding optimum allocation, it follows that (6.71) is minimized
subject to a total cost of the form L Clln", when
With a mean per unit it will be recalled that for minimum variance nil is chosen
proportional to N"Sy,,/J0..
In the planning of a sample, the allocation with a ratio estimate may appear
a little perplexing, because it seems difficult to speculate about likely values of
Sd/t. Two rules are helpful. With a population in which the ratio estimate is a best
linear unbiased estimate, Sdk will be roughly proportional to JX;., (by theorem
6.3). In this case the nil should be proportional to NII JJ[,,;J0.. Sometimes the
variance of dill may be more nearly proportional to XII 2. This leads to the
allocation of nh proportional to N"xll/..rc;,. that is, to the stratum total of XIIi,
divided by the square root of the cost per unit. An example of this type is discussed
by Hansen, Hurwitz, and Gurney (1946) for a sample designed to estimate sales of
retail stores.
If the estimate YRc is to be used. the same general argument applies.
Example. The different methods of allocation can be compared from data collected in a
complete enumeration of 257 commercial peach orchards in North Carolina in June 1946
(Finkner, 1950). The purpose was to determine the most efficient sampling procedure for
estimating commercial peach production in this area. Information was obtained on the
number of peach trees and the estimated total peach production in each orchard. The high
correlation between these two variables suggested the use of a ratio estimate: One very
large orchard was omitted.
For thi illustration, the area is divided geographically into three strata. The number of
peach trees in an orchard is denoted by XIII and the estimated production in bushels of
peaches by YIII' Only the first ratio estimate Y116 (based on a separate ratio in each stratum)
will be considered, since the principle is the same for both types of stratiiJed ratio estimate.
Four methods of allocation are compared;__(_a) nil proportional 10 Nil' (b) nil pro-
portional to N"s.II, (c) nil proportional to NIIJX,., and (d) nil proportional to N"xll =Xh •
The sample size is 100. The data for these comparisons are summarized in Table 6.4.
RA no ESTlMATORS 173
TABLE 6.4
DATA FROM THE NORTH CAROLINA PEACH SURVEY
Pop. 3898 4434 6409 62.43 80.06 44.45 56.47 1.27053 1433
Pop. 256 100 20181 100 20.45 1688.9 100 11379 100
The upper part of the table shows the basic data. The method employed to calculate the
four variances was first to find the nk for each type of allocation. These values are shown in
the columns headed (a) through (d) in the lower part of the table. Thus, with allocation (a),
n~ = nN./ N , so that in the first stratum
_ (100)(47) 8
n, - 256 1
When the n~ have been obtained, the corresponding V( YEU ) is found by substituting in
the formula
where
Sdh 2 = S •• 2 + R h 2 S •• 2 - 2Rk S y••
The quantities Sd~ 2 are given on the extreme right of the top half of Table 6.4.
TABLE 6.5
CoMPARlSON OF FOUR METHODS OF ALLOCATION
Variance
Method of
Allocation: n. Strata
Proportional Relative
to 2 3 Total Precision
Unbiased Methods
One estimate, due to Hartley and Ross (1954), can be derived by starting with
the mean 1 of the ratios yJ Xi and correcting it for bias .
.,
1=-n1"Lr, =-L~
1 " Y
n X,
Now
1 N _ 1 N Yi (1 N ) _
- L ri(xi-X)= - L _' X, - - L rl X
N ,_ I N I _ I X, N I- I
is
1 ~ (
- - t... r, XI-X_) = -n - (y-rx
_ __)
n-l i_ 1 n-l
RAno ESTIMATORS 175
On substituting into (6.73), the estimate f, corrected for bias, becomes
A _ n (N - 1) _ __
RHR = r + (n -l)NX(Y - rx) (6.74)
(6.76)
As a third method, Lahiri (1951) showed that the ordinary ratio estimate R is
r
unbiased if the sample is drawn with probability proportional to XI' Perhaps the
simplest method of doing this (Midzuno, 1951) is to draw the first member of the
sample with probability proportional to XI. The remaining (n -1) members of the
sample are drawn with equal probability. It is easy to prove (exercise 6.10) that
with this method the probability that a specific sample is drawn is proportional to
r Xj, and that R = t yJ t XI is unbiased for this method of sample selection.
Methods with bias of order 1/n 2
These methods consist of an adjustment to R. The first, due to Quenouille
(1956), is applicable to a broad class of statistical problems in which the proposed
estimate has a bias of order lin. It has been given the name of the jackknife
method, to denote a tool with many uses. The utility of this method for ratio
estimates was pointed out by Durbin (1959).
Ignoring the fpc for the moment, the bias of estimates Like R may be expanded
in a series of the form
A hI b2
B(R)=R+-+-+" . (6.77)
n n2
If n = mg, let the sample be divided at random into g groups of size m. From (6.77)
B(gR) = gR +~+~+
2 ... (6.78)
m gm
r
Now let RI be the ordinary ratio Y/ L x, computed from the sample after
omitting the jth group. Since RI is obtained from a simple random sample of size
176 SAMPLING TECHNIQUES
(6.79)
Hence
A] bl' b2
E [(g-l)R j = (g-I)R + m + (g-l)m2+· .. (6.80)
E[gR-(g-l)R.]=R b2 R _ b2 - g -
J g(g - 1)m 2 n 2(g - 1)
The bias is now of order l/n 2 • We can construct g estimates of this type, one for
each group. Quenouille's estimator (the jackknife) is the average of these g
estimates, that is,
Ro = gR - (g - I)R _ (6.8 1)
where R_ is the average of the g quantities Rj • As Queoouille showed, ·the
A
where SyX= I" (YI- y)(x, - .i)/(n - 1), s/ = I" (XI - .i) 2 /(n - 1), so that cyx , c;.. are
the sample relative covariance and relative variance of x.
The structure of RT may be seen by noting that from (6.34) the leading term in
the expected value of R may be written
TABLE 6.6
A SMALL ARTIFICIAL POPULATION
Stratum
II III
Y x y x y x
2 2 2 3 I
3 4 5 4 7 3
4 6 9 8 9 4
11 20 24 23 25 12
Totals 20 32 40 36 44 20
Rh 0.625 1.111 2.200
TABLE 6.7
RESULTS FOR DIFFJ;RENT ESTIMATES OF Y
Method Variance (Bias)' MSE
with the consequ~nce ~ that this un:: has a high probability of being drawn and that sample
containing this unit give gnod estimates of R h •
The Ouenouille. Beale. and Tin methods all produced substantial decreases in bias as
compared with the separate ratio estimate, and all had smalle r MSE's, so that in this
ex 'mple they achieved their principal purposes.
The . tudy by J. N. K. Rao (1969) of natural populations cited in section 6.9
compared the Quenouille, Beale, and Tin methods for n = 2, 4, 6, 8, on 15 such
populations. For n = 2, the most severe test, the medians and the upper quartiles
of the quantities I biaslNMSE were as follows : Ro, 3% , 7% ; RB , 8% , 12%; Rr,
8%,19%; as against 15% , 20% for k The more complex methods appear to help
ma.terially as regards bias in these tiny amples.
The same study compared the MSE's of five of the estimates in this ection with
that of R. (Lahiri 's method was omitted, since the study Wl'.s confined to simple
random samples.) For each method the ratio 100 M SE(P ) )/MSE(R), and so
forth, was calculated for each population.
For n = 4, Quenouille' and Mickey's estimates weIr slig ltly inferior to R in
these populations but, for n ~ 6, all methods had av rage ~ ISE's very close to
those of R. For a very mall sample from a single pOpul:Hion, this study suggests
that these more comj'k l( methods have no material advantage in accuracy ever k
But the fact that they reduce bias with little or no increase in MSE in a single
stratum should give them an advantage in a separate ratio estimate with numerous
strata having small samples.
Under a linear regression model, comparisons of the MSE's of these methods
for small n by P.S.RS. Rao and J. N. K. Rao (1971), Hutchinson (1971), and J. N.
K. Rao and Kuzik (1974) gave results in general agreeing with those from "the
natural populations.
TABLE 6.8
A VERAOE PERCENT BIAS IN ESTIMATORS OF V ARJANCE.
n=
Average of 4 6 8 12
The stability of v(.Ro) relative to that of vlR), as judged by the squares of the
coefficients of variation of these variance estimates, was poor in these samples. In
studies by Rao and Beegle (1968) of v(Ro) and VI (R) under a linear re~ression
model of y on x in an infinite population with x normal, however, v(Ro) and
vl(R) appeared about equaUy stable for n =4 to n = 12.
With a separate ratio estimate and numerous strata these results suggest that
L Xh2V(R oh ) is superior to L X,/VI(l~h) as an estimator of VO\., ). The former is
likely to be freer from bias and both should have adequate stability. But with only
a few strata the issue is questionable until further comparisons appear.
6.87)
(6.88)
RATIO ESTIMATORS 181
The Two Ratios Have Different Denominators But May Be Correlated ..
An example is the comparison of the proportion of men who smoke with the
proportion of women who smoke, in a survey in which the unit is a cluster of
house . Mathematically, this is the most general case.
v(R - R') = v(R)+v(R')-2 COy (RR') (6.89)
y
The oni unfamiliar term is cov (RR') . Writing, in the usual way,
A y- Rx R' - R" Y'- R'x'
R - R=---
X X'
we have
COy (RR') = k, COY (YI- Rx;)(y/ - R'x,')
Example. The 1954 field trial of the Salk polio vaccine wa conducted among children
in the first three grades in all schools in a number of counties. The counties were not
randomly selected, since those with a history of previous polio anacks were favored, but fOI
this illustration, it will be assumed that they are a random sample from some population.
Children whose parents did not give permission to participate in the trial were called the
" not inoculated" group and, of course, received no shots. Half of the children who received
permission were given three shots of an inert liquid anc! were called the "placebo" group.
From the data in Table 6.9, compare the frequencies R, R' of paralytic polio in the "not
inoculated" and " placebo" groups. To reduce the amount of data, the comparison is
.restricted to 34 counties, each having more than 4000 children in the two groups combined.
In these data any variation in the polio attack rate from county to county would produce a
positive correlation between.R. and .R.'.
The following quantities are derived from the totals.
Not inoculated: R' "" 2::6 "" 0.347857, x' = 284.6 = 8.3706
34
For v(Jh v(R') and coy (M'), all uncorrected sums of squares and products among the
four variates are required.
= 0.00584
182 SAMPLING TECHNIQUES
TABLE 6.9
NUMBER. OF CKILOREN (~, X') AND OF PAIlAl.YTIC CASES (y, y') PEIl CoUNTY
X· ~' yt y' ~ x' y y'
" • ~.~' "" numbers of "placebo" and "not inoculated" children (in 1000's)
t Y. y' = numbers of paralytic polio cases in the placebo and not inoculated
groups
n(n - l)xx
(497) - (O.52569)(844.6) -{O.34786)(1397 .4)
+ (0.52569)(0.34786)(2690.8)
(34)(33)(4.9235)(8.3706)
= 0.00127
Hence
(6.93)
where
.C RR · = Cyy'+ Cu,- Cv1,-Cn , (6.96)
, For the corresponding sample estimate v(R/ k ) we substitute sample estimates of
the terms in (6.95).
blimp/e. For the ratio of placebo and not inoculated case rates, R/ R =
0.52569/0.34786 = 1.511. Estimate the s.e. of this ratio. The computations in the preced-
ing example give
A 0.00584 c _ 0.00240 _ .
CAR = (0.5257)2 0.0211 ; A'A' - (0.3479)2 0.0198,
C 0.00127 00069
AA' = (0.5257)(0.3479) = . 4
veRI R') =(1.511)2(0.0211 + 0.0198) -0.0139) = 0.0617
s.e. (R/ R') = 0.248
The double ratio estimate has occasionally been used in place of YR = Ax "to estimate a
population total Y, as suggested by Keyfitz (Yates, 1960). Suppose that R '=(y'/x') is
known for the same sample from a previous period and that R' = Y'/ X' is also known. If
R'/ R~has been found, say, to be slightly > 1, we might argue intuitively that R is also likely
to give an overestimate of R that should be adjusted downward by dividing it by the ratio
R'/R'. This leads to the double ratio estimate YOR •
R' R
YOR =If(Rx)=lf(R 'X) (6.97)
(6.100)
With p variates, it is necessary to compute the inverse V" of the matrix V.i' Then
the optimum W, = L.lI , where L, is the sum of the elements in the ith column of
V" and L is the sum of all the p 2elements of V,i. The minimum variance is IlL.
1n practice. the weights are determined from estimated variances and
covariances VI/" From (6.7) in ection 6.3,
(1 - I) y2
';"_~- (cyv +clI - 2c yl)
n
. n
A convenient method of computation is first to obtain the matrix
If VI/ = 'Iv,'/(I- I) yl, the matrix v,/ is easily obtained by taking diagonal contrasts
186 SAMPLING TECHNIQUES
in C, that is,
VII' = C yy + C 11 - C y 1 - C y I
(6 .102)
By the usual Taylor series expansion, the analogue of (6.8) for the product
estimator in a large simple random sample is
·f
(6.103)
where (CV)2 is the square of the coefficient of variation of either of Yp. P.S.R.S. Yp
Rao and Mudholkar (1967) have extended Olkin's multivariate ratio estimator to
a weighted combination of ratio estimators (for Xi positively correlated with y)
and product estimators (for XI negatively correlated with y) .
EXERCISES
6.1 A pilot survey of 21 households gave the following data for numbers of members
(x). children (y,). cars (Y2), and TV sets (Yl) .
x YI Y2 Ya x Yl Y2 Ya x YI Y2 Ya
5 3 1 3 2 0 0 1 6 3 2 0
2 0 1 I 3 1 1 1 4 2 1 I
4 1 2 0 2 0 2 0 4 2 I 1
4 2 I 1 6 4 2 1 3 1 0 1
6 4 1 1 3 1 0 0 2 0 2 I
3 1 1 2 4 2 1 J 4 2 I I
5 3 I 1 5 3 1 I 3 I J J
RATIO ESTIMATORS 187
Assuming that the total population X is known, would you recommend t-kat ratio
estimates be used instead of simple 'expansions for estimating total numbers of children,
cars, and 1V sets?
6.2 l~ a field of barley the grain, y" and the grain plus straw, x" were weighted for each
of a large number of sampling units located at random over the field . The total produce
(grain plus straw) of the whole field was also weighed. The following data were obtained :
t:" = 1.13, cY' = 0.78, c... = 1.11. Compute the gain in precision obtained by estimating the
grain yield of the field from the ratio of grain to total produce instead of from the mean yield
of grain per unit.
It requires 20 min to cut, thresh , and weigh the grain on each unit, 2 min to weigh the
straw on each unit, and 2 hr to collect and weigh the total produce of the field . How many
units must be taken per field in order that the ratio estimate may be more economical than
the mean per unit?
6.3 For the data in Table 6.1, YR =28,367 and C09 = 0.0142068, Co. = 0.0 146541 ,
en = 0.0156830. Compute the 95% quadratic confidence limits for Y and compare them
with the limits found by the normal approximation.
6.4 The values of y and x are measured for each unit in a simple random sample from a
population . If X, the population mean of x, is known, which of the following procedures do
you recommend for estimating Y/X"I (a) Always use y/X. (b) Sometimes use y/X and
sometimes y/i. (e) Always use i ii . Give reasons for your answer.
6.5 The following data are for a small artificial population with N = 8 and two strata of
equal size .
Stratum I Stratum 2
X II 111 ( x 2,t Y2 i
2 0 10 7
5 3 18 15
9 7 21 10
IS 10 25 16
For a stratified random sample in whicb"1 = "2= 2. compare the MSE's of YR. and YR c by
working out the results for all possible samples. To what extent is the difference in MSE's
due to biases in the estimates?
6.6. In exercise 6.5 compute the variance given by using Lahiri ' method of sample
election within each stratum and a eparate ratio estimate.
6.7 Forty-five states of the United States (excluding the five largest) were arranged in
nine strata with five states each, states in the same stratum having roughly the same ratio of
"h
1950 to 1940 population . A stratified random sample with = 2 gave the following results
for 1960 population (y) and 1950 population (x), in millions.
Stratum
2 3 4 5 6 7 8 9
YAI 0.23 0.63 0.97 2.54 4.67 4.32 4.56 1.79 2.18
XAl 0.13 O.SO 0.91 2.01 3.93 3.96 4.06 1.91 1.90
VAl 4.95 2.85 0.61 6.07 3.96 1.41 3.57 1.86 1.75
Xu 2.78 2.38 0.53 4.84 3.44 l.3a 3.29 2.01 1.32
188 SAMPLING TECHNIQUES
Give that the 1950 population total X is 97.94, estimate the 1960 population by the
combined ratio estimate. Find the standard error of your estimate by Keyfitz' short-cut
method (section 6.13). The correct 1960 total was 114.99. Does your estimate agree with
this figure within sampling errors?
6.8 In the example of a bivariate ratio estimate given by Olki~, a sample of 50 cities was
drawn from a population of 200 large cities. The variates y, X" Xl are the numbers of
i~abitants ~er city in 1950,1940, and 1930, respectively. For the population, Y== 1699,
X, == 1482,X2 == 1420(in 100's) and, for the sample, y = 1896, x, = 1693, Xl = 1643 . The C
matrix as defined in section 6.20 is
y X, Xl
1.213 1.241 1.256
1.241 • 1.302 1.335
1.256 1.335 1.381
Estimate Y by (a) the sample mean, (b) the ratio of 1950 to 1940 numbers of inhabitants,
and (c) the bivariate ratio estimate. Compute the estimated standard error of each
es ·mate.
6.9 Prove that with Midzuno's method of sample selection (section 6.15) the probabil-
ity that any specific sample will be drawn is
(n - l)!(N - n)! I" (X,)
(N-l)! X
6.10 In small populations the leading term in the bias of R in simple random samples of
size n is of the form
Regression Estimators
adjusts the sample mean of the actual measurements by the' regression of the
act'lal measurements on the rapid estimates. The rapid estimates need not be free
from bias. If Xi - Yi = D , so that the rapid estimate is perfect except for a constant
bias D, then with b = 1 the regression estimate becomes
y + (X - i) = X + (y - i)
= (pop. mean Of rapid estimate)+(adjustment for bias)
91' = 9 + boex - x)
REGRESSION ESTIMATORS 191
is unbiased, with variance
N - - 2
_ 1- f i~1 [(y, - )') - bo(x,- X)]
V(YI,) = -;- N -1 (7.3)
= I-f(5/-2b05yx+b025/) (7.7)
n
Corollary. An unbiased sample estimate of V(YI,) is
_ 1- f t
[(YI- y) - bo(xj - i)]2
V(YI,) = -n- '--'----n---1--- (7.8)
_ I-I( 2 ")b
- - - Sy - ... oSyx + b0 2 Sx 2) (7 .9)
n
This follows at once by applying theorem 2.4 to the variate YI':' bo(xi - X).
I
A natur':tl question at this point is: What is the best value of bo? The answer is
given in theorem 7.2.
Theorem 7.2. The value of b o that minimizes V(YI,) is
N _ _
5 L (y, - y)(XI-X)
b
o
= B =:;A = '-i-_I_ _ _ _ __ (7.10)
5x f (XI -X)2
which may b .;: called the linear regression coefficient of y on X 10 the finite
population. Note that B does not depend on the properties of any sample that is
192 SAMPLING TECHNIQUES
(7.11 )
(7.12)
This gives
(7. 15 )
(b - B)2S 2]
= Vrni .. (YI,) [1 + S~2( 1 _ P 2) (7. 16)
(7.18)
For example, if p = 0.7, the increase in variance is less than 10%, (a == 0.1),
provided that
The estimator Yin like YR, will be shown in section 7.7 to have a bias of order II n.
In finding the sampling error of YI" replace tht< sample b in (7.20) by the
population regression coefficient B in (7.10). In Theorem 7.3 the error committed
if" this approximation will be shown to be of order 1/-rn
relative to the terms
retained. We first examine the relation between band B.
Introduce the variate t!j defined by the relation
(7.21)
N
Two properties of the ej are that L e, = 0 and
N N N
L el(x, -X}= I (YI- Y)(x, - X}-B L (Xi _ X)2=0 (7.22)
194 SAMPLING TECHNIQUES
by definition of B. Now
-
B(YJr- YI, =(I-f)S
y)- 2=. V(-) - - ) '2(1 -p 2) (7.28)
11
REGRESSION ESTIMATORS 195
7.4 SAMPLE ESTIMATE OF VARIANCE
As a sample estimate. of V(YI,), valid in large samples, we may use
(7.29)
(7.30)
the latter being the usual short-cut computing formula. The derivation is as
fonows.
In theorem 7.3, equation (7.28), we had, since S/O- p2) = S/,
2 1 ~ 2
S. = -- t.. (el-i)
n - 1i- 1
Now, from equation (7.21), it follows that
(7.32)
(regression)
(ratio)
_ N-n ,
V()'):::: Nn S, - (mean per unit)
It is apparent that the variance of the regre!>sion estimate is ~maller than thai of
the mean per unit unle!.s p = 0, in which case the two variances are equal.
The variance of the regression e~timate is less than that of the ratio e~timate if
(7.33)
Thus the regression estimate i more precise than the ratio estimate unless
X, i~ a straight line through
B :::: R . This occurs when the relation between Y, and
th .... origin.
Example. The precision of the regression. ratio. and mean per unit estimates from a
simple random sample can be compared hy using data collected in the complete enumera-
tIOn of peach orchards described on p. I 7~. In thb example. )', is the estimated peach
pr9,duction in an orchard and x, the number of peach tree, In the orchard. We will compare
the' estimates of the total production oC the 256 orchard~. made from a sample of 100
orchards. It is doubtful whether the ample is large enough to make the variance formula
fully valid, since tbe cv's of y and i are both omewhat higher than 10%. but the example
.will serve to illustrate the computations. The basic data are a~ follows .
~
S, 2 =6409 S,. = 4434 S, 2 = 3898
R = 1.270 p =0.887 fI = 100 N = 256
v('Y,:) = N(N - nl S / (l _ p 2)
n
• _ N(N-n) 2 2'
V(YR ) - (S. + R S, - 2PS.. )
n
No general ana lytical results are available on the accuracy of the apflrDximate
formulas Ci .25) for V(YI, ) and (7 .29) fer V(YI,) in mode rate or small samples. The
approximate estimator in (7 .25) and (7.29) are
- = -(1 - -{J Sv 2( 1 - p)
V( YI,) 2 (7.25)
n
(7.29)
where for fixed x" the e, are independently distributed with mea n 0, variance
2
CT, = q / (1 - p 2). With this model, Cochran (1942) gave the result that to terms of
2
order 1/n •
(7.36)
TABLE 7.1
A VERAGE PERCENT UNDERESTIMA nON OF THE VARIANCE OF 9"
n
Estimator 6 8 12
V(y,,) in (7 .25) 38 34 28
v(y,,) in (7 .29) ' 48 42 33
(7.40)
REGRESSION ESTIMATORS 199
Hence the leading term in the bias - Eb(i - X) of Y'r is the average of
(7.41)
Let UI = ej(xj - X). By (7.22) its population mean 0 = O. The average value of the
first term in (7.41) may therefore be written
- E(u - O)(i -;R) (1- f) E(u j - O)(XI- X) (7.42)
52 = - -n- S2
x x
by theorem (2.3) (p. 25) for the average value of a sample covariance in simple
random sampling. This in turn equals (7.38), namely
_ (1 - f) Ee;(x; - X)2
n S/
In the second term in (7.41), e is O(1/,J;,.) and (i-;R)2is O(l/n), so that this term
is of smaller order than (7.38). Thus (7.38) is the leading term in the bias of YI,'
This result holds for any n > 1 and any sample selected solely by the values of x.
This approach and its generalization to the case of unequal residual variances
were given by Royall (1970). Under this model a pu.rposive sampling plan that
succeeded in making i = X would minimize V()'I,) for given n.
Also. for any ample selected solely according to the values of the x,. the usual
lea. t square. estimator
~, ' = £
[()'I - n- b (x, - i)f /( 11 - 2) (7.47)
Then
(7.50)
The two estimates will be considered first in the case in which the bll and bare
chosen in advallce, since their properties are unusually simple in this situation.
From section 7 2, fir" is an unbiased estimate of Yin so that }it,. is an unbiased
estimate of Y. ')ince sampling is independent in different strata, it follows from
theorem 7.1 tr \t
- ~ W,,2(1-fh) 2 b 2S 2) (7.51)
V((YI,..) = '" (Syh - 2b"Sy!IC/o + h x/o
h nil
REGRESSION ESTIMATORS 201
Theorem 7.2 shows that V(Ylr., ) is minimized when bll =BI., the true regression
coefficient in stratum h. The minimum value of the variance may be written
Turning to the combined estimate with preassigned b, (7.50) shows that YIn: is
also an unbiased estimate of Y in this case. Since Ylr< is the usual e, timate from a
stratified sample for the variate Ylu + b(X - Xh,), we may apply theorem 5.3 to this
variate, giving the result
. - )_~ W h (l-lh)(s 2 ?
2
V(YI,.. - /... vh - _bSyxh + b 2 Sxh 2 ) (7.53)
h nh
The preceding analysis is helpful in indicating the type of sample estimates bit
and b that may be efficient when used in regression estimates. With the separate
estimate, the analysis suggests that we take
L (Yhl - 91, )(Xhl - Xh)
bh =i (7.56)
L (XIIi -XII)2
i
provided that the sample size nh is large in aiJ strata. To obtain a sample estimate
of variance, substitute
S;' Xh=
nh'
~2[L(Yh'-Yh)2-bh2 L(Xlli-Xh)2]
,
(7.58)
V (Ylrr
'" W/(l-
- ) = £., (
fh) -) - bc (Xhc - Xh
'" [(Yhi - Yh
1) £., - )]2 (7.61)
h nh nh - ;
EXERCISES
7.1 An experienced fanner makes an eye estimate of the weight of peaches XI on each
tree in an orchard of N = 200 trees. He finds a total weight of X = 11,600 lb. The peaches
are picked and weighed on a simple random sample oC 10 trees, with the following results.
Tree Number
2 ) 4 5 6 7 8 9 10 Total
Stratum I Stratum 2
.1"11 '!I 1 , x2 , ,1/2,
4 0 5 7
6 3 6 12
7 5 8 13
Use the ordinary least squares estimates of the B's , b. and b, on pp . 201-2.
7.8 In the popula~ion of exercise 7.7. show that if the optimum preassigned B could be
used in each case, V(Y,,,) = 4.39, VCY,~) = 4 .43. both estima!es bei,l&, of cour e, unbiased .
7.9 By the same method , compare the MSE's of the separate and combined ratio
estimates in the population in exercise 7.7. Since the ratio Y/ X is 8/17 =0.47 in stratum 1
=
and 32/19 1.68 in stratum 2, large sample theory would suggest that Y to would be
superior to Ync' You will find, however, that in these, tiny saTples, Yn , has the smaller
MSE. Its superiority is not due to smaller bias, neither YR. nor Y nc being ~naterialll' biased.
As another disagreement with large-sample theory, yOll will find that Y R.t and Y Rc have
smaller MSE's than the corresponding regression estimates.
HAPTER 8
Systematic Sampling
8.1 DESCRIPTION
Thi s method of sa mpling is at first sight quite different from simple random
sampling. Suppose th at the N units in the population are numbered I to N in some
orde r. To se lect a sa mpl e of n units, we,take a unit at random from the first k units
a nd e very kth unit the reafte r. For instance. if k is 15 and if the first unit drawn i ~
numbe r 13, the subsequent units are numbers 28, 43, 58, and so on. The se lecti on
of the first unit dete rmines the whole sample . This ~ " pe is called an every kth
systematic sample .
The apparent advantages of this method over ,im.lle randO' n :;dmpling are a
follows.
1. It is easie r ~o draw a sample and of' ,;!.1 easier to execute without mistai .:s.
Thjs is a particular advantage Whf" .l th r. Jrawing is done in the field . Even when
drawing is do n in an office there may be a substantial saving in time. For instance,
it the units are described on cards that re all of the same size and lie in a file
~ a card can ~ drawn out every inch along the file as mea ured by a ruler .
This ~ration is speedy, whereas simple random sampljng would be slow. Of
course. this method departs slightly from the strict " every kth" rule . ...
2. Intuitively, systematic sampling seems likely to be more precise than s~le
random sampling. In effect, it stratifies the population into n strata, which coftSist
of the first k unit , the second k units, and so on. We might therefore expect1f,e
systematic sample to be about as precise as the corresponding stratified random
sample with one unit per tratum . The differe.1ce is that with the systematic sample
the unit occur at the same relative position in the stratum, whereas with the
stratified random sample the position in the stratl1m i jeterr.lined separately by
randomization within each stratum (see Fig. 8.1). The systematic sample is spread
more evenly over the population, and this fact has sOlletimes made systematic
sampling considerably more precise than stratified random sampling.
One variant of the systematic sample is to choose each unit at o7;'ear the center
of the stratum; that is, in tead of star ing the seque Ice by a random number
chosen between 1 and k, we take the starting number as (k + ])/2 if k is odd and
205
206 SAMPLING TECHNIQUES
SYSTEMATIC SAMPLING
either k/2 or (k + 2)/2 if k is even (Madow, t 953). This procedure carries the idea
of systematic sampling to its logical conclusion. If Yi can be considered a
continuous function of a continuous variable i, there are grounds for expecting
that this centrally located sample will be more precise than one randomly located.
Limited investigation on some natural populations supports this opinion,
although centrally located samples tend to behave erratically. Attention here will
be confined to samples with some random element.
TABLE 8.1
THE POSSIBLE SYSTEMAllC SAMPLES FOR N = 23, k = 5
Systematic sample number
II III IV V
2 3 4 5
6 7 8 9 10
., II 12 13 14 IS
16 17 18 19 20
21 22 23
TABLE 8.2
COMPOSITION OF THE k SYSTEMATIC SAMPLES
Sample number
2 k
Yl !h !I, !t.
Yk 01 .114. ~ ilk. , !/Z •
where
20 SAMPLING TECHNIQUES
is the variance among units that lie within the same systematic sample. The
denominator of this variance, k (n - 1), is constructed by the usual rules in the
:lOalysis of variance: each of the k samples contributes (n -1) degrees of freedom
to the sum of squares in the numerator .
Proof. By the usual identity of the analysis of variance
, J
"
N - n 52
V(y) = - - -
N n
From (8 . 1) , V(Y.fY) < V(y) if and only if
1\f - 1 2 k(n-I) , N - n 52
--5 - S- < - - l....- (8.3)
NNw" Nn
that is, if
(8.5)
where P... is the correlation coefficient between pairs of units that are in the same
systematic sample. It is defined a
_ E(y,/ - n(y,u - n
(8.6)
P... - E(Y'j - 9)2
where the numerator is averaged over all kn(n - 1)/2 distinct pairs, and the
denominator over all N values of Yor Since the ?enominator is (N - 1)5 2 / N, this
gives
2 I __
Proof.
k
n 2 kV(Yry) = n 1 L: (y, - Y)2
1= I
k
= L [(y, , - Y) +(y, ~ - f" )+ "' + (Ytn-Yll
;_ 1
The squared terms amount to the total sum of . quares of deviations from Y, that
is, to (N - 1)5 2 . This gives
n 2 kV(Ysy) = (N - l)5 2 +2L:
t
L
j<u
(y,/- Y(Y,u- n (8.8)
(8.9)
Hence
2
5 (N
V(Y,y) ;: -; - l) [1 + (n - l)p ... ]
-;::; (B .10)
This result shows that positive correlation between units in the arne sample
inflates the variance of th ample mean. Even a small positive correlation may
have a large effect, becau e of the multiplier (n -1).
The two preceding theorems express V(YJY) in terms of 52, hence relate it to the
variance for a simple random sample. There j an analogue of theorem 8.2 that
expresse V(Y.y) in term of the variance for a stratified random sample in which
the strata are composed of the first k units, the second k units, and so on. In our
notation the subscriptj in Ylj denotes the stratum . The tratum mean is written f j.
Theorem 8.3.
V (Y. y) 5~,,(N
=--;;- ~- ")l 1 +\n - l )p.....,] (~ . ll )
210 SAMPLING TECHNIOUI!.S
where
(8.12)
This is the variance among units that lie in the same stratum. The divi or n(k - 1)
is used because each of the It strata contributes (k - 1) dewees bf lreedom.
Furthermore,
_ E(YIl- Y.,)(Yiu - 9.u) (8.13)
PWII - E(YI/ _ 9./)2
This quantity is the correlation between the deviation from the stratum means of
pairs of items hat are in the same systematic sample.
- 2 fL (Ylj -
PW'I - n(n -l)(k - l); _ lj<u
9 )(YI~ - 9..J
S!"
(8. 14)
V( - ) =
y"
(NN- n) S!$/
,. (8. 15)
'r
0 I I 2 5 4 7 7 8 6 4. 1
II 6 II 9 10 13 12 15 16 16 17 12.2
III 18 19 20 20 24 23 25 28 29 27 23 .3
IV 26 30 )1 31 33 32 35 37 38 38 33.1
Totals 50 58 61 63 75 71 82 88 91 88 72.7
TABLE 8.4
ANALYSIS OF VARIANCE
df ss ms
the variances of the estimated means from simple random and stratified random samples
are as follows.
N-
V... = ( N
")S2 9 136.25
-;=10' - 4 - = 30.66
Both stratified random sampling and systematic sampling are much more
eff ctive than simple random ampling but. as anticipated. systematic sampling is
less precise than stratjfied random sampling.
Table 8.5 shows the same data. with the order of the observations reversed in
the second and fourth strata. This has the effect of making pwJt negative, because it
makes the majority of the cross products between deviations from the strata
means negative for pairs of observations that lie in the same systematic sample. In
the first systematic sample, for instance, the deviations from the strata means are
now - 4.1, +4.8, -5.3, +4.9. Of the six products of pairs of deviations, four are
negative. Roughly the same situation applies in every systematic sample.
This change does not affect V",n and v.,.
With systematic sampling, it brings
about a dramatic increase in precision, as i seen when the systematic sample totals
in Table 8.5 are compared with th se in Table 8.3. We now have
TABLE 8.S
DATA IN TABU 8.3, WITH THE ORDER REVERSED IN STRATA II AND IV
0 I I 2 5 4 7 7 8 6 4.1
II 17 16 16 15 12 13 10 9 8 6 12.2
III
IV !l 19 20 20 24 23 25 28 29 27
38 37 35 32 33 31 31 30 26
23.3
33.1
Totals 73 74 74 72 73 73 73 75 75 65 72.7
Theorem 8.4. Consider all N! finite populations that are formed by the N!
permutations of any se t of numoer Yl> Y2, ... ,YN' Then, on the:: average over
these finite populations.
(8.16)
Theorem 8.S. )f the variates y, (i =" 1, 2 •...• N) are drawn at random from a
superpopulation in which
g'(y, - /-lY = a/
Then
The crucial conditio ns are that all YI have the same mean J,L, that i • there is no
tre nd, and that no linear correlation exi ts between the values y, and y, at two
different points. The variance u/ may change from point to point in the series.
Proof. For any specific finite population.
214 SAMPLING TECHNIQUES
Now
Hence
(B .18)
This gives
(8.19)
Turning to V. y , let y" denote the mean of the uth systematic sample, For any
specific finite population,
1 k _ - 2
V'Y =-
k L (y" -
,, - I
y) (B .20)
"
=.!.[k "i:- I(y,, - p.)2-k(Y - JL)2] (B .21)
By the theorem for the variance of the mean of an uncorrelated sample from an
infinite population ,
(8.22)
(8.23)
)( • systematic 5ampif
o - strat,f*' random u mple
~.__..,.1
~.~.
~x
"
Fla. 8.2 Systematic sampling In a population with linear trend.
all strata, whereas stratified random sampling gives an opportunity for within-
stratum errors to cancel.
To examine the effects mathematically, we may assume that YI "" i. We have
f i = N(N+ I) f j2= N(N + 1)(2N+ 1)
,_) 2' I-I 6
The population variance S2 is given by
S 2= N~ I(IY/-NY2)
This give
1 _ - z e- 1
Vsy ="k L (Yu - Y) =1:2 ( .27)
Equality occurs only when n = 1. Thus, for removing the effect of a linear trend ,
suspected or unsuspected. the systematic sample is much more effective than the
simple random sample but les effective than the stratified random sample.
die + sign being used for the first member, the - sign for the last. For any i, the two
weights obviously add to 2. The reader may verify that if the population consists of
a linear trend and N = nk the weighted sample mean gives the correct population
mean. The performance of these end corrections has been examined by Yates
(1948), to whom the) are due.
Bellhouse and Rao (1975) have extended the Yates correction to the case
N.e nk when the systematic sample is drawn by Lahiri's circular method (section
8.1), which guarantees constant n. As before, the weights different from 1 are
applied to the first and last sample numbers in the original serial order of the
population. For ~xamp le. if the starting random number in drawing the sample is
19 with N = 23, n = 5, units 19, 1, 6, 11, 16 constituting the sample, the first and
last members are Yl and Y,q . Two cases ari e .
ase 1. Small i for which i + (n - 1)k:s N, so that the n units are obtained
• without passing over YN' The weight for the first (+) and last (-) members are
1±n[2i+(n- l)k -(N+l)] ,
(8.30)
2(n -I)k
Case 2. i + (n -1)k > N. Let nz be the number of sample units obtained after
=
passing over YN' Thus, with i 19, nz = 4. The weights forthe first (+) and last (-)
SYSTEMATIC SAMPLING 217
members are
In both cases the internal sample members receive weight 1 in the sample total.
With N = 23, n = k = 5, i = 19, n2 = 4, the first and last weights are 1 ± (-7/18).
Hence Yt receives a weight 11/18, while Yt9 receives 25/18.
Two alternative methods attempt to change the method of sample selection so
that the sample mean is unaffected by a linear trend . With N = nk and n even, a
method suggested by Sethi (1965) divides the population into n/2 strata of size 2k,
choosing two units equidistant from the end of each stratum. With starting
random number i, the n/ 2 pairs of units are those numbered
[i+2jk, 2(j+l)k -i+l ], j=0 , 1, 2, ... !n-1 (8.32)
This selection removes the effect of a linear trend in any stratum of 2k units,
even if the linear slope varies from stratum to stratum. Murthy (1967) has called
the method balanced systematic sampling.
The modified method of Singh et al. (1968) chooses pairs of units equidistant
from the ends of the population. With n even, the n/2 c:quidistant pairs that start
with unit i (i = 1, 2, . .. , k ) are
[i +jk, (N - jk )-i+ 1], j = 0 , 1,2, .. . ! n -1 (8. 33)
With n odd in these methods, j goes up to !{n - 1)-1 in (8.32) and (8.33). The
baJal1ced method (8.32) adds the remaining sample member near the end at
[i + (n - 1)k]; the modified method near the middle at [i +!(n - 1)k). The effect of
a linear trend is not completely eliminated in y for n odd.
Comparisons of the performances of these two methods with Yates' corrections
and with ordinary systematic sampling have been made on superpopulation
models representing linear and parabolic trends, periodic and autocorrelated
variation (Bellhouse and Rao, 1975), and on a few small natural populations by
these authors and by Singh (personal communication). In general the three
methods (Yates, balanced, modified) performed similarly. being superior to
ordinary systematic sampling in the presence of a linear or parabolic trend.
The population in Table 8.3, p. 211, for example, is one on which these methods
hould perform very well. Ordinary systematic sampling gave V,)' = 11.63. Com-
parable variances for the other methods (n = 4, k = 10) are: Yates, 1.29; Sethi
(balanced), 0.46; Singh (modified), 0.34. The balanced method happens to be that
obtained in Table 8.5 by reversal of strata II and IV in Table 8.3.
seen pictorially in Fig. 8.3. In this representation the height of the curve is the
observation YI' The sample points A represent the case least favorable to the
systematic sample. This case holds whenever k is equal to the period of the sine
curve or is an integrai multiple of the period. Every observation within the
systematic sample is exactly the same, so that the sample is no more precise than a
single observation taken at random from the population .
The most favorable case (sample B) occurs when k is an odd mUltiple of the
half-period. Every systematic sample has a mean exactly qual to the true
population mean, since successive deviations above and below the middle line
cancel. The sampling variance of the mean is therefore zero. Between these two
cases the sample has various degrees of effectiveness, depending on the relation
between k and the wavelength.
Populations that exhibit an exact sine curve are not likely to be encountered in
practice. Populations with a more or less definite periodic trend are, however, not
uncommon. Examples are the flow of road traffic past a point on a road over 24
hours of the day and store sales over seven days of the week. For estimatiQg an
aver~ge over a time period, a systematic sample daily at 4 p.m. or every Tuellday
would obviously be unwise . Instead, the strategy is to stagger the sample over the
periodic curve, for example, by seeing that every weekday is equally represented
in the case of store sales.
Some populations have a kind of periodic effect that is less obvious. A eril's of
weekly payrolls in a small sector of a factory may always list the workers in the
same order and may contain between 19 and 23 names every week , A systemlltic
sample of 1 in 20 names over a period of weeks might consist mainly of th~ reWeds
of one worker or of the records of two or three workers. Similarly, a systematic
sample of names from a city directory might contain too many heads of house-
holds, or too many children. If there is time to study the periodic structure, a
systematic sample can usually be designed to capitalize on it. Failing this, a simple
or stratified random sample is preferable when a periodic effect is suspected but
not well known .
In some natural populations quasiperiodic variation may be present that would
be difficult to anticipate. L. H. Madow (1946) found evidence pointing this way in
a bed of hardwood seedling stock in a rather small population (N = 420). Finney
(1950) discussed a similar phenomenon in timber volume per strip in the Dehra
Dun forest, although in a reexamination of the data Milne (1959) suggested that
SYSTEMATIC SAMPLING 219
the apparent periodicity might have been produced by the process of measure-
ment. The effect of quasiperiodicity is that systematic sampling performs poorly at
some values of n and particularly well for others. Whether this effect occurs
frequently is not known. Matern (1960) cites examples in which natural forces
(e.g., tides) might produce a spatial periodic variation, but he is of the opinion that
no clear case has been found in forest surveys.
For this class of populations it is easy to show that stratified random sampling is
superior to simple random sampling, but no general result can be established
about systematic sampling. Within the cla.~ there are superpopulations in which
220 SAMPLING TECHNIQUES
systematic sampling is superior to stratified random sampling, but there are also
superpopulations in which systematic sampling is inferior to simple random
sampling for certain values of k.
A general theorem can be obtained if it is further assumed that the correlogram
is concave upwards.
Theorem 8.6. If, in additic-n to conditions (8.34), we have
5} = Pu +I +Pu - I-2pu;';:'0 (u = 2,3, ... , (kn -2)] (8 .35)
then
Z' VJy :s; Z'VJ,:s; g'Vron (8.36)
for any size of sample. Furthermore, unless 5.. 2 :: 0, U = 2, 3, ... , (kn - 2),
Z'VJY < ~VJ' (8.37)
A proof has been given by Cochran (1946).
A sketch 0; the argument for n = 2 illustrates the role played by the "concave
upwards " condition. In the syst~matic sample the members of the pair are always
k units aput. Hence
~V(YJY):: ~(O'2 + 0'2+ 2PkO'2) = i O' 2(1 + Pic) (8.38)
With the stratified sample, there are k possible positions for the unit drawn from
e
each stratum, making combinations of positions. The numbers of combinations
1, 2, ... (2k -1) units apart are as foUows .
"
Distance 2 . . . (k -1) k (k + 1) ... (2k -1) Total
hence
(8.41)
But jf
Pu +1 +Pu - I ;,;:.2pu (u = 2, 3, ... )
SYSTEMATIC SAMPLING 221
it is easy to show that every term inside tbe brackets is positive. This completes
the proof. In short, the average distance apart is k for both the systematic
and the stratified sample, but because of the concavity the stratified sample
loses more in precision when the distance is less than k than it gains when the
distance exceeds k .
QuenouiUe (1949) has shown that the inequalities in theorem 8.6 remain valid
when two of the conditions are relaxed so that
In this event each of the three average variances is increased by the same amount.
As far as practical applications are concerned, correlograms that are concave
upward have been proposed by several writers as models for specific natural
populatIons. The function Pu = tanh (u - 3/ 5 ) was suggested by Fisher and Macken-
zie (1922) for th correlation between the weekly rainfall at two weather st.ations
that are a distance u apart; the function Pu = e - Au by Osborne (1942) and Matern
(1 947) for forestry and land use surveys ; and the function Pu = (/- u)/I by Wold
(1938) for certain types of economic time series.
TABLE 8.6
NATURAL POPULATIONS USED IN STUDIES OF SYSTEMATIC SAMPLINa
Refere"nce N Type of Data
In the pap..:rs by Yates and Finney comparison are given for a range of values of
nand k within each finite population. III these cases the data in Table 8.7 are the
geometric means of the variance ratios for the individual values of k. The other
writers make computations for only one value of k per population but may give
datu for different items or for several populations of the same natural type. Here,
again, geometric means of the variance ratios were taken.
Although the data are limited in extent, the results are impressive. In the studies
'that permit comparison with \1,'1 systematic sampling shows a consistent gain in
precision which, although modest, is worth having. The median of the ratios
\1,11/ V,)' is 1.4. The gains in comparison with \1,'2 are substantial. the median ratio
being 1.9.
SYS'reMA TIC SAMPLING 223
TABLE 8.7
RELATIVE PRECISION OF SYSTEMATIC AND STRATIFIED RANDoM SAMPLlNG
Relative Predsion of
Systematic to Stratified
Range
Data of k V,tI / v. y ~M2/V"
The internal trend of the results agrees with expectations, although not too
much should be made of this in view of the small number of studies. The gains are
largest for the types of data in which we would guess that variation would be
nearest to continuous. The decline in Vllt! VI)' from soil to air temperatures would
also be anticipated from this viewpoint. In the last three items (forest nursery
data) , the only one showing no gain is coniferous transplant stock, which is older
and more uniform than seedling stock.
If i = 1 is chosen as the first member, all member of the systematic sample have
the value (m +a). For the other three pos ible choice of i, all member have the
values m, (m - a) , or m, respectively. Thus from a single sample we have no mean
of estimating the value of a. But the true sampling variance of the mean of the
2
ystematic sample is a / 2. The illustration show that it is impo sible to con truct
an estimated variance that is unbiased if periodic var iation i present.
These results do not mean that nothing can be done . Excluding the case of
periodic variation, we might know enough about the structure of the population to
be able to develop a mathematical model that adequately represe nts the type of
variation present. We might then be:: able to manufacture a formula for the
estimated variance that is approximately unbiased for this model, although it may
be badly biased for other models. The decision to use one of these models must
rest on the judgment of the sampler .
Some simple models with their corresponding e timated variances are pre-
sented below. No proofs are given .
The simplest mudels apply to populations in which y, is composed of a trend
plus a " random " component. Thus
Y; = IL; +e;
where ILl is some function of i. For the random componen t, we assume that there is
a superpopuJation in which
~(e, ) = 0, g'(e/) = 0/ , (i ¢. j)
A proposed formula s,/ for the estimated vari ance is called unbiased if
'lE (s,y 2) = ~ \I,y
that is, if it is unbiased over all finite populations that can be drawn from thf'
superpopulation.
TItis case applies when we are confident that the order is essentially random with
respect to the items being measured . The variance formu la is the same as that {or a
simple random sample and is unbiaserl if the model is correct.
SYSTEMATI SAMPLING 225
Stratification Effects Only
lLi constant (rk+1 S is rk+k)
(8.44)
In thi case the mean is constant within each stratum of k units. The e timate S ;y2'
which is based on the mean square successive difference, is not unbiased. It
contains an unwanted contribution from the difference between IL'S in neighbor-
ing strata, and the first and last strata carry too little weight in estimating the
random component of the variance. With a reasonably large sample, this estimate
would in general be too high , assuming that the model is correct.
Linear Trend
ILl :; IL + f3i
2 N - n n' r(Yi- 2Yi+k +Y, +2k?
(1 s i s n - 2) (8.45)
S'r 3 =N n2 6(n-2)
The estimate is based on successive quadratic terms in the equence Yi' The sum of
squares contains (n - 2) terms. With a linear trend we have seen (section 8.7) that
the trend can be eliminated by the use of end corrections. The term n'/n 2 is the
sum of squares of the weights in YW'Y- Unless n is small, n'/n 2 can be replaced by
the usual factor 1/ n. Because the strata at the ends receive too little weight, the
estimate is biased unless a} is constant, but it should be satisfactory if n is large
and lhe model is correct.
If continuous variation of a more complex type is present, the preceding
formulas may gi e poor results. In Table 8.8 the second and third formulas are
applied to six forest nursery beds (John on, 1943). The quadratic formula i
slightly better than that based on successive differences, but both give consistently
serious overestimate .
, TABLE B.B
VARIANCES OF SAMPLE MEAN NUMBERS OF SEEDLINGS (JOHNSO 's DATA)
Actual
0
Bed V SY S.; ' 2 S';'3
The factor 7.5 is the sum of squares of the coefficients in any duo and g is the
number of differences that the sample provides (g == n/9) . In natural populations
that Yates examined. a formula of this type was superior to S ;y2 based on
successive differences but still overestimated V(Y. y)'
In summary, there is no dearth of formulas for the estimated variance. but all
appear to have limited applicability.
With N = nk, suppose that n is divisible by an integer m (say to). The following
method uses systematic sampling in part and provides an unbia"ed sample
estimate of V(Ys y ) based on (m -1) degrees of freedom . Draw a simple random
sample of size m from the units numbered 1 to mk. For every unit in this sample,
take also every (mk)th thereafter. In effect, this method divides the population
intb mk clusters each of size N/mk = n/m, and chooses m clusters at random , so
that we have a simple random sample of m clusters. For instance, suppose that we
want a 20% sample from a population with N :; 2400, so that n =- 480, k :; 5. Take
a simple random sample of size m = 10 from units numbered 1 to mk = 50, and
take every 50th unit thereafter. We then have 10 cluster samples each of size
2400/50 = 48.
Gautschi (1957) has examined. the accuracy of this method under the popula-
tion structures considered in this chapter. As might be anticipated, the accuracy
lies between those of simple random sampling and of sy tematic sampling with
m =l.
covered by forest or by water on a map, Matern found the square grid superior to
the random method in two examples.
Figure 8.4b shows an alternative systematic sample, called an unaligned
sample. The coordinates of the upper left unit are selected first by a pair of random
numbers. Two additional random numbers determine the horizontal coordinates
x :
I )( : x
1
II II x
I I I I
I x
I
----'1'----..,-----
I. I
J
----r----,-----
)( I I
I
x I x I x I I )(
I I I I
1 I 1 x I
----~----+---- -
I J
-----r----4-----
I
l( I
I I
" I "
I
I x
I I I
)(
1 I I )( 1
(a) Aligned or " square gnd" (b) Unaligned sample
sample
of the remaining two units in the first column of strata. Another two are needed to
fix the vertical coordinates of the remaining units in the first row of strata. The
constant interval k (equal to the sides of the squares) then fil('es the locations of all
·points. lnvestigations by Quenouille (1949) and Das (1950) for si mple two-
dimensional correlograms indicate that the unaligned pattern will often be
superior both to the square grid and ~o stratified random sampling.
urther evidence of the superiority of an unali gned sample is obtained from
experience in exp~rimental design , in which the latin square has been found a
precise method for arranging treatment in a rectangular field . The 5 x 5 latin
square in Fig. 8.5a may be regarded as a division of the field into five systematic
. samples, one for each letter. There is some evidence that this particular quare.
which is called the " knight's move " latin square, i lightly more precise thtln 3
randomly chosen 5 x 5 square , probably b.~cause alignment is absent it1 the
diagonals as well as in rows and columns.
The principle of the latin square has been used by Homeyer and Black (1946) in
sampling rectangular fields of oats. Each field contained 21 plots. The three
possible systematic samples are denoted by the letters A, B, and C, respectively, in
Fig. 8.5b. This arrangement, with one of the letters chosen at random in each field .
gave an increase in precision of around 25% over stratified random sampling with
rows as strata. The arrangement does not quite satisfy the latin square propert)
because each letter appears three times in one column and twice in the other
columns, but it approaches this property a nearly as possible.
Yates (1960), who terms arrangements of this type lattice sampling, discusse
their use in two- and three-dimensional sampling. In three dimensions ead: roW.
SYSTEMATI SAMPLING 229
A lJ C D E A B C
D E A B C B C A
B C D E A C A B'
E A B C D A B C
C D E A B B' C A
c: A ' B
A B C
(a), " Knight's move" lat in square (b) Systematic design for a 3 x 7 rec-
tangular field
Fi~. 8.S Two systemat ic designs based on the latin square.
column, and vertical level can be represented in the sample by choosingp units out
of the p 3 in the population . With p2 units in the sa Ie, each of the p2
combinations of levels of rows and columns, of rows and ical heights, and of
columns and vertical heights can be represented . Patter on (1954) has investi-
gated the arrangements that provide an unbiased estimate of error.
--
~o
~
-M - ----
•• o
N
~OO~~.M~O~.~~~O~~.OOOOO
f'""') MM---N-NN--N -
00
~NOOOMO~N~~-~~OO-~~~-~ ~
1 00 -- --N-- -NN----NN~N M
V
o ,..,
•I ~
~~~~-O~~.~O-~.M~~~~O
-~- --NN--- N_ t"") ,..,
o
N
00
N
~
8
I~
00
o~ V-~~N~~~~~~VVV~~V~~~
~
I . ~NNN~.~.NN~--N---MMN
~
~
0-
~
III
",'3
- 0
fI-
lii
230
SYSTEMATIC SAMPLING 231
this type is being made, an occasional check on the sampling errors may be
sufficient. Yates (1948) has shown how this may be done by taking supplementary
observations.
EXERCISES
8. J The data in the table (p.230) are the numbers of seedlings for each foot of bed in a
bed 200 ft long.
Find the variance of the mean of a systematic sample consisting of every twentieth foot .
Compare this with the variances for (a ) a simple random sample, (b) a !>tratified random
sample with two units per stratum, ic) a stratified random sample with one unit per stratum.
All samples have n '" 1O. [L (y, - Y) 2", 23,601 .]
8.2 A population of 360 households (numbered 1 to 360) in Baltimore is arranged
alphabeticall y in a file by the surname of the head of the household. Households in which
the head is nonwhite occur at the following numbers: 28, 31 -3 3, 36-41 . 44, 45, 47 , 55 , 56,
58, 68 , 69,82,83, 85, 86, 89- 94,98, 99 , 101 , 107- 110, 114, 154, 156, 178, 223 ,224, 296,
298- 300,302- 304, 306- 323, 325-331 , 333, 335- 339, 341 , 342. (The nonwhite hou e-
holds show some " clumping" because of an association between surname and coloL)
Compare the precision of a 1-in-8 systematic sample with a simple random sample of the
same size for estimating the proportion of households in which the head is nonwhite.
8 .3 A ne ighborhood contains three compact communities, conSisting, respectively, of
people of Anglo-Saxon, Polish, and Italian descent. There is an up-to-date directory. In it
the persons in a house are listed in the following order: husband, wife, children (by age),
others. Houses are listed in order along streets. The average number of persons per house is
five .
The choice is between a systematic sample of every fifth person in the directory and a
20% simple random sample . For which of the following variables do you expect the
systematic sample to be more precise? (a ) Proportion of people of Poli h descent, (b)
proportion of males, (c ) proportion of children . Give reasons.
8 .4 In a directory of 13 houses on a street the persons are listed as follows: M = male
adult, F =female adult, m =male child, f = fem ale child .
Household
I 2 3 4 5 6 7 8 9 10 11 12 13
M M M M M M M M M M M M M
F F F F F F F F F F F F F
f
m
.rm m m
m
f
m
f m m m f
m
f
f f f f
f f f m
Compare the variances given by a systematic sample of one in five persons and a 20%
simple random sample for estimating (a) the proportion of males, (b) the proportion of
children, (c) the proportion of persons living in professional households (households 1,2,3,
12, and J3 are described as professional). Do the results support your answers to exercise
8 .3? For the systematic sample, number down each column, then go to the top of the next
column.
8.5 In exercise 8.1 we might estimate V(y,,) y (a) regarding each systematic sample as
a simple random sample, (b) pretending that each 1-in-20 systematic sample is composed
232 SAMPLING TECHNIQUES
of two l-in-40 systematic samples with a separate random start. For each method , compare
the average of the estimated variances with the actual variance of Y'Y'
8.6 In a population consisting of a linear trend (section 8.6) show that a systematic
sample is less precise than a stratified random sample with strata of size 2k and two units
per stratum if n > (4k + 2)/(k + 1).
8.7 A two-dimensional population with a linear trend may be represented by the
relation
YII = i+j (i,j = 1, 2, , . ..• nk)
where y" is the value in the ith row and jth column. The population contains N 1 = n 2k 1
units.
A systematic square grid sample is selected by drawing at random twO independent
starting coordinates io, jo, each between 1 and k . The sample. of size n 2 , contains all units
whose coordinates are of the form
io + yk, i o+5k
where y. 5 are any two integers between 0 and (n -1), inclusive.
Show that the mean of this sample has the same precision as the mean of a simple random
sample of size n 2 .
8.8 Jf the comparison in exercise 8.7 were made for a three-dimen ional population
with linear trend , what result would you expect?
8.9 In a population with y, = i 2 (i = 1. 2 . ... , 16), compare the values of E( Y- 9)1
given by every Jah systematic sampjing and by the Yates, Sethi, and Singh et al. methods for
n =4, k = 4, N = 16.
CHAPTER 9
units 3 x 3 ft, possibly because samplers tend to place boundary plants inside the
unit when there is doubt . Sukhatme (1947) cities similar results for wheat and rice .
This Chapter deals with the case in which every cluster unit~contains the same
number of elements or subunits.
I I . .. 1 (9.4)
re atIVe net precIsion oc Cu'Su ,2
If differences in the costs of taking the sample are ignored (i .e. , if Cu ' is
constant), the relative ne t precision with the uth unit oc 1/ Su'2. Kish's de/! factors
for the differen t units (section 4.11) are therefore proportional to the Su,2=
2
su / Mu.
Example. Johnson 's data (1941) for a bed of white pine seedlings provide a simple
example. The bed conta ined six rows, each 434 ft lo ng. There are many ways in which the
bed can be divided into sampling units. Data for four types of unit are shown in Table 9.1.
Sint;C the bed was completely counted , the data are correct population values .
TABLE 9.1
OAT ... FOR FOUR TYPES Of S... MPLtNG UNtT
Type of Unit
By theorem 9. 1, corollary 1, the relative net precisions are worked out in Table <.) .2.
The last line of Table 9.2 gives the relative precisions when that of the smallest unit is
taken as 100. The I-ft bed appears to be the best unit .
TABLE 9.2
RELATIVE NET PRECISIONS OF THE FOUR UNITS
The variances among units, expressed on a common basis, are also worth looking at. The
values of S.'2= S. 2/ M. , applicable to a single foot of row, are, respectively, 2. 537, 3.373,
3.849.5.713 . Note that these variances increase steadily with increasing size of unit. This
result is commonly found (although exceptions may occur) . Since the relative net precision
ert/C.'S. '2, the cost of taking a given bulk of sample must decrea e with the larger units if
they are to prove economical.
'f
Theorem 9.1 and its corollaries remain valid for stratified sampling with
proportional allocation if all strata are of the same size and if S} , Su'2 represent
average variances within strata. This is so, under the conditions stated, because
the variance of the estimated population total, ignoring the fpc, is N 2 Su 2 / n, and
therefore assumes the same form as in simple random sampling. Theorem 9.1
does not hold for more complex types of sampling.
The preceding results are intended merely as an illustration of the general
procedure. Comparisons among units should always be made for the kind of
sampling that is to be used in practice or, if this has not been decided, for the kinds
that are under consideration. Changes in the method of sampling or of estimation
will alter the relative net precisions of the different units. Even with a fixed method
of sampling and estimation, relative net precisions vary with size of sample if the
cost is not a linear function of size or if the size is large enough so that the fpc must
be taken into account.
There is usually more than one item to consider. One approach is to fix the total
cost and work out the relative net precisions for each type of unit and each item.
Unless one type is uniformly superior, some compromise decision is made, giving
principal weight to the most important items.
SINGLE-STAGE CLUSTER SAMPLIN G 237
TABLE 9.3
ESTIM ATED STANDARD ERRORS ( ~;.; ) fO R fOU R SIZES Of UNIT. W ITH SIMPL.E
RAN DOM SAMPLING
Best
Items S/4 S/2 S 2S Uni t
In view of the numerous factor that influence the results, a study of optimum
ize of unit in an extensive survey is a large ta k. A good example for farm
sampling is de cribed by Jes en (1942). An excerpt from his results is given in
Table 9.3. This compares four sizes of unit-a quarter-section, a half- ection , a
section, and a block consisting of two contiguous ections. The section is an area 1
mile square, containing on the average slightly under four farms. In this compari-
son the total field cost ($1000), the length of questionnaire (60 min to complete),
and the travel cost (5 cents per mile) are all specified, becau e relative net
precisions change if any of the e variables is altered. Costs are at a 1939 level.
The data in the table are the relative tandard errors (in per cent) of the
estimated means per iarm for 18 items. No unit is best for all items. The
half-section and the quarter-section are, however. superior to the larger units for
all except two items, with little to choose between the half- and quarter-sections.
The half-section would probably be preferred , because the problem of Identifying
the boundaries accurately is easier.
238 SAMPLING TECHNIOUES
TABLE 9.4
ANALYSI S OF VARIANCE OF THE SAMPLE DATA (ON A SMALL-UNIT BASIS)
df ms
Between large units (n - I)
Between small units within large n(M - I)
The estimated variance of a large unit (on a small -unit basis) is Sb 2 . It might be
thOUght that an appropriate estimate of the variance of a small unit would be the
mean square between all small units in the sample; that is,
2 (n-l)s/+n(M - l)s,}
s = nM - l (9.5)
This estimate, although often satisfactory, is Slightly biased because the sample is
not a simple random sample of small units, since these are sampled in contiguous
groups of M units.
An unbiased estimate is obtained from the sample by constructing an analysis of
variance, as in Table 9.5. for the whole population, which contains N large units
and NM small units.
By its definition, the population variance among small units is given by the last
tine of the table, that is.
2
S 2 = (N - 1)Sb2+N(M - 1)Sw
(9.6)
NM - l
•
SINGLE-STAGE CLUSTER SAMPLING 239
TABLE 9.5 1
ANALYSIS OF VARIANCE fOR THE WHOLE POPULATION (ON A SMALL-UNIT BASI.)
df
Example. The data come from a farm sample taken in North Carolina in 1942 in order
to estimate (arm employment (Finkner, Morgan, and Monroe, 1943). The method of
drawing the sample was to locate points at random on the map and to choos as sampling
units the three farms that were nearest to each point. This method is not recommended
because a large farm has a greater chance o( inclusion in the sample than a small farm, and
an isolated farm has a greater chl\ncc than another in a densely farmed area. Any effects of
this bias will be ignored .
240 SAMPLING TECHNIQUES
TABLE 9.6
SAMPLE ANALYSIS OF VARIANCE (NUMBER OF PAID WORKERS)
(SINGLE-FARM BASIS)
df ms
From the sa mpl e data for individual farm s, the group of three farms can be compared
with the individu a l farm as a sampling unit. The ite m chosen is the numbe r of paid workers.
The sample was stratified, the stra tum bei ng a group of townships similar in de nsi ty of farm
population and in ratio of cropland to farmland . Since the sampling fract io n was 1.9%, th e
fpc can be ignored.
The correct procedure is to compute N h2 S h21nh separa te ly within each stratum for the two
types o f unit , using an a na lysis of variance and expression (9 .8). We will use a simpler
procedure as an approximation .
The strata containe d in general between 300 a nd 450 farms, and ei ther two or thrl!e
3- farm units we re ta ke n in each stratum to make th e sampling a pproxima tl! ly proportional.
Assuming proportiona lit y, th at i. , nhl Nh = nl N, we may write
2
• N 2 • N - 2
" V(Y,,)=-I NhSh =-Sh
n n
if we assume further that t~e S/ do not vary grea tl y among strata, so th at they may be
replaced by their average, Sh 2•
Estimates o f 5h 2 are obtained from the analysis of varian ce in Table 9.6, which is on a
single-farm basis.
For the group of three farms , th e mean square ShJ' = 6.218 serve~ as the estima te o f 5,,2
o n a si ngle-farm basis , For the indivIdual farm , u ing (9.S), we have
6.218+2(2.918) =4 .018
3
By theorem 9.1, co ro llary 2, the two figures. 6.218 for the group of three farm s and 4.018
for the individual farm , indicate the re lative va riances obtained for a fixed total size of
sample. Th e group of farm s gives a bout two thirds tb e precision of the single farm .
Con. idera tion of costs would pres umably make tJle..res~1t more favorable to the three-farm
unit.
_ -l'A( _
_ E( Yij 1) Yik
l'/'\
1 )
2L
1
I (y" - })(Ylk - })
,<k
p- E(Yi'- Y)2 --'-="(M
.!.....;.::..--1,-)(NM----l-=-)S:-:;'2- (9.9)
The number of terms (cross products) in the nwnerator E is NM(M -1)/2, and in
the denominator E is (NM - 1)5 21NM.
Theorem 9.2. A simple random sample of n clusters, each containing M
elements, is drawn from the N clusters in the population. Then the sample mean
per element y is an unbiased estimate of Y with variance
• 1- / NM - l 2
V(y)=-;;-' M 2 (N-l/ [l+(M -l)p]
Proof. Let Yi denote the total for the ith cluster and y =I yJ n. By theorems
2.1 and 2.2, j is an unbiased estimate of Y with variance
V( -) = ( 1- f) L (Yi - '9)2
Y n N- l
But y = My and Y = Mr. Hence y is an unbiased estimate of Ywith variance
V(y) = 1- fI(YI- y)2 (9.11)
nM2 N- l
But
(YI- Y) = (YIl- 9)+(YI2- 9)+ ... +(YIM- Y)
Square and sum over aU N clusters,
using the definition of pin (9.9). Sub titute ill (9.11) for V(9) . This gives
1 +(M -- l)p
shows by how much the variance is changed by the use of a cluster instead of an
element as sampling unit. This factor is therefore Kish's (deff) for clusters of size M
(section 4.11). If p > 0, the cluster is less precise for a given bulk of sample . If
p <0, as sometimes happens, the cluster is more precilte. Theorem 9.2 is a simple
extension of theorem &.2, p. 209.
2
An alternative expre ion can be given for p. Let Sb denote the variance among
cluster totals, on a single unit basis. Then
by (9.12). Hence,
(9.15)
SINGLE·STAGE CLUSTER SAMPLING 243
A good discu ion of the numerical values of p for different items and different
sizes of cluster is given by Han en, Hurwitz, and Madow (1953), who regard p as a
"measure of homogeneity" of the cluster.
(9.21)
SINOLE·STAGE a.USTER SAMPUNO 245
Assuming simple random sampling and ignoring the fpc, the variance of the mean
per element; is S/lnM. From (9.17), this equals
• S2-(M-I)AM,- 1
V(y) = (9.22)
n
To determine the optimum size of unit, we find M, and incidentaJJy n, to
minimize V for fixed C. The general solution is complicated, although its
application in a numerical problem presents no great difficulty.
By some manipUlation we can obtain the equation that gives the optimum M.
First solve the cost equation (9.21) as a quadratic in../n. This gives
(9.23)
C+AV=c 1Mn+C2J;;+AV
Differentiating, and noting that aVIan = - Vln , we obtain the equations
I - 1/2 A aV AV
n: CI M +ic 2 n =---=- (9.24)
an n
>.av
M: cln=--- (9.25)
aM
Divide (9.25) by (9.24) to eliminate A. This leads to
n av cln
-=- clM+~c2n - 1J2
V -aM
or
(9.26)
factors , being dependent only on the shape of the variance function. Botb sides
can be seen to be increasing functions of M , for g > 0, M 2: 1, within the region of
interest. Suppose that the solution has been found for specified values of C, CI' and
C2, and we wish to examine the effect of an increase in cIon this solution. The left
side does not depend on Cit but the right side increases as Cl increases. However,
the optimum M is found to decrea e because of the term clM on the right. A
decrease in C2 produces a similar effect.
Now C 1 increa es if the length of interview increases, whereas C 2 decreases if
travel becomes cheaper or jf the farms in a given area become denser. These facts
lead to the conclusion that tbe optimum size of unit become smaller when
length of interview increases
travel becomes cheaper
the elements (farms) become more dense
total amount of money used (C) increases
This conclusion is a consequence of the type of cost function and would require
reexamination with a different function. It iJIustrates the fact that the optimum
unit i not a fixed characteristic of the population, but depends also on the type of
survey and on the levels of prices and wages.
Hansen, Hurwitz, and Madow (1953) give an exceUent discussion of the
construction of cost functions for surveys involving cluster sampling.
(9.32)
EXERCISES
9.1 For the data in Table 9.1 compare the relative net precisions of the four types of
unit when the object is to estimate the total number of seedlings in the bed with a standard
error of 200 seedlings. (Note that the fpc is involved.)
9.2 For the data in Table 3.5 (pAi7) estimate the relative precision of the household to
the individual for estimating the sex ratio and the proportion of people who had seen a
doctor in the past 12 months. assuming simple random samphng.
9.3 A population consisting of 2500 elements is divided into 10 strata. each containing
50 large units composed of five elements. The analysis of variance of the population for an
item is as follows, on an element basis.
df ms
Ignoring the fpc, is the relative precision of the large to the small unit greater with simple
random sampling than with stratified random sampling (proportional allocation)?
9.4 A population containing LNM elements is divided into L strata, each baving IV
large units, each of which contains M small units. The following quantities come from the
analysis of variance of the population, on an element basis.
5,2 = mean square between strata
5 2 2 = mean square between large units within strata
5]2 = mean square between elements within strata
If N is large and the fpc is ignored. show that the relative precision of the large to the small
unit (element) is improved by stratification if
(M - I) M 1
~<5/ - 5/
9.5 In a rural survey in which the sampling unit is a cluster of M farms, the cost of taking
a sample of n units is
C = 4tMn + 6Wn
where t is the time in hours spent getting the answers from a single farmer . If $2000 is spent
on the survey. the value of n for M = 1. 5, 10; t =i. 2, work out as follows.
M
5 10
t == i hr 400 131 74
t == 2 hr 156 40 21
Verify two of these values to ensure that you understand the use of the formula.
The variance of the sample mean (ignoring the fpc) is
52
Mn [1 +(M- I)p]
=
If p = 0.1 for all M between 1 and 10, which size of unit is most precise for (a) t i hr, (b)
t = 2 hr? How do you explain the difference in results?
9.6 If $5000 were available for the survey, would you expect the optimum size of unit
to decrease or increase (relative to that for $20oo)? Give reasons. You may, if you wish,
find the optimum size in order to check your argument.
CHAP TER 9A
denote the item total for the ith cluster unit. Given a simple random sample of n of
the N population units, an unbiased estimate of Yis (by theorem 2.1, coroUary)
~ N"
y =- L y, (9A.l)
n 1- 1
By theorem 2.2, its variance is
N
A
V(Y) =
N 2( l - f) L (Y/ - Y)2
;_1-..:.,I_ __
(9A.2)
11 N- 1
where Y= Y/ N is the population mean per cluster unit.
The estimate Yis often found to be of poor precision. This occurs when the YI
(means per elemel1t) vary little from unit to unit and the M j vary greatly. In this
event the YI = M/YI also vary greatly from unit to unit and the variance (9A.2) is
large.
249
250 SAMPUNO TECHNIQUES
If the M, and hence Mo are al1 known, an alternative is a ratio estimate in which
M; is taken as the auxiliary variate X"
L" Y,
YR = Mo..!.:::!._.
n
= Mo (sample mean per element)
L M,
I- I
In the notation of the ratio estimate the population ratio R = Y/ X = Y/ Mo= 9',
the population mean per element. By theorem 6.1, assuming that the number of
clusters in the sample is large,
(9A.3)
N
== N 2 (1- f) L M/(y; - 9')2
(9A.4)
n N- J
"
As (9A.4) shows, the variance of YR depends on the variability among the means
per element and is often found to be much smaller than V( n.
Note that YR requires a knowledge of the total Mo of a1l the M;, while Y does
not. The reverse is true when we are estimating the population mean per element.
In this case the corresponding estimates are
Thus, -r
R requires knowledge of only tbe M; that fall in the selected sample.
,
9A.2 SAMPLING WITII PROBABILITY PROPORTIONAL TO SIZE
If all the M, are known, another technique, developed by Hansen and Hurwitz
(1943), is to select the units with probabilities proportional to their sizes M,. One
SINGLE-STAGE CLUSTER SAMPLING 251
method of selecting a single unit is illustrated in the followin g smaJI popula tion of
N = 7 uni ts.
Size Assigned
Unit M, 'f.M, Range
1 3 3 1-3
2 I 4 4
3 II 15 5- 15
4 6 21 16-21
5 4 25 22-25
6 2 27 26-27
7 3 30 28-30
The cumulative . ums of the M j are formed. To select a unit, draw a random
number between I and Mo = 30. Suppose that this is 19. In the sum, number 19
falls in unit 4, which cover the range from numbers 16 to 21 . mclu",ive. With this
method of drawing. the probahiljty that any unit another is "elected is propor-
tional to its size.
This method of selecting a unit is convenient when N is only moderate, or in
stratified sampling when the Nh are moderate or small, but the cumulation of th
M I can be time-consuming with N large (e.g., N = 20,000). For this case, Lahiri
(1951) has given an alternative method that avoids the cumulation. Let Mmu
be the large t of the MI' Draw a random number between t and N ;
suppose this is i. Now draw another random number m between I and MffUi1(' If m
is less than or equal to M;, the ith UOlt is selected. If not, try another pair of random
numbers. Naturally, thi method involves the fewest rejections whcn the M j do
not differ too much in size.
Now consider n > 1. Assume at present that sampling is with "'placement. To
select a second unit by the cumulative method, draw a new random number
between 1 and 30. However, unlike sampling withou t replacement, we do not
forbid the selection of unit 4 a second time. Wtth this rule, the probabilities of
selection remain proportional to the sizes at each draw. An advant ge of selection
with replacement is that the formulas for the true and estimated variances of the
estimates are simple.
In sampling without replacement, on the other hand (section 9 A.6), keeping the
selection probabilities proport ional to the chosen sizes is more difficult and sooner
or later becomes impossible as n increa es. This may be seen in the extreme
(although impractical) ca en = 7 in the preceding example. If selection were made
without replacement, every unit would be certain to be chosen. irrespective of the
original sizes M I' However, for ~tr8 tified sampling in which the Nil are small, milch
research has heen done (section 9A.6 If) to deve lop practical methods of sampling
with unequal probabilities without replacement.
252 SAMPLlNG TECRNIQUES
For com pari on with subsequent methods, this estimate will be denoted by Y,.."...
Furthermore,
b Mo ~ --ih2
(9A.6)
V(r,.."..) = - '- M, (YI - I}
n I- I
so that the variance of Ypps , like that of YR , depends on the variability of the unit
means per element.
In some applications the sizes M I are known only approximately. In others the
"size" is not the number of elements in the unit but a measure of its bigness that is
though! to be highly correlated with the unit total YI' For instance, the "size" of a
hospital might be measured by the total number of beds or by the average number
of occupied beds over some time period. Similarly, various measures of the "size"
of a'restaurant, a bank, or an agricultural district can be devised. Consequently,
we will consider a measure of size M;' and a corresponding probability of selection
N
ZI = M/ / M o', where Mo' = L M/ . As far as the theoretical results are concerned,
the ZI can be any set of positive numbers that add to lover the population. It will
be shown that '
Y = _!_
ppz
n
f ~
I- I Z,
(9A. 7)
V(Yppz ) = -1 L
A N (YI
ZI - - Y)2 (9A.8)
n I_ I ZI
The proofs utilize a method introduced in section 2.10. Let II be the number of
times that the ith unit appears in a specific sample of size, n, where II may have any
of the values 0,1,2, ... , n. Consider the joint frequency distribution of the II for
all N units in the population.
The method of drawing the sample is equivalent to the standard probability
problem in which n balls are thrown into N boxes, the probability that a ball goes
SINGLE-STAGE CLUSTER SAMPLING 253
into the ith box being z, at every throw. Consequently the joint distribution of the
Ii is the multinomial expression
For the multinomial, the following properties of the distribution of the I, are well
known.
V(!,):: nZi(I - z,) , (9A.9)
(9A.I0)
Y.
ppz
='!'(t
n
YJ+ r/2+ .. '+ 1 YN)=.!. ~ I X!.
I ZI Z2 n NZN ':'1 'ZI
where the sum extends over all units in the population. In repeated sampling the
t's are the random v riables, whereas the Y, and the Xi are a set of IIlced numbers .
Hence, since E(t, ) =nx, by (9A.9).
1 N Y, N
E(YppJ =- L (nz,)-= L Yi = Y
n '- I Z, i-I
1[ N
=- L (1!.)2zl(1-z,) -2 L L :..!.!J.
V.Y ZIZ,]
N N
(9A.13)
n I- I Z, Z,
, - I j>I'<:,
= -1 (N
L..L_ 2
Y y 2 =-1 ) -- YJ2
LN z, (YI (9A.ll)
n I - I ZI n 1- 1 Z,
Since ( I - z,) equals the sum of all other z 's in the population, the coefficient of
y," / Zi in (9A.14) contains a term zi for any j¢. i. Similarly. the coefficient of z, y/ /
(;ontains a term ::,. Hence,
V ( }' ) ""'.~ ,,'!, ~ ('Y/ZI+Y/ ZI _ 2YY)
~. n i -- I /-> 1 Z, Z/ ' f
=- -
1 N N
I I zlz, (Y...!. _ ily ) 2 (9A.15)
n ' - '/ > 1 Zj Zf
LI ( Ypp~)= ,~,
n (yi- Yppz )2/ n (n - l) (9A.16)
,- I ZI
(9A.17)
(9A.2l)
These resul ts follow from theorem 9A.l , since YI = YiM; and 9" = YjMo.
Theorem 9A.4. Under the conditions of theorem 9A.3, an unbiased sample
estimate of V ( Y"". ) is
unit, a question of interest is: what measure of size minimizes the variance of Y""z ?
Now,
1"" (Y;
V (Y""z)=-LZ; -- Y )2=-1 (Ny/ - Y 2)
L-
n ZI n Zi
This expression become~ zero if Z, oc YI: that is, Z, = yJ Y. If the YI are all positive,
this set oC ZI is an acceptable set of probabilities. Consequently, the best measures
of size are numbers proportional to the item totals y/ for the units.
This result is not of direct practical application; if the YI were known in advance
for the whole population the sample would be unnecessary. The result suggests
that if the YI are relatively stable through time, the most recently available
previous values of the YI may be the best measures of size for this item. In practice,
of course, a single measure of size must be used for all items in selecting the
sample. If there is a choice between different measures of size, the measure most
nearly proportional to the unit totals of the principal items is likely to be best.
There i no simple rule for deciding which is most accurate. The issue depends
on the relation between Yi and Mi and on the variance of Yt as a function of M, . The
situation favorable to the ratio and pps estimates is that in which Yt is unrelated to
Mi. The situation favorable to Y.. is that in which the unit total y, is unrelated to M,.
Some guidance can be obtained by expressing the variance of the three
estimate in a comparable form. We assume that (N - 1) == N and write E(Yi -
N
Y/ = I: (Yi - YlIN. We also a sume that the bias of YR is negligible .
For Y.. we have, from (9A.2),
n V( Y.. )= N 2 (1 - f)E(y, - Y)2::: (1 - f)E(NYi - Y)2 (9A.23)
For YR. from (9A.4),
n V( YR )::: N 2 (l - f)EM/ (YI - 9')l = (1- [JE{~) \Moy; - Y)2
(9A.24)
where M = I:M,IN= MolN.
From (9A.2l). for YpJM,
• From (9A.23), (9A.24), (9A.25), we see that V( Y.. )depends on the accuracy of
the quantitie. Ny, =NM,y; as estimates of Y, while V(YR ) and V( YpJM ) depend on
the accuracy of the quantities MoY; = MoyJ M; as estimates of Y. If Yi is unrelated
to M" we expect the MoYI to be more accurate than the NM,y" and the reverse if y,
is unreldted to M,.
As regards YR and Ypl''' note from (9A.24) ~and (9A.25) that V(YR ) giv~s
relatively greater weight to large units than V(YpJM )' Note also that Y.. and YR
benefit from the fpc term, which can become substantial in small strata (e.g., with
nh = 2, Nh = 10). This point stimulated the development of unequal probability
selection without replacement. Formula (9A.24) for YR , of course, holds only in
large samples.
Further comparisons among the methods have been made from an infinite
population model by Cochran, 2nd ed. , Des Raj (1954, 1958), Yates (1960).
Zarcovic (1 960)~ and Foreman and Brewer (1971). Most writers assume that the
finite population is a random sample from an infinite superpopulation in which
19A.26)
which il> hoped to approximate the relation that holds in many surveys. Some
assumption must also be made abou t the variance of t, in clusters of given size.
SINGLE·STAGE CLUSTER SAMPLING 257
From (9.12) in Section 9.4, we get on dividing by (N - I) == N,
V(el ) = V(YIIMt)
== MtS2[MIP +(1- p))
As an apptoximation this suggest'i V(el) = cMJ' where 1 < g < 2 in most applica-
tions. From the model (9A.26) we get
I'r a eN
T = -=+{3+-= (9A.27)
M M
We assume here that eN is negligible ; this amounts to ignoring the fpc.
It follows that,
nV(YJ
N2 E(YI - Y/ = E[{3 (M, - M) + etl 2
2
- 2 a V(M;) 6
= E[a(l - M;/M)+e;] = M2 + cE(Mt) (9A.29)
nVR -. . :. . a 2 ----=-or-
V(Mi ) + cE(M.B) (9A.32)
N2 M< I
.. 11', : Z,
N ~z ( N Z
+ 2: ~ == Z, I + I ~
, .. ,(1-z,)
)
", l -z,
(9A.34)
: ; z,( I + A __ z,_)
1 - z,
(9 A .35)
1r'J = probability that the ith and jth units are both in the sample
The following relations hold :
N N N
I 1r/ ::: n, I 1r/J = (n -1 )1rI, I, ,I> ; 1r'J= i n (n - l) (9A.36)
lqil,'
To establish the econd relation, let pes) denote the probability of a sample
consisting of n specified units . Then 1r'J :::;; IP(s) over all samples containing the ith
and jth unit , and 1r/ = IP(s) over all samples containing the ith unit. When we
take I1rjj for j ¢ i, every P(s) for a sample containing the ith unit is counted (n -1)
times in the sum, since there are (n - 1) other values of j in the sample . This prove
the econd relation . The third relation follows from the second.
The Horvitz- Thompson (1952) estimator of the population total is
A
YH1' =
I" -y, (9A.37)
i 1T',
yHT=rl.!.
, TT,
is an unbiased estimator of Y, with variance
VCYHT ) = f
,_ I
(1-TT')y/+2
1T;
ff
i- I j> i
('TTlj -77"I77"j)YIYj
1T't'TTj
(9A.38)
where TT;j is the probability that units i and j both are in the sample.
Proof. Let (; (i = 1, 2, . ..• N) be a random variable that takes the value 1 if the
ith unit i drawn and zero otherwise. Then (; follows the binomial distribution for a
sample of size 1, with probability 'TT, . Thus
E(l,) = 77";, (9A.39)
The value of Cov (t;(,) is also required . Since ('(f is 1 only if both units appear in the
sample,
Cov (t;t, ) = E(/;t,) - E(t;}E(/,) = TT" - 77"I'TTf (9A.40)
Hence, regarding the Y, as fixed and the t, as random variables.
~ (~(,y;)
E( YHT') = E '- -
f a 1 1T;
= '-~
j_ 1
Yf = Y
"
(9A.41)
Hence,
(9A.42)
SINGLE·STAGE ClUSTER SAMPliNG 261
Cor~Uary. From (9A.41), using the tl method, an unbiased sample estimator
of V(YHT ) is seen to be
~ (l - +2 ~ ~
( ./',HT ) -_ I...
vir
1T1)
2 YI
2
I... I...
(1Tlj - 1T;1Tj)
YIYI (9A.43)
1 1T1 1 />1 1T11T/1TI/
For n = 2 this method of sample selection keeps 1T; = 2z1 and uses the Horvitz-
Thompson estimator
YHT == l!.+1L=!(!!+l'.L)
1T1 1Tj 2 ZI z/
With IT = 2 the probability that the ith unit is drawn is the sum of the
probabilities that it was drawn first and drawn second. Thus
(9A.46)
Since this method use the HTestimate , theorem 9A.5 and its corollary provIde
formulas for the variance and estimated variance of YH .
The method has two desirable properties. Brewer (1963) has shown that its
variance is always less than that of the e timate YpPl in . ampling with replacement.
Second, som~ algebra shows (Rao, 19(5) that (7T,7T,-- 7T"J > 0 for all i :j: j, so that
the Yates-Grundy estimate V2 of the variance is alway!> positive .
Durbin's ( 1967) approach draws the first unit (i) with probability z,. If unit i was
drawn first . the probability that unit j is drawn second is made proportional to
Z.[ I + 1 ] (9AAH )
' (1 - 2z,) (1 -2z, )
In this case the divisor of the proportion~ is
N [I I ] ( I - z,) N z N Z
L zi +- - - = - -+ L = 1+ L ____::L_
I '" ~ ( 1- 2z,) (1 - 2z, ) 1-2z, " ,(1 - 2z, ) ,: 1 ( - 2z,
(9AAf.J)
Thus the divisor is equal to 2D in Brewer's (9AA5) . The probability that the ith
unit was drawn first and the jth unit second is, therefore.
By symmetry, this equals P(j)P(iJj) , so that Durbin's 7T" is the same as Brewer's in
(9A.47) . .
Sampford (1967) has extended this method to samples of size n, provided
nZI < 1 for all units in the population. With his method of sample selection, the
probability that the sample consi ts, for example. of units I. 2, ... , n is a natmal
extension of (9AA7), being proportional to
( 1- f z,) fI
;- 1 i- I
Z'/ fI
, .. ,
(1 - nz,) (91' .51)
SINGLE·STAGE CLUSTER SAMPLING 263
For this method it can be shown that 1fj = nZi. A formula for 1fij is given, with
advice on its calculation by computer. The HT estimator of Y is used, so that
formulas are available for its variance and estimated variance. The Yates-Grundy
estimator V2 in (9A.44) is always positive. Several methods of actually drawing the
sample so as to satisfy (9A.51) are suggested by Sampford. One is to draw the first
unit with probabilities z, and all subsequent units with probabilities proportional
to z/(l- nZj) with replacement. If a sample with n distinct units is obtained, this is
accepted. An attempt at a sample is rejected as soon as a unit appears twice . This
method can be seen to lead to (9A.51). As a guide to its speed, a formula is given
for the expected number of attempts required to obtain a sample.
For n = 2 Durbin's (1967) method of drawing the sample, unlike the Brewer
method, has the property that the unconditional probability of drawing unit i is z,
at both the first and the second draws. ]n multistage sampling in surveys repeated
at regular intervals, Fellegi (1963) pointed out earlier that it is necessary or
advisable to drop unit~ and replace them from time to time on some regular
pattern called a rotation scheme, because of the undesirability of long-continued
questioning of the same persons. He produced a method of selection of successive
units that also has the Durbin property. His method, based on iterative calcula-
tions, is similar to the Brewer- Durbin method , but has slightly different 1fir
This method uses the first selection technique suggested (section 9A.6). the
successive units being drawn with probabilities z" z/(l- Z,). Zk/(l- Z, - z). and
so on . Murthy's estimator (1957) follows earlier work by Des Raj (1956a).
who produced ingenious unbiased estimates based on the specific order in which
the n units in the sample were drawn . Murthy showed that corresponding to any
ordered estimatc of this class we can construct an unordered estimate that is also
unbiased and has smaller variance.
His propo ed e<;limator is
"
L P (sl i)y,
YM '" I P(s) (9A.52)
where
p (sli) = conditional probability of getting the set of unit that was drawn , given
that the ith unit was drawn first
pes) = unconditional probability of getting the set of units that was drawn
We now prove that the eStimate YM is unbiased . For any unit i in the
population, L P(sli) = 1, taken over all s mples having unit i drawn first. To show
264 SAMPLING TECHNIQUES
Z·
L zf
P(sli) =-=:L..; LP(sli)=~= 1
I -zi l -zi
For n = 3, with j and k the second and third unit,
N N N
L p(sli) = L L Z/Zk/(l - zl)(l- z, _. zi) == L z/(l - ZI) = 1
,"I k "'J /.,,1
and so on for n > 3. Hence, when we sum L pes ) YM over all samples of size " , the
coefficient of YI in the sum is 1, so that
(9A.53)
General expressions for VCYM) and l' ( f'M) for any n have been given by
Murthy (1957).
When n = :' , the sample consisting of units i and j ,
(9A.56)
(9A57)
After substituting for YM"from (9A56) and for y l and rearranging, we find
(9A.58)
(9A.59)
where P(slij) is the conditional probability of getting the sample given that units i
and j were selected (in either order) in the first two draws. A computer program
for calculating the P(sJi , and P(slij) has been given by Bayless (1968).
1 3 9 1- 9
2 1 12 10-12
3 1] 45 13-45 17 (unit 3)
4 6 63 46-63 47 (unit 4)
5 4 75 64- 75
6 2 81 76-81 77 (unit 6)
7 3 90 82- 90
M'
0 -- 30
If nZj < 1 (i.e., nM;' So Mo') for all i, any unit has a probability nZI of being
selected, and no unit is selected more than once. For instance, unit 5 is selected if
4s r So 15, giving probability 12/30 = 3z s. If nZI> 1 for one or more units, such
SAMPLI NG TECHN IQUES
units may be selected more than once in the sa mple, but the average frequency of
<;e lection is 7r, = nz,. This happens in the example for unit 3, since 3M )' = 33 >
30 = Mo'. Unit 3 is chosen once if 1 _ r s 12 or 16 S r S 30 (in all 27 choices), and
rwic I;' if 13 _ , S 15 (3 choices) . The average frequency of , election i (I x 27 + 2 x
3)/ 30 "" 33/30 = 3z 3 •
It ,follows th at
y'Y:. = ~l!. = _!_~~
~ l- (9A.61)
I 7r, n I Z,
I~un unbiased e timate of Y.
Hartlcy and Rao ( I (62) examined this method wi th the units first arranged in
random order . With the restriction nz, < I. (all i) . they obtained approximate
expressions for V( Y,yJ and v( Y,yJ .
(t) A .o2)
"
where YK , ZM refer tn the unit drawn from group g.
Since the Zx will not be equa l, this method does not keep the probabilities of
-..:lcction pr oportional to the sizes, and there i~ some evidence th at its e~timator
~u lfcr' CI ~ l i!!ht l o ~s in precision. Its advantages are its ~impl i.:i1 y and generality.
In dcvcll)ping V('YR H C ) we average over two stages. Stage I b the randomiza-
Imn mtn group. . stage 2 the selection of a un it wit hin each group. For any specIfic
'plit into gr~) ups, Ys in (9A.62) is an unbiased estimator of the group total Y~. and
hence f:',( YHHC ) = Y. A well -k nown formula f\ r finding a var,iance over two
\!ages of sampling, proved in Chapt er 1O. i ~
V( Y'lH(" ) = El[V~(YRIi )] + Vl[E2(YR llr l] (10 .2)
S1l1ce 2( YRNC ) = Y and is a constant, it h a~ lero variance and the second term in
( 10.2) di~appears in this application .
For V~ ( YRllci we can use . within a gro up. the variance formula for sampling
with replacement, si nce only one unit i5 se lected from each group. Within a gr up
the pf("lbabi lities of selection are z,/ ZR' By (9A. 15) in ection 9 A.3 we get, for any
,peclfic spli1 (with nR = 1),
(9A .63)
INGLE-STAOE CL STER SAMPLING 267
taken over the pair of units in group g. Over the set of random subdivisions in to
~roup • the probability that any pair of units falls in group g is Ng(Ns -1)/
N(N - I) . Hence the average val ue of V 2 ( YIr ) over the randomization is
Thu V( YRHC ) i<; si mpl y a mUltiple of the variance of the estimate Ypl" in sampling
with replacement. If N/n is integral. the choice Nr. = N/n minimizes the multi -
plier. 111 th i~ case
• (n-1).
, '( ) RHI ) - 1- -=l '" () '1'1" ) (9A.66)
If N/ n I" intcgral.(QA.661 glve~ V( } ' RHcl/ V( )',,,,z);: (N - n lieN - J). the same
ratio a~ obtained in . implc random ~a mpling .
An unoi..1"t!d "allance estimator can be show n In be
• (._£, N/ - N) n ( v ~ -
)'
With N = fiR +k. and k groups of size (R + 1). formula (9A.68) becomes
A N + k(n-k) - Nn n
_
2
In the Iiterutu re, comparisons of tht· perfo rmances ot some of the methods bay
been made , parti ularly by Rao and Bayless (1969, J 97t J, in three l\ua tion~ . (a I
on small artificial populations [e.g., the population!. with ,"i = 4, n =:; on tructcd
by Yate and Grundy (1953)], (b) under the linear rcgre 'sion mex. '1 u cd Jtl
section 9A.5. and (c) on 20 natural populations.
268 SAMPLING TECHNIQUES
Seven methods are compared here on three artificial populations with N =5,
n = 2. The relative sizes ZI of the units are the same in all three populations
(A, B, C) (Table 9 A. I).' In A the mean per element, which is proportional to yJ Zit
is uncorrelated with Z I' In B the mean per element rises as the sizes increase. In C
the unit totals have little relation to the sizes.
TABLE 9A.l
THREE SMALL ARTIFICIAL POPULATIONS
Relative sizes ll ;) 0.1 0.1 0.2 0.3 0.3
TABLE 9A.2
VARIANCES OF THE ESTIMATED POPULATION
TOTALS
Population
Estimate A B C
section 9A.13, this superiority is probably due to the estimator Ny, not the SRS
feature.
Rao anti Bayless (1969) compared 10 unequal-probability methods in 20
natural populations found in books and papers on sampling, with N ranging from
9 to 35. They confined themselves to methods (a) known to have smaller variances
than Yppz and (b) providing a positive unbiased variance e timator. Among the
method pres~nted here, they compared the efficiencies of YM, YRH and Ypp:
with that of Y8' For n = 2, there was little to choose among the three " without
replacement" methods, with YM slightly ahead, beating YRHC whenever the two
methods differed in precision . Also, we have noted (section 9A.7) that th
variance estimator may be unstable for methods using the Horvitz- Thompson
estimate . The Rao-Bayless results compare the coefficients of variation of v( YM) ,
VCYRHc), and vCYB ) in the Yates-Grundy form with that of v(Yppz ) , as measures
of the stabilities of the variance estimators. Relative to v( YB ) as 100%, the
median efficiencies of the other variance estimators \\ ere : V(YRH )=109%,
v(YM ) = 104%, v( Yppz) = 97%, the three methods all showing a few large indi-
vidual gains.
Bayless and Rao (1970) give similar comparisons for n = 3 (14 populations) and
n = 4 (10 populations). Sampford's extension of the Brewer method was used . For
n much beyond 2, both Sampford's and Murthy's methods require computer aid
in calculating the needed probabilities. The variances of YM and Ys agreed closely
in nearly all populations, with YRHC slightly behind, its median efficiency relative
to Ys dropping to 92% for n = 4. In stability of the variance estimators, on
the other hand, the superiority of YRHC and YM increased, the median relative
-'efficiencies being 118% (n = 3) and 129% (n = 4) for v( YRHc) and 110% (n = 3)
and 120% (n = 4) for v(YM ) relative to 100% for v(Ys ).
Rao and Bayless also compared the efficiencies of Yand v( y) for some of the
estimators under the linear regression model of ection 9A.S with a = O. While
comparative results depended on the power g, the general trend was similar to that
in the natural populations.
(9A.70)
Ratio estimates enter either when the variable of interest is a ratio (e.g.,
unemployed females/females eligible for work), or when they are used to increase
SINGLE·STAGE CLUSTER SAMPLING 271
precision , In unequal-probability sampling a single choice of the z, or Zit, must be
made for all variables for which estimate are required. For instance, the Zj or Ziti
may be proportional to the total recent sales of a type of business, where the
survey has to estimate current sales of individual classes of items as well as current
total sales. For some classes, sales may not be closely proportional to the Z" In
such cases, use of the familiar "ratio to the same variable last time" estimate may
bring substantial increases in precision.
With unequal-probability sampling the change in formulas to those for ratio
estimators is easily made. In an unstratified population, replace Yfor any method
by XYI X. For instance, with the HT estimator, we use X(f
yJ'TT'i) (x,/Trl) for If
r
the ratio form instead of (yJ 11',) . For the standard approximations to the MSE
and estimated MSE of a ratio estimate, replace y, by d, = (y, - Rx,) in V( Y) and by
d;' = (Yi = Rx, ) in v(Y) .
For instance, with the ratio form of the Horvitz-Thompson estimator, the
approltimation to V2, the estimated variance in the Yates-Grundy form, is from
(9A.44),p. 261 , .
V2[ YIfl"(R») == rr
I j '> i
(11',11'1 - 11"1)
1Tij
(d;' -~
"tT,
,\2
7TJ~J
(9A. 71)
In a stratified population it is likely with smaJ) nit that the combined ratio
estimate (section 6.11) will be used [i.e., Y = X(r Y,,)/(r X,,)). For approximate
variance formulas, replace YIt, by dll , = YIII - RXIII in V and by d"i' = Ylol - Rxlol in v.
When the y, for an item are not related to the ZI and no suitable ratio estimate is
feasible for such an item, Rao (1966) has investigated alternative estimators.
These are produced from any of the unequal-probability estimators (m: M ,
RHC, etc.) by replacing yslz, by Ny" wherever y, appears in the estimator. Thus,
for the Horvitz-Thomp on estimator, the alternative form is
(9A.72)
The estimators are biased but intuition suggests that if the YI have no relation to
the z" the biases should be relatively small. By the same method as used in finding
VCYIfl") in (9A.42), we get
N2 N N
v[Y1n-]=-rr
n
r (11'111'1- 11'jj)(Y, - Yj) 2
I j >1
(9A.74)
Thus V[ Y*Ifl") depends on the amount of variation in the unit totals, as with
V( YSR ) . In population C, Table 9A.2, this method gave 0.266 and 0.280 for the
272 SAMPLING T CHNIQUES
MSE's of the alternative forms of YHT and YM , these forms doing almost as well a
YSRS' For both methods the (Bias) 2 term was about 4% of the MSE.
EXERCISES
9A.l Horvitz and Thompson (!9~7.) give the foil wing data for eye estimates M, of the
numbers of households and for the actual numbers 1', in 20 city blocks in Ames, Iowa. To
assist in the calculations, value of Yi and 9// M, are also given. A sample of II = 1 block is
chosen . Compute the variances of the total number of households Y, as obtained by (a) the
unbia ed estimate in sampling with equal probabilities. (b) the ratio estimate in sampling
with equal probabilities, (c) sampling with probability proportional to M,. (For the ratio
estimate, compute the true mean square error, not the approximate formula.)
For V(O) this method give the following easily remembered result.
\/(0)= V I [E 2 (8)]+E I [V2(0)J (10.2)
where V 2 (0) is the variance over all possible subsample selections for a given set of
units . To show this, let 0 = E(O) (where 0 is not necessarily the quantity that 8 is
designed to estimate, since 8 may be bia ed) . By definition,
V(O)=E(O - O)1 = E I E 2 (0 - 0)2 (10.3)
But
E 2 ( 0- 0)2 = E 2 ( O~) - 20E2 ( 0) + 0 2 ( LO.4)
= [E2 (8)J + V 2 (0) - 20E2 (0) + (} 2
2
(l0.5)
Average now over first-stage select ions. Since E) E 2(0) = 8,
YeO) =: E I [E 2(0)]2- (} 2 + E I[ v2 ee)] (10.6)
= V I [£2(0)) + E)[ V 2(0)] (10.7)
Formula (10.7) extends naturally to three 0: more stages. For three-stage
sampling,
(10.7')
Yr = f
i- I
bL:: sample mean per subunit in the ith primary unit
m
y= I
i- I
~ = over-all sample mean per subunit
n
N
I (Y, - 9)2
2 _I ~ ) . . .
5I - N _} :: variance among pnmary uOlt means
N M - 2
I I (Yi/- Y i )
5 22 = i- I ~~M _ 1) variance among subunits within primary units
Note that Yj denotes the total over-all subunits in the ith unit (denoted by Yi in
Chapters 9 and 9A).
SUBSAMPLING WITH UNITS OF EQUAL SIZE 277
Theorem 10.1. If the n units and the m subunits from each chosen unit are
selected by simple random sampling, y is an unbiased estimate of Y with
variance
(N -n) --+
V( Y-) == - -
N
S/n (M-m)
---
M
s/
mn
(10.8)
(10.10)
"
Since E 2 (y) = I Y, / n, the first term on the right is the variance of the mean per
subunit for a one-stage simple random sample of n units. Hence, by
Theorem 2.2,
(10. 11)
n
Furthermore. with y =: I y,/n and simple random sampling used at the second
stage,
-) _ (M - m)
V 2 (Y - 2
f S2/ (10.12)
Mn m
M
where S2/= I(Y;j- Y,) 2/(M-I) is the variance among subunits for the ith
I "
primary unit. When we average over the first-stage samples, I 5 2 / / n averages to
N i
I 5 2/ / N = 5 22 •
i
Hence
(10.13)
The theorem follows from formula (10.10), on adding (10.11) and (10.13).
If II = n/ Nand i2 = m/ M are the sampling fractions in the first and second
stages, an alternative form of the result is
(10.14)
278 SAMPLING TECHNIQUES
(10.15)
Proof.
(10.17)
Hence
where f'n = f YJ n. The last term on the right holds because subsampling is
independent in different units and 9= L" y,/ n. Thus,
'f
( 10.19)
Multiplying by (1 - I,) / n(n - 1) and averaging over the first stage of simple
random sampling,
(10.20)
By comparison with (10.14) for V(y), note that the term in 5 2 2 is too small by the
amount II (1-12)522/ mn. Since £, E 2(5/ ) = 5/, an unbiased estimate of V(y) is
therefore
_) _ 1- 1, 2+1,(1 - 12)5 2 2
v (Y - --s, (10.21)
n mn
2 f (y, _ 9)2
v(y) = ~ = -' ~_I-,---_- ( 111 . 2 I)
n n (n-I)
Thus the estimated variance can be computed from a knowledge of the unit me an,
only. Thi result is helpful when ubsampling is systematic, becau~e in thi~ CH'n!
we cannot compute an unbiased estimate of S: 2 • But ( 10.:13) still applies. providcd
that n/ N is small. If II/ N is not small. (10.23) overc ·timates hy the amount
[IS//n. a~ seen from (10.20) and ( IO.I ..n
f (p, _ p)2
s/ = ,-' ...:,1_ _-
n- 1
In n
Example. In a study of plant disease th plants were grown in 160 small plots
containing nine plants each. A random sample of 40 plots wa chosen and three random
plants in each ampled plot were examined for the pre ence of disease. It wa found that 22
280 SAMPLING TECHNIQUES
plots had no diseased plants (ou t of three), 11 had one, 4 had two, and 3 had three . Estimate
the proportion of diseased plants and its s.e. The symbol <p de notes the frequencies 22, t 1.
4,3.
We have N = 160, M = 9. n '" 40, m = 3. In finding s , 2 and S2 2 , it is convenie nt to work at
first with the numbers of diseased plants (3p,) and the numbers of healthy plants (3qJ The
calculations are et out as follow .
Frequency
3p ; </> 9p,q, 9</>p ,q, 3</>1' , 9.pp,t
0 22 0 0 0 0
II 2 22 II II
2 4 2 8 8 16
3 3 0 0 9 27
40 30 28 54
_ 3 L <pp, 28
p = 3 L <p = 120 = 0.233
_ 2 1 ( (28)2)
L<P(P, - P) = (9) 54 - = 3.822
40
30
L<PP4, "'"9= 3.333
The last term on the right doe 'Dot depend on the choice of nand m. Minimizing V
svnSAMPLlNG WITH UNITS OF EQUAL SIZE 281
for fixed C, or C for fixed V, is equivalent to minimizing the product
_ S2 FrTr"..
mop. - J 2 2 CI/C 2 (10.26)
S, - S2 / M
provided S ,2 > S2" / M. Round mop. to the nearest integer. If m is an integer such
that m < mop. < m + 1, a slightly better rule for mop. small is (Cameron, J951):
round up if m ~p. > m(m + 1), otherwise round down . If mop. > M orif S/ < S/ /M,
take m = M , using one-stage sampling. (The product (V + Sl 2 / N)C is a strictly
decreasing function of m when S / < S / / M .)
The value of 11 is found by solving either the cost equation or the variance
equation , depending on which has been preassigned .
In most practical situations the optimum is relatively flat. An error of a few units
in the choice of m produces only a small loss of precision. as the following example
illustrates. Write
S 2= S
u I
2_ S/
M (1O.27)
Example. LeI
then
mop< = D.JiO = 4. I
We will regard total cost as fixed and see how the variance of y changes witb m. N is
assumed large . From (10.14).
2
.) S. l S2
V( Y =- + -
n 11m
- , ' 1+S/
V(y) =s}r.( -- )(c,
-+ m)=S}c
-- 2( 1.69)
J+ - (IO+m)
( mS} C2 C In
Omitting the constant factor. the.: relative variance can be calculated for different values of
m. Table 10. I shows these vlJrianres and the relative precisions (with the maximum
precision for m = 4 taken as the standard) . .
For any value of m between 2 and 9, Ih loss of precision relative to the optimum is less
than 12%.
284 SAMPLING TECHNIQUES
deals with the choice of n' and m' . If s I 2 is the variance hetween unit means and S 2 2
is the var.iance between subunits within units, as defined in section 10.4, (10.22)
gives
•
E(s
I
2) = (s s/)
M I
+ S/ = s + S2
2_
m' " m'
2
2
(10.31 )
S2 ~
m"p, = J 2 2 CI/CZ
51 - 52 /M
As an estimate of mop, from the pilot survey, (10.31) suggests that we take
S2 .;;;;;
mup, = J 2 2 J cl / r, = 2 2 JcJcz (10 .32 )
Sl - S 2 /m' J(m'si / S2 ) - I
The estimate m"p, is subject to a sampling error that depends on the sampling error
2
of the ratio SI /S /. From the analysis of variance it is known that m's I2/ s/ is
distributed a
F
t::'( l+m ,S}
S/ )
where F has (n ' - I ) and n '(m ' - ') degrees of freedom , provided that the y" are
normally distributed . This result leads to the sampling di stribution of "lop, for
given values of n' and m ', that is,
Jm'c, / co
JF( + ;~))
A
(10 .33 )
mop, = 1 m - 1
52 = 1.35.,
consider how well m o", is estimated from a pilot sample with n ' = 10 and m ' = 4 . From
(10.33),
6.324 6.324
=-r============
A
m
0 ", • JFlI +(4/1.69)] - 1 JJ .367F - l
where F has 9 and 30 df. To find the Iimit~ within whLch mop, will lie 80% of the time, we
have, from the] 0% one-tailed significance levels of F,
Flo (9, 30) = 1.8490, F.90 (9, 30) = 1/ F IO (30, 9) = 1/2.2547 = 0.4435
Substitution of these values of F gives
lower limit, m.,.. = 2.8; upper limit, m.", = 9.0
SUBSAMPLING WITH UNITS OF E QUA L SIZE 285
As shown in Table 10.1. any m in this range gives a degree of precision close to the
optimum . Thu~, with n' = 1n, m ' '" 4, the chances are 8 in 10 that the loss of precision is
small with normal data .
The 80 and 95°1., limits for n' "" 5. 10. 20 and m' = 4 appear in Table 10.3. With n' = 20,
we are almost certain to estimate map' with precision close to the optimum . This is not so
with n' = 5.
TABLE 10.3
LOWER AND UPPER LIMITS FOR ,nUll I
n' 80·;'; 90~~
5 2.5. oc I.B, 0-
10 2.R.9.0 2.3,
20 3.1.6.4 2.7,9.1
If the ratio C I / c ~ is the same in the pilot survey as in the main ~urvcy . the cost of
the pilot survey will be proportional to C In' + C2n' m '. Brooks (1955) gives a table
of th values of (n '. m ' ) in the most economical pilot survey that provide. an
expected relative precisicll1 of 9() % in the estimation of mOT'" Table J0.4 show part
of thi:- table .
TABLE 10.4
PILOT SAMPLE DbSIGNS HAVING AN EXI'FCTED RE~ATIVE PRH: IStO'-i Of 90. 0
cl /c~ ~ 1 2 4 B 16 32 64
S2 2 / S ,, 2 n Ill ' n' /II ' ,,' III ' n' nl ' n /11 ' n' 1/1 n' /II
1 7 3 6 4 6 5 5 6 5 7 4 10 4 12
2 Ii 5 '7 7 6 9 6 9 5 13 5 14 4 20
4 9 9 8 II 8 12 7 14 7 15 5 25 5 27
8 10 14 10 15 9 17 9 IB B 22 6 32 5 44
16 10 :5 10 27 10 27 10 28 II 37 7 46 6 60
32 10 46 10 47 10 48 10 49 9 58 8 69 6 102
64 10 92 10 93 10 96 10 100 10 104 8 137 7 169
The computations assume that Nand M are large : the designs are con. ervative
if fpc terms are taken into account. Note that no more than J 0 primary units are
required and that the designs are relatively insensitive to the ratio CI/C 2'
estimate crop production in India (Sukhatme, 1947). the village is a conve nient
ampling unit. Within a village, only some of the fields growing the crop in
question are selected , so that the field is a subunit. When a field is sel cted, only
certain parts of it are cut for the determination of yield pe r acre; thus, the subunit
itse lf is sampled . If phy ical or chemical analyses of the crop arc involved. an
additional subsa mpling may be used. si nce these determinations are often made
on a part of the sample cut from a field .
The result s are a straightforward extension of those for two-stage sampling and
are given briefly. The population contains N first -stage units. each with M
, econd-stage units. each of which has K third-stage unit s. The corresponding
numbe rs for the sa mple arc n, m, and k, respective ly. Let Y'J" be the value obtai ned
for the uth third-stage unit in the jth second -stage unit drawn from the ith primary
unit . The relevant population mea ns per third -stage unit are as follows .
K MK NMK
Theorem 10.3. If si mple random sa mpling is used at all three stage, thc
sa mple me an y per third-stage unit is an unbiased estimate of~ with variance
V(~) = .!._=.uS 2 +~ - ! 2S 2+ 1- ! 1S 2 (10.34)
. n I nm 2 nmk ~
where!1 = n/ N,f2 == m/ M.h = k/ K are ~he sampling fraction s at the three stage .
Proof. Only the principal steps are indicated. Write
y- y= (y- Y,,,")+(Y,,,,, - Y")+('Y,, - 9) (10.35)
where Y"", is the population mean of the"rn second-stage units that were selected
and Y" is the popu lation mean of the" primary units that were selected. When we
square and tak e the average , the cross-product terms vanish . The contributions of
UBSAMPLlNG WITH UNITS OF Oli AI. SIZE 2 7
the squared terms turn out to be a~ follows .
(10.37)
and E(s \~ ) '" 5 / To ohtain the first result, let y,,, denote the meC'ln over the m
second-stage unit~ in the ith primary unit. given that all K clement!>. were
enumerated at the third stage . Let y"
he the mean of the n v a lue~ Y,k' Then . from
(I() 22) for two-stage sampling, it follow!. that
( IO.3R)
ow, if 9, is the sample mean for the ith primary unit , write
(IO .3Y)
By first averaging over samples in which the first -stage and . econd - ~tage Ul1lt~
ar fixed, it lTIay he shown th at
IE'" [ . . , 11 J~ (1 - I ,) /
-( - I)
'- (y, - y,,,l - (y - y,,) = k ( In.-HI)
n- /11 '
a nd that the cross-product t!!fnlS from (J n.3l) contrihute nothing. Thii> el>tabli~he~
288 SAMPLING TECHNIQUES
the result for £(SI2). That for £(s/) is found similarly. Hence
('10.42)
(10.45)
SUBSAMPLING WITH UNITS OF EQUAL SIZE 289
where Wh = NhMh/'i.NhMh is the relative size of the stratum in terms of second-
stage units and 91t is the ample mean in the stratum. By applying theorem to.1
within each stratum, we have
(10.46)
• ) _ '" IV
v ( y" - t... ""It
2[1 -flh
--Slh
2 +[lh(1- [2h)
S 2h
2] (10.47)
h nh nhmh
Corresponding variances for the estimated population total are obtained by
multiplying formulas (10.46) and (l 0.47) by ('i.NhMh )2.
The quantity
(10.49)
(10.50)
The e give
(10.5])
290 SA MP LING TECH NIQUES
The formul a for optimum mlr is ex actl y the sa me a!- in unstra tifie d sampling
[(10.26 ) in se ' tjon IO.n).
From ( I (JAY ). ~ in ce WI, NIrMIr
N hM IrS,,1r
( I () S2 J
~
Sin ce se lf-wei ghting estim a tes are conve nie nt. we consider unde r wh a t cir-
cumsta nces th e o ptimum a llocation lea ds to a !>e lf-weighting estim ate . From
( 10.45), it foll o ws tha I ji" is se lf-we ighting if III,mf, / N h M h =- i,, =c() n ~ l a nl since . in
this e ve nt.
The conditio n is, as mi ght be expecte d . tha I the proha hil ity f" of sckl:t ing a ' lih ulllt
be the sa me in a ll stra ta.
From ( 10,50), the o ptimum a llocation g i ve~
f l O q)
'f
Fre que ntl y ( ~ Ir ' the cost pe r se cond -stage unit. will he app rox im atdy I he ~ , tlnL'
in la rge a nd sma ll prima ry unit ~: but S :' h may he gre ater in large lln ih th all I I I
sma ll. Howeve r. since the optimum i ~ fl a t. a ,e lf -weighting sa mple wi ll o it ,' l1 h, '
a lmost as preci:-.e as the optimum . Note th il t this re, ult ho ld!> eve n if th t.: np llll' 11m
sampling of prima ry units is fa r from propo rtion a l.
EXERCISES
10. 1 A ~e t o f 20,O()() rec() rd~ a re More d in 4()() fil e dr a we r ~. eac h CO l1la inin g 5() rl!CI l i l k
In JI two-stage ~a mrl e. fi ve records a rc drawn at ra ndo m from each o f X() ra ndoml y , .:Ic.:c lc'd
dra we rs. For on e ite m. the e~tim a t es o f va ri a nce we re S l~ = 3h::! . o5/:c H0 5. a, d efi ned in
sect io n lOA . (a J o mpute th e sta nda rd e rro r o f th e mea n pcr record fm nl tIm ,a mple . (h i
Co mpare thi s with th e sta nd a rd e rro r give n hy th e approx im a tc fo rmul a I 10.2 3) in ' cet'on
10.4 .
10.2 From th e results of a pilot two -stage ~a mpl e. 111 whi ch ", ' \uhunit \ wcre dlo'el1
v)
fro m each of n ' unit s, it is useful to bc abl e to e~ tim a t e the va lue o f V( th " l w(lu ld he g i~ e n
by a subse que nt , a mpl e ha ving m ~ ub u nit s from each of n ul1it " S ho w th a t a n lInhia \ cd
estima te of Vf.Yl is
V'( y_) = (N
---n) ~ Sll (1- -m+-mn- - -mil)
s ,2+ - -
N n mn m' m 'N MN
SUBSAMPLlNG WITH UNITS OF OUAL SIZE 291
2
where 5,2 and 52 are computed from the preliminary sample. Hint. Use theorem 10. I and
the result (10 .22 1:
, , S, 2 S,2
. ,' ) = S ' ._ .....!-+--=-
E(~
M In
JO.3 In sa mpling wheat fields in Kansas, with the field as a primary unit . King and
McCarty (1941) report the following mean squares for yield in bushels per acre: 5,' = 165.
s/ = 6f1 . Two ~utJsamples were taken per field . For a sample of n fields . compare the
ar ian ce~ of th e ~a mpl e mean as given by (a) the sample as actually taken . (b) four
, ubsamples per field fr.-,m n field~ , (e) co mpletely harvesting n fields .
Nand M ma y be as~ umed large and constant . In (e) assume that complete harvesting is
equiva lent tn single -stage sa mpling (i.e .. to having m = M).
10.4 In the ,arne ,urvey. with two subsa mples per field. the. mean squares for the
percentage of prott:in were S,2 = 7.73. s/ =' 1.43 . How ma ny fields are required to estimate
the mean Yield 1(1 within ± I bushel and the mean protein percentage to within ± 1. apart
frnm a l -in-10 chan ce in each case') Perform the calculations (a) assuming that two
'uh~ample~ [lc:r ficld arc taken in the mCtin survey. (b 1 assuming complete harvesting of a
field in the main survey .
10.5 For the wheat-yield data in exercise 10.3. what is the value of c,lc ] in a linear cost
function if th c estimated op timum m i\ 1?
10,6 If ml {\,f and 11/ N are hath small and the cost function i~ linear. show that n; = 1.
give, a smaller value of V(,vl than m = I if
e l 2,S',2
-> -
(2 S1~
I fl . 7 A large department s tore handle~ about 20,000 account~ receivable per month . A
2% ~arnplc (111 - 4(0) wa~ verified each month over a 2-yea r period (n = 24) . The numbers
of account, found to be in error per month «l ut of 400) were (i n order of magnitude) O. 0 , l.
I. 2.4.4, 5. S. S. 5. 6, 6. 6. 7, 7. R. 9 . 9, 10. In, D . 14. 17. the time pattern heing erratic.
!-rom the results In sect ion 1O ..'i. co mput e s, ~ and s]~. Hence compute the st<lOdard error of
p. as an estimate of the percentage of accounts that are in error over a period of a year. that
would At! obtained from vt!r ifying (a) 1200 accounts from a slIlgle month . chosen at
random, (b) 30n accounts from eac h of four random months, (c) 100 accounts each month .
HiM. Either lI ~e the formula in exercise 10 ,2 with m ' = 400 or obtain unbiased estimate~ of
S, 2 and S/ and usc theorem 10.1,
10,1{ In planning a two-~tage survey it was expected that c,/c l would be about 4 and
that S/ / S/ would lie between 5 and SO. (a) What value of m would you choo 'e from Table
10,2'1 (b) Sup[lose that after the survey was completed it was found that c ,I c! was close to 8
and 5/ I S.. ~ wa~ about 25 Compute the relative preci ion given by your m to that given by '
the optimum m. (c) Make the same' computation for c ,/ c~ = 4. S/ I S} '" 100,
10.9 If p i~ the correlation coefficient between second-stage unit. in the same primary
unit , prove that
11.1 INTRODUcnON
In sampling extensive populations, primary units that vary 10 size are encoun -
tered frequently . Moreover, considerations of cost often dictate the use of
multistage sampling, so that the problems discussed in this chapter are of common
occurrence. If the sizes do not vary greatly, one method is to stratify by size of
primary unit, so that the units within a stratum become equal, or nearly so. The
formulas in section 10.9 may then be an adequate approximation . Often, how-
ever, substantial differences in size remain within some strata, and sometimes it is
advisable to base the stratification on other variables. In a review of the British
Social Surveys, which are nationwide samples with districts as primary units, Gray
and Corlett (1950) point out that size was at first included as one of the variables
for stratification but that another factor was found more desirable when the
characteristics of the population became better known.
Some concentrated effort is required in order to obtain a good working
knowledge of multistage sampling when the units vary in size, because the
technique is flexible . The units may be chosen either with equal probabilities or
with probabilities proportional to size or to some estimate of size. Various rule
can be devised to determine the sampling and subsampling fractions , and various
methods of estimation are available. The advantages of the different methods
depend on the nature of the population, on the field costs, and on the supplemen-
tary data that are at our disposal.
The first part of this chapter is devoted to a description of the principal methods
that are in use. We will begin with a population that consists of a single stratum.
The extension to stratified sampling can be made, as in preceding chapters, by
summing the appropriate variance formulas over the strata. For simplicity, we
assume at first that only a single primary unit is chosen, that is, that n = 1. This case
is not so impractical as it might appear at fir t sight, because when there is a large
number of strata we may achieve satisfactory precision in estimation even though
nh = 1. In the monthly surveys taken by the U .S. Census Bureau to estimate
292
SUBSAMPLING WITH UNITS OF UNEOUAL SIZES 293
numbers of employed people, the primary unit is a county or a group of
neighboring counties. This is a large unit, but it has administrative advantages that
decrease costs. Since counties are far from uniform in their characteristi,cs,
stratification is extended to the point at which only one is selected from each
stratum. Con equently, the theory to be discussed is applicable to a single stratum
in this sampling plan.
As in preceding chapters, the quantities to be estimated rna}' be the population
total Y, the papulation mean (usually the mean per ubunit y), or a ratio of two
variates.
Notatidn. The observation for the jth subunit within the itb unit is denoted by
Ylj. The following symbols refer to the ith unit.
Population Sample
Number of subunits M, m,
Mean per subunit Y, 9,
Total Y,= M,Y, y, =m,9,
N
Number of subunit Mo= ~M,
"
~m,
N
"
Total Y = ~ 1'; ~y;
I k nce Ihe bi a~ equ ab (Yo- Y)' Si nce the met hod is bi ase d. we will comput the
mean square error (MS E) about Y. Write
9, - Y = (.v, - Y. )+ ( Y, - Y,,) ( Y" - fJ
Square and take the expectation ove r all possible sample . All contributi ons from
cross- product term va nish. The cx pec t a tion ~ of the squared terms foll ow easily
by the methods given in Chapt er 10. We fi nd
N ~ S2/ N ~ - 2
V(Y II ) = M 2 t... M ,(M , - m, ) - + - 2 t... ( Y, - Y) ( 11.2)
II ,= 1 m, M o , - 1
The be twee n-units compo ne nt of this variance (second te rm o n the right)
re prese nts the variation among the u.nit totals Y,. This ompo ne nt is affected bo th
by va ri ati o ns in the M, from unit to unit by variatio ns in the mean s Y, pe r e le me nt.
If the units vary conside rabl y in size. this compo ne nt is la rge , even tho ugh the
mea ns pe r e le me nt 'F, are almo ·t constant from unit to unit. Freque ntl y this
compone nt is so large tha t YII has a much highe r MSE th an the biased estimate )II '
T hu. ne ither me thod I no r me thod II is full y satisfactory.
Now average ove r all possible selection of the unit. Since th e ith unit is selected
with re lative frequ ency M;/ M o,
_
V(YIII) = M
1 [ LN (M, - m, )-+
S2,2 - 'fJ 2J
LNM, ( Y{- ( 11.3)
o i- I m, ,- I
Note that , as in method I, the between- unit compo ne nt a rises fro m differe nces
amo'ng the means per subunit Vi in the successive units. If these means per subunit
are nearly equal , thi component is small .
296 SAMPLl~G TE HNtQUES
TABL I 1.1
AR11I1C"1 ,\L POPULATION WI rH UNITS Of UN~OUI\L SIZES
Unn !l lI 1\1 , )' . S2.
2
9, 9, - y
Totals 12 D
Example. Lei u, apply thc\c rc~ult~ to a small popu lati on . artificially constructed. The
d"I,1 ilrc pre ~entcd in Tahle II . I. There arc three unit ~, with 2, 4. and 6 e lements.
Icspertively. The leader may verify the figur<:\ given fsn Y" Sz.', a nd V,. The population
mean ' i~ ;~. or 2 .75 . The unweighted mean () f thc Y, is 2.167 = fl., so that the bias in
method I i~ - O.:'iH 3. It ~ ~quare . thl' contribution t,,1he MSE. i~ () 140.
One unit is to he selected and two suhunits sampled from it. We consider four methods.
two of which arc variant, l1f method I.
Method la .
Selection ; unit with eq ual probability. m, = 2.
Estimate ; 9, (hiased) .
Method lb.
Selecrion ; unit wllh equal probability. m, =~ M, .
E~timate ; 9, (hiased) .
Method II.
Selection ; unit with equal probability. m, = 2 .
'. Estimate : NM,y'! Mil (unbiased) .
Method Ill.
Selection: unit with probabi lit y MJ Mo. m, = 2 .
Estimate : 9, (unbiased) .
Method Ib (proportional subsampling) doe . not guarantee a sample size gf 2 (it may be I ,
2 . or 3), but the average ample size is 2 .
13y application of the sampling error formulas (11 . 1), (11.2), and (11 .3), we obtain the
rc~ults in Tahle 11 .2.
TABLE 11.2
MSE 's OF SAMPLE ESTIMATES OF Y
Contribution to MSE from Total
Method Within Units Between Units Bias MSE
Althoug:h the example is artificial, the results are typical of those found in
l'ompuisons made on many populations. Method III gives the smallest MSE
SUBSAMPLlNO WITII UNITS OF UNEQUAL SIZES 297
because it has the smallest contribution from variation between units. Method n,
although unbiased, is V~I;'y inferior. Method la (equal size of subsample) is slightly
better than method Ib (proportional subsampling).
Some comparisons of these methods have also been made on actual popula-
tions. For six items (total workers, total agricultural workers, total nonagricultural
workers, estimated separately for males and females) , Hansen and Hurwitz
(1943) found that method III produced large reductions in the contribution from
variation between units as compared with the unbiased method U, and reductions
that averaged 30% as compared with method 1. (They assumed the contribution
from variation within units to be negligible.) In estimating typical farm items for
the state of North Carolina, Jebe (1952) reported reductions in the total variance
of the order of 15% as compared ith methods of type I. In both studie the
primary unit was a county.
(11.4)
This follows because, in repeated sampling, the ith unit appears with relative
frequency Zj, so that
=1 [M; _ -
- -(y;-Y;)+ (Mi - - MoT"")]
- Y;
Mo Z; Z;
If Zi = M./ Mo, ( 1 L.5) reduces to ( 11.3) (or VcYlII) ' If Zi = 1/ N (initial prob-
abi lities equal), ( 1 1.5) reduces to ( I 1.2) for the variance of the unbiased estim ate
when probabilities are equal.
Unless Zi = M,/ Mo. the betwee n- units compone nt in (11 .5) is affected to some
extent by variations in the sizes M, as well as by variations in the means pe r
element Y,.
TABLE
, 11.3
COMPUTATION OF V(Yrd
M,(M , - III ,) Y, Y,
Uni t M, M ,/Mo z, III, S2,2 Y, Y
,
1J ''',
Z, -,
2 0. 17 0.2 2 0 0.500 I 5 - 28
2 4 0.33 0.4 2 10 0.667 8 20 - 13
3 6 0.50 0.4 2 30 0.800 24 60 +27
Example. Table 11 .3 shows th e computat ions for fi ndin g V(Y,v) in the artificial
pop ulation in Table 11 . 1. The ::, have been take n as 0. 2, 0.4, and 0.4 , and th e nI, = 2.
From (11.5). the varia nce comes o ut (lS f()lJow~ :
'" .. \"' M,(M, - m,)Sz/ M 2 0 ~ .,
wnhln-ufllts contrlhutlon = ~ / 0 = . L 1.,
l ,m,
.,
between-units contribution = L Z,( Z,Y; - y)2( M,/ = 3.583
Comparison with Table I 1.2 shows that me thod IV has a lower varia nce than
the unbiased method I I in which th e primary unit is cho!.en with equal prob-
a bilities. but method IV is decidedly inferior to method I or me thod Ill. In this
exampl e me thod IV pays too high a price in order to obtain a n unbiased estimate .
Con equently. it is natural to consider ~hether the sa mple mean (as in method
I) would be be tte r tha n the estimate adopted in method IV.
E(M = L. ZI Yi = f'z
If the ZI are good estimates, f'z is close to the correct mean f'::: L M; YJ M o and
the bias is small.
If we write
Yv - f' = (y, - Yi) + ( Y; - 9,) + (9'. - f')
SUBSAMPLING WITH UNITS OF UNEQUAL SIZES 299
the three components of the MSE work out as f.ollows.
TABL 11.4
CONTRIBUTIONS TO THE MSE IN METHOD v
Within Between Bias Total
Units Units MSE
This i superior to all method s except method In (pps ) and is almost as good as
method Ill .
TABLE J 1.5 ,
TWO'_Sl AGE SAMPLING METHOlJS (n = I)
Probabilities in Estimate Bias MSE
Method Selecting Units of }' Status in Example
where the weight WIJ is known for every sample s, and may depend on other
primary units that are in the sample as well as on unit i.
We will adopt the device used in section 2.9 of letting Wi.,' be a random variable
that equals WI. if unit i appears in the sample and equals zero otherwise. This gives
(11.8)
SUBSAMPUNG WITH UNITS OF UNEQUAL SIZES 301
Now
(11.9)
Ven = v(.t,. 1Wi.r Y;) = v(i,- 1Wj,J Y;) +I,- IE 1(w;.,,2)oo2/ (11.10)
(11.14)
(11.15)
302 SAMPLING TF.CIINIOUES
Thus the rule for constructing the sample estimator of V(f Wi' Y,) is: In an
unbiased estimator of V([ W; s Y,) from one-stage sampling, insert Y, wherever Y,
appears. To this add the te rm I (W"U1/). where f w,., Y, = Y. and rT2/ is an
unbiased e's timator of V 2 ( ) ' ,) .
Proof.
( I I. J 7)
Introduce the rantlom variabie. a where a,..' = 0" if unit i i)' in the sample
L, ' .
and a,..' = 0 otherwise. Similarly, let b :/., = b,l' if units i and j are in the sample and
b;il = 0 otherwise. From ( II . J 5) for o ne-stage sampling.
N ) N N N
U( LW,s'Y, =L O,, ' Y,1+ 2L Lb;I'Y''Y; ( I I. I H)
I I J .,
EIEz(I
,
a,: f'/+2I .Ib;jj Yf~) +~I 2(I W,..'U2,,1 )
I I '"
( 11.1 9)
In ( 11.19) we have used the results that £(a;..') = V( w,,') and that for ;] ' i,
EI(w,,')= I = E I 1 (w,,'). Continuing.
E[ u(f w,., Y)] = v(f y,) + f, E Wi" 1( W;.: 2)U2/ = v(£, w" Y,) ( 11 .20)
(11.21)
Wi.
N
=; ; N !:!
E(wIS ') == !!_ n = 1 '.
where 12. == m,/ M, . The estimator becomes self-weighting if h is con. tant, (= 12,
say). We then have
,. N "I,...
nf 2 II y"
Y" =- 01.23)
i ,
The qua ntity nhl N is, of course, the probability that any second-stage unit is
drawn .
For an unbiased sample estimate of variance , theorem 11.2 gives, from (11.16),
(l 1.24)
( I 1.25) .
This is a typical ratio estimator. since both numerator and denominator vary
from sample to sample. It is used mainly for estimating mean per subunit, for
which knowledge of Mo is not required, and for which it is an extension of me thod
I with n ::;:: 1.
304 SAMPLING TECHNIQUES
( 11.26)
where Y = Y/ Mo.
~ " ~
Since Yu = (N/ n) L M,Yh we can obtain the approximate MSE( YR) from for-
mula (11 .22) for V( YJ by substituting M,(y; - Y) for M,y, or, more generally,
(Ylj - Y) for Ylj' Now (11 .22) is
N
In the substitution, 1"; =M, Y; becomes M, (Y; - Y) while Y =L y,/ N is
replaced by O. Also S2;2 = L (YiJ - Y,)
- 2
/(M; - 1) remains unchanged when (Y'J - ~
C J
j
replaces Y;j . This gives the result
2 N 2 - ~2 2 2
MSE( YR)='= ~(1 - II)LM, (1"; - r J +!:!.. r.. M, (1 - 12, )S2, (11.27)
n N- 1 n '";
(11.28)
The resemblance to the corresponding formula when primary units are of equal
size~ may be noted. From (10.14) section 10.3, multiplying by Mo 2 since Y = MoY,
we have
(11.29)
The difference is that in (11 .28) the contributions to the MSE from the primary
units are weighted , larger units receiving greater weight.
SUBSAMPLINO WITH UNIT OF UNEQUAL SIZES 305
By theorem 11.2 an approximate sample estimate of MSE(YR ) in (11.27) is
given by
(11. 30)
..." "
where 9'R =I M;MI ~ :;: YR/M o·
When this estimator is used to estimate the po pulation mean per subunit, we
have fR =I Y,;£M; and Vd'R) =V(YH )/M o2 . When Mo is not known , we
substitute M o = N(IM'/ n) for Mo in calculating v cfrR)'
When primary units are selected with equal probabiliriel>, an alternative esti-
mate of the population mean per subun it (another extension of method I) is
Example. From the volume American Mer. of Science. 20 pages were selected at
random . On each page the ages of two scienti t , from two biographies also selected at
random, were recorded. The total number of biographies per page varies in general from
about 14 to 21 . Estimate the average age and its standard error from the data in Table 11 .6,
using the ratio estimate.
F.rom the extreme right column,
f = L.M1YI= 17,121.5 47.7 years
R L. MI 359
Since 11/ N is negligible, we have from (lJ . 30), dividing by Mo".
v
(f R
) = (20)(571,300) = 4.67
(19)(3 W
ScrK) = 2.16 years
306 SAMPLING TECHNIQUES
TABLE 11.6
AOES OF 40 S CIENTISTS IN American Men of Science (n - 20, m - 2)
Ages
Unit Total
No. M; Yn Y;. Yi Miiii
1 15 47 30 77 S77.S
2 19 38 SI 89 84S.S
3 19 43 45 88 836.0
4 16 S5 41 96 768.0
S 16 S9 4S 104 832.0
6 19 39 38 77 731.5
7 18 43 43 86 774.0
8 18 49 SI 100 900.0
9 18 4S 3S 80 720.0
10 18 46 S9 lOS 94S.0
11 20 71 64 13S 1,350.0
12 18 35 46 81 729.0
13 19 61 S4 115 1,092.5
14 19 4S 87 132 1,254.0
IS 18 31 38 69 621.0
16 16 64 39 103 824.0
17 16 63 47 110 880.0
18 19 36 33 69 655.S
" 19 19 61 39 100 950.0
20 19 54 34 88 836.0
(11.31)
SUBSAMPLING WITH UNITS OF UNEQUAL SIZES 307
For n =1, it was shown in section 11.3 that this estimator, MoYIV= YIv, is
unbiased. Its variance is obtained from formula (11.5) on mUltiplying by Mo 2 as
With this method of sampling, the estimator Yppz is the mean of n independent
estimates of the form Yrv. Consequently, from classical sampling theory 'Yppz is
unbiased and
"(9; ./'. )2
I --r.
V
(¥ )_' - 1 Z,
IV - (n-l)
ppz
(11.34)
v(Y.
ppz
f(Y'-y.
)= 1-1 Z,
n(n -1)
r
ppz
(11.35)
These results hold also for multistage sampling, provided that 9; is an unbiased
estimator of l'i and that subsampling is independent whenever a primary unit is
drawn.
To discover when ¥ppz becomes self-weighting in two-stage sampling, write
(11.36)
With a self-weighting sample , the variance estimat Jr (11.35) takes the simpler
form
(l1.38)
where Yi = L Yij is the ample total in the ith unit. The simplicity of the estimated
j
variances (11 .35) and (l1.3R) is an attractive feature of with-replacement
sampling.
With this method there are other ways in which the subsamples may be drawn.
If the ith unit is selected I, times, one variant is to draw a single subsample of size
lim, without replacement provided that Mi > m/I, . Sukhatme (1954) has shown
~ N
that with this method , V( Y ppz ) is reduced by (n - ]) L Mi52,21 n. Another method
is to draw a single subsample of size m, no matter how many times the ith unit is
selected. The estimate M,y./ z, from this unit receives a eight I, (the number of
times that the unit has been drawn) with either method . The effect i to increase
VCYppz ) by
(n -)) f M/(l - 12.)5 2/
n mi
For the same co t, the differences in precision among these methods are seldom
likely to be substantial.
If z, =Mil Mo the unbiased estimate (J 1.31) reduces to
.,
A Mo" _
Ypp.. =-Ly, (11.39)
n i
(11.41)
with variance
Vrf'B) = ff
I j >i
('IT/?I'j - 'lTlj)( Yi_l1) + f M?O - h.i )S2l
'lT1 'IT]
2
I ml?l'j
l
(11.42)
SUBSAMPLING WITH UN1TS OF UNEOUAL SIZES 309
For the corresponding estimator Ypp , in sampling with replacement, we have,
from (1) .33), with 17'; = nz"
Since the "within-units" contributions to the variance are the same in (11. 33)
and (1) .43), any relative gain in precision from sele<..'ting units without replace-
ment is watered down in two-stage samples by the within-units contribution to the
variance. For instance. in a stratified multistage sample of n = 441?rovinces opt of
N= 147 provinces, Des Raj (1964) found that the ratio V{YWD.l~)/V(YWR)
averaged about 0.79 over seven items for the between-units component, but the
average ratio over both components was 0.92.
With n = 2 an unbiased sample estimate of V(Ys ) is
V ( .c.iJ )_(
I - 17' ] 17'217'12
-]- l)(M]YI
-
M2Y2
_., _ -
~ Mj2( 1 - f2j)s~:2
) 2+ L.. (11. 4 4)
17'1 17'2 ,_ I tnl7T,
where Zg = I z/ over the gth group and Mg. Y,l' and z, refer to the ,!nit drawn from
the group. For a sample of n units an unbia~ed estimator of V( YRHC ) is
V(YRHcJ= '-r
CN/- N) fz,l( - 2 MgY LYRH
)2 +fzgM_g2(1-f 2,)S2B2
( '" . '- N g 18 z~ S Z8 m~
(11.46)
Formulas (l J .44) alJd (11.46) rt..q uire separate calculation of the between-units
and the withjn-units contributions to the estimated variance. In extensive surveys
with many strata and numerous items, the complexity of such formulas makes the
estimation of variances u task requrring more computer time than can be devoted
to it in practice. recall the much simpler form of the variance formula (11.35) when
units are drawn with replacement; [hat is,
1 "
V(YPP')= n{n _nL
(9;;-Y" )2
ppz
where y;" =Mjy,/ ZI. This result make with-replacement sampling appealing if it
doe not involve too much loss of precision.
Among without-replacement methods, Platek and Singh (1972), in planning
the redesign of the Canadian Labor Force Survey for strata in which n~ == 6 or a
312 AMPUNG TECHNIQlJES
where
The conditions under which YR reduces to a multiple of the ratio of the sample
totals 'i..'i..Yij/I.I.X// are always those under which the corresponding Y becomes
self-weighting; in this case hi = mJ M, = constant.
For the estimated variance, substitution of d j / = Y// - &// for Y;j in (11.24) for
v(YJ gives
2
f
v( Y ) == N (1-fl) L (Y;- RX/)2 +!!.. M / (l- f2j)S~'2j
R (11.52)
n n- 1 n mj
n
f z,
V(YR ) ==..!. .!.(y; - RXj)2+.!.
n
f M/(l-z,m,/2/)S~2' (11.54)
UBSAMPLING WITH UNITS OF U"!EQUAl. SIZES 313
From ( 11 .35) a sample estimate v( YR ) that is slightly biased is
v(Y ) =
R
. 1
n(n - I)
t(Y.-z,RX,)2 (11.55)
(11.56)
(
,( yO
R8
) : ,_
\11,71", 71"'1
l_l)(!2~_!Z;_)2+ ~_ M,2(1-h)s~ l.: ( 11.57)
71"j 71", " m 71",
The third term is included becau~e the sampler mu.st u<;lll1lJy Ii!" the elements in
,illY se lected unit and verify their numbel in order :,) draw a SI IU\ampie. Henc~
" n
cost = cun + C2 L m, + c, L ,'vI,
This formula is not usable as it stancis, since the cost ri ;:pends ('In (he parti .. ular et
of units that is chosen . Instead, consider the average co!.t over n unit!>, which
equals
Write
N 2 - -6-\ 2
5 2 _ LM, (Y,-SJ
b - M 2(N - l)
This is a weighted variance among unit means rer element. It is analogous to the
variance 5/ in section 10.3 and reduces to 5, if all Ms are equal. We may also
write
52 2_~
-i., -
Mi 5 212
Mo
This is a weighted mean of the within-unit variances. It reduces to the 5 2 2 of
section 10.3 if all M, are equal.
t2
In this notation, since = ril/M,
2
MSE(Y) 1 ('5b 2 --=-
.. =- 52 ) 1 52 2 - -
+-_ 1 Sb 2 (11.59)
n\ M nm N
Applying the Cauchy-Schwarz inequality as usual to 01.58) and (11 .59) we get
- . 52 ~ (1 60)
op1
m = JS/ - 5//MV~ 1.
The methods given in section 10.8 for utilizing knowledge about the ratios S2/ Sb
and C lie. to guide the selection of mop' are applicable here. The unbiased estimate
when units are drawn with equal probabilities can be handled similarly.
The next section presents a more general analysis of this problem.
V(9"R) =.!.[f
n
!..CY, - RX;)2+ M,(Mt - m,) S~21]
Z, Z,ml
(11.62)
Since d,j= Y'j - Rx,j> we may write (1', - RX, ) = MtD,. Noting that 17', = nz"
M,jnz,m, = l/fo, and combining the first and third terms in (11.62), we get
(11.63)
The problem is to choose n. fo. and the 17', = nz, to minimize V subject to fixed
average cost and to the restriction
N
LZ, = 1,
Take A and p. as Lagrangian multipliers and minimize
Differentiation gives
n: (l1.65)
(11.66)
Since the individuaJ values of (D? '- S~2,j~) will not be known, we consider
how the average vaJue may depend on the size of unit Mt. using the following
rough argument. Suppose a population were divided into units of size M Since
B(D,) = 0, formula (9.10) on p. 241 gives
-2
B(D,)~ V(D1)
-
=s/
M[l +(M-l)PM]
-2 SJ2I). Sl[
E ( DI _~2
- - = - l+(M - ' I)PM-(I-PM))=PllfUd
M, M
316 SAMPLING TECHNIQUES
Mj.r;;:;.
z, ex: 1"''''==== (11.6R)
-Jc,,+c,Mj
With p positive. PM, may be expected to decrease as M, increa es, since subunits
far apart are less subject to common influences. but this decrease may be only
~Iight for JP:;. Deductions from (11.68) are as follows.
1. If the cost of Ii ting c,M1 is unimportant, Z, ex: M j (i.e., pps selection) is best if
..;p;; changes little over the range of sizes in the population . If {ii;, decreases
noticeably, optimum probabilities li es between ZI M, and Zj ex:.JM,.
2. If listing cost predominates, Zj should lie between JM, and constant (equal
probabilitie~) .
3. If listing costs and fixed costs are of the same order of magnitude. : , '" ,'\,1,
may be a good compromise.
Differentiation of (I J .64) with respect to the overall sampling fraction f () gives
(I 1.69)
The value' of ;. .~ f(lund in term~ of the known 7T'S by adding (11.66) over all units.
This step leflds to the result
(11.70)
Comparison with 01.60) will show thatfo has the same structure as in sampling
with equal probabilities, remembering that'in (11.60), fr. = ntflopl/ Mo and C I =
c,,+c,M
The opwnum n is found from the average co t equation (11.01).
VCY,,) = t
h
V( Y,,). (1 1.71)
(11. 72)
where y,,/ is the total over the m", subunits from the ith unit in stratum h. These
estimates are seen to be self-weighting within strata if the probability fo;, =
n"z,,/mhJ M h, of selecting any subunit in the stratum is constant within the stratum.
In this event, the estimate becomes
A L l n.
Y.,=L,LYh, (11.73)
"JO" I
(11.74)
where MOh = I.Mhi • Thus. choice of tOil == fo, which makes these estimators com-
pletely self-weighting, will be near-optimal as regards precision unless either the
silll or the C2h vary widely from stratum to stratum. especially since this choice
affects only the second-stage contribution to the variance.
Hence, we substitute dldl =Yhij - Rx"11 for Yhij to obtain the approximate formulru
318 SAMPLING TECHNIQUES
for V('YRc) from those for VCY,, ) by the sampling plan used. FOT V('9'RC) we
substitute diuj = Y'd, - ReX"lj in v( '9'.. ).
For example, with unequal probabilities with replacement (section 11 .9),
formula (11.33) leads to
(11.75)
where
D"I = YII; - RXIo;
M.. - X
2
Sd2101= (M"I-I)
1
7«Yhlj - RX"lj)-(YIo1 - R "I)]
2
For the estimated variance, formula (11.35) gives, after the substitution,
(11.76)
where
"..
I:DIoI '
D"I ,= Mhld"I', 6",=_1__,
Z" I nIt
"
where Yilo,:::: }J1ti/2Z hi and Yjh; is an unbiased sample estimate of Yjlti from the
subsample in this unit.
From (11.35) an unbiased estimate of V(}J) works out as
2
v(}J) = LL 2(yjlti- ji,,)2;!!i L (yj" 1- yj"2)2 (11. 78)
" I h
v[/CY)) =
iiI) (Yilo 1- Yilo~
. t [Ie~ (ay; I ,] 2.
(11.79)
Select a half-sample H by choosing one unit from eacb stratum. The estimate of
r
I(y) = Y from this half-sample is 1(11) = 2 YhI' , where hi is the unit chosen in
II .
stratum h. Hence, if I(S) denotes the estimate of I( Y) made from the whole
sample,
1(1I)-/(S) = 2 LYhI' - L (YIII ' + Y1I2') = L±(Yhl'- Y1I 2') (11.81)
II II II
1 + + + +
2 + + + +
3 + + + +
4 + + + +
5 + + + +
and the average of the first two, all differ to orne extent. Also, I (S) = feY) is
biased, and the variance estimates do not include the correct bias contribution to
MSE[f(Y)]. However. note that if we drew m independent samples Sj, each ', ith a
complementary H;, C. the quantity
(11.R4)
the expre 'sion on th extreme right illdicati h w the method might be applied to
other nonhnear estimation.
For a linear estimaror like j it is easil ' ,,('rified th t formula (11.85) reduces to
the usual v(y) = L ~yj - 9)2/ n 1/1 - 1).
In extending thi method to stratified sampl,. .. with nh =-1, Frankel (1971)
suggests omitting one unit at random from stratum II (or h ;:, 1,2, .. . L in turn ,
calculatinfl; f(Sn) from the remaining !)C'!mpie uf size l2l - 1). One (orm of the
lacknife e<;timate of V[f( S)) is then
1..
v[l(5») == L [f(5h ) - f(5)Y (11.86)
h
For a Iin<;;ar estimator f i Y), thiS variance e':tirnatc;r reduce, to the usual unbiased
estimator in ppz sampling with repla~ment.
As with BRR, there are four analogolls versiOn!> of the laeknife estimator of
v[/(5»).
TABLE 11 .8
A VERAGE TAlL PROBABIUTIES OF (f cY)-
Ef (y»)/s.e.(f (Y))
CoMPARED WITH THOSE OF STU()ENTS't
6 strata
P(/) "" .042. 1='2.576 P(/) = .098, 1= 1. 960
BRR ) Taylor BRR J Taylor
30 str8ta
P(t ) ... .059, t '" 1.960 P(t) = .110, 1=1.645
BR R J Taylor BRR J Taylor
EXERCISES
11.1 By working out the estimates for all possible samples tbat can be drawn from the
artificial population in Table 11.1, by methods la, lb, II, and III, verify tbe total MSE's
~iven in Table 11.2.
11 .2 For methods II (eQ'!al probabilities, unbiased estimate) and III{ pps selection),
recompute the variance of f' for the example in Table 11 .1 when m, = 1. Show that the
=
precision of method 1lI in relation to method II is lower for m, 1 than for m, 2. What=
general result does this illustrate?
11 .3 For the population in Table 11 .1, if the estimated izes Z, are 0. 1, 0.3 and 0.6, with
m, "" 2, show that the unbiased estimate (method IV) gives a smaller variance than pps
sampling. What is the explanation of this result?
11.4 The elements in a population with three primary units are classified into two
classes. The unit sizes M, and the proportions P, of elements that belong to the tirst class are
as follows.
MI = 100, Ml = 200, M 3= 300, PI = 0 .40, Pz = 0.45, P3 =0.35
For a sample consisting of 50 elements from one primary unit, compare the MSE's of
methods la, II, and III (or estimatlhg the proportion of elements in the first class in the
population. (In the variance formulas in 5ection 11.2, s,z is approximately P,O,.)
11 .5 A sample of n primary units is selected with equal probabilities. From each chose.n
unit, a constant fraction 12 of the subunits is taken. U a, out of the m, subunits in the ith unit
fall in class C, show that the rati<rto-size estimate (section 11.8) of the population
proportion in class C is p = Ia,/Im,. From formula (1l.36);..ihow that an estimate of
MSE(;;) i.
at at
Factory mi aj Pj""- Factory m, aj Pt=-
mj mt
65 8 0.123 7 85 18 0.212
2 82 21 0.256 8 73 )1 0.15)
3 52 4 0.077 9 50 7 0.140
4 91 12 0.132 10 76 9 0.118
5 62 I 0.016 II 64 20 0.312
6 69 3 0.043 12 50 2 0.040
Estimate the percentage and the total number of defective pieces in use and give
estimates of their standard errors.
Note. Since M,/ M == mil m, the between-units component of v(fi ) may be computed as
1- /
-2( 1 1)U:: a/- 2,; r a,m, +pl r m/ )
nm n -
and, since the m, are fairly large, the within-units componellt as
11(1- / 2 ) r
(nm) 2 a,q,
11 .7 If primary units are selected with equal probabilities and /2is constant, show that
in the notation of exercise 11.5 the unbiased estimate of a population proportion is
p = NIall nMo/2 and that, if terms in 1I ml are negligible, its variance may be computed as
_ 1- / ) " 2 /) (1- / 2)"
v(P )- n (n - 1)-
m 2r (a.- a)+ (nm-)2 ra,ql
Calculate p and its standard error for the data in exercise 11.6.
11.8 A sample of n primary units i chosen with probabilities proportional to estimated
sizes z, (with replacement) and with a constant expected over-all sampling fraction /0. Show.
that the unbiased and the ratio-to-size estimates of the population total are, respectively,
Tl lo and ™./r" m , where T is the ample total. (It follows that if Mo is not known the
l
unbiased estimate can be used, but not the ratio to size. For e timating tile population mean
per ubunit, the situation is rever ed.)
11.9 In a study of overcrowding in a large city one stratum contained 100 blocks of
which 10 were chosen with probabilities proportional to estimated size (with replacement) .
An expected over-all ampljng fraction I. ;: 2% was used. Estimate the total number of
persons and the average persoh per room and their s.e .'s from the data below.
Block 1 2 3 4 5 6 7 8 9 10
Rooms 60 52 58 56 62 51 72 48 71 58
Persons 115 80 82 93 105 109 130 93 109 95
11 .10 For Durbin's method (section 11.10) f simplif)'ing variance estimation in ppz
sampling without replacement, a simple method of sample selection due essentially to Kish
326 SAMPLING TECHNIQUES
(1965), is as follows. The subscript h to denote the stratum will be omitted and the numl:>er
of primary units is assumed to be even.
Arrange the units in order of increasing z, and mark them off in pail's. The method is
exact only if z, = z, for members of the same pair; this will be assumed here. Select two units
ppz with replacement. If two different units are drawn, accept both. If the same unit is drawn
twice, let the sample consist of the two members of the pair to which this unit belongs. Show
that for this method: (a) '71", = 2z" (b) for units not in the same pair, '71"" = 2z,z, = '71",'71"/2, so
that '71",'71",'71",,-1 - 1 = 1, and (c) for units in the same pair, '71"/1 := 4z,z, = '71",'71"" so that
.."..,'71",'71",,- 1-1 = O.
11.11 In section 11 .9, formula (11.33) for V( Y"...) in sampling with replacement was
proved under the plan that whenever the ith unit was selected, an independent simple
random subsample of size m, was drawn from the whole of the unit. Prove the following
results for two alternative plans.
(a) When the ith unit is selected ti times, a simple random subsample of size m,t, is
drawn from it (assume m,t,sM,). Under this plan, V(Y"...) in (11.41) is reduced by
N
(n -1) L MiSZ/ I/I (Sukhatme, J 954).
(b) When the ith unit is selected t, times, a simple random subsample of size m, is
drawn . Then V(Y"..,) in (J 1.41) is increased by
(n -1) N 2 2
--LM,
n
(1-/2')S21 1m,
N
In both (a) and (b), Ypp% = L f,M,Y/ nz" the ith unit receiving weight t,.
CHAPTER12
Double Sampling
Let
Wh = N h / N = proportion of popula,tion falling in stratum It
W" ;: n hi n' ;: proportion of fir t sample falling in stratum h
Then W" i an unbiased estimate of Who
The second sample is a stratified random sample of size n in which the Jill are
mea ured: nh units are drawn from stratum h. Usually the second sample in
stratum It is a random subsample from the nlr' in the stratum . The objective of the
first sample is to estimate the strata weights ; that of the second sample is to
estimate the strata means Vh .
The population mean V == L Wh Vh . As an estimate we use
L
y..,=hL- l w"y" (12 .1)
The problem is to choose n' and the nh to minimize V(ji.,) for given cost.
We must then verify whether the minimum variance is smaller than can be
attained by a single simple random ample in which y, alone is measured. In
presenting the theory, we assume that the n h are a random subsample of the nh '.
Thus, nit = l1"nh' , where 0 < 111t :S 1 and the I1h are chosen in advance . Repeated
sampling implies a fresh drawing of both the first and the second samples, so that
the Wh , nil, and jill are all random variables. The problem is therefore one of
stratification in which the trata sizes are not known exactly (section SA.2).
' r Two approximations will be made for simpliCIty. The first sample size n' is
assumed large enough so that every Wh > O. Second, when we come to discuss
optimum strategy, every optimum I1h as found by the formula is assumed :S 1.
Theorem 12.1. The estimate ji., is unbiased .
Proof. Average first over samples in which the Wh are fixed. Since jih is the
mean of a simple random sample from the stratum, E(jih) = Vir' Furthermore,
when we average over different selections of the first sample, E(w/t) ::; Wh , since
the first sample is itself a simple random sample. Hence
(12 .2)
Theorem 12.2. If Ute first sample is random and of size n', the second sample
i a random subsample of the first , of size nil = "hnil', where 0 < "h :S 1, and the IIh
are fixed ,
2
N + t -n-'-
1 1 ) L WhSh 1 )
V(y.,) == s:1 -,- (
I1h - 1 (12.3)
is the mean of a simple random sample of ize ,,' from the popula,t ion . Hence,
averaging over repeated selections of the sample of size .n',
(12.4)
But
L L L
YII = L w"y" = L W"YIo'+L WIr(Y~ -YII') (12.5)
" " "
Let the subscript 2 refer to an average over aU random subsamples of nil units that
can be drawn from a given n~' units. Oearly, E 2(YII) = YII'. Results that follow
immediately are (see exercise 2.16):
cov (YII', (Y~ -y,,')] = 0:
(12.6)
cov (YII', y,,) = V(YII ') : V(Yh - Yit ') = V(Yh) - V(Yh')
Hence, for fixed w",
V 2 [1: WII(YII - YII ')] =]=1: W,,2Sh2(_!__~)
nIl nIl
=L W,,~,,2
n
(_!_-1)
JI"
(12.7)
2(
V(Y., )= S -;-- L W"S/ ( 1
1 1) +1:-,- --1 ) (12.8)
n N " n ""
Corollary L The result can be expressed in a number of different forms. By
the analysis of variance,
(N-l)S2=1:(Nh-l)S/+1:N~( i\-Y)2 (12.9)
Hence, if g' =(N-n ')/(N-l), multiplying by g'/n'N gives
l)S/+!:1: W,,(YI. - Y)2
(12.10)
From (12.3) this gives
L WIr S,,2( 1 ) g' L I g' - 2
V(y.,) =1:-,- --1 +-;1: (WII-N"' )S", '1--; 1:L W,,(Y,,-y)
-
Emmplt. This example did not arise from a double sampling problem, but illustrates
orne feature of the solution . We use the Jefferson data from p . 168. In estimating corn
acres, assume that we could either take a simple random sample of farms or devote some
resources to classifying farms into two strata by farm size. Relevant population data are for
corn acres.
Strata W. •
2
5. y.
1 0 ,786 312 17.7 19.404
2 0 ,2 14 922 30.4 51 .626
Suppose thaI C* = 100, c. = I = c. and that S ] / .''' is negligible. This implie that if double
sampling is not used, we can afford to take a sample of " '" 100 fal ms, giving V (ji) 6.20. =
Let c' be the cost per farm of classifying farms into stratum 1 ( :S 160 acres) and stratum 2
( > 160 acre) . onsider the questions :
1. For what values of r'/ e does double sampling bring an increase in precision ?
2. What is the opumum double sampling plan if c' = c/ I 00, and what is th e resulting
V ( ji., )?
" 3. In problem 2, bow do the plan and the value of V (ji,, ) change if the IIh are guessed as
twice the optimum fracti o ns'!
1. from the population data, I. W.S. = 20.4 . ( 2_ I. W.5. 2 ) '" 177 , Hence from (1 2.22)
V",," (ji,,) = 0 .01 (20.4+ 13 .3Jc')2
If this is to be less than 6.20 lor simple random sampling, c' < 0, 11 ; that is, c'/ c < 1/ 9.
2. If c'/c :: 1/ 100, the n with the optimum plan, (1 2.22) gives
V",,"(y,, ) = 0,01 (20.4 + I.W = 4.71
Note th at if classificati on by farm size cost nothing, we would have
V", .. (ji,,) =0,01 (20.4)2 = 4. 16
As regards the details o[ the plan with c'/ c = 1/100, from (12.21 ), we get
II, >: 0.133 ; V 2= 0.229
Since I.W"II" = 0 ,]535, we find , from (12 .18), that n' = 612 . In return thi give
expec,ted vaJues of 64, 30 for "It
n 2 • Thus nearly all the money is spent on
measurement: only 6% on classification.
3. If we guess II, = 0.266, 112 == 0.458. then I. W.II. = 0.307 and from (12. J 8), "' = 315,
=
leading to an expected" I 66, n2 == 31. From (12.3), V(j.,) for this plan will be found
to be 4.85 , only a 3% increase over the optimum 4.71 in problem 2.
DOUBLE SAMPLING 333
12.4 ESTIMA TED VARIANCE IN DOUBLE SAMPLING FOR
STRA11FICAnON
If lin ' and 1/."1 are both negligible with respect to 1 (e.g., <0.02), an almost
unbiased sample e timate of V(y.,) in (12.14) is imply the sample copy of this
formula.
(12.24)
(12.24')
where g' = (N - n')/(N -1). This formula will suffice for almost all applications.
With I,' Nand 1/ n' not negligible a m re complex algebraicexpr ~ion is needed .
By averaging first for fixed n' and WIt and then over, ariations in the WI.. the
lIverage of ~·"s" 2 in ( 12.25) is WI,S" 2 • while that of Sit 2 is Sit ~. These result~ will he
used after equation (12.31).
In the last term in (12.25)
~ ~
L., Wh (- -)2
Yit - Y., = L., - 2
Wh)'" -
-2
y" (12.27)
A veraging first for fixed W"'
~
Ev...
_ 2
WltYh ) =
~
L., WII
- ~ ~ ~( l
Yh + L., wS" - - ' - -
N
1) (12.2 )
I'hw"n Wh
Furthermore,
(12 2 ,
Also.
(12.. (I)
334 SAMPLING TECHNI lJE~
Subtracting (12.30) from (I'"' 29) and multiplying by g'l n' gives,
This proves the r suit . Notl! that the two middle terms in (12.25) are of order
l/n'N and l / n ,JJlh and are negligible relatIve to terms retained if liN and lin'
are negligible. Thi supports the <;impler form (12.24).
Rao (197.1) has given the result 0 2.2 5) in terms of the "h
and nit' as follows .
,- ,=1:"'- J "
V\,t " , N 'h
(n,, ' ·· 1_~C.!)W"s,.z + (N - n ') "w (- _ - )2
n'- 1 N- ] "h N tn' - I); h Yh y"
(12.32)
Corollary. To use (12 .24) in the estimation of a proportion, put Ph for Yh and
nlcP"q;, !(II , 1) fO! 5,/ .
"
&lt , In II ~i mr>le random sample of 374 households from a large district, 292 were
occupied hy white ramilies and 82 by nonwhite families . A sample of about one in four
households gftve th following data on ownership.
E.sumate the prop,mion of rented households in the area rrom which the sample was drawr
an its standard error.
If the first stratum consists of the white-occupied households.
292
W, == 374 == 0 .78,
43
PI = 74 == 0.58,
(12.33)
(12.36)
where we assume that the S/, SI/ and hence the al2 are known in advance.
336 SAMPLING TECHNIQUES
When the objective is to minimize V for given C, Sedran k (1965) and Booth
and Sedransk ( J 969) try different values of "' in turn . For given n " the value of " is
"j
then known from (12.35). For given n, the that minimize Vare = nat/CI aj). "j
Difficulty may arise in usinr these "i, however, becau e the "I' provided by the
initial sample are random variables. In ome subgroups we may find "j' <
natiCI ai) so that the minimizing nl exceeds the available n;'. Allocation rules to
handle this situation will be illustrated for L = 3 groups, from Sedransk (1965).
Number the subgroups in increasing order of n/CI a,)/nal, and let CI a,) denote
the sum over all classes except the first . Take I
if
, nal
if nl < CIal )
, (n - "l )a2
if n2 2:: CI al) (12.37)
if
where L a, =
I
r al - a l'
These rules are not .::omplete, but will cover most n' likely to be near optimal
The principle is to keep clo e to the nl ex: a, allocation.. See Booth and Sedran 'k
(1969 ) for more detail.
Ex,,,,,,le_ An example in which double sampliftg should perform welJ is a follows :
c= 2,000, c· = 1, C= 10. The three subgroups are of relative size WI == 0.05 , W2 = 0.25,
W, = 0.70, a,2== S,2 = 10 (i = 1,2, 3). Consider first single sampling. Since it costs 11
monetary units to select, classify, and measure a sampling unit, we can afford" = 182 with
single sampling.
Optimum allocation would require equal "" but on the average the values of from "I
singJesampling with n == 182 are 9. J, 45 .5, 127.4. A sumingE(fln;)= 1I E (n, ), the average
V from single sampling is approximately
E( V):!: I o(..!_+...2_)
31
....,11.70
53 .~
Rao (I973) handles these problems by the method used with stratified sam-
pling: this method specifies the fraction v, of the n;' in the ith subgroup that are to
be measured . An advantage is that the optimum n' can be determined analytically.
With C = c' n' + en as before, the expected co:,! is •
By the Cauchy-Schwarz inequality, the optimal "I for fixed n ' is given by
n'\+:", ==a:a i } (C*- n'c') (12.40)
Q, e
provided all ", 5. 1, upstirution of ,,'W,JI; from (12.40) into (12.39) gives. for the
minimal E( VI ,
(12.42)
a1
2 ca:: 0,)2
I
(12.45)
£(V):: n'WI + C* - n'(c' + cWI)
From this E(V) the derivative dE(v)ldn' vanishes at
a:: a)
n' = C* I{ (C'+CWI)+~[CWl(C'+CWI)]I/2} (i2.46)
The value m2 at which III and 112 are both 1, so that (12.45) cea es to hold, is
cW a:: a/)]
2
Thus expressions (12.45) and (12.46) apply only over the range mt 2: n' ~ m 2 ' If
d£(v)ldn ' does not vanish for n'~ m2, we need to set Ill> 112, and so forth , = 1 in
tum until the turning point of E( V) is found . In many situations for which double
sampling is economical, however, the turning value of ECV) occurs for m I ~ n' 2:
m 2·
ExIIrnple. For the worked example in this section, C* = 2000, c' = 1, c = 10, a/ = S/ =
10, W. =0.05, 0.25,0.70. We have, from (12.43) and (12 .47) .
'f
2000
,
m = [1 +(10)(0.05)(3) =800
2000
m 2 = [1.5 + (10)(.25)2) - 308
Since this value lies between 800 and 308, it gives the required minimum. Formula (12.44)
gives 112 = .346, " 3 = .124. Numerically this solution is essentially the same as that found by
~~ransk:'s method, which had n ' "" 620 and similar values of '" in the three subgroups.
Both methods extend easily to the ca e of differential costs of measurement in
different subgroups. Suggestions are also given (Rao, 1973) for the case where
EO I nj) is substantially larger than II E(n,) .
where i', i are the means of the XI in the first and second samples a~d b is the least
squares regression coefficient of YI on X" computed from the second sample.
If no assumption is made about the presence of a linear regression in the
population, y" will be biased, just as in one-stage sampling (Chapter 7). An
approximation to V(y,,) can be given, assuming random sampling and lin and
1/n' negligible with respect to 1.
where S",2 is the variance of u within the large sample. It follows that
(1 1)
n' N)'
(1 1)
= - - - S 2 + --- S 2 (l-p Z)
n n' Y
(12.51)
(12.49)
small-sample results for its variance can be obtained. Let the regression model in
the superpopulation be
Y =0: + f3X+E (12 .5_2)
where, for given x's, the E'S are independent with means 0 and variance tT/(l-
p2), where u y and pare n_ow parameters of the superpopulation.
On substituting for y, Y, and b from (12.52), straightforward algebra gives
n
L (Xi-X?
By averaging over the distribution of the E'S, it follows from (12 .53) that YI, is
model-unbiased for fixed x's in the finite population and the two samples.
Furthermore, from (12.53),
_
[ -y) X- 21 ] 2 2· ( 1 1 ) 2( -, X)- 2 2( 2) (x' - i)2
E(y,, =Uy (1- P ) ;;_ - N +{3 x - +Uy 1- p L(XI- i)2
(12.54)
The last term on the right in (12.54) arises from the sampling error of b and is of
order 1/n relative to the first two terms on th(: right. Averaging the first two terms
on the right over the distribution of the X 's created by repeated random selections
of the finite population and the two samples, we get
" EV(y,,)==tTy 2( I - p 2 ) (1 l(
;;_ - N1 ) + p 2tTy -;;;
1 - N1 ) (12.55)
2( 2) 2 2 2
= tTy I - p +~_£L (12.56)
n n' N
This expression has the same form as (12.49) except that in (12.56) and p u/
refer to the superpopulation.
Double sampling with regression has been extended by Khan and Tripathi
(1967) to the case where p auxiliary x variables are measured in the second
sample, Y being estimated by the mUltiple linear regression of y on these
variables. With the second sample a random subsample of the first and with
multivariate normality assumed for y and the x 's, the extension of (12.54) for
p > 1 gives for the average variance
(12.61)
(12.64)
or
2 4(c/c1
(12.65)
p >(1 +e/e,)2
Equations (12.64) and (12.05) give lhe critical ranges of cle' for given p and of p
for given c/e' that make double sampling profitable.
Figure 12.1 plots the values of the r tio cl e ' (on a log scale) against p. Curve I is
the relationship when double and single sampling are equally precise; curve II
342 SAMPUNG TECHNIOUES
holds when V..", =0.8 V(j), that is, when double sampling gives a 25% increase
in precision ; and curve III refers to a 50% increase in precision. For example,
when p = 0.8, double sampling equals single sampling in precision if cl c' is 4,
gives a 25% increase in precision if clc' is about 7'i., and a 50% increase if clc' is
about 13.
100
!
E m\
::I
'E
I;::
50
.5 40
""c 30
\ \
n\ \
::I
&
"§20
2
tE
~~ "i'\.. \ \
10
::I
... 8 ...... '\.
~ 6 ........
" "-
...
.5 5 ............
........,_
" ,,~
C
"
::I
& 3
~ ~,\
"" ,'.~
'(
§ 2
~
"0
:8
•
a::
1
0.4 0.5 0.6 0.7 0.8 0.9 1.0
p• correlation ~ Yi .00 "i
.... 12.1 ttelation between c/ c' and p for three fixed values of the relative precision of double and
single sampling.
Curve I: double and single sampling equally precise.
Curve II : double sampling gives 25 per cent increase in precision.
Curve III: double sampling gives 50 per cent increase in precision.
For practical use, the curves overestimate the gains to be achieved from double
sampling, because the best values of nand n' must either be estimated from
previous data or be guessed. Some allowance for errors in these estimations
should be made before deciding to adopt double sampling.
For any p, there is an upper limit to the gain in I,recision from double sampling.
This occurs when information on i' is obtained free (c' =0). The upper limit to the
relative precision is 1/(1-p2).
DOUBLE SAMPLING 343
12.8 ESTIMATED VARIANCE IN DOUBLE SAMPLING
FOR REGRESSION
If terms in 1/ n are negligible, VC9,,) is given by (12.49):
V(- ) . S/(l_ p 2) + p 2S/_~
y" n n, N
With a linear regression model, the quantity
s;.x = n ~ 2[ f 1-1
(Yi - 9)2 - b
2
f
1- 1
(Xi - x?] (12.66)
S
2
=
r (YI- ji)2
y n-1
IS an unbiased estimate of S/, it follows that
2 2
Sy -Sy.x
.9R-Y=~i'- Y
X
=
.9 - -) +:i(i'-X)
(:iX-Y ji -
-: x - X)-
X(_Y - Ri)+Y(-'
=-::-
x X
344 SAMPLlNG TECHNIQUES
The first component is the error of the ordinary ratio estimate (section 2.1 1). In
obtaining the appropriate error variance in ection 2. ] ] . we replaced the factor
Xli by unity in this term. To the same order of approximation. we replace the
factor y/i in the second component by the population ratio R = Y/X. Thus
YR - Y =(y - Ri)+R(i' -X) (12.70)
If the second sample is a random subsampJe of the first,
- . (1 1)
V 2(.YR - Y). ;;- n' Sd
,2
(12.71)
where Sd, 2 is the variance within the second sample of the variate d, = Y, - Rx"
Averaging now over repeated random selection of the first ample.
V(YR) = \/1 £2+£1 V 2
(1 1) 2
n N SY + ---
== -;-- rt
(1 1)
II'
(S Y2 -2RSY.. +R 2 S.)
2 (12.72)
TABLE 12.1
ESTIMATES FROM THE UNMATCHED AND MATCHF.D PORTIONS
'r
Estimate Variance
Unmatched :
Matched :
V(Y2') = w:
2..
: w2m
DOUBLE SAMrLlNG 347
From Table 12.1, thi works out.after simplification as
V{ - ') _ S/(n - up2)
Y2 - n2_u2p2 (12.74)
Table 12.2 shows for a series of values of p tbe optimum percent that should be
matched and the relative gain in precision compared with no matching. The best
TABLE 12.2
OPTIMUM % MATCHED
% gain with
p
Optimum
% matched
% gain in
precision
I - --
m
n
I
3
m
-n -- I
4
O.S 46 7 7 6
0.6 44 11 11 9
0 .7 42 11 17 IS
0.8 38 2S 2S 23
0.9 30 39 39 39
0.95 24 52 SO 52
1.0 0 100 67 75
where ~ :: em/ru' If sample sizes are the same on the two occasions so that
m + u = n, the optimum unma~ch ed proportion on the second occasion is found by
minimizing
ve2 (n - up 2) (1 - JLP2)
~:7 = [M + u(l - 8)] (n 2_ U2p 2) [~+ #L(1 -~)J(I _ ~ 2p2) (12.78)
where ~ = u/ n and V comes tram (12.74). If ~ < 1, matching being cheaper, the
optimum t,ruportion matched is, of course, greater than the values in Table 12.2.
He also ·Ieal with the case where the costs are to be the same on the two
' l()ccasion ~"
In some ap;,li.:atlOns the data for occasion 1 provide several auxiliary variables
correiait'd with Yl, one of which will, of course, usually be YI. For example, in
estImating th kill Y7 of waterfowl per hunter in Ontario from 1968 to 1969, Sen
(l973a) found that the kill per hunter and the number of days hunted in 1967 and
1968 were bul.h correlated with Y2' In this paper he extended the preceding
analysis to the cas where 92," is adjusted by its multiple linear regression on the
3uxiliary variables and where the samples on the two occasions are of unequal
sizes. With large samples of equal size the only change in (12.76) for V"",(jiz') is to
replace p '2 by .he .'quare R 2 of the multiple correIa n coefficient between Y2 and
the auxiljpry "'ariables (as uming multivariate normality). The corresponding
theory for the case where Y2m is adjusted by the muJtivariate ratio estimate is
gIven in Sen (1972) for equal sample sizes and in Sen (1973b) (or unequal sample
sizes.
TABLE 12.3
EsTIMATES OF y~ ON THE hth Oc ASION
Estimate Variance
Unmatched :
Matched :
on page 339. Note that (a) our m corresponds to the n in (12.49) and (b) the term
p 2 S// n 'in (12.49), which equals 8 2 V(i ' ), is replaced by p2V(Y~ _ I)' since B =p
when S is constant on successive occasions and Y~ _ I corresponds to i ' in the
earlier analysis.
We now examine the precision obtained if the optimum mIl and Uk and the
optimum weights are used on every occasion. It will be found that the optimum
mh/n increase steadily on ucces ive occasions, rapidly approaching a limiting
. value of i.
350 SAMPUNG T£CHNIQUES
(12.81)
'f
When this value is substituted in (12.80), the relation becomes; after some
algebraic manipulation,
1 -~
-1 = 1 +--__;-=== (12.82)
glo g"-1(1 +Jl- p 2)
This relation may be written
r" = 1 +br"_1
where r" = 1/KIt and rl = 1/gl ::: 1. Repeated use ofthis recurrence relation gives
1
- = rIo = 1 + b + b 2 +, .. + b 11 - 1 =I-b"
--
~ I-b
goo= I-b =
2Jl-1l
r=---? (12.83)
l+vl-p-
DOUBLE SAMPUNO 351
Hence the variance of jilt' tends to
V(Yoo') = S2 ( 2~ ) {12.84}
n 1 +Jl - p 2
Finally, the limiting value of mit is obtained from (12.81) as
moo ~ 1
-;-=goo(1 +J1 _ p2) 2
irrespective of the value of p .
Table 12.4 shows the optimum percentage to matCh-100m";", as found from
(12.81)-and the resulting variances for p = 0.7, 0.8, 0.9 and 0.95 and for a series
of values of h.
TABLE 12.4
OPTIMUM % MATCHED AND VAiUANCES
p- p'"
h 0.7 0.8 0.9 0.95 0.7 0.8 0.9 0.95
By the fourth occasion, the optimum percent matched is close to 50 for all the
values of p shown, although a smaller amount of matching is indicated for the
second and third occasions. The reductions in variance, (1 - &), are modest if pis
less than 0.8.
variances of ji,, ' and of the estimated change (ji" '- YI. _I) when m, U, and 4> are held
constant. We continue t() write V(y" ,) = g"S2/ n, although the actual value of gIl
will be different {rom that in the preceding section.
The estimate is now
YIi' :::: 4>Y" .. ' + (1 - (/J }Y"m'
Substituting the expressions for the two variances (from Table 12.3), we have
S2
V(y,,'} == ~ ;' 4> 2V(YIt .. '} + (1 - 4»2 V(Y"m')
n
S2[4>2 + (1-4>}2(1 _ p2)] + S2p2(1_(/J)2gh _1
u m n
Hence
_ [4> 2+(I_ 4» 2(1_p2)]+ 2(1 A..)2
g,, - -; A P -", g" - I (12.85)
The value of the weight 4> that minimizes the limiting variance may be found by
differentiating (12.86). This leads to a quadratic equation whose appropriate root
is
Ji7(Jl- p2+ 4 AfoLP ':! -~J
4>oPI:::: 2Ap-
In practice, the value of p will not be known exactly and will differ from item to
item. A compromise value can usually be chosen. Oearly, 4>orx will be less than
foL = u/n, since the matched part of the sample gives higher precision per unit than
the unmatched part. For example, with foL =0.25, that is, ! of the sample
unmatched, q,opr turns out to be 0.216, 0.198, and 0.164 for p = 0.7 ,0.8,0.9. The
choice of 4> :::: 0.2 would be adequate for this range of p.
For the estimate of change, we have
V(y" '- Yl.-l} = V(y,,') + V(Yi. - J) - 2 cov (Y,,'YI. _,) (12.87)
DOUBLE SAMPLING 353
To find the covariance term, note that if YIII, Yh-I,I are the values for the ith unit
in the matched set on occasions hand (h -1), our model is
YIII = Yk +P(Yh - l,l- Yh - 1)+ehl
where the ehl are independent of the y 's. From thjs model it is found by
sub titution that
9h ... ' = Yhm + P(ji;' - I - jih'- I,,,,) = Yh + P(jii. - I - Y" -I) + els m
Hence the covariance of jill ... ' and jil. _1 is p V(jil. _I)' But
cov (ji" 'y I. - I) = COV {[I/>jihu + (1- I/> )Yhm ']jil. - I} = p(1- 1/» V(y;' _t)
since jillu is independent of ji ~ _ I' From (12.87), thjs gives
2 14 10 22 14 33 19 41 22
3 16 14 30 20 52 32 67 39
4 17 15 32 24 59 40 79 52
co 17 15 33 26 62 50 89 74
RO G
15 10 27 18 56 40
-Described on p. 355.
354 SAMPLING TECHNIQUES
with the optimum gains from Table 12.4 suggests that after the econd occasion
little precision is lost by using a constant weight and a fixed proportion matched,
unlessp 2: 0.95 .
If p exceeds 0.8, the regression coefficient h = p may be replaced by I with only
a small additional loss of precision . This gives an estimat YII" of the form
( l2 .8Y)
1n the important Current Population Survey taken monthly by the U.S. Bureau
of the Census, one quarter of the second-stage units are replaced each month, so
that an individual household remains in the sample during four consecutive
months. The household is omitted for the eight ucceeding months but is then
brought back for another four months, thus increasing lightly the precision of
year-to-year comparisons.
" The composite estimate used in this survey is of a form related to (12.89) but
slightly different.
Yh" = (1 - K)YII + K(Yh - 1+ Ylom - Yh - I.m) (12 .90)
where K is a constant weighting factor. The difference is that Yh, the current
estimate for the whole sample, takes the place of the Yhu in (12.89). The quantities
Yhl1l, 9h - I.,., YII in (12 .90) are ratio estimates of a fairly complex type .
The variance of Yh" (due to Bcrshad) is given in Hansen , Hurwitz, and Madow
(1953); see also the Appendix in Hansen et al. (1955). The estimator of the
month-to-month change is
Since the primary units remain unchanged, only the within-units components of
V(Yh") and V(dh ) are affected by the sample rotation policy.
Rao and Graham (1964) have examined the performance of composite
estimators in rotation policies of this type, in which a respondent remains in the
sample for' months and then drops out for m months. They used as models an
exponential correlogram and a linear correJogram in time, descending to zero. A
more complex correlogram, needed if there is a high correlation between months
hand (h - 12), has been studi d by Graham (1973).
DOUBLE SAMPLING 3'55
Their gains in efficiency for r = 2, 4 and m = 00 correspond to the results for
A = i. ~ in Table 12.5. For an exponential correlogram in which the correlation
between result on the same unit on occasions h, h - i is pi, the lines labeled RG in
Table 12.5 show for p = 0.7,0.8,0.9, their percent gains in efficiency. For each p
they use in (12.90) the optimum K for the current estimates as h -+ 00 . As Table
12.5 shows, they also find that the gains in efficiency from the composite
estimators are much greater in estimating change than current level.
In a more g~neral framework, Scott and Smith (1974) have discussed the role of
time series methods in making estimates in repeated surveys of various types.
In another rotation policy a new sample is drawn on each occasion, with no
matching. With weekly or monthly sampling, this plan i appropriate when annual
estimates, and to a lesser extent semiannual or quarterly estimates, are of primary
importance, for example, in an illness survey with emphasis on chronic diseases . If
the questionnaire obtains for any unit the results fOT the preceding month as well
as for the current month, we can consider composite estimates of the form
(12.91)
where Yh = e timate made from current data in the current sample
Yh - I.h = estimate made from previous month's data in the current sample
Y~ _ I = composite estimate for the previous month
The theory is discussed by Hansen, Hurwitz, and Madow (1953) and Woodruff
(1959), who apply it to a survey of retail sales, and by Eckler (1955). In the Retail
Trade Survey the composite estimate involves a ratio estimate, being of the form
12.2 For the Wh , Ph in exercise 12.1 , find the cost ratios c.1 c_' for which double
sampling is more economical than single sampling.
12.3 A population contains L strata of equal size. If V",. denotes the variance of the
meim of a simple random sample and V," VIIs are the corresponding variances for stratified
random sampling with proportional allocation and for double sampling with stratification.
show that, approximately,
t (Yh- Y)l
nV..... =Sh2+_h____
L
t(Yh - YV
- 2 n k
nV,u = Sk + - ; - - - -
n L
where Sh 2 is the average variance within strata. (N and n ' may both be assumed large
relative to L, and the nh in double sampling may be assumed equal to n/ L.)
Hence, if (RP)" denotes the relative precision of the stratified sample to the s.imple
random sample. with a corresponding definition for (RP)d.o' show that
(RP)"
(RP)d. = 1 + (nl n ')[(RP)" - 1]
For (RP).. = 2. plot (RP)", against nln'. How small must this ratio be in order that
·(RP)... = 1.9?
12.4 If p '" 0.8 in double sampling for regression , how large must n' be relative to n, if
the loss in precision due to sampling errors in the mean of the large sample is to be less than
10%.
12.5 In an application of dt>uble sampling for regression, the small sample was of size
87 and the large sample of size 300. The following computations apply to the small sample.
L (y,- y)2 = 17,283, L (Yi - y)(x,- i) = 5114, L (x, - ,f)2 = 3248
where A = ml n, ~ = ul n. (b) For given p, A. ~, find the value of cjl that minimizes V(jiz").
Show that if p exceeds! the best weight <p lies between ~ and 11-1(1 +~).
DOUBLE SAMPLING 357
*,
12.8 For IJ. = IJ. =!, P = 0.8, and p = 0.9, compare V (yz") in the preceding exercise
with the variance of the optimum composite regression estimate Y2', as given by equation
t n
12.74. (In 92" take tP = 0.2 when IJ. = and tP = 0.4 when IJ. = Verify that for these values
of p the estimate ;12" is almost as precise as 92' for both IJ. =~ and IJ. =!.
12.9 An independent sample of size n is drawn each month . From the sample taken in
any month, data are obtained for the current and the preceding month. A composite
estimate 9~' is made as in (12.91), section 12.13 .
9~' = 9k +4>.(YI.-I- YA - I.h)
The model is
YAI = Y h +p(Y. 1.1 - Y. - l)+e~1
where eAI is independent of the y's and has variance (l_p 2). Show that (a)
9h'- Y. = i. +4>.(9;'- , - Y" - I)+(P - 4>.)(Y,, - l.. - Y" - I)
(b) If V(y. ') = 8~S 2 / n, where S2 is constant on all occaslons,
s. = (I _ p2) + 4>.28. _ + (p -tP.)2
1
(c) The optimum 4>k "'" p/(l + 8h - l) and the resulting optimum 8. is
p2
8h = 1- ---
1 +8,, -1
(d) The limiting 8. is 800 = JI - p'. These results were given by Eckler (1955).
12 .10 If E" = V(y)/V(y,,) and E,._ = V(y)/V(y,._) are the relative efficiencies of the
linear regression and ratio estimates of the sample mean of a simple random sample, show
that for both Y" and y" the corresponding relative efficiency to 9 in double 'ampling with
optimum choice of nln' is (ignore liN),
Hence note that with either of these estimators, double sampling will not be highly effective
unless e'/e is small (e.g., <1/10) . For example, with c'/ e = 1/10, E =6 gives EtU = 2.1.
12.11 In sampling on two occasions, suppose that SI == 52 = S and that the samples are
large, so that the regression coefficients of Yll on yHand of y" on Y21 in the matched part of
the samples on the two occasions are both effecti ;rely equal to p. The estimate Y2' in section
12.10 is constructed and an analogous estimate Yl ' u ing the regression of YlI on Y ~I. Show
that
(i)
(ii)
(One way of doing this is to express (92':t Yl') as linear functions of (ji2," ± 91 ... ) and
(92.:t 91.), which are uncorrelated).
Note that, as intuition suggests, (i) IS minimized when u = 0, while (ii) is minimized when
U " n.
12.12 The most favorable case for the application of the method in exercise 12.1 occurs
when the true proportion is 0 in stratum 1. In estimating the total number of units Y, in the
358 SAMPLING TECHNIQUES
population that possess an attribute costly to measure, this happens when there is a second
attribute, cheap to measure, such that only units having the second attribute can posse s the
first attribute. In a simple random sample of size n', count the number m' who have
attribute 2. Draw a subsample ohize JIm' = m' / k from these, and count the number r who
have attribute 1.
(0) From theorems 12.1 and 12.2, show tbat 'P"J = Nkr/n is an unbiased estimate of Y J
with variance
V('P") =
J
fY2P
n'
I
[Q (1-~)
I N
+ P2(k - 1)]
P +P2
J
where PI> (P, + P 1 ) are the population proportions having attributes 1,2. Assume 1/N (P , +
PJ negligible.
(b) With c = 10, c' = 1, the investigator guesses that P J = 0.25, P2 = 0.15 . Is double
sampling profitable in this case?
CHAPTER 13
13.1 INfRODUcnON
The theory pre ented in preceding chapters assumes throughout that orne kind
of probability sampling is used and that the observation y, on the ith unit is the
correct value for that unit. The err r of estimate arises solely from the random
sampling variation that is present when n of the un its are measur d in~tead of the
complete population of N units.
These assumptions hold reasonably well in the simpler types of surveys in which
the measuring devices are accurate and the quality of work is high . In complex
surveys, particularly when difficult problems of mt: asurement are involved, the
assumptions may be far from true. Three additional source of error that may be
present are as follows.
1. Failure to measure some of the units in the chosen sample. This may occur by
oversight or, with human populations, because of failure to locate some indi idu-
als or their refusal to answer the questions when located .
2. Errors of measurement on a unit. The measuring device may be biased or
imprecise. With human populations th respondents may not possess accurate
information or they may give biased answers.
3. Errors introduced in editing, coding and tabulating the re ults.
which measurements would be obtained if the units happened to fall in the sample,
the second of the units for which no measurements would be obtained. The
composltl()n f the two strata depe nd Intimately on the methods used to find the
units and obtain the da ta . A survey in which at least three calls are made. if
necessary. on every hou e and in which a supervisor wi th exceptional powers of
persuasion calls o n all persons who refuse to give data will have a much smalle r
"nonresponse" stratum than one in which only a single attempt is made for every
hou e .
TABLE 13. 1
RESPONSIOS TO THREE R EQU~STS IN A MA ILED INQ UIRY
Average Number
Number of ~~of of Fruit Trees
G ru wers Populallo n per Grower
"
This division into two distinct strata is, of co urse, (10 oversimplification . hance
plays a part in determining whether a Itn!t IS found and mea ured in a given
numbe r of atte mpts. In a more complet(.; ~pecification of the problem we would
attach to each unit :l probability repre~e nting the chance that it would be
measured by a given field method if it fe ll in the sample.
The sa mple provides no information about the nonresponse stratum 2. This
wo uld not matter if it could be ass umed that the characteristics of stratum 2 are the::
same as those of stratum 1. Where check s have been made, however, it has often
been found that units in the " nonresponse " stratum differ from units that are
measurable . An illustration appears in Table 13. 1. The data come from an
experimental sampling of fruit orchards in North Carolina in 1946. Three
successive mailings of the same questionnaire were sent to growers. For one of the
questions-number of fruit trees-<:ompJete data were available for the popula -
tion (Finkner, J 950).
The steady decline in the number of fruit trees per grower in the successive
responses is evident , these numbers being 456 for respondents to the first mailing,
382 in the second mailing, 340 in the third , and 290 for the refusals to all three
letters. The total response was poor, more than half the population failing to give
data even after three attempts.
SOURCES OF ERROR IN SURVEYS 361
We now consider the effects of nonresponse on the sample estimate. Let N I , N2
be the numbers of units in the two strata and let WI = Nil N, W 2 = N21 N, so that
W 2 is the proportion of nonresponse in the population. Assume that a simple
random sample is drawn from the population. When the field work is completed,
we have data for a simple random sample from stratum 1 but no data from stratum
2. Hence the amount of bias in the sample mean is
£(91) - ¥ = ¥I- ¥ = ¥I -( W) 5\ + W 2 ¥2)
=W2( ¥J - Y2) (13.1)
The amount of bias is the product of the proportion of nonresponse and the
difference between the means in the two strata. Since the sample provides no
information about Y2 , the size of the bias is unknown unless bounds can be placed
on Y2 from some source other than the sample data. With a continuous variate,
Ihe only bounds that can be assigned with certainty are often 0 wide as to be
useless.
Consequently, with continuou data, any sizable proportion of nonresponse
usually makes it impossible to assign useful confidence limits to Y from the sample
results. We are left in the position of relying on some guess about the size of the
bias, without data to substantiate the guess.
In sampling for proportions the situation is a little easier, since the unknown
proportion P2 in stratum 2 must lie between 0 and 1. If W 2 is known, these bounds
for P2 enable us to construct confidence limits for the population proportion P.
Suppose that a simple random sample of It units is drawn and that measurements
are obtained for It J of the units In the sample. Assuming It I large enough, 95%
confidence limi t!. for PI are given by
PI ±2.JPIQll lt l
where PI is the sample proportion and the fpc is ignored.
When we try to derive a confidence statement about P, we are on safe ground if
we assume P2 = 0 when finding A
and P2 =< 1 when finding Pu . Thus we might
take, for 95% limits,
The limits are distressingly wide unless W2 is very small. Table J3.2 hows the
average limits for a sample size" = I 000 and a series of values of W2 and PI ' Since
the limit in (13.2) and (13.3) depend on the value of n I tnumber of respondents in
thesampJe), we have taken nl =- n WI, its average value, in computing Tahle 13.2.
TABL 13.2
95 % CONFIDENCE LIMITS FOR P ( • 0) WHt.N" "" 1000
~o
Nonresponse. Sample Percentage. J OOPI
IOOW2 5 10 20 50
The rapid increase in the width of the confidence interval with increasing W~ i~
evident. It is of interest to examine what values of n would he needed to give the
ame widths of confidence interval if W 2 were zero. This is easily done when PI i~
50% . For W 2 =5%, Table 13 .2 shows that the half-width of the eonfidcm.l
interval is 5.6. The equivalent sample , izc n. , as uming no nonre~ponse, is found
from the equation
"
5.6 =- 2J(SO)(50)/ n,
n. = 320
For W 2 =- In, 1S, and 20%. the values of II, are 155 ,90. and 60, respectively. It
is evidently worthwhile to devote a ubstantial proportion of the resources to the
reduction of nome ponse.
If the population nonrespon e rate W2 is not known , as will usually be the case,
conservatIve confidence limits can be calculated from the sample data by a method
suggested by a student. In calculating the lower limit, assume that all ample
nonrespondents would have given a negative response . In calculating the upper
limit, assume that sample no ore pondents would have given a positive response .
For example, suppose n == 1000, n J = 800, and PI = 10%, so that 80 sample
members give a positive response and the sample non response rate is 20%. Then,
in percents,
PL ::= 8 - 2J(8)(92)/ 1000 = 6.3%
TABLE 13.3
i'~
Nonresponse. d(%>
lOOW 20 15 10 5
0 24 43 96 384
2 27 SO 122 653
4 3J 60 J66 2000
6 36 75 255
8 43 99 521
10 53 142
15 112
364 SAMPLING CHNIQUES
If Number of Adults in
Hou ehold is
Relative
Table
Frequency
of Use
Number I
I2 I3 I4 I5 I6
Select Adult Numbered
--
1/6 A I J J J 1 1
JIl2 81 I I I I 2 2
1/ 12 B2 I I I 2 2 2
1/6 I I 2 2 3 3
1/6 D I 2 2 3 4 4
1/ 12 EI I 2 3 3 3 5
1/ 12 E2 1 2 3 4 5 5
1/6 F 1 2 3 4 5 6
13.4 CALL·BACK
A "wndard technique is to specify the number of t:all -backs. or a minimum
numhcr. that must be made on any unit before abandoning it as " unable to
(,'III :\(.:t." · Steph'an and McCarthy (195Rl give data fr m a number of surveys on
thl' P\!rn~ l1taAe !)f the total sample obtained at each call. Average results are
,h,\\l.n In Table 135.
TABLE 13.5
NUMBER OF CALLS REQUIR ED FOR COMPLETED I NTERVIEWS
% of Sample contacted on
First Second Third or Later Per Cent
Respondent Call Call Call No nresponse Total
"TwlI . urveys in which the respondent Wit a housewife and a farm operator, respectively.
have been included in the "any adult" group.
366 SAMPLING TECHNIQUES
In surveys in which any adult in the house could answer the que tions, the first
call obtained about 70% of the sample and the first two calls, 87%. The increased
cost of sampling when a randomly chosen adult is to be interviewed is evident, the
first call producing only 37% of the required interviews. The marked success of
the second call reflects the work of the interviewer in finding out in advance when
the desired respondent would be at home and available.
Little has been published on the reJative costs of later calls to the first call. Later
calls would be expected to be more expensive per completed interview, since the
houses are more sparsely located in the area assigned to the interviewer and since
the occupants are presumably people who spend more than an average amount of
time away from home. From British experience, Durbin (1954) suggests that later
calls may be less expensive than would be anticipated. The following figures
(Table 13.6) show estimated relative costs per completed interview (i.e., money
spent on jth calls divided by number of new interviews obtained) for each caJl up
to the fifth in a special study reported by Durbin and Stuart (I 954).
TABLE 13.6
RELATIVE COSTS PER NEW CoMP1.ETED INTERVIEW AT THE iTH CA1.1.
Call 2 3 4 S
Relative cost 100 112 127 lSI 2S0
The estimation of these costs requires care. If the desired respondent is not at
home at the first call, the interviewer may spend time inquiring when this person
will be at home and making a tentative appointment. In the costing such time
should be assigned to the second call instead of to an unsuccessful first call.
A more useful measure is the average cost per completed interview over all
interviews obtained up to the ith call. These figures give the relative costs of
obtaining n completed interviews when we insist on i calls before the interviewer
gives up. In order to compute these figures , we must know bow many interviews
are obtained at each call. In Table 13 .7 these calculations are made under two sets
of assumptions. The first simulates surveys in which any adult can answer the
questions, the second those demanding a random adult. The data on numbers of
interviews obtained were -taken from Table 13.5.
The details of the calculation are shown only for the first assumption, the
method being exactly the same for the second. The symbol no denotes the original
sample size.
Insistence on up to three calls costs only 4 % more per completed interview than
single calls jf any adult is a satisfactory respondent, and only 10% more if a
random adult must be interviewed. How typical these results are is not known, but
the method provides realistic estimates of the cost of inSisting on call-backs jf the
necessary cost and sample size data have been collected. There i al 0 the time
factor: call-backs delay the final results.
SOURCES OF ERROR IN SURVEYS 367
TABLE 13.7
RELATIVE CosTS PER COMPLETED INTERVIEW UP TO THE ith CALI.
• Interviews
(13.7)
Consider the composition of the ample after i calls. The persons in the sample can
be cia sified into (r + ) cia ses as follows : in the first c1as and interviewed ; in the
second class and interviewed; and so on. The (r + ))th class consists of all tho e not
yet interviewed after i calls. If the fpc is ignored, the numbers falling in tbe e
368 SAMPLING TECHNIQUES
Eu.ple. A population with three classes is shown in Table 13.8. The p/ and 1111/ are
intended to represent surveys in which a random adult is interviewed. At 'the lint call the
probabilities 1111/ of obtaining an interview are taken as 0.6, 0.3, and 0.1 in the three classes.
At the second and subsequent calls, the conditional probabilities of interviewing a per on
SOURCES OF ERROR IN SURVEYS 369
missed previously are 0.9, 0.5, and 0.2. These figures were made higher than the
correspondi ng probabilities at the first call in order to represent the effect of intelligent
inquiry by the interviewer.
TABLE 13.B
CHARACTERISTICS Of THE THRE E CLASSES
Class
I 2 3
TABLE 13 .9
NUMBER OF INTERVIEWS. COS1S PER INTERVIEW AND BIASES
Number of Average
Number of Calls Interviews Cost per 11
Required Obtained Interview Bias Bias
The item being estimated is a binomial percentage close to 50%. Two sets of ILl are
considered (I, 11). For si mplicity, the within-class variances 0/ = 1L;(100 - IL,) were all take n
as 2500. The relative costs per completed interview at successive calls were those given in
Ta ble 13 .6.
Table 13.9 shows (a) expected total number of interviews obtained for a total of i calls,
(b) the average cost of these calls per interview. and (c) the bias (ji, - ji) in the estimate y
under as umptions 1 and II about the ILl" .
I n II , for example. the true population mean ji is 54 %. The mean ji 1 obtained from first
calls is 56.235%, giving the bia~ of + 2.235% shown in the table. A policy that requires
three caUs reduces this bias to +0.842% .
The values of MSE(y) obtained hom a given expenditure of money were compared for
the different call-back policies. In the first comparisons the amount of money is sufficient to
take n. "" 500 if only one call is made. From Table) 3.9 the e~pected number of interviews
obtained in the first call is E(n,) = (500)(0.425) = 212.5 . 1f twoealls are made, this expected
number must be reduced to £(n2) = 212.5/1.05 = 202.4. to maintain the same cost, and
similarly (or 3, 4, and 5 call-backs. These value of E("I) were substituted in equation
(13.10) to give V(y) and hence MSE(y).
370 SAMPLLNG TECHNIQU ES
TABLE 13 .10
VALUES OF MSE(y) FOR DIFFERENT CALL-BACK POLICIES
COSTING THE SAME AMOUNT
no = 500
(for first calls only) no = 1000 no = 2000
Number of
Calls No
Required Bias I· II· II II
• These represent populations with smaller (I) and greater (II) amounts of
bias, as defined in Table 13.8.
Table 13 .10 presents the re ulting MSE's for three amounts of expenditure,
corresponding to " . = 500. 1000, 2000 for a single call. When n. = 500, the values of
MSE(y) are also given for the " no bias" situation in which every IL} = SO. Thi column
shows the effect of call- backs when they are unnece sary, since no bias results from
confining the survey to a single call.
The policies giving the lowe t MSE 's are shown in boldface type . Consider first the
smallest sample size, " . = 500. If call-backs are unnecessary , a policy demanding as many
'as four call-backs results in only a modest increase in the MSE. In I, involving the smaller
amount of bias, the different policies produce about the same accuracy, although three is
the optimum . In II , three to five call-backs are satisfactory, a single call giving a MSE about
25% above the minimum .
For the larger sample sizes the optimum number of call-backs increases to four or five,
and the use of a single call results in more substantial losses of accuracy.
This is, of course, only an illustration. The importance of the method is that as
information accumulates about costs and relative biases an economical policy can
be worked out for any specific type of survey.
where the c 's are the costs per unit: Co is the cost of making the first attempt. c I i~
the co t of processing the results from the first attempt. and C2 is the cost of getting
and processing the data in the second tratum . If WI ' W 2 are the population,
proportion in the two strata. the expected cost is
As an estimate of Y. we take
(1.1 . 1 ~
where YI> Y2 are the means of the amples of si zes TIl = II I' and 11 2 = n 2' / k . By
theorem 12 .1, the estimate ji' will be unbiased if responses are obtained fwm all
the elected random subsample of size " 2 = 11 2'jk .
By formula (l2.3), the variance of ji' is
, (I 1) ,-+W,S2
V(y) = --,.-- -,- -1- 1)
-
2
(
11 N 11 1/ 2
= (!_!)S 2+ (k - ) W2 S/ (HI5)
11 ' N 11'
The quantities 11' and k are then chosen to minimize the product C( V + 05 2 / N) .
From (13.15) and (13.13) we have
( 1J . 17)
SAMPUNG TECHNIQUES
(13 .18)
The initial sample size n' may be chosen either to minimize C for pecified Vor
V for specified C by solving fol' n' from (13.16) or (13.17). If V is specified,
, N[5 2 +(k-1)W2 5/ ]
(13 .19)
n op1 = (NV + 52)
where V is the value specified for the variance of the estimated population mean .
The solutions require a knowledge of W 2 : this can often be estimated from
previous experience. In addition to 52, whosc value must be estimated in advance
in any "sample size" problem, the solutions also involve 5 22 , the variance in the
nonre ponse stratum. The value of 5 2 2 may be harder to predict ; it will probably
not be the same as 52. For i:1stance, in surveys made by mail of most kinds of
economic enterprise, the respondents tend to be larger operators, with larger
between-unit variances than the non respondents.
If W2 is not well known, a satisfactory approximation is to work out the value of
n ~PI from a provisional (13.18) and (13.19) for a range of assumed values of W2
between 0 and a safe upper limit. The maximum n~Plin this series is adopted as the
initial sample size n'. When the replies to the mail survey have been received, the
value of n2' is known . In seeking the value of k to be used with this method, we use
'f the variance vrCji') conditional on the known values of n2 and n'. This can be
Equation (J 3.20) is solved to find the k that gives the desired conditional
variance. The cost for this method is usually only slightly higher than the optimum
cost for known W 2 •
Example. This example is condensed from the paper by Hansen and Hurwitz (1946).
The first sample is taken by mail and the response rate WI is expected to be 50%. The
precision desired is that which would be given by a simple random sample of size 1000 if
there were no nonresponse. The cost of mailing a questionnaire is 10 cents, and the cost of
processing the completed questionnaire is 40 cents. To carry out a personal interview costs
S4.10.
How many questionnaires should be sent out and what percentage of the nonrespon-
dents should be interviewed?
In terms of the cost function (13.12) the unit costs in dollars are as follows.
Co = cost of first attempt = 0.1
C I =cost of processing data for a respondent =0.4
C2 = cost of obtaining and processing data
for a nonrespondent = 4.5
SOURCES OF ERROR IN SURVEYS
The optimum n' and k can be found from (l J .19) and (J 3. 18). If the variances 52 and 5/
are assumed equal and N is assumed to be large. then
= 1870
Note that we have put 5 / V = 1000. or V "" 5 2 /1000, since this is the variance that the
2
sample mean would have if a sample of 1000 were taken and complete response were
obtained .
Consequentl). 1870 questionnaires l>hould be mailed . Of the 935 that are not returned.
we interview a random subsample of 935/2.739. or 341. The cost is S2095.
For instance, the cost ratio is 1.029 for WI = 0.5 if r '" 4, 1.146 if r =
10, and 1.228 if r = 16. If, however, S 2 is substantially greater than S/" there is
more to be gained from subsampling.
With stratified sampling, the optimum values of the nh I and the kh in the
individual strata are rather compJex. A good approximation is to estimate first, by
the methods in sections 5.5 and~ . 9, the sample sizes nolt that would be required in
the strata if there were no nO)1response. Now, from (13.19), if W2 = 0, we have
NS 2
(13.21)
n" == NV+S2
E,-
f. (6)
,- I nj' --1
t+
;0
f. (
,- I -
nj '-
t+
-1
6) t.1(55!_ t . '(
)11T"j } - '7TJ)
5- , (13.25)
(13.26)
Hence
(13.27)
uSing assumption (i). Furthermore, since B(y},) = J.L} for any j and I, this gives the
result
(13.28)
j- l
t pj[l - (1- '7Tj )6]
Since the true mean ji = L PjJ.LJ> some bias remains in yps. In a certain sense, this
estimate bas the same bias as Y6, the sample mean given by the call-back method
with a requirement that as many as six calls be made if necessary. In ection 13.5,
equation (13.9), it was shown that the call-back method, with a total of j caUs,
gives an unbiased estimate of jijl= :E WljPjJ.LJi:E WljPj, where WIJ is the probability
that a person in class j who falls in.the sample will be interviewed. Now Wl j:=; '7TJ' If
at subsequent calls the probability of finding at home a person not previously
reached remains at '7Tj, then
Wlj = [1 - (1- '7Tj)I]
so that ilps = ji6' However, with the call-back method the probability of an
interview at a later call may be greater than 'TTj as a result of information obtained
376 SAMPLING TECHNIQUES
by the interviewer at the first or earlier calls. In this event the call-back method has
less bias after six calls.
The variance of YPS is rather complicated. With the usual approximation for a
ratio estimate, it may be expressed, following D ming (1953), as
This expression tends to be 25 to 35% higher than the variance of the unweighted
mean of the first calls. Also, V(yps) contains a term that does not decrease as no
increases and becomes important in very large samples.
To summarize, comparisons made on simulated populations by Deming (1953),
Durbin (1954), and myself suggest that this method shows to best advantage, in
relation to call-backs, when the biases from early calls are substantial and the
sample is large. The reductions in MSE for the same outlay are small, however,
unless call-backs cost substantially more than postulated here. The Politz-
Simmons technique has the advantage of sliving time. Errors and incompleteness
in the values of t, not considered in the analysis, are a disadvantage. The method
may also be applied, as suggested by Simmons (1954), in conjunction with several
call-backs.
Several other methods for mitigating the "not-at-home" bias have been
proposed. Bartholomew's (1961) applies to a survey with two calls. He supposes
that, for those not at home on the first call, the interviewer, by careful inquiry, can
make the probability of finding them on the second call approximately equal. If
this is so, the n2 persons interviewed at the second call are a random subsample of
the (no - nl) persons missed at the first call. Hence [n 191 + (no - n I)Y2]/ no is an
unbiased estimate of the mean of the initial target sample .. The method worked
weU on some British surveys to which Bartholomew applied it. In repeated
SOURCES OF ERROR IN SURVEYS 377
surveys Kish and Hess (1959a) suggest that nonresponses from recent surveys
may serve as a repla~ment for nonresponses in a current survey. Wherever the
bias from early calls shows a systematic pattern, as in Table 13.1, Hendricks
(1949) has outlined extrapolation methods to estimate the average results that
would be given by nonrespondents.
correlated with the correct value 1'-1 ; for instance, the measuring device may
consistently underestimate high values of 1'-; and overestimate low values.
There may be a correlation between the values of tit> on different units in the
same sample. The simplest example is the "interviewer bias." Dramatic differ-
ences are sometimes found in the mean values of Ylo obtained by different
interviewers who are sampling comparable parts of the same population (see
Lienau, 1941 , Mahalanobis, 1946, and Barr, 1957).
A similar effect has appeared when samples of a growing crop are cut by
different teams and when chemical or biological analyses are done in different
laboratories. The human factor is not the only cause for correlations among units
that are measured at about the same time. Many measuring processes are affected
by the weather; some use raw materials whose quality varies from batch to batch.
In estimating the current sale price of homes built some years ago, Hansen,
Hurwitz, and Bershad (1961) point out that if some houses in the sample have
been sold recently their prices establish a level that guides the interviewer and the
household~r in assigning values to houses that have oot been sold for many years.
In fact, the average price recorded for the sample may depend on the order in
which the recently sold houses appear in the sample.
In orde.r to handle these intrasample correlations in their most general terms, a
more complex model than that presented here is required. In particular, the
notation for elo and PI would have to indicate that their values may depend on the
other units present in the sample However, the types of correlation that are
believed to be most common in practice can be represented by the present model
br by simple extensions of it.
The components of the error of measurement are summarized in Table 13.11.
We have noted further that values of Pi and dlo on different units in the same
sample may be correlated with one another, where dlo - t;t> - PI'
TABLE 13.11
CoMPONENTS OF THE ERROR OF MEASUREMENT ON THE ith UNIT
subject to bias {3. In the estimated error variance, which we attach to the sample
mean, the bias cancels out, since this estimate is derived from a sum of squares of
tt:rms (y/- Y)2 . Consequently, the usual computation of confidence limits for Y
from the sample data takes no account of the bias. The same results hold in
stratified random sampling.
The situation is essentially the same with regression and ratio estimates.
Consider the regression estimate
5;" = y+b(X-i)
where both the Yi and the Xi may be subject to constant bia es {3y and {3~ ,
rt:spectively. Since the least squares estimate b remains unchanged and since the
bias {3x cancels out of the term (.Y' - i) , it follows that YI, is subject to a bias (3y . It is
easy to verify tpat the sample c"timate of V(YI,) contains no contribution due to
the biases
With the ratio e timate
Y-
YR =-::: X
x
the bias is <llso [3)" to a first approximation , since in large samples E(Xji) is
approximaH.ly 1 even if tht Xj are subject to a constant bias. In large samples the
sample esti'nate of varian.;e
(-) (N- n) 1: (Yj-RX/) 2 (13.35)
., V YR =-~ n -1
E(YR - Y)2
that is, as a'; (;o;timate of the variance about the biased mean Y.
To summari /.e. a constant bias passes undetected by the sample data. As we
have seen (sc 'tion 1.8), the 95% confidence probabilities are almost unaffected if
the ratio of (3y to the standard error of the estimated mean is less than 0.1 but, as
the ratio increases beyond this value, the computation of confidence limits
bec.:ome misleaumg. Estimates of change from one time period to another, or
from one stratum to another, remain unbiased, provided that the bias is constant
throughout.
E(d" 21S) = -\
n
f 0/
i
(13 .37)
(13.39)
where (Td 2 denotes the population average of the variances of the errors of
measurement. In the Hansen et al. terminology, (Td 2 / /I is the response variance of
the sample estimate y".
With uncorrelated errors the same model can be applied in estimating a
population proportion P (Hansen , Hurwitz, and Bershad, 196] ). For any unit, let
the correct value J.l..i be 1 if the unit is in class C and zero otherwise. If errors of
measurement occur, this implies that units are sometimes incorrcctly classified.
For the ith unit, the recorded value YI" is sometimes I , sometimes O. Let p. denote
the proportion of measurements on the ith unit for which Y,a = 1. 111en, for given i,
YI" is a binomial variate in repeated measurement, with mean J.l..I ' = P; while the
variance of dlo is PiQ,. Hence, if Po = Yell is the sample estimate, (13.38) becomes
N
V(P ) = V( - ) = _1 ~ p.Q + (1 -
" Yo nNL..' I n
n L N(Pi- - 1p )2 (13.40)
1 (N N N )
<n(N-1) LP;-LP/+LP/-Nr
N PO (13.41)
n(N-1 )
382 SAMPLING TECHNIQUES
where
N
p='IP,
N
As (13.41) shows, the sum of the response variance and the sampling variance
has an upper limit NPO/n(N - l). This upper limit is also the variance of the
sample mean of the correct measurements on the units if the fpc is ignored and
there is no overall bias. For the correct measurement ILi on any unit is then a
binomial variate with mean P, so that
S 2 N
~ = n(N _1/0 2:: V(p.. ) (13.42)
This rather puzzling result holds because (i) the sampling variance entering into
(13.39) and (13 .40) is that of IL,' = (ILl + (3,) and (ii) in the e timation of a
proportion, ILl and {3i are always negatively correlated. When IL, = I , {3, :s; 0, since
Pi = (ILl + {3/) :S; 1, and similarly when ILi = 0, f3i 2:: 0. Thus the term
S
=e_
,2
n
in (13 .39) is always less than
s;
::L
n
by just about the response variance of Ya'
With uncorrelated errors, a useful result is that the usual formula for v (y", ) in
simple random sampling remains unbiased if the fpc is negligible. From theorem
2.4 corollary, this formula, developed by assuming no errors of measurement, is
E(
-)
V Yo =
1-/
-
2 1- / 2
-Ud +- -S.. , (13.45)
n n
while, from (13.39)
V( y_.. )_1
- -Ud
2+(1
--
- /)S ,2 (13.46)
n n"
Thus Ev(yo) = V(y.. ) if f is negligible.
SOURCES OF E.RROR IN SURVEYS 383
in the same way the formulas in preceding chapters for the sample estimates of
sampling error variances can be shown to remain valid in stratified and multistage
sampling, as do the large-sample formulas applicable to ratio and regression
estimates, provided that errors of measurement in Yu. and Xu. are uncorrelated
within the sample and that the fpc's are negligible.
(13.47)
(13.48)
where the products are averaged over all pairs of units in the same sample. By
analogy with cluster sampling, the average intrasample correlation coefficient p...,
may be defined by the equation
E (du. djo)= Pwu./ (13 .49)
This gives, from (13.48),
2
- Ud
V(d., ) = -[1 +(n - 1)pw] (13.50)
n
Hansen, Hurwitz, and Bershad (1 961) have calJed V(da ) the total response
variance as it affects the sample mean. Its component u/ / n is called the simple
response variance, while the term (n - 1)P..u/ / n is the co"elated component of
the total response variance.
From (13.34) we get, assuming cov (da • ji.') =0,
_
V(y.,) = -
1- 1 ·2
u/[ 1+(n - 1)p", ] +-S,. (13.51)
n n
The average value of the usual v(Ya) in (13.43) is found in the same way to be
(13.52)
Since P.. is likely to be positive for many types of measurement error, the standard
formula for v(Y.. ) is usually an underestimate in this case and makes the sample
estimate appear more precise than it is. This is true even when the fpc is negligible.
384 SAMPLING TECHNIQUES
(13.56)
This expre sion estimates Ud 12, the simple response variance for agent 1 in the
survey, if two conditions (A) hold: no correlation between response errors dill dl2
on the same unit; and Ud/=Ud2 2 • Equation (13.56) applies to a single pair of
agents and is averaged over the subs ample of k pairs of agents ,
Conditions A may hold when the mea urement is coding, the agents being
coders of similar skill trained by different supervisors, neither coder eeing t~
otber's work. With interviewing, a positive cov (dll d12 ) is to be expected because
some respondents repeat a first incorrect response from memory. In this event
I (Y11 - Yi2f 12m underestimates Ud 12. It also underestimates Ud,2 if the second
agent is more skilled than the first, as shown for a (0,1) measurement by Hansen,
Hurwitz, and Pritzker (1965). Moreover, L (YI1 - Y12)2/2m has been found to
decljne if the second interviewer is given the response obtained by the first
interviewer, even if told not to look at them until the repeat interview is completed
(Koon , 1973). Thes compLexitie illustrate why the realistic study of errors f
measurement is difficult.
SOURCES Of EIlROIl IN SURVEYS 387
For the total response variance the relevant estimate from (13.55) for a single
pair of agents is <Y.I- .9.2)2/2, where Y.I, 9.2 are the means of the m first and second
measurements. Under conditions (A).
where Pw is the correlation between re ponse errors on different units by the same
agent. Equation (13.57) provide only a s!ngle degree of freedom, but is averaged
over the k pairs of agents. Havi ng an estimate of O'd 2 from (13.56), we can estimate
the relative size of the imple respon. e variance and the correlated component.
In interview survey the correlated component is usually found much larger than
the simple re pon e variance except for basic items such as age, sex, and marital
status (Fellegi, 1964). Partly for this reason the 1970 U.S. Census used self-
enumeration by mail extensively (Hansen and Waksberg. 1970).
If (13.55) holds, we can also study the ratio of the simple response variance to
the sampling variance that applies to the sample in an agent's assignment. For
under conditions A,
z '"
E 2:I (y,,, - j,,,,)2/2(m - 1) = u/(l-p",)+S,/ (13.58)
" i
Nurnberofre pon e
Second agent
1 0 Total
First 1 a b a +b
agent 0 c d c+d
a +c b+d m
Thus, b is the number of units on which the first agent record 1, the second
388 SAMPUNG TEOINIQUES
TARLE 13.12
ExpECTAnONS Of THE MEAN SQUARES (ON A SINGLE-UNrr BASIS)
df ms E(ms)
Between
interviewers
(subsampJes)
k -l S,/+ (1'<lZCl +(m -l)p.. J
Within
interviewers
k(m-1) of 2 L2:(Ylloo-Y"')~ S,/ + (1'/(1 - P.. )
.. k(m-l)
(13.64)
Consequently, comparison of ( '" - 1) (Sb' - s,.2)/ m with $b2 estimate the rela-
tive amount which the correlat ··1 oompon nt of the respon e variance contributes
to the total variance of Y•. Witl. measurements in which the correlated com~nent
is much larger than the simple el.ponse 'ariance, the ratio (m -1){S,,2- sw )/ms/
has been used alternatively as a me sure of the relative contributiou of the total
response variance to the total variance of o. Tepping and Boland (1972) present
estimates of this ratio for items in the Current Population Survey.
When the interpenetration method is a plied in a multistage sample covering a
wide geographic area, the most common practice is to have pairs of interviewers
390 SAMPLING TECHNIQUES
measure interpenetrating subsamples drawn from the smallest clusters among the
succes ive stages. In this way the number of ultimate units in an interviewer's
assignment is kept at its customary level in the survey, although the interviewer
has to travel over twice the usual area. For a single cluster, Table 13.12 provides
1 df between interviewers and 2(m -1) df within interviewers: the carre ponding
mean square are averaged over the c cluster cho en for the study. The sampling
variance that is measured i , of course, only that within the last stage of clustering.
As a reminder of this fact , the sampling variance term in E(ms) is sometimes
written 5,/( 1 - P.) instead of 5,}, where P. is the intracluster correlation among
the sampling errors for different subunits.
The interpenetration method was used in this form in Response Variance Study
I by the U.S. Census Bureau (1968), designed to estimate the correlated compo-
nents of the total response variances of items in the 1960 Census. The areas in this
study were compact clusters of households, the clu ters being scattered all over
the U.S.A. In any sample duster, two interpenetrating subsamples were formed ,
each subsampJe being assigned to a different interviewer.
In halt the cl~ the two interviewer had different crew leaders. In this half it
wa a umed, as seem reasonable,.that the response errors of the two interview-
ers were uncorrelated. Thus (s/-s..,2)/m estimates Pwu/, where s,,2, is now the
average mean square between interviewers in the same cluster. In the other half of
the clusters the two interviewers had the same crew leader. The objective here
was to measure the extent to which "crew leader effect" induced a covariance
between d.l and d.l for the two interviewers in a cluster. If 0, Sb 2 in Table 13.12
now estimates
2 1 - -
5,., (l-P.)+O"II [1 +(m - l)p..,] - mE cov (d.I d.2 )
and (Sb l_ s,/)/ m estimates
(13.65)
Comparison of the two sets of values of (s,.2- s..,2)/m revtals the presence of a
"crew leader effect." Since differences between estimates of variance like Sb 2 nad
2
sw are unstable, large numbers of clusters in the two halves of this type of study
are necessary to measure "crew leader effect" with any precision.
The interpenetration technique extends to stratified and multistage sampling. If
the primary intere t is in an unbiased estimate of V(ji.,) that takes proper account
of the effects of errors of measurement, aU that is necessary is that the sample
consist of a number of subsamples of the same structure in which we are sure that
errors of measurement are independent in different subsamples. Strictly, tbi
requires that different interviewing teams, supervisors, and data processor be
used in different subsamples. If ji,.. is the mean of the ith subsample, the quantity
l:(YIa - y.}2/k(k -1} is an unbiased estimate of V(y.. ), with (k -1) df. This result
holds because the subsample can be regarded as a ingle complex sampling unit,
the sample being in effect II. s~ple random sample of these complex units, with
SOURCES O · ERROR tN SURVEYS 391
uncorrelated errors of measurement between different complex units. Conse-
quently, the results in section 13.10 apply.
Numerou applications of thi method, sometimes called replicated sampling,
are described by Deming (1960), who has u ed the method extensively. For other
discussions of its advantages, see Jones (1955) and Koop (1960). Travel costs of
interviewers are increa ed by interpenetration, but this can be mitigated if the
ample i~ stratified into compact areas. For instance, each stratum might contain
two random samples, assigned to a different interviewer. Each interviewer is
required to travel over the whole stratum in tead of over only half the stratum.
Every stratum provides 1 df for the t:stimate of V(Yo) .
BaUar and Daieruus (1969). Hansen and Waksberg (1970) review the re~arch
work of the U.S. Census Bureau on measurement errors as they affect the Census
and some of the most important sample survey takell by the Census Btl eau.
Resulting changes in the 1970 Census included more wide pread use of self-
enumeration (omitting interviewer biases) 10 a Census by mail, further use of
sampling as distinct from a complete Census, and advunce computer election of
the sample to avoid some biases that had been delt'l."ted in selection of the
final-stage sample by the interviewer. Disturbing errors found in data on occupa-
tion, industry, and housing quality as well a recall problems in expenditure data
are under continuing study.
(13.68)
(13.70)
The first term in V (7TAW) is the variance that V(7TA) would have if al1 n
respondents answered truthfully a direct question about class A membership.
Except for 1TA near ~ and P > 0.85, the second term is greater than the first, often
much greater. The method is thus quite imprecise in general. This might be
expected, since the interviewer does not know whether a " yes" answer implies
membership in class A or the opposite. As Warner showed, however, his method
may give a smaller MSE than a direct sensitive question would, if -the latter
produced numerous refusals or false answers.
The first statement remains unchanged . If all respond truthfully, the population
proportion of "yes" answers is now
(13 .71)
where 1TU is the proportion in the sampled population who were born in May. If
1Tu is known, the obvious (and maximum likelihood) estimate of 1TA is
with variance
(1.3-: 73)
"
394 SAMPLING TECHNIQUES
Morton (Greenberg et ai., 1969, p . 532) has suggested bow tbe case, 1TU known
can always be acbieved. A box contains red, white, and blue baJls in known
proportions P h P2 , P3 . Drawing a red ball produces the sensitive statement.
Drawing a white or a olue ball produces tbe statement: "The color of this ball is
white." Thus. 1TU = P2/(P2 + P3).
Dowling and Shachtman (1975) have shown that V(?T,;,U) < V(?TAw) for all
1TA. 1TU, provided that P exceeds about l. (The variance of ?TAW i symmetrical
about P =~ but that of ?TAU is not, a small P proviCJing few responses on the
sensitive question with this method.)
If it is necessary to estimate both 1TA and 1TU we can have two random samples
of sizes n h n2. with different proportions Pit P 2 for the sensitive question. With
cfJh cfJ2 denoting the proportions of "Yes" answers in the populations defined by
the choices PI and P 2 •
cfJl = P 1 1TA + (1- P ,)7Tu (13.74)
cfJ2 = P 21TA + (1 - P 2)1TU (13 .75)
These relations suggest the estimate
A [~I(I - P2) -~(1-PJ)]
1TA U= (P 1 -P2 ) (13.76)
with
V(1T,;,U) =
(P I - P2 ) nl n2
?]
1 2[cfJl(1 - cfJl )(1-P2)2+ cfJ 2(J- cfJ2)(1- P I (13.77)
If PI >1· Greenberg et ai. (1969 showed that this variance is min'mized when
P2 = 0, that is, when an in the second (n2) sample are a!,ked the unrelated 1T(J
question. Moors (1971) has recommended ~hi procedure, but Greenberg et aI.
(1969) suggest PI + P 2 = 1 as a working rule, in case the choice P2 = 0 might
weaken cooperation by respondents. When PI =0.8, for example, 80% of ample
1 and 20% of sample 2 would be asked the sensitive question on the average.
With the optimum n I, n2 for given n = nl + n2, the Cauchy-Schwarz inequality
shows that the resulting minimum variance of ?TAU is
Vmj,,(1TAU) = n(P
I
~ P 2)z[(I- P2)JcfJ1(1- ~1)+(1 - P )JcfJ2(1 -cfJ2)]2
I (13.78)
(13.79)
This choice requires advance estimates of 1TA and 1TU, but the optimum is fairly
. Greenberg et aI. (1969) give recommendations aboutthe choices of Ph P'}" n I'
SOURCE.<; OF ERROR fN SURVEYS 95
and n2 ' For preservation of confidentiality it helps to have 'lTu approximately
equal to 17"A-
Numerous variants of these methods have been studied (e.g., use of two
unrelated questions), as well as the biases produced in '77-A and '77-AU if a fraction
of the respondents answer the question falsely. Greenberg et al. (1971) have
applied the two-sample technique to estimate the mean ILA for a sensitive discrete
or continuous variable by methods analogous to those leading to equations
(1 3.74) and (13.75). The unrelated question estimates the mean lLu {or a
nonsensitive variable. Random subgroups of"; ubjects (i = 1,2) receive the
sensitive question with probability P" the nonsensitive with probability (1 - PI)'
The recorded variahle Z, for a subject therefore follows a mixture of two
distributions in proportions P" (1- Pi), one distribution with mean ILA ' variance
u ,/. the other with mean J-Lu. variance uu2 • Hence
£(I,) = P,lLA;' (1 - PJlLu (13 .RO)
V(z,) = P,<TA2.+ (I - Pi)UU2.+ Pi (1- P, (jLA - p,(J)l (13.81)
Analogous to (13.76), the estimate of ILA is
~ ((l- P1)ZI-(1- J'I )Z2]
ILA U: (PI - P2 ) (13.82)
For maxunlJm efficiency the condition lLu =ILA' uu2 =0 are required , while for
preservation of anonymity, lL u = ILA, u u 2 = UA 2 are best.
Warner (1971) has given a theoretical framework for a broad cia s of ran-
domized response models. As (j 3.74) and (]3.75) suggest, the trick is to estimate
certain linear functions of the sensitive and unt ' lat d ?T'S or IL 'S with as many
equations as there are parameters to be estimated.
The method has been applied to obtain estimate:> of the proportion
illegitimate births, of induced abortions, of users of ht'roin, of persons having
contact with organized crime, and of mean income and number of abortions as
continuous-discrete applications. Since the method bas attracted widespread
attention, further applications are likely to appear. An excellent review is given by
Horvitz, Greenberg, and Abernathy (1975) .
There have recently been discussions of ule degree of privacy that the resron-
dents have in different versions of randomized interviewing. In some versions the
interviewer may be able to guess the statu of some respondents with regard to the
sensitive issue with a fairly high probability of being correct-an undesirable
feature for this method.
13.19 SUMMARY
In regard to tI cir effects on the formulas given in preccJing rha)1lers, non
piing error may be classified as foJ)ows.
396 SAMPLING TECHNIQUES
EXERCISES
13.1 Suppose that, by field methods of different intensities, it is possible to make the
"response" stratum consist of 60, 80, 90, or 95% of the whole population. For a percel'ltage
that is to be estimated, the true "response" stratum means are: 60% stratum, 40.7; SO%
stratum, 43.5; 90% stratum, 44.S; 95% stratum, 45.4; last 5%, 59.0. (0) For a method that
SOURCES OF ERROR IN SURVEYS 97
samples only the 60% stratum, show that the root mean square en:or of the estimated
percentage for the whole population is
J(2414/n)+28 .94
where n is the number of completed questionnaires obtained. (b) Show that a root mean
square error of 5% cannot be achieved by a method with 60% response but can be obtained
with slightly over 100 completed questionnaires for the methods that have a response of
80% or better. (c) If a root mean square error of 2% is prescribed, what methods can
achieve it and what sample sizes are needed?
13.2 In 13.1 (c) suppose that it costs $5 per completed questionnaire for the field
method that has a 90% response . To obtain a completed questionnaire from the nexl 5% of
the population costs $20. For a root mean square error of 2% , is it cheaper to use the
method with 90% response rale or that with 95% response rate?
13.3 A population consists of two strata of equal sizes. The probability of finding the
respondent at home and willing to be interviewed at any call i 0.9 for persons in stratum 1
and 0.4 for persons in stratum 2. (a) In the notation of section 13.5 show that
WI! = 1- (0.1)', W'2 =1 - (0.6)'
~b) If the original sample size is no, compute the total expected number of interviews
obtained for I, 2, 3, 4 and 5 calls. (c) If the relative costs per completed intervjew at the ith
call are tOO, 120, 150. 200. and 300 for j :: ),2,3, 4, 5, respectively, compute the average
cost per interview for all intervi~ws obtained up to the jth call. (d ) The money available for
the survey is enough to pay for 300 completed first calls. If the policy is 10 insist on i calls,
what are the expected total numbers of completed interviews that can be obtained for the
same amount of money when j:o 1, 2, 3, 4, 5?
13.4 In exercise 13.3 persons in stratum) have a mean of 40% for some binomial
percentage that is being estimated and persons in stratum 2 have a mean of 60%. (a )
Compute tbe bias in the sample mean for j = 1, 2, 3, 4, 5 calls. (b) Colppute tbe variances of
the sample means for the cost situation in part (d ) of exercise 13.3. (To save computing, the
variance may be taken as 2600/11" where II, i the expected total number of interview
obtained.) (c ) Which policy gives the lowest MSE~
13.5 In section 13.6 (subsampling of the noorespondents) verify the formula (p. 373)
for the ratio of the expected cost of obtaining a specifi d V with no subsampling to the
minimum expected cost.
. F(co+c ,W.+c, W 1 )
Rabo = --;.==========-.....:..c~;=
[J(F - W,)(co+c ., W,)+ W,Jc:]l
where F = 5' /5/. Let Co =c, = 1, Cl = 16. (a) If F :; 1, show that the cost ratio has a
maximum 1.25 when W, = 0.2 or 0.25. (b) If F = 1.5, show that the maximum is 1.4) for
W1 .,. 0.3 or 0.35.
13.6 In a survey on poultry and pigs kept in gardens and certain small holdings (Gray,
1957) a postal inquiry with several reminders was followed by interviews of a subsample of
nonrespondents. By advance judgment, k = 2 was chosen (i.e., a 50% su~mple) . The
following data were available after the survey for one important item, il' the notation of
exercise 13.5'.
By finding VC for k = 2 and for the optimum k , determine whether k = 2 was a good
choice.
13.7 In a survey by the Politz-Simmons method 390 respondents in an initial sample of
660 were found at home on the first call . The number who stated that they were at home on
O. 1, ... , 5 of the five previous nights and the number answering yes to a question in the
survey were as follows .
0/ 5 1/ 5 2/ 5 3/5 4/5 5/ 5
Number 14 35 55 74 94 118
Yes answers 4 )3 20 30 42 156
Compute the Politz-Simmons estimate of the proportion of " yes" answers in the
population and compare it with the simple binomial estimate.
13.8 A population with N = 6 contains three units for which the correct answer to a
question is yes and three for which it is no. Owing to errors of measurement, the probability
of obtaining a "yes" response on a yes unit is 0.9. and the probabili ty of obtaining a " no"
response on a no unit is also 0.9. (a) By working out the distribution of all possible
responses for samples of size 2, show that the probabilities are 0.218 , 0.564, and 0.2 18 that
the sample gives 0 , 1, 2 "yes" responses. (b) Show that th variance of the estimated
proportion of "yes" respon e is 0.1090. Verify results (13.40) and (1 3.41) in section
13 .10. (c) What would be the variance of the estimated proportion of " yes" responses if
there were no errors of measurement?
13.9 In part of the 1942 BengaJ Labour Enquiry (Mahalanobi , 1946) a random
sample of about 175 families was taken in each of three strata . The sample in each stratum
was divided into five random subsamples, each assigned to a different interviewer. The five
interviewers worked in aU three strata. For expenditure n food , the relevant part of the
aQjllysis of variance (on a Single-family basis) is as follows .
df ms E(ms)
Annitage. P. (1947). A comparison of stratllied with unrestricted random sampling from a finite
population. Biometrika. 34, 273-280.
Arvcscn. J. N. (1969). Jackknifing U-Statistics. AIIII. Math. Slal., ..o; 2076-2100.
Avadhani. M. S. and Sukhatme. B. V. (1973). Controlled sampling with equal probabilities and
without replacement. lilt. Stat. Rev., 41, 175-183.
Bailar, B. A . (1975). The effects of rotation group bias on estimates from panel surveys. Jour. Amer.
Stat. Assoc., 70, 23- 30.
Bailar. B. A., and Dalenius, T. (1969). Estimating response variance components of the U.S. Bureau
of the Census Survey Model. SmsIchya. B31, 341- 360.
Barr, A. (1957). Differences between experienced interviewen. App. Stat , 6, 180-188.
Bartholomew. D . J. (1961). A method of allowing for " not-al-home" biu in sample surveys. App.
Stat.• 10,52-59.
Bartlett. M. S. (1949). Fitting a straight line when both variablet are lub;ect to enor. Biometric.t. 5,
., 207- 212.
Basu. D. (1958). On sampling with and without replacement. Smtkhya, lO, 287-294.
Bayless. D. L. (1968). Variance estimation in sampling from finite populations. Ph.D. Thesis. Texas A
& M Univenity.
Bayless, D. L., and Rao, J. N. K. (1970). An empirical study of stabilities of estimators and variance
estimators in unequal probability sampling (n - 3 or 4). Jour. Amer. Stat. Assoc .• 65, 1645-1667.
Beale, E. M. L. (I 962). Some uses of computers in operational research. Industrklk Organisation,ll,
51-2.
Bean, J. A. (1970) . Estimation and sampling variance in the health interview survey. National Center
for Health Statistics, Washington, D.C., Series 2. 38.
Bean, J. A. (1975). Distribution and properties of variance estimaton for complex multistage
probability samples. National Center for Health Statistics. WashinJton. D.C .. Series 2, 65.
Beardwood, J. , Halton, J. H.. and Hammersley, J . M . (1959). The .hortest path through many points,
Proc. Cambridge PhIl. Soc., 55, 299-327.
BeJlhouse, D. R. and Rao, J. N. K. (1975). Systematicsarnplingin the presence of. trend. Biometrilul,
61, 694-697.
Belloc, N. B. (1954). Validation of morbidity survey data by comparison with hospital records. Jour.
Amer. SIal. Assoc., 49, 832-846.
Birnbaum, Z. W., and Sirken, M. O. (1950a). Bias due to nonavailability in sampling surveys. JoW'.
Amer. Stat. Assoc .• 45, 98-111.
Birnbaum, Z. W., and Sirken, M. O . (1950b). On the total error due to noninterview and to random
sampling. btl. JoW'. Opinion and Attitude Res., 4, 179-191 .
Blythe, R. H. (1945). The economics of sample size applied to the scaling of saw-logs. Biom. Bull., 1,
67- 70.
400
REFERENCES 401
Booth, G .• and Sedransk. J. (1969). Planning some two-factor comparative surveys. Jour. Amer. SIal.
Assoc .• 64, 560-573.
Bose, Charneli (1943). Note on the sampling error in the method of double sampling. Sankhya, 6, 330.
Brewer, K. W. R. (196.3a) . A model of systematic sampl ing with unequal probabilities. Ausrralian
Jour. Sial., 5, 5-13.
Brewer, K. W. R. (1963b) . Ratio e~ti mati on in finite populations; Some results deducible from the
assumption of an underlying stochastic process. Ausrralian Jour. Sial .. 5, 93- 105.
Brewer, K. W. R .. and Hanif, M. (1969). Sampling without replacement and probability of indusion
proportional to size . I Methods using the Horvitz and Thompson estimator. II Methods uSing
special estimators. Unpublished manuscript.
Brewer, K. W. R .. and Hanif. M. (1970). Durbin's new multistage variance estimator. Jour. Roy. Stal.
Soc .. 832, 302-311.
Brooks, S. (1955). The cstim tion of an optimum subsampling number: Jour. Amer. tat. Assoc .. SO,
398-415.
Bryant, E. C , Harrley, H. 0 ., and Jessen, R. J . ( 1960). De ign and estimation in two-way stratifica tion.
Jour. Amu. Slat. Assoc., 55, 10S-124,
Buckland, W. R. (195 1). A review of the literature of systematic sampling. Jour. Roy. Slat. Soc., 813,
20tl- 215
Burstein. H, (1975). Finite population correction for binomial confidence limits. Jour. A mer, SlOt.
Assoc" 70, 67--69,
Cameron. J, M. (1951) , Use of variance components in preparingscheduJes for the sampling of baled
wool. Bioml!rics, 7. 11.\ - 90,
Chatterjee, S, ( 19M) . A programming algorithm and its Slat: tical al'phcatioo~, O.N,R. Tech, Re.pt. I.
Departme nt of Sta\l~lIcs . Harvard Univer. ity. Cambndge.
Chatterjee. S, (1 (67) . A note on ,)ptimum tratification , Skand. AkI.. 50, 40--44.
Chatterjee, S, ( 1968). Multivariate stratified ~ urvey~. Jour. Amer. Suu. Assoc .. 63, 530-534.
Chatterjee, S. (I (72). A tudy of optimum allocation in multivariate stratified surveys. Ska,ui. Akl.,
SS, 73- 1l0
Chung, J . H .. and DeLury, D. S , (1950) . Conjidmce Limits for Ihe Hypergeometric DistribullOn .
University of Toronto Pre~~, Toronto, Canada.
Cochran, W G . (1942) . ampling theory when the sampling units are of unequal sizes, Jour, Aml'r,
SIal, Assoc" 37, 199-2 I 2.
Cochran, W. G . (194(,i, Relative accuracy of systematic and stratified random samples for a certain
c1assof populatlnns. Ann . Malh , 5101 .• 17,164- 177 .
('..ochran. W. G. (196 I ) Comparison of methods for determining stratum boundaries. Bull, Int. SIal,
irul., 38, 2, 345-358,
Cochran, W, G .. Mosteller. F .. and Tukey, J. W, ( 1954). Slalistical Problems of lhe Kinsey R eporl.
American Slalistical Asrocialion, Washington, D.C.. p, 280,
Coleman, J. S. (1%6), Equality of Educalional OpPDrlunity, u.s. Government Printing Office,
Washington, D.C.
Cornell, F. G . (1947) . A stratified random sample of a small finite population. Jour, mer. SIal. Assoc.,
41, 523-532.
Cornfield. J. (1944) . n sample from finite populations, Jour, Amer. SIal. Assn::., 39, 23&-239,
Cornfield. J, (1951). The determination of sample size. Amer. Jour. Pub. Heallh, 41. 654-'661 ,
Cox, D. R, (1952). Estimation by double sampling, Biomerrika, 39, 217-227,
Dalenius, T . (1957), Sampling ill ",eden. Contributions to the methods and theories of sample survey
practice. Almqvi t and Wicksell, Stockholm .
Dalcnius, T " and Gurney, M. (1951) . The proll" m of optimum stratification. II. Skand. Akr.. 34,
133- 148.
402 SAMPLING TECHNf l iES
Dale ni us, T .. and Hodges, J. L., .1r. (1959). Mirumum V:;I .!let: sttatiflcation. Jour. A mer. Sial. Assoc ..
54,88-101.
Das, A. C (1950). Two-djmensional yslematic samrltng and the associated stratified and random
sampl inA. Sankhya, 10. 95-1 08.
David, F. N .. a nd Neyman. J. (t~38). Ex tension of t he Markoff theorem of leaS! square .. l UI Res.
Mem., I , 105.
David, 1. P., and Sukhlllme, B. V. (1974). On the bias. a nd mean '\<j uare error of the ratio esllmlltor .
Jour. A mer. SUII. A (!c •• 69, 464-466.
Deming, W. c. (1953) . Or. a probability mechanism 1(1 alta in an economi balance be tween the
resultant error of non·rc~pon:>t: and the bias of non rc .ponse. Jour. A mer. Sial. AlSIlC., 41.
743-772.
Deming, W. E . (1956) . On sillllJls fications of sampling design through repl ication with equal plCb-
abilities and without stagt; • . Juur. A mer. Stat. Assoc .. S1 . 24-53.
Deming, W. E . (1960). Sample Desigl1'" Business Research. John Wi ley and Sons, Ne w York .
Deming, W. E., lind Simmoll ~, W. R. (1946) . On the desi~ nf 3 ~ample for dealer inven tories. Jour.
A mer. Stat. Assoc .• 41. 10-33 .
Des Raj (1954 ). On samphng with probabilities proportion"1 tf) size . Gallila, 5, 175- 1112.
Des Raj 11950a). Some esti m~llul' ill sampling wilh varying proba bilities without replace me nt. JOllr.
Am.:r. :>Ial. Assoc .. 51, 26v- 2M4.
Des Raj (1956b). A nOle Oil the determinatiun of optimum probabilities in sampling without
replacement . Scm khya, 17, 197- 200.
Des Raj ( J 1J5lS). On th~ relative accuracy of ome sampling techniques. JOllr. Amtr. SIal. Assoc., 53,
9B-I01.
Des Raj (lY64). 1 he use of systematic sampling with probability proportional to size In a lar~e-scale
survey. Jour. A mer. Sial. Assoc .. 59, 25 1-255.
Des Raj (1966). Some rcmarK~ on a simple proced ure of sampling without replacement. Jour. A mer.
Stal. Assoc .. 61 ,39 1- 396 .
• Des Raj, and Khamis, S. H. (195lS) . Some remarks on sampling WIth replaceme nt. A nn . Malh . SUlI.,
, 29,550-557.
Dowling, T . A. , and Sllat'l ltrlJan, R. H. (1975) . On til: relative efficie llCY of randomized response
models. Jour. Amer. Scar Assoc., 70, 84- R7.
Ourbin, 1. (1953) . Some r~sl.!~ in sampling th('ory when thl: uni ts are selected with unequal
prObabilities. Jour. Roy. tal. Soc., B15, 26~- ~b9 .
Durbin. 1. (1954). Non-response and call - back~ in ~ u n·lj r. . Bull. lilt. Stal. Insl., 34, 72- 86.
Durbin. J. (1958). Sampling theory for cstlm l~ based on few r individuah than the number selected.
Bull. Inl. Sial. Insl., 36, 3, 113- 11 9.
Durbin, 1. (1959). A note on the applka tion of Oue nouille 's method..,f bias reduction to the e~timation
of ratios . Biometrika, 46, 477--480.
Durbin. 1. (1967). Design of multi-stage surveys for the estimalJ(ll1..,f sampling errors A pp. Sial .. 16,
152-1 64 .
Durbin. J., and Stuart, A. (1954). Callbacks and clustering in sample surveys : an experimental study.
Jour. Roy. Stat. Soc., A1l7, 387-428.
tJallup, G . (1972). OpinIOn polling in a democracy. Slutisn'cs, a GUIM Illlhc Unknown . J. M. Tanur et
af. (ed~ .), HOlden -Day, Inc., San Francisco, 146-152.
Gautschi, W. (1957). Some remarks on ystematic saml'ling Alltl . Math . Stat., 28, 385-394.
GodamlJe , V. P. (1955) . A UIIified theory of sampling from finite populations. Jour. Roy. Slat. Soc.,
BI7,269- 278.
Goodman, L. A., lind Hartley, H. O. (1958). The pr (,I';on of unbiased ratio-type estimators. Jour.
Amer. SIal. Assoc., 53, 491-508 .
Goodman, R., and Kish, L. ('1950). Controlled sel ction-3 technique in probability sampling. Jour.
Amer. Stal. A.uoc., 45, 350-372.
Graham, J. E. (1973). Composite estimation in two cycle r tation sampling designs. Comm. in Sral.. l,
419-431.
Gray, P. G . (1955). The m mary factor in social surveys. Jour. Armr. Stal. Assoc., SO, 344-363.
Gray, P. G. (1957). A sample survey with both a postal and an interview stage. App. SIal., 6, 139-153.
Gray, P. G., and Corlett, T . (1950). Sampling for the social survey. Jour. Roy. SIal. Soc., .A1l3,
150-206.
Greenberg, B. G ., et al . (1969). The unrelated q ue 'uon Tandomi:ted re ponse model: Theoretical
framework . Jour. Amer. SIal. Assoc., 64, 5:!U-539.
Greenberg, B. G . e t al. (1\171). Application of the randomized response technique in obtaining
quantitative data. Jour. Anurr. Stal. Assoc., 6(;, 243-250.
Grundy. P. M., Healy, M. J. R., and Rees. P. H. (1 54). DeCIsion betwee n two altemati"es- how
many e~perimenlS? Biormtrics, 10, 317~ J'
Grundy, P. M., Healy, M. J. R .. lind Recs, O. H. (1956). Eoonomic chOice of the amount of
expenmenla!lon. Jour. Roy. Stal. 5«., B18, 32-55.
404 SAMPLING TECHNIQUES
Hagood, M. J ., and Bernert, E. H . (1945). Component indexes as a basis for stratification. Jour. A~r.
SUlI. Assoc., 40, 330-341.
Hajek, J. (1958). Some contributions to the theory of probability sampling. Bull. Inl. Slat.lnst., 36. 3,
127- 134.
Hajek, J . (1960). Limiting distributions in simple random sampl.ing (rom a finite population. Pub.
Ma/h . lns/. Hungarian Acad. Sci., 5, 361 - 374 .•
Haldane: J. B. S. (1945). On a method of e timating frequencies. Bjo~trika , 33, 222-225.
Hansen , M. H. , et al. (1951). Response errors in surveys. Jour. Amer. Stat. Assoc., 46, 147-190.
Hansen , M. H., and Hurwitz, W. N. (1942). Relative efficiencies of various sampling units in
population inquiries. Jour. Amer. Stat. Assoc., 37, 89-94.
Hansen. M. H .. and Hurwitz. W. N. (1943). On the theory of sampling from finite populations. Ann.
Math . Stal .. 14, 333-362.
Hansen, M. H .. and Hurwitz, W. N. (1946). The problem of nonrespon 'e in sample surveys. Jour.
A mer. Slat. Assoc.. 41, 517- 529.
Hansen, M. H .. and Hurwitz, W. N. (1949). On the determination of the optimum probabilities in
ampling. Ann. Math . SIal., 20, 426-432.
Hansen, M. H .. Hurwitz, W. N., and Bershad, M. (1961). Measuremen errors in censuses and surveys.
Bull. Int. Slat. Illst., 38, 2, 359-374.
Hansen, M. H., Hurwitz, W. N., and Gurney, M. (1946). Problems and method~ of the sample survey
of busines . Jour. Amer. Stat. Assoc., 41, 173- 189.
Hansen, M. H ., Hurwitz, W. N., and Jabine, T. B. (1963). The use of imperfect lists for probability
sampling at the U .S. Bureau of the Census. Bull. Int. Stat. Inst., 40, I, 497- 517.
Hansen. M . H., Hurwitz, W. N., and Madow. W. G . (1953). Sampk SUJ'Vey Methods and Theory. John
Wiley and Sons. New York , Vols. I and II .
Hansen , M. H ., Hurwitz, W. N., Nisselson, H ., and Steinberg, J . (1955). The redesign of the censu
current population.survey. Jour. Amer. Stat. Assoc., 50, 701 - 719.
Hansen, M. H., Hurwitz, W. N., and Pritzker, L. (1965). The estimation and interpretation of gross
differences and the simple response variance. Contributions W SUlJistics presented W Professor P. C.
Mahalanobi.! . Pergamon Press, Oxford, and Statistical Publishing Society, Calcutta, 111 - 136.
Hansen, M. H., and Wak berg, J. (1970). Research 00 non-sampling errors in censuses and surveys.
Rev. Int. Stat. Inst .• 38, 318- 332.
Hanson, R. H., and Marks, E. S. (1958). lnfiuence of tbe interviewer on the accuracy of survey results.
Jour. Amer. Stat. Assoc., 53, 63~55.
Hartley. H. O. (1946). Discussion of paper by F. Yates. Jour. Roy. Stat. Sac. , 109, 37 .
Hartley. H. O. (1959). Analytic Studies of Survey DaUl. Istituto di Statistica. Rome, volume in honor of
Corrado Gini.
Hartley, H. O . (1962). Multiple frame surveys. Proc. Sac. Stat. Sect. Amer. SUIt. Assoc., 203-206.
Hartley, H . O . (1974). Multiple frame methodology and selected applications. Sankhya, CJ6,99-1 18.
Hartley, H. 0., and Hocking, R. (1963). Convexing programming by tangential approximation.
Management Science, 9, 6()()-{\12.
Hartley, H. 0 .• and Rao, J. N. K . (1962). Sampling with unequal probabilities and without replace-
ment. Ann. Math. Stat., 33, 350-374.
Hartley, H . 0., and Rao, J. N. K. (1968). A new estimation theory [or sample surveys. Biometrika, 55,
547-557.
Hartley, H. 0 ., and Rao, J. N. K. (1969). A new estimation theory for sample surveys, lL In New
Developments in Survey Sampling, N. L. Johnson and H. Smith (eds.), Wiley-Interscience, New
York , 147-169.
Hartley, H. 0., Rao, J. N. K. , and Kiefe r, O. (1969). Variance estimation with one unit per stratum.
Jour. Amer. Stat. Assoc., 64, 841-851.
Hartley, H. 0 ., and Ross, A. (1954). Unbiased ratio estimates. Nature, 174, 270-27] . .
Harvard Computation Laboratory (1955). Tabks of the CUmulative Binomial Probability Distribution.
Harvard University Press, Cambridge, Mat~s.
REFERENCES 405
Haynes. J. D. (194R) . An empirical investigation of amphng methods for an area. M. S. thesis.
University of North Carolina.
Hendricks. W. A. (1944). The relatt¥e efficiencies of groups of farms as sampling units. Jow. Amer.
SUlI. Assoc., 39, 367-376.
Hendricks, W. A. (1949). Adjustment for bias by non-response in mailed surveys. Agr. &otI. ReI.• 1.
52-56.
Hendricks, W. A. (1956). The MalMmatical Theory of Sampling. Scarecrow Press, New Brunswick.
N.J .
Hess. I., Riedel, D. C .• and Fitzpatrick, T. B. (1976). Probability Sampling of HospiUl/s and Patients.
University of Michigan, Ann Arbor, Mich .• second edition.
Hess. I., Sethi. V. K.• and Balakrishnan, T. R. (1966). Stratification: A practical investigation. Jour.
Amer. Stal. Auoc., 61, 74-90.
Hoetfding. W. (1948). A class of statistics with asymptotically normal distribution . Ann. Malh . SIal .•
19.293-325.
Homeyer, P. G., and Black, C. A. (19<46). Sampling replicated field experiments on oats for yield
determinations. Proc. Sail Sci. Soc. America, 11,341-344.
Horvitz. D. G. (1952). Sampling and field procedures of the PitL~burgh morbidity survey. Pub. Heallh .
Re~.67, 1003-1012.
Horvitz, D. Y., Greenberg, B. G ., and Abernathy, J. R. (1975). Recent developments in randomized
response designs. A slll'Vey of statistic41 duign and /inearmode/s. J. N. Srivastava (ed.), American
Elsevier Publishing Co., New York, 271 - 285.
Horvitz, D. G ., Shah, B. V., and Simmons. W. R. (1967). The unrelated randomized response model.
Proc. Soc. Stat. Sect. Amer. Stal. Assoc., 65- 72 .
Horvitz, D. G., and Thompson, D. J. (1952). A generalization of sampling without replacement from a
finite universe. Jour. Amer. Stat. Assoc., 47, 66~5 .
Huddleston. H. F., Qaypool, P. L., and Hocking, R. R. (1970). Optimum sample allocation to stralA
using convex programming. App. Stal. • 9, 273-278.
Hutchinsol\. M. C. (1971). A Monte Carlo comparison of some ratio estimators. Biometrika, sa.
313-321.
Hyman, H. H. (t 954). llllertHewing in Soci41 Research, University I)f O1icago Press, Chicago, III.
James, A. T .• Wilkinson. G. N.• and Venables. W. N. (1975). Interval e~timates for a ratio of means.
Sankhya (in press). .
Jebe, E. H. (1952). Estimation for sub-sampling designs employing the county as a prillUlty sampling
unit. Jour. Amer. SIal. Assoc .• 47, 49-70.
n.
Jensen, A. (1926). Report on the representative method in statistics. Bull. 1111. Stat. Insl., 359-377.
Jessen, R. J. (1942) . Statistical investigation of a sample survey for obtaining farm facts. Iowa Agr.
fup. Sta. Res. Bull., 304 .
Jessen. R. J. (1955). Determining the fruit count on a tree by randomized branch sampling. Biometrics,
U,99-109.
Jessen, R. J.. et aI. (1947). On a population ,ample for Greece. Jour. Amer. SIal. Assoc., 41, 357- 184_
Jessen, R. J., and Houseman. E. E . (1944). Statistical investigations 01 farm sample surveys taken in
Iowa, florida and California. Iowa Agr. fup. Sta. Res. Bull.• 329 .
Johnson, F. A. (1941). A statistical study of sampling methods for tree nursery inventories. M. S.
thesi , Iowa State College.
JohNOn, F. A. (1943). A statistical ~tudy of sampling methods for tree nursery inventories. Jow.
F_slry, 41. 67~89 .
Jones, H. W. (1955). Jnvestigating the properties of a sample mean by employing random subsamplc
means. Jow. A mer. Stat. Assoc., 51. 54- 83.
Kempthorne, O . (1969). Some remarks on inference in finite sampling. New ~velopmenll in Survey
Sampling, N. L. Johnson aAd H. Smith. Jr. (eds.), John Wiley &. Sons. New York, 671-tiCJS.
406 SAMPLING TECHNIQUES
Kendall. M. G., and Smith, B. B. (1938). Randomness and random sampling numbers. Jour. Roy. Sit
Soc., 101, 147- 166.
Keyfitz, N. (1957). Estimates of sampling variance where two units are selected from each stratulT
Jour. Amer. Stal. Assoc., 52, 503-510.
Khan . ... and Tripathl, T . P. (J 967). The use of multIvariate auxiliary information in double-samplinJ
J . Ind. Stat. Assoc .. 5, 42-411.
King, A. J., and McCarty. D. E. (194 I) . Application of sampling to agricultural statistics with empha51
on stratified <amples. Jour. Marktting, April , 462-474.
Kis~r. C. V .. and Whelpton. P. K. (1953). Resume of the Indianapolis study of social and psychologic;
factors affecting fertility. Population Studies, 7, 95-110.
KISh, L. ( 1949). A procedure for objective respondent selection within the household . Jour. Ame.
5101. Assoc., 44, 380-187 .
Kish. L. (1957). Confidence limit> for clustered samples. Amer. Soc. Rev. , 22,154-165.
Kish , L. ( 1'165). Survey Sampling. John Wiley & Sons. New York .
Klsh. L.. and Frankel. M. R . (1974) . Inference from complex samples. Jour. Roy. Stat. Soc., 836,1-37
Kish , L.. and Hess, 1. (1958) . On non coverage of sample dwellings . Jour. A mer. Stat. Assoc.. 53
509-524.
Kish. L.. and Hess. I. (1959a) . A "replacement" procedure for reducing the bias of nonre~ponse
Amer. Statistician, 13, 4, 17- ISI .
Kish. L.. and Hess. I. (J 959b). On vanances of ratios and thelT differences in multistage samples. Jou.
Amer. Stal. A.ttor .• 54, 416-44t.
Kish. L.. and Lansing. J. B. (J 954). Response errors in estimating the value of homes. Jour. Amer. Sta
Assoc .• 49, 520-5311.
Kish , L., Namboodiri. N. K., and Pillal, R. K. (1962) . The ratio bias in surveys. Jour. A mer. 510
Ass()( .. 57, Rti3-S7t.
Koch . G. \ I 'J73) . An alternative approach to multivariate response error models for sample surve
data with application~ to esti mators involving subelass means. Jour. A mer. Stat. Ass., 611
906-913 .
Kokan . A. R. (1963) . Optimum allocation in multivariate surveys. Jour. Roy. Slat. Soc., AIU
557-565 .
Koons . D. A. (197:1) . Quality control and measurement of nonsampling error in the Health lnterviel
Survey . Nal. (:t nter for Health Stat .• 5eries 2.54 .
Koop. J . ('. ( 196()) . On theoretica l questions underlYing the technique of replicated or interpenetratin
,amples. Proc. Soc. SIal. Secl. Amer. Sial. Assoc.. 19ti-205.
Koop. J. C i: aIiR). An exercise in ratio estimation. Amer. Statisticiafl, 22, 1.29-30.
J..:ulldorff. G. (1<)(>3) . Some problems of optimum allocation for sampling on two occasions. Rev. In
Sial. losl.. 31 , 24-57.
Lahiri . D. 8 . (1951) . A method for sample ~election providing unbiased ratio estimates. Bull. lnl. Sta,
Inst .. 33, 2. 133- 14(J.
Lieberman. C . J .. and Owen. D. B. (1961). Tables of tlte Hyptrgeometric Probability Distribution
Slanford UnIversity Press. Stanford. Calif.
Lienau. C. C. ( 1941). Selection. training and performance of the National Health Survey field staff
Amer. lour. HVllime.34, 110- 132.
Lund. R. . (I'Iolli . EstImators in multiple frame surveys. Proc. Soc. Sci . Sect. Amer. Stat. Assoc.
282-2RIl .
McCartQy. P. J. (19/)6) . Replication : An approach to the an8lysi~ of data from complex survey
National Center for Health Statistia;. Washington. D C., Series, 2, 14.
McCarthy, P. J. ( 1969). Pse udo-replication : Half-samples. Rev. (nl. Slat. [nst., 37, 239-2&,.
McVay. F. E . (1947). Sampling methods applied to estimating numbers of commercial orchards in
commercial peach area. lour. Amer. Stal. Assoc.• 42, 533-540.
Madow. L. H. (1946). Systematic sampling and its relation to other sampling designs. Jour. Amer. 51,
Assoc.. 41, 207-214.
REFERENCES 407
Madow, L. H . (1950). 00 the use of the county as a primary samphng unit (or ~tate estimates. Jour.
Amer. Stat. Assoc. , 45, 30-47.
Madow. W. G. (1948 ). On the limiting distributions of estimates based on samples from finite
universes. Ann. Math . Stal., 19, 535-545 .
Madow. W. G. (1949). On the theory of ystematic sampling,n. Ann Math. Stat .• lO. 333- 354.
Madow. W. G . (\953). On the theory of systematic sampling. Ill. Ann. Math . Stat .. 24.101 - 106.
Madow. W. G ., and Madow, L. H. (1944). On the theory of sY~lematicsampling. Ann. Math. Stal .. 1S,
1-24.
Madow. W. G . (1965). On some 8 peets 01 response error measurement. Proc. Soc. Stat. Soc. Amer.
Stal. Assoc., 1112-192 .
Mahalanobis. P. . (1944). On large-scale sample surveys. Phil. Trans. Roy. Soc. London. B231,
329-451.
Mahalanobis. P. C. (1946). Recent experiment' in statisllcal ~ampling in the Indian Statistical
Institute. Jour. Roy. Stal. Soc .. 109,3 25-370.
Matern. B. (1947). Methods of esumating the accuracy of line and sample plot surveys. Medd. ir.
Sialens Skogsiorsknings institul.• 36, 1- 138.
Matern, B. (1960). Spatial variation . Medd. ir. Sla/ens Skogsiorsknings Institut .• 49, 5. 1-144.
Mickey. M. R. (1959). Some finite population unbiased ratio and regression estimators. Jour. Amer.
Stal. Assoc .• 54, 594-612.
Midzuno, H. (1951). On the sampling system with probability proportionate to sum of sizes Ann. /t1S1.
Stal. Math .. l, 99-108 .
Milne. A. (1959). The centric systematic area sample treated as a random sample. Bio~trics, 15,
:!70-297.
Moor~ , 1. J. A . (1971). Optimization of the unrelated Quesuon randomized response model. Jour.
Amer. SIal. Assoc .. 66, 627-629.
Murthy. M. N. (1957). Ordered and unordered estimators in sampling without replacement. Sankhya.
18,379- 390.
Murthy, M. N. (1967). Sampling 'J'Mory and Melhods. Statistical Publishing Society. Calcutta. India.
Naroin. R . D . (1951). On sampling without n'pll1ccment with varying probabilities. Jour. Ind. Soc.
Agric. Sial .• 3, 169- 174.
National Bureau of Standard (1950). Tables vi I~ Binomial Prohability Distribution . U.S. Govern-
ment Printing Office. Washington. D .C.
Neter, J. (1972). How accountants save monel' by sampling. Slati.rtics. A Guide /0 Iht Unknown, J. M.
Tanur et 81. (eds.), Holden-Day. Inc .• San Francisco. 203-211 .
Neyman. J. (1934) . On the twO different aspects of the repre entative method: The method 01 stratified
sampling and the method of purposive selection. Jour. Roy. SIal. Soc .• 97, 558-606.
Neyman. J. (1938). Contribution 10 Ihe theory of ampling human populations. Jour. Amer. Stat.
Assoc.• 33,101 - 116.
Nordbotten , S. (1956). Allocation in tratified ~ampling by means of linear programming. Skand. Akl.
1idskr., 39, 1-6.
Nordin. J. A. (IQ44). Determining sample ize. Jour. Amer. StaL Assoc .• 39,497-506.
Ikin. 1. (\ 9~8). Multivariate ratio E'stimation for finite populations. Biometrika. 45, IS4-1b:i .
Osborne, J. G. (1942). Sampling errors of yslematic and random surveys of cover-type areas . Juur.
, A mer. lat. Assoc .. 37, 256-264.
Patterson, H. D. (\ 950). Sampling on ucce sive occasions with partial replacement 01 units. Jour. Roy.
Sial. Soc., 8 12, 241-25~ .
Patterson, H . D. (1954). The errors of I ttice ampling. Jour. Roy. Stal. Soc .• BI6. 14(~ 149.
Pau lson, E. (1942). A note on th estimation of me mean values for 8 bivariate distribution . Ann.
Math . Stat.• 13, 44()-444 .
408 SA MPLING TECHNIOUES
Payne . . L. ( 1951). The Arl of Askinll Ouestions. Princeton University Press. Princeton, N.J .
Placke tt . R. L .. and Burman. J. P. ( 1946). The design of optimum multifactorial experiments.
Biometrika , 33, 305- 325.
Plat ek. R .. and Singh. M. P. (1972). ome aspects o f redesign f the Canadian Labor Force Survey.
Proc. Soc. Stal. Secl. Amer. tal. Assoc .• 397-4()2.
, Politz. A. N .. and immon . . W. R. ( 194'1, 19501 An a ttempt to get the " no t a t homes" into the sample
without callback,. Jour. Amer.. tal. Assoc .. 44, 9- :1 1, and 45,136- 137.
Pritzker. L. . and Ha nson, R. ( 19(2). Meas urt:ment e rrors in the 1960 Ce nsus of Population . Proc. Soc.
Slat. Secl. A mer. SIal. ASSQ(·.. 80-89 .
Ouenouille. M. H. (1949). Problems in plane sa mpling. Ann. Malh . SIal., 20, 355-375.
Ouenouille. M. H . ( 1956). Notes on bia, in e tims tion . Biomelrika. 43, 353-360.
Raiffa . H .. and Schla ifer. R. (196 1). Applied Statistical Decision Theory. Harvard Business School.
ambridge . Mas .
Rand Corporation (1955). A Million Random Digits. Free Press. Glencoe, III.
Rao. C. R. (1971) . $9me aspects of statistica l inference in problems of sampling from finite
populations. Foundations of Slalisticallnference. V. P. Godambe and D . A . Sprott. (eds.). Holt.
Rinehart. and Winston. Toronto. Canada 177 202.
Rao. J. N. K. (1962) . On the estimation of the relative efficiency of sampling procedu res. A nn . Insl.
5 101. Malh .. 14, 143- J 50.
Rao. J . N. K. (I 965J. On two simple chemes of unequal probability sa mpling without replacement.
Jour. Ind. 5101. Assoc .. 3, 173-180.
Rao, J . N. K. (196(,) . Alternative estimators in pps sampling for mUltiple characteristics. Sankhya ,
A23, 47-60.
Rao. J . N. K. ( 1968). Some sma ll sample results in ratio and regression estimation. Jour. Ind. 5101.
Assoc" 6, 160- 168 .
Rao, J . N . K. (1969) . Ratio and regression estimators. New Developments ill Survey Sampling, N. L .
., Johnson and H. Smith. Jr. (eds.), John Wiley & Sons. New York. 2 n - 234.
Rao. J . N. K. ( 1973) . On double sampling for stratification and analytical surveys. Biometrika. 60,
125- 133 .
Rao. J . N . K . (1975a). On the foundations of survey sampling. In A Survey of latlstical Design
and Lillear Modt!ls. J. N. Srivastava (ed.). American Elsevier Publi shi ng Co. New York,
489- 505 .
Rao. J. N. K. (1 975b). Unbiased variance estimation for multistage designs. Sankhya (in press).
Rao. J. N. K .• and Bayless. D. L. ( 1969). An empirical study of the stabilities of e timators and variance
es timator~ in unequal probability sampling of two units per stratum . Jour. Amer. Stal. Assoc., 64,
540-559.
Rao. J. N. K., and Beegle. L. D . ( 1967). A Monte Carlo study of some ratio estimators. Sankhya, 829,
47-56 .
Rao. J. N. K.• and Graham . J . E . ( 1964). Rotation designs for sampling on repeated occasions. Jour.
Amer. . 101. Assoc., 59, 492- 509.
R o. J . N. K., a nd Kuzik , R . A. (1974) . Sampling errors in ratio estimation . .Indian Jour. Stal. 36, C,
43- 58.
Rao. J. N. K.. Hartley. H . 0 .. and Cochran, W. G . (1962). A simple procedure of unequal probability
sa mpling without replacement. Jour. Roy. SIal. Soc. 824, 482-491. I
Rao, J . N. K .• and Pereira. N. P. (1968). On double ratio estimators. Sankhya , AJO, 83-90.
Rao, J . N. K.. and Singh, M. P. ( 1973). On the choice of estimator in survey sampling. Australian Jour.
Slat .. )5, 95- 104 .
Rao, P. S. R . 5 ., and Mudholkar. G . . (1967). Generalized multivariate estimations for th mean of
finite populations. Jour. Amer. Stat. AfSOC., 62, 1oo8- JOI2.
REFERENCES 409
Rao, P. S. R . S., and Rao, J. N. K. (1971). Small sample results for ratio estimators. Biomt!lrika, !8,
625-630.
Robson, D. S. (1952). Multiple sampling of attributes. Jour. Amer. Stal. Assoc., 47, 203-215 .
Robson, D. S. (1957). Applications of multivariate polykaYi to the theory oC unbiased ratio type
estimation. Jour. Amer. Stat. Assoc., 52, 51]-522.
Robson, D. S., and King, A. J. (1953). Double sampling and the Curtis impact survey. Cornell Univ.
Agr. Exp. Sta . Mem., 231.
Romig, H. G. (l952). 50-tOO Binomial Tables. John Wiley & Sons, New York.
Ror, J., and Cha.lcravarti, 1. M. (1960). Estimating the mean of a finite population. Ann. Math. Stat.,
31,392- 398.
Royall, R . M . (1968). An old approach to finite population sampling theory. lour. Amer. Stal. Assoc.,
63,1269-1279.
Royall. R . M. (I 970a). On finite population sampling theory under certain linear regression models.
BkJmetrika, 57, 377-387
Royall, R. M . (1970b). inite population sampling-<>n labels in estimation., Ann. MIlIh . Star., 41,
1774-1779.
Royall, R. M . (1971). Linear regression model in finite population sampling tbeory. Fou'llialions of
Statistical Inference, V. P. Godambe , and D. A. Sprott (eds.), Holt, Rinehart, & Winston,
Toronto, C'lnada, 259-279.
Royall . R. M .. and Herson. J. (1973). Robust estimation in finite populations, I. lour. Amt!f. Stat.
Assoc., 68, 880-889.
Sagen, O . K., Dunham, R. E ., and Simmons, W. R . (1959). Health statistics from record sour,ces and
household interviews compared . Proc. Soc. Sial. SeCI. Amer. Sial. Assoc. , 6-15.
Sampford, M. R. (1967 ). On sampling without replacement with unequal probabilities of selection.
Biometrika, 54, 499- 513 .
Sandelius, M . (1951 ). Truncated inverse binomial sampling. Ska'lliinavisk Aktuarietidskrift, 34.
41-44.
Siirndal, C. E . ( I Q72). Sample survey theory vs. general slatisticaltheory: Estimation of the population
mean, Rev. Inl. Stat. Insl., 40, )- 12.
Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components.
Biomlltrics, 2, I 10-114.
Scau , A. J., and Smith. T . M . F. (1974). Analysis of repeated surveys using time series methods. Jour.
A mer. SIal. Assoc.. 69, 674-{)78.
Sedransk, J. (1965 ). A double sampling scheme for analytical surveys. lour. Amt!r. Stat. Assoc., 60,
985- 1004.
Sedransk, J . (1967). Designing some multi-factor analytical studies. Jour. Amer. Stal. Assoc., 62,
1121- 1139.
Sen, A. R. (J 953). On the estimate of variance in sampling with varying probabilities. lour. Ind. Soc.
Agne. Slat., 5,119- 127.
Se n, A . R . (1972) . Successive sampling with p (p ~ 1) auxiliary variables. Ann. Math . Suu. , 43,
2031 - 2034 .
Sen, A. R. (1973a). Theory and application of sampling on repeated occasion with several auxiliary
variables. Biometrics, 29, 383- 385.
Sen, A . R . (1973b). Some theory of sampling on sucoe ive oa:asion . Australian Jour. Stal., 15,
105-110.
Seth, G . R ., and Rao, J. N. K. (1964) . On th comparison between simple random sampliog with and
without replacement. Sankhya, AU, 85- 86.
Sethi, V. K. (J 963). A note on optimum tratification tor e timating the population means. Australian
Jour. Stat., 5, 20-33 .
Sethi, V. K. (1965). On optimum pairing of units. Sankhya, 827, 315- 320.
410 SAMPLING TECHNIQUES
Simmons, W. R. (1954). A plan to account for "not-at-homes" by combining weighting and callbacks.
Jour. of Marketing, 11,42-53.
Singh, D., Jindal, K. K., and Garg, J. N., (1968). On modified systematic sampling. Biometrika, 55,
541-546.
Sittig, J. (1951). The economic choice of sampling system in acceptance sampling. Bull. Inl. Sial. IrUI.,
33, V, 51-84.
Sionim, M. J. (1960). Sampling in a Nutshell. Simon & Schuster, New York.
Smith, H. F. (1938). An empirical law describing heterogeneity in the yields of agricultural crops. Jour.
Agric. Sci. , 28, 1- 23.
Smith, T. M. F. (1976). The foundations of survey sampling: A Review. Jour. Roy. SIal. Soc., Al39.
183- 204 .
Snedecor, G . W., and Cochran, W. G . (1967). Statistical Melhods. lowa Stale University Press, Ames,
Iowa, sixth edition.
Srinath, K. P. (1970. Multiphase sampling in nonresponse problems. Jour. Amer. Slat. Assoc., ]6,
583-586.
Stein, C. (1945). A two-sample test for a linear hypothesis whose power is independent of the variance .
Ann. Math . Slat. , 16, 243- 258.
Stephan, F. F. (1941). Stratification in representative sampling. Jour. Marketing, 6, 38-46.
Stephan, F. F. (1945). The expected value and variance of the reciprocal and other negative powen; of
a positive Bernoulli variate. Ann. Math . Slat .. ]6, 50-61.
Stephan, F., and McCarthy, P. J . (1958). Sampling Opinions. John Wiley and Sons. New York , p. 243 .
Stuart, A. (1954). A simple presentation of optimum sampling results , Jour, Roy. Slat. Soc. , 816,
239- 241 .
Sukhatme, P. V, (1935). Contribution to the theory of the representative method. Supp. Jour. Roy.
Slat. Soc .. 2, 253- 268 .
Sukhatme, P. V. (1947). The problem of plot size in large-scale yield surveys. Jour. Amer. Stat. Assoc.,
42, 297-310.
Sukhatme, P. V. (1954). Sampling Theory of Surveys, With Applications. Iowa State College Press,
Ames, Iowa .
• ukhatme, P. V.. and Seth, G . R. (1952) , Non -sampling errors in surveys. Jour. Ind. Soc. Agr. Stat .• 4,
5-41 .
Sukhatme, P. V.. and Sukhatme, B. V. (1970). Sampling Theory of Surveys With Applications. Food
and Agriculture Organization, Rome, second edition.
Tepping, B. J., and Boland, K. L. (1972). Response variance in the Current Population Survey. U .S,
Bureau of the Census Working Paper No. 36, U. S. Government Printing Office, Washington.
D.C.
Tin. M. (1965). Comparison of some ratio estimators. Jour. Arner. Slat, Assoc. , 60, 294-307.
Trueblood, R. M., and Cyert, R . M. (1957). Sampling Techniques in Accounting, Prentice-Hall.
Englewood Cliffs, N.J.
Trussell, R. E. , and Elinson, J. (1959). Orronic Illness in a lArge City, Harvard University Press,
Cambridge. Mass .• pp. 339-370.
Tschuprow, A. A . (1923) , On the mathematical expectation of the moments of frequency distributions
in the case of correlated observations. Metron, 2, 461-493, 646-683.
Tu ey, J. W. (1950). Some sampling simplified. Jour. Amer. Slat. Assoc .• 45, 50 1- 5 19.
Tukey. J. W, (1958). Bias and confidence in not-quite larse samples. Ann. Math. Stat., 29,614.
U. N. Statistical Office (J 950). The preparation of sample survey reports. Stat. Papers Series C. No. I.
U. N. Statistical Office (1960). Sample Surveys of Cu"entlnUrest. Eighth Report.
U. S. Bureau of the Census. (1968). Evaluation and Research Program of the U.S. Qnsus of Population
and HOUSing, 1960: Effects of Interviews and Oew Leaders. Series ER 60. No.7. Washington ,
D.C.
REFERENCES 411
Warner, S. L. (1965). Randomized response: A survey technique for eliminating e\lasive answer bias.
Jour. Amer. SUl/. Assoc. , 60, 63-69.
Warner, S. L. (197 1). The linear randomized respolh'ie model. Jour. Amer. SUl/. As.roc.,66, 884-,I!88.
Watson, D. J. (1937). The estimation of leaf areas. Jour. Agr. Sci., 27, 474.
West, O . M . (1951). Th<! Results of Applying a Simple Random Sampling Process to Farm Management
DaUl. Agncultural Experiment Station. ('.om ell University.
Williams, W. H. (1963). The precision of some unbiased regression estimators. 8iometrilro, 17.
267-274.
Wishart, J. (1952). Moment-coefficients of the k -statistics in samples from a finite population.
Biometrika, 39, 1- 13.
Wold, H . O . A. (1954). A Study in lhe Analysis of Stationary TIme Series. Almqvist and Wicksell,
StockhOlm, second edition .
Woodruff, R. S. (1959). The use of ro tating samples in the Census Bureau's Monthly Surveys. Proc.
Soc. Stat. Sect. Amer. Slat. Assoc., 130-138.
Woodruff, R. S. (1971). A Simple method for approximating the variance of a complicated estimate.
Jour. Amer. Slat. Assoc .• 66, 41 1-41 4.
Woolsey, T. D . (1956). Sampling methods for a small household survey. PIlb. Health Moru>graphs, No.
40.
Yates, F. (l948). Systematic sampling. Phil. Trans. Roy. Soc. London, AUI . 345-377 .
Yates, F. (1960). Sampling Methods for Censuses and Surveys. Otarles Griffin and Co., London, third
edition.
Yates, F., and Grundy, P. M. (1953). Selection without replacementfrom within strata with probability
proportio nal 10 siu. Jour. R oy. Sial. Soc., 815, 25J-26 J.
Zarkovic, S. S. (1960). On the effici e ncy of sampling with variou~ probabilities and the selection of
units with replacement. Metrika, 3, 53-60.
Zukllovitsky. S. I., and Avdeyeva. L. r. (1966). Liluar and Convex Programming. W. B. Saunders.
Philadelphia.
Answers to Exercises
1.1 (a) Examples of problems of definition are decisions whether to count words in a preface.' or
index and how mathematical symbols are treated as "words." In a book such as this one with many
mathematical symbols, however. it seems unlikely that a cou nt of words, either including or omitting
symbols, would be wanted. (b) (1) The pages constitute a convenient trame. A disadvantage of the
page as a sampling unit in which we count all words on any sample page is that with numerous
illuslrations the number of words per page may be quite variable because of the incomplete pages. It
may be worthwhile first to list all the incomplete pages, formi ng two 5ubpopulations or strata, one of
incomplete pages and one of complete pages. that are sampled separately by the method of stratified
sa mpling described in Chapter 5. (2) A problem with the line is that obtai ning a listing of lines so that
lines can be sampled directly is time-consuming. Also. there are incomplete lines at the end of
paragraphs. Since words per line should be fairly stable, however, the solution may be to use two-stage
sampling (Chapter 11), first drawing a sample of pages and then counting the number of lines on each
selected page and drawing a subsample of lines on these pages.
1.1 This question supposes that we first draw a sample of cards with equal probability. (a) If
sampJe names not in the target population are discarded. the only problem is that the size of the sample
of names from the target population will generally be less th an the number of cards and will be a
random variable, depending on tbe cards that bappen to be chosen. (b) The problem is that names
appearing on several cards have higher probabilities of selection. One way of handling tbis is to count
tbe number of cards on which a selected name appears and usc this number in making the e limate by
Ij)ethods appropriate to se lection with unequal probabilities (Chapter 9A). Another way that gives
each name an equal chance but may involve many rejections is to retain a card only if it is the first of the
set on which this name appears. (c) As in (b), names are being selected witb unequal probabilities. I
know of no easy method of giVing each name an equal chance. If the number of cards on which each
name appears has been recorded somewhere, an unequal-probability method can be used . as in (b) .
1.3 Suggestions are : (a ) a recent directory of department and luggage stores . (b ) the repositories
for lost article~ maintained by the subway and bus companies, and (c) hospitals and private physicians
in the geographical area in which snake bites occur, plus any public health organization to which
reporting of bites is compulsory. Weaknesses in all three frames are likely to be incompleteness. plus
bigh costs in (c) if snake bites are rare and not centrally reported. (d ) A list of households is often used
as a frame for selecti ng a sample of families . Although there will be some incompletene s (families who
cannot be reached). [he major problem may be errors in measurement.
1.4 A problem is incompleteness because of new construction. In a sample of addresses, new
dwellings can usually be bandied by the interviewer, who checks, for any sample addres , whether
there are new dwellings between this address and the next address in the directory and . if so, includes
these new dwellings in the sample. Whole areas of new construction may not be mentioned in the
directory and require development of a separate frame. Drawing a list of addresses is preferable to
drawing a list of persons, since addresses are more permanent. For reasons of travel expense, however.
the sampling unit may be a city block from which a subsample of dwellings is elected.
l.! S80,39O and S82,970.
1.6 The confidence probability is about 0.054 (found from t '" - 1.67 with 25 tlegrees
freedom). This assumes tbat future receipts follow the same frequency distribution as the.la.mple of 26
0'
receipts, and that this distribution is normal.
412
ANSWERS TO EXERCISES 4 13
1.7 When the MSE i~ due entirely to bias. the estimate is always wrong by IJMSE. The
probability of an error ~JJMSE 15 therefore unity and th e probability of an error 2: 1. 96../MSE or
2: 2.576JMSE is zero.
3.2 1066 1334 a\ gl\ ~r. by the normal .. pproximat,on. equation 319.
3.3 NClirly conclusive.
3.6 ta, 16.2 oj: 3.6%. (b l J 738± ZIW famihe
3.7 171j(l:l 268 famih e .
3.8 A~ an exact result.
VlAoJ N ,lnO ,
~~ ~ 11 ;(j': 1T)(O t+ Pl 1T)
" w N, = N(1 - 1Tl. and ,n large samples II, "'. ,, (1 - 1T) . These <ub~titulions give the <tated res ult . In
order Ihal V(A , ) V{A , 'I he small. we mu't hav€' ?T() - 0, )/0 , 18rg< This means that 0 , musl br
smd ll: in o th er words. rhe proportion of domain 1 rhal 1ie~ in cia ~ C musr be large. For !!lveo 0 ,. 1r
shou ld be large .
3.9 All give A u = 13 . By th e hypergeometric, the prohability of no UOlI< ,n C in the sample is
0.060 1 {or A -12 and 0.0434 for A,, -13. B the binomial. P p -04507 and.j'J-fPu = 0 .4114.
giving A u'" 12.3. Page 59, Ex . 3 gives 0.061 (or Au "" 12 and 0.044 for All '" 13 .
3.U E limal c (b) see ms more preci~ .
3.U nIt' hiihest va lue is POIII as compared with PO/ "," by Ih(' binomIa l formu la. Thi~ occurs
when evny cluster consists entirely of I ' orenrirely of 0'5. The lowest value can be zero if every cluster
gives the arne proportion P. (This is possible only for certa,n va lues of P and m).
3.13 VarianI' ,. 0 .0(1 184 b} the ratio method and 0.00160 h) !'be t>inofllial formula .
3.14 Average size of sample - ml P
4.J 735 houses. This sample size is needed for Iwo-car household il P ~ I n%
4.2 About 260 sheets.
4.3 (a ) 2475; (b) 4950 .
4.4 ,, - 2 1 (laking t - 2).
4.5 11 - 484 . For number of unemployed. tbe cv uld be abou t 1~% .
414 SAMPLING TECHNIQUes
4.6 62 more.
4.7 (a) n - 278; (b) n - 2315; (c)" - 3046.
4.8 If a rectangular distribution i~ anumed within each class, we talce 52 = 0.083h 2 or 5 .. 0.29h.
This gives estimates of 230, 580, 2030. and 11,600 in the four classes. If the right-triangular
di~tribution is used in the fourth claloS, we take 5 - 0.24h, giving 9600 for this class.
UNS~2/3
4.11 nup, ~ ( --=
2 c J2
4.12 (a) n '" 1250; (b) n ~ 679. In this part, we can give a dummy variate y, the value +100 for a
(YtlS, No) answer. -100 for a (No, Y ) a nswer, and 0 otherwise. Thcn Y=P 1 - P2 in perccntages.
With an advance sample of nl '" 200, formula (4.7) in Section 4.7 can be used, givmg n - 679.
4.13 (a) The probability that a family of four persons blS 1,2. and 3 females is approximately 1/4,
1/1.. and 1/4. respectively. For estim.~tmg the proportion P of females, a simple random sample of n
families gives VIP) = 0.03125/n. PS agllin't 0.0625/n for a sample of 4n penons. The dtff factor is
about 1/2. In the corresponding CYJI.llplc In Table 3.5 with 30 households of unequal SIZes, the
eSllmated del! factor Wb 0.475 . (bl fne dt:ff factor would be slightly raised by famWes witb identical
twins, sInce the prop rtions of families With I and 3 females would be sllahtly increased.
5.1 (a) Neyman allocation give. /II = 0.87. n2 = 3. 13. (b) There are three possible estimates
und", optimllm 11I1..-':lIlIon and nine under proportional allocation. V_(ii,,) - i .. 0.167 : V"",p(ii.. ) -
11 " 0.583 . (.I) ForlJl uht 5.27 gives V..,(9 .. ) a 0.159.
5.2 (a)" , ~3 -15 . /1 2= 625 ; (b)" 1 &750. n ~- 250 .
5.3 RP - 181.,u fOl propo{tiollllJ allocatiol' lind 214% for optimum allocation.
5.5 Whc;:n W , - W . the relative incceu"s equal 0 .029 for Cl/CI " 2 and 0 .111 for C2/C} - 4 .
.5.6 t
(a ) II;!" .. t /I . / n - (b l n '" 264, nl " 88, n2 - 176; (c) 11936.
5.7 (a ) S2lf,!! agai nst S 1936. tb l No . The minimum field OO1It to reduce V to 1 is 52230.
5.8 ( 0 I n , " 384. 11 2 '" 192; (b) "1 '" 400, " . - 1600; (c) n,- 1200, n2 - 2400 .
" .5.9 Frll florlal ,"crea~e ~ ~.
5.10 n1 '" 's41, IIl - 313, n3 - 146.
5.12 In population 1. v_ -0.143/n; V.... -0.134/n. In population 2, V_ - 0.0491/n, V.... -
0 .0423/0. Tt t I·cdu.:tion in variance from optimum allocation is about 6% in population 1 as against
14% in popUlation 2.
5.14 (a) 11 we guess P, - 4S%, P."'2S%, P,-7 .S% 8S a compromise, this give. nl -268,
n2 - 116, n, ~ :\ 6 : (b ) s.c. '" 0.0225; (c) s.e. '" 0.0241.
5.15 As II uoproaches N. 8 stage is reached in which the standard formula n~ OCN~5~ for Neyman
optimum allOCll tivn is no longer applicable. since it would require n~ > N" in at least one stratum. As
noted in SC(;tion .8, formula (5 .27) then ceases to hold. The student isin error if he claims that (S.27)1s
always wrone; the formu la has a limited range which , however. covers nearly all applications.
SA.4 No. In each of the wont cases [I(w" - Wit) Y,,)2 is (0.105)2 ~ 0.0110. Thus, with stratifica-
tion, M~E(.y..,), as given by formula (5A.6), is 0.0108+0.0110-0,0218. With simple random
samplina, V(Y) >= O.Ol77.
SA.6 (a) n ... 1024. The optimum allocation for the teOOnd variate (average amount invested)
lalistie£ both requirements.
SA. 7 WI - 0.728, W1 - 0.272, S, '" 1.806, S2 - 4.698 (in the coded scale). (a) The optimum
sample sizes are ", - 0.50711, 111- 0.49311. (b) V(y ) .. 31.9S/n, V ...,(YoU) " 6.72/n.
SA.8 (b) r../1fYJ
o
dy - r· J2(1- y)d,- 2J'2[1-(1- a)3/2]/3. Heneewewaot[l-(1 -a)'/2]-i.
Jo
SA.9 In Exercise SA.7" WI - 0.728, as In the DaleniUII-HodgC$ rule, comes closest to •• tisfying
the Ekman rule. In Exercise SA.8. the Ekman rule gives a - (3 -..fS)/2 - 0.38.
ANSWBRS TO EXERCISES 4J5
!lA.10 The optima are L .. 7 for p - 0.95, L '" 5 for p = 0.9, and L - 4 for p e 0.8. Ei ther LaS or
L - 6 i, a good compromise.
SA.ll (a) Gain in precision is about 110%. (b) Gain from proportional stratification oversimpl~
random sampling is about 90%
SA.l1 Increase "I as the hint suggests, leave " ," 400. We require nl . )40. giving n '"' 540.
6.1 For the ratio estimate VCYR ) ~ N'(1- nS/ In and for simple expansion VO?') c
2
N ( I ._ nS, 2/". where d : ()I - RIC). For the sample of 21 households the eMimates of S/ and S, 2 are as
follows . Number of children, si
0 .49,K s/
1.61 ; number of cars, . / a 0.41,
8 s/
= 0.39; number of
TV se ts, s/ - O.S 1, s/ -
0.45 . The ratio estimate appears superior for children.
6.2 Gain - 66% . At leal 11 units b)llhc ral;o melhod.
6.3 Quadratic limits (27.100, 29,870); normal limits (27,030, 29.700).
6.4 Apply theorem 6.3 to the estimation of R "" YIX. With large .s amples, use YI X if r s (cv of
IC)/ 2(cv of )I), and use iii otherwise, where r and the cv's are sample estimates.
6.5 The MSE's are 46.5 for the separate ratio estimate and 40.6 for tbe combined ratio estimate.
In both cases the contribution of bias to the MSE is negligible.
6.6 For Lahiri's method, vel>Ib) ~ 40.1.
6.7 Estimated population total z 116.21 millions. The relative variance is 0.00111 , so that the
s.e. is (0.0333)(116.21) - 3.87 millions. The esti mate is within I s.e. of the true total.
6.8 The estimates are (a) 1896, (b) 1660, (c ) 1689. In (c ) we find WI'" 2.38. - 1.38. w, B
Estimated s.e.'s are (a ) 256, (b) 36.9, (c) 18.6. For the s.e. in (b) I used tbe formula S.e. '"
~RJ(l - n(e.. +CIl - 2e. '>ln, where YR is tbe ratio estimate of Y. that ;s, 1660. For the S.e . in (c ) I
used Y"MR - 1689.
8.1 Variances are 8.19 (systematic), 11.27 (simple random), 8.25 (stratified, 2), 7.46 (strati-
fied,I) .
8.1 V, •• - 0 .00141 , V ••• - 0.00340.
8.3 The systematic sample should be superior for the proportion of people of Polish de cent,
since this variable exhibits geographical stratification. It is likely to be inferior for proportion of
children because the sampling interval, 1 in 5, coincides with the average size of a household . The same
is true. though to a smaller extent, for proportion of males.
8.4 The variances are as follows. Males, V... - 0 .0204, V... - 0.0216; children , V... - 0.0204 ,
v. •. - 0.0776; professional, v... -
0.0192, V", - 0.00 J 6.
8.5 Actual variance . 8.19. Method (a) gives 11.29. For method (b) the estimated variance from
a single sample is (1 - n(ji'l - ji,,)' / 4, where YIl' )i'2 are the means of the two halves. The average is
3.24. The serious underestimation is unexpec d.
8.7 80th variances are (k' - 1)/6.
416 SAMPLING TECHNIQUES
9.1 Relative costs of using the four types of unit are a 100. 90.1. 79.7. and 77.8 (takinll, the first
unit as a standard) .
9.2 Relative precision of the household is 211 % for the sex ratio and 38% for the proportion who
had seen a doctor.
9.1 Relative precision of the large unit is 0.566 with simple random sampling and 0 .625 with
stratified random sampling.
9.5 ( a)M~5:(b ) M"1.
9.6 The optimum M should decrease because travel cost. which varies as,r,., becomes relatively
less important when II increases.
10.1 (a ) 2.00 ; (b ) 2. D .
10.3 (a ) 165 / n ; ( 1) ) 148.5/ n ; (c) 132 / n.
10.4 (a) II '"' 660 fields ; (b ) ,, = 530 field~ . Protein requires fewer fields than yield,
10.5 C, / C2- 8.
" 11>.7 (a) 0,93 %; (b l 0.5 1% : (c) 0.36%.
10.8 ( a ) E,ther m il = 7 o r mn = 8 is suitable ; (b ) 89% for rn o = 7 and 93 % for m o ~ 8; (c) 86% for
,
mo = 7 and 89% for m " = 8.
11.2 The relat.ive precision of ill to II drops from 3,02 to 2.75 . If two sampling plan differ
primarily in their between-units contribution to the variance. the relative precision of the superior plan
will in general decrease as the ratio of the within-units variance to the total variance increases,
11.3 The explanation is. roughly speaking. that with these data the Y.! z, are more stable than the
Yi / M ,. If we look t, - fl. 13.
and H.
the between-units contribution in method IV would vanish
11.4 Total variance: 0.00504 (10), 0.02358 (11). 0,00554 (Ill).
11.6 Estimated percentage J4 .2±2, J6. Estimated number 3540±540,
n.7 Estimated percentage 13 .9±2.49,
11.9 Total rooms, 2Q,400, total persons, 50,550, persons per room, 1.72; s,e.'s: lotal persons.
2,440, pe~ns per room 0,066 ,
12.1 II'" 267 , II ' = 1320 or II - 268. n' - 1280. V(PII) with optimum allocation i 6.67 wh.en PIt is
in % 's. With single sampling. V ( p.) - 8,33.
12.:Z c./c.' > 9.
12.3 "/11' = 1/19.
12.4 ,,' >1611 ,
11.5 By formula (12.67). s.e. = 1.25, ignoring 1/ N.
12.6 Per cent gain {rom the second to the sixth occasion are 50, 75,91. 100, and 105, respectively.
ANSWERS TO EXERCISES 417
11.' The values 0111 V(Y2")/5 ' and n V(~,')/ 5' are a~ lollows: I' -:t () -
0.8 : 0.885. 0.875; I' -~ ;
p - 0.9: 0.843. 0.840; I' - i.
p. 0.8: 0.824, 0.810; I' '' !. p m 0.9: 0.752. 0.746.
n.n (a) LeI y, - I for any unit that has the tirst attnbutc and y, - 0 otherwise and let stratum 1 be
the stratum in which every unit has the second attribute. In the notation 01 theorems 12.1. 12.2. with
I/N neglisible. 5' - PIQI and 5 1' - PIP,/(P) + P1)" Results in (a) follow from the theorems ; (b) if
c> I, the expected cost. double samplina with "I - 1/2. k - 2 (the optimum) sives V( Y,.) ..
N 1 (0 .844)/C>. while a simple random sample lives ven ..N ' (l .875)/C· . over twice as large.
13.1 (c) 90% response with 1047 completed questionnaires or 95% respOnse with 701 completed
Questionnaires.
13.2 The method with 90% response costs $5235 . ThaI with 95% re ponse costs $5.7895 per
completed questionnaire. or 54058 total cost.
13.3 (b) 0.6Sno. 0.815no. 0.8915no. 0.9351no. 0.9611no; (c) 100, 104, 108. 112. 117; (d ) 300.
288.277.267.256.
13.4 (a) Bias (in %). - 3.85. - 2.15, - 1.21. - 0.69. - 0.40; (b) varianceSarc 8.67. 9.03 . 9.39. 9.74,
10.16; (c) four calls.
13.6 Yes. VC for k .. 2 Is only about 2% over the minimum Vc.
13.7 Politz-Simmons estimate. 39.7·%; binomial. 42.3% .
13.' (c) Variance - 0.1.
13.9 II each enumerator's error of measurement were independent from family to family , the
variance of the sample mean would be (tT• .' + tT/ + 1711 , + (711)/525 instcad of (17,/ + o} ... 35171/ +
\()Su/}/SlS . En\lmerator biasc\ contrib\lte abo\lt 55% of the tota\ "aritonce .
13.10 10·V( ..... )- V(100"... ) - (a) 1.80 (Direct); (b ) 10.69 (Warner ); (c ) 3.30 ( 7T" krown); (d )
5.12 (Moo"); (t ) 6.30 (P, " 1- PI) '
13.11 (a ) 10·MSE(..... ) - 3.81 ; .fr"u is superior if 7T u is known . (b ) 10·MSE( ..... ) '" 5.47 ; the
two-question ..... u 15 also superior if P, - O. (c ) 10·MSE(,.... ) . 7,64. All methods arc superior except
Warner's orisinal method.
Author Index
I..
or taking out ita pagel or oturwlse demag-
Ing it. will constitute an 'injury to I book.
Any luch injury to a book .eriou.
oftenee; Un.... II borrower pointS out the
Injury et the ti .f borrowing the beok.
he thall be required to replace thl book or
pav ita price,
I, to ICMp thf.