Behrman Et Al. (Restat, 2004) Evaluating Preschool Programs When Lenght of Exposure Varies
Behrman Et Al. (Restat, 2004) Evaluating Preschool Programs When Lenght of Exposure Varies
Abstract—Nonexperimental data are used to evaluate impacts of a Boliv- goals of improving child nutrition and providing environ-
ian preschool program on cognitive, psychosocial, and anthropometric
outcomes. Impacts are shown to be highly dependent on age and exposure ments that are conducive to learning (Myers, 1995). In this
duration. To minimize the effect of distributional assumptions, program paper, we evaluate the effectiveness of one such program,
impacts are estimated as nonparametric functions of age and duration. A an early childhood development program in Bolivia called
generalized matching estimator is developed and used to control for
nonrandom selectivity into the program and into exposure durations. PIDI (Proyecto Integral de Desarrollo Infantil).
Comparisons with three groups—children in the feeder area not in the There has been little research on preschool interventions
program, children in the program for ⱕ1 month, and children living in in developing-country settings. However, a large literature
similar areas without the program—indicate that estimates are robust for
significant positive effects of the program on cognitive and psychosocial evaluates the effects of preschool programs in the United
outcomes with ⱖ7 months’ exposure, although the age patterns of effects States that are targeted at children from impoverished fam-
differ slightly by comparison group.
ilies.4 The Perry Preschool Program is perhaps the best
known of the U.S. programs in the evaluation literature. An
I. Introduction experimental evaluation of this program found that children
who participated in it scored higher on cognitive tests,
T here is growing recognition that human capital invest-
ments made in early childhood are important determi-
nants of school performance and lifetime productivity.1
although the gains tended to disappear within a few years.
Long-lasting effects were found on other outcome mea-
Previous studies suggest strong associations between (1) sures, such as educational attainment, earnings, welfare
cognitive and psychosocial skills measured at young ages participation rates, out-of-wedlock birth rates, and crime
and (2) educational attainment, earnings, and employment rates.5 Evaluations of two other early intervention programs,
outcomes.2 the Milwaukee and Abecedarian projects, document long-
In developing countries, low levels of investment in lasting effects on test scores (Ramey, Campbell, and Blair,
human capital are seen as a major barrier to growth as well 1998). The positive impacts consistently found for interven-
as a source of poverty. Lower levels than in developed tions aimed at very young children are in sharp contrast to
countries reflect the facts that children enroll later in ele- the relatively weak impacts often found in evaluations of
mentary school, repeat grades more frequently, and drop out U.S. job training programs targeted at adolescent youth or
of school at earlier ages. Recent research demonstrates that adults (for example, Bloom et al., 1993).
nutrition is an important factor in explaining delayed school Although the promising results from U.S. preschool pro-
enrollments and lower educational attainment levels.3 To gram evaluations might lead to high expectations about
combat such problems, governments in several developing similar programs in other settings, the results from U.S.
countries, often supported by international agencies, have experience may not be generalizable to developing coun-
introduced subsidized preschool programs with the twofold tries. Both the preschool programs and the families and
children they aim to help differ in some possibly important
Received for publication November 19, 2002. Revision accepted for respects. For example, program expenditure per child in
publication September 4, 2003.
* University of Pennsylvania; Florida State University; and University
developing countries is usually lower, although as a fraction
of Pennsylvania and NBER, respectively. of the family’s income it may be higher. Lower levels of
This research is sponsored by the World Bank Research Foundation expenditure do not necessarily imply low impacts, however,
Project on “Evaluation of the Impact of Investments in Early Childhood
Development on Nutrition and Cognitive Development” (P. I. Harold because diminishing marginal returns to investment could
Alderman). This paper was presented at the 2000 World Congress meet- lead to higher impact per unit of investment. Another
ings of the Econometric Society. We thank Harold Alderman, Alfonso difference in developing countries is that preschool provid-
Flores-Lagunes, Judith McGuire, John Newman, Steven Stern, Edward
Vytlacil, and participants at seminars at the University of Minnesota, ers are often less well trained. Lastly, in terms of the target
Lehigh University, University of Delaware, University of Virginia, Uni- population, children frequently suffer from protein and
versity of Pennsylvania, Hebrew University, Ohio State University, and
the NBER for helpful comments. We are also grateful to Elizabeth
energy malnutrition and micronutrient deficiencies, which is
Peñaranda of the PAN staff in La Paz, Bolivia, for help in understanding why preschool programs in developing countries tend to put
the details of the program being evaluated and of the data. We thank an greater emphasis on nutrition. Such differences in program
anonymous referee and the editor Robert Moffit for many useful sugges-
tions. Todd thanks the NSF for support under SBR-9730688.
1 This view is expressed, for example, in the United States Congress’s 4 See Barnett (1992) for a survey of the findings from evaluations of
1994 stated goal to send every child to school “ready to learn.” Goals many different U.S. programs.
2000: Education America Act. 5 The Perry Preschool Program spent significantly more per pupil than is
2 See for example Currie and Thomas (1999), Neal and Johnson (1996). typically spent on preschool interventions ($7252/year, over a third more
3 See Glewwe and Jacoby (1995), Alderman et al. (2001), Glewwe, than the Head Start program, for example). Most of this expenditure went
Jacoby, and King (2001), and Martorell (1999) for evidence for Ghana, to teacher pay; the teachers tended to be highly trained professionals
Pakistan, the Philippines, and Guatemala. (Sweinhart & Weikart, 1998).
characteristics and the contexts in which they operate could comes. Our analysis shows the importance of carefully
affect the extent and type of benefit from the intervention. taking into account age and family background differences
The PIDI program analyzed in this paper provides day- in analyzing the effects of the program.
care, nutritional, and educational services to children between We use matching methods to control for potential bias
the ages of 6 months and 72 months who live in poor, due to nonrandom selectivity into the program. One meth-
predominantly urban areas. The goals are to improve health odological contribution this paper makes to the previous
and early cognitive/social development by providing children literature on matching is to allow for a continuous dose of
with better nutrition, adequate supervision, and stimulating treatment (corresponding to the number of months spent in
environments. It is hoped that the program will also ease the the program), whereas most of the existing literature as-
transition to elementary school, improve progression through sumes that treatment is binary or belongs to a discrete set of
elementary grades, and raise school performance, all of which treatment types. Two of the matching estimators that we use
are expected to increase postschool productivity. are justified under the assumption that selection into the
Through PIDI, children attend full-time child care centers program is on observables, that is, that it can be taken into
located in the homes of women living in low-income areas account by conditioning on observed family and child
targeted by the program. These women are given training in characteristics. We also develop an alternative marginal
child care and loans and grants (up to $500) to upgrade matching estimator that allows selection into the program to
facilities in their homes. Each PIDI center has up to 15 be based on unobservables, but assumes that conditional
children and approximately one staff member per five chil- upon having selected into the program, selection into alter-
dren, with additional staff provided when there is a larger native program durations is on observables. An advantage of
proportion of infants. The program provides food to supply the marginal estimator is that it only requires data on the
70% of the children’s nutritional needs as well as health and treatment group and is thus implementable when no com-
nutrition monitoring and educational activity programs. The parison group data are available.
program cost has been estimated by Ruiz (1996) to be The results show that the program significantly increases
approximately $43 per beneficiary per month, which is cognitive achievement and psychosocial test scores, espe-
substantial in a country where per capita annual GDP is cially for children who participated in the program for at
$800 in exchange-rate-converted pesos, or $2540 in pur- least 7 months. The impact estimates are fairly robust to the
use of alternative comparison groups and estimators. Esti-
chasing power parity terms. Approximately 40% of the
mates obtained by the marginal matching estimator tend to
expenditure goes to the nutritional component of the pro-
be larger, particularly at longer durations and for children
gram (World Bank, 1997).
aged 6–36 months, than those obtained using traditional
This paper uses a large nonexperimental data set to assess
econometric estimators that impose stronger functional-
the impact of the PIDI program on multiple child outcome
form assumptions. Cost-benefit analysis based on our esti-
measures related to health, cognitive development, and
mates and on evidence from wage studies for developing
psychosocial skill development. As measures of health, we
countries indicates that the PIDI program may have fairly
consider standard anthropometric measures: height for age
high rates of return.
and weight for age. To measure cognitive and psychosocial In section II of the paper, we develop a model of enroll-
development, we use children’s scores on a battery of tests ment in preschool that gives an economic interpretation for
of bulk motor skills, fine motor skills, language and auditory the average treatment effects that we estimate. Section III
skills, and psychosocial skills. describes how we generalize existing matching estimators
For our study of the PIDI program, the sample size is to accommodate a varying treatment dose as well as impact
approximately 10 times larger than the sizes typically ob- heterogeneity with respect to children’s ages, and introduces
served in experimental evaluations, and the data set is the marginal effect estimators. Section IV provides addi-
representative of the entire population of program recipi- tional information about the PIDI program and data sets,
ents. However, there is self-selection among eligible chil- analyzes the determinants of program participation, and
dren into the program, which poses a threat to the validity of presents the impact estimates obtained by matching and, for
the results. Although the comparison group data sets that we comparison, by more standard regression methods. Section
use were chosen by a sampling scheme designed to increase V performs a cost-benefit analysis based on our preferred
comparability with the families in the program, we still find marginal effect estimates and other explicit assumptions
some important differences between the treatment and com- regarding subsequent schooling and wage effects. Section
parison group families. For example, families with children VI concludes.
in the program tend to have lower parental education levels
and incomes, a difference that would likely bias the esti-
II. A Model of the Preschool Participation Decision
mated program impacts downward if not taken into account. and Treatment Effects
This source of bias is partly offset by the fact that program
participants tend to be older than nonparticipants, which We develop a model of the mother’s decision to enroll her
increases their average test scores and anthropometric out- child in preschool, which provides a way of interpreting the
110 THE REVIEW OF ECONOMICS AND STATISTICS
treatment effects that will be estimated later in the paper and shock realized after input decisions, and that the productiv-
gives some insight into which conditioning variables should ity of inputs may depend on the child’s age. For example,
be used in the matching procedure. Our framework assumes preschool could be highly productive for a toddler but not
that the mother maximizes a time-separable utility function for a six-month-old infant. Equation (2) is the budget
that depends on her own consumption (C mt ) and leisure (s lt ) constraint, and (3) describes the constraint that preschool is
and on the quality of her child (q t ). There is a child quality only an option for eligible families. We assume that mothers
production function that depends on the mother’s time do not try to influence their children’s eligibility (for exam-
allocated to child production (s ct ), on the child’s consump- ple, by changing their labor force participation or by making
tion of household monetary resources (C ct ), on whether the their children appear undernourished).
child is in preschool, and on stochastic elements. Time not The total effect on child quality from participating in
spent in leisure or child quality production is assumed to be preschool from time period t to t⬘ (the treatment effect of
spent working at wage w mt . switching from state D P,t ⫽ 0 to D P,t ⫽ 1 for t 僆 {t, t⬘})
To focus only on the most relevant aspects, we abstract can be expressed in terms of the model. It includes the direct
from certain considerations. We assume that the father’s effect that participation has on quality in the current period
only role is to contribute to the asset income A t of the as well as indirect effects that could occur, for example, if
family, which is consumed in full every time period, and the mother reduces the child’s consumption at home know-
c l c
that there is only one child of age a t for whom the mother ing that he/she receives meals at school. Let s 1,t , s 1,t , C 1,t
is making decisions. D *p,t is an indicator that takes a value 1 denote the values that solve the dynamic programming
if the mother would choose to enroll the child in the problem when D P,t ⫽ 1, and let s 0,t c l
, s 0,t c
, C 0,t denote the
preschool program were the child eligible, e t is an indicator corresponding values when D P,t ⫽ 0.
that equals 1 if the child is eligible. D p,t is an indicator for The total effect of preschool participation on current-
whether the child is actually enrolled. We assume a fixed period quality for a particular child of age a t who starts off
cost K to the mother of enrolling in the preschool program at quality level q t⫺1 ⫽ q is given by
and transporting the child to the program site.
The mother’s problem can be expressed as a dynamic ⌬q t
⫽ q共s 1,t
c c
, C 1,t , 1, a t , q 兲 ⫺ q共s 0,t
c c
, C 0,t , 0, a t , q 兲.
programming problem, where the choices at any point in ⌬D P,t
time are whether to enroll the child in preschool, how much
time to invest in the child, how much time to spend in The total program effect is inclusive of any compensating
leisure, and how much consumption to allocate to the child. changes in the mother’s allocation of time and consumption
The set of period t state variables, denoted by ⍀ t , consists to the child. In addition, a change in current-period quality
of the child’s age, previous-period child quality, mother’s levels potentially affects future quality levels. For example,
wage, father’s income, and program participation cost.6 The if a child starts off period t ⫹ 1 at a high quality level, he
random shocks in the model (ε tt , ε lt , ε qt ) are shocks to the or she may be better able to take advantage of consumption
value of mother’s consumption and leisure and to the child and time investments. The effect of increasing quality due to
quality production function. The mother solves program enrollment at time t on future quality levels (at
time t⬘ ⬎ t) is given by
V t 共⍀ t 兲 ⫽ max U共C tm , qt , stl ; εtt , εtl , εtt 兲
兵C tm ,s tc ,DP,t
* ,s tl 其 ⌬q t⬘ ⌬q t⬘⫺1 ⌬q t⫹1 ⌬q t
··· .
⫹ E共Vt⫹1 共⍀t⫹1 兩C , s , D*P,t , s , ⍀t 兲兲
m c l ⌬q t⬘⫺1 ⌬q t⬘⫺2 ⌬q t ⌬D p,t
t t t
再 冎
q t ⫽ q共s tc , C tc , D P,t , a t , q t⫺1 兲 ⫹ ε tq , (1)
冘 ⌬qv
冘
⌬qw ⌬qv⫹1 ⌬qv
t⬘ T
C ⫹ C ⫹ D p,t K ⱕ 共1 ⫺ s ⫺ s 兲w ⫹ A t ,
m c c l m
(2) ⌬t,T ⫽ ⫹ ··· , (4)
t t t t t
v⫽t
⌬DP,v w⫽v⫹1 ⌬qw⫺1 ⌬qv ⌬DP,v
D P,t ⫽ e t D *P,t . (3)
where the first term captures the current-period impact and
Equation (1) describes the production technology for child the second term the impact on future quality levels up until
quality, where we are assuming that previous-period quality some end period T. 8
is a sufficient statistic for prior inputs,7 that ε qt represents a In the data analyzed in sections IV and V, we do not
observe children over the entire time period of their
6 For simplicity, there is no uncertainty about the wage, father’s income,
or participation cost. 8 In writing the treatment effect solely as a function of effects on current
7 We make this assumption for simplicity. The assumption that previous- and future quality, we are also assuming that there are no effects of
period quality is a sufficient statistic could be relaxed to allow quality to anticipation of the program on quality levels prior to the program entry
depend on the history of inputs over the child’s lifetime. date.
EVALUATING PRESCHOOL PROGRAMS 111
participation in the program (up to T), so we can only characteristics Z̃ t . This requirement implies that, conditional
estimate the cumulative effect of the preschool treatment up on Z̃ t , there must be other variables affecting program
until the time of observation, t o . We assume that the empir- participation and that these variables not be correlated with
ical test scores and anthropometric measures available in the child outcomes in the no-treatment state. For example,
data set capture aspects of child quality.9 The mean treat- suppose that distance to the program site is a determinant of
ment effect we estimate (conditional on age and duration of participation and that the placement of program sites can be
time in the program) using the matching estimators de- considered random with respect to child outcomes in the
scribed in section III is equal to E(⌬ t,t o兩age ⫽ a, t o ⫺ t ⫽ no-treatment state conditional on Z̃ t . If Z̃ t contains all the
l, D P ⫽ 1), where D P ⫽ 1 denotes participating in the elements of the state space except distance, then the above
program. The treatment effect depends on the production exogeneity condition can be satisfied.
function for quality, on the utility function determining For reasons described later in the paper, it is important
other input levels, and on the distribution of asset and wage not to match on variables that themselves are affected by the
income among families.10 program, such as the mother’s labor supply in the model.
Next, we consider the question of how to choose the set This is because the matching estimator (described below in
of conditioning variables used in matching. Let q 0t be the section III) integrates over f(Z̃ t 兩D *P,t ⫽ 1). To estimate
quality level when the child does not participate in the correctly the mean no-treatment outcomes, we require that
program at time t, and q 1t the quality level when the child the density of the matching variables do not change with
does participate. The decision to enroll the child at any time treatment. For this reason, we match on the following: (a)
period (which is only relevant for eligible families) implies variables observed prior to the enrollment decision (under
that at that date, the current utility plus the expected future the assumption that the density of these variables does not
utility from participating is higher than from not partic- change due to anticipation of the program), (b) variables
ipating: that we expect to be stable over the time period of obser-
vation (such as the mother’s and father’s education, the
m
U共C 0,t l
, q 0,t , s 0,t ; ε tl , ε tt 兲 family structure, and the characteristics of the household),
⫹ E共V t⫹1 共⍀ t⫹1 兩C 0,t
m c
, s 0,t l
, s 0,t , D *P,t , ⍀ t 兲兲 and (c) variables that are deterministic with respect to time
(such as the child’s age). We do not include variables that
⬍ U共C 1,t
m l
, q 1,t , s 1,t ; ε tl , ε tt 兲 directly relate to children’s physical, mental, and social
⫹ E共V t⫹1 共⍀ t⫹1 兩C 1,t
m c
, s 1,t l
, s 1,t , D *P,t , ⍀ t 兲兲. development.
Define the outcomes in the no-program state and in the III. Cumulative and Marginal Matching Estimators
program participation state as
As discussed in the previous section, we are interested in
Y t 共a, 0兲 ⫽ q t 兩 D P,t⫽0 for all t⬘ⱕt and estimating the treatment effect ⌬ t,t 0 [defined in equation
(4)], which gives the total effect of the preschool program
Yt 共a, l 兲 ⫽ Yt 共a, 0兲 ⫹ ⌬t,t⫹l , on child quality for a child that participates in the program
and suppose that there is available a set of conditioning for a duration l ⫽ t ⫺ t 0 . We next describe the estimators
variables Z̃ t . The cumulative matching estimator described we use and the assumptions required to justify their appli-
in the next section assumes that cation. We go beyond the previous literature on matching by
allowing for a continuous dose of treatment (given by the
E共Y t 共a, 0兲兩Z̃ t , D *P,t ⫽ 1, t 僆 t 0 · · · t兲 duration of time spent in the program), by permitting im-
pacts to depend in a flexible way on children’s ages, and by
⫽ E共Y t 共a, 0兲兩Z̃ t , e t ⫽ 1, D *P,t ⫽ 0, t 僆 t 0 · · · t兲. developing a marginal matching estimator that can be im-
plemented if data on program participants are the only data
The estimator also requires that Pr(D *P,t ⫽ 0兩Z̃ t , e t ⫽ 1) ⬎
available.
0, so that there is a positive probability of observing both
Let Y(a, l ) denote the outcome measure intended to
program participants and nonparticipants with the same
capture an aspect of child quality (test score or anthropo-
9 Preschool investments could increase the amount learned in school and
metric measure) for a child of age a who participated in the
lead to higher quality in elementary school years, but these benefits will program for length of time l. 11 For nonparticipants, l ⫽ 0.
not be captured by our estimation approach, due to the data limitation Also define D P ⫽ 1 if l ⬎ 0, D P ⫽ 0 otherwise. For a child
posed by not observing children at these later ages. In the cost-benefit of age a, the cumulative impact of participating in the
analysis of section V, we will briefly consider these other sources of
benefits. program l time periods relative to not participating is given
10 Knowing the treatment effect of the program does not allow recovery by ⌬(a, l, 0) ⫽ Y(a, l ) ⫺ Y(a, 0). Also of interest is the
of the parameters of the production technology. Only under the strong marginal effect of participating in the program l 1 time
assumption that parents do not alter their time or resource allocations
when their child participates in the program would the treatment effect
correspond to a feature of the production technology (see related discus- 11 We assume participation takes place in consecutive time periods, as it
periods relative to l 0 time periods: ⌬(a, l 1 , l 0 ) ⫽ Y(a, l 1 ) ⫺ matching on the characteristics of the treatment group, the
Y(a, l 0 ). Neither of these program impacts is directly method effectively aligns the distribution of observables of
observable, because every child in the program is observed the comparison group with that of the treated group.13
for a single duration at each age and no child is observed The identifying assumption that justifies the matching
simultaneously in and out of the program at the same age. estimator that we use to estimate ⌬ ( A, l, 0) is that there
Because of this missing data problem, we do not attempt to exist a set of conditioning variables x such that
estimate the full distribution of treatment impacts. We focus
instead, firstly, on the problem of estimating average treat- E共Y共a, 0兲兩a, l i ⫽ l, x兲 ⫽ E共Y共a, 0兲兩a, l i ⫽ 0, x兲 (5)
ment impacts, and secondly, on the problem of estimating
marginal treatment impacts—in both cases, conditional on and
age and duration of exposure to the program. The average
program impact for children of age a 僆 A who participated 0 ⬍ f共a, x兩D P ⫽ 0兲. (6)
l 1 time periods as opposed to l 0 (where l 0 could equal 0) is
given by As discussed in section II, the first condition implies that
after conditioning on a set of observed characteristics {a,
兰a僆A 关Y共a, l1 兲 ⫺ Y共a, l0 兲兴 fa 共a兩l ⫽ l1 兲da x}, no-treatment outcomes for children who have partici-
⌬ 共 A, l1 , l0 兲 ⫽ ,
兰a僆A fa 共a兩l ⫽ l1 兲da pated for duration l will be on average the same as those
observed for children who have not participated (D P ⫽ 0).
where f a (a兩l ⫽ l 1 ) is the conditional density of ages and A The second condition ensures that for each child in the
can be a singleton set or a range of ages. participant group there is positive probability of finding a
Integrating over the joint density of observed ages and match from the nonparticipant group.14 Let S P ⫽ {(a, x) :
program durations gives the overall impact of the program f(a, x兩D P ⫽ 1) ⬎ 0 and f(a, x兩D P ⫽ 0) ⬎ 0} denote the
relative to the counterfactual of participating for length of region of the support of (a, x) that satisfies equation (6),
time equal to l 0 : called the region of overlapping support.15
Under the above conditions, ⌬ ( A, L, 0) can be estimated
兰l僆L 兰a僆A 关Y共a, l 兲 ⫺ Y共a, l0 兲兴 fa,l 共a, l 兲da by
⌬ 共 A, L, l0 兲 ⫽ ,
兰l僆L 兰a僆A fa,l 共a, l 兲da
distance, according to some metric, between a set of their experiment in which the characteristics of the treatment and comparison
characteristics that constitute the matching variables. By groups are aligned by virtue of randomization.
14 See Rosenbaum and Rubin (1983). Under both conditions, treatment
is termed strictly ignorable. If there are some (a, X) values for which the
12 Matching methods have been developed and applied to the evaluation second support condition fails, then treatment impacts cannot be estimated
of training programs by Heckman, Ichimura, and Todd (1997), Heckman, by the method of matching for individuals with those characteristics.
Ichimura, Smith, and Todd (1998), Dehejia and Wahba (1999), Smith and 15 See Heckman, Ichimura, and Todd (1997) for discussion of support
observations that are close according to the distance metric that the group of children observed participating in the
receive greater weight. The nonparametric estimators we program l 0 periods provide an appropriate comparison
use are local linear regression estimators that have been group for the children observed participating l 1 peri-
developed and studied in Cleveland (1979) and Fan (1992). ods—an assumption that may not be justified if chil-
The details of local linear regression estimators are de- dren are systematically entering or dropping out from
scribed in appendix B.16 the program at different ages. Partly for this reason,
The analogous nonparametric estimator for Ê(Y(a i , l i )兩x i , we prefer the approach described next, which is the
D Pi ⫽ 1) in equation (7) is one we take in our empirical work.
冘
2. Marginal impact estimator that only uses data on
Ê共Y共a, l 兲兲 ⫽ Y k 共a k , l k 兲W k 共储l k ⫺ l i 储, program participants. An alternative estimation strat-
k僆兵D P ⫽1其 egy only uses data on program participants and com-
(9)
储a k ⫺ a i 储, 储 x k ⫺ x i 储兲, pares outcomes for children of similar ages with
different durations. An advantage of this approach
where the weights now additionally depend on the distance over the previous one is that it does not require
between l k and l i (allowing the impact of the program to assumptions on the process governing selection into
depend on the duration of time in the program).17 Note that the program and allows for the possibility that selec-
in equation (7) averaging is performed in two stages, once tion into the program is based on unobserved charac-
in obtaining the nonparametric estimates and again in aver- teristics. However, here we are faced with a different
aging over the set {D P ⫽ 1} 艚 {a i 僆 A} 艚 {l i 僆 L} 艚 potential source of nonrandom selection—the process
{(a i , l i ) 僆 S P }. Because of the second averaging, the governing selection into alternative program dura-
average impact estimators over ranges of age and duration tions. For example, four-year-olds who have taken
values converge at a faster rate than the pointwise (in a and part in the program for three years may be systemat-
l ) estimators. The asymptotic theory of Heckman, Ichimura, ically different from four-year-olds who just recently
and Todd (1998) is general enough to accommodate the entered the program. Again, matching methods can be
estimators. In the empirical work, however, we evaluate the used to solve the selection problem relating to the
variation of the estimators using bootstrap methods rather choice of program duration—under the assumption
than variance estimators based on asymptotic formulas. that children who have taken part in the program for
different lengths of time can be made comparable by
Estimating Marginal Program Impacts: Instead of (or conditioning on observed child and parental charac-
in addition to) the impact of the program against the bench- teristics.
mark of no program, we may be interested in the marginal
treatment effect of increasing duration in the program from In our empirical application to the analysis of the PIDI
l 0 to l 1 : ⌬ (a, l 1 , l 0 ) ⫽ E(Y(a, l 1 )兩D P ⫽ 1, x) ⫺ E(Y(a, program, a major determinant of duration in the program is
l 0 )兩D P ⫽ 1, x), where l 1 , l 0 ⬎ 0. There are two different the time at which the program first became available to
ways of estimating marginal effects. One is to first estimate children, which differs across children depending on the
cumulative effects at different duration levels and then take child’s place of residence and child’s age at the time the
the difference. The other way is to use only data on program local PIDI site began its operation. Two-thirds of the chil-
participants, drawing comparisons between program partic- dren in the PIDI evaluation sample began participating in
ipants who have taken part in the program for different the program as soon as it became available; on average, the
lengths of time.
delay between the time the local PIDI site opens and the
time children begin participating is 3.2 months. The varia-
1. Marginal impact estimator based on difference in
tion in duration of time spent in the program that arises from
cumulative effects. An estimator of the marginal effect
variation in when the program became available to children
from participating in the program for l 1 time periods
as opposed to l 0 time periods (l 1 ⬎ l 0 ) can be is therefore arguably exogenous with respect to program
obtained as the difference between the two cumulative outcomes, conditional on observed child and parent charac-
program effects: ⌬ (a, l 1 , l 0 ) ⫽ ⌬ˆ (a, l 1 , 0) ⫺ ⌬ˆ (a, l 0 , teristics.
0). Estimating marginal effects in this way assumes Formally, the identifying assumption the marginal esti-
mator invokes is that there exists a set of conditioning
16 The numbers of observations used in constructing the averages are variables x such that
determined by the choice of bandwidth or smoothing parameter. We use
least squares cross-validation to choose these parameters as described in
section IV. E共Y共a, l 0 兲兩l ⫽ l 1 , x, a兲 ⫽ E共Y共a, l 0 兲兩l ⫽ l 0 , x, a兲
17 An alternative approach would be to construct the weighted averages
⌬ 共a,
ˆ l1 , l0 兲 ⫽
1
冘
n i僆兵l ⫽l 其
关Ê共Y共a, l1 兲兩xi 兲 ⫺ Ê共Y共a, l0 兲兩xi 兲兴.
In our empirical work, we estimate the conditional proba-
bilities P( x) by logistic regression.18
i 1
冕
metric rate.
In our context, by imposing strong conditional indepen-
dence assumptions, we could apply the above reasoning to E Y 共Y共a, 0兲兩D P ⫽ 1, x兲 f 共 x兩D P ⫽ 1兲 dx
Y(a, 0). However, for the purpose of estimating average x僆X
冕
effects of treatment, the assumption of conditional indepen-
dence of outcomes and participation status is stronger than ⫽ E Y 共Y共a, 0兲兩D P ⫽ 0, x兲 f共 x兩D P ⫽ 1兲 dx
necessary (see Heckman, Ichimura, & Todd, 1997). Instead, x僆X
冕
we assume directly that equation (5) holds when we replace
x by P( x) ⫽ Pr(D P ⫽ 1兩a, x). The conditional expecta-
tions can then be estimated by three- and two-dimensional ⫽ EY共Y共a, 0兲兩DP ⫽ 0, x兲 f共x兩DP ⫽ 0兲再 f共x兩DP ⫽ 1兲
f共x兩DP ⫽ 0兲
冎dx,
nonparametric regressions: x僆X
冘
where the second equality follows under the assumptions
Ê共Y共a, l 兲兩P共 x兲, D P ⫽ 1兲 ⫽ Y k 共a k , l k 兲 that would justify the application of matching. The last line
k僆兵D P ⫽1其
(10)
18 Heckman, Ichimura, and Todd (1998) and Hahn (1998) consider
⫻ W k 共储l k ⫺ l储, 储a k ⫺ a储, 储P共 x k 兲 ⫺ P共 x兲储兲, whether it is better in terms of efficiency to match on P(X) or on X
冘
directly. For the treatment on the treated parameter, Heckman, Ichimura,
and Todd (1998) show that neither is necessarily more efficient than the
Ê共Y共a, 0兲兩P共 x兲, D P ⫽ 0兲 ⫽ Y k 共a k , 0兲 other. If the treatment effect is constant, then it is more efficient to
k僆兵D P ⫽0其 condition on the propensity score; but in the general case the answer
depends on the mean of the conditional variance relative to the variance
⫻ W k 共兩储a k ⫺ a储, 储P共 x k 兲 ⫺ P共 x兲储兲. of the conditional mean.
EVALUATING PRESCHOOL PROGRAMS 115
shows that matching can be seen as a reweighting method, a PIDI site but without any children attending PIDI, and (iii)
where comparison group observations are reweighted by a comparison group subsample ( A) selected from a strati-
f共x兩D ⫽ 1兲 fied random sample of households with children in the age
. The reweighting accomplished through match-
f共x兩D ⫽ 0兲 range served by PIDI living in poor urban communities
ing (or through a weighted regression) balances observed comparable to those in which PIDI had been established, but
characteristics of the treatment and comparison groups. in which PIDI programs had not yet been established at the
Such a balancing would also occur in a randomized exper- time of the survey.20 As noted above in section IIIA under
iment. Traditional regression-based estimators do not at- “Reducing the Dimension of the Conditioning Problem,”
tempt to emulate the balancing feature of randomized ex- the data are choice-based sampled with unknown population
periments, but instead control for observable differences weights. For this reason, we do not know the participation
between groups by assuming the conditional mean function rate among all eligibles. Fortunately, this information is not
is correctly specified by the regression equation.19 needed for our estimation strategy, but it would be required
to implement some other common evaluation approaches.21
Selection on Unobservables: The estimators for cumu- We estimate program impacts using both the comparison
lative program effects described above assume that out- group samples A and B. Sample B has an advantage over A
comes are mean-independent of program participation con- in being drawn from the same area as the participant sample
ditional on a set of observables. If the program participation P, which controls for unobserved local community effects
equation can be described by the index model D P ⫽ that may affect children’s outcomes. However, sample B
1((Z) ⫺ V ⬎ 0), then the matching estimator assumes families elected not to participate in the program, so the
that E(Y(a, 0)兩x, V ⬍ (Z)) ⫽ E(Y(a, 0)兩x). This outcomes observed for B children may not be directly
assumption is not likely to be satisfied if unobservables that comparable with those for P children. Sample A combines
are related to program outcomes are important determinants data on families that would have participated in the program
of program selection. One option in this case is to use a had the program been available as well as data on families
difference-in-difference (DID) matching strategy that al- that would not have participated. Finally, to estimate mar-
lows for time-invariant unobservable differences in the ginal program impacts, we compare children in the partic-
outcomes between participants and nonparticipants (see ipating sample P who had been in PIDI for two or more
Heckman, Ichimura, and Todd, 1997). However, our data do months with children in P who had been in PIDI for one
not allow application of this estimator, because program month or less.
participants are only observed after they already entered the All the children in sample P meet the eligibility criteria
program. As we show below, lack of preprogram (baseline) that are summarized below in section IVB, but children in
data is a limitation in the data for our study and makes it the comparison samples A and B do not necessarily meet the
difficult to estimate reliably the cumulative effects of the criteria. In our application of the matching estimators, we
program. However, we can estimate the marginal impact of only use subsamples of children from the samples A and B
short versus long durations using the estimators, described who satisfy the eligibility criteria.22 As described below,
earlier in this subsection under “Estimating Marginal Pro- there was a change in the eligibility criteria over time. We
gram Impacts,” that allow selection into the program to be use the later criteria rather than the earlier ones, because the
based on unobservables. earlier ones included subjective aspects, the application of
which we cannot duplicate with much confidence. The first
IV. Empirical Results and most important (at least in the lexicographical ordering
A. The Data sense) of the original criteria is a child characteristic—being
malnourished—that the program is attempting to affect
The PIDI evaluation data sets consist of repeated cross- directly. Because we do not have baseline data on children
section data collected in two rounds on three different
subsamples: (i) a participating subsample (P) of children 20 Stratification is based on information given in the 1992 Bolivian
and f( x兩D P ⫽ 0) ⬎ 0, and it assigns zero weight in estimation to sample and include everybody in the program participation model, with an
comparison group observations for which f( x兩D P ⫽ 1) ⫽ 0 but f( x兩D P ⫽ indicator variable for whether persons are eligible for the program.
0) ⬎ 0. In contrast, regression estimators typically use all the observa- Ineligible persons would have a predicted probability of participating in
tions in estimation and use functional-form assumptions to extrapolate the program equal to 0 and would therefore be excluded in the matching
over any regions of x where the supports do not overlap. analysis by the support restriction.
116 THE REVIEW OF ECONOMICS AND STATISTICS
in P, we cannot infer their preprogram nutritional status supposed to be applied lexicographically and were in part
and, in particular, whether they were malnourished at the subjective (particularly the first and third), which introduces
time of entry into the program. Thus, we are aware of at a random element in who participates in the program, even
least one important omitted variable that likely affects both after conditioning on observed characteristics. The initial
program entry and program outcomes, particularly the an- criteria subsequently were replaced by a more objective
thropometric outcomes. eligibility index that awards one point if the family has (a)
Table 1 shows the sample sizes in groups P, A, and B for no running water in the household, (b) no sewer system, (c)
the first and second rounds with and without imposing no more than two rooms in addition to the bathroom and
eligibility on the samples. The first round of data consists of kitchen in the house, (d) no bathroom or latrine in the
1198 participant (P) children, 1227 A children, and 628 B household, (e) no separate kitchen, (f) more than four
children interviewed between November 1995 and May children, (g) a mother with five grades or less of schooling,
1996.23 The second round consists of a follow-up sample and (h) an unemployed father. Two points are awarded if (a)
from the first round and, in addition, a larger sample of new the family has only a mother or a father or (b) the mother of
households that were not visited in the first round. The the family works outside the household. A total of six points
second round includes 2420 participant children, 2205 are required to be eligible for the program. The second
group A children, and 1732 group B children who were index has fixed weights rather than the lexicographical one
interviewed between November 1997 and May 1998. Im- used initially. It also focuses more on household character-
posing the eligibility criteria on the comparison group istics and does not include the more subjective aspects of
samples leads to a substantial reduction in the sample the previous one—such as children being malnourished or
sizes—roughly cutting them in half. The numbers of chil- maltreated. Nevertheless, in some general sense both the
dren observed in both rounds are 364 participants in group original and the current criteria attempt to identify children
P, and 745 group A and 392 group B children.24 from poor socioeconomic families with limited provision of
home child care.26
B. Eligibility Criteria
To participate in PIDI, families are required to meet C. Variables
eligibility criteria.25 The initial eligibility requirements were
that candidates would be taken who were 6–72 months of The PIDI data sets provide detailed information on pa-
age living in the poor urban communities selected by the rental, household, and child characteristics. There is infor-
program according to whether they met the following cri- mation, for example, on income sources, educational attain-
teria (in order): (1) malnourished children, (2) children with ment, parental occupations, fertility and reproductive
working parents at risk of lack of supervision, (3) children histories, family structure, and possession of durable goods.
who had been maltreated, (4) children who lived with only For all children in the sample households between 6 and 72
one parent or another relative, (5) children with four or more months of age, there are data on cognitive, psychosocial,
siblings, and (6) younger children. These criteria were and anthropometric test score measures. The outcome mea-
sures that we examine in this paper are the following: (i)
23 The sampling frame was a stratified random sample. First PIDI sites bulk motor skills, (ii) fine motor skills, (iii) language-
were randomly sampled, and then children within the sites were selected auditory skills, (iv) psychosocial skills, (v) height-for-age
randomly. percentile, and (vi) weight-for-age percentile.27
24 In the first round, there are 1198 children in PIDI. Because of
participating in PIDI at the time of the second-round data collection, 268 likely to become eligible for the program if their mothers work, even if
were too old for PIDI (had graduated from the program), and 104 were no that implies more family income, ceteris paribus.
longer participating in the program. Thus, we estimate the program 27 We also explore whether these are effects on the lower tails of the
dropout rate among the children who were followed in the second round anthropometric distributions—explicitly, on a height Z score below a
to be approximately 23%. threshold of 3 and a weight Z score below a threshold of 2, where the
25 Once they were determined to be eligible, they could not become different thresholds reflect the relative severity of the nutritional problems
ineligible for the program even if some of their characteristics changed in this population. (Z scores give the number of standard deviations from
over time. the mean. They are widely used in the nutrition literature to characterize
EVALUATING PRESCHOOL PROGRAMS 117
The first three are measures of cognitive skills, the fourth also makes the average income levels similar across groups.
is a psychosocial outcome, and the last two are anthropo- Within the P group there are no significant differences for
metric measures.28 The test score outcomes (i), (ii), (iii), and the two subsamples defined by program duration (last two
(iv) are highly significantly correlated with each other, with columns).
a statistically significant Kendall tau coefficient of 0.8–0.9 Panel C compares other characteristics of the household
for each of the pairwise correlations. Height and weight and reveals differences in family structure across groups:
percentile measures are less strongly positively correlated PIDI households are less likely to have both parents residing
(Kendall tau 0.43). Height-for-age percentile is only slightly in the household, and they have lower total household and
positively correlated with the test score outcome measures per capita income. Group differences are reduced substan-
(with a Kendall tau coefficient equal to 0.06 for each of the tially when the eligibility criteria are imposed.
test score measures). The pairwise correlations between the In summary, in terms of the observed mothers’, fathers’,
weight-for-age percentile and the test score outcomes are all and other household characteristics, the total A and B
insignificantly different from 0. samples tend to be economically better off than the P
sample.30 Applying the eligibility criteria makes the com-
D. Comparison of Group Mean Characteristics parison samples based on A and B much more similar to
group P, though groups A and B still probably on the whole
In table 2, we compare the characteristics of the parents have more resources. Subdividing the P sample into sub-
and of the households for children participating in PIDI samples for 1 month and less versus 2 months or greater
(group P) with those of nonparticipating children (in groups duration leads to no significant differences in the subsample
A and B), with and without imposing eligibility on the A means, with the single exception of greater participation in
and B samples and with group P subdivided by duration of outside organizations by households with greater duration.
program participation between 1 month or less and 2
months or more. Child Characteristics: A comparison of the age distri-
Panel A of the table compares characteristics of the bution for PIDI participating children with children in
mothers. Approximately 8% of mothers in the PIDI group groups A and B reveals major differences, with PIDI par-
have no education and cannot read or write, which is similar ticipants tending to be much more concentrated in middle
to the rate for mothers in the B sample (eligible and total) age ranges (30–55 months). Figure 1 compares children in
but slightly higher than for the A sample (eligible and the eligible B, eligible A, and P groups with respect to
total).29 PIDI mothers are also more likely to participate in weight, height, and four test-score outcome measures, con-
the labor force, but much of this difference is eliminated by ditioning only on age. From the figure, it is apparent that
imposing program eligibility criteria on samples A and B. A PIDI children older than 12 months are short for their age.
comparison of the incomes shows that PIDI mothers have For weight, there are no discernable group differences. The
lower incomes even though they work on average more test-score comparisons do not show any distinct advantage
hours per day. Among PIDI mothers in the two duration or disadvantage for children in the PIDI group. Of course,
subsamples (shown in the last two columns) there are no these findings could be consistent with a positive effect of
significant differences.
the program because PIDI families tend to have lower
Panel B compares characteristics for the fathers. Fathers’
incomes, lower parental education levels, and less stable
educational levels are also lower in the PIDI group than in
family structure, which are all characteristics that we might
group A (24% with basic or no education, compared to
associate with inferior child nutrition and test score out-
16%) but about the same as in group B (26%). Fathers in
comes.
the PIDI group have less stable employment and are more
If we divide the P sample by length of time spent in the
likely to be employed in occasional work than are fathers in
the other groups; but if the eligibility criteria are applied, program, the results are suggestive of a positive impact of
there is a reversal in these comparisons. Imposing eligibility the program for children who have been in the program for
some time. Figure 2 plots the outcome measures for group
the degree of malnutrition, with a Z score ⬍ ⫺2 indicating moderate A and group B eligible children and for P children who have
malnutrition and ⬍ ⫺3 indicating severe malnutrition.) Z scores are participated at least 13 months. The PIDI group appears to
increasingly used in the economic literature on the determinants of and
impact of malnutrition (see the survey in Strauss & Thomas, 1998). We
do better on average in cognitive test scores than children
report these estimates in the text, but for the sake of brevity do not present from the other groups, but this difference is not necessarily
them in tables. attributable to the program; it may be due to preexisting
28 The cognitive outcomes and psychosocial outcomes are measured by
the data sets. ment literature about how well social programs are successfully targeted
29 44% of PIDI mothers have only basic or no education, compared with to the poor (for example, van de Walle & Nead, 1995). These comparisons
34% of group A and 46% of group B mothers (47% and 61% if A and B suggest some success in targeting PIDI towards the poorer households in
are limited to those eligible). poor communities.
118 THE REVIEW OF ECONOMICS AND STATISTICS
* Includes both rounds of data, but excludes observations from second round who were also included in first round.
† Someone in household participates in neighborhood organizations.
EVALUATING PRESCHOOL PROGRAMS 119
differences between program participants and nonpartici- in table 2A–C indicate that groups A, B, and P differ along
pants. several dimensions that could be relevant to the program
participation decision. We estimate a logistic model for the
E. Determinants of Program Participation
probability of participating in the program using group P
and the eligibles in group B, the two groups that selected
The probability of program participation plays an impor- into and out of the program. Our selection of variables is
tant role in estimating program effects by the matching guided by the theoretical model presented in section II that
method as described in section IIIA. The mean comparisons indicated that father’s income, child’s age and child’s pre-
120 THE REVIEW OF ECONOMICS AND STATISTICS
program quality status are potential determinants of partic- model from those shown in table 2 to maximize the per-
ipation.31 Information is available on all these variables, centage correctly classified by the hit-or-miss criterion.
except preprogram quality (see discussion below). We select Under the resulting model, 79% of the observations are
the particular set of included regressors for the logistic correctly classified.32 The included regressors are listed in
appendix C. The most useful predictors of participation are
31 Mother’s labor force participation was not included in the participa-
(i) presence of a mother in the household, (ii) education coefficients from the participation model that was estimated
level of the mother, (iii) number of children, (iv) educa- on groups P and B. 33
tion level of the father, and (v) monthly income of the The first column of figure 3 plots the log odds ratio for
father. participating children and for eligible children in the B and
For group A, it is impossible to know which families A groups. For groups P and B the supports of the log odds
would have elected to participate in the program had the ratios overlap, but if group A is used as a comparison group,
program been available to them. However, under the as-
sumption that the same participation process governs deci- 33 This requires assuming that there are no significant unobserved
sions for group A as for group B, we can impute probabil- locality characteristics affecting outcomes, so that a similar model for
ities of participation for group A families using the participation for groups P and B can also be applied to group A.
122 THE REVIEW OF ECONOMICS AND STATISTICS
some high values of the log odds ratio are observed for We also consider estimates based on a DID estimator for
program participants for which no matching values can be the much smaller subset of children who are observed in
found for children in the A group. This limits the range of both sample rounds in P and in eligible group B (see table
values over which treatment impacts can be estimated. 2 for sample sizes, and appendix E, table E2, for the
Our estimates of the marginal effects of longer durations coefficient estimates). The estimates are imprecise due to
in the program are based on the survival probability corre- the substantially reduced sample sizes. In this case, the
sponding to the probability that duration in the program is 2 estimates suggest that the effects of the program are nega-
months or more.34 Appendix C lists of set of regressors we tive for all outcomes except the height-for-age percentile
used for this model (chosen using the hit-or-miss criterion (see figure 5).
with a correct classification rate of 76%). The log odds ratio
of the survival probabilities is plotted in the second column
G. Cumulative Impacts Estimated by the Method of
of figure 3.35
Matching
F. Impacts Estimated by Traditional Regression Methods We next describe estimated cumulative program impacts
Before presenting impact estimates based on the match- based on the matching estimators developed in section III,
ing estimators, we first report for comparison estimates that first conditional on age only and then conditional on both
are obtained by simple regression estimators. First, we age and duration in the program. We also present results on
estimate a simple cross-sectional regression model for the the marginal impacts. In implementing the matching esti-
three cognitive development tests, the psychosocial ability mators, we choose bandwidth values by the least squares
test, and the two anthropometric indicators, based on the cross-validation (LSCV) method, which searches over a
combined P and eligible B samples. Our specification in- grid of possible bandwidth values and chooses the values
cludes as independent variables a dichotomous variable for that minimize the integrated squared error of the nonpara-
participation in PIDI, a cubic in duration in PIDI, a cubic in metric estimators.37
the child’s age, the child’s sex, and a set of conditioning Table 3 compares the conditional-on-age difference in
variables that is the same as used in estimating the proba- raw means of the outcome measures with the mean program
bility of program participation, as described in the previous impacts estimated by the cross-sectional matching estimator
section.36 Figure 4 plots the estimated program impact as a given in equation (7). Each age interval represents a quintile
function of duration in the program. of the participant age distribution. The “Mean Diff.” column
The figure shows that estimated program impacts on test shows the difference in raw mean outcomes, the “Mag”
scores are mostly positive and on the order of one additional column shows the estimated program impact obtained by
answer correct (out of a possible 32). For the anthropomet- the matching method, and the “%” column shows the
ric outcomes, we find the counterintuitive result of a nega- estimated program impact as a percentage of the average
tive impact of the program on weight and on height. We do outcome measure for the comparison group children in the
not find these estimates to be credible, because large nega- relevant age range. In parentheses, we report bootstrapped
tive impacts of the program on anthropometrics immedi- standard errors of the estimates.
ately upon program entry (as indicated by the estimated The test score impacts are almost all positive for children
negative impact of PIDI participation on the intercept) are aged 37–58 months. For this age group, the program is
extremely unlikely, which suggests that the regression mod- estimated to increase test scores by roughly one additional
els may be misspecified. One potential source of misspeci- correct item, which is 3%–4% of the average score within
fication is that program impacts may depend in a nonaddi- age classes of the untreated group. Although this impact
tive way on age and program duration. The matching may seem modest in magnitude, it is worth noting that the
estimators described below nonparametrically estimate the recently evaluated Tennessee class size experiment, widely
nature of the dependence. acclaimed in the United States as a successful program,
found an increase in test score outcomes of only 6%
34 This is a version of the estimator described in section IIIA under (Krueger, 1998). With regard to the anthropometric mea-
“Estimating Marginal Program Impacts” that integrates over the observed sures, the estimated program impacts are imprecisely esti-
program durations greater than or equal to 2 months. mated.
35 When we use the survival probability calculated only using the data on
program participants, there is no need to use the odds ratio in matching, Table 4 reports estimated impacts conditional on specific
as there is no choice-based sampling problem. However, for convenience age and duration ranges. The estimates are obtained by first
we also match on the log odds ratio for this group. estimating mean impacts at each age and duration value
36 For brevity, the regression estimates are shown in Appendix E (which
is available upon request from the authors) in table E1. The model
explains considerable shares of the variance in the four test scores (84% 37 The grid is three-dimensional for estimating the expectation condi-
or more) but much less of the sample variance in the anthropometric tional on D P ⫽ 1, and two-dimensional for estimating the expectation
indicators (approximately 4%). Family background characteristics are conditional on D P ⫽ 0. The values over which we searched are 1.0
found to be significant determinants of all the child outcomes (the family through 16.0 for the log odds ratio with a step size of 1, 1.0 through 28.0
background variables are highly significant, and F-tests reject the null that for age with a step size of 1 month, and 1.0 through 28.0 for duration with
they are insignificant at conventional significance levels). a step size of 1 month.
EVALUATING PRESCHOOL PROGRAMS 123
FIGURE 4.—ESTIMATED PROGRAM IMPACTS FROM CROSS-SECTIONAL, CUBIC-IN-DURATION MODEL, SAMPLES P AND B
observed in the data and then taking averages over the section IV C. The bottom two panels of table 4 show results
individual impacts within each age-duration class.38 The for the anthropometric measures, which are insignificantly
results indicate that average impacts increase as length of different from 0.
exposure to treatment increases. Impacts are almost always We carried out a similar analysis using as a source of
positive for children who have participated in the program comparison group data the sample of children living in a
for at least 13 months (with only two exceptions, both for geographic area not served by the program (group A de-
children under 36 months old who have participated 25⫹ scribed in section II). Tables 5 and 6 report the estimated
months) and roughly twice the order of magnitude of the
program impacts in a format identical to the previous two
overall average impacts reported in table 3. They tend to be
larger than those found under the cubic specification of tables. The impact estimates on test scores are generally
positive over all age ranges for durations of exposure of at
38 Averages are therefore self-weighting by the joint duration and age least 7 months. The estimates are more widely statistically
density. significant for the A comparison group than for the B group;
124 THE REVIEW OF ECONOMICS AND STATISTICS
they are significant over half the time for children aged 36 H. Marginal Program Impacts Estimated by the Method
months or less, and almost always for children aged 59⫹ of Matching
months. With regard to the anthropometric measures, we
find statistically significant negative impacts on weight for Because preprogram nutritional status represents an im-
short durations of exposure. We think it unlikely that the portant unobservable, our preferred estimates are those for
program could decrease children’s weight over a short time the marginal impact of the program for different durations
interval by as much as estimated. Rather, the estimated of participation obtained using only data for program par-
negative impacts on weight are probably evidence of bias ticipants (P). These estimates use participants with shorter
arising from selection into the program on unobservables durations as the comparison group for participants with
that is not taken into account by matching. An important longer durations and use matching to control for differences
unobservable is preprogram nutritional status, on which, as in child characteristics that affect the program duration
noted in section IV B, the initial program eligibility criteria rather than the participation decision. As described in
placed primary emphasis. section IIIA, the marginal estimator allows selectivity into
EVALUATING PRESCHOOL PROGRAMS 125
TABLE 3.—COMPARISON OF DIFFERENCEIN RAW MEANS AND CUMULATIVE MEAN PROGRAM IMPACTS AFTER ADJUSTING FOR SELECTIVITY
INTO THE PROGRAM USING MATCHING METHOD—SAMPLES P AND B
TABLE 5.—COMPARISON OF DIFFERENCE IN RAW MEANS AND CUMULATIVE MEAN PROGRAM IMPACTS AFTER ADJUSTING FOR SELECTIVITY
INTO THE PROGRAM USING MATCHING METHOD—SAMPLES P AND A
TABLE 7.—COMPARISON OF DIFFERENCE IN RAW MEANS AND CUMULATIVE MEAN PROGRAM IMPACTS AFTER ADJUSTING FOR SELECTIVITY
INTO THE PROGRAM USING MATCHING METHOD—GROUP P, DUR. ⱖ 2 AND DUR. ⱕ 1
the program to be based on unobservables, but assumes that benefits outweigh the costs, which have been estimated to
children with longer durations can be made comparable to be approximately $43/month per child enrolled (⫽$516/
children with shorter durations by conditioning on observables. year) by Ruiz (1996). We focus here exclusively on benefits
Table 7 presents marginal impact estimates in a parallel in terms of earnings. There are four channels that we
format to table 3. The test score impacts are mostly positive consider by which the preschool program can affect lifetime
for children of all age ranges. The marginal estimates earnings: (1) by increasing cognitive skills as an adult
indicate generally somewhat larger effects than do the (conditional on grades completed), which directly affects
average estimates and also are suggestive of positive ben- earnings, (2) by increasing physical stature as an adult,
efits at younger ages. For the anthropometric indicators, the which directly affects earnings, (3) by increasing the num-
marginal program effects on mean weight-for-age percentile ber of grades completed, which directly affects earnings and
and mean height-for-age percentile are positive for over half the age a of school completion, and (4) by decreasing the
the age ranges, although none of the estimates are statisti- age of school completion without changing the number of
cally significant. grades completed. For the program to have an impact
Table 8 shows estimated impacts conditional on age and through channels (3) and (4), we are assuming that im-
duration in the program. The test score results suggest proved cognitive skills and nutrition as a child facilitate
increasing marginal impacts with greater program exposure. earlier entry into school, lessen repetition rates, and lead to
The estimates are mostly positive and tend to be larger than more grades completed. Appendix D summarizes empirical
the overall average marginal impacts in table 6 for children
evidence on the importance of these four channels from the
who have participated in the program for at least 6 months.
experience of developing countries.
For children aged 6–36 months, the estimated impacts on
As our data do not provide information on how higher
height and weight percentiles are also generally positive for
cognitive skills and better nutrition affect adult earnings and
different durations, except for children older than 36
we are unaware of any such estimates for Bolivia, we draw
months, for whom the height estimates are negative. For
younger children, the height estimates for short durations on estimates from previous studies on other developing
are surprisingly large and positive.39 For weight percentiles countries. One is a study by Stauss and Thomas (1997) that
the marginal effect estimates are more credible than the analyzes the relationship between adult earnings and height,
cumulative estimates (table 4). These comparisons suggest body mass index (BMI), caloric consumption, protein con-
that the first criterion for selecting children into the program sumption, and education for male workers in a neighboring
(malnourishment) focused on low weight and not low stature. Latin American country, Brazil. It finds that a 1% increase
in height leads to a 2.4% increase in adult male earnings, in
V. Cost-Benefit Analysis a regression of log hourly wages on height and years of
education.40 To our knowledge, there has been no research
So far we have considered only the problem of estimating on the cognitive-skills–earnings relationship specifically for
the benefits of the program. Next we consider whether the Latin American workers, so we base our cost-benefit anal-
ysis on a study by Alderman et al. (1996) of the cognitive-
39 A possible explanation for this result, which unfortunately the lack of
skills–earnings relationship for male workers in Pakistan,
preprogram data makes it difficult for us to explore, is that parents tended
to enroll their young children only when they considered them to be
sufficiently mature and that their assessment of their child’s maturity was 40 Their study uses a normal bias correction to control for selectivity into
TABLE 8.—ESTIMATED MARGINAL IMPACTS BY DURATION AND AGE CLASSES—SAMPLE P, DUR. ⱖ2 AND DUR. ⱕ1
Age in
Months Duration 2–6 7–12 13–18 19–24 25⫹ mo.
Bulk Motor Skills
6–24 ⫺0.20 (⫺1%) ⫺0.12 (⫺1%) 0.80 (6%)* 䡠 䡠
25–36 ⫺0.14 (⫺1%) 0.32 (2%) 0.65 (3%)* 0.56 (3%) ⫺0.86 (⫺4%)
37–41 ⫺0.04 (⫺0%) 0.15 (1%) 0.20 (1%) 0.34 (1%) 0.37 (2%)
42–58 ⫺0.24 (⫺1%) 0.44 (2%) 0.93 (3%)* 0.77 (3%)* 1.09 (4%)*
59⫹ 0.23 (1%) 0.59 (2%) 0.69 (2%)* 0.46 (2%) 0.94 (3%)*
Fine Motor Skills
6–24 0.05 (0%) 0.32 (2%) 0.93 (7%)* 䡠 䡠
25–36 ⫺0.09 (⫺0%) 0.08 (0%) 0.28 (1%) 0.10 (1%) ⫺0.83 (⫺4%)*
37–41 ⫺0.06 (⫺0%) ⫺0.08 (⫺0%) 0.19 (1%) ⫺0.01 (⫺0%) ⫺0.26 (⫺1%)
42–58 0.34 (1%)* 0.98 (4%) 1.33 (6%)* 1.08 (5%)* 1.24 (5%)*
59⫹ 0.11 (0%) 0.76 (3%) 0.90 (3%) 1.04 (4%)* 1.20 (4%)*
Language and Auditory Skills
6–24 0.10 (1%) 0.51 (4%) 0.95 (8%)* 䡠 䡠
25–36 0.07 (0%) 0.42 (2%)* 0.54 (3%)* 0.46 (3%) ⫺0.62 (⫺4%)
37–41 ⫺0.18 (⫺1%) 0.28 (1%) 0.51 (2%) 0.27 (1%) ⫺0.13 (⫺1%)
42–58 0.31 (1%) 1.59 (6%)* 2.16 (9%)* 1.95 (8%)* 2.18 (9%)*
59⫹ ⫺0.06 (⫺0%) 0.94 (3%)* 1.01 (4%)* 1.37 (5%)* 1.27 (5%)*
Psychosocial Skills
6–24 0.15 (1%) 0.69 (6%) 1.21 (10%)* 䡠 䡠
25–36 0.13 (1%) 0.75 (4%)* 1.07 (6%)* 1.62 (9%)* 0.88 (5%)
37–41 0.64 (3%)* 0.70 (3%)* 1.01 (4%)* 1.19 (5%)* 1.25 (5%)*
42–58 0.35 (1%) 0.95 (4%)* 1.44 (6%)* 1.31 (5%)* 1.54 (6%)*
59⫹ ⫺0.00 (⫺0%) 0.82 (3%)* 1.05 (4%)* 1.26 (4%)* 1.26 (4%)*
Weight Percentile
6–24 ⫺0.71 (⫺3%) 0.22 (1%) 3.36 (12%) 䡠 䡠
25–36 0.74 (3%) 0.75 (3%) 1.89 (7%) 3.30 (13%) 5.22 (20%)
37–41 ⫺1.60 (⫺5%) ⫺3.03 (⫺9%) ⫺3.54 (⫺11%) ⫺0.92 (⫺3%) 1.06 (3%)
42–58 ⫺1.08 (⫺3%) 1.05 (3%) 1.14 (4%) ⫺0.21 (⫺1%) 1.51 (5%)
59⫹ ⫺2.35 (⫺7%) 2.93 (9%) 4.27 (13%) 1.27 (4%) 0.34 (1%)
Height Percentile
6–24 3.10 (19%) 2.78 (17%) 1.63 (10%) 䡠 䡠
25–36 2.28 (19%) 1.94 (16%) 1.44 (12%) 0.98 (8%) ⫺0.50 (⫺4%)
37–41 ⫺0.55 (⫺3%) ⫺0.54 (⫺3%) ⫺1.52 (⫺9%) ⫺1.92 (⫺11%) ⫺1.94 (⫺11%)
42–58 ⫺1.22 (⫺8%) ⫺1.16 (⫺7%) ⫺1.48 (⫺9%) ⫺2.05 (⫺13%) ⫺1.86 (⫺12%)
59⫹ ⫺1.04 (⫺9%) 1.25 (10%) 1.57 (13%) 0.68 (6%) ⫺0.18 (⫺1%)
* Significant at the 10% level.
which finds that a 1% increase in cognitive skills increases The present discounted value of earnings associated with
earnings by 0.23%.41 Their study has an advantage over a 1% increase in height is calculated as follows. Let y(s, c,
some other studies in the literature in that it controls for the h) be the annual earnings of individuals with s grades
potential endogeneity of cognitive ability in the wage equa- completed, cognitive ability c, and height h, and let a be the
tion. As we only observe the children in our study at a very age of completing school. We draw a distinction between
young age, we assume for the cost-benefit analysis that grades completed and rate of progression through grades,
increases in height and cognitive ability as a child have a because a number of students in Bolivia both start school
persistent effect and translate into equiproportional in- late and repeat grades. Estimates from the 1990 third round
creases as an adult.42,43 of the Encuesta Integrada de Hogares (Integrated House-
hold Survey), which covers the ten most populous urban
41 Their study finds that a 7.3% increase in cognitive skills, evaluated at
ages. For example, the Berkeley Growth Study found a correlation of 0.71 grades actually completed, or between 0.9 and 1.4 grades, in
between test scores measured at ages 4 and 17 (Currie & Thomas, 1999). comparison with a mean of 8.6 grades actually completed.
43 We use height for our illustration rather than BMI because this
assumption is more dubious for BMI than for height. But, as noted below,
we consider a small percentage increase in height in comparison with that parents may have selected taller children for consideration for the
those obtained for some of the estimators in section IV, because we expect program.
EVALUATING PRESCHOOL PROGRAMS 129
TABLE 9.—COSTS AND ESTIMATED BENEFITS OF THE PIDI PROGRAM IN U.S. DOLLARS UNDER DIFFERENT HYPOTHETICAL IMPACTS*
Discount Rate 3% 5%
Educational Mean Annual
Level Earnings‡ ($) Cost† Benefit Benefit/Cost Ratio Cost Benefit Benefit/Cost Ratio
Intermed. (8) 1224 3 1352 1394 5107 3.66 1301 3230 2.48
Secondary (11) 1422 3 1550 1394 3969 2.85 1301 2232 1.72
Intermed. (8) 1224 3 1352 1743 5107 2.93 1626 3230 1.99
Secondary (11) 1422 3 1550 1743 3969 2.28 1626 2232 1.37
* Assuming that children take part in program 3 years, from age 2 to age 5. Impact: Shortens length of time to complete education by 1 year, increases average educational attainment level by 1 year, increases
cognitive skills by 5%, and increases height by 2%. Our simulation is based on a point estimate reported in Strauss and Thomas (1997) of a 2.4% increase in earnings for each 1% increase in height, and on a point
estimate reported in Alderman et al. (1996), which finds a 0.233% increase in earnings for each 1% increase in cognitive skills.
† The first two lines of estimates are based on a cost of $516/year as estimated by Ruiz (1996). The second set of estimates include a 25% upward adjustment to costs to allow for possible distortionary costs
to the government of raising the revenues to pay for the program.
‡ Conversion factor: 7.8 Bolivianos/1 U.S. dollar.
Let r be an externally determined real rate of interest, and T completion. The tables shows estimates for two values of
the length of working life, assumed not to depend on s, c, the discount rate, r ⫽ 3% or r ⫽ 5%.
a, or h. In Bolivia, recent life expectancies at birth are The single impact that has the largest effect among the
approximately 60 years. The present discounted value of ones considered is increasing the number of grades com-
earnings for a given (s, a, h, c) vector is V(s, a, h, c) ⫽ pleted (under the assumption that there is a corresponding
兰 60
a y(s, h, c)e
⫺rt dt. 44 This yields a present discounted 1-year increase in the age of completion), which alone
value of earnings equal to V(s, a, h, c) ⫽ r ⫺1 y(s, c, would generate a benefit-cost ratio greater than 1 for both
h)(e ⫺ra ⫺ e ⫺r60 ). discount rates and both education levels. When multiple
The expected impact of a 2% increase in height is types of program impacts are considered together (as shown
y (s) ⫻ 2 ⫻ 0.024 ⫻ r ⫺1 (e ⫺ra ⫺ e ⫺r60 ), where y (s) is the in the table), the benefit/cost ratios range from 1.7 to 3.7. We
average earnings for men with s grades completed and we also estimated the cost-benefit ratios adjusting the costs by
use the results, as noted, from Strauss and Thomas’s (1997) an additional 25% to allow for distortionary costs to the
study and from Alderman et al. (1996). government of raising the revenues to pay for the program.46
The earnings gain that would result from a decrease in the
school completion age from a 1 to a 2 without changing the
level of school attainment is given by y (s)r ⫺1 [e ⫺ra 2 ⫺ VI. Conclusions
e ⫺ra 1]. An increase in the level of attainment from s 1 to s 2 This paper analyzes the impact of a preschool program in
has two possibly partially offsetting effects (as in Mincer, a developing country using a relatively large, nonexperi-
1958). It increases earnings capacity, but also potentially mental data set. To do so, we generalize matching methods
decreases the amount of time available for work, operating to allow the program impact to vary with a continuous
through a. To denote the dependence of a on s, write a(s). treatment dose (the duration of time spent in the program)
The benefit of increasing schooling from s 1 to s 2 is given by and to depend in a flexible way on the age of the child. We
also develop a marginal effect estimator that assumes that
y 共s 2 兲 ⫺ra共s 2兲 y 共s 1 兲 ⫺ra共s 1兲
共e ⫺ e ⫺r60 兲 ⫺ 共e ⫺ e ⫺r60 兲. program participants with differing durations of participa-
r r tion can be made comparable by conditioning on observed
child and family characteristics. Advantages of the marginal
On the cost side, the cost of participating in the program for
effect estimator are that it does not require assumptions on
4 years between ages 2 and 5 is given by $516 兰 52 e ⫺rt dt.
the process governing selection into the program, can ac-
Table 9 reports the cost-benefit estimates under hypothet-
commodate the case where selection into the program is on
ical program impacts that are in the range of some of the
unobservable characteristics, and can be implemented using
impacts observed in the impact analysis of section IV E
data only on program participants.
(table 8) and for average male earnings levels associated
We applied several different estimators to evaluate the
with three different education levels: 8, 11, and 14 years of
effectiveness of the PIDI program in Bolivia, which is
education.45 Specifically, we obtain an earnings gain result-
aimed at improving early cognitive skills and nutrition. We
ing from an impact of 2% on height, a 5% increase in
developed a dynamic model for the decision to enroll
cognitive skills, and a 1-year increase in grades completed
children that provided an interpretation for the treatment
and a corresponding 1-year increase in the age of school
impact estimates and guided our selection of matching
44 We assume for simplicity that the earnings path is flat over the life
variables. Impact estimates based on cross-sectional regres-
cycle [that is, y(s, h, c) does not depend on t ⫺ a after controlling for s].
45 Mean earnings are calculated from the sample of adult males in the 46 The 25% figure is based on studies such as Devarajan, Squire, and
group A comparison group data. (This group was chosen because it is not Suthiwart-Narueput (1997), Feldstein (1995), and Harberger (1997). As
self-selected on program participation.) The modal number of years of seen in the table, even with this adjustment the cost-benefit ratios are
education for these males is 8. substantially greater than 1.
130 THE REVIEW OF ECONOMICS AND STATISTICS
sion estimators indicate a positive, statistically significant Glewwe, Paul, and Hanan G. Jacoby, “An Economic Analysis of Delayed
effect on test scores. Our matching estimators show that test Primary School Enrollment in a Low-Income Country: The Role of
Early Childhood Nutrition,” this REVIEW 77:1 (1995), 156–159.
score gains depend strongly on duration of exposure to the Glewwe, Paul, Hanan G. Jacoby, and Elizabeth King, “Early Childhood
program, with positive effects observed for children who Nutrition and Academic Achievement: Analysis Using Longitudi-
participated at least 7 months (for the marginal estimator) nal Data,” Journal of Public Economics 81:3 (2001), 345–368.
Grantham-McGregor, S., C. Walker, S. Chang, and C. Powell, “Effects of
and increasing effects observed with longer durations. How- Early Childhood Supplementation With and Without Stimulation
ever, the impacts on the anthropometric outcomes are not on Later Development in Stunted Jamaican Children,” American
precisely estimated. Journal of Clinical Nutrition 66 (1997), 247–453.
Our cost-benefit analysis considered a few different chan- Haas, J., S. Murdoch, J. Rivera, and R. Martorell, “Early Nutrition and
Later Physical Work Capacity,” Nutrition Reviews 54 (1996),
nels by which the program might be expected to have an S41–S48.
effect on lifetime earnings, including a direct effect of the Haddad, Lawrence, and Howarth Bouis, “The Impact of Nutritional Status
program on earnings operating through greater physical on Agricultural Productivity: Wage Evidence from the Philip-
stature and cognitive skills and indirect effects operating pines,” Oxford Bulletin of Economics and Statistics 53:1 (1991),
45–68.
through less time spent in school to achieve a given level of Hahn, Jinyong, “On the Role of the Propensity Score in Efficient Estima-
education and/or higher educational attainment levels. tion of Average Treatment Effects,” Econometrica 66:2 (1998),
When all the channels are combined, under the assumptions 315–331.
Harberger, Arnold, “New Frontiers in Project Evaluation? A Comment on
of our simulations, the expected benefit of the program Devarajan, Squire and Suthiwart-Narueput,” The World Bank Re-
outweighs the costs by a fair amount. search Observer 12:1 (1997), 73–39.
Heckman, J. J., H. Ichimura, J. Smith, and P. Todd, “Characterizing
Selection Bias Using Experimental Data,” Econometrica 66
REFERENCES
(1998), 1017–1098.
Alderman, Harold, Jere R. Behrman, Victor Levy, and Rekha Menon, Heckman, J. J., H. Ichimura, and P. Todd, “Matching As an Econometric
“Child Health and School Enrollment: A Longitudinal Analysis,” Evaluation Estimator: Evidence from Evaluating a Job Training
Journal of Human Resources 36:1 (2001), 185–205. Program,” Review of Economic Studies 64:4 (1997), 605–654.
Alderman, Harold, Jere R. Behrman, David Ross, and Richard Sabot, “Matching As an Econometric Evaluation Estimator,” Review of
“The Returns to Endogenous Human Capital in Pakistan’s Rural Economic Studies 65 (1998), 261–294.
Wage Labour Market,” Oxford Bulletin of Economics and Statistics Heckman, J. J., and P. Todd, “Adapting Propensity Score Matching and
58:1 (1996), 29–56. Selection Models to Choice-based Samples,” University of Chi-
Barnett, Stephen, “The Benefits of Compulsory Preschool Education,” cago, unpublished manuscript (1995).
Journal of Human Resources 27:2 (1992), 279–312. Jamison, Dean T., “Child Nutrition and School Performance in China,”
Behrman, Jere R., “The Economic Rationale for Investing in Nutrition in Journal of Development Economics 20:2 (1986), 299–310.
Developing Countries,” World Development 21:11 (1993), 1749– Krueger, Alan, “Reassessing the View that American Schools Are Bro-
1771. ken,” Federal Reserve Bank of New York Economic Policy Review
Behrman, Jere R., and Anil B. Deolalikar, “Wages and Labor Supply in 4:1 (1998), 29–43.
Rural India: The Role of Health, Nutrition and Seasonality,” in
Lavy, Victor, Jennifer Spratt, and Nathalie Leboucher, “Patterns of Inci-
David E. Sahn (Ed.), Causes and Implications of Seasonal Vari-
ability in Household Food Security (Baltimore, MD: The Johns dence and Change in Moroccan Literacy,” Comparative Education
Hopkins University Press, 1989). Review 41:2 (1997), 120–141.
Bloom, H., L. Orr, G. Dave, S. H. Bell, and F. Doolittle, The National Martorell, R., “Results and Implications of the INCAP Follow-up Study,”
JTPA Study: Title IIA Impacts on Earnings and Employment at 18 Journal of Nutrition 125 (1995), 1127S–1138S.
Months (Bethesda, MD: Abt Associates, 1993). “The Nature of Child Malnutrition and its Long-term Implica-
Boissiere, Maurice, John B. Knight, and Richard H. Sabot, “Earnings, tions,” Food and Nutrition Bulletin 20 (1999), 288–292.
Schooling, Ability and Cognitive Skills,” American Economic Martorell, R., K. L. Kahn, and D. G. Schroeder, “Reversibility of Stunting:
Review 75 (1985), 1016–1030. Epidemiological Findings in Children from Developing Coun-
Cleveland, William, “Robust Locally Weighted Regression and Smooth- tries,” European Journal of Clinical Nutrition 48:Suppl. (1994),
ing Scatterplots,” Journal of the American Statistical Association S45–S57.
74 (1979), 829–836. Martorell, Reynaldo, Juan Rivera, and Haley Kaplowitz, “Consequences
Currie, Janet, and Duncan Thomas, “Early Test Scores, Socioeconomic of Stunting in Early Childhood for Adult Body Size in Rural
Status and Future Outcomes,” National Bureau of Economic Re- Guatemala” (Stanford, CA: Food Policy Research), Stanford Uni-
search working paper no. 6943 (1999). versity mimeograph (1989).
Dehejia, Rajeev, and Sadek Wahba, “Causal Effects in Nonexperimental Mincer, Jacob, “Investment in Human Capital and Personal Income
Studies: Reevaluating the Evaluation of Training Programs,” Jour- Distribution,” Journal of Political Economy 66:4 (1958), 281–302.
nal of the American Statistical Association 94:448 (1999), 1053– Moock, Peter R., and Joanne Leslie, “Childhood Malnutrition and School-
1062. ing in the Terai Region of Nepal,” Journal of Development Eco-
Deolalikar, Anil B., “Nutrition and Labor Market Productivity in Agricul- nomics 20:1 (1986), 33–52.
ture: Estimates for Rural South India,” Review of Economics and
Myers, Robert, The Twelve Who Survive: Strengthening Programmes of
Statistics 70:3 (1988), 406–413.
Early Childhood Development in the Third World (Ypsilanti, MI:
Devarajan, S., L. Squire, and S. Suthiwart-Narueput, “Beyond Rate of
Return: Reorienting Project Appraisal,” The World Bank Research High Scope Press, 1995).
Observer 12:1 (1997), 35–36. Neal, Derek, and William Johnson, “The Role of Pre-market Factors in
Fan, J., “Design Adaptive Nonparametric Regression,” Journal of the Black-White Wage Differences,” Journal of Political Economy
American Statistical Association 87 (1992), 998–1004. 104:5 (1996), 869–895.
Feldstein, Martin, “Tax Avoidance and the Deadweight Loss of the Pitt, Mark M., Mark R. Rosenzweig, and Donna M. Gibbons, “The
Income Tax,” National Bureau of Economic Research working Determinants and Consequences of the Placement of Government
paper no. 5055 (1995). Programs in Indonesia,” The World Bank Economic Review 7:3
Glewwe, Paul, “The Relevance of Standard Estimates of Rates of Return (1993), 319–348.
to Schooling for Education Policy: A Critical Assessment,” Journal Pollitt, Ernesto, Malnutrition and Infection in the Classroom (Paris:
of Development Economics 51:2 (1996), 267–290. UNESCO, 1990).
EVALUATING PRESCHOOL PROGRAMS 131
冘 冉 冊
Rosenzweig, Mark R., “Why Are There Returns to Schooling?” American n
zi ⫺ z0
Economic Review 85:2 (1995), 153–158. min 关 yi ⫺ a ⫺ b1 共zi ⫺ z0 兲兴2 K ,
Rosenzweig, Mark R., and Kenneth J. Wolpin, “Evaluating the Effects of a,b hn
i⫽1
Optimally Distributed Public Programs,” American Economic Re-
view 76:3 (1986), 470–487. where K is a kernel function and h n ⬎ 0 is a bandwidth that converges
Ruiz, Fernando, “Estudio de Costos del Proyecto Integral de Desarrollo to 0 as n 3 ⬁. The estimator of the conditional mean is â. If b 1 were
Infantil (PIDI),” La Paz, Bolivia: Ruiz Guissani Consultores mim- constrained to equal 0, then â would give the standard kernel regression
eograph (1996). estimator. Thus, kernel regression can be viewed as a special case of LLR.
Smith, Jeffrey, and Petra Todd, “Reconciling Conflicting Evidence on the Fan (1992) shows that the local linear estimator has the same variance
Performance of Propensity Score Matching Methods,” American as the kernel estimator but has a lower-order bias at boundary points.48
Economic Review 91:2 (2001), 112–118. The smaller bias associated with the LLR estimator implies that it is more
“Does Matching Address LaLonde’s Criticism of Nonexperi- rate-efficient than the kernel estimator. Another advantage emphasized by
mental Estimators,” Journal of Econometrics, forthcoming Fan is that the bias of the LLR estimator does not depend on the design
(2004). density of the data. Because of these advantages, local linear methods are
Strauss, John, “Does Better Nutrition Raise Farm Productivity?” Journal usually a better choice than standard kernel methods for nonparametric
of Political Economy 94 (April 1986), 297–320. regression. The local linear estimator is asymptotically normal with a rate
Strauss, John, and Duncan Thomas, “Human Resources: Empirical of convergence equal to 冑nh kn, where k is the dimension of z. In our
Modeling of Household and Family Decisions,” in J. Behrman application, the estimators have k ⫽ 2 or k ⫽ 3.
and T. N. Srinivasan (Eds.), Handbook of Development Eco- The kernel function we use in the empirical work is the biweight kernel
nomics, vol. 3A (Amsterdam: North Holland Publishing Com- (sometimes also called a quartic kernel). Bandwidth values are selected by
pany, 1995). least squares cross-validation as described in the text.
“Health, Nutrition and Economic Development,” Journal of
Economic Literature 36:2 (1998), 766–817.
Sweinhart, Lawrence J., and David P. Weikart, “High/Scope Perry Pre- APPENDIX C
school Program Effects at Age Twenty-Seven,” in Jonathon Crane
(Ed.), Social Programs that Work (New York: Russell Sage Foun- List of Variables Included in Program Participation Model
dation, 1998).
Thomas, Duncan, and Strauss, John, “Health and Wages: Evidence on In this appendix, we list the variables that were included in the
Men and Women in Urban Brazil,” Journal of Econometrics 77:1 discrete-choice models for program participation and for the probability of
(1997), 159–185. experiencing a duration that exceeds 1 month (used in comparing groups
Todd, Petra E., and Kenneth I. Wolpin, “On the Specification and Esti- with durations ⬎1 and durations ⱕ1 month). The following list gives the
mation of the Production Function for Cognitive Achievement,” variables included in the models. The subset of variables and interactions
The Economic Journal 113:485 (2003), F3–F33. were selected from a larger set of variables available in the data set to
van de Walle, Dominique, and Kimberly Nead, Public Spending and the maximize the percentage of observations correctly classified under the
Poor (Baltimore: Johns Hopkins University Press, 1995). model.
World Bank, World Development Report: The State in a Changing World Variables included in the model for program participation: age in
(New York and Oxford: Oxford University Press for the World months of child, sex of child, indicators for whether mother and father
Bank, 1997). reside in the household, education level of mother, job type of father,
monthly income of father, number of siblings, number of rooms in house,
indicator for whether family owns house, indicator for whether house has
running water, indicator for whether house has a bathroom, indicator for
whether house has a television set, interaction between number of rooms
APPENDIX A in house and age of child, interaction between employment status of father
and age of child, interaction between number of siblings and age of child,
Data Appendix interaction between monthly income of father and number of siblings,
interaction between education level of mother and age of child.
The PIDI survey consists of five modules: two about the household, Variables in the model for the probability of experiencing a duration
one about women in the household, one about the children, and, for PIDI that exceeds 1 month: age of child, sex of child, indicator for whether
families, one about the PIDI center supervisor. The first module gathers family participates in outside organizations, indicators for whether mother
socioeconomic data for all household members, including information and father reside in household, education level of mother, job type of
about parents’ educational attainment levels, income sources, father’s and mother, age of father, education of father, monthly income of father,
mother’s occupations, and family structure. The second module gathers number of siblings, number of rooms in house, indicator for whether
information on fertility and reproductive histories for all females in the family owns house, indicator for whether house has running water,
household between the ages of 13 and 49. The third module gathers a indicator for whether there is a bathroom in the house, indicator for
variety of information on the children in the household, including anthro- whether household has a television set, interaction between number of
pometric measures, test scores on cognitive and psychosocial tests, infor-
mation on vaccination records and recent illnesses, and some qualitative 47 Local polynomial estimators were developed in the early statistics
data on parent-child interactions. The fourth module gathers information literature by Cleveland (1979). They were further developed in Fan (1992)
on household living conditions, information on whether the family pos- and have more recently been considered in the econometrics literature by
sesses certain types of durable goods, data on the households’ interaction Heckman, Ichimura, Smith, and Todd (1998).
with local community groups, and qualitative data on the parents’ opinions 48 The advantage stems from the fact that local linear regression imposes
of the PIDI program. The fifth module provides information on the an orthogonality condition between the regressors and the residuals that is
characteristics of the PIDI center coordinators. not imposed under kernel regression. See Fan (1992).
132 THE REVIEW OF ECONOMICS AND STATISTICS
rooms and age of child, interaction between employment status of father Evidence on (1) and (2): Grantham-McGregor et al. (1997) for Ja-
and age of child, interaction between employment status of mother and the maica; Martorell (1995) and Martorell, Rivera, and Kaplowitz (1989) for
age of child, interaction between number of siblings and age of child, rural Guatemala; and Haas et al. (1996), Martorell (1999), and Martorell,
interaction between age of father and number of siblings, interaction Kahn, and Shroeder (1994) for the more general experience in developing
between monthly income of father and number of siblings, interaction countries.
between education level of mother and age of child. Evidence on (3): hundreds of studies, many of which are surveyed in
Psacharopoulos (1994) and Rosenzweig (1995).
APPENDIX D Evidence on (3) and (4): Jamison (1986) for China; Moock and Leslie
(1986) for Nepal; and Behrman (1993) and Pollitt (1990) for the more
Empirical Evidence on Impact of Preschool Child Nutrition and general experience in developing countries.
Cognitive Development on Postschooling Earnings Evidence on (4): Alderman et al. (2001) for rural Pakistan; Glewwe and
Jacoby (1995) for Ghana; and Glewwe, Jacoby, and King (2001) for the
To simulate benefits of improved preschool child nutrition and cogni- Philippines.
tive development on adult earnings, a number of channels must be For our illustrative simulations, we use estimates from Alderman et al.
considered as noted in section V. There is piecemeal empirical evidence of (1996) for (1) and Thomas and Strauss (1997) for (2), under the assump-
significant effects through all four of the channels for developing coun- tion in both cases that there is a strong persistence of changes in preschool
tries. child anthropometric and cognitive development, so that the percentage
Evidence on (1): Alderman et al. (1996) for rural Pakistan; Boissiere, changes for adults equal those we estimate for children. We also use the
Knight, and Sabot (1985) for urban Kenya and Tanzania; Glewwe (1996) estimate in the latter study for the impact of grades completed in schooling
for Ghana; and Lavy, Spratt, and Leboucher (1997) for Morocco. on earnings in (3). The studies on the impact of child nutrition on
Evidence on (2): Behrman and Deolalikar (1989) and Deolalikar progression rates through school and total schooling in (3) and (4) indicate
(1988) for rural India; Haddad and Bouis (1991) for rural Philippines; significant effects, but do not yield parameters that are useful for our
Strauss (1986) for Côte d’Ivoire; Thomas and Strauss (1997) for Brazil; simulations, because they do not correct for censoring for completed
and Behrman (1993) for the more general experience in developing schooling; so we consider illustrative magnitudes for these possible
countries. effects.