0% found this document useful (0 votes)
62 views19 pages

Understanding Instrumental Variables in Econometrics

Uploaded by

ivanmrn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views19 pages

Understanding Instrumental Variables in Econometrics

Uploaded by

ivanmrn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Old School IV

Master Joshway

ASSA Continuing Education: January 2020

Instruments Ready?
Organizing IV
I tell the IV story in two iterations, first with constant e§ects,
then with heterogeneous potential outcomes.

• The constant e§ects framework focuses on selection bias and essential


IV mechanics
• Many reasons to instrument ... here’s one:
• The regression we want is long, say:

yi = a + rsi + Ai0 g + vi = a + rsi + h i (1)

Since si and h i = Ai0 g + vi are correlated,

Cov (yi , si )
6= r
V (si )
• The short regression su§ers from "ability bias"
• IV recovers long-regression r without observing Ai
• Why go long? Must be a causal story!

IV Goes Long (in pursuit of causal e§ects)


• Potential earnings modeled as a function of ability:

y0i = a + h i = a + Ai0 g + vi ,
where we’re happy to assume E [vi si ] = 0
• Moving from s − 1 to s years of schooling yields constant returns:

ys ,i − ys −1,i = r,
making eq. (1) into a causal model
• A valid instrument, zi , is:
1 correlated with si
2 uncorrelated with h i = Ai0 g + vi and hence with y0i

• zi is excluded from the causal model of interest


• Given these assumptions, we have:
Cov (yi , zi ) Cov (yi , zi )/V (zi ) ”RF ”
r= = = (2)
Cov (si , zi ) Cov (si , zi )/V (zi ) ”1st”
Abraham (Wald) Meets Jacob (Bernoulli)
• Repeat our long regression, equation (1):

yi = a + rsi + Ai0 g + vi = a + rsi + h i

• By linear CEF, the RF for a Bernoulli (dummy) instrument, zi , is

Cov (yi , zi )
= E [ y i | z i = 1 ] − E [ y i | zi = 0 ] ,
V ( zi )
Cov (si ,zi )
with an analogous formula for V ( zi )
. This shows:

Cov (yi , zi ) E [ y i | zi = 1 ] − E [ y i | z i = 0 ]
r= = (3)
Cov (si , zi ) E [si |zi = 1] − E [si |zi = 0]
• A direct route uses E [h i |zi ] = 0:

E [yi |zi ] = a + rE [si |zi ] (4)

Solving (4) for r yields (3)

Angrist and Krueger (1991): Compulsory IV

• Children born in late-quarters start school younger, so are kept in


school longer by birthday-based compulsory schooling laws
• There’s a powerful first stage supporting this
• Late-quarter births have more years of schooling
• This is driven by high school and not college, consistent with the
AK-91 CSL story

• Mean schooling and wages by YOB/QOB appear in Figure 4.1.1

• The IV estimator is the sample analog of (2); with a dummy


instrument this becomes the sample analog of (3)

• AK-91 Wald IV for the economic returns to schooling compares


average schooling and earnings for early- and later-quarter births
• The instrument here is Zi = 1[QOBi = 1]
996 QUARTERLY JOURNAL OF ECONOMICS

TABLE III
PANEL A: WALD ESTIMATES FOR 1970 CENSUS-MEN BORN 1920-1929a

(1) (2) (3)


Born in Born in 2nd, Difference
1st quarter 3rd, or 4th (std. error)
of year quarter of year (1) - (2)

in (wkly. wage) 5.1484 5.1574 -0.00898


(0.00301)
Education 11.3996 11.5252 -0.1256
(0.0155)
Wald est. of return to education 0.0715
(0.0219)
OLS return to education' 0.0801
(0.0004)

Panel B: Wald Estimates for 1980 Census-Men Born 1930-1939

(1) (2) (3)


Born in Born in 2nd, Difference
1st quarter 3rd, or 4th (std. error)
of year quarter of year (1) - (2)

in (wkly. wage) 5.8916 5.9027 -0.01110


(0.00274)
Education 12.6881 12.7969 -0.1088
(0.0132)
Wald est. of return to education 0.1020
(0.0239)
OLS return to education 0.0709
(0.0003)

a. The sample size is 247,199 in Panel A, and 327,509 in Panel B. Each sample consists of males born in the
United States who had positive earnings in the year preceding the survey. The 1980 Census sample is drawn
from the 5 percent sample, and the 1970 Census sample is from the State, County, and Neighborhoods 1 percent
samples.
b. The OLS return to education was estimated from a bivariate regression of log weekly earnings on years of
education.

estimate in this case because unobserved earnings determinants


(e.g., ability) are likely to be uniformly distributed across people
born on different dates of the year.15
The last row of each panel in Table III provides the OLS
Two-Stage Least Squares (2SLS)
15. We note that our procedure will slightly understate the return to education
because first-quarter births, whose birthdays occur midterm, are more likely to
attend some schooling beyond their last year completed. Consequently, the differ-
• We do IV by doing 2SLS
ence in years of school attended between first and later quarters of birth is less than
the difference in years of school completed. Since the difference in completed
education rather than the difference in years of school attended appears in the
denominator of the Wald estimator, our estimate is biased downward. In practice,
• This accommodates covariates (controls, Xi ) and multiple
however, this is a small bias because the difference in completion rates is small.

instruments (dummy or otherwise):


This content downloaded from [Link] on Fri, 17 May 2019 [Link] UTC
All use subject to [Link]
yi = a0 Xi + rsi + h i , (5)

The first stage and reduced form are

si = Xi0 p 10 + p 11
0
zi + x 1i = ŝi + x 1i (6)
0 0
yi = Xi p 20 + p 21 zi + x 2i (7)
• The 2SLS "second stage" is obtained by substituting (6) into (5):

yi = a0 Xi + r[Xi0 p 10 + p 11
0
zi ] + rx 1i + h i (8)
= a0 Xi + rŝi + rx 1i + h i
= a0 Xi + rŝi + x 2i
2SLS Notes
• 2SLS subs ŝi for si in (5):

yi = a0 Xi + rŝi + [h i + rx 1i ], (9)

• Because ŝi and x 2i are uncorrelated, OLS estimation of (9) identifies r


• In practice, let Stata ivregress do it
• Likewise, we get the RF this way

yi = a0 Xi + r[Xi0 p 10 + p 11 0
zi ] + rx 1i + h i (10)
0 0
= Xi [a + rp 10 ] + rp 11 zi + [rx 1i + h i ]
= Xi0 p 20 + p 21
0
zi + x 2i
• 2SLS implicitly computes the ratio of RF to 1st for each IV:
p 21
=r
p 11
In old-school SEMs, the sample analog of this ratio is an Indirect
Least Squares estimator of r

2SLS in AK-91

• We now see that it’s the QOB × YOB first stage and reduced form
that are plotted in Figure 4.1.1
• The corresponding 2SLS estimates appear in Table 4.1.1

• 2SLS matches the QOB earnings pattern (RF) to the QOB pattern in
schooling (first stage):
p 21 = rp 11
The key p-p-p-pattern here is ...

• Covariates include year-of-birth and state-of-birth dummies, as well as


linear and quadratic functions of age in quarters
2SLS is a many-splendored thing
• 2SLS is IV where the instrument is ŝi∗ , the residual from a regression
of ŝi on Xi :
Cov (yi , ŝi∗ ) Cov (yi , ŝi∗ )
=
V (ŝi∗ ) Cov (si , ŝi∗ )
• One-instrument 2SLS is IV where the instrument is z̃i , the residual
from a regression of zi on the covs, Xi :
Cov (yi , ŝi∗ ) Cov (yi , z̃i )
=
V (ŝi∗ ) Cov (si , z̃i )
• One-instrument 2SLS is ILS:
Cov (yi , ŝi∗ ) Cov (yi , ŝi∗ )
=
V (ŝi∗ ) Cov (si , ŝi∗ )
Cov (yi , z̃i ) p 21
= =
Cov (si , z̃i ) p 11
• Over-identified 2SLS is a weighted average of these just-identified
IV=ILS estimates (MHE 4.5.1)

2SLS Mistakes

• 2SLS . . . so simple a fool can do it . . .


• and many do!

• What can go wrong?

• As explained in MHE 4.6.1, three mistakes stubbornly persist:


• Manual 2SLS
• Covariate ambivalence
• Forbidden regressions (from the left and the right)

• These are the bitter fruit of attempts to "improve" upon orthodox


2SLS protocols

• 2SLS is already awesome: let Stata do it!


Group Work

Wald Serves in Vietnam


• Key variables

zi = randomly assigned draft-eligibility in 1970-72 draft lotteries


di = a dummy indicating Vietnam-era veterans
yi = earnings after service
• The causal e§ect of Vietnam-era military service is the draft-eligibility
RF divided by the draft-eligibility first stage
• di is also a dummy, so the first stage is a di§ in probs:

Cov (di , zi )
= E [ d i | zi = 1 ] − E [ d i | zi = 0 ]
V ( zi )
= P [ d i = 1 | zi = 1 ] − P [ d i = 1 | zi = 0 ]

• Angrist (1990), Figures 1-2 and MHE Table 4.1.3


• Updated: Angrist, Chen, and Song (2011)
Multiple groups and 2SLS

• More to the draft lottery than draft-eligibility: Angrist and Chen


(2011), Figure 1
• Let ri = j 2 {1, ..., J } denote lottery numbers. Draft-eligibility Wald
uses 1[ri < 195] as an instrument in a just-identified setup
• Using fine-grained info on ri , we have

E [yi |ri ] = a + rP [di = 1|ri ], (11)

since P [di = 1|ri ] = E [di |ri ]. So we can estimate r by fitting:

ȳj = a + rp̂j + h̄ j ; j = 1, ..., J (12)

• E¢cient GLS for this grouped constant-e§ects linear model is


weighted least squares, weighted by V (h̄ j )
s2
• V (h̄ j ) = nh under homoskedasticity
j

Visual IV, Grouping, and GLS

• Equation (12) in action: Angrist (1990), Figure 3.


• This illustrates visual instrumental variables (VIV)

• GLS applied to equation (12) is 2SLS


• The instruments here are lottery-number indicators. Define
Zi ≡ {rji = 1[ri = j ]; j = 1, ..., J − 1}
• The first stage for di on Zi plus a constant is saturated, so fitted
values are cond. means, p̂j , repeated nj times for each j
• The second stage slope estimate is therefore weighted least squares on
the grouped equation, (12), weighted by nj
• Because GLS is e¢cient, 2SLS is also the e¢cient linear combination
of the underlying just-identified IV (Wald) estimates
• That’s why we call Figure 3 "VIV"
• Sargan/Hansen overid tests the fit of this line

• Fig. 3 also illustrates two-sample IV : ȳj from one smpl, p̂j from
another (details in AK 1992, 1995; Inoue and Solon, 2010)
There’s Weakness in Numbers
(of instruments)

2SLS is Biased, Yo
• OLS estimates are unbiased and consistent for the corresponding pop
reg (maybe not the reg you want, but nicely estimated)
• 2SLS estimates are consistent for causal FX but biased
• Endogenous var. is vector x; dep. var. is vector y ; no covs:

y = bx + h (13)

The N ×q matrix of instruments is Z , with first-stage

x = Zp + x (14)

Outcome error h i is correlated with x i . Instruments are uncorrelated


with x i by construction and with h i by assumption
• The 2SLS estimator is
" # −1 0 " # −1 0
b
b2SLS = x 0 PZ x x PZ y = b + x 0 PZ x x PZ h

where PZ = Z (Z 0 Z )−1 Z 0 produces fitted values


Bias and First-stage F

• A Bekker (1994) approximation generates:

shx 1
E [b
b2SLS − b] ≈ (15)
s2x F + 1

where " #
F ≡ (1/s2x )E p 0 Z 0 Z p /q
is the "population first stage F"
s
• As F gets small, the bias of 2SLS approaches hx2
sx
s s
• The bias of the OLS estimator is hx , which also equals shx2 if p = 0
s2x x

• 2SLS estimates are therefore said to be "biased towards OLS


estimates" when the first stage is weak
• The bias of 2SLS vanishes as F increases, as it should when p 6= 0
and sample size grows

First-stage F (cont.)

• Bias grows as the number of instruments grows (if the instruments


are weak)
• Adding instruments with no e§ect on the first-stage R-squared, the
model sum of squares, E (p 0 Z 0 Z p ), and the residual variance, s2x , are
fixed while q increases
• From this we learn that the addition of weak instruments decreases F
and therefore increases bias

• Holding the first-stage sum of squares fixed, bias is least in the


just-ID case when the number of instruments is as low as it can get

• 2SLS bias is a consequence of first-stage estimation error. We’d like


to use b
xpop = Z p as an instrument since these fits are uncorrelated
with second stage resids
• In practice, we use b
x = PZ x = Z p + PZ x
• 2SLS bias arises from the corr between PZ x and h
IV Without Bias or Tears
• The reduced form is unbiased: if the relationship you’re after is
invisible in the reduced form, then it ain’t there!
• In just-identified models, the p-value for the reduced-form e§ect of the
instrument is approximately the p-value from the second stage
• Chernozhukov and Hansen (2008) develop reduced-form-based
inference for over-identified models

• LIML is approximately median-unbiased for constant-e§ects (but


beware heteroskedasticity)
• Just-identified 2SLS is approximately unbiased (proof:
just-ID=LIML)
• The just-ID and LIML sampling distributions have no o¢cial moments,
yet their medians are where they should be
• Split-sample IV (SSIV) and jackknife IV (JIVE) are Bekker-unbiased
(Angrist and Krueger 1995; Angrist, Imbens, and Krueger 1999)
• Updates include Hausman, Newey, Woutersen, Chao, and Swanson
(2012), many others

Monte Carlo for Many-Weak

yi = bxi + h i
q
xi = Â pj zij + x i
j =1

with b = 1, p 1 = 0.1, p j = 0 8j > 1, joint normal errors with


corr (h i , x i ) = .8, where the instruments, zij , are independent, standard
normals. The sample size is 1000.
• Figure 4.6.1: OLS, just identified IV (q=1, labeled IV; F=11.1),
2SLS (q=2, labeled 2SLS; F=6.0), LIML (q=2)
• Figure 4.6.2: OLS, 2SLS, and LIML with q=20 (1 good instrument,
19 worthless; F=1.51)
• Figure 4.6.3: OLS, 2SLS, and LIML with q=20 but p j = 0;
j = 1, ..., 20 (all 20 worthless; F=1.0)
• Quarter of birth estimates of the returns to schooling (reprise):
Table 4.6.2
Welcome to the Machine

New Models and Methods


• Belloni, Chen, Chernozhukov, and Hansen (2012) use machine
learning to pick a few instruments when you’re blessed w/an
abundance thereof
• The leading ML method here is lasso, a type of "regularized
regression", minimizing
l
min E [Yi − Â bj xj ]2 +

|bj | (16)
{b j } j j
| {z } | {z }
squared error penalty term

where l is a user-chosen penalty


• Lasso favors lower-dimensional "sparse" models and small coe¢cients
• The absolute value inside the penalty term causes lasso to drop some
regressors, while shrinking others
• Post-lasso runs conventional OLS on the regressors lasso retains
• BCH (2012) discuss the theory behind a post-lasso 2SLS first stage
• Sounds promising!
What have we found? The same old fears . . .

• Sims based on AK91 with 180 instruments (QOB*YOB; QOB*POB)


and even 1530 (QOB*YOB*POB) show that LIML and SSIV beat
ML for bias and MAE
• Lasso for instrument selection faces two challenges
• 2SLS is (still) biased, yo
• 2SLS w/a lassoed first stage is pretesting

• Details
• The good behavior of lasso is predicated on the assumptions of
"approximate sparsity," which implies the sample grows relative to the
number of first-stage parameters
• The Bekker sequence reveals the finite sample behavior of 2SLS, SSIV,
LIML etc. by fixing the number of obs/parameter; Bekker isn’t sparse
• Hall, et al. (1996) show the dangers of test-based solutions to the
weak instruments problem (Andrews, Stock, and Sun 2019 update this)
• Better to use a Bekker-unbiased estimator from the get-go (Angrist
and Frandsen 2019)

Tables and Figures


- -

A Average Education by Quarter of Birth (first stage)


A.
13.2
4

13.1 4
2 2
4 3
3
13 4
3 1 3
3
12 9
12.9 1

Years of Education
4 2 4 2
12.8 3 2
4 2 1
4 3
3
3
12.7 4 1
4 2
1 1
12.6
A.3 Average Education2 by Quarter
1
of Birth (first stage)
12.5 2 1 2 1

12.4

12.3
1
12.2
30 31 32 33 34 35 36 37 38 39

Year of Birth

B. Average Weekly Wage by Quarter of Birth (reduced form)

5.94

5.93

B. Average Weekly Wage by Quarter of Birth (reduced form)


5.92 4
3 3
arnings

3 4 3 4
3 3 4
5.91 3 4 3 2 3 4
4 4 2
2
Log Weekly Ea

2 4
5.9 1 2
2 4 3
2 1 2
2
5.89 1 1
1 1
1 2 1
5.88 1
1

5.87

5.86
30 31 32 33 34 35 36 37 38 39

Year of Birth

Figure 4.1.1: Graphical depiction of Örst stage and reduced form for IV estimates of the economic return to
schooling using quarter of birth (from Angrist and Krueger 1991).

Table 4.1.1
2SLS estimates of the economic returns to schooling
“38332_Angrist” — 10/18/2008 — 17:36 — page 124

OLS 2SLS
(1) (2) (3) (4) (5) (6) (7) (8)
Years of education .071 .067 .102 .13 .104 .108 .087 .057
(.0004) (.0004) (.024) (.020) (.026) (.020) (.016) (.029)
Exogenous Covariates
Age (in quarters) !
Age (in quarters) squared !
9 year-of-birth dummies ! ! ! ! !
50 state-of-birth dummies ! ! ! ! !
Instruments
dummy for QOB = 1 ! ! ! ! ! !
dummy for QOB = 2 ! ! ! !
dummy for QOB = 3 ! ! ! !
QOB dummies interacted with ! !
year-of-birth dummies
(30 instruments total)
Notes: The table reports OLS and 2SLS estimates of the returns to schooling using the Angrist and Krueger (1991)
1980 census sample. This sample includes native-born men, born 1930–39, with positive earnings and nonallocated
values for key variables. The sample size is 329,509. Robust standard errors are reported in parentheses. QOB denotes
quarter of birth.
-

Table 4.1.3

IV Estimates of the Effects of Military Service on the Earnings of White Men born in 1950

Earnings Veteran Status Wald


Estimate of
Earnings Mean Eligibility Inelig. Eligibility Veteran
year Effect Mean Effect Effect

(1) (2) (3) (4) (5)


1981 16,461 -435.8 .182 .159 -2,741
(210.5) (. 040) (1,324)
1971 3,338 -325.9 -2050
(46.6) (293)
1969 2,299 -2.0
(34.5)

Note: Adapted from Table 5 in Angrist and Krueger (1999) and author tabulations. Standard errors are shown in
parentheses. Earnings data are from Social Security administrative records. Figures are in nominal dollars. Veteran
status data are from the Survey of Program Participation. There are about 13,500 individuals in the sample.
-

0.15

0.1

0.05

‐0.05

‐0.1

‐0.15

‐0.2

‐0.25 estimate
estimate + 1.96*se
‐0.3
estimate ‐ 1.96*se
‐0.35
1970 1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 2003 2006

Figure 1. Draft‐lottery Estimates of Vietnam‐era Service Effects on ln(Earnings) for White Men Born 1950‐52

102 AMERICAN ECONOMIC JOURNAL: APPLIED ECONOMICS APRIL 2011

Panel A. Whites

0.45 Year of Birth


1950 1951

1952 1953
P(Veteran|RSN)

0.35

0.25

0.15

0.05
1 50 100 150 200 250 300 365
RSN
Panel B. Nonwhites

0.45 Year of Birth


1950 1951

1952 1953
P(Veteran|RSN)

0.35

0.25

0.15

0.05
1 50 100 150 200 250 300 365
RSN

Figure 1. The Conditional Probability of Military Services by Random Sequence Number

Notes: This figure plots the probability of Vietnam-era military service against draft lottery numbers. Data are from
the 2000 census.
-

4.7. APPENDIX 115


1
.75
.5
.25
0

0 .5 1 1.5 2 2.5

OLS IV
2SLS LIML

Figure 4.6.1: Distribution of the OLS, IV, 2SLS, and LIML estimators. IV uses one instrument, while 2SLS
and LIML use two instruments.

V (si )
where ! ! V (si )! V (S j )
: Solving for # 1 , we have
116 CHAPTER 4. INSTRUMENTAL VARIABLES IN ACTION
-

116 CHAPTER 4. INSTRUMENTAL VARIABLES IN ACTION

1
.75
.5
.25
0

0 .5 1 1.5 2 2.5

OLS 2SLS
LIML

Figure 4.6.2: Distribution of the OLS, 2SLS, and LIML estimators with 20 instruments

Figure 4.6.2: Distribution of the OLS, 2SLS, and LIML estimators with 20 instruments

-
1
.75
.5
.25
0

0 .5 1 1.5 2 2.5

OLS 2SLS
Figure 4.6.3: Distribution of the OLS, 2SLS, and LIML estimators with 20 worthless instruments
LIML

Figure 4.6.3: Distribution of the OLS, 2SLS, and LIML estimators with 20 worthless instruments
-
214 Chapter 4

Table 4.6.2
Alternative IV estimates of the economic returns to schooling
(1) (2) (3) (4) (5) (6)
2SLS .105 .435 .089 .076 .093 .091
(.020) (.450) (.016) (.029) (.009) (.011)
LIML .106 .539 .093 .081 .106 .110
(.020) (.627) (.018) (.041) (.012) (.015)
F-statistic 32.27 .42 4.91 1.61 2.58 1.97
(excluded instruments)
Controls
Year of birth ! ! ! ! ! !
State of birth ! !
Age, age squared ! ! !
Excluded instruments
Quarter-of-birth dummies ! !
Quarter of birth*year of birth ! ! ! !
Quarter of birth*state of birth ! !
Number of excluded instruments 3 2 30 28 180 178
Notes: The table compares 2SLS and LIML estimates using alternative sets of instru-
ments and controls. The age and age squared variables measure age in quarters. The OLS
estimate corresponding to the models reported in columns 1–4 is .071; the OLS estimate
corresponding to the models reported in columns 5 and 6 is .067. Data are from the Angrist
and Krueger (1991) 1980 census sample. The sample size is 329,509. Standard errors are
reported in parentheses.

The first column in the table reports 2SLS and LIML esti-
-
mates of a model using three quarter-of-birth dummies as
instruments, with year-of-birth dummies as covariates. The
OLS estimate for180this specification is 0.071, while
Instruments (QOB*YOB; POB*YOB; Average F=2.5)
the 2SLS
1530 Instruments (QOB*YOB*POB; Average F=1.7)
estimate is a [Link], IVs atStandard
0.105. MedianThe first-stage
abs. Median abs. Avg. IVs F-statistic Standard isMedian abs. Median abs.
Estimator
over 32, well out (1) of (2)
retained
the danger
Bias deviation
(3) zone. (5)Not surprisingly,
dev.
(4)
error retained
(6)
Bias
(7) (8) the (9)
deviation dev. error
(10)

OLS LIML estimate is almost 0.107 identical


0.0004 to
0.0003 2SLS
0.1070 in this case.
2SLS
Post-lasso IV (CV penalty)
Angrist and Krueger
180
74.0
0.0403
0.0390
(1991)
0.0108
0.0120
experimented
0.0075
0.0082
0.0397
0.0384
with
1530
99.0
models
0.0611
0.0559
that 0.0032
0.0046
0.0084 0.0059
0.0611
0.0560
Post-lasso IV (plug-in include age and age
penalty, IVs selected)* 2.1 squared
0.0143 measured
0.0346 0.0218 in0.0279
quarters 1.6 as0.0149
additional
0.0367 0.0224 0.0271

Split-Sample IV
controls. These 63.1
Post-lasso SSIV (CV penalty)
controls
180 -0.0009
-0.0015
are
0.0237
0.0258
meant
0.0158
0.0172
to0.0173
pick 63.0
0.0158
up omitted
1530 -0.0001
-0.0013
age 0.0183 0.0115
0.0164
0.0280
0.0112
0.0183
effects
Post-lasso SSIV (plug-in that might2.1 confound
penalty, IVs selected)** -0.0724 the quarter-of-birth
1.3168 0.0274 0.0287 3.4 instruments.
0.0197 0.0504 0.0228 0.0292
Post-lasso ( IV choice split only, CV penalty) 63.1 0.0429 0.0144 0.0097 0.0431 63.0 0.0460 0.0141 0.0093 0.0459

LIML
The addition of 180 age -0.0016
and age 0.0185
squared
0.0123
reduces1530the-0.0034
0.0124
number 0.0117
of 0.0079 0.0083
instruments to two,
Post-lasso LIML (CV penalty) 74.0 since 0.0152
0.0222 age in0.0102 quarters, 0.0220 year 99.0 of0.0484 birth,0.0094and 0.0066 0.0483
Post-lasso LIML (plug-in penalty, IVs selected)* 2.1 0.0126 0.0347 0.0221 0.0273 1.6 0.0138 0.0366 0.0221 0.0257
quarter
Pretested LIML (t => 3.12 offorbirth
for 180, t=>2.3 1530) are
18 linearly
0.0222 dependent.
0.0236 0.0148 As shown
0.0238 153 in0.0385
column 0.0163 2, 0.0111 0.0393

the
Random forest first stage, 2SLSfirst-stage
using RF fits as instruments (min leaf size=1)drops to 0.4 when age and age
F-statistic 0.0611 squared
0.0047 0.0030 0.0612 −1
are included as controls, a sure sign of trouble. But
Random forest 2SLS, min leaf size = 800
Random forest first stage, SSIV using RF fits as instruments (min leaf size =1)
0.0567
-0.0003
the 0.0158
2SLS 0.0109 0.0567
0.0065 0.0045
0.0108 0
Random forest SSIV, min leaf size = 800 -0.0005 0.0158 0.0104 0.0103
1
Notes: The table describes simulation results for 999 Monte Carlo estimates of the economic returns to schooling using simulated samples constructed from the
Angrist and Krueger (1991) census sample of men born 1930-39 (N=329,509). The causal effect of schooling is calibrated to 0.1; the OLS estimand is 0.207. The
instruments used to compute the estimates described by columns 1-5 consist of 30 quarter-of-birth-by-year-of-birth and 150 quarter-of-birth-by-state-of-birth
interactions (average F-stat = 2.5, average concentration parameter = 270). The instruments used to compute the estimates described by columns 6-10 are quarter-
of-birth-by-year-of-birth-by-state-of-birth interactions (average F-stat = 1.7, average concentration parameter = 1050). All models include saturated year of birth by
state of birth controls. Columns 1 and 6 report the average number of instruments retained by lasso. Post-lasso estimates are computed as described in the
appendix. Split-Sample IV uses first stage coefficients estimated in one half-sample to construct a cross-sample fitted value used for IV in the other. Sample-
splitting procedures average results from complementary splits. Post-lasso with an IV-choice split only uses post-lasso in half the sample to pick instruments, doing
2SLS with these and own-sample fitted values in the other half. "Post-lasso LIML" is LIML using the instrument set selected by a post-lasso first stage. "Pretested
LIML" estimates are computed using conventional LIML, retaining only instruments with a first-stage t-statistic in the upper decile of t-statistics for the full set of
“38332_Angrist” — 10/18/2008 — 17:36 — page 214
instruments. Simulation sets choose lasso penalties once, using the original AK91 data. Random forest routines are described in the appendix.
*The plug-in penalty generates a lasso first stage that includes no instruments in 11 simulation runs with 180 instruments and in 57 simulation runs with 1530
instruments. Statistics reported in these rows are for runs completed.
**Post-lasso SSIV with a plug-in penalty picks zero instruments in 670 of 180-instrument runs, and in 893 of 1530-instrument runs. Statistics reported in these rows
are for runs completed.

You might also like