ECON7310: Elements of Econometrics: Research Project 2
ECON7310: Elements of Econometrics: Research Project 2
Research Project 2
error, and (β0, β1, β2, β3) are unknown parameters of interest. As usual,
T refers to time periods. Use the data file Q1data.dta to answer the
(a)
identifier (id) and time identifier (t). Which regressor(s) are not time-
You can use the egen command along with the by option to compute
the standard deviation of each regressor for each i. Which regressor(s)
Answer:
Answer:
(10 points) It is well known that the standard errors(SE)of panel data
error uit over time for given i (clustering on i), i.e., C(uit,uis|xi) ̸= 0 for t
estimator is BLUE?
Answer:
^y =2.398 x it , 1−2.052 x it , 2+0.327 x it , 3+1.886
The coefficient estimates and R2 are exactly same with those in Part b.
this is true, the Gauss-Markov theorem does not hold. In this case, the
(d)
(10 points) One of your friends argues that the OLS estimator may be
wrong with using OLS? Your friend suggests that you should use
variables (IV), zit,1 and zit,2, for xit,1. What conditions must hold for zit,1
Answer:
For zit,1 and zit,2 to be valid IV, they should satisfy the following
(e)
(15 points) Estimate (1) using TSLS with zit,1 and zit,2 as IV. As in (c),
you should compute and report cluster-robust SE. Compare the TSLS
your findings. Assuming both zit,1 and zit,2 are valid IV, do you think
pretty sure that zit,1 is exogenous. Name a test that can be used to check
Answer:
^y =1.724 x it ,1−2.113 x it , 2+1.191 x it , 3+1.931
Comparing with estimated model from Part (c), it is obviously that the
obtained from Part (c), which means that there is at least one
endogenous regressors.
This intuition is verified by the (robust) Hausman test. It’s p-value is
below:
^
X 1=−0.108 x it ,2+1.309 x it , 3+0.305 z it , 1+ 0.209 z it ,2−0.046
(0.091) (0.224) (0.144) (0.071) (0.120)
=0.1034
R2
(f)
model Tyit = β0 + β1xit,1 + β2xit,2 + β3xit,3 + γtds,t + vit (2) s=2 where ds,t
for t = 2tot = T. Why? Estimate (2) using TSLS with zit,1 and zit,2 as IV
and test if time effects are significant, i.e., at least one γt are not zero.
regressor? [Hint: Use OLS and TSLS to estimate (2) and compare their
estimates.]
Answer:
no time effects. It turns out that controlling time effects does not help
eliminate the endogeneity problem as the p-value for the Hausman test
(10 points) Suppose that vit = αi + eit with eit ∼ i.i.d.(0, σe2). Re-write
(2) as T yit =β0 +β1xit,1 +β2xit,2 +β3xit,3 +γtds,t +αi +eit (3) s=2 Treat αi as
fixed effects (FE). Use an FE estimator to estimate (3)1. Justify the fact
your findings.
Answer:
appropriate to assume E[eit| vit]=0 but allow E[αi| vit]=/0. Note that the
This is not surprising as we know from Part(e) that zit1 and zit2 are
In April 2008, the unemployment rate in the United States stood at 5%.
lose their jobs than others during the Great Recession? For example,
were young workers more likely to lose their jobs than middle-aged
without a degree or women versus men? The data file employment 08-
These workers were surveyed one year later, in April 2009, and asked
labor force). The data set also includes various demographic measures
for each individual. Use these data to answer the following questions.
(a)
Answer:
Because the p-value of age and age^2 are both almost 0 and less than
employment.
From the p-value, we know that age and age^2 both are statistically
significant in this linear probability model. This can proves that there is
(b)
Answer:
Because the p-value of age and age^2 are both almost 0 and less than
employment.
From the p-value, we know that age and age^2 both are statistically
significant in this linear probability model. This can prove that there is
(c)
Answer:
Because the p-value of age and age^2 are both almost 0 and less than
employment.
From the p-value, we know that age and age^2 both are statistically
significant in this linear probability model. This can prove that there is
(d)
Answer:
For logit:
For Linear:
For Probit:
(e)
Explain.
Answer:
The results of Part (d) shows that there are significant differences in
Additionally, the probability goes up first and then decrease when age
increases from 20 years old to 60 years old, which also proves that
(f)
(10 points) The data set includes variables measuring the workers’
country, and weekly earnings in April 2008. Repeat (a)-(c) using these
factors as additional regressors and construct a table like Ta- ble 11.2
[Hint: You will need to generate dummies for race groups and use
Answer:
The coefficients of age and age^2 are quite different from the
estimation of part (a) to part (c), which obviously indicates that there
are omitted variable bias in the models from part (a) to part (c).
In conclusion, from the second table below, we can know that a young
married man, who only has a high school degree and below, and has
lower weekly earning will suffer the most in the Great Recession.