Chapter 9
Chapter 9
Tirtha Chatterjee
Topics we cover
Tirtha Chatterjee
Functional form misspecification
• Occurs when we do not account for the relationship b/w dependent and observed
explanatory variables
• Can be because of omitting important explanatory variables
• Also, can be because of wrong functional form of the dependent variable
• Say we use wage instead of log(wage)
• Biased estimators
Tirtha Chatterjee
Misspecification because of Omission
Tirtha Chatterjee
Misspecification because of Omission- nested models
• F test for joint exclusion restrictions can be used to detect functional form
misspecification
• Sometimes difficult to pinpoint the precise reason for the misspecification
• Regression Specification Error Test- RESET (Ramsey, 1969)
Tirtha Chatterjee
RESET
Tirtha Chatterjee
RESET - Implementation
Tirtha Chatterjee
RESET - Implementation
Tirtha Chatterjee
RESET- example
• We estimate two models for housing prices- sample size= 88
• The first one has all variables in level form:
𝑝𝑟𝑖𝑐𝑒 = 𝛽$ + 𝛽% 𝑙𝑜𝑡𝑠𝑖𝑧𝑒 + 𝛽! 𝑠𝑞𝑟𝑓𝑡 + 𝛽" 𝑏𝑑𝑟𝑚𝑠 + 𝑢
• The second one uses the logarithms of all variables except bdrms:
𝑙𝑝𝑟𝑖𝑐𝑒 = 𝛽$ + 𝛽% 𝑙𝑙𝑜𝑡𝑠𝑖𝑧𝑒 + 𝛽! 𝑙𝑠𝑞𝑟𝑓𝑡 + 𝛽" 𝑏𝑑𝑟𝑚𝑠 + 𝑢
• First model
• F statistics for the first model- 𝐹!,'! = 4.67, p = 0.012
• We reject the null hypothesis at 5% level of significance- evidence of misspecification
• Second model
• F statistics for the second model- 𝐹!,'! = 2.56, p = 0.084
• We do not reject the null hypothesis at 5% level of significance- no evidence of misspecification
• On the basis of RESET- log-log/ second model is preferred specification
Tirtha Chatterjee
RESET- drawback
• It does not provide any direction on how to proceed if the model is rejected.
• In our previous example, when the first model was rejected, we could think of log-log model
because it is easy to interpret
• In this example, it so happens that it passes the functional form test as well.
• We might not be so lucky all the time- What if the second model got rejected too?
• RESET has no power for detecting omitted variables which are linear
• The bottom line is that RESET is a functional form test, and nothing more.
Tirtha Chatterjee
Non-nested models
𝑦 = 𝛽$ + 𝛽% log(𝑥% ) + 𝛽! log(𝑥! ) + 𝑢
• We cannot simply use a standard F test.
• Two approaches have been suggested.
• Mizon and Richard (1986)
• Davidson-MacKinnon test
Tirtha Chatterjee
First approach-Mizon and Richard (1986)- Implementation
• Steps
• First construct a comprehensive model that contains each model as a special case and
• then test the restrictions that led to each of the models.
• The comprehensive model is
𝑦 = 𝛾! + 𝛾"𝑥" + 𝛾#𝑥# + 𝛾$log(𝑥") + 𝛾&log(𝑥#) + 𝑢
• We can first test H0: 𝛾$= 0, 𝛾& = 0 as a test of the first model
• We can also test H0: 𝛾" = 0, 𝛾#= 0 as a test of second model.
Tirtha Chatterjee
Second approach-Davidson-MacKinnon test
• If the 1st model holds with E(u| 𝑥", 𝑥#) = 0, then fitted values from 2nd model should
be insignificant when added to the first model
• Therefore, to test whether first model is the correct model, we first estimate 2nd model
by OLS to obtain the fitted values; call these 𝑦.
<
• 𝑦I are just nonlinear functions of 𝑥% and 𝑥!
• Then estimate this auxiliary regression
𝑦 = 𝛽$ + 𝛽% 𝑥% + 𝛽! 𝑥! + 𝜃% 𝑦I + 𝑒𝑟𝑟𝑜𝑟
• Davidson-MacKinnon test is obtained from the t-stat on 𝑦< from this auxiliary
regression
• Significant t-test means rejection of first model
Tirtha Chatterjee
Second approach-Davidson-MacKinnon test
• The same thing can be done to test if the second model is correctly specified
• We estimate the first model and then the fitted values are used as regressors in the
second model.
• A significant t test is a rejection of our hypothesis
• This approach can be followed for testing any 2 non-nested models with the same
dependent variable
Tirtha Chatterjee
Problems with testing non-nested alternatives
Tirtha Chatterjee
Functional form misspecification- omission of key variables
Tirtha Chatterjee
Proxy variables
• Proxy variable is something that is related to the unobserved variable that we would
like to control for in our analysis.
• One proxy could be IQ
• IQ does not have to be the same thing as ability;
• what we need is for IQ to be correlated with ability
• Let us estimate the following model- 𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + 𝛽$𝑥$∗ + 𝑢
• We assume that data are available on y, 𝑥", and 𝑥#
• The explanatory variable 𝑥$∗ is unobserved, but we have a proxy variable for 𝑥$∗
• Let us call the proxy variable 𝑥$.
• In our example, 𝑥$∗ is ability and 𝑥$ is IQ
Tirtha Chatterjee
Using proxy variables for unobserved explanatory variables
Tirtha Chatterjee
Using proxy variables for unobserved explanatory variables
• How can we use 𝑥$ to get unbiased (or at least consistent) estimators of 𝛽"& 𝛽#?
• We can plug in 𝑥$ for 𝑥$∗ & run the regression of y on 𝑥", 𝑥#, 𝑥$.
• We call this the plug-in solution to the omitted variables problem
• because 𝑥" is just plugged in for 𝑥"∗ before we run OLS.
• If 𝑥$ is truly related to 𝑥$∗ , this seems like a sensible thing.
• However, since 𝑥$ and 𝑥$∗ are not the same, we need some assumptions
Tirtha Chatterjee
Assumptions needed for the plug-in solution to work
Tirtha Chatterjee
Proxy variable
𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + 𝛽$𝑥$∗ + 𝑢
𝑥$∗ = 𝛿! + 𝛿$𝑥$ + 𝑣$
• If we substitute 𝑥$ for 𝑥$∗ and do simple algebra, we get
𝑦 = 𝛽! + 𝛽$𝛿! + 𝛽"𝑥" + 𝛽#𝑥#+ 𝛽$𝛿$𝑥$+u+ 𝛽$𝑣$
• Composite error in this equation 𝑒 = 𝑢 + 𝛽$𝑣$- It depends on u and 𝑣$
• Write this equation as 𝑦 = 𝛼! + 𝛽"𝑥" + 𝛽#𝑥#+ 𝛼$𝑥$+e
• where 𝛼$ = 𝛽$ + 𝛽" 𝛿$ and 𝛼" = 𝛽" 𝛿" is the slope parameter on the proxy variable 𝑥"
• e has zero mean and is uncorrelated with 𝑥", 𝑥#, and 𝑥$
• Since u and 𝑣" both have zero mean and each is uncorrelated with 𝑥% , 𝑥! , and 𝑥"
• The estimated model – regressing y on 𝑥" 𝑥# & 𝑥$will get unbiased (or at least
consistent) estimators of 𝛼!, 𝛽", 𝛽# and 𝛼$.
Tirtha Chatterjee
Including a proxy could exacerbate problem of multicollinearity
• For example, Including variable like IQ as a proxy for ability in a regression that
includes educ
• But there are advantages
• First, the inclusion of IQ reduces the error variance because the part of ability explained by IQ has
been removed from the error- Reflected in a smaller standard error of the regression
• Second, the added multicollinearity is a necessary evil if we want to get an unbiased estimator of
educ
• There could be increase in multicollinearity even if we had data on ability and had included it in the
model
• Ultimately, it is a trade-off
Tirtha Chatterjee
Using lagged dependent variable as proxy variables
• Sometimes we suspect that some of the x variables could be correlated with the error
but we are not aware of a good proxy
• In those cases, we can use observed dependent variable from a previous time point as
a proxy
• Lagged dependent variable
• Using past values increases data requirements but it is a way to account for historical
factors that could have an impact on the present
• Consider a simple equation to explain city crime rates:
𝑐𝑟𝑖𝑚𝑒 = 𝛽$ + 𝛽% 𝑢𝑛𝑒𝑚 + 𝛽! 𝑒𝑥𝑝𝑒𝑛𝑑 + 𝛽" 𝑐𝑟𝑖𝑚𝑒)% + 𝑢
• where crime is a measure of per capita crime, unem is the city unemployment rate, expend is per
capita spending on law enforcement, and 𝑐𝑟𝑖𝑚𝑒)% indicates the crime rate measured in some earlier
year (this could be the past year or several years ago).
Tirtha Chatterjee
Using lagged dependent variable as proxy variables
• Thus, factors unobserved to us (the econometricians) that affect crime are likely to be
correlated with expend (and unem).
• Using cross-sectional data is unlikely to give us unbiased estimator of causal effect of law
enforcement expenditure on crime
Tirtha Chatterjee
Measurement error
Tirtha Chatterjee
Measurement error in dependent variable
• Let 𝑦 ∗ denote the variable that we would like to explain- say, annual family savings.
• The regression model has the usual form
𝑦 ∗ = 𝛽! + 𝛽"𝑥" + ⋯ + 𝛽% 𝑥% + 𝑢
• and we assume it satisfies the Gauss-Markov assumptions.
• Say there is measurement error in 𝑦 ∗ or we do not measure it correctly
• Let y represent the observable measure of 𝑦 ∗ . In the savings case, y is reported annual
savings.
• Unfortunately, families are not perfect in their reporting of annual family savings
Tirtha Chatterjee
Measurement error in dependent variable
• Measurement error is the difference between the observed value and the actual value:
𝑒! = 𝑦 − 𝑦 ∗
• important thing is how the measurement error in the population is related to other
factors.
• To obtain an estimable model, we plug 𝑦 ∗ = 𝑦 − 𝑒! into equation, and rearrange:
y = 𝛽! + 𝛽"𝑥" + ⋯ + 𝛽% 𝑥% + 𝑢 + 𝑒!
• The error term is 𝑢 + 𝑒!. We estimate this model by OLS.
Tirtha Chatterjee
Measurement error in dependent variable
• We can estimate the model using OLS and ignore measurement error if 𝑒! also has
zero mean like u and is uncorrelated with each 𝑥2 .
• Biased estimators for the intercept if measurement error doesn’t have zero mean
• If we assume corr(x, 𝑒$ )=0, then no endogeneity and OLS estimators are unbiased and consistent.
• Measurement error in dependent variable results in larger variances of the OLS
estimators.
• Since, if 𝑒$ and u are uncorrelated, then 𝑉𝑎𝑟 𝑢 + 𝑒$ = 𝜎*! + 𝜎$! > 𝜎*! .
• Basically, if the measurement error is uncorrelated with the independent variables,
then OLS estimation has good properties.
• Otherwise, endogeneity and biased estimates
Tirtha Chatterjee
Measurement error in explanatory variable
𝑦 = 𝛽! + 𝛽"𝑥"∗ + 𝑢
• The problem is that 𝑥"∗ is not observed. Instead, we have a measure of 𝑥"∗ ; call it 𝑥".
• For example, 𝑥%∗ could be actual income and 𝑥% could be reported income
• The measurement error in the population is 𝑒" = 𝑥" − 𝑥"∗
• This can be positive, negative, or zero.
• We assume that
• average measurement error in the population is zero: E(𝑒% ) = 0.
• 𝐸(𝑦|𝑥%∗, 𝑥% ) = 𝐸(𝑦|𝑥%∗), which just says that 𝑥% does not affect y after 𝑥%∗ has been controlled for.
Tirtha Chatterjee
Measurement error in explanatory variable
Tirtha Chatterjee
Classical errors-in-variables (CEV) assumption
Tirtha Chatterjee
Missing data
• Can arise because of various reasons- collected information & later saw some info
missing
• Will reduce the sample size
• Are there any statistical consequences of using the OLS estimator and ignoring the
missing data?
• No statistical problem if the data are missing completely at random (MCAR)
• If MCAR- then units which have all data are not systematically different from those which don’t
• Reason for missing data is is independent, in a statistical sense
• Don’t replace missing with Zeros- can cause substantial bias in the OLS estimators.
• For MCAR-, one could create a dummy variable –missing data indicator
• which equals the actual data from variable, and zero if the data is missing
• All the variables including this new dummy variable is included in the regression
Tirtha Chatterjee
Missing data- Non random samples
Tirtha Chatterjee
Exogenous sample selection
Tirtha Chatterjee
Endogenous sample selection
Tirtha Chatterjee
Outliers
Tirtha Chatterjee
Least Absolute Deviation estimators
• The LAD estimators of the 𝛽2 in a linear model minimize the sum of the absolute
values of the residuals
• LAD does not give increasing weight to larger residuals
• it is much less sensitive to changes in the extreme values of the data than OLS.
• LAD is designed to estimate the parameters of the conditional median of y given
explanatory variables rather than the conditional mean.
• Because the median is not affected by large changes in the extreme observations, it
follows that the LAD parameter estimates are more resilient to outlying observations.
Tirtha Chatterjee
What to do in case of outliers?
Tirtha Chatterjee