0% found this document useful (0 votes)

4 views

Chapter 9

The document discusses various issues related to model specification and data issues in econometrics. It covers topics like functional form misspecification, tests to detect misspecification using nested and non-nested models, and their drawbacks. It also discusses misspecification due to omitted variables, using proxy variables, and issues like measurement error and missing data.

Uploaded by

Gunjan Choudhary

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Chapter 9

Uploaded by

Gunjan Choudhary

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Econometrics – II

BA (H) Econ Core– Spring, 2024 (February to May)

Chapter 9: More on specification and data issues

Wooldridge, J.M. (2015) Introductory Econometrics, 6e

Tirtha Chatterjee
Topics we cover

• What is functional form misspecification or model misspecification

• Tests to detect model misspecification
• Nested and non-nested models
• Drawbacks with the methods
• Functional mis-specification because of Omission of variables
• Using proxy variables
• Measurement Error in dependent and explanatory variables
• Missing Data
• Outliers or Outlying observations

Tirtha Chatterjee
Functional form misspecification

• Occurs when we do not account for the relationship b/w dependent and observed
explanatory variables
• Can be because of omitting important explanatory variables
• Also, can be because of wrong functional form of the dependent variable
• Say we use wage instead of log(wage)
• Biased estimators

Tirtha Chatterjee
Misspecification because of Omission

• Example- suppose hourly wage is determined by

log 𝑤𝑎𝑔𝑒 = 𝛽! + 𝛽"𝑒𝑑𝑢𝑐 + 𝛽#𝑒𝑥𝑝𝑒𝑟 + 𝛽$𝑒𝑥𝑝𝑒𝑟 # + 𝑢
• but we omit 𝑒𝑥𝑝𝑒𝑟 #, then we are committing a functional form misspecification.
• biased estimators of all parameters
• Could arise from omitting any other variable – say some interaction variable
• Could be 𝑒𝑥𝑝𝑒𝑟* 𝑒𝑑𝑢𝑐

Tirtha Chatterjee
Misspecification because of Omission- nested models

• F test for joint exclusion restrictions can be used to detect functional form
misspecification
• Sometimes difficult to pinpoint the precise reason for the misspecification
• Regression Specification Error Test- RESET (Ramsey, 1969)

Tirtha Chatterjee
RESET

• Useful to detect general functional form misspecification

• Say the original model to be estimated is
𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + ⋯ + 𝛽% 𝑥% + 𝑢
• RESET is implemented to test if we should include non-linear or higher order terms in
our model
• RESET adds polynomials of fitted values to equation to detect general kinds of
functional form misspecification.
• To implement RESET, we must decide how many functions of the fitted values to include in an
expanded regression.
• There is no right answer to this question, but the squared and cubed terms have proven to be useful
in most applications.

Tirtha Chatterjee
RESET - Implementation

• First, we estimate our original model-𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + ⋯ + 𝛽% 𝑥% + 𝑢

• Let 𝑦3 denote the OLS fitted values.
• Next ,we regress y on 𝑥" … 𝑥% , 𝑦3 #, 𝑦3 $
• 𝑦( ! and 𝑦( " are just nonlinear functions of the 𝑥#
• The expanded/auxiliary regression is
𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + ⋯ + 𝛽% 𝑥% + 𝛿"𝑦3 # + 𝛿#𝑦3 $ + 𝑣
• We are not interested in the estimated parameters
• We only use this equation to test whether original model has missed important
nonlinearities.

Tirtha Chatterjee
RESET - Implementation

• The null hypothesis is that original model is correctly specified.

• 𝐻$ : 𝛿% = 0, 𝛿! = 0
• RESET is the F statistic for testing the null
• A significant F statistic suggests some sort of functional form problem.
• the test can be made robust to heteroskedasticity using the methods discussed in Chapter 8

Tirtha Chatterjee
RESET- example
• We estimate two models for housing prices- sample size= 88
• The first one has all variables in level form:
𝑝𝑟𝑖𝑐𝑒 = 𝛽$ + 𝛽% 𝑙𝑜𝑡𝑠𝑖𝑧𝑒 + 𝛽! 𝑠𝑞𝑟𝑓𝑡 + 𝛽" 𝑏𝑑𝑟𝑚𝑠 + 𝑢
• The second one uses the logarithms of all variables except bdrms:
𝑙𝑝𝑟𝑖𝑐𝑒 = 𝛽$ + 𝛽% 𝑙𝑙𝑜𝑡𝑠𝑖𝑧𝑒 + 𝛽! 𝑙𝑠𝑞𝑟𝑓𝑡 + 𝛽" 𝑏𝑑𝑟𝑚𝑠 + 𝑢
• First model
• F statistics for the first model- 𝐹!,'! = 4.67, p = 0.012
• We reject the null hypothesis at 5% level of significance- evidence of misspecification
• Second model
• F statistics for the second model- 𝐹!,'! = 2.56, p = 0.084
• We do not reject the null hypothesis at 5% level of significance- no evidence of misspecification
• On the basis of RESET- log-log/ second model is preferred specification

Tirtha Chatterjee
RESET- drawback

• It does not provide any direction on how to proceed if the model is rejected.
• In our previous example, when the first model was rejected, we could think of log-log model
because it is easy to interpret
• In this example, it so happens that it passes the functional form test as well.
• We might not be so lucky all the time- What if the second model got rejected too?
• RESET has no power for detecting omitted variables which are linear
• The bottom line is that RESET is a functional form test, and nothing more.

Tirtha Chatterjee
Non-nested models

• Following are two non-nested models

𝑦 = 𝛽$ + 𝛽% 𝑥% + 𝛽! 𝑥! + 𝑢

𝑦 = 𝛽$ + 𝛽% log(𝑥% ) + 𝛽! log(𝑥! ) + 𝑢
• We cannot simply use a standard F test.
• Two approaches have been suggested.
• Mizon and Richard (1986)
• Davidson-MacKinnon test

Tirtha Chatterjee
First approach-Mizon and Richard (1986)- Implementation

• Steps
• First construct a comprehensive model that contains each model as a special case and
• then test the restrictions that led to each of the models.
• The comprehensive model is
𝑦 = 𝛾! + 𝛾"𝑥" + 𝛾#𝑥# + 𝛾$log(𝑥") + 𝛾&log(𝑥#) + 𝑢
• We can first test H0: 𝛾$= 0, 𝛾& = 0 as a test of the first model
• We can also test H0: 𝛾" = 0, 𝛾#= 0 as a test of second model.

Tirtha Chatterjee
Second approach-Davidson-MacKinnon test

• If the 1st model holds with E(u| 𝑥", 𝑥#) = 0, then fitted values from 2nd model should
be insignificant when added to the first model
• Therefore, to test whether first model is the correct model, we first estimate 2nd model
by OLS to obtain the fitted values; call these 𝑦.
<
• 𝑦I are just nonlinear functions of 𝑥% and 𝑥!
• Then estimate this auxiliary regression
𝑦 = 𝛽$ + 𝛽% 𝑥% + 𝛽! 𝑥! + 𝜃% 𝑦I + 𝑒𝑟𝑟𝑜𝑟
• Davidson-MacKinnon test is obtained from the t-stat on 𝑦< from this auxiliary
regression
• Significant t-test means rejection of first model

Tirtha Chatterjee
Second approach-Davidson-MacKinnon test

• The same thing can be done to test if the second model is correctly specified
• We estimate the first model and then the fitted values are used as regressors in the
second model.
• A significant t test is a rejection of our hypothesis
• This approach can be followed for testing any 2 non-nested models with the same
dependent variable

Tirtha Chatterjee
Problems with testing non-nested alternatives

• First, a clear winner need not emerge.

• Both models could be rejected or neither model could be rejected.
• If neither model is rejected, we can use the adjusted R-squared to choose between them.
• If both models are rejected, more work needs to be done.
• A second problem is that rejecting the first model does not directly imply that second
is correct
• First model can be rejected for a variety of functional form misspecifications.

Tirtha Chatterjee
Functional form misspecification- omission of key variables

• Omission of key explanatory variables could be because of data unavailability

• Suppose we want to estimate the following model & estimate : 𝛽"& 𝛽#

log 𝑤𝑎𝑔𝑒 = 𝛽$ + 𝛽% 𝑒𝑑𝑢𝑐 + 𝛽! 𝑒𝑥𝑝𝑒𝑟 + 𝛽" 𝑎𝑏𝑖𝑙 + 𝑢

• Now, it is very difficult to collect data for ability
• And educ is correlated with ability
• Putting ability in the error term causes the OLS estimator of 𝛽" (and 𝛽#) to be biased
• This is nothing but omitted variable bias
• Possible solution is to use proxy variables for omitted variable

Tirtha Chatterjee
Proxy variables

• Proxy variable is something that is related to the unobserved variable that we would
like to control for in our analysis.
• One proxy could be IQ
• IQ does not have to be the same thing as ability;
• what we need is for IQ to be correlated with ability
• Let us estimate the following model- 𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + 𝛽$𝑥$∗ + 𝑢
• We assume that data are available on y, 𝑥", and 𝑥#
• The explanatory variable 𝑥$∗ is unobserved, but we have a proxy variable for 𝑥$∗
• Let us call the proxy variable 𝑥$.
• In our example, 𝑥$∗ is ability and 𝑥$ is IQ

Tirtha Chatterjee
Using proxy variables for unobserved explanatory variables

• What do we require of 𝑥$(proxy variable)?

• First, it should have some relationship to 𝑥$∗ (unobserved explanatory variable).
• This is captured by the simple regression equation
𝑥$∗ = 𝛿! + 𝛿$𝑥$ + 𝑣$
• Where 𝑣$ is an error because 𝑥$∗ and 𝑥$ are not exactly related
• Parameter 𝛿$measures the relationship b/w 𝑥$∗ and 𝑥$
• If 𝛿" >0, 𝑥"∗ and 𝑥" are positively related, if >0, then negatively related.
• If 𝛿" =0, then 𝑥" is not a suitable proxy for 𝑥"∗

Tirtha Chatterjee
Using proxy variables for unobserved explanatory variables

• How can we use 𝑥$ to get unbiased (or at least consistent) estimators of 𝛽"& 𝛽#?
• We can plug in 𝑥$ for 𝑥$∗ & run the regression of y on 𝑥", 𝑥#, 𝑥$.
• We call this the plug-in solution to the omitted variables problem
• because 𝑥" is just plugged in for 𝑥"∗ before we run OLS.
• If 𝑥$ is truly related to 𝑥$∗ , this seems like a sensible thing.
• However, since 𝑥$ and 𝑥$∗ are not the same, we need some assumptions

Tirtha Chatterjee
Assumptions needed for the plug-in solution to work

𝑦 = 𝛽! + 𝛽" 𝑥" + 𝛽# 𝑥# + 𝛽$ 𝑥$∗ + 𝑢

𝑥$∗ = 𝛿! + 𝛿$ 𝑥$ + 𝑣$
• (1) 𝐸(𝑢|𝑥", 𝑥#, 𝑥$∗, 𝑥$) = 0 - The error u is uncorrelated with 𝑥", 𝑥# 𝑎𝑛𝑑 𝑥$∗ & and 𝑥$
• 𝐸(𝑢|𝑥% , 𝑥! , 𝑥"∗ )=0 is standard assumption of the model we want to estimate
• In addition, u is uncorrelated with 𝑥" . This means that 𝑥" is irrelevant in the population model, once
𝑥% , 𝑥! , 𝑥"∗ have been included. 𝑥" is a proxy for 𝑥"∗ . It is 𝑥"∗ that directly affects y and not 𝑥"
• (2) The error 𝑣$is uncorrelated with 𝑥", 𝑥#, and 𝑥$.
• 𝑣" is uncorrelated with 𝑥% 𝑎𝑛𝑑 𝑥! requires 𝑥" to be a “good” proxy for 𝑥"∗
• In terms of conditional expectations.
𝐸(𝑥"∗ | 𝑥% , 𝑥! , 𝑥" ) = 𝐸(𝑥"∗ | 𝑥" )=𝛿$ + 𝛿" 𝑥"
• Once 𝑥" is controlled for, the expected value of 𝑥"∗ does not depend on 𝑥% or 𝑥! .
• Alternatively, 𝑥"∗ has zero correlation with 𝑥% and 𝑥! once 𝑥" is partialled out.

Tirtha Chatterjee
Proxy variable
𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + 𝛽$𝑥$∗ + 𝑢
𝑥$∗ = 𝛿! + 𝛿$𝑥$ + 𝑣$
• If we substitute 𝑥$ for 𝑥$∗ and do simple algebra, we get
𝑦 = 𝛽! + 𝛽$𝛿! + 𝛽"𝑥" + 𝛽#𝑥#+ 𝛽$𝛿$𝑥$+u+ 𝛽$𝑣$
• Composite error in this equation 𝑒 = 𝑢 + 𝛽$𝑣$- It depends on u and 𝑣$
• Write this equation as 𝑦 = 𝛼! + 𝛽"𝑥" + 𝛽#𝑥#+ 𝛼$𝑥$+e
• where 𝛼$ = 𝛽$ + 𝛽" 𝛿$ and 𝛼" = 𝛽" 𝛿" is the slope parameter on the proxy variable 𝑥"
• e has zero mean and is uncorrelated with 𝑥", 𝑥#, and 𝑥$
• Since u and 𝑣" both have zero mean and each is uncorrelated with 𝑥% , 𝑥! , and 𝑥"
• The estimated model – regressing y on 𝑥" 𝑥# & 𝑥$will get unbiased (or at least
consistent) estimators of 𝛼!, 𝛽", 𝛽# and 𝛼$.

Tirtha Chatterjee
Including a proxy could exacerbate problem of multicollinearity

• For example, Including variable like IQ as a proxy for ability in a regression that
includes educ
• But there are advantages
• First, the inclusion of IQ reduces the error variance because the part of ability explained by IQ has
been removed from the error- Reflected in a smaller standard error of the regression
• Second, the added multicollinearity is a necessary evil if we want to get an unbiased estimator of
educ
• There could be increase in multicollinearity even if we had data on ability and had included it in the
model
• Ultimately, it is a trade-off

Tirtha Chatterjee
Using lagged dependent variable as proxy variables

• Sometimes we suspect that some of the x variables could be correlated with the error
but we are not aware of a good proxy
• In those cases, we can use observed dependent variable from a previous time point as
a proxy
• Lagged dependent variable
• Using past values increases data requirements but it is a way to account for historical
factors that could have an impact on the present
• Consider a simple equation to explain city crime rates:
𝑐𝑟𝑖𝑚𝑒 = 𝛽$ + 𝛽% 𝑢𝑛𝑒𝑚 + 𝛽! 𝑒𝑥𝑝𝑒𝑛𝑑 + 𝛽" 𝑐𝑟𝑖𝑚𝑒)% + 𝑢
• where crime is a measure of per capita crime, unem is the city unemployment rate, expend is per
capita spending on law enforcement, and 𝑐𝑟𝑖𝑚𝑒)% indicates the crime rate measured in some earlier
year (this could be the past year or several years ago).

Tirtha Chatterjee
Using lagged dependent variable as proxy variables

• We want to find the effects of unemployment and law enforcement expenditures on

crime rate in the city
• We expect that 𝛽$> 0 because crime has inertia.
• What is the purpose of including 𝑐𝑟𝑖𝑚𝑒1" in the equation?
• Cities with high historical crime rates may spend more on crime prevention.
• Cities with higher historical crime rates may have effects on unemployment
• Engaged in criminal activities and not labour market

• Thus, factors unobserved to us (the econometricians) that affect crime are likely to be
correlated with expend (and unem).
• Using cross-sectional data is unlikely to give us unbiased estimator of causal effect of law
enforcement expenditure on crime

Tirtha Chatterjee
Measurement error

• When we use an imprecise measure of an economic variable in a regression model,

then our model contains measurement error.
• Some reasons for measurement error
• Recorded measures might contain error- For example- reported annual income is a measure of
actual annual income but generally people do not report it correctly
• Coding errors, rounding errors, imprecise data collection, imperfect proxy
• Measurement error can be both in the dependent variable or the independent variable
or both
• We will study the consequences of measurement error for ordinary least squares
estimation.

Tirtha Chatterjee
Measurement error in dependent variable

• Let 𝑦 ∗ denote the variable that we would like to explain- say, annual family savings.
• The regression model has the usual form
𝑦 ∗ = 𝛽! + 𝛽"𝑥" + ⋯ + 𝛽% 𝑥% + 𝑢
• and we assume it satisfies the Gauss-Markov assumptions.
• Say there is measurement error in 𝑦 ∗ or we do not measure it correctly
• Let y represent the observable measure of 𝑦 ∗ . In the savings case, y is reported annual
savings.
• Unfortunately, families are not perfect in their reporting of annual family savings

Tirtha Chatterjee
Measurement error in dependent variable

• Measurement error is the difference between the observed value and the actual value:
𝑒! = 𝑦 − 𝑦 ∗
• important thing is how the measurement error in the population is related to other
factors.
• To obtain an estimable model, we plug 𝑦 ∗ = 𝑦 − 𝑒! into equation, and rearrange:
y = 𝛽! + 𝛽"𝑥" + ⋯ + 𝛽% 𝑥% + 𝑢 + 𝑒!
• The error term is 𝑢 + 𝑒!. We estimate this model by OLS.

Tirtha Chatterjee
Measurement error in dependent variable

• We can estimate the model using OLS and ignore measurement error if 𝑒! also has
zero mean like u and is uncorrelated with each 𝑥2 .
• Biased estimators for the intercept if measurement error doesn’t have zero mean
• If we assume corr(x, 𝑒$ )=0, then no endogeneity and OLS estimators are unbiased and consistent.
• Measurement error in dependent variable results in larger variances of the OLS
estimators.
• Since, if 𝑒$ and u are uncorrelated, then 𝑉𝑎𝑟 𝑢 + 𝑒$ = 𝜎*! + 𝜎$! > 𝜎*! .
• Basically, if the measurement error is uncorrelated with the independent variables,
then OLS estimation has good properties.
• Otherwise, endogeneity and biased estimates

Tirtha Chatterjee
Measurement error in explanatory variable

• Much more serious problem compared to measurement error in dependent variable.

𝑦 = 𝛽! + 𝛽"𝑥"∗ + 𝑢
• The problem is that 𝑥"∗ is not observed. Instead, we have a measure of 𝑥"∗ ; call it 𝑥".
• For example, 𝑥%∗ could be actual income and 𝑥% could be reported income
• The measurement error in the population is 𝑒" = 𝑥" − 𝑥"∗
• This can be positive, negative, or zero.
• We assume that
• average measurement error in the population is zero: E(𝑒% ) = 0.
• 𝐸(𝑦|𝑥%∗, 𝑥% ) = 𝐸(𝑦|𝑥%∗), which just says that 𝑥% does not affect y after 𝑥%∗ has been controlled for.

Tirtha Chatterjee
Measurement error in explanatory variable

• We can estimate OLS model by plugging in 𝑥" in place of 𝑥"∗

𝑦 = 𝛽! + 𝛽" 𝑥" − 𝑒" + 𝑢

𝑦 = 𝛽! + 𝛽"𝑥" + (𝑢 − 𝛽" 𝑒")
𝑦 = 𝛽! + 𝛽"𝑥" + 𝑢G
• But additional assumptions required for OLS to have good properties
• The first assumption is that 𝑒% is uncorrelated with the observed measure, 𝑥% : 𝐶𝑜rr(𝑥% , 𝑒% ) = 0
• If this is true, then 𝑒% must be uncorrelated with 𝑥%∗ - Classical errors-in-variables (CEV)
assumption

Tirtha Chatterjee
Classical errors-in-variables (CEV) assumption

• Measurement error is uncorrelated with the unobserved explanatory variable (𝑥"∗ )-

𝐶𝑜rr(𝑥"∗ , 𝑒") = 0
• But 𝑒" = 𝑥" − 𝑥"∗
• Thus CEV implies that 𝑒" is correlated with the observed measure, 𝑥"
• Thus, even with CEV, OLS regression of y on 𝑥" gives a biased and inconsistent
estimator- with cov(𝑥", composite error ) ≠ 0

Tirtha Chatterjee
Missing data
• Can arise because of various reasons- collected information & later saw some info
missing
• Will reduce the sample size
• Are there any statistical consequences of using the OLS estimator and ignoring the
missing data?
• No statistical problem if the data are missing completely at random (MCAR)
• If MCAR- then units which have all data are not systematically different from those which don’t
• Reason for missing data is is independent, in a statistical sense
• Don’t replace missing with Zeros- can cause substantial bias in the OLS estimators.
• For MCAR-, one could create a dummy variable –missing data indicator
• which equals the actual data from variable, and zero if the data is missing
• All the variables including this new dummy variable is included in the regression

Tirtha Chatterjee
Missing data- Non random samples

• Say the missing data is not random

• For example
• individuals with higher income do not report their savings/income
• Or say in the birth weight data set, probability that education is missing is higher for those people
with lower than average levels of education
• The random sampling assumption MLR.2 is violated, and we must worry about these
consequences for OLS estimation.
• Sample selection two types- exogenous and endogenous

Tirtha Chatterjee
Exogenous sample selection

• When sample is selected on the basis of independent variables

• Example- Suppose that we are estimating a saving function,
𝑠𝑎𝑣𝑖𝑛𝑔 = 𝛽! + 𝛽"𝑖𝑛𝑐𝑜𝑚𝑒 + 𝛽#𝑎𝑔𝑒 + 𝛽$𝑠𝑖𝑧𝑒 + 𝑢
• Data set comprises of survey of people over 35 years age
• This is nonrandom sample of adults- but is a case of exogenous sample selection
• We can still get unbiased and consistent estimators of the parameters in the population
model using the nonrandom sample
• E(saving|income,age,size) is the same for any subset of the population described by income, age, or
size.
• Not so serious problem if there is enough variation in the independent variables

Tirtha Chatterjee
Endogenous sample selection

• Sample selection is based on the dependent variable

• For example, suppose we wish to estimate the relationship between individual wealth
and several other factors in the population of all adults:
𝑤𝑒𝑎𝑙𝑡ℎ = 𝛽! + 𝛽"𝑒𝑑𝑢𝑐 + 𝛽#𝑒𝑥𝑝𝑒𝑟 + 𝛽$𝑎𝑔𝑒 + 𝑢
• Suppose that only people with wealth below $250,000 are included in the sample.
• This is a nonrandom sample from the population of interest, and it is based on the
value of the dependent variable.
• This will result in biased and inconsistent estimators of the parameters in
• Because the population regression E(wealth|educ,exper,age) is not the same as the
expected value conditional on wealth being less than $250,000.

Tirtha Chatterjee
Outliers

• Outlying observations can occur for two reasons.

• Mistake has been made in entering the data. Adding extra zeros to a number or misplacing a decimal point can
throw off the OLS estimates.
• Outliers can also arise when sampling from a small population if one or several members of the population are
very different in some relevant aspect from the rest of the population.
• OLS is susceptible to outlying observations because it minimizes the sum of squared
residuals
• Least Absolute Deviations (LAD) method of estimation less sensitive to outliers

Tirtha Chatterjee
Least Absolute Deviation estimators

• The LAD estimators of the 𝛽2 in a linear model minimize the sum of the absolute
values of the residuals
• LAD does not give increasing weight to larger residuals
• it is much less sensitive to changes in the extreme values of the data than OLS.
• LAD is designed to estimate the parameters of the conditional median of y given
explanatory variables rather than the conditional mean.
• Because the median is not affected by large changes in the extreme observations, it
follows that the LAD parameter estimates are more resilient to outlying observations.

Tirtha Chatterjee
What to do in case of outliers?

• We can report and compare results with and without outliers

• Differentially weight observations based on outlier quality (robust regression)
• Formal approach of dealing with outliers beyond the scope of this course
• Excluding outliers might not be the best approach
• statistical properties of the resulting estimators are complicated.
• Outlying observations can provide important information by increasing the variation in the
explanatory variables

Tirtha Chatterjee

120 DS-With Answer
100% (1)
120 DS-With Answer
32 pages
The Practically Cheating Calculus Handbook
From Everand
The Practically Cheating Calculus Handbook
S. Deviant
3.5/5 (7)
Algebra & Trigonometry Super Review - 2nd Ed.
From Everand
Algebra & Trigonometry Super Review - 2nd Ed.
Editors of REA
No ratings yet
Chapter 6
No ratings yet
Chapter 6
18 pages
OLS Assumptions and diagnostics
No ratings yet
OLS Assumptions and diagnostics
18 pages
Lesson 5 Model Selection
No ratings yet
Lesson 5 Model Selection
41 pages
Chapter 7
No ratings yet
Chapter 7
50 pages
Multiple Linear Regression 13112023 063212pm
No ratings yet
Multiple Linear Regression 13112023 063212pm
49 pages
chp2 Econometric
No ratings yet
chp2 Econometric
54 pages
Chapter Three
No ratings yet
Chapter Three
35 pages
Cofusion Matrix Cross- Validation
No ratings yet
Cofusion Matrix Cross- Validation
34 pages
Chow Test
0% (1)
Chow Test
23 pages
Econometric Modeling: Model Specification and Diagnostic Testing
No ratings yet
Econometric Modeling: Model Specification and Diagnostic Testing
52 pages
Mlfa Autumn 22 Lec 02
No ratings yet
Mlfa Autumn 22 Lec 02
24 pages
Lecture-4---Multiple-Linear-Regression-imran-20022025-092939am
No ratings yet
Lecture-4---Multiple-Linear-Regression-imran-20022025-092939am
49 pages
Lecture 4 Intro To ML 27 03 2023 27032023 041559pm
No ratings yet
Lecture 4 Intro To ML 27 03 2023 27032023 041559pm
50 pages
week2_One variable optimization
No ratings yet
week2_One variable optimization
39 pages
Running A Proper Regression Analysis: V G R Chandran Govindaraju Uitm Email: Website
No ratings yet
Running A Proper Regression Analysis: V G R Chandran Govindaraju Uitm Email: Website
36 pages
Chapter 5 Variables Selection
No ratings yet
Chapter 5 Variables Selection
57 pages
31 Lecture Slides 29 and 30
No ratings yet
31 Lecture Slides 29 and 30
15 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Stats101A - Chapter 1
No ratings yet
Stats101A - Chapter 1
25 pages
Chapter (4)
No ratings yet
Chapter (4)
57 pages
Session 2 - QRT - Oct 3, 2020
No ratings yet
Session 2 - QRT - Oct 3, 2020
17 pages
Lesson 3 Overview Problems and Outliers
No ratings yet
Lesson 3 Overview Problems and Outliers
31 pages
Chapter 3
No ratings yet
Chapter 3
39 pages
Exam2Review
No ratings yet
Exam2Review
23 pages
Unit 5. Model Selection: María José Olmo Jiménez
No ratings yet
Unit 5. Model Selection: María José Olmo Jiménez
15 pages
Data Preprocessing in Machine Learning[1]
No ratings yet
Data Preprocessing in Machine Learning[1]
24 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
2.3 Assumptions of Linear Regression
No ratings yet
2.3 Assumptions of Linear Regression
16 pages
CSC121 - Topic 4 (ALGORITHM DESIGN FOR SELECTION CONTROL STRUCTURE)
No ratings yet
CSC121 - Topic 4 (ALGORITHM DESIGN FOR SELECTION CONTROL STRUCTURE)
55 pages
Chapter three
No ratings yet
Chapter three
35 pages
Csc121 - Topic 4 Algorithm Design for Selection Control Structure (1) (1)
No ratings yet
Csc121 - Topic 4 Algorithm Design for Selection Control Structure (1) (1)
55 pages
Week 4 - Intro to ML
No ratings yet
Week 4 - Intro to ML
37 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
01 Iterative Methods
No ratings yet
01 Iterative Methods
356 pages
ML U-4
No ratings yet
ML U-4
63 pages
1729585037_ML11_Generalization
No ratings yet
1729585037_ML11_Generalization
40 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Basic Inferential Statistics Nov. 5
No ratings yet
Basic Inferential Statistics Nov. 5
50 pages
Introduction To SEM
No ratings yet
Introduction To SEM
64 pages
Unit-Vi 2
No ratings yet
Unit-Vi 2
31 pages
Send CTP
No ratings yet
Send CTP
9 pages
ML3 - Evaluation
100% (1)
ML3 - Evaluation
65 pages
ML - Module 5
No ratings yet
ML - Module 5
80 pages
Week8_Functions_Ctd_2
No ratings yet
Week8_Functions_Ctd_2
29 pages
Chow Test
No ratings yet
Chow Test
23 pages
Model Selection Strategies
No ratings yet
Model Selection Strategies
20 pages
Accuracy Assessment and Confusion Matrix
No ratings yet
Accuracy Assessment and Confusion Matrix
23 pages
Smart PLSworkshop
No ratings yet
Smart PLSworkshop
64 pages
Models in Operations Resarch
No ratings yet
Models in Operations Resarch
25 pages
Topic 2 Nonlinear Regression Models
No ratings yet
Topic 2 Nonlinear Regression Models
16 pages
L3-Operators and expressions
No ratings yet
L3-Operators and expressions
36 pages
1726614865 453 350Lecture5 - Simple Linear Regression - Estimation
No ratings yet
1726614865 453 350Lecture5 - Simple Linear Regression - Estimation
29 pages
Unit 3-2
No ratings yet
Unit 3-2
15 pages
ADM2304 Multiple Regression Dr. Suren Phansalker
No ratings yet
ADM2304 Multiple Regression Dr. Suren Phansalker
12 pages
Lecture 0
No ratings yet
Lecture 0
40 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Topic 1_Economic Performance and Living Standards_Broadbery and Gupta_2016
No ratings yet
Topic 1_Economic Performance and Living Standards_Broadbery and Gupta_2016
37 pages
Indian Economy (IIECO-3002) Quiz-1
No ratings yet
Indian Economy (IIECO-3002) Quiz-1
9 pages
ResearchPaper_Ayodhya
No ratings yet
ResearchPaper_Ayodhya
6 pages
Trade Course Outline
No ratings yet
Trade Course Outline
5 pages
Feenstra_Taylor_Econ_CH03
No ratings yet
Feenstra_Taylor_Econ_CH03
43 pages
endsemexams
No ratings yet
endsemexams
5 pages
Gender and Poverty - 20.03.2024
No ratings yet
Gender and Poverty - 20.03.2024
15 pages
Stat-1004 F2022
No ratings yet
Stat-1004 F2022
5 pages
Institutions and Development - 25.02.2024
No ratings yet
Institutions and Development - 25.02.2024
44 pages
L2 - Development - Dimensions - 05.02.2024
No ratings yet
L2 - Development - Dimensions - 05.02.2024
34 pages
Transition Matrix and Convergence
No ratings yet
Transition Matrix and Convergence
8 pages
2.structural Features - Updated 5
No ratings yet
2.structural Features - Updated 5
56 pages
Examples - Notes 1
No ratings yet
Examples - Notes 1
4 pages
Practice Questions 24th August
No ratings yet
Practice Questions 24th August
5 pages
Unit 1 - IS-LM-PC
No ratings yet
Unit 1 - IS-LM-PC
27 pages
A Call For Greater Diversity of Thought in Environmental Studies Courses
No ratings yet
A Call For Greater Diversity of Thought in Environmental Studies Courses
5 pages
EBS348 SW-TP - EDUCATIONAL STATISTICS - GUIDE NOTES
No ratings yet
EBS348 SW-TP - EDUCATIONAL STATISTICS - GUIDE NOTES
271 pages
December Predictions
No ratings yet
December Predictions
10 pages
ENENDA30 - Module 01 Part 1
No ratings yet
ENENDA30 - Module 01 Part 1
85 pages
JobsDB HK 2021 Job Seeker Salary Survey Report
No ratings yet
JobsDB HK 2021 Job Seeker Salary Survey Report
183 pages
Waqar Ansari's RISE QM Ch#08
No ratings yet
Waqar Ansari's RISE QM Ch#08
21 pages
Result Master Sheet
No ratings yet
Result Master Sheet
7 pages
Maed-03 839
No ratings yet
Maed-03 839
9 pages
BA1502-BAS152 Recitation1
No ratings yet
BA1502-BAS152 Recitation1
2 pages
Chapter 10 with answer
No ratings yet
Chapter 10 with answer
10 pages
Jemas Mph 2024
No ratings yet
Jemas Mph 2024
16 pages
Statistics Solutions Class 11
No ratings yet
Statistics Solutions Class 11
41 pages
1100-0111_en
No ratings yet
1100-0111_en
16 pages
population syllabus
No ratings yet
population syllabus
22 pages
Intermediate STATS 10
100% (1)
Intermediate STATS 10
35 pages
FUNDAMENTALS OF BUSINESS ANALYTICS WITH SPREADSHEET_NOTES
No ratings yet
FUNDAMENTALS OF BUSINESS ANALYTICS WITH SPREADSHEET_NOTES
22 pages
Mobilelab ICMR
No ratings yet
Mobilelab ICMR
9 pages
SPRA-S11b_19-Wind-Loading-Protocol-for-Calculations
No ratings yet
SPRA-S11b_19-Wind-Loading-Protocol-for-Calculations
13 pages
CFA Level 1 Cheatsheet Preview
No ratings yet
CFA Level 1 Cheatsheet Preview
5 pages
null
No ratings yet
null
77 pages
DA0 001 Full File Edu Re Tldwno
No ratings yet
DA0 001 Full File Edu Re Tldwno
50 pages
Bio Statistics Review Examples Part I
100% (1)
Bio Statistics Review Examples Part I
19 pages
Biostatistics Module
No ratings yet
Biostatistics Module
79 pages
03 Mathematics (1)
No ratings yet
03 Mathematics (1)
31 pages
MATH AI SL IA jgc937 Annotated
No ratings yet
MATH AI SL IA jgc937 Annotated
18 pages
MATH 1281 - Unit 5 Assignment
No ratings yet
MATH 1281 - Unit 5 Assignment
4 pages
Analysis of Financial Performance of Everest Bank Limited
No ratings yet
Analysis of Financial Performance of Everest Bank Limited
52 pages
Naresh, Nagendraswamy - 2016 - Classification of Medicinal Plants an Approach Using Modified LBP With Symbolic Representation
No ratings yet
Naresh, Nagendraswamy - 2016 - Classification of Medicinal Plants an Approach Using Modified LBP With Symbolic Representation
25 pages
Agricultural Statistics and Biometry (Agr 304) - 2021.2022
No ratings yet
Agricultural Statistics and Biometry (Agr 304) - 2021.2022
11 pages
8602 (2 Assignment) Samia Mumtaz 0000237743
No ratings yet
8602 (2 Assignment) Samia Mumtaz 0000237743
24 pages
COMM104 F24 Assignment1 Questions
No ratings yet
COMM104 F24 Assignment1 Questions
8 pages

Chapter 9

Uploaded by

Chapter 9

Uploaded by

Econometrics – II

BA (H) Econ Core– Spring, 2024 (February to May)

Chapter 9: More on specification and data issues

• What is functional form misspecification or model misspecification

• Example- suppose hourly wage is determined by

• Useful to detect general functional form misspecification

• First, we estimate our original model-𝑦 = 𝛽! + 𝛽"𝑥" + 𝛽#𝑥# + ⋯ + 𝛽% 𝑥% + 𝑢

• The null hypothesis is that original model is correctly specified.

• Following are two non-nested models

• First, a clear winner need not emerge.

• Omission of key explanatory variables could be because of data unavailability

log 𝑤𝑎𝑔𝑒 = 𝛽$ + 𝛽% 𝑒𝑑𝑢𝑐 + 𝛽! 𝑒𝑥𝑝𝑒𝑟 + 𝛽" 𝑎𝑏𝑖𝑙 + 𝑢

• What do we require of 𝑥$(proxy variable)?

𝑦 = 𝛽! + 𝛽" 𝑥" + 𝛽# 𝑥# + 𝛽$ 𝑥$∗ + 𝑢

• We want to find the effects of unemployment and law enforcement expenditures on

• When we use an imprecise measure of an economic variable in a regression model,

• Much more serious problem compared to measurement error in dependent variable.

• We can estimate OLS model by plugging in 𝑥" in place of 𝑥"∗

𝑦 = 𝛽! + 𝛽" 𝑥" − 𝑒" + 𝑢

• Measurement error is uncorrelated with the unobserved explanatory variable (𝑥"∗ )-

• Say the missing data is not random

• When sample is selected on the basis of independent variables

• Sample selection is based on the dependent variable

• Outlying observations can occur for two reasons.

• We can report and compare results with and without outliers

You might also like