Econometrics II
Revision Class: Introduction
to Econometrics
Dr. Rabia Ikram
Assistant Professor, Lahore School of Economics
Lahore School of Economics
Winter 2024
What is Econometrics?
Econometrics - use of economic theory and
statistical methods to analyze economic data
Typical goals of econometric analysis:
– Estimating relationships between economic
variables
– Testing economic theories and hypotheses
– Forecasting economic variables, such as firm‘s
sales, the overall growth of the economy, or
stock prices.
– Evaluating and implementing government and
business policy
2
What is Econometrics?
Empirical analysis-uses data to test a theory
or to estimate a relationship
Steps in empirical analysis:
1. Economic model (this step is often skipped)
2. Econometric model
How can we estimate a population
parameter from a sample?
3
What makes one estimator
better than another?
1) Unbiased
The estimated coefficients may be smaller or larger,
depending on the sample that is the result of a random
draw.
However, on average, they will be equal to the values
that characterize the true relationship between Y and X
in the population.
4
What makes one estimator
better than another?
2) Efficient
Depending on the sample, the estimates will be nearer or
farther away from the true population values
How far can we expect our estimates to be away from the true
population values on average (=sampling variability) ?
In the diagram, A and B are
both unbiased estimators but
B is superior because it is
more efficient.
5
What makes one estimator
better than another?
3) Consistent
As the sample size increases to infinity, in the limit,
the variance of the distribution tends to zero. The
distribution collapses to a spike at the true value.
The plim of the sample mean is therefore the
population mean.
6
The Nature of Econometrics
and Economic Data
Econometric analysis requires data
Different kinds of economic data sets
– Cross-sectional data
– Time series data
– Pooled cross sections
– Panel/Longitudinal data
Econometric methods depend on the nature of the
data used
– Use of inappropriate methods may lead to
misleading results 7
The Simple Linear Regression Model
Definition of the simple linear regression model
“Explains variable in terms of variable
” Intercept Slope parameter
Error term, disturbance,
unobservables.
It represents factors other than x
that affect y.
1. Dependent variable 1. Independent variable
2. Explained variable 2. Explanatory variable
3. Response variable 3. Control Variable
4. Predicted variable 4. Predictor Variable
5. Regressand 5. Regressor
8
The Simple Linear Regression Model
Interpretation of the simple linear regression model
“Studies how varies with changes in :”
∆𝒚 ∆𝒖
𝜷 𝟏= as long as =𝟎
∆𝒙 ∆ 𝒙
By how much does the dependent Interpretation only correct if all other
variable change if the independent things remain equal when the
variable is increased by one unit? independent variable is increased by
one unit
9
When is there a causal
interpretation?
Conditional mean independence assumption
We need to make a crucial assumption about how u and
x are related
We want it to be the case that knowing something
about x does not give us any information about u, so
that they are completely unrelated. That is, that
The explanatory variable must not contain information
about the mean of the unobserved factors.
The average value of the error term does not depend
on the value of x. 10
For any given value of x, the distribution of y is centered
about E(Y|X).
Population regression function
𝐸 ( 𝑐𝑜𝑙𝐺𝑃𝐴|h𝑠𝐺𝑃𝐴 )=2+0.5 h𝑠𝐺𝑃𝐴
For individuals with , the
average value of is
11
Ordinary Least Squares
(OLS)
Intuitively, OLS is fitting a line through the
sample points such that the sum of squared
residuals is as small as possible, hence the
term least squares.
– The residual, û, is an estimate of the error term,
u, and is the difference between the fitted line
(sample regression function) and the sample point
– Why do we square the residuals?
12
Sample regression line, sample data points and the
associated estimated error terms
Fit as good as possible a regression line through the
data points: Fitted regression line or the
sample regression function (SRF)
because it is the estimated
version of PRF.
SRF is obtained for a given
sample of data, a new sample
will generate a different slope
and intercept.
13
The Simple Linear Regression Model
• What does “as good as possible” mean?
• Regression residuals
• Minimize sum of squared regression
residuals
OLS estimators used to
produce OLS estimates.
• Ordinary Least Squares (OLS) estimates
14
Algebraic Properties of
OLS
• Properties of OLS on any sample of data
• Fitted values and residuals
Fitted or predicted values Deviations from regression line (= OLS residuals)
• Algebraic properties of OLS regression
Deviations from regression Covariance between deviations Sample averages of y and x
line sum up to zero and regressors is zero lie on regression line
15
Deriving OLS Estimates
for SLR
𝒀 𝒊= 𝜷𝟎 + 𝜷 𝟏 𝑿 𝒊+ 𝝁𝒊
Population Error term
Regression
Function (PRF)
Estimated ^𝒊 =𝒀 𝒊 − ^
𝝁 𝜷𝟎 − ^
𝜷𝟏 𝑿𝒊
residuals:
Objective
𝟐 ^ ^
𝑹𝑺𝑺=∑ 𝝁 =∑ (𝒀 𝒊 − 𝜷 𝟎 − 𝜷𝟏 𝑿 𝒊 )𝟐
function: 𝒊
16
Deriving OLS Estimates for SLR
Minimizing Objective function:
^ ^
𝑹𝑺𝑺=∑ 𝝁 =∑ (𝒀 𝒊 − 𝜷 𝟎 − 𝜷𝟏 𝑿 𝒊 )
𝟐 𝟐
𝒊
First Order Conditions (FOCs):
i.
ii.
17
Summary of OLS slope
estimate
Provided that
• The slope estimate is the sample covariance
between x and y divided by the sample variance of
x
• If x and y are positively correlated, the slope will
be positive
• If x and y are negatively correlated, the slope will
be negative
• Only need x to vary in our sample
18
Example- Simple Linear
Regression (SLR)
Where:
– Wage= hourly wage in dollars
– Edu= years of education
Interpret edu.
19
Multiple Linear Regression Analysis
– b0 is still the intercept
– b1 to bk all called slope parameters
– u is still the error term (or disturbance)
– Still need to make a zero conditional mean
assumption, so now assume that
– E(u|x1,x2, …,xk) = 0
– Still minimizing the sum of squared residuals,
so have k+1 first order conditions
20
Interpreting Multiple
Regression
yˆ ˆ ˆ x ˆ x ... ˆ x , so
0 1 1 2 2 k k
yˆ ˆ1 x1 ˆ 2 x2 ... ˆ k xk ,
so holding x2 ,..., xk fixed implies that
yˆ ˆ x , that is each has
1 1
a ceteris pa ribus interpreta tion
21
“Partialling Out”
Interpretation
The regression coefficient of each X variable
provides an estimate of its influence on Y,
controlling for the effects of all the other X
variables.
One can show that the estimated coefficient of an
explanatory variable in a multiple regression can be
obtained in two steps:
1) Regress the explanatory variable on all other
explanatory variables
2) Regress y on the residuals from this regression
22
“Partialling Out”
Interpretation
Thus, measures the sample relationship
between y and x1 after x2 has been
partialled out.
23
“Partialling Out”
Interpretation
Why does this procedure work?
– The residuals from the first regression is the part of the
explanatory variable that is uncorrelated with the other
explanatory variables
– The slope coefficient of the second regression therefore
represents the isolated effect of the explanatory variable on the
dep. Variable
– This implies that regressing y on x1 and x2 gives same effect of x1
as regressing y on residuals from a regression of x1 on x2.
– This means only the part of xi1 that is uncorrelated with xi2 are
being related to yi so we’re estimating the effect of x1 on y after x2
has been “partialled out” 24
Example-Multiple Linear
Regression (MLR)
Where;
– colGPA= college grade point average
– hsGPA= high school grade point average
– ACT= achievement test score
Interpret hsGPA.
25
Example-Multiple Linear
Regression (MLR)
Where;
– colGPA= college grade point average
– hsGPA= high school grade point average
– ACT= achievement test score
Interpretation
• Holding ACT fixed, another point on high school grade point
average is associated with an increase of .453 points in the
college grade point average
• Or: If we compare two students with the same ACT, but the hsGPA
of student A is one point higher, we predict student A to have a
colGPA that is .453 higher than that of student B
• Holding high school grade point average fixed, another 10 points
on ACT are associated with less than one point on college GPA 26
Assumptions for
Unbiasedness
i. Population model is linear in parameters:
y = b0 + b1x1 + b2x2 +…+ bkxk +
u
ii. We can use a random sample of size n, {(xi1,
xi2,…, xik, yi): i=1, 2, …, n}, from the population
model, so that the sample model is:
yi = b0 + b1xi1 + b2xi2 +…+ bkxik
+ ui
iii. E(u|x1, x2,… xk) = 0, implying that all of the
explanatory variables are exogenous
27
iv. None of the x’s is constant, and there are no
The Gauss-Markov
Theorem
28
Too Many or Too Few
Variables
What happens if we include variables
in our specification that don’t belong?
29
Too Many or Too Few
Variables
What happens if we include variables
in our specification that don’t belong?
– There is no effect on our parameter
estimate, and OLS remains unbiased
30
Too Many or Too Few
Variables
What if we exclude a variable from our
specification that does belong?
31
Too Many or Too Few
Variables
What if we exclude a variable from our
specification that does belong?
– OLS will usually be biased
32
Omitted Variable Bias
Omitting relevant variables: the simple
case
𝒚 =𝜷 𝟎+ 𝜷𝟏 𝒙 𝟏 + 𝜷 𝟐 𝒙 𝟐+𝒖 True model (contains x 1
and x2)
Estimated model (x2 is
omitted)
33
Omitted Variable Bias
If x1 and x2 are correlated, assume a linear
regression relationship between them
If y is only regressed If y is only regressed error term
on x1 this will be the on x1, this will be the
estimated intercept estimated slope on x1
As,
Omitted Vairable Bias
34
Conclusion: All estimated coefficients will be biased
Summary of Direction of
Bias
35
Omitted Variable Bias
Summary
36
Analyzing the Variance of Y
Goodness-of-Fit
“How well does the explanatory variable explain the dependent variable?”
Measures of Variation
Explained sum of squares, Residual sum of squares,
represents variation represents variation not
explained by regression explained by regression
37
Goodness-of-Fit
Decomposition of total variation in y
Total variation in Y Explained part Unexplained part
Goodness-of-fit measure (R-squared)
38
Hypothesis Testing
• Statistical inference “... draws
conclusions from (or makes inferences
about) a population from a random
sample taken from that population.”
• A population is the ‘universe’ or the total
number of observations.
• A sample is a subset of a given
population.
39
Testing a Hypothesis Relating to
a Regression Coefficient
40
Hypothesis Testing
41
Testing
42
Testing when j is ‘statistically
significant’
43
Significance of a Single Coefficient vs
Overall Significance of the Regression
44
Testing the Overall Significance for
Multivariate Linear Regression: F test
45
46
The F Statistic for Overall
Significance of a Regression
Decision Rule:
F-stat > F critical, reject null, which suggests restricted
independent variables jointly add explanatory value to
our model and help predict the variations in the
dependent variable. 47
Multiple Regression Analysis
with Qualitative Information-
Dummy Variables
• Qualitative Information
– Examples: gender, race, industry, region,
rating grade, …
– A way to incorporate qualitative information is
to use dummy variables
– They may appear as the dependent or as
independent variables
• A single dummy independent variable
= the wage gain/loss if the person is a woman rather than a Dummy variable:
man (holding other things fixed) =1 if the person is a woman
=0 if the person is a man
Difference in intercepts between females and males.
48
Multiple Regression Analysis
with Qualitative Information-
Dummy Variables
Graphical Illustration
Alternative interpretation of coefficient:
i.e. the difference in mean wage between men
and women with the same level of education.
Intercept shift
49
Multiple Regression Analysis with
Qualitative Information- Dummy Variables
• Comparing means of subpopulations described by dummies
Not holding other factors constant, women earn
$2.51per hour less than men, i.e. the difference
between the mean wage of men and that of
women is $2.51.
Average wage of men in the sample:
The difference in the average wage between
women and men:
Average wage of women in the sample:
• Discussion
– It can easily be tested whether difference in means is significant
– The wage difference between men and women is larger if no
other things are controlled for; i.e. part of the difference is due to
differ-ences in education, experience, and tenure between men
and women
50
Summary of Functional Forms
Involving Logarithms
51
Interpretation of Functional
Forms Involving Logarithms
Example: GDP per capita and Labor
Productivity in Agriculture
Level-Level Model
Labor Productivity in
Interpretation:
Agriculture ($) GDP per capita ($)
52
Interpretation of Functional
Forms Involving Logarithms
Example: GDP per capita and Labor
Productivity in Agriculture
Log-Level Model
Log of Labor Productivity in
Agriculture ($) GDP per capita ($)
Interpretation:
53
Interpretation of Functional
Forms Involving Logarithms
Example: GDP per capita and Labor
Productivity in Agriculture
Level-Log Model
Labor Productivity in Log of GDP per capita ($)
Interpretation:
Agriculture ($)
54
Interpretation of Functional
Forms Involving Logarithms
Example: GDP per capita and Labor
Productivity in Agriculture
Log-Log Model
Log of Labor Productivity in Log of GDP per capita ($)
Interpretation:
Agriculture ($)
55