0% found this document useful (0 votes)
325 views

Advanced Econometrics

This document provides an introduction to a course on Advanced Econometrics II taught by Christophe Hurlin at the University of Orléans in February 2018. It begins with definitions of key terms like panel data, micro-panel, macro-panel, balanced and unbalanced panels. It then discusses the main advantages of using panel data sets and panel data models, including having more observations, being able to analyze new economic questions, accounting for unobservable components, and enabling easier estimation and inference. The objectives of the course are also outlined.

Uploaded by

古寒州
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
325 views

Advanced Econometrics

This document provides an introduction to a course on Advanced Econometrics II taught by Christophe Hurlin at the University of Orléans in February 2018. It begins with definitions of key terms like panel data, micro-panel, macro-panel, balanced and unbalanced panels. It then discusses the main advantages of using panel data sets and panel data models, including having more observations, being able to analyze new economic questions, accounting for unobservable components, and enabling easier estimation and inference. The objectives of the course are also outlined.

Uploaded by

古寒州
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Advanced Econometrics II

School of Economics and Management - University of Geneva

Christophe Hurlin, Université d’Orléans

University of Orléans

February 2018

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 1 / 61


Introduction

"Econometrics is the quantitative analysis of actual economic


phenomena based on the concurrent development of theory and
observation, related by appropriate methods of inference", P. A.
Samuelson, T. C. Koopmans, and J. R. N. Stone (1954)

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 2 / 61


Introduction

Econometrics is fundamentally based on four elements:

1 A sample of data
2 An econometric model
3 An estimation method
4 Some inference methods

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 3 / 61


Introduction

In econometrics, data come from one of the two sources: experiments and
non-experimental observations

1 Experimental data are based on (randomized controlled)


experiments designed to evaluate a treatment or policy or to
investigate a causal e¤ect.
2 Data obtained outside an experimental setting are called
observational data (issued from survey, administrative records etc...)

All of this lecture is devoted to methods for handling real-world


observational data

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 4 / 61


Introduction

Whether the data is experimental or observational, data sets can be mainly


distinguished in three types:

1 Cross-sectional data
2 Time series data
3 Panel data

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 5 / 61


Introduction

Cross-sectional data:

Data for di¤erent entities: workers, households, …rms, cities,


countries, and so forth.

No time dimension (even if date of data collection varies somewhat


across units, it is ignored).

Order of data does not matter!

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 6 / 61


Introduction

Time series data:

Data for a single entity (person, …rm, country) collected at multiple


time periods. Repeated observations of the same variables (GDP,
prices).

Order of data is important!

Observations are typically not independent over time;

In this case the notion of population corresponds to the Data


Generating Process (DGP).

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 7 / 61


Introduction

Panel data or longitudinal data:

Data for multiple entities (individuals, …rms, countries) in which


outcomes and characteristics of each entity are observed at multiple
points in time.

Combine cross-sectional and time series issues.

Present several advantages with respect to cross-sectional and time


series data (depending on the question of interest!).

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 8 / 61


Introduction

Objectives of the course


The objectives of the course are the following:

1 to understand the speci…cation, estimation, and inference in the


context of models that include individual (…rm, person, etc.) and/or
time e¤ects.
2 to review the standard linear regression model, then to apply it to
panel data settings involving ’…xed’, ’random’, and ’mixed’e¤ects.
3 to extend this linear panel data models to dynamic models with
recently developed GMM and instrumental variables methods.
4 to extend this linear panel data models to non-linear panel data
models

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 9 / 61


Section 2

Baseline De…nitions

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 10 / 61


De…nitions

De…nition (Panel data set)


A longitudinal, or panel, data set is one that follows a given sample of
individuals over time, and thus provides multiple observations on each
individual in the sample (Hsiao 2003, page 2).

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 11 / 61


De…nitions

Terminology and notations:

Individual or cross section unit : country, region, state, …rm,


consumer, individual, couple of individuals or countries (gravity
models), etc.

Double index : i (for cross-section unit) and t (for time)

yit for i = 1, .., N and t = 1, .., T

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 12 / 61


De…nitions

De…nition (micro-panel)
A micro-panel data set is a panel for which the time dimension T is
largely less important than the individual dimension N:

T << N

Example (micro-panel)
The University of Michigan’s Panel Study of Income Dynamics, PSID with
15,000 individuals observed since 1968 is a micro-panel.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 13 / 61


De…nitions

De…nition (macro-panel)
A macro-panel data set is a panel for which the time dimension T is
similar to the individual dimension N :

T 'N

Example (macro-panel)
A panel of 100 countries with quaterly data since the WW2 is considered
as a macro-panel.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 14 / 61


De…nitions

Remark: some econometric issues are speci…c to micro or macro panels.

Example (heterogeneity issue)


The heterogeneity issue cannot be tackled with if the time dimension is
too small.

Example (non stationarity issue)


The non-stationarity issue (non-stationarity and cointegration tests,
VECM, etc.) is only relevant for macro-panel or for panel with a time
dimension su¢ ciently large.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 15 / 61


De…nitions

De…nition (balanced vs. unbalanced panels)


A panel is said to be balanced if we have the same time periods,
t = 1, .., T , for each cross section observation. For an unbalanced panel,
the time dimension, denoted Ti , is speci…c to each individual.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 16 / 61


Introduction

Balanced panel with T = 11


and N = 3

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 17 / 61


Introduction

Balanced panel with missing


values

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 18 / 61


Introduction

Unbalanced panel with


T = 11 and N = 3

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 19 / 61


De…nitions

Remark: While the mechanics of the unbalanced case are similar to the
balanced case, a careful treatment of the unbalanced case requires a
formal description of why the panel may be unbalanced, and the sample
selection issues can be somewhat subtle.

=> issues of sample selection and attrition

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 20 / 61


De…nitions

De…nition (Panel data model)


A panel data regression model (or panel data model) is an econometric
model speci…cally designed for panel data.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 21 / 61


Section 3

Advantages of Panel Data Sets


and Panel Data Models

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 22 / 61


Advantages of Panel Data

Panel data sets for economic research possess several major advantages
over conventional cross-sectional or time-series data sets.

Hsiao, C., (2003, 2nd ed), Analysis of Panel Data, second edition, Cambridge
University Press.

Wooldridge J.M., (2001), Econometric Analysis of Cross Section and Panel


Data, The MIT Press.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 23 / 61


Advantages of Panel Data

What are the main advantages of the panel data sets and the panel
data models?
Advantage 1: the phantasm of a larger number of observations
Advantage 2: new economic questions (identi…cation)
Advantage 3: unobservable components
Advantage 4: easier estimation and inference

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 24 / 61


Advantages of Panel Data

Advantage 1: the phantasm of a larger number of observations

Panel data usually give the researcher a large number of data


points (N T ), increasing the degrees of freedom and reducing the
collinearity among explanatory variables – hence improving the
e¢ ciency of econometric estimates

But it is often of phantasm.... more data points doesn’t necessarily


imply more information => heterogeneity issue !!

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 25 / 61


Advantages of Panel Data

Advantage 2: new economic questions (identi…cation)


Longitudinal data allow a researcher to analyze a number of important
economic questions that cannot be addressed using cross-sectional or
time-series data sets.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 26 / 61


Advantages of Panel Data

De…nition (identi…cation)
The oft-touted power of panel data derives from their theoretical ability to
identify the e¤ects of speci…c actions, treatments, or more general
policies.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 27 / 61


Advantages of Panel Data

Example (Ben-Porath (1973), cited in Hsiao (2003))


Suppose that a cross-sectional sample of married women is found to have
an average yearly labor-force participation rate of 50%.
1 ) It might be interpreted as implying that each woman in a
homogeneous population has a 50 percent chance of being in the labor
force in any given year.
2 ) It might imply that 50 percent of the women in a heterogeneous
population always work and 50 percent never work.

To discriminate between these two stories, we need to utilize individual


labor-force histories (the time dimension) to estimate the probability
of participation in di¤erent subintervals of the life cycle.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 28 / 61


Advantages of Panel Data

Advantage 3: unobservable components

Panel data allows to control for omitted (unobserved or


mismeasured) variables.

Panel data provides a means of resolving the magnitude of


econometric problems that often arises in empirical studies, namely
the often heard assertion that the real reason one …nds (or does not
…nd) certain e¤ects is the presence of omitted (mismeasured or
unobserved) variables that are correlated with explanatory variables.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 29 / 61


Advantages of Panel Data

Example: Let us consider a simple regression model.

yit = α + β0 xit + ρ0 zit + εit i = 1, .., N t = 1, .., T

where
xit and zit are k1 1 and k2 1 vectors of exogenous variables
α is a constant, β and ρ are k1 1 and k2 1 vectors of parameters
εit is i.i.d. over i and t, with V (εit ) = σ2ε
Let us assume that zit variables unobservable and correlated with
xit
cov (xit , zit ) 6= 0

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 30 / 61


Advantages of Panel Data

Example (ct’d): The model can be rewritten as

yit = α + β0 xit + µit

µit = ρ0 zit + εit


cov (xit , µit ) 6= 0
It is well known that the least-squares regression coe¢ cients of yit
on xit are biased

=> endogeneity bias

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 31 / 61


Advantages of Panel Data

Example (ct’d): Let us assume that zi ,t = zi , i.e. z values stay constant


through time for a given individual but vary across individuals (individual
e¤ects).
yit = α + β0 xit + µit
µit = ρ0 zi + εit with cov (xit , µit ) 6= 0
Then, if we take the …rst di¤erence of individual observations over time:

yit yi ,t 1 = β0 (xit xi ,t 1 ) + εit εi ,t 1

Least squares regression now provides unbiased and consistent


estimates of β.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 32 / 61


Advantages of Panel Data

Example (ct’d): Let us assume that zi ,t = zt , i.e. z values are common


for all individuals but vary across time (common factors).

yit = α + β0 xit + ρ0 zt + εit i = 1, .., N t = 1, .., T

Then, if we consider deviation from the mean across individuals at a given


time:
yit y t = β0 (xit x t ) + εit εt
where

y t = (1/N ) ∑N
i =1 yit x t = (1/N ) ∑N
i =1 xit εt = (1/N ) ∑N
i =1 εit

Least squares regression now provides unbiased and consistent


estimates of β.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 33 / 61


Advantages of Panel Data

Advantage 4: easier estimation and inference

Panel data involve two dimensions: a cross-sectional dimension N,


and a time-series dimension T .

We would expect that the computation of panel data estimators


would be more complicated than the analysis of cross-section data
alone (where T = 1) or time series data alone (where N = 1).

However, in certain cases the availability of panel data can actually


simplify the computation and inference.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 34 / 61


Advantages of Panel Data

Example (time-series analysis of nonstationary data)


Let us consider a simple AR (1) model.

xt = ρxt 1 + εt

where the innovation εt is i.i.d. 0, σ2ε . Under the non-stationarity


assumption ρ = 1, it is well known that the asymptotic distribution of the
OLS estimator b
ρ is given by:

d 1 W (1)2 1
T (b
ρ 1) ! R
T !∞ 2 1 W (r )2 dr
0

where W (.) denotes a standard Brownian motion.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 35 / 61


Advantages of Panel Data

Hence, the behavior of the usual test statistics in time series often
have to be inferred through computer simulations.

But if panel data are available, and observations among


cross-sectional units are independent, then one can invoke the
central limit theorem across cross-sectional units to show that
I the limiting distributions of many estimators remain asymptotically
normal
I the Wald type test statistics are asymptotically chi-square
distributed.

See for instance Levin and Lin (1993); Im, Pesaran, Shin (1999),
Phillips and Moon (1999, 2000), Quah (1994), etc.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 36 / 61


Advantages of Panel Data

Example (time-series analysis of nonstationary data)


Let us consider the panel data model

xi ,t = ρxi ,t 1 + εi ,t

where the innovation εi ,t is i.i.d. 0, σ2ε over i and t, then:


p d
T N (b
ρ 1) ! N (0, 2)
N ,T !∞

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 37 / 61


Section 4

Issues Involved in using Panel Data

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 38 / 61


Issues with Panel Data

There are three main issues related to panel data:

1 Heterogeneity bias => Chapter 1


2 Dynamic panel data models (Nickel’s bias) => Chapter 2
3 Selectivity bias (not speci…c to panel data models)

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 39 / 61


Issues with Panel Data

The heterogeneity issue


When important factors peculiar to a given individual are left out, the
typical assumption that economic variable y is generated by a parametric
probability distribution function P (Y jθ )), where θ is an m-dimensional
real vector, identical for all individuals at all times, may not be a
realistic one.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 40 / 61


Issues with Panel Data

De…nition (Parameter heterogeneity issue)


The parameter heterogeneity issue (in the model speci…cation)
consists in specifying and estimating the individual and/or time-speci…c
e¤ects that exist among cross-sectional or time-series units but are not
captured by the included explanatory variables.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 41 / 61


Issues with Panel Data

Example: Let us consider a production function (Cobb Douglas) with two


factors (labor and capital). We have N countries and T periods. Let us
denote:
yi ,t = αi + βi ki ,t + γi ni ,t + εi ,t
with
yit the log of the GDP for country i at time t.
nit the log of the labor employment for country i at time t.
yit the log of the capital stock for country i at time t.
εi ,t i.i.d. 0, σ2ε , 8 i, 8 t.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 42 / 61


Issues with Panel Data

Example (ct’d): In this speci…cation, the elasticities αi and βi are speci…c


to each country
yi ,t = αi + βi ki ,t + γi ni ,t + εi ,t

Several alternative speci…cations can be considered.

First, we can assume that the production function is the same for all
countries: in this case we have an homogeneous speci…cation:

yi ,t = α + βki ,t + γni ,t + εi ,t

αi = α βi = β γi = γ

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 43 / 61


Issues with Panel Data

Example (ct’d): However, an homogeneous speci…cation of the


production function for macro aggregated data is meaningless.

Alternatively, we can consider an heterogeneous Total Factor


Productivity (TFP), with E (αi + εi ,t ) = αi , due to institutional
organizational factors, etc.

Thus, we can have a speci…cation with individual e¤ects αi and


common slope parameters (elasticities β and γ).

yi ,t = αi + βki ,t + γni ,t + εi ,t

βi = β γi = γ

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 44 / 61


Issues with Panel Data

Example (ct’d):

Finally, we can assume that the labor and/or capital elasticities are
di¤erent across countries.

In this case, we will have an heterogeneous speci…cation of the panel


data model (heterogeneous panel).

yi ,t = αi + βi ki ,t + γi ni ,t + εi ,t

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 45 / 61


Issues with Panel Data

Example (ct’d):

yi ,t = αi + βi ki ,t + γi ni ,t + εi ,t

In this case, there are two solutions to estimate the parameters

1 The …rst solution consists in using N times series models to produce


some group-mean estimates of the elasticities.
2 Consider a random coe¢ cient model. In this case, we assume that
parameters βi and γi and randomly distributed, with for instance:

βi i.i.i β, σ2β γi i.i.i γ, σ2γ

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 46 / 61


Issues with Panel Data

Fact (Heterogeneity bias)


Ignoring such heterogeneity (in slope and/or constant) could lead to
inconsistent or meaningless estimates of interesting parameters.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 47 / 61


Issues with Panel Data

The heterogeneity bias

Let us consider a simple linear with individual e¤ects and only one
explicative variable xi (common slope) as a DGP.

yit = αi + βxit + εit

Let us assume that all NT observations fxit , yit g are used to estimate
the following homogeneous model.

yit = α + βxit + εit

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 48 / 61


Issues with Panel Data

The heterogeneity bias

Source: Hsiao (2003)


Broken ellipses= point scatter for an individual over time
Broken straight lines = individual regressions.
Solid lines = least-squares regression using all NT observations

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 49 / 61


Issues with Panel Data

The heterogeneity bias

All of these …gures depict situations in which biases (on b


β) arise in
pooled least-squares estimates because of heterogeneous intercepts.

Obviously, in these cases, pooled regression ignoring heterogeneous


intercepts should never be used.

Moreover, the direction of the bias of the pooled slope estimates


cannot be identi…ed a priori; it can go either way.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 50 / 61


Issues with Panel Data

The heterogeneity bias


Let us consider another example. The true DGP is heterogeneous

yit = αi + βi xit + εit

and we use all NT observations fxit , yit g to estimate the homogeneous


model.
yit = α + βxit + εit

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 51 / 61


Issues with Panel Data

Pooling the NT observations,


assuming identical parameters for all
cross-sectional units, lead to
nonsensical results

It leads to estimate an average of


coe¢ cients that di¤er across
individuals (the phantasm of the
NT observations)

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 52 / 61


Issues with Panel Data

In this case, pooling gives rise to the false


inference that the pooled relation is
curvilinear.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 53 / 61


Issues with Panel Data

Fact (Heterogeneity issue)


In both cases, the classic paradigm of the “representative agent” simply
does not hold, and pooling the data under homogeneity assumption makes
no sense.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 54 / 61


Section 5

Course Information

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 55 / 61


Course Information

Course outline
Chapter 1: Linear Panel Models and Heterogeneity
Chapter 2: Dynamic Panel Data Models
Chapter 3: Non Stationarity and Panel Data Models
Chapter 4: Non Linear Panel Data Models

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 56 / 61


Course Information

Books: advanced econometrics (not speci…c to panel data)


Amemiya T. (1985), Advanced Econometrics. Harvard University Press.

Cameron A.C. and P.K. Trivedi (2005), Microeconometrics: Methods and


Applications, Cambridge University Press, Cambridge, U.S.A.
Davidson R. (2000), Econometric Theory, Blackwell Publishers, Oxford.

Davidson R. and J. Mackinnon (2004), Econometric Theory and Methods,


Oxford University Press, Oxford.
Greene W. (2007), Econometric Analysis, sixth edition, Pearson.

Johnston J. and J. Dinardo (1997), Econometric Methods, 4th ed., The


McGraw-Hill Companies Inc., New York.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 57 / 61


Course Information

Books: panel data econometrics (I/II)


Arellano M. (2003), Panel Data Econometrics, Oxford University Press, U.K.

Baltagi B. (2005), Econometric Analysis of Panel Data, John Wiley & Sons,
New York, Third edition.
Baltagi B. (2006), Panel Data Econometrics: Theoretical Contributions and
Empirical Applications, Elsevier, Amsterdam.
Hsiao (2003), Analysis of Panel Data, Cambridge University Press
(recommended).
Krishnakumar J. and E. Ronchetti (2000), Panel Data Econometrics: Future
Directions, Elsevier, Amsterdam.
Krishnakumar J. and E. Ronchetti (1983), Limited Dependent and
Qualitative Variables in Econometrics, Cambridge University Press.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 58 / 61


Course Information

Books: panel data econometrics (II/II)


Matyas L. and P. Sevestre (2008), The Econometrics of Panel Data,
Springer-Verlag, Berlin.
Wooldridge J.M (2010), Econometric Analysis of Cross Section and Panel
Data, MIT Press. (recommended).
Books: panel data econometrics (in French)
Pirotte A. (2011), Econométrie des données de panel, Economica.

Sevestre P. (2002), Econométrie des données de panel, Dunod, Paris.

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 59 / 61


Course Information

Additional references (articles and surveys) among many others...


Baltagi, B.H. and Kao, C. (2000), “Nonstationary panels, cointegration in
panels and dynamic panels : a survey”, in Advances in Econometrics, 15,
edited by B. Baltagi et C. Kao, 7-51, Elsevier Science.
Dumitrescu E. and Hurlin C. (2012), "Testing for Granger Non-causality in
Heterogeneous Panels", Economic Modelling, 29, 1450-1460.
Hurlin, C. and Mignon, V. (2005), “Une synthèse des tests de racine unitaire
sur données de panel”, Economie et Prévision, 169-171, 253-294
Hurlin C. et Mignon, V. (2007), "Une Synthèse des Tests de Cointégration
sur Données de Panel", Economie et Prévision, 180-181, 241- 265

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 60 / 61


End of the general introduction

Christophe Hurlin (University of Orléans)

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 61 / 61

You might also like