0% found this document useful (0 votes)

92 views35 pages

Basic Econometrics: TWO-VARIABLEREGRESSION MODEL

This document summarizes the method of ordinary least squares (OLS) for estimating parameters in a two-variable linear regression model. It describes how OLS provides unique estimates of the parameters that minimize the sum of squared residuals. It also outlines the 10 classical assumptions underlying the linear regression model, including that the relationship is linear, the disturbance term has a mean of zero and constant variance, and the disturbances are uncorrelated with the regressors.

Uploaded by

sebastian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views35 pages

Basic Econometrics: TWO-VARIABLEREGRESSION MODEL

Uploaded by

sebastian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Chapter Three

TWO-VARIABLEREGRESSION MODEL:
THE PROBLEM OF ESTIMATION

3.1 THE METHOD OF ORDINARY LEAST SQUARES

Under certain assumptions, the method of least squares has some very
attractive statistical properties that have made it one of the most powerful and
popular methods of regression analysis. To understand this method, we first explain
the least squares principle.
Recall the two-variable PRF:

However, as we noted in Chapter 2, the PRF is not directly observable. We

estimate it from the SRF:

which shows that the ûi (the residuals) are simply the differences
between the actual and estimated Y values.
1
.
Now given n pairs of observations on Y and X, we would like to determine the SRF
in such a manner that it is as close as possible
, to the actual Y. To this end, we
may adopt the following criterion: Choose the SRF in such a way that the sum of
the residuals  ûi  
 Yi  Yˆi is as small as possible. =

If we adopt the criterion of minimizing  ûi

for a given sample, the method of least squares provides us with unique estimates
of β1 and β2 that give the smallest possible value of  uˆi2

How is this accomplished? The process of differentiation yields the following

equations for estimating β1 and β2:

where n is the sample size. These simultaneous equations are known as the
normal equations.
2
Solving the normal equations simultaneously, we obtain

Equation (3.1.7) can be obtained directly from (3.1.4) by simple algebraic

manipulations.
Incidentally, note that, by making use of simple algebraic identities, formula
(3.1.6) for estimating β2 can be alternatively expressed as

3
The regression line thus obtained has the following properties:
1. It passes through the sample means of Y and X. This fact is obvious from
(3.1.7), for the latter can be written as Y  ˆ1  ˆ2 X
which is shown diagrammatically in Figure 3.2.

4
2. The mean value of the estimated Y = Yˆi is equal to the mean value of the actual Y

3. The mean value of the residuals ûi is zero

As a result of the preceding property, the sample regression

can be expressed in an alternative form where both Y and X are expressed as

deviations from their mean values. To see this, sum (2.6.2) on both sides to give

Dividing Eq. (3.1.11) through by n, we obtain

which is the same as (3.1.7). Subtracting Eq. (3.1.12) from (2.6.2), we obtain

5
where yi and xi , following our convention, are deviations from their respective
(sample) mean values.
Equation (3.1.13) is known as the deviation form. Notice that the intercept term ̂1
is no longer present in it. But the intercept term can always be estimated by
(3.1.7), that is, from the fact that the sample regression line passes through the
sample means of Y and X.
An advantage of the deviation form is that it often simplifies computing formulas.
In passing, note that in the deviation form, the SRF can be written as

4. The residuals ûi are uncorrelated with the predicted Yi .

5. The residuals û i are uncorrelated with Xi ; that is,

6
3.2 THE CLASSICAL LINEAR REGRESSION MODEL:
THE ASSUMPTIONS UNDERLYING THE METHOD OF LEAST SQUA

The Gaussian, standard, or classical linear regression model (CLRM), which is

the cornerstone of most econometric theory, makes 10 assumptions.
We first discuss these assumptions in the context of the two-variable regression
model; and in Chapter 7 we extend them to multiple regression models, that is,
models in which there is more than one regressor.
Assumption 1: Linear regression model. The regression model is
linear in the parameters, as shown in (2.4.2)

Assumption 2: X values are fixed in repeated sampling. Values

taken by the regressor X are considered fixed in repeated samples.
More technically, X is assumed to be nonstochastic

7
Assumption 3: Zero mean value of disturbance ui. Given the
value of X, the mean, or expected, value of the random
disturbance term ui is zero. Technically, the conditional mean
value of ui is zero. Symbolically, we have

Assumption 4: Homoscedasticity or equal variance of ui.

Given the value of X, the variance of ui is the same for all
observations. That is, the conditional variances of ui are identical.
Symbolically, we have

where var stands for variance.

Technically, (3.2.2) represents the assumption of homoscedasticity, or equal

(homo) spread (scedasticity) or equal variance.
8
Put simply, the variation around the regression line (which is the line of average
relationship between Y and X) is the same across the X values; it neither increases
or decreases as X varies. Diagrammatically, the situation is as depicted in Figure
3.4.

9
In contrast, consider Figure 3.5, where the conditional variance of the Y population
varies with X. This situation is known appropriately as heteroscedasticity, or
unequal spread, or variance. Symbolically, in this situation (3.2.2) can be written as

Notice the subscript on σ2 in Eq. (3.2.3), which indicates that the variance of the Y
population is no longer constant.

Assumption 5: No autocorrelation between the disturbances.

Given any two X values, Xi and Xj (i ≠ j), the correlation between any
two ui and uj (i ≠ j) is zero. Symbolically,

where i and j are two different observations and where cov means
covariance..

In words, (3.2.5) postulates that the disturbances ui and uj are uncorrelated.

10
Technically, this is the assumption of no serial correlation, or no autocorrelation.
This means that, given Xi , the deviations of any two Y values from their mean value
do not exhibit patterns such as those shown in Figure 3.6a and b. In Figure 3.6a, we
see that the u’s are positively correlated, a positive u followed by a positive u or a
negative u followed by a negative u. In Figure 3.6b, the u’s are negatively
correlated, a positive u followed by a negative u and vice versa.
If the disturbances (deviations) follow systematic patterns, such as those shown in
Figure 3.6a and b, there is auto- or serial correlation, and what Assumption 5
requires is that such correlations be absent. Figure 3.6c shows that there is no
systematic pattern to the u’s, thus indicating zero correlation.

11
Assumption 6 states that the disturbance u and explanatory variable X are uncorrelated.

Assumption 7: The number of observations n must be greater than the

number of parameters to be estimated. Alternatively, the number of
observations n must be greater than the number of explanatory variables.

Assumption 8: Variability in X values. The X values in a given sample must

not all be the same. Technically, var (X) must be a finite positive number.

12
Assumption 9: The regression model is correctly specified. Alternatively,
there is no specification bias or error in the model used in empirical analysis.

Some important questions that arise in the specification of the model include
the following:
(1) What variables should be included in the model?
(2) What is the functional form of the model? Is it linear in the parameters, the
variables, or both?
(3) What are the probabilistic assumptions made about the Yi , the Xi, and the ui
entering the model?
These are extremely important questions, for by omitting important variables
from the model, or by choosing the wrong functional form, or by making wrong
stochastic assumptions about the variables of the model, the validity of interpreting
the estimated regression will be highly questionable.

13
Our discussion of the assumptions underlying the classical linear regression model is now
completed. It is important to note that all these assumptions pertain to the PRF only and
not the SRF. But it is interesting to observe that the method of least squares discussed
previously has some properties that are similar to the assumptions we have made about
the PRF. For example, the finding that
 ûi = 0, and, therefore,

û = 0, is akin to the assumption that E(ui | Xi) = 0. Likewise, the finding that

 ûi X = 0 is similar to the assumption that cov(u , X ) = 0. It is comforting

i i i
to note that the method of least squares thus tries to “duplicate” some
of the assumptions we have imposed on the PRF.
When we go beyond the two-variable model and consider multiple
regression models, that is, models containing several regressors, we
add the following assumption.

Assumption 10: There is no perfect multicollinearity. That is, there are no

perfect linear relationships among the explanatory variables.

14
3.3 PRECISION OR STANDARD ERRORS OF LEAST-SQUARES ESTIMATES
Given the Gaussian assumptions the standard errors of the OLS estimate can be
obtained as follows:

where var = variance and se = standard error and where σ2 is the constant or
homoscedastic variance of ui of Assumption 4.
All the quantities entering into the preceding equations except σ2 can be estimated
from the data. σ2 itself is estimated by the following formula:

15
2
where ̂ is the OLS estimator of the true but unknown σ2 and where the expression

n− 2 is known as the number of degrees of freedom (df),

2
 i
ˆ
u being the sum of the residuals squared or the residual sum of squares (RSS).

2
Once  i is known,
ˆ
u ̂ 2 can be easily computed.

2
 i
ˆ
u itself can be computed either from (3.1.2) or from the following expression

Compared with Eq. (3.1.2), Eq. (3.3.6) is easy to use, for it does not require computing
ûi for each observation although such a computation will be useful in its own right.

Since

an alternative expression for computing  uˆi2 is

16
In passing, note that the positive square root of ̂ 2
.

is known as the standard error of estimate or the standard error of the

regression (se). It is simply the standard deviation of the Y values about the
estimated regression line and is often used as a summary measure of the
“goodness of fit” of the estimated regression line.

17
18
3.4 PROPERTIES OF LEAST-SQUARES ESTIMATORS: THE GAUSS–MARKOV THEOREM

19
3.5 THE COEFFICIENT OF DETERMINATION r 2: A MEASURE OF “GOODNESS OF FIT”

To compute this r 2,

20
21
Now dividing (3.5.3) by TSS on both sides, we
obtain

We now define r 2 as

or, alternatively, as

The quantity r 2 thus defined is known as the (sample) coefficient of determination and
is the most commonly used measure of the goodness of fit of a regression line. Verbally, r
2
measures the proportion or percentage of the total variation in Y explained by the
regression model.

22
Although r 2 can be computed directly from its definition given in (3.5.5), it can be
obtained more quickly from the following formula:

23
If we divide the numerator and the denominator of (3.5.6) by the sample size n (or
n− 1 if the sample size is small), we obtain

an expression that may be computationally easy to obtain.

Given the definition of r 2, we can express ESS and RSS discussed earlier as follows:

24
Therefore, we can write

an expression that we will find very useful later.

A quantity closely related to but conceptually very much different from r 2 is the
coefficient of correlation, which is a measure of the degree of association between two
variables. It can be computed either from

or from its definition

25
which is known as the sample correlation coefficient.

26
27
That is,

28
29
The estimated regression line therefore is

30
31
32
33
34
35

Ashmore & Jussim - Richard D. Ashmore, Lee Jussim-Self and Identity - Fundamental Issues (Rutgers Series On Self and Social Identity) - Oxford University Press, USA (1997)
No ratings yet
Ashmore & Jussim - Richard D. Ashmore, Lee Jussim-Self and Identity - Fundamental Issues (Rutgers Series On Self and Social Identity) - Oxford University Press, USA (1997)
257 pages
Educational Research Quantitative Qualitative and Mixed Approaches 7th Edition by Robert Burke JohnsonLarry B Christensen
No ratings yet
Educational Research Quantitative Qualitative and Mixed Approaches 7th Edition by Robert Burke JohnsonLarry B Christensen
330 pages
Prompt Engineering Notes
No ratings yet
Prompt Engineering Notes
21 pages
Classical Linear Regression Model (CLRM)
100% (1)
Classical Linear Regression Model (CLRM)
68 pages
Final Thesis Research
95% (22)
Final Thesis Research
32 pages
FS2-Episode 3-AGT
72% (18)
FS2-Episode 3-AGT
11 pages
(IFIP — the International Federation for Information Processing) Asbjørn Rolstadås (Auth.), Norio Okino, Hiroyuki Tamura, Susumu Fujii (Eds.)-Advances in Production Management Systems_ Perspectives An
No ratings yet
(IFIP — the International Federation for Information Processing) Asbjørn Rolstadås (Auth.), Norio Okino, Hiroyuki Tamura, Susumu Fujii (Eds.)-Advances in Production Management Systems_ Perspectives An
483 pages
Lecture 2.3 Model Validation
No ratings yet
Lecture 2.3 Model Validation
16 pages
Handouts CH 3 (Gujarati)
No ratings yet
Handouts CH 3 (Gujarati)
5 pages
Ref. CH 3 Gujarati Book
No ratings yet
Ref. CH 3 Gujarati Book
51 pages
Psy155 Group5 2a Chapters1 5
No ratings yet
Psy155 Group5 2a Chapters1 5
57 pages
OLS Assumptions
No ratings yet
OLS Assumptions
11 pages
7 Quality Management Principles
No ratings yet
7 Quality Management Principles
4 pages
Chapter 5 Review and Evaluate Workbased Learning Effectiveness
No ratings yet
Chapter 5 Review and Evaluate Workbased Learning Effectiveness
29 pages
Learner Autonomy
No ratings yet
Learner Autonomy
3 pages
RM-Module 5
No ratings yet
RM-Module 5
21 pages
Estimation Problem of Two-Variable Regression Model
No ratings yet
Estimation Problem of Two-Variable Regression Model
15 pages
Handout Theory by PD Sir
No ratings yet
Handout Theory by PD Sir
94 pages
Chapter 3 Econometrics Edited
No ratings yet
Chapter 3 Econometrics Edited
48 pages
3 Lecture03
No ratings yet
3 Lecture03
30 pages
Cánepa Koch, Gisela e Ingrid Kummels (Eds.) - Photography in Latin America: Images and Identities Across Time and Space.
No ratings yet
Cánepa Koch, Gisela e Ingrid Kummels (Eds.) - Photography in Latin America: Images and Identities Across Time and Space.
16 pages
Chapter Two
No ratings yet
Chapter Two
44 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Marketing Management Module 4
No ratings yet
Marketing Management Module 4
109 pages
EDLD 5333 Week 5 Assignment
No ratings yet
EDLD 5333 Week 5 Assignment
6 pages
Lecture 3 - Econometria I
No ratings yet
Lecture 3 - Econometria I
46 pages
Simple Regression
No ratings yet
Simple Regression
45 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
Mtetwa Trevor Ncamiso 2019.
No ratings yet
Mtetwa Trevor Ncamiso 2019.
269 pages
Outline - Simple Regression
No ratings yet
Outline - Simple Regression
51 pages
Human Resources Management: Assignment - II
No ratings yet
Human Resources Management: Assignment - II
6 pages
Eco 3
No ratings yet
Eco 3
68 pages
Econometrics Theory Note
No ratings yet
Econometrics Theory Note
13 pages
CH 2 Part II Handout
No ratings yet
CH 2 Part II Handout
27 pages
Module 4 Part 3
No ratings yet
Module 4 Part 3
16 pages
Econometrics Notes
No ratings yet
Econometrics Notes
30 pages
Chapter Three
No ratings yet
Chapter Three
22 pages
How To Manage Sports Injuries?
No ratings yet
How To Manage Sports Injuries?
31 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Chapter 2
No ratings yet
Chapter 2
18 pages
Thesis Statement For Renaissance Art
100% (3)
Thesis Statement For Renaissance Art
7 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
31 pages
Chapter 2 The Classical Linear Regression Model (CLRM)
No ratings yet
Chapter 2 The Classical Linear Regression Model (CLRM)
20 pages
BST 32202 Linear Regression 6 SLR Assumptions Lse
No ratings yet
BST 32202 Linear Regression 6 SLR Assumptions Lse
20 pages
What Is A Math/Stats Model?: 1. Often Describe Relationship Between Variables 2. Types
No ratings yet
What Is A Math/Stats Model?: 1. Often Describe Relationship Between Variables 2. Types
64 pages
Bce 221 Sim SDL Manual - Week 8-9
No ratings yet
Bce 221 Sim SDL Manual - Week 8-9
25 pages
ECN 318 Lecture Notes Weeks 3-4
No ratings yet
ECN 318 Lecture Notes Weeks 3-4
25 pages
Int. Geodetic Research Projects: - Fully Integrated in Lectures and Thesis of Geomatics (MSC) Study Program
No ratings yet
Int. Geodetic Research Projects: - Fully Integrated in Lectures and Thesis of Geomatics (MSC) Study Program
1 page
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Regression Analysis Sta 221
No ratings yet
Regression Analysis Sta 221
10 pages
Dproject
No ratings yet
Dproject
10 pages
Marketing ADM 2320
No ratings yet
Marketing ADM 2320
12 pages
Chapter 2
No ratings yet
Chapter 2
58 pages
Unit1 - Data Science - SPPU
No ratings yet
Unit1 - Data Science - SPPU
15 pages
Chapter 2
No ratings yet
Chapter 2
17 pages
Economic Geography Dissertation Ideas
100% (2)
Economic Geography Dissertation Ideas
4 pages
TWO-VARIABLE New
No ratings yet
TWO-VARIABLE New
19 pages
ECN 5121 Econometric Methods Two-Variable Regression Model: The Problem of Estimation By: Domodar N. Gujarati
No ratings yet
ECN 5121 Econometric Methods Two-Variable Regression Model: The Problem of Estimation By: Domodar N. Gujarati
65 pages
Home Work 1: Group Member Student Name ID Contribution
No ratings yet
Home Work 1: Group Member Student Name ID Contribution
7 pages
Econometrics 2
No ratings yet
Econometrics 2
8 pages
Manual Econometrics
No ratings yet
Manual Econometrics
20 pages
Simple Regression
No ratings yet
Simple Regression
18 pages
Lecture2 241007 162001
No ratings yet
Lecture2 241007 162001
11 pages
LLMs4Psych Arabic
No ratings yet
LLMs4Psych Arabic
35 pages
Frederic Bartlett PDF
No ratings yet
Frederic Bartlett PDF
10 pages
BAB 4the Simple Linear Regression Model
No ratings yet
BAB 4the Simple Linear Regression Model
26 pages
Do Therapists Address Gender and Power in Infidelity? A Feminist Analysis of The Treatment Literature
No ratings yet
Do Therapists Address Gender and Power in Infidelity? A Feminist Analysis of The Treatment Literature
14 pages
Econometrics I: Chapter 3: Two Variable Regression Model: The Problem of Estimation
No ratings yet
Econometrics I: Chapter 3: Two Variable Regression Model: The Problem of Estimation
35 pages
Eco Trix
No ratings yet
Eco Trix
16 pages
M2L2 CLRM & Simple Linear Regression Analysis
No ratings yet
M2L2 CLRM & Simple Linear Regression Analysis
13 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
Basic Econometrics
No ratings yet
Basic Econometrics
23 pages
Econometrics: Domodar N. Gujarati
No ratings yet
Econometrics: Domodar N. Gujarati
29 pages
Econometrics
No ratings yet
Econometrics
25 pages
Lecture # 2 (The Classical Linear Regression Model) PDF
No ratings yet
Lecture # 2 (The Classical Linear Regression Model) PDF
3 pages
Theme 2 Ordinary Least Squares Regression
No ratings yet
Theme 2 Ordinary Least Squares Regression
10 pages
Artificial Intelligence in Customer Relationship Management
No ratings yet
Artificial Intelligence in Customer Relationship Management
2 pages
LESSON PLAN-Penny Experiment: TH ST
No ratings yet
LESSON PLAN-Penny Experiment: TH ST
7 pages
2 Simple Regression Model 29x09x2011
No ratings yet
2 Simple Regression Model 29x09x2011
35 pages
Biomedical Research Associate Scientist in Houston TX Resume Sandhya Das
100% (1)
Biomedical Research Associate Scientist in Houston TX Resume Sandhya Das
6 pages
Linear Models
No ratings yet
Linear Models
92 pages
Basic Regression Analysis
No ratings yet
Basic Regression Analysis
5 pages
Classical LinearReg 000
No ratings yet
Classical LinearReg 000
41 pages
Artificial Intelligence and Machine Learning Based Financial Risk Network Assessment Model
No ratings yet
Artificial Intelligence and Machine Learning Based Financial Risk Network Assessment Model
6 pages
Two-Variable Regression Model - The Problem of Estimation
No ratings yet
Two-Variable Regression Model - The Problem of Estimation
35 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
17 pages
CMI Final Action Research Proposal Ceantesr
No ratings yet
CMI Final Action Research Proposal Ceantesr
7 pages
Econometrics: Two Variable Regression: The Problem of Estimation
No ratings yet
Econometrics: Two Variable Regression: The Problem of Estimation
28 pages
U03d2 Ethics, Recruitment, and Random Assignment
No ratings yet
U03d2 Ethics, Recruitment, and Random Assignment
6 pages
Facade Access Flyer
No ratings yet
Facade Access Flyer
1 page
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Topology and Geometry for Physicists
From Everand
Topology and Geometry for Physicists
Charles Nash
3.5/5 (1)

Basic Econometrics: TWO-VARIABLEREGRESSION MODEL

Uploaded by

Basic Econometrics: TWO-VARIABLEREGRESSION MODEL

Uploaded by

Chapter Three

3.1 THE METHOD OF ORDINARY LEAST SQUARES

However, as we noted in Chapter 2, the PRF is not directly observable. We

If we adopt the criterion of minimizing  ûi

How is this accomplished? The process of differentiation yields the following

Equation (3.1.7) can be obtained directly from (3.1.4) by simple algebraic

3. The mean value of the residuals ûi is zero

As a result of the preceding property, the sample regression

can be expressed in an alternative form where both Y and X are expressed as

Dividing Eq. (3.1.11) through by n, we obtain

4. The residuals ûi are uncorrelated with the predicted Yi .

5. The residuals û i are uncorrelated with Xi ; that is,

The Gaussian, standard, or classical linear regression model (CLRM), which is

Assumption 2: X values are fixed in repeated sampling. Values

Assumption 4: Homoscedasticity or equal variance of ui.

where var stands for variance.

Technically, (3.2.2) represents the assumption of homoscedasticity, or equal

Assumption 5: No autocorrelation between the disturbances.

In words, (3.2.5) postulates that the disturbances ui and uj are uncorrelated.

Assumption 7: The number of observations n must be greater than the

Assumption 8: Variability in X values. The X values in a given sample must

 ûi X = 0 is similar to the assumption that cov(u , X ) = 0. It is comforting

Assumption 10: There is no perfect multicollinearity. That is, there are no

n− 2 is known as the number of degrees of freedom (df),

an alternative expression for computing  uˆi2 is

is known as the standard error of estimate or the standard error of the

an expression that may be computationally easy to obtain.

an expression that we will find very useful later.

or from its definition

You might also like