Demand Estimation
DEMAND FORECASTING
OVERVIEW
Demand Curve Estimation
Identification Problem
Interview and Experimental Methods
Regression Analysis
Measuring Regression Model Significance
Measures of Individual Variable Significance
Demand/Sales/Revenue/Profit Forecasting Methods:
Single Equation Regression Models,
Simultaneous Equation Regression Models,
Autoregressive Integrated Moving Average (ARIMA)
Models, and
Vector Autoregressive (VAR) Models
KEY CONCEPTS
simultaneous relation multiple regression model
identification problem standard error of the estimate
consumer interview (SEE)
market experiments correlation coefficient
regression analysis coefficient of determination
deterministic relation degrees of freedom
statistical relation corrected coefficient of
determination
time series
cross section
F statistic
scatter diagram
t statistic
linear model
two-tail t tests
multiplicative model
one‑tail t tests
simple regression model
Demand Curve Estimation
Simple Linear Demand Curves
The best estimation method balances
marginal costs and marginal benefits.
Simple linear relations are useful for demand
estimation.
Using Simple Linear Demand Curves
Straight-line relations give useful
approximations.
Identification Problem
Changing Nature of Demand Relations
Demand relations are dynamic.
Interplay of Supply and Demand
Economic conditions affect demand and
supply.
Shifts in Demand and Supply
Curve shifts can be estimated.
Simultaneous Relations
Interview and Experimental
Methods
Consumer Interviews
Interviews can solicit useful information when
market data is scarce.
Interview opinions often differ from actual
market transaction data.
Market Experiments
Controlled experiments can generate useful
insight.
Experiments can become expensive.
Regression Analysis
What Is a Statistical Relation?
A statistical relation exists when averages are related.
A deterministic relation is true by definition.
Specifying the Regression Model
Dependent Variable/Explained
Variable/Predictand/Regressand/Response/Endogenous
Variable
Explainatory Variable/Independent
Variable/Predictor/Regressor/Stimulus or Control
Variable/Exogenous Variable
Dependent variable Y is caused by X.
X variables are independently determined from Y.
Least Squares Method
Minimize sum of squared residuals.
Measuring Regression Model
Significance
Standard Error of the Estimate (SEE) increases with
scatter about the regression line.
Goodness of Fit, r and R 2
r = 1 means perfect correlation; r = 0
means no correlation.
R2 = 1 means perfect fit; R2 = 0 means no
relation.
Corrected Coefficient of Determination, R2
Adjusts R2 downward for small samples.
F statistic
Tells if R2 is statistically significant.
Measures of Individual Variable
Significance
t statistics
t statistics compare a sample characteristic to the
standard deviation of that characteristic.
A calculated t statistic more than two suggests a
strong effect of X on Y (95 % confidence).
A calculated t statistic more than three suggests a
very strong effect of X on Y (99 % confidence).
Two-tail t Tests
Tests of effect.
One-Tail t Tests
Tests of magnitude or direction.
Demand Estimation
What will happen to quantity demanded,
total revenue and profit if we increase
prices?
What will happen to demand if consumer
incomes increase or decrease due to an
economic expansion or contraction?
What affect will a tuition increase have on
Marquette’s revenue?
Practical Example: Port Authority
Transit Case
How will the fare
price increase affect
demand and overall
revenues?
What other factors,
besides fares, affect
demand?
Demand Estimation Using Market
Research Techniques
How do we estimate the Demand
Function?
Econometric Techniques (Your Project)
Non-econometric Techniques
Look first at Non-econometric Approaches
What are these?
Consumer Surveys: Just Ask
Them
Question customers to Advantages:
estimate demand Flexible
“How many bags of chips Relatively
would you buy if the price inexpensive to
was Rs. 2.29/bag?” conduct
“How many cases of beer Disadvantages
would you buy if the price Many potential
of beer was Rs. biases
11.99/case?” Strategic
Compare different Information
individuals’ responses Hypothetical
Interviewer
Market Experiments
Firms vary prices and/or advertising and
compare consumer behavior
Over time (e.g., before and after rebate offer)
Over space (e.g., compare Delhi and Haryana
consumption when prices are varied between two
regions)
Potential Problems
Control of other factors not guaranteed.
“Playing” with market prices may be risky.
Expensive
Consumer Clinics and Focus
Groups
Simulated market Advantages
setting in which Flexibility
consumers are given Disadvantages
income to spend on a Selectivity bias
variety of goods Very expensive
The experimenters
control income,
prices, advertising,
packaging, etc.
Econometrics
“Economic Measurement”
Collection of statistical techniques
available for testing economic theories by
empirically measuring relationships among
economic variables.
Quantify economic reality – bridge the gap
between abstract theory and real world
human activity.
Practical Example
How does the state of
Delhi set a budget?
What is the process?
The Econometric Modeling
Process
1. Specification of the theoretical model
2. Identification of the variables
3. Collection of the data
4. Estimation of the parameters of the
model and their interpretation
5. Development of forecasts (estimates)
based on the model
Numbers Instead of Symbols!
Normal model of consumer demand
Q = f(P, Ps, Id)
Q = quantity demanded of good, P =
good price, Ps = price of substitute good,
Id = disposable income
Econometrics allows us to estimate the
relationship between Q and P, Ps and Id
based on past data for these variables
Q = 31.5 – 0.73P + 0.11Ps + 0.23Yd
Instead of just expecting Q to “increase” if
there is an increase in Id – we estimate that
Q will increase by 0.23 units per 1 dollar of
increased disposable income
0.23 is called an estimated regression
coefficient
The ability to estimate these coefficients is
what makes econometrics useful
Regression Analysis
One econometric approach
Most popular among economists, business
analysis and social sciences
Allows quantitative estimates of economic
relationships that previously had been
completely theoretical
Answer “what if” questions
Regression Analysis Continued
Regression analysis is a statistical technique that
attempts to “explain” movements in one variable,
the dependent variable, as a function of
movements in a set of other variables, called the
independent (or explanatory) variables, through
the quantification of a single equation.
Q = f(P, Ps, Yd)
Q = dependent variable
P, Ps , Yd = independent variables
Deals with the frequent questions of cause and
effect in business
What is Regression Really
Doing?
Regression is the fitting of curves to data.
More later!
Q
Gathering Data
Once the model is specified, we must collect
data.
Time-series data
e.g., sales for my company over time.
What most of you will be using in your projects.
Cross-sectional data
e.g., sales of 10 companies in the food processing
industry at one point in time.
Panel Data/Longitudinal Data/Pooled Data
e.g., sales of 10 companies in the food processing
industry at various point in time.
Garbage In, Garbage Out
Your empirical estimates will be only as
reliable as your data.
Look at the two quotes from Stamp and
Valavanis that follow.
You will want to take particular care in
developing your databases.
Sir Josiah Stamp
“Some Economic Factors in Modern Life”
The government are very keen on amassing
statistics. They collect them, add them, raise
them to the n’th power, take the cube root and
prepare wonderful diagrams. But you must never
forget that every one of those figures comes in
the first instance from the village watchman, who
just puts down what he damn well pleases.
Moral: Know where your data comes from!
Valavanis
“Econometric theory is like an
exquisitely balanced French recipe,
spelling out precisely with how many
turns to mix the sauce, how many
carats of spice to add, and for how
many milliseconds to bake the
mixture at exactly 474 degrees of
temperature.”
Valavanis - continued
“But when the statistical cook turns to raw
materials, he finds that hearts of cactus
fruit are unavailable, so he substitutes
chunks of cantaloupe; where the recipe
calls for vermicelli he uses shredded wheat;
and he substitutes green garment dye for
curry, ping-pong balls for turtle’s eggs, and,
for Chaligougnac vintage 1883, a can of
turpentine.”
Moral: Be careful in your choice of proxy
variables
Economic Data
You are in the process of gathering
economic data.
Some will come from your firm.
Some may come from trade publications.
Some will come from the government.
Must be of the same time scale (monthly,
quarterly, yearly, etc.)
Always be Skeptical
Always approach your data with a critical
eye.
Remember the quotes
Just because something appears in a table
somewhere, does not mean it is necessarily
correct.
Government data revisions.
Does your data pass the “smell test”?
How to Begin the Data
Exercise
First question you should ask yourself is:
“If money were no object, what would be the
perfect data for my demand model?”
From that basis, you can then start finding
what actual data you can get your hands on.
There will be compromises that you have to
make. These are called proxy variables!
Remember the Valavanis quote.
How to Choose a Good Proxy
Proxy variables should be variables whose
movements closely mirror the desired variable
for which you do not have a measure.
For example: Tastes of consumers are
difficult to measure.
May use a time trend variable if you suspect these
are changing over time.
May include demographic characteristics of the
population.
Dummy Variables
Binary Variable
Take on a “1” or a “0”
Example: Trying to model salaries
1 if you have a college degree, 0 if you
don’t
Example: Model effect of Harley-Davidson
reunion years on demand
1 for reunion years, 0 otherwise
Back to Regression Analysis
Theoretical Model: Y = 0 + 1X +
Y is dependent variable
X is independent variable
Linear Equation (no powers greater than 1)
’s are coefficients – determine coordinates of the
straight line at any point
0 is the constant term – value of Y when X is 0
(more on this later - no economic meaning but
required)
1 is the slope term – amount Y will change when X
increases by one unit (can be 2 … n) holds all other
’s constant (except those not in model!)
More about , the error term, later
Graphical Representation of
Regression Coefficients
Regression Line
Y
Y = 0 + 1X
Y
X 1
0
X
The Error Term
Y = 0 + 1X +
is purely theoretical
Stochastic Error Term Needed Because:
Minor influences on Y are omitted from equation
(data not available)
Impossible not to have some measurement error
in one of the equation’s variables
Different functional form (not linear)
Pure randomness of variation (remember human
behavior!)
Example of Error
Trying to estimate demand for SUV’s
Demand may fall because of uncertainty
about the economy (what data do we use for
uncertainty?)
Other independent variables may be omitted
Demand function may be non-linear
Demand for SUV’s is determined by human
behavior – some purely random variation
All end up in error term
The Estimated Regression
Equation
Theoretical Regression Equation:
Y = 0 + 1X +
Estimated Regression Equation:
Y^ = 103.40 + 6.38X + e
Observed, real word X and Y values are used to
calculate coefficient estimates 103.40 and 6.38
Estimates are used to determine Y-hat, the fitted
value of Y
“Plug-in” X and get estimate of Y
Differences Between Theoretical and
Estimated Regression Equations
0, 1 replaced with estimates ̂ 0 , ̂1 (103.40 and 6.38)
Can’t observe true coefficients, we make estimates
Best guesses given data for X and Y
Y^ is estimated value of Y – calculated from the regression
equation (line through Y data)
Residual e = Y – Yˆ
Residual is difference between Y (data) and Yˆ (estimated
Y with regression)
Theoretical model has error, estimated model has residual
A Simple Regression Example
in Eviews
Demand for Ford Taurus
Ordinary Least Squares
Regression
OLS Regression
Most Common
Easy to use
Estimates have useful
characteristics
How Does Ordinary Least
Squares Regression Work?
We attempt to find the curve that best fits
the data among all possibilities
While there are a number of ways of
doing this, OLS minimizes the sum of the
squared residuals
Finding Best Fitting Line using
Ordinary Least Squares
Actual data points are
Y = 0 + 1X + dependent variable (Y’s)
Yˆ = ˆ0 + ̂1 X + e P
“hat” is sample estimate
of true value _
OLS minimizes: e 2 P
e = (Y – Yˆ )
OLS minimizes (Y- Yˆ )2
_
Q
Q
Best possible linear line through data
True vs. Estimated Regression Line
No one knows the parameters of the true
regression line:
Yt = + Xt + t (theoretical)
We must come up with estimates.
Y^t = ^ + ^Xt + et (estimated)
So how does OLS work?
OLS selects the
estimates of 0 and 1
that minimize the
squared residuals
Minimize difference
between Y and Y^
Statistical Software
Complex math behind
the scenes
OLS Regression Coefficient
Interpretation
Regression coefficients (’s) indicate the
change in the dependent variable
associated with a one-unit increase in the
independent variable in question holding
constant the other independent variables
in the equations (but not those not in the
equation)
A controlled economic experiment?
Another Example
The demand for beef
B = 0 + 1P + 2Yd
B = per capita consumption of beef per
year
Yd = per capita disposable income per year
P = price of beef (cents/pound)
Estimate this using Eviews
Overall Fit of the Model
Need a way to evaluate model
Compare one model with another
Compare one functional form with another
Compare combinations of independent
variables
Use coefficient of determination r2
r2 – The Coefficient of
Determination
Reported by Eviews every time you run a regression
Between 0 and 1
The larger the better
Close to one shows an excellent fit
Near zero shows failure of estimated regression to
explain variance in Y
Relative term
r2 = .85 says that 85% of the variation in the
dependent variable is explained by the independent
variables
Graphical r
2
r2 = 0
r2 = .95
r2 = 1
The Adjusted r 2
Problem with r2: Adding another independent
variable never decreases r2
Even a nonsensical variable
Need to account for a decrease in “degrees of
freedom”
Degrees of freedom = data observations –
coefficients estimated
Example: 100 years of data, 3 variables
estimated (including constant)
DF = 97
Adjusted r 2
Slightly negative to 1
Accounts for degrees of freedom
Better estimate of fit
Don’t rely on any one statistic
Common sense and theory more important
Same interpretation as r2
Use adjusted r2 from now on!
The Classical Linear
Regression (CLR) Model
These are some basic assumptions which
when met, make the Ordinary Least
Squares procedure the “Best Linear
Unbiased Estimator” (aka BLUE).
When one or more of these assumptions is
violated, it is sometimes necessary to
make adjustments to our model.
Assumptions
(Yt=X1t+ X2t+...+t)
Linearity in coefficients and error term
has zero population mean
All independent variables are independent of
Error term observations are uncorrelated with
each other (no serial correlation)
has constant variance (no heteroskedasticity)
No independent variables are perfectly
correlated (multicollinearity)
Will come back to some of these when we test our models
1st Assumption: Linearity
We assume that the model is linear (additive)
in the coefficients and in the error term, and
specification is correct.
e.g., Yt=X1+ X2+ is is linear in both,
whereas Yt=X1+ X2+ is not.
Some nonlinear models can be transformed
into linear models.
e.g., Yt=X1 X2
We showed this can be transformed using logs to:
lnYt=lnlnX1+ lnX2+ ln
Hypothesis Testing
In statistics we
cannot “prove” a
theory is correct
Can “reject” a
hypothesis with a
certain degree of
confidence
Common Hypothesis Test
H0: = 0 – Null Hypothesis
HA: 0 – Alternative Hypothesis
Test whether or not the coefficient is
statistically significantly different from zero
Does the coefficient affect demand?
Two-tailed test
Does Rejecting the Null Hypothesis
Guarantee that the Theory is Correct?
NO! It is possible that we are committing
what is known as a Type I error.
A Type I error is rejecting that Null hypothesis
when it is in fact correct.
Likewise, we may also commit a Type II
error
A Type II error is failing to reject the Null
hypothesis when the alternative hypothesis is
correct.
Type I and Type II Error
Example
Presumption of innocence until proven
guilty
H0: The defendant is innocent
HA: The defendant is guilty
Type I error: sending an innocent
defendant to jail
Type II error: freeing a guilty defendant
The t-Test, and the t-Statistic
We can use the t-Test to do
hypothesis testing on individual
coefficients.
Given the linear regression model:
Yt=X1+ X2+...+t
We can calculate the t-statistic for
each estimated value of i.e., hat),
and test hypotheses on that estimate.
Setting up the Null and
Alternative Hypotheses
H0: 1= 0
(i.e., X1 is not important)
HA: 10
(i.e., X1 is important, either
positively or negatively)
Testing the Hypothesis
Set up null and alternative hypothesis
Run regression and generate t-score.
Look up the critical value of the t-Statistic (tc), given
the degrees of freedom (n-k) in a two-tailed test
using X% level of significance (1%, 5%, 10%)
n = sample size, k = estimated coefficients
(including intercept)
Reject null (=0) if abs(tk)> tc
t Statistic Table on Page 754 of Hirschey
Interpretation of level of significance: 5% means
only 5% chance estimate is actually equal to zero or
not significant statistically (this is a 95% confidence)
Example
Taurus example with t-stats
Limitations of the t-Test
1. Does not indicate theoretical
validity
2. Does not test Importance of the
variable.
The size of the coefficient does this.
F-test and the F-statistic
You can also test
whether a group of
coefficients is
statistically significant.
Look at the F-test for all
of the independent
variable coefficients.
First set up the null and
alternative hypotheses.
H0 and HA for the F-test
H0: k0
i.e., all of the slope coefficients are
simultaneously zero.
HA: not H0
i.e., at least one, if not more slope coefficients,
are nonzero.
Note: It does not indicate which one or ones
of the coefficients are nonzero.
The Critical F
As with the t-statistic, you must compare the
actual value of F with its critical value (Fc):
Actual value from EVIEWS or
Fk-1, n-k = [r2/(k-1)]/[(1-r2)/(n-k)]
FC must be looked up in a table, using the
appropriate degrees of freedom for the
numerator (k-1) and the denominator (n-k)
Table on Page 751 (10%), 752 (5%) and 753
(1%) of Hirschey
The F-Test
If FFC then you reject H0 (all
coefficients not equal to zero)
If F<FC then you fail to reject H0
Look at an Eviews example
Specification Errors
Suppose that your make a mistake in
your choice of independent variables.
There are 2 possibilities:
You omit an important variable
You include an extraneous variable
There are consequences in both cases.
Omitting an Important Variable
Suppose your true regression model is:
Qt=Pt+ It+t
Suppose you specify the model as:
Qt=Pt+t*
Thus, error term of the misspecified model
captures the influence of income, It.
t*It+t
Consequences
Prevents you from getting a
coefficient for income
Causes bias in the price estimate
Violates classical assumption of error
term not being correlated with an
explanatory (independent) variable
Inclusion of an Irrelevant
Variable
A variable that is included, that does not
belong in your model also has
consequences.
Does NOT bias the other coefficients.
Lowers t-scores of other coefficients (so you
might reject)
Will raise r2 but will likely decrease the
adjusted r2 (help you identify)
Example
Annual Consumption of Chicken
Y = consumption of chicken, PC = price of
chicken, PB = price of beef, I = disposable
income
Y^ = 31.5 – 0.73PC + 0.11PB + 0.23I
PC t-stat = -9.12, PB t-stat = 2.50, I t-stat =
14.22
Adjusted r2 = 0.986
Interpretation?
Example
Add interest rate to the equation, R
Y^ = 30 – 0.73PC + 0.12PB + 0.22YD + 0.17R
PC t-stat = -9.10, PB t-stat = 2.08, YD t-stat =
11.05, R t-stat = 0.82
Adjusted r2 = .985
Lowers t-stats and adjusted r2
t-stat suggests rejection and so does the
adjusted r2
How do you decide whether a
variable should be included?
Trial and Error – Many EVIEWS runs!
Start with THEORY!
Use your judgement here!
If theory does not provide a clear answer, then:
Look at t-test
Look at adjusted r2
Look at whether other coefficients appear to be
biased when you exclude the variable from the
model.
Inclusion of Lagged Variables
Some independent variables influence demand
with a lag.
For example, advertising may primarily influence
demand in the following month, rather than the
current month.
Thus, Qt=0+1Pt+2It+3At-1+t
When there is a good reason to suspect a lag
(i.e., when theory suggests a lagged
relationship), you can investigate this option.
Eviews Lagged Variable
Unemployment in previous time periods
important to current demand for Taurus?
Functional Form
Don’t forget the constant term – no meaning
but required for classical assumptions
Linear Form
Double Log Form
There are many others that we won’t discuss
Linear Form
Y = 0 + 1X +
What we have looked at thus far
Constant slope is assumed
Y/X = 1
Double-Log Form
Second most common
Natural log of Y is independent variable and
natural log of X’s are dependent variables
lnY = 0 + 1lnX1 + 2lnX2 +
Elasticities of the model are constant
ElasticityY,Xk = %Y/ %X1 = 1 = constant
Interpretation of coefficients: if X1 increases by
1% while the other X2 is held constant, Y will
change by 1%
Can’t be any negative or 0 observations in your
data set (natural log not defined)
Violations of the Classical
Model
Multicollinearity
Serial Correlation
Others
Problem of Multicollinearity
Recall the CLR assumption that the
independent variables are not perfectly
correlated with each other
This is called “perfect multicollinearity”
Easy to detect
OLS cannot estimate parameters in this
situation (put in the same independent twice
and Eviews can’t do it)
Look at problem of imperfect
multicollinearity
Imperfect Multicollinearity
This occurs when two or more
independent variables are highly, but
not perfectly correlated with each
other!
If this is severe enough, it can influence the
estimation of the ’s in the model.
How to Detect the Problem
There are some formal tests
Beyond scope of this course
Look for the tell-tale signs of the problem:
High adjusted r2, high F-statistics, and low t-
scores on suspected collinear variables.
Eviews example with Taurus
Remedies
Possibly do nothing!
If t-scores are at or near significance
levels, you may want to “live with it”.
Drop one or more collinear variables.
Let the remaining variable pick up the joint
impact.
This is ok if you have redundancies.
Remedies - continued
Form a new variable:
e.g., if income and population are correlated,
you could form per capita income: I/Pop.
Other solutions I can help with on your projects
The Problem of
Serial Correlation
The fourth assumption of the CLR model
is:
“Observations of the error term are
uncorrelated with each other”
When this is not satisfied, we have a
problem known as serial correlation.
Examples of Serial Correlation
Positive Serial Negative Serial
Correlation Correlation
Q Q
P P
Consequences of Serial
Correlation
Pure serial correlation does not bias the
estimates
Serial correlation tends to distort t-scores
Serial correlation results in a pattern of
observations in which OLS gives a better
fit to the data than would be obtained in
the absence of the problem (t scores
higher).
Uses error to explain dependent variable
QUESTION:
Why is this a problem?
This suggests that t-statistics are
overestimated!
Type I error: You may falsely reject the
null hypothesis, when it is in fact true.
Neither, F-statistics nor t-statistics can be
trusted in the presence of serial
correlation.
Detection:
The Durbin-Watson d-test
This is a test for first order serial
correlation
This is the most common type in economic
models.
Note that there are other tests (Q-test,
Breusch-Godfrey LM test), but we will not
cover them here.
The d-statistic is derived from the
regression residuals (e).
Theoretical range of d-statistic
If there is perfect positive serial correlation then
d=0.
If there is perfect negative serial correlation then
d=4.
If there is no serial correlation, then d=2
Check this statistic in Eviews on your project
If near 2 no problem, if different than 2 then …
Correction for Serial Correlation
using GLS
Adding an autoregressive term solves serial
correlation problem
Details is outside scope of class
Soviet Defense spending model
If your original regression model was:
LS SDH C USD SY SP
DW=0.62 a problem
Simply add an AR(1) term to your command line:
LS SDH C USD SY SP AR(1)
DW=1.97 problem solved
Summary Steps for Project
Think about theoretical model: what independent
variables make sense based on theory? (already
doing this)
Collect data and examine it (already doing this)
Choose a functional form (likely linear)
Run regression models in Eviews
Examine adjusted r2, t-stats, F-stat and exclude or
include variables based on these and theory
Do you need lagged variables?
Look for evidence of (and correct for)
multicollinearity or serial correlation
Summary Steps for Project
Interpret your results
Use model to forecast demand (next topic)
I’ll do a “sample project” next time using
the Taurus data
Homework Continued
Interpret Adjusted r2
Which is a better measure of overall fit? Why?
Is F-Stat Significant? What does it mean?
Any evidence of serial correlation?
How could this be corrected for?
Estimate the equation as a log-log model
Interpret the results
Is beef a normal good?
Is demand elastic or inelastic (for price and
income)?
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
OLS estimation method estimates the values of the
parameters by minimising the sum of the error term.
Let the theoretical econometric model be
Yt = 0 + 1Xt + t
With usual assumptions about .
is an unknown random (stochastic) disturbance term.
However, statistical estimation requires regression of Y on
X and estimating 0, 1 and residuals – particular values of
. To distinguish from the unknown disturbance , the
sample residuals are termed as errors and denoted by .
Thus, the empirical model is
Yt = 0 + 1Xt + t
t = Yt – (0 + 1Xt)
= Actual Value – Predicted Value
(Yt - Yt )
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
The method of least squares minimises the sum
of squared residuals (or errors), SSE or Error Sum
of Squares (ESS).
This means that we are minimising the sum of
the squares of the vertical distances from the line
of regression. Alternatively, we could have
minimised the absolute sum of vertical distances
or sum of squares of the horizontal distances or
perpendicular distances (orthogonal estimators).
OLS confines to the minimisation of the sum of
squares of the vertical distances.
2
i.e., it minimises n n
t (Yt 0 1 X t )
t 1
2
t 1
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
SSE is to be minimised with respect to the parameters
0 and 1. We have to choose those values of 0 and 1
which will give as close a fit to data as is possible with
this specification. Let these values be and
0 . 1
We see that the standard model or the ordinary least
squares (OLS) model as it is more popular called and
the classical linear regression model [OLS with the
additional assumption of t ~ N (0, 2)] have 4
parameters. 0 and 1 – the parameters of linear
dependence, and E(t) and 2 - the parameters of the
probability distribution of . However, the assumption
of E(t) = 0 i.e., randomness of the disturbance is not
very restrictive. Let the expected value of I be non-
zero, say k, then
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
Yt EYi / X 0 1 ( X t ) E ( t )
0 1 ( X t ) k
( 0 k ) 1 ( X t )
ESS ESS
0and 0
0 1
whereESS [Yt 0 1 X t ]2
t
2
are the estimated values of the parameters.