A random variable is a variable whose value is
unknown until it is observed;
If g(x) if a function of X, then g(x) is
Probability Density Functions – f(x) random The error term
Probability of each possible value occurring
Cumulative Density Function – F(x) Rules of mean:
Probability x≥X Assumptions of Simple Linear Regression
Joint Probability Density Function
1. For each value x,
Joint probability of X&Y occurring 2. equivalently,
Marginal probabilities are PDFs of either
X/Y variables
Conditional Probability (f(x|y)) Pr of x Variance = Measure of dispersion 3.
occurring if y assumed to happen 4. Covariance between any pair of random
errors ei and ej
o
Statistical Independence occurs if x Stronger version: es are stat independent,
doesn’t impact y therefore values of y are stat independent
5. Variable x is not random, and takes at
least two different values
o 6. (optional) Values of e are normally
o Or, if: distributed about mean if values of y are,
o IF x and Y independent
and vice versa
o X and Y ONLY statistically independent
if either above is true for every pair of x Least squares regression
and y
General form
Rules of Summation Least squares residual
Covariance
Generating the least squares estimates
Minimise function:
Correlation
If ρ= {-1,1) perfect positive/negative
correlation Elasticities (for a 1% change in x, {elasticity}%
If ρ=0, no correlation, also cov(X,Y) = 0 change in y)
Elasticity of mean expenditure with
respect to income
- often used at a representative point on
the regression line
The estimator b2
Normal Distribution - X ~ N(µ, σ2)
Standard normal distribution
Plus, E(b2) =β2 (if model assumptions hold)
hence estimator is unbiased.
o
Weighted sum of normal variables is Variances/covariances of the OLS
normally distributed indicators
On the variances b2
o Larget the variance, greater the
uncertainty there is in the statistical
Properties of Probability distributions model, and the larger the variances and
covariance of the least squares
Mean = key measure of centre Simple Regressions estimator
o The larger the sum of the squares, the
General form smaller the variances of the LSE and the
more precisely we estimate the
Can calculate a conditional mean Slope of simple regression (β2)
unknown parameters
o The larger the sample size N, the smaller 2. Calculate Test Statistic
the variances and covariances of the LSE
o The larger the term (sum(x2)), the larger Elasticity: (β2)
the variance of the LSE b1
o The absolute magnitude of the Log-linear function 3. Decide on α
covariance increases the larger in 4. Calculate tc (1-α OR 1-α/2, df(N-2) using t-tables and
magnitude of the sample mean xbar and a sketch of rejection region
the covariance has a sign opposite xbar Slope =by 5. Rule that we reject H0 if |t|>tc
6. Conclude in terms of problem at hand
Elasticity:
Gauss-Markov Theorum
Semi-elasticity: %change in y for a 1 unit
Under assumptions of SR1-SR5, estimators
b1 and b2 have smallest variance of all change in x
linear and unbiased estimators of b1 and
b2. They are BLUE – Best linear unbiased Regression with indicator variables
estimators
Facts about GMT
Go
1. Estimators b1 and b2 best compared to
odness of fit and modelling issues
similar estimators – those which linear Used to compare difference bw two
and unbiased. Not states that best of all variables- slope is the difference bw R2 – Explains the proportion of the variance in y
possible estimators population means about its mean that is explained by the
2. Estimators best in class bc min variance. regression model
Always want to use one with smallest var, Interval estimation and hypothesis testing
Mention Omitted vars if low
bc estimation rule gives us higher p of
Point v Interval Estimates
obtaining estimate close to true
parameter value Point estimates - be point estimate of the
3. If any of SR1-5 assumptions not held, then unknown population parameter in
OLS estimates not best linear unbiased regression model
estimators Interval estimate – a range of values in
4. GMT doesn’t depend on assumption of which true parameter is likely to fall
normality (SR6)
5. GMT applies to LS estimators – not to LS How to make a interval estimate of β2 (But
estimates from a single sample. don’t know population s.d.)
Normality assumption
If we make SR6, LSE are normally distributed
CLT
Normalise by converting to Z
If SR1-5 hold, and sample size N is sufficiently
large, then the least squares estimators are apx
normal
Estimating σ2
BUT DO NOT USE!!! –
Use Critical t and df because popn sd unknown
Estimating Variance and Covariance
Obtaining Interval Estimates
Find Critical tc percentile value t(1-a/2,m)
for degrees freedom m
Then solve as above for upper and lower
(Xi-xbar)^2
limits
Variance-Covariance Matrix ‘When the procedure we used is applied
to many random samples of data from the
same population, then 95% of all the
interval estimates constructed using this
procedure will contain the true
parameter’
Standard error of b2
Hypothesis Testing
i.e. se from sample-to-sample used to Multiple Regression
construct various b2s Steps:
Assumptions:
1. State Hypotheses
[i.e. homoskedastic]
Estimating nonlinear relationships
Quadratic model
Assumptions about explanatory variables:
Expl. Vars are not random (i.e. known Omitting a relevant variable leads to a EG, inc. N, S, E – and base will be W
prior to finding value of dependent var) biased estimator
Any one of expl vars is not an exact linear Can be viewed as setting βOmittedvar=0 Example
function of another – otherwise exact
collinearity and LS fails
Finding OLS estimators
Irrelevant Variables
Minimise
Can increase variance for included var. i.e.
LS estimators are random vars reducing precision of those vars. Can apply F test to test sig of dummies
Model specification tips Log-linear models
Estimating σ^ 2
1. Choose vars and form on basis of
theoretical and general understanding of
, k=number of β parameters being the relationship
2. If estimated equation has coefficients w
estimated
unexpected signs or unrealistic
Var-Covar matrix magnitude, may be cause by
misspecifications like omission of imp var
3. Can perform sig tests to decide whether
var or group of vars should be included Can approximate % gap bw M/F by δ
use hat 4. Can test adequacy of a model using RESET
(not good) For a better calculation, use:
Hypothesis testing of βk
Collinearity
(% change bw Dummy=1, D=0)
*NOTE df=N-K!*
Linear Probability Model
Interval estimation
Therefore, if x2, x3 corr=1, var b2-> infinity –
likewise id x2 has no variation (i.e. collinear w
constant term)
Probability function
Testing joint hypothesis Cannot find LS estimators, cannot obtain
estimates of βk -
e.g. , H1: any of β4, 5,
6 are ≠0 Impacts:
this is a Bernoulli distribution:
Makes unrestricted model with all xi Estimator SEs are large, so likely t tests
Makes restricted model with x4,5,6 be will lead to conclusion that parameter
excluded from y estimates are not sig diff from 0
Calc. SSER and SSEU Estimators may be v sensitive to
F stat determines whether a large or small addition/deletion of few obs, or to
reduction in SSEs deletion of apparently insignificant var BUT, var of error term is not homoscedastic,
F crit(J, N-k) – J is horizontal, less sig on Accurate forecasts may be possible if the and p(x) can be <0, >1 (i.e. problems w model)
crit) nature of the collinear relationship
remains same within the out of sample
obs
J = #of restrictions (ie Indicator variables Heteroskedacity
terms removed), N=#obs, K=#coef. In
unrestricted model inc. constant Use to construct models in which some or When the var of e is not randomly distributed –
all of regression parameters inc intercept i.e. it increases/decreases or some
Steps in F test change for some obs in the sample combination. NOT Randomly distributed
D=0 reference (base) group residuals!
1. State H0 and H1
2. Specify test stat and distribution Intercept indicator (dummy) variable E.g. var(e) increases as x increases -> y and e
3. Set sig. level, determine rej. Region
are heteroskedastic
4. Calculate sample value of test stat
5. State conclusion Therefore the LS assumptions are violated
Testing Sig of model (test of the overall
- violation of LS
significance of the regression model
assumptions, as variance is a function of x
Interaction variable (slope indicator/slope
1. State H0 (all βk=0) and H1 (at least 1≠0)
dummy) Two implications of heteroskedasticity
2. Continue as above. Use this equation
3. 1. The LSE still linear, unbiased- but not best
– there is another better estimator
Relationship bw t- and F-Tests 2. The standard errors usually computed for
Slope: the LS estimators are incorrect. CIs and
When F-test for a single β, F=t 2
Hyp tests may be misleading
Model specification
Omitted Variable Bias Dummy var trap
Need to use as an estimator
Cannot include L and (Not L) – will make of var(b2), not the one used for unbiased e
collinearity
Detecting heteroskedasticity σ 2 =α 1 + α 2 male
Visually (informal – should be no pattern in For full marks the variance function should be
residuals) for K=2 in terms of sigma squared– I don’t mind what
greek letters are used for the coefficients
Lagrange multiplier (Breusch-Pagan) Test Helps to ensure CIs and test stats are
correct when there is heteroskedasticity Is the estimated variance of e higher for men
BUT, does not address other impacts of or for women? .[5 points]
hetero – LS estimator no longer best
Failing to address this may not be too The estimated variance of e is lower for men
serious – w large N, var of LS estimators than for women. The estimated coefficient
may be small enough to get adequately suggests that the variance is lower for men by
precise estimates 28,849.63, Must state that it’s lower and say by
o To find an alternative estimator how much it is lower for full marks.
w lower var, it is necessary to
Sub ehat – then R2 from eqn measures the specify a suitable variance Is the variance of e statistically different for
proportion of var on ehat2 explained by Zs. function. Using LS w robust SEs men and for women? .[5 points]
avoids the need to specify a
Use Chi-square test – test stat: Chi-crit: Hypothesis test of male coefficient. Required:
suitable variance function
Hypotheses: (1 point)
Tough MCQs Test statistic/t critical/alpha OR p-value. (3)
BUT! Large sample test only Conclusion (1 point)
When collinear variables are included in an
econometric model coefficient estimates are Conduct an appropriate test for the presence
d.) unbiased but have larger standard errors of heteroskedasticity. What do you conclude?
Show all working.
If you reject the null hypothesis when
performing a RESET test, what should you State the equation to use for testing hetero:
conclude? d.) an incorrect functional form was e^ 2=α 1 +α 2 male +v 1
used
Hypothesis (1 point):
How does including an irrelevant variable in a H 0 :α 2 =0( homoskedasticity)
regression model affect the estimated
H 1 :α 2≠0( heteroskedasticity )
coefficient of other variables in the model? d.)
they are unbiased but have larger standard
errors
Test statistic (1 point)
If X has a negative effect on Y and Z has a 2 2
positive effect upon Y, and X and Z are χ =N×R =706×0 . 0016=1 . 1296
negatively correlated, what is the expected
consequence of omitting Z from a regression of Level of significance, df, Chi sq critical value –
Hal White test
Y on X? a) The estimated coefficient on X will be any level of significance can be used. For 0.05,
Can test for hetero wo precise knowledge biased downwards (too negative). df=1, the critical value is 3.841 (1 point)
of relevant vars – sets Zs as equal to xs,
What are the consequences of using least Conclusion (1 points). Since the test statistic is
x2s, possibly cross-products
squares when heteroskedasticity is present? not greater than the critical value, we cannot
NONE of a) no consequences, coefficient reject the null hypothesis of homoscedasticity.
estimates are still unbiased b) confidence There is no heteroskedasticity in the model.
intervals and hypothesis testing are inaccurate
Depending on your result from part (16), what
due to inflated standard errors c) all coefficient
changes should be made to your model?
estimates are biased for variables correlated
with the error term d) it requires very large Since the test in part (16) concludes that there
Use F test or with sample sizes to get efficient estimates is no hetero present, we don’t need to do
anything but can estimate the model as
Exam Qs
specified.
Suppose [equation] includes hetero – what
ln(WAGE) = β1 + β2EDUC + β3EDUC2 + β4EXPER
does this mean for [CI/hyp tests]
+ β5EXPER2 + β6HRSWK + e
For full marks, I expect an explanation of
d) Suppose you wish to test the hypothesis
heteroskedasticity, the consequences and why
that a year of education has the same effect on
the tests are unreliable.
ln(WAGE) as a year or experience. What null
E.g. Heteroskedasticity is a violation of the GM and alternative hypothesis would you set up?
assumption of constant error variance (5 marks)
(homoscedasticity). The variance of the error
NOTE: White/Breusch tests may give different Education and experience have the same effect
term under hetero is no longer constant. (2
results on ln(wage) if β2 = β4 and β3 = β5 The null and
points)
alternative hypotheses are: H0 : β2 = β4 and β3
Heteroskedasticity-consistent standard errors In the presence of hetero, standard errors will = β5 H1 : β2 ≠ β4 or β3 ≠ β5 or both
(Robust standard errors) be biased and test statistics therefore unreliable
e) What is the restricted model, assuming that
since they depend on the estimates of the
Valid in large samples for both hetero- and the null hypothesis is true? (5 marks)
standard errors. (3 points)
homoscedastic errors
The restricted model assuming the null
Write down a model that allows the variance
hypothesis is true is: ln(WAGE) = β1 + β4 (EDUC
of e to differ between men and women. The
+ EXPER) + β5 EDUC2 + EXPER2 ( ) + β6HRSWK +
variance should not depend on other factors
e
f) Given that the sum of squared errors from
the restricted model is SSER = 254.1726, test
the hypothesis in (d). (For SSEU use the
relevant value from the table of output above.
The sample size is N = 1000 )
F = (SSER − SSEU ) J SSEU (N − K) = (254.1726 −
222.6674) 2 222.6674 994 = 70.32 The 5%
critical value is F=3.005. Since the F statistic is
greater than the F critical value, we reject the
null hypothesis and conclude that education
and experience have different effects on
ln(WAGE).