0% found this document useful (0 votes)
120 views

F Test of Goodness of Fit

The document discusses using an F-test to test the goodness of fit of a regression model. It defines the total, explained, and residual sum of squares, and how they relate. It also defines the F statistic and how it can be used to test the null hypothesis that a model variable has no explanatory power against the alternative that it does.

Uploaded by

PeterParker1983
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views

F Test of Goodness of Fit

The document discusses using an F-test to test the goodness of fit of a regression model. It defines the total, explained, and residual sum of squares, and how they relate. It also defines the F statistic and how it can be used to test the null hypothesis that a model variable has no explanatory power against the alternative that it does.

Uploaded by

PeterParker1983
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

In an earlier sequence it was demonstrated that the sum of the squares of the actual values
of Y (TSS: total sum of squares) could be decomposed into the sum of the squares of the
fitted values (ESS: explained sum of squares) and the sum of the squares of the residuals.
1
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

R2, the usual measure of goodness of fit, was then defined to be the ratio of the explained
sum of squares to the total sum of squares.

2
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

The null hypothesis that we are going to test is that the model has no explanatory power.

3
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u
Null hypothesis: H0: β 2 = 0
Alternative hypothesis: H1: β 2 ≠ 0

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

Since X is the only explanatory variable at the moment, the null hypothesis is that Y is not
determined by X. Mathematically, we have H0: β2 = 0

4
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u
Null hypothesis: H0: β 2 = 0
Alternative hypothesis: H1: β 2 ≠ 0

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
1, n − k )
F ( k −= = =
RSS ( n − k ) RSS n − k
( ) (1 − R2 ) ( n − k )
TSS
Hypotheses concerning goodness of fit are tested via the F statistic, defined as shown. k is
the number of parameters in the regression equation, which at present is just 2.

5
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u
Null hypothesis: H0: β 2 = 0
Alternative hypothesis: H1: β 2 ≠ 0

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
1, n − k )
F ( k −= = =
RSS ( n − k ) RSS n − k
( ) (1 − R2 ) ( n − k )
TSS
n – k is, as with the t statistic, the number of degrees of freedom (number of observations
less the number of parameters estimated). For simple regression analysis, it is n – 2.

6
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u
Null hypothesis: H0: β 2 = 0
Alternative hypothesis: H1: β 2 ≠ 0

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
1, n − k )
F ( k −= = =
RSS ( n − k ) RSS n − k
( ) (1 − R2 ) ( n − k )
TSS
The F statistic may alternatively be written in terms of R2. First divide the numerator and
denominator by TSS.

7
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u
Null hypothesis: H0: β 2 = 0
Alternative hypothesis: H1: β 2 ≠ 0

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
1, n − k )
F ( k −= = =
RSS ( n − k ) RSS n − k
( ) (1 − R2 ) ( n − k )
TSS
We can now rewrite the F statistic as shown. The R2 in the numerator comes straight from
the definition of R2.

8
F TEST OF GOODNESS OF FIT

RSS TSS − ESS ESS


= = 1− = 1 − R2
TSS TSS TSS

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
1, n − k )
F ( k −= = =
RSS ( n − k ) RSS n − k
( ) (1 − R2 ) ( n − k )
TSS
It is easily demonstrated that RSS/TSS is equal to 1 – R2.

9
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u
Null hypothesis: H0: β 2 = 0
Alternative hypothesis: H1: β 2 ≠ 0

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
1, n − k )
F ( k −= = =
RSS ( n − k ) RSS n − k
( ) (1 − R2 ) ( n − k )
TSS
F is a monotonically increasing function of R2. As R2 increases, the numerator increases
and the denominator decreases, so for both of these reasons F increases.

10
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80

60

40

20

0
0 0.2 0.4 0.6 0.8 1 R2

Here is F plotted as a function of R2 for the case where there is 1 explanatory variable and
20 observations. Since k = 2, n – k = 18.

11
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80

60

40

20

0
0 0.2 0.4 0.6 0.8 1 R2

If the null hypothesis is true, F will have a random distribution.

12
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80

60

40

20

0
0 0.2 0.4 0.6 0.8 1 R2

There will be some critical value which it will exceed, as a matter of chance, only 5 percent
of the time. If we are performing a 5 percent significance test, we will reject H0 if the F
statistic is greater than this critical value.
13
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80

60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

In the case of an F test, the critical value depends on the number of explanatory variables
as well as the number of degrees of freedom. When there is one explanatory variable and
18 degrees of freedom, the critical value of F at the 5 percent significance level is 4.41.
14
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,5% ( 1,18 ) = 4.41
60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

For one explanatory variable and 18 degrees of freedom, F = 4.41 when R2 = 0.20.

15
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,5% ( 1,18 ) = 4.41
60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

If R2 is higher than 0.20, F will be higher than 4.41, and we will reject the null hypothesis at
the 5 percent level.

16
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,1% ( 1,18 ) = 8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

If we were performing a 1 percent test, with one explanatory variable and 18 degrees of
freedom, the critical value of F would be 8.29. F = 8.29 when, R2 = 0.32.

17
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,1% ( 1,18 ) = 8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

If R2 is higher than 0.32, F will be higher than 8.29, and we will reject the null hypothesis at
the 1 percent level.

18
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,1% ( 1,18 ) = 8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

Why do we perform the test indirectly, through F, instead of directly through R2? After all, it
would be easy to compute the critical values of R2 from those for F.

19
F TEST OF GOODNESS OF FIT

F R 2 ( k − 1)
F ( k − 1, n − k ) = 2
(1 − R ) ( n − k )
140

120
R2 1
F ( 1,18 ) =
100
( 1 − R 2
) 18
80
Fcrit ,1% ( 1,18 ) = 8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

The reason is that an F test can be used for several tests of analysis of variance. Rather
than have a specialized table for each test, it is more convenient to have just one.

20
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u
Null hypothesis: H0: β 2 = 0
Alternative hypothesis: H1: β 2 ≠ 0

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
1, n − k )
F ( k −= = =
RSS ( n − k ) RSS n − k
( ) (1 − R2 ) ( n − k )
TSS
Note that, for simple regression analysis, the null and alternative hypotheses are
mathematically exactly the same as for a two-tailed t test. Could the F test come to a
different conclusion from the t test?
21
F TEST OF GOODNESS OF FIT

Model Y = β 1 + β 2X + u
Null hypothesis: H0: β 2 = 0
Alternative hypothesis: H1: β 2 ≠ 0

∑ (Y − Y ) ∑( ) ∑
2
= − +
2
ˆ
Y Y ˆ
u 2
TSS = ESS + RSS

∑ (Yˆi − Y )
2
ESS
=
R = 2

∑ (Yi − Y )
2
TSS

ESS
ESS ( k − 1) TSS ( k − 1) R 2 ( k − 1)
1, n − k )
F ( k −= = =
RSS ( n − k ) RSS n − k
( ) (1 − R2 ) ( n − k )
TSS
The answer, of course, is no. We will demonstrate that, for simple regression analysis, the F
statistic is the square of the t statistic.

22
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

We start by replacing ESS and RSS by their mathematical expressions.

23
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

The denominator is the expression for σˆ u , the estimator of σ u , for the simple regression
2 2

model. We expand the numerator using the expression for the fitted relationship.

24
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

The β̂ 1 terms in the numerator cancel. The rest of the numerator can be grouped as shown.

25
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

We take the β̂ 22 term out of the summation as a factor.

26
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

We move the term involving X to the denominator.

27
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

The denominator is the square of the standard error of β̂ 2 .

28
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

Hence we obtain β̂ 22 divided by the square of the standard error of β̂ 2 . This is the t statistic,
squared.

29
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

It can also be shown that the critical value of F, at any significance level, is equal to the
square of the critical value of t. We will not attempt to prove this.

30
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

Since the F test is equivalent to a two-sided t test in the simple regression model, there is
no point in performing both tests. In fact, if justified, a one-sided t test would be better than
either because it is more powerful (lower risk of Type II error if H0 is false).
31
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

∑ (Yˆi − Y )
2
ESS
F =
RSS ( n − 2 ) ∑ i
ˆ
u 2
( n − 2)

∑ ( ( βˆ ) ( ))
2
+ βˆ2 X i − βˆ1 + βˆ2 X 1
2 ∑ 2 ( )
β
1
= −
2
ˆ 2
X X
σˆ u
2
σˆ u i

βˆ22 βˆ22 βˆ22


2 ∑( )
= Xi − X
= = =
2 2
t
σˆ u σˆ u2 ∑( Xi − X ) ( s.e. ( βˆ ) )
2 2
2

The F test will have its own role to play when we come to multiple regression analysis.

32
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

Here is the output for the regression of hourly earnings on years of schooling for the
sample of 500 respondents from the National Longitudinal Survey of Youth 1997‒.

33
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n −=
2) = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

We shall check that the F statistic has been calculated correctly. The explained sum of
squares (described in Stata as the model sum of squares) is 6014.

34
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n −=
2) = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

The residual sum of squares is 64315.

35
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n −=
2) = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

The number of degrees of freedom is 500 – 2 = 498.

36
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n −=
2) = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

The denominator of the expression for F is therefore 129.15. Note that this is an estimate of
σ u2 . Its square root, denoted in Stata by Root MSE, is an estimate of the standard deviation
of u.
37
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

ESS 6014 6014


F ( 1, n −=
2) = = = 46.57
RSS ( n − 2 ) 64315 ( 500 − 2 ) 129.15

Our calculation of F agrees with that in the Stata output.

38
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

R2 0.0855
F ( 1, n − 2 )
= = = 46.56
( 1 − R 2
) ( n − 2) ( 1 − 0.0855 ) ( 500 − 2 )

We will also check the F statistic using the expression for it in terms of R2. We see again
that it agrees, apart from rounding error.

39
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

We will also check the relationship between the F statistic and the t statistic for the slope
coefficient.

40
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

6.822 = 46.51

Obviously, this is correct as well, apart from rounding error.

41
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

Fcrit, 0.1% (1,500 ) = 10.96 t crit, 0.1% (500 ) = 3.31 10.96 = 3.312

And the critical value of F is the square of the critical value of t. (We are using the values
for 500 degrees of freedom because those for 498 do not appear in the table.)

42
F TEST OF GOODNESS OF FIT

. reg EARNINGS S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 46.57
Model | 6014.04474 1 6014.04474 Prob > F = 0.0000
Residual | 64314.9215 498 129.146429 R-squared = 0.0855
-----------+------------------------------ Adj R-squared = 0.0837
Total | 70328.9662 499 140.939812 Root MSE = 11.364
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.265712 .1854782 6.82 0.000 .9012959 1.630128
_cons | .7646844 2.803765 0.27 0.785 -4.743982 6.273351
----------------------------------------------------------------------------

Fcrit, 0.1% (1,500 ) = 10.96 t crit, 0.1% (500 ) = 3.31 10.96 = 3.312

The relationship is shown for the 0.1% significance level, but obviously it is also true for any
other significance level.

43
Copyright Christopher Dougherty 2016.

These slideshows may be downloaded by anyone, anywhere for personal use.


Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.

The content of this slideshow comes from Section 2.7 of C. Dougherty,


Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.

2016.04.20

You might also like