0% found this document useful (0 votes)
77 views49 pages

Stat 136 Chapter 10 Nonnormality and Heteroskedasticity

Chapter 10 of Statistics 136 discusses diagnostic checking for nonnormality and heteroskedasticity, outlining reasons, implications, graphical tools, and tests for detecting these issues in regression analysis. It emphasizes the importance of validating the normality of error terms and provides various statistical tests for this purpose, such as the Shapiro-Wilk and Jarque-Bera tests. The chapter also explains the concept of heteroskedasticity, its causes, and its implications on the validity of regression estimators.

Uploaded by

aubarnachea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views49 pages

Stat 136 Chapter 10 Nonnormality and Heteroskedasticity

Chapter 10 of Statistics 136 discusses diagnostic checking for nonnormality and heteroskedasticity, outlining reasons, implications, graphical tools, and tests for detecting these issues in regression analysis. It emphasizes the importance of validating the normality of error terms and provides various statistical tests for this purpose, such as the Shapiro-Wilk and Jarque-Bera tests. The chapter also explains the concept of heteroskedasticity, its causes, and its implications on the validity of regression estimators.

Uploaded by

aubarnachea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Statistics 136 Chapter 10

Diagnostic Checking: Nonnormality and Heteroskedasticity

UP School of Statistics

2nd Semester AY 2023-2024

UP School of Statistics Statistics 136 Chapter 10


Table of Contents

1 Reasons of Nonnormality

2 Implications of Nonnormality

3 Graphical Tools For Detecting Nonnormality

4 Testing for Nonnormality

5 Reasons for Heteroskedasticity

6 Implications of Heteroskedasticity

7 Tests for Heteroskedasticity

8 Remedial Measures for Nonnormality and Heteroskedasticity

UP School of Statistics Statistics 136 Chapter 10


Reasons of Nonnormality of Error Terms

Existence of outliers in the model (nonsystematic problem).


There are still too much variability of Y that is due to error
(systematic problem).
Relationship of the regressors with the dependent variable is
nonlinear (systematic problem).
Y is inherently far from being normally distributed (systematic
problem).

UP School of Statistics Statistics 136 Chapter 10


Table of Contents

1 Reasons of Nonnormality

2 Implications of Nonnormality

3 Graphical Tools For Detecting Nonnormality

4 Testing for Nonnormality

5 Reasons for Heteroskedasticity

6 Implications of Heteroskedasticity

7 Tests for Heteroskedasticity

8 Remedial Measures for Nonnormality and Heteroskedasticity

UP School of Statistics Statistics 136 Chapter 10


Implications of Nonnormality

Note that several of the tests and procedures we have


discussed assume the normality of the error terms.
The tests and confidence intervals for the model parameters
assume normality of the error terms.
Additionally, the prediction of new responses assumes
normality of the error terms.
The rationale behind the MSE being the UMVUE for the error
term variance also assumes normality.

UP School of Statistics Statistics 136 Chapter 10


Implications of Nonnormality

However, take note that the β̂ being the BLUE of β does not
require the assumption of normality of error terms.
Note also of the fact that because of the central limit
theorem, a lot of statistics used for hypothesis testing and
confidence interval estimation approximately follows the
normal distribution.
Because of these aforementioned realities, it was stated by a
lot of statisticians that the assumption of normality is the
least bothersome of the assumptions of regression analysis.
However it is still imperative to check for gross nonnormality!
:D

UP School of Statistics Statistics 136 Chapter 10


Table of Contents

1 Reasons of Nonnormality

2 Implications of Nonnormality

3 Graphical Tools For Detecting Nonnormality

4 Testing for Nonnormality

5 Reasons for Heteroskedasticity

6 Implications of Heteroskedasticity

7 Tests for Heteroskedasticity

8 Remedial Measures for Nonnormality and Heteroskedasticity

UP School of Statistics Statistics 136 Chapter 10


Graphical Tools For Detecting Nonnormality

Useful graphical tools for assessing the normality of a given


variable are histograms (with super-imposed normal curve) and
normal probability plots.

UP School of Statistics Statistics 136 Chapter 10


Histograms
Example of histograms:

UP School of Statistics Statistics 136 Chapter 10


Normal Probability Plots

The following summarize the steps in constructing a normal


probability plot:
Let e(1) < e(2) < e(3) < . . . < e(n) be the residuals ranked in
increasing order.
i − 0.5
If we plot e(i) against the cumulative probability Pi = ,
n
i = 1, 2, . . . , n, the resulting points should lie approximately
on a straight line.
The straight line is usually determined visually, with emphasis
on the central values rather than the extremes.
Substantial departures from a straight line indicate that the
distribution is not normal.

UP School of Statistics Statistics 136 Chapter 10


Prototypes of the Normal Probability Plot

Normal probability plot is a graphical representation of scaled


residuals.

UP School of Statistics Statistics 136 Chapter 10


Notes About the Normal Probability Plot

Small sample sizes (n ≤ 16) often produce normal probability


plots that deviate substantially from linearity.
For large sample sizes (n ≥ 32), the plots are much better
behaved.
Usually about 20 points are required to produce normal
probability plots that are stable enough to be easily
interpreted.
A common defect that shows up on the normal probability
plot is the occurrence of one or two large residuals.
Sometimes, this is an indication that the corresponding
observations are outliers.

UP School of Statistics Statistics 136 Chapter 10


Table of Contents

1 Reasons of Nonnormality

2 Implications of Nonnormality

3 Graphical Tools For Detecting Nonnormality

4 Testing for Nonnormality

5 Reasons for Heteroskedasticity

6 Implications of Heteroskedasticity

7 Tests for Heteroskedasticity

8 Remedial Measures for Nonnormality and Heteroskedasticity

UP School of Statistics Statistics 136 Chapter 10


Testing for Nonnormality

The following are some of the tests that may be used for validating
the assumption of normality of the error terms.
χ2 Goodness-of-fit test
Kolmogorov-Smirnov One-Sample Test
Wilk-Shapiro Test
Anderson-Darling Test
Cramer-von-Mises criterion
Jarque-Bera test
All of these tests above test the following hypotheses:
Ho: The error terms are normally distributed.
Ha: The error terms are not normally distributed.

UP School of Statistics Statistics 136 Chapter 10


χ2 Goodness-of-fit test

χ2 Goodness-of-fit test
Description: The interest is in the number of subjects or objects
that fall in various categories. In the χ2 goodness-of-fit test, we
wish to test whether a significant difference exists between an
observed number of objects in a category and an expected number
of observations based on the null hypothesis. (This is discussed in
full in Stat 132).
Strengths: A very intuitive test.
Weaknesses: The test is very sensitive to the binning process.
There is also a requirement for the expected frequencies per bin
(≥ 5).

UP School of Statistics Statistics 136 Chapter 10


Kolmogorov-Smirnov One-Sample Test (or Lilliefors Test)

Kolmogorov-Smirnov One-Sample Test (or Lilliefors Test)


Description: This is another goodness-of-fit test that is concerned
with the degree of agreement between the distribution of a set of
observed scores and some specified distribution (This is discussed
in full in Stat 132).
Strengths: It is not sensitive to departure from other assumptions.
It is also not sensitive to several identical values.
Weaknesses: It is less powerful than most tests for normality. It
requires larger sample size being a nonparametric test.

UP School of Statistics Statistics 136 Chapter 10


Shapiro-Wilk Test

Shapiro-Wilk Test
Description: A test that essentially looks at the correlation
between the ordered data and the theoretical values from the
normal distribution.
Strengths: This is one of the most powerful normality-tests. It is
also one of the most popular tests too. One can use it for small
sample sizes.
Weaknesses: It being a very powerful test is a problem in itself,
especially if the sample size is too large. If the sample size is
sufficiently large this test may detect even trivial departures from
the null hypothesis (for example: outlying residuals). The test is
also very sensitive to many identical residual values.

UP School of Statistics Statistics 136 Chapter 10


Procedure for Shapiro-Wilk Test
The test statistic is
 n
2
P
ai x(i)
i=1
W = n , 0 < W < 1,
(xi − x̄)2
P
i=1

where
x(i) with parentheses enclosing the subscript index i) is the i th
order statistic, i.e., the i th -smallest number in the sample;
x̄ is the sample mean.
m T V −1
the constants ai are given by (a1 , . . . , an ) = p
m T V −1 V −1 m
where m = (m1 , . . . , mn )T and m1 , . . . , mn are the expected
values of the order statistics of an iid sample from the
standard normal distribution, and V is the covariance matrix
of those order statistics.
UP School of Statistics Statistics 136 Chapter 10
Anderson-Darling Test

Anderson-Darling Test
Description: This is a test that compares the quantile values of
the empirical distribution of the data with the theoretical
distribution of interest. It is based upon the concept that when
given a hypothesized underlying distribution, the data can be
transformed to a uniform distribution.
Strengths: The Anderson-Darling test is one of the most powerful
statistics for detecting most departures from normality. It may be
used with small sample sizes, n ≤ 25.
Weaknesses: Very large sample sizes may reject the assumption
of normality with only slight imperfections, but industrial data with
sample sizes of 200 and more have passed the Anderson-Darling
test. The test is also very sensitive to many identical residual
values.

UP School of Statistics Statistics 136 Chapter 10


Procedure for Anderson-Darling Test
The data of the variable X that should be tested is sorted
from low to high.
The mean and standard deviation are calculated from the
sample of X.
Xi − X̄
All X’s are standardized, yi = .
s
Assuming the Y ’s are standard normal, we compute for each
Pi = P(Y ≤ yi ).
A2 is calculated as
Pn
(2k − 1)(ln(Pk ) + ln(1 − Pn+1−k ))
A2 = k=1 −n
n
A2∗ , an approximate adjustment for sample size, is calculated
using  
2∗ 2 0.75 2.25
A =A 1+ + 2 .
n n
UP School of Statistics Statistics 136 Chapter 10
Cramer-von-Mises Criterion

Cramer-von-Mises Criterion
Description: In statistics the Cramer-von-Mises criterion for
judging the goodness-of-fit of a probability distribution F ∗ (x)
compared to a given distribution F is given by
Z∞
W = 2
[F (x) − F ∗ (x)]2 dF (x)
−∞

In applications, F (x) is the theoretical distribution and F ∗ (x) is


the empirically observed distribution.
Strengths: This is a nonparametric test, which implies that it is a
robust test. It is also more powerful than Kolmogorov-Smirnov
Test.
Weaknesses: It is not as powerful as Shapiro-Wilk and
Anderson-Darling. It is also very sensitive to observation ties.

UP School of Statistics Statistics 136 Chapter 10


Jarque-Bera Test

Jarque-Bera Test
Description: In statistics, the Jarque-Bera test is a goodness-of-fit
measure of departure from normality, based on the sample kurtosis
and skewness.
Strengths: This is one of the most popular tests, mostly used in
econometrics (it still is).
Weaknesses: The test has low power for distributions with short
tails, especially for bimodal distributions. It is also an asymptotic
test and is very sensitive when the sample size is too low.

UP School of Statistics Statistics 136 Chapter 10


Jarque-Bera Test

The test statistic JB is defined as


(K − 3)2
 
n 2
JB = S +
6 4

where n is the number of observations (or degrees of freedom in


general); S is the sample skewness, K is the sample kurtosis.
The statistic JB has an asymptotic chi-square distribution with two
degrees of freedom and can be used to test the null hypothesis
that the data are from a normal distribution. The null hypothesis
is a joint hypothesis of both the skewness and excess kurtosis
being 0, since samples from a normal distribution have an expected
skewness of 0 and an expected excess kurtosis of 0. As the
definition of JB shows, any deviation from this increases the JB
statistic.

UP School of Statistics Statistics 136 Chapter 10


Table of Contents

1 Reasons of Nonnormality

2 Implications of Nonnormality

3 Graphical Tools For Detecting Nonnormality

4 Testing for Nonnormality

5 Reasons for Heteroskedasticity

6 Implications of Heteroskedasticity

7 Tests for Heteroskedasticity

8 Remedial Measures for Nonnormality and Heteroskedasticity

UP School of Statistics Statistics 136 Chapter 10


Reasons for Heteroskedasticity

One of the assumptions of the regression model is that the


variation of the error terms is everywhere the same. If the
errors have constant variance, the errors are called
homoskedastic.
Heteroskedasticity is the problem of having nonconstant
variances of the error terms. The term means ”differing
variance” and comes from the Greek ”hetero” (’different’) and
”skedasis” (’dispersion’). If this problem is present, the
residual plots are usually funnel-shaped or diamond-shaped.

UP School of Statistics Statistics 136 Chapter 10


Reasons for Heteroskedasticity

the variability is a function of the variable themselves. For


example:
following the error-learning models, as people learn, their
error of behavior becomes smaller over time; variance is
expected to decrease
as income grows, people have more discretionary income
and hence more scope of choice about the disposition of
their income; variance is expected to increase with income
as data collection techniques improve, variance is
expected to decrease
heteroskedasticity can also arise because of the presence of
outliers
heteroskedasticity arises when the regression model is
incorrectly specified

UP School of Statistics Statistics 136 Chapter 10


Table of Contents

1 Reasons of Nonnormality

2 Implications of Nonnormality

3 Graphical Tools For Detecting Nonnormality

4 Testing for Nonnormality

5 Reasons for Heteroskedasticity

6 Implications of Heteroskedasticity

7 Tests for Heteroskedasticity

8 Remedial Measures for Nonnormality and Heteroskedasticity

UP School of Statistics Statistics 136 Chapter 10


Implications of Heteroskedasticity

Ordinary least squares (OLS) estimators are still linear and


unbiased.
Since we require the assumption that Var (ϵ) = σ 2 I for
Gauss-Markov Theorem and the derivation of the UMVUE,
OLS estimators are not efficient when Var (ϵ) ̸= σ 2 I . The
OLS estimators are no longer BLUE and UMVUE. Moreover,
they are also not asymptotically efficient, or they do not
become efficient even if the sample size increases.
We also required the assumption of constancy of variance to
derive the ANOVA F-test. This test is misleading if we have a
problem of heteroskedasticity.
Variance of β̂ is no longer equal to σ 2 (X ′ X )−1 . With this,
the t-tests for the significance of the coefficients are no longer
valid. Confidence intervals and prediction intervals are also
misleading.
UP School of Statistics Statistics 136 Chapter 10
Table of Contents

1 Reasons of Nonnormality

2 Implications of Nonnormality

3 Graphical Tools For Detecting Nonnormality

4 Testing for Nonnormality

5 Reasons for Heteroskedasticity

6 Implications of Heteroskedasticity

7 Tests for Heteroskedasticity

8 Remedial Measures for Nonnormality and Heteroskedasticity

UP School of Statistics Statistics 136 Chapter 10


Tests for Heteroskedasticity

The following are some of the tests that may be used for validating
the assumption of normality of the error terms.
Two sample test
Goldfeld-Quandt test
White’s heteroscedasticity test
Breusch-Pagan Test
All of these tests above test the following hypotheses:
Ho: The error variance is constant.
Ha: The error variance is not constant.

UP School of Statistics Statistics 136 Chapter 10


Two-Sample Test

Two-Sample Test
Description: If the residual plot gives the impression that the
variance increases or decreases in a systematic manner related to
an independent variable X or to Y , a simple test is to fit separate
regressions to each half of the observations arranged by the level of
X , then compare their MST’s.
Ho: σ12 = σ22 vs. Ha: σ12 ̸= σ22
Strengths: A very intuitive test.
Weaknesses: Less powerful than most tests. Has to assume that
the population is normally distributed.

UP School of Statistics Statistics 136 Chapter 10


Two-Sample Test

Test statistic:
(n1 − 1)S12
(n1 − 1)σ12 S2
Fc = 2
= 12 ∼ F(n1 −1,n2 −1)
(n2 − 1)S2 S2
(n2 − 1)σ22

Critical region: Reject Ho if Fc > F(α,n1 −1,n2 −1) .


Note: S1 should be larger than S2 .

UP School of Statistics Statistics 136 Chapter 10


Goldfeld-Quandt Test

Goldfeld-Quandt Test
Description: The test involves the calculation of two least squares
regression lines, one using the data thought to be associated with
low variance errors and the other using data thought to be
associated with high variance errors.
If the residual variances associated with each regression line are
approximately the same, the homoskedasticity assumption cannot
be rejected. Otherwise, there is a problem of heteroskedasticity.
Strengths: It has a nonparametric test which does not assume
normality of error terms.
Weaknesses: Error variance must be a monotonic function of the
specified explanatory variable. For example, when faced with a
quadratic function mapping the explanatory variable to error
variance the Goldfeld–Quandt test may improperly accept the null
hypothesis of homoskedastic errors.

UP School of Statistics Statistics 136 Chapter 10


Goldfeld-Quandt Test

The following are the steps for doing a Goldfeld-Quandt test:


Plot the residuals against each Xj and choose the Xj with the
most noticeable relation to error variation.
Order the entire data set by the magnitude of the Xj chosen
in (a).
Omit the middle d observations (the choice for the value of d
is somewhat arbitrary but it is often taken as 1/5 of total
observations unless n is very small).
Fit two separate regressions, the first (indicated by the
subscript 1) for the portion of the data associated with low
residual variance, and the second (indicated by the subscript
2) associated with high residual variance.

UP School of Statistics Statistics 136 Chapter 10


Goldfeld-Quandt Test

Calculate SSE1 and SSE2 .


Calculate SSE2 /SSE1 and assuming that the error terms are
normally distributed, this will follow an F-distribution with
[(n − d − 2p)/2, (n − d − 2p)/2] degrees of freedom.
Reject the null hypothesis of constant variance at a chosen
level of significance if the calculated statistic is greater than
the critical value.

UP School of Statistics Statistics 136 Chapter 10


White’s Heteroskedasticity Test

White’s Heteroskedasticity Test


Description: The White’s Test for Heteroskedasticity is a more
general test that does not specify the nature of the
heteroskedasticity.
Strengths: This is one of the most popular tests for
heteroskedasticity (especially in economics). If the
heteroskedasticity is found in the data, a remedial measure is
already readily available that makes use of the result of this test.
You also do not need to specify the nature of heteroskedastciity.
Weaknesses: In cases where the White test statistic is statistically
significant, heteroskedasticity may not necessarily be the cause;
instead the problem could be a specification error. The method
also assumes that the error terms are normally distributed.

UP School of Statistics Statistics 136 Chapter 10


White’s Heteroskedasticity Test

The procedure is summarized as follows:


Run the regression equation and obtain the residuals, ei .
Regress ei2 on the following regressors: constant, the original
regressors, their squares and their cross products (optional)
Obtain R 2 and k, the number of regressors excluding the
constant.
Compute for nR 2 (test statistic), where n is the number of
observations; nR 2 ∼ χ2(k)
Choose the level of significance and based on the p-value,
decide on whether to reject or not reject the null hypothesis of
constant variance.

UP School of Statistics Statistics 136 Chapter 10


Breusch-Pagan Test

Breusch-Pagan Test
Description: The Breusch-Pagan test is a test that assumes that
the error terms are normally distributed, with E (ϵi ) = 0 and
Var (ϵi ) = σi2 (i.e., nonconstant variance).
Strengths: A much more focused test on pure heteroskedasticity
than the White test.
Weaknesses: Normal error terms is a requirement to use the said
method.

UP School of Statistics Statistics 136 Chapter 10


Breusch-Pagan Test

The values depend on the horizontal axis quantity Xi values in the


following way:
log (σi2 ) = γ0 + γ1 Xi
We are interested in testing the null hypothesis of constant
variance versus the alternative hypothesis of nonconstant variance.
Specifically, the hypothesis test is formulated as:
Ho: γ1 = 0
Ha: γ1 ̸= 0

UP School of Statistics Statistics 136 Chapter 10


Table of Contents

1 Reasons of Nonnormality

2 Implications of Nonnormality

3 Graphical Tools For Detecting Nonnormality

4 Testing for Nonnormality

5 Reasons for Heteroskedasticity

6 Implications of Heteroskedasticity

7 Tests for Heteroskedasticity

8 Remedial Measures for Nonnormality and Heteroskedasticity

UP School of Statistics Statistics 136 Chapter 10


Remedial Measures

One of possible reasons why there is gross nonnormality


and/or heteroskedasticity is due to the other assumptions in
the error terms.
Normality is sometimes violated because of inadequacy of the
model fit, nonlinearity, and heteroskedasticity.
Heteroskedasticity is sometimes violated because of inadequacy
of the model fit, nonlinearity, and nonnormality.
This also implies that presence of one problem is usually
associated with presence of the other problem. Solution for
one usually solves the other.
Advice of many statisticians: Solve first the problem of
nonlinearity and the possible reasons of inadequacy of the
model.

UP School of Statistics Statistics 136 Chapter 10


Remedial Measures for Nonnormality

Solving the nonlinearity problem using transformation of


the data. Most of the time, transforming Y is the solution.
Box-Cox transformation is a very reliable solution to
nonnormality. (This transformation will be discussed later in
this chapter).
Trying to include as much regressors as you can. One of
the common behavior of the residuals is that they tend to
become normally distributed if there are many good predictors
in the model.
Dealing with outliers. Sometimes the source of
nonnormality is simply the outliers. Try to deal with outliers
too. We will discuss how to look for them and how to deal
with them in Chapter 13.

UP School of Statistics Statistics 136 Chapter 10


Remedial Measures for Heteroskedasticity

Solving for nonlinearity through variable transformation.


Oftentimes, Box-Cox transformation works like a charm.
Dealing with outliers.
Generalized Least Squares (GLS)
To determine the BLUE of β, we use the GLS instead of the
OLS.
The basic idea behind the GLS is to transform the observation
matrix so that the variance of the transformed model is I or
σ2 I .
Since V is positive definite, V −1 is also positive definite, and
there exists a nonsingular matrix say W such that
V −1 = W ′ W .
Transforming the model Y = X β + ϵ by W yields
WY = WX β + W ϵ.

UP School of Statistics Statistics 136 Chapter 10


Remedial Measures for Heteroskedasticity
Weighted Least Squares - a special version of the GLS
When V is given as:
 2 
σ1 0 ... 0
0 σ22 ... 0
V = .
 
.. .. .. 
 .. . . .
0 0 ... σn2

then
1
 
 σ1 0 ... 0
 1 
0 ... 0
 
W = σ2 
 . .. .. .. 
 .. . . . 
 
 1
0 0 ...
σn
The larger σi is, the smaller the weight of that observation in
the estimation of β.
UP School of Statistics Statistics 136 Chapter 10
Remedial Measures for Heteroskedasticity

Some possible variance and standard deviation function


estimates include:
If a residual plot against a predictor exhibits a megaphone
shape, then regress the absolute values of the residuals
against that predictor. The resulting fitted values of this
regression are estimates of σi .
If a residual plot against the fitted values exhibits a
megaphone shape, then regress the absolute values of the
residuals against the fitted values. The resulting fitted values
of this regression are estimates of σi .

UP School of Statistics Statistics 136 Chapter 10


Remedial Measures for Heteroskedasticity

Some possible variance and standard deviation function


estimates include:
If a residual plot of the squared residuals against a predictor
exhibits an upward trend, then regress the squared residuals
against that predictor. The resulting fitted values of this
regression are estimates of σi2
If a residual plot of the squared residuals against the fitted
values exhibits an upward trend, then regress the squared
residuals against the fitted values. The resulting fitted values
of this regression are estimates of σi2

UP School of Statistics Statistics 136 Chapter 10


Remedial Measures for Heteroskedasticity

White’s Heteroskedasticity Consistent Covariance


Halbert White (1980) derived a heteroskedasticity consistent
covariance matrix estimator that provides “correct” estimates
of the coefficient covariances in the presence of
heteroskedasticity of unknown form.
The White covariance matrix is given by:
n
!
n ′ −1
X
VarW (β̂) = (X X ) ei xi xi (X ′ X )−1 .
2 ′
n−k
i=1

where n is the number of observations, k is the number of


regressors, and ei is the least square residuals.
The White’s consistent covariance estimators produce adjusted
standard errors that can be used to test the regression
coefficients (recall the t-test). These standard errors are
consistent, i.e., the estimated values are approaching the
“true” parameter values as the sample size becomes large.

UP School of Statistics Statistics 136 Chapter 10


The Box-Cox Transformation

Box-Cox transformations are a family of power transformations


on Y such that Y ′ = Y λ , where λ is a parameter to be determined
using the data. It is a data transformation technique used to
stabilize variance, and make the data more normal distribution-like.
The normal error regression model with a Box-Cox transformation
is
Yiλ = β0 + β1 Xi + ϵi

UP School of Statistics Statistics 136 Chapter 10


The Box-Cox Transformation

The estimation method of maximum likelihood can be used to


estimate λ or a simple search over a range of candidate values may
be performed (e.g., λ = −4, −3.5, . . . , 3.5, 4.0).
The transformation of Y has the form:
 λ
y − 1
, λ ̸= 0
y (λ) = λ
ln(y ), λ=0

UP School of Statistics Statistics 136 Chapter 10

You might also like