0% found this document useful (0 votes)
16 views13 pages

Week 2 Lecture 2

The document discusses desirable properties of point estimators and the use of parametric and nonparametric techniques for testing the central location of a population. It highlights the advantages and disadvantages of the mean and median, and outlines the sign test and Wilcoxon signed ranks test for hypothesis testing regarding the population median. Additionally, it emphasizes the importance of normality assumptions and provides guidance on performing these tests using R software.

Uploaded by

1512866916
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views13 pages

Week 2 Lecture 2

The document discusses desirable properties of point estimators and the use of parametric and nonparametric techniques for testing the central location of a population. It highlights the advantages and disadvantages of the mean and median, and outlines the sign test and Wilcoxon signed ranks test for hypothesis testing regarding the population median. Additionally, it emphasizes the importance of normality assumptions and provides guidance on performing these tests using R software.

Uploaded by

1512866916
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ECON20003: QM2

WEEK 2: DESIRABLE PROPERTIES OF POINT ESTIMATORS


PARAMETRIC AND NONPARAMETRIC TECHNIQUES THE
ASSUMPTION OF NORMALITY
References:
S: § 10.1
W: 3.7

Notes prepared by:


Dr László Kónya and
Dr Mehmet Özmen

Faculty of Business and Economics


Department of Economics
NONPARAMETRIC TESTS FOR A POPULATION
CENTRAL LOCATION

• For quantitative data the two most useful and popular measures of central location are the
arithmetic mean and the median (in this order).

The mean has two advantages over the median:


▪ The mean is a comprehensive measure because it is computed from all available data points,
while the median is based on at most two data points.
▪ The mean is used far more extensively in inferential statistics than the median.
However, occasionally the median also has some advantages:
▪ Since the median depends only on the middle value(s), it is not affected by outliers
(uncharacteristically small or large values), while the mean can be unduly influenced by
them.
▪ The median exists even if the measurement scale is just ordinal, but the mean does not.

UoM, ECON 20003, Week 2 2


• A hypothesis about the central location of a quantitative population is usually best tested
with a t test for the population mean ()…
…unless t test is inappropriate e.g. because the normality assumption is clearly violated.

Can instead use some nonparametric alternative for testing the central location of a single
population:

a) the sign test for the median;

b) the Wilcoxon signed rank test for the median.

a) (One sample) Sign test for the median () assumes:


i. The data is a random sample of independent observations.
ii. The variable of interest is qualitative or quantitative.
iii. The measurement scale is at least ordinal.

But, the sign test does not require any assumption about the distribution of the sampled
population.

UoM, ECON 20003, Week 2 3


The hypotheses are vs.

This test is based on the signs of the observed non-zero deviations from
0, i.e. on the signs of xi -0  0, i = 1, 2, …, n.
Since the true median is right in the middle of an ordered data set, the numbers of negative and
positive deviations (S- and S+) are expected to be about the same if the null hypothesis is correct,
i.e. the population median is indeed 0.
Let S denote the test statistic.
In essence, it could be either S- or S+, but we arbitrarily choose S = S+.
If H0 is true and the selection of the sample items is random, S follows a binomial distribution (see
Review 3) with n and p = 0.5 parameters,

UoM, ECON 20003, Week 2 4


For sufficiently large n (np = nq = 0.5n  5, so n  10), this binomial distribution (B) can be
approximated with a normal distribution (N),

Reject H0 if (i) right-tail test: pR = P(S  S+) is small,


(ii) left-tail test: pL = P(S  S+) is small,
(iii) two-tail test: 2min(pR , pL) is small.

b) (One sample) Wilcoxon signed ranks (sum) test for the median
(),
The sign test is based entirely on the signs of the deviations from 0.

The Wilcoxon signed ranks test is a more sensitive and potentially


more powerful alternative because it takes the magnitudes of these
deviations as well into consideration.

UoM, ECON 20003, Week 2 5


The Wilcoxon signed ranks test assumes that
i. The data is a random sample of independent observations.
ii. The variable of interest is quantitative and continuous.
iii. The measurement scale is interval or ratio.
iv. The distribution of the sampled population is symmetric ( = ).

The Wilcoxon signed ranks test has the same null and alternative hypotheses as the sign test,
but it is based on the signs and on the absolute values of the deviations, i.e. |di| = |xi -0|, i = 1,
2, …, n.

Rank all non-zero |di| from smallest to largest and calculate the sum of the ranks assigned
to negative deviations (T−) and the sum of the ranks assigned to positive deviations (T+).

The test statistic is T = T+.

When H0 is true, T is right in the middle of this interval.

The sampling distribution of T is non-standard, but lower and upper critical values (TL and TU)
for 6 ≤ n ≤ 30 are provided in Table 9, Appendix B of the Selvanathan book (p. 1110).
6
UoM, ECON 20003, Week 2
Using these critical values, reject H0 if
(i) right-tail test: T ≥ TU,,
(ii) left-tail test: T ≤ TL,,
(iii) two-tail test: T ≥ TU,/2 or T ≤ TL ,/2.

When H0 is true and there are more than 30 non-zero deviations


(i.e. n > 30), the sampling distribution of T can be approximated with a
normal distribution.
Namely,

with

(Ex 1)
c) Perform the sign test and the Wilcoxon signed ranks test at the 5% level of
significance with R.
The original null and alternative hypotheses are H0 :  = 10 and HA :  > 10,
but since these nonparametric tests are focusing on the median rather than on
the mean, we rewrite them as H0 :  = 10 and HA :  > 10.
UoM, ECON 20003, Week 2 7
(Ex 1)
c) Perform the sign test and the Wilcoxon
signed ranks test at the 5% level of
significance with R.
The original null and alternative
hypotheses are
H0 :  = 10 and HA :  > 10,
but since these nonparametric tests are
focusing on the median rather than on the
mean, we rewrite them as
H0 :  = 10 and HA :  > 10.
The SignTest function of the DescTools package generates the following printout:

R reports the test statistic (S), the number of non-zero differences, the p-value,
the alternative hypothesis, and the sample median.
Check whether R performed the appropriate test (i.e. a right-tail sign test this
time) and whether the p-value <  = 0.05. Since p-value = 0.2692 > 0.05, we
maintain H0 at the 5% significance level.
Hence, at the 5% significance level the sign test does not provide sufficient
evidence in favour of the alternative hypothesis that the median Australian is
more than 10kg overweight.

UoM, ECON 20003, Week 2 9


The [Link] function of the exactRankTests package generates the following printout:

R reports the test statistic (V), the p-value, the alternative hypothesis, and
a variant of the sample median.
Since p-value  0.019 < 0.05, unlike the sign test, the Wilcoxon signed ranks
test rejects H0 at the 5% significance level.

Given these contradicting outcomes, recall that the Wilcoxon test assumes that
the population is symmetrical. This assumption, however, is not supported by
the sample (see the normality checks on slide 18), so we better rely on the
sign test this time.

UoM, ECON 20003, Week 2 10


Note: Similarly to the Wilcoxon signed ranks test, many other nonparametric
tests assume that the underlying variable of interest is continuous.
This assumption is primarily required to exclude the possibility of ties
and it is necessary for exact hypothesis tests.

Still, these procedures are often used in practice even if the variable of
interest is discrete, or is reported on an ordinal scale, but

i. there are a large number of different values,


or
ii. the sample size is large enough to approximate the
discrete sampling distribution of the test statistic with some
continuous probability distribution.

UoM, ECON 20003, Week 2 11


FLOW CHART FOR TESTING THE CENTRAL
LOCATION OF A SINGLE POPULATION

Type of data

Quantitative measured on Qualitative / categorical


a ratio or interval scale

Measurement
 scale

Known Estimated Ordinal Nominal

Not Sign test, Neither 


X-bar ~ N X~N Wilcoxon nor  exist
signed ranks
Yes Not Yes test for 
Note: the data is supposed
Z-test for  t-test for  to be a random sample of
independent observations
UoM, ECON 20003, Week 2 12
WHAT SHOULD YOU KNOW?

• Desirable statistical properties of point estimators:


linearity, unbiasedness, efficiency and consistency.
• To verify whether a sample might have been drawn from a normally
distributed population using graphs, numerical descriptive measures
and the Shapiro-Wilk test.
• Difference between parametric and nonparametric tests.
• To perform the (one sample) Sign test and Wilcoxon signed ranks test
for the population median manually and with R/RStudio.

UoM, ECON 20003, Week 2 13

You might also like