0% found this document useful (0 votes)
411 views26 pages

Hypothesis Testing Notes

Uploaded by

nyaraimutengwa99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
411 views26 pages

Hypothesis Testing Notes

Uploaded by

nyaraimutengwa99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

STATISTICAL INFERENCE
SIGNIFICANCE TESTING; HYPOTHESIS TESTING

Null hypothesis ,Alternative hypothesis,Test statistics, Significance


level, Hypothesis test (1-tail and 2-tail), Type 1 and type 2 errors, z-
tests, t – tests
SYLLABUS OBJECTIVES(6046)

 formulate hypotheses
 distinguish between a type 1 and a type 2 error
 compute probabilities of making type 1 and type 2 errors
 apply a hypothesis test in the context of a single observation from a
population which has binomial distribution using either the binomial
distribution or the normal approximation to the binomial distribution
 formulate hypotheses and apply a hypothesis test concerning population
mean using a sample drawn from a normal distribution of known variance
using the normal distribution
 describe the characteristics of a t distribution
 formulate hypotheses
 apply a hypothesis test concerning population mean using a small sample
drawn from a normal distribution of unknown variance using a t – test

1
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

 DEFINITION OF HYPOTHESIS

-It is a claim or a statement made about the value of a population parameter


which may or may not be true.
- The decision-making process for evaluating claims about a population is
called hypothesis testing.

 TYPES OF HYPOTHESES

 Null hypothesis (H0)


- It is a statement or claim about a population parameter that is assumed to
be true.
-The null hypothesis is a starting point.
- is a statistical hypothesis that states that there is no difference between a
parameter and a specific value, or that there is no difference two
parameters.
- It is the hypothesis that we assume to be correct unless proved otherwise.
 Alternative hypothesis (H1)
- We state what we think is wrong about the null hypothesis in an
alternative hypothesis.
- It is a statistical hypothesis that states the existence of a difference
between a parameter and a specific value, or states that there is a
difference between two parameters.
-It is a negation of the null hypothesis.
-Uses the inequality signs >,< and ≠.

 FORMULATING HYPOTHESES

-When given any situation, it is very important to formulate hypotheses


correctly.
-Consider the following when deciding your null hypothesis
1. The null hypothesis always has an element of equality ie =, signs are
used.
2. The null hypothesis is usually an expression of a claim made by someone
where an element of equality should be included, if not then it becomes
alternative.
3. The null hypothesis is the hypothesis we form with hope of rejecting it.

 ERRORS
 Type 1 error
- is the error of rejecting H0 when H0 is actually true.

2
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

 Type 2 error
- consists of not rejecting H0 when H0 is false.
The errors can be summarised in the following table
DECISION
ACCEPT H0 REJECT H0
REALITY H0 IS TRUE Correct decision H0 wrongly rejected:
Type I error
H0 IS FALSE H0 wrongly accepted: Correct decision
Type II error

 Probability of making type 1 error


-P(Type I) = P(Reject H0 when H0 is true)
 Probability of making type 2 error
- In order to calculate the probability of a Type II error you would need to
know the actual value of the parameter p.
-P(Type II) = P(accept H0 when H0 is false) = P(accept H0 when H1 is true)

EXAMPLES
1. One rainy day during the summer holidays, a family of four were
playing a simple game of cards. The game was one of chance so the
probability of any particular person winning should have been 0,25.
After playing a number of games, Robert complained that his younger
sister Sarah must have been cheating as she kept winning. Their
parents quickly intervened and decided to carry out a proper
investigation and carefully watched the next 20 games.
a) Find the probability of a Type I error. Given that the critical region
is chosen to be X 9
b) If in fact Sarah was cheating and p = 0.35, find the probability of a
Type II error.

SOLUTION
a) H0 : p = 0,25 H1 : p > 0,25
Critical region X 9
P(Type I error) = P(rejecting H0 when H0 is true)
= P(X 9|X ∼ B(20, 0.25))
=P(X 9) = 1−P(X )
= 1−P(X=0)+P(X=1) + …+P(X=8)
= 20C0(0,25)0(0,75)20 +20C1(0,25) (0,6)19+…+20C20(0,25)20
= 0.0409

b) H0 is rejected if X therefore to accept H0 we need X


P(Type II error) = P(accepting H0 when H0 is false)
=P(X 8|H0 is false)
Given that p = 0.35, P(Type II error) = P(X 8|X ∼ B(20, 0.35))

3
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

=P(X=0)+P(X=1) +…+P(X=8)
= 0.7624

2. It is known that 60% of the moths of a certain species are red; the rest
are yellow. A biologist finds a new colony of these moths and observes
that more of them seem to be red than she would expect. She designs
an experiment in which she will catch 10 moths at random, observe
their colour and then release them.
a) Given that the critical region is chosen to be X 9 find the
probability of making type 1 error?
b) If the proportion of red moths is 80%, find the probability of type II
error.

SOLUTION
a) H0 : p = 0,6 H1 : p > 0,6
Critical region X 9
P(Type I error) = P(rejecting H0 when H0 is true)
= P(X 9|X ∼ B(10, 0.6))
=P(X 9) = P(X=9) +P(X=10))
= 10C9(0,6)9(0,4) +10C10(0,6)10
= 0.046357401
=0,0464 to 3 s.f

b) H0 is rejected if X therefore to accept H0 we need X


P(Type II error) = P(accepting H0 when H0 is false)
=P(X 8|H0 is false)
Given that p = 0.8, P(Type II error) = P(X 8|X ∼ B(10, 0,8))
P(X 8)= 1−P(X=9)+P(X=10)
= 1−0.375809638
=0,624190361
=0,624 to 3 s.f
3. The random variable X is binomially distributed. A sample of 10 is
taken, and it is desired to test H0 : p = 0.45 against H1 : p ≠ 0.45. The
critical regions for this test are and X
a) Calculate the probability of a Type I error for this test
b) Given that the true probability was later found to be 0.40, calculate
the probability of a Type II error.

SOLUTION
a) H0 : p = 0,6 H1 : p > 0,6
Critical region X 9
P(Type I error) = P(rejecting H0 when H0 is true)
= P(X 9|X ∼ B(10, 0.45) or P(X 1|X ∼ B(10, 0.45)

4
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

= P(X=9) +P(X=10))+P(X=0)+P(X=1)
= 10C9(0,45)9(0,55) +10C10(0,45)10+10C0(0,55)10+10C1 (0,45)1(0,,55)9
=0,004161743+0,00340506+0,002532951+0,020724149
= 0,027759351
=0,0278 to 3 s.f

b) H0 is rejected if X and X 9 therefore to accept H0 we need X


or X 8 ie 2 8
P(Type II error) = P(accepting H0 when H0 is false)
P(Type II error) = P(2 8|X ∼ B(10, 0,4))
= P(X )+ P(X )+P(X=4)+P(X=5)+P(X=6)+P(X=7)+P(X=8)
= C2(0,4) (0,6) + C3(0,4)3(0,6)7+10C4(0,4)4(0,6)6 +10C5
10 2 8 10

(0,4)5(0,6)5+10C6 (0,4)6(0,6)4+10C7 (0,4)7(0,6)3+10C8 (0,4)8(0,6)2


=0,998322278+0,953642598
=0,951964876
=0,952 to 3 s.f

FOLLOW UP EXERCISE
1. The random variable X is binomially distributed. A sample of 10 is
taken, and it is desired to test H0 : p = 0.25 against H1 : p >
0.25.The critical region for this test is X .
a) Calculate the probability of a Type I error for this test
b) Given that the true value of p was later found to be 0.30,
calculate the probability of a Type II error.
2. The random variable X is binomially distributed. A sample of 20 is
taken, and it is desired to test H0 : p = 0.30 against H1 : p < 0.30,
and the critical region for this test is
a)Calculate the probability of a Type I error for this test
b)Given that the true probability was later found to be 0.25,
calculate the probability of a Type II error.

 FINDING TYPE I AND TYPE II ERRORS USING THE NORMAL


DISTRIBUTION

- If you are carrying out a hypothesis test for the mean of a normal
distribution, the sample variance for a sample of size n will be .
-When a continuous distribution such as the normal distribution is used then
P(Type I error) is equal to the significance level of the test.
-For type II error you need the critical value in order to get the critical region

5
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

EXAMPLES

1. The weight of jam in a jar, measured in grams, is distributed normally with


a mean of 150 g and a standard deviation of 6 g. The production process
occasionally leads to a change in the mean weight of jam per jar but the
standard deviation remains unaltered. The manager monitors the
production process and for every new batch takes a random sample of 25
jars and weighs their contents to see if there has been any reduction in the
mean weight of jam per jar.
a) Using 5% levels of significance calculate, the probability of type II error
given that the true value of μ for the new batch is in fact 147.

SOLUTION
H0 : μ = 150
H1: μ < 150 (i.e. a one-tailed test)
~N

The 5% critical region for Z is Z < −1.6449 so reject H0 if −1,6449

148.02612
H0 is rejected if 148.02612 therefore to accept H0 we need
148.02612 when μ = 147

P( 148.02612) =P

=P(Z > 0,8551)


= 1−
=1−0,8037
= 0.196 to 3 s.f

2. Bags of sugar having a nominal weight of 1 kg are filled by a machine.


From past experience it is known that the weight, X kg, of sugar in the
bags is normally distributed with a standard deviation of 0.04 kg. At the
beginning of each week a random sample of 10 bags is taken in order to
see if the machine needs to be reset. A test is then done at the 5%
significance level with H0 : μ = 1.00 kg and H1 : μ ≠ 1.00 kg. Find
a) Assuming that the mean weight has in fact changed to 1.02 kg, find
P(Type II error) for this test.

SOLUTION
H0 : μ = 1,00 kg and H1 : μ ≠ 1,00 kg (i.e. a two-tailed test)
~N
The 5% critical region for Z is Z < −1,96 or Z > 1,96 (2,5% on each tail)

6
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

so reject H0 if −1,96 or 1,96

or
or

H0 is rejected if 0,9752 and if therefore to accept H0 we


need 0,9752 or 1,0248 when μ = 1,02
ie 0,9752 1,0248
P(Type II error) = P(accepting H0 when H0 is false)
P(Type II error) = P( 0,9752 1,0248|X ∼ N(1,02; 0,04))
P(( 0,9752 1,0248)

=P

=P(−3,54 < Z < 0,379)


=
=0,6477 −1+0,9986
= 0,6464 3 s.f

FOLLOW UP TASK

1. The random variable X ∼ N(μ, 32). A random sample of 20 observations


of X is taken, and the sample mean is taken to be the test statistic. It is
desired to test H0 : μ= 50 againstH1 : μ > 50, using a 1% level of
significance. Given that the true mean was later found to be 53,find the
probability of a Type II error.

2. The random variable X ∼ N(μ, 22). A random sample of 16 observations


of X is taken, and the sample mean is taken to be the test statistic. It is
desired to test H0 : μ = 30 againstH1 : μ < 30, using a 5% level of
significance. Given that the true mean was later found to be 28.5, find the
probability of a Type II error.

3. A manufacturer claims that the average outside diameter of a particular


washer produced by his factory is 15 mm. The diameter is assumed to be
normally distributed with a standard deviation of 1 mm. The manufacturer
decides to take a random sample of 25 washers from each day’s production
in order to monitor any changes in the mean diameter. Using a significance
level of 5%, and given that the average diameter had in fact increased to
15.6 mm, find the probability of type II error?.

7
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

 SIGNIFICANCE LEVEL

- Level of significance, or significance level, refers to a criterion of judgment


upon which a decision is made regarding the value stated in a null hypothesis.
- Is the maximum probability of making a type I error and is denoted by α
-Thus a test with α =0.01 is said to have a significance level of 0.01.
- When = 0.10, there is a 10% chance of rejecting a true null hypothesis;
- Statisticians generally agree on using three arbitrary significance levels: the
0.10, 0.05, and 0.01 levels.

 TEST STATISTIC

- It is a mathematical formula that allows researchers to determine the


likelihood of obtaining sample outcomes if the null hypothesis were true.
-The value of the test statistic is used to make a decision regarding the null
hypothesis.

 P-VALUE

- It is the probability of obtaining a sample outcome, given that the value


stated in the null hypothesis is true.
-Is the probability, assuming that the null hypothesis is true, of getting a value
of the test statistic at least as extreme as the computed value for the test.
-The p‐value is determined by the data is related to the actual probability of
making Type I error (Rejecting a True Null Hypothesis).
-The p value for obtaining a sample outcome is compared to the level of
significance.
- Researchers make decisions regarding the null hypothesis.
- The decision to reject or retain the null hypothesis is called significance.
-The decision can be to retain the null (p > ) or reject the null (p < ).

 CRITICAL REGION

-is the range of values of the test value that indicates that there is a significant
difference and that the null hypothesis should be rejected.
-It is the region of the probability distribution function of the test statistic that
would allow the Null Hypothesis to be rejected.

 CRITICAL VALUE

-The critical value separates the critical region from the noncritical region.
-The symbol for critical value is C.V.

8
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

 ONE TAILED AND TWO TAILED TESTS

 One tailed tests


-A one-tailed test is either a right tailed test or left-tailed test, depending
on the direction of the inequality of the alternative hypothesis.
-These are tests where the alternative hypothesis involves > indicating you
are looking for an increase in p or < indicating that you are looking for a
decrease in p.

 Two tailed tests


-Involves the critical regions on both sides. It is not specified whether
there is an increase or not. Alternative hypothesis involves the inequality ≠
-In a two-tailed test, the null hypothesis should be rejected when the test
value is in either of the two critical regions.

 PROCEDURES FOR HYPOTHESIS TESTING

 Step 1: State the hypotheses.


 Step 2: Set the criteria for a decision.
-To set the criteria for a decision, we state the level of significance for a
test.
 Step 3: Compute the test statistic.
 Step 4: Make a decision.
1. If the test statistic lies in the rejection region, reject Ho. (critical value
method)
2. If the p‐value < α, reject Ho. (p‐value method)
-This p‐value method of comparison is preferred to the critical value
method because the rule is the same for all statistical models:

9
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

 HYPOTHESIS TESTING INVOLVING BINOMIAL DISTRIBUTIONS

-If the proportion of successful outcomes in the population is p then the test
statistic X~B(n, p), where n is the number in the sample.
-If the observed value x falls in the critical region, then, for one-tailed tests,
the P(X x) is or the P(X x) is , depending on the alternative
hypothesis and for two tailed test, P(X x) is or the P(X x) is
- Normal approximations to the binomial can be used where n is large np >5,
npq >5 where X~B(n, p) X~N(np; npq)
- When normal approximation is used remember to apply continuity correction
when calculating p value.

 STAGES FOR BINOMIAL

1. Define the variable ie X~Bin(n,p)


2. State the hypothesis H0 and H1
3. State the distribution according to H0 ie if H0 is true, then X~Bin(n,p)
4. State level and identify type of test
5. State the rejection criteria.
6. Calculate the required probability (p value)
7. Make conclusion.

 HYPOTHESIS TESTING FOR BINOMIAL : WORKED EXAMPLES

1. The standard treatment for a particular disease has a 0,4 probability of


success. A certain doctor has undertaken research in this area and has
produced a new drug which has been successful with 11 out of 20 patients.
The doctor claims that the new drug represents an improvement on the
standard treatment. Test at 5% significance level.

SOLUTIONS
Step 1: Let X be the number of successful treatments therefore X~Bin(n;p)
Step 2:H0: p = 0,4 H1: p > 0,4
Step 3: If H0 is true, then X~Bin(20; 0,4)
Step 4: This a right tailed test at = 0,05
Step 5: Reject H0 if P(X 11) 0,05
Step 6: Calculate the p value.
P(X 11) = P(X=11)=P(X=12) + …+P(X=20)
= +20C11(0,4)11(0,6)9 +20C12(0,4)12(0,6)8+…+20C20(0,4)20
= 0,12752016
= 0,1275
Step 7: Since P(X 11)>0,05 we fail to reject H0 and conclude that the

10
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

new drug does not represents an improvement on the standard treatment

2. Hester suggested that the die was biased in favour of 4. She threw a die 15
times and obtained a four on 6 occasions. Carry out a test at 5 %
significance level.

SOLUTIONS
Step 1: Let X be the outcome of throwing a die therefore X~Bin(n; )
Step 2:H0: p = H1: p >
Step 3: If H0 is true, then X~Bin(15; )
Step 4: This a right tailed test at = 0,05
Step 5: Reject H0 if P(X 6) 0,05
Step 6: Calculate the p value.
P(X 6) = 1−P(X )=P(X=0) + …+P(X=5)
= 15C0 +15C1 +15C2 +15C3 +15C4

+15C5
= 0,9726
Step 7: Since P(X 6)>0,05 we fail to reject H0 and conclude that the die
is fair

3. A coin is tossed 20 times and obtained a head on 6 occasions. Is there


evidence that the coin is biased? Test at 5% significance level.

SOLUTIONS
Step 1: Let X be the outcome of tossing a coin therefore X~Bin(n; )
Step 2:H0: p = H1: p
Step 3: If H0 is true, then X~Bin(20; )
Step 4: This a two tailed test at = 0,025
Step 5: Reject H0 if P(X 6) 0,025 or P(X 6) 0,025
Step 6: Calculate the p value.
P(X 6) = P(X=0) + …+P(X=6)
= 20C0 +20C1 +20C2 +20C3 +20C4

+20C5 +20C6
=0,05769
Step 7: Since P(X 6)>0,025 we fail to reject H0 and conclude that the
coin is fair

11
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

4. A manager thinks that his sales staffs make a sale to 45% of customers
entering their shop. He randomly selects 100 customers. Of the 100
customers, 35 were sold something. Using a suitable approximation test, at
the 5% level of significance, whether or not what the manager thinks is
justified.

SOLUTION
Step 1: Let X be the outcome of tossing a coin therefore X~Bin(n; )
Step 2:H0: p =0,45 H1: p 0,45
Step 3: If H0 is true, then X~Bin(100; 0,45)
Since n is large and np > 5 and npq >5, X~N(45; 24,75)
Step 4: This a two tailed test at = 0,025
Step 5: Reject H0 if P(X 35) 0,025 or P(X 35) 0,025
Step 6: Calculate the p value.
P(X 35) = P(X<35,5) apply continuity correction
=P
=P(Z<−1,91)
=1−0,9719
=0,0281
Step 7: Since P(X 35)>0,025 We fail to reject H0 and conclude there is
no evidence that the manager is wrong.

 HYPOTHESIS TESTING INVOLVING NORMAL DISTRIBUTION


WITH KNOWN VARIANCE Z-TEST

- The distribution of the means of samples from a normal population N(μ, σ2)
is indeed normal; its mean is μ and its standard deviation is .

-If X~N( ) then


-Hypothesis testing for a population mean can be done using Z tests.
-The test statistic is Z and is computed from the sample data.
-Z= where is sample mean, is hypothesised population mean, is

population standard deviation and n is sample size

12
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

-If variance is known, the Z test is used if n 30 or the distribution is normal


if n < 30.
-The Z value is compared to the critical value for z given in the table.

STAGES IN HYPOTHESIS TESTING


1. State hypotheses
2.Identify critical value
3.Compute test value
4.Make decision to reject or not to reject
5.Conclude the results.

EXAMPLES
1. The weights of students enrolling at a college have a standard
deviation of 7,5kg and mean of 70 kg. A random sample of 90 students
from the new entry was weighed and their mean weight was 71,6kg.
a) Assuming that the standard deviation has not changed test, at 5 %
significance level whether there is evidence that the mean is more than
70kg.
b) What is the importance of the central limit theorem in your test?

SOLUTION
a)Step 1: H0: = 70 H1: >70
Step 2: Since = 0.05 and the test is a right-tailed test, the critical
value is zcrit = 1.645. Reject H0 if zcal > 1,645
Step 3:Compute the test value Z
zcal = = = 2,0239

Step 4: Since zcal > 1,645 we reject H0 and conclude that there is
evidence that the mean is more than 70 kg.
b) The Central Limit Theorem is used to assume that (which is the
mean weight of 90 students) is normally distributed.

2. Observations over a long period of time have shown that the mass of
adult males of a type of bat is normally distributed with mean 110 g
and standard deviation 10 g. A scientist has a theory that in one area
these bats are becoming smaller, possibly as an adaptation to changes
in their environment. He plans to trap 20 adult male bats, weigh them
and then release them. The mean mass of the bats has reduced to 108 g

13
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

but the standard deviation has remained unaltered. Use the data to carry
out a suitable hypothesis test at the 5% significance level.

SOLUTION
a)Step 1: H0: = 110 H1: <110
Step 2: Since = 0.05 and the test is a left-tailed test, the critical
value is zcrit = −1.645. Reject H0 if zcal < −1,645
Step 3:Compute the test value Z
zcal = = = −0,8944

Step 4: Since zcal > −1,645 we fail to reject H0 and conclude that there
is no evidence that the mean mass has reduced.
3. A machine produces bolts of diameter D and D is normally distributed
with mean 0,58cm and standard deviation 0,015cm. The machine is
serviced and after the service a random sample of 50 bolts from the
production run is taken to see if the mean diameter of the bolts has
changed from 0,58cm. The distribution of the diameters after the
service is still normal with a standard deviation of 0,015cm. The mean
diameter of the 50 bolts is 0,577cm. Test at 1% whether the mean
diameter of the bolts has changed.

SOLUTION
a)Step 1: H0: = 0,58 H1: ≠ 0,58
Step 2: Since the test is a two-tailed test, = 0.005 and the critical
value is zcrit = −2,576 or 2,576. Reject H0 if zcal < −2,576 or zcal
>2,576
Step 3:Compute the test value Z
zcal = = = −1,414

Step 4: Since zcal > −2,576 we fail to reject H0 and conclude that there

14
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

is no evidence that the mean diameter has changed.

 Z TEST WHEN VARIANCE IS UNKNOWN AND SAMPLE LARGE

- When variance is unknown and sample is large the test statistic is given by
where is an estimate.

-Unbiased estimates of the population parameters are used.

- = and =

EXAMPLES
1. The packaging on a type of electric light bulb states that the average
lifetime of the bulbs is 1000 hours. A consumer association thinks that
this is an overestimate and tests a random sample of 64 bulbs,
recording the lifetime, x hours, of each bulb. You may assume that the
distribution of the bulbs’ lifetimes is normal. The results are
summarised as follows. n = 64, Σx = 63 910.4, Σx2 = 63 824 061
(i) Calculate unbiased estimates for the population mean and variance.
(ii) Carry out the test, at the 5% significance level, to test whether the
statement on the packaging is overestimating the lifetime of this type
of bulb.

SOLUTION
i) = = =998,6 and

= = 49,77079365
ii) Step 1: H0: = 1000 H1: <1000
Step 2: Since = 0.05 and the test is a left-tailed test, the critical
value is zcrit = −1.645. Reject H0 if zcal < −1,645
Step 3:Compute the test value Z
zcal = = = −1,58757

Step 4: Since zcal > −1,645 we fail to reject H0 and conclude that the
statement on the packaging is not overestimating the lifetime of this
type of bulb.

2. A sample of 40 observations from a normal distribution X gave Σx =


24 and Σx2 = 596. Performing a two-tail test at the 5% level, test
whether the mean of the distribution is zero.

SOLUTION

15
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

= = =0,6

= = = 3,86
Step 1: H0: = 0 H1: ≠ 0
Step 2: Since the test is a two-tailed test, = 0.025 and the critical
value is zcrit = −1,96 or 1,96. Reject H0 if zcal < −1,96 or zcal >1,96
Step 3:Compute the test value Z
zcal = = = 0,983

Step 4: Since zcal <1,96 we fail to reject H0 and conclude that the mean
of the distribution is zero.

 T DISTRIBUTION

-If a sample of n observations is taken from a normal population with mean


and variance then the sample mean followed a normal and

Z= . If is unknown then s where s2 is unbiased estimator of is used.

-When n is large N(0;12) however if n is small s is unlikely closer to

and can no longer be modelled by the normal distribution N(0;12) but a

T distribution will be a suitable model.


-If X1;X2;…Xn is selected form a normal population with mean and unknown
variance then t = has a tn-1 distribution where s2 is unbiased estimate

of .
-There are a family of t distributions distinguished by degrees of freedom
where = n− 1.

CHARACTERISTICS OF A T DISTRIBUTION

- A t distribution is bell shaped


- The mean, median, and mode are equal to 0
- The curve never touches the x axis.
-The distribution is symmetric about 0 and as N(0;12)

16
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

USING TABLES FOR T DISTRIBUTION


-The critical value for is obtained from the tables.
-The table gives values of t such that P(T =p

Extract of a t distribution table


p .75 .90 .95 .975 .99 .995 .9975 .999 .9995
v=1 1,000 3,078 6,314 12,71 31,82 63,66 127,3 318,3 636,6
2 0,816 1,886 2,920 4,303 6,965 9,925 14,09 22,33 31,60
3 0,765 1,638 2,353 3,182 4,541 5,841 7,453 10,21 12,92
4 0,741 1,533 2,132 2,776 3,747 4,604 5,598 7,173 8,610

-The distribution is symmetric about 0 so for 0,05 use the values under 1−0,05
= 0,95

EXAMPLES
Find the critical values for a) b)
SOLUTION
a) v = 4 and p = 1−0,01 =0,99 then 4 and 0,99 intersect at 3,747
b) v = 2 and p = 1−0,05 =0,95then 2 and 0,95 intersect at 2,920

 HYPOTHESIS TEST INVOLVING NORMAL DISTRIBUTION WITH


UNKNOWN VARIANCE AND SMALL SAMPLE (T-TEST)

- When the population standard deviation is unknown and sample size n small
(ie n < 30); the z test is not normally used for testing hypotheses involving
means. A different test, called the t test, is used.
- The test statistic T = with = n−1

-The stages for hypothesis testing are the same as for Z tests

EXAMPLES

17
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

1. A medical investigation claims that the average number of infections


per week at a hospital in a certain town is 14.3. A random sample of 10
weeks had a mean number of 15.7 infections. The sample standard
deviation is 1.6. Is there enough evidence to reject the investigator’s
claim at 5% significance level?

SOLUTIONS
Step 1: H0: = 14,3 H1: ≠ 14,3
Step 2: Since the test is a two-tailed test, = 0.025 = 9 and the
critical value is = −2,262 or 2,262. Reject H0 if tcal < −2,262 or tcal
>2,262
Step 3: Compute the test value T
tcal = = = 2,76

Step 4: Since tcal >2,262 we reject H0 and conclude that the average
number of infections is not 14,3

2. An athlete finds that her times for running a race are normally
distributed with mean 10,6 seconds. She trains intensively for a week
and then records her time in the next 5 races. Her times, in seconds, are
10.70 ; 10.65; 10.75; 10.80 ;10.60; Is there evidence at 5 % level that
training intensively has improved her times

SOLUTIONS

= = = 10,7

= = = 0,079
Step 1: H0: = 10,6 H1: < 10,6
Step 2: Since the test is a one-tailed test, = 0.05 = 4 and the
critical value is = −2,132. Reject H0 if tcal < −2,132
Step 3: Compute the test value T
tcal = = = 2,83

Step 4: Since tcal >-2,132 we fail to reject H0 and conclude that the
race times have not improved.

3. A random sample of 8 observations of a normal variable gave


=36,5 and = 0,74. Test at 5 % level the hypothesis that
the mean of the distribution is 4,3 against an alternative hypothesis that
the mean is greater than 4,3.

SOLUTIONS

18
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

= = = 4,56

= = = 0,33
Step 1: H0: = 4,3 H1: > 4,3
Step 2: Since the test is a right-tailed test, = 0.05 = 7 and the
critical value is = 1,895. Reject H0 if tcal < 1,895
Step 3: Compute the test value T
tcal = = = 2,83

Step 4: Since tcal > 1,895 we fail to reject H0 and conclude that the
mean is 4,3.

 EXAM TYPE QUESTIONS

1. A manufacture of an item used for the production of metal rods claims


that new machine that he has acquired has resulted in an improved
product. The old machine is known to have given 20% defectives per
output. Test at 5% significance level the validity of the claim if out of a
sample of 20 items 2 were found to be defective. Use the binomial test.
[7]

2. a) In an election held in 2007, 60% of the voters voted for Party A. In a


poll of opinion conducted last week, 250 potential voters were asked
how they would vote if there was an election now. 135 of the voters
said they would vote for Party A. Investigate at 5% level of
significance whether the proportion of the voters in favour of A has
decreased significantly. [6]
b) Ambulance Services claims that it takes an average of 8,9 minutes
to respond to emergency calls. To verify this claim, the Agency which
licences ambulance services timed 50 responses to emergency calls.
The observed data gave a mean of 9,3 minutes and standard deviation
of 1,8 minutes. Test at 5% significance level whether there is evidence
to justify Ambulance service’s claim. [8] (NOV 2010 no 10)

3. a) Distinguish between a 1-tailed and a 2-tailed test. [2]


b) A political party claims that it commands 60 % of the voters. To test
this, a random sample of 300 potential voters was asked which party
they would vote for. 160 confirmed that they would vote for that party.
Establish whether this sample supports the claim by the party. Test at
10 % level of significance. [7] [JUNE 2013 no 7]

19
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

4. A coin is tossed 5 times. Use a binomial test, at 5% level of


significance, to test whether the coin is biased towards heads if at least
4 heads are obtained. [5] [JUNE 2014 no 3]

5. a) Distinguish between 1 tailed and 2 tailed test. [2]


b) It is claimed that rural secondary school pupils travel a distance of
more than 12km to school. To test this claim, a random sample of 100
pupils was asked to keep a record of the distances they travel to school.
The random sample showed an average distance of 14,5 km with a
standard deviation of 4.8km. Test at 0.05 level of significance whether
the claim is true. [6][JUNE 2015 no 7]

6. A manufacturer of orange juice claims that the volumes of packets


which the firm produces are normally distributed with mean 1 000ml
and variance 16. A consumer right inspector tests a sample of 20
packets and finds that the average volume is 997.5 ml. Test at 1%
significance level to establish whether or not the manufacturer is
overstating the volume of the contents. [5] [NOV 2017 no 6]

7. The mass, x kg, of each pocket in a random sample of 80 pockets with


manure was measured and the results summarised by ,
Test at the 5% level of significance, the claim that
the pockets contains less than 1,10kg of manure. [8] [2018 P1 no 9]

8. The credit manager for a department store believes that the average
monthly credit account balances have changed from the historical
average of $5 870. the internal auditor took a random sample of 35
credit account balances and calculated the unbiased estimates of the
mean and variance to be $5 790 and $62 500 respectively.
(a) Explain whether a one-tailed test is appropriate. [2]
(b) Stating the null and alternative hypotheses clearly, test at 5%
significance level whether the sample evidence supports the credit
manager’s belief. [7] [NOV 2004 no 9]

9. The Zimbabwe consumer report (1999) states that the mean retail cost
of Nokia 5110 cellular phone was $600.00. A random sample of 10
stores in Harare, gave the following prices for this model,
593 621 545 561 609 555 588 575 619 599
a) Calculate the mean and standard deviation of the above data. [3]
b) Assuming that the retail costs of these cellular phones are normally
distributed, test at 10% level of significance whether this information
indicates that the population mean of the cost of the cellular phones is
less than $600,00. [6] NOV 2007 no 6

20
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

10. A dairy farmer claims that his milk bottles contain exactly one litre of
milk. A consumer took a random sample of 20 bottles and found that
the average contents to be 0.980 litres with a standard deviation of
0,070 litres. Test the farmer’s claim at 5% significance level. [6]
JUNE 2008 no 4

11. A random sample of eight observations of a normal variable gave


4.65 and
Test at the 5 % level of significance, whether the
mean is 4.32. [7]
[JUNE 2014 no 6]

12. a) Distinguish between a


i) one tailed test and a two tailed test [2]
ii) statistic and a parameter, [2]
iii) sample and a population. [2]
b) A machine is supposed to produce toothpicks of length 5cm. A
sample of 10 toothpicks was taken and their lengths measured. The
following results were obtained.
4.99 4.96 5.00 4.98 5.01 4.95 4.96 4.97 4.99 4.97
Assuming that the lengths are normally distributed, test at the 1% level
of significance whether the machine is in good working order. [10]
[2018 P2 no 10]

SOLUTIONS TO EXAM TYPE QUESTIONS

1. NOV 2009 no 8
HO: p = 0,2 H1: p ≠ 0,2
If Ho is true X Bin(20;0,2) At 5% reject Ho if P(X ≤ x)<0,025 or P(X x)
0,025
when x = 2, find P(X ≤ 2)
=P(X=0) + P(X=1) + P(X=1) + P(X=2)
=20C0 (0,8)20 + 20C1(0,8)19(0,2) + 20C2(0,8)18(0,2)1 + 20C3 (0,8)17(0,2)3
=0,20608
Since 0,20608 > 0,025 we fail to reject Ho and conclude that the claim
is valid.

2. (NOV 2010 no 10)


HO: p = 0,6 H1: p < 0,6
= = 0,54
Z=

21
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

=-1,9376
Since -1,9376 < -1,645 reject Ho and conclude it has decreased.
OR
Step 1: If H0 is true, then X~Bin(250;0,60)
X Bin(250;0,6) np =150 and nq =100 so since np> 5 and nq > 5
use Normal approximation X N(150;60)
Step 2: This a left- tailed test at = 0,05
Step 3: Reject H0 if P(X 135) 0,05
Step 4: Calculate the p value.
P(X 135) = P(X<135,5) apply continuity correction
=P
=P(Z<−1,87)
=1−0,9693
=0,0307
Step 5: Since P(X 135)<0,05 We reject H0 and it has decreased

3. [7] [JUNE 2013 no 7]


a) A 1-tailed test is one which looks for a definite increase or decrease
and uses > or < whilst
a 2-tailed test looks for any change and uses ≠
b) HO: p = 0,6 H1: p < 0,6
= = 0,5333
Use one tailed test(lower tail ) at 10%. Zcrit =-1,282. Reject Ho if Zcal <-
1,282
Z=

=-2,3574
Since -2,3574 < -1,282 reject Ho and conclude that it commands less
than 60 % of the voters.
OR
Step 1: Let X ~Bin(300;0,6)
Step 2:H0: p = H1: p
Step 3: If H0 is true, then X~Bin(300;0.6)
np =180 , nq =120 and npq=72.
Since np> 5 and nq > 5 use Normal approximation X N(180;72)
Step 4: This a two tailed test at = 0,025
Step 5: Reject H0 if P(X 160) 0,025 or P(X 160) 0,025

22
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

Step 6: Calculate the p value.


P(X 160) = P(X <160,5) apply continuity correction
=P
=P(Z<−2,3)
=1−0,9893
=0,0107
Step 5: Since P(X 160)<0,05 We reject H0 and conclude that it
commands less than 60 % of the voters.

4. [JUNE 2014 no 3]
P=
HO: p = 0,5 H1: p > 0,5
If Ho is true X Bin(5;0,5) At 5% reject Ho if P(X x) 0,025
when x = 4, find P(X 4)
=P(X=4) + P(X=5)
=5C4(0,5)4(0,5)+ 5C5(0,5)5
= 0,1875
Since 0,1875> 0,025 we fail to reject Ho and conclude that the coin is
not biased towards heads if at least 4 heads are obtained.

5. [JUNE 2015 no 7]
a) A 1-tailed test is one which looks for a definite increase or decrease
and uses > or < whilst
a 2-tailed test looks for any change and uses ≠
b) HO: = 12 H1: > 12
Use one tailed test(lupper tail ) at 5%. Zcrit =1,645. Reject Ho if Zcal
>1,645
Zcal = OR

= 5,208 or 5,18
Since Zcal >1,645 we reject Ho and conclude that rural secondary school
pupils travel a distance of more than 12km

6. [NOV 2017 no 6]
HO: = 1000 H1: < 1000
Use one tailed test(lower tail ) at 1%. Zcrit =-2,326. Reject Ho if Zcal <-
2,326
Zcal =

=−2,795

23
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

Since Zcal <-2,326 we reject Ho and conclude that the manufacturer is


overstating the volume of the contents.

7. [2018 P1 no 9]
=
= = 0,994

=
= 0,270877
= 0,52
HO: = 1,10 H1: < 1,10
Use one tailed test(lower tail ) at 5%. Zcrit =−1,645. Reject Ho if Zcal
<−1,645
Zcal =

=−1,82
Since Zcal <−1,645 we reject Ho and conclude that the pockets contains
less than 1,10kg of manure

8. [NOV 2004 no 9]
a) Not appropriate because the belief is not specific on whether the
change is an increase or decrease.
b) HO: = 5 870 H1: 5 870
Use two tailed test at 5 %. Zcrit = ±1,960. Reject Ho if Zcal < -1,960 or >
1,960
Zcal =

=-1,893
Since -1,893 > -1,960 we fail to reject Ho and conclude that the
average monthly credit account balances have not changed.

9. NOV 2007 no 6
a) = = = 586,5

=
=25,3978
b)HO: = 600 H1: < 600
Use one tailed T test( lower tail ) at 10%. CV = -1,383 Reject Ho if t <-
1,383

24
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

=
=26,77
T=

=-1,5947 (can also use the biased standard deviation 25,3978 and get-
1,68)
Since Tcal <-1,383 we reject Ho and conclude that the population mean
of the cost of the cellular phones is less than $600,00.

10. JUNE 2008 no 4


= = =0,005157894
=0,072
HO: = 1 Litre H1: ≠ 1 litre
Use two tailed test at 5%. =19. tcrit =±2,093. Reject Ho if tcal <-2,093 or
tcal > 2,093
tcal =

=-1,242
Since tcal > -2,093 we fail reject Ho and conclude that the farmer’s
claim is true at 5% significance level.

11. [JUNE 2014 no 6]


SLN: =
=
=0,1057
s2 =
= 0,0925
HO: = 4,32 H1: ≠ 4,32
Use two tailed T test at 5%. Degrees of freedom is 7 and CV = ±1,895.
Reject Ho if t <-1,895
or t > 1,895
t= OR

= 2,87
Since t > 1,895 we reject Ho and conclude the mean is not 4.32

12. [2018 P2 no 10]


a) i) A 1-tailed test is one which looks for a definite increase or
decrease and uses > or < whilst a 2-tailed test looks for any change

25
A LEVEL STATISTICS NOTES COMPILED BY MANYUVIRE D CELL 0783235483

and uses ≠
ii) A statistic is a calculation or measure from a sample and a
parameter is a measure from a population.
iii) A population is the entire group under study whilst a sample is part
of a population.
b) =
=
= 4,978
=

=
= 0,0037333
= 0,01932
HO: = 5 H1: ≠ 5
Use two tailed T test at 5%. Degrees of freedom is 9 and CV = ±3,250.
Reject Ho if t < −3,250
or t > 3,250
t=

= −3,6
Since t < −3,250 we reject Ho and conclude at 1% level of significance
that the machine is not in good working order.

CONSTRUCTIVE COMMENTS ON ERRORS AND


MISTAKES ARE WELCOME

CALL TEXT OR WHATSAP 0783235483

26

You might also like