We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 16
6. TESTS OF SIGNIFICANCE
(Small Samples)
6.0 Introduction:
In the previous chapter we have discussed problems relating
to large samples. The large sampling theory is based upon two
important assumptions such as
(a) The random sampling distribution of a statistic is
approximately normal and
(b) The values given by the sample data are sufficiently close
to the population values and can be used in their place for
the calculation of the standard error of the estimate.
The above assumptions do not hold good in the theory of small
samples. Thus, a new technique is needed to deal with the theory of
small samples. A sample is small when it consists of less than 30
items. ( n< 30)
Since in many of the problems it becomes necessary to take
a small size sample, considerable attention has been paid in
developing suitable tests for dealing with problems of small
samples. The greatest contribution to the theory of small samples is
that of Sir William Gosset and Prof. R.A. Fisher. Sir William
Gosset published his discovery in 1905 under the pen name
‘Student’ and later on developed and extended by Prof. R.A.Fisher.
He gave a test popularly known as ‘t-test’.
6.1 t - statistic definition:
If X1, X2, ...Xn is a random sample of size n from a normal
population with mean p and variance o”, then Student’ s t-statistic is
defined as t= =
-_mx,
where x = =x is the sample mean
n
143and S$? = hex -x)
n
is an unbiased estimate of the population variance o It follows
student’ s t-distribution with v =n-I d.f
6.1.1 Assumptions for students t-test:
1.
2.
3.
The parent population from which the sample drawn is
normal.
The sample observations are random and independent.
The population standard deviation o is not known.
6.1.2 Properties of t- distribution:
-
2
3.
4.
t-distribution ranges from —oo to oo just as does a normal
distribution.
Like the normal distribution, t-distribution also symmetrical
and has a mean zero.
t-distribution has a greater dispersion than the standard
normal distribution.
As the sample size approaches 30, the t-distribution,
approaches the Normal distribution.
Comparison between Normal curve and corresponding t -
curve:
144
o76.1.3 Degrees of freedom (d.f):
Suppose it is asked to write any four number then one will
have all the numbers of his choice. If a restriction is applied or
imposed to the choice that the sum of these number should be 50.
Here, we have a choice to select any three numbers, say 10, 15, 20
and the fourth number is 5: [50 — (10 +15+20)]. Thus our choice of
freedom is reduced by one, on the condition that the total be 50.
therefore the restriction placed on the freedom is one and degree of
freedom is three. As the restrictions increase, the freedom is
reduced.
The number of independent variates which make up the
statistic is known as the degrees of freedom and is usually denoted
by v (Nu)
The number of degrees of freedom for n observations is
n — k where k is the number of independent linear constraint
imposed upon them.
For the student’ s t-distribution. The number of degrees of
freedom is the sample size minus one. It is denoted by v = n —1
The degrees of freedom plays a very important role in ¢
test of a hypothesis,
When we fit a distribution the number of degrees of
freedom is (n— k—-1) where n is number of observations and k is
number of parameters estimated from the data.
For e.g., when we fit a Poisson distribution the degrees of
freedom is v=n—1-1
In a contingency table the degrees of freedom is (r—1) (¢ —1)
where r refers to number rows and c refers to number of columns.
Thus in a 3 x 4 table the d.f are (3-1) (4-1) = 6 d.fIna 2 x
2 contingency table the d.f are (2-1) (2-1) = 1
In case of data that are given in the form of series of
variables in a row or column the d.f will be the number of
observations in a series less one ie v = n-1
Critical value of t:
The column figures in the main body of the table come
under the headings to,100, to.so, to.o2s, to.c1o and to.oos. The subscripts
145give the proportion of the distribution in ‘tail’ area. TI hus for two-
tailed test at 5% level of significance there will be two rejection
areas each containing 2.5% of the total area and the required
column is headed to.o2s
For example,
ty (.05) for single tailed test = t, (0.025) for two tailed test
ty (.01) for single tailed test = ty (0.005) for two tailed test
Thus for one tailed test at 5% level the rejection area lies in
one end of the tail of the distribution and the required column is
headed to.0s-
Critical value of t - distribution
61.472 peaeennit 20 fa +o
The t-distribution has a number of applications in statistics,
of which we shall discuss the following in the coming sections:
(i) t-test for significance of single mean, population variance being
unknown.
(ii) t-test for significance of the difference between two sample
means, the population variances being equal but unknown.
(a) Independent samples
(b) Related samples: paired t-test
6.2 Test of significance for Mean:
We set up the corresponding null and alternative hypotheses
as follows:
146Ho: jt = jto; There is no significant difference between the sample
mean and population Mean.
Hy # fo (qt < Ho (or) [> Ho)
Level of significance:
5% or 1%
Calculation of statistic:
Under Hp the test statistic is
aie Ree
where x = zx is the sample mean
n
and $t=—! yy¢x-39° or) t= LYK)’
n-1 n
Expected value :
t= a ~ student’ s t-distribution with (n-1) d.f
vn
Inference :
If to < t, it falls in the acceptance region and the null
hypothesis is accepted and if t, > t, the null hypothesis Hy may be
rejected at the given level of significance.
Example 1:
Certain pesticide is packed into bags by a machine. A
random sample of 10 bags is drawn and their contents are found to
weigh (in kg) as follows:
50 49 52.44 45 48 46 45 49 45
Test if the average packing can be taken to be 50 kg.
Solution:
Null hypothesis:
Ho : p = 50 kgs in the average packing is 50 kgs.
147Alternative Hypothesis:
Hy : 1 # 50kgs (Two -tailed )
Level of Significance:
Let a= 0.05
Calculation of sample mean and $.D
xX d=x-48 d
50 2 4
49 1 1
52. 4 16
44 4 16
45 =3 9
48 0 0
46 2 4
45 3 9
49 + 1
45 33 9
Total 7 69
X=A+ zd
n
=4g+ 22
10
= 48-0.7 =47.3
1 (ds)
= —[69-~—+
9 t 10 1
=. 64.1 =7,12
9
Calculation of Statistic:
Under Hp the test statistic is :
X-H
VS? /n
148
A Gay?
n-1 [za n ]= 2.262
Inference:
Since ty > te , Ho is rejected at 5% level of significance and
we conclude that the average packing cannot be taken to be 50 kgs.
Example 2:
A soap manufacturing company was distributing a
particular brand of soap through a large number of retail shops.
Before a heavy advertisement campaign, the mean sales per week
per shop was 140 dozens. After the campaign, a sample of 26 shops
was taken and the mean sales was found to be 147 dozens with
standard deviation 16. Can you consider the advertisement
effective?
Solution:
We are given
n= 26; x = 147dozens; s=16
Null hypothesis:
Hp: 1. = 140 dozens i.e. Advertisement is not effective.
Alternative Hypothesis:
Hy: 1 > 140kgs (Right -tailed)
Calculation of statistic:
Under the null hypothesis Ho, the test statistic is
tp = 2B
s/vn—1
_ ve i
16/25 16
149
=2.19Expected value:
te=
follows t-distribution with (26-1) = 25d.f
ee ae |
s/¥n=1
= 1.708
Inference:
Since to > te, Ho is rejected at 5% level of significance.
Hence we conclude that advertisement is certainly effective in
increasing the sales.
6.3 Test of significance for difference between two means:
6.3.1 Independent samples:
Suppose we want to test if two independent samples have
been drawn from two normal populations having the same means,
the population variances being equal. Let x,, Hay0Xp, and yi, Ys
Fn, be two independent random samples from the given
normal populations.
Null hypothesis:
Ho : ji = H2 ie. the samples have been drawn from the normal
populations with same means.
Alternative Hypothesis:
Hy: ba # Ha (Hi < be OF pr > pa)
Test statistic:
Under the Hp, the test statistic is
1 = t, we reject
the null hypothesis.
Example 3:
A group of 5 patients treated with medicine ‘A’ weigh 42,
39, 48, 60 and 41 kgs: Second group of 7 patients from the same
hospital treated with medicine ‘ B’ weigh 38, 42 , 56, 64, 68, 69 and
62 kgs. Do you agree with the claim that medicine ‘ B’ increases the
weight significantly?
Solution:
Let the weights (in kgs) of the patients treated with
medicines A and B be denoted by variables X and Y respectively.
Null hypothesis:
Ho : pi = pa
ie. There is no significant difference between the medicines A and
B as regards their effect on increase in weight.
Alternative Hypothesis:
Hi : i < pp (left-tail) ie. medicine B increases the weight
significantly.
Level of significance : Let a = 0.05
Computation of sample means and S.Ds
Medicine A
x xX-x (x= 46) (x-xy?
42 -4 16
39 Sy 49
48 2 4
60 14 196
41 -5 25
230 0 290
151x= S5= = =46
n 5
Medicine B
u y-y (y=57) |_y-yP
38 -19 361
42 -15 225
56 -l 1
64 7 49
68 ll 121
69 12 144
62 5 25
399 0 926
y= Zy _ 399 =57
n, qd
a 1 > 2
Ss a ees [X&-x) +2Z0y-y)']
n, +n, -
1
=—[29 = 1216
5p 1290+ 926)
Under Ho the test statistic is= 1.812
Inference:
Since to < t, it is not significant. Hence Hp is accepted and
we conclude that the medicines A and B do not differ significantly
as regards their effect on increase in weight.
Example 4:
Two types of batteries are tested for their length of life and
the following data are obtained:
No of samples Mean life Variance
(in hrs)
Type A 9 600 121
Type B 8 640 144
Is there a significant difference in the two means?
Solution:
We are given
ny=9; x =600hrs; 5;7=121; m=8; x2 =640hrs; —sy"=144
Null hypothesis:
Ho : }1 = pa ie. Two types of batteries A and B are identical ie.
there is no significant difference between two types of batteries.
153Alternative Hypothesis:
Hi: ju # 2 (Two- tailed)
Level of Significance:
Let a= 5%
Calculation of statistics:
Under Hp, the test statistic is
ns; +n,8,"
n, +n, -2
_ 9x121+8%144
9+8-2
= 2241 _ 149.4
15
where S? =
Expected value:
follows t-distribution with 9+8-2 =15 d.f
154Inference:
Since tp > ty it is highly significant. Hence Hp is rejected and
we conclude that the two types of batteries differ significantly as
regards their length of life.
6.3.2 Related samples —Paired t-test:
In the t-test for difference of means, the two samples were
independent of each other. Let us now take a particular situations
where
(i) The sample sizes are equal; i.e., n) = ny = n(say), and
(ii) The sample observations (x1, x2, ..-.%n) and (yi, ya,
smn) are not completely independent but they are
dependent in pairs.
That is we are making two observations one before treatment
and another after the treatment on the same individual. For example
a business concern wants to find if a particular media of promoting
sales of a product, say door to door canvassing or advertisement in
papers or through T.V. is really effective. Similarly a
pharmaceutical company wants to test the efficiency of a particular
drug, say for inducing sleep after the drug is given. For testing of
such claims gives rise to situations in (i) and (ii) above, we apply
paired t-test.
Paired — t test:
Let di = Xi ~ Yi (i = 1, 2, ...n) denote the difference in
the observations for the i* unit.
Null hypothesis:
Ho : p41 = Ha ie the increments are just by chance
Alternative Hypothesis:
Hy: pi # a (Hi > Ha (or) br < Ha)
Calculation of test statistic:
to= 4
°” |s7Jal|
2
where a-24 and g=ya-a pe ra? -29 ]
n n-1 n-l n
155Expected value:
t= 7 -| follows t-distribution with n—I d.f
S/vn
Inference:
By comparing to and t, at the desired level of significance,
usually 5% or 1%, we reject or accept the null hypothesis.
Example 5:
To test the desirability of a certain modification in typists
desks, 9 typists were given two tests of as nearly as possible the
same nature, one on the desk in use and the other on the new type.
The following difference in the number of words typed per minute
were recorded:
Typists A|BI/[C/]D/E/]F/|G/H|I
Increase in
number of words | 2 | 4 | 0 | 3 |-1| 4 |-3| 2 | 5
Do the data indicate the modification in desk promotes speed in
typing?
Solution:
Null hypothesis:
Ho 3 = ba i.e. the modification in desk does not promote speed in
typing.
Alternative Hypothesis:
Hy: hi <2 (Left tailed test)
Level of significance: Let a = 0.05
Typist d &
A 2 4
B 4 16
c 0 0
D 3 9
E =I I
F 4 16
G 3 9
H 2 4
I 5 25
Ed=16 | xd =84Xd _ 16
des 2
i 9 1.778
1 2
S= [Xa - (Xd) ]
n-1 n
I 16)?
gl84 oy = 469 = 2.635
Calculation of statistic:
Under Hp the test statistic is
= LTI8x3 _ 5 p94
Expected value:
te
ex follows t- distribution with 9-1 = 8 d.f
= 1.860
Inference:
When to < t, the null hypothesis is accepted. The data does
not indicate that the modification in desk promotes speed in typing.
Example 6:
An IQ test was administered to 5 persons before and after
they were trained. The results are given below:
Candidates I IL IL IV Vv
1Q_ before 110 120 123 132 125
training
1Q after 120 118 125 136 121
0
Test whether there is any change in IQ after the training
programme (test at 1% level of significance)
Solution:
Null hypothesis: .
Ho : 1 = pa ie. there is no significant change in IQ after the
training programme.
157Alternative Hypothesis:
Hy: pu # pa (two tailed test)
Level of significance :
Under Ho the test statistic is
Inference:
a=0.01
x io | 120 [| 123 132 125 __[_ Total
y 120 118 125 136 121 :
d=x-y | -10 2 -2 —4 4 -10
& 100 4 4 16 16 140
q-24_=10_,
n 5
g= ope (24d) ]
n-1 n
1 100.
= =[140-——
qu 0 z ] =30
Calculation of Statistic:
Since to < t, at 1% level of significance we accept the null
hypothesis. We therefore, conclude that there is no change in IQ
after the training programme.
158