Anova
Anova
11
Analysis of Variance
and Co-variance
WHAT IS ANOVA?
Professor R.A. Fisher was the first man to use the term ‘Variance’* and, in fact, it was he who
developed a very elaborate theory concerning ANOVA, explaining its usefulness in practical field.
*
Variance is an important statistical measure and is described as the mean of the squares of deviations taken from the
mean of the given series of data. It is a frequently used measure of variation. Its squareroot is known as standard deviation,
i.e., Standard deviation = Variance.
Page 2 of 27
Later on Professor Snedecor and many others contributed to the development of this technique.
ANOVA is essentially a procedure for testing the difference among different groups of data for
homogeneity. “The essence of ANOVA is that the total amount of variation in a set of data is broken
down into two types, that amount which can be attributed to chance and that amount which can be
attributed to specified causes.”1 There may be variation between samples and also within sample
items. ANOVA consists in splitting the variance for analytical purposes. Hence, it is a method of
analysing the variance to which a response is subject into its various components corresponding to
various sources of variation. Through this technique one can explain whether various varieties of
seeds or fertilizers or soils differ significantly so that a policy decision could be taken accordingly,
concerning a particular variety in the context of agriculture researches. Similarly, the differences in
various types of feed prepared for a particular class of animal or various types of drugs manufactured
for curing a specific disease may be studied and judged to be significant or not through the application
of ANOVA technique. Likewise, a manager of a big concern can analyse the performance of
various salesmen of his concern in order to know whether their performances differ significantly.
Thus, through ANOVA technique one can, in general, investigate any number of factors which
are hypothesized or said to influence the dependent variable. One may as well investigate the
differences amongst various categories within each of these factors which may have a large number
of possible values. If we take only one factor and investigate the differences amongst its various
categories having numerous possible values, we are said to use one-way ANOVA and in case we
investigate two factors at the same time, then we use two-way ANOVA. In a two or more way
ANOVA, the interaction (i.e., inter-relation between two independent variables/factors), if any, between
two independent variables affecting a dependent variable can as well be studied for better decisions.
1
Donald L. Harnett and James L. Murphy, Introductory Statistical Analysis, p. 376.
Page 3 of 27
This value of F is to be compared to the F-limit for given degrees of freedom. If the F value we
work out is equal or exceeds* the F-limit value (to be seen from F tables No. 4(a) and 4(b) given in
appendix), we may say that there are significant differences between the sample means.
ANOVA TECHNIQUE
One-way (or single factor) ANOVA: Under the one-way ANOVA, we consider only one factor
and then observe that the reason for said factor to be important is that several possible types of
samples can occur within that factor. We then determine if there are differences within that factor.
The technique involves the following steps:
(i) Obtain the mean of each sample i.e., obtain
X 1, X 2, X 3 , ... , X k
when there are k samples.
(ii) Work out the mean of the sample means as follows:
X 1 + X 2 + X 3 + ... + X k
X =
No. of samples ( k )
(iii) Take the deviations of the sample means from the mean of the sample means and calculate
the square of such deviations which may be multiplied by the number of items in the
corresponding sample, and then obtain their total. This is known as the sum of squares for
variance between the samples (or SS between). Symbolically, this can be written:
SS between = n1 X 1 − XFH IK 2
FH
+ n2 X 2 − X IK 2
FH
+ ... + n k X k − X IK 2
(iv) Divide the result of the (iii) step by the degrees of freedom between the samples to obtain
variance or mean square (MS) between samples. Symbolically, this can be written:
SS between
MS between =
( k – 1)
where (k – 1) represents degrees of freedom (d.f.) between samples.
(v) Obtain the deviations of the values of the sample items for all the samples from corresponding
means of the samples and calculate the squares of such deviations and then obtain their
total. This total is known as the sum of squares for variance within samples (or SS within).
Symbolically this can be written:
d
SS within = ∑ X 1i − X 1 i 2
d
+ ∑ X 2i − X 2 i 2
d
+ ... + ∑ X ki − X k i 2
i = 1, 2, 3, …
(vi) Divide the result of (v) step by the degrees of freedom within samples to obtain the variance
or mean square (MS) within samples. Symbolically, this can be written:
*
It should be remembered that ANOVA test is always a one-tailed test, since a low calculated value of F from the sample
data would mean that the fit of the sample means to the null hypothesis (viz., X 1 = X 2 ... = X k ) is a very good fit.
Page 4 of 27
SS within
MS within =
(n – k )
where (n – k) represents degrees of freedom within samples,
n = total number of items in all the samples i.e., n1 + n2 + … + nk
k = number of samples.
(vii) For a check, the sum of squares of deviations for total variance can also be worked out by
adding the squares of deviations when the deviations for the individual items in all the
samples have been taken from the mean of the sample means. Symbolically, this can be
written:
FH
SS for total variance = ∑ X ij − X IK 2
i = 1, 2, 3, …
j = 1, 2, 3, …
This total should be equal to the total of the result of the (iii) and (v) steps explained above
i.e.,
SS for total variance = SS between + SS within.
The degrees of freedom for total variance will be equal to the number of items in all
samples minus one i.e., (n – 1). The degrees of freedom for between and within must add
up to the degrees of freedom for total variance i.e.,
(n – 1) = (k – 1) + (n – k)
This fact explains the additive property of the ANOVA technique.
(viii) Finally, F-ratio may be worked out as under:
MS between
F -ratio =
MS within
This ratio is used to judge whether the difference among several sample means is significant
or is just a matter of sampling fluctuations. For this purpose we look into the table*, giving
the values of F for given degrees of freedom at different levels of significance. If the
worked out value of F, as stated above, is less than the table value of F, the difference is
taken as insignificant i.e., due to chance and the null-hypothesis of no difference between
sample means stands. In case the calculated value of F happens to be either equal or more
than its table value, the difference is considered as significant (which means the samples
could not have come from the same universe) and accordingly the conclusion may be
drawn. The higher the calculated value of F is above the table value, the more definite and
sure one can be about his conclusions.
*
An extract of table giving F-values has been given in Appendix at the end of the book in Tables 4 (a) and 4 (b).
Page 5 of 27
+ nk X k − X FH IK 2
Within
samples or
categories
d
∑ X 1i − X 1 i 2
+ ... (n – k)
SS within
(n – k )
d
+ ∑ X ki − X k i 2
i = 1, 2, 3, …
Total FH
∑ X ij − X IK 2
(n –1)
i = 1, 2, …
j = 1, 2, …
Correction factor =
bT g 2
n
Page 6 of 27
(iii) Find out the square of all the item values one by one and then take its total. Subtract the
correction factor from this total and the result is the sum of squares for total variance.
Symbolically, we can write:
Total SS = ∑ X ij2 −
bT g 2
i = 1, 2, 3, …
n
j = 1, 2, 3, …
2
(iv) Obtain the square of each sample total (Tj) and divide such square value of each sample
by the number of items in the concerning sample and take the total of the result thus
obtained. Subtract the correction factor from this total and the result is the sum of squares
for variance between the samples. Symbolically, we can write:
SS between = ∑
dT i − b T g
j
2
2
j = 1, 2, 3, …
nj n
where subscript j represents different samples or categories.
(v) The sum of squares within the samples can be found out by subtracting the result of (iv)
step from the result of (iii) step stated above and can be written as under:
R
| bTg U| R| dT i bT g U|
2 2 2
SS within = S∑ X − V − S∑ − n V|
2 j
|T n | | n
ij
W T W j
= ∑X − ∑2dT i j
2
ij
nj
After doing all this, the table of ANOVA can be set up in the same way as explained
earlier.
CODING METHOD
Coding method is furtherance of the short-cut method. This is based on an important property of
F-ratio that its value does not change if all the n item values are either multiplied or divided by a
common figure or if a common figure is either added or subtracted from each of the given n item
values. Through this method big figures are reduced in magnitude by division or subtraction and
computation work is simplified without any disturbance on the F-ratio. This method should be used
specially when given figures are big or otherwise inconvenient. Once the given figures are converted
with the help of some common figure, then all the steps of the short-cut method stated above can be
adopted for obtaining and interpreting F-ratio.
Illustration 1
Set up an analysis of variance table for the following per acre production data for three varieties of
wheat, each grown on 4 plots and state if the variety differences are significant.
Page 7 of 27
Solution: We can solve the problem by the direct method or by short-cut method, but in each case
we shall get the same result. We try below both the methods.
Solution through direct method: First we calculate the mean of each of these samples:
6+7+ 3+8
X1 = =6
4
5+5+3+7
X2 = =5
4
5+4+3+4
X3 = =4
4
X1 + X 2 + X 3
Mean of the sample means or X =
k
6+5+ 4
=5 =
3
Now we work out SS between and SS within samples:
FH
SS between = n1 X 1 − X IK 2
FH
+ n2 X 2 − X IK 2
FH
+ n3 X 3 − X IK 2
d
SS within = ∑ X 1i − X 1 i 2
d
+ ∑ X 2i − X 2 i 2
d
+ ∑ X 3i − X 3 i,
2
i = 1, 2, 3, 4
= {(6 – 6)2 + (7 – 6)2 + (3 – 6)2 + (8 – 6)2}
+ {(5 – 5)2 + (5 – 5)2 + (3 – 5)2 + (7 – 5)2}
+ {(5 – 4)2 + (4 – 4)2 + (3 – 4)2 + (4 – 4)2}
= {0 + 1 + 9 + 4} + {0 + 0 + 4 + 4} + {1 + 0 + 1 + 0}
= 14 + 8 + 2
= 24
Page 8 of 27
FH
SS for total variance = ∑ X ij − X IK 2
i = 1, 2, 3…
j = 1, 2, 3…
= (6 – 5) + (7 – 5) + (3 – 5) + (8 – 5)2
2 2 2
Table 11.2
Source of SS d.f. MS F-ratio 5% F-limit
variation (from the F-table)
Between sample 8 (3 – 1) = 2 8/2 = 4.00 4.00/2.67 = 1.5 F(2, 9) = 4.26
Within sample 24 (12 – 3) = 9 24/9 = 2.67
Total 32 (12 – 1) = 11
The above table shows that the calculated value of F is 1.5 which is less than the table value of
4.26 at 5% level with d.f. being v1 = 2 and v2 = 9 and hence could have arisen due to chance. This
analysis supports the null-hypothesis of no difference is sample means. We may, therefore, conclude
that the difference in wheat output due to varieties is insignificant and is just a matter of chance.
Solution through short-cut method: In this case we first take the total of all the individual
values of n items and call it as T.
T in the given case = 60
and n = 12
2
Hence, the correction factor = (T) /n = 60 × 60/12 = 300. Now total SS, SS between and SS
within can be worked out as under:
Total SS = ∑ X ij2 −
bT g 2
i = 1, 2, 3, …
n
j = 1, 2, 3, …
Page 9 of 27
FG 60 × 60 IJ
+ (7)2 + (5)2 + (4)2 + (3)2 + (4)2 –
H 12 K
= 332 – 300 = 32
SS between = ∑
dT i − b T g
j
2
2
nj n
SS within = ∑ X ij2 −∑
dT i
j
2
nj
= 332 – 308
= 24
It may be noted that we get exactly the same result as we had obtained in the case of direct
method. From now onwards we can set up ANOVA table and interpret F-ratio in the same manner
as we have already done under the direct method.
TWO-WAY ANOVA
Two-way ANOVA technique is used when the data are classified on the basis of two factors. For
example, the agricultural output may be classified on the basis of different varieties of seeds and also
on the basis of different varieties of fertilizers used. A business firm may have its sales data classified
on the basis of different salesmen and also on the basis of sales in different regions. In a factory, the
various units of a product produced during a certain period may be classified on the basis of different
varieties of machines used and also on the basis of different grades of labour. Such a two-way design
may have repeated measurements of each factor or may not have repeated values. The ANOVA
technique is little different in case of repeated measurements where we also compute the interaction
variation. We shall now explain the two-way ANOVA technique in the context of both the said
designs with the help of examples.
(a) ANOVA technique in context of two-way design when repeated values are not there: As we
do not have repeated values, we cannot directly compute the sum of squares within samples as we
had done in the case of one-way ANOVA. Therefore, we have to calculate this residual or error
variation by subtraction, once we have calculated (just on the same lines as we did in the case of one-
way ANOVA) the sum of squares for total variance and for variance between varieties of one
treatment as also for variance between varieties of the other treatment.
Page 10 of 27
Correction factor =
bT g 2
n
(iv) Find out the square of all the item values (or their coded values as the case may be) one by
one and then take its total. Subtract the correction factor from this total to obtain the sum of
squares of deviations for total variance. Symbolically, we can write it as:
Sum of squares of deviations for total variance or total SS
= ∑ X ij2 −
bT g 2
n
(v) Take the total of different columns and then obtain the square of each column total and
divide such squared values of each column by the number of items in the concerning
column and take the total of the result thus obtained. Finally, subtract the correction factor
from this total to obtain the sum of squares of deviations for variance between columns or
(SS between columns).
(vi) Take the total of different rows and then obtain the square of each row total and divide
such squared values of each row by the number of items in the corresponding row and take
the total of the result thus obtained. Finally, subtract the correction factor from this total to
obtain the sum of squares of deviations for variance between rows (or SS between rows).
(vii) Sum of squares of deviations for residual or error variance can be worked out by subtracting
the result of the sum of (v)th and (vi)th steps from the result of (iv)th step stated above. In
other words,
Total SS – (SS between columns + SS between rows)
= SS for residual or error variance.
(viii) Degrees of freedom (d.f.) can be worked out as under:
d.f. for total variance = (c . r – 1)
d.f. for variance between columns = (c – 1)
d.f. for variance between rows = (r – 1)
d.f. for residual variance = (c – 1) (r – 1)
where c = number of columns
r = number of rows
(ix) ANOVA table can be set up in the usual fashion as shown below:
Page 11 of 27
Between
columns ∑
dT i − bT g
j
2
2
(c – 1)
SS between columns MS between columns
treatment nj n (c – 1) MS residual
Between
rows ∑
bT g − bT g
i
2 2
(r – 1)
SS between rows MS between rows
treatment ni n (r – 1) MS residual
Total ∑ X ij2 −
bT g 2
(c.r – 1)
n
Solution: As the given problem is a two-way design of experiment without repeated values, we shall
adopt all the above stated steps while setting up the ANOVA table as is illustrated on the following
page.
ANOVA table can be set up for the given problem as shown in Table 11.5.
From the said ANOVA table, we find that differences concerning varieties of seeds are insignificant
at 5% level as the calculated F-ratio of 4 is less than the table value of 5.14, but the variety differences
concerning fertilizers are significant as the calculated F-ratio of 6 is more than its table value of 4.76.
(b) ANOVA technique in context of two-way design when repeated values are there: In case of
a two-way design with repeated measurements for all of the categories, we can obtain a separate
independent measure of inherent or smallest variations. For this measure we can calculate the sum
of squares and degrees of freedom in the same way as we had worked out the sum of squares for
variance within samples in the case of one-way ANOVA. Total SS, SS between columns and SS
between rows can also be worked out as stated above. We then find left-over sums of squares and
left-over degrees of freedom which are used for what is known as ‘interaction variation’ (Interaction
is the measure of inter relationship among the two different classifications). After making all these
computations, ANOVA table can be set up for drawing inferences. We illustrate the same with an
example.
Table 11.4: Computations for Two-way Anova (in a design without repeated values)
FG 60 × 60 IJ
Step (ii) Total SS = (36 + 25 + 25 + 49 + 25 + 16 + 9 + 9 + 9 + 64 + 49 + 16) –
H 12 K
= 332 – 300
= 32
Illustration 3
Set up ANOVA table for the following information relating to three drugs testing to judge the
effectiveness in reducing blood pressure for three different groups of people:
Amount of Blood Pressure Reduction in Millimeters of Mercury
Drug
X Y Z
Group of People A 14 10 11
15 9 11
B 12 7 10
11 8 11
C 10 11 8
11 11 7
Table 11.6: Computations for Two-way Anova (in design with repeated values)
187 × 187
Step (i) T = 187, n = 18, thus, the correction factor = = 1942.72
18
Step (ii) Total SS = [(14)2 + (15)2 + (12)2 + (11)2 + (10)2 + (11)2 + (10)2 +(9)2 + (7)2 + (8)2 + (11)2 + (11)2 + (11)2
LM b187g 2 OP
2 2
+ (11) + (10) + (11) + (8) + (7) ] –
MN 18 2 2 2
PQ
= (2019 – 1942.72)
= 76.28
LM 73 × 73 + 56 × 56 + 58 × 58 OP − LM b187g 2 OP
6 Q M 18 PQ
SS between columns (i.e., between drugs) =
Step (iii)
N 6 6
N
= 888.16 + 522.66 + 560.67 – 1942.72
= 28.77
LM 70 × 70 + 59 × 59 + 58 × 58 OP − LMb187g 2 OP
6 Q M 18 PQ
SS between rows (i.e., between people) =
Step (iv)
N 6 6
N
= 816.67 + 580.16 + 560.67 – 1942.72
= 14.78
Step (v) SS within samples = (14 – 14.5)2 + (15 – 14.5)2 + (10 – 9.5)2 + (9 – 9.5)2 + (11 – 11)2 + (11 – 11)2
+ (12 – 11.5)2 + (11 – 11.5)2 + (7 – 7.5)2 + (8 – 7.5)2
+ (10 – 10.5)2 + (11 – 10.5)2 + (10 – 10.5)2 + (11 – 10.5)2
+ (11 – 11)2 + (11 – 11)2 + (8 – 7.5)2 + (7 – 7.5)2
= 3.50
Step (vi) SS for interaction variation = 76.28 – [28.77 + 14.78 + 3.50]
= 29.23
Contd.
Page 15 of 27
21 A
B
(Blood pressure reduction
in millimeters of mercury)
19 C
17
15
13
11
9
7
5 X-axis
X Y Z
Drugs
Fig. 11.1
*
Alternatively, the graph can be drawn by taking different group of people on X-axis and drawing lines for various drugs
through the averages.
Page 16 of 27
The graph indicates that there is a significant interaction because the different connecting lines
for groups of people do cross over each other. We find that A and B are affected very similarly, but
C is affected differently. The highest reduction in blood pressure in case of C is with drug Y and the
lowest reduction is with drug Z, whereas the highest reduction in blood pressure in case of A and B
is with drug X and the lowest reduction is with drug Y. Thus, there is definite inter-relation between
the drugs and the groups of people and one cannot make any strong statements about drugs unless he
also qualifies his conclusions by stating which group of people he is dealing with. In such a situation,
performing F-tests is meaningless. But if the lines do not cross over each other (and remain more or
less identical), then there is no interaction or the interaction is not considered a significantly large
value, in which case the researcher should proceed to test the main effects, drugs and people in the
given case, as stated earlier.
Table 11.8
Variance between
columns or MS ∑
dT i − bT g
j
2 2
Variance between
varieties or MS ∑
bT g − bT g
v
2 2
Contd.
*
In place of c we can as well write r or v since in Latin-square design c = r = v.
Page 17 of 27
where total SS = ∑d x i −
bT g
ij
2
2
n
c = number of columns
r = number of rows
v = number of varieties
Illustration 4
Analyse and interpret the following statistics concerning output of wheat per field obtained as a
result of experiment conducted to test four varieties of wheat viz., A, B, C and D under a Latin-
square design.
C B A D
25 23 20 20
A D C B
19 19 21 18
B A D C
19 14 17 20
D C B A
17 20 21 15
Solution: Using the coding method, we subtract 20 from the figures given in each of the small
squares and obtain the coded figures as under:
C B A D
1 8
5 3 0 0
A D C B
2 –2
–1 –1 1 –2
Rows
B A D C
3 –10
–1 –6 –3 0
D C B A
4 –7
–3 0 1 –5
Column
0 –4 –1 –7 T = –12
totals
Squares of
coded figures Sum of
Columns squares
1 2 3 4
C B A D
1
25 9 0 0 34
A D C B
2
1 1 1 4 7
Rows
B A D C
3 46
1 36 9 0
D C B A
4 35
9 0 1 25
Sum of
36 46 11 29 T = 122
squares
Correction factor =
bT g = b−12g b−12g = 9
2
n 16
nj n
For finding SS for variance between varieties, we would first rearrange the coded data in the
following form:
Table 11. 9
Varieties of Yield in different parts of field Total (Tv)
wheat
I II III IV
A –1 –6 0 –5 –12
B –1 3 1 –2 1
C 5 0 1 0 6
D –3 –1 –3 0 –7
nv n
contd.
Page 20 of 27
Residual 10.50
10.50 6 = 175
.
or error 6
Total 113.00 15
The above table shows that variance between rows and variance between varieties are significant
and not due to chance factor at 5% level of significance as the calculated values of the said two
variances are 8.85 and 9.24 respectively which are greater than the table value of 4.76. But variance
between columns is insignificant and is due to chance because the calculated value of 1.43 is less
than the table value of 4.76.
ANOCOVA TECHNIQUE
While applying the ANOCOVA technique, the influence of uncontrolled variable is usually removed
by simple linear regression method and the residual sums of squares are used to provide variance
estimates which in turn are used to make tests of significance. In other words, covariance analysis
consists in subtracting from each individual score (Yi) that portion of it Yi´ that is predictable from
uncontrolled variable (Zi) and then computing the usual analysis of variance on the resulting
(Y – Y´)’s, of course making the due adjustment to the degrees of freedom because of the fact that
estimation using regression method required loss of degrees of freedom.*
2
George A-Ferguson, Statistical Analysis in Psychology and Education, 4th ed., p. 347.
*
Degrees of freedom associated with adjusted sums of squares will be as under:
Between k–1
within N–k–1
Total N–2
Page 21 of 27
ASSUMPTIONS IN ANOCOVA
The ANOCOVA technique requires one to assume that there is some sort of relationship between
the dependent variable and the uncontrolled variable. We also assume that this form of relationship is
the same in the various treatment groups. Other assumptions are:
(i) Various treatment groups are selected at random from the population.
(ii) The groups are homogeneous in variability.
(iii) The regression is linear and is same from group to group.
The short-cut method for ANOCOVA can be explained by means of an example as shown
below:
Illustration 5
The following are paired observations for three experimental groups:
Group I Group II Group III
X Y X Y X Y
7 2 15 8 30 15
6 5 24 12 35 16
9 7 25 15 32 20
15 9 19 18 38 24
12 10 31 19 40 30
Y is the covariate (or concomitant) variable. Calculate the adjusted total, within groups and
between groups, sums of squares on X and test the significance of differences between the adjusted
means on X by using the appropriate F-ratio. Also calculate the adjusted means on X.
Solution: We apply the technique of analysis of covariance and work out the related measures as
under:
Table 11.11
Group I Group II Group III
X Y X Y X Y
7 2 15 8 30 15
6 5 24 12 35 16
9 7 25 15 32 20
15 9 19 18 38 24
12 10 31 19 40 30
Total 49 33 114 72 175 105
Mean 9.80 6.60 22.80 14.40 35.00 21.00
N
∑ X 2 = 9476 ∑Y 2 = 3734 ∑XY = 5838
∑ X ⋅ ∑Y
Correction factor for XY = = 4732
N
Hence, total SS for X = ∑X 2 – correction factor for X
= 9476 – 7616.27 = 1859.73
R| b49g + b114g + b175g U| − correction factor for X
V| l q
2 2 2
SS between for X = S| 5
T 5 5
W
= (480.2 + 2599.2 + 6125) – (7616.27)
= 1588.13
SS within for X = (total SS for X) – (SS between for X)
= (1859.73) – (1588.13) = 271.60
Similarly we work out the following values in respect of Y
TYY
= 1859 .73 −
b1106g 2
794
= (1859.73) – (1540.60)
= 319.13
EYY
= 271.60 −
b198g 2
274.40
= (271.60) – (142.87)) = 128.73
Adjusted SS between groups = (adjusted total SS) – (Adjusted SS within group)
= (319.13 – 128.73)
= 190.40
Anova Table for Adjusted X
Source d.f. SS MS F-ratio
Between groups 2 190.40 95.2 8.14
Within group 11 128.73 11.7
Total 13 319.13
At 5% level, the table value of F for v1 = 2 and v2 = 11 is 3.98 and at 1% level the table value of
F is 7.21. Both these values are less than the calculated value (i.e., calculated value of 8.14 is greater
than table values) and accordingly we infer that F-ratio is significant at both levels which means the
difference in group means is significant.
Adjusted means on X will be worked out as follows:
Sum of product within group
Regression coefficient for X on Y i.e., b =
Sum of squares within groups for Y
Page 24 of 27
198
= = 0.7216
274.40
Adjusted means of groups in X = (Final mean) – b (deviation of initial mean from general mean
in case of Y)
Hence,
Adjusted mean for Group I = (9.80) – 0.7216 (–7.4) = 15.14
Adjusted mean for Group II = (22.80) – 0.7216 (0.40) = 22.51
Adjusted mean for Group III = (35.00) – 0.7216 (7.00) = 29.95
Questions
1. (a) Explain the meaning of analysis of variance. Describe briefly the technique of analysis of variance for
one-way and two-way classifications.
(b)State the basic assumptions of the analysis of variance.
2. What do you mean by the additive property of the technique of the analysis of variance? Explain how
this technique is superior in comparison to sampling.
3. Write short notes on the following:
(i) Latin-square design.
(ii) Coding in context of analysis of variance.
(iii) F-ratio and its interpretation.
(iv) Significance of the analysis of variance.
4. Below are given the yields per acre of wheat for six plots entering a crop competition, there of the plots
being sown with wheat of variety A and three with B.
Set up a table of analysis of variance and calculate F. State whether the difference between the yields of
two varieties is significant taking 7.71 as the table value of F at 5% level for v1 = 1 and v2 = 4.
(M.Com. II Semester EAFM Exam., Rajasthan University, 1976)
5. A certain manure was used on four plots of land A, B, C and D. Four beds were prepared in each plot and
the manure used. The output of the crop in the beds of plots A, B, C and D is given below:
Page 25 of 27
Output on Plots
A B C D
8 9 3 3
12 4 8 7
1 7 2 8
3 1 5 2
Find out whether the difference in the means of the production of crops of the plots is significant or not.
6. Present your conclusions after doing analysis of variance to the following results of the Latin-square
design experiment conducted in respect of five fertilizers which were used on plots of different fertility.
A B C D E
16 10 11 09 09
E C A B D
10 09 14 12 11
B D E C A
15 08 08 10 18
D E B A C
12 06 13 13 12
C A D E B
13 11 10 07 14
7. Test the hypothesis at the 0.05 level of significance that µ 1 = µ 2 = µ 3 for the following data:
Samples
No. one No. two No. three
(1) (2) (3)
6 2 6
7 4 8
6 5 9
– 3 5
– 4 –
Total 19 18 28
8. Three varieties of wheat W1, W2 and W3 are treated with four different fertilizers viz., f1, f2, f3 and f4. The
yields of wheat per acre were as under:
Page 26 of 27
Set up a table for the analysis of variance and work out the F-ratios in respect of the above. Are the
F-ratios significant?
9. The following table gives the monthly sales (in thousand rupees) of a certain firm in three states by its
four salesmen:
States Salesmen Total
A B C D
X 5 4 4 7 20
Y 7 8 5 4 24
Z 9 6 6 7 28
Total 21 18 15 18 72
Set up an analysis of variance table for the above information. Calculate F-coefficients and state whether
the difference between sales affected by the four salesmen and difference between sales affected in three
States are significant.
10. The following table illustrates the sample psychological health ratings of corporate executives in the field
of Banking. Manufacturing and Fashion retailing:
Banking 41 53 54 55 43
Manufacturing 45 51 48 43 39
Fashion retailing 34 44 46 45 51
Can we consider the psychological health of corporate executives in the given three fields to be equal at
5% level of significance?
11. The following table shows the lives in hours of randomly selected electric lamps from four batches:
Batch Lives in hours
1 1600 1610 1650 1680 1700 1720 1800
2 1580 1640 1640 1700 1750
3 1450 1550 1600 1620 1640 1660 1740 1820
4 1510 1520 1530 1570 1600 1680
Perform an analysis of variance of these data and show that a significance test does not reject their
homogeneity. (M.Phil. (EAFM) Exam., Raj. University, 1979)
12. Is the interaction variation significant in case of the following information concerning mileage based on
different brands of gasoline and cars?
Page 27 of 27
Brands of gasoline
W X Y Z
A 13 12 12 11
11 10 11 13
Cars B 12 10 11 9
13 11 12 10
C 14 11 13 10
13 10 14 8
13. The following are paired observations for three experimental groups concerning an experimental involving
three methods of teaching performed on a single class.
Method A to Group I Method B to Group II Method C to Group III
X Y X Y X Y
33 20 35 31 15 15
40 32 50 45 10 20
40 22 10 5 5 10
32 24 50 33 35 15
X represents initial measurement of achievement in a subject and Y the final measurement after subject
has been taught. 12 pupils were assigned at random to 3 groups of 4 pupils each, one group from one
method as shown in the table.
Apply the technique of analysis of covariance for analyzing the experimental results and then state
whether the teaching methods differ significantly at 5% level. Also calculate the adjusted means on Y.
[Ans: F-ratio is not significant and hence there is no difference due to teaching methods.
Adjusted means on Y will be as under:
For Group I 20.70
For Group II 24.70
For Group III 22.60]