0% found this document useful (0 votes)
13 views

Lecture 09 Anova

Uploaded by

jabeenmaham04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Lecture 09 Anova

Uploaded by

jabeenmaham04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

PD-407

BIOSTATISTICS
ANOVA
LECTURE # 09
BY,

DR. SADIA IQBAL


PHARM-D, M.PHIL SCHOLAR (KU)
LECTURER (PHARMACEUTICAL CHEMISTRY)

Dated: 10th, 11th, 15th June 2024 (E) DCOP-DUHS


11th, 13th, 15th June 2024 (M)
ANOVA
ANALYSIS OF VARIANCE
LEARNING OBJECTIVES
• AT THE END OF LECTURE STUDENTS WILL BE ABLE TO UNDERSTAND
❑ ANALYSIS OF VARIANCE
❑ ONE WAY CLASSIFICATION
❑ TWO WAY CLASSIFICATION
❑ EXERCISE
ANOVA
•Extends independent-samples t test
• Compares the means of groups of independent
observations
–Don’t be fooled by the name. ANOVA does not
compare variances.
•Can compare more than two groups
THE BASIC ANOVA SITUATION
• Two variables: 1 CATEGORICAL, 1 QUANTITATIVE

• Main question: do the (means of) the quantitative variables depend


on which group (given by categorical variable) the individual is in?

• If categorical variable has only 2 values:


• 2-SAMPLE T-TEST

• ANOVA allows for 3 or more groups


ANALYSIS OF VARIANCE (ANOVA)
• The analysis of variance is a procedure by which the total variation in the
data of the sample is split up into meaningful components that measure
different source of variation. Each of the component yields an estimate of
the population variance and these estimates are tested for homogeneity
using the F distribution.
• ONE WAY CLASSIFICATION or SINGLE FACTORS EXPERIMENTS:
The classification of observations on the basis of a single criteria is called a one way
classification or single factor experiment. In single factor experiments independent
samples each consisting of “n” observations are selected from each of “k” population.
We use the word “treatment” for the sample and each treatment has “n” repetition or
replications. The collected data consisting of “kn” observations (k samples of n
observations each) may be represented as:
One way classification
AN EXAMPLE ANOVA SITUATION
• Subjects: 25 patients with blisters
• Treatments: treatment A, treatment B, placebo
• Measurement: # of days until blisters heal

• Data [and means]:


• A: 5,6,6,7,7,8,9,10 [7.25]
• B: 7,7,8,9,9,10,10,11 [8.875]
• P: 7,9,9,10,10,10,11,12,13 [10.11]

• Are these differences significant?


THE LOGIC AND THE PROCESS OF ANALYSIS OF
VARIANCE (CONT.)

• A large value for the f-ratio indicates that the obtained


sample mean differences are greater than would be
expected if the treatments had no effect.
• Each of the sample variances, ms values, in the f-ratio is
computed using the basic formula for sample variance:
SS
Sample variance = MS = ──
DF
THE LOGIC AND THE PROCESS OF ANALYSIS OF
VARIANCE (CONT.)

• To obtain the SS and df values, you must go through an


analysis that separates the total variability for the entire set
of data into two basic components: between-treatment
variability (which will become the numerator of the f-ratio),
and within-treatment variability (which will be the
denominator).
Idea Behind
ANOVA

Graphical demonstration:
Employing two types of
variability
ANOVA – NULL AND ALTERNATIVE HYPOTHESES
Say the sample contains K independent groups
1. ANOVA tests the null hypothesis
H0: μ1 = μ2 = … = μk
• That is, “the group means are all equal”
2. The alternative hypothesis is
H1: μi ≠ μj for some i, j
• Or, “the group means are not all equal”
• K= no of treatments, n= no of observations
3. ANOVA Table

• MSS = Mean sum of square


• TSS = Total Sum of square
• SSB = sum of square between group
• MSE = Mean sum of error
• Sums of squares SST and SSE previously computed for the
one-way ANOVA are used to form two mean squares,
one for treatments and the second for error
• Mean squares are denoted by MST and MSE
3. ANOVA Table

Source of Sum of Squares df MSS (mean 4. F=


variation sum of sq)

Between k
k-1 MSSB MSSB/
Groups = 1/n  Ti.2 − (T ..) 2 / nK = SS/df MSSw
(SSB) i =1

Within TSS-SSB K (n-1) MSSW MSSB/


Groups =SSw/df MSSw
(MSE)

Total TSS n k
(T ..) 2 Kn - 1
=   X ij −
2

j =1 i =1 nK
• 4. Level of Significance, α = ?
• 5. Critical region:
 ν1 = k-1
 ν2 = k(n-1)
F> F α, ν1, ν2
or

• F> F α, k-1, k(n-1)


• Conclusion
PROBLEM # 01:
• Question# 02 on page 400 of book “introduction to statistics by
Ronald E. Walpole, 3rd edition”.
The following data represent the number of packages of 5
popular brands of cigarettes sold by a super market on 8
randomly selected days.
Perform an analysis of variance, at the 0.05 level of significance
and determine whether or not the 5 brands sell, on the average,
the same number of cigarettes at this super market.
ANOVA Table (Q. 02, pg no 400)
A B C D E Total
21 35 45 32 45

35 12 60 53 29

32 27 33 29 31

28 41 36 42 22

14 19 31 40 36

47 23 40 23 29

25 31 43 35 42

38 20 48 42 30

Total
Mean
A B C D E Total

21 35 45 32 45

35 12 60 53 29

32 27 33 29 31

28 41 36 42 22

14 19 31 40 36

47 23 40 23 29

25 31 43 35 42

38 20 48 42 30

Total 240 208 336 296 264 1344

Mean 30 26 42 37 33 168
Ho : µA = µB = µC = µD = µE (There is no significant difference between 5 popular
brands of cigarettes)
HA : At least 2 brands of cigarettes are not same (There is significant difference
between 5 popular brands of cigarettes)
α = 0.05
n k 2
(T ..)
TSS =  X ij −
2

j =1 i =1 nK

= [ (21)2 + (35)2 + (45)2 + (32)2 + (45)2 + (35)2 + (12)2 + (60)2 + (53)2 + (29)2 + (32)2 + (27)2 +
(33)2 + (29)2 + (31)2 + (28)2 + (41)2 + (36)2 + (42)2 + (22)2 + (14)2 + (19)2 + (31)2 + (40)2 + (36)2 +
(47)2 + (23)2 + (40)2 + (23)2 + (29)2+ (25)2 + (31)2 + (43)2 + (35)2 + (42)2 + (38)2 + (20)2 + (48)2 +
(42)2 + (30)2 ] – (1344)2
40
= 49370 – 1806336
40
= 49370 – 45158.4 TSS = 4211.6
k
SSB = 1/n  Ti .
i =1
2
− (T ..) 2
/ nK

= 1/8 (240)2 + (208)2 + (336)2 + (296)2 + (264)2 – (1344)2


40
= 1/8 (371072) – 1806336
40
= 46384 – 45158.4
SSB = 1225.6
MSE = TSS – SSB
= 4211.6 – 1225.6
= 2986
Source of Sum of Squares Degree of MSS (mean sum of 4. F=
variation Freedom-df sq)
Between 1225.6 k-1 MSSB MSSB/MSSw
Groups (SSB) 5-1 = 4 = SS/df = =
1225.6/4 306.4/85.3
= 306.4 1
= 3.591
Within TSS-SSB = K (n-1) MSSW =SSw/df MSSB/MSSw
Groups (MSE) 2986 5(8-1)= = 2986/35 = 3.591
35 = 85.31
Total TSS 4211.6 Kn – 1
40-1 = 39
• Critical region:
 ν1 = k-1 = 5-1 =4
 ν2 = k(n-1) = 5(8-1) = 35
3.591
F> F α, ν1, ν2
or
F> F α, k-1, k(n-1)
F 0.05, 4, 35 = 2.65
2.65

CONCLUSION:
Fcal lies in critical region so we reject the null hypothesis and conclude that at
least two means are not same i.e., there is a significant difference between 5
popular brands of cigarettes.
Critical
Values of the F
Distribution:
f0.05(ν1,ν2)
Critical Values
of the F
Distribution:
f0.05(ν1,ν2)
Critical Values
of the F
Distribution:
f0.01(ν1,ν2)
Critical Values
of the F
Distribution:
f0.01(ν1,ν2)
PROBLEM # 02:
• Question# 03 on page 400 of book “introduction to statistics by Ronald E.
Walpole, 3rd edition”.
Six different machines are being considered for use in manufacturing
rubber seals. The machines are being compared with respect to tensile
strength of the product. A random sample of 4 seals from each machine is
used to determine whether or not the mean tensile strength varies from
machine to machine. The following are the tensile strength measurements
in kilograms per square centimeter x 10-1.
Perform the analysis of variance at the 0.05 level of significance and
indicate whether or not the treatment means differ significantly.
MACHINES Total

1 2 3 4 5 6

17.5 16.4 20.3 14.6 17.5 18.3

16.9 19.2 15.7 16.7 19.2 16.2

15.8 17.7 17.8 20.8 16.5 17.5

18.6 15.4 18.9 18.9 20.5 20.1

Total

Mean

CONCLUSION:
Fcal lies in acceptance region so we accept the null hypothesis and conclude that the
means are same i.e., there is a no significant difference between tensile strength from
machine to machine.
ANOVA Table For Unequal Number of Observations
Source of Sum of Squares df MSS (mean 4. F=
variation sum of sq)

Between k
k-1 MSSB MSSB/
Groups =  (Ti.2  ni) − (T ..) 2 / N = SS/df MSSw
i =1
(SSB)
Within TSS-SSB Σni - k MSSW MSSB/
Groups =SSw/df MSw
(MSE)

ni k
(T ..) 2 Σni - 1
=  X ij
Total TSS

2

j =1 i =1  ni
• Critical region:
ν1 = k-1
ν2 = Σni - k
PROBLEM # 03:
• Example # 02 on page 394 of book “Introduction to statistics by
Ronald E. Walpole, 3rd edition”.
It is suspected that higher-priced automobiles are assembled
with greater care than lower priced automobiles. To investigate
whether there is any basis for this feeling, a large luxury model
A, a medium-size sedar B and a sub compact hatch back C were
compared for defects when they arrived at the deal’s showroom.
All cars were manufactured by the same company. The number
of defects for several of the three models are recorded in table.
Test the hypothesis at the 0.05 level of significance that the
average number of defects is the same for the three models.
NUMBERS OF AUTOMOBILE DEFECTS
MODELS
A B C TOTAL
4 5 8
7 1 6
6 3 8
6 5 9
3 5
4
TOTAL
MEAN
CONCLUSION:
Fcal lies in critical region so we reject the null hypothesis and conclude that at least two means are not same i.e.,
there is a significant difference between the average number of defects for the three models of automobile.
PROBLEM 04
• Question # 04 on page 400 of book “Introduction to statistics by Ronald
E. Walpole, 3rd edition”.
Three sections of the same elementary mathematics course are taught by
3 teachers. The final grades were recorded as follows:
Is there a significant difference in the average grades given by the 3
teachers? Use a 0.05 level of significance.
TEACHERS
A B C TOTAL
73 88 68
89 78 79
82 48 56
43 91 91
80 51 71
73 85 71
66 74 87
60 77 41
45 31 59
93 78 68
36 62 53
77 76 79
96 15
80
56
TOTAL
MEAN
CONCLUSION:
Fcal lies in acceptance region so we accept the null hypothesis and conclude
that at least two means are same i.e., there is a no significant difference
between the average number of grades given by three teachers.

You might also like