ANOVA
Schedule
• Revision
• Single-factor Analysis of Variance (ANOVA)
• SPSS
Last week…
• Single t-test
• Independent t-test
• Repeated t-test
• Effect Size (Cohen’s d)
Anova
• Suppose we want to compare THREE sample means to see if
a difference exists somewhere among them.
• What we are asking is:
• Do all three of these means come from a common
population?
• Is one mean so far away from the other two that is
likely not from the same population?
• Or are all three so far apart that they all likely come
from unique populations?
Statistical Hypotheses for ANOVA (1 of 2)
• Null hypothesis: the treatment conditions have no effect on the
participant’s scores
• In the population, this is equivalent to saying that the group means do
not differ from each other
• Asking if each mean likely came from the larger overall population
H 0 : 1 = 2 = 3
Statistical Hypotheses for ANOVA (2 of 2)
• H1: There is at least one mean difference
among the populations
• The treatment conditions do affect the scores
• The acceptable shorthand is “Not H0”
• The issue: how many ways can H0 be rejected?
• All means are different from every other mean
• Some means are not different from some others, but other means do
differ from some means
If we use multiple t-test
H0: X1 = X2; alpha = .05 H0: X2 = X3; alpha = .05
H0: X1 = X3; alpha = .05
Pairwise comparison means three t-tests
ALL with alpha = .05 Type I error at 95%
confidence.
BUT error COMPOUNDS with each t-test:
(.95)3 = .857
Alpha = 1 - .857 = .143!!!
That’s why multiple t-test is less preferable.
Type I Errors and Multiple-Hypothesis Tests
• Why use ANOVA (if t can compare two means)?
• Experiments often require multiple hypothesis tests—each with Type
I error (alpha level)
• Type I error for a set of tests accumulates testwise alpha →
experimentwise alpha → testwise alpha
• ANOVA evaluates all mean differences simultaneously with one test—
regardless of the number of means—and thereby avoids the problem of an
inflated experimentwise alpha
An Overview of Analysis of Variance
• Analysis of variance
• Used to evaluate mean differences between two or more treatments
(or populations)
• Uses sample data as the basis for drawing general conclusions about
populations
• Clear advantage over a t test: can be used to compare more than two
treatments at the same time
Variance Between
Variance Within
Total variance = variance Within + Variance Between
The Logic of Analysis of Variance
• Between-treatments variance
• Variability results from general differences between the treatment conditions
• Variance between treatments measures differences between sample means
• Within-treatments variance
• Variability within each sample
• Individual scores are not the same within each sample
• Analysis of Variance is a variability ratio
• Total variance = Variance Between + Variance Within
• Partitioning – separating total variance into its component parts
x1
x3
Variance Between
x2
x1
x3
Variance WITHIN
x2
The F-Ratio: The Test Statistic for ANOVA
• The F-ratio can be calculated after we have analyzed
the total variability into two basic components
• Denominator of the F-ratio is called the error term
(measures only random and unsystematic variability)
The F-Ratio: The Test Statistic for ANOVA
• The value obtained for the F-ratio determines whether any treatment effects
exist; two possibilities:
• Fail to reject H 0
• No systematic treatment effects; the differences are entirely caused by random, unsystematic
factors
• the means are very close to overall mean or distribution melt together
𝑆𝑀𝐴𝐿𝐿
𝐿𝐴𝑅𝐺𝐸
• means are fairly close to overall mean and distributions overlap a bit; difficult to distinguish
𝑆𝑖𝑚𝑙𝑎𝑟
𝑆𝑖𝑚𝑙𝑎𝑟
• Reject H0
• The treatment does have an effect and causes systematic differences between samples
• At least one mean is an outlier and each distribution is narrow; distinct from each other.
𝐿𝐴𝑅𝐺𝐸
•
𝑆𝑀𝐴𝐿𝐿
The Test Statistic for ANOVA (1 of 2)
• Not possible to compute a sample mean difference between more than
two samples
• F-ratio is based on variance instead of sample mean differences
• Variance is used to define and measure the size of differences among
sample means (numerator)
The Test Statistic for ANOVA (2 of 2)
• Variance in the denominator measures the mean differences that
would be expected if there is no treatment effect
variance (differenc es) between sample means
F=
variance (differenc es) expected with no treatment effect
Total
Variability
Partitioned
into Two
Components
Sum of Squares
SS = ∑(𝑥 − µ)2
a
v
e
r
a
g
e
Sample Variance
𝑆𝑆
S2 =
𝑑𝑓
Variance revisited and sum of squares
• Anova is by definition the “ANalysis Of VAriance”
• Variance is the average squared deviation (difference) of a data point from
the distribution mean.
• take the distance of each point from the mean, square each distance,
add them together, and then find the average.
• Take out the “find the average” part and we are left with just the SUM of
SQUARES (SS).
• SS – variance without finding the average of the sum of the squared
deviations.
SStotal
(Total/overall)
sum of squares
SSBetween
SSWithin
(Between/treatments)
(Within/Error) sum of squares
sum of square
SST (Total Sum of Squares)
2
G
SS total = X2 −
N
• k : Number of treatment conditions
• N: Total number of scores
• G = ΣT : Grand total of all scores in study
SST (Total Sum of Squares)
2
G
SS total = X2 −
N
• G = ΣT : Grand total of all scores in study
• G = 1565
• N = 21
• ∑X2 = 119531
• G2/N= 116629.76
• SStotal = 119531 – 116629.76 = 2901.24
SST (Total Sum of Squares)
SSBetween - Treatments
T 2 G2
SS between −treatments = n
−
N
• n1, n2… :Number of scores in each treatment
• N: Total number of scores
• T : Sum of scores (ΣX) for each treatment
• G = ΣT : Grand total of all scores in study
SSWithin
SS within −treatments = SS inside each treatment
SSyear1 = (82-71.71)2 + (93 – 71.71)2 + …..+ (53 – 71.71)2 = 1039.43
SSyear2 = 751.43
SSyear3 = 1021.71
SSwithin = SSyear1 + SSyear2 + SSyear3 = 1039.43 + 751.43 + 1021.71 = 2812.57
SSBetween - Treatments
T 2 G2
SS between −treatments = n
−
N
• Year 1: 252004/7 = 36000.57
• Year 2: 277729/7 = 39675.57
• Year 3: 287296/7 = 41042.29
• G2/N= 116629.76
• SSbetween = 36000.57+39675.57+41042.29 –
116629.76
• = 116718.43 – 116629.76 = 88.67
• or SS Between = SS total – SS within
• SS Between = 2901.24 – 2812.57 = 88.67
Analysis of Sum of Squares (SS)
2
G
SS total = X2 −
N
SS within −treatments = SS inside each treatment SS between −treatments =
T 2 G2
n
−
N
k : Number of treatment conditions
n1, n2… :Number of scores in each treatment
N: Total number of scores
T : Sum of scores (ΣX) for each treatment
G = ΣT : Grand total of all scores in study
No universally accepted notation for ANOVA; other sources may use other symbols
The Analysis of Degrees of Freedom (df)
• Total degrees of freedom
dftotal = N – 1
• Within-treatments degrees of freedom
dfwithin = N – k
• Between-treatments degrees of freedom
dfbetween = k – 1
Number of treatment conditions: k
Number of scores in each treatment: n1, n2…
Total number of scores: N
The Analysis of Degrees of Freedom (df) Example
• Total degrees of freedom
dftotal = N – 1 = 21 – 1 = 20
• Within-treatments degrees of freedom
dfwithin = N – k = 21 – 3 = 18
• Between-treatments degrees of freedom
dfbetween = k – 1 = 3 – 1 = 2
Number of treatment conditions: k
Total number of scores: N
Partitioning the
Degrees of
Freedom (df) for
the Independent-
Measures ANOVA
Calculation of Variances (MS) and the F-Ratio
SS between
MSbetween = 2
sbetween =
df between
SS within
MS within = 2
s within =
df within
2
sbetween MSbetween
F= 2 =
s within MS within
Calculation of Variances (MS) and the F-Ratio
SS between
MSbetween = 2
sbetween = = 88.67/2 = 44.33
df between
SS within
MS within = 2
s within = = 2812.57/18 = 156.254
df within
2
sbetween MSbetween
F= 2 = =44.33/156.25 = .284
s within MS within
The Structure and Sequence of Calculations for the ANOVA
Step 1.
G2
SS total = X −
2
N
SS between −treatments =
T 2 G2
n
−
N
SS within −treatments = SS inside each treatment
dftotal = N – 1 dfbetween = k – 1 dfwithin = N – k
Step 2.
SS between SS within
MSbetween = sbetween
2
= MS within = s within
2
=
df between df within
Step 3.
2
sbetween MSbetween
F= 2 =
s within MS within
Anova Summary table
• Concise method for presenting ANOVA results
• Helps organize and direct the analysis process
• Convenient for checking computations
• “Standard” statistical analysis program output
Source Sum of df Mean Square F
Square (SS)
Between 88.67 2 44.33 F(2,18) = .284
Groups
Within 2812.57 18 156.25
Groups
Total 2901.24 20
Examples of Hypothesis Testing and Effect Size with ANOVA
• If the null hypothesis is true, the value of F will be around 1.00
• Because F-ratios are computed from two variances, they are always
positive numbers
• Table of F values is organized by two df
• df numerator (between) shown in table columns
• df denominator (within) shown in table rows
The Distribution of F-Ratios with df = 2, 15
An Example of Hypothesis Testing
• Hypothesis tests use the same four steps that have been used in
earlier hypothesis tests
• Computation of the test statistic F is done
in stages
• Compute SStotal, SSbetween, SSwithin
• Compute MStotal, MSbetween, MSwithin
• Compute F
Example
Biology English Psychology
n=4 n = 10 n=6 N = 20
M=9 M = 13 M = 14 G = 250
T = 36 T = 130 T = 84 ∑X2= 3377
SS = 37 SS = 90 SS = 60
Hypothesis Testing (ANOVA)
• Step 1: State the hypothesis, and select the alpha level.
• H0 : µ1 = µ2 = µ3
• H1 : At least one population is different.
• alpha = .05
• Step 2: Locate the critical region. (find the df values for
F-ratio)
• df total = N – 1 = 20 – 1 = 19
• df between = k – 1 = 3 – 1 = 2
• df within = N – k = 20 – 3 = 17
• The F-ratio for these data has df = 2, 17.
• With alpha = .05, the critical value for the F-ratio is 3.59
Hypothesis Testing (Anova)
Step 3. Compute the F-ratio
𝐺2
• SStotal = ∑X2 -
𝑁
2502 Biology English Psychology
• = 3377 - 20
• = 3377 – 3125 n=4 n = 10 n=6 N = 20
• = 252 M=9 M = 13 M = 14 G = 250
• SS Within = ∑SS inside each treatment
• = 37 + 90 +60 T = 36 T = 130 T = 84 ∑X2= 3377
• =187 SS = 37 SS = 90 SS = 60
𝑇2 𝐺2
• SS between = ∑ 𝑛 - or
𝑁
36 2 1302 84 2 2502
• = 4 + 10 + 6 - 20 SS between = SS total – SS within
• = 324 + 1690 + 1176 – 3125 = 252 – 187
• = 65 =65
Hypothesis Testing (Anova)
Step 4. compute the MS value and the F – ratio
Biology English Psychology
𝑆𝑆 65
• MSbetween = 𝑑𝑓 = = 32.5 n=4 n = 10 n=6 N = 20
2
𝑆𝑆 187
• MSwithin = 𝑑𝑓 = = 11 M=9 M = 13 M = 14 G = 250
17
MSbetween 32.5
• F = = = 2.95 T = 36 T = 130 T = 84 ∑X2= 3377
MSwithin 11
SS = 37 SS = 90 SS = 60
Step 5. Make a decision
compare F(2, 17) = 2.95 with the cutoff F-ratio (3.59, at alpha = .05)
fail to reject the null hypothesis.
Measuring Effect Size for ANOVA
• Compute percentage of variance accounted for by the treatment
conditions
• In published reports of ANOVA results, the effect size is usually called η 2
(“eta squared”)
• the proportion of variance explained
SS between treatments
2 =
SS total
Reporting the Results of Analysis of Variance
• Treatment means and standard deviations are presented in text, table, or
graph
• Results of ANOVA are summarized, including
• F and df F(df between, df within) = .xxx
• p value
• η2
• E.g., F(2, 18) = .284, p > .05, η2 = 0.03
Assumptions for the Independent-Measures
ANOVA
• The observations within each sample must be independent
• The population from which the samples are selected must be normal
• The populations from which the samples are selected must have equal
variances
(homogeneity of variance)
• Violating the assumption of homogeneity of variance risks invalid test
results
Post Hoc Tests
• ANOVA compares all individual mean differences simultaneously in one
test
• A significant F-ratio indicates that at least one difference in means is
statistically significant
• Does not indicate which means differ significantly from each other!
• Post hoc tests are follow-up tests done to determine exactly which mean
differences are significant and which are not
Posttests and Type I Errors
• Post hoc tests compare two individual means at a time (pairwise
comparisons)
• Each comparison includes risk of a Type I error
• Risk of Type I error accumulates and is called the experimentwise alpha
level
• Increasing the number of hypothesis tests increases the total probability of
a Type I error
• Post hoc (“posttests”) use special methods to try to control
experimentwise Type I error rate
Tukey’s Honestly Significant Difference (HSD) Test
• A single value that determines the minimum difference between
treatment means that is necessary to claim statistical
significance—a difference large enough that p < αexperimentwise
• Honestly significant difference (HSD)
MS within
HSD = q
n
e.g., if the HSD = 3.67
the mean difference between any two samples must be at least 3.63 to be significant
The Relationship Between ANOVA and t Tests
• For two independent samples, either t or F can be used
• Always result in the same statistical decision
• The t statistic compares distances
• The F-ratio compares variances
• The relationship is: F = t2
• For any value of α (the critical region), (tcritical)2 = Fcritical
The Distribution of t
Statistics with df =18
and the
Corresponding
Distribution of F-
Ratios with df = 1, 18
Formulas for
ANOVA
The End