0% found this document useful (0 votes)
74 views12 pages

Mixed Model PDF

Uploaded by

Tiruneh GA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views12 pages

Mixed Model PDF

Uploaded by

Tiruneh GA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

RESEARCH

Mixed-Model Analysis of Crossover


Genotype–Environment Interactions
Rong-Cai Yang*

Alberta Agriculture, Food and Rural Development, no. 300, 7000-113


ABSTRACT St., Edmonton, AB, Canada T6H 5T6, and Dep. of Agricultural, Food
Genotype–environment interactions (GEI) are and Nutritional Science, Univ. of Alberta, Edmonton, AB, Canada
important in crop improvement if genotype T6G 2P5. Received 25 Sept. 2006. *Corresponding author (rong-cai.
ranks change across environments. Current [email protected]).
tests for crossover (rank changing) interac-
Abbreviations: BLUE, best linear unbiased estimator; BLUP, best lin-
tions (COI) assume that effects are all fixed or
ear unbiased predictor; COI, crossover interactions; EBLUE, empirical
all random. The objective of this study was to
best linear unbiased estimator; EBLUP, empirical best linear unbiased
develop a new test for COI under the model with
predictor; GEI, genotype–environment interactions; LR, likelihood
a mixture of fixed and random genotypic, envi-
ratio; MET, multiple-environment trial; ML, maximum likelihood;
ronmental, and GEI effects. The key part of this
RCBD, randomized complete block design; REML, restricted maxi-
new test is that the difference between a pair
mum likelihood.
of genotypes at a random environment or the
difference between a pair of environments for a
random genotype involves the linear combina-
tions (predictable functions) of both best linear
unbiased estimates (BLUEs) of fixed effects and
T he presence of genotype–environment interactions (GEI)
remains one of the major impediments for crop improve-
ment and production. It has been long recognized (e.g., Haldane,
best linear unbiased predictors (BLUPs) of ran-
1947; Gregorius and Namkoong, 1986; Baker, 1988, 1996) that
dom effects. The predictable functions are used
in the same way as the usual estimable functions
GEI are of consequence in selection programs when genotype
for the fixed effects in hypothesis testing except ranks change and there is no clear superiority of a single geno-
that the BLUPs of random effects are adjusted type across environments in a multiple-environment trial (MET).
by accounting for the uncertainty arising from Thus, interactions involving rank changes (crossover GEI) are
the distributions of these effects. Strategies are much more important than those that only reflect differences in
proposed to implement the procedure using the scale. There are currently two approaches to detecting crossover
SAS system. The procedure was used to ana- interactions (COI), depending on whether a fi xed-effect or a ran-
lyze barley (Hordeum vulgare L.) and field pea dom-effect model is used. Treating both genotypes and environ-
(Pisum sativum L.) cultivar trials. The analyses ments as fi xed effects, Baker (1988) and Cornelius et al. (1992)
show that treating random effects as fixed, as have used the test of Azzalini and Cox (1984) to evaluate all pos-
may happen with previous analysis procedures,
sible 2 × 2 subtables (quadruples) for the presence of COI from a
results in detection of more COI than mixed- or
two-way genotype × environment table. A different strategy, also
random-effect models. Therefore, significant
COI may be overemphasized when random GEI
based on the fi xed-effect model, is to minimize the presence of
effects are treated as fixed.
Published in Crop Sci. 47:1051–1062 (2007).
doi: 10.2135/cropsci2006.09.0611
© Crop Science Society of America
677 S. Segoe Rd., Madison, WI 53711 USA
All rights reserved. No part of this periodical may be reproduced or transmitted in any
form or by any means, electronic or mechanical, including photocopying, recording,
or any information storage and retrieval system, without permission in writing from
the publisher. Permission for printing and for reprinting the material contained herein
has been obtained by the publisher.

CROP SCIENCE, VOL. 47, MAY – JUNE 2007 1051


COI by clustering genotypes or environments into homo- random environment or the differences between environ-
geneous subsets using the criteria derived from the multi- mental effects for a random genotype are the linear combi-
plicative model or performance-based analyses (Crossa et nations of both best linear unbiased estimators (BLUEs) of
al., 2004; Yang et al., 2005; Navabi et al., 2006). The dis- fi xed effects and best linear unbiased predictors (BLUPs)
cussion from recent works suggests, however, that either of random effects. These linear combinations are called
genotypic or environmental effects (and thus GEI effects) predictable functions (Henderson, 1984; Stroup, 1989).
should be random (Baker, 1996; Piepho, 1998; Balzarini, The use of these predictable functions enables the Azza-
2002; Smith et al., 2005). Whether these effects are fi xed lini–Cox test to be applied directly to the mixed- and ran-
or random determines (i) if the focus of the crop improve- dom-effect models. The new mixed-model test for COI is
ment work should be on which environments or on how illustrated through the analysis of a barley (Hordeum vulgare
many environments to test in the future (Baker, 1996) L.) MET and a field pea (Pisum sativum L.) MET.
or (ii) if the purpose of the breeding and cultivar testing
programs is to identify the best cultivars or to detect the MATERIALS AND METHODS
difference between a pair of cultivars (Smith et al., 2005). Mixed Models for Multiple-Environment Trials
The second approach to characterizing GEI and I consider a MET data set where g genotypes are tested in each
detecting COI assumes that both genotypic and environ- of e environments. These environments can be a set of sites
mental effects (and thus GEI effects) are random. In this tested in a given year or a combination of multiple sites tested
approach, the nature of GEI can be assessed by partitioning in multiple years. In each environment, the genotypes can be
the total GEI variability into two components: (i) change arranged in a variety of complete and incomplete block designs.
in scale of a trait measured in different environments, that For clarity, I consider a randomized complete block design
is, heterogeneity of variances, and (ii) imperfect genetic (RCBD) in each environment, but each environment may have
correlation of the same trait across environments (or geno- a different number of replications or blocks (i.e., rj replications in
types) (e.g., Yang and Baker, 1991). When most of the GEI the jth environment) to accommodate the commonly encoun-
tered situation in many breeding and cultivar testing programs
variability is explained by the heterogeneous variances,
(in the special case of the equal number of replications across all
the GEI is considered to be unimportant since it suggests environments, rj = r for all j). For this MET, the conventional
non-COI; however, when the variability due to imper- ANOVA model is given by
fect genetic correlations between environments describes yijk = μ + τi + δ j + ( τδ )ij + γ jk + εijk
a large proportion of the total GEI it may indicate the [1]
presence of COI. Since these two components of the GEI where yijk is measured response (i.e., yield) of the kth replication
sum of squares are not chi-square distributed, the usual of the ith genotype in the jth environment (i = 1, 2, …, g; j = 1,
ANOVA provides no direct statistical tests of significance. 2, …, e; k = 1, 2, …, rj), μ is the overall mean, τi is the effect of
Yang (2002) and Crossa et al. (2004) have subsequently the ith genotype, δ j is the effect of the jth environment, (τδ)ij is
developed a likelihood ratio (LR) test based on the mixed- the interaction effect of the ith genotype with the jth environ-
model theory for the significance of the two components ment, γ jk is the effect of kth replication in the jth environment,
and ε ijk is the random error. Averaging across replications within
of the GEI variability, thereby assessing the presence of
an environment, Eq. [1] reduces to the model that is commonly
COI. These LR tests, however, require estimating vari- used to describe the genotype × environment cell means (e.g.,
ances and covariances (second-order statistics). Hypothesis Cornelius and Crossa 1999): yij . = μ + τi + δ j + ( τδ )ij + eij . ,
testing involving second-order statistics is more sensitive r
where eij . = Σkj=1( γ jk + εijk )/ r j .
to departures from model assumptions, thereby rendering Three versions of the model in Eq. [1] are considered: (i)
the LR tests less robust than the statistical test involving fi xed-effect model with all effects except γ jk and ε ijk being fi xed;
means or estimated model effects (first-order statistics). (ii) random-effect model with all effects except μ being ran-
In addition, a significant imperfect genetic correlation dom; (iii) mixed-effect model arising from the situation where
between environments does not imply the presence of either of genotypic and environmental effects is fi xed whereas
COI (Crossa et al., 2004). the other is random. If the genotypic effect is fi xed and the
I have developed a third approach to detecting COI environmental effect is random, then, from Eq. [1], μ and τi
based on the model with a mixture of fi xed and random are fi xed effects and δ j, (τδ)ij, γ jk , and ε ijk are independent and
normally distributed random effects that have expectations of
genotypic, environmental, and GEI effects. Currently the
zero and variances σ δ2, σ τδ2, σ γ2, and σ ε2, respectively; on the
Azzalini–Cox test has been applied or developed strictly other hand, if the genotypic effect is random and the environ-
under the fi xed-effect model even though, in many METs mental effect is fi xed, then μ and δ j are fi xed effects and τi , (τδ)ij,
including regional cultivar or breeding trials, either gen- γ jk , and ε ijk are independent and normally distributed random
otypes or environments should be considered random effects that have expectations of zero and variances σ τ2, σ τδ2, σ γ2,
(Baker, 1996; Piepho, 1998; Balzarini, 2002; Smith et al., and σ ε2, respectively. Obviously, the two variants of the mixed-
2005). The key part of this new approach is the recogni- effect model are reciprocal with the nature (fi xed or random)
tion that the differences between genotypic effects at a of effects τi or δ j being swapped. While the mixed-effect model

1052 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MAY – JUNE 2007


is the focus of the subsequent analysis, the fi xed- and random- comparisons among a mixture of fi xed and random effects. It
effect models are included for comparison. has been shown (Henderson, 1984, p. 41–42; Schaeffer, 2006)
Since each of the three models has one or more fi xed and that a predictor (a linear combination of observations), L´y, is a
random effects, Eq. [1] can be conveniently written in the stan- BLUP of Xβ + Zu if L´y = K´β + M´û, where
dard linear mixed model (Henderson, 1984; Littell et al., 2006),
L ′ = M ′GZ′V−1 + K ′(X ′V−1X)− X ′V−1
y = Xβ + Zu + ε [2]
where, under the mixed-effect model with fi xed genotypic and −M ′GZ′V−1X(X ′V−1X)− X ′V−1
random environmental effects, y is an m [= Σej=1 (grj)] × 1 vec- and
tor of observations y = [y111, y112, …, y gere ]´; β is a (g + 1) × 1 L ′y = K ′βˆ + M ′GZ′V−1 (y − Xβˆ ) = K ′βˆ + M ′uˆ
vector of unknown fi xed effects, β = [μ, τ1, τ2, …, τg]´; u is an
and β̂ and û are simply solutions to the well-known mixed
n ( = e + ge + Σej =1r j ) × 1 vector of random effects, u = [δ 1, δ 2,
model equations (Henderson, 1984):
…, δ e, (τδ)11, (τδ)12, …, (τδ)ge, γ 11, γ 12, …, γ ere ]´, X is an m × (g −
+ 1) design matrix of 1s and 0s relating y to β, Z is an m × n ⎡ βˆ ⎤ ⎡ X ′R −1X X ′R −1Z ⎤⎥ ⎡⎢ X ′R −1 y ⎤⎥
⎢ ⎥=⎢
design matrix of 1s and 0s relating y to u, ε is an m ×1 vector of ⎢ ˆ ⎥ ⎢ ′ −1 −1 −1 ⎥ ⎢ −1 ⎥ [3]
⎣⎢ u ⎦⎥ ⎣⎢ Z R X Z′R Z + G ⎦⎥ ⎣⎢ Z′R y ⎦⎥
random errors ε = [ε111, ε112, …, ε gere ]´, and the prime (´) rep-
resents vector or matrix transposition. Random vectors u and ε where superscript “−1” and superscript “–” represent matrix
are assumed to be normally and independently distributed with and generalized inverses, respectively. With known G and R, β̂
zero mean vectors and variance–covariance matrices G and R, is the BLUE of β as often obtained using the generalized least
respectively, such that squares estimation procedure, and û is the BLUP of u, which
⎡u⎤ ⎛ ⎡0⎤ ⎡G 0 ⎤ ⎞⎟ shrinks the fi xed-effect estimates of u toward the expected value
⎢ ⎥ ~N ⎜⎜ ⎢ ⎥ , ⎢ ⎥⎟ of zero (e.g., McLean et al., 1991; Robinson, 1991). The values
⎢ε⎥ ⎜⎜ ⎢0⎥ ⎢ 0 R ⎥⎟⎟⎟
⎣ ⎦ ⎝⎣ ⎦ ⎣ ⎦⎠ of G and R are usually unknown, however, and their estimates,
where ~N means normally distributed. Thus, the expectation Ĝ and R̂, are substituted into Eq. [3] to obtain the empirical
and variance of y are E(y) = Xβ and var(y) = V = ZGZ´ + R. BLUE (EBLUE) of β and the empirical BLUP (EBLUP) of u.
Different aspects of GEI have been characterized by allowing The sampling variability of the predictable function,
G and R to take different forms of covariance structure (e.g., K´β̂ + M´û, is measured by Var[K´( β̂ − β) + M´(û − u)]. To
Piepho, 1998; Yang, 2002; Crossa et al., 2004), but here I take a evaluate this variance, one needs to recognize that (i) both
simple form of G and R for a comparison with the conventional estimated fi xed effects (β̂) and predicted random effects (û)
ANOVA model: carry the sampling variability; (ii) while parametric values of
⎛σ2I 0 ⎞⎟⎟
fi xed effects (β) have zero variances and covariances with other
⎜⎜ δ e 0
⎜⎜ ⎟⎟ effects, those of unobservable random effects (u) do not because
G = ⎜⎜ 0 σ2τδ I ge 0 ⎟⎟⎟ these effects themselves have probability distributions. The
⎜⎜ ⎟
⎜⎜ 0 2 ⎟ ⎟⎟ sampling variance of K´β̂ + M´û (Stroup 1989) is
⎝ 0 σ γ b⎠
I
Var[ K ′( βˆ − β ) + M ′( uˆ − u )] = K ′Var( βˆ )K
and R = σ ε2 Im , where Ie, Ige, Ib, and Im are the identity matrices of
+K ′Cov[ βˆ ,( uˆ − u )′ ]M
orders e, g × e, b(= Σej =1r j ), and m, respectively. For the mixed- [4]
effect model with random genotypic and fi xed environmental + M ′Cov[( uˆ − u ), βˆ ′ ]K
effects, Eq. [2] is modified with effects τi or δ j being swapped, +M ′Var(uˆ − u)M
where β = [μ, δ 1, δ 2, …, δ e]´, u is an n (=g + g × e + Σej =1r j ) × 1 where Var(β̂), Cov[β̂,(û − u)´], Cov[(û − u), β̂´], and Var(û − u)
vector of random effects, and u = [τ1, τ2, …, τg, (τδ)11, (τδ)12, …, are the terms of the so-called C matrix, which is equal to the coef-
(τδ)ge, γ 11, γ 12, …, γ ere ]´. The fi xed- and random-effect models ficient matrix in Eq. [3] (Henderson, 1984; McLean et al., 1991):
can be similarly considered with different numbers of elements ⎡ Var( βˆ ) Cov[βˆ , (uˆ − u)′ ]⎤⎥
in β and u. In the fi xed-effect model, β = [μ, τ1, τ2, …, τg, δ 1, C = ⎢⎢ ⎥
δ 2, …, δ e, (τδ)11, (τδ)12, …, (τδ)ge]´, and u = [γ 11, γ 12, …, γ ere ]´ ⎢⎣Cov[(uˆ − u), βˆ ′ ] Var(uˆ − u) ⎥⎦
whereas in the random-effect model, β = [μ] and u = [τ1, τ2, …, ⎡Cββ Cβu ⎤
τg, δ 1, δ 2, …, δ e, (τδ)11, (τδ)12, …, (τδ)ge, γ 11, γ 12, …, γ ere ]´. Of = ⎢⎢ ⎥
⎥ [5]
⎢⎣Cuβ Cuu ⎥⎦
course, the dimensions of design matrices X and Z are changed −
accordingly under these different models. ⎡ X ′R −1X X ′R −1Z ⎤⎥
=⎢ ⎢

⎢⎣ Z′R −1X Z′R −1Z + G−1 ⎥⎦
Predictable Functions and
with C βu = Cuβ´ and
Test for Crossover Interactions
Statistical inference under the mixed-effect model involves Cββ = (X ′V−1X)−
both β and u. The general problem is to predict and test a linear
combination of fi xed and random effects, K´β + M´u, known as Cuβ = −GZ′V−1XCββ
a predictable function, given that K´β is an estimable function Cuu = (Z′R −1Z + G−1 )−1 − Cuβ X ′V−1ZG
(Henderson, 1984; Stroup, 1989; Littell et al., 2006), where K
and M are vectors of known coefficients that determine desired The standard error of the predictable function (SEP) is given by

CROP SCIENCE, VOL. 47, MAY – JUNE 2007 WWW.CROPS.ORG 1053


as done by Cornelius and Crossa (1999), whose derivation was
SEP =

( ) ⎤
Var ⎢ K ′ βˆ − β + M ′ (uˆ − u)⎥
⎣ ⎦ based directly on the cell means rather than the individual
[6] observations used here. With the BLUPs of the cell means of
= K ′Cββ K + K ′Cβu M + M ′Cuβ K + M ′Cuu M fi xed genotypes and random environments derived above, the
It should be pointed out that C would underestimate the true desired comparison between genotypes i and i´ at a random
sampling variability of β̂ and û because the uncertainty arising environment ( j) is
from estimating G and R is not considered. Different inflation ( )
BLUP (μ ij )− BLUP μ i ′j = BLUE (τi )− BLUE (τi ′ )
factors (e.g., Prasad and Rao, 1990; Harville and Jeske, 1992;
Kenward and Roger, 1997) have been proposed to account for +BLUP ⎡⎢(τδ )ij ⎤⎥ − BLUP ⎡⎢(τδ )i ′j ⎤⎥
⎣ ⎦ ⎣ ⎦
the underestimation but they are often very small unless a data
= (yi⋅⋅ − yi ′⋅⋅ )
set is poorly balanced. [9]
The key step of testing for COI under the mixed-effect

( )
+Sτδ ⎢⎡(yij⋅ − yi⋅⋅ )− yi ′j⋅ − yi ′⋅⋅ ⎥⎤

model is to identify a predictable function that allows a com-
= (1 − Sτδ )(yi⋅⋅ − yi ′⋅⋅ )
parison between a pair of genotypes evaluated at a random
environment or for a comparison between a pair of environ- (
+Sτδ yij⋅ − yi ′j⋅ )
ments used to evaluate a random genotype. Like the Azza-
lini–Cox test in the fi xed-effect model (Baker, 1988; Cornelius The fi rst part of Eq. [9] leads directly to the construction of the
et al., 1992), a two-way (g × e) genotype–environment table desired predictable function for the comparison with the ele-
is created to facilitate such comparisons. Unlike in the fi xed- ments of vectors K = [{K l}]´ and M = [0e {M lj} 0b]´ being
effect model, however, it is not the difference between the two ⎪⎧⎪1, if l = i ⎪⎧⎪1, if l = i
⎪⎪ ⎪
cell means from the two-way table but rather the difference K l = ⎨−1, if l = i ′ and M lj = ⎪⎨−1, if l = i ′ [10]
between their BLUPs that should be used. For example, to ⎪⎪ ⎪⎪
compare genotypes i and i´ at environment j, one needs the ⎪⎪⎩0, elsewhere ⎪⎪⎩0, elsewhere
difference between BLUPs of the ijth cell mean [μ ij = E(–yij.) = and 0e and 0b denote e × 1 and b × 1 vectors of 0s. It is also evident
μ + τi + δj + (τδ)ij ] and the i´jth cell mean [μ i´j = E(–yi´j .) = μ + from the second part of Eq. [9] that the differences between the
τi´ + δ j + (τδ)i´j ], BLUP(μ ij – BLUP(μ i´j ). For clarity, I will show cell and marginal means are shrunken to the extent determined
the BLUPs of the cell means from a balanced data set (i.e., rj = r by the GEI and error variances only. Similarly, the BLUPs of the
for all j) using the development of Cornelius and Crossa (1999). cell means and the differences between pairs of environments
Thus, the BLUP of the ijth cell mean is for a random genotype can be derived under the mixed-effect
BLUP(μ ij ) = BLUE(μ i ) + BLUP(δ j ) + BLUP[(τδ)ij ] [7] model involving random genotypic and fi xed environmental
effects, with the genotypic and environmental effects being
where BLUE(μi ) = BLUE(μ + τi ) = the simple mean of the ith swapped in Eq. [7–10]. Further, the same procedure can be used
genotype ( –yi... ), to obtain the BLUEs of the cell means and pairwise differ-
rgσδ2
BLUP(δ j ) = ( y⋅ j⋅ − y⋅⋅⋅ ) ences between BLUEs of the cell means under the fi xed-effect
E(MSδ ) model and to obtain the BLUPs of the cell means and pairwise
and
r σ2τδ differences between BLUPs of the cell means under the ran-
BLUP[( τδ )ij ] = ( y⋅ j⋅ − y⋅⋅⋅ ) dom-effect model. These BLUEs and BLUPs enable the desired
E(MSδ )
predictable functions to be constructed, resulting in appropriate
r σ2τδ sets of vectors K and M.
+ ( yij⋅ − yi⋅⋅ − y⋅ j⋅ + y⋅⋅⋅ )
E(MSτδ ) The null hypothesis of no difference between the two
with E(MS δ ) and E(MS τδ ) being the expected mean squares for genotypes (i and i´) at random environment j or no difference
environmental and GEI factors from the ANOVA table, respec- between the two environments ( j and j´) for random genotype i
tively, E(MS δ ) = σe2 + gσ γ2 + rσ τδ2 + rgσ δ2 and E(MS τδ ) = σe2 + (i.e., K´β + M´u = 0) can be tested using a generalized t statistic
rσ τδ2. Thus, as follows:
BLUP(μ ij ) = yi⋅⋅ + Sδ ( y⋅ j⋅ − y⋅⋅⋅ ) K ′βˆ + M ′uˆ
t= [11]
SEP
+Sτδ ( yij⋅ − yi⋅⋅ − y⋅ j⋅ + y⋅⋅⋅ ) [8]
where SEP is given in Eq. [6]. The statistic in Eq. [11] is approx-
where S δ = [rgσ δ2 + rσ τδ2]/E(MS δ ) and S τδ = rστδ2/E(MS τδ ) are imately t distributed, with the approximate degrees of freedom
the shrinkage factors for environmental and GEI effects, being estimated according to McLean and Sanders (1988) and
respectively. If the variance components and thus expected Kenward and Roger (1997). As pointed out above, the true
mean squares are unknown and need to be estimated, then sampling variability of β̂ and û and thus SEP may be under-
the shrinkage factors can be substituted by their estimates, estimated because the uncertainty arising from estimating G
Sˆτδ = (rgσˆ δ2 + r σˆ τδ
2
)/ MSδ and Sˆτδ = r σˆ τδ 2
/ MSτδ , and BLUP and R is not accounted for. Thus the t statistic in Eq. [11] may
of the ijth cell mean in Eq. [8] is replaced by its EBLUP, be slightly biased upward if there is no correction for the pos-
EBLUP(μ ij ) = yi⋅⋅ + Sˆδ ( y⋅ j⋅ − y⋅⋅⋅ ) + Sˆτδ ( yij⋅ − yi⋅⋅ − y⋅ j⋅ + y⋅⋅⋅ ) . sible underestimation of SEP (e.g., Kenward and Roger, 1997).
Similarly, BLUP(μi´j ) is obtainable by simply substituting Likewise, the t statistic for the comparison of the same pair of
the subscript i with i´. Furthermore, the BLUP values of these genotypes in another random environment ( j´) or for the com-
cell means under the random-effect model can also be obtained parison of the same pair of environments for another random

1054 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MAY – JUNE 2007


genotype (i´) can be similarly calculated. The presence of COI While the debate remains whether negative estimates of variance
will then be evaluated by determining if the genotypic or envi- components should be reported as such or should be constrained
ronmental difference is significantly greater than zero in one to zero, the use of ANOVA estimators would avoid biased F tests
environment or for one genotype and significantly less than and standard errors for different effects (Searle et al., 1992; Littell
zero in the other. et al., 2006). The comparison between the two ANOVA estima-
Following Cornelius et al. (1992), the critical values for tors provides information on the effect of the lack of balance in the
assessing significant COI are calculated using three different data, whereas the comparison between REML and ML estima-
t-tests, depending on if an experiment-wise (the original Azza- tors assesses the extent to which the ML estimator is biased due
lini–Cox test), comparison-wise, or interaction-wise error rate to the presence of fixed effects. With a balanced data set and non-
(θ) is used. For a given significance level (α), the respective θ negative estimates of variance components, REML estimators are
values are given by identical to ANOVA estimators; otherwise, all four estimators may
⎧⎪ −2 log(1 − α )/[ g( g − 1)e(e − 1)], if experiment-wise be different (Searle et al., 1992).
⎪⎪

θ = ⎨α, if comparison-wise Computing Predictable Functions
⎪⎪
⎪⎪ α / 2, if interaction-wise and Testing Crossover Interactions
⎪⎩
The test for COI described above can be implemented in the
to control the overall error rate for all the comparisons, the
SAS MIXED procedure (SAS Institute, 2004). The usual over-
individual or comparison-wise error rate for each comparison,
all tests for fi xed and random effects are achieved by specify-
and the error rate per quadruple for COI, respectively. The
ing fi xed effects in the MODEL statement and random effects
powers of the three t-tests for COI are different, with the test
in the RANDOM statement. For example, under the mixed-
based on the experiment-wise error rate being the most con-
effect model involving fi xed genotypic and random environ-
servative and its power decreasing with the increasing number
mental effects, the genotypic effects are given in the MODEL
of genotypes and environments. The test using the interaction-
statement and environmental, GEI, and replications-within-
wise error rate is the most sensitive and the test based on the
environmental effects in the RANDOM statement.
comparison-wise error rate has intermediate power and Type
While the solution options in the MODEL statement and
I error rate. Regardless of which error rate is protected, a test
in the RANDOM statement provide EBLUE of β and EBLUP
for significant COI is essentially the test for the null hypothesis
of u, respectively, it is the use of the ESTIMATE statement
that no COI exists in any one of all possible quadruples vs.
that allows estimating and testing for any linear combinations
the alternative hypothesis that some COI exist. Cornelius et al.
of fi xed and random effects, thereby testing for COI. The usual
(1992) argued against the use of the original Azzalini–Cox test
and familiar use of the ESTIMATE statement is for construct-
because it gives experiment-wise error rate protection against
ing estimable functions (linear combinations of fi xed effects).
rejecting a true null hypothesis (lower Type I error rate) at a
When random effects are included as well, the ESTIMATE
cost of high Type II error rate (i.e., low power to detect the
statement uses a vertical bar (|) to separate fi xed effects (before
true COI). In other words, a Type I error may not be serious
the bar) from random effects (after the bar). Thus the coeffi-
because follow-up cultivar trials will reveal spurious COI, but
cients given in the ESTIMATE statement constitute the ele-
a Type II error is serious because a potentially important COI
ments of vector K for the fi xed effects and those of vector M
may go undetected.
for random effects. Given that multiple ESTIMATE statements
As evident in Eq. [8], different variance components for ran-
are allowed under one PROC MIXED session (so long as they
dom effects are needed to compute shrinkage factors for deriving
all appear after the MODEL and RANDOM statements), the
BLUPs of the cell means. I used two ANOVA-based methods,
SAS MACRO facility is used to create different ESTIMATE
SAS Type 1 (based on computation of sequential sum of squares
statements for comparing pairs of genotypes at each and every
for each random effect) and SAS Type 3 (based on computation
random environment or comparing pairs of environments for
of partial sum of squares for each random effect) and two likeli-
each and every random genotype. This works well for a moder-
hood-based methods, maximum likelihood (ML) and restricted
ate number of comparisons, but is problematic for a very large
maximum likelihood (REML) to assess the effects of different
number of comparisons. For example, for g = 30 genotypes
estimation methods on the BLUPs and thus on the test for COI.
(fi xed) and e = 50 environments (random), one needs to create
These estimation methods are all available in the SAS MIXED
a total of e × [g(g – 1)/2] = 21 750 ESTIMATE statements for
procedure (SAS Institute, 2004). The REML and ML methods
all possible comparisons, which is too numerous to be handled
are preferred methods for estimating variance components because
by a computer with a modest amount of memory!
they are able to accommodate data sets with unbalanced or compli-
An alternative strategy is to use the estimated C matrix (cf.
cated data structures and they possess the properties of consistency
Eq. [4]) and the EBLUE of β and EBLUP of u produced by the
and asymptotic normality of the estimators desirable for hypothesis
MMEQSOL option of PROC MIXED to calculate all possible
testing (Searle et al., 1992). The ANOVA estimators are included,
comparisons, their associated standard errors, and associated t sta-
however, because the ANOVA or least squares analysis of METs
tistics, following Eq. [6] and [11]. As a partial check on the cal-
continues to dominate the current GEI literature (e.g., Cornelius
culations, the results from comparing the first two genotypes (or
and Crossa, 1999; Gauch, 2006; Yan and Tinker, 2006). In addi-
environments) at each and every random environment (or geno-
tion, REML and ML estimators may be biased when the con-
type) are confirmed with outputs from a subset of ESTIMATE
straint of non-negative estimates of variance components needs to
statements for corresponding comparisons. These calculations are
be imposed. In contrast, ANOVA estimators are always unbiased.
implemented in a SAS program called “mixed_COI.sas.” The core

CROP SCIENCE, VOL. 47, MAY – JUNE 2007 WWW.CROPS.ORG 1055


part of the program is listed and explained in the Appendix for Alberta in 2001 (Yang et al., 2005). Specifically, I analyzed
the case of the mixed-effect model involving fixed genotypes and the yield data of 33 registered cultivars or advanced breeding
random environments. Modifications of SAS codes can be readily lines at four selected sites—Brooks, Vegreville, Namao, and
made to accommodate the case of the mixed-effect model involv- High Prairie—representing four regions of defi nite geography
ing random genotypes and fixed environments and the case of the and soil characteristics: southern Alberta, east-central Alberta,
fixed- or random-effect model. The complete program is available west-central Alberta, and the Peace River region (Yang et al.,
from me on request. 2005). The Brooks site was under irrigation, whereas the other
three sites were under a dryland condition. These four sites also
Data Sets belong to four distinctive isoyield groups based on clustering
Two data sets were used for the COI assessment under the analysis (Yang et al., 2005). The experimental design for the
mixed-effect model. The fi rst data set used for the mixed-effect trials at all four sites was a RCBD with four replications.
model, involving fi xed genotypic effects and random environ-
mental effects, was taken from the Canadian Prairies Barley RESULTS
Trials as described in Yang et al. (2006). While these trials Crossover Interactions in the Barley Data Set
included a large number of cultivars and advanced breeding
Table 1 is a two-way table giving the average yields of 108
lines, a focus has been on the following six two-row barley cul-
tivars (Helm et al., 2004): three eligible for feed grades—‘CDC combinations between six barley cultivars and 18 sites,
Dolly’, ‘Seebe’, and ‘Xena’—and three others eligible for malt- based on four replications at seven sites—Big Lake, CDC
ing grades—‘AC Metcalfe’, ‘Harrington’, and ‘Merit’. Details North, Dawson Creek, Ft. Kent, Ft. St. John, North-
of cultivar development, yield performance, and agronomic ern Sunrise, and St. Paul—and three replications at the
and quality characteristics are given in Field Crop Develop- remaining 11 sites. These cell means based on the fi xed-
ment Centre (2006). I analyzed the yield data of these six culti- effect model (i.e., both cultivar and site effects are fi xed)
vars evaluated in 2003 at 18 sites across the Province of Alberta serve as reference points in subsequent analyses under
including the two neighboring sites in the Province of British mixed- or random-effect models. The least square means
Columbia (Table 1). Two sites in Lethbridge, representing rain- and simple means for individual sites are the same because
fed (dryland) and irrigated conditions, are within ≤10 km and the data set is balanced within each site. The number of
are indistinguishable at the scale of the map. The cultivar trials
replications is not constant across sites, however, so the
across all sites were conducted using a RCBD with three or
best estimates of genotype means are least square means,
four replications. Cultural practices such as fertility, tillage, and
pest control varied from site to site but were considered to be which differ from the simple means.
the most appropriate for the individual sites. Given in Table 2 are the estimates of variance compo-
The second data set for the mixed-effect model, involving nents of random effects for sites, replications (sites), cultivar ×
random genotypic effects and fi xed environmental effects, was site interactions, and errors using the four estimation meth-
taken from a field pea cultivar trials conducted at 21 sites across ods: SAS Type 1, SAS Type 3, REML, and ML. The REML

Table 1. Yields of six barley (Hordeum vulgare L.) cultivars at 18 sites across Alberta, Canada, in 2003.
Site Longitude Latitude AC Metcalfe CDC Dolly Harrington Merit Seebe Xena Mean
−1
——————— ° ——————— ——————————————————————— Mg ha ———————————————————————
Beaverlodge 119.43 55.21 4.13 3.87 4.04 4.75 3.87 4.44 4.18
Big Lakes 113.70 53.61 6.73 6.63 6.58 7.99 7.02 8.39 7.22
Calmar 113.85 53.26 10.18 8.73 9.66 11.05 9.71 11.20 10.09
CDC North 113.33 53.63 5.92 6.89 5.72 6.07 6.62 7.89 6.52
Dawson Creek 120.23 55.76 4.31 4.50 4.41 4.88 4.71 4.46 4.54
Ft. Kent 110.61 54.31 4.68 3.67 2.80 4.74 3.49 4.81 4.03
Ft. St. John 120.85 56.25 7.64 7.57 6.91 8.16 7.91 8.10 7.72
Irricana 113.60 51.32 3.74 4.43 3.97 3.39 3.80 4.23 3.93
Killam 111.85 52.78 3.55 2.71 3.12 2.66 2.67 3.26 2.99
Lacombe 113.73 52.46 3.86 4.26 3.99 3.53 3.99 4.79 4.07
Lethbridge (dry) 112.81 49.70 2.38 1.72 2.41 2.67 2.24 2.57 2.33
Lethbridge (irrigated) 112.81 49.70 6.00 5.90 5.95 7.12 6.20 6.88 6.34
Lomond 112.65 50.35 4.35 4.58 4.43 4.54 3.79 5.15 4.47
Neapolis 113.86 51.65 4.98 4.74 4.33 4.48 4.40 4.97 4.65
Northern Sunrise – – 6.15 6.45 6.07 6.96 6.30 6.78 6.45
Olds 114.09 51.78 7.01 7.66 7.23 7.19 7.01 8.12 7.37
St. Paul 111.28 53.98 3.43 2.80 2.93 2.57 3.07 3.30 3.02
Stettler 112.71 52.31 4.59 4.84 4.50 4.45 4.53 5.09 4.67
Mean 5.20 5.11 4.95 5.40 5.07 5.80 5.26

1056 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MAY – JUNE 2007


estimates differ from those by Type 1 and Type 3, Table 2. Estimates of variance components (±SE) by four estimation
as expected for an unbalanced data set like the bar- methods—Type 1, Type 3, restricted maximum likelihood (REML), and
maximum likelihood (ML)—in the mixed-model analysis of barley (Hor-
ley MET data; they would be identical if the data deum vulgare L.) multiple-environment trial data.
were balanced. The ML estimates are only slightly
Variance
smaller than the estimators of the other three meth- component† Type 1 Type 3 REML ML
ods because the deduction of the number of fixed σ 2 3.8642 ± 1.3063 3.8642 ± 1.3063 3.9733 ± 1.3795 3.7514 ± 1.3063
δ
(cultivar) effects (six) from the total observations of σ 2 0.0621 ± 0.0204 0.0621 ± 0.0204 0.0622 ± 0.0204 0.0622 ± 0.0204
γ
366 has a negligible effect. The same estimates by σ 2 0.1181 ± 0.0273 0.1181 ± 0.0273 0.1171 ± 0.0269 0.1076 ± 0.0273
τδ
Type 1 and Type 3 are not surprising because of the σ 2 0.1892 ± 0.0182 0.1892 ± 0.0182 0.1887 ± 0.0182 0.1887 ± 0.0182
ε
hierarchical structure of different effects. σ , σ , σ , and σ represent variances for site effects, replications-within-sites effects, cultivar
† 2 2 2 2
δ γ τδ ε
While the significance of cultivar × site inter- × site interaction effects, and random errors, respectively. Note that cultivar effects are fixed.
action can be assessed directly and quickly by
Table 3. Likelihood ratio (LR) tests for cultivar × site interac-
determining if the cultivar × site variance compo- tions under three linear models in the mixed-model analy-
nent is greater than zero by more than twice its standard sis of barley (Hordeum vulgare L.) multiple-environment trial
error (Wald’s Z test), this test is not particularly reliable data. The methods of estimating covariance parameters are:
for small sample sizes and for the variance components Type 1, Type 3, restricted maximum likelihood (REML), and
with a skewed or bounded sampling distribution (Littell maximum likelihood (ML).
et al., 2006). Thus, I carried out a LR test (Yang, 2002) Model Type 1, Type 3, and REML ML
by comparing −2(Res)log likelihoods for two models, one Fixed-effect model
with the interaction term (full model) as in Eq. [1] and the LR 182.5 293.6
other without it (reduced model). The LR tests based on Probability 6.90 × 10−42 4.09 × 10−66
all four estimation methods show significant GEI vari- Mixed-effect model
ability (Table 3). For example, the −2(Res)log likelihoods LR 45.6 42.5
by the REML method for the full and reduced models Probability 7.25 × 10−12 3.54 × 10−11
are 691.8 and 737.4, respectively, and LR = 737.4 – 691.8 Random-effect model
= 45.6. Under the null hypothesis that the GEI variance LR 45.8 45.9
component is zero, LR follows a chi-square distribution Probability 6.55 × 10 −12
6.20 × 10−12
2
with one degree of freedom (χ1 ), leading to a probabil-
ity of 1.45 × 10−11. However, this probability needs to be Table 4. Number of significant crossover interactions with
three different linear models (fixed, mixed, and random
halved (7.25 × 10−12) because the asymptotic distribution
effects), four estimation methods (Type1, Type 3, restricted
of the LR statistic with the boundary value of the covari- maximum likelihood, and maximum likelihood), and three
ance parameter (zero GEI variance) in the reduced model test criteria (experiment-wise, comparison-wise, and inter-
is actually a 50:50 mixture of χ02 and χ12 distributions action-wise) in the mixed-model analysis of barley (Hordeum
(Self and Liang 1987). Given that the χ02 distribution takes vulgare L.) multiple-environment trial data. The total of qua-
the value 0 with probability 1, the mixture distribution druples is 2295.
takes the value 0 with probability ½ and takes a value Method Experiment-wise Comparison-wise Interaction-wise
drawn from a χ12 distribution with probability ½. The Type 1
small probability obviously indicates that the null hypoth- Fixed 4 65 219
esis of zero GEI variance is unlikely to be true and that Mixed 0 20 97
interactions between cultivars and sites are significant. Random 0 23 105
Of 2295 [6(6 – 1)18(18 – 1)/4] possible quadruples Type 3
evaluated for COI, the number of significant COI varies Fixed 4 65 219
with different models (fi xed, mixed, and random), estima- Mixed 0 20 97
tion methods (Type1, Type 3, REML, and ML), and test Random 0 23 105
criteria (experiment-wise, comparison-wise, and interac- Restricted maximum likelihood
tion-wise) as shown in Table 4. While different numbers of Fixed 4 65 219
significant COI are observed for combinations of estima- Mixed 0 20 97
tion methods, models, and test criteria, the least and most Random 0 23 105
sensitive of these combinations share the same quadruples Maximum likelihood
that display significant COI. Three points are quite obvi- Fixed 15 107 279
ous from the table. Mixed 0 20 97
First, treating random site and GEI effects as fixed under Random 0 23 105
the fixed-effect model would overestimate the power of
detecting COI, as the numbers of significant COI are higher

CROP SCIENCE, VOL. 47, MAY – JUNE 2007 WWW.CROPS.ORG 1057


linear models. In Fig. 1, the dashed lines represent upper
and lower bounds (based on the interaction-wise criterion)
beyond which significant cultivar differences are indicated. It
is hardly surprising that the number of sites showing signifi-
cant cultivar differences is not equal on positive and negative
sides, as yields of CDC Dolly and AC Metcalfe are higher
than those of Harrington at most sites. Thus, the total num-
ber of significant COI for each cultivar pair is the number of
sites with significant cultivar differences on the positive side
multiplied by that on the negative side. For example, in the
top graph of Fig. 1, the number of sites showing significant
positive cultivar differences (CDC Dolly–Harrington, above
the upper dashed line) is 7 under the fixed-effect model and
6 under the mixed- and random-effect models, whereas the
number of sites showing significant negative cultivar differ-
ences (CDC Dolly–Harrington, below the lower dashed line)
is 3 under the fixed-effect model and 2 under the mixed- and
random-effect models. Thus, the number of significant COI
for this cultivar pair is 21 (7 × 3) under the fixed-effect model
and 12 (6 × 2) under the mixed- and random-effect models.
It is evident from the same argument that there are zero COI
for AC Metcalfe–Harrington under all three models.
Second, the number of significant COI is larger if the
ML method is used to estimate variance components than
if any of the other three methods (Type 1, Type 3, and
REML) is used under the fi xed-effect model (Table 4).
The latter three methods give identical numbers of signif-
Figure 1. Differences in yields (Mg ha−1) for two barley (Hordeum icant COI even though the REML estimates of variance
vulgare L.) cultivar pairs, CDC Dolly–Harrington and AC Metcalfe– components are slightly different from those by the Type 1
Harrington, evaluated at 18 sites across Alberta under three linear and Type 3 methods (Table 2). The ML estimates of vari-
models (fixed-, mixed-, and random-effect models). The dashed ance components are slightly biased downward, thereby
lines represent upper and lower bounds (based on interaction-wise
leading to either smaller standard errors of estimable func-
criterion) beyond which significant cultivar differences are indicated.
tions or smaller shrinkages of differences in means. Thus,
higher frequencies of significant COI result under the ML
under the fixed-effect model than under mixed- or random- estimation method for the fi xed-effect model.
effect models. The comparison under the fixed-effect model Third, the percentages of significant COI evaluated
is based on unshrunken differences between the marginal by the interaction-wise test criterion ranged from 4%
means of a pair of cultivars across all sites and the cell means (97) to 12% (279), whereas those evaluated by the experi-
at a particular site. In contrast, the comparison under the ment-wise test criterion ranged from 0% (0) to 0.7% (15).
mixed- or random-effect models is based on shrunken differ- Obviously, the ranking for sensitivities of the three test
ences between the means of a pair of cultivars across all sites criteria is: interaction-wise > comparison-wise > experi-
or at a particular site. Consequently, the difference between ment-wise. Under the null hypothesis that there are no
cultivars based on the mixed- or random-effect models is significant COI with an error rate of α = 0.05, the test
smaller than those based on the fixed effect model, as the would find 115 significant COI by chance alone. Judging
random effects are scaled toward their true expectation of from the observed numbers of significant COI, no COI
zero. The miniscule difference in detecting COI between is present in all cases except for the interaction-wise test
the mixed- and random-effect models is somewhat expected criterion under the fi xed-effect model.
because the cultivar difference at a random environment
under both models involves the shrunken GEI effects with Crossover Interactions
similar but not identical shrinkage factor (the variance com- in the Field Pea Data Set
ponents differ slightly under the two models). The shrinkage The analysis of field pea data with random genotypes (culti-
effect is illustrated in Fig. 1, in which line plots are made of vars or breeding lines) and fixed sites shows that the numbers
differences for two cultivar pairs, CDC Dolly–Harrington of significant COI are all >159 (5% of 3168, the total number
and AC Metcalfe–Harrington, at all 18 sites under the three of quadruples for 33 genotypes and four environments) as

1058 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MAY – JUNE 2007


Table 5. Number of significant crossover interactions with
expected by chance alone, regardless of different linear mod-
three different linear models (fixed, mixed, and random), four
els (fixed, mixed, and random), estimation methods (Type1, estimation methods (Type1, Type 3, restricted maximum likeli-
Type 3, REML, and ML), and test criteria (experiment- hood, and maximum likelihood) and three test criteria (experi-
wise, comparison-wise, and interaction-wise) (Table 5). The ment-wise, comparison-wise, and interaction-wise) in the
extent of significant COI is obviously much higher in this mixed-model analysis of field pea (Pisum sativum L.) multiple-
field pea data than in the barley data; however, the patterns environment data. The total number of quadruples is 3168.
of significant COI are similar for both data sets. First, treat- Model Experiment-wise Comparison-wise Interaction-wise
ing random genotypic and GEI effects as fixed under the Type1
fixed-effect model would overestimate the power of detect- Fixed 524 613 657
ing COI, as the numbers of significant COI are higher under Mixed 205 292 365
the fixed-effect model than under mixed- or random-effect Random 205 292 352
models. Second, there are higher frequencies of significant Type 3
COI under the ML estimation method than the other three Fixed 524 613 657
estimation methods for the fixed-effect model. Third, the Mixed 205 292 365
ranking for sensitivities of the three test criteria is: interac- Random 205 292 352
tion-wise > comparison-wise > experiment-wise. As in the Restricted mximum likelihood
barley data set, the same quadruples that display significant Fixed 524 613 657
COI occur throughout different estimation methods, mod- Mixed 205 292 365
els, and test criteria. Random 205 292 352
High levels of significant COI in the field pea data are Maximum likelihood
somewhat expected given marked and inconsistent differ- Fixed 545 615 758
ences in yield among 33 genotypes across the four selected Mixed 213 292 365
sites. The range of yields across 33 genotypes is 2.38 to 6.41 Random 205 292 352
Mg ha−1 for Brooks, 1.52 to 5.64 Mg ha−1 for High Prai-
rie, 2.69 to 6.63 Mg ha−1 for Namao, and 1.21 to 4.56 Mg environmental effects are fi xed, the new test reduces to
ha−1 for Vegreville. The presence of significant COI across the conventional Azzalini–Cox test for COI (Baker, 1988;
the four sites is also indicated by imperfect correlations Cornelius et al., 1992). In the past, the Azzalini–Cox test
between pairs of sites. The estimated correlations between has often been used for assessing COI but its use is based on
genotypic performances at all six pairs of sites vary from the limiting assumption that all effects are fi xed. In real-
0.066 for Brooks–High Prairie to 0.763 for High Prairie– ity, either genotypes or environments or both should be
Namao, but all these correlations are significantly less than random, as argued in Baker (1996) and Smith et al. (2005).
unity according to Fisher’s z transformation (SAS Institute, The mixture of fi xed and random effects is now accom-
2004). In this field pea example, the assumption of fixed modated in the new mixed-model test, which uses linear
site and random genotypic effects appears to be particu- combinations of BLUPs of random effects and BLUEs of
larly reasonable. For nontraditional crops such as field pea fi xed effects for detecting COI. It is also the responsibility
in western Canada, one of the major objectives of cultivar of the users of the new test, however, to ensure that the
testing programs is to test a large number of introduced cul- choice of model should be made a priori and be based on
tivars or advanced breeding lines and to identify which sites the inference space and sampling of cultivars or environ-
are suitable for optimal crop production. In addition, the ments for a MET. In both examples analyzed in this study,
four selected sites are more regularly used for cultivar and I used a selected set of cultivars (barley data set) and envi-
line testing across years than the other sites that are chosen ronments (field pea data set) for clearer illustration of the
yearly on an ad hoc basis. The four sites differ markedly mixed-model test for COI. Nevertheless, the new test also
in their geography and performance (Yang et al., 2005). works with the whole set of cultivars or environments.
While the across-site combined analysis was performed to In fact, similar patterns and extent of COI are observed
illustrate the flexibility of the new test for COI, it needs when analyzing all 41 cultivars and breeding lines (barley
to be pointed out that in reality, if environments are con- data) and all sites (field pea data). For example, the mixed-
sidered fixed, then results for cultivar performances would model analysis of the complete barley data set involves the
typically be reported separately for each environment. evaluation of 125 460 [41(41 – 1)18(18 – 1)/4] possible qua-
druples of fi xed cultivars and random sites; the numbers
DISCUSSION of significant COI are 7 (experiment-wise), 770 (compari-
The proposed test for COI based on the mixed-model son-wise), and 3738 (interaction-wise) if the estimation
theory is quite general as it is applicable to all the mod- methods of REML, Type 1, or Type 3 are used. All these
els regardless of whether genotypic, environmental, and numbers are obviously <5% of the total number of qua-
GEI effects are fi xed or random. If both genotypic and druples (125 460). Thus, the new test is applicable to any

CROP SCIENCE, VOL. 47, MAY – JUNE 2007 WWW.CROPS.ORG 1059


number of genotypes and environments if the nature of the ANOVA-based analysis. One of the major drawbacks of
each effect (fi xed or random) is clearly identified. the ANOVA-based analysis is the dichotomy in its applica-
It is clear from Table 4 that the choice of which model tions: statistical inference is based either on estimable func-
is used to test for COI would lead to a different conclusion tions of fixed effects under the assumed fixed-effect model
about the nature of cultivar × site interactions. Looking at or on the magnitudes and structures of variances of the dif-
the test results based on the interaction-wise error rate, the ferent types of GEI under the random-effect model. I pro-
fi xed-effect model analysis would identify significant COI pose to study GEI by estimating and testing for appropriate
in this barley data set, whereas the mixed- and random- predictable functions (linear combinations of both fixed
effect model analyses would not. An important conse- [genotype] and random [GEI] effects). While estimation of
quence is that false claims may be made if either genotypes fixed effects and prediction of random effects have been a
or environments (and thus GEI) should be random but are major focus of the mixed-model theory (Henderson, 1984),
treated as fi xed. In the barley example, if both cultivars the hypothesis testing concerning a mixture of these two
and sites are considered to be fi xed effects, then the claim types of effects is largely neglected (Kennedy, 1991). Given
under the fi xed-effect model is appropriate. While the 18 that the most realistic models for studying GEI are those
sites for the barley trials are known to belong to different based on a mixture of fixed and random effects, further
soil zones or eco-regions in Alberta with “fi xed” agro- research should be directed to estimation, prediction and
climatic characteristics, however, the effects of soil zones testing for predictable functions that can characterize the
or eco-regions alone account for only a small portion of nature and patterns of GEI.
site-to-site variability (Yang et al., 2006). If the site effects This study can be extended to other areas related to
are random, as argued above, there is no evidence of sig- characterization of GEI when the mixed- or random-effect
nificant COI in this barley data set (the observed 97 sig- model is considered. For example, in the past, the crite-
nificant COI based on the most sensitive interaction-wise ria developed for clustering genotypes or environments
test is still <115 [5%] expected by chance alone). A similar into subsets with negligible COI are based on the fi xed
conclusion was implied in an earlier investigation on the GEI effect (e.g., Crossa et al., 2004; Navabi et al., 2006).
same six barley cultivars (Helm et al., 2004). In addition, It would be of interest to investigate the effectiveness of
in crop improvement, the adaptability of cultivars needs the clustering procedures based on the random GEI effect.
to be effectively assessed by applying inference beyond the In addition, this study used simple covariance structure
observed sites to the entire population of environments. models for error and GEI effects to permit direct com-
Thus, site and cultivar × site interaction effects are consid- parison with the ANOVA-based analysis. In the future,
ered random for most breeding applications (e.g., Baker, another area of investigation may be to exploit more com-
1996; Balzarini, 2002). plicated error and GEI covariance structures from the
It is crucial to correctly determine whether or not mixed-model perspective (e.g., Piepho, 1998; Yang, 2002;
the observed significant GEI involves rank changes (COI) Qiao et al., 2004; Casanoves et al., 2005; Crossa et al.,
because COI are the only type of interactions that impact 2004, 2006) or from the Bayesian perspective (Cotes et al.,
selection programs. Even in this case, COI may still be irrel- 2006; Edwards and Jannink, 2006) to assess their sensitiv-
evant to a breeder if their occurrence coincides with the ity to detection of significant COI. Finally, the commonly
situation where one cultivar consistently outperforms other used additive main effects and multiplicative interaction
cultivars in the majority of the environments in a MET. In (AMMI) model (Gauch, 2006) and the genotype main
general, the presence of COI would suggest that much of the effects and genotype × environment interaction effects
improvement made in one or one set of environments will (GGE) model (Yan and Tinker, 2006) are fi xed-effect
not be carried over when the selected genotypes are grown in models. When some or all effects are random, the pres-
other environments. In this case, one must select one geno- ent mixed-model analysis may conceivably be extended to
type for one set of environments and a different genotype for provide BLUPs of genotype and environment eigenvec-
other environments. On the other hand, if significant GEI tors for biplot characterization of GEI.
merely reflects differences in scale (i.e., absence of COI), then
there is no need to consider any aspect of GEI (Baker, 1996).
In this case, the breeding efforts or production system can be APPENDIX
greatly simplified because a genotype that is the best in one Listed below is the syntax for the PROC MIXED analysis
environment will be the best in all environments. Thus, in of the barley MET, where cultivar effects (&G_EFF with
the absence of COI, a single genotype would optimize pro- an ampersand sign [&] signaling a SAS macro variable) are
duction in all environments. fixed, whereas site effects (&E_EFF), blocks-within-sites
Despite a growing use of the mixed-model analysis for effects [&B_EFF(&E_EFF)] and cultivar × site interaction
studying GEI (e.g., Piepho, 1998; Yang, 2002; Crossa et al., effects (&E_EFF*&G_EFF) are random. This is the core
2004, 2006), much of the GEI literature remains focused on part of the SAS program mixed_COI.sas described above.

1060 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MAY – JUNE 2007


PROC MIXED METHOD = &METH MMEQSOL able in mixed_COI.sas to perform tests for COI based on
COVTEST; these outputs.
CLASS &E_EFF &B_EFF &G_EFF;
Acknowledgments
MODEL &DVAR = &G_EFF/DDFM = KR SOLU-
I thank Drs. James Holland, Jose Crossa, and two anonymous
TION; reviewers for valuable comments. This research was supported
RANDOM &E_EFF &B_EFF(&E_EFF) &E_ in part by Alberta Agriculture, Food, and Rural Development’s
EFF*&G_EFF/SOLUTION; Industry Development Sector New Initiative Fund and a Natural
ESTIMATE ‘CULT1-CULT2 AT SITE 1’ &G_EFF Sciences and Engineering Research Council of Canada grant.
1–1 |&E_EFF*&G_EFF 1–1;
(17 more ESTIMATE statements)… References
ODS OUTPUT MMEQSOL = MMESOL(DROP = Azzalini, A., and D.R. Cox. 1984. Two new tests associated with
ROW) ESTIMATES = EVAL; analysis of variance. J. R. Stat. Soc. B 46:335–343.
RUN; Baker, R.J. 1988. Tests for crossover genotype–environmental
interactions. Can. J. Plant Sci. 68:405–410.
In the PROC MIXED statement, the option of
Baker, R.J. 1996. Recent research on genotype–environmental
METHOD = &METH specifies which of the four estima- interaction. p. 235–239. In Proc. Int. Oat Conf., 5th, and Int.
tion methods (Type 1, Type 3, REML, and ML) is used, Barley Genet. Symp., 7th, Saskatoon. Vol. 1. Univ. Ext. Press,
where &METH can be “type1,” “type3,” “reml,” or “ml” Univ. of Saskatchewan, Saskatoon.
(two other methods, Type 2 and MIVQUE0 [minimum Balzarini, M. 2002. Applications of mixed models in plant breed-
variance quadratic unbiased estimation] are not included). ing. p. 353–363. In M.S. Kang (ed.) Quantitative genetics,
The option of MMEQSOL requests SAS to output a solu- genomics, and plant breeding. CAB Int., Wallingford, UK.
Casanoves, F., R. Macchiavelli, and M. Balzarini. 2005. Error
tion to the mixed-model equations along with the inverted
variation in multienvironment peanut trials: Within-trial
coefficients matrix (cf. Eq. [3]). The option of COVTEST spatial correlation and between-trial heterogeneity. Crop Sci.
asks SAS to generate asymptotic standard errors and Wald 45:1927–1933.
Z tests for the variance parameter estimates. Cornelius, P.L., and J. Crossa. 1999. Prediction assessment of
The MODEL statement includes one dependent shrinkage estimators of multiplicative models for multi-envi-
variable (&DVAR) and one fi xed effect (&G_EFF). The ronment cultivar trials. Crop Sci. 39:998–1009.
option of DDFM = KR identifies the method of Ken- Cornelius, P.L., M. Seyedsadr, and J. Crossa. 1992. Using the
ward and Roger (1997) for computing the denominator shifted multiplicative model to search for “separability” in
crop cultivar trials. Theor. Appl. Genet. 84:161–172.
degrees of freedom for the tests of fi xed effects specified in
Cotes, J.M., J. Crossa, A. Sanches, and P.L. Cornelius. 2006. A
the MODEL statement and for tests of predictable func- Bayesian approach for assessing the stability of genotypes.
tions specified in the ESTIMATE statements (below); this Crop Sci. 46:2654–2665.
option involves calculating an inflation factor for the esti- Crossa, J., J. Burgueno, P.L. Cornelius, G. McLaren, R. Tre-
mated variance–covariance matrix of the fi xed and ran- thowan, and A. Krishnamachari. 2006. Modeling genotype
dom effects (Prasad and Rao, 1990; Harville and Jeske, × environment interaction using additive genetic covariances
1992) and then computing Satterthwaite-type degrees of of relatives for predicting breeding values of wheat genotypes.
freedom on the inflated variance–covariance matrix. The Crop Sci. 461:1722–1733.
Crossa, J., R.-C. Yang, and P.L. Cornelius. 2004. Studying cross-
option of SOLUTION is added to request SAS output
over genotype × environment interaction using linear–bilin-
estimates and tests of fi xed effects for checking purposes. ear models and mixed models. J. Agric. Biol. Environ. Stat.
The random statement lists all three random effects 9:362–380.
[&E_EFF &B_EFF(&E_EFF) &E_EFF*&G_EFF]. The Edwards, J.W., and J.-L. Jannink. 2006. Bayesian modeling of het-
option of SOLUTION is added to demand SAS to output erogeneous error and genotype × environment interaction
estimates and tests of random effects for checking purposes. variances. Crop Sci. 46:820–833.
A selected set of ESTIMATE statements is specified to Field Crop Development Centre. 2006. Cereal Research Report.
provide desired outputs for evaluating significant COI and to Available at www1.agric.gov.ab.ca/$department/deptdocs.nsf/
all/fcd5464/$FILE/006_ResearchReport_Web.pdf (verified 11
check the calculations of predictable functions and their stan-
Mar. 2007). Field Crop Development Centre, Lacombe, AB.
dard errors using the outputs from the MMESOL option. Gauch, H.G. 2006. Statistical analysis of yield trials by AMMI and
For example, a series of 18 estimate statements can be given GGE. Crop Sci. 46:1488–1500.
to estimate and test for the differences between AC Metcalfe Gregorius, H.-R., and G. Namkoong. 1986. Joint analysis of
and CDC Dolly for each of the 18 test sites. In general, a SAS genotypic and environmental effects. Theor. Appl. Genet.
MACRO is available in mixed_COI.sas to allow for auto- 72:413–422.
matically creating these ESTIMATE statements. Haldane, J.B.S. 1947. The interaction of nature and nurture. Ann.
Finally, the ODS statement is used to request SAS Eugen. 13:197–205.
Harville, D.A., and D.R. Jeske. 1992. Mean squared error of esti-
to provide the outputs from the MMESOL option and
mation or prediction under a general linear model. J. Am.
ESTIMATE statements. A SAS/IML subroutine is avail- Stat. Assoc. 87:724–731.

CROP SCIENCE, VOL. 47, MAY – JUNE 2007 WWW.CROPS.ORG 1061


Helm, J., P. Juskiw, and T. Duggan. 2004. Presentation: A new Robinson, G.K. 1991. That BLUP is a good thing: The estimation
look at location yield data. Available at www1.agric.gov. of random effects. Stat. Sci. 6:15–51.
ab.ca/$department/deptdocs.nsf/all/fcd5590 (verified 9 Mar. SAS Institute. 2004. SAS OnlineDoc 9.1.3. Available at support.
2007). Alberta Agric. and Food, Edmonton. sas.com/onlinedoc/913/docMainpage.jsp (verified 9 Mar.
Henderson, C.R. 1984. Applications of linear models in animal 2007). SAS Inst., Cary, NC.
breeding. Univ. of Guelph, Guelph, AB. Schaeffer, L.R. 2006. ANSC 637. Set 7: Prediction theory. Avail-
Kennedy, B.W. 1991. C.R. Henderson: The unfi nished legacy. J. able at www.aps.uoguelph.ca/~lrs/ANSC637/LRS07/LRS07.
Dairy Sci. 74:4067–4081. pdf (verified 9 Mar. 2007). Univ. of Guelph, Guelph, ON.
Kenward, M.G., and J.H. Roger. 1997. Small sample inference for Searle, S.R., G. Casella, and C.E. McCulloch. 1992. Variance
fi xed effects from restricted maximum likelihood. Biometrics components. John Wiley & Sons, New York.
53:983–997. Self, S.G., and K.Y. Liang. 1987. Asymptotic properties of maxi-
Littell, R.C., G.A. Milliken, W.W. Stroup, R.D. Wolfi nger, and mum likelihood estimators and likelihood ratio tests under
O. Schabenberger. 2006. SAS for mixed models. 2nd ed. SAS nonstandard conditions. J. Am. Stat. Assoc. 82:605–610.
Inst., Cary, NC. Smith, A.B., B.R. Cullis, and R. Thompson. 2005. The analysis of
McLean, R.A., and W.L. Sanders. 1988. Approximating degrees crop cultivar breeding and evaluation trials: An overview of
of freedom for standard errors in mixed linear models. p. 50– current mixed model approaches. J. Agric. Sci. 143:449–462.
59. In Proc. Stat. Comput. Sect., New Orleans, LA. Am. Stat. Stroup, W.W. 1989. Predictable functions and prediction space in
Assoc. Alexandria, VA. the mixed model procedure. p. 39–48. In Applications of mixed
McLean, R.A., W.L. Sanders, and W.W. Stroup. 1991. A unified models in agriculture and related disciplines. South. Coop. Ser.
approach to mixed linear models. Am. Stat. 45:54–64. Bull. 343. Louisiana Agric. Exp. Stn., Baton Rouge.
Navabi, A., R.-C. Yang, J. Helm, and D.M. Spaner. 2006. Can Yan, W., and N.A. Tinker. 2006. Biplot analysis of multi-envi-
spring wheat-growing megaenvironments in the northern ronment trial data: Principles and applications. Can. J. Plant
Great Plains be dissected for representative locations or niche- Sci. 86:623–645.
adapted genotypes? Crop Sci. 46:1107–1116. Yang, R.-C. 2002. Likelihood-based analysis of genotype–envi-
Piepho, H.P. 1998. Methods for comparing the yield stability of ronment interactions. Crop Sci. 42:1434–1440.
cropping systems: A review. J. Agron. Crop Sci. 180:193–213. Yang, R.-C., and R.J. Baker. 1991. Genotype–environment inter-
Prasad, N.G.N., and J.N.K. Rao. 1990. The estimation of mean actions in two wheat crosses. Crop Sci. 31:83–87.
squared error of small-area estimators. J. Am. Stat. Assoc. Yang, R.-C., S.F. Blade, J. Crossa, D. Stanton, and M.S. Bandara.
85:163–171. 2005. Identifying isoyield environments for field pea produc-
Qiao, C.G., K.E. Basford, I.H. DeLacy, and M. Cooper. 2004. tion. Crop Sci. 45:106–113.
Advantage of single-trial models for response to selection Yang, R.-C., D. Stanton, S.F. Blade, J. Helm, D. Spaner, S. Wright,
in wheat breeding multi-environment trials. Theor. Appl. and D. Domitruk. 2006. Isoyield analysis of barley cultivar tri-
Genet. 108:1256–1264. als in the Canadian prairies. J. Agron. Crop Sci. 192:284–294.

1062 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MAY – JUNE 2007

You might also like