Uses and Abuses of Coefficient Alpha - Schmitt (1996)
Uses and Abuses of Coefficient Alpha - Schmitt (1996)
Neal Schmitt
Michigan State University
The article addresses some concerns about how coefficient alpha is reported and used. It also shows
thai alpha is not a measure of homogeneity or unidimensionality. This fact and the finding that test
length is related to reliability may cause significant misinterpretations of measures when alpha is
used as evidence that a measure is unidimensional. For multidimensional measures, use of alpha as
the basis for corrections for attenuation causes overestimates of true correlation. Satisfactory levels
of alpha depend on test use and interpretation. Even relatively low (e.g., .50) levels of criterion
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
reliability do nol seriously attenuate validity coefficients. When reporting intereorrelations among
This document is copyrighted by the American Psychological Association or one of its allied publishers.
Presentation of coefficient alpha (hereinafter alpha: Cron- the interitem correlation matrix is of unit rank (i.e., uni-
bach, 1951) as an index of the internal consistency or reliability dimensional). Cronbach's early statements (1947, 1951) about
of psychological measures has become routine practice in vir- reliability suggest that the reliability of a multidimensional
tually all psychological and social science research in which measure can only be estimated by correlating scores on parallel
multiple-item measures of a construct are used. In this article I forms of a test that each represent the same factor structure.
describe four ways in which researchers' use of alpha to convey It is also the case that alpha increases as a function of test
information about the operationalization of a construct or con- length. The widely used Spearman-Brown correction formula
structs can represent a lack of understanding or can convey less expresses the relationship between test length and reliability.
information than is actually required to evaluate the degree to Lord and Novick (1968), among others, have provided a dis-
which measurement problems are or are not a concern in the cussion of the Spearman-Brown along with tabular illustrations
interpretation of the research results. In each instance, I will of the relationship between test length and reliability. In fact,
also indicate which additional or supplementary information is the Kuder-Richardson derivations of various reliability formu-
necessary to evaluate the measurements used in the research. las that are specific forms of alpha involve the use of the Spear-
man-Brown correction of a single item's reliability. The single
Alpha Is Not a Measure of Unidimensionality item reliability expressed as the average intercorrelation among
items in a measure is extended to express the full-length lest
One important confusion in the literature involves the use reliability in the Kuder-Richardson derivation. The Spearman-
of homogeneity and internal consistency as though they were Brown formula in this instance would be equal to [TV" times the
synonymous. Internal consistency refers to the interrelatedness single item reliability]/! 1 + (N - 1) times the single item
of a set of items, whereas homogeneity refers to the unidimen- reliability].
sionality of the set of items. Internal consistency is certainly Given that alpha is a function of the interrelatedness of the
necessary for homogeneity, but it is not sufficient. The most re- items in a test and the test length rather than the homogeneity
cent explication and discussion of this distinction is that of Cor- of the interitem correlations or their unidimensionalily (as is
tina ( 1 9 9 3 ) . Hattie (1985) made a similar distinction in a com- often assumed), what are the measurement implications? Con-
prehensive review of alternative ways in which researchers have sider the two interitem matrices depicted in Table 1. In the case
indexed unidimensionality. Cronbach ( 1 9 5 1 ) viewed reliability, of both of these six-item matrices, coefficient alpha (actually
including internal consistency measures, as the proportion of standardized alpha) is .86, but it is clear that the interrelation-
test variance that was attributable to group and general factors. ships among the first set of items indicate that the responses to
Specific item variance, or uniqueness, was considered error. the items are a function of two factors. Removal of a single gen-
Clearly, Cronbach, Cortina, and Hattie would not treat alpha as eral factor from the second set of items would yield zero off-
a measure of unidimensionality. In fact, Cronbach stated that diagonal correlations, indicating no item-specific and no group
alpha is an underestimate of reliability (as he denned it) unless factors were responsible for item responses; hence this second
six-item measure is unidimensional. This would not be true of
the first set of items; the intereorrelations of these items indicate
the presence of two factors.
An earlier version of this article was presented at the 103rd Annual
Consider the second example in Table 2. In this case, a 6-item
Convention of the American Psychological Association. New York, Au-
measure and a 10-item measure have the same alpha, but the
gust 1995.
Correspondence concerning this article should be addressed to Neal shorter measure clearly is a function of two factors. The 10-item
Schmitt, Department of Psychology. Michigan State University, Iiast measure in this example is a function of a single factor. Both
Lansing, Michigan 48824. of these comparisons clearly indicate that alpha is not a good
350
SPECIAL SECTION: COEFFICIENT ALPHA 351
indicator of the unidimensionality of a set of items. In sum- A second problem in the use of alpha arises from researchers'
mary, if alpha is used as "proof" that a set of items have an common presumption that a particular level of alpha (usually
unambiguous or unidimensional interpretation, the conclu- .70) is desired or adequate. Having obtained that level, they
sions drawn may or may not be correct. then proceed to use the measure without further consideration
There are several alternatives to this use of alpha (Hattie, of its dimensionality or construct validity. This use of the statis-
1985, actually discussed 30). Cortina (1993) suggested that in tic clearly represents a lack of appreciation of the meaning of
addition to reporting alpha, researchers also report what Cor- alpha as discussed earlier and of the relationship between alpha
tina called the precision of alpha or what he called the standard and test length. There are two reasons why the use of any cutoff
error of alpha. This statistic reflects the spread of interitem cor- value (including .70) is shortsighted.
relations. This index will yield a value of 0 when all interitem First, alpha is often used to make corrections for unreliability
correlations are zero and relatively high values when the spread between two measures in an attempt to ascertain the relation-
of interitem correlations is great. A large spread in interitem ship between the latent or true variables underlying the mea-
correlations indicates either some form ofmullidimcnsionality sures. This correction involves dividing the observed correla-
or a great deal of sampling error in the estimation of the interi- tion between the two variables by the product of the square root
tem correlations. Cortina's index is not the standard error of of their reliabilities (Lord & Novick, 1968). Classic reliability
alpha; the absence of sample size in his formula means sam- theory also holds that the upper limit of validity (the relation-
pling error does not necessarily influence this index. Given cer- ship between a predictor and criterion) is the square root of the
tain distributional assumptions, Feldt (1980) and Feldt, Wood- reliability of the criterion or outcome variables rather than 1.00,
ruff, and Salih (1987) presented a formula for the computation which is the upper limit of a Pearson correlation. The concern
of the standard error of alpha. then is that the true correlations involving a predictor and an
If concerned with sampling error, researchers should use the unreliable outcome variable will be seriously attenuated (i.e.,
Feldt (1980) index when they want to assess the accuracy of underestimated) because of inadequate criterion reliability
their estimate of alpha. By contrast, when assessing the degree rather than any lack of real or true relationship. In considering
to which a measure is actually unidimensional, an increasingly the implications of these findings for expected validity, it can be
popular approach in determining the extent of unidimension- seen that with reliability equal to .70, validity has an upper limit
ality is to test whether the interitem correlation matrix fits a of .84 (i.e., the square root of .70) as opposed to 1.00. Even with
single-factor model (Joreskog & Sorbom, 1979). For example, reliability as low as .49, the upper limit of validity is .70. When
the second examples in both Tables 2 and 3 are perfectly fit by
a single-factor model. In Table 2, a single factor model fit the
first matrix of interitem correlations poorly as indexed by a sig- Table 2
nificant chi-SQuare, but more important, by uniformly poor fit Tests of Different Length and Dimensionality
statistics as computed in L1SREL8 (Joreskog & Sorbom, With Equal Alpha
1993). In this instance, a two-factor model fit the data perfectly.
Variable 1 2 3 4 5 6 Variable 1 2 3 4 5 6 7 8 9 10
The same was true for the first example in Table 3, but in this
instance, the fit of a single-factor model was not as bad: normed _ _
1. _ 1.
nonfit index ( N N F I ) = .54; adjusted goodness-of-fit index 2. .6 ~>
.3
—
(AGFI) = .45; and root mean square residual ( R M S R ) - .13. 3. .6 .6 — 3. .3 .3
—
For readers interested in the assessment of unidimensionality, 4. .3 .3 .3 — 4. .3 .3 .3 —
5. .3 .3 .3 .6 — 5. .3 .3 .3 .3 —
the relationship between classical test theory perspectives and
6. .3 .3 .3 .6 .6 — 6. .3 .3 .3 .3 .3 —
structural equation modeling of measurement models has been (u 81) 7. .3 .3 .3 .3 .3 .3 —
very effectively and clearly illustrated and explained by Miller 8. .3 .3 .3 .3 .3 .3 .3 —
(1995). 9. .3 .3 .3 .3 .3 .3 .3 .3 —
In examining the types of matrices computed from actual 10. .3 .3 .3 .3 .3 .3 .3 .3 .3 —
(a = .81)
assessee responses, there arc rarely instances in which the fac-
352 SCHMITT
8. .8 .8 .8 .3 .3 3 .8 —
This document is copyrighted by the American Psychological Association or one of its allied publishers.
derives measures of several constructs from a single paper-and- rather than corrected correlations. Clearly, both intercorre-
pencil measure or interview instrument and reports that the al- lations and alpha must be reported if the reader is to be ade-
phas of all measures were relatively high (e.g., above .85). The quately informed about the obtained results. Of course, it is also
researcher then proceeds to make interpretations based on the incumbent on the researcher to consider both sources of infor-
profile of respondents' scores on these dimensions without pre- mation when drawing conclusions about the adequacy of
senting the intercorrelations among the scales. Or, these mea- measures.
sures may be used in some multivariate analysis and the re-
searcher then reports surprise at finding that multicolinearity Summary and Conclusions
renders any interpretation regarding the relative efficacy of the
variables ambiguous. The minimum information that should Four caveats are implied by this article regarding the proper
be provided in these instances includes the alpha coefficients, use of the alpha coefficient.
the observed correlations, and the correlations corrected for at- 1. Alpha is not an appropriate index of unidimensionality to
tenuation due to unreliability. This can all be done efficiently assess homogeneity.
2. In correcting for attenuation due to unreliability, use of
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
with no additional use of space (for those who have been pres-
This document is copyrighted by the American Psychological Association or one of its allied publishers.
sured by editors to use space sparingly). An example is pre- alpha as an estimate of reliability is based on the notion that
sented in Table 4. the measures involved are unidimensional. When this is not the
The example in Table 4 can also be used to demonstrate why case, the corrected coefficients will be overcorrected.
both corrected and uncorrected coefficients (or the information 3. There is no sacred level of acceptable or unacceptable level
allowing their calculation) should be presented. First, without of alpha. In some cases, measures with (by conventional
the intercorrelations of these variables, the reader does not have standards) low levels of alpha may still be quite useful.
the information to evaluate whether the levels of reported alpha 4. Presenting only alpha when discussing the relationships of
are good or bad. Second, the correlation between any two vari- multiple measures is not sufficient. Intercorrelations and cor-
ables might suggest that they are so highly correlated that any rected intercorrelations must be presented as well.
differentiation between these two measures is not practically or
theoretically useful. In Table 4, the observed correlation be- References
tween Variables 1 and 2 indicates they are less discriminable
Cortina, J. M. (1993). What is coefficient alpha? An examination of
than are measures of the other constructs. However, when both theory and applications. Journal oj Applied Psychology, 78, 98-104.
the intercorrelations and the reliabilities of the measures are Cronbach, L. J. ( 1 9 4 7 ) . Test "reliability": Its meaning and determina-
taken into account (or the corrected correlations are ex- tion. Psychometrika, 12, 1-16.
amined ), it is clear that these conclusions about Variables 1 and Cronbach, L. J. ( 1 9 5 1 ) . Coefficient alpha and the internal structure of
2 are incorrect. They are no more or less discriminable than tests. Psychometrika, I ft, 297-334.
Variables 2, 3, or 4. One might also conclude by examining ob- Heldt, L. S. (1980). A test of the hypothesis that Cronbach's alpha co-
served correlations that Variable 5 shares little in common with efficient is the same for two tests administered to the same sample.
Psychometrika, 45, 99-105.
the other four variables, but the corrected coefficients clearly
Feldt, L. S., Woodruff, D. J., & Salih, F. A. (1987). Statistical inference
contradict this view.
for coefficient alpha. Applied Psychological Measurement, 11, 93-
Other examples could be constructed to show other combi-
103.
nations of reliability and intercorrelations that would be very Hattie, J. ( 1 9 8 5 ) . Methodology review: Assessing unidimensionality of
differently interpreted when relying on observed correlations tests and items. Applied Psychological Measurement, 9. 139-164.
Joreskog, K. G., & Sorbom, D. (1979). Advances in/actor analysis and
structural equation models. Cambridge, MA: Abt Books.
Table 4 Joreskog, K. G., & Sorbom, D. (1993). New features in L1SREL8. Chi-
cago, Illinois: Scientific Software International.
Observed Correlations. Alpha Coefficients, and Corrected
Lord, F. M., & Novick, M. R. ( 1 9 6 8 ) . Statistical theories of mental test
Correlations Among Measures of Several Constructs
scores. Reading. MA: Addison-Wesley.
Variable 1 2 3 4 5 Miller, M. B. ( 1 9 9 5 ) . Coefficient alpha: A basic introduction from the
perspectives of classical test theory and structural equation model-
1. (.81) .80 .71 .83 .59 ling. Structural Equation Modelling, 2, 255-273.
2. .70 (.95) .62 .81 .60 Schmidt. F. L.. & Hunter, J. E. (1996). Measurement error in psycho-
3. .51 .47 (.64) .86 .65 logical research: Lessons from 26 research scenarios. Psychological
4. .52 .55 .45 (.49) .67 Methods, 1, 199-223.
5. .32 .35 .31 .28 (.36)
Note. Alpha coefficients are presented on the diagonal, observed cor- Received May 31, 1996
relations below the diagonal, and correlations corrected for attenuation Revision received July 24, 1996
above the diagonal. Accepted July 24, 1996 •