0% found this document useful (0 votes)
67 views4 pages

Uses and Abuses of Coefficient Alpha - Schmitt (1996)

Uses and Abuses of Coefficient Alpha - Schmitt (1996)
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views4 pages

Uses and Abuses of Coefficient Alpha - Schmitt (1996)

Uses and Abuses of Coefficient Alpha - Schmitt (1996)
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Piyi-hologicalAHessmefil Copyright 19% by the American Psychological Associa

1946, Vol. S, No 4, 350-353

Uses and Abuses of Coefficient Alpha

Neal Schmitt
Michigan State University

The article addresses some concerns about how coefficient alpha is reported and used. It also shows
thai alpha is not a measure of homogeneity or unidimensionality. This fact and the finding that test
length is related to reliability may cause significant misinterpretations of measures when alpha is
used as evidence that a measure is unidimensional. For multidimensional measures, use of alpha as
the basis for corrections for attenuation causes overestimates of true correlation. Satisfactory levels
of alpha depend on test use and interpretation. Even relatively low (e.g., .50) levels of criterion
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

reliability do nol seriously attenuate validity coefficients. When reporting intereorrelations among
This document is copyrighted by the American Psychological Association or one of its allied publishers.

measures that should be discriminable, it is important to present observed correlations, appropriate


measures of reliability, and correlations corrected for unreliability.

Presentation of coefficient alpha (hereinafter alpha: Cron- the interitem correlation matrix is of unit rank (i.e., uni-
bach, 1951) as an index of the internal consistency or reliability dimensional). Cronbach's early statements (1947, 1951) about
of psychological measures has become routine practice in vir- reliability suggest that the reliability of a multidimensional
tually all psychological and social science research in which measure can only be estimated by correlating scores on parallel
multiple-item measures of a construct are used. In this article I forms of a test that each represent the same factor structure.
describe four ways in which researchers' use of alpha to convey It is also the case that alpha increases as a function of test
information about the operationalization of a construct or con- length. The widely used Spearman-Brown correction formula
structs can represent a lack of understanding or can convey less expresses the relationship between test length and reliability.
information than is actually required to evaluate the degree to Lord and Novick (1968), among others, have provided a dis-
which measurement problems are or are not a concern in the cussion of the Spearman-Brown along with tabular illustrations
interpretation of the research results. In each instance, I will of the relationship between test length and reliability. In fact,
also indicate which additional or supplementary information is the Kuder-Richardson derivations of various reliability formu-
necessary to evaluate the measurements used in the research. las that are specific forms of alpha involve the use of the Spear-
man-Brown correction of a single item's reliability. The single
Alpha Is Not a Measure of Unidimensionality item reliability expressed as the average intercorrelation among
items in a measure is extended to express the full-length lest
One important confusion in the literature involves the use reliability in the Kuder-Richardson derivation. The Spearman-
of homogeneity and internal consistency as though they were Brown formula in this instance would be equal to [TV" times the
synonymous. Internal consistency refers to the interrelatedness single item reliability]/! 1 + (N - 1) times the single item
of a set of items, whereas homogeneity refers to the unidimen- reliability].
sionality of the set of items. Internal consistency is certainly Given that alpha is a function of the interrelatedness of the
necessary for homogeneity, but it is not sufficient. The most re- items in a test and the test length rather than the homogeneity
cent explication and discussion of this distinction is that of Cor- of the interitem correlations or their unidimensionalily (as is
tina ( 1 9 9 3 ) . Hattie (1985) made a similar distinction in a com- often assumed), what are the measurement implications? Con-
prehensive review of alternative ways in which researchers have sider the two interitem matrices depicted in Table 1. In the case
indexed unidimensionality. Cronbach ( 1 9 5 1 ) viewed reliability, of both of these six-item matrices, coefficient alpha (actually
including internal consistency measures, as the proportion of standardized alpha) is .86, but it is clear that the interrelation-
test variance that was attributable to group and general factors. ships among the first set of items indicate that the responses to
Specific item variance, or uniqueness, was considered error. the items are a function of two factors. Removal of a single gen-
Clearly, Cronbach, Cortina, and Hattie would not treat alpha as eral factor from the second set of items would yield zero off-
a measure of unidimensionality. In fact, Cronbach stated that diagonal correlations, indicating no item-specific and no group
alpha is an underestimate of reliability (as he denned it) unless factors were responsible for item responses; hence this second
six-item measure is unidimensional. This would not be true of
the first set of items; the intereorrelations of these items indicate
the presence of two factors.
An earlier version of this article was presented at the 103rd Annual
Consider the second example in Table 2. In this case, a 6-item
Convention of the American Psychological Association. New York, Au-
measure and a 10-item measure have the same alpha, but the
gust 1995.
Correspondence concerning this article should be addressed to Neal shorter measure clearly is a function of two factors. The 10-item
Schmitt, Department of Psychology. Michigan State University, Iiast measure in this example is a function of a single factor. Both
Lansing, Michigan 48824. of these comparisons clearly indicate that alpha is not a good

350
SPECIAL SECTION: COEFFICIENT ALPHA 351

Table 1 torial nature of the interitem correlations is as clear as in these


Sample Imeritem Matrices With Equal Cronbach Alpha two examples. This also implies that unidimensionality is not
unambiguously present or absent. The question can be re-
Variable I 5 6 Variable I 2 3 5 6
framed as Hattie (1985) suggested: "Are there decision criteria
1. 1. that determine how close a set of items is to being a unidimen-
2. .8 — 2. ^5 sional set?" (p. 159). It is clear that alpha is not an adequate
3. .8 .8 — 3. .5 .5 — index of unidimensionality, and to interpret or use it for this
4. .3 .3 .3 — 4. .5 .5 .5 — purpose is wrong. It is also an underestimate of reliability (as
5. .3 .3 .3 .8 — 5. .5 .5 .5 .5 —
defined by Cronbach, i.e., as a measure of the communalilics of
6. .3 .3 .3 .8 .8 — 6. .5 .5 .5 .5 .5 —
(a = .86) (« = .86) the items) in the presence of multidimensionality. The latter
statement has implications for the use of alpha in corrections
Note. All examples are written in correlational form as opposed to for attenuation, which are elaborated on next.
covariance form for convenience and ease of interpretation only.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

What Is an Adequate Level of Alpha?


This document is copyrighted by the American Psychological Association or one of its allied publishers.

indicator of the unidimensionality of a set of items. In sum- A second problem in the use of alpha arises from researchers'
mary, if alpha is used as "proof" that a set of items have an common presumption that a particular level of alpha (usually
unambiguous or unidimensional interpretation, the conclu- .70) is desired or adequate. Having obtained that level, they
sions drawn may or may not be correct. then proceed to use the measure without further consideration
There are several alternatives to this use of alpha (Hattie, of its dimensionality or construct validity. This use of the statis-
1985, actually discussed 30). Cortina (1993) suggested that in tic clearly represents a lack of appreciation of the meaning of
addition to reporting alpha, researchers also report what Cor- alpha as discussed earlier and of the relationship between alpha
tina called the precision of alpha or what he called the standard and test length. There are two reasons why the use of any cutoff
error of alpha. This statistic reflects the spread of interitem cor- value (including .70) is shortsighted.
relations. This index will yield a value of 0 when all interitem First, alpha is often used to make corrections for unreliability
correlations are zero and relatively high values when the spread between two measures in an attempt to ascertain the relation-
of interitem correlations is great. A large spread in interitem ship between the latent or true variables underlying the mea-
correlations indicates either some form ofmullidimcnsionality sures. This correction involves dividing the observed correla-
or a great deal of sampling error in the estimation of the interi- tion between the two variables by the product of the square root
tem correlations. Cortina's index is not the standard error of of their reliabilities (Lord & Novick, 1968). Classic reliability
alpha; the absence of sample size in his formula means sam- theory also holds that the upper limit of validity (the relation-
pling error does not necessarily influence this index. Given cer- ship between a predictor and criterion) is the square root of the
tain distributional assumptions, Feldt (1980) and Feldt, Wood- reliability of the criterion or outcome variables rather than 1.00,
ruff, and Salih (1987) presented a formula for the computation which is the upper limit of a Pearson correlation. The concern
of the standard error of alpha. then is that the true correlations involving a predictor and an
If concerned with sampling error, researchers should use the unreliable outcome variable will be seriously attenuated (i.e.,
Feldt (1980) index when they want to assess the accuracy of underestimated) because of inadequate criterion reliability
their estimate of alpha. By contrast, when assessing the degree rather than any lack of real or true relationship. In considering
to which a measure is actually unidimensional, an increasingly the implications of these findings for expected validity, it can be
popular approach in determining the extent of unidimension- seen that with reliability equal to .70, validity has an upper limit
ality is to test whether the interitem correlation matrix fits a of .84 (i.e., the square root of .70) as opposed to 1.00. Even with
single-factor model (Joreskog & Sorbom, 1979). For example, reliability as low as .49, the upper limit of validity is .70. When
the second examples in both Tables 2 and 3 are perfectly fit by
a single-factor model. In Table 2, a single factor model fit the
first matrix of interitem correlations poorly as indexed by a sig- Table 2
nificant chi-SQuare, but more important, by uniformly poor fit Tests of Different Length and Dimensionality
statistics as computed in L1SREL8 (Joreskog & Sorbom, With Equal Alpha
1993). In this instance, a two-factor model fit the data perfectly.
Variable 1 2 3 4 5 6 Variable 1 2 3 4 5 6 7 8 9 10
The same was true for the first example in Table 3, but in this
instance, the fit of a single-factor model was not as bad: normed _ _
1. _ 1.
nonfit index ( N N F I ) = .54; adjusted goodness-of-fit index 2. .6 ~>
.3

(AGFI) = .45; and root mean square residual ( R M S R ) - .13. 3. .6 .6 — 3. .3 .3

For readers interested in the assessment of unidimensionality, 4. .3 .3 .3 — 4. .3 .3 .3 —
5. .3 .3 .3 .6 — 5. .3 .3 .3 .3 —
the relationship between classical test theory perspectives and
6. .3 .3 .3 .6 .6 — 6. .3 .3 .3 .3 .3 —
structural equation modeling of measurement models has been (u 81) 7. .3 .3 .3 .3 .3 .3 —
very effectively and clearly illustrated and explained by Miller 8. .3 .3 .3 .3 .3 .3 .3 —
(1995). 9. .3 .3 .3 .3 .3 .3 .3 .3 —
In examining the types of matrices computed from actual 10. .3 .3 .3 .3 .3 .3 .3 .3 .3 —
(a = .81)
assessee responses, there arc rarely instances in which the fac-
352 SCHMITT

a measure has other desirable properties, such as meaningful Table 3


content coverage of some domain and reasonable unidimen- Intercorrelations. Alpha Coefficients, and Corrected
sionality, this low reliability may not be a major impediment to Correlations for Two Multidimensional Tests
its use. Of course, the usual correction for attenuation would That Arc Perfectly Parallel
allow the size of the relationship between the underlying con-
Test A TesIB
structs to be determined, and it would also allow for clearer
correlations between this variable and other potential target Variable 1 2 3 4 5 6 7 8 9 10 11 12
variables of interest.
1. __
Researchers who do appreciate the relationship between test
2. .8 —
length and reliability sometimes attempt to excuse the low reli-
3, .8 .8 —
ability of their measures by referencing Ihe short length of the
4. .3 .3 .3 —
measure. The gist of this argument is typically that because the 5, .3 .3 .3 .8 —
test is short, a low level of alpha would be expected and therefore 6. .3 .3 .1 8 .8 —
7, .8 .8 .8 .3 .3 .3 —
the researchers should be allowed to use and interpret the find-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

8. .8 .8 .8 .3 .3 3 .8 —
This document is copyrighted by the American Psychological Association or one of its allied publishers.

ings of research using this measure of low reliability. I n these


9 8 8 8 3 3 3 8 8
instances, the researchers may or may not be correct in conclud- 10. .3 .3 .3 .8 .8 .8 .3 .3 .3 —
ing that the low reliability of the measure is a function of test ] i .3 .3 .3 .8 .8 .8 .3 .3 .3 .8 —
length. However, il remains true that the measures have low re- 12. .3 .3 .3 .8 .8 .8 .3 .3 .3 .8 .8 —

liability, and estimates of the relationships between the variables


Noli'. Observed Ciim.-lationAB = l9.8/(4.58 X 4.58) = .94. Cor-
and other variables will be correspondingly attenuated. Further,
rected Correlation*,, = .94/( 1/1(6 X fS6) - 1.09.
interpretations of these relationships should include caveats
about low reliability and the potential for underestimating any
relationships between the measured variable and other vari-
ables of interest. In this instance, if lack of reliability is deemed correlations divided by the product of the standard deviations
to be a significant problem in estimating effect sizes or evaluat- of the two measures.) This correlation corrected for attenuation
ing hypotheses, the researcher should develop a longer measure (i.e., .94 divided by the product of the square roots of the two
with adequate reliability. Short length does not alleviate the reliabilities) is 1.09. Obviously this is an overestimate of the
problems of reliability. true correlation of 1.00. This demonstration implies that one
should not correct for attenuation using an alpha coefficient as
Corrections for Unreliability and Muhidimensionality the reliability estimate unless there is also evidence that the
measures involved arc unidimensional.
As previously demonstrated, a relatively high level of alpha The practical implications of this demonstration (i.e.,
can be obtained when the item responses are in fact the function whether the correction as it is often applied in research is
of more than one construct; in these instances alpha is likely to affected) can only be speculated. In some applied situations
be an underestimate of the measure's reliability as denned by such as academic and job performance prediction situations,
Cronbach. What are the implications of these findings for the the practice may make a practical difference in results and in-
appropriateness of the correction for attenuation for unreliabil- terpretations. In most such instances an effort is made to con-
ity when the correlation being corrected is an estimate of the struct measures of outcome variables that reflect the dimen-
relationship between two multidimensional measures? The cor- sionality of the job or academic pursuit. These instruments
rection for attenuation due to unreliability is computed to pro- should be appropriately multidimensional. If researchers were
vide accurate estimates of the "true" relationship between con- to compute a combined score across items or dimensions of this
structs. Observed correlations are always distorted by any ran- outcome measure, and then compute the validity of the predic-
dom measurement error in either of two measures correlated. tor and use a composite alpha to correct this validity for atten-
The correction for attenuation serves to provide estimates of the uation, the resultant correction would be an overestimate of the
relationships between the underlying constructs measured. The "true" validity. In this instance, the researcher would be better
importance of this correction and the implications for research advised to develop unidimensional measures of each predictor
in many different areas of psychology have recently been dis- and criterion construct and then correct observed correla-
cussed by Schmidt and Hunter (1996). tions using estimates of the reliability of these unidimensional
The short answer to this question seems to be that when the measures.
factor structure of two multidimensional measures is the same,
the correction for attenuation will be an overcorrection. Apply-
Presenting Alpha Information Is Not Enough
ing the classic correction for attenuation using alpha as an esti-
mate of reliability in such cases will result in an overestimate of Researchers fairly routinely report the level of alpha associ-
the true correlation between these two variables. ated with the various measures they use in operationalizing key
An illustration of this phenomenon is shown in Table 3. This constructs. However, the intercorrelations among the measures
is a case in which two tests (A and B) are composed of identical are often not presented. This is particularly troublesome if it is
factors and the observed intercorrelation between the two mea- important to the researchers' objectives that the measures pos-
sures is .94, as calculated from the matrix of correlations pre- sess some degree of discriminant validity. Perhaps the worst
sented in Table 3. (Correlation equals the obtained cross scale form of this problematic reporting occurs when a researcher
SPECIAL SECTION: COEFFICIENT ALPHA 353

derives measures of several constructs from a single paper-and- rather than corrected correlations. Clearly, both intercorre-
pencil measure or interview instrument and reports that the al- lations and alpha must be reported if the reader is to be ade-
phas of all measures were relatively high (e.g., above .85). The quately informed about the obtained results. Of course, it is also
researcher then proceeds to make interpretations based on the incumbent on the researcher to consider both sources of infor-
profile of respondents' scores on these dimensions without pre- mation when drawing conclusions about the adequacy of
senting the intercorrelations among the scales. Or, these mea- measures.
sures may be used in some multivariate analysis and the re-
searcher then reports surprise at finding that multicolinearity Summary and Conclusions
renders any interpretation regarding the relative efficacy of the
variables ambiguous. The minimum information that should Four caveats are implied by this article regarding the proper
be provided in these instances includes the alpha coefficients, use of the alpha coefficient.
the observed correlations, and the correlations corrected for at- 1. Alpha is not an appropriate index of unidimensionality to
tenuation due to unreliability. This can all be done efficiently assess homogeneity.
2. In correcting for attenuation due to unreliability, use of
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

with no additional use of space (for those who have been pres-
This document is copyrighted by the American Psychological Association or one of its allied publishers.

sured by editors to use space sparingly). An example is pre- alpha as an estimate of reliability is based on the notion that
sented in Table 4. the measures involved are unidimensional. When this is not the
The example in Table 4 can also be used to demonstrate why case, the corrected coefficients will be overcorrected.
both corrected and uncorrected coefficients (or the information 3. There is no sacred level of acceptable or unacceptable level
allowing their calculation) should be presented. First, without of alpha. In some cases, measures with (by conventional
the intercorrelations of these variables, the reader does not have standards) low levels of alpha may still be quite useful.
the information to evaluate whether the levels of reported alpha 4. Presenting only alpha when discussing the relationships of
are good or bad. Second, the correlation between any two vari- multiple measures is not sufficient. Intercorrelations and cor-
ables might suggest that they are so highly correlated that any rected intercorrelations must be presented as well.
differentiation between these two measures is not practically or
theoretically useful. In Table 4, the observed correlation be- References
tween Variables 1 and 2 indicates they are less discriminable
Cortina, J. M. (1993). What is coefficient alpha? An examination of
than are measures of the other constructs. However, when both theory and applications. Journal oj Applied Psychology, 78, 98-104.
the intercorrelations and the reliabilities of the measures are Cronbach, L. J. ( 1 9 4 7 ) . Test "reliability": Its meaning and determina-
taken into account (or the corrected correlations are ex- tion. Psychometrika, 12, 1-16.
amined ), it is clear that these conclusions about Variables 1 and Cronbach, L. J. ( 1 9 5 1 ) . Coefficient alpha and the internal structure of
2 are incorrect. They are no more or less discriminable than tests. Psychometrika, I ft, 297-334.
Variables 2, 3, or 4. One might also conclude by examining ob- Heldt, L. S. (1980). A test of the hypothesis that Cronbach's alpha co-
served correlations that Variable 5 shares little in common with efficient is the same for two tests administered to the same sample.
Psychometrika, 45, 99-105.
the other four variables, but the corrected coefficients clearly
Feldt, L. S., Woodruff, D. J., & Salih, F. A. (1987). Statistical inference
contradict this view.
for coefficient alpha. Applied Psychological Measurement, 11, 93-
Other examples could be constructed to show other combi-
103.
nations of reliability and intercorrelations that would be very Hattie, J. ( 1 9 8 5 ) . Methodology review: Assessing unidimensionality of
differently interpreted when relying on observed correlations tests and items. Applied Psychological Measurement, 9. 139-164.
Joreskog, K. G., & Sorbom, D. (1979). Advances in/actor analysis and
structural equation models. Cambridge, MA: Abt Books.
Table 4 Joreskog, K. G., & Sorbom, D. (1993). New features in L1SREL8. Chi-
cago, Illinois: Scientific Software International.
Observed Correlations. Alpha Coefficients, and Corrected
Lord, F. M., & Novick, M. R. ( 1 9 6 8 ) . Statistical theories of mental test
Correlations Among Measures of Several Constructs
scores. Reading. MA: Addison-Wesley.
Variable 1 2 3 4 5 Miller, M. B. ( 1 9 9 5 ) . Coefficient alpha: A basic introduction from the
perspectives of classical test theory and structural equation model-
1. (.81) .80 .71 .83 .59 ling. Structural Equation Modelling, 2, 255-273.
2. .70 (.95) .62 .81 .60 Schmidt. F. L.. & Hunter, J. E. (1996). Measurement error in psycho-
3. .51 .47 (.64) .86 .65 logical research: Lessons from 26 research scenarios. Psychological
4. .52 .55 .45 (.49) .67 Methods, 1, 199-223.
5. .32 .35 .31 .28 (.36)

Note. Alpha coefficients are presented on the diagonal, observed cor- Received May 31, 1996
relations below the diagonal, and correlations corrected for attenuation Revision received July 24, 1996
above the diagonal. Accepted July 24, 1996 •

You might also like