Nominal-Ordinal Association Analysis
Nominal-Ordinal Association Analysis
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].
Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to digitize, preserve
and extend access to Journal of the American Statistical Association.
http://www.jstor.org
ExpectedFrequenciesforLoglinearModel 8= (Elniin2j - n
fhin2j)/nlIn2
and Log Odds ofAdjacentCell Frequencies i,j i<j (2.3)
North 15,842.1(.03) 21,131.7(-.27) 20,691.1 = (Un- ,n2,
South 26,475.9(-.15) 24,704.3(.64) 16,920.8
ExpectedFrequenciesfor where U = iij I
n1n2X and U' = j<j n1In2j are discrete
LogitModel and ObservedLogits on whichtheMann-Whitney
analogsofthestatistics test
North 15,815.7(- .81) 21,184.6(.44) 20,664.7 is based forcontinuousdata. Equivalently, 8 = [Si -
South 26,502.3(-.57) 24,651.4(1.26) 16,947.3 whereSI denotes
E(SI)]/(nIn2/2), the rank sum for the
NOTE: For simplicityof illustration,we list data for only two of the four regions given in firstgroup(averageranksbeing assigned to the levels of
the original table. the ordinalvariable)and E(SO) = n1(nj + n2 + 1)/2
Source: U.S. Bureau of the Census (1977), CurrentPopulation Reports, p-25, no. 709,
Table A. denotestheexpectedvalueofSI when{Pli = P2i,I = 1,
. . ., c}. Thus, 8 may be interpretedas the difference
betweenSI and itsexpectedvaluewhenthedistributions
of theordinalvariableformembersselectedat random are identical,dividedby themaximum possiblevalueof
fromgroup1 and group2, respectively. thatdifference. It followsthat8 > 0 if and onlyif the
I
meanrankforgroup exceeds the mean rankforgroup
2.1 Delta 2.
A simplemeasureof associationthatuses only the
orderingofthelevelsof theordinalvariableis 2.2 Alpha
8 = P(YI > Y2) - P(Y2 > Y,) (2.1) In someapplications,it is informativeto describethe
relativesizes ofP( Y, > Y2)andP( Y2> Y,) inratioform,
= plipij - Plip2j.
i>j i<j = P(Y1 > Y2)/P(Y2> Y1)
Clearly,-1 6?c 1, with|8 = 1 ifand onlyifone of - E PIiPZjP/ PIiP2j (2.4)
the Jpjj,1 c j c 4 distributions
is entirelybelow or i>j i<j
man (1965, p. 112) defineda relatedsamplemeasure, withequal variances,the value of the odds ratiocom-
called theta(see also Freeman1976and Hubert1974). putedfora collapsingintoa 2 x 2 tabledependsgreatly
Additionalapproachesto measuring nominal-ordinalas- on how the cuttingpointis chosenforforming the di-
sociationmaybe foundinAgresti(1978),Crittenden and chotomy.McCullagh(1980)suggested alternative
models
Montgomery (1980),Goodmanand Kruskal(1959,Secs. thatalso do not requirescores. These modelsassume
4.4 and 4.5), Jacobson(1972),Rehak(1976),and Sarndal constantdifferences
{Aik} betweendistribution
functions
(1974). or theircomplements on a log-logscale.
4. MODELSFOR NOMINAL-ORDINAL
DATA 4.2 A Loglinear Model
In the past decade, several authorshave proposed This modelassumesthata meaningful set of ordered
modelsfordescribing patternsofassociationsinnominal- scores{V,} can be assignedto thecolumns.It contains
tables.TheseincludeAndrich an interaction
ordinalcross-classification termrepresenting a deviationfrominde-
(1979),Bock (1975,pp. 541-550),Duncan (1979),Fien- pendence that changes linearly within each row;
berg (1977, pp. 52-58), Goodman (1979), Haberman specifically,
(1974),McCullagh(1979,1980)and WilliamsandGrizzle In Pii = yi + i + otiVi,
(1972). In this section we brieflydescribe the two
mostcommonlyconsideredmodeltypes,logitand log- I - i c r, I c ji c. (4.4)
linear.We then discuss how the measuresstudiedin Variousformulations ofthismodelhave been suggested
thispaper,whichare not modelbased, are appropriateby Simon (1974), Haberman(1974), Goodman(1979),
for use in diverse situationsdescribedby these and Duncan(1979),and Fienberg(1977,pp. 52-55).
manyothermodels. An interestingimplicationof model(4.4) is that
4.1 A LogitModel ln(piJpk,Ipi,pkj) = (0ii - O-k)(Vi -
V/). (4.5)
Let F, ji= ',=,= oi be thejth cumulativeprobabilityfor In otherwords,forany pairof rows,theodds ratiois
categoryi ofthenominalvariable,I ? i c r, I j c c. constantforall pairsof columnsthatare equidistant in
One way we can use the ordinalnatureof the column score.For theequal-interval scores{ Vj = j}, thelogodds
classification withoutresorting to scoringmethodsis by ratioequals AikX= i -i - X-kforall pairsof adjacentcol-
constructing a model,usingtheaccumulated logitsln[FiIj/ umns.Thus,any pairof rowshas a constantdifference
(1 - F,j)], j = 1, 2, . . . , c - I withineach row. The betweenlog odds ofadjacentcells proportions,
mostpopularmodelof thistypeis theunsaturated one Aik = - In(pkJ/pkJ,,?+),
in whichthedifference betweenthedistributions forany In(pi.i/pi,.i+?)
pairof rowsis constantacrossthecolumnson thislogit I j c- 1. (4.6)
scale. Thatis,
4.3 Measuring Association
ln[F,i/(1- F,j)] = ln[Fk,il(1- Fkj)] + Aik,
Whenmodel(4.1) or (4.4) providesan adequatefitto
j = 1,2, . . . , c - 1, (4.1) a table,thestrength oftheassociationcan be quantified
forall pairs 1 ? i ? k ? r. This model has been suggested by the magnitude of the{IXik} "difference" parameters.
by several authors,includingClayton(1974), Simon (Notice that r - I of these parameters (and hencethe
(1974),McCullagh(1979,1980),andWilliamsandGrizzle {ao}) determine the entire set of {Xkb}.) Since the meaning
(1972). It mayequivalently be describedby notingthat Of Aik depends on the particular structural form forthe
theodds ratios model, however, the magnitudes of these parameters can-
. +
not be compared across differing structural models.This
(oil + + p)1+ Pic) makesitdifficult to compareassociationsintwoor more
+ *A+ + Pkj)l(pk
(~k1+
(pk1I 5kJ)(~Jk?1
j+I1 ++ *A p~)=
kc) exp(AiXk), tablesforwhichdifferent structural modelsare applica-
1 j-- c - 1 (4.2) ble. In addition, many cross-classifications occurinprac-
ticeforwhich(a) noneof thecommonly used structural
are identicalforall collapsingsof each 2 x c subtable modelsprovidesan adequatefit,or (b) ifa good-fitting
intoa 2 x 2 table. Anotherformulation of the model, modelis obtainedby trying severalstructural types,the
givenby Simon(1974),is resultof "fishingforstructure"maybe thatthe same
modeltypeis inadequatewhenappliedto othercross-
ln[Fij/(l - FiA)] = ai + Ij. (4.3)
classificationsof thesame variables.
The logisticdifferences are {Aik = ti - ak} in this The deltaand alpha measuresof nominal-ordinal as-
parameterization. sociation,notbeingmodelbased, can oftenbe used for
Although model(4.1) is intuitively appealing,it is not comparing strengths of associationacrosstableseven if
alwayssuitable,even whenthereare "nice" underlyingno single,simplestructural modelformis generallyap-
distributions differing only in location. Fleiss (1970) plicableto thosetables.These measureshave a certain
showed,forexample,thatfortwo normaldistributionsrobustness inthesensethattheyareapplicableina broad
rangeof settingsthat would encompassa varietyof all { I 8^i6 } and { I ln&1, i } from1960exceed thecorre-
models.We showedinSection3.3 thattheyarenaturally sponding valuesfrom1975.Thisindicatesthatdifferences
suitedto systemsof stochastically ordereddistributions.betweenregionsin thedistribution of size of residential
Now itcan be seenthatanycross-classification tablefor area tendedto diminish overthese 15 years.This slight
whichthelogisticmodel(4.1), loglinearmodel(4.4), or decreaseinvariability amongtheregionsis also reflected
one of McCullagh's(1980) log-logmodelsfitsperfectlyby thesmallervalues in 1975of thesummary measures
is such thatthe distributions withinthe rows are sto- 6 anda (.224and2.008,respectively, comparedwith.259
chastically ordered.Thus,thedeltaand alphameasures and 2.244 in 1960).Withineach regionone could also
are suitableforuse wheneverone of these importantcompute8 or a forpairsof yearsto summarize change
modeltypesis deemedappropriate. Theirrobustness is towardmetropolitan populations.
illustratedby thefactthatifany of thesemodelsfitsa The above remarksare notintendedas a criticism of
particular table perfectly, thenthe {iA,,} forthatmodel themodel-building approach.It is important to attempt
will be matchedin signby the {8i6} and {lnao,}.When to describetable structures, and we believethatthese
differentmodeltypesfitdifferent tables,thesemeasures measureshelp to complementthatprocess. They de-
giveus a commonbasis forcomparing associationsand scribestrength ofassociationon a commonbasis forthe
summarizing theresultsofthemodels. class of tables of stochastically ordereddistributions,
Table 1,introduced earlier,illustrates theabovepoints. different elementsof which may be wellfitby different
Theloglinear model(4.4) providesa poorfittothesedata. modelsor by noneofthe simple models in current use.
Goodness-of-fit testsare of littleinterest fortheseesti-
matedpopulationfrequencies.Nevertheless, the likeli- OF ASYMPTOTIC
APPENDIX:DERIVATIONS
=
hoodratiochi-squaredstatisticG2 2.08 106,based x
SAMPLINGDISTRIBUTIONS
on df = 1, is largeeven forthe size of thisdata set.
Closer inspectionrevealsthatthe two differences be- The populationvaluesof8 and a can be expressedas
tweenlogodds ofadjacentcellfrequencies quite are dif- = v/A,wherev and A are functions of the{Pi}. Let g
ferentand even have different signs. The logitmodel denote the sample value of i, ,ij = vAlaAdpij)-A(av
providesa similarfitand also is inadequate,withG2 = apij), and 4 = ji,jPij(ij. Using the "delta method,"
2.14 x 100based on df = 1. In fact,notethatthemag- Goodmanand Kruskal(1972) showedthatforfullmul-
nitudesoftheestimated expectedfrequencies ineachrow tinomialsampling,\(7 n)kr( - -1N(0, 1) as the sam-
-
- Rij')) and4 = 0. If 8 = 1, thenRi(C) - 1 - pi. and FIENBERG, STEPHEN E. (1977), The Analysis of Cross-Classified
Rij-) = 0 all i, j, so that Ur2 = 0; then8 = I and CategoricalData, Cambridge,
Mass.: MIT Press.
= 0 withprobability FLEISS, JOSEPHL. (1970),"On theAssertedInvariance
oftheOdds
one. Ratio," BritishJournalofPreventiveand Social Medicine, 24, 45-46.
In applicationsin whichwe are comparingseveral FREEMAN, LINTON C. (1965), ElementaryApplied Statistics,New
groupson an ordinalresponse,the samplingschemeis York:JohnWiley.
oftenindependent (1976),"A FurtherNoteonFreeman'sMeasureofAssociation,"
multinomial
withinthelevelsofX. In Psychometrika,41, 273-275.
thatcase we assumethe{pi.}are known,and we obtain GOODMAN, LEO A. (1979),"MultiplicativeModelsfortheAnalysis
) and Xi+ - 2Ajpij(R of OccupationalMobility
Tablesand OtherKindsofCross-Classifi-
,+= -2pi. A(Rij(C) -
Rij='
- R,j?i()). cation Tables," American Journalof Sociology, 84, 804-819.
GOODMAN,LEO A., andKRUSKAL, WILLIAM H. (1954),"Meas-
Nextwe considerthesampleversiona of&, underthe uresofAssociationforCrossClassifications," JournaloftheAmer-
assumptionthatall aij * 1. Letting& = v/AwithA ican StatisticalAssociation, 49, 723-764.
= P(I), we obtainXij = 2vRij() - 2ARij(C)and 4 = O (1959),"MeasuresofAssociationforCrossClassifications, ll:
FurtherDiscussion and References," Journalof tile American Sta-
forthe case of fullmultinomialsampling.For the case tisticalAssociation, 54, 123-163.
of independent multinomial samplingwithknown{pi.}, (1963),"MeasuresofAssociation forCrossClassifications,
III:
we obtainXij+ = 2vpj.R.1'Y - 2Api.Rj'(c) and j+ = ApproximateSamplingTheory," Journalof theAmericanStatistical
Association, 58, 310-364.
2vE.pjjRij(I) - 2AI:jpijRij(C). (1972),"MeasuresofAssociationforCrossClassifications,
IV:
Beinga difference ratherthana ratio,ln(a) tendsto Simplificationof Asymptotic Variances,"Journalof theAmerican
StatisticalAssociation, 67, 415-421.
convergefasterto its limiting normaldistribution.
Its HABERMAN, SHELBY J.(1974),"Log-LinearModelsforFrequency
variance can be estimated for large samples by TablesWithOrderedClassifications," Biometrics,
30, 589-600.
r& Ina2. Thus,one can formthe 100(1- p)percentcon- HUBERT, LAWRENCE (1974),"A Note on Freeman'sMeasureof
fidenceintervalln(cx)+ zp 2utV7C forln(c) and then Association forRelatingan Orderedto an Unordered
chometrika,39, 517-520.
Factor,"Psy-
exponentiateendpointsto obtaina corresponding confi- JACOBSON,PERRY E., JR.(1972),"ApplyingMeasuresofAssoci-
denceintervalfora. ation to Nominal-OrdinalData," Pacific Sociological Review, 15,
41-60.
[Received September 1978. Revised March 1981.] KLOTZ, JEROME H. (1966), "The Wilcoxon,Ties. and the Com-
puter," Journalof theAmericanStatisticalAssociation, 61, 772-787.
McCULLAGH, PETER (1979),"The Use oftheLogisticFunctionin
REFERENCES the Analysis of Ordinal Data," Procedings of the Intern0tionalSta-
tisticalInstitute,Manila.
AGRESTI, ALAN (1978),"DescriptiveMeasuresforRankCompari- (1980),"RegressionModelsforOrdinalData," Journolof tlie
sons of Groups," 1978 Proceedings of the American StatisticalAs- Royal Statistical Society, Ser. B, 42, 109-142.
sociation, Social StatisticsSection, 585-590. REHAK, JAN(1976),"ZakladnfDeskriptivnf MfryproRozlo2eniOr-
ANDRICH, DAVID (1979),"A ModelforContingency TablesHaving dinalnfch
Dat," Zvlastni otiskze Sociologickeho easopisiu. 416-431.
an OrderedResponseClassification,"
Biometrics,
35,403-415. SARNDAL, C.E. (1974),"A Comparative StudyofAssociation Meas-
BOCK, R. DARRELL (1975), MultivariateStatistical Methods in Be- ures," Psychometrika,39, 165-187.
havioral Research, New York: McGraw-Hill. SIMON, GARY (1974),"Alternative
AnalysesfortheSingly-Ordered
BROSS, IRWIND.J.(1958),"How to Use RiditAnalysis,"Biometrics, ContingencyTable," JournaloftheAmericanStatisticalAssociation,
14, 18-38. 69, 971-976.
CLAYTON, D.G. (1974),"Some OddsRatioStatistics fortheAnalysis SOMERS, ROBERT H. (1962),"A New Asymmetric MeasureofAs-
ofOrderedCategoricalData," Biometrika,61, 525-531. sociation forOrdinal Variables," AmericanSociological Review, 27,
CRITTENDEN, KATHLEEN S., and MONTGOMERY,ANDREW 799-811.
C. (1979),"A SystemofPairedAsymmetric MeasuresofAssociation VIGDERHOUS, GIDEON (1979), "Equivalence BetweenOrdinal
for Use WithOrdinalDependentVariables,"Social Forces, 58, MeasuresofAssociation andTestsofSignificantDifferencesBetween
1178-1194. Samples," Quality and Quantity,13, 187-201.
DUNCAN, OTIS DUDLEY (1979),"How DestinationDependson WILLIAMS, 0. DALE; and GRIZZLE, JAMESE. (1972),"Analysis
Originin the OccupationalMobilityTable," AmericanJournalof ofContingency TablesHavingOrderedResponseCategories," Jour-
Sociology, 84, 793-803. nal of the American StatisticalAssociation, 67, 55-63.