0% found this document useful (0 votes)
53 views7 pages

Nominal-Ordinal Association Analysis

This article proposes measures for summarizing the strength of association between a nominal variable with two levels and an ordered categorical variable with c levels. The measures are based on differences or ratios of probabilities of events concerning pairs of observations from each of the two groups defined by the nominal variable. These measures complement results from fitting loglinear and logit models to cross-classification tables, and are useful when a single model does not adequately describe the data. The measures generalize to tables with an r-level nominal variable in a way that relates to Freeman's theta index. Examples using US population data illustrate the measures.

Uploaded by

Arif Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views7 pages

Nominal-Ordinal Association Analysis

This article proposes measures for summarizing the strength of association between a nominal variable with two levels and an ordered categorical variable with c levels. The measures are based on differences or ratios of probabilities of events concerning pairs of observations from each of the two groups defined by the nominal variable. These measures complement results from fitting loglinear and logit models to cross-classification tables, and are useful when a single model does not adequately describe the data. The measures generalize to tables with an r-level nominal variable in a way that relates to Freeman's theta index. Examples using US population data illustrate the measures.

Uploaded by

Arif Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Measures of Nominal-Ordinal Association

Author(s): Alan Agresti


Source: Journal of the American Statistical Association, Vol. 76, No. 375 (Sep., 1981), pp. 524-
529
Published by: Taylor & Francis, Ltd. on behalf of the American Statistical Association
Stable URL: http://www.jstor.org/stable/2287505 .
Accessed: 15/01/2015 15:04

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].

Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to digitize, preserve
and extend access to Journal of the American Statistical Association.

http://www.jstor.org

This content downloaded from 159.178.22.27 on Thu, 15 Jan 2015 15:04:23 PM


All use subject to JSTOR Terms and Conditions
Measures of Nominal-Ordinal
Association
ALANAGRESTI*

Measuresare formulated forsummarizing the strength


ofassociationbetweena nominalvariableandan ordered oftheseconsiderations, themeasuresofnominal-ordinal
categoricalvariable.The measuresare differences or ra- associationwe proposeuse onlytheordering ofthelevels
tios of probabilities of eventsconcerning two typesof of theordinalvariable.
pairsof observations. Theycan be used to describethe We startby considering the2 x c tablein Section2.
degreeof difference betweentwoor moregroupson an For thiscase, the measureswe suggestare relatedto
ordinalresponsevariable.The measuressummarize and morefamiliarmeasuresof association,nonparametric
complement theresultsof fitting modelsto nominal-or-teststatistics, and riditmeasuresproposedin othercon-
dinalcross-classification tables,especiallywhena single texts.In Section3 we construct twogeneralized measures
structural modelformcannotbe foundthatadequately forthe r x c case thatcan be expressedin termsof
describesan entiretableor set oftables. probabilities of eventsconcerning two typesof pairsof
KEY WORDS: Cross-classification tables;Somers'sd, members. One of these measures is an alternative rep-
Gamma;Freeman'stheta;Mann-Whitney statistic;
Logit resentation of Freeman's (1965) theta index.
and loglinearmodels. Severalauthorshaveproposedvarioustypesofmodels
fordescribing cross-classifications ofnominalandordinal
1. INTRODUCTION variables.A well-fitting modelcan be usedtotestthenull
hypothesis of independence againstmeaningful alterna-
The mostappropriatemeasuresforsummarizing the tives,and its structural formdescribesthenatureof the
degreeof associationbetweentwo variablesdependon bivariaterelationship. Our emphasisin thisarticleis on
themeasurement scales ofthosevariables.In thisarticle developingsummary measuresofthestrength oftheas-
we proposesome summarymeasuresof the degreeof sociationin orderto complement such testsand many
associationbetweena nominalvariableand an ordinal suchmodels.
variable.These mightbe used to describeassociations Table 1 containsa cross-classification oftheestimated
betweenpairsofvariablessuchas religious affiliation
and UnitedStatespopulationin 1975by regionand by size
opinionabout abortion,maritalstatusand lifesatisfac- of residential area. In Section4.3 we illustrate how the
tion,race and severity ofcriminal punishment, and type measuresofassociationpresented inthisarticlecan pro-
of medicaltreatment and degreeof recoveryfromdis- vide usefulsummaries forthesedata. The complemen-
ease. In mostapplicationsin whichthiscombination of tarityofthesemeasuresto themodel-building processis
measurement scales occurs,theordinalvariableis nat- especiallyimportant fordata like these,whichwe will
urallyregardedas a responsevariable.One suchsetting see are poorlyfit by unsaturatedloglinearand logit
is whenthe levels of the nominalvariablerepresentr modelsforthissetting.
groups(e.g., religious types,races,regions)thatwe want
to comparewithrespectto theirdistribution on an or- 2. DICHOTOMOUSNOMINALVARIABLE
deredcategoricalresponse.To reflectthiscommondi-
rectionin therelationship, themeasureswe proposeare Suppose thatsamplingunitsmay be classifiedon a
asymmetric in nature.We onlyconsiderdiscreteordinal dichotomous nominalvariableand on an ordinalvariable
variables,forwhichcase the data maybe summarized having c categories labeled 1. 2, ., c fromleast to
. .

in a cross-classification table havingr unorderedrows greatest in degree. Measures of nominal-ordinal associ-


and c orderedcolumns. ation then correspond to measures of the difference be-
In analyzingordinalvariablessuch as the ones just tween two groups in the distribution of an ordinal vari-
mentioned,different able. Let pij denote the probability that a randomly
researcherswould probablyuse
quitedifferent scoringpatterns ifaskedto assignnumer- selected individual is classified in level i of thenominal
variable (which we refer to as group i, i = 1,2) andlevel
ical values to the levels. Also, it is oftenadvantageous
of the ordinal variable (j = 1, 2, . . . , c). We let pi
to be able to presentsummaries or to makeconclusions j =
thatare notfoundedon scoringsystemsor strongdistri- pij/pi.be theconditional probability thata memberis
butionalassumptions classified in levelj of the ordinal variable, givenmem-
concerning thevariables.Because
bershipin theithgroup.Finally,let Y, and Y2be inde-
*
pendentrandomvariablesgivingthe categorynumbers
Alan Agrestiis Associate Professor,Department of Statistics,
University to therefereesand the
of Florida.The authoris grateful
editorforhelpful
comments. Partofthisresearchwas completed while ? Journalofthe AmericanStatisticalAssociation
theauthorwas visiting theDepartment of Statisticsat OregonState September1981,Volume76, Number375
University. ApplicationsSection
524

This content downloaded from 159.178.22.27 on Thu, 15 Jan 2015 15:04:23 PM


All use subject to JSTOR Terms and Conditions
Agresti:Measures of Nominal-OrdinalAssociation 525

Table 1. Population Distribution(in thousands) in thatriditscoresare assignedto the c categoriesof the


1975 by Region and Size of Residential Area, With ordinalvariableby treating dis-
{iP2i}as the "identified
Statistics for Loglinear and Logit Models tribution." Thenthemeanriditscoreforthe{lj } distri-
butionis (8 + 1)/2,preciselythemeasuredefinedin(2.2).
Region Size of ResidentialArea The relationship betweenriditanalysisand Somers'sd
Other Large measurewas notedby Vigderhous (1979).
Non-metropolitan Metropolitan Metropolitan Givenrandomsamplesofsizes n, and n2fromthetwo
groupsand frequencies{ni1} in the cells, a sample analog
North 17,763 17,290 22,612
South 24,555 28,546 15,000 of 8 is

ExpectedFrequenciesforLoglinearModel 8= (Elniin2j - n
fhin2j)/nlIn2
and Log Odds ofAdjacentCell Frequencies i,j i<j (2.3)
North 15,842.1(.03) 21,131.7(-.27) 20,691.1 = (Un- ,n2,
South 26,475.9(-.15) 24,704.3(.64) 16,920.8
ExpectedFrequenciesfor where U = iij I
n1n2X and U' = j<j n1In2j are discrete
LogitModel and ObservedLogits on whichtheMann-Whitney
analogsofthestatistics test
North 15,815.7(- .81) 21,184.6(.44) 20,664.7 is based forcontinuousdata. Equivalently, 8 = [Si -
South 26,502.3(-.57) 24,651.4(1.26) 16,947.3 whereSI denotes
E(SI)]/(nIn2/2), the rank sum for the
NOTE: For simplicityof illustration,we list data for only two of the four regions given in firstgroup(averageranksbeing assigned to the levels of
the original table. the ordinalvariable)and E(SO) = n1(nj + n2 + 1)/2
Source: U.S. Bureau of the Census (1977), CurrentPopulation Reports, p-25, no. 709,
Table A. denotestheexpectedvalueofSI when{Pli = P2i,I = 1,
. . ., c}. Thus, 8 may be interpretedas the difference
betweenSI and itsexpectedvaluewhenthedistributions
of theordinalvariableformembersselectedat random are identical,dividedby themaximum possiblevalueof
fromgroup1 and group2, respectively. thatdifference. It followsthat8 > 0 if and onlyif the
I
meanrankforgroup exceeds the mean rankforgroup
2.1 Delta 2.
A simplemeasureof associationthatuses only the
orderingofthelevelsof theordinalvariableis 2.2 Alpha
8 = P(YI > Y2) - P(Y2 > Y,) (2.1) In someapplications,it is informativeto describethe
relativesizes ofP( Y, > Y2)andP( Y2> Y,) inratioform,
= plipij - Plip2j.
i>j i<j = P(Y1 > Y2)/P(Y2> Y1)
Clearly,-1 6?c 1, with|8 = 1 ifand onlyifone of - E PIiPZjP/ PIiP2j (2.4)
the Jpjj,1 c j c 4 distributions
is entirelybelow or i>j i<j

above theother.Whenthereare butc = 2 re-


entirely - E PliP2,/ PiP2i.
sponse categories,8 = P21 - p,,, the standarddifference i>j i<i
ofproportions. We see that0 ? oc x I, withox- 1 havingthe same sign
Our interestin P(YI > Y2) - P(Y2 > YI) instead of a. = O if6 = - 1and x = X if8 = 1, but
as 8. Note that
(say) P( Y1 > Y2) alone is so thatthe rangeof possible thereverseimplications do notholdunlessP(Y1 = Y2)
values is centeredarounda number(zero) thatalways = 0.
occurswhenthe two variablesare independent. Alter- The sampleversioncxofaxis relatedtotheMann-Whit-
natively,one mightuse themeasure bycx= U/U'. For thespecialcase inwhich
neystatistics
P(Y, > Y2) + P(Y, = Y2)/2 =(8 + 1)/2, (2,2) thereare onlyc = 2 responsecategories,oxreducesto
the odds ratio,PI2P2I/P1IP22. Like the odds ratio,it is
whichtakes on values between0 and 1 and equals .5 oftenusefulto measureaxon thelogarithmic scale, since
whenthevariablesare independent (see Klotz 1966). In (ax)is symmetric around the independence value of
8 is a specialcase ofotherdescriptive zero,and sincethedistribution
Not surprisingly, ofitssampleanalogcon-
measurescommonlyused fororderedcategoricaldata. vergestonormality fasterthanthedistribution ofcx.When
The asymmetricordinalmeasure of association,So- thetwogroupsare themselvesnaturally ordered,Good-
mers'sd (Somers 1962),is definedto be thedifferencemanand Kruskal's(1954)gammameasureequals (a. -
betweentheproportion ofconcordant pairsand thepro- 1)/(0 + 1).
portionofdiscordant pairs,outofthosepairsofmembers
thatareuntiedontheindependent variable.In thissetting,
3. NOMINAL-ORDINAL ASSOCIATION
ifthedichotomous variableis treatedas an independent
variable,withgroup1 arbitrarily consideredto be the We now generalizecxand 6 in orderto describethe
higherlevel,Somers'sd equals 6. The riditmeasurein- degreeofassociationbetweenan orderedcategoricalre-
troducedby Bross (1958) is also relatedto 6. Suppose sponsevariableY and a nominalvariableX havingr 1ev-

This content downloaded from 159.178.22.27 on Thu, 15 Jan 2015 15:04:23 PM


All use subject to JSTOR Terms and Conditions
526 Joumal of American Statistical Association, September 1981

els. Forthepopulation ofinterest,letpijdenotetheprob- levelsofX is theprobability thattwolevelsofX chosen


abilitythata randomly selectedmemberis classifiedin at random(accordingto the{pi.}distribution) wouldyield
level i of X and level j of Y and let PiJ- pij/pi.,I ? i thatpair,giventhattheyare different. It followsfrom
c r, I c j c. thisrepresentation that0 c 8 c 1, with8 = 0 iffall 6ik
= O and 8 = I iffall I6ik I = 1. This measurecan be
3.1 Generalized Delta easilygeneralized todescribepartialassociationbyform-
inga weighted averageofthe8 valuesthatare computed
We definea generalizedversionof8 as a difference of within
combinations of levelsofcontrolvariables.
probabilitiesof two typesof pairs,so thatit sharesthe
Underfullor independent multinomial sampling, the
simplestructure exhibitedby 8 in the dichotomous(r
= 2) case and byordinalmeasuressuchas Kendall'stau
sample analog of 8 is asymptotically normallydistributed.
The asymptotic varianceformulas forthemeasurespre-
and gamma.
sentedin thispaperare givenin theAppendix.
Let bikdenotethevalueof8 forthe2 x c tableobtained
by considering levels i and k of the nominalvariableX 3.2
Generalized Alpha
as groups1 and 2, respectively. Let yi be thecategory
numberof theordinalvariableY fora memberselected In generalizing fromtwo to severallevels of X, we
at randomfromthe ithlevel of X, i = 1, 2, . . . , r. The extended 8 = P(Y, > Y2) - P(Y2 > Y1) to 6 = P(C I
pairof responsesyi and Yk has consistent orderifyi - Ux) - P(I I Ux). Similarly,we generalize t - P(Y1 >
Ykhas thesamesignas 6ik. A pairhas inconsistent order Y2)/P( Y1 < Y2) to
ifithas theoppositesign.Let C andI denoteconsistent cx = P(C I Ux)/P(I I Ux)
orderand inconsistent order,respectively, ofa randomly
selectedpair,and letthesymbolUx meanthatthemem- = E piiRii''/1 pijRii?) (3.5)
bersare classifiedin different levelsofX. Finally,let i,j i,j
= P(C)/P(I)
Gi+ = {k: bik> O},
Gi = {k: 6ik < O}. It is cxtimesmorelikelyfora randomly selectedpairof
membersto have consistentorderthanto have incon-
Rij(c) = E Pkl + E Pkl, (3.1) sistentorder.Whereas0 c 8 c 1, we have 1 c c c00
kEG,+ Iczj kEG, I- >J
with cx = 1 iff8 = 0 and cx = o if (but not only if) 8
and = 1. If it is preferred
to use a measurehavingrange[0,
1], one obviousalternative is theinverseofcx.Also, we
Rij) =PEkp + E Pkl. can express cxas cx (1 + -)/(1 -I), where j - [P(C)
=
kEG,- I>j kEGJ- /cj
- P(I)]/[P(C) + P(I)] is the differencein the proportion
NoticethatRi>C) is theprobabilitythata pairofmembers, of consistentpairs and the proportion
of inconsistent
one inleveli ofX andlevelj of Y andtheotherrandomly pairs,out of thosepairsuntiedon bothvariables.Note
chosen,willhave consistent order.It followsthat that& mayattainits upperlimitforanyvaluesof r and
c, whereasit is impossible
for8 = 1 whenr > c.
P(C) = E pijRij(c),
i,j
3.3 Use WithStochastic Orderings
P(I) = E pijRij?,
ij (3.2) The measures8 and & are mostmeaningful whenthe
levelsofX are stochastically orderedon theordinalvar-
and iable. Supposethatleveli ofX is stochastically
largeron
P(Ux) = 2, Pi,PkE Y thanlevel k of X; thatis, Ejl=I Pilc Ejl =1 Pklfor1
i<k ? j ? c. Then it maybe shownthat6ik 2 0 and ln xik

We nowdefinegeneralizeddeltaas 0, and foranylevelm ofX, aim ? 6km and tim akiOm


If all r levels of X are stochastically
orderedon Y, it
8 =P(C I Ux) - P(I I Ux) (3.3) followsthata labelingoftheselevelsexistsforwhich6ik
2 0 wheneveri 2 k. If the nominalvariableX were
- R
, pij(Rij(c) Rij'))I2 , Pi.Pk .
i,j i<k insteadordinaland had levelsorderedfromlow to high
accordingto thislabeling,a pairhavingconsistent order
Whenr = 2, notethat6 =8 I as definedin (2.1). wouldbe concordant anda pairhavinginconsistentorder
Thismeasurecanbe showntoequala weighted average wouldbe discordant.In thatcase 8 corresponds to So-
of theabsolutevalues of the8 values forthe(2) 2 x c mers's d withX as the independent variable,and a is
tables representing
the various pairs of levels of X; relatedto gammaas in the2 x c case.
namely,
8= pipk | 6ik |/ pipk- (3.4) 3.4 Other Approaches
i<k i<k
Although of6 as a difference
ourformulation between
Theweight to theI6ik Ifora particular
assigned pairof twoprobabilities
seemsto be original,
we notethatFree-

This content downloaded from 159.178.22.27 on Thu, 15 Jan 2015 15:04:23 PM


All use subject to JSTOR Terms and Conditions
Agresti: Measures of Nominal-Ordinal Association 527

man (1965, p. 112) defineda relatedsamplemeasure, withequal variances,the value of the odds ratiocom-
called theta(see also Freeman1976and Hubert1974). putedfora collapsingintoa 2 x 2 tabledependsgreatly
Additionalapproachesto measuring nominal-ordinalas- on how the cuttingpointis chosenforforming the di-
sociationmaybe foundinAgresti(1978),Crittenden and chotomy.McCullagh(1980)suggested alternative
models
Montgomery (1980),Goodmanand Kruskal(1959,Secs. thatalso do not requirescores. These modelsassume
4.4 and 4.5), Jacobson(1972),Rehak(1976),and Sarndal constantdifferences
{Aik} betweendistribution
functions
(1974). or theircomplements on a log-logscale.

4. MODELSFOR NOMINAL-ORDINAL
DATA 4.2 A Loglinear Model
In the past decade, several authorshave proposed This modelassumesthata meaningful set of ordered
modelsfordescribing patternsofassociationsinnominal- scores{V,} can be assignedto thecolumns.It contains
tables.TheseincludeAndrich an interaction
ordinalcross-classification termrepresenting a deviationfrominde-
(1979),Bock (1975,pp. 541-550),Duncan (1979),Fien- pendence that changes linearly within each row;
berg (1977, pp. 52-58), Goodman (1979), Haberman specifically,
(1974),McCullagh(1979,1980)and WilliamsandGrizzle In Pii = yi + i + otiVi,
(1972). In this section we brieflydescribe the two
mostcommonlyconsideredmodeltypes,logitand log- I - i c r, I c ji c. (4.4)
linear.We then discuss how the measuresstudiedin Variousformulations ofthismodelhave been suggested
thispaper,whichare not modelbased, are appropriateby Simon (1974), Haberman(1974), Goodman(1979),
for use in diverse situationsdescribedby these and Duncan(1979),and Fienberg(1977,pp. 52-55).
manyothermodels. An interestingimplicationof model(4.4) is that
4.1 A LogitModel ln(piJpk,Ipi,pkj) = (0ii - O-k)(Vi -
V/). (4.5)
Let F, ji= ',=,= oi be thejth cumulativeprobabilityfor In otherwords,forany pairof rows,theodds ratiois
categoryi ofthenominalvariable,I ? i c r, I j c c. constantforall pairsof columnsthatare equidistant in
One way we can use the ordinalnatureof the column score.For theequal-interval scores{ Vj = j}, thelogodds
classification withoutresorting to scoringmethodsis by ratioequals AikX= i -i - X-kforall pairsof adjacentcol-
constructing a model,usingtheaccumulated logitsln[FiIj/ umns.Thus,any pairof rowshas a constantdifference
(1 - F,j)], j = 1, 2, . . . , c - I withineach row. The betweenlog odds ofadjacentcells proportions,
mostpopularmodelof thistypeis theunsaturated one Aik = - In(pkJ/pkJ,,?+),
in whichthedifference betweenthedistributions forany In(pi.i/pi,.i+?)
pairof rowsis constantacrossthecolumnson thislogit I j c- 1. (4.6)
scale. Thatis,
4.3 Measuring Association
ln[F,i/(1- F,j)] = ln[Fk,il(1- Fkj)] + Aik,
Whenmodel(4.1) or (4.4) providesan adequatefitto
j = 1,2, . . . , c - 1, (4.1) a table,thestrength oftheassociationcan be quantified
forall pairs 1 ? i ? k ? r. This model has been suggested by the magnitude of the{IXik} "difference" parameters.
by several authors,includingClayton(1974), Simon (Notice that r - I of these parameters (and hencethe
(1974),McCullagh(1979,1980),andWilliamsandGrizzle {ao}) determine the entire set of {Xkb}.) Since the meaning
(1972). It mayequivalently be describedby notingthat Of Aik depends on the particular structural form forthe
theodds ratios model, however, the magnitudes of these parameters can-
. +
not be compared across differing structural models.This
(oil + + p)1+ Pic) makesitdifficult to compareassociationsintwoor more
+ *A+ + Pkj)l(pk
(~k1+
(pk1I 5kJ)(~Jk?1
j+I1 ++ *A p~)=
kc) exp(AiXk), tablesforwhichdifferent structural modelsare applica-
1 j-- c - 1 (4.2) ble. In addition, many cross-classifications occurinprac-
ticeforwhich(a) noneof thecommonly used structural
are identicalforall collapsingsof each 2 x c subtable modelsprovidesan adequatefit,or (b) ifa good-fitting
intoa 2 x 2 table. Anotherformulation of the model, modelis obtainedby trying severalstructural types,the
givenby Simon(1974),is resultof "fishingforstructure"maybe thatthe same
modeltypeis inadequatewhenappliedto othercross-
ln[Fij/(l - FiA)] = ai + Ij. (4.3)
classificationsof thesame variables.
The logisticdifferences are {Aik = ti - ak} in this The deltaand alpha measuresof nominal-ordinal as-
parameterization. sociation,notbeingmodelbased, can oftenbe used for
Although model(4.1) is intuitively appealing,it is not comparing strengths of associationacrosstableseven if
alwayssuitable,even whenthereare "nice" underlyingno single,simplestructural modelformis generallyap-
distributions differing only in location. Fleiss (1970) plicableto thosetables.These measureshave a certain
showed,forexample,thatfortwo normaldistributionsrobustness inthesensethattheyareapplicableina broad

This content downloaded from 159.178.22.27 on Thu, 15 Jan 2015 15:04:23 PM


All use subject to JSTOR Terms and Conditions
528 Journalof American StatisticalAssociation,September 1981

rangeof settingsthat would encompassa varietyof all { I 8^i6 } and { I ln&1, i } from1960exceed thecorre-
models.We showedinSection3.3 thattheyarenaturally sponding valuesfrom1975.Thisindicatesthatdifferences
suitedto systemsof stochastically ordereddistributions.betweenregionsin thedistribution of size of residential
Now itcan be seenthatanycross-classification tablefor area tendedto diminish overthese 15 years.This slight
whichthelogisticmodel(4.1), loglinearmodel(4.4), or decreaseinvariability amongtheregionsis also reflected
one of McCullagh's(1980) log-logmodelsfitsperfectlyby thesmallervalues in 1975of thesummary measures
is such thatthe distributions withinthe rows are sto- 6 anda (.224and2.008,respectively, comparedwith.259
chastically ordered.Thus,thedeltaand alphameasures and 2.244 in 1960).Withineach regionone could also
are suitableforuse wheneverone of these importantcompute8 or a forpairsof yearsto summarize change
modeltypesis deemedappropriate. Theirrobustness is towardmetropolitan populations.
illustratedby thefactthatifany of thesemodelsfitsa The above remarksare notintendedas a criticism of
particular table perfectly, thenthe {iA,,} forthatmodel themodel-building approach.It is important to attempt
will be matchedin signby the {8i6} and {lnao,}.When to describetable structures, and we believethatthese
differentmodeltypesfitdifferent tables,thesemeasures measureshelp to complementthatprocess. They de-
giveus a commonbasis forcomparing associationsand scribestrength ofassociationon a commonbasis forthe
summarizing theresultsofthemodels. class of tables of stochastically ordereddistributions,
Table 1,introduced earlier,illustrates theabovepoints. different elementsof which may be wellfitby different
Theloglinear model(4.4) providesa poorfittothesedata. modelsor by noneofthe simple models in current use.
Goodness-of-fit testsare of littleinterest fortheseesti-
matedpopulationfrequencies.Nevertheless, the likeli- OF ASYMPTOTIC
APPENDIX:DERIVATIONS
=
hoodratiochi-squaredstatisticG2 2.08 106,based x
SAMPLINGDISTRIBUTIONS
on df = 1, is largeeven forthe size of thisdata set.
Closer inspectionrevealsthatthe two differences be- The populationvaluesof8 and a can be expressedas
tweenlogodds ofadjacentcellfrequencies quite are dif- = v/A,wherev and A are functions of the{Pi}. Let g
ferentand even have different signs. The logitmodel denote the sample value of i, ,ij = vAlaAdpij)-A(av
providesa similarfitand also is inadequate,withG2 = apij), and 4 = ji,jPij(ij. Using the "delta method,"
2.14 x 100based on df = 1. In fact,notethatthemag- Goodmanand Kruskal(1972) showedthatforfullmul-
nitudesoftheestimated expectedfrequencies ineachrow tinomialsampling,\(7 n)kr( - -1N(0, 1) as the sam-
-

forthesemodelseven differ in orderfromtheobserved ple size n x, where


frequencies (e.g., fortheNorthrow,thelargestexpected
=F2 p,j(+ij - (A.1)
frequency occursin thecell withthe smallestobserved i,j
frequency). Poorfitsare also obtainedwithotherunsat-
uratedmodelswe have considered,such as the log-log For independentmultinomial samplingwith the {pi.}
models. knownand withsamplingproportions {wiJ,the same
The distributions in thetwo rowsof Table I are sto- asymptotic distribution occursbutwith
chastically ordered,however,so 8 and a providemean- I
2= E-
- -+)2/A4 (A.2)
ingfulsummaries. The tendency forpeoplein theNorth iWi j
to be morehighlymetropolitan is reflectedbythesimply
interpretable values 8 = .151 and a = 1.574.For ex- where4ij+ = v(aAI8pij) - A(avlapij) and Xi = PjP
ample,a = 1.574meansthatfora randomly selectedpair (ij+. Substitution of the sampleproportions intoeither
fromTable I (one observation fromeach row),itis 1.574 asymptotic varianceformula yieldsa consistent estimate
timesas likelythatthememberfromtheNorthlivesin &e2 of re2, whichcan be used in constructing confidence
themorehighly metropolitan areathanitis thatthemem- intervals fort. In thisAppendixwe givetheexpressions
berfromtheSouthlivesinthemorehighlymetropolitanforXij and Xij7 to be insertedinto(A.1) and (A.2) for
area. thecases t = 8 and t = a.
Theseremarks are strengthened bythefactthatsimilar We firstconsiderthe asymptotic distribution of the
behavioroccurswhendatafromotheryearsareanalyzed sampleanalog8 of 8 forfullmultinomial sampling.The
A

and whenregionis measuredwithmorecategories.For consistency ofall samplecellproportions impliesthatall


example,theloglinear and logitmodelsprovidepoorfits 6ik 86ik* We assume thatall 6ik * 0, whichimplies
to thecorresponding data from1960(G2 = 2.11 x 106 that
and G2 = 2.30 x 106, respectively). In each yeartheG2
valuesareevenlargerwhenfourlevels(Northeast, North P( Gi Gij
- and Gi Gi-, all i) - 1 as n
---
o W. (A.3)
central,South,West)are used forregion.However,in
each yearthe regionshave the stochasticordering NE It followsfroma lemmain Goodmanand Kruskal(1963,
> W > NC > S on size of residential area, so the{8ij} p. 357) thatforasymptotic purposeswe maytreatGi+
and{ai1}providesimplesummaries. Theirusealso results -G1+ and G, - -G?-. Now, letting6 = v/AX withAX
in interesting and substantive conclusions.For example, = 2 Ei<k Pi.Pk , we obtain4ij = 2V(1 - Pi.) - 2tX(Rij(c

This content downloaded from 159.178.22.27 on Thu, 15 Jan 2015 15:04:23 PM


All use subject to JSTOR Terms and Conditions
Agresti:Measures of Nominal-OrdinalAssociation 529

- Rij')) and4 = 0. If 8 = 1, thenRi(C) - 1 - pi. and FIENBERG, STEPHEN E. (1977), The Analysis of Cross-Classified
Rij-) = 0 all i, j, so that Ur2 = 0; then8 = I and CategoricalData, Cambridge,
Mass.: MIT Press.
= 0 withprobability FLEISS, JOSEPHL. (1970),"On theAssertedInvariance
oftheOdds
one. Ratio," BritishJournalofPreventiveand Social Medicine, 24, 45-46.
In applicationsin whichwe are comparingseveral FREEMAN, LINTON C. (1965), ElementaryApplied Statistics,New
groupson an ordinalresponse,the samplingschemeis York:JohnWiley.
oftenindependent (1976),"A FurtherNoteonFreeman'sMeasureofAssociation,"
multinomial
withinthelevelsofX. In Psychometrika,41, 273-275.
thatcase we assumethe{pi.}are known,and we obtain GOODMAN, LEO A. (1979),"MultiplicativeModelsfortheAnalysis
) and Xi+ - 2Ajpij(R of OccupationalMobility
Tablesand OtherKindsofCross-Classifi-
,+= -2pi. A(Rij(C) -
Rij='
- R,j?i()). cation Tables," American Journalof Sociology, 84, 804-819.
GOODMAN,LEO A., andKRUSKAL, WILLIAM H. (1954),"Meas-
Nextwe considerthesampleversiona of&, underthe uresofAssociationforCrossClassifications," JournaloftheAmer-
assumptionthatall aij * 1. Letting& = v/AwithA ican StatisticalAssociation, 49, 723-764.
= P(I), we obtainXij = 2vRij() - 2ARij(C)and 4 = O (1959),"MeasuresofAssociationforCrossClassifications, ll:
FurtherDiscussion and References," Journalof tile American Sta-
forthe case of fullmultinomialsampling.For the case tisticalAssociation, 54, 123-163.
of independent multinomial samplingwithknown{pi.}, (1963),"MeasuresofAssociation forCrossClassifications,
III:
we obtainXij+ = 2vpj.R.1'Y - 2Api.Rj'(c) and j+ = ApproximateSamplingTheory," Journalof theAmericanStatistical
Association, 58, 310-364.
2vE.pjjRij(I) - 2AI:jpijRij(C). (1972),"MeasuresofAssociationforCrossClassifications,
IV:
Beinga difference ratherthana ratio,ln(a) tendsto Simplificationof Asymptotic Variances,"Journalof theAmerican
StatisticalAssociation, 67, 415-421.
convergefasterto its limiting normaldistribution.
Its HABERMAN, SHELBY J.(1974),"Log-LinearModelsforFrequency
variance can be estimated for large samples by TablesWithOrderedClassifications," Biometrics,
30, 589-600.
r& Ina2. Thus,one can formthe 100(1- p)percentcon- HUBERT, LAWRENCE (1974),"A Note on Freeman'sMeasureof
fidenceintervalln(cx)+ zp 2utV7C forln(c) and then Association forRelatingan Orderedto an Unordered
chometrika,39, 517-520.
Factor,"Psy-
exponentiateendpointsto obtaina corresponding confi- JACOBSON,PERRY E., JR.(1972),"ApplyingMeasuresofAssoci-
denceintervalfora. ation to Nominal-OrdinalData," Pacific Sociological Review, 15,
41-60.
[Received September 1978. Revised March 1981.] KLOTZ, JEROME H. (1966), "The Wilcoxon,Ties. and the Com-
puter," Journalof theAmericanStatisticalAssociation, 61, 772-787.
McCULLAGH, PETER (1979),"The Use oftheLogisticFunctionin
REFERENCES the Analysis of Ordinal Data," Procedings of the Intern0tionalSta-
tisticalInstitute,Manila.
AGRESTI, ALAN (1978),"DescriptiveMeasuresforRankCompari- (1980),"RegressionModelsforOrdinalData," Journolof tlie
sons of Groups," 1978 Proceedings of the American StatisticalAs- Royal Statistical Society, Ser. B, 42, 109-142.
sociation, Social StatisticsSection, 585-590. REHAK, JAN(1976),"ZakladnfDeskriptivnf MfryproRozlo2eniOr-
ANDRICH, DAVID (1979),"A ModelforContingency TablesHaving dinalnfch
Dat," Zvlastni otiskze Sociologickeho easopisiu. 416-431.
an OrderedResponseClassification,"
Biometrics,
35,403-415. SARNDAL, C.E. (1974),"A Comparative StudyofAssociation Meas-
BOCK, R. DARRELL (1975), MultivariateStatistical Methods in Be- ures," Psychometrika,39, 165-187.
havioral Research, New York: McGraw-Hill. SIMON, GARY (1974),"Alternative
AnalysesfortheSingly-Ordered
BROSS, IRWIND.J.(1958),"How to Use RiditAnalysis,"Biometrics, ContingencyTable," JournaloftheAmericanStatisticalAssociation,
14, 18-38. 69, 971-976.
CLAYTON, D.G. (1974),"Some OddsRatioStatistics fortheAnalysis SOMERS, ROBERT H. (1962),"A New Asymmetric MeasureofAs-
ofOrderedCategoricalData," Biometrika,61, 525-531. sociation forOrdinal Variables," AmericanSociological Review, 27,
CRITTENDEN, KATHLEEN S., and MONTGOMERY,ANDREW 799-811.
C. (1979),"A SystemofPairedAsymmetric MeasuresofAssociation VIGDERHOUS, GIDEON (1979), "Equivalence BetweenOrdinal
for Use WithOrdinalDependentVariables,"Social Forces, 58, MeasuresofAssociation andTestsofSignificantDifferencesBetween
1178-1194. Samples," Quality and Quantity,13, 187-201.
DUNCAN, OTIS DUDLEY (1979),"How DestinationDependson WILLIAMS, 0. DALE; and GRIZZLE, JAMESE. (1972),"Analysis
Originin the OccupationalMobilityTable," AmericanJournalof ofContingency TablesHavingOrderedResponseCategories," Jour-
Sociology, 84, 793-803. nal of the American StatisticalAssociation, 67, 55-63.

This content downloaded from 159.178.22.27 on Thu, 15 Jan 2015 15:04:23 PM


All use subject to JSTOR Terms and Conditions

You might also like