Stepp Et Ali PDTRT 2012 - Integrating Competing Dimensional Models of Personality - Linking The SNAP, TCI, and NEO Using Item Response Theory
Stepp Et Ali PDTRT 2012 - Integrating Competing Dimensional Models of Personality - Linking The SNAP, TCI, and NEO Using Item Response Theory
net/publication/221977759
CITATIONS READS
46 215
6 authors, including:
All content following this page was uploaded by Timothy J Trull on 16 May 2014.
Paul A. Pilkonis
University of Pittsburgh School of Medicine
Mounting evidence suggests that several inventories assessing both normal personality
and personality disorders measure common dimensional personality traits (i.e., Antag-
onism, Constraint, Emotional Instability, Extraversion, and Unconventionality), albeit
providing unique information along the underlying trait continuum. We used Widiger
and Simonsen’s (2005) pantheoretical integrative model of dimensional personality
assessment as a guide to create item pools. We then used Item Response Theory (IRT)
to compare the assessment of these five personality traits across three established
dimensional measures of personality: the Schedule for Nonadaptive and Adaptive
Personality (SNAP), the Temperament and Character Inventory (TCI), and the Revised
NEO Personality Inventory (NEO PI-R). We found that items from each inventory map
onto these five common personality traits in predictable ways. The IRT analyses,
however, documented considerable variability in the item and test information derived
from each inventory. Our findings support the notion that the integration of multiple
perspectives will provide greater information about personality while minimizing the
weaknesses of any single instrument.
A variety of dimensional personality inven- (but see Clark & Livesley, 2002; Widiger,
tories have been advanced by several research Livesley, & Clark, 2009, for examples). There
groups and available data do not clearly support is much evidence to suggest that dimensional
one proposal over another (Clark, 2007). More- personality inventories measure five common
over, many empirical articles attempting to map underlying traits, namely Antagonism, Con-
the structure of personality originate from a straint, Emotional Instability, Extraversion, and
particular theory or instrument, with relatively Unconventionality (e.g., see Widiger, 2011a).
little cross-talk among theoretical perspectives Widiger and Simonsen (2005) provide a theo-
retical framework toward an integrative dimen-
sional model of personality, which highlights this
This article was published Online First November 7, 2011. common hierarchical structure found across 18
Stephanie D. Stepp, Lan Yu, Michael N. Hallquist, and different dimensional personality inventories. We
Paul A. Pilkonis, Department of Psychiatry, University of propose to use Widiger and Simonsen’s (2005)
Pittsburgh School of Medicine; Joshua D. Miller, Depart-
ment of Psychology, University of Georgia; Timothy J.
theoretical work as a heuristic to examine the links
Trull, Department of Psychological Sciences, University of between three common dimensional personality
Missouri. inventories: the Schedule for Nonadaptive and
Correspondence concerning this article should be ad- Adaptive Personality-2nd edition (SNAP-2; Clark,
dressed to Stephanie D. Stepp, Department of Psychiatry,
University of Pittsburgh School of Medicine, Western Psy-
Simms, Wu, & Casillas, in press), the Tempera-
chiatric Institute and Clinic, 3811 O’Hara St., Pittsburgh, ment and Character Inventory (TCI; Cloninger,
PA 15213. E-mail: [email protected] Przybeck, Svrakic, & Wetzel, 1994), and the Re-
107
108 STEPP ET AL.
vised NEO Personality Inventory (NEO PI-R; theoretical perspectives, an integrative view of
Costa & McCrae, 1992a). dimensional personality assessment must ad-
dress the pragmatic issue of comparing infor-
Common Dimensional Personality Traits mation provided by each item from competing
inventories to enable researchers and clinicians
Theoretical and empirical reviews of the la- to assess personality traits more efficiently, pre-
tent structure of personality provide increasing cisely, and flexibly. In view of the shared latent
evidence for the salience of five dimensional structure of abnormal and normal personality
personality traits that cut across theoretical per- traits, the integration of multiple inventories
spectives and inventories (Clark, 2007; will be best served by a final product that pro-
Krueger, 2005; Widiger, 2011a). Available ev- vides information across the range of trait levels
idence also supports the notion that abnormal (i.e., trait information at normal, subclinical,
and normal personality share a common hierar- and clinical levels). Thus, it may be sensible to
chical structure, with maladaptive traits repre- combine items from self-report inventories de-
senting extreme levels of normal traits (Markon, veloped to assess maladaptive traits (e.g., the
Krueger, & Watson, 2005; O’Connor, 2002). In SNAP), with items gleaned from normal per-
addition, the redundancy across inventories ob- sonality instruments (e.g., the NEO PI-R), as
served in several incremental validity studies clinical inventories may provide better informa-
(e.g., Reynolds & Clark, 2001; Stepp, Trull, tion about extremely high or low trait levels.
Burr, Wolfenstein, & Vieth, 2005) suggests that That said, some have suggested that measures
integrating items from competing inventories of maladaptive personality are redundant with
onto a common scale may be the most fruitful normal personality inventories (Costa & Mc-
path for uncovering information about underly- Crae, 1992b) and some recent evidence sup-
ing personality traits. ports this assertion (e.g., Walton, Roberts,
Based on a thorough review of the empirical Krueger, Blonigen, & Hicks, 2008).
literature, Widiger and Simonsen (2005) pro- Previous research has utilized joint factor
vided a schematic that maps most of the 18 analyses of normal and abnormal personality
proposals for dimensional personality assess- instruments to argue that a common factor
ment onto five broad traits of Antagonism, Con- structure underlies different personality inven-
straint, Emotional Instability, Extraversion, and tories and that normal and abnormal personality
Unconventionality. For example, the Mistrust fall on a common continuum (Markon et al.,
scale from the SNAP and the Agreeableness 2005; Watson, Clark, & Chmielewski, 2008).
scale from the NEO PI-R are hypothesized to Although factor analysis is a useful tool for
measure an underlying Antagonism factor. Al- understanding whether common traits are
though evidence generally supports the notion shared across personality inventories, it does
of a common latent structure across personality not provide any information about whether ab-
inventories (Clark & Livesley, 2002; Markon et normal and normal inventories yield informa-
al., 2005), further study is required to validate tion at different points along the trait continuum
the assertion that a shared factor structure un- (i.e., ranging from normal to clinical), nor does
derlies most of the 18 dimensional personality it yield detailed feedback about which invento-
inventories. Moreover, the essential notion that ries (or items) provide the most information
scales developed from seemingly different the- about the latent traits.
oretical perspectives are adequately represented
by a single dimension needs to be validated. For Item Response Theory
example, does the psychometric evidence sup-
port the assertion that items from the Negative Item Response Theory (IRT) represents a
Temperament scale from the SNAP are isomor- class of modern psychometric techniques that
phic with items from the Neuroticism scale model levels of a putative latent trait (e.g., Neu-
from the NEO PI-R, or are these scales better roticism) as a function of item characteristics, in
conceptualized as two related, but distinct, which the probability of correct item response is
factors? modeled as a function of latent trait theta ()
In addition to grappling with issues of factor and one or more item parameters (Embretson &
structure and conceptual isomorphism across Reise, 2000; Lord, 1980). Of particular import
LINKING PERSONALITY INVENTORIES 109
to our study, IRT methods provide specific from competing dimensional personality assess-
feedback about the position along , where each ment inventories, we selected three measures
item or inventory provides the most psychomet- that represent different approaches to test con-
ric information about the trait. For example, an struction: the SNAP-2, which was derived from
item tapping intense expressions of anger (e.g., a bottom-up approach to measure personality
throwing objects) would likely provide more pathology; the TCI, which was rationally de-
information about Emotional Instability at high rived from Cloninger’s (1987) psychobiological
levels of than an item about occasional argu- theory of personality; and the NEO PI-R, which
ments with romantic partners. One advantage of was derived from factor analytic work on nor-
IRT is that individuals’ estimates are indepen- mal personality.
dent of the number of items and the specific Widiger and Simonsen (2005) developed a
items used in the population for calibration. conceptual framework for creating a pantheo-
Thus, even when individuals take different sets retical scale for assessing five common person-
of items with different response options, (result- ality traits. This study sought to extend their
ing in different patterns of missingness), the conceptual work by examining how three
data can be combined and concurrently cali- widely used personality inventories map on to
brated or linked, estimating item parameters these five common personality traits. We sought
across three measures within a single latent trait to demonstrate that items drawn from different
model (i.e., on one single computer run; Lord, inventories map onto the five personality traits
1980). as predicted by Widiger and Simonsen (2005)
Item- and test-information functions in IRT by linking measures using IRT models. To dem-
are estimated on the same latent trait scale, onstrate this, we refined and reorganized the
yielding psychometric information that is di- scales to include only the items that provided
rectly comparable across inventories (Reise & the most information about the underlying trait
Henson, 2003). For example, if one were at- and examined the proportion of items contrib-
tempting to develop an IRT-informed integra- uted by each inventory in the measurement of
tive scale of Extraversion, it would be important the underlying trait to determine which inven-
to know which items provide maximum infor- tory provided the most information regarding
mation about the latent trait across the broadest the underlying trait. For example, does the NEO
range of levels, from extremely shy to ex- PI-R provide more information about Conscien-
tremely outgoing. Because item characteristic tiousness compared to the SNAP-2 and TCI?
estimates are not tied to particular inventories or Second, we linked these refined item pools from
theories per se, it is likely that items from sev- the individual inventories to create five pan-
eral inventories would provide the most valid theoretical scales that optimally measure Antag-
trait scale, and inclusion of items from both onism, Constraint, Emotion Instability, Extra-
normal and abnormal inventories may be cru- version, and Unconventionality. To develop
cial to ensure that average and extreme trait these item pools, we used IRT to identify the
levels are represented. In summary, IRT models best performing items for each personality trait
are ideal for the present purposes because (1) and linked the SNAP-2, TCI, and NEO PI-R.
they yield detailed information about the posi-
tion along the trait continuum accounted for by Method
particular inventories (or items from those in-
ventories), (2) latent trait estimates for items Participants and procedures. Partici-
and inventories are directly comparable across pants were recruited from undergraduate psy-
scales, and (3) they provide specific feedback chology courses at two universities, psychiatric
about which tests provide the most valid infor- inpatient, psychiatric outpatient, medical, and
mation about the underlying traits. community settings. Written informed consent
was obtained from all participants before ad-
Integrating Dimensional Personality ministration of the questionnaires. Undergradu-
Inventories ate student participants were compensated with
credit toward their psychology course grade and
To identify the items that optimally measure patient and community participants were com-
five common dimensional personality traits pensated monetarily for their participation.
110 STEPP ET AL.
tion) that contains 390, true-false self-report strongly agree and 5 indicating strongly disagree.
items. Using items drawn from DSM–III PD Exploratory factor analyses of large sets of trait
criteria, personality pathology concepts from adjective self-ratings (e.g., “friendly,” “coura-
other research programs (e.g., psychopathy; geous”) or short trait descriptions (e.g., “I am not
Cleckley, 1964), and trait-like symptoms of a worrier.”) consistently yield the Big Five per-
Axis I conditions (e.g., dysthymia), raters de- sonality domains. Several independent studies
rived 22 item clusters, which were subsequently have replicated the essential factor structure of this
factor analyzed. Results from exploratory factor inventory (e.g., Savla, Davey, Costa, & Whitfield,
analysis indicated that 12 dimensions of patho- 2007; Wu, Lindsted, Tsai, & Lee, 2008). The
logical personality traits characterized the NEO PI-R has been used extensively in empirical
SNAP items, and the best indicators of these
studies of normal and abnormal personality (e.g.,
dimensions were retained to form personality
Markon et al., 2005; Yamagata et al., 2006), and
pathology scales. Subsequent validation of the
SNAP indicates the measure has strong psycho- has been accepted among many as the dominant
metric properties, is correlated with DSM–IV Big Five personality assessment instrument
PDs in predicted ways, and successfully distin- (Clark, 2007).
guishes among distinct forms of personality pa- Creating item pools. We used Widiger and
thology (Morey et al., 2003). Simonsen’s (2005) pantheoretical integrative
Temperament and Character Inventory. model of personality disorder classification to de-
The TCI is a 240-item, true-false self-report in- velop item pools for Antagonism, Constraint,
ventory. It is a broadband personality assessment Emotional Instability, Extraversion, and Uncon-
instrument developed a priori based on Clon- ventionality from the SNAP-2, TCI, and NEO
inger’s (1987) seven-factor psychobiological the- PI-R. Widiger and Simonsen (2005), classified
ory of personality, which was strongly influenced items from these three instruments into the five
by genetic and family studies of personality; lon- scales of interest. Table 1 lists the scales from each
gitudinal studies of personality stability and personality measure that were used to create the
change; humanistic and transpersonal notions of five item pools. The Entitlement SNAP-2 scale
personality development; and basic conditioning/ was included in both Antagonism and Extraver-
learning studies in animals and humans (Clon- sion (Widiger & Simonsen, 2005). Table 1 lists
inger, 1987; Cloninger, Svrakic, & Przybeck, the scales from each personality inventory as well
1993). The TCI measures four dimensions of tem- as the number of items from each of the scales
perament (Harm Avoidance, Novelty Seeking, used in the analyses. The initial item pools con-
Persistence, and Reward Dependence) and three sisted of a large set of items for each personality
dimensions of character (Cooperativeness, Self- dimension: Antagonism (214 items: 48 NEO, 93
Directedness, and Self-Transcendence). The seven
SNAP, and 73 TCI items), Constraint (232 items:
main TCI dimensions comprise 25 facets. The
48 NEO, 92 SNAP, and 92 TCI items), Emotional
TCI has generated a large and influential body
of literature, spanning topics including the Instability (190 items: 48 NEO, 62 SNAP, and 80
genetic heritability of personality (Ando et TCI items), Extraversion (186 items: 48 NEO, 77
al., 2002), personality variability within Axis SNAP, and 61 TCI items), and Unconventionality
I diagnoses (e.g., Fassino et al., 2002), and (96 items: 48 NEO, 15 SNAP, and 33 TCI items).
the impact of personality on psychotherapy Response frequencies for each item were in-
outcome (Joyce, Mulder, McKenzie, Luty, & spected before data analysis to ensure that all scale
Cloninger, 2004). values were endorsed by at least 1.0% of the
Revised NEO Personality Inventory. The sample.2
NEO PI-R is self-report inventory developed to
measure the Big Five personality domains: Neu-
roticism, Extraversion, Openness, Agreeableness,
and Conscientiousness. Each of the five broad 2
domains is divided into six facets and each facet is Additional tables providing item content and response
frequencies for the Antagonism, Constraint, Emotional In-
assessed by eight items. It consists of 240 state- stability, Extraversion, and Unconventionality initial item
ments for which participants rate their level of pools are available upon request from the corresponding
agreement on a 5-point scale, with 1 indicating author.
112
Table 1
NEO, SNAP, and TCI Scales to Create Item Pools for Antagonism, Constraint, Emotional Instability, Extraversion, and Unconventionality
Personality Dimensions
Scales and initial item pools
Items retained following psychometric analyses
SNAP TCI NEO (Steps 1b, 2c, and 3d)
Antagonism Mistrust (19)a Cooperativenessⴱ (40) Agreeablenessⴱ (48) Mistrust (1 ⫽ 9, 2 ⫽ 9, 3 ⫽ 0)
● Initial item pool (N ⫽ 214) Manipulativeness (20) Reward dependenceⴱ (33) Manipulativeness (1 ⫽ 13, 2 ⫽ 13, 3 ⫽ 1)
● Post-factor analysis item pool Aggressiveness (20) Aggressiveness (1 ⫽ 19, 2 ⫽ 19, 3 ⫽ 14)
(N ⫽ 92) Entitlement (16) Cooperativeness (1 ⫽ 23, 2 ⫽ 23, 3 ⫽ 8)
● Final item pool (N ⫽ 24) Dependencyⴱ (18) Reward dependence (1 ⫽ 6, 2 ⫽ 6, 3 ⫽ 0)
Agreeableness (1 ⫽ 22, 2 ⫽ 18, 3 ⫽ 1)
Emotional instability Negative temperament (28) Harm avoidance (36) Neuroticism (48) Negative temperament (1 ⫽ 26, 2 ⫽ 26, 3 ⫽ 19)
● Initial item pool (N ⫽ 190) Self-harm (16) Self-directednessⴱ (44) Self-harm (1 ⫽ 13, 2 ⫽ 13, 3 ⫽ 8)
● Post-factor analysis item pool Dependency (18) Dependency (1 ⫽ 4, 2 ⫽ 4. 3 ⫽ 0)
(N ⫽ 114) Harm avoidance (1 ⫽ 29, 2 ⫽ 29, 3 ⫽ 10)
● Final item pool (N ⫽ 56) Self-directedness (1 ⫽ 23, 2 ⫽ 23, 3 ⫽ 10)
Neuroticism (1 ⫽ 29, 2 ⫽ 25, 3 ⫽ 9)
Extraversion Positive affectivity (27) Reward dependence (33) Extraversion (48) Positive affectivity (1 ⫽ 22, 2 ⫽ 22, 3 ⫽ 18)
● Initial item pool (N ⫽ 186) Exhibitionism (16) Extravagance (9) Exhibitionism (1 ⫽ 13, 2 ⫽ 13, 3 ⫽ 10)
STEPP ET AL.
● Postfactor analysis item pool Entitlement (16) Exploratory excitability (11) Entitlement (1 ⫽ 6, 2 ⫽ 6. 3 ⫽ 2)
(N ⫽ 91) Detachmentⴱ (18) Shynessⴱ (8) Detachment (1 ⫽ 17, 2 ⫽ 17. 3 ⫽ 15)
● Final item pool (N ⫽ 59) Attachment (9) Reward dependence (1 ⫽ 1, 2 ⫽ 1, 3 ⫽ 1)
Extravagance (1 ⫽ 0, 2 ⫽ 0, 3 ⫽ 0)
Exploratory excitability (1 ⫽ 1, 2 ⫽ 1, 3 ⫽ 1)
Shyness (1 ⫽ 2, 2 ⫽ 2, 3 ⫽ 2)
Attachment (1 ⫽ 2, 2 ⫽ 2, 3 ⫽ 1)
Extraversion (1 ⫽ 28, 2 ⫽ 14, 3 ⫽ 9)
Constraint Disinhibitionⴱ (35) Self-directedness (44) Conscientiousness (48) Disinhibition (1 ⫽ 23, 2 ⫽ 23. 3 ⫽ 4)
● Initial item pool (N ⫽ 232) Workaholism (18) Novelty seekingⴱ (40) Workaholism (1 ⫽ 8, 2 ⫽ 8, 3 ⫽ 2)
● Postfactor analysis item pool Propriety (20) Persistence (8) Propriety (1 ⫽ 4, 2 ⫽ 4, 3 ⫽ 1)
(N ⫽ 108) Impulsivityⴱ (19) Impulsivity (1 ⫽ 9, 2 ⫽ 9, 3 ⫽ 1)
● Final item pool (N ⫽ 19) Self-directedness (1 ⫽ 11, 2 ⫽ 11, 3 ⫽ 0)
Novelty seeking (1 ⫽ 11, 2 ⫽ 11, 3 ⫽ 1)
Persistence (1 ⫽ 6, 2 ⫽ 6, 3 ⫽ 3)
Conscientiousness (1 ⫽ 36, 2 ⫽ 23, 3 ⫽ 7)
LINKING PERSONALITY INVENTORIES 113
Total number of items. b Number of items retained following combined factor analysis. c Number of items retained following removal of items with response frequencies less
Analytic Approach
than .05 indicate a good fit (McDonald & Ho, take different sets of items, and (3) it tolerates
2002). Given these established standards of CFA inventories of differing lengths and rating scales
fit statistics, we also noted the caution of mechan- (cf. McHorney & Cohen, 2000; Reise & Waller,
ical use of CFA fit criteria as a “permission slip” 2009). The Multilog program for the GRM es-
for modeling data using IRT, because CFA fit timates a slope (a) parameter and four location
results can be affected dramatically by large num- (b) parameters for each five-category NEO PI-R
ber of items and skewed data distributions (Cook, item. The Multilog program 2PL model esti-
Kallen, & Amtamnn, 2009), which are common mates a slope (a) and one location (b) parameter
characteristics of personality data. For example, for each two-category SNAP-2 and TCI item.
Hays and his colleagues (2007) reported After the initial concurrent calibration, we
CFI ⫽ 0.95 and RMSEA ⫽ 0.12 as sufficiently examined items in terms of item information
unidimensional for IRT analysis on their Physical and item content. Items with low item informa-
Functioning item bank development. Similarly, tion were considered to be poor items in IRT
Revicki and his colleagues (2009) reported calibration. Item discrimination parameter esti-
CFI ⫽ 0.902, TLI ⫽ 0.991, and RMSEA ⫽ 0.156 mates affect an item’s total information func-
as sufficiently unidimensional for further IRT tion. The higher the discrimination parameter,
analysis. Buysse and his colleagues (2010) re- the more peaked the item information function.
ported RMSEA ⫽ 0.140, TLI ⫽ 0.957, and Thus, items with discrimination parameter esti-
CFI ⫽ 0.843 for Sleep Disturbance item bank and mates less than 1.00 provide little information
RMSEA ⫽ 0.157, TLI ⫽ 0.955, and CFI ⫽ 0.812 and were removed from the item pool. The
for Sleep-Related Impairment item bank as suffi- reduced item pool was then recalibrated.
ciently unidimensional for further IRT analysis.
Therefore, we also checked the scree plot of Results
eigenvalues and the ratio of the first two eigenval-
ues from EFA in judging unidimensionality. Al- Descriptive Statistics
ternatives to the basic one-factor model were con-
sidered to improve fit (cf. McHorney & Cohen, Summed scores of the item pools were com-
2000). puted and examined for each personality dimen-
IRT calibration. Following the factor sion. The mean of the summed scores for the
analyses to determine which items across the 214 Antagonism items was 41.51 (SD ⫽ 14.59)
three personality measures were sufficiently for the total sample, 43.31 (SD ⫽ 13.66) for the
unidimensional for IRT analysis, we further ex- student sample, and 34.42 (SD ⫽ 15.97) for the
amined item response distributions because nonstudent sample. The summed score distribu-
item parameter estimates may be biased for tions for Antagonism were not significantly
items with sparse cells (Thissen, 2003). Al- skewed in the total sample or either of the
though it is common that the item response subsamples when considered separately (total
distributions are skewed for questionnaire data, sample, skew ⫽ .27; student sample ⫽ .56;
the item response categories with few observa- nonstudent sample ⫽ .08).
tions (less than 5% of total frequencies) have to The mean of the summed 232 Constraint
be combined with their adjacent categories to items for the total sample was 62.68
achieve reliable estimates. We removed those (SD ⫽ 14.51), 64.83 (SD ⫽ 14.25) for the
items with at least one response category having student sample, and 54.35 (SD ⫽ 12.39) for the
less than 5% of total frequencies to keep the nonstudent sample. The summed score distribu-
original response scales. Because we had such tions for Constraint were also not significantly
large item pools, we were able to delete these skewed (total sample, skew ⫽ .30; student sam-
items; there was no need to retain items with ple ⫽ .38; nonstudent sample ⫽ ⫺.20).
sparse cells. Then we concurrently calibrated all For the total sample, the mean of the summed
items from each personality dimension of inter- scores for Emotional Instability was 52.61
est using the GRM for the NEO PI-R and 2PL (SD ⫽ 31.69), and 47.15 (SD ⫽ 14.82)
for the SNAP-2 and TCI in Multilog 7.03 (This- and 73.77 (SD ⫽ 59.10) for the student and
sen, 2003). The advantages of this concurrent nonstudent samples, respectively. The summed
calibration include: (1) it retains the integrity of score distribution in the total sample for Emo-
the original scale, (2) it allows individuals to tional Instability was positively skewed (total
LINKING PERSONALITY INVENTORIES 115
sample, skew ⫽ 2.43); however, the distribu- Specifically, the ratio of the first to the second
tions were not significantly skewed when exam- eigenvalue was greater than 2.0 for all traits:
ining the subsamples separately (student sam- Antagonism (3.10), Constraint (3.54), Emo-
ple ⫽ .73; nonstudent sample ⫽ .74). tional Instability (3.92), Extraversion (5.90),
The mean of the summed scores for the 186 and Unconventionality (2.03). Although the ra-
Extraversion items was 66.13 (SD ⫽ 25.74) for tio of the first two factors for Unconventionality
the total sample, 66.40 (SD ⫽ 15.90) for the was less than 3, all item loadings were above
student sample, and 65.06 (SD ⫽ 47.40) for the .35. Specifically, in the final single factor solu-
nonstudent sample. The summed score distribu- tion, item loadings ranged from .35 to .79 for
tions for Extraversion were not significantly Antagonism, .35 to .93 for Constraint, .38 to .88
skewed (total sample, skew ⫽ .86; student sam- for Emotional Instability, .43 to .92 for Extra-
ple ⫽ .14; nonstudent sample ⫽ .72). version, and .38 to .76 for Unconventionality.
Finally, the mean of the summed 96 Uncon- The basic 1-factor CFA model for Extraversion
ventionality items was 48.76 (SD ⫽ 33.85) for fit well to the validation sample data (CFI ⫽ .901,
the total sample, 46.44 (SD ⫽ 12.10) for the TLI ⫽ .930, RMSEA ⫽ .035). The RMSEA in-
student sample, and 57.74 (SD ⫽ 70.29) for the dex indicated at least acceptable fit for the remain-
nonstudent sample. The summed score distribu- ing four dimensions. However, the CFI/TLI
tion in the total sample for Unconventionality global fit indices indicated less than adequate fit
was positively skewed (total sample, for the basic 1-factor CFA model: Antagonism
skew ⫽ 2.13); however, the distributions were (CFI ⫽ .821, TLI ⫽ .838, RMSEA ⫽ .031),
not significantly skewed when examining the Constraint (1.85, CFI ⫽ .773, TLI ⫽ .781,
subsamples separately (student sample ⫽ .64; RMSEA ⫽ .030), Emotional Instability (2.00,
nonstudent sample ⫽ .78). CFI ⫽ .861, TLI ⫽ .871, RMSEA ⫽ .030), and
Because of the BIBD, internal consistency Unconventionality (3.01, CFI ⫽ .684, TLI ⫽
reliability coefficients could not be calculated .696, RMSEA ⫽ .046).
for the student sample (i.e., because of planned Similar to the method used by McHorney and
missingness, too few student cases completed Cohen (2000), we next tried alternatives to the
all the items from a scale or instrument for ␣ to basic 1-factor CFA model. Correlated errors
be calculated). However, reliability coefficients among indicators from the same measure were
were calculated for the nonstudent sample by specified to reflect that some of the covariance
instrument within each of the five item banks. among the items from the same inventory be-
For the Antagonism item bank, ␣ ⫽ .80 for the cause of measurement error. The addition of
NEO items, ␣ ⫽ .95 for the SNAP items, and correlations among the residuals of the items
␣ ⫽ .84 for the TCI items. The internal consis- from the same measure (cf. McHorney & Co-
tency coefficients (␣) for the Constraint item hen, 2000) improved the CFI and TLI indexes
pool were .83 for the NEO items, .88 for the to acceptable levels: Antagonism (CFI ⫽ .900,
SNAP items, and .68 for the TCI items. For TLI ⫽ .907, RMSEA ⫽ .024), Emotional In-
the Emotional Instability item bank, ␣ ⫽ .92 for stability (CFI ⫽ .910, TLI ⫽ .909, RMSEA ⫽
the NEO items, ␣ ⫽ .94 for the SNAP items, .027), and Unconventionality (CFI ⫽ .957,
and ␣ ⫽ .93 for the TCI items. For the Extra- TLI ⫽ .948, RMSEA ⫽ .019); with the excep-
version item bank, ␣ ⫽ .87 for the NEO items, tion of Constraint (CFI ⫽ .884, TLI ⫽ .867,
␣ ⫽ .94 for the SNAP items, and ␣ ⫽ .78 for RMSEA ⫽ .024), where the CFI and TLI in-
the TCI items. Lastly, the internal consistency dexes remained slightly outside the acceptable
coefficients (␣) for the Unconventionality item range. Although the CFA fit indices were
bank were .90 for the NEO items, .86 for the slightly outside the traditional acceptable range,
SNAP items, and .85 for the TCI items. CFA fit values were found to be sensitive to
data distribution and number of items (Cook,
Assessing Dimensionality Kallen, & Amtmann, 2009). As they suggested,
using traditional cutoffs and standards for CFA
For each of the personality dimensions, the fit statistics is not recommended for establishing
scree plot of eigenvalues from the EFA in the unidimensionality of item banks because the
development sample was suggestive of a single impact of distribution and item number was
factor, with the first value larger than the others. quite large in some cases. We also examined
116 STEPP ET AL.
alternative models that posited additional fac- range of item discrimination for Antagonism
tors (e.g., a model that allowed items from (as ⫽ .48 –2.51), Constraint (as ⫽ .42–2.05),
different measures to load on method factors). Emotional Instability (as ⫽ .55–2.75), Extra-
These solutions did not yield superior fit relative version (as ⫽ .64 –2.93), and Unconventionality
to the single-factor model with correlated er- (as ⫽ .38 –1.58), with many items providing
rors. Further, we found in the literature about poor discrimination (i.e., as ⬍ 1.0). Based on
robustness of item parameter estimation to as- these item parameters, we further reduced the
sumptions of unidimensionality: Studies using number of items to yield a final number of items
multidimensional data generated by a factor an- for each trait: Antagonism (24 items; 1 NEO, 15
alytic approach tend to show that a unidimen- SNAP, and 8 TCI items), Constraint (19
sional IRT model is robust to moderate degrees items; 7 NEO, 8 SNAP, and 4 TCI items),
of multidimensionality (Harrison, 1986; Kirisci, Emotional Instability (56 items; 9 NEO, 27
Hsu, & Yu, 2001; Reckase, 1979). Given the SNAP, and 20 TCI items), Extraversion (59
EFA results and eigenvalues as well as CFA items; 9 NEO, 45 SNAP, and 5 TCI items), and
cutoff values used in previous IRT studies, we Unconventionality (18 items; 7 NEO, 4 SNAP,
judged these results overall provide sufficient and 7 TCI items). For illustrative purposes, Ta-
evidence of unidimensionality. ble 2 provides the parameter estimates and their
Based on these results, we determined 92 standard errors from the final concurrent cali-
Antagonism, 108 Constraint, 114 Emotional In- brations in rank order of their slope parameter
stability, 91 Extraversion, and 49 Unconven- estimates for the Constraint domain. The pa-
tionality items were sufficiently unidimensional rameter estimates for the remaining domains are
for IRT analysis (Reeve et al., 2007; Tate, 2003; available upon request from the first author. The
Zwick & Velicer, 1986). Table 1 delineates the items retained for each scale maintained a rea-
name of the scales as well as the number of sonable balance of content and provided ade-
items retained from each of these scales follow- quate coverage for the dimensions of interest.
ing the factor analyses. These item pools re- Interestingly, the final calibration for each of the
flected the overlapping content in the SNAP-2, personality dimensions contains items from all
TCI, and NEO PI-R. Specifically, the three mea- three measures, which suggests that each mea-
sures overlapped in measuring anger and verbal sure provides some utility in measuring the un-
and physical aggression for Antagonism; premed- derlying trait. The slope estimates of the final
itation and perseverance for Constraint; stress sus- item pools for Antagonism (as ⫽ 1.11–3.93),
ceptibility, negative affectivity, and impulsiveness Constraint (as ⫽ 1.09 –1.99), Emotional Insta-
for Emotional Instability; high activity, positive bility (as ⫽ 1.00 –2.59), Extraversion
affectivity, and sociability for Extraversion; and (as ⫽ 1.04 –3.03), and Unconventionality
curiosity, unusual experiences, and connectedness (as ⫽ 1.07–2.45) indicated considerable varia-
for Unconventionality. tion in item discrimination. The location param-
eters for the Antagonism (bs ⫽ ⫺1.70 –2.04),
IRT Calibration Constraint (bs ⫽ ⫺2.81–2.49), Emotional Insta-
bility (bs ⫽ ⫺2.75–2.57), Extraversion (bs ⫽
For item response frequency distribution ex- ⫺2.70 –1.90), and Unconventionality (bs ⫽
aminations, all items having at least one re- ⫺2.50 –1.61) reflect a sizable range of the un-
sponse category with less than 5% of total fre- derlying personality dimension of interest.
quencies were NEO items. This was expected Next, we compared the psychometric infor-
since NEO had 5 response categories while TCI mation at the test level. One of the advantages
and SNAP had only 2 response categories. of concurrent calibration is that all three mea-
Thirty-six NEO items were removed according- sures are on the same metric. For each of the
ly: 4 items from Antagonism, 13 items from personality dimensions, the test information
Constraint, 4 items from Emotional Instabil- curves (see Figure 1) were plotted for the three
ity, 14 items from Extraversion, and 1 item from measures separately and combined. Panel 1a
Unconventionality. Using the items banks se- displays the test information curves for Antag-
lected from the factor analyses for each person- onism. For these three measures, the TCI pro-
ality dimension, the slope estimates from the vided the most information, followed by the
initial concurrent calibration indicated a wide SNAP-2. However, the SNAP-2 covered a
Table 2
Concurrent GRM and 2PLM Item Parameter Estimates and SEs for the 19-Item Constraint Scale From the NEO, SNAP, and TCI
Item Original scale Stem a b1 b2 b3 b4
T103 Persistence I usually push myself harder than most . . . 1.99 (0.22) ⫺0.58 (0.08)
N130 Conscientiousness I never seem to be able to get organized (R) 1.96 (0.16) ⫺1.71 (0.14) ⫺0.67 (0.08) ⫺0.16 (0.07) 0.89 (0.10)
S29 Workaholism When I start a task, I am determined to finish . . . 1.94 (0.23) ⫺0.96 (0.10)
N40 Conscientiousness I keep my belongings neat and clean 1.90 (0.15) ⫺2.17 (0.19) ⫺0.80 (0.09) ⫺0.16 (0.08) 0.97 (0.10)
S89 Impulsivity I tend to value and follow a rational, . . . 1.85 (0.26) ⫺1.36 (0.13)
T148 Novelty seeking I like to pay close attention to details in . . . 1.73 (0.22) ⫺0.47 (0.09)
N100 Conscientiousness I like to keep everything in its place so I . . . 1.60 (0.14) ⫺2.30 (0.22) ⫺0.98 (0.11) ⫺0.15 (0.08) 1.16 (0.12)
S214 Workaholism I enjoy working hard 1.57 (0.20) ⫺1.28 (0.13)
N25 Conscientiousness I’m pretty good about pacing myself so as to . . . 1.55 (0.13) ⫺2.21 (0.20) ⫺0.83 (0.11) ⫺0.27 (0.09) 1.43 (0.14)
S173 Disinhibition I usually use careful reasoning when . . . 1.50 (0.20) ⫺1.08 (0.13)
S154 Disinhibition I always try to be fully prepared before I . . . 1.45 (0.18) ⫺1.04 (0.13)
T62 Persistence I am more hard-working than most people 1.37 (0.20) ⫺0.36 (0.10)
S261 Disinhibition I work just hard enough to get by (R) 1.34 (0.16) ⫺0.72 (0.12)
S202 Propriety When I’m working on something, I’m not . . . 1.22 (0.33) ⫺1.05 (0.16)
N210 Conscientiousness I plan ahead carefully when I go on a trip 1.22 (0.14) ⫺2.81 (0.37) ⫺1.37 (0.19) ⫺0.49 (0.13) 1.11 (0.17)
N205 Conscientiousness There are so many little jobs that need . . . (R) 1.22 (0.15) ⫺2.63 (0.37) ⫺1.10 (0.18) ⫺0.28 (0.14) 1.45 (0.21)
LINKING PERSONALITY INVENTORIES
T166 Persistence I often give up a job if it takes much . . . (R) 1.21 (0.21) ⫺1.64 (0.24)
S254 Disinhibition Before making a decision, I carefully . . . 1.10 (0.17) ⫺1.13 (0.17)
N230 Conscientiousness I’m something of a “workaholic” 1.09 (0.13) ⫺1.79 (0.26) ⫺0.05 (0.14) 0.90 (0.16) 2.49 (0.32)
Note. Letters before the item number indicate the measure to which the item originated (N ⫽ NEO-PI-R, S ⫽ SNAP-2, and T ⫽ TCI). Some items had to be shortened (. . .) because
of space limitations. “R” indicates the item was reverse scored. The “a” parameter represents slope and “b” parameter(s) represents location.
117
118 STEPP ET AL.
Figure 1. Test information curves for the SNAP-2, TCI, NEO PI-R, and combined test for
Antagonism (Panel 1a), Constraint (Panel 1b), Emotional Instability (Panel 1c), Extraversion
(Panel 1d), and Unconventionality (Panel 1e).
LINKING PERSONALITY INVENTORIES 119
broader range relative to the TCI. The NEO consistent with recent meta-analytic and empir-
PI-R covered a broad range; albeit providing ical evidence demonstrating that five personal-
relatively little information to the test informa- ity dimensions are shared among dimensional
tion curve (it contributed only one item). measures of abnormal and normal personality
Panel 1b displays the test information curve for (Markon et al., 2005; O’Connor, 2005; Samuel,
Constraint. The NEO PI-R provided the most Simms, Clark, Livesley, & Widiger, 2010). Our
information and covered the widest range findings are consistent with Samuel and col-
followed by the SNAP-2 and TCI. Panel 1c leagues (2010) demonstrating that common la-
displays the test information curves for Emo- tent personality dimensions cut across the
tional Instability. The SNAP-2 provided more NEO PI-R as well as personality measures
information at a narrower range of ⫺1.5 intended to assess more extreme variants of
to 2.5, while the NEO PI-R provided more personality pathology (SNAP and Dimen-
information at the low tail (i.e., less than ⫺1.5). sional Assessment of Personality Pathology-
The TCI covered a slightly narrower range for Basic Questionnaire; Livesley & Jackson,
Emotional Instability than the SNAP-2 and pro- 2011). Additionally, items from the NEO
vided less information. Panel 1d displays the PI-R provided more information at the lower
test information curves for Extraversion. Simi- range of the latent trait compared to the mea-
lar to Emotional Instability, the SNAP-2 pro- sures of maladaptive personality.
vided more information at a range of ⫺3.0 The integration of multiple perspectives, spe-
to 2.5, while the NEO PI-R provided more cifically the SNAP-2, TCI, and NEO PI-R, pro-
information at the two tails (i.e., less than ⫺3.0 vides the most information about the underlying
and larger than 2.5). The TCI provided the least trait, thereby minimizing the weaknesses of any
information. Finally, Panel 1e displays the test single perspective. Our final item banks inte-
information curves for Unconventionality. The grated information functions across personality
NEO PI-R covered the widest range and pro-
scales from competing inventories, which pro-
vided the most information, followed by the
vides the most information about the underlying
TCI and SNAP-2. Across all five personality
trait compared with the subset of items retained
domains, combining items from the three indi-
from each individual inventory. Integrating
vidual scales provides the maximum amount of
information with the most precision across the multiple information functions always leads to
widest range of when compared to the infor- an increase in the amount of information pro-
mation provided by the subset of items retained vided. For example, for Extraversion, the NEO
from each individual scale. PI-R provides the most information at the high
and low ends of trait, whereas the SNAP-2
provides more information in the middle range
Discussion
of the trait. Each measure contributed items to
Our goal was to map the scales from the the final calibrated item pool, suggesting that
SNAP-2, NEO, and TCI onto five common per- each measure provides some utility in measur-
sonality traits (i.e., Antagonism, Constraint, ing the underlying domains of interest. Even
Emotional Instability, Extraversion, and Uncon- though factor analyses pruned unrelated items,
ventionality) using Widiger and Simonsen’s we further pruned items if they provided little
(2005) model as a guide. Thus, we specified information to the construct (i.e., items with
the factors we sought to measure in advance. slope parameters ⬍1.00 were eliminated). Thus,
The results demonstrated that items from the items from any measure could have been elim-
SNAP-2, TCI, and NEO PI-R overlap in their inated at this stage and it is important to note
measurement of Antagonism, Emotional Insta- that all three scales were represented in the final
bility, Extraversion, Constraint, and Unconven- item pool. These results are also consistent with
tionality in predictable ways. For example, past reports that two-, three-, and four-factor
items from the Negative Temperament, Self- models of personality, which have all been pro-
harm, and Dependency SNAP-2 scales, Harm posed as alternative accounts for normal and
Avoidance and Self-Directedness TCI scales, abnormal personality (e.g., Eysenck & Eysenck,
and Neuroticism NEO PI-R scale all overlap in 1976; O’Connor & Dyce, 1998; Tellegen,
measuring Emotional Instability. Our results are 2000), are well-represented within a five-factor
120 STEPP ET AL.
model hierarchy (Digman, 1997; Markon et al., categorical responses does not uniformly in-
2005). crease the information for levels over the
entire range (Muraki, 1993). Moreover, infor-
Comparing the SNAP, TCI, and NEO mation from different items, even if indicators
have different response options, when com-
Our data analytic strategy enabled us to di- bined in an item pool allows us to directly
rectly compare scales from the SNAP-2, TCI, compare the information provided by the differ-
and NEO PI-R by linking the scales to the same ent scales. Each NEO item is spread out across
metric. We retained items from each of the four decision points (Strongly Disagree vs. Dis-
inventories that provided the best psychometric agree; Disagree vs. Neutral; Neutral vs. Agree;
information to represent the five personality and Agree vs. Strongly Agree). Thus, each NEO
traits. By putting items across inventories on the item can be thought of as 4 binary discrimina-
same underlying latent trait scale, we were able tors (k-1 response options). When NEO, SNAP,
to provide information to researchers and clini- and TCI indicators comprise an item pool, we
cians regarding the “best” functioning items are able to compare the aggregate information
from each of the three inventories. We did not provided by all the items from the scale (test
intend to create a new measure based on these information curve). Concurrent calibration en-
three inventories, rather to provide information sured that items from different measures could
across them. be compared on the same metric. Although the
The results demonstrated the relative test information is a sum of individual item
strengths and weaknesses of the SNAP-2, TCI, information, which means that the height of the
and NEO PI-R when measuring Antagonism, information curve is often affected by the num-
Constraint, Emotional Instability, Extraversion, ber of items included in the item pool, this does
and Unconventionality as separate personality not inevitably mean that the more items in-
dimensions. Specifically, the SNAP-2 and TCI cluded, the more information the corresponding
provided information at narrower bands of for “test” will have. With IRT, a longer test does
all personality traits relative to the NEO PI-R not provide more information than a shorter test
with the exception of Antagonism. Moreover, (Embretson & Reise, 2000).
the SNAP-2 provided more information at nar- By combining items from different scales, we
row bands of Antagonism, Emotional Instabil- are able to provide more information about the
ity, and Extraversion relative to the NEO PI-R. underlying construct than any one scale. This
Additionally, the NEO PI-R provided the most observation is important because many person-
information across all bands of Constraint and ality assessments are created with little cross-
Unconventionality relative to the SNAP-2 and talk between them. This has resulted in a pleth-
TCI. Lastly, the TCI provided the most infor- ora of personality inventories often viewed as
mation for Antagonism but provided the least competing for “favored” status. However, we
information for Constraint, Emotion Instability, have demonstrated that, by using IRT calibra-
and Extraversion compared to the SNAP-2 and tions across different instruments, we can
NEO PI-R. The TCI provided more information equate them on the same metric and measure the
and measured a wider range of Unconvention- broadest range of theta with the most precision
ality relative to the SNAP-2 but not the NEO when compared with any one instrument in
PI-R. isolation.
A comparison of item parameters also dem-
onstrated that the NEO PI-R items covered a Constructs of Antagonism, Constraint,
wider range than SNAP-2 and TCI items Emotional Instability, Extraversion, and
across four personality traits. This can be partly Unconventionality
attributed to the structure of the measure. Poly-
tomous response items generally cover a wider In addition to demonstrating the relative
theta range than dichotomous items since each strengths and weaknesses of the SNAP-2, TCI,
polytomous response item can be treated as a and NEO PI-R, latent trait models inform our
series of dichotomous items. However, this understanding of the constructs of interest. The
structural difference does not fully account for factor analyses and concurrent calibrations
our findings because increasing the number of culled items from all inventories that did not
LINKING PERSONALITY INVENTORIES 121
sufficiently overlap, which ensured the final nervous and stressed”), depression (e.g., “I
item pool was representative of the unidimen- rarely feel lonely or blue” [reverse-scored]),
sional latent trait. Our results distilled core fea- anger (e.g., “My anger frequently gets the better
tures of the construct, while culling peripheral of me”), impulsivity (e.g., “Sometimes I get so
aspects. upset, I feel like hurting myself”), helplessness
Antagonism. The majority of the Antago- (e.g., “I often feel that I am the victim of cir-
nism items measured argumentativeness (e.g., cumstance”), and self-consciousness (e.g., “I
“I often get into arguments with my family and usually stay away from social situations”). The
friends”), physical aggression (e.g., “When I get final item pool contained items from the Nega-
angry, I am often ready to hit someone”), ani- tive Temperament and Self-harm SNAP-2
mosity (e.g., “I enjoy getting revenge on people scales, Harm Avoidance and Self-Directedness
who hurt me”), and self-centeredness (e.g., TCI scales, and the Neuroticism NEO PI-R
“Some people think I am selfish and egotisti- scale (i.e., all facets: Anxiety, Hostility, Depres-
cal”). The original set of items were from the sion, Self-Consciousness, Impulsiveness, and
Aggressiveness and Manipulativeness SNAP-2 Vulnerability). The Dependency SNAP-2 scale
scales, the Cooperativeness TCI scale, and the was not represented in the final item pool,
Agreeableness NEO PI-R scale (i.e., NEO PI-R which suggests that indecisiveness may not be
facets: Trust, Straightforwardness, Altruism, best understood as part of the Emotional Insta-
Compliance, and Modesty). However, the final bility construct.
calibrated scale only contained one item from Extraversion. The majority of Extraver-
the NEO PI-R Antagonism scale. Contrary to sion items measured high activity (e.g., “Most
Widiger and Simonsen’s (2005) predictions, no days I have a lot of ‘pep’ or vigor”), positive
items from the Reward Dependence TCI scale, affectivity (e.g., “I laugh easily”), and sociabil-
Entitlement and Dependency SNAP-2 scales, ity (e.g., “I go out of my way to meet people”).
and the Tender-mindedness facet of the Agree- At least one item was represented from the
ableness NEO PI-R scale were represented in Positive Affectivity, Exhibitionism, Entitle-
the final item pool. Interestingly, a relatively ment, and Detachment SNAP-2 scales, the
large number of items were dropped from the Reward Dependence, Exploratory Excitabil-
Antagonism item pool because items factor ity, and Shyness TCI scales, and the Extra-
loadings were ⬍.35, indicating that the Antag- version NEO PI-R scale (i.e., all facets:
onism construct does not appear to be as unidi- Warmth, Gregariousness, Assertiveness, Ac-
mensional when compared with the other per- tivity, Excitement-Seeking, and Positive
sonality domains across the NEO, SNAP, and Emotions). Contrary to Widiger and Simons-
TCI. en’s (2005) predictions, the Extravagance
Constraint. The majority of the Constraint TCI scale was not represented in the final
items measured premeditation (e.g., “I plan item pool. This finding suggests that the ease
ahead carefully when I go on a trip”), persever- with which one spends money may not be an
ance (e.g., “When I start a task, I am determined important aspect of the Extraversion con-
to finish it”), and diligence (e.g., “I enjoy work- struct.
ing hard”). Items from the final pool origi- Unconventionality. The preponderance of
nated from the Workaholism, Impulsivity, Unconventionality items measured intellectual
Propriety, and Disinhibition SNAP-2 scales, curiosity (e.g., “I am intrigued by the patterns I
Novelty Seeking and Self-Directedness TCI find in art”), unusual experiences, (e.g., “Some-
scales, and Conscientiousness NEO PI-R times I have this strange experience in which
scale (i.e., all facets: Competence, Order, Du- things seem “more real’ than usual”) and con-
tifulness, Achievement Striving, Self-Disci- nectedness (e.g., “I sometimes feel so connected
pline, and Deliberation). At least one item to nature that everything seems to be part of one
from all proposed overlapping scales were living organism”). As predicted by Widiger and
represented in the final pool. Simonsen (2005) items in our final pool origi-
Emotional Instability. The preponderance nated from the Eccentric Perceptions SNAP-2
of Emotional Instability items measure stress scale, Self-Transcendence TCI scale, and Open-
susceptibility (e.g., “I sometimes get too upset ness NEO PI-R scale (i.e., NEO PI-R facets:
my minor setbacks”), anxiety (e.g., “I often feel Ideas, Aesthetics, and Feelings). Three Open-
122 STEPP ET AL.
ness facets were not represented in the final item Specifically, finding items that provide informa-
pool: Values, Fantasy, and Actions. Including tion across the entire range of Antagonism
Unconventionality in a dimensional model of could prove difficult, which poses complica-
personality pathology is somewhat controver- tions for computer adaptive testing. This status
sial because it is not clear that this construct is also has implications for measuring change as
relevant to personality pathology (O’Connor & extremely different precision for individuals ex-
Dyce, 1998; Watson et al., 2008; Watson, ists at different ranges of Antagonism.
Clark, & Harkness, 1994). However, our find-
ings are consistent with Widiger (2011b) and Nosological Implications
suggest that certain facets from the Openness
NEO PI-R scale (i.e., Ideas, Aesthetics, and The best classification scheme for personality
Feelings) can be linked with more extreme un- disorders (PDs) is the topic of considerable de-
usual experiences as measured with the Eccen- bate (Clark, 2007). Overwhelming evidence in-
tric Perceptions SNAP-2 scale. Future research dicates that the dominant psychiatric nosology,
with this scale is required to determine its va- the DSM–IV–TR (American Psychiatric Associ-
lidity for psychotic features that may be impor- ation, 2000), which divides personality pathol-
tant markers for particular manifestations of ogy into 10 separate diagnoses, fails to align
personality pathology (e.g., Schizotypal PD). with empirical classification research (e.g.,
Additional information regarding the nature Krueger, 2005; Livesley, 2001; O’Connor,
of the underlying traits can be found by exam- 2005), and few clinicians or researchers main-
ining the individual item distributions and tain that the DSM–IV PD taxonomy adequately
test information curves. For Constraint, Emo- captures the range of personality pathology
tional Instability, Extraversion, and Unconven- (Westen, 1997; Zimmerman et al., 2005).
tionality, the frequency distributions for indi- Dimensional classification of PDs has been
vidual items were approximately normal and proposed as an attractive alternative approach
resulting test information curves were also ap- because it addresses most of the limitations of
proximately normally distributed (peaks of the the current categorical system (Widiger & Trull,
distributions range from approximately ⫺1 for 2007). Conceptualizing an individual’s person-
Constraint to 1 for Extraversion and Unconven- ality as a multidimensional profile composed of
tionality). However, the frequency distributions distinct traits explains the co-occurrence among
for individual items in our final Antagonism PDs as a function of shared trait liabilities, and
item bank were positively skewed despite our heterogeneity within a disorder reflects differ-
attempt to enrich the sample at the ceiling with ential interactions among traits (Krueger &
clinical patients. Thus, the test information Markon, 2006). Additionally, dimensional mod-
curves for the SNAP-2, TCI, NEO PI-R, and els preserve information about subclinical man-
combined test were displaced to the right, with ifestations of personality pathology that may
more information and precision provided in the have significant functional consequences, such
moderate to marked ranges (approximately 0 to as excessive alcohol use and social maladjust-
⫹2 SD). It might seem as though it is appropri- ment (Bagge et al., 2004; Stepp, Trull, & Sher,
ate to identify low threshold Antagonism items; 2005). Several dimensional models have been
however, it is unclear if such items would mea- explicitly developed to assess a wide range of
sure the same construct as problematic, higher maladaptive personality traits. Morey and col-
levels of Antagonism. Reis and Waller (2009) leagues (2007) demonstrated the incremental
observe that peaked and (most often positively) validity of these approaches over the extant
skewed information functions for clinical scales diagnostic system.
are indicative of an underlying “quasi-trait.” As Our current work demonstrates the advan-
a result, the construct may be less informative at tages of linking competing personality invento-
the low end of the scale. Thus, low antagonism ries into an integrated framework. By linking
may not reflect cooperativeness/amicability but inventories from different perspectives, we can
something entirely different, such as flexibility develop a comprehensive classification system
to engage in a wide range of interpersonal be- that capitalizes on the strengths of different
havior. Reis and Waller note that quasi-traits inventories. For example, because the NEO
have implications for many IRT applications. PI-R provided information at the low end of
LINKING PERSONALITY INVENTORIES 123
Emotional Instability and the SNAP-2 provided tool and refine our understanding of the under-
more information in more moderate and high lying traits.
ranges of Emotional Instability, integrating Although this linking approach based on con-
items from both of these models provides the current IRT calibration can be used to create a
most information along the personality trait new integrated inventory, we did not intend to
continuum when compared with the subset of create a new inventory based on these three
items from any one inventory. This finding is commercially available inventories. Rather, we
consistent with previous work demonstrating intended to provide researchers information on
equivocal results when pitting one model the item performance form each of these three
against another (Harkness & McNulty, 1994; inventories with advantages and disadvantages
Morey et al., 2007; Reynolds & Clark, 2001; of each inventories.
Stepp et al., 2005). Thus, it seems that selecting Potential biases can sometimes be intro-
a single inventory to serve as our future taxon- duced by combining samples (cf. Waller,
omy would result in a classification system that 2008). We wanted to bolster our student sam-
leaves out meaningful aspects of personality. ple with psychiatric patient samples to expand
Our data analytic strategy illustrates IRT as the potential range of scores that would be
one tool that can be used for linking personality endorsed. For IRT purposes, we felt that the
inventories to develop an improved measure- potential increase in range of scores by also
ment system for personality traits. IRT models using patients outweighed the concerns of
yield information about the position on the per- commingling the samples.
sonality trait continuum where each item and
inventory provides maximum psychometric in- Future Directions for Personality
formation about the trait. This could enable us Assessment
to develop an empirically informed measure-
ment system that contains items that tap the Our next set of objectives includes validating
low, middle, and high ranges of each trait. For the integrated item banks by demonstrating
example, a comprehensive inventory for Extra- their utility in predicting social functioning and
version might include items that assess low treatment response compared with already ex-
(e.g., “I am a ‘people person’”), middle (e.g., “I isting measures of personality, including the
prefer to start conversations, rather than waiting extant classification system for personality dis-
for others to talk to me”), and high (e.g., “I often orders. One of the advantages to our approach is
feel as if I’m bursting with energy”) ranges of that we can further refine these item pools. The
the trait. Future research can develop cut-points predictive utility of these refined item banks
along the trait continuum to aid in clinical de- should also be tested against existing measures.
cision-making. In summary, we encourage researchers to
continue to investigate the utility of integrative
personality inventories. We believe this ap-
Limitations proach provides the most information along the
entire personality trait continuum and will yield
We were only able to link a small subset of the most comprehensive, flexible, and precise
the 18 competing dimensional personality in- inventory. This approach brings together theo-
ventories in the current study. However, given ries and inventories that are distinct but contain
that we have now integrated the SNAP-2, TCI, significant overlap. For this reason we feel that
and NEO PI-R, we will be able to forge ahead. an integrative dimensional personality inven-
As participants in future studies complete at tory will generate novel empirical studies and
least one of the measures we have already con- refine our understanding of underlying person-
currently calibrated in addition to another per- ality traits.
sonality inventory (e.g., MPQ, DAPP-BQ), we
will be able to link the additional personality
inventory to the current scales. As more inven- References
tories are linked and we are able to directly
compare each inventory, we will learn about the American Psychiatric Association. (2000). Diagnos-
best path for a future integrative measurement tic and statistical manual of mental disorders (4th
124 STEPP ET AL.
ed. revised). Washington, DC: American Psychi- Cochran, W. G., & Cox, G. M. (1957). Experimental
atric Association. designs. New York, NY: Wiley.
Ando, J., Ono, Y., Yoshimura, K., Onoda, N., Shi- Cook, K. F., Kallen, M. A., & Amtmann, D. (2009).
nohara, M., Kanba, S., & Asai, M. (2002). The Having a fit: Fit statistics for simulated and actual
genetic structure of Cloninger’s seven-factor data fit statistics for IRT models. Quality of Life
model of temperament and character in a Japanese Research, 18, 447– 460. doi:10.1007/s11136-009-
sample. Journal of Personality, 70, 583– 609. doi: 9464-4
10.1111/1467-6494.05018 Costa, P. T., & McCrae, R. R. (1985). The NEO
Bagge, C., Nickell, A., Stepp, S., Durrett, C., Jack- Personality Inventory manual. Odessa, FL: Psy-
son, K., & Trull, T. J. (2004). Borderline person- chological Assessment Resources.
ality disorder features predict negative outcomes 2 Costa, P. T., & McCrae, R. R. (1992a). NEO PI-R
years later. Journal of Abnormal Psychology, 113, professional manual. Odessa, FL: Psychological
279 –288. doi:10.1037/0021-843X.113.2.279 Assessment Resources, Inc.
Buysse, D. J., Yu, L., Moul, D. E., Germain, A., Costa, P. T., & McCrae, R. R. (1992b). “‘Normal’
Stover, A., Dodds, N. E., . . . Pilkonis, P. A. personality inventories in clinical assessment:
(2010). Development and validation of Patient- General requirements and the potential for using
Reported Outcome Measures for sleep disturbance the NEO Personality Inventory”: Reply. Psycho-
and sleep-related impairments. Sleep, 33, 781–792. logical Assessment, 4, 20 –22. doi:10.1037/1040-
Campbell, B. F., Sengupta, S., Santos, C., & Lorig, 3590.4.1.20
K. R. (1995). Balanced incomplete block design: Digman, J. M. (1997). Higher-order factors of the
Description, case study and implications for prac- Big Five. Journal of Personality and Social Psy-
tice. Health Education Quarterly, 22, 201–210. chology, 73, 1246 –1256. doi:10.1037/0022-
Clark, L. A. (2007). Assessment and diagnosis of 3514.73.6.1246
personality disorder: Perennial issues and an Embretson, S. E., & Reise, S. P. (2000). Item Re-
emerging reconceptualization. Annual Review of sponse Theory for psychologists. Mahwah, NJ: Er-
Psychology, 58, 227–257. doi:10.1146/annurev-
lbaum.
.psych.57.102904.190200
Eysenck, H. J., & Eysenck, S. B. G. (1976). Psychoti-
Clark, L. A., & Livesley, W. J. (2002). Two ap-
cism as a dimension of personality. New York,
proaches to identifying the dimensions of person-
NY: Crane, Russak, & Company.
ality disorder: Convergence on the five-factor
Fassino, S., Abbate-Daga, G., Amianto, F., Leom-
model. In P. T. Costa & T. A. Widiger (Eds.),
bruni, P., Boggio, S., & Rovera, G. G. (2002).
Personality disorders and the five-factor model of
personality (2nd ed., pp. 161–176). Washington, Temperament and character profile of eating dis-
DC: American Psychological Association. orders: A controlled study with the Temperament
Clark, L. A., Simms, L. J., Wu, K. D., & Casillas, A. and Character Inventory. The International Jour-
(in press). Schedule for Nonadaptive and Adaptive nal of Eating Disorders, 32, 412– 425. doi:
Personality: Manual for administration, scoring, 10.1002/eat.10099
and interpretation (2nd ed.). Minneapolis, MN: Harkness, A. R., & McNulty, J. L. (1994). The Per-
University of Minnesota Press. sonality Psychopathology Five (PSY-5): Issues
Cleckley, H. (1964). The mask of sanity (4th ed.). St. from the pp. of a diagnostic manual instead of a
Louis, MO: Mosby. dictionary. In S. Strack & M. Lorr (Eds.), Differ-
Cloninger, C. R. (1986). A unified biosocial theory of entiating normal and abnormal personality (1st
personality and its role in the development of ed., pp. 291–315). New York, NY: Springer
anxiety states. Psychiatric Developments, 4, 167– Harrison, D. A. (1986). Robustness of IRT parameter
226. estimation to violations of the unidimensionality
Cloninger, C. R. (1987). A systematic method for assumption. Journal of Educational Statistics, 11,
clinical description and classification of personal- 91–115. doi:10.2307/1164972
ity variants: A proposal. Archives of General Psy- Hays, R. D., Liu, H., Spritzer, K., & Cella, D. (2007).
chiatry, 44, 573–588. Item response theory analyses of physical func-
Cloninger, C. R., Przybeck, T. R., & Svrakic, tioning items in the medical outcomes study. Med-
D. M. (1994). The Temperament and Character ical Care, 45, S32–S38. doi:10.1097/01.mlr
Inventory (TCI): A guide to its development and .0000246649.43232.82
use. St. Louis, MO: Centre for Psychobiology of Joyce, P. R., Mulder, R. T., McKenzie, J. M., Luty,
Personality. S. E., & Cloninger, C. R. (2004). Atypical depres-
Cloninger, C. R., Svrakic, D. M., & Przybeck, T. R. sion, atypical temperament and a differential anti-
(1993). A psychobiological model of temperament depressant response to fluoxetine and nortriptyline.
and character. Archives of General Psychiatry, 50, Depression and Anxiety, 19, 180 –186. doi:
975–990. 10.1002/da.20001
LINKING PERSONALITY INVENTORIES 125
Kirisci, L., Hsu, T., & Yu, L. (2001). Robustness of Muthén, B., du Toit, S. H. C., & Spisic, D. (1997).
item parameter estimation programs to assump- Robust inference using weighted least squares and
tions of unidimensionality and normality. Applied quadratic estimating equations in latent variable
Psychological Measurement, 25, 146 –162. doi: modeling with categorical and continuous out-
10.1177/01466210122031975 comes. Unpublished Technical Report.
Krueger, R. F. (2005). Continuity of axes I and II: O’Connor, B. P. (2002). The search for dimensional
Toward a unified model of personality, personality structure differences between normality and abnor-
disorders, and clinical disorders. Journal of Per- mality: A statistical review of published data on
sonality Disorders, 19, 233–261. doi:10.1521/ personality and psychopathology. Journal of Per-
pedi.2005.19.3.233 sonality and Social Psychology, 83, 962–982. doi:
Krueger, R. F., & Markon, K. E. (2006). Reinterpret- 10.1037/0022-3514.83.4.962
ing comorbidity: A model-based approach to un- O’Connor, B. P. (2005). A search for consensus on
derstanding and classifying psychopathology. An- the dimensional structure of personality disorders.
nual Review of Clinical Psychology, 2, 111–133. Journal of Clinical Psychology, 61, 323–345. doi:
doi:10.1146/annurev.clinpsy.2.022305.095213 10.1002/jclp.20017
Livesley, W. J. (2001). Conceptual and taxonomic O’Connor, B. P., & Dyce, J. A. (1998). A test of
issues. In W. J. Livesley (Ed.), Handbook of per- models of personality disorder configuration. Jour-
sonality disorders (pp. 3–38). New York, NY: nal of Abnormal Psychology, 107, 3–16. doi:
Guilford Press. 10.1037/0021-843X.107.1.3
Livesley, W. J., & Jackson, D. N. (2009). Dimen- Reckase, M. D. (2009). Multidimensional item re-
sional Assessment of Personality Pathology-Basic sponse theory. New York, NY: Srpinger. doi:
Questionnaire. Port Huron, MI: Research Psychol- 10.1007/978-0-387-89976-3
ogists Press. Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F.,
Lord, F. M. (1980). Applications of item response Crane, P. K., Teresi, J. A., . . . Cella, D. (2007).
theory to practical testing problems. Hillsdale, NJ: Psychometric evaluation and calibration of health-
Erlbaum.
related quality of life item banks: Plans for the
Markon, K. E., Krueger, R. F., & Watson, D. (2005).
patient-reported outcomes measurement informa-
Delineating the structure of normal and abnormal
tion system (PROMIS). Medical Care, 45, s22–
personality: An integrative hierarchical approach.
s31. doi:10.1097/01.mlr.0000250483.85507.04
Journal of Personality and Social Psychology, 88,
Reise, S. P., & Henson, J. M. (2003). A discussion of
139 –157. doi:10.1037/0022-3514.88.1.139
modern versus traditional psychometrics as ap-
McDonald, R. P., & Ho, M. H. (2002). Principles and
practice in reporting structural equation analyses. plied to personality assessment scales. Journal of
Psychological Methods, 7, 64 – 82. doi:10.1037/ Personality Assessment, 81, 93–103. doi:10.1207/
1082-989X.7.1.64 S15327752JPA8102_01
McHorney, C. A., & Cohen, A., S. (2000). Equating Reise, S. P., & Waller, N. G. (2009). Item response
health status measures with item response theory: theory and clinical measurement. Annual Review
Illustration with functional status items. Medical of Clinical Psychology, 5, 25– 46.
Care, 38, 11– 45. doi:10.1097/00005650- Revicki, D. A., Chen, W.-H., Harnam, N., Cook, K.,
200009002-00008 Amtmann, D., Callahan, L. F., . . . Keefe, F. J.
Morey, L. C., Hopwood, C. J., Gunderson, J. G., (2009). Development and psychometric analysis of
Skodol, A. E., Shea, M. T., Yen, S., . . . the PROMIS pain behavior item bank. Pain, 146,
McGlashan, T. H. (2007). Comparison of alterna- 158 –169. doi:10.1016/j.pain.2009.07.029
tive models for personality disorders. Psychologi- Reynolds, S. K., & Clark, L. A. (2001). Predicting
cal Medicine, 37, 983–994. doi:10.1017/ dimensions of personality disorder from domains
S0033291706009482 and facets of the Five-Factor Model. Journal of
Morey, L. C., Warner, M. B., Shea, M. T., Gunder- Personality, 69, 199 –222. doi:10.1111/1467-
son, J. G., Sanislow, C. A., Grilo, C., . . . 6494.00142
McGlashan, T. H. (2003). The representation of Rock, D. A., & Nelson, J. (1992). Applications and
four personality disorders by the Schedule for extensions of NAEP concepts and technology.
Nonadaptive and Adaptive Personality dimen- Journal of Educational Statistics, 17, 219 –232.
sional model of personality. Psychological Assess- doi:10.2307/1165171
ment, 15, 326 –332. doi:10.1037/1040-3590 Samuel, D. B., Simms, L. J., Clark, L. E., Livesley,
.15.3.326 W. J., & Widiger, T. A. (2010). An item response
Muraki, E. (1993). Information functions of the gen- theory integration of normal and abnormal person-
eralized partial credit model. Applied Psychologi- ality scales. Personality Disorders: Theory, Re-
cal Measurement, 17, 351–363. doi:10.1177/ search, and Treatment, 1, 5–21. doi:10.1037/
014662169301700403 a0018136
126 STEPP ET AL.
Savla, J., Davey, A., Costa, P. T., & Whitfield, K. E. ogy, 103, 18 –31. doi:10.1037/0021-843X
(2007). Replicating the NEO PI-R factor structure .103.1.18
in African-American older adults. Personality and Westen, D. (1997). Divergences between clinical and
Individual Differences, 43, 1279 –1288. doi: research methods for assessing personality disor-
10.1016/j.paid.2007.03.019 ders: Implications for research and the evolution of
Stepp, S. D., Trull, T. J., Burr, R. M., Wolfenstein, Axis II. American Journal of Psychiatry, 154,
M., & Vieth, A. Z. (2005). Incremental validity of 895–903.
the Structured Interview for the Five-Factor Model Widiger, T. A. (2011a). Integrating normal and ab-
of personality (SIFFM). European Journal of Per- normal personality structure: A proposal for
sonality, 19, 343–357. doi:10.1002/per.565 DSM-V. Journal of Personality Disorders, 25,
Stepp, S. D., Trull, T. J., & Sher, K. J. (2005). 338 –363. doi:10.1521/pedi.2011.25.3.338
Borderline personality features predict alcohol use Widiger, T. A. (2011b). The DSM-5 dimensional
problems. Journal of Personality Disorders, 19, model of personality disorder: Rationale and empir-
711–722. doi:10.1521/pedi.2005.19.6.711 ical support. Journal of Personality Disorders, 25,
Tate, R. (2003). A comparison of selected empir- 224 –234. doi:10.1521/pedi.2011.25.2.222
ical methods for assessing the structure of re- Widiger, T. A., Livesley, W. J., & Clark, L. E. A.
sponses to test items. Applied Psychological (2009). An integrative dimensional classification
Measurement, 27, 159 –203. doi:10.1177/ of personality disorder. Psychological Assess-
0146621603027003001 ment, 21, 243–255. doi:10.1037/a0016606
Tellegen, A. (2000). Manual of the Multidimensional Widiger, T. A., & Simonsen, E. (2005). Alternative
Personality Questionnaire. Minneapolis, MN: dimensional models of personality disorder: Find-
University of Minnesota Press. ing a common ground. Journal of Personality Dis-
Thissen, D. (2003). MULTILOG 7: Multiple categor- orders, 19, 110 –130. doi:10.1521/pedi.19.2
ical item analysis and test scoring using item re- .110.62628
sponse theory [computer program]. Chicago, IL: Widiger, T. A., & Trull, T. J. (2007). Plate tectonics
Scientific Software. in the classification of personality disorder: Shift-
Thissen, D., & Wainer, H. (Eds.). (2001). Test scor- ing to a dimensional model. American Psycholo-
ing. Mahwah, NJ: Erlbaum. gist, 62, 71– 83. doi:10.1037/0003-066X.62.2.71
Van de Linden, W. J., Veldkamp, B. P., & Carlson, Wu, K., Lindsted, K. D., Tsai, S., & Lee, J. W.
J. E. (2004). Optimizing balanced incomplete (2008). Chinese NEO PI-R in Taiwanese adoles-
block designs for educational assessments. Applied cents. Personality and Individual Differences, 44,
Psychological Measurement, 28, 317–331. doi: 656 – 667. doi:10.1016/j.paid.2007.09.025
10.1177/0146621604264870 Yamagata, S., Suzuki, A., Ando, J., Ono, Y., Kijima,
Waller, N. G. (2008). Commingled Samples: A Ne- N., Yoshimura, K., . . . Jang, K. L. (2006). Is the
glected Source of Bias in Reliability Analysis. genetic structure of human personality universal?
Applied Psychological Measurement, 32, 211– A cross-cultural twin study from North America,
223. doi:10.1177/0146621607300860 Europe, and Asia. Journal of Personality and So-
Walton, K. E., Roberts, B. W., Krueger, R. F., Blo- cial Psychology, 90, 987–998. doi:10.1037/0022-
nigen, D. M., & Hicks, B. M. (2008). Capturing 3514.90.6.987
abnormal personality with normal personality in- Yamamoto, K., & Mazzeo, J. (1992). Item response
ventories: An item response theory approach. theory scale linking in NAEP. Journal of Educa-
Journal of Personality, 76, 1623–1647. doi: tional Statistics, 17, 155–173. doi:10.2307/
10.1111/j.1467-6494.2008.00533.x 1165167
Watson, D., Clark, L. A., & Chmielewski, M. (2008). Zimmerman, M., Rothschild, L., & Chelminski, I.
Structures of personality and their relevance to (2005). The prevalence of DSM–IV personality
psychopathology: II. Further articulation of a com- disorders in psychiatric outpatients. American
prehensive unified trait structure. Journal of Per- Journal of Psychiatry, 162, 1911–1918. doi:
sonality, 76, 1545–1585. doi:10.1111/j.1467- 10.1176/appi.ajp.162.10.1911
6494.2008.00531.x Zwick, W. R., & Velicer, W. F. (1986). Comparison
Watson, D., Clark, L. A., & Harkness, A. R. (1994). of five rules for determining the number of com-
Structures of personality and their relevance to ponents to retain. Psychological Bulletin, 99, 432–
psychopathology. Journal of Abnormal Psychol- 442. doi:10.1037/0033-2909.99.3.432