A Comprehensive Analysis of 50 Years of Eysenck
A Comprehensive Analysis of 50 Years of Eysenck
René T. Proyerc
a
Department of Psychology, University of Zurich, Switzerland
b
School of Psychology, University of Sunderland, United Kingdom
c
Department of Psychology, Martin-Luther-University Halle-Wittenberg, Germany
Keywords: Instruments for the assessment of the Eysenckian superfactors of personality, Psychoticism (P), Extraversion (E),
PEN model and Neuroticism (N), were developed over the course of almost 50 years. Typically the convergence with the
Extraversion precursor was examined when a new scale was published. In the present study the continuity and change of the
Psychoticism substance of P, E, and N is tested by administering all instruments to a sample simultaneously, together with
Neuroticism
measures of the Five-Factor Model. A factor analysis of the 19 markers of the PEN model clearly yielded three
Questionnaires
factors, with higher loadings for E and N compared to P. The superfactors typically were measured purely after
the historically second (or third, for P) instrument. Analysing the item difficulty confirmed that the P items were
softened during the revisions but this created a confounding of item difficulty and content: The earlier “tough”
items (mostly low Agreeableness) were gradually complemented by “softer” items representing the presumed
obverse of P, superego strength (mostly low Conscientiousness). Finally, a part of the observed heterogeneity of
P was due to these differences in item difficulty. Overall, the EPQ-R seems to be the most valid single measure of
the PEN model.
⁎
Corresponding author at: University of Zurich, Department of Psychology, Personality and Assessment, Binzmuehlestrasse 14, Box 7, CH-8050 Zürich,
Switzerland.
E-mail address: [email protected] (W. Ruch).
https://doi.org/10.1016/j.paid.2020.110070
Received 7 February 2020; Received in revised form 14 April 2020; Accepted 17 April 2020
Available online 29 April 2020
0191-8869/ © 2020 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
1.1. The PEN system loadings of the facets so that the total scores only measure P, E, and N.
This appears to be more problematic with the short version of the EPP
The PEN system of personality is a factor-analytically based de- (the EPP-S; Eysenck, Wilson, & Jackson, 1999), which only contains
scriptive taxonomy of personality containing the three superfactors three facets for each factor. It should be noted that the first version of
Psychoticism, Extraversion, and Neuroticism (Eysenck & Eysenck, this scale employed partly different labels for the subscales and also
1985). The PEN system assumes a hierarchical arrangement of per- assumed a partly different assignment of the primaries to the three
sonality characteristics with Psychoticism (versus Impulse Control), factors (Eysenck & Wilson, 1976). Furthermore, it had 630 items
Extraversion (versus Introversion), and Neuroticism (versus Emotional compared to the 440 items.
Stability) located at the highest level. They are referred to as types (or Studies correlating the EPQ with the EPP-factors (Costa & McCrae,
second-order factors in factor-analytic terms) as opposed to traits (or 1995) or the EPP-S total scales (Knyazev, Belopolsky, Bodunov, &
first-order factors) defining them. The type concept of Psychoticism is Wilson, 2004) clearly showed that the correlations a) were highest for
made up of traits like being aggressive, cold, egocentric, impersonal, homologous scales (indicating the best correspondence for the markers
impulsive, antisocial, unemphatic, creative, and tough-minded. The of P, E, and N in the two instruments); b) were consistently higher for N
traits whose intercorrelations give rise to the type concept of Extra- and E compared to P; and c) showed some patterns in the off-diagonals
version are sociable, lively, active, assertive, sensation-seeking, carefree, (e.g., EPP P correlating positively with E, and EPP N correlating ne-
dominant, surgent, and venturesome. Finally, Neuroticism is made up of gatively with EPQ E). This is not surprising as, for example, the three
traits like anxious, depressed, tense, irrational, shy, moody, emotional, scales defining P (Risk-taking, Impulsiveness, and Irresponsibility) seem
and proneness to guilt feelings and low self-esteem (Eysenck & Eysenck, to capture E as well, while the three scales defining N (Anxiety, In-
1985). feriority, and Unhappiness) seem to stem from the introverted side. This
Eysenck started studying basic personality types by using ratings can be explored further by looking into the studies examining the factor
and objective tests applied to individuals of chosen clinical groups structure of the standard and short versions, which yielded solutions
(such as neurotics, psychotics) and later designed questionnaires for with three (EPP, EPP-S) and five (EPP only) factors. The three-factor
their measurement. The first instrument, the MMQ (Eysenck, 1947) solution for the 21 facets of the EPP typically yielded factors of P, E, and
measured N (with 40 items), the MPI (Eysenck, 1959a) measured E and N. Several scales with double loadings (and for aggression even triple
N with 24 items each, and two forms of the EPI (Eysenck & Eysenck, loadings) emerged, and one or two scales (i.e., practical) that were
1964) measured E and N (with 24 E and 24 N items in each form). The outside the PEN model (Costa & McCrae, 1995; Eysenck et al., 1992;
first studies of P used unpublished instruments containing items of all Jackson & Francis, 2004; Jackson, Furnham, Forde, & Cotter, 2000).
factors. P items needed to be identified that fulfil three criteria; speci- The latter result gave rise to the idea to investigate whether the EPP
fically, the items had to a) intercorrelate together to define a common scales might be better represented by the FFM (Costa & McCrae, 1995);
factor; b) discriminate between non-clinical groups and psychotic and a view later refuted by Jackson et al. (2000). However, most im-
criminal groups; and c) not correlate to any noteworthy extent with E portantly, when P, E, and N were obtained most purely through a target
and N. In the two studies using the PI (Eysenck & Eysenck, 1968) and rotation (Costa & McCrae, 1995; Eysenck et al., 1992), it became ap-
PEN (Eysenck & Eysenck, 1972), 20 items each fulfilled the criteria. parent that the secondary loadings did not even out; for instance, facets
Further work then led to the publication of the EPQ (Eysenck & of P also tended to load on E, and more facets of N were on the in-
Eysenck, 1975) measuring P, E and N (with 25 P, 21 E, and 23 N items) troverted side of E facets.
and the psychometrically improved EPQ-R (Eysenck et al., 1985) A third hurdle was to preserve the substance of the two clinically
measuring P, E, and N (with 32, 23, and 24 items, respectively). Finally, orientated factors N and P while softening the item contents to make
the EPP (Eysenck, Barrett, Wilson, & Jackson, 1992; Eysenck & Wilson, the scales applicable to the general population. While item contents can
1991) was published that contained facets (7 per superfactor with 20 be maintained in a softer version for N (e.g., “have you ever thought of
items each). suicide” may be weakened to “are you occasionally really fed up” and
medical symptoms may be reduced in intensity and frequency), this is
1.2. The sequential development of the structural model: Some inherent more difficult for psychotic symptoms (e.g., paranoid ideas) or not
consequences possible at all for others, like hearing voices. Thus, it is important to see
how the softening of P was undertaken and what the final outcome was.
The sequential development of concepts (compared to a simulta- An empirical comparison among different Eysenckian questionnaires
neous one) bears some predictable hurdles to master. A first one is that (MMQ, MPI, EPI, and EPQ) has only been conducted for the N scale
the substance of a factor most likely needs adjustments once a further (Ferrando, 2001). The unidimensionality of the 47 N items in these four
factor is added. The initial definition of N in the MMQ was strongly questionnaires received support, and the MMQ items were found to be
influenced by dysthymia, and once E was added the more introverted N more difficult (i.e., had lower means) than the items from the other
items had to be eliminated to have E and N clearly separated. While three questionnaires. These items often referred to the occurrence of
with the EPI (but not the MPI) E and N were almost orthogonal, the physical symptoms, which were “softened” over time to make the items
addition of P to the model posed problems with impulsivity, which, more suitable for non-clinical rather than clinical populations.
together with sociability, formed E. Studies revealed that impulsivity
could be broken down into four positively correlated components with 1.3. The P-scale: two new perspectives on some of the existing criticisms and
one of them—non-planning impulsiveness—being mostly aligned with earlier controversies
P, while other elements like venturesomeness remained with E (Eysenck
& Eysenck, 1978; for a recent discussion, see Zuckerman & Glicksohn, We argue that this process of softening items is related to two cri-
2016). Consequently, the EPQ shifted its focus to assessing mostly ticisms or controversies related to P; namely, the acclaimed hetero-
sociability, rather than a mixture of both impulsivity and sociability as geneity of P and the lack of correspondence in an alternative system of
in the EPI (Rocklin & Revelle, 1981). personality description, the FFM, in which it covers two factors (A and
A second hurdle was to keep the meaning of the superfactors con- C). The P-scale of the EPQ (Eysenck & Eysenck, 1975) had a lower in-
stant when including facets for each superfactor, which started with the ternal consistency (i.e., Cronbach's alpha) than the E and N scales,
EPP (Eysenck et al., 1992; Eysenck & Wilson, 1991). Despite the fact despite the higher number of items. The EPQ-R raised the number of
that since 1985 nine traits were always listed as defining facets, the EPP items from 25 to 32 to obtain a satisfactory alpha. Several explanations
uses seven facets. It is of course difficult to balance out the secondary were put forward; for instance, the P facets might have a lower
2
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
reliability (Eysenck & Eysenck, 1991) or the P-scale might be factorially suffer?”) and the one with the highest mean was Item 74 (M = 0.301;
heterogeneous (Roger & Morris, 1991). Additionally, the P-scales of “When you catch a train do you often arrive at the last minute?”). Thus,
different questionnaires were found to show a lower convergence than the correlation (PHI-max) between these two items can maximally be
the E and N scales, respectively (e.g., EPQ-A and EPQ-RS, Alexopoulos 0.27. Consequently, if one allows for more factors–as, for example,
& Kalaitzidis, 2004; EPP and EPQ, Knyazev et al., 2004). Goldberg and Rosolack (1994) did–the severely lowered upper limit for
Eysenck's (1992a) conceptual account of the P dimension shows the the correlation makes it likely that these two items will load on dif-
diversity of traits and syndromes; he lists (from the low P pole to the ferent factors. Thus, differences in item difficulty likely contribute to
middle) traits like altruistic, socialized, empathic, conventional, and the heterogeneity of the P-scale, and it opens the possibility that item
conformist, and locates (from above average to extreme) phenomena difficulty and content might be confounded.
like being criminal, impulsive, hostile, aggressive, psychopathic, schi-
zoid, unipolar depressive, schizoaffective, schizophrenic, or suffering 1.4. Aims of the present study
from an affective disorder. Clearly, these lists are prone to show some
heterogeneity when packed into one scale. A multitude of self-rating forms (adjectives from different articles de-
Eysenck (1992b) listed a narrower segment of primaries of P that scribing P covering the entire time span, the nine primary traits depicted in
should also explain why P relates to both low A and low C, despite the the model, as well as adjectives from the German trait taxonomy studies;
latter two being uncorrelated. In his controversy with Costa and Ostendorf, 1994) and questionnaires (from the MMQ to the EPP-S) will be
McCrae, Eysenck declared A and C to be (narrow) primaries of P as- administered to examine the following questions: a) How did the three
suming the two outermost positions in the segment of primaries cov- concepts P, E, and N develop over the years in terms of basic statistics (M,
ering (low) A, coldness, Machiavellianism, hostility, aggression, (low) SD, Cronbach's alpha) as well as their factor loadings?; b) How does the
empathy, and (low) C. The alternative interpretation that P is an arbi- correlation of different forms of the P scale and rating markers of P change
trary combination of low C and low A was first raised in a factor ana- in relation to C and A?; and c) Is the alleged heterogeneity of the P-scale in
lysis of the 25 P items by Goldberg and Rosolack (1994). They found the part an artefact due to the wide range in item means?
two sets of positively and negatively keyed items to be scattered in arcs Regarding a), we expect that the means (and SD) for P increase from
of about 125 degrees in a space defined by two orthogonal components the early to the latest versions of the scale to the midpoint of the scale.
(see Fig. 1.1 in Goldberg & Rosolack, 1994). This implies negative At the same time, the factor loadings in a three-factor model should
correlations and hence a large heterogeneity among the items. Ad- increase and display a clearer factor structure, reflecting the increased
ditionally, they found the two factors to be correlated with A- and C- of reliability and purification of the scales, respectively. Regarding b), we
the FFM and concluded that P is heterogeneous and a blend of low C expect that the earlier versions of the scale will be mainly negatively
and A. related to A while the later versions will have an equal contribution of
We would like to add two new perspectives on this matter. The first C- to P. This pattern should also emerge for adjectives describing the P
perspective is based on the fact that while the Eysencks saw the typical scales sampled over the course of the development of the P scales based
high P-scorer as “cold, impersonal, hostile, lacking in sympathy, un- on expert ratings by 10 FFM experts (prototypicality for C, A, N, E, and
friendly, untrustful, odd, unemotional, unhelpful, antisocial, lacking in Openness to experience). This analysis will be performed on items that
human feelings, inhumane, generally bloody-minded, lacking in in- received high prototypicality evaluations by a PEN expert. Regarding
sight, strange, with paranoid ideas that people were against him” c), we expect that the difference in the item means explains most but
(Eysenck & Eysenck, 1976, p. 47), they also followed Royce (1973) who not all of the observed heterogeneity of P in the EPQ, pertaining to a
saw the third factor (beyond E and N) in personality to be superego, as confound between the means and contents of the scale (extending the
championed by Cattell. They conceded that superego “is clearly the analyses of Goldberg & Rosolack, 1994).
obverse of the psychoticism factor we are here hypothesizing; all the
traits characterizing the ‘high superego’ person are characteristically
2. Material and methods
absent in the high P scorer, as we shall see” (Eysenck & Eysenck, 1976,
pp. 43–44). High superego, of course, makes the low pole of P closer to
2.1. Participants and procedure
impulse control (but also C in the FFM).
We might therefore expect to find that earlier approaches to P yield
Sample 1 comprised 629 adults (63.3% women) from the general
items that have a low endorsement frequency (and low variance in case
population aged 17 to 91 years (M = 41.3, SD = 13.9). All participants
of binary answers), tougher content, and more overlap with low A–in
completed the two latest versions of Eysenck's questionnaires (i.e., the
line with the description of the typical high P-scorer mentioned above.
EPQ-R and EPP), a questionnaire for the assessment of the FFM of
Later approaches might include softer items, with higher endorsement
personality (NEO-PI-R; Ostendorf & Angleitner, 2004), as well as self-
rates (and hence a larger variance, thereby affecting the scale variance
ratings based on the 21 measured traits in the EPP scales (EPP-SR) and
more than the older items) and item contents that also reflect C or
self-ratings on the 27 (9 × 3) adjectives depicted in the PEN-model
impulse control. Thus, these two sources of heterogeneity (differences
(PEN-SR). Participants in Sample 1 were recruited through radio and
in item endorsement and contents) might be confounded. Findings on
newspaper reports, mouth-to-mouth propaganda, and a website dedi-
the relationships between earlier and later questionnaires and the FFM
cated to the project. The participants completed the questionnaires in
are in line with this interpretation of a shift from A- to C- in the P items.
the lab or received them via mail. Upon request, participants received
For instance, using 53 P items from the EPQ and new items, McCrae and
personal feedback on their scores or a general feedback on selected
Costa (1985) found correlations of −0.20 to −0.45 with A and −0.29
findings of the study.
to −0.31 with C. Later studies using the EPP-P scale (e.g., Costa &
Sample 2 was a subsample of Sample 1 (338 adults; 60.9% women)
McCrae, 1995; Muris, Schmidt, Merckelbach, & Rassin, 2000), by
who additionally completed the older versions of Eysenck's questionnaires
contrast, found the largest correlations with C- and smaller or even non-
(i.e., MMQ, MPI, and both forms of the EPI), and various precursors of the
significant correlations with A-.
P scales (i.e., PI 68, PEN 72, and EPQ 75 as well as items tested for the
The second perspective is that keeping the tough items (that mark
EPQ-R that were excluded from the final scale). The non-redundant items
the content of P well) and supplementing them with “softened” P items
of these scales were compiled into one longer instrument.
will lead to a) very skewed distributions for the tough items (as the EPQ
uses a yes/no answer format) and to b) a large range in the item means.
In the English norm data, the item with the lowest mean was Item 11 1
The information about the British norm data was kindly provided by Paul
(M = 0.03; “Would it upset you a lot to see a child or an animal Barrett.
3
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
Sample 3 was composed of an expert sample of one PEN-expert same time. This eliminated five items from the pool.
(Sybil B. G. Eysenck), and 10 FFM-experts (Alois Angleitner, Peter The Multiple Prototypicality Ratings Form-FFM contained the re-
Borkenau, Filip deFruit, Lewis R. Goldberg, A. A. Jolijn Hendriks, maining 62 P items, and the instruction given to the FFM experts was
Willem K. B. Hofstee, John A. Johnson, Robert R. McCrae, Ivan the same except the beginning: “Please judge the degree of proto-
Mervielde, and Gerard Saucier). These experts rated all P items of the typicality of each of the following 62 questions. For each question ask
Multiple Prototypicality Ratings Form, which contained the P items of all yourself: how prototypical would a ‘yes’-answer to these items be for
P scales except the one in the EPP, regarding their prototypicality for the dimensions of Neuroticism (N), Extraversion (E), Openness to
the PEN-model (Sybil B. G. Eysenck) or the FFM (the 10 FFM-experts). Experience (O), Agreeableness (A), and Conscientiousness (C)“. The
We additionally conducted analyses and simulations based on the answers were then averaged.
British norm data of the EPQ kindly provided by Paul Barrett. The PEN-SB contained 207 trait adjectives presumably measuring P
and coming from different sources, namely descriptions of the high P
2.2. Instruments scorer in research papers on P and manuals (e.g., Eysenck, 1992a;
Eysenck & Eysenck, 1972, 1975, 1976, 1992), the model with nine
The Maudsley Medical Questionnaire (MMQ; Eysenck, 1947; used in defining subtraits (Eysenck & Eysenck, 1985), and the German per-
the German version by Eysenck, 1953) assesses N with 38 items. Ad- sonality taxonomy project. As part of the latter, Ostendorf (1994) col-
ditionally, it contains a lie scale (18 items). All items use a dichotomous lected prototypicality ratings for P, E, and N from H. J. Eysenck (who
response format (0 = “no”, 1 = “yes”). judged 430 adjectives) as well as from 10 students (who judged 823
The Maudsley Personality Inventory (MPI; Eysenck, 1959a; used in adjectives). The students were very familiar with the PEN system and
the German version by Eysenck, 1959b) assesses E and N with 48 items they were also given materials to study. The judgments were done on a
(24 items per scale). All items are rated on a three point-scale 7-point rating scale (−3 = prototypical for negative pole of the trait,
(0 = “no”, 1 = “can't decide”, 2 = “yes”). 0 = not prototypical, +3 = prototypical for the positive pole of the
The Eysenck Personality Inventory (EPI; Eysenck, 1970; used in the trait). Adjectives were selected for the study if they were more proto-
German version by Eggert, 1974) assesses E and N with 48 items (24 typical for P than for E and N combined; that is, for slightly prototypical
items each). Additionally, it contains a lie scale (9 items). All items use P-adjectives (+1/−1) the scores for E and N needed to be “0” to be
a dichotomous response format (0 = “no”, 1 = “yes”). There are two included in the study. These ratings were then added to a total score,
parallel forms of the instrument (EPI-A and EPI-B). but also separate scores were used in the analyses; for example, one
The Eysenck Personality Questionnaire revised (EPQ-R; Eysenck & score for each of the six levels of prototypicality (−3, −2, −1, +1,
Eysenck, 1991, 1992c; Eysenck et al., 1985; used in the German version +2, +3) in the students' rating as well as Hans Eysenck's rating.
by Ruch, 1999) contains 102 items for the assessment of P (32 items), E Likewise, separate scores were computed for how P was described in
(23 items), N (25 items), while 22 items form a lie scale. All items use a the above-mentioned six publications.
dichotomous response format (0 = “no”, 1 = “yes”).
Psychoticism items from different precursors of the P scale that did
not make it into the EPQ-R were combined in one instrument. These 2.3. Data analysis
were P items from the PI (Eysenck & Eysenck, 1968), PEN (Eysenck &
Eysenck, 1972), EPQ (Eysenck & Eysenck, 1975), and an unpublished To test research question a), we first computed Cronbach's alpha,
instrument that finally led to the EPQ-R (Eysenck et al., 1985). All items item difficulties (means), corrected item-total correlations, and average
use a dichotomous response format (0 = “no”, 1 = “yes”). These items, inter-item correlations of the P-scales from the PI, PEN, EPQ, EPQ-R,
together with the standard P items, were used to compute the total EPP, and EPP-S. These items were administered in Sample 2 and cov-
scores representing the P scales of 1968, 1972, and 1975, but also ered versions of the P-scale from 1968 to 1999. Additionally, the item
scores for the new items were derived. difficulties (means) of adjective descriptions of P (from 1972 to 1992)
The Eysenck Personality Profiler (EPP; Eysenck & Wilson, 1991; used were investigated as well (also completed by Sample 2). Second, to
in the German translation by Bulheller & Häcker, 1998) contains 420 determine the factor structure and loadings of P, E, and N, we subjected
items for the assessment of P, E, and N (140 items each). Additionally, the scales from 1947 to 1991 as well as the means of the adjective
20 items form a lie scale. All items are rated on a three-point scale descriptions of P, E, and N (see Eysenck & Eysenck, 1985) to a principal
(0 = “no”, 1 = “can't decide”, 2 = “yes”). It should be noted that the components analysis with varimax-rotation.
three existing German adaptations (EPP-D) did not use all 21 facets, To test research question b), we first correlated the different scales
namely the long (EPP-D BH) and short version (EPP-DS BH) by and adjectives of P (from 1968 to 1999) with the A and C scales of the
Bulheller and Häcker (1998) and one (EPP-D M) with yet a different NEO-PI-R, based on self-reports in Samples 1 and 2. Next, the ratings for
items scoring key (Moosbrugger, Fischbach, & Schermelleh-Engel, the 62 non-redundant P items were averaged across the 10 FFM experts.
1998). These forms were derived from the pool of 420 items. Only ratings of A and C were used, and each item was coded as be-
The NEO Personality Inventory-revised (NEO-PI-R; Costa & McCrae, longing to “A” or “C” (depending on which yielded the higher mean).
1992; German version by Ostendorf & Angleitner, 2004) assesses the The items were grouped according to their first appearance in a P scale,
FFM traits (i.e., N, E, Openness to experience, A, and C) with 240 items and each item was used only once even if it reappeared in later versions
(48 items per dimension). All items use a 5-point Likert-style scale of the P scale. The use of non-redundant item sets helped to examine
(1 = “strongly disagree” to 5 = “strongly agree”). whether the relative importance of A and C shifted throughout the
The Multiple Prototypicality Ratings Form-PEN contained 67 non-re- different versions of the P items.
dundant P items used in precursors of the P Scale until the EPQ-R. The Finally, the sources of the heterogeneity of the P-scale (research
instruction read “Please judge the degree of prototypicality of each of question c) was investigated by computing principal component ana-
the following 67 questions. For each question ask yourself: how pro- lyses of 25 P items in the normative sample of the English EPQ (Phi), a
totypical is a ‘yes’-answer to an item for the dimensions of Psychoticism simulated data set (Phi-max) based on a perfect Guttman scale, and the
(P), Extraversion (E), and Neuroticism (N). For your answer use a corrected correlation matrix (Phi-corr = Phi/Phi-max). These three
seven-point scale ranging from -3 (highly prototypical for the negative factor solutions were compared using rank-order correlations and by
pole) to +3 (highly prototypical for the positive pole), with ‘0’ meaning plotting the factor loadings.
that this item is orthogonal/unrelated to that dimension.” For the sake
of the present study, these ratings will only be used to identify the items
highly prototypical of P but not yielding high scores for E or N at the
4
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
.9
Cronbach’s alpha
.8
.7
.6
.5
item difficulty
.4
.3
corrected item-total
.2 correlation
.1 inter-item correlation
0
3. Results were considered. The first three components explained 71.7% of the
variance and were rotated to the varimax-criterion. Table 1 gives the
3.1. How did the three concepts P, E, and N develop over the years? factor loadings on all three components labelled in accordance to the
theoretical expectations.
3.1.1. Changes in basic psychometric properties Table 1 shows that the three factors clearly may be identified as P, E
We analysed the psychometric properties (i.e., Cronbach's alpha, and N, explaining 12.7%, 26.2%, and 32.8% of the variance, respec-
item difficulties, corrected item-total correlations, and average inter- tively. With one exception every marker had its highest loading on the
item correlations) of different versions of the P-scale that were com- expected factor and most of these were high and pure. The core of the P
pleted by Sample 2 (Fig. 1). items that were used in 1968 (from the PI) was actually loading on N.
Fig. 1 shows that–as expected–item difficulty decreased over time The sets of items added in 1975 and 1985 (i.e., new in EPQ and EPQ-R,
from on average rather difficult items in the precursors of the P-scale respectively) were good markers of P. The P scale of the EPP had a high
(PI, PEN), over slightly less difficult items in the EPQ-R, to considerably loading, but also loaded on E. The total score of adjectives for P (but
easier items in the EPP and the EPP-S. At the same time, Cronbach's also the one of E and N) was a clear marker. Thus, the model and the
alpha increased. This increase was not only due to the inclusion of more item contents converged. E was mostly clearly measured, but the MPI E
items, since also the average inter-item correlations and the average scale, as well as the EPI-B E, were more on the emotional stable side.
corrected item-total correlations showed a similar increase over time. Extraversion in the EPI-A E scale tended to load positively on P, which
Interestingly, different patterns were found for the E and N scales was likely due to the impulsivity items (which were removed from the
(not shown in detail): While the E scale remained considerably constant EPQ). N was clearly marked by the scales, with the exception that there
over time with regard to the relevant psychometric properties (e.g., were introverted elements in the MMQ and EPP N scales. Thus, overall
item means between 0.45 and 0.54), the N scale also showed some there was continuity across the scales, with some unexpected secondary
decrease in item difficulty in one step from the MMQ (item mean of loadings for certain scales.
0.29) to the MPI (and subsequent scales; 0.48 for the N scale of the EPQ-
R). However, the items were more difficult again in the latest additions; 3.2. Did the relation between P and C and a change over time?
that is, the EPP (item mean of 0.27) and the EPP-S (0.32). Thus, there
was a softening of N, but it was over a short time and not implemented 3.2.1. Self-ratings
in the EPP and EPP-S. The correlations between self-ratings in the various indicators of P
An even clearer picture was obtained when analysing the means of (scales and adjectives) and the A and C scales of the NEO-PI-R were
the self-ratings of adjectives used to describe the P concept in several computed (Samples 1 and 2). The results regarding the adjectives and
publications and manuals over time: The item difficulty strongly de- scales were very clear-cut. The correlations were used in Fig. 3 as co-
creased over time (see Fig. 2), thus paralleling the picture obtained by ordinates to place the marker in this two-dimensional space with A and
the questionnaires. Interestingly, the typically observed gender differ- C serving as axes.
ence for P (men with higher scores than women) was not found in the The results were quite different for different sets of markers. For H.
adjectives used in the last two instruments. J. Eysenck the prototypical markers were exclusively A- (denoted by
black squares in Fig. 3). This was also the case when additionally
3.1.2. Factor structure of all used markers for P, E, and N considering the level of prototypicality (not shown in detail). Next, the
For the main question of the continuity or change in the prime adjectives based on the description of P from articles and manuals (top
concepts between the MMQ (Eysenck, 1947) and the Eysenck Person- left panel) were primarily A- with a slight correlation with C-. There
ality Profiler (Eysenck & Wilson, 1991), we computed a principal was no development; in fact, also the Eysenck and Eysenck (1985)
component analysis of all scales completed by Sample 2. To avoid model is A-, and only the description in the manual of the EPQ-R
overlap among P items, only the newly added items in later versions (Eysenck & Eysenck, 1991, 1992c) yielded a slight involvement of C-
5
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
.5
.45
.4
Males
.35
.3
Means
Females
.25
.2
.15
.1
.05
Table 1 Bulheller & Häcker and the version by Moosbrugger et al., 1998), which
Factor loadings on the three varimax-rotated factors. lacked substantial correlations with A- (these versions can be derived
Scales Psychoticism Extraversion Neuroticism from the translated version of the EPP used; results are not shown in
detail). The inspection of the P items at different times (in the EPQ-R
PI P ('68) 0.09 −0.01 0.67 and its precursors) should be most interesting as this signified the de-
PEN P (new'72) 0.65 −0.22 0.22
velopment of P. In 1968, P was marginally negatively related to both C
EPQ P (new'75) 0.61 0.18 0.05
EPQ-R P (new'85) 0.68 0.13 −0.02
and A. In 1972, the correlation to A was twice as high as the one to C.
EPP P 0.75 0.25 −0.12 The EPQ P scale correlated more with C (−0.40) than with A (−0.30),
Adjectives P 0.68 −0.04 0.06 and for the EPQ-R scale the two correlations were equally high. The
MPI E 0.04 0.89 −0.22 same pattern can be observed when only analysing those items that
EPI-A E 0.23 0.89 −0.04
were newly added to the scales at the respective time points (bottom
EPI-B E 0.13 0.85 −0.30
EPQ-R E 0.03 0.93 −0.14 left panel in Fig. 3): While the items added in 1972 (PEN) only had
EPP E 0.10 0.81 0.02 slight loadings on C-, subsequently added items had considerably
Adjectives E −0.06 0.87 −0.22 higher loadings on C-, while simultaneously their loadings on A- de-
MMQ N −0.09 −0.32 0.84 creased.
MPI N 0.07 −0.06 0.91
EPI-A N −0.06 −0.10 0.93
EPI-B N 0.04 −0.09 0.93 3.2.2. Expert ratings of P-items: prototypicality for A and C
EPQ-R N 0.02 −0.08 0.91
Similar trends were found for the expert ratings conducted for the
EPP N 0.03 −0.27 0.88
Adjectives N 0.12 −0.17 0.80
set of new P items introduced for different versions of the P scale. To
Eigenvalues 2.41 4.98 6.23 highlight the change, two packages were distinguished: The early P
scales of 1968 and 1972 had 15 (7 positively and 8 negatively keyed)
Note. N = 305. Expected loadings in boldface; anomalies italicized. items identified by the experts to represent A, and only 2 items relating
to C (2 yes, 0 no-items). Later P scales (new items for EPQ and EPQ-R)
(−0.20). The entire span of traits as presented in Eysenck (1992a) in- that led to the EPQ-R had 9 items (4/5) relating to A and 13 to C (10/3).
cluded conformist and conventional and had a noticeable correlation This clearly demonstrates that there was a shift in the substance of P
with low C. Thus, in terms of the description of the concept P did not once superego was noticed in the mid-1970s to be the obverse of P.
get softened. However, there was a slight mismatch between Eysenck's Early items defined P purely as A-, and in later scales P was a mixture of
view of the concept and the adjectives used to describe the concept. C- and A-. Items of the EPP or EPP-S were not used, but it is evident that
This was picked up by the students that mostly saw prototypical P (+1, these would be more prototypical for C- than for A-.
+2, +3) as mostly A- and slightly lower on C (white squares).
This pattern was different for measured P in questionnaires. A very
3.3. Is the alleged heterogeneity of the P-scale in part an artefact due to the
strong change can be seen for the EPP (bottom right panel in Fig. 3).
wide range in item means and heterogeneity?
While Eysenck and Wilson's (1976) total score was measuring only A-,
the total score in the EPP contributed equally to C and A and the short
Before answering research question c), we first conducted a simu-
scale EPP-S was even more C- than A-; the same held true for all German
lation to estimate the effects of impaired maximal correlations due to
adaptations of the scale (EPP-D; the regular and short version by
different item difficulties in binary data and then computed a principal
6
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
.2 .2
Adjectives EPQ-R P Scale and Precursors
0 0
NEOPI-R Conscientiousness
NEOPI-R Conscientiousness
E'94
E'94 E&E‘85
E&E'72 PI'68
E&E'92
-.2 S E&E'75 -.2 S PEN'72
E&E'76
E'92
EPQ-R'91
-.4 -.4
EPQ'75
-.6 -.6
-.8 -.8
-.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2
NEOPI-R Agreeableness NEOPI-R Agreeableness
.2 .2
New Items of P Scales EPP(E&W'75) The Eysenck Personality Profiler
0
NEOPI-R Conscientiousness
NEOPI-R Conscientiousness
E'94 E'94
new'72
-.2 S -.2 S
new‘91
-.4 -.4
new'75 EPP‘91
-.8 -.8
-.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2 -.8 -.7 -.6 -.5 -.4 -.3 -.2 -.1 0 .1 .2
NEOPI-R Agreeableness NEOPI-R Agreeableness
Fig. 3. Relationships between self-ratings in several sets of P-adjectives (Top Left Panel), the EPQ-R P Scale and its Precursors (Top Right Panel), newly added items
of P scales (Bottom Left Panel), and the Eysenck Personality Profiler P-Scale (Bottom Right Panel) with Agreeableness and Conscientiousness. Each panel also shows
the relationships of prototypical adjectives for P as rated by H. J. Eysenck with Agreeableness and Conscientiousness (black squares; E'94) and prototypical adjectives
as rated by students (white squares; S). Abbreviations denote the publication year of the adjectives' source (e.g., E&E'72: Eysenck & Eysenck, 1972).
Fig. 4. The loadings on the first two unrotated principal components as a function of variation in item difficulty.
7
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
8
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
Fig. 6. Factor space defined by the first and second (6.4% of variance) principal component derived from the intercorrelation of the 25 P items in the English norm
sample.
0.40 and 0.45). This addresses research question a). element of the concept of P. This goes along very well with the finding
However, and important for research question b), the P scales did that the definition of P was mostly related to A-, but the P scale of the
not only change over time with regard to difficulty but there were also EPQ related to both A- and C-, and the new P items introduced in the
considerable changes in content that go beyond the above-mentioned EPQ were primarily C-. Overall, based on these findings one might
initial overlap with N: While early studies assessed P more as the op- question Eysenck and Eysenck's (1976) depiction of P as the opposite of
posite of A, later studies assessed P more as the opposite of C. As the superego factor as reported by Cattell.
Eysenck and Eysenck considered P to be the obverse of the superego The change in the nature of P is also reflected in the EPP. For
factor, this leads to the question of what other evidence is there to the Eysenck and Wilson (1976), toughmindedness (i.e., P) was composed of
relation between P and superego? In a study of the 16 PF and the EPQ, aggressiveness–peacefulness, assertiveness –submissiveness, achieve-
McKenzie (1988) found separate factors for superego and P, but the P ment-orientation–unambitiousness, manipulation–empathy, sensation
scale loaded −0.40 on the superego factor, which lends some support seeking–unadventurousness, dogmatism– flexibility, and masculinity–-
to Eysenck's contention that P is the obverse of superego. In a further femininity. The facets of P in the later EPP were risk-taking, im-
analysis, a superego factor was found that was loaded substantially by pulsivity, irresponsibility, manipulativeness, sensation seeking, tough-
some but not all items of the P scale. The 14 P-items not loading sig- mindedness, and practicality. Using a confirmatory factor analysis on
nificantly on superego were those involving the cruelty or sadism the German EPP, Moosbrugger and Fischbach (2002) only found three
9
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
facets fitting to the concept, namely impulsivity, irresponsibility, and 4.2. Limitations
sensation seeking. This means that the selection of facets suitable to
measure P were actually the ones of superego/impulse control/C, but Of course, the present findings have to be interpreted while taking
not the ones anymore that had typically been used in the descriptions of some strengths and weaknesses into account. While the present study
P; that is, facets that relate to A- (manipulation, empathy, risk-taking, used a large sample of mainly no-student adults and relied on different
and tough-mindedness). methods (e.g., using items and adjectives rated by experts), only self-
Overall, the present study showed that the line of development of P report measures were employed. While we would expect the same
can be traced using different methods, including the instruments (e.g., patterns to emerge in peer-ratings, we did not collect data for settling
PI, PEN, EPQ, EPQ-R, EPP) and the adjectives used to describe the P this question. Further, as the participants completed consecutive mea-
concept in publications and manuals. Interestingly, this is in contrast to sures, the questionnaire was redundant in many places. Although this
Eysenck's own characterization of prototypical adjectives of the P might have decreased the participants' motivation in the study, we
concept, which also in 1994 rather followed the original description of observed a generally very low dropout rate.
the P-scale as only A- than the later operationalization (and description)
as A- and C-. In sum, three different clusters need to be kept apart: First, 4.3. Conclusions
Eysenck's stipulation that P is only A-; second, a slight development in
the description of the high P scorer that stopped at 1975; and third, the The present study is the first to simultaneously look at all scales
items of instruments that gradually increased the involvement of C- measuring the Eysenckian concepts of P, E, and N over half a century.
until it even dominated over A-. Interestingly, the students that studied The different scales and adjectives of P, E, and N could be well sepa-
the provided material also rated primarily A-, with the exception of rated in principal components analyses, supporting the general viability
slightly low expression of P, which was strongly C-. This is consistent of the different measures to represent the PEN model. Still, it appeared
with the finding that the gender differences were more prevalent in the to be most difficult to transfer P to a construct to be measured in a
early versions (see Fig. 2), as there typically are gender differences in A general adult population, which lead to a shift in content over time. The
but not in C (see e.g., Weisberg, DeYoung, & Hirsh, 2011). “softening” of the P items during the revisions confounded item diffi-
With respect to the scale characteristics and research question c), culty and content, shifting the content from low A to a mixture of low A
the reported low alpha of P was identified to be a function of item and low C, and contributed to the heterogeneity of the P scale.
heterogeneity regarding item difficulty (which will be ameliorated if Depending on the intended population under study or application, ei-
using Likert-type scales, such as a 6-point answer format). Part of the ther earlier versions of the questionnaires (PI, EPQ) or later versions
heterogeneity of the P scale stemmed from the differences in item (EPP, EPPS-S) can be recommended. Overall, the EPQ-R seems to be the
means (this span increased for the revised P-scale of the EPQ-R) and most valid single measure of the PEN model.
was a by-product of applying data analysis methods to data not ful-
filling the requirements. However, these effects of item difficulty were CRediT authorship contribution statement
confounded with changes in item content: The easier items more
strongly related to C-, while the more difficult items showed stronger Willibald Ruch:Conceptualization, Data curation, Formal ana-
relationships to A-. Thus, the present analysis limits, but does not rule lysis, Funding acquisition, Investigation, Methodology, Project
out, the interpretation that P is factorially heterogeneous, and we ten- administration, Resources, Software, Supervision, Validation,
tatively conclude that both of these aspects contribute to the reported Visualization, Writing - original draft, Writing - review &
heterogeneity of P. Future studies should tackle this problem by using editing.Sonja Heintz:Project administration, Writing - original draft,
modified items that disentangle item content and item difficulty. Writing - review & editing.Fabian Gander:Visualization, Writing -
original draft, Writing - review & editing.Jennifer Hofmann:Writing
4.1. Recommendations and implications - original draft, Writing - review & editing.Tracey Platt:Writing -
original draft, Writing - review & editing.René T. Proyer:Writing -
Several recommendations can be made based on the present study. original draft, Writing - review & editing.
The results help to better integrate the findings derived with different P
scales. They provide new perspectives on potential causes for the ob- Acknowledgments
served heterogeneity of P and on the debate of the relations between P
and A and C. The study also demonstrates that the gap between the This research was supported by a grant of the Deutsche
concept of P and measures of P got wider. We suggest to use the EPQ-R Forschungsgemeinschaft – DFG (HE 3143/1-1 “Untersuchungen zum
for the testing of the PEN model, since it provided the clearest factor PEN-Modell der Persönlichkeit”). Thanks to Olga Altfreder, Matthias
structure with the least amount of secondary loadings and since it came Bergande, Kristina Dürscheidt, Cornelia Kirchhof, Gwendolin
closest to Eysenck's conceptualization of the P-scale as primarily low A, Linnenbrink, Nina Rambech for collecting part of the data and app.
while later instruments increased the scale's relationships to low C. We 1000 participants filling in up to 8 h worth of questionnaires over the
know which of the scales measured E and N most purely (i.e., those that time span of two years. The authors are grateful to Dr. Paul Barrett for
had no second loadings), and for P we know the relative contribution of providing access to the norm data on the EPQ for analysis of means.
the different facets in a measure. For “tough” individuals and applica- Furthermore, to Dr. S. B. G. Eysenck for providing the expert rating on P
tions in a forensic context, it is important that elements of being cold, and Drs. Alois Angleitner, Peter Borkenau, Filip deFruit, Lewis R.
hostile, and aggressive are assessed, and hence the earlier versions of Goldberg, A. A. Jolijn Hendriks, Willem K.B. Hofstee, John A. Johnson,
the P scale are recommended. If weaker expressions of P are to be Robert R. McCrae, Ivan Mervielde, and Gerard Saucier for providing the
measured, then the EPP scales are best for representing “soft” P. Due to FFM ratings.
the high item difficulty in several measures, it is recommended to
compute factor analyses based on corrected coefficients (e.g., PHI- References
corr = PHI/PHI-max) or tetrachoric correlations for obtaining unbiased
findings. Furthermore, when evaluating the internal consistency of the Alexopoulos, D. S., & Kalaitzidis, I. (2004). Psychometric properties of Eysenck
P scale, it is advisable to use split half-reliability (with items matched Personality Questionnaire-Revised (EPQ-R) short scale in Greece. Personality and
Individual Differences, 37(6), 1205–1220. https://doi.org/10.1016/j.paid.2003.12.
for difficulty; see Ruch, 1999) rather than Cronbach's alpha, for which 005.
the P scale does not meet the requirements (Feldt, Woodruff, & Salih, Bowden, S. C., Saklofske, D. H., Van de Vijver, F. J. R., Sudarshan, N. J., & Eysenck, S. B.
1987). G. (2016). Cross-cultural measurement invariance of the Eysenck personality
10
W. Ruch, et al. Personality and Individual Differences 169 (2021) 110070
questionnaire across 33 countries. Personality and Individual Differences, 103, 53–60. 014662168701100107.
https://doi.org/10.1016/j.paid.2016.04.028. Ferrando, P. J. (2001). The measurement of neuroticism using MMQ, MPI, EPI and EPQ
Bulheller, S., & Häcker, H. (1998). Deutsche Bearbeitung. (German revision). In H. J. items: A psychometric analysis based on item response theory. Personality and
Eysenck, C. D. Wilson, & C. J. Jackson (Eds.). Eysenck Personality Profiler EPP-D: Individual Differences, 30(4), 641–656. https://doi.org/10.1016/S0191-8869(00)
Manual (EPP, German version, manual). Frankfurt am Main, Germany: Swets Test 00062-3.
Services. Furnham, A., Eysenck, S. B. G., & Saklofske, D. H. (2008). The Eysenck personality
Costa, P. T., & McCrae, R. R. (1992). Revised NEO personality inventory (NEO PI-R) and measures: Fifty years of scale development. In G. J. Boyle, G. Matthews, & D. H.
NEO five-factor inventory (NEO-FFI). Professional manual. Odessa, FL: PAR. Saklofske (Vol. Eds.), The SAGE handbook of personality theory and assessment. Vol. 2.
Costa, P. T., & McCrae, R. R. (1995). Primary traits of Eysenck's P-E-N system: Three- and The SAGE handbook of personality theory and assessment (pp. 199–218). London, UK:
five-factor solutions. Journal of Personality and Social Psychology, 69, 308–317. SAGE Publications.
https://doi.org/10.1037/0022-3514.69.2.308. Goldberg, L. R., & Rosolack, T. K. (1994). The big-five factor structure as an integrative
Eggert, D. (1974). Eysenck-Persönlichkeits-Inventar: E-P-I: Handanweisung für die framework: An empirical comparison with Eysenck's P-E-N model. In C. F. Halverson,
Durchführung und Auswertung [Eysenck Personality Inventory (EPI): Directions for its G. A. Kohnstamm, & R. P. Martin (Eds.). The developing structure of temperament and
implementation and evaluation]. Göttingen, Germany: Hogrefe. personality from infancy to adulthood (pp. 7–36). Hillsdale, NJ: Erlbaum.
Eysenck, H. J. (1947). Dimensions of personality. London, UK: Routledge & Kegan Paul. Heaven, P. C., Ciarrochi, J., Leeson, P., & Barkus, E. (2013). Agreeableness, con-
Eysenck, H. J. (1953). Maudsley Persönlichkeitsfragebogen [Maudsley Personality Inventory]. scientiousness, and psychoticism: Distinctive influences of three personality dimen-
Göttingen, Germany: Hogrefe. sions in adolescence. British Journal of Psychology, 104, 481–494. https://doi.org/10.
Eysenck, H. J. (1959a). Manual of the Maudsley personality inventory. London, UK: 1111/bjop.12002.
University of London Press. Jackson, C. J., & Francis, L. J. (2004). Primary scale structure of the Eysenck Personality
Eysenck, H. J. (1959b). Das “Maudsley personality inventory” (MPI). Göttingen, Germany: Profiler (EPP). Current Psychology, 22(4), 295–305. https://doi.org/10.1007/s12144-
Hogrefe. 004-1035-9.
Eysenck, H. J. (1970). EPI Eysenck personality inventory. London, UK: University of London Jackson, C. J., Furnham, A., Forde, L., & Cotter, T. (2000). The structure of the Eysenck
Press. personality profiler. British Journal of Psychology, 91(2), 223–239. https://doi.org/10.
Eysenck, H. J. (1974). Eysenck-Persönlichkeits-Inventar E-P-I. Eysenck Personality 1348/000712600161808.
Inventory EPIGöttingen, Germany: Hogrefe. Knyazev, G. G., Belopolsky, V. I., Bodunov, M. V., & Wilson, G. D. (2004). The factor
Eysenck, H. J. (1992a). The definition and measurement of psychoticism. Personality and structure of the Eysenck Personality Profiler in Russia. Personality and Individual
Individual Differences, 13, 757–785. https://doi.org/10.1016/0191-8869(92) Differences, 37(8), 1681–1692. https://doi.org/10.1016/j.paid.2004.03.003.
90050-Y. McCrae, R. R., & Costa, P. T., Jr (1985). Comparison of EPI and psychoticism scales with
Eysenck, H. J. (1992b). Four ways five factors are not basic. Personality and Individual measures of the five-factor model of personality. Personality and Individual Differences,
Differences, 13, 667–673. https://doi.org/10.1016/0191-8869(92)90237-J. 6, 587–597. https://doi.org/10.1016/0191-8869(85)90008-X.
Eysenck, S. B. G. (1992c). Manual of the EPQ-R and the impulsiveness, venturesomeness and McKenzie, J. (1988). Three superfactors in the 16PF and their relation to Eysenck's P, E
empathy scales. London, UK: Hodder & Stoughton. and N. Personality and Individual Differences, 9(5), 843–850. https://doi.org/10.1016/
Eysenck, H. J. (1994). Normality–abnormality and the three-factor model of personality. 0191-8869(88)90002-5.
In S. Strack, & M. Lorr (Eds.). Differentiating normal and abnormal personality (pp. 3– Moosbrugger, H., & Fischbach, A. (2002). Evaluating the dimensionality of the Eysenck
25). New York, NY: Springer. Personality Profiler—German version (EPP-D): A contribution to the Super Three vs.
Eysenck, H. J. (1997). Personality and experimental psychology: The unification of psy- Big Five discussion. Personality and Individual Differences, 33(2), 191–211. https://doi.
chology and the possibility of a paradigm. Journal of Personality and Social Psychology, org/10.1016/S0191-8869(02)00095-8.
73, 1224–1237. https://doi.org/10.1037/0022-3514.73.6.1224. Moosbrugger, H., Fischbach, A., & Schermelleh-Engel, K. (1998). Zur Konstruktvalidität
Eysenck, H. J., Barrett, P., Wilson, G., & Jackson, C. (1992). Primary trait measurement of des EPP-D. [on the construct validity of EPP-D]. In H. J. Eysenck, C. D. Wilson, & C. J.
the 21 components of the P-E-N system. European Journal of Psychological Assessment, Jackson (Eds.). Eysenck Personality Profiler EPP-D. manual. Frankfurt, Germany: Swets
8(2), 109–117. Test Services.
Eysenck, H. J., & Eysenck, M. W. (1985). Personality and individual differences: A natural Muris, P., Schmidt, H., Merckelbach, H., & Rassin, E. (2000). Reliability, factor structure
science approach. New York, NY: Plenum Press. and validity of the Dutch Eysenck Personality Profiler. Personality and Individual
Eysenck, H. J., & Eysenck, S. B. G. (1964). Manual of the Eysenck personality inventory. Differences, 29(5), 857–868. https://doi.org/10.1016/S0191-8869(99)00237-8.
London, UK: University of London Press. Ostendorf, F. (1994). Zur Taxonomie deutscher Dispositionsbegriffe [On the taxonomy of
Eysenck, H. J., & Eysenck, S. B. G. (1975). Manual of the Eysenck personality questionnaire. German disposition terms]. In W. Hager, & M. Hasselhorn (Eds.). Handbuch
London, UK: Hodder & Stoughton. deutschsprachiger Wortnormen (pp. 382–441). Göttingen, Germany: Hogrefe.
Eysenck, H. J., & Eysenck, S. B. G. (1976). Psychoticism as a dimension of personality. Ostendorf, F., & Angleitner, A. (2004). NEO-PI-R - NEO-Persönlichkeitsinventar nach Costa
London, UK: Hodder and Stoughton. und McCrae [NEO-PI-R– NEO-Personality Inventory by Costa and McCrae]. Göttingen,
Eysenck, H. J., & Eysenck, S. B. G. (1991). Manual of the Eysenck Personality Scales (EPS Germany: Hogrefe.
adults). London, UK: Hodder & Stoughton. Rocklin, T., & Revelle, W. (1981). The measurement of extroversion: A comparison of the
Eysenck, H. J., Eysenck, S. B. G., & Barrett, P. (1985). A revised version of the psycho- Eysenck Personality Inventory and the Eysenck Personality Questionnaire. British
ticism scale. Personality and Individual Differences, 6, 21–29. https://doi.org/10.1016/ Journal of Social Psychology, 20(4), 279–284. https://doi.org/10.1111/j.2044-8309.
0191-8869(85)90026-1. 1981.tb00498.x.
Eysenck, H. J., & Wilson, G. (1991). The Eysenck Personality Profiler. London, UK: Roger, D., & Morris, J. (1991). The internal structure of the EPQ scales. Personality and
Corporate Assessment Network. Individual Differences, 12, 759–764. https://doi.org/10.1016/0191-8869(91)
Eysenck, H. J., & Wilson, G. D. (1976). Know your own personality. New York, NY: Penguin 90232-Z.
Books. Royce, J. R. (1973). Multivariate analysis and psychological theory. London, UK: Academic
Eysenck, H. J., Wilson, G. D., & Jackson, C. (1999). The Eysenck Personality Profiler (short) Press.
(2nd ed.). Guildford, UK: Psi-Press. Ruch, W. (1999). Die revidierte Fassung des Eysenck Personality Questionnaire und die
Eysenck, S. B. G., & Eysenck, H. J. (1968). The measurement of psychoticism: A study of Konstruktion des deutschen EPQ-R bzw. EPQ-RK [the Eysenck Personality
factor stability and reliability. British Journal of Social & Clinical Psychology, 7(4), Questionnaire-Revised and the Construction of German Standard and Short Versions
286–294. https://doi.org/10.1111/j.2044-8260.1968.tb00571.x. (EPQ-R and EPQ-RK)]. Zeitschrift für Differentielle und Diagnostische Psychologie, 20(1),
Eysenck, S. B. G., & Eysenck, H. J. (1972). The questionnaire measurement of psychoti- 1–24. https://doi.org/10.1024//0170-1789.20.1.1.
cism. Psychological Medicine, 2, 50–55. https://doi.org/10.1017/ Weisberg, Y. J., DeYoung, C. G., & Hirsh, J. B. (2011). Gender differences in personality
S0033291700045608. across the ten aspects of the Big Five. Frontiers in Psychology, 2, 178. https://doi.org/
Eysenck, S. B. G., & Eysenck, H. J. (1978). Impulsiveness and venturesomeness: Their 10.3389/fpsyg.2011.00178.
position in a dimensional system of personality description. Psychological Reports, Zuckerman, M., & Glicksohn, J. (2016). Hans Eysenck's personality model and the con-
43(3), 1247–1255. https://doi.org/10.2466/pr0.1978.43.3f.1247. structs of sensation seeking and impulsivity. Personality and Individual Differences,
Feldt, L. S., Woodruff, D. J., & Salih, F. A. (1987). Statistical inference for coefficient 103, 48–52. https://doi.org/10.1016/j.paid.2016.04.003.
alpha. Applied Psychological Measurement, 11(1), 93–103. https://doi.org/10.1177/
11