Psychometrics: Psicometria Psicometría
Psychometrics: Psicometria Psicometría
Psychometrics
PSICOMETRIA
PSICOMETRÍA
Luiz Pasquali1
1
Researcher Professor Associated with the University of Brasilia. Brasilia, DF, Brazil. [email protected]
992
Rev Esc Enferm USP Received: 15/06/2008 Portuguese / English:
Psychometrics
2009; 43(Spe):992-9 Approved: 15/12/2008 www.scielo.br/reeusp
Pasquali L
www.ee.usp.br/reeusp/
INTRODUCTION PSYCHOMETRICS:
CONCEPT AND MODELS
Measurement in psychosocial sciences
Modern psychometrics can be traced back to two
Psychometrics is etymologically represented as the sources: the classical test theory (CTT), and the item re-
theory and technique of measuring mental processes, and sponse theory (IRT). CTT has been axiomatized by
is especially applied in the fields of psychology and educa- Gulliksen(3) and IRT was initially elaborated by Lord(4) and
tion. It is grounded in the general theory of measurement Rasch(5), and finally axiomatized by Bimbaum(6) and Lord(7).
in sciences, or else, in the quantitative method whose ma-
jor characteristic is the fact that it represents the knowl- In a general sense, psychometrics attempts to explain
edge of nature in a more precise way in comparison with the meaning of responses given by subjects in a series of
the application of common language to describe the ob- tasks typically named as items. The CTT is aimed at explain-
servation of natural phenomena. ing the total final result, that is, the sum of responses pro-
vided to a series of items, expressed by the so-called total
Psychometrics historically stems from the psychophys- score (S). For instance, the S in a test of 30 capability items
ics of the Germans Ernst Heinrich Weber and Gustav would be the sum of correctly responded items. If the value
Fechner. The British Francis Galton also contributed to the of 1 were given to each correct item and 0 to each incor-
development of psychometrics by creating tests to mea- rect one, and the subject reached 20 correctly and 10 in-
sure mental processes; by the way, he is considered as the correctly responded items, this person’s score S would be
creator of psychometrics. However, it was the inventor of 20. The CTT, then, asks itself: what does this total 20 mean
the multiple factorial analyses, Leon Louis Thurstone, who to the subject? The IRT, on the other hand, is not inter-
enlivened psychometrics, making it different from psy- ested in the test total score; it is specifically aimed at each
chophysics. Psychophysics was defined as one of the 30 items and wants to know what the probabil-
the measurement of directly observed pro- ity is and what the factors that influence this
cesses, or in other words, the organism’s probability are regarding every individual
Psychometrics item’s correctness and incorrectness (in ca-
stimulus and response, while psychometrics
attempts to explain pability tests) or acceptance or rejection (in
consists in measuring the organism’s behav-
ior by means of mental processes (law of com- the meaning of preference tests: personality, interests, atti-
parative judgment). responses given tudes). In such a way, the CTT is interested in
by subjects in producing quality tests, while the IRT is fo-
Measurement in sciences has raised dia- cused on developing quality tasks (items). At
a series of tasks
tribes among researchers, particularly in the the end, therefore, we have either valid tests
typically named (CTT) or valid items (IRT), and those results
field of social sciences. Nonetheless, the most
accepted definition among researchers was
as items. will build as many valid tests as desired, or
given by Stanley Smith Stevens in 1946. He the amount of tests allowed by the items.
used to say that to measure meant to assign numbers to Thus, the richness of the psychological or edu-
(1)
objects and events in accordance with given rules . The cational assessment within the IRT’s scope of action con-
assignment rules to such numbers are defined by the pro- sists in building store rooms of valid items that evaluate
posal of the same author concerning the four measurement latent traits - these store rooms are called item bank, aimed
levels or measurement typologies, which are: nominal, or- at elaborating countless numbers of tests.
dinal, interval, and ratio. The CTT model was elaborated by Spearman and de-
tailed by Gulliksen, as follows:
The nominal measurement is the one that applies num-
bers to nature phenomena, keeping exclusively the axioms T = TS + E
of number identity, that is, the number is employed only
as a numeric or graphic symbol. When applying the num- where,
ber, the ordinal typology saves the axioms of order, that is T = subject’s total or empirical score, which is the sum
to say, the major characteristics of the number, or its mag- of all items achieved by the test;
nitude (by definition, a given number is greater or smaller
than, not only different from or better than the other ex- TS = true score, which is the real magnitude of what the
actly because its value is intrinsically higher or lower than test wants to measure in the subject; that score will be the
any other). The other typologies point to axioms of addi- S itself, in case there is no measurement error;
tionality. The axiom history was detailed by Whitehead and E = the error of the measurement.
Russell between 1910 and 1913, and again in 1965, in their
book Principia Mathematica, where they describe the 27 In this way, the empirical score is the sum of the true
famous axioms of the mathematical number(2). score and the error; consequently, E = T – TS, and TS = T – E.
993
Psychometrics Rev Esc Enferm USP
Pasquali L 2009; 43(Spe):992-9
www.ee.usp.br/reeusp/
Figure 1 shows the relationship among these various
elements of the empirical score, where the union between 1,00
the true (TS) and the error (ES) score can be observed; that 0,90
is to say, the subject’s empirical or gross score (T – test re- 0,80
sult known as the Tau score - τ) is comprised of two compo- 0,70
nents: the subject’s real or true score (TS) in what the test 0,60
intends to measure, and the error score (ES) of the mea-
Pi q 0,50
surement, which is always present in any empirical opera- 0,40
tion. In other words, we are assuming here that as the
0,30
subject’s gross score differs from his true score, it is the
0,20
error that accounts for such a disparity; this difference, then,
0,10
is the error’s concept itself.
0 1 2 3 4 5 6 7 8
TS Capability q
994
Rev Esc Enferm USP Psychometrics
2009; 43(Spe):992-9 Pasquali L
www.ee.usp.br/reeusp/
ity, or the so-called instrument calibration. This measure- Psychometrics manuals usually define the validity of any
ment issue is also relevant to psychosocial sciences, al- given test by certifying whether or not the test measures
though it conceptually has nothing to do with the validity what it is supposed to measure. Although this definition
issue. may sound like a tautology, when the psychometric theory
that admits the latent trait is taken into account it proves
This is because validity refers to the congruence be- to be not. This definition clearly states that whenever be-
tween the instrument being used for measurement and the haviors (items) are measured - and behaviors are the physi-
property under evaluation and not regarding the accuracy cal representation of the latent trait - the latent trait itself
that describes the object’s property. In physics, the instru- is being measured. This supposition is only possible when
ment is a physical object that measures physical proper- an existing previous trait theory supports the behavioral
ties; then, it seems easy to acknowledge whether or not representation as a deductible hypothesis for the theory.
the object’s measuring property is congruous with the The test validity (the hypothesis), therefore, will be estab-
measured object’s property. Take the object’s length prop- lished by the empirical testing of the hypothesis verifica-
erty, for example. The instrument that measures this prop- tion. At any rate, this is the scientific methodology. Hence,
erty (length), the meter, applies its length property in or- the current psychometrics practice of intuitively grouping
der to measure another object’s length; so, we are not a series of items and statistically verifying a posteriori what
matching length with length as univocal terms. There is no they are measuring becomes quite unusual. The emphasis
need to prove that the meter’s length property is congru- in the formulation of the trait theory used to be quite weak
ous with the same property in the measured object; terms in the past; under the influence of the cognitive psychol-
are univocal, conceptually equivalent, and identical. ogy, psychometrics is fortunately retaking this emphasis,
It is less clear, however, when the astronomer measures bringing it back to its relevant place.
the galactic speed property of approximation or withdrawal The classical psychometrics, by the way,
via Doppler Effect, where approximation/ understands what supposedly has to be mea-
withdrawal of the galaxy’s light spectral lines Validity refers to the sured as the criterion, which is represented
would be the measurement instrument.
congruence between by a parallel test. Thus, the what is the
Here, we actually have a problem to validate latent trait in the cognitivistic conception of
the measurement instrument; the question the instrument being psychometrics, and it is the criterion (score
is: is it or is it not true that spectral line dis- used for measurement in the parallel test) in the behavioralist per-
tances have to do with the speed of galaxies? and the property under spective.
Such an inference can be made, but it has to evaluation and not
somehow be empirically demonstrated, that regarding the accuracy The validation process of any given test
is, at least its consequences should be indi- that describes the
cated, as well as all the derived, derivable, or begins with the formulation of detailed defi-
verifiable hypotheses. In this specific case, the
object’s property. nitions of specific traits or constructs, derived
problem of measurement precision is related from psychological theory, previous research,
to the preciseness of the distance measurements of the or systematic observation and analysis of the relevant
oscilloscope’s spectral lines, whereas the validity is related domains of behavior. The items of the test, then, are pre-
to whether or not the measurement of spectral line dis- pared in order to fit the construct’s definitions. Next,
tances, regardless its accuracy and perfection, has some- empirical analysis of the items are implemented, and the
more efficient (i.e., valid) items are finally selected from the
thing to do with the galaxy’s withdrawal speed. In other
initial sample of items(9).
words, the validity in such case refers to the demonstra-
tion of compatibility (legitimacy) in the representation or Although it constitutes the core point of psychomet-
modeling of galactic speed via spectral line distances. rics, the validation of the trait’s behavioral representation,
or the test’s representation, brings about significant diffi-
This astronomy case illustrates what typically occurs
culties that are located in three levels in the process of
with psychosocial sciences measurements, and conse-
elaborating the instrument, namely: the theory level, the
quently turns the evidence of instrument validity in these
information empirical collection level, and the statistical
sciences into an essential and crucial aspect; to show the
analysis of information properly said.
validity of instruments in these sciences is a sine qua non
condition. This is particularly the case of the above-men- The most significant difficulties are probably centered
tioned focuses that deal with the psychological concept of at the level of theory. As a matter of fact, the psychological
latent trait, where the correspondence (congruence) be- theory is still found in an embryonic state, and so it virtu-
tween latent trait and its physical representation (behav- ally lacks any level of axyomatization. As a result, a wide
ior) must be demonstrated. It is not incidental, therefore, scope of theories arises, even contradictory ones. It is worth
that the problem of validity has taken a central role in the remembering that we have several theories, such as be-
measurement theory in the history of psychology; in fact, haviorism, psychoanalysis, existentialist psychology, dialec-
it is its basic and indispensable parameter. tical psychology, and others; when existing simultaneously,
995
Psychometrics Rev Esc Enferm USP
Pasquali L 2009; 43(Spe):992-9
www.ee.usp.br/reeusp/
they postulate irreducible principles among the various they did sounds a bit uncommon to sciences, as operation-
theories; they also can weakly combine principles within ally non-defined concepts are not susceptible to scientific
the same theory, or even present an insufficient aspect that knowledge. Concepts or constructs are scientifically re-
is unable to develop useful hypothesis for the psychologi- searchable only when they are liable for adequate behav-
cal knowledge. This confused perspective takes place in the ioral representation. Otherwise, they will only be meta-
theoretical field of the constructs, that is, in the formula- physical, non-scientific concepts. The problem stemming
tion of clear and accurate hypothesis to either test or pos- from the general synthetic attitude of psychometricists of
tulate useful psychological hypothesis. Even when there is then is that whenever the construct validity had to be de-
success in the operationalization process, the empirical data fined, the researchers started from the test, that is, from
collection will not be exempt of difficulties, such as, for ex- the behavioral representation, instead of beginning with
ample, the unequivocal definition of criteria groups where the psychometric theory grounded on the elaboration of
these constructs can be ideally studied. Problems are found the construct’s theory (or the latent trait theory). The ob-
even at the level of the statistical analysis. According to the stacle is not to identify the construct from any existing rep-
elaboration logic of the instrument, the hypothetical veri- resentation (test), but to find out whether or not the rep-
fication of the construct’s representation legitimacy is per- resentation (test) constitutes a legitimate, adequate repre-
formed by means of analyses such as the factorial analysis sentation of the construct. This focus demands quite a close
(confirmatory), which attempts to identify the previously collaboration between psychometricists and the cognitive
operationalized constructs of the instrument in the empiri- psychology(14). The construct validity of any given test can
cal data. But, the factorial analysis happens to make some be dealt with in several angles: the construct’s behavioral
strong postulations that not always match the reality of representation analysis; the hypothetical analysis; and the
facts. For instance, the factorial analysis assumes that sub- IRT’s information curve(15-16).
jects’ responses to the instrument’s items are determined
by a linear relationship these subjects have with the latent The criterion validity of a test consists of the efficiency
traits. The rotation of axles is another serious problem, al- level it has to predict the specific performance of a subject.
lowing for countless numbers of factors related to the same The subject’s performance thus becomes the criterion
instrument(10). against which the measurement achieved by the test is as-
sessed. The subject’s performance must obviously be mea-
Having these difficulties in mind, psychometricists call sured/assessed through techniques that are independent
upon a series of techniques in order to make possible the on the planned test itself.
demonstration of the instrument’s validation. These tech-
niques can essentially be reduced to three large classes (the There are two distinctions for a test’s criterion validity:
trinitarian model): construct validation; content validation; (1) predictive validity, and (2) concurrent validity. The core
and criterion validation(11,12). difference between both is basically the matter of time
between the information collection of the test to be vali-
The construct validation, or concept validation, is dated, and the information collection of the criterion. If
deemed as the most fundamental form of validating psy- both collections are performed almost simultaneously, the
chological instruments, and this is quite reasonable, since result will be a concurrent validity; if the data about the
it constitutes the direct way of verifying the hypothesis of criterion are collected after the test’s information collec-
the behavioral representation legitimacy of latent traits; tion, the result will be the predictive validity. The fact that
therefore, it is connected with the psychometrics theory the information is simultaneously reached, or reached fur-
defended here. Historically, the construct concept was in- ther to the test itself, is not a technically relevant factor
serted into psychometrics through the American Psycho- towards the validity of the test. The relevance is located in
logical Association Committee on Psychological Tests, which the determination of a valid criterion. Here the central
functioned between 1950 and 1954, and whose results later nature of this type of test validation is situated, as follows:
became technical recommendations for psychological (1) to define an adequate criterion, and (2) to measure the
tests(12). criterion in a valid, independent way, regardless the test
itself.
The concept of construct validity was elaborated by the
classical article by Cronbach and Meehl(13), Construct valid- As per the criteria adjustment, we can affirm that there
ity in psychological tests, although the concept was already is a series of them that are usually employed, such as:
part of history under other names, such as intrinsic valid-
ity, factorial validity, and face validity. These various terms 1) Academic performance. Perhaps this used to be, or
show the confusing notion expressed by constructs. In spite still is the most applied criterion to validate intelligence
of the fact that Cronbach and Meehl attempted to clarify tests. It consists in the achievement of the students’ school
the concept of construct validity, they still define them as performance by means of teachers’ grades, by the students’
the characteristic that any test has of measuring an attri- general academic average, by the academic honors received
bute or quality that has not been operationally defined(13). by students, or even by the teachers’ or colleagues’ purely
They recognize, however, that the construct validity re- subjective assessment regarding these students’ intelli-
quired a new scientific focus. In fact, to define validity as gence. Despite being broadly used, this criterion has been
996
Rev Esc Enferm USP Psychometrics
2009; 43(Spe):992-9 Pasquali L
www.ee.usp.br/reeusp/
similarly quite criticized mainly due to the deficiency of its determine the validity of the new test. Here’s an obvious
assessment process. It is widely known that teachers are question: what is the purpose of creating another test if an
generally tendentious in attributing grades to students; this existing one validly measures what it is supposed to mea-
bias is not always a conscious act, but it stems from their sure? The answer is based upon a sense of economy, that
attitudes and sympathies towards this or that student. is, one makes use of a test that demands a longer length of
Teachers could overcome this challenge quite easily if they time to be responded or assessed as a criterion to validate
were used to apply performance tests based on content another test that spends a lower amount of time.
validity, for instance. As this is quite a laborious task, teach-
ers typically do not make efforts towards validating (con- In case of this last type of validity method, two distinct
tent validity) the students’ academic tests. situations must be met. First, whenever there are provably
validated tests for the measurement of any trait, they cer-
In this context, the subject’s schooling level is also ap- tainly constitute a criterion against which a new test can
plied as an academic performance criterion: advanced, re- be safely validated. Nevertheless, when tests accepted as
peating, and dropping out subjects. Supposedly, those who definitely validated do not exist for the assessment of a la-
keep a regular study, or those who are academically ad- tent trait, the application of the contending validity is ex-
vanced proportionally to their ages have more intelligence. tremely precarious. This situation is unfortunately the most
Evidently, not only the issue of intelligence must be worked common one. As a matter of fact, there are available tests
out in this argument, but also several other social factors, to measure practically anything, as attested by the Buro's
personality aspects, etc., which makes this quite an am- Mental Measurement Yearbooks, which are periodically
biguous, deceitful criterion. published and contain thousands of existing psychological
tests in the market. In this case, these tests can be used as
2) Performance in specialized training. It refers to the
validation criteria, but the risk is excessively high due to
performance obtained in training courses under specific
the fact that a test whose validity is minimally question-
situations (musicians, pilots, mechanical or specialized elec-
able is being employed as a criterion.
tronic activities, etc.). At the end of this training process a
typical assessment takes place, producing useful data that We can conclude that the concurrent validity only makes
will serve as criteria for the students’ performance. The sense if provably valid tests can serve as a criterion against
critical observations uttered for point 1 are also replicable which one wants to validate a new test, and that this new
in this paragraph. test have some advantages over the previous one (such as,
for instance, saving time, etc.).
3) Professional performance. In this case, test outcomes
are compared with the subjects’ success/failure, or their A frustrating issue stands out at the end of this study
quality level in the work environment. Hence, a test of on criterion validity processes. If the researcher has em-
mechanical ability can be implemented against the me- ployed all his ability to build a test, under the highest de-
chanical performance of subjects in a given work place. gree of control possible, why would he validate this task-
Mapping out the quality of the performance of subjects in test against lower measures, represented by the measure-
service, again, is evidently quite a difficult task. ment of various criteria presented here? Is it reasonable to
4) Psychiatric diagnosis. This method is quite used to validate supposedly superior measurements using a poorer
validate personality/psychiatric tests. The criteria groups measurement?(17). The criticisms of both Thurstone in 1952
are comprised of the results of the psychiatric assessment and above all those of Cronbach and Meehl in 1955(13,18)
that settles clinical categories: normal versus neurotic, psy- replaced the criterion validity of the psychological tests’
chopath versus depressive, etc. Again, it is very hard to ad- validation panacea technique for the construct validity.
equate the psychiatrists’ assessments. However, these criteria can be deemed as good and useful
towards the criterion validation. The significant difficulty
5) Subjective diagnosis. Assessments performed by col- in almost all of them is located in the demonstration of
leagues and friends can be a basis for the establishment of their measurement adjustment; in other words, these mea-
criteria groups. This technique is employed, above all, in surements are generally precarious, thus leaving much
personality tests, where more objective assessments are doubt on the test validation process. Nonetheless, there
hardly achieved. Thus, subjects place their colleagues in are well-known examples of validated tests through this
categories, or score personality traits (aggressiveness, co- method, such as the MMPI (Minnesota Multiphasic Person-
operation, etc), based on the experience of their living to- ality Inventory).
gether. Needless to say that there are enormous hardships
produced by these assessments in terms of objectivity; A test’s content validity is comprised of verifying wheth-
nonetheless, the application of a large number of judges er or not the test constitutes a representative sample of a
can diminish the subjective biases of these evaluations. finite universe of behaviors (domain). It is applicable when-
ever a finite universe of behaviors can be delimited a priori,
6) Other available tests. The outcomes achieved by such as the case of performance tests that intend to cover
means of another valid test that predicts the same perfor- a content that is delimited by a specific programmatic
mance of the test to be validated can serve as a criterion to course(11).
997
Psychometrics Rev Esc Enferm USP
Pasquali L 2009; 43(Spe):992-9
www.ee.usp.br/reeusp/
Test reliability where,
The reliability or trustworthiness parameter of tests is rtt : reliability coefficient
referenced by a long and heterogeneous series of names.
Some of those names stem from the own concept of this
2
SV : test true variance
parameter; in other words, these terms attempt to express ST2: test total variance
what they really represent to the test. These names are,
mostly: preciseness, trustworthiness, and reliability. Other There are practically two statistical techniques to de-
names of this parameter result more directly from the type cide the accuracy of a test, that is, the correlation and the
of technique applied in the empirical collection of informa- analysis of the internal consistency.
tion, or the statistical technique employed in the analysis
of the collected empirical data. Among these names we The correlation technique is applied for test-retest and
mention the following: stability, steadiness, equivalence, test parallel format conditions. Both cases show the out-
internal consistence. comes of the same subjects that were submitted to the
same test in two different occasions, or responded to two
Trustworthiness, or reliability of a test refers to the ma- parallel formats in the same test. The reliability index, in
jor characteristic it must display, namely, the errorless mea- this case, simply consists of a bi-varied correlation between
surement; hence, we have the terms preciseness, reliabil- both scores concerning the same subjects.
ity, and trustworthiness. An errorless measurement means
that the same test that measures the same subjects in dif- The internal consistency analysis demands a complex
ferent occasions, or equivalent tests that measures the same apparatus of statistical techniques that are finally reduced
subjects in the same occasion, produce identical outcomes; to two situations: dividing the test in shares - more com-
in other words, the correlation of both measurements must monly in two halves - with a subsequent correction made
score 1. However, as the error is always present in any mea- by the Spearman-Brown prediction formula, and several
surement, the further this correlation withdraws from 1, alpha coefficient techniques, being the Cronbach alpha the
the bigger the measurement error will be. The reliability most widely known of them all. Here, only one test is ap-
analysis of a psychological instrument precisely shows how plied in only one occasion; analyses consist of verifying the
much the same instrument withdraws from the ideal 1 cor- internal consistency of the items that compose the test. It
relation, determining a close-to-1 coefficient, so that the is, therefore, an accuracy estimation, whose logic is as fol-
error probability is lower. lows: if the items understand themselves, that is, covariate,
in a given occasion, they will thus understand each other in
Tests’ trustworthiness problem used to be a favorite is- any other occasion throughout the test.
sue for classical psychometrics, where the statistical esti-
mation paraphernalia for this parameter grew up the most;
CONCLUSION
but it lost importance within modern psychometrics in fa-
vor of the validity parameter. Anyway, within CTT, the trust-
worthiness coefficient, rtt, is statistically defined as the cor- In order to guarantee that tests will present the scien-
relation between the scores of the same subjects in two tifically required quality parameters, the American Psycho-
parallel ways of a test, T1 and T2. Hence, the trustworthi- logical Association (APA) established the Standards for Edu-
ness coefficient is defined as the co-variance function cational and Psychological Testing, with several editions
[Cov(T1,T2)] between the test formats by means of their since 1985.
2 2
SV2
own variances ( ST1e ST ), that is, rtt =
ST2
2
REFERENCES
1. Stevens SS. On the Theory of Scales of Measurement. Science. 5. Rasch G. Probabilistic models for some intelligence and
1946;103(2684):677-80. attainment tests. Copenhagen: Danish Institute for Educational
Research and St. Paul; 1960.
2. Whitehead AN, Russell B. Principia mathematica. Cambridge:
Cambridge University Press; 1910-1913, 1965. 3 v. 6. Birnbaum A. Some latent trait models and their use in inferring
and examinee’s ability. In: Loed FM, Lord MR. Novick, statistical
3. Gulliksen H. Theory of mental tests. New York: Wiley; 1950. theories of mental test scores. Reading: Addison Wesley; 1968.
p.17-20.
4. Lord FM. A theory of test scores. Iowa (IA): Psychometric
Society; 1952. (Psychometric Monograph, n. 7). 7. Lord FM. Applications of item response theory to practical
testing problems. Hillsdale: Erlbaum; 1980.
998
Rev Esc Enferm USP Psychometrics
2009; 43(Spe):992-9 Pasquali L
www.ee.usp.br/reeusp/
8. Campbell DT, Stanley J. Experimental and quasi-experimental 14. Pasquali L.Validade dos testes psicológicos: será possível re-
designs for research. Skokie: Rand McNally; 1973. encontrar o caminho? Psicol Teor Pesq. 2007;23(n.esp):
99-107.
9. Anastasi A. Evolving concepts of test validation. Ann Rev
Psychol. 1986;37(1):1-15. 15. Pasquali L. Psicometria: teoria dos testes na psicologia e na
educação. Petrópolis: Vozes; 2004.
10. Pasquali L, organizador. Instrumentos psicológicos: manual
prático de elaboração. Brasília: LabPAM/IBAPP; 1999. 16. Pasquali L. TRI – Teoria de Resposta ao Item: teoria, procedi-
mentos e aplicações. Brasília: LabPAM/UnB; 2007.
11. Pasquali L. Análise fatorial para pesquisadores. Porto Alegre:
Artmed; 2005. 17. Ebel RL. Must all tests be valid? Am Psychol. 1961;16(10):
640-7.
12. American Psychological Association (APA).Technical recom-
mendations for psychological tests and diagnostic techniques. 18. Thurstone LL.The criterion problem in personality research.
Washington; 1954. Chicago: University of ChicagoPress; 1952.
999
Correspondence Addressed to: Luiz Pasquali
Psychometrics Rev Esc Enferm USP
Campus
Pasquali Darci Ribeiro, ICC Sul - LabPAM, sala AI-096
L 2009; 43(Spe):992-9
Plano Piloto - Asa Norte www.ee.usp.br/reeusp/
CEP 70910-900- Brasília, DF, Brazil