CHAPTER 4: OF TESTS AND TESTING ANDROGYNOUS
Referring to an absence of primacy of
ASSUMPTIONS ABOUT PSYCHOLOGICAL male or female characteristic.
TESTING AND ASSESSMENT
LIBERATED
Psychological Traits and States Exist
Psychological Traits and States can be Freed from constraints of gender-
Quantified and Measured dependent social expectations.
Test-Related Behavior Predict Non-Test
NEW AGE
Related Behavior
Tests and Other Measurement Techniques Refers to a particular nonmainstream
Have Strengths and Weaknesses orientation to spirituality and health.
Various Sources of Error are Part of the
CONSTRUCT
Assessment Process
Testing and Assessment Can be Conducted An informed, scientific concept developed
in a Fair and Unbiased Manner or constructed to describe or explain behavior;
Testing and Assessment can Benefit cannot be seen, heard or touched but existence can
Society be inferred from overt behavior.
TRAIT OVERT BEHAVIOR
Any distinguishable, relatively enduring Refers to an observable action or the
way in which one individual varies from one product of an observable action, including test- or
another. assessment-related responses.
STATES RELATIVELY ENDURING
Distinguish one person from another but Reminder that a trait is not expected to be
are relatively less enduring. manifested in behavior 100% of the time;
important to be aware of the context or situation in
PSYCHOLOGICAL TRAIT which a particular behavior is displayed.
Examples are traits that relate to
DEFINITION OF TRAIT AND STATE
intelligence, specific intellectual abilities,
cognitive style, adjustment, interests, attitudes, Refer to a way in which one individual
sexual orientation and preferences, varies from another.
psychopathology, personality in general and
REFERENCE GROUP
specific personality traits.
Can greatly influence one's conclusions or
judgments.
WEIGHING A COMPARATIVE VALUE OF A TEST'S ITEMS those limitations might be compensated for by
Comes about as a result of a complex
data from other sources.
interplay among many factors, including technical
considerations, the way a construct has been ERROR IN ASSESSMENT
defined for the purposes of the test, and the value
Something that is more than expected; actually a
society (and the test developer) attaches to the
component of measurement process; refers to a
behaviors evaluated.
long-standing assumption that factors other than
what a test attempts to measure will influence
TEST SCORE
performance on the test.
Presumed to represent the strength of the
targeted ability or trait or state and is frequently ERROR VARIANCE
based on cumulative scoring.
The component of a test score attributable
to sources other than the trait or ability measured.
DOMAIN SAMPLING
Refer to either a sample of behaviors from SOURCES OF ERROR VARIANCE
all possible behaviors that could conceivably be
Assessess themselves; Assessors,
indicative of a particular construct or a sample of
Measuring Instruments.
test items from all possible items that could
conceivably be used to measure a particular CLASSICAL OR TRUE SCORE THEORY OF MEASUREMENT
construct. Each testtaker has a true score on a test that would
be obtained but for the random action of
FORENSIC MATTERS
measurement error.
Psychological tests may be used to postdict
behavior. CHARACTERISTICS OF A GOOD TEST
Reliability and Validity
POSTDICT
To aid in the understanding of behavior RELIABILITY
that has already taken place. Involves consistency of the measuring
tool: the precision with which the test measures
COMPETENT TEST USERS
and the extent to which error is present in
Understand how a test was developed, the measurements; the perfectly reliable measuring
circumstances under which it is appropriate to tool consistently measures in the same way; it
administer the test, how the test should be yields the same numerical measurement every
administered and to whom, how the test results time it measures the same thing under the same
should be interpreted; understand and appreciate conditions.
the limitations of the tests they use as well as how
Refers to the process of deriving norms;
may be modified to describe a particular type of
VALIDITY
norm derivation.
It measures what it's supposed to measure;
focuses on items that collectively make up the USER/PROGRAM NORMS
test. Consist of descriptive statistics based on a
group of testtakers in a given period of time rather
NORMS than norms obtained by formal sampling methods.
Provide a standard with which the results
TEST STANDARDIZATION/STANDARDIZATION
of a measurement can be compared.
Process of administering a test to a
NORM-REFERENCED TESTING AND ASSESSMENT representative sample of testtakers for the purpose
A method of evaluation and a way of of establishing norms.
deriving meaning from test scores by evaluating
STANDARDIZED TEST
an individual testtaker's score and comparing it to
scores of a group of testtakers; common goal is to Has clearly specified procedures for
yield information on a testtaker's standing or administration and scoring, typically includes
ranking relative to some comparison group of normative data.
testtakers.
SAMPLING
NORM
Targeting some defined group as the
Refers to behavior that is usual, average, population for which the test is designed.
normal, standard, expected, or typical.
TYPES OF NORMS
NORMS IN PSYCHOMETRIC CONTEXT
Age Norms
Test performance data of a particular Grade Norms
group of testtakers that are designed for use as a National Norms
reference when evaluating or interpreting National Anchor Norms
individual test scores. Local Norms
Norms from a Fixed Reference Group
NORMATIVE SAMPLE Subgroup Norms
Group of people whose performance on a Percentile Norms
particular test is analyzed for reference in
PERCENTILE
evaluating the performance of individual
testtakers. Expression of the percentage of people
whose score on a test or measure falls below a
NORMING particular raw score; popular way of organizing all
test-related data, including standardization sample as age, gender, racial/ethnic background,
data. socioeconomic strata, geographical location, and
different types of communities within the various
parts of the country.
PERCENTAGE CORRECT
NATIONAL ANCHOR NORMS
Refers to the distribution of raw scores-
Provide norms provide some stability to
more specifically, to the number of items that
test scores by anchoring them to other test scores.
were answered correctly multiplied by 100 and
divided by the total number of items. EQUIPERCENTILE METHOD
Method by which such equivalency tables
AGE-EQUIVALENT SCORES/AGE NORMS
or national anchor norms are established which
Indicate the average performance of
begins with the computation of percentile norms
different samples of testtakers who were at
for each of the tests to be compared.
various ages at the time the test was administered.
SUBGROUP NORMS
GRADE NORMS
Segmentation of a normative sample by
Developed by administering the test to
any of the criteria initially used in selecting
representative samples of children over a range of
subjects for the sample.
consecutive grade levels; the mean or median for
each level is calculated; Do not provide LOCAL NORMS
information as to the content type of items that a
Provide normative information with
student could or could not answer correctly.
respect to the local population's performance on
some test.
DEVELOPMENTAL NORMS
A term applied broadly to norms FIXED REFERENCE GROUP SCORING SYSTEM
developed on the basis of any trait, ability, skill, or Distribution of scores obtained on the test
other characteristic that is presumed to develop, from one group of testtakers (fixed reference
deteriorate, or otherwise be affected by group) is used as the basis for the calculation of
chronological age, school grade, or stage of life. test scores for future administrations of the test.
NATIONAL NORMS ANCHORING
Derived from a normative sample that was A procedure that permits the conversion of
nationally representative of the population at the raw scores on the new version of the test into
time the norming study was conducted; May be fixed reference group scores.
obtained by testing large numbers of people
representative of different variable of interest such NORM-REFERENCED SCORES
Approach to evaluation which seeks to extent to which X and Y are correlated;
derive meaning from a test score by evaluating the interpreted by sign and magnitude.
test score in relation to other scores on the same
CORRELATION
test.
Positive Correlation
Negative Correlation
None
CRITERION
Standard on which a judgment or decision MAGNITUDE OF CORRELATION COEFFICIENT
may be based. Judged by its absolute value; the extent
can be as low as -1 to as high as +1; this would
CRITERION-REFERENCED TESTING AND ASSESSMENT
mean that the correlation is perfect, without error
Defined as a method of evaluation and a in the statistical sense.
way of deriving meaning from test scores by
evaluating an individual's score with reference to a POSITIVE CORRELATION
set standard; focus is on how scores relate to a When two variables simultaneously
particular content area or domain. increase or simultaneously decrease.
CORRELATION COEFFICIENT NEGATIVE (INVERSE) CORRELATION
Number that provides us with an index of When one variable increases while the
the strength of the relationship between two other variable decreases.
things.
ZERO CORRELATION
CORRELATION
No relationship exists between the two
Expression of the degree and direction of variables
correspondence between two things; does not
PEARSON CORRELATION COEFFICIENT/PEARSON PRODUCT-
illustrate a causal relationship but there is an MOMENT COEFFICIENT OF CORRELATION/PEARSON R
implication of prediction; if we know that there is
Devised by Karl Pearson; r can be the
a high correlation between X and Y, then we
statistical tool of choice when the relationship
should be able to predict with various degrees.
between the variables is linear and when the two
COEFFICIENT OF CORRELATION (r) variables being correlated are continuous (That
they can take any value); formula takes into
Expresses a linear relationship between two (and
account the relative position of each test score or
only two) variables, usually continuous in nature;
measurement with respect to the mean of the
reflects the degree of concomitant variation
distribution.
between variable X and variable Y; numerical
index that expresses this relationship; tells us the PEARSON R COMPUTATION
If the negative standard score values for accounted for by chance, error, or otherwise
measurements of X always corresponded with unmeasured or unexplainable factors.
negative standard score values for Y scores, the
MOMENT
resulting r would be positive (multiplying two
negative values will result in a positive number); Describes a deviation about a mean of a
if positive standard score values on X always distribution
corresponded with negative standard score values
for Y and vice versa, then an inverse relationship DEVIATES
would exist and so a negative correlation would Individual deviations about the mean of a
result; should only be used when the relationship distribution; first moments of the distribution.
between the variables is linear.
MOMENTS SQUARED
ZERO OR NEAR-ZERO CORRELATION Second moments of the distribution.
Could result when some products are
MOMENTS CUBED
positive and some are negative.
Third moments of the distribution.
WHAT TO DO WITH PEARSON R
SPEARMAN'S RHO/RANK-ORDER CORRELATION
Ask Is this number statistically significant COEFFICIENT/RANK-DIFFERENCE CORRELATION
COEFFICIENT
given the size and nature of the sample?
Ask Could this result have occurred by chance? Developed by Charles Spearman;
Significance at the .01 level tells you, with frequently used when sample size is small (fewer
reference to these data, that a correlation such as than 30 pairs of measurements) and especially
this could have been expected to occur merely by when both sets of measurements are in ordinal (or
chance only one time or less in a hundred if X and rank-order) form; special tables are used to
Y are not correlated in the population. determine if an obtained rho coefficient is or is not
significant.
Significance Levels
GRAPHIC REPRESENTATIONS OF CORRELATION
.05 provides the basis for concluding that a
correlation does indeed exist; means that the result Bivariate Distribution
could have been expected to occur by chance Scatter Diagram
along five times or less in a hundred. Scattergram
Scatterplot
COEFFICIENT OF DETERMINATION (r2)
SCATTERPLOT
Indication of how much variance is shared
by the X- and the Y-variables; The remaining Simple, graphing of the coordinate points
variance of the r2 (1-r2) could presumably be for values of the X-variable (horizontal axis) and
the Y-variable (vertical axis); provide a quick
indication of the direction and magnitude of the Line of Best Fit; the straight line that
relationship, if any, between the two variables. comes closes to the greatest number of points on
the scatterplot of X and Y.
DIRECTION OF THE CURVE
REGRESSION COEFFICIENTS
Helps distinguish positive from negative
correlations. b = slope of the line
a = intercept (constant which indicates where the
DEGREE TO WHICH THE POINTS FORM A STRAIGHT LINE
line crosses the Y-axis.
Helps estimate the strength of magnitude
of the correlation.
CURVILINEARITY
Eyeball gauge of how curved a graph is.
OUTLIER
STANDARD ERROR OF THE ESTIMATE
Extremely atypical point located at a
Error in the prediction of Y from X; the
relatively long distance-an outlying distance-from
higher the correlation between X and Y, the
the rest of the coordinate points in a scatterplot;
greater the accuracy of the prediction and the
stimulate interpreters of test data to speculate
smaller the standard error of the estimate.
about the reason for the atypical score; can
provide a hint of some deficiency in the testing or
MULTIPLE REGRESSION
scoring procedures.
Takes into account the intercorrelations
REGRESSION among all the variables involved; correlation
between each of the predictor scores and what is
The analysis of relationships among
being predicted is reflected in the weight given to
variables for the purpose of understanding how
each predictor; Predictors that correlate highly
one variable may predict another.
with the predicted variable are generally given
SIMPLE REGRESSION more weight.
Involves one independent variable (X), META-ANALYSIS
referred to as the predictor variable; and one
Analysis of data from several studies;
dependent variable (Y), referred to as the outcome
refers to a family of techniques used to
variable.
statistically combine information across studies to
REGRESSION LINE produce single estimates of the statistics being
studied; more weight can be given to studies that
have larger numbers of subjects.