0% found this document useful (0 votes)
27 views14 pages

Psychological Testing Overview

The document discusses standard scores and normalization in psychological testing. It covers common standard scores like z-scores, t-scores, stanines, and their characteristics. It also discusses issues regarding testing and cultural considerations that assessors should be aware of.

Uploaded by

ycaldito
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views14 pages

Psychological Testing Overview

The document discusses standard scores and normalization in psychological testing. It covers common standard scores like z-scores, t-scores, stanines, and their characteristics. It also discusses issues regarding testing and cultural considerations that assessors should be aware of.

Uploaded by

ycaldito
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

PSYCH 162 ● NORMAL CURVE EQUIVALENT

- Mean: 50
MEASURE OF VARIANCE - SD: 21.06
- An indication of how score in a
distribiutions are scattered
CORRELATION AN INFERENCES
RAW SCORE
- Number of items scored or coded CORRELATION
in a specific manner such as - An expression of the degree and
correct/ incorrect, true or false, direction of correspondence
and so on between two things

NORM REFERENCE INTERPRETATIONS CORRELATION COEFFICIENT


- Standardizations samples should - The numerical index that
be expresses the relationship: the
extent to which the x and y are
STANDARD SCORES co-related.

● Z SCORES ● STRONG POSITIVE


- Results from the ● WEAK POSITIVE
● T SCORES ● STRONG NEGATIVE
● STANINE ● WEAK NEGATIVE
● MODERATE NEGATIVE
COMMON STANDARD SCORES ● NO CORRELATION
● Z-SCORE
- Mean: 0
- SD: 1
CORRELATION RELATIO NATURE OF
- Application: useful research but NSHIP VARIABLES
difficult to use and interpret
PEARSON R LINEAR BOTH
● T-SCORE (PRODUCT CONTINUOU
- CORRELATION S
- MOMENT
● IQ’s
COEFFICIENT)
● CEEB
● PERCENTILE RANKING SPEARMAN’S MONOTO ONE
● GRADE EQUIVALENT RHO NIC CONTINUOU
(RANK-ORDER S AND
NORMALIZED STANDARD SCORES CORRELATION ORDINAL /
COEFFICIENT BOTH
● STANINE
ORDINAL
- Mean: 5
- SD: 2
● WELCHER SCALED SCORES
- Mean: 10
- SD: 3
HISTORY ● Facial expressions, finger
CHINA and hand signs, and shifts
- Selection of applicants who would in one’s position in space
obtain a government job may all convey messages.
● Let's not forget that the
SONG DYNASTY (960-1279 A.D) messages conveyed by
- Knowledge of classical writings such body language may
- be different from culture to
culture.

3. Standards of Evaluation
CULTURAL AND ETHICAL
● Consider culture
CONSIDERATIONS
appropriate measures in
evaluating you individuals
CULTURE ● Behaviors differ from one
- The socially transmitted behavior culture to another. What
patterns, beliefs, and products of seems to be normal in one
work of a particular population, culture may be considered
community, or group of people. as pathological in another.
● Individualist Culture
Issues Regarding Culture and - United States and
Assessment Great Britain) is
characterized by
1. Verbal Communication value being placed
● The examiner and the on traits such as
examinee must speak the self-reliance,
same language. autonomy,
● This is not necessary only independence,
for the assessment to uniqueness, and
proceed but also for the competitiveness.
assessor’s conclusions ● Collectivist Culture
regarding the assessment - value is placed on
to be reasonably accurate traits such as
● If a test is in written form conformity,
and includes written cooperation,
instructions, then the test interdependence,
taker must be able to read and striving toward
and comprehend what is group goals.
written.

2. Nonverbal Communication and


Behavior
CHALLENGES, BEST PRACTICES, AND 3 LEVELS
FACILITATING FACTORS IN THE 1. test-User Qualification
PHILIPPINES - Level A
● Tests are aids that
Availability and cost of test materials can be adequately
Test appropriateness be administered,
Applicability in the Philippines setting scored, and
Testing environment interpreted with the
Task demands manual and a
Attitude of clients general orientation
Report writing to the kind of
institution or
organization in
Communicate the findings which one is
Adherence to standard procedures working
Supervision ● Insurance,
Using a battery of tests achievement, or
Use of technology proficiency tests

- Level B
LEGAL AND ETHICAL ● Tests or aids that require
CONSIDERATIONS some technical knowledge
of test construction and
LAWS use and of educational
- Rules that individuals must obey field such as statistics,
for the good of the society as a individual differences of
whole adjustment, personnel
- Rules thought to be good for the psychology, and guidance
society as a whole ● Aptitude test and
ETHICS adjustment inventories
- A body of principles of right, applicable to normal
proper, or good conduct population

- Level C
The Concerns of the Profession ● Test and aids that require
- To whom should be allowed to substantial understanding
purchase and use psychological of testing and supporting
test materials. psychological fields
- Three levels of tests in terms of together with supervised
the degree to which the test’s use experience in the use of
required knowledge of testing and these devices.
psychology ● Projective tests, individual
mental tests
● Psychological Traits and States
Psychology Law (RA 10029) can be quantified and measured
“Practice of Psychology” consists of the
delivery of psychological services that - Ones it's acknowledged that
involve application of psychological psychological traits and states
principles and procedures for the exist, the specific traits and states
purpose of describing, understanding, to be measured and quantified
predicting and influencing the behavior need to be carefully defined
of individuals or groups, to assist in the - Once having defined the traits,
attainment of optimal human growth and state, or other construct to be
functioning. measured, a test developer
considers the types of item
2. Testing People with Disabilities content that would provide insight
- Transforming the test into into it.
a form that can be taken by - Measuring traits and states by
the testtaker means of a test entails developing
- Transforming the response not only appropriate test items
of the test taker so that but also appropriate ways to
they are scorable score tests and interpret the
- Meaningfully interpreting results.
the test data - CUMULATIVE SCORING
● The test is presumed to
represent the strength of
BASICS OF PSYCHOLOGICAL TESTING the targeted ability or trait
AND ASSESSMENT or state.
● Inherent Cumulative Score
WHAT IS A “GOOD TEST”? - The assumption that
Assumptions (testing and assessment) the more the test
taker responds in a
Assumption 1. particular direction
● Psychological Traits and States as keyed by the test
Exist manual as correct
- Trait - any distinguishable, or consistent with a
relatively enduring way in particular trait, the
which one individual varies higher the test taker
from another is presumed to be
- States - distinguish one on the targeted
person from another but ability or trait.
are relatively less
enduring. Assumption 3
● Test-related behavior predicts
non-test related behavior

Assumption 2.
Assumption 4 - The consistency of the measuring
● Test and other measurement tool
techniques have strengths and - The precision with which the test
weakness measures and the extent to which
error is present in measurements.
Assumption 5
● Various sources of error are part VALIDITY
of the assessment process - A test is considered valid for a
- Errors- refers to particular purpose if it does, in
long-standing assumption fact, measure what it purports to
that factors other than measure.
what a test attempts to
measure will influence
performance on the test. 1. Clear instructions of
- Error Variance- the administration, scoring and
component of a test score interpretation
attributable to scores other 2. It is a plus if it is economical
than trait or ability 3. Measures what it purports to
measured. measure
- 4. Test users are trained to
CLASSICAL TEST THEORY (CTT) administer them
- True Score Theory 5. Useful test, it yields actionable
- Assumption is made that each results that benefits the society at
test taker has true score on a test large
that would be obtained but for the 6. Adequate norms
action of measurement error.
NORMS
Assumption 6 1. Norm -Referenced testing and
● Testing and Assessment can be in assessment
a fair and unbiased manner - Method of evaluation and a
- The use of tests that are way of deriving meaning
not relevant to the from test scores by
background of the test evaluating an individual
taker can be a form of bias testaker’s score and
on the test user side. comparing it scores of a
group of testtakers
Assumption 7 - The goal of it is to yield
● Testing and Assessment benefit information on a testaker’s
the society standing or ranking
relative to some
RELIABLE AND VALID comparison group of
testakers
RELIABILITY - The meaning of an
individual test score is
understood relative to - In the distribution of
other scores on the same raw score
test
DEVELOPMENTAL NORMS
Norms - term applied broadly to norms
- The test performance data of a developed on the basis of any
particular group of test takers that trait, ability, skill, or other
are designed for use as a characteristic that is presumed to
reference when evaluating or develop, deteriorate, or otherwise
interpreting individual test score be affected by chronological age,
school grade, or stage of life.
Normative Sample
- That group of people whose 1. Age Norms
performance on a particular test ● Average performance of
is analyzed for reference in different samples of test
evaluating the performance of takers who were at various
individual test takers ages at the time the test
was administered
NORMING
- Norming refers to the 2. Grade Norms
process of deriving norms ● Developed by
administering to
Test STANDARDIZATION representative samples of
- The process of administering a children over a range of
test to a representative sample of consecutive grade levels.
test takers for the purpose of - Use of grade norms
establishing norms is as a convenience,
- Ex. readily
● Race Norming understandable
- A controversial gauge of how one
practice of norming student’s
on the basis of race performance
or ethnic compares with that
background of fellow students in
the same grade
TYPES OF NORMS - Useful only with
1. Percentile respect to years and
● Expression of the months of
percentage of people schooling
whose score on a test or completed
measure falls below a
particular raw score NATIONAL NORMS
- Samples that was
● Percentage Correct nationally representative of
population at the time the - Method of evaluation and a
norming study was way of deriving meaning
conducted from test scores by
evaluating an individual’s
1. National Anchor Norms score with reference to a
- Compare the hypothetical set standard.
test to the actual that is EXAMPLES:
given to a grade level. ● To be eligible for a high school
- Provides stability to test diploma, students must
scores by anchoring them demonstrate at least a
to other test scores sixth-grader reading level.
● To earn the privilege of driving an
2. Subgroups Norms automobile, would-be drivers
- Analyzing in the must take a road test and
perspective of their demonstrate their driving skill to
regional/community the satisfaction of a
sample state-appointed examiner.
● Mastery of Test, Skill or
3. Local Norms something
- Normative information with
respect to the local
population’s performance 1. Utility
on some tests. - Usefulness or practical
- value of testing to improve
2 WAYS THAT A DATA MAY BE VIEWED efficiency
AND INTERPRETED
1. Fixed Reference Group Scoring Factors:
System / Norm Reference ● A test said to psychometrically
- Aid in providing a context for sound for a particular purpose if
interpretation reliability and validity coefficients
- Used as the basis for the are acceptably high.
calculation of test scores for
future administrations of the test ● Cost in the context of test utility
- Ex. refers to disadvantages, losses,
● SAT or expenses in both economic
● College Admission Tests and non economic terms.

2. Criterion Referenced Evaluation ● Benefit refers to profits, gaind,


- Criterion and advantages.
● A standard which a
judgment or
decision may be
based
Guidelines in Item Writing
1. Define clearly what you want to
TEST CONCEPTUALIZATION
measure
2. Generate an item pool
MAKING A BLUEPRINT 3. Avoid exceptionally long items
● Known as the test specifications 4. Keep the level of reading difficulty
● A framework for developing the appropriate for those who will
questionnaire complete the scale
● GRID STRUCTURE 5. Avoid “double-barrelled” items
that convey two or more ideas at
Content Areas the same time’
- Covers everything that is relevant 6. Consider mixing positively and
to the purpose of the negatively worded items
questionnaire

● Write down the content areas to TEST CONSTRUCTION


be covered by your questionnaire.
If these are not clear-cut, consult SCALING
experts in the field. - The process of setting rules for
assigning numbers in
Manifestations measurement.
- What are the behaviors or affects
that content areas manifest? Types of Scales
- Makes sure that different aspects 1. Age Based Scale and Grade
of the content areas will be Based Scale
elicited - Function of age and grade

● Write down ways in which the 2. Unidimensional and


content areas of your Multidimensional Scale
questionnaire may become - Number of dimensions that
manifest guides the test taker’s
● Draw blueprint, labeling each responses.
content area (column) and each
manifestations (row) 3. Comparative and Categorical
Scale
PILOT WORK - Comparative
- Preliminary research surrounding ● Judgment of a
the creation of a prototype stimulus in
- The test developer typically comparison with
attempts to determine how best to every other stimulus
measure a targeted construct.
- Done to evaluate whether they ● EX.
should be included in the final Providing testtakers with a list of 30
form of the instrument. items on a sheet of paper and asking
them to rank the justifiability of the items
from 1 to 30
ITEM FORMAT
- Categorical - Variable refers to the form, plan,
● Stimuli are placid structure, arrangement, and
into one or more layout of individual test items.
alternative
categories that 2 Types
differ quantitatively 1. Selected -Response Format
with the respect to - require testtakers to select
some continuum. a response from a set of
alternative responses.
● EX. - Ex.
Test takers might be given 30 index The examiner must select the response
cards, on each of which is printed one of that is keyed as correct.
the 30 items.
If the test is designed to measure the
Testtakers would be asked to sort the strength of a particular trait and if the
cards into three piles: those behaviors items are written in a selected-response
that are never justified, those that are format, then examinees must select the
sometimes justified, and those that are alternative that best answers the
always justified question with respect to themselves.
-
2. Constructed-Response Format
4. Guttman Scale - require testtakers to
- Yields Ordinal - level of supply or to create the
measures correct answer, not merely
- Items on it range from to select it.
whether to stronger
expression of the attitude, 3 Types of Selected-Response Format
belief, or feeling being 1. Multiple Choice
measured. - Stem
- Scalogram Analysis ● Statement or
● An item analysis question that
procedure and contains the
approach to test problem
development that - A correct alternative or
involves graphic option
mapping of a - Distractors or foils
testtaker’s (Incorrect Choices)
responses.
2. Matching Item
3. True-False (Binary-Coice Item)
WRITING ITEMS respponse rather than a
careful considerations of
Item Pool each item.
- The reservoir or well from which
items will or will not be drawn for 3. Indecisiveness
the final version of the test. - The tendency to use the
“don’t know” or
Things to consider in Person-BAsed “uncertain” option.
Questionnaires Reduction
- Omiting the middle
1. Acquiescience category
- The tendency to agree with
the items regardless of 4. Extreme Response
their contents - The tendecy to choose an
Reductions extreme option regardless
- Can be reduced by of direction
ensuring that an equal - Some respondents will use
number of items is scored one direction for a series
in each direction. (Reverse of items and then switch to
Scoring) the other direction, and so
- Best to avoid on.
double-negative Reduction
statements - Use of clear, unambiguous,
- Less likely to occur with and specific items.
items that are clear,
unambiguous, and
specific. DESIGNING QUESTIONNAIRE
- Good design is crucial for
2. Social Desirability producing a reliable and valid
- The tendency to respond questionnaire.
to an item in a socially - Respondents feels less
acceptable manner. intimidated by a questionnaire
Reductions that has a clear layout and is easy
- Can be reduced by to understand, and take their
excluding items that are track of completing the
clearly socially desirable or questionnaire more seriously.
desirable.
- Ask indirect questions to 1. Background Information
evoke a response that is - Include headings and
not simply a reflection of sufficient space for the
how the respondent respondents to fill out their
wishes to present himself. name, age gender, or
- Asking respondents to whatever other
give an immediate
background information qualify for a specific
you require. diagnosis.
- Useful to obtain the date
on which the questionnaire 3. Ipsative Scoring
is completed, especially if - Compare a testtaker’s
it is to be administered score on one scale within a
again. test to another scale within
that same test.
2. Instructions
- Must be clear and ITEM ANALYSIS
unambiguous - Set of methods used to evaluate
- Tell the respondents how test items, is one of the most
to choose a response and important aspects of test
how to indicate the chosen construction
response in the ● Index of the item’s
questionnaire difficulty
● Index of the item’s
3. Layout reliability
● Index of the item’s validity
● Index of item’s
ITEM SCORING discrimination
1. Cumulative Model
- The higher the score on 1. Item- Difficulty Index
the test, the higher the test - If everyone gets the item
taker is on the ability, trait, right then the item is too
or other characteristic that easy
the test purpose to - If everyone gets the item
measure. wrong then the item is too
- Commonly used difficult.
- Achievement Testing
Computations
2. Class/Category Scoring - Calculating the proportions of the
- Testtaker responses earn total number of testtakers who
credit toward placement in answered the item correctly
a particular class or ● Step 1
category with other - Note the Total
testtakers whose pattern of number of
responses is presumably Respondents
similar in some way - N= 100
- Used by some diagnostic ● Step 2
systems where individuals - Count the number
must exhibit a certain of respondents who
number of symptoms to answer the item
correctly
- X= 50 -
● Step 3
- Divide the sum of Classical Test Score Theory
Step 2 by the sum of - Assumes that each person has a
Step 1 true score that would obtain if
- 50 ÷ 100 = .50 there were no errors in
measurement
Item-Endorsement Index Observed Score = True Score + Error
- Personality Test or Self ● x=T+E
Description Construct
- A measure of the perception of
people who said yes to, agreed
with, or otherwise endorsed the
items. Item Response Theory
- The probability that a person with
Computations X ability will be able to perform at
● Step 1 a level of Y.
- Find half of the difference
between 100% (0r 1.00) Errors
success and the chance - Refers to the component of the
performance. observed test scores that does
not have to do the the testtaker’s
● Step 2 ability
- Add this value to the
probability of performing Measurement Error
correctly by chance ● Random Error
- Source of error in
measuring a targeted
2. Item Reliability Index variable caused by
- Provides an indication of unpredictable fluctuations
the internal consistency of and inconsistencies o
a test. father variables in the
- Equal to the product of the measurement process
item-score standard - Unanticipated events
deviation and the happening in the
correlation between the immediate vicinity of the
item score and the total test environment
test score. - Unanticipated physical
events happening within
Reliability Coefficient the test taker
- An index of reliability, a
proportion that indicates the ratio ● Systematic Error
between the true score variance ● Source of error in measuring a
on a test and the total variance. variable that is typically constant
or proportionate to what is Factor Analysis and Inter-Item
presumes to be a true value of the Consistency
variable being measured
● Weighing scales Factor Analysis
- Uses to determine whether items
on a test appear to be measuring
Source of Error the same things
● Test Construction Items that do not “load on” the factor
- The way the items were that they were written tp tap (or items
worded that do not appear to measure what they
● Test Administration were designed to measure) can be
- Test Environment revised or eliminated.
- Testtaker’s condition
- Examiner-related Variables 4. Item Validity Index
● Test scoring and interpretation - Statistic designed to provide an
- Element of subjectivity indication of the degree to which
a test is measuring what is
VARIANCE purports to measure
● Variance - Judgment
- The standard deviation - The higher the item validity index,
squared the greater the test’s
criterion-related validity
● True Variance
- Variance from true Validity
difference - Is a judgment based on evidence
about the appropriateness of
● Error Variance inferences drawn from test
- Variance from irrelevant, scores.
random sources
Validation
- Is the process of gathering and
Reliability evaluating evidences about
- Refers to the portion of the total validity
variance attributed to true
variance. 5. Item Discrimination Index
- The greater the proportion of the - How adequately an item separates
total variance attributed to true or discriminates between high
variance, the more reliable the score and low scores on an entire
test. test?
- Determines whether the people
who have done well on particular
items have also done well on the
whole test.
Item Discrimination
- Compares people who have done Other Considerations in Item Analysis
well with those who have done - Guessing
poorly ● One that has eluded any
- Good di universally acceptable
● High scorer > low scorer solution

- consider revising/eliminating of - Item Fairness


the difference is negative ● Refers to the degree, if any,
a test item is based
● Step 1
- Identify a group of - Speed Tests
students who on the test ● Test taken under speed
(those in the 67th conditions yield
percentile and above) misleading or
- Also identify a group of uninterpretable results
students that has done
poorly (those in the 33rd 3 Criteria
percentile below) 1. A correction for guessing must
recognize that when a respondent
● Step 2 guesses at an answer on an
- Find the proportion of achievement test, the guess is not
students in the high group typically made on a totally
and the proportion of random basis.
students in the low group 2. A correction for guessing must
who got each items correct also deal with the problem of
omitted items
● Step 3 3. Just as some people may be
- For each item, subtract the luckier than others in front of a
proportion of correct Las Vegas slot machine, so some
responses for the low testtakers may be luckier than
group from the proportion others in guessing the choices
of correct responses for that are keyed correctly.
the high group.
- This gives the item To Reduce
discrimination index 1. Explicit instructions regarding
this point for the examiner to
Analysis of Item Alternative convey to the examinees
- We chart the number of testtakers 2. Specific instructions for scoring
in the U group and group who and interpreting omitted items.
choose each alternative
- The test developer can get an idea
of the effectiveness of a distractor
by means of a simple eyeball test

You might also like