0% found this document useful (0 votes)
123 views21 pages

Psychological Assessment 1

The document outlines key assumptions and concepts related to psychological assessment, including the existence and measurement of psychological traits, types of tests (criterion-referenced and norm-referenced), levels of measurement, and reliability and validity of tests. It also discusses various statistical methods and concepts such as central tendency, skewness, and the relationship between reliability and validity. Additionally, it provides examples and scenarios to illustrate these concepts in practice.

Uploaded by

LOL Blaze - MID
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views21 pages

Psychological Assessment 1

The document outlines key assumptions and concepts related to psychological assessment, including the existence and measurement of psychological traits, types of tests (criterion-referenced and norm-referenced), levels of measurement, and reliability and validity of tests. It also discusses various statistical methods and concepts such as central tendency, skewness, and the relationship between reliability and validity. Additionally, it provides examples and scenarios to illustrate these concepts in practice.

Uploaded by

LOL Blaze - MID
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

KimlookAssumptions about Psychological Assessment

l. Psychological traits & states exist


2. Psychological traits & states can be quantified & measured
3. Test-related behavior predicts non-test-related behavior.
4. Tests and other measurement techniques have strengths & weaknesses.
5. Various sources of error are part of the assessment process.
6. Testing & assessment can be conducted in a fair and unbiased manner
7. Testing & assessment benefits the society.

4. In order for Sinag to pass the board exam and have her license, she needs to at least attain an
overall percentage of 75% among the four subjects. This is an example of:

a. Criterion-referenced test
b. Norm-referenced test
c. Group-referenced test
d. Standard-referenced test

CRITERION-REFERENCED TESTS
- evaluating an individual's test score on the basis of
whether or not some criterion has been met or not or
with reference to a standard.
- ex: passing in school, driver's license test

NORM-REFERENCED TESTS
- a method of evaluation and a way of deriving meaning from
test scores by evaluating an individual test taker's score and
comparing it to scores of a group of test takers.

> NORMS — test performance data of a particular group of


test takers that are designed for use as a reference when
evaluating individual test scores

5. Most IQ tests are:


a. Content-referenced
b. Group-referenced
c. Criterion-referenced
d. Norm-referenced

6. This is the level of measurement that has all the 3 properties of measurement and in which
mathematical operations are most permissible.
a. Nominal
b. Ordinal
c. Interval
d. Ratio
4 LEVELS OF MEASUREMENT [NOIR]

l. NOMINAL SCALES
- observations can be named but not placed in any order
- words, letters, or numbers are used to classify the data
- no mathematical operations apply
- zero is just a label
- ex: SEX (male or female), COLOR (black, red, blue)

2. ORDINAL SCALES
- numbers can be ordered or arranged meaningfully from
highest to lowest or vice versa
- the interval or level are meaningful but not always equal
- median and percentile ranks may be employed
- ex:Top 10 (Top l, Top 2, Top 3, etc.), Level of Satisfaction
(Likert Scale)

3. INTERVAL SCALES
- indicates an actual amount (numerical)
- the order and difference between the rankings can
be known
- has no absolute zero point
- mathematical operations (e.g., mean, SD, Pearson's r,
tests of significance) may be applied
- ex:Temperature (300C, 200C, and I OOC, - I OOC)

4. RATIO SCALES
- the order and difference can be described and has an
absolute zero point and the ratio between two points
has meaning
- all mathematical operation can meaningfully be
performed
- ex: weight (30 kg, 20 kg, 10 kg, 0 kg)

7. The following can be manipulated using MATHEMATICAL OPERATIONS except for:


a. Nominal
b. Ordinal
c. Interval
d. Ratio

8. Which among the following is an average of a data set?


a. Mean
b. Median
c. Mode
d. All of the above

Central Tendency
- is a statistical measure to determine a single
score that define the center of distribution .
- goal: find the single score that is most typical or
most representative of the entire group.
- in other words, this is the average value that
can be used to describe the population.

9. In a 50-point item quiz in experimental psychology, the class got an average score of 15.
Meanwhile, student A got a score of 48, student B got a score of 47. On the other hand, student C
obtained a score of 45, while student D got a score of 46. To know the central tendency of the class,
the teacher should use:
a. Mean
b. Median
c. Mode
d. None of the above

l. MEAN
- average; sum of the observations or test scores divided by the
number of observations or scores.
- appropriate for: interval and ratio & when the distribution is
approximately normal

2. MEDIAN
- middle score in the distribution
- appropriate for: ordinal, interval, and ratio
- used when the distribution is skewed and when there are outliers
3. MODE
- the most frequently occurring score (common or
typical)
- can have more than one mode
- the only central tendency that can be used in
nominal/categorical data

10. Ms. Bianca gave an examination to her BSP 2D class. Supposedly, the chapters covered in the
exam were chapters I to 3. However, the items she gave were from chapters 4 to 5, which led the
majority of the class to have lower scores, and only a few people got higher scores. With this given
information you would expect that the distribution might be:
a. Left skewed
b. Right skewed
c. Positively skewed
d. Two options are correct

SKEWNESS
- the nature and extent to which symmetry is absent
- proportion of higher vs. lower scores
- piling of one score on one side

Positively Skewed
- right skewed
- relatively few of the scores fall at the high end of the distribution
- greater proportion of lower scores
NEGATIVELY SKEWED
- left skewed
- relatively few of the
scores fall at the low
end of the distribution
- greater proportion
of higher scores
11 . When the results of their exams were released, Donny saw that he got a score of 90 in his
Abnormal Psychology class, 92 in his Developmental Psychology class, 89 in his Psychological
Assessment class, and 87 in his IOP class. Given these data, you could infer that:
a. He performed best in Developmental Psychology.
b. He performed equally in Abnormal Psychology &
Developmental Psychology.
c. He performed the least in Psychological assessment
d. None of the above.

GOLDEN RULE:
Raw scores are relatively useless.
Because without any frame of reference, raw scores
provide us very little information

STANDARD SCORE (OR Z SCORE)


- are raw scores that has been converted from one scale to another scale, where the latter scale has
some arbitrarily set mean and SD

Z SCORE
- conversion of a raw score into a number indicating how many SD units the raw score is from the mean
- mean of 0 & SD of I
12. For his Abnormal Psychology class, Donny got a score of 92 with a mean of 90 and a standard
deviation of 2. VVhat is his z score?

c. 1.5

13. Given the data in item 12, what is Donny's T score?


a. 50
b. 60
c. 70
d. 80

T SCORE
- fifty plus minus 10 scale
- mean of 50 & SD of 10
- advantage: none of the scores is negative
14. This is the standardized score with a mean of 50 and a standard deviation of 10.
a. z score
b. T score
c. sten
d. stanine

15. The tails of the bell curve is:


a. Asymptotic
b.Asymptomatic
c. Symptomatic
d. Symptotic

The tails of a bell curve are asymptotic, which means that they approach but never quite meet the
horizon.

16. Deliberately assigning what's considered as "pass" or "fail" on your variable (grades of the
participants) is an example of:
a. True dichotomous variable
b. Artificially Dichotomous variable
c. Dichotomous variable
d. Continuous variable

True Dichotomous variable


- naturally form two categories
- ex: sex (male or female), pregnancy (pregnant or not
pregnant)

Artificially Dichotomous variable


- considered as "artificial" because they reflect an underlying
continuous scale forced into a dichotomy
- ex: pass or fail, age period (e.g., young and old)

17. Vera wants to know if there is a significant


difference between males & females when it
comes to resilience.VVhat statistical tool should
Vera use?
a. Unpaired t-test
b. Paired t-test
c. Independent t-test
d. Two options are correct

T-test
- is used to test for the significant difference between two
groups.
- one independent, categorical variable that has two
levels/groups & one continuous dependent variable.

18. This is a component of the observed test score that does not have anything to do with the test
taker's ability.
a. True score
b. Reliability coefficient
c. Error
d.Validity

19. While the testing was on-going, a man with a knife was outside the room while shouting
profanities and threatening the people outside. This caused the test takers to be worried and anxious.
This could serve as a:
a. Systematic error
b. Random error
c. Measurement error
d. Bias error

MEASUREMENT ERROR - refers to all factors associated with the process of measuring some variable,
other than the variable being measured.

RANDOM ERROR — caused by unpredictable fluctuations and inconsistencies of other variables in the
measurement process.

SYSTEMATIC ERROR - typically constant or proportionate to what is presumed to be the true value of
the variable being measured.

20. This refers to the consistency of test scores and a measurement.


a. Reliability
b. Validity
c. Utility
d. NOTA

Reliability
- refers to the consistency of the scores in a measurement
- it is the degree to which a measurement instrument yields the same results on repeated trials

21. It requires the administration of the same test to the same people twice.
a. Alternate forms reliability
b. Split-half reliability
c. Test-retest reliability
d. Inter-rater reliability

a. Alternate forms reliability — different tests, two


administration, same sample
b. Split-half reliability — same test, one administration
c. Test-retest reliability
d. Inter-rater reliability — same test, different raters, at
the same time

Test-Retest Reliability
- estimate of reliability by correlating pairs of scores from the same people on two different
administrations of the same test
- appropriate for: tests that measure a construct that is relatively stable over time & tests that employ
outcome measures (e.g., reaction time, grip test)

22. Donny wants to establish the reliability of his newly constructed test. It is a multiple-choice type of
test with only one correct answer, further he made sure that all of the items have a p = 0.50. What
should Donny use?
a. Cronbach's alpha
c. KR21
b. KR20
d. Spearman brown

Measures of Internal Consistency


KR20
- or Kuder-Richardson 20
- calculates the reliability of a test in which the items are
dichotomous
KR21
- calculates the reliability of a test in which the items are
dichotomous with equal difficulty

23.A Cronbach's alpha of .98 is considered as:


a. Acceptable range of reliability
b. Excellent range of reliability
c. Good range of reliability
d. Questionable

Measures of Internal Consistency


Coefficient Alpha
- it is the mean of all possible split-half correlations, corrected by the Spearman Brown formula
- useful for tests with non-dichotomous items (e.g., Likert scale)
- received the most acceptance and is widely used today
- ranges from 0 to I

24. In order to measure the reliability of a behavioral checklist, Christine decided to ask three of her
colleagues to use it.Afterwards, she used a statistical tool to measure the degree of the consistency of
their assessment. What method of reliability measure was used by Christine?
a. Test-retest reliability
b. Parallel forms reliability
c.Alternate forms reliability
d. Inter-rater reliability

Inter-rater Reliability
- is the degree of agreement or consistency between 2 or more scorers with regard to a particular
measure
- ex of behaviors scored by raters: depressed mood (as observed) or nonverbal performance tasks
- often used in coding nonverbal behavior

25. Joan created a test for conscientiousness for the first time and she is now about to assess its
reliability.VVhen she asked her mentor, she was advised to use test-retest reliability. Was her mentor
correct?
Choose the best answer.
a. Yes, because conscientiousness is a homogenous variable and, in such
instances, test-retest should be used.
b. No, because conscientiousness is a heterogeneous variable therefore
it should not be used.
c. Yes, because conscientiousness is a static variable and is stable
overtime, therefore test-retest should be used.
d. No, because conscientiousness is dynamic and changes over time,
internal consistency should be used.

Nature of a Test
Stable vs. Dynamic
- If the test measures a variable or a trait that is stable and we
wish to assess the stability of the trait across different
situations, then we use Test-Retest Reliability.

- If the concern is still stability, but we would like to control for


practice and carryover effect, and if a test has more than one
form, then we use Alternate-Form Reliability.

26. This refers to the 'tailedness' of a distribution.


a. Skewness
b. Normality
c. Kurtosis
d. None of the above

Kurtosis
- is a measure of the tailedness of a distribution — how often outliers
occur
- Kurtosis is not concerned with peakedness nor variability, but with
the weight of the tails in comparison to a normal distribution
(Weunch, 20 1 4)

27. When Gino checked the distribution of the scores of the test he made, he noticed that there were
many outliers at both ends of the tail of his distribution. Thus, we could say that this might be a what
kind of distribution?
a. platykurtic
b. leptokurtic
c. normal
d. none of the above

Mesokurtic — medium tailed; outliers are neither


highly frequent nor infrequent

Leptokurtic — tails are heavy & shoulders are


light; more extreme values at the tails

Platykurtic — tails are light while shoulders are


heavy; fewer outliers at the tails

28. Which among the following is/are not TRUE?


I.Validity is the extent of how a test measures what it purports to
measure.
II.A test can be universally valid for all time, all uses, and with all types
of test takers.
Ill. The validity of a test may diminish as the culture or the time changes.
IV.A test's validity may not be reestablished to the same as other
populations.
a. Only I & Ill are false
b. only II & IV are false
c.All are false
d.All are true

Validity
- a judgment or estimate of how well a test measures what it purports to measure in a particular
context.
- a "valid" tests is a test that has been shown to be valid for a particular use with a particular population
of test takers at a Particular time. No test is "universally valid" for all time, for all uses, and with all types
of populations.

- the validity of a tests may diminish as the culture or the times change, thus a test's validity have to be
reestablished w/ the same or other test taker populations.

Validation — is the process of gathering and evaluating evidence about validity. Both the test developer
& test user may play a role in this.

29. What is the relationship of reliability & validity?


a. A test cannot be valid if it is not reliable.
b. A test cannot be reliable if it is not valid.
c. Reliability does not limit validity.
d. There is no relationship between the two.

Relationship between Reliability and Validity


"From the psychometric perspective, evidence of score reliability is considered to be necessary, but not
sufficient, condition for validity (Urbina, 2014)."
30. This is the validation used to check for the appropriateness of a survey.
a. content
b. criterion
c. construct
d. face

Validity
Content validity
- evaluates how adequately an instrument covers all relevant parts of the construct it aims to measure
- extent on how appropriate/representative the items relative to the construct it is measuring
- ex: depression has behavioral, affective, and cognitive aspects which must be covered adequately in a
scale

31 . To assess the validity of the newly created college admission test of Backburner University, they
have decided to use the first semester GWA of the incoming freshman students as a standard to
assess if the test was actually effective. What measure of validity will they use?
a. Content
b. Construct
c. Criterion
d. Face

Validity
Criterion validity
- a judgment of how adequately a test score can be used to infer an individual's most probable standing
on some measure of interest — a criterion
- it inquires into the relationship between scores on a test and another external criteria

Criterion — the standard against which a test/test score is compared or evaluated; it is a direct and
independent measure of what the test is designed to predict (can be another test or an outcome
measure)

Validity
b. Predictive validity
- measures the relationship between the test scores and a criterion measure obtained at a future time
- how accurately scores on the test predict some criterion measure
- ex: the relationship between college admission test and freshman GPA provide evidence of the
predictive validity of the admission test

32. During the hiring process,Anjo scored high in the aptitude test which led him to being hired.
However, after three months, his assessment showed that he was not really fit for the job because of
his low job performance. What kind of miss rate does this depict?
a. Sigma error
b. Alpha Error
c. Beta Error
d. Item error
Alpha Error/Type I Error
- False positive
- Rejecting the null hypothesis when it is true
- Sobi niya gusto ka row niya pero hindi naman Pala. (dapat so mga ganito isinasako tas hinahagis so far
away)

Beta Error/Type II Error


- False negative
- Failing to reject the null hypothesis when it is false
- Sobi mo wala kang gusto sa kaniya pero meron taloga. (In game so pagiging denial si ate ko, oh.)

33. Based on the vignette in item 32, the aptitude test is lacking in what kind of validity?
a. Content
b. Concurrent
c. Predictive
d. Construct

34. To check for the validity of his newly constructed life satisfaction scale, Jae correlated the scores of
the test takers in this scale to their scores in Beck's Depression Inventory. The results showed that
there is no significant correlation between the two. This is evidence of what kind of validity?
a. Content
b. Construct
c. Criterion
d. None of the above

Validity
Construct validity
- the extent to which the test may be said to measure a theoretical construct or trait
- the judgment about the appropriateness of inferences drawn from test score regarding individual
standings on a construct

Construct validity
- evidences:
● homogeneity
● evidences of changes with age
● evidences of pretest & posttest
● evidences of distinct groups
● convergent evidence
● discriminant evidence

Construct validity
Convergent evidence
- is shown when scores on test tent to correlated highly in the predicted direction w/ scores on older,
more established, and already validated tests designed to measure the same or a similar construct
Discriminant evidence
- is shown when the validity coefficient shows little (insignificant) relationship between the test scores
and/or other variables with which test score should not be theoretically related

36. This is a rating error in which the rater's ratings are consistently overly negative due this/her
tendency to be too strict.
a. Severity error
b. Negative error
c. Horn effect
d. Extreme effect

RATING ERRORS
LENIENCY — lenient in scoring, marking, or grading; rates too positively
SEVERITY — tendency of the rater to be too strict or negative; rates too negatively
CENTRAL TENDENCY ERROR — ratings tend to cluster in the middle of the rating continuum
HALO EFFECT — tendency to give a particular ratee a higher rating that he objectively deserves because
of the rater's failure to discriminate among conceptually distinct aspects of a ratee's behavior.

HORN EFFECT - Biased in a bad way

37.What is the correct process of test development?


a. Test conceptualization, Test construction, Item analysis, Test tryout, and Test revision
b. Test construction, Item analysis, Test Tryout, Tests conceptualization, Test revision
c. Test conceptualization, Test construction, Test tryout, Item analysis, Test revision
d. Test conceptualization, Item analysis, Test conceptualization, Test Revision, Test tryout

38. Jay-F has already decided the scaling and scoring method that he will use, and is now writing the
test items for the depression scale that he is developing. Jay-F is in what stage of test development?
a. Test conceptualization
b. Test construction
c. Item analysis
d. Test tryout

Test Development
Test construction
- involves writing the items and the decision making with regard to the scaling and scoring method that
will be utilized.
Writing items
- Item pool — a reservoir from which items will or will not be drawn for the final version of the test
- The item pool should contain at least twice the number of the items expected to be included in the
final version of the test.
- Content validity should be kept in mind.

Item Analysis
- in which the item difficulty index, item reliability index, item validity index, and item discrimination
index are analyzed.

40. Jay-F's target number of items is 20, he should at least have how many items in his item pool?
a. 25
b. 30
C. 40
d. 50

41. The larger the item difficulty index, the the item is ____
a. More difficult
b. Easier
c. Moderately difficult
d. NOTA

Item Analysis
Item Difficulty Index
- it answers the question: what proportion of test takers
answered each item correctly?
- its value can theoretically range from 0 to I
- the larger the difficulty index, the easier the item.
- an item is not good if everyone gets it right or if everyone gets it wrong
- range of difficulty should be: 0.30 to .80

42.An item should discriminate between high scorers and low scorers. It is a red flag if either the high
scorers or low scorers are more likely to answer the item correctly, instead half of the two groups
should answer it correctly to say that the said item discriminates between the two groups.
a. first statement is inaccurate
b. second statement is inaccurate
c. both statements are accurate
d. both statements are inaccurate

Item Discrimination Index


- a measure of the difference between the proportion of high scorers answering an item correctly and
the proportion of low scorers answering the item correctly
- the higher the value of d, the greater the high scorers answering the item correctly
- an item is good if most of the high scorers answered the item correctly
- a negative d-value is a red flag because is shows that low-scoring examinees are more likely to answer
the item correctly than high-scoring examinees

43. Bryan is a newly hired registered psychometrician in the HR department of a company B. During
the orientation, he was told that he will be administering Thematic Apperception Test to the
applicants. What should Bryan do?
a. Just do what he is told for he might be fired.
b.Administer the test in the future because he can use it as a psychometrician.
c. Report his boss to the Psychological Association of the Philippines.
d. Do not administer the test.

44. What could be the reason for the answer in item 43?
a. He can because an RPm is allowed to take a Level B test.
b. He cannot because an RPm cannot take a Level C test.
c. He cannot because he has no prior training to conduct TAT.
d. Two of the options could be correct.

Test-User Qualifications- APA Committee on Ethical Standards for Psychology, entitled; Ethical
Standards for the Distribution of Psychological Tests and Diagnostic Aids, defined 3-levels oftests'
required knowledge of testing & psychology:
Level A — tests that can adequately be administered, scored, and interpreted using manual & general
orientation
Level B — tests that require some technical knowledge of test construction and use
Level C — tests that require substantial understanding of testing and supporting psychological fields
together with supervised experience

45. Ley, a psychologist, needs to conduct an intelligence test on her 15-year-old client.VVhat test
should she use?
a. WAlS
b. WlSC
c. WPPSl
d. WASl

WPPSI-IV - 2 y/o to 7 y/o


WISCV - 6 y/o to 16 y/o
WAIS IV - 16 to 90 y/o
WASI II - 6 y/o to 90 y/o
Note: all of which has a mean of 100 and SD of 15

46. The clinician needed a test that measures psychopathology in a relatively quick time. What would be
his best option?
a. MMPI-II
b. BPI
c. PNLT
d.WlSC

Basic Personality Inventory (BPI)


- designed to measure aspects of personality and psychopathology.
- can be completed in approx. 30-40 minutes
- has 240 items, 12 scales
- 12 years and older
- scales measure traits such as: hypochondriasis, depression, denial, interpersonal problems, alienation,
persecutory ideas, anxiety, thinking disorder, impulse expression, social introversion, self-depreciation,
and deviation

47. The mother of your 24-year-old client is demanding for the results of her son's psychological tests.
She insisted that she wanted to check the veracity of her son's results and insisted that she is his
mother and that it is her right.What is the best course of action?
a. Show the mother the results of her son because she is the mother after all.
b. Deceive the mother and show her fake results in order to protect his confidentiality.
c. Do not show the results
d. Report the mother to the authorities because she is trying to invade the privacy of her child.
F. Release of Test Data
1. It is our responsibility to ensure that test results and interpretations are not used by persons
Other than those explicitly agreed upon by the referral sources prior to the assessment
procedure.
2. We do not release test data in the forms of raw and scaled scores, client's responses to test
questions or stimuli, and notes regarding the client's statements and behaviors during the
examination unless regulated by the court.

48. Jerem is a psychologist who he fell in love with one of his former clients named Linette. He
pursued her and became in a romantic relationship with her three years after their therapy sessions
had ended. Was this act of Jerem ethical?
a. Yes, because enough time has passed since their client- therapist relationship has ended.
b. No, therapists and their former clients should never ever be in a relationship.
c. Yes, there isn't any stipulation in the code of ethics about client-therapist romantic relationships.
d. No, the years that passed since their professional relationship has ended is not yet sufficient.

49. After concluding that he is not competent enough to handle the case of his client, Dio has decided
to refer his client to his colleague named Lyden, and he gave her all the relevant psychological data
that he has. On the day of their supposed therapy session, Dio told his client that instead of having
their session, he will bring her to Lyden to introduce them to each other. Was bio's behavior ethical?
a. Yes, because he was not competent enough to handle the case of his client. It goes without saying
that he has to refer her to another professional.
b. Yes, his colleague was competent enough to handle his referred client and the former also accepted
his referral.
c. No, he accepted his client's case therefore he should see through it until the end.
d. No, because he failed to respect the client's autonomy and rights to confidentiality.

H. Referrals
1. We ensure that referrals with colleagues are discussed and consented by our clients. We
provide an explanation to clients regarding the disclosure of information that accompany the
referral.
2. We ensure that the recipient of the referral is competent in providing the service and the client
will likely benefit from the referral.
3. In considering referrals, we carefully the appropriateness Of the referral, Of the referral to thc
client and the adequacy of client's consent for referral.

50. You have discussed with your colleagues the case of a particular client of yours in order to ask for
their professional insight about it. Is this unethical?
a. Yes, we should never talk about the case of our clients to anyone.
b. Maybe. It depends on how you discussed the case with them.
c. The PAP code of ethics does not have any stipulation about this.
d. No, since it is a professional discussion that might enrich your insights and findings about the case of
your client, it is not unethical

F. Consultation
1. We do not discuss with our colleagues or other professionals confidential information that could
lead to the identification of the client, unless the client gave consent or the disclosure cannot be
avoided.
2. When we seek opinion from our colleagues or other professionals, We make sure that the
extent to which we disclose information is limited to what is only needed to achieve the
purpose.

You might also like