0% found this document useful (0 votes)

19 views41 pages

Chapter 6 Item Analysis and Validation Assessment in Learning 1

Uploaded by

Neshema Faith Eusebio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views41 pages

Chapter 6 Item Analysis and Validation Assessment in Learning 1

Uploaded by

Neshema Faith Eusebio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

REPORTERS

Eusebio, Neshema Matillano, Abegail Danieles, Allysa

Faith Joy Mae

Amoguez, Apple Bienes, Jessica Unasin, Francis Mae

Jane
Chapter 6

ITEM ANALYSIS AND

VALIDATION
LEARNING OUTCOMES

 Explain the meaning of item analysis, item

validity, reliability, item difficulty,
discrimination index
 Determine the validity and reliability of
given test items
 Determine the quality of a test item by its
difficulty index, discrimination index and
plausibility of options (for a selected -
response test)
ITEM ANALYSIS
• It is the act of analyzing student responses to
individual exam questions with the intention of
evaluating exam quality.
• It is an important tool to uphold test
effectiveness and fairness.
• It will provide information that will allow the
teacher to decide whether to revise or replace
an item.
Two important characteristics of
an item

a) Difficulty Index
b) Discrimination
Index
Difficulty is defined as the number of students
who
Index
are able to answer the item correctly divided by the total
number of students. Thus:

Item difficulty = number of students with correct answer/

total number of students.

The item difficulty is usually expressed in percentage.

Encoded Pilot Testing Result
Item difficulty = number of students with correct answer/
total number of students.

= 14/20
=0.7 (70%)
Encoded Pilot Testing Result
Range of Difficulty
Interpretation Action
Index

0-0.25 Difficult Revise or discard

0.26-0.75 Right difficulty Retain

0.76 - above Easy Revise or discard

Encoded Pilot Testing Result
Discrimination
refersIndex
to the power of the item to
discriminate the students between
those scored high and low in the
overall test.
Encoded Pilot Testing Result
Index of discrimination DU DL (U-
Upper group; L- Lower group)

upper 25% (easy) of the class and

lower 25% (difficult) of the class
25% x total no. of students

= 25% (0.05) x 20
=5 5 students in upper 25%
5 students in lower 25%
Encoded Pilot Testing Result
Encoded Pilot Testing Result
Encoded Pilot Testing Result
Discrimination Index = item
difficulty index of upper 25% - item
difficulty index lower 25%
Encoded Pilot Testing Result
Index Range Interpretation Action

-1.0 - -.50 Can discriminate Discard

-.55 - 0.45 Non-discriminating Revise

0.46 - 1.0 Discriminating item Include

Encoded Pilot Testing Result
Index Range Interpretation Action

-1.0 - -.50 Can discriminate Discard

-.55 - 0.45 Non-discriminating Revise

0.46 - 1.0 Discriminating item Include

Encoded Pilot Testing Result
More Sophisticated Discrimination
Index

Item discrimination is the ability of an

item to differentiate among students based on
their understanding of the test material.
Traditional hand calculation methods compare
item responses to total test scores, but
computerized analyses offer more accurate
assessment by considering all student
responses.
More Sophisticated Discrimination
Index
The item discrimination index by ScorePak® is a
Pearson Product Moment correlation between student
responses to a specific item and total scores on all other items
on the test. It estimates the degree to which an individual item
is measuring the same thing as the rest of the items. The
discrimination index reflects the degree to which an item and
the test as a whole measure a unitary ability or attribute.
Values of the coefficient tend to be lower for tests measuring a
wide range of content areas than for more homogeneous tests.
Item discrimination indices should be interpreted in the
context of the test type, with items with low discrimination
indices often being ambiguously worded.
More Sophisticated Discrimination
Index
Tests with high internal consistency consist of
items with mostly positive relationships with total test
score. In practice, values of the discrimination index will
seldom exceed .50 because of the differing shapes of
item and total score distributions. ScorePak® classifies
item discrimination as "good" if the index is
above .30; "fair" if it is between. 10 and .30; and
"poor" if it is below .10.

A good item is one that has good discriminating

ability and has sufficient level of difficult (not too difficult
nor too easy).
More Sophisticated Discrimination
Index

At the end of the Item Analysis report,

test items are listed according to their degrees
of difficulty (easy, medium, hard) and
discrimination (good, fair, poor). These
distributions provide a quick overview of the
test, and can be used to identify items which
are not performing well and which can perhaps
be improved or discarded.
VALIDATION AND
VALIDITY

Validation is the process of collecting and

analyzing evidence to support the meaningfulness
and usefulness of the test.

Validity is the extent to which a test measures

what it purports to measure or as referring to the
appropriateness, correctness, meaningfulness and
usefulness of the specific decisions a teacher makes
based on the test results.
There are essentially three main types
of evidence that may be collected: content-
related evidence of validity, criterion-
related evidence of validity and
construct-related evidence of validity.

Content-related evidence of
validity refers to the content and format of
the instrument.
Criterion-related evidence of
validity refers to the relationship between
scores obtained using the instrument and
scores obtained using one or more other
tests (often called criterion).

Construct-related evidence of
validity refers to the nature of the
psychological construct or characteristic
being measured by the test.
The usual procedure for determining content
validity may be described as follows:

• The teacher writes out the objectives of the test

based on the Table of Specifications and then gives
these together with the test to at least two (2)
experts along with a description of the intended test
takers.

• The experts look at the objectives, read over the

items in the test and place a check mark in front of
each question or item that they feel does not
measure one or more objectives.
• They also place a check mark in front of each
objective not assessed by any item in the test.

• The teacher then rewrites any item checked and

resubmits to the experts and/or writes new items to
cover those objectives not covered by the existing
test.

• This continues until the experts approve of all items

and also until the experts agree that all of the
objectives are sufficiently covered by the test.
In order to obtain evidence of criterion-
related validity, the teacher usually compares
scores on the test in question with the scores on some
other independent criterion test which presumably
has already high validity. In particular, this type of
criterion-related validity is called its concurrent
validity. Another type of criterion-related validity
is called predictive validity wherein the test scores
in the instrument are correlated with scores on a later
performance (criterion measure) of the students.
In summary content validity refers to how
will the test items reflect the knowledge actually
required for a given topic area (e.g. math).

Criterion-related validity is also known as

concrete validity because criterion validity refers
to a test's correlation with a concrete outcome.
In the case of pre-employment test, the two
variables that are compared are test scores and employee
performance.

There are 2 main types of criterion validity-

concurrent validity and predictive validity.

● Concurrent validity refers to a comparison between the

measure in question and an outcome assessed at the
same time.

● In predictive validity, we ask this question: Do the

scores in NAT Math exam predict the Math grade in Grade
12?
RELIABILITY

Reliability refers to the consistency of the

scores obtained - how consistent they are for each
individual from one administration of an instrument
to another and from one set of items to another. We
already gave the formula for computing the reliability
of a test: for internal consistency; for instance, we
could use the split-half method or the Kuder-
Richardson formulae (KR-20 or KR-21).
Reliability and validity are related concepts. If an
instrument is unreliable, it cannot get valid outcomes. As
reliability improves, validity may improve (or it may not).
However, if an instrument is shown scientifically to be valid
then it is almost certain that it is also reliable.

Predictive validity compares the question with an

outcome assessed at a later time. An example of predicitve
validity is a comparison of scores in the National Achievement
Test (NAT) with first semester grade point average (GPA) in
college. Do NAT scores predict college performance? Construct
validity refers to the ability of a test to measure what it is
supposed to measure. As researcher, you intend to measure
depression but you actually measure anxiety so your research
gets compromised.
The
The following
following table
table is
is a
a standard
standard followed
followed almost
almost
universally
universally in
in educational
educational test
test and
and measurement.
measurement.

Reliability Interpretation
Excellent reliability; at the level of the best.
.90 and above standardized tests
.80 - 90 Very good for a classroom test
Good for a classroom test; in the range of most. There
.70-80 are probably a few items which could be improved.
Somewhat low. This test needs to be supplemented by
other measures (e.g., more tests) to determine.
.60-70 grades. There are probably some items which could
be improved.
Reliability Interpretation
Suggests need for revision of test, unless it is
quite short (ten or fewer items). The test
.50-60 definitely needs to be supplemented by other
measures (e.g., more tests) for grading.

Questionable reliability. This test should not

.50 or below contribute heavily to the course grade, and it
needs revision.
THANK YOU
FOR
LISTENING!

Z Notes Comp p2
No ratings yet
Z Notes Comp p2
9 pages
EN - 840D-810D - Complete Service - v74 PDF
100% (3)
EN - 840D-810D - Complete Service - v74 PDF
380 pages
Item Analysis
100% (1)
Item Analysis
9 pages
Lesson No. 1: Validity of A Test
No ratings yet
Lesson No. 1: Validity of A Test
10 pages
Meeting the Assessment Requirements of the Award in Education and Training
From Everand
Meeting the Assessment Requirements of the Award in Education and Training
Nabeel Zaidi
No ratings yet
Validation and Item Analysis: Integrated Review 1 (IR1)
No ratings yet
Validation and Item Analysis: Integrated Review 1 (IR1)
24 pages
Item Analysis and Validation
No ratings yet
Item Analysis and Validation
39 pages
Assessment-of-Learning-1-Lessons-5-8
No ratings yet
Assessment-of-Learning-1-Lessons-5-8
39 pages
Final PPT g6
No ratings yet
Final PPT g6
36 pages
Module 6 - Difficulty and Discri.
No ratings yet
Module 6 - Difficulty and Discri.
6 pages
Module-8
No ratings yet
Module-8
3 pages
BEED ASSESS 1
No ratings yet
BEED ASSESS 1
25 pages
Analysis
No ratings yet
Analysis
46 pages
Item Analysis and Evaluation Statistical Analysis of Assessment Data
No ratings yet
Item Analysis and Evaluation Statistical Analysis of Assessment Data
50 pages
Item Analysis and Validity
No ratings yet
Item Analysis and Validity
49 pages
Validity Refers To How Well A Test Measures What It Is Purported To Measure
No ratings yet
Validity Refers To How Well A Test Measures What It Is Purported To Measure
6 pages
Item Analysis and Validation
No ratings yet
Item Analysis and Validation
20 pages
ED203 Midterms Group 7
No ratings yet
ED203 Midterms Group 7
20 pages
Assignment No 2: Course Code:8602 Program:B.ed 1.5
No ratings yet
Assignment No 2: Course Code:8602 Program:B.ed 1.5
11 pages
Item Analysis and Validation 1
No ratings yet
Item Analysis and Validation 1
17 pages
Item Analysis and Validation: Mark Leonard Tan Verena Gonzales Ann Creia Tupasi Ramil Cabañesas
No ratings yet
Item Analysis and Validation: Mark Leonard Tan Verena Gonzales Ann Creia Tupasi Ramil Cabañesas
46 pages
Qualities of An Evaluation Tool
No ratings yet
Qualities of An Evaluation Tool
42 pages
5FS2 Learning Episode 5 Item Analysis
No ratings yet
5FS2 Learning Episode 5 Item Analysis
13 pages
Administering, Analyzing, & Improving Tests (Part 2)
No ratings yet
Administering, Analyzing, & Improving Tests (Part 2)
31 pages
8602 (2 Assignment) Samia Mumtaz 0000237743
No ratings yet
8602 (2 Assignment) Samia Mumtaz 0000237743
24 pages
Qualities of Test(Validity & Relibility Etc)
No ratings yet
Qualities of Test(Validity & Relibility Etc)
38 pages
TRIXIELYN-KATE-N.-ROXAS_IMPROVING-ASSESSMENT-ITEMS
No ratings yet
TRIXIELYN-KATE-N.-ROXAS_IMPROVING-ASSESSMENT-ITEMS
28 pages
Try Out and Validation
No ratings yet
Try Out and Validation
36 pages
Assessment in Learning 1
No ratings yet
Assessment in Learning 1
2 pages
What is Reliability
No ratings yet
What is Reliability
2 pages
Validity and Reliability
No ratings yet
Validity and Reliability
13 pages
Qualities of Good Measuring Instruments
56% (9)
Qualities of Good Measuring Instruments
4 pages
Chapter 6 (Group 2) Item Analysis & Validation
No ratings yet
Chapter 6 (Group 2) Item Analysis & Validation
6 pages
Module 4 Psychometric properties (1)
No ratings yet
Module 4 Psychometric properties (1)
49 pages
0520-20Validity2020Reliability
No ratings yet
0520-20Validity2020Reliability
37 pages
Measurement and Evaluation
No ratings yet
Measurement and Evaluation
71 pages
Educ Measurement Prelim
No ratings yet
Educ Measurement Prelim
24 pages
Al2 Report
No ratings yet
Al2 Report
87 pages
Educ105 - Coverage Exam
No ratings yet
Educ105 - Coverage Exam
14 pages
wr
No ratings yet
wr
4 pages
Item Analysis
No ratings yet
Item Analysis
47 pages
chapter 4,5,&6
No ratings yet
chapter 4,5,&6
15 pages
Measuring Instrument Module 2
No ratings yet
Measuring Instrument Module 2
10 pages
Validity: MEAM 607 - Advanced Test and Measurement By: Sherwin Trinidad
No ratings yet
Validity: MEAM 607 - Advanced Test and Measurement By: Sherwin Trinidad
38 pages
Anp - Item Analysis
No ratings yet
Anp - Item Analysis
20 pages
Statistics - Steps in Test Construction
No ratings yet
Statistics - Steps in Test Construction
108 pages
Item Analysis and Validation
No ratings yet
Item Analysis and Validation
19 pages
Validation of Instrument
No ratings yet
Validation of Instrument
28 pages
MODULE 2 Lesson 1-4
No ratings yet
MODULE 2 Lesson 1-4
37 pages
Week V & VI
No ratings yet
Week V & VI
77 pages
Item Analysis
100% (1)
Item Analysis
33 pages
SPL-3 Unit 3
No ratings yet
SPL-3 Unit 3
4 pages
Educ106
No ratings yet
Educ106
40 pages
PDF Document
No ratings yet
PDF Document
76 pages
PE 7 MODULE 7 Correct
No ratings yet
PE 7 MODULE 7 Correct
8 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
33 pages
Day 12 Item Analysis
No ratings yet
Day 12 Item Analysis
7 pages
Lesson 6 - March 4
No ratings yet
Lesson 6 - March 4
3 pages
Material For Evaluation For Class Lectures 1
No ratings yet
Material For Evaluation For Class Lectures 1
67 pages
EDU 301 Power Point
No ratings yet
EDU 301 Power Point
45 pages
Testing Impact Review
From Everand
Testing Impact Review
Mason Ross
No ratings yet
GRE: What You Need to Know: An Introduction to the GRE Revised General Test
From Everand
GRE: What You Need to Know: An Introduction to the GRE Revised General Test
Kaplan Test Prep
5/5 (2)
Design Objectives and Chapter 3
No ratings yet
Design Objectives and Chapter 3
26 pages
Namma Kalvi 6th Science All Terms Selection Guide Unit 1 3 4 em 218546
No ratings yet
Namma Kalvi 6th Science All Terms Selection Guide Unit 1 3 4 em 218546
101 pages
How To Calibrate Vacuum Oven: 1. Find Hot Zone With Temperature Uniformity
No ratings yet
How To Calibrate Vacuum Oven: 1. Find Hot Zone With Temperature Uniformity
3 pages
VIS 140 Updates
No ratings yet
VIS 140 Updates
20 pages
Possible For The Spontaneous of Thin, Free Liquid Films: Mechanism Rupture
No ratings yet
Possible For The Spontaneous of Thin, Free Liquid Films: Mechanism Rupture
11 pages
GARAYenglish2019 PDF
No ratings yet
GARAYenglish2019 PDF
47 pages
Fully Green Taiwan Solar Aircon Catalog PDF
0% (1)
Fully Green Taiwan Solar Aircon Catalog PDF
17 pages
Solidworkz
No ratings yet
Solidworkz
8 pages
Introduction to Plasmas and Plasma Dynamics With Reviews of Applications in Space Propulsion Magnetic Fusion and Space Physics 1st Edition Tang - Read the ebook online or download it for the best experience
No ratings yet
Introduction to Plasmas and Plasma Dynamics With Reviews of Applications in Space Propulsion Magnetic Fusion and Space Physics 1st Edition Tang - Read the ebook online or download it for the best experience
80 pages
KG Allen PDF
No ratings yet
KG Allen PDF
130 pages
HPLC Determination of Vasicine
No ratings yet
HPLC Determination of Vasicine
8 pages
CANoe - Diva FactSheet en
No ratings yet
CANoe - Diva FactSheet en
2 pages
Computing The Mean of A Discrete Probability Distribution
No ratings yet
Computing The Mean of A Discrete Probability Distribution
17 pages
En SR106A 186A
No ratings yet
En SR106A 186A
12 pages
System-4 Hypertherm Powermax 85 Plasma Cutting Plant Technical Report Citizen
No ratings yet
System-4 Hypertherm Powermax 85 Plasma Cutting Plant Technical Report Citizen
8 pages
Experiment 3 Series Resonance PDF
No ratings yet
Experiment 3 Series Resonance PDF
11 pages
Testing Force Systems and Biomechanics
No ratings yet
Testing Force Systems and Biomechanics
11 pages
Some Locality Results For Riemannian Functions: A. Lastname, L. Shastri, J. B. Brown and U. Li
No ratings yet
Some Locality Results For Riemannian Functions: A. Lastname, L. Shastri, J. B. Brown and U. Li
12 pages
Solve Implicit Equations (Colebrook White) - 1
No ratings yet
Solve Implicit Equations (Colebrook White) - 1
8 pages
PSCAD Pi-Section Modelling
No ratings yet
PSCAD Pi-Section Modelling
2 pages
Final Exam Sem 1 - 2
No ratings yet
Final Exam Sem 1 - 2
15 pages
016 Paper
No ratings yet
016 Paper
4 pages
Instant download (Ebook) Automate the Boring Stuff with Python: Practical Programming for Total Beginners by Al Sweigart ISBN 9781593275990, 1593275994 pdf all chapter
100% (5)
Instant download (Ebook) Automate the Boring Stuff with Python: Practical Programming for Total Beginners by Al Sweigart ISBN 9781593275990, 1593275994 pdf all chapter
81 pages
TFG Javier Moreno Fernandez Villamil
100% (1)
TFG Javier Moreno Fernandez Villamil
90 pages
API 6A Valve
100% (2)
API 6A Valve
12 pages
Specific Gravity
No ratings yet
Specific Gravity
2 pages
Solid Dispersions of Furosemide
No ratings yet
Solid Dispersions of Furosemide
9 pages
REKA UG Digital Luggage Scales
No ratings yet
REKA UG Digital Luggage Scales
2 pages

Chapter 6 Item Analysis and Validation Assessment in Learning 1

Uploaded by

Chapter 6 Item Analysis and Validation Assessment in Learning 1

Uploaded by

REPORTERS

Eusebio, Neshema Matillano, Abegail Danieles, Allysa

Amoguez, Apple Bienes, Jessica Unasin, Francis Mae

ITEM ANALYSIS AND

 Explain the meaning of item analysis, item

Item difficulty = number of students with correct answer/

The item difficulty is usually expressed in percentage.

0-0.25 Difficult Revise or discard

0.26-0.75 Right difficulty Retain

0.76 - above Easy Revise or discard

upper 25% (easy) of the class and

-1.0 - -.50 Can discriminate Discard

-.55 - 0.45 Non-discriminating Revise

0.46 - 1.0 Discriminating item Include

-1.0 - -.50 Can discriminate Discard

-.55 - 0.45 Non-discriminating Revise

0.46 - 1.0 Discriminating item Include

Item discrimination is the ability of an

A good item is one that has good discriminating

At the end of the Item Analysis report,

Validation is the process of collecting and

Validity is the extent to which a test measures

• The teacher writes out the objectives of the test

• The experts look at the objectives, read over the

• The teacher then rewrites any item checked and

• This continues until the experts approve of all items

Criterion-related validity is also known as

There are 2 main types of criterion validity-

● Concurrent validity refers to a comparison between the

● In predictive validity, we ask this question: Do the

Reliability refers to the consistency of the

Predictive validity compares the question with an

Questionable reliability. This test should not

You might also like