0% found this document useful (0 votes)

89 views

Item Analysis and Evaluation Statistical Analysis of Assessment Data

An item analysis examines student responses to test items to assess item and test quality. It provides useful information for improving student learning and developing better future tests. The item difficulty index measures the proportion of students answering correctly. The discrimination index measures how well items differentiate between higher- and lower-scoring students. Together, these indices help evaluate item quality and identify items needing revision or removal to improve the test's validity in measuring the intended construct.

Uploaded by

Robert Terrado

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views

Item Analysis and Evaluation Statistical Analysis of Assessment Data

Uploaded by

Robert Terrado

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 50

ITEM ANALYSIS

After you create your assessment items and

give your test, how can you be sure that the
items are appropriate -- not too difficult and
not too easy? How will you know if the test
effectively differentiates between students who
do well on the overall test and those who do
not? An item analysis is a valuable, procedure
that teachers can use to answer both of these
questions.
What is item analysis?
- is a process which examines student
responses to individual test items in order
to assess the quality of those items and of
the test as a whole.
Benefits derived from Item Analysis
1. It provides useful information for class
discussion of the test.
2. It provides data which helps students
improve their learning.
3. It provides insights and skills that lead
to the preparation of better tests in the
future.
2 main characteristics of an
item:
-Item Difficulty index
-Item Discrimination index
Item Difficulty index
- is a common and very useful analytical
tool for statistical analysis, especially
when it comes to determining the validity
of test questions in an educational setting.
-It is often called the p-value because it
-is a measure of proportion
Difficulty index = no. of students w/ correct answer
total number of students
The item difficulty usually expressed in percentage.

The higher the difficulty index, the easier

the item is understood to be. (Wood,
1960)
Example:
Out of the 20 students who
answered question five, only four
answered correctly.
4 ÷ 20 = 0.2
Because the resulting p-value is
closer to 0.0, we know that this is a
difficult question.
For example, let's say you gave a multiple choice quiz and
there were four answer choices (A, B, C, and D). The
following table illustrates how many students selected each
answer choice for Question #1 and #2.
Question A B C D
#1 0 3 24* 3
#2 12* 13 3 2

* Denotes correct answer

One problem with this type of difficulty
index is that it may not actually indicate
that the item is difficult or easy. A student
who does not know the subject matter will
naturally be unable to answer the item
correctly even if the question is easy. How
do we decide on the basis of this index
whether the item is too difficult or too easy?
 Difficult items tend to discriminate between
those who know and those who does not
know the answer.
Easy items cannot discriminate between
those two groups of students.
We are therefore interested in deriving a
measure that will tell us whether an item
can discriminate between these two groups
of students. Such a measure is called an
index of discrimination.
Item Discrimination Index
- which is a measure of how well the item
discriminates between examinees who are
knowledgeable in the content area and
those who are not.
There are several different formulas that calculate
item discrimination, but the one that is most
commonly used is called the point-biserial
correlation, which compares a test taker’s score on
an individual item with their score on the test
overall. For highly discriminating questions,
students who answer correctly are those who
have done well on the rest of the test.
The opposite is also true. Students who answer
highly discriminating questions incorrectly
tend to do poorly on the rest of the test as well.
Item discrimination is measured in a range
between -1.0 to 1.0. Negative discrimination
indicates that students who are scoring highly
on the rest of the test are answering that
question wrong. This could mean that there is
a problem with the question, such as bias or
even a typo in the answer key. Test writers
should reevaluate questions that result in
negative discrimination because they do not
help to show mastery.
How to Find the Item Discrimination Index

Item discrimination Index = (hc – lc) ÷ t

hc= higher scoring group

lc= lower scoring group
t= half of the class
1. Arrange your students from highest scorers
to lowest scorers, with the highest scorers at
the top.

2. Divide the table in half between high and

low scorers, with an equal number of students
on each side of the dividing line.
3. Subtract the number of students in the
lower-scoring group who
answered the question correctly (lc) from the
number of students in the higher-scoring
group who answered the question correctly
(hc).

4. Divide the resulting number by the number

of students on each side of your dividing line,
which should be half of the class (t).
STUDENT TOTAL SCORE QUESTIONS/ITEMS
1 2 3
A 90 1 0 1
B 90 1 0 1
C 80 0 0 1
D 80 1 0 1
E 70 1 0 1
F 60 1 0 0
G 60 1 0 1
H 50 1 1 1
I 50 1 1 1
J 40 0 1 1

1"indicates the answer was correct; "0" indicates it was

incorrect.
For Question #1, that means
you would subtract 4 from
4, and divide by 5, which
results in a Discrimination
Index of 0.
Item # Correct #Correct Difficulty (p) Discrimination
(Upper group) (Lower group) (D)
Question 1 4 4 .80 0

Question 2 0 3 .30 -0.6

Question 3 5 1 .60 0.8

Item Analysis Worksheet
Ten students have taken an objective assessment. The quiz contained 10 questions. In the table below,
the students’ scores have been listed from high to low (A,B,C,D,E are in the upper half). There are five
students in the upper half and five students in the lower half. The number“1” indicates a correct answer
on the question; a “0” indicates an incorrect answer.

Questions
Student Total Score (%)
1 2 3 4 5 6 7 8 9 10

A 100 1 1 1 1 1 1 1 1 1 1

B 90 1 1 1 1 1 1 1 1 0 1

C 80 1 1 0 1 1 1 1 1 0 0

D 70 0 1 1 1 1 1 0 1 0 1

E 70 1 1 1 0 1 1 1 0 0 1

F 60 1 1 1 0 1 1 0 1 0 0

G 60 0 1 1 0 1 1 0 1 0 1

H 50 0 1 1 1 0 0 1 0 1 0

I 40 1 1 1 0 1 0 0 0 0 1

J 30 0 1 0 0 0 1 0 0 1 0
Calculate the Difficulty Index (p) and the Discrimination
Index (D) for each question.
# Correct
# Correct Difficulty Discrimination
Item (Upper Action
(Lower group) (p) (D)
group)
Question 1 4 2 0.6 0.4 Revise
Question 2 5 5 1 0 Discard
Question 3 4 4 0.8 0 Discard
Question 4 4 1 0.5 0.6 Include
Question 5
Question 6
Question 7
Question 8

Question 9
Question 10
VALIDATION
-Validation is the process of collecting and
analyzing evidence to support the
meaningfulness and usefulness of the test.
What is Validity?
The concept of validity was formulated by
Kelly (1927, p. 14) who stated that a test is
valid if it measures what it claims to
measure.
For example a test of intelligence should
measure intelligence and not something else
(such as memory).
Several Ways to Estimate the Validity of a Test
Internal Validity

Internal validity is the measure of the

experimenter’s measurement of the
dependent variable.
External Validity

• External validity refers to the extent to which the findings of a study can
generalized.
2 Types
• Ecological Validity
>refers to the extent to which the results and conclusion are generalized to real life.
• Population Validity
>refers to the extent to which the sample can be generalized to similar and wider
populations
Temporal Validity

• Temporal validity refers to the extent to

which the findings and conclusions of study
are valid when we consider the differences
and progressions that come with time.
Test Validity

• Test validity refers to the extent to which the

results of a study or test can be said to have
meaning.
CONTENT VALIDITY- refers to the content and format
of the instrument.
-How appropriate the items seem to a panel of
reviewers who have knowledge of the subject
matter.
-Does the instrument include everything it should
and nothing it should not?
Example:
Constructing a vocabulary test using a sample of
all vocabulary words studied in a semester.
A comprehensive math achievement test
would lack content validity if good scores
depended primarily on knowledge of
English or it only had questions about
aspect of math(e.g., algebra)
CONSTRUCT VALIDITY
-refers to the agreement of test results with
certain characteristics which the test aim to
portrait.
For example, a test of intelligence nowadays
must include measures of multiple
intelligences, rather than just logical-
mathematical and linguistic ability measures.
CRITERION-RELATED VALIDITY
Also referred to as instrumental validity, it
states that the criteria should be clearly
defined by the teacher in advance. It has to
take into account other teachers´ criteria to
be standardized and it also needs to
demonstrate the accuracy of a measure or
procedure compared to another measure or
procedure which has already been
demonstrated to be valid.
CONCURRENT VALIDITY
Concurrent validity is a statistical method using
correlation, rather than a logical method.
Examinees who are known to be either masters or non
masters on the content measured by the test are identified
before the test is administered. Once the tests have been
scored, the relationship between the examinees’ status as
either masters or non-masters and their performance (i.e.,
pass or fail) is estimated based on the test. This type of
validity provides evidence that the test is classifying
examinees correctly. The stronger the correlation is, the
greater the concurrent validity of the test is.
PREDICTIVE VALIDITY
This is another statistical approach to validity that
estimates the relationship of test scores to an examinee's
future performance as a master or nonmaster.
Predictive validity considers the question,
"How well does the test predict examinees' future
status as masters or non-masters?" For this type of
validity, the correlation that is computed is based on
the test results and the examinee’s later performance.
This type of validity is especially useful
for test purposes such as selection or admissions.
FACE VALIDITY
Like content validity, face validity is determined by a
review of the items and not through the use of
statistical analyses. Unlike content validity, face
validity is not investigated through formal procedures.
Instead, anyone who looks over the test, including
examinees, may develop an informal opinion as to
whether or not the test is measuring what it is
supposed to measure. While it is clearly of some value to
have the test appear to be valid, face validity alone is
insufficient for establishing that the test is
measuring what it claims to measure.
Statistical Conclusion Validity

• It refers to the extent to which we can the

results are statistically significant, that is, we
can establish cause and effect above chance.
Instrumental Validity

• It refers to the extent to which the

instruments are used to measure the
dependent variables are correct for that
measurement.
Diagnostic Validity

• It refers to the extent to which a diagnosis

made about a condition is accurate.
• It is most commonly used in clinical setting.
STATISTICAL ANALYSIS OF ASSESSMENT
DATA
Once we have collected quantitative data, we
will have a lots of numbers. It’s now time to
carry out some statistical analysis to make
sense of and draw some inferences from our
data.
The first thing to do with any data is to
summarize it, which means to present it in a
way that best tells the story.
One of the most common techniques is using
graphs…( line graph, pie, bar..etc.)

Drawing a graph gives an immediate

“picture” of the data. It is always worth
drawing a graph before we start any further
analysis.
2 Methods of Statistical Analysis
DESCRIPTIVE STATISTICS- Descriptive
statistics try to describe the relationship
between variables in a sample or population.
-It provides a summary of data in the form of
mean, median and mode.
INFERENTIAL STATISTICS - use a random
sample of data taken from a population to
describe and make inferences about the whole
population.
DESCRIPTIVE STATISTICS- is the first level of
analysis. A commonly used descriptive
statistics are the measures of central
tendency( mean, mode, median) and the
measures of variability (interquartile range,
variance, standard deviation ,coefficient of
variation)
INFERENTIAL STATISTICS- show the
relationship between several different variables.
A few types of inferential analysis are:
Correlation-this describes the relationship
between two variables. If correlation found, it
means that there is a relationship among
variables. For example, taller people tend to
have a higher weight. Hence height and weight
are correlated with each other.
Regression- this shows the relationship
between two variables. For example, regression
can helps us guess someone’s weight base don
their weight.
Analysis of variance: statistical procedure used
to test the degree to which two or more groups
vary or differ in an experiment.
INFERENTIAL STATISTICS-significant
relationships are determine by rejecting the
null hypothesis and accepting the alternative
hypothesis.
Null hypothesis is rejected if:
computed statistics is greater than the critical
value

Item Analysis
100% (1)
Item Analysis
9 pages
Final Project Report 7th Sem BBA
0% (1)
Final Project Report 7th Sem BBA
37 pages
Assessment of Psychomotor Domain
100% (6)
Assessment of Psychomotor Domain
28 pages
DBA Research Templete
100% (2)
DBA Research Templete
12 pages
Validation and Item Analysis: Integrated Review 1 (IR1)
No ratings yet
Validation and Item Analysis: Integrated Review 1 (IR1)
24 pages
Analysis
No ratings yet
Analysis
46 pages
Module 6 - Difficulty and Discri.
No ratings yet
Module 6 - Difficulty and Discri.
6 pages
Item Analysis and Validation: Mark Leonard Tan Verena Gonzales Ann Creia Tupasi Ramil Cabañesas
No ratings yet
Item Analysis and Validation: Mark Leonard Tan Verena Gonzales Ann Creia Tupasi Ramil Cabañesas
46 pages
Item Analysis and Validation
No ratings yet
Item Analysis and Validation
39 pages
Item Analysis Module
No ratings yet
Item Analysis Module
10 pages
Chapter 6 Item Analysis and Validation Assessment in Learning 1
No ratings yet
Chapter 6 Item Analysis and Validation Assessment in Learning 1
41 pages
BEED ASSESS 1
No ratings yet
BEED ASSESS 1
25 pages
Administering, Analyzing, & Improving Tests (Part 2)
No ratings yet
Administering, Analyzing, & Improving Tests (Part 2)
31 pages
PDF Document
No ratings yet
PDF Document
76 pages
Lesson 4 - Item Analysis and Test Validation
No ratings yet
Lesson 4 - Item Analysis and Test Validation
24 pages
DIAZ - Quantitative Item Analysis
No ratings yet
DIAZ - Quantitative Item Analysis
14 pages
Item Analysis and Validation
No ratings yet
Item Analysis and Validation
19 pages
Chapter 6 Report in Assessment
No ratings yet
Chapter 6 Report in Assessment
40 pages
C. Item Analysis: A B C D
No ratings yet
C. Item Analysis: A B C D
8 pages
Day 12 Item Analysis
No ratings yet
Day 12 Item Analysis
7 pages
Item Analysis and Validation Module
No ratings yet
Item Analysis and Validation Module
13 pages
Item Analysis and Validation 1
No ratings yet
Item Analysis and Validation 1
17 pages
Educ 71 FS2 Episode5
No ratings yet
Educ 71 FS2 Episode5
20 pages
Test Item Analysis
No ratings yet
Test Item Analysis
2 pages
Kyu Edu 2301 WK8
No ratings yet
Kyu Edu 2301 WK8
5 pages
EDU 301 Power Point
No ratings yet
EDU 301 Power Point
45 pages
Item Analysis: Complex Topic
100% (2)
Item Analysis: Complex Topic
8 pages
Dept Education 1705 MPHIL II SEM Item Analysis
No ratings yet
Dept Education 1705 MPHIL II SEM Item Analysis
8 pages
itam analysis
No ratings yet
itam analysis
8 pages
Final PPT g6
No ratings yet
Final PPT g6
36 pages
Item Difficulty and Item Discrimination
100% (1)
Item Difficulty and Item Discrimination
19 pages
Classroom Assessment WTNTK-pages-143
No ratings yet
Classroom Assessment WTNTK-pages-143
2 pages
Item Analysis and Validation (Group 5)
No ratings yet
Item Analysis and Validation (Group 5)
35 pages
Module 4 Item Analysis and Validation
100% (9)
Module 4 Item Analysis and Validation
7 pages
PED 8- INPUT 7
No ratings yet
PED 8- INPUT 7
3 pages
5FS2 Learning Episode 5 Item Analysis
No ratings yet
5FS2 Learning Episode 5 Item Analysis
13 pages
Validation of Instrument
No ratings yet
Validation of Instrument
28 pages
7-Item-Analysis
No ratings yet
7-Item-Analysis
49 pages
Al1 Finals
No ratings yet
Al1 Finals
12 pages
Multiple Choice Test Item Analysis
No ratings yet
Multiple Choice Test Item Analysis
26 pages
Basic Concepts in Item and Test Anaysis
No ratings yet
Basic Concepts in Item and Test Anaysis
11 pages
Item Analysis
100% (1)
Item Analysis
33 pages
Item Analysis
No ratings yet
Item Analysis
29 pages
PED06 Midterm Item Analysis
No ratings yet
PED06 Midterm Item Analysis
55 pages
Item Analysis 2
No ratings yet
Item Analysis 2
43 pages
Item Analysis Made Easy
No ratings yet
Item Analysis Made Easy
55 pages
2.4 ITEM ANALYSIS
No ratings yet
2.4 ITEM ANALYSIS
19 pages
Test Item Analysis
No ratings yet
Test Item Analysis
29 pages
Item Analysis Send
No ratings yet
Item Analysis Send
34 pages
Module Asl1 Unit 4
No ratings yet
Module Asl1 Unit 4
9 pages
05E.90 Improving A Classrom-Based Assessment Test
100% (1)
05E.90 Improving A Classrom-Based Assessment Test
36 pages
Administering, Analyzing, and Improving The Test or Assessment
No ratings yet
Administering, Analyzing, and Improving The Test or Assessment
53 pages
5 Item Analysis and Validation
No ratings yet
5 Item Analysis and Validation
22 pages
L11 ItemAnalysis
No ratings yet
L11 ItemAnalysis
59 pages
Item Analysis
No ratings yet
Item Analysis
4 pages
Item Analysis (1)by irshad ahmad for microbilogy aiims bibinagar
No ratings yet
Item Analysis (1)by irshad ahmad for microbilogy aiims bibinagar
17 pages
Chapter8 Item Analysis
100% (1)
Chapter8 Item Analysis
39 pages
Assembling, Administering and Appraising Classroom Tests and Assessments
67% (6)
Assembling, Administering and Appraising Classroom Tests and Assessments
27 pages
Item Analysis Module
No ratings yet
Item Analysis Module
2 pages
Item Analysis
No ratings yet
Item Analysis
4 pages
Master ACT Math Prep: Maths, #1
From Everand
Master ACT Math Prep: Maths, #1
Subbalakshmi Devaki
No ratings yet
Business Analytics: Step-by-Step Tutorial
From Everand
Business Analytics: Step-by-Step Tutorial
Narcyz Roztocki
No ratings yet
Master Fundamental Concepts of Math Olympiad: Maths, #1
From Everand
Master Fundamental Concepts of Math Olympiad: Maths, #1
Subbalakshmi Devaki
No ratings yet
Properties of Assessment Methods
No ratings yet
Properties of Assessment Methods
15 pages
Concepts in Evaluative Techniques/ Assessment: - Wendy L. Alonso - Christine Joy P. Apigo - Madette Aubrey O. Oaquera
No ratings yet
Concepts in Evaluative Techniques/ Assessment: - Wendy L. Alonso - Christine Joy P. Apigo - Madette Aubrey O. Oaquera
37 pages
Articles On Comparative Education
No ratings yet
Articles On Comparative Education
8 pages
Unit V Education Accountability and Authority
No ratings yet
Unit V Education Accountability and Authority
8 pages
Department of Education Naguilian National High School: Republic of The Philippines
No ratings yet
Department of Education Naguilian National High School: Republic of The Philippines
68 pages
Students' Performance in Mathematics Grade 5
No ratings yet
Students' Performance in Mathematics Grade 5
23 pages
Coping Mechanism and Psychological Resiliency of Senior High School Students
No ratings yet
Coping Mechanism and Psychological Resiliency of Senior High School Students
33 pages
Mauren Publish (Terbit)
No ratings yet
Mauren Publish (Terbit)
21 pages
Prob Model
No ratings yet
Prob Model
53 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
Ia Test
No ratings yet
Ia Test
5 pages
Data Analyst Competency Dictionary
No ratings yet
Data Analyst Competency Dictionary
18 pages
Sample Size Calculator: Confidence Level Confidence Level
100% (1)
Sample Size Calculator: Confidence Level Confidence Level
3 pages
Management Accounting Final
No ratings yet
Management Accounting Final
7 pages
Year 10 Adv 2025 - Scope & Sequence
No ratings yet
Year 10 Adv 2025 - Scope & Sequence
1 page
Block 3
No ratings yet
Block 3
83 pages
What Is The Effect of Banking Concentration and Competition On Financial Development? An International Assessment
No ratings yet
What Is The Effect of Banking Concentration and Competition On Financial Development? An International Assessment
15 pages
01-Six Sigma & Process Capability 131209
No ratings yet
01-Six Sigma & Process Capability 131209
28 pages
Solution Manual For Elementary Statistics, 13th Edition Mario F. Triola
100% (1)
Solution Manual For Elementary Statistics, 13th Edition Mario F. Triola
36 pages
Stat 543 Homework
100% (1)
Stat 543 Homework
4 pages
Educational Reserach
No ratings yet
Educational Reserach
1 page
Toilet Ladies Checklist 1
No ratings yet
Toilet Ladies Checklist 1
3 pages
Math Curriculum Guide Grades 1-10
No ratings yet
Math Curriculum Guide Grades 1-10
109 pages
Thesis Format - Quantitative
No ratings yet
Thesis Format - Quantitative
40 pages
Hallgren2012Computinginter-raterreliabilityforobservationaldataAnoverviewandtutorial
No ratings yet
Hallgren2012Computinginter-raterreliabilityforobservationaldataAnoverviewandtutorial
13 pages
6 Text Clustering
No ratings yet
6 Text Clustering
66 pages
Distinguishing Between Parameter and Statistic
100% (5)
Distinguishing Between Parameter and Statistic
19 pages
Research Proposal Using The CHED-GIA Format: Ida H. Revale Bicol University Research & Development Center
No ratings yet
Research Proposal Using The CHED-GIA Format: Ida H. Revale Bicol University Research & Development Center
45 pages
(Ebook) Introduction to Imprecise Probabilities by Thomas Augustin, Frank P. A. Coolen, Gert de Cooman, Matthias C. M. Troffaes ISBN 9780470973813, 0470973811 - The full ebook version is ready for instant download
100% (1)
(Ebook) Introduction to Imprecise Probabilities by Thomas Augustin, Frank P. A. Coolen, Gert de Cooman, Matthias C. M. Troffaes ISBN 9780470973813, 0470973811 - The full ebook version is ready for instant download
57 pages
03 Practical Research 1 - Qualitative and Quantitative Research
No ratings yet
03 Practical Research 1 - Qualitative and Quantitative Research
2 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
17 pages
LAB 06 One Way Anova
No ratings yet
LAB 06 One Way Anova
9 pages
Internship Presentation
No ratings yet
Internship Presentation
18 pages

Item Analysis and Evaluation Statistical Analysis of Assessment Data

Uploaded by

Item Analysis and Evaluation Statistical Analysis of Assessment Data

Uploaded by

ITEM ANALYSIS

After you create your assessment items and

The higher the difficulty index, the easier

* Denotes correct answer

Item discrimination Index = (hc – lc) ÷ t

hc= higher scoring group

2. Divide the table in half between high and

4. Divide the resulting number by the number

1"indicates the answer was correct; "0" indicates it was

Question 2 0 3 .30 -0.6

Question 3 5 1 .60 0.8

Internal validity is the measure of the

• Temporal validity refers to the extent to

• Test validity refers to the extent to which the

• It refers to the extent to which we can the

• It refers to the extent to which the

• It refers to the extent to which a diagnosis

Drawing a graph gives an immediate

You might also like