VALIDITY AND
RELIABILITY
Validity
- is the ability of an instrument to
measure what intends to measure.
Reliability
– refers to the consistency of
results.
METHODS IN ESTABLISHING
RELIABILITY
a. Test-retest or Stability test
measures test consistency
the same test is given to the
respondents twice.
In other words, you give the same
test twice to the same people at
different times to see if the scores
are the same.
For example, test on a Monday,
then again the following Monday.
The two scores are then correlated.
Test-Retest Reliability Coefficients
It is also called coefficients of stability, it vary
between 0 and 1, where:
1 : perfect reliability,
≥ 0.9 : excellent reliability,
≥ 0.8 < 0.9 : good reliability,
≥ 0.7 < 0.8 : acceptable reliability,
≥ 0.6 < 0.7 : questionable reliability,
≥ 0.5 < 0.6 : poor reliability,
< 0.5 : unacceptable reliability,
0 : no reliability.
For measuring reliability for two tests,
use the Pearson Correlation
Coefficient
What is Pearson Correlation?
Correlation between sets of data is a
measure of how well they are related.
The most common measure of
correlation in stats is the Pearson
Correlation.
The full name is the Pearson Product
Moment Correlation (PPMC).
GLUCOSE
SUBJECT AGE (x) LEVEL (y) xy x2 y2
1 43 99 4257 1849 9801
2 21 65 1365 441 4225
3 25 79 1975 625 6241
4 42 75 3150 1764 5625
5 57 87 4959 3249 7569
6 59 81 4779 3481 6561
Σ 247 486 20485 11409 40022
Step 6: Use the following correlation
coefficient formula.
Step 6: Use the following correlation
coefficient formula.
Test-Retest Reliability Coefficients
It is also called coefficients of stability, it vary
between 0 and 1, where:
1 : perfect reliability,
≥ 0.9 : excellent reliability,
≥ 0.8 < 0.9 : good reliability,
≥ 0.7 < 0.8 : acceptable reliability,
≥ 0.6 < 0.7 : questionable reliability,
≥ 0.5 < 0.6 : poor reliability,
< 0.5 : unacceptable reliability,
0 : no reliability.
SUBJECT (x) (y)
1 43 99
2 21 65
3 25 79
4 42 75
5 57 87
6 59 81
Σ
The reliability of survey data may depend on the
following factors:
Respondents may not feel encouraged to
provide accurate, honest answers
Respondents may not feel comfortable
providing answers that present themselves in a
unfavorable manner.
Respondents may not be fully aware of their
reasons for any given answer because of lack
of memory on the subject, or even boredom.
b. Internal Consistency
the degree of interrelationship or
homogeneity among the items on a
test, such that they are consistent
with one another and measuring the
same thing.
Example:
You want to find out how satisfied your
customers are with the level of customer
service they receive at your call center.
You send out a survey with three
questions designed to measure overall
satisfaction.
Choices for each question are:
Strongly agree
Agree
Neutral
Disagree
Strongly disagree
1. I was satisfied with my experience.
2. I will probably recommend your
company to others.
3. If I write an online review, it would be
positive.
If the survey has good internal
consistency, respondents should
answer the same for each question,
i.e. three “agrees” or three “strongly
disagrees.”
If different answers are given, this is
a sign that your questions are poorly
worded and are not reliably measuring
customer satisfaction.
There are three main techniques for
measuring the
internal consistency reliability,
depending upon the degree,
complexity and scope of the test.
1. Cronbach’s Alpha Test
2. Split half Test
3. Kuder-Richardson Test
Cronbach’s Alpha Test
most commonly used when you want to
assess the internal consistency of a
questionnaire (or survey) that is made
up of multiple Likert-type scales and
items.
Cronbach’s Alpha Formula
Where:
k = # of items
s 2i = the sum of the variances of each item
2
S y = the variance of the Total column
Cronbach’s Alpha turns out to be 0.773.
The following table describes how different
values of Cronbach’s Alpha are usually
interpreted:
Kuder-Richardson test
- Kuder-Richardson Formula 20, or KR-20,
is a measure reliability for a test
with binary variables (i.e. answers that
are right or wrong). It should only be
used if there is a correct answer for each
question.
– used for dichotomous items
The scores for KR-20 range from 0 to 1,
where 0 is no reliability and 1 is perfect
reliability. The closer the score is to 1, the
more reliable the test.
k = sample size for the test
s 2 = variance for the test
p = proportion of people passing the item
q = proportion of people failing the item.
Σ = sum up (add up)
Split Half
Assesses the internal consistency of
the test. It measure the extent to which
all parts of the test contribute equally
to what is being measured.
It is commonly use for multiple choice
test.
Is a test for a single knowledge area is
split into two parts and then both part
given to one group of students at the
same time.
Since the reliability coefficient holds only
half of the test the reliability for a whole test
is calculated using
Half test is calculated using
A coefficient of 0 means no reliability
and 1.0 means perfect reliability
means perfect reliability. Generally if
the reliability is above 0.80 it is said to
have a good reliability if below 0.50 it
would not be considered a very
reliable test.
Test-retest
Measuring a property that you
expect to stay the same over time.
Internal consistency
Using a multi-item test where all
the items are intended to measure
the same variable.
You devise a questionnaire to
measure the IQ of a group of
participants.
You devise a questionnaire to
measure the IQ of a group of
participants.
- Test-retest
A test of color blindness for trainee
pilot applicants should have high
test-retest reliability, because color
blindness is a trait that does not
change over time.
Test-retest
Measuring a property that you
expect to stay the same over time.
Internal consistency
Using a multi-item test where all
the items are intended to measure
the same variable.
Example:
You want to find out how satisfied your
customers are with the level of customer
service they receive at your call center.
You send out a survey with three
questions designed to measure overall
satisfaction.
Example:
You want to find out how satisfied your
customers are with the level of customer
service they receive at your call center.
You send out a survey with three
questions designed to measure overall
satisfaction.
- Internal Consistency
Choices for each question are:
Strongly agree
Agree
Neutral
Disagree
Strongly disagree
1. I was satisfied with my experience.
2. I will probably recommend your
company to others.
3. If I write an online review, it would be
positive.