0% found this document useful (0 votes)
57 views75 pages

Unit 8

The document discusses different types of measurement scales used in research including nominal, ordinal, interval, and ratio scales. It provides examples to illustrate the key properties and uses of each scale type. Nominal scales categorize variables without rank, ordinal scales rank order categories, interval scales measure distances between points, and ratio scales have a true zero point. The degree of information provided increases from nominal to ratio scales.

Uploaded by

Naman Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views75 pages

Unit 8

The document discusses different types of measurement scales used in research including nominal, ordinal, interval, and ratio scales. It provides examples to illustrate the key properties and uses of each scale type. Nominal scales categorize variables without rank, ordinal scales rank order categories, interval scales measure distances between points, and ratio scales have a true zero point. The degree of information provided increases from nominal to ratio scales.

Uploaded by

Naman Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 75

Unit 8

Measurement:
Scaling, Reliability, Validity

1
Scale
 Is a tool or mechanism by which
individuals are distinguished as to how
they differ from one another on the
variables of interest to our study.

3
scales
 There are four basic types of scales:
1. Nominal Scale
2. Ordinal Scale
3. Interval Scale
4. Ratio Scale

3
scales
 The degree of sophistication to
which the scales are fine-tuned
increases progressively as we move
from the nominal to the ratio scale.
 The information on the variables can be
obtained in greater detail when we
employ an interval or a ratio scale than
the other two scales.
4
scales
 With more powerful scales,
increasingly sophisticated data
analyses can be performed, which in
turn, means that more meaningful
answers can be found to our research
questions.

5
Nominal Scale
 A nominal scale is one that allows the researcher to assign
subjects to certain categories or groups.

 What is your department?


O Marketing O Maintenance O Finance
O Production O Servicing O Personnel
O Sales O Public Relations O Accounting

 What is your gender?


O Male
O Female

7
Nominal Scale
 For example, the variable of gender,
respondents can be grouped into two
categories- male and female.

7
Nominal Scale
 The information that can be
generated from nominal scaling is to
calculate the percentage (or frequency)
of males and females in our sample of
respondents.

8
Example 1
 Nominally scale the nationality of
individuals in a group of tourists to a country
during a certain year.
 We could nominally scale this variable in the
following mutually exclusive and
collectively exhaustive categories.
American Japanese
Russian Malaysian
Chinese German
Arabian Other
9
Example 1
 Note that every respondent has to fit
into one of the above categories and
that the scale will allow computation of
the numbers and percentages of
respondents that fit into them.

10
Ordinal Scale
 Ordinal scale: not only categorizes variables in such
a way as to denote differences among various
categories, it also rank-orders categories in some
meaningful way.

 The preference would be ranked ( from best to


worse; or from first to last) and numbered as I, II,
III, and so on.

11
Example 2
 Rank the following five characteristics
in a job in terms of how important they
are for you.
You should rank the most important
item as 1st, the next in importance as
2nd, and so on, until you have ranked
each of them 1st, 2nd, 3rd , 4th or 5th.

12
Example 2 (Cont.)
 Job Characteristic Ranking
The opportunity provided by the job to:
1. Interacts with others _____
2. Use different skills _____
3. Complete a task to the end _____
4. Serve others _____
5. Work independently _____

13
Example 2 (Cont.)
 This scale helps the researcher to
determine the percentage of respondents
who consider interaction with others as
most important, those who consider using a
number of skills as most important, and so
on.
 Such knowledge might help in designing
jobs that would be seen as most enriched
by the majority of the employees. 14
Example 2 (Cont.)

 We can see that the ordinal scale


provides more information than the
nominal scale. Even though differences
in the ranking of objects, persons are
clearly known, we do not know their
magnitude.
 This deficiency is overcome by interval
scaling.
15
Interval Scale

 Interval scale: Whereas the nominal


scale allows us only to qualitatively
distinguish groups by categorizing them
into mutually exclusive and collectively
exhaustive sets, and the ordinal scale to
rank-order the preferences, the interval
scale lets us measure the distance
between any two points on the scale.
16
Example 3a
 Indicate the extent to which you agree
with the following statements as they
relate to your job, by circling the
appropriate number against each,
using the scale given below.
Strongly disagree 1, Disagree 2
Neither Agree Nor Disagree 3
Agree 4, Strongly Agree 5.

17
Example 3a (Cont.)
 The following opportunities offered by
the job are very important to me (scale
on the basis of importance):
Interacting with others 1 2 3 4 5
Using a number of 1 2 3 4 5
different skills
Completing a task from 1 2 3 4 5
beginning to end
Serving others 1 2 3 4 5
Working independently 1 2 3 4 5

18
Example 3a (Cont.)
 Suppose that the employees circle the
numbers 3, 1, 2, 4, and 5 for the five items.
 The magnitude of difference represented
by the space between points 1 and 2 on the
scale is the same as the magnitude of
difference represented by the space between
points 4 and 5, or between any other two
points. Any number can be added to or
subtracted from the numbers on the scale,
still retaining the magnitude of the difference.
19
Example 3a (Cont.)
 If we add 6 to the five points on the
scale, the interval scale will have the
numbers 7, 8,….., 11 ( instead of 1 to
5).
 The magnitude of the difference
between 7 and 8 is still the same as
the magnitude of the difference
between 9 and 10. It has an arbitrary
origin. 20
Example 3b

 Circle the number that represents your feelings at this particular


moment best. There are no right or wrong answers. Please answer
every question.

1. I invest more in my work than I get out of it

I disagree completely 1 2 3 4 5 I agree completely

2. I exert myself too much considering what I get back in return

I disagree completely 1 2 3 4 5 I agree completely

3. For the efforts I put into the organization, I get much in return

I disagree completely 1 2 3 4 5 I agree completely

21
Ratio Scale
 Ratio scale: overcomes the
disadvantage of the arbitrary origin
point of the interval scale, in that it has
an absolute (in contrast to an arbitrary)
zero point, which is a meaningful
measurement point.

 What is your age?


26
Ratio Scale

 The ratio scale is the most


powerful of the four scales because
it has a unique zero origin (not
an arbitrary origin).
 The differences between scales are
summarized in the next Figure.

23
The differences between
scales

24
Properties of the Four Scales
Developing Scales
 The four types of scales that can be used
to measure the operationally defined
dimensions and elements of a variable are:
Nominal, Ordinal, Interval, and Ratio
scales.
 It is necessary to examine the methods of
scaling (assigning numbers or symbols) to
elicit the attitudinal responses of subjects
toward objects, events, or persons.
26
Developing Scales
 Categories of attitudinal scales:
(not to be confused with the four
different types of scales)
 The Rating Scales
 The Ranking Scales

27
Developing Scales
 Rating scales have several response
categories and are used to elicit
responses with regard to the object,
event, or person studied.
 Ranking scales, make comparisons
between or among objects, events, or
persons and elicit the preferred choices
and ranking among them.
28
Rating Scales
 The following rating scales are often
used in organizational research.
1. Dichotomous scale
2. Category scale
3. Likert scale
4. Numerical scale

29
Rating Scales
5. Semantic differential scale
6. Itemized rating scale
7. Fixed or constant sum rating scale
8. Stapel scale
9. Graphic rating scale
10. Consensus scale

30
Dichotomous Scale
 Is used to elicit a Yes or No answer.
(Note that a nominal scale is used to
elicit the response)
 Example 4
Do you own a car? Yes No

31
Category Scale
 It uses multiple items to elicit a single
response.
 Example 5
Which state do you reside in India?
Andhra Pradesh
U.P.
Maharashtra
Goa
Other

32
Likert Scale
 Is designed to examine how strongly
subjects agree or disagree with
statements on a 5-point scale as
following:
_________________________________
Strongly Neither Agree Strongly
Disagree Disagree Nor Disagree Agree Agree
1 2 3 4 5
______________________________________________________

33
Likert Scale

 This is an Interval scale and the


differences in responses between any
two points on the scale remain the
same.

34
Semantic Differential Scale
 We use this scale when several
attributes are identified at the
extremes of the scale. For instance,
the scale would employ such terms as:
Good – Bad
Strong – Weak
Hot – Cold

35
Semantic Differential Scale
 This scale is treated as an Interval
scale.
 Example 6
What is your opinion on your supervisor?
Responsive--------------Unresponsive
Beautiful-----------------Ugly
Courageous-------------Timid

36
Numerical Scale
 Is similar to the semantic differential scale,
with the difference that numbers on a 5-
points or 7-points scale are provided, as
illustrated in the following example:
How pleased are you with your new job?
Extremely Extremely
pleased 5 4 3 2 1 displeased

37
Itemized Rating Scale
 A 5-point or 7-point scale is provided for each item
and the respondent states the appropriate number
on the side of each item. This uses an Interval
Scale.
 Example 7(i)
Respond to each item using the scale below, and indicate your
response number on the line by each item.
1 2 3 4 5
Very unlikely unlikely neither likely very likely
unlikely nor
likely
--------------------------------------------------------------------------------
I will be changing my job in the near future. --------

38
Itemized Rating Scale
 Note that the above is balanced
rating with a neutral point.
 The unbalance rating scale which
does not have a neutral point, will be
presented in the following example.

39
Itemized Rating Scale
 Example 7(ii)
Circle the number that is closest to how you
feel for the item below:
Not at all Somewhat Moderately Very much
interested interested interested interested
1 2 3 4
--------------------------------------------------------------------------------
How would you rate your interest 1 2 3 4
in changing current organizational
Policies?

40
Fixed or Constant Sum Scale
 The respondents are asked to distribute a
given number of points across various items.

Example : In choosing a bath soap, indicate the importance you
attach to each of the following five aspects by allotting points
for each to total 100 in all.
Fragrance -----
Color -----
Shape -----
Size -----
_________
Total points 100
This is more in the nature of an ordinal scale.

41
Stapel Scale
 The Stapel scale got its name after its
developer Jan Stapel.
 This scale simultaneously measures both
the direction and intensity of the attitude
toward the items under study.The
characteristic of interest to the study is
placed at the center and a numerical scale
ranging, say from +3 to – 3, on either side
of the item as illustrated in the following
example:
42
Example 8: Stapel Scale
 State how you would rate your supervisor’s abilities with respect
to each of the characteristics mentioned below, by circling the
appropriate number.
+3 +3 +3
+2 +2 +2
+1 +1 +1
Adopting modern Product Interpersonal
Technology Innovation Skills
-1 -1 -1
-2 -2 -2
-3 -3 -3

43
Graphic Rating Scale
 A graphical representation helps the
respondents to indicate on this scale
their answers to a particular question by
placing a mark at the appropriate point
on the line, as in the following example:

44
Graphic Rating Scale
 Example 9
 On a scale of 1 to 10, how would you
rate your supervisor?
10

45
Ranking Scales
 Are used to tap preferences between
two or among more objects or items
(ordinal in nature). However, such
ranking may not give definitive clues
to some of the answers sought.

46
Ranking Scales
 Example 10
There are 4 different brands of soft drinks produced by
a company, the manager seeks information that would
help decide which brand is most preferred.
Assume:
35% of respondents choose the 1st brand.

25% of respondents choose the 2nd brand.


20% of respondents choose the 3rd brand.
20% of respondents choose the 4th brand.
100%
47
Ranking Scales
 The manager cannot conclude that the first
brand is the most preferred one. Why?
 Because 65% of respondents did not choose
that product. We have to use alternative
methods like Paired Comparisons, Forced
Choice and the Comparative Scale.

48
Paired Comparison
 It is used when, among a small number of objects,
respondents are asked to choose between two
objects at a time.
 It helps to assess preferences.
 In the previous example, if the 1st brand is preferred
over 2nd , 3rd and 4th, then only the manager can
establish the preferences.
 But, as the no. of objects increases, hence the
number of paired comparisons.
 Hence the no. of pairs of n objects will be n(n-1)/2.
 Hence, preferred when is small.
49
Forced Choice
 The forced choice enables respondents
to rank objects relative to one another,
among the alternative provided. This is
easier for the respondents, particularly
if the number of choice to be ranked is
limited in number.

50
Forced Choice
 Example 11
Rank the following newspapers that you
would like to subscribe to in the order of
preference, assigning 1 for the most preferred
choice and 5 for the least preferred.
-------‫• الدستور‬
---------‫• الرأي‬
----‫• أخبار اليوم‬
-----------‫• الغد‬
--------‫• شيحان‬
51
Comparative Scale
 Provides a benchmark or a point of reference
to assess attitudes towards the current
object, event or situation.
 Example

In a volatile financial environment,compared


to stocks, how useful is it to invest in
Treasury bonds?
More Useful About the same Less Useful
1 2 3 4 5

52
Goodness of Measures

 It is important to make sure that the


instrument that we develop to measure
a particular concept is accurately
measuring the variable, and we are
actually measured the concept that
we set out to measure.

53
Goodness of Measures
 We need to assess the goodness of
the measures developed. That is, we
need to be reasonably sure that the
instruments we use in our research do
indeed measure the variables they
are supposed to, and that they
measure them accurately.

54
Goodness of Measures
Goodness of Measures
 How can we ensure that the measures
developed are reasonably good?
 First an item analysis of the
responses to the questions tapping the
variable is done.
 Then the reliability and validity of
the measures are established.

56
Item Analysis
 Item analysis is done to see if the items in
the instrument belong there or not. Each item
is examined for its ability to discriminate
between those subjects whose total scores
are high, and those with low scores.
 In item analysis, the means between the
high-score group and the low-score group
are tested to detect significant differences
through the t-values.
57
Item Analysis
 The items with a high t-value are then
included in the instrument. Thereafter,
tests for the reliability of the
instrument are done and the validity
of the measure is established.

58
Reliability
 Reliability of measure indicates extent
to which it is without bias and hence
ensures consistent measurement across
time (stability) and across the various
items in the instrument (internal
consistency).

66
Stability
 Stability: ability of a measure to remain
the same over time, despite
uncontrollable testing conditions or the
state of the respondents themselves.
 Test-Retest Reliability: The reliability
coefficient obtained with a repetition of the
same measure on a second occasion.
 Parallel-Form Reliability: Responses on
two comparable sets of measures tapping the
same construct are highly correlated.
84 60
Test-Retest Reliability
 When a questionnaire containing some items
that are supposed to measure a concept is
administered to a set of respondents now, and
again to the same respondents, say several
weeks to 6 months later, then the correlation
between the scores obtained is called the test-
retest coefficient.
 The higher the coefficient is, the better the
test-retest reliability, and consequently, the
stability of the measure across time.

61
Parallel-Form Reliability
 When responses on two comparable
sets of measures tapping the same
construct are highly correlated, we have
parallel-form reliability.
 Both forms have similar items and the
same response format, the only changes
being the wording and the order or
sequence of the questions.
62
Parallel-Form Reliability
 What we try to establish in the parallel-
form is the error variability resulting from
wording and ordering of the questions.
 If two such comparable forms are highly
correlated (say 8 and above), we may be
fairly certain that the measures are
reasonably reliable, with minimal error
variance caused by wording, ordering, or
other factors.

63
Internal Consistency
 Internal Consistency of Measures is indicative of the
homogeneity of the items in the measure that tap the construct.
 Inter-item Consistency Reliability: This is a test of the

consistency of respondents’ answers to all the items in a


measure. Most popular test of inter-item consistency
reliability is the Cronbach’s coefficient alpha that tests, how
closely related a set of items are, as a group. Cronbach's
alpha coefficient ranges from 0 to 1, with higher values
indicating greater internal consistency. Typically, values
above 0.7 are considered acceptable for research purposes,
while values above 0.8 are considered good and values
above 0.9 are considered excellent.

 Split-Half Reliability: Split-half reliability reflects the


72
correlations between two halves of an instrument.
Internal Consistency : An example

Suppose you want to find out how satisfied your customers


are with the level of customer service they receive at your
call center. You send out a survey with three
questions designed to measure overall satisfaction.
Choices for each question are: Strongly
agree/Agree/Neutral/Disagree/Strongly
.disagree

 I was satisfied with my experience.


 I will probably recommend your company to others.
 If I write an online review, it would be positive.
65
If the survey has good internal
consistency, respondents should answer
the same for each question, i.e. three
”.“agrees” or three “strongly disagrees
If different answers are given, this is a
sign that your questions are poorly
worded and are not reliably measuring
.customer satisfaction
66
Validity
 Validity tests show how well an instrument
that is developed measures the particular
concept it is intended to measure.
 It is concerned with whether we measure the
right concept.
 Several types of validity tests are used to
test the goodness of measures: content
validity, criterion-related validity and
construct validity.
67
Content Validity
 Content validity ensures that the measure
includes an adequate and representative
set of items that tap the concept.
 The more the scale items represent the
domain of the concept being measured, the
greater the content validity.
 In other words, content validity is a
function of how well the dimensions and
elements of a concept have been
delineated.
68
Criterion-Related Validity

 Criterion-Related Validity is established when the


measure differentiates individuals on a criterion (job
performance) it is expected to predict. This can be done
by establishing what is called concurrent validity or
predictive validity.

 Concurrent validity is established when the scale


discriminates individuals who are known to be
different; that is, they should score differently on the
instrument as in the following example.

 It is particularly useful for selection and placement


decisions in educational and employment settings, where
the goal is to predict future performance based on current
69
test scores.
Criterion-Related Validity
 Example 12
If a measure of work ethic is developed and
administered to a group of welfare recipients,
the scale should differentiate those who are
enthusiastic about accepting a job and glad of
a opportunity to be off welfare, from those
who would not want to work even when
offered a job.

70
Example (concurrent validity)
 An example of concurrent validity would be a study
that examines the relationship between scores on a
new depression screening test and scores on an
established depression screening test. The
researchers would administer both tests to a sample
of participants and then determine whether the
scores on the new test are highly correlated with
scores on the established test. If the scores on the
new test are highly correlated with scores on the
established test, then the new test would be
considered to have concurrent validity.
71
Example (predictive validity)

An example of predictive validity would be a study 


that examines the relationship between scores on a
college entrance exam and subsequent academic
performance in college. The researchers would
administer the college entrance exam to a sample of
high school students and then track their academic
performance over the next few years in college. If
there is a strong positive correlation between the
scores on the entrance exam and academic
performance in college, then the entrance exam
.would be considered to have predictive validity
72
Construct Validity
 Construct Validity testifies to how well the results
obtained from the use of the measure fit the theories
around which the test is designed. This is assessed
through convergent and discriminant validity.
 Convergent validity is established when the scores
obtained with two different instruments measuring
the same concept are highly correlated.
 Discriminant validity is established when, based
on theory, two variables are predicted to be
uncorrelated, and the scores obtained by measuring
them are indeed empirically found to be so.

73
Goodness of Measures
 Goodness of Measures is established
through the different kinds of validity and
reliability.
 The results of any research can only be as
good as the measures that tap the concepts
in the theoretical framework.
 Table 7.2 summarizes the kinds of validity
discussed in the lecture.

74
Validity

.
75

You might also like