Describing Data:
Measures of Variation or Dispersion
Chapter 4
McGraw-Hill/Irwin ©The McGraw-Hill Companies, Inc. 2008
GOALS
• Defining Dispersion with numerical example
• Identifying the significance of measuring dispersion
• Methods of measuring dispersion: Computing and
interpreting the range, mean deviation, variance, and
standard deviation.
• Understand the characteristics, uses, advantages, and
disadvantages of each measure of dispersion.
2
What is Dispersion?
A measure of location, such as the mean or the median, only
provides a single value that describes the entire data set. But that
single value can not adequately describe a set of observation. It
is necessary to describe the variability or spread of the set of
observation. Consider the following example:
Factory A (Wages Tk.) Factory B (Wages Tk.) Factory C (Wages Tk.)
1300 1310 1380
1300 1300 1210
1300 1304 1220
1300 1306 1200
1300 1280 1490
Total 6500 6500 6500
Average 1300 1300 1300
3
What is Dispersion?
Dispersion is an important characteristic of a frequency
distribution; it tells us how compactly the individual
values are distributed around the average. The measure
of dispersion or variation is the measure of extent of
variation or deviation of individual values from the
central value. This measure of dispersion gives a
precise idea as to the extent of representation of the
central value.
4
Samples of Dispersions
5
Significance of Measuring Dispersion
The measures of dispersion is needed for four basic purposes:
i. To determine the reliability of an average.
ii. To serve as a basis for the control of the variability.
iii. To compare two or more series with regard to their variability.
iv. To facilitate the use of other statistical measures.
6
Properties of a Good Measure of Dispersion
1. It should be easy to understand
2. It should be simple to compute
3. It should be rigidly defined.
4. It should be based on each and every item of the
distribution.
5. It should be amenable to further mathematical treatment.
6. It should not be unduly affected by the extreme values.
7. It should have sampling stability.
7
Methods of Measuring Dispersion
8
Methods of Measuring Dispersion
Different Methods of For Ungrouped Data For Grouped Data
Dispersion
(Highest Value – Lowest (Lower value of the lower
Range Value) class – Higher value of
the higher class
Q3 Q1
Quartile Deviation Q.D.
2
Mean Deviation A.D.
X X f X X
N
A.D.
N
Standard Deviation X 2
X
2
fX 2
X
2
N N
9
Range
Range is the simplest method of studying dispersion. It is
defined as the difference between the value of the smallest
item and the value of the largest item included in the
distribution. Symbolically:
Range=H-L
Where H= Highest observed value
L= Lowest observed value
For a frequency distribution range is computed by taking
difference of the lower limit of the first class interval and the
upper limit of the last class interval.
10
EXAMPLE – Range
The number of cappuccinos sold at the Starbucks location in the
Orange Country Airport between 4 and 7 p.m. for a sample of 5 days
last year were 20, 40, 50, 60, and 80. Determine the range for the
number of cappuccinos sold.
Range = Largest – Smallest value
= 80 – 20 = 60
Interpretation
The range can be computed easily and gives a rough estimate
of the spread of scores in any distribution. However, the range
depends on only two scores (the highest and lowest) and therefore is
subject to instability from one or two extreme scores.
11
Uses of Range
Quality Control: The concept of range is basically applied
to examine and control the quality of products in the
manufacturing process.
Fluctuations in the Price of any products: Range is
useful in the studying the variation of any prices which
usually changes from one time period to another time
period.
Weather Forecasts: The meteorological department does
make use the range in determining the difference between
the minimum temperature and the maximum temperature.
12
Quartile Deviation
Quartile deviation gives the average amount by which two
quartile differ from median. The formula for calculating
quartile deviation is:
Q3 Q1
Q.D.
2
When the value of quartile deviation is very small, it describes
high uniformity or small variation of the central 50 %
observations, and a high quartile deviation means the variation
among the central observation is large.
Quartile Deviation
14
Mean Deviation
Mean deviation is obtained by calculating the absolute
deviation of each observation from the mean value, and then
averaging these deviations by taking their arithmetic mean.
The formula for calculating mean deviation is
A.D.
X X
or A.D.
f X X
N N
The reasons for taking absolute deviation is that it is the
amount of the differences of the observations from the mean
rather than the direction of the differences.
Mean Deviation
16
EXAMPLE – Mean Deviation from ungrouped data
The number of cappuccinos sold at the Starbucks location in the
Orange Country Airport between 4 and 7 p.m. for a sample of 5
days last year were 20, 40, 50, 60, and 80. Determine the mean
deviation for the number of cappuccinos sold.
17
EXAMPLE – Mean Deviation from grouped data
Considering the frequency distribution like
Class Frequency
0 up to 5 2
5 up to 10 7
10 up to 15 12
15 up to 20 6
Calculate the mean deviation from the above classified data.
EXAMPLE – Mean Deviation from grouped data
Class Frequency Mid Point fX
X X f X X
(f) (X)
0 up to 5 2 2.5 5 9.07 18.14
5 up to 10 7 7.5 52.5 4.07 28.49
10 up to 15 12 12.5 150 0.93 11.16
15 up to 20 6 17.5 105 5.93 35.58
N 27 fX f X X
312.5 93.37
Here X
fX
312.5 / 27 11 .57
N
A.D.
f X X
93.37
3.46
N 27
Standard Deviation
The standard deviation concept was introduced by Karl
Pearson in 1893. It is by far the most important and widely
used measure of studying variation. Actually, it is a
measure of how much ‘spread’ or ‘variability’ is present in a
sample. If all the numbers are very close to each other, the
standard deviation is close to zero. If the numbers are well
dispersed, the standard deviation will tends to be large.
Interpreting and Understanding Standard
Deviation: The Empirical Rule
In a bell shaped distribution
approximately
68% of all values fall within a
1 standard deviation of the
mean.
95% of all values fall within a
2 standard deviation of the
mean.
99.7% of all values fall within
a 3 standard deviation of the
mean.
21
Population and Sample Standard Deviation:
Ungrouped Data
22
Population and Sample Variance and
Standard Deviation: Grouped Data
23
Formula for Calculating Standard
Deviation
For Ungrouped Data:
2
X 2
X
X X
2
Or N N
N
For Grouped Data:
2
f X X
2
fX 2
fX
Or N
N N
Which Measure of Dispersion is the Best and
Why:
Standard deviation is the best measure of variation because
of its mathematical characteristics. This measure is
calculated from original data. It is rigidly defined and based
on each and every items of the distribution. Also is amenable
to further algebraic treatment and is of less affected by
fluctuations of sampling than most other measure of
dispersion. Most of the statistical theory is based on
standard deviation. It helps to make comparison between
variability of two or more sets of data. Also standard deviation
helps in testing the significance of random samples.
25
Relative Measure of Dispersion
The dispersion of two distributions cannot be compared if
they are expressed in two different units of measurements.
In that case, we need a relative measure of dispersion
which is used to measure and compare variations of data
in different series expressed in different units of
measurements. A measure of relative dispersion is the
ratio of a measure of absolute dispersion to an appropriate
average. It is sometimes called coefficient of dispersion
because coefficient means a pure number that is
independent of unit of measurement.
26
Relative Measure of Dispersion
27
The Arithmetic Mean of Grouped Data -
Example
Recall in Chapter 2, we
constructed a frequency
distribution for the vehicle
selling prices. The
information is repeated
below. Determine the
arithmetic mean vehicle
selling price.
28
Solution
Selling Prices ($ ‘000) Frequency Mid Point fX fX 2
(f) (X)
15 -18 8 16.5 132 2178
18-21 23 19.5 448.5 8745.7
5
21-24 17 22.5 382.5 8606.2
5
24-27 18 25.5 459 11704.5
27-30 8 28.5 228 6498
30-33 4 31.5 126 3969
33-36 2 34.5 69 2380.5
Total N=80 1845 44082
Solution
Standard Deviation
2
fX 2
fX
N N
2
44082 1845
80 80
4.3756
End of Chapter 4
31