Statistics &
Queuing Theory
Course No: MAT0541202
Topic 3: Measures of Dispersion
Tariq Bin Amir
Measures of Dispersion
The dispersion of a distribution reveals how the
observations are spread out or scattered on each side
of the center.
The measure of dispersion shows how the data is
spread or scattered around the mean.
To measure the dispersion, scatter, or variation of a
distribution is as important as to locate the central
tendency.
If the dispersion is small, it indicates high iformity of the
observations in the distribution.
Absence of dispersion in the data indicates perfect
uniformity. This situation arises when all observations in
the distribution are identical.
Measures of Dispersion
Purpose of Measuring Dispersion
A measure of dispersion appears to serve two purposes.
First, it is one of the most important quantities used to
characterize a frequency distribution.
Second, it affords a basis of comparison between two
or more frequency distributions.
The study of dispersion bears its importance from the
fact that various distributions may have exactly the same
averages, but substantial differences in their variability.
Range
Range is simply the difference between the largest and
smallest values in a set of data
Useful for: daily temperature fluctuations or share price
movement
The formula is:
Range = largest observation - smallest observation
Example 1: Find out the range of the given distribution:
1, 3, 5, 9, 11
The range is 11 – 1 = 10.
Example 2:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
Range
Why the Range can be Misleading
Ignores the way in which data are distributed
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Interquartile Range (IQR)
Quartiles split the ranked data into 4 segments with an
equal number of values per segment
25% 25% 25% 25%
Q1 Q2 Q3
The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
Q2 is the same as the median (50% of the observations
are smaller and 50% are larger)
Only 25% of the observations are greater than the third
quartile
Interquartile Range (IQR)
Interquartile range measures the range of the middle
50% of the values only
Is defined as the difference between the upper and
lower quartiles
Interquartile range = upper quartile - lower quartile
= Q3 - Q1
The IQR is a measure of variability that is not influenced
by outliers or extreme values
Measures like Q1, Q3, and IQR that are not influenced
by outliers are called resistant measures
Interquartile Range (IQR)
Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
Quartile Deviation
Mean Deviation
Mean Deviation
k
k = Number of classes
f i xi x xi= Mid point of the i-th class
MDx i 1
fi= frequency of the i-th class
n
Mean Deviation
Variance
Variance is a measure of how data points differ from the
mean
Example:
Data Set 1: 3, 5, 7, 10, 10
Data Set 2: 7, 7, 7, 7, 7
What is the mean and median of the above data set?
Data Set 1: mean = 7, median = 7
Data Set 2: mean = 7, median = 7
But we know that the two data sets are not identical! The
variance shows how they are different.
We want to find a way to represent these two data set
numerically.
Variance
Sample variance:
n
(Xi X)2 X = arithmetic mean
S2 i 1 n = sample size
n -1
Xi = ith value of the variable X
Sample Variance with frequency table
X = arithmetic mean
2
(x x ) f
s 2
n = sample size
n -1
Xi = ith value of the variable X
f = frequency
Variance
Population variance:
N μ = population mean
(X i μ) 2
N = population size
σ2 i 1
N Xi = ith value of the variable X
Variance
Calculate the Variance for Ungrouped Data
1. Find the Mean.
2. Calculate the difference between each score and the
mean.
3. Square the difference between each score and the
mean.
4. Add up all the squares of the difference between each
score and the mean.
5. Divide the obtained sum by n – 1.
Variance
Example:
Variance
Example (cont.):
Variance
Calculate the Variance for Grouped Data
1. Calculate the mean.
2. Get the deviations by finding the difference of each
midpoint from the mean.
3. Square the deviations and find its summation.
4. Substitute in the formula.
Variance
Example:
Standard Deviation
Measures the variation of observations from the mean
The most common measure of dispersion
Takes into account every observation
Measures the ‘average deviation’ of observations from
mean
Works with squares of residuals not absolute values—
easier to use in further calculations
Is the square root of the variance
Has the same units as the original data
Standard Deviation
Standard deviation of a sample s
In practice, most populations are very large and it is
more common to calculate the sample standard
deviation.
x x
2
Sample standard deviation s
n 1
Where: (n-1) is the number of observations in the
sample
Standard deviation of a population δ
Every observation in the population is used.
x x
2
Standard deviation δ
n
Standard Deviation
Characteristics of the Standard Deviation
The standard deviation is affected by the value of
every observation.
The process of squaring the deviations before adding
avoids the algebraic fallacy of disregarding the signs.
It has a definite mathematical meaning and is
perfectly adapted to algebraic treatment.
It is, in general, less affected by fluctuations of
sampling than the other measures of dispersion.
The standard deviation is the unit customarily used in
defining areas under the normal curve of error. It has,
thus, great practical utility in sampling and statistical
inference.
Standard Deviation
Steps for Calculating Standard Deviation
1. Calculate the difference between each value and the
mean.
2. Square each difference.
3. Add the squared differences.
4. Divide this total by n-1 to get the sample variance.
5. Take the square root of the sample variance to get
the sample standard deviation.
Standard Deviation
Standard deviations for frequency distributions
If data is in a frequency distribution
No. Units Frequency
n f
1 85
2 192
3 123
Total 400
Total
Calculate standard deviation using:
s
x x 2
1
Standard Deviation
Example: Find Standard Deviation of Ungroup Data
Here, x
x i
50
5
n 10
ix x 2
20
s
2
2.2,
n 1 9
s 2.2 1.48
Standard Deviation
Example: Find Standard Deviation of Group Data
f x x
2
f i xi 60 i 40
x 6 s 2
i
4.44
f i 10 n 1 9
Moments
A moment is a quantitative measure of the shape of a
set of points.
The first moment is called the mean which describes
the center of the distribution.
The second moment is the variance which describes
the spread of the observations around the center.
Other moments describe other aspects of a
distribution such as how the distribution is skewed
from its mean or peaked.
A moment designates the power to which deviations
are raised before averaging them.
Skewness
A distribution in which the values equidistant from the
mean have equal frequencies and is called Symmetric
Distribution.
Any departure from symmetry is called skewness.
In a perfectly symmetric distribution,
Mean=Median=Mode and the two tails of the
distribution are equal in length from the mean.
If right tail is longer than the left tail then the distribution
is said to have positive skewness. In this case,
Mean>Median>Mode
If left tail is longer than the right tail then the distribution
is said to have negative skewness. In this case,
Mean<Median<Mode
Skewness
When the distribution is symmetric, the value of
skewness should be zero.
Karl Pearson defined coefficient of Skewness as:
Mean Mode
Sk
SD
Since in some cases, Mode doesn’t exist, so using
empirical relation,
Mode 3Median 2Mean
We can write,
3 Median Mean
Sk
SD
(it ranges b/w -3 to +3)
Kurtosis
KURTOSIS is a a measure of the "peakedness" of the
probability distribution of a real-valued random
variable. Its the standardized fourth central moment of
a distribution.
When the peak of a curve becomes
relatively high then that curve is
called Leptokurtic.
When the curve is flat-topped, then it
is called Platykurtic.
Since normal curve is neither very
peaked nor very flat topped, so it is
taken as a basis for comparison.
The normal curve is called
Mesokurtic.
Kurtosis
4
Kurt 2 2 , for population data
2
m4
Kurt b2 2
, for sample data
m2
For a normal distribution, kurtosis is equal to 3.
When is greater than 3, the curve is more sharply
peaked and has narrower tails than the normal curve
and is said to be leptokurtic.
When it is less than 3, the curve has a flatter top and
relatively wider tails than the normal curve and is said
to be platykurtic.
Conclusion
The more the data are spread out, the greater the
range, variance, and standard deviation.
The less the data are spread out, the smaller the
range, variance, and standard deviation.
If the values are all the same (no variation), all these
measures will be zero.
None of these measures are ever negative.