100% found this document useful (1 vote)
84 views

Measure of Dispression

The document discusses measures of dispersion used in descriptive statistics. It defines dispersion as how observations are spread out around the central tendency. Absolute measures of dispersion like range and standard deviation are expressed in the original data units, while relative measures like coefficient of variation allow comparison between distributions. The mean deviation is presented as the average of absolute deviations from the central value. Standard deviation is defined as the positive square root of the mean of squared deviations from the mean. Examples are provided to demonstrate calculating standard deviation for both ungrouped and grouped data distributions.

Uploaded by

Mariam Zehra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
84 views

Measure of Dispression

The document discusses measures of dispersion used in descriptive statistics. It defines dispersion as how observations are spread out around the central tendency. Absolute measures of dispersion like range and standard deviation are expressed in the original data units, while relative measures like coefficient of variation allow comparison between distributions. The mean deviation is presented as the average of absolute deviations from the central value. Standard deviation is defined as the positive square root of the mean of squared deviations from the mean. Examples are provided to demonstrate calculating standard deviation for both ungrouped and grouped data distributions.

Uploaded by

Mariam Zehra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

1

Lecture-4: Descriptive Statistics


[Measures of Dispersion]

Instructor:
Dr. Muhammad Umair Sohail
Institute of Business Management
Oct, 2021
2
Measures of Variability or Dispersion

 The dispersion of a distribution reveals how the observations are spread


out or scattered on each side of the center.
 To measure the dispersion, scatter, or variation of a distribution is as
important as to locate the central tendency.
 If the dispersion is small, it indicates high uniformity of the observations in
the distribution.
 Absence of dispersion in the data indicates perfect uniformity. This
situation arises when all observations in the distribution are identical.
 If this were the case, description of any single observation would suffice.
3
Purpose of Measuring Dispersion

 A measure of dispersion appears to serve two purposes.

 First, it is one of the most important quantities used to characterize


a frequency distribution.

 Second, it affords a basis of comparison between two or more


frequency distributions.

 The study of dispersion bears its importance from the fact that
various distributions may have exactly the same averages, but
substantial differences in their variability.
4
Measures of Dispersion

 Absolute measure of dispersion:


These are expressed in same units in which original data is presented, that just shows the variability of the data set.
 Range
 Percentile range
 Quartile deviation
 Mean deviation
 Variance and standard deviation

 Relative measure of dispersion:


These are not expressed in units but it is a pure number. but these measures can be used to compare the
variations between the two series.
 Coefficient of variation
 Coefficient of mean deviation
 Coefficient of range
 Coefficient of quartile deviation
5
Mean Deviation

 The mean deviation is an average of absolute deviations of individual


observations from the central value of a series. Average deviation about
mean
k

f i xi  x
MD x   i 1
n
 k = Number of classes
 xi= Mid point of the i-th class
 fi= frequency of the i-th class
6
Merits of Mean Deviation

 Merits
 It is simple to understand.
 It is easy to calculate.
 It is based on all the observations of a series.
  It shown the dispersion, or scatter of the various items of a series from its central value.
 It is not very much affected by the values of extreme items of a series.
 It facilitates comparison between different items of a series.
 It truly represents the average of deviations of the items of a series.
 It has practical usefulness in the field of business and commerce.
7
Demerits of Mean Deviation

 It is not rigidly defined in the sense that it is computed from any central value viz.
Mean, Median, Mode etc. and thereby it can produce different results.
 It violates the algebraic principle by ignoring the + and – signs while calculating
the deviations of the different items from the central value of a series.
 It is not capable of further algebraic treatment.
 It is affected much by the fluctuations in sampling.
 It is difficult to calculate when the actual value of an average comes out in
fraction, or recurring figure for that in such a case it requires the use of the
shortcut method which involves a cumbersome formula subject to adjustment in
different cases.
 It is not suitable for sociological study.
8
Standard Deviation

 Standard deviation is the positive square root of the mean-


square deviations of the observations from their arithmetic
mean.
Population Sample


 i
 x    2

s
 i
 x  x  2

N N 1

SD  variance
8
9
Standard Deviation for Group Data

 SD is :  f i  xi  x  2
x
 fxi i
s
N
Where
f i

 Simplified formula

 fx  fx 
2
2 
s 
N  N 
  9
10
Merits of Standard Deviation

 It is based on every item of the distribution. Also it is amenable to algebraic


treatment and is less affected by fluctuations of sampling than most other measures
of dispersion.
 It is possible to calculate the combined standard deviation of two or more groups.
This is not possible with any other measure.
 For comparing the variability of two or more distribution coefficient of variation is
considered to be most appropriate and this is based on mean standard deviation.
 Standard deviation is most prominently used in further statistical work. For
example, in computing skewness, correlation, etc., use is made of standard
deviation. It is keynote in sampling and provides a unit of measurement for the
normal distribution.
11
Limitations:

(i)    As compared to other measures it is difficult to compute. However, it does not


reduce the importance of this measure because of high degree of accuracy of results
it gives.

(ii)    It gives more weight to extreme items and less to those which are near the
mean. It is because of the fact that the squares of the deviations which are big in size
would be proportionately greater than the squares of those deviations which are
comparatively small. The deviations 2 and 8 are in the ratio of 1:4 but their squares,
i.e., 4 and 64 would be in the ratio of 1:16.
12
Example-1: Find Standard Deviation of Ungroup Data

Family
1 2 3 4 5 6 7 8 9 10
No.

Size (xi) 3 3 4 4 5 5 6 6 7 7
13
Example

Family No. 1 2 3 4 5 6 7 8 9 10 Total


xi 3 3 4 4 5 5 6 6 7 7 50
xi  x -2 -2 -1 -1 0 0 1 1 2 2 0

 xi  x  2
4 4 1 1 0 0 1 1 4 4 20

2 9 9 16 16 25 25 36 36 49 49 270
xi

 xi 50  x i  x
2
20
x  5 s2    2.2,
n 10 n 1 9 s  2.2  1.48
14
Example-2: Find Standard Deviation of
Group Data

xi fi f i xi f i xi
2 xi  x  xi  x  2 f i  xi  x  2
3 2 6 18 -3 9 18
5 3 15 75 -1 1 3
7 2 14 98 1 1 2
8 2 16 128 2 4 8
9 1 9 81 3 9 9
Total 10 60 400 - - 40

x
 f x i i

60
6  f  xi  x 
2
i 40
s 2
   4.44
f i 10 n 1 9
15
Coefficient of Variation

 A coefficient of variation is computed as a ratio of the standard


deviation of the distribution to the mean of the same
distribution.

sx
CV 
x
16
Advantages

 That is use to compare the two different distribution, even they


have different unit of measurement.
 It is a pure and unit less quantity.
17
Disadvantages

 There are some requirements that must be met in order for the
CV to be interpreted in the ways we have described.  The most
obvious problem arises when the mean of a variable is zero.  In
this case, the CV cannot be calculated. 
 If the mean of a variable is not zero, but the variable contains
both positive and negative values and the mean is close to
zero, then the CV can be misleading. 
18
Example-3: Comments on Children in a
community

Height weight
Mean 40 inch 10 kg
SD 5 inch 2 kg
CV 0.125 0.20

 Since the coefficient of variation for weight is greater than that of height, we would
tend to conclude that weight has more variability than height in the population.
19
Measure of Shape

 Skewness
Absence of symmetry
Extreme values in one side of a distribution
 Kurtosis
Peakedness of a distribution
20
Skewness

Negatively Symmetric Positively


Skewed (Not Skewed) Skewed
21
Moments

Definition: A moment is a quantitative measure of the shape of a


set of points.
1. The first moment is called the mean which describes the center of the
distribution.
2. The second moment is the variance which describes the spread of the
observations around the center.
3. Other moments describe other aspects of a distribution such as how the
distribution is skewed from its mean or peaked.
4. A moment designates the power to which deviations are raised before
averaging them.
22
Central (or Mean) Moments
In mean moments, the deviations are taken from the mean

For
  Ungrouped Data:

 x  
r

r th Population Moment about Mean= r  i

  x  x
r
i
r th Sample Moment about Mean=mr 
In General, n
23
Central (or Mean) Moments

Formula for Grouped Data:

 f  xi   
r

r th Population Moment about Mean= r 


f
f  x  x
r

th
r Sample Moment about Mean=mr 
 i

f
Central (or Mean) Moments 24

Example: Calculate first four moments about the mean for the following set of examination
marks:
X
45
Solution: For solution, 32
37
46
39
36
41
48
36
Central (or Mean) Moments 25

Example: Calculate: first four Weights (grams) Frequency (f)


moments about mean for the 65-84 9
following frequency distribution: 85-104 10
105-124 17
125-144 10
145-164 5
165-184 4
Solution: For solution, 185-204 5
Total 60
26
Moment Ratios

32 
1  3 ,  2  42
2 2

m32 m4
b1  3 , b2  2
m2 m2
27
Skewness

A distribution in which the values equidistant from the mean have equal frequencies and is
called Symmetric Distribution.
Any departure from symmetry is called skewness.

In a perfectly symmetric distribution, Mean=Median=Mode and the two tails of the


distribution are equal in length from the mean. These values are pulled apart when the
distribution departs from symmetry and consequently one tail become longer than the other.

If right tail is longer than the left tail then the distribution is said to have positive skewness.
In this case, Mean>Median>Mode

If left tail is longer than the right tail then the distribution is said to have negative skewness.
In this case, Mean<Median<Mode
28
Skewness

When the distribution is symmetric, the value of skewness should be zero.


Karl Pearson defined coefficient of Skewness as:
Mean  Mode
Sk 
SD

Since in some cases, Mode doesn’t exist, so using empirical relation,


Mode  3Median  2Mean
We can write,
3  Median  Mean 
Sk 
SD

(it ranges b/w -3 to +3)


29
Kurtosis

Karl Pearson introduced the term Kurtosis (literally the amount of hump) for
the degree of peakedness or flatness of a unimodal frequency curve.
When the peak of a curve
becomes relatively high then that
curve is called Leptokurtic.

When the curve is flat-topped,


then it is called Platykurtic.

Since normal curve is neither


very peaked nor very flat topped,
so it is taken as a basis for
comparison. The normal curve is
called Mesokurtic.
30
Kurtosis
 

4
Kurt   2  2 , for population data
2
m
Kurt  b2  42 , for sample data
m2
For a normal distribution, kurtosis is equal to 3.

When is greater than 3, the curve is more sharply peaked and has narrower
tails than the normal curve and is said to be leptokurtic.

When it is less than 3, the curve has a flatter top and relatively wider tails
than the normal curve and is said to be platykurtic.
31
Kurtosis

Excess Kurtosis (EK): It is defined


as:
EK=Kurtosis-3
 For a normal distribution, EK=0.
 When EK>0, then the curve is said
to be Leptokurtic.
 When EK<0, then the curve is said
to be Platykurtic.
32
Describing a Frequency Distribution

To describe the major characteristics of a frequency distribution, we need to


calculate the following five quantities:

 The total number of observations in the data.


 A measure of central tendency (e.g. mean, median etc.) that provides the
information about the center or average value.
 A measure of dispersion (e.g. variance, SD etc.) that indicates the spread of
the data.
 A measure of skewness that shows lack of symmetry in frequency
distribution.
 A measure of kurtosis that gives information about its peakedness.
33
Describing a Frequency Distribution

It is interesting to note that all these quantities can be derived from the first four
moments.

For example,
 The first moment about zero is the arithmetic mean
 The second moment about mean is the variance.
 The third standardized moment is a measure of skewness.
 The fourth standardized moment is used to measure kurtosis.

Thus first four moments play a key role in describing frequency distributions.
34
Skewness

Mean Mode Mean Mean


Mode
Median
Median Mode Median

Negatively Symmetric Positively


Skewed (Not Skewed) Skewed
35
Peakedness of the Distribution

Leptokurtic

Mesokurtic
Platykurtic
36
Standard for kurtosis

1. For a normal distribution, kurtosis is equal to 3.

2. When is greater than 3, the curve is more sharply peaked and


has narrower tails than the normal curve and is said to be
leptokurtic.

3. When it is less than 3, the curve has a flatter top and relatively
wider tails than the normal curve and is said to be platykurtic.

You might also like