Measures of Central Tendency
Measures of central tendency describe a set of data by identifying the central position in the data set as
a single representative value. We come across new data every day. We find them in newspapers,
articles, in our bank statements, mobile and electricity bills. Now the question arises whether we can
figure out some important features of the data by considering only certain representatives of the data.
In statistics, the three most common measures of central tendencies are mean, median, and mode
Mean
The mean (often called the average) is most likely the measure of central tendency that we are familiar
with. It is also known as average. Mean is simply the sum of all the components in a group or
collection, divided by the number of components. It is denoted by x,̄ pronounced “x bar”.
Mean = Sum of the terms/ Number of terms
For example, let the weights of 8 boys in kilograms: 45, 39, 53, 45, 43, 48, 50, 45. Here, there are 8
boys.
Therefore, the average of the group:
Average = Sum of the weights/Number of boys
= (45 + 39 + 53 + 45 + 43 + 48 + 50 + 45)/8
= 368/8 = 46. Thus, the average weight of the group is 46 kilograms.
Let us now see how to calculate the mean for different types of data
If we have n values in a data set and they have values x 1,x2, …,xn, the sample mean (pronounced
"x bar"), is: x̅ = x1 + x2 + ...... +xn /n
This formula is written in a different manner using the Greek capital letter, ∑, pronounced "sigma",
which means "sum of...": x̅ = ∑x/n
In statistics, samples and populations have many different meanings. To show that we are calculating
population mean and not the sample mean, we use Greek letter "mu", denoted as μ: μ = ∑x/n
Example:
If the heights of 5 people are 142 cm, 150 cm, 149 cm, 156 cm, and 153 cm. Find the mean height.
Mean height
x̅ = 142+150+149+156+153/5
= 750/5 =150
Let there be n number of items in a list x1, x2, x3, … , xn.
Let the frequency of each item be f1, f2, f3, … , fn respectively.
The mean can be calculated using the formula:
x̅ = f1x1 + f2x2 + f3x3 +.....+ fn x n / f1 + f2 + f3 + .....+ fn
OR
x̅ = ∑fi xi/n
Example: Find mean of the following distribution:
x 4 6 9 10 15
f 5 10 10 7 8
Solution: Calculation table for arithmetic mean:
xi fi xifi
4 5 20
6 10 60
9 10 90
10 7 70
15 8 120
∑fi=40 ∑ xi fi=360
Mean= x̅ = ∑xi fi / ∑fi = 360/40 = 9
Mean = 9
When the items in a list are written as a range, for example, 10-20, we need to first calculate the class
mark.
Class Mark = Upper Limit + Lower Limit / 2
Then, the mean is calculated using the formula given below, where xi is the class mark for each item.
Example: Here data is in the form of class intervals. Following table indicates data on the number of
patients visiting a hospital in a month. Find average number of patients visiting the hospital in a day.
Number of days
Number of patients
visiting hospital
0-10 2
10-20 6
20-30 9
30-40 7
40-50 4
50-60 2
Solution: Now, we find the class mark (called as mid-point of a class) for each class.
Class mark = lower limit + upper limit/2
Let x1, x2, x3 ……xn be the class marks of the respective classes.
Hence, we get the following table
Classmark (xi) frequency (fi) xifi
5 2 10
15 6 90
25 9 225
35 7 245
45 4 180
55 2 110
Total ∑fi = 30 ∑fixi = 860
Mean = x̅ = ∑fi xi/∑fi
= 860/30 = 28.67
Mean Discrete Series
Calculate Mean for the following data:
Monthly Income Number of Employees
5000 8
6000 7
7000 6
8000 9
9000 10
Monthly Income (X) No. of Employees(f) fX
5000 8 40000
6000 7 42000
7000 6 42000
8000 9 72000
9000 10 90000
---------- ----------
Total 40 2,86,000
Mean X = ∑ fX / N = 2,86,000/40 = Rs.7150
N – Sum of the frequencies = ∑f = 40
Mean Discrete Series
Short Cut Method and Step deviation Method
X = A + (∑fd/N)*i
Here N = ∑f and d=(X-A)/i
X f d=(X-A)/I fd
5000 8 -2 -16
6000 7 -1 -7
7000 6 0 0
8000 9 1 9
9000 10 2 20
---- ----
40 6
Here i=1000 common multiple of X and N = ∑f
X’ = A + (∑fd/N)*i = 7000 + (6/40)*1000 = Rs.7150
Mean Continuous Series
Monthly Income Number of Employees
5,000 - 10,000 6
10,000 - 15,000 7
15,000 - 20,000 9
20,000 - 25,000 4
25,000 – 30,000 5
X f m fm
5,000 - 10,0006 7500 45000
10,000 - 15,000
7 12500 87500
15,000 - 20,000
9 17500 157500
20,000 - 25,000
4 22500 90000
25,000 - 30,000
5 27500 137500
--- -----------
31 517500
X = ∑fm/N = 517500/31 = Rs.16693.55
Here N = ∑f
Short Cut Method
X = A + (∑fd/N) where d = (m-A/i)
X f m d = (m-A)/I fd
5,000 - 10,000 6 7500 -2 -12
10,000 - 15,000 7 12500 -1 -7
15,000 - 20,000 9 17500 0 0
20,000 - 25,000 4 22500 1 4
25,000 – 30,000 5 27500 2 10
--- ---
31 -5
Here i = 5000
Short Cut Method and Step Deviation Method
X = A + (∑fd/N)
Where N = ∑f and d = (m-A/i)
= 17500 + (-5/31)*5000
= 17500 – (25000/31)
= 17500 – 806.45 = 16693.55
When not to use the mean
Main disadvantage of mean: it is particularly sensitive to outliers.
These are values that are unusually larger or smaller compared to the rest of the data. For example,
consider salary of staff at a factory:
Staff 1 2 3 4 5 6 7 8 9 10
Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k
The mean salary for these ten staff is $30.7k.
However, the raw data suggests that this mean value does not accurately reflect the typical salary of a
worker, because most workers have salaries in the $12k to 18k range.
Thus the mean is being skewed by the two large salaries. Therefore, we would like to have a better
measure of central tendency.
We will find out later, taking the median would be a better measure of central tendency in this
situation.
•
Median
The value of the middle-most observation that is obtained after arranging the data in ascending order is
called the median of the data.
The advantage of using the median as a central tendency is that it is less affected by outliers and skewed
data.
To calculate the median, assume we have the data:
65 55 89 56 35 14 56 55 87 45 92
First, we rearrange data into ascending order:
14 35 45 55 55 56 56 65 87 89 92
The median mark will be the middle mark - here, 56 (highlighted in bold).
It is the middle mark because it lies in the exact center as there are 5 scores before it and 5 scores after it.
This works very well when we have an odd number of scores, but what when we have an even number of
scores?
What if we had 10 scores?
We take middle two scores and find their average.
65 55 89 56 35 14 56 55 87 45
Rearranging that data into ascending order:
14 35 45 55 55 56 56 65 87 89
We take the 5th and 6th score in our data set and average them. We get a median of 55.5.
Let us see how to calculate median for different types of data.
Case 1: Ungrouped Data
Step 1: Arrange the data in ascending or descending order.
Step 2: Let the total number of observations be n.
To find the median, we need to consider if n is even or odd.
If n is odd, then use the formula:
Median = [(n+1)/2]th observation
Let's consider the data: 56, 67, 54, 34, 78, 43, 23. What is the median?
For finding mean, arrange data in ascending order: 23, 34, 43, 54, 56, 67, 78.
Here, n (no.of observations) = 7
Median = (7 + 1)/2 = 4th observation. Median = 54.
Case 2: Grouped Data
If n is even, then use the formula:
Median = [(n/2)th obs. + ((n/2)+1)th obs]/2
Step 1: Find the median class.
When the data is continuous and in the form of a frequency distribution, the median is found as shown
below:
Let n = total number of observations i.e. ∑fi
Note: Median Class is the class where n/2 lies.
Step 2: Use the following formula to find the median.
Median = l + [(n/2-c)/f] × h
c = cumulative frequency of the class preceding the median class where l = lower limit of the median
class, f = frequency of the median class and h = class size
Find the mode of the given data:
Marks 0-20 20-40 40-60 60-80 80-
Obtained 100
Number of
students 5 10 12 6 3
The highest frequency == 12, so the modal class is 40-60.
l = lower limit of modal class = 40
fm = frequency of modal class =12
f1= frequency of class preceding modal class = 10
f2 = frequency of class succeeding modal class = 6
h = class width == 20
Mode = l + [(fm-f1)(2fm-f1-f2)] × h
= 40+[(12-10)(2 × 12 - 10-6)] × 20 = 40+[(2/8) ] × 20 = 45
Median – Individual Observation – Even number of Observations
The following data gives value of net worth of the top 10 world richest billionaires:
X 73 67 57 53.5 43 34 34 31 30 29
Calculate Median for the above data.
Always, write the values in ascending order
29,30,31,34,34,43,53.5,57,67,73
Median = (N+1)/2 th observation
= (10+1)/2 th observation
= 5.5th observation
= (5th observation + 6th observation )/2
= (34+43)/2
= 38.5 billions
Hence the median net worth of 10 billionaires is $38.5 billions
It may be noted that 5 billionaires have net worth less than this value and 5 billionaires have net worth
more than this value.
The following data gives value of net worth of the top 9 world richest billionaires:
X 73 67 57 53.5 43 34 34 31 30
The ascending order of the values is
30, 31, 34,34,43,53.5,57,67 and 73
Median = (N+1)/2th observation
= (9+1)/2th observation
= 5th observation
= 43 billions
Hence the median net worth of 9 billionaires is $43 billions
It may be noted that 4 billionaires have net worth less than this value and 4 billionaires have net worth
more than this value.
Problem in Median Discrete Series
Monthly Income (X) No. of Emps(f) cum.freq
5000 8 8
6000 7 15 (8+7)
7000 6 21
8000 9 30
9000 10 40
----------
Total 40
Median = (N+1)/2 th observation
= (40+1)/2 th observation
= 20.5th observation = Rs.7000
Median – Continuous Series
Monthly Income Number of Employees
5,000 - 10,000 6
10,000 - 15,000 7
15,000 - 20,000 9
20,000 - 25,000 4
25,000 – 30,000 5
X f cf
11
5,000 - 10,000 6 6
10,000 - 15,000 7 13
15,000 - 20,000 9 22
20,000 - 25,000 4 26
25,000 – 30,000 5 31
Median = L+((N/2-cf)/f)*I
N/2 = ∑f/2 = 15.5
L = 15,000 and f = 9
Cumulative frequency (cf) = 13
i=size of the class interval = 5,000
Median = 15,000 + ((15.5-13)/9)*5,000
= 15,000 + 1388.89 = Rs. 16, 388.89
Hence the median income of 31 employees is. Rs.16, 388.89. It may be noted that 15 employees have
income less than this value and 15 employees have income more than this value
12
Mode
The value which appears most often in the given data i.e. the observation with
highest frequency is called mode of data.
Case 1: Ungrouped Data
For ungrouped data, we just need to identify the observation which occurs
maximum times.
Mode = Observation with maximum frequency
For example in the data: 6, 8, 9, 3, 4, 6, 7, 6, 3 the value 6 appears the most
number of times. Thus, mode = 6.
An easy way to remember mode is: Most Often Data Entered.
Depending upon the number of modes the data has, it can be called unimodal,
bimodal, trimodal, or multimodal. The example above has only 1 mode, so it
is unimodal. A data may have no mode, 1 mode, or more than 1 mode.
Case 2: Grouped Data
When the data is continuous, the mode can be found using the following steps:
Step 1: Find modal class i.e. the class with maximum frequency.
Step 2: Find mode using the following formula:
Mode = l + [(fm-f1)/ (2fm-f1-f2)] × h
l = lower limit of modal class,
fm = frequency of modal class,
f1= frequency of class preceding modal class,
f2 = frequency of class succeeding modal class,
h = class width
Find the mode of the given data:
Mark 0-20 20-40 40-60 60-80 80-100
Students 5 10 12 6 3
The highest frequency == 12, so modal class is 40-60.
l = lower limit of modal class = 40
fm = frequency of modal class =12
f1 = frequency of class preceding modal class = 10
f2 = frequency of class succeeding modal class = 6
h= class width = 20
Using the mode formula,
Mode = l + [(fm-f1)/(2fm-f1-f2)] × h
= 40+[(12-10)/(2 × 12 - 10-6) ] × 20
= 40+[2/8] × 20 = 45
Mode Individual Observation
Find the mode for the following data:
The size of nine garments is as follows:
30, 31, 32, 33, 34, 32, 33, 32, 30
Here 32 is repeated max number of time and hence mode is 32.
A survey was taken to find the number of cars in a house in the first street of Ram Nagar.
The result of the survey is shown in the form of frequency distribution table. Find the mode for the
following data.
Number of cars in a house: 0 1 2 3 4
Number of houses : 5 12 7 4 2
Mode = 3 Median – 2 Mean
X f fX cf Mode = 3 Median – 2 Mean
0 5 0 5 Median = (N+1)/2 th observation
1 12 12 17 = (30+1)/2 th observation
2 7 14 24 = 15.5th observation
3 4 12 28 =1
4 2 8 30 Hence Mode of maximum number
--- --- of cars in a house in the first street is
30 46 ‘1’
Mode Continuous series
Find the mode for the following data:
Class : 0-10 10-20 20-30 30-40 40-50 50-60
Frequency: 8 11 14 7 9 6
Mode = L +( ∆1/ ∆1+ ∆2)*i
∆1 – the difference between the highest frequency and preceding frequency i.e. 14-11 = 3
∆2 -the difference between the highest frequency and Succeeding frequency i.e 14 -7 = 7
L – Lower limit of the Modal class = 20
i- Size of class interval = Upper limit – lower limit of class = 10
Substituting these values in the following formula, we get
Mode = L +(( ∆1/ (∆1+ ∆2))*i
= 20 + (3/10)*10 = 23
Empirical Relation Between Measures of Central Tendency
The three measures of central tendency i.e. mean, median, and mode are closely connected by the
following relations (called an empirical relationship).
2Mean + Mode = 3Median
For instance, if we are asked to calculate the mean, median, and mode of continuous grouped data,
then we can calculate mean and median using the formulae as discussed before and then find mode
using the empirical relation.
Example: We have data with mode 65 and a median of 61.6, then, we can find the mean using the
above relation.
2Mean + Mode = 3Median
2Mean = 3Median - Mode
16
2Mean = 3 × 61.6 - 65
2Mean = 119.8
Mean = 119.8/2 = 59.9
Difference between Mean and Average
The term average is frequently used in everyday life to denote a value that is typical for a group of
quantities.
Average rainfall in a month or the average age of employees of an organization is typical examples.
We might read an article stating "People spend an average of 2 hours every day on social media."
We understand from the use of the term average that not everyone is spending 2 hours a day on social
media but some spend more time and some less.
However, we can understand from the term average that 2 hours is a good indicator of the amount of
time spent on social media per day.
Average is the value that indicates what is most likely to be expected.
They help to summarize large data into a single value.
An average tends to lie centrally with the values of the observations arranged in ascending order of
magnitude.
We call an average measure of the central tendency of the data. Averages are of different types.
What we refer to as mean i.e. the arithmetic mean is one of the averages.
Mean is called the mathematical average whereas median and mode are positional averages.
Difference between Mean and Median
The mean is known as the mathematical average whereas the median is known as the positional
average.
A department of an organization has 5 employees which include a supervisor and four executives.
The executives draw a salary of ₹10,000 per month while the supervisor gets ₹40,000.
Mean = (10000+10000+10000+10000+40000)/5 =80000/5 = 16000.
Thus, the mean salary is ₹16,000.
To find the median, we consider the ascending order: 10000, 10000, 10000, 10000, 40000.
N = 5, so, (n+1)/2 = 3.
Thus, the median is the 3rd observation.
Median = 10000. Thus, the median is ₹ 10,000 per month.
Let us compare two measures of central tendencies.
We can observe that the mean salary of ₹ 16,000 does not give even an estimated salary of any of
the employees whereas the median salary represents the data more effectively.
Weaknesses of mean are that it gets affected by extreme values.
Look at the following graph to understand how extreme values affect mean and median:
So, the mean is to be used when we don't have extremes in the data.
If we have extreme points, then the median gives a better estimation.
Example: The mean monthly salary of 10 workers of a group is $1445. One more worker whose monthly
salary is $1500 has joined the group. Find the mean monthly salary of 11 workers of the group using the
measures of central tendency formula.
Solution:
Here, n=10, x̅ =1445
Using the formula, x̅ = ∑fi xi/n
Therefore ∑xi = x̅ × n
∑xi =1445 ×10 =14450
10 workers salary = $14450
11 workers salary = $14450 + 1500 = $15950
Average salary = 15950/11 =1450
Answer: Average salary of 11 workers = $1450
A survey on heights of 50 girls of class X was done at a school, following data was obtained:
Height (in cm) 120-130 130-140 140-150 150-160 160-170 Total
Number of girls 2 8 12 20 8 50
Find the mode and median of the above data using the measures of central tendency formula.
Solution: Modal class= 150-160 [as it has maximum frequency]
l =150, h &=10, fm =20, f1 =12, f2=8
Mode = l + [(fm-f1) / (2fm-f1-f2)] × h
= 150 + [(20-120) / (2 × 20-12-8)] × 10
= 150 + 4=154
To find the median, we need cumulative frequencies.
Cumulative frequency
Class Intervals No. of girls (fi)
(c)
120-130 2 2
130-140 8 2+8 = 10
140-150 12 = f1 10+12 = 22 (c)
150-160 20 = fm 22+20 = 42
160-170 8 = f2 42+8 = 50 (n)
n = 50, n/2 = 25
Median class = 150-160 l =150, c= 22, f=20, h= 10
Median = l + [(n/2- c)/f] × h
= 150 + [(50/2- 22)/20] × 10
= 150 + 1.5 = 151.5
Answer: Mode = 154, Median = 151.5