RAFFLES INSTITUTION
RAFFLES PROGRAMME 2024
YEAR 3 MATHEMATICS
TOPIC 11: STATISTICS (MATHS 1)
WORKSHEET 1
Name: ( ) Class: 3 ( ) Date:
WORKSHEET 1: MEAN AND STANDARD DEVIATION OF GROUPED DATA
(New Syllabus Mathematics 4, 7th Edition, P.65)
Learner Outcomes:
At the end of the lesson(s), students should be able to
1. Estimate mean for grouped data.
2. State the median class and modal class for grouped data.
3. Define and explain the use of standard deviation for a set of data.
4. Calculate standard deviation for a set of data.
5. Use the mean and standard deviation to compare two sets of data
(1) REVISION
(1.1) Data Representation
In Year 1 Statistics I, we have learnt the following data representation:
18%
31%
18%
12%
21%
Pie Chart Bar Chart Line Graph
Stem Leaf No. of Cars Sold per Month
0 9 Each represents 100 cars
1 23555667789
2 2
3 3
4 1 Jan Feb March
Stem and Leaf Diagram Pictogram
Histogram Dot Diagram
Page 1 of 22
(1.2) Measures of Central Tendency of Ungrouped Data
____________ is the data with the highest frequency.
A set of data may have more than one mode.
____________ is the middle data when the data is arranged in order of magnitude.
It gives a sense of average as a half-way mark.
If n is the number of data, median is
n 1
th value if n is odd
2
n n
average of th and 1 th values if n is even
2 2
____________ is the average of the data, i.e. sum of all individual data divided by the total
number of data, n,
x
sum of data
or
x
number of data n
Note: Summation ( ∑ ) is the addition of a sequence of numbers.
EG 1
The number of goals scored by a soccer team in 8 matches are 2, 5, 3, 3, 1, 4, 0, 2.
Calculate the mode, median and mean of the data.
Page 2 of 22
EG 2
A survey was conducted on a group of students to find out the number of hours of sleep that
they had on the 1st day of FHBL. The results are shown in the frequency table below. Find the
mode, median and mean number of hours of sleep.
Number of hours of sleep 4 5 6 7 8 9
Number of students 3 5 5 8 4 1
Page 3 of 22
(2) GROUPED FREQUENCY DISTRIBUTION
In real world context, most data are grouped by class intervals due to the huge range of data.
An example of grouped data in class intervals are shown below.
1. How to calculate mean?
Test results of class 3N (30 students) 2. How to represent this class interval?
Test marks ( x ) 0 x 10 10 x 15 15 x 20 20 x 25 25 x 30
No. of students 2 4 11 8 5
For the class interval 0 x 10 , the lower class limit is 0 and the upper class limit is 10, the
class width is 10 and the class mark (midpoint) is 5.
Modal Class is the class interval with the _________________________ (i.e. 15 x 20 ).
Median Class is the class interval in which the __________________ lies (i.e. 15 x 20 ).
An estimate of the Mean of grouped data is given by
x
fx
f
where x is the ________________________ and f is the frequency of the class interval.
Page 4 of 22
EG 3
30 students took a test. The results are shown in the frequency table below. Find the estimated
mean test mark.
Test marks ( x ) 0 x 10 10 x 15 15 x 20 20 x 25 25 x 30
Class mark 5 12.5 17.5 22.5 27.5
No. of students 2 4 11 8 5
We cannot find the exact mean because we have lost the exact value of each data in a grouped
frequency distribution. We can only find an estimate of the mean by assuming the
class mark (class mid-value) to be the mean of the data in a class interval.
EG 4
The following group frequency table shows the amount of time spent by students of class 3Z
watching television at home in April.
No. of hours ( h ) 10 h 20 20 h 30 30 h 40 40 h 50 50 h 60 60 h 70
Frequency 4 9 7 5 4 2
(a) State the modal class and the median class. Remember to obtain
(b) Estimate the mean number of hours of the distribution. class mark!
For estimation of mean, answers should be rounded off to 3 s.f. and cannot be left as fraction.
Page 5 of 22
(3) Choice of Measures of Central Tendency in Real Life
You were introduced to three measures of central tendency at the beginning of this topic. In
real life, which one will you always choose to summarise the data you have collected?
There are 3 cases shown in the next two pages to illustrate how to choose the suitable measure
of central of tendency.
Recall from Y1 Statistics I:
Measure of
Advantages Disadvantages
Central Tendency
Takes into account all the scores Affected by unbalanced extreme
Mean and thus makes the most of the scores
information provided by the data
It is less sensitive to extreme Does not take into account the
Median values total quantity represented by the
data
It can be used for both Does not take into account the
quantitative and qualitative data. total quantity represented by the
Mode data and there may be more
than one mode or it may not
exist
Page 6 of 22
Case 1: Test Result Analysis (extreme data)
The following is a recent test result of a class of 26 students (A to Z). Find the mean, median
and mode of the test results.
Which measure of central tendency will best summarise the students’ performance in the
class test and why?
No Student Marks/30
1 A 4
2 K 4
3 E 5 Extreme Values
4 F 5
5 H 5
6 M 5 Mean = ______
7 Q 6
8 W 6 Median = ______
9 I 6
Mode = _______
10 B 17
11 G 17
12 S 17
13 C 19
14 J 19
15 L 19
16 X 19
17 N 20
18 O 20
19 D 28
20 P 28
21 R 28
22 U 28 Extreme Values
23 T 29
24 V 29
25 Y 29
26 Z 30
Page 7 of 22
Case 2: Research Education Survey Analysis (qualitative data)
A RE group researching a topic on “The impact of hand phone usage on teenagers” designed a
survey to gather some data. One of the survey questions is “What is your favourite hand phone
colour?”.
What measure of central tendency would be appropriate in interpreting the set of data obtained
for this question? Why?
Case 3: Class Survey Analysis (ordinal data)
A teacher conducted a class survey for class after a lesson. The question was as follows:
How satisfied are you with your examination preparation?
Very satisfied
Satisfied
Indifferent
Dissatisfied
Very Dissatisfied
What measure of central tendency should the teacher use to summarise the students feedback?
Page 8 of 22
(4) SPREAD OR DISPERSION OF DISTRIBUTION
Consider the two different sets of marks obtained by two different classes for their test.
Class 3X
Marks: 4, 5, 6, 6, 7, 8, 10, 11, 11, 12 Mean = ______
Class 3Y
Marks: 1, 2, 2, 2, 2, 2, 15, 18, 18, 18 Mean = ______
Both sets of data have the same mean but what do you notice the spread of the data?
The simplest measure of spread of distribution is the Range.
Range = largest data – smallest data
In the above data, the range of class 3X =
the range of class 3Y =
Page 9 of 22
(5) STANDARD DEVIATION
At lower sec, you have learnt to make comparisons of data sets using single data points (e.g.
highest/lowest data value between two data sets, frequency of a data value/group versus
another, etc.) or a single value measure of central tendency (e.g. mean, mode and median).
However, in some situations, such comparisons may not give us enough information for a
good comparison between data sets or fail to help us solve a problem.
presents standard deviation.
Let’s take a look at the following context.
Football Fever
The organizers of the Premier League Federation have to decide which one of two players -
Mike Arwen and Dave Backhand – should receive the “The Most Consistent Player” award.
Table 1 shows the number of goals that each striker scored in a 16-year period.
Table 1: Number of goals scored by two strikers in the Premier League between 2003 and 2018
Year Mike Dave
2003 14 12
2004 10 10
2005 15 17
2006 10 14
2007 15 11
2008 11 14
2009 15 14
2010 12 15
2011 16 15
2012 13 16
2013 17 16
2014 13 13
2015 18 12
2016 13 14
2017 18 13
2018 14 18
The organizers agreed to approach this decision mathematically by designing a measure of
consistency.
1. Design a measure of consistency. Your measure of consistency should make use
of all data points in the table.
2. Post your measure of consistency on the Interactive Thinking Tool in SLS.
3. Comment on your friends' measures of consistency. Your comment/feedback
should be constructive.
Measure of consistency: A method to measure the amount of variation in the data
set, i.e. how much the data is spread out.
Page 10 of 22
Standard Deviation is used to measure the spread of data in a set from its mean. Standard
deviation is useful when comparing the spread of two data sets that have approximately the same
mean. Generally the more widespread a set of data is around its mean, the higher the standard
deviation. There are several notations that stand for standard deviation: S, SD or . We shall
use SD to represents standard deviation.
Standard deviation is used widely in our daily lives. Some examples of situations in which
standard deviation might help to understand the value of the data:
In the analysis of test scores, a small standard deviation shows that students scored very
close to the mean score.
In weather forecasting, the forecasts are compared the actual temperature recorded. A
low standard deviation would show a reliable weather forecast.
Here is a video clip to understand the significant use of standard deviation in statistics.
Standard Deviation - Explained and Visualized
https://youtu.be/MRqtXL2WX2M
Page 11 of 22
(5.1) Standard Deviation for Ungrouped Data
For a set of ungrouped data, the standard deviation is given by:
x x x x
2 2
2
Standard deviation, SD or
n n
where x is the mean of the data set with n terms.
The expression x x is the deviation of each data from the mean.
EG 5
Consider the two different sets of marks obtained by two different classes for their test.
Class 3X
Marks: 4, 5, 6, 6, 7, 8, 10, 11, 11, 12 Mean = 8
Class 3Y
Marks: 1, 2, 2, 2, 2, 2, 15, 18, 18, 18 Mean = 8
From the given data above, find the standard deviation of the data for class 3X and class 3Y.
What information can we obtain from the results?
The two sets of marks have the same mean, but have very different values of standard
deviation.
Class 3X with a smaller standard deviation, has a narrower spread of marks around the
mean. Hence the performance are more consistent, i.e. most pupils have about 8 marks.
However, Class 3Y with a ___________________________, has a ________________
of marks from the mean. This means that the marks are _________________________
and the students in Class 3Y have extreme performances (“outliers”); hence the mean is
not a good indicator of how an average student in Class 3Y fared in the test.
Why… do we still square x x in the formula for standard deviation when x x
is the deviation of each data from the mean?
Page 12 of 22
Infusion Of Computational Thinking
Topic (Strand): Statistics (Calculation of Standard Deviation)
Nature of Task: ☐ Implementing Procedures ☒ Solving Problems
CT Focus: ☐ Abstraction ☒ Generalisation
☐ Decomposition ☒ Algorithm Design
Task
Given a set of Ungrouped Data, design a code to find its Standard Deviation.
Function: Mean value of ungrouped data
Input: Ungrouped Data
Output: Standard Deviation of Ungrouped Data
Hints
Define the Mean value of Ungrouped Data
Define the Standard Deviation using the Mean definition, Ungrouped Data, total
data and appropriate Mathematical Functions
Notes
To find the Standard Deviation, it suffices to find the Mean of the Ungrouped Data.
Use this task to bring up the idea of writing codes in modules and re-using them,
which is an example of solving problems using generalisation and algorithm design.
Given below is a code using Python to calculate the mean of a set of ungrouped data.
Use this information to design a code to calculate the Standard Deviation.
def mean (values):
length = len(values)
total_sum = 0
for i in range(length):
total_sum = sum (values)
average = total_sum * 1.0 / length
return average
x = [1, 12, 23, 44, 56, 126, 60]
m = mean (x)
print(m)
import math
def mean (values):
return sum (values)*1.0 / len (values)
def StanDev (values):
length = len (values)
mean = mean values
total_sum = 0
for i in range (length):
total_sum += (values[i]-m)**2
under_root = total_sum*1.0 / length
return math.sgrt (under_root)
x = [1, 12, 23, 44, 56, 126, 60]
SD = StanDev(x)
print (SD)
Page 13 of 22
(5.2) Standard Deviation for Grouped Data (Frequency Distribution)
For a set of grouped data, the standard deviation is given by:
f x x fx x
2 2
2
Standard deviation, SD or
f f
where x is the mean of the data set, and f is the frequency of the class interval.
EG 6
The table below shows the number of persons in cars passing the road junction outside RI
during lunch hour. Calculate the mean and the standard deviation, giving your answers correct
to 3 significant figures.
No of person in a car 1 2 3 4 5 6
No of cars 14 11 9 6 4 1
Remember to use exact value of mean for the calculation of standard deviation!
Page 14 of 22
(5.3) Standard Deviation for Grouped Data (Grouped Frequency Distribution)
For a set of grouped data, the standard deviation is given by:
f x x fx x
2 2
2
Standard deviation, SD or
f f
where f is the frequency and x is the class mid-value of the class interval.
EG 7
The distribution table below shows the speed of 100 cars. Estimate the mean and the standard
deviation of the distribution.
Speed (v km/h) 0 < v 30 30 < v 40 40 < v 50 50 < v 60 60 < v 70 70 < v 80 80 < v 90
No. of cars 8 8 25 35 14 6 4
Remember to obtain
class mark!
Remember to indicate the units for your mean and standard deviation!
Page 15 of 22
EG 8
A group of 100 Secondary 3 students in School Alpha were asked for the amount of pocket
money they received each week, to the nearest dollar. The data is shown below:
Amount of money ($) 16 – 20 21 – 25 26 – 30 31 – 35 36 – 40 41 – 45
No. of students 3 12 19 36 22 8
(a) Complete the following table and use it to calculate the estimate mean and estimate
standard deviation of the amount of money received by the 100 students in School
Alpha.
Amount of Mid-value Frequency
fx fx 2
Money ($) (x) (f)
16 – 20 3
21 – 25 12
26 – 30 19
31 – 35 36
36 – 40 22
41 – 45 8
f 100 fx fx 2
(b) Another group of 100 Secondary 4 students in School Bravo were asked for the amount
of pocket money they received each week. The following data analysis is obtained:
Estimated Mean of data = $32.30
Estimated Standard Deviation = $3.02
Compare and comment on the results for the two schools.
Page 16 of 22
HOMEWORK
LEVEL 1
1 NSM4, Chapter 3, Ex 3D Q8 – 13 (P.119 – 120)
2 The table below gives the masses, in kg, of 150 workers of similar height.
Mass (x kg) 50 x 55 55 x 60 60 x 65 65 x 75 75 x 95
No of workers 25 35 40 30 20
Estimate the mean and the standard deviation, giving your answers correct to 3
significant figures.
[Ans: 64.2 kg; 9.92 kg]
LEVEL 2
1 A survey was conducted to find out the number of hours in a day spent by pupils on the
computer. The following table shows the results:
No. of hours 2 3 4 5 6 7 8
No. of pupils in Class A 2 3 6 11 10 7 1
No. of pupils in Class B 4 4 9 8 7 5 3
(a) Calculate the mean hour and standard deviation for each class.
(b) Which class spent less time on the computer?
Page 17 of 22
2 The table below shows the scores that were obtained in a quiz done by 46 students in a
class.
Scores 45 < 𝑥 ≤ 55 55 < 𝑥 ≤ 65 65 < 𝑥 ≤ 75 75 < 𝑥 ≤ 85 85 < 𝑥 ≤ 95
No. of students 5 12 7 12 10
Estimate the mean and the standard deviation of the quiz scores, giving your answers
correct to 3 significant figures.
Page 18 of 22
3 The following table shows the mass of 20 durians from a Shop A:
Mass (x kg) 1<𝑥≤2 2<𝑥≤3 3<𝑥≤4 4<𝑥≤5
Frequency 2 p 8 q
(a) Find the value of p and of q given that an estimate for the mean mass of the
durians is 3.5 kg.
(b) Calculate the estimated standard deviation for the mass of the durians in Shop
A.
(c) Given that the mean and standard deviation of the mass of the durians in Shop
B are 4.4 kg and 1.2 kg respectively, comment on the distribution of the mass
of durians in both shops in 2 different ways.
[Ans: 1(a) Class A mean = 5.225 h, SD = 1.42 h; Class B mean = 4.925 h, SD = 1.69 h
1(b) Class B; 2. Mean = 72.2, SD = 13.3; 3(a) p 3, q 7 ; 3(b) 0.949 kg]
Page 19 of 22
(6) CALCULATOR CORNER
We can make use of the calculator to estimate the mean and the standard deviation.
Note that calculator should only be used to verify the obtained values of mean and standard
deviation in exams or tests. If no working is shown and the given answer given is wrong, no
credit will be given.
Scan the respective QR code for your calculator model and learn how to use it to estimate the
mean and the standard deviation of the example provided below.
CASIO fx-97SG X CASIO fx-96SG PLUS SHARP EL-W531S
https://youtu.be/KIMGp4PNLFM https://youtu.be/MuMa81o6ozs https://youtu.be/-0I9mvlFOAw
Using EG 7:
The distribution table below shows the speed of 100 cars. Estimate the mean and the standard
deviation of the distribution.
Speed (v km/h) 0 < v 30 30 < v 40 40 < v 50 50 < v 60 60 < v 70 70 < v 80 80 < v 90
No. of cars 8 8 25 35 14 6 4
Answer: Estimated mean speed, x 51.5 km/h (3 s.f.)
Estimated standard deviation, SD 15.7 km/h (3 s.f.)
Page 20 of 22
FOR YOUR INTEREST
Estimate Median in Grouped Date
What’s the formula to calculate an estimate of the median in grouped data?
(http://myhome.iolfree.ie/~wgf/AP/The%20Median2.pdf)
n
cumulative frequency before median class
Estimated median = L 2 class width
frequency of median class
where L = the lower boundary of the median interval and n = the total number of data.
Assumed Mean
In statistics the assumed mean is a method for calculating the arithmetic mean of a data set. It
simplifies calculating accurate values by hand, especially for numbers that are large in values.
Example
Find the mean of the following numbers: 297, 295, 292, 315 and 311
Let the assumed mean be 300. You may choose other numbers.
Let A = 297 – 300 = –3
B = 295 – 300 = –5
C = 292 – 300 = –8
D = 315 – 300 = 15
E = 311 – 300 = 11
3 5 8 15 11
The mean of A, B, C, D and E = X = 2
5
Therefore the mean of 297, 295, 292, 315 and 311 = 300 + X = 302
Page 21 of 22
Practice
Using an assumed mean of 1240, find the mean of 1242, 1252, 1248, 1244 and 1249.
[Ans: 1257.5]
Page 22 of 22