Revision Questions -- SA3(Q)
Revision Questions -- SA3(Q)
Question 1
The above histogram shows the salary (in thousand $) for a company with 2700 employees.
(a) What is the nearest percentage (estimated) for the employees that has salary less than
$55 thousand? [57%]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(a) If a boxplot to be constructed based on this histogram, How the boxplot likely to be?
Draw an estimated boxplot without giving the exact 5-number summary.
…………………………………………………………………………………………………
Question 2
Test scores (in percent) for two units Maths and Science are recorded for 16 students as
follows:
Maths
Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Maths 47 50 26 56 71 75 76 90 85 56 80 94 52 76 65 57
Score
(%)
Science 38 20 24 42 52 16 70 45 73 52 85 88 48 80 84 84
Score
(%)
(a) Use graphics calculator calculate the following descriptive statistics for both the units,
give answer correct the nearest percent.
Maths Science
Mean 66 56
Median 68 52
Mode 56, 76 84
Standard Deviation 18 25
Lower Quartile 54 40
Upper Quartile 78 82
Range = Max – Min 68 72
The highest 25% of the Science scores were all above or equal to .
(c) Determine if there is any outlier for Science scores. Show all working to justify your
answer.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(d) The following parallel boxplots for both Maths and Science scores are shown below.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Question 3
Twenty students, selected randomly were asked to estimate the number of hours (n) that they
had spent studying in the past week (in and out of class). The responses are recorded below.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Teacher of the class, Johnny, noticed that the number of hours spent studying in the past
week (n) (in and out of class) can give the estimated average marks (M) for the academic
performance. He has done some statistics analysis and he obtained the information as
follows:
𝑛̅ = 39.5, ̅ = 76.8,
𝑀 𝑠𝑛 = 13.2, 𝑠𝑀 = 20.6
(c) Use the given information, write the regression equation for the number of hours
spent studying (n) and average marks (M).
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(d) What is the slope for the regression equation? Interpret the value relate to the context
of the question.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(e) Use the regression obtained in part (c) to estimate the average marks for a student
who spent 35 hours in the past week. Give answer to the nearest whole number.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Question 4
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(b) Comment about the skewness of the distributions for this two villages.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Question 5
A chemical solution was gradually heated. At five-minute intervals the time, t minutes, and
the temperature, T ℃, were noted.
Time 0 5 10 15 20 25 30 35
Temperature 0.8 3.0 6.8 10.9 15.6 19.6 23.4 26.7
(a) Which is the independent variable and which is the dependent variable?
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(b) Use your Graphics calculator to draw a scatter plot and comment on the relationship
of t and T by stating the strength, direction and form.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(c) Evaluate the Person’s correlation coefficients and comment about it if it supports your
answer in (b).
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(d) Comment about the coefficient of determination for the relationship of t and T.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(e) Calculate the equation of the regression line of T on t. [𝑻 = 𝟎. 𝟕𝟖𝒕 − 𝟎. 𝟐𝟓]
…………………………………………………………………………………………………
(f) Use your equation to estimate the temperature after 12 minutes. [𝟗. 𝟏]
Is this estimation reliable?
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Question 6
A medical officer wishes to study the relationship between the blood pressure and the age of
male patients. He gets the following results from 12 patients.
BP 118 147 143 160 145 125 115 149 152 130 152 150
Age (years) 36 56 47 72 49 42 38 63 68 42 60 55
(a) Which is the explanatory variable and which is the response variable?
…………………………………………………………………………………………………
…………………………………………………………………………………………………
The equation of the best line fit for this set of data is
(b) Interpret the coefficient of Age and constant relate to the context in this data set.
Discuss about the constant if it is sensible? Explain.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(c) Use the given regression equation to estimate the blood pressure of a 25-year-old
patient. Is this reliable and why?
Residual
12
10
8
6
4
2
0
30 35 40 45 50 55 60 65 70 75
-2
-4
-6
-8
Does this residual plot suggest that the linear equation is not appropriate? Explain.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Does this residual plot suggest that the linear relationship? Explain.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
r-value
(h) Which model is better? Does it suggest better linear relationship? Give reason.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(i) It is known that the data has the relationship in the form of 𝐴𝑔𝑒 = 𝑎 × 𝑏 𝑘(𝐵𝑃) , where
a, b and k are constants. Use equation in part (f) to get the values of a, b and k.
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………