Unit 4.3 Random Variables, Discrete and Continuous Probability Distribution
Unit 4.3 Random Variables, Discrete and Continuous Probability Distribution
Specific Objectives:
1. To be able to find the expectations and variances of discrete and continuous probability distributions.
2. To learn Binomial and Normal distribution and their daily life applications.
20.1 Random Variables 6 A formal treatment of random variable is not expected. Instead, teachers can
(a) Discrete probability introduce its preliminary idea by using simple examples such as throwing of coins (for
functions discrete random variable) and life time of electric bulbs (for continuous random variable).
Discrete probability function f ( x ) can be introduced as f ( x ) = P ( X = x ) where X is a
discrete random variable and x is a fixed value of a random variable through familiar
examples such as throwing of 2 coins:
134
0.25 for x = 0
0.5 for x = 1
f (x) =
0.25 for x = 2
0 otherwise
X (the number of heads obtained) is a discrete random variable which can take the
values 0, 1 or 2.
Emphasis should be laid on the conditions f ( x ) ≥ 0 and ∑ f ( x ) = 1
Teachers should remind students that capital letter X is usually reserved for random
variable and the lower case x for values the random variable can assume.
Example
X is the number of attempts required to get a ‘six’ in a throw of a die.
x −1
1 5
The discrete probability function f ( x ) is f ( x ) = P ( X = x ) = . Clearly,
6 6
∞ 1
∑ f (x) =
∑ x =1
P( X = x ) =
1−
6
5
=1
Representing the discrete probability function graphically (in the form of bar chart or
histogram as shown below) certainly helps students to visualize the concept.
135
(b) Probability density At this stage, students should have a clear picture of the discrete probability
functions function. We can extend this idea to continuous random variable and introduce the
continuous probability density function (p.d.f.) f ( x ) . Students should note that
∞
f ( x ) ≥ 0 and
∫ −∞
f ( x ) dx = 1
Students are expected to know that a continuous random variable X can take any value
within a specified range and it is related to p.d.f. f ( x ) by
b
P (a ≤ x < b ) =
∫ a
f ( x ) dx
Detailed Content Time Ratio Notes on Teaching
∞
k can be determined from
∫ −∞
f ( x ) dx = 1 .
136
1 3 4
Also P ( −2 < X ≤ 1) =
∫ 0
f ( x ) dx and P ( X ≥ 3) = 1 −
∫ 0
f ( x ) dx =
∫
3
f ( x ) dx
Students may be asked to find M in terms of b if P(X ≤ M) = b. They should note that
when b = 0.5, M is the median.
Example 2 '
The scheduled time of arrival of a flight to a certain city is 8:00 a.m. However, the actual
time of arrival is (8 + X) am, where X is a random variable having the following p.d.f.:
3(4 − x 2 )
for − 2 < x < 2
f ( x ) = 32
0
elsewhere
Possible questions include finding the probability that a flight will be between 7:00 a.m.
and 8:00 a.m. and between 9:00 a.m. and 10:00 a.m.
It is worthwhile to spare some time to discuss with students the meaning of the term
cumulative distribution function φ(t ) .
φ(t ) =
∑ x ≤t
f (x) in discrete case
t
and φ(t ) =
∫ −∞
f ( x ) dx in continuous case
Examples such as the two shown below may be used to illustrate these two cases.
1. Discrete
In a throw of 2 dice, the probability of getting a sum greater than 10 is 1 − φ(10) .
( φ(10) is the probability that the sum is equal to or smaller than 10.)
2. Continuous
If φ(a ) denotes the probability that the life time of an electric bulb is smaller than a,
then P ( X < a ) = φ(a ) , P (a < X < b ) = φ(b ) − φ(a ) and P ( X > a ) = 1 − φ(a ) .
20.2 Expectations and 5 A brief revision on the meaning and physical significance of mean and standard
Variances deviation will facilitate students’ learning the concepts of expectation. The meaning of
expectation can be introduced through simple example such as that shown below.
137
Example
A man has a probability p = 0.01 of winning a prize x = $200. We say that his chance is
worth px = ($200)⋅(0.01) = $2. Then the teacher can extend this idea to n discrete values
of X.
Teachers may also discuss with students the definition of expectation of a function of X.
The following shows the two definitions.
E [ g ( x )] = ∑ pg ( x ) discrete random variable
∞
E [ g ( x )] =
∫
−∞
f ( x )g ( x ) dx continuous random variable .
In the case of discrete random variables, students are expected to know the
meanings of E ( X ) (= µ) and E ( X − µ )2 (= Var(X) = σ ). In particular, teachers
2
dispersion of X about µ.
Interesting examples can be discussed.
Detailed Content Time Ratio Notes on Teaching
Example
The probability of a candidate passing an examination at anyone attempt is 0.4. If he
fails, he carries on entering until he passes and each entry costs him $120. Teachers
may discuss with students the expected cost of his passing the examination.
Calculations involving fair games, expected gain/loss are best illustrated by real-life
examples. The following are two of them.
Example 1
In an investment, a man can make a profit of $5 000 with a probability of 0.62 or a loss of
$8 000 with a probability of 0.38.
E(X) = µ and Var(X) = σ can be calculated from
2
Example 2
A gambling machine has four windows and each of them displays one of the four
different colours: red, orange, yellow and blue. Each of the colours is equally likely to be
displayed and the colour displayed by the machine on one window is independent of the
colour displayed on the other windows. A man pays $a for a game. He receives $5 if all
the colours displayed in the four windows are different. He receives $30 if all the colours
displayed are the same. In all other cases, he loses. $X is the net amount he received in
playing a game.
Most of the students should realize that E ( X ) < 0 in most of the gambling games.
Teachers may also ask students to work out the new µ and σ when all the money is
2
doubled and to find the relations between the new and old parameters. For abler
students, teachers may ask them to guess the value of E (Y ) where $Y is the net
amount the man receives if he plays the games twice.
Example
Orange juice is delivered to a fast food shop every morning. The daily demand for
orange juice is a continuous random variable X distributed with a probability density
function f ( x ) of the form
ax (b − x ) for 0 ≤ x ≤ 1
f (x) =
0 elsewhere
∞ ∞
∫ −∞
f ( x ) dx = 1 and
∫
−∞
xf ( x ) dx = 0.625
The orange juice container at this fast food shop is filled to their total capacity of 0.8 units
139
every morning.
The probability P that in a given day, the fast food shop cannot meet the demand for
1
orange juice is given by P = 1 − φ(0.8) =
∫ 0.8
f ( x ) dx
For abler students, teachers may guide them to prove the two formulae
E [ag ( X ) + b ] = aE [ g ( X )] + b and Var(aX + b ) = a2 Var( X ) where a, b are constants.
Also, it is not difficult for an average student to show that
E [ g ( X ) + h( X )] = E [ g ( X )] + E [ h( X )]
and Var( X ) = E ( X 2 ) − [E ( X )] = E ( X 2 ) − µ2
2
Example
Given Z = 2 X 2 − 3 X + 5 where X is a random variable with mean µ and variance σ .
2
E (Z ) = E (2 X − 3 X + 5) = E (2 X ) − E (3 X ) + E (5)
2 2
E(Z) can be obtained from
( )
= 2 µ2 + σ2 − 3µ + 5
Detailed Content Time Ratio Notes on Teaching
20.3 Binomial Distribution 7 Teachers can introduce Bernoulli trials by using the familiar example of tossing a
(a) Bernoulli trials, Binomial fair coin. Teachers should emphasize that in a Bernoulli trial, there are only two possible
probability outcomes.
Repeated Bernoulli trials play an important role in probability and statistics especially
when the probabilities of the two possible outcomes are the same for each trial. Students
should know that the probability associated with r successes in the n trials is given by the
expression
P (r successes) = Crn pr q n −r
Teachers can easily quote numerous examples in our daily life to illustrate the Binomial
probability.
Example 1
A die is thrown n times. In order that the probability of getting at least one ‘six’ is greater
than 0.99, n should satisfy the following inequality:
n
5
1 − > 0.99
140
6
Example 2
r balls are randomly distributed in n cells. Students may be asked to find the probability
Pk that a specified cell contains exactly k balls (k ≤ r). In this case, students are expected
1
to know that Pk = P(k successes in r Bernoulli trials) with p = .
n
(b) Binomial distribution At this stage, teachers may introduce that Binomial distribution can be considered
as a repeated Bernoulli trial with the same probability of success. Teachers may also
introduce the notation B (n, p) for the distribution.
Students are expected to know the formulae E(X) = np and Var (X) = npq for
Binomial distribution.
The probability graph of a Binomial distribution with different values of n and p can
be shown. Students should be able to see that when p = q = 0.5, the graph is symmetric.
For abler students, the mode of Binomial distribution can also be discussed.
Example 1
A student sits for a test which contains only 4 multiple choice questions. With his
knowledge of the subject, he has a probability of 0.7 of knowing the correct answer of
each question. There are 5 options in each question, thus the student has the probability
0.2 of getting the correct answer in each question through guessing. He has attempted
all the questions.
The probability that the student knows the correct answers of 3 questions is
C34 (0.7)3 (0.3) .
Since the student can get the correct answer of a question simply by guessing, P(correct
answer for a question) = p can be calculated from the two cases (a) he knows the
question and (b) he guesses it correctly.
141
Example 2
5% of light bulbs are defective. A large batch of light bulbs is tested according to the
following rules.
(a) A sample of 10 light bulbs is tested.
Detailed Content Time Ratio Notes on Teaching
(i) If two or more light bulbs are defective, then the whole batch is rejected.
(ii) If there is no defective light bulb, the whole batch is accepted.
(iii) If there is only one defective light bulb, try rule (b).
(b) Another sample of 10 bulbs is tested. If there is no defective bulb, the whole batch is
accepted; otherwise it is rejected.
If X is the number of light bulbs examined, then it is not difficult to find P(X = 20) = 10
(0.95) (0.05) and P(X = 10) = 1 − 10 (0.95) (0.05). Students may be asked to find E(X)
9 9
and Var(X).
Example 3
10% of the items produced by a machine are defective. The items are packed in large
batches, and a batch is accepted if a sample of n items from it contains no defectives;
otherwise it is rejected.
142
The least value of n to ensure the probability that the batch will be rejected is at least
0.95 satisfies (0.9) < 1 − 0.95. If n = 10, then P (the batch being accepted) = (0.9) = P.
n 10
20.4 Normal Distribution 10 Normal distribution is a very important example of continuous probability
(a) Basic definitions distribution. The p.d.f. f ( x ) , i.e.
1 1 x − µ 2
f (x) = exp −
σ 2π 2 σ
Students are expected to recognize that E ( X ) = µ and Var( X ) = σ2 , but the proof is
not necessary.
It is worthwhile for teachers to discuss with students why normal distribution is commonly
used in many subjects.
1. Easy to use.
2. Can be used as an approximation to other distributions.
Graphs with different µ and σ can be introduced. Students should realize that all the
graphs shown are bell-shaped and are symmetric about x = µ. The notation N(µ, σ )
2
which means a normal distribution with mean = µ and variance = σ may be introduced.
2
(b) Standard normal curve The normal distribution depends on µ and σ. Students should find that it is difficult to
and the use of normal tabulate the probability function of each normal distribution with a different set of
table parameters. Therefore, it is necessary to express the random variable in standard unit,
X −µ
using the transformation Z = . Students should have no difficulty in seeing that
σ
E(Z) = 0, Var(Z) = 1 and
a−µ x −µ b−µ
P (a < X < b ) = P < <
143
σ σ σ
= P ( z1 < Z < z2 )
This area, for different values of z1, is put into a table called normal distribution table (The
table only gives values up to z1 = 3.59). Adequate practice is necessary for ensuring that
students can use the table properly.
Example
144
X is N(8, 4)
6−8 X −8 9−8
P (6 < X < 9) = P < <
2 2 2
= P ( −1 < X < 0.5)
= P (0 < Z < 1) + P (0 < Z < 0.5)
P ( X > 9) = P (Z > 0.5) = 0.5 − P (0 < Z < 0.5)
In P(X < k) = 0.87, k can be obtained with greater accuracy if method of linear
interpolation is used.
Teachers can remind students that in solving many of the problems, they have to
make use of symmetry and laws of complementary probability. Moreover, for z1 involving
more than 3 significant figures, the method of linear interpolation should be used.
(c) Applications Standard normal distribution is essential in daily applications. Teachers should
provide adequate demonstration. Examples like the following may be used.
Example 1
A manufacturer uses a machine to produce resistors. He found that 10% of the resistors
are less than 95Ω and 20% of the resistors are above 110Ω. The distribution of the
resistances X is normal.
µ and σ can be calculated from the two equations
95 − µ
P ( X < 95) = P Z < = 0.1
σ
110 − µ
P ( X > 110) = P Z > = 0.2
σ
Example 2
Suppose X, the length of a rod, is a normally distributed random variable with mean µ
and variance 1. If X does not meet certain specifications, then the manufacturer will
suffer a loss. Specifically, the profit M (per rod) is the following function of X.
145
3 if 8 ≤ X ≤ 10
M = −1 if X < 8
−5 if X > 10
The expected profit, E(M), is given by E (M ) = 8φ(10 − µ ) − 4φ(8 − µ ) − 5 where
z
∫
1 − 21 t 2
φ( z ) = e dt is the cumulative probability function.
−∞ 2π
Suppose that the manufacturing process can be adjusted so that different values of µ
may be achieved. The value of µ corresponding to maximum profit can be determined by
differentiating E(M) with respect to µ.
Example 3
A factory produces soft drinks contained in bottles. The normal volume contained in a
bottle is 1.25 litres. However, due to random fluctations in the automatic bottling
machine, the actual volume per bottle varies according to a normal distribution. It is
observed that 15% of the bottles contain less than 1.25 litres whereas 10% contain more
than 1.30 litres.
Detailed Content Time Ratio Notes on Teaching
Students should have no difficulty in finding the mean µ and standard deviation σ of the
volume distribution.
(d) Binomial approximated to Students should be made clear that the binomial probability can be calculated by
normal distribution using normal approximation only when n is large. In this case, the mean and variance
can be taken as np and npq respectively. Students should also be reminded that ‘end
continuity corrections’ is required in this approximation.
Example
A coin is tossed 400 times.
146
Students may be interested to know that P (195 ≤ X ≤ 210) ≠ P (195 < X < 210) .
20.5 Linear Combination of 6 Students should recognize that the sum of scalar multiples of independent normal
Independent Normal variables is also normal. From this, it is not hard to see that:
Variables
If X and Y are two independent normal variables such that X ~ N µ1, σ12 ( ) and
(
Y ~ N µ 2 , σ2 2
), then (
aX + bY ~ N aµ1 + bµ2 , a 2
σ12 + b σ2
2 2
) for any real
values a and b.
The above result can be extended easily to n independent normal variables. Teachers
should also quote examples to illustrate the usefulness of the above fact in daily-life
application.
Example 1
Cakes are sold in packets of 6. The mass of each cake is a normal variable with mean
25 g and standard deviation 2 g. The mass of the packing material is a normal variable
with mean 30 g and standard deviation 4 g. Find the distribution of the total mass of each
packet of cakes and hence find the probability that the total mass of a packet is less than
142 g.
Example 2
The thickness, A cm, of a paperback is normally distributed with mean 2 cm and
2
variance 0.63 cm . The thickness, B cm, of a hardback is normally distributed with mean
5 cm and variance 1.42 cm . The distribution of X = 2 A − B can be determined and
2
hence the probability that a randomly chosen paperback is less than half the thickness of
a randomly chosen hardback can be evaluated by using the standard normal distribution
table.
147
27