Unit 1
Unit 1
Objective(s)
By the end of this unit you should be able to:
Define and explain a random variables
Identify the types of random variables
State the properties of discrete and continuous random variables.
Explain the difference between discrete and continuous random variables
Perform statistical calculations involving random variables
When we perform an experiment, we are interested not in the particular outcome that occurs,
but rather in some number associated with that outcome.
For example, in the game of craps, a player is interested not in the particular numbers on the
two dice, but in their sum. In tossing a coin 50 times, we may be interested only in the
number of heads obtained, and not in the particular sequence of heads and tails that constitute
the result of 50 tosses.
In both examples, we have a rule which assigns to each outcome of the experiment a single
real number. Hence, we can say that a function is defined.
You guys are already familiar with the function concept. Now we are going to look at some
functions that are particularly useful to study probabilistic / statistical problems.
1
SECTION 1: Definition and Explanation of a Random Variable
In probability theory, certain functions of special interest are given special names.
A probability distribution is a table of values showing the probabilities of various outcomes
of an experiment.
For example, if a coin is tossed three times, the number of heads obtained can be 0, 1, 2 or 3.
The probabilities of each of these possibilities can be tabulated as shown:
Table 1.1: Probability distribution for the sample space for tossing three coins.
Number of heads 0 1 2 3
Probability 1 3 3 1
8 8 8 8
A discrete variable is a variable which can only take a countable number of values. In this
example, the number of heads can only take 4 values (0, 1, 2, 3) and so the variable is
discrete. The variable is said to be random if the sum of the probabilities is one.
A function whose domain is a sample space and whose range is some set of real numbers is
called a random variable.
If the random variable is denoted by X and has the sample space Ω = {O1 , O2 , .., On } as
domain, then we write X(Ok ) for the value of X at element Ok. Thus, X(Ok ) is the real
number that the function rule assigns to the element ok of Ω.
Lets look at some examples of random variables
Example 1.1: Let Ω = {1, 2, 3, 4, 5, 6} and define X as follows:
X(1) = X(2) = X(3) = 1, X(4) = X(5) = X(6) = –1
Then X is a random variable whose domain is the sample space Ω and whose range is
the set {1, -1}. X can be interpreted as the gain of a player in a game in which a die is rolled,
the player winning GH₵ 1 if the outcomes is 1, 2, or 3 and losing GH₵ 1 if the outcome is 4,
5, or 6.
Example 1.2: Two dice are rolled and we define the familiar sample space
Ω = {(1, 1), (1, 2), … (6, 6)}
containing 36 elements. Let X denote the random variable whose value for any element of Ω
is the sum of the numbers on the two dice.
Then the range of X is the set containing the 11values of X:
{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
Each ordered pair of Ω has associated with it exactly one element of the range as
required by the definition. But, in general, the same value of X arises from many different
outcomes.
2
For example X(𝑂𝑘 ) = 5 is any one of the four elements of the event
{(1, 4), (2, 3), (3, 2), (4, 1)}
Example 1.3: A coin is tossed, and then tossed again. We define the sample space
Ω = {HH, HT, TH, TT}
If X is the random variable whose value for any element of Ω is the number of heads
obtained, then
𝑋 (𝐻𝐻) = 2, 𝑋 (𝐻𝑇) = 𝑋(𝑇𝐻) = 1, 𝑋(𝑇𝑇) = 0
Notice that more than one random variable can be defined on the same sample space. For
example, let 𝑋 denote the random variable whose value for any element of Ω is the number of
heads minus the number of tails. Then
X(HH) = 2, X (HT) = X(TH) = 0, X(TT) = -2
Suppose now that a sample space
Ω = {𝑂1, 𝑂2, …, 𝑂𝑛 }
is given, and that some acceptable assignment of probabilities has been made to the sample
points in Ω. Then if 𝑋 is a random variable defined on Ω, we can ask for the probability that
the value of 𝑋 is some number, say 𝑥.
The event that 𝑋 has the value x is the subset of Ω containing those elements 𝑂k for
which
𝑋 (𝑂k) = 𝑥. If we denote by 𝑓(𝑥) the probability of this event, then
𝑓 (𝑥) = 𝑃({𝑂k ∈ Ω | X (𝑂k) = 𝑥}) (1)
Because this notation is cumbersome, we shall write
𝑓(𝑥) = 𝑃(𝑋 = 𝑥), (2)
adopting the shorthand “X = x” to denote the event written out in (1)
For instance, let 𝑋 be the age of a randomly selected student here today and 𝑌 be the number
of planes completed in the past week. A random variable that takes on a finite or countable
infinite number of values is called a discrete random variable (e.g. 𝑋) while one which
takes on a non-countable infinite number of values is called a continuous random variable
(i.e. 𝑌).
Example 1.5: Example of discrete random variables.
1. Number of broken eggs in a batch
2. The number of bits in error in a transmitted message.
3
1. The current in a copper wire
2. The length of a manufactured part.
Probability mass Function
The probability mass function of a discrete random variable 𝑋 is a function that assigns
probability to each member of 𝑋 and is represented as:
𝑓(𝑥) = 𝑃(𝑋 = 𝑥)
Example 1.7: What is the probability mass function of the random variable that counts the
number of heads on 3 tosses of a fair coin?
Solution
The range of the variable is {0, 1, 2, 3}.
1 3
𝑃(𝑋 = 0) = 1 ( )
2
1 3
𝑃(𝑋 = 1) = 3 ( )
2
1 3
𝑃(𝑋 = 2) = 3 ( )
2
1 3
𝑃(𝑋 = 3) = 1 ( )
2
Example 1.8: Consider the following game. A fair 4-sided die, with the numbers 1, 2, 3, 4 is
rolled twice. If the score on the second roll is strictly greater than the score on the first the
player wins the difference in euro. If the score on the second roll is strictly less than the score
on the first roll, the player loses the difference in euro. If the scores are equal, the player
neither wins nor loses. If we let 𝑋 denote the (possibly negative) winnings of the player, what
is the probability mass function of 𝑋? (𝑋 can take any of the values −3, −2, −1, 0, 1, 2, 3.)
Solution
The total number of outcomes of the experiment is 4 × 4 = 16.
𝑃(𝑋 = 0): 𝑋 will take the value 0 for the outcomes
4
(1, 1), (2, 2), (3, 3), (4, 4). So 𝑓 (0) = .
16
𝑃(𝑋 = 1): 𝑋 will take the value 1 for the outcomes (1, 2), (2, 3), (3, 4).
3
So 𝑓 (1) = 16 .
2
𝑃(𝑋 = 2): 𝑋 will take the value 2 for the outcomes (1, 3), (2, 4). 𝑆𝑜 𝑓 (2) = 16 .
1
𝑃(𝑋 = 3): Similarly 𝑓 (3) = 16 .
Table 1.2: The summary of all the probabilities for each member of 𝑋
𝑥 −3 −2 −1 0 1 2 3
𝑓(𝑥) 1 2 3 4 3 2 1
16 16 16 16 16 16 16
4
A function 𝑓 can only be a probability mass function if it satisfies the following conditions:
1. As 𝑓 (𝑥) represents the probability that the variable 𝑋 takes the value 𝑥, 𝑓 (𝑥) can
never be negative or greater than one i.e. 0 ≤ 𝑓 (𝑥) ≤ 1 for all 𝑥
2. Also if we sum overall values of 𝑓(𝑥) (in the range of 0 ≤ 𝑓 (𝑥) ≤ 1) for all
members 𝑥 in the set 𝑋, the total must be equal to one.
∑ 𝑓(𝑥) = ∑ 𝑃(𝑋 = 𝑥) = 1
𝑥 𝑥
𝑥 −3 −2 −1 0 1 2 3
𝑓(𝑥) 1 2 3 4 3 2 1
16 16 16 16 16 16 16
1 2 3 4 3 2 1
∑ 𝑓(𝑥) = + + + + + +
16 16 16 16 16 16 16
𝑥
=1
Example 1.9: Let 𝑋 be a continuous random variable with the following pdf
𝑐𝑒 −𝑥 𝑥≥0
𝑓(𝑥) = {
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
where 𝑐 is a positive constant.
i) Find 𝑐
ii) Find 𝑃(1 < 𝑋 < 3)
Solution
5
i) To find 𝑐, we can use the second property above, that is,
∞
1 = ∫ 𝑓(𝑡)𝑑𝑡
−∞
∞
= ∫ 𝑐𝑒 −𝑥 𝑑𝑥
0
= 𝑐|−𝑒 −𝑥 |∞
0
=𝑐
Thus, we must have 𝑐 = 1
6
SECTION 2: Discrete Random Variable and its Property
Dear learners, welcome to this section. In section 1 we discuss random variables as variables
associated with probability have classify variables as discrete or observing the values the
variable can assume. Some decisions in business, insurance, and other real-life situations are
made by assigning probabilities to all possible outcomes pertaining to the situation and then
evaluating the results. This section explains the concepts and applications of a discrete
probability distribution.
Objective(s)
By the end of this section you should be able to:
Describe a discrete random variable
Construct a probability distribution for a discrete random variable
Example 2.1: A coin is tossed, and then tossed again. We define the sample space
Ω = {HH, HT, TH, TT}
If X is the random variable whose value for any element of Ω is the number of heads
obtained, find the probability function corresponding to the random variable 𝑋 assuming that
the coin is fair.
Solution
From Example 1.3 in section 1, we have
1 1 1 1
𝑃(𝐻𝐻) = , 𝑃(𝐻𝑇) = , 𝑃(𝑇𝐻) = , 𝑃(𝑇𝑇) =
4 4 4 4
Then
𝑃(𝑋 = 0) = 𝑃(𝑇𝑇)
7
1
=4
𝑃(𝑋 = 2) = 𝑃(𝐻𝐻)
1
=
4
𝑥 0 1 2
𝑓(𝑥) 1 1 1
4 2 4
A probability distribution for a random variable describes how probabilities are distributed
over the values of a random variable. We can describe a discrete probability distribution with
a table, graph or equation.
Discrete probability distributions describe populations but not samples.
Example 2.2: Let the random variable x represent the number of girls in a family of four
children. Construct a table describing the probability distribution.
Solution
Total number of outcomes is 16
Total number of ways to have 0 girls is 1 so we have
1
𝑃(0 𝑔𝑖𝑟𝑙𝑠) =
16
Total number of ways to have 1 girls is 4 so we have
4
𝑃(1 𝑔𝑖𝑟𝑙𝑠) =
16
Total number of ways to have 2 girls is 6 so we have
6
𝑃(2 𝑔𝑖𝑟𝑙𝑠) =
16
Total number of ways to have 3 girls is 4 so we have
4
𝑃(3 𝑔𝑖𝑟𝑙𝑠) =
16
8
Total number of ways to have 4 girls is 1 so we have
1
𝑃(4 𝑔𝑖𝑟𝑙𝑠) =
16
Table 2.2: The probability distribution
𝑥 0 1 2 3 4
𝑓(𝑥) 1 4 6 4 1
16 16 16 16 16
Check:
5
1 4 6 4 1
∑ 𝑓(𝑥) = + + + +
16 16 16 16 16
𝑖=1
=1
Example 2.3: Represent graphically the probability distribution for the sample space for
tossing three coins as in table 1.1.
Solution
The values that X assumes are located on the x axis, and the values for P(X) are located on
the y axis.
Figure 2.1: Probability distribution for the sample space for tossing three coins
Example 2.3: Suppose the range of a discrete random variable is {0, 1, 2, 3, 4}. If the
probability mass function is 𝑓 (𝑥) = 𝑐𝑥 for 𝑥 = 0, 1, 2, 3, 4, what is the value of 𝑐?
Solution
First of all, 𝑐 ≥ 0 as 𝑓(𝑥) ≥ 0. Then,
9
𝑓(0) + 𝑓(1) + 𝑓(2) + 𝑓(3) + 𝑓(4) = 1
𝑐(0 + 1 + 2 + 3 + 4) = 𝑓(0) + 𝑓(1) + 𝑓(2) + 𝑓(3) + 𝑓(4)
10𝑐 = 1
1
𝑐 = 10
10
SECTION 3: Examples of Discrete Random Variables
Dear learners, welcome to this section. At this stage, we can boost of having adequate
knowledge on random variables and its two types and probability distribution. In this section,
we will explore more examples on discrete random variables. We will find the mean,
variance, standard deviation and expected value of a discrete random variables.
Objective(s)
By the end of this section you should be able to find the mean, variance, standard deviation
and expected value of a discrete random variables.
The expected value, or mean, of a discrete random variable is a measure of its central location
and is given by:
𝐸(𝑥) = 𝜇 = ∑ 𝑥𝑃(𝑥)
The variance summarises the variability in the values of a discrete random variable and is
given by:
Or
𝑣𝑎𝑟(𝑥) = 𝜎 2 = ∑ 𝑥 2 × 𝑃(𝑥) − 𝜇 2
The standard deviation, 𝜎, is defined as the positive square root of the variance.
Example 3.1: Calculate the mean, variance and standard deviation for the discrete probability
distribution below:
𝑥 0 1 2 3 4
𝑃(𝑥) 1 4 6 4 1
16 16 16 16 16
11
Solution
𝑚𝑒𝑎𝑛, 𝜇 = ∑ 𝑥𝑃(𝑥)
𝑣𝑎𝑟(𝑥) = 𝜎 2 = ∑ 𝑥 2 × 𝑃(𝑥) − 𝜇 2
𝑆𝐷 = 𝜎 = √𝜎 2
= √1
=1
Example 3.2: Jackson is a midfielder in a division one team. He scores goals and does this
on a regular basis. The probability that he will score 0, 1, 2, or 3 goals in his next match are
0.25, 0.35, 0.25, and 0.15 respectively.
a) What is the probability that Jackson will score fewer than 3 goals tonight?
b) What is the most likely number of goals that Jackson will score?
c) What is the probability that Jackson will score at least 1 goal tonight?
Solution
Let 𝑋 be the number of goals Jackson scored
a) Probability that 𝑋 is fewer than 3 is
𝑃(𝑋 < 3) = 𝑃(𝑋 = 0) ∪ 𝑃(𝑋 = 1) ∪ 𝑃(𝑋 = 2)
= 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)
= 0.25 + 0.35 + 0.25
12
= 0.85
𝑥 𝑃(𝑥) 𝑥 ∙ 𝑃(𝑥)
0 0.25 0
1 0.35 0.35
2 0.25 0.50
3 0.15 0.45
𝑚𝑒𝑎𝑛, 𝜇 = ∑ 𝑥𝑃(𝑥)
13
Section 4: Continuous Random Variable and its Properties
Welcome to this section of unit 1. In section 2 we explained discrete variables and their
distributions. Remember that a discrete variable cannot assume all values between any two
given values of the variables. In this section we will learn about continuous random variables
and its properties.
Objective(s)
By the end of this section you should be able to:
Describe continuous random variable
State the properties of a continuous probability function
Perform calculations on continuous random variables
Discrete random variables commonly arise from situations that involve counting. Situations
that involve measuring often result in a continuous random variable.
Solution
As given, 𝑓(𝑥) = 3𝑥 2 . Recall the two properties of 𝑓(𝑥):
i) 𝑓(𝑥) is always non-negative;
ii) For the integral of 𝑓(𝑥) across the domain 0 ≤ 𝑥 ≤ 1 we have,
1
𝑃(0 ≤ 𝑥 ≤ 1) = ∫ 3𝑥 2 𝑑𝑥
0
= |𝑥 3 |10
=1
14
Since the integral of 𝑓(𝑥) across the function’s domain = 1.
1
1
a) 𝑃 ([0, 3 ]) = ∫0 3𝑥 2 𝑑𝑥
3
1
= 𝑥 3 |30
1
= 27
2 1
b) 𝑃 ([3 , 1 ]) = ∫2 3𝑥 2 𝑑𝑥
3
= 𝑥 3 |12
3
8
= 1 − 27
19
= 27
Exercise 4.2: Suppose 𝑓(𝑥) = 𝑥 2 + 2𝑥 + 3. Then 𝑃(0 ≤ 𝑋 ≤ 0.5) is the area under the
graph of f between x = 0 and x = 0.5
1
𝑥 𝑖𝑓 0 ≤ 𝑥 ≤ 2
𝑓(𝑥) = { 2 }
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Then, find the probability 𝑃(1 ≤ 𝑋 ≤ 1.5):
Solution
1.5
𝑃(1 ≤ 𝑋 ≤ 1.5) = ∫1 𝑓(𝑥) 𝑑𝑥
1.5 1
= ∫1 𝑥 𝑑𝑥
2
1.5
𝑥2 5
= |4| = 16
1
15
16
Section 5: Examples of Continuous Random Variables
Introduction
Dear learners, you are welcome to this section. In section 3 we learnt about continuous
random variables and it properties. In this section we will be solving more problems relating
to continuous random variable.
Objective(s)
By the end of this section, you should be able to calculate the mean, variance and standard
deviation of a continuous random variables.
In the case of the continuous random variable, the mean is calculated using
∞
𝐸(𝑋) = ∫ 𝑥 ∙ 𝑃(𝑥)𝑑𝑥 ≈ ∑ 𝑥 ∙ [𝑃(𝑥)] 𝑑𝑥
−∞ 𝑥
We can find the variance of a continuous random variable using the formula
∞
𝑣𝑎𝑟(𝑥) = 𝜎 2 = ∫ (𝑥 − 𝜇)2 𝑓(𝑥)𝑑𝑥
−∞
Or
𝑣𝑎𝑟(𝑥) = 𝜎 2 = 𝐸(𝑋 2 ) − 𝐸(𝑋)2
as the probability weighted average of the squared deviations of the 𝑥′𝑠 from 𝜇.
Example 5.1: (a) Find the constant 𝑐 such that the function
2
𝑃(𝑥) = {𝑐𝑥 0<𝑥<3
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
is a density function, and
(b) Compute 𝑃(1 < 𝑋 < 2)
Solution
(a) Since 𝑃(𝑥) satisfies property 1 if 𝑐 ≥ 0, it must satisfy property 2 in order to be a density
function. Now,
∞ 3
∫−∞ 𝑃(𝑥)𝑑𝑥 = ∫0 𝑐𝑥 2 𝑑𝑥
𝑐𝑥 3 3
=[ ]
3 0
= 9𝑐
17
1
And since this must be equal to 1, we have 𝑐 = 9.
21
(b) 𝑃(1 < 𝑋 < 2) = ∫1 9 𝑥 2 𝑑𝑥
𝑥3
= [27]12
8 1
= 27 − 27
7
= 27
Example 5.2: Given the following probability density function (pdf), 𝑃(𝑥) = 3(1 − 𝑥)2 , 0 <
𝑥 < 1, what is the standard deviation?
Solution
1
Mean, 𝐸(𝑋) = ∫0 𝑥. 3(1 − 𝑥)2 𝑑𝑥
1
= 3 ∫0 𝑥 − 2𝑥 2 + 𝑥 3 𝑑𝑥
1
=4
1
𝐸(𝑋 2 ) = ∫ 𝑥 2 . 3(1 − 𝑥)2 𝑑𝑥
0
1
= 3 ∫ 𝑥 2 − 2𝑥 3 + 𝑥 4 𝑑𝑥
0
1
= 10
Note: 𝐸(𝑋) = 𝜇
Example 5.3: Let X be a continuous random variable with the following pdf
1
𝑥 𝑖𝑓 0 ≤ 𝑥 ≤ 2
𝑓(𝑥) = {2 }
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Using calculus we can compute the expectation, variance, and standard deviation of X:
18
∞
𝐸(𝑥) = ∫ 𝑥 𝑓(𝑥) 𝑑(𝑥)
−∞
2
1 2
=∫ 𝑥 𝑑(𝑥)
0 2
2
𝑥3 4
= [6] =3
0
∞
𝐸(𝑋 2 ) = ∫ 𝑥 2 𝑓(𝑥) 𝑑𝑥
−∞
2
1 3
=∫ 𝑥 𝑑(𝑥)
0 2
2
𝑥4
[8] =2
0
16 2 2
𝑣𝑎𝑟(𝑥) = 𝐸(𝑋 2 ) − 𝜇 2 = 2 − = 𝑎𝑛𝑑 𝜎𝑥 = √
9 9 9
19
Section 6: Joint Distribution of Random Variables
Introduction
Welcome to this section. We have been solving problems involving continuous and discrete
random variables of one distribution. In this section we will learn about joint distribution
probability distribution of random variables.
Objective(s)
By the end of this section, you should be able to:
Construct a joint probability distribution table
Calculate the mean of sums of random variables
Find the variance of sums of random variables
Example 6.1: A fair coin is tossed three independent times. We choose the familiar set
Ω = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
1
as sample space and assign probability 8 to each simple event. We define the
following random variable:
0 𝑖𝑓 𝑡ℎ𝑒 𝑓𝑖𝑟𝑠𝑡 𝑡𝑜𝑠𝑠 𝑖𝑠 𝑎 𝑡𝑎𝑖𝑙
𝑋={
1 𝑖𝑓 𝑡ℎ𝑒 𝑓𝑖𝑟𝑠𝑡 𝑡𝑜𝑠𝑠 𝑖𝑠 𝑎 ℎ𝑒𝑎𝑑
𝑌 = 𝑡ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑒𝑎𝑑𝑠,
𝑍 = 𝑡ℎ𝑒 𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 ℎ𝑒𝑎𝑑𝑠 𝑎𝑛𝑑 𝑡𝑎𝑖𝑙𝑠
We can list the values of these three random variables for each element of the sample space
Ω
20
Table 6.1: Sample space
Element of Ω Value of X Value of Y Value of Z
HHH 1 3 3
HHT 1 2 1
HTH 1 2 1
THH 0 2 1
HTT 1 1 1
THT 0 1 1
TTH 0 1 1
TTT 0 0 3
Consider now the first pair 𝑋, 𝑌. We want to determine not only the possible pairs of
values of 𝑋 and 𝑌, but also the probability with which each such pair occurs.
To say, for example, that 𝑋 has the value 0 and 𝑌 the value 1 is to say that the event
2 1
{THT, TTH} occurs. The probability of this event is therefore 8 or 4. We write
1
𝑃(𝑋 = 0, 𝑌 = 1) = 4,
adopting the convention in which a comma is used in place of ∩ to denote the intersection of
the two events 𝑋 = 0 and 𝑌 = 1. We similarly find
1
𝑃(𝑋 = 0, 𝑌 = 0) = 𝑃({𝑇𝑇𝑇}) = ,
8
Notice that the event 𝑌 = 0 is the union of the mutually exclusive events (𝑋 = 0, 𝑌 = 0)
and 𝑋 = 1, 𝑌 = 0. Hence
1 1
𝑃(𝑌 = 0) = 𝑃(𝑋 = 0, 𝑌 = 0) + 𝑃(𝑋 = 1, 𝑌 = 0) = +0 =
8 8
21
In the table, this probability is obtained as the sum of the entries in the column headed 𝑦 = 0.
By adding the entries in the other columns, we similarly find
3 3 1
𝑃(𝑌 = 1) = , 𝑃(𝑌 = 2) = , 𝑃(𝑌 = 3) =
8 8 8
In this way, we obtain the probability function of the random variable 𝑌 from the joint
probability table of 𝑋 and 𝑌. This function is commonly called the marginal probability
function of 𝑌. By adding across the rows in the joint table, one similarly obtains the
(marginal) probability function of 𝑋
Notice that knowing the value of 𝑋 changes the probability that a given value of 𝑌
3
occurs. For example, (𝑌 = 2) = . But if we are told that the value of 𝑋 is 1, then the
8
1
conditional probability of the event 𝑌 = 2 becomes 2. This follows from the definition of
conditional probability:
1
𝑃(𝑋 = 1, 𝑌 = 2) 4 1
𝑃(𝑌 = 2|𝑋 = 1) = = =
𝑃(𝑋 = 1) 1
2
2
In other words, the event 𝑋 = 1 and 𝑌 = 2 are not independent: knowing that the first toss
results in a head increases the probability of obtaining exactly two heads in three tosses.
What we have done for the pair 𝑋, 𝑌 can be done for 𝑋 and 𝑍. In this case, the joint
probability table looks like this
Table 6.3: Joint probability of X and Z
𝑧
1 3 𝑃(𝑋 = 𝑥)
0 3 1 1
8 8 2
𝑥 3 1 1
1
8 8 2
3 1
𝑃(𝑍 = 𝑧) 1
4 4
Definition 4 Let X and Y be random variables on the same sample space Ω with respective
range spaces
𝑅𝑥 = {𝑥1 , 𝑥2 , … , 𝑥𝑛 } 𝑎𝑛𝑑 𝑅𝑦 = {𝑦1 , 𝑦2 , … , 𝑦𝑛 }
22
The joint distribution or joint probability function of X and Y is the function h on the product
space 𝑅𝑥 × 𝑅𝑦 defined by
The function h is usually given in the form of a table, the following properties:
(i) ℎ(𝑥𝑖 , 𝑦𝑗 ) ≥ 0,
(ii) ∑𝑖 ∑𝑗 ℎ(𝑥𝑖 , 𝑦𝑗 ) = 1
23
From the marginal probability functions of X and Y, we find that
1 1
𝐸(𝑋) = 0 ( ) + 1 ( )
2 2
1
= ,
2
1 3 3 1
𝐸(𝑌) = 0 ( ) + 1 ( ) + 2 ( ) + 3 ( )
8 8 8 8
3 6 3
= + +
8 8 8
12
=
8
3
=
2
Observe that 𝐸(𝑋 + 𝑌) = 𝐸(𝑋) + 𝐸(𝑌)
1 1
𝑦 0 0
2 2
1 0 1 1
4 4
1 1
𝑃(𝑌 = 𝑦) 1
2 2
Note that (𝑋) = 0, 𝐸(𝑌) = 12 , and 𝐸(𝑋𝑌) = 𝐸(𝑋 3 ) = 0, so that the previous theorem
holds.
However, the theorems did not hold for the dependent random variables in the previous
example.
We conclude that the preceding theorem holds for all pairs of independent random
variable and some but not all pairs of dependent random variables.
25
Variance of Sums of Random Variables
We turn now to some results leading to a formula for the variance of a sum of random
variables. First, the following identity can be established:
Except for the sign, the last three terms are equal. Hence,
𝐸[(𝑋 − 𝜇𝑥 )(𝑌 − 𝜇𝑦 )] = 0
where we have rearranged terms in the bracket using Theorem 2. Now we perform the
indicated squaring operation to obtain
2
𝑣𝑎𝑟(𝑋 + 𝑌) = 𝐸[(𝑋 − 𝜇𝑥 )2 + 2(𝑋 − 𝜇𝑥 )(𝑌 − 𝜇𝑦 ) + (𝑌 − 𝜇𝑦 ) ]
2
= 𝐸[(𝑋 − 𝜇𝑥 )2 ] + 2𝐸[(𝑋 − 𝜇𝑥 )(𝑌 − 𝜇𝑦 ) + 𝐸(𝑌 − 𝜇𝑦 ) ],
Note that if X and Y are independent, the middle term on the right hand side vanishes.
The other two terms are, by definition, precisely 𝑣𝑎𝑟(𝑋) and 𝑣𝑎𝑟(𝑌). Q.E.D.
Now if X and Y are independent, then so are 𝑎𝑋 and 𝑏𝑌 for any constants a and b. Thus,
we can extend the previous result to 𝑎𝑋 and 𝑏𝑌:
𝑣𝑎𝑟(𝑎𝑋 + 𝑏𝑌) = 𝑣𝑎𝑟(𝑎𝑋) + 𝑣𝑎𝑟(𝑏𝑌).
26
Activity set 1
1. The probability that a cellular phone company kiosk sells X number of new phone
contracts per day is shown below. Find the mean, variance, and standard deviation for this
probability distribution.
𝑥 4 5 6 8 10
𝑃(𝑥) 0.4 0.3 0.1 0.15 0.05
What is the probability that they will sell 6 or more contracts three days in a row?
2. The number of students using the Math Lab per day is found in the distribution below.
Find the mean, variance, and standard deviation for this probability distribution.
𝑥 6 8 10 12 14
𝑃(𝑥) 0.15 0.3 0.35 0.1 0.1
What is the probability that fewer than 8 or more than 12 use the lab in a given day?
3. A continuous random variable 𝑥 that can assume values between 𝑥 = 1 and 𝑥 = 3 has a
1
density function given by 𝑓(𝑥) = 2.
a. Show that the area under the curve is equal to 1
b. Find 𝑃(2 < 𝑥 < 2.5)
Suggested Answers
1. 𝜇 = 5.4, 𝜎 2 = 2.94, 𝜎 = 1.71 and probability = 0.027
2. 𝜇 = 9.4, 𝜎 2 = 5.24, 𝜎 = 2.289 and probability = 0.25
31
3. (a) ∫1 2 𝑑𝑥 = 1, (b) 0.25
27