0% found this document useful (0 votes)
12 views25 pages

Elementary Probability and Statistics

The document provides an introduction to probability theory, covering essential concepts such as random experiments, sample spaces, and events. It explains important terms like empirical probability, compound events, and mutually exclusive events, along with their mathematical properties and theorems. The document also includes examples and exercises to illustrate the application of probability rules.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views25 pages

Elementary Probability and Statistics

The document provides an introduction to probability theory, covering essential concepts such as random experiments, sample spaces, and events. It explains important terms like empirical probability, compound events, and mutually exclusive events, along with their mathematical properties and theorems. The document also includes examples and exercises to illustrate the application of probability rules.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

1

Contents

1 Introduction to Probability 3
1.1 Some important terms and concepts . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Empirical or experimental Probability . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Compound Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Mutually Exclusive Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Independent Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.8 EXERCISES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Random Variables and Probability Distributions 12


2.1 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Discrete Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 PMF and CDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Expected Value and Variance . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.4 Geometric Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.5 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Continuous Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 PDF and CMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.2 The Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.3 The Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . 22
2

2.3.4 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 22


2.3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3

Chapter One

Introduction to Probability
Probability theory originated from gambling theory. A large problems exists today which are
based on the game of chances, such as coin tossing, die throwing and playing cards. Probability
is very important for statistics because it provides the rules that allows us to reason uncertainty
and randomness, which is the basis of statistics. Independence and conditional probability are
profound ideas, but they must be understood in order to think clearly about any statistical
investigation.
Aims of the chapter: The aims of this chapter are to:

• understand the concept of probability

• work with independent events and determine conditional probabilities

• work with probability problems

Learning outcomes: After completing this chapter, you should be able to:

• Explain the fundamental ideas of random experiments, sample spaces and events

• List the axioms of probability and be able to derive probability rules

• explain conditional probability and the concept of independent events

• Apply the probability laws to problems.

1.1 Some important terms and concepts

Definition 1.1. 1. An experiment is any operation whose outcome cannot be predicted


with certainty. For example tossing a coin or dye.

2. A trial is a single performance of the experiment.

3. An outcome is one of the possible result from the trial of an experiment.

Example 1.1.

Tossing a coin is a trial.

throwing a die is a trial.

Picking a ball from a bag containing balls of different colours.


4

4. The sample space, S, of an experiment is a set of all possible outcomes of one trial of
the experiment. Each outcome can then be called a sample point.
Example 1.2. • Dataset 1: S = {0, 1, 2, ....}
• Dataset 2: S = {x : x ≥ 0}
• Toss of a coin: S = {H, T }
• Roll of a six sided die: S = {1, 2, 3, 4, 5, 6}

5. An event is a subset of a sample space.


Example 1.3. • Roll of a die. Denote by A the event that an even number is obtained.
Then A = {2, 4, 6}. We see that A ⊂ S.
• Toss of a coin. Denote by B the event that a head is shown. Then B = {H}. We see
that B ⊂ of S.

6. An event having no sample point is called a null event and is denoted by ∅.

7. The total number of possible outcomes in any trial is known as exhaustive events.
Definition 1.2 (Probability). The probability of an event A occurring is the ration of the
total number of a favourable outcomes to the total number of all possible outcomes all equally
likely to occur. That is, the probability of an event A is given by P (A) where

Number of favourable outcomes n(A)


P (A) = = (1.1)
N umberof possibleoutcomes n(S)

Since A ⊂ S it implies that Ac = Ā = S − A Thus

P (Ac ) = P (S) − P (A) (1.2)

Now, Since the probability of the sample space is 1, that is, P (S) = 1, we have that

P (Ac ) = 1 − P (A) (1.3)

Thus in general, if an event A can occur in n ways out of a total number of N possible likely
ways, then the probability of occurrence of the event called it’s success can be expressed as
n
P (A) = (1.4)
N
And
N −n n
P (Ac ) = =1− = 1 − P (A) (1.5)
N N

Conversely, the probability of non occurrence of the event called it’s failure can be expressed
as
P (success) + P (f ailure) = 1 (1.6)
5

From the above we have that,


0 ≤ P (A) ≤ 1 (1.7)
that is, probability of an event can take any value from 0 to 1 inclusive.

Properties of probbility

1. P (∅) = 0

2. For any event A, P (Ac ) = 1 − P (A)

3. For any event A, 0 ≤ P (A) ≤ 1

4. For any sequence of disjoint events A1 , A2 , · · ·



X
P (∪∞
i=1 ) = P (Ai )
i=1

Example 1.4. 1. In the role of a pair of dice white, find the probability that appearance are
the same.

2. Find the probability that both are even numbers

Solution:

1. S = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (5, 1), (5, 2),
(5, 3), (5, 4), (5, 5), (5, 6), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
n(S) = 36
Let A be the set of same appearance, then
A = {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)}
n(A) = 6
We have that, P (A) = n(A) n(S)
= 366
= 16

2. Let B be the set of a pair of even numbers


B = {(2, 2), (2, 4), (2, 6), (4, 2), (4, 4), (4, 6), (6, 2), (6, 4), (6, 6)}
n(B) = 9
Thus, P (B) = n(B)
n(S)
= 369
= 41

Example 1.5. Given an experiment of tossing 3 coins, find the probability of

(i) As many heads as tails

(ii) More heads than tails

Solution:
6

(i) S = {HHH, HHT, HT T, T T T, T T H, T HH, HT H, T HT }


P (As many heads as tails) = 08 = 0.

Remark 1.1. If your probability is zero, it means the event can never occur.

(ii) Let H be heads and T be tails, then


4 1
P (More heads than tails) = P (H > T ) = 8
= 2

1.2 Empirical or experimental Probability

If after n (n large) repetitions of an experiment if an event occurs, then the probability of the
outcome is nh . This is called the Empirical Probability of the event.
For example, if a coin is tossed 1000 times and heads count is 489 then the probability of
getting a head on the next throw is
489
P (H) = = 0.489 = 48.9% (1.8)
1000
The probability of getting a tail on the next throw is

P (T ) = 1 − 0.489 = 0.511 = 51.1% (1.9)

1.3 Compound Events

Let A and B be 2 separate events in the sample space S, then

1. A ∪ B is the event ”either A or B”.

2. A ∩ B is the event ”both A and B”.

3. A0 = Ac is the event ”not A”.

4. A ∩ B 0 is the relative compliment of B w.r.t A.

A ∩ B 0 = {x : x ∈ A and x ∈
/ B} (1.10)

1.4 Mutually Exclusive Events

Definition 1.3. Two events A and B are mutually exclusive if the are disjoint. That is, if
A ∩ B = {}.
7

The Probability of mutually exclusive events A and B is denoted by P (A ∪ B) and is given


by
P (A ∪ B) = P (A) + P (B) (1.11)
Theorem 1.1. If ∅ is an empty set the P (∅) = 0

Proof: Let A be any set then A ∪ ∅ = A


This implies P (A ∪ ∅) = P (A)
Also, P (A) + P (∅) = P (A)
Thus we have that

P (A ∪ ∅) = P (A) + P (∅)
=⇒ P (∅) = P (A) − P (A ∪ ∅)
= P (A) − P (A)
=0

Thus, P (∅) = 0, that is the probability of the empty set is zero


Theorem 1.2. If A0 is the compliment of an event A, then

P (A0 ) = 1 − P (A) (1.12)

Proof: The sample space S can be decomposed into two mutually exclusive events A and
A . That is, S = A ∪ A0
0

We have that

P (S) = P (A ∪ A0 )
1 = P (A) + P (A0 )
=⇒ P (A0 ) = 1 − P (A)

Which ends the proof


Theorem 1.3. If A ∩ B are any 2 events, then

P (A ∩ B 0 ) = P (A) − P (A ∩ B) (1.13)

Proof:

A = (A ∩ B) ∪ (A ∩ B 0 )
=⇒ P (A) = P [(A ∩ B) ∪ (A ∩ B 0 )]
=⇒ P (A) = P (A ∩ B) + P (A ∩ B 0 )
=⇒ P (A ∩ B) = P (A) − P (A ∩ B 0 )

Which ends the proof 


Theorem 1.4. If A and B are any two events, then

P (A ∪ B) = P (A) + P (B) − P (A ∩ B) (1.14)


8

Proof: The set A ∪ B can be decomposed into two mutually exclusive events A ∩ B 0 and
B. Therefore

A ∪ B = (A ∩ B 0 ) ∪ B
P (A ∪ B) = P [(A ∩ B 0 ) ∪ B]
= P (A ∩ B 0 ) + P (A)
= P (A) − P (A ∩ B) + P (B)

Thus,
P (A ∪ B) = P (A) + P (B) − P (A ∩ B) (1.15)

Definition 1.4. Events A, B and C are collectively exhaustive if

P (A ∪ B ∪ C) = 1 (1.16)

That is

P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (A ∩ B ∩ C) (1.17)


P (A ∪ B ∪ C) = P (A) + P (B) + P (C) − P (A ∩ B) − P (A ∩ C) − P (B ∩ C) + P (A ∩ B(1.18)
∩ C)
=1 (1.19)

1.5 Independent Events

Definition 1.5. A event A is said to be independent of an event B if the probability that A


occurs is not influenced by whether B has or has not occurred.

Thus for independent events A and B, We have that

P (A ∩ B) = P (A) · P (B) (1.20)

Example 1.6. John tosses a fair coin and Daniel tosses a fair dice. Find the probability of

(a) A head and a 4


(b) A head or a 4

Solution:

(a)
P (H ∩ 4) = P (H) · P (4)
1 1
= ·
2 6
1
=
12
9

(b)
P (H ∪ 4) = P (H) + P (4) − P (H ∩ 4)
1 1 1
= + −
2 6 12
4 1
= −
6 12
7
=
12
Example 1.7. A fair coin is tossed 3 times. Consider the events A, B and C where

A = {first toss is head}


B = {second toss is heads}
C = {It has two heads tossed in a row}

decide whether

(i) A is independent of B.

(ii) B is independent of C.

(iii) A is independent of C.

Solution: S = {(HHT ), (HT T ), (HT H), (HHH), (T T HH), (T HH), (T HT ), (T T T )}


Thus, n(S) = 8
4 4 3
P (A) = , P (B) = , P (C) =
8 8 8

(i) For A to be independent of B, P (A ∩ B) = P (A) · P (B)


2 1
Now, P (A ∩ B) = P (the first toss is a head and the second toss is a head) = 8
= 4
P (A) · P (B) = 12 × 21 = 14
Since P (A ∩ B) = 14 = P (A) · P (B), A and B are independent.
3
(ii) P (B ∩ C) = P (the second toss is a head and two heads in a row) = 8
P (B) · P (C) = 12 × 38 = 16
3

Since P (B ∩ C) 6= P (B) · P (C), B and C are not independent.

(iii) Assignment

1.6 Conditional Probability

Definition 1.6. A conditional probability is one in which the occurrence of a event A


depends on the occurrence of another event B. It is denoted P (A\B), that is probability of A
given B.
10

P (A ∩ B)
P (A\B) = , P (B) 6= 0 (1.21)
P (B)

1.7 Bayes’ Theorem

Bayes’ theorem comes into play when the conditional probability cannot be applied directly.
Therefore Bayes’ theorem is a extension of the conditional probability.

Theorem 1.5. Let S be the sample space, A1 , A2 , · · · , An be partitions of S and B be any


event. Then for each i = 1, 2, ..., n we have that

P (Ai ) · P (B\Ai )
P (Ai \B) = (1.22)
P (A1 ) · P (B\A1 ) + P (A2 ) · P (B\A2 ) + · · · + P (An ) · P (B\An )

1.8 EXERCISES

1. • (a) In tossing a coin, what is the probability of getting a head?


(b) In tossing a die, What is the probability of getting a 6?
(c) Find the probability of throwing a 7 with two dice
(d) A bag contains 6 red and 7 black balls. Find the probability of drawing a red ball.
(e) From a pack of 52 cards, 1 card is drawn at random. Find the probability of getting
a queen.
(f) Given a fair die, find the probability of throwing
(i) a4
(ii) an odd number
(iii) an even number
(iv) a prime number

2. A bag contains 8 white and 10 black balls. Two balls are drawn in succession. What is
the probability that the first is white and the second is black.

3. Two persons A and B appear in an interview for 2 vacancies for the same post. The
probability of A’s selection is 71 and that of B’s selection is 15 . What it the probability
that

(i) both of them will b selected.


(ii) None of them will be selected.

4. What is the chance of getting two 6s’ in two rolling of a single die?

5. A problem in Mathematics is given to 3 students: A, B, C whose chances of solving it are


1 1 1
, , respectively. What is the probability that the problem will be solved?
2 3 4
11

6. From a bag containing 4 white balls and 6 black balls, two balls are drawn at random. If
the balls are drawn one after the other without replacement, find the probability that

(i) both balls are white


(ii) both balls are black
(iii) the first ball is white and the second ball is black
(iv) one ball is white and the other is black.

7. Do the same exercise as above, but consider that the balls are replaced in this case.

8. IF from a pack of cards a single card is drawn. What is the probability that it is either a
spade or a King?

9. A person is known to hit the target in 3 out of 4 shots, whereas another person is known
to hit the target in 2 out of 3 shots. Find the probability of the targets being hit at all
when they both try.

10. A bag contains 3 red and 4 white balls. Two draws are made without replacement. What
is the probability that both ball are red?

11. Find the probability of drawing a queen and an King from a pack of cards in two consec-
utive draws, the cards drawn are not replaced.

12. The Guardian news paper publishes three columns entitled politics (A), Books (B), Ad-
verts (C). Reading habits of a randomly selected reader with respect to the three columns
are, A = 0.14, B = 0, 23, C = 0.37, A ∩ B = 0.08, A ∩ C = 0.09, B ∩ C = 0.13 and
A ∩ B ∩ C = 0.05. Find

(i) P (A\B)
(ii) P (A\B ∪ C)
(iii) P (A\reads at least one)
(iv) P (A ∪ B\C)
12

Chapter Two

Random Variables and Probability


Distributions
In this chapter, we introduce the concepts of random variables and probability distributions.
These distributions are univariate, which means that they are used to model a single numerical
quantity. The concepts of expected value and variance are also discussed.
Aims of the Chapter: The aims of this chapter are to

• Be familiar with the concepts of random variables

• Be able to explain what a probability distribution is

• Be able to determine the expected value and variance of a random variable.

Learning outcomes: After completing this chapter, you should be able to:

• Define a random variable and distinguish it from the values that it takes.

• Explain the difference between discrete and continuous random variables.

• Find the expectation and variance of simple random variables, whether discrete of con-
tinuous.

• Demonstrate how to proceed and use simple properties of expected values and variance.

2.1 Random Variables

Definition 2.1. A random variable is an experiment for which the outcomes are numbers.

Suppose that to each point of a sample space we assign a number. We then have a function
defined on the sample space. This function is called a random variable (or stochastic variable)
or more precisely a random f unction (stochastic f unction). It is usually denoted by an upper-
case letter such as X or Y . In general, a random variable has some specified physical, geometric,
or other significance.

Example 2.1. Suppose that a coin is tossed twice so that the sample space is S = {HH, HT, T H, T T }.
Let X represent the number of heads that can come uo. With each sample point we can associate
a number for X as shown in the Table below. It follows that X is a random variable.
13

Sample Point HH HT TH TT
X 2 1 1 0

It Should be noted that many other random variables could also be defined on this sample
space, for example, the square of the number of heads or the number of heads minus the tails.
Example 2.2. Roll two dice and call their sum X. The sample space for X is

SX = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}

and the outcomes are not equally likely. However, we know that the probabilities of the events
corresponding to each of these outcomes, and we could display them in a table as follows.

Outcome 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
Probability 36 36 36 36 36 36 36 36 36 36 36

A random variable that takes on a finite or countably infinite number of values is called a
discrete random variable. For example, the number of days that it rains yearly.
While one that takes on noncountably infinite(continuous) number of values is called contin-
uous random variable. For example, the amount of preparation time for an Exams.

2.2 Discrete Probability Distributions

Definition 2.2. A random variable that takes a finite or countably finite number of values is
called a discrete random variable

2.2.1 PMF and CDF

Definition 2.3. Let X be a discrete random variable. We define the probability mass func-
tion(PMF) to be the function which gives the probability of each x ∈ SX . That is,
X
P (X = x) P (s) (2.1)
(s∈S|X(s)=x)

That is, the probability of getting a particular number is the sum of the probabilities of all
those outcomes which have that number associated with them. The set of all pairs

{(x, P (X = x)) : x ∈ SX }

is known as the probability distribution of X.


Example 2.3. For the example above concerning the sum of two dice, the probability mass
function can be tabulated as
14

x 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
P (X = x) 36 36 36 36 36 36 36 36 36 36 36

For any discrete random quantity, X, we have that


X
P (X = x) = 1
x∈S

as every outcome has some number associated with it. It can often be useful to know the
probability that a random number is no greater than some particular value. This leads us to
cumulative distributive f unction.

Definition 2.4. Let X be a discrete random variable. We define the cumulative distributive
function (CDF) as X
F (x) = P (X ≤ x) = P (X = x)
y∈S:y≤x

Example 2.4. For the sum of two dice, the cumulative distribution function for the outcomes
can be tabulated as

x 2 3 4 5 6 7 8 9 10 11 12
1 3 6 10 15 21 26 30 33 35 36
P (X ≤ x) 36 36 36 36 36 36 36 36 36 36 36

It is important to know that the CDF is defined for all real numbers - not just the possible
values. In ou example, we have

F (−3) = P (X ≤ −3) = 0
6
F (4.5) = P (X ≤ 4.5) = P (X ≤ 4) =
36
F (25) = P (X ≤ 25) = 1

Example 2.5. Consider the following probability distribution for the household size, X in
Cameroon. With x being the number of people in a household

x 1 2 3 4 5 6 7 8
P (X = x) 0.3002 0.3417 0.1551 0.1336 0.0494 0.0145 0.0034 0.0021

8
P
These are clearly non negative and their sum P (X = x) = 1. We have that the CDF of
i=1
household problem is

x 1 2 3 4 5 6 7 8
F (x) = P (X ≤ x) 0.3002 0.6419 0.7970 0.9306 0.9800 0.9945 0.9979 1.0000
15

Thus,

P (X = 1) = F (1) = 0.3002 (2.2)


P (X = 2) = F (2) − F (1) = 0.3417 (2.3)
P (X ≤ 2) = F (2) = f (1) + f (2) = 0.6419 (2.4)
P (X = 3 or 4) = f (3) + f (4) = F (4) − F (2) = 0.2887 (2.5)
P (X > 5) = f (6) + f (7) + f (8) = 1 − F (5) = 0.0200 (2.6)
P (X ≥ 5) = f (5) + f (6) + f (7) + f (8) = 1 − F (4) = 0.0694 (2.7)

2.2.2 Expected Value and Variance

Definition 2.5. The expected value or (population) mean of a random variable X, denote
µ or E(X), is the sum over n possible values. That is
n
X n
X
µ = E(X) = xi P (X = xi ) = xi f (xi ) (2.8)
i=1 i=1

n may be finite or infinite.

The location measure use to summarise random quantities is known as the expectation of
the random quantity. It is the ”centre of mass” of the probability distribution. Analogous to
the sample mean x̄, it represents the ”average” value of X.

Example 2.6. For the sum of two dice, we have


1 2 3 1
µ = E(X) = 2 × + 3× 4× + · · · + 12 × =7
36 36 36 36
By looking at the symmetry of the mass function, it is clear that in some sense 7 is the ”central”
value of the probability distribution.

Definition 2.6. The variance of a random variable X, denoted σ 2 or V ar(X), is given by


n
X
2
σ = V ar(X) = (xi − µ)2 P (X = xi ) (2.9)
i=1

The variance represents the spread of random quantities, relative to the expected value of
all values with positive probability. The variance can also be written as
n
X
2
σ = V ar(X) = x2i P (X = xi ) − µ2
i=1

and this expression is usually a bit easier to work with.

Definition 2.7. The standard deviation of X, denoted by σ or SD(X), is the square root
16

of its variance. This is, p


σ = SD(X) = V ar(X) (2.10)
Example 2.7. For the sum of two dice, we have
n
X
V ar(X) = x2i P (X = xi ) − µ2
i=1
1 2 2 3 1
= 22 × + 32 × 4 × + · · · + 122 × − 72
36 36 36 36
329
= − 72
6
35
=
6
and

r
35
SD(X) =
6
Example 2.8. Find the expectation and variance of the Household example.

2.2.3 Binomial Distribution

Now that we have an understanding of discrete random quantities, we look next at a few
standard families of discrete random variables. One of the most commonly encountered discrete
distribution is the binomial distribution. This is the distribution of the number of ”successes”
in a series of n independent trails, each of which results in a ”success” (with probability p) or
a ”failure” (with probability 1 − p). If the number of successes ix X, we would write

X ∼ B(n, p)

to indicate that X is a binomial random quantity based on n independent trials, each occurring
probability p.
Example 2.9. 1. Toss a fair coin 100 times and let X be the number of heads. Then X ∼
B(100, 0.5).
2. A certain kind of lizard lays 8 eggs, each of which will hatch independently with probability
0.7. Let Y denote the number of eggs which hatch. Then Y ∼ B(8, 0.7)

Let us now derive the PMF for the binomial distribution X ∼ B(n, p). Clearly, X can take
on any value from 0 up to n, and no other. Therefore, we simply have to calculate P (x = k) for
k = 1, 2, 3, ..., n. The probability of k successes followed by n−k failures is given by pk (1−p)n−k.
Indeed, this is the probability of any particular sequence involving k successes. There are nk
such sequences, so by the multiplication principle, we have
 
n k
P (X = k) = p (1 − p)n−k , k = 0, 1, 2, ..., n (2.11)
k
17

Example 2.10. For the Lizard eggs, Y ∼ B(8, 0.7) we have


 
8
P (X = k) = 0.7k (1 − 0.7)8−k , k = 0, 1, 2, ..., 8
k
 
8
= 0.7k 0.38−k , k = 0, 1, 2, ..., 8
k

We can therefore tabulate the PMF and CDF as follows

k 0 1 2 3 4 5 6 7 8
P (Y ≤ k) 0.00 0.00 0.01 0.04 0.14 0.25 0.30 0.20 0.06
F (k) 0.00 0.00 0.01 0.06 0.19 0.45 0.74 0.94 1.00

The Expectation, variance and standard deviation of the binomial distribution X ∼ (n, p)
are given by

E(X) = np (2.12)
V ar(X) = np(1 − p) (2.13)
p
SD(X) = V ar(X) (2.14)

Example 2.11. For example, in the case of the coin tosses, X ∼ B(n, p), we have

E(X) = np = 100 × 0.5 = 50


V ar(X) = np(1 − p) = 100 × 0.5 × (1 − 0.5) = 25
p √
SD(X) = V ar(X) = 25 = 5

2.2.4 Geometric Distribution

If X is the number of trials until a success is encountered, and each independent trial has
probability p of being a success, we write

X ∼ Geom(p)

Clearly, X can take on any positive integer, so to deduce the PMF, we need to calculate
P (X = k) for k = 1, 2, 3, ... In order to have X = k, we must have an ordered sequence of k − 1
failures followed by one success. By the multiplication rule we have that

P (X = k) = (1 − p)k−1 p, k = 1, 2, 3, ... (2.15)

The CDF of the geometric distribution X ∼ Geom(p) is given by

F (k) = P (X ≤ k) = 1 − (1 − p)k (2.16)

The Expectation, variance and standard deviation of the geometric distribution X ∼


18

Geom(p) are given by


1
E(X) = (2.17)
p
1−p
V ar(X) = (2.18)
p
p
SD(X) = V ar(X) (2.19)

Example 2.12. For X ∼ Geom(o.2)


1 1
E(X) = = =5 (2.20)
p 0.2
2−p 1 − 0.2 0.8
V ar(X) = = = = 20 (2.21)
p 0.2 0.2

2.2.5 Poisson Distribution

The Poisson distribution is a very important discrete probability distribution, which arises in
many different context in probability and statistics. Typically, Poisson random quantities are
used in place of binomial random quantities in situations where n is large, p is small, and the
expectation np is stable.
A Poisson random variable, X with parameter λ is written as

X ∼ P (λ)

Example 2.13. Consider the number of calls made in a 1 minute interval to an internet service
provider (ISP). The ISP has thousands of subscribers, but each one will call with a very small
probability. The ISP knows that on average 5 calls will be made in the interval. The actual
number of calls will be a Poisson random variable, with mean 5.

If X ∼ P (λ), then the PMF of X is given by

λk −k
P (X = k) = e , k = 0, 1, 2, 3, ... (2.22)
k!
Example 2.14. content...

The Expectation, variance and standard deviation of the Poisson distribution X ∼ P (λ)
are given by

E(X) = λ (2.23)
V ar(X) = λ (2.24)
p
SD(X) = V ar(X) (2.25)

Thus the Expectation and the variance of the Poisson distribution are both λ.
19

2.3 Continuous Probability Distributions

In this section, we discuss techniques for handling continuous random quantities. Continuous
random variables are random quantities with a sample space which is neither finite nor
countably infinite. Continuous probability models are appropriate if the result of a experiment
is a continuous measurement, rather than a count of a discrete set.
IF X is a continuous random quantity with sample space S, then for any particular a ∈ S,
we generally have that
P (X = a) = 0
. This is because the sample space is so ”large” and every possible outcome so ”small” that the
probability of any particular value is vanishingly small. Therefore the probability mass function
we defined for discrete random quantities is inappropriate for understanding continuous random
quantities. In order to understand continuous random quantities, we need to understand a little
calculus.

2.3.1 PDF and CMF

Definition 2.8. Let X be a continuous random variable, the probability density function
(PDF) of X is a function f (x) which satisfies the following:

1. f (x) ≥ 0

R∞
2. −∞
f (x)dx = 1

Rb
3. P (a ≤ X ≤ b) = a
f (x)dx for any a and b.

Consequently, we have
Z
P (x ≤ X ≤ x + δx) = x + δxf (y)dy
x
u f (x)δx
P (x ≤ X ≤ x + δx)
=⇒ f (x) u
δx
So that we may interpret the PDF as

P (x ≤ X ≤ x + δx)
f (x) = lim = (2.26)
δx→0 δx
Example 2.15. The manufacturer of a certain kind of light bulb claims that the lifetime of the
20

bulb in hours, X can ne modelled as a random quantity with PDF


(
0, x < 100
f (x) = c
x2
, x ≥ 100

where c is a constant. What value must c take in order for this to define a valid PDF? What is
the probability that the bulb lasts no longer that 150 hours? Given that a bulb lasts longer than
150 hours, what is the probability that it lasts longer that 200 hours?
Remark 2.1. 1. PDFs are not probabilities. For example, the density can take values greater
than 1 in some regions as long as it still integrates to 1.
2. Because P (X = x) = 0, we have that P (X ≤ k) = P (X < k) for continuous random
quantities.
Definition 2.9. Let X be a continuous random variable, the cumulative distribution func-
tion (CDF) of X is a function F (x) such that for all x

F (x)P (X ≤ x)

Hence, the CDF of continuous random quantities is defined the same as the CDF of discrete
random quantities, but for continuous random quantities we have the continuous analogue
Z x
F (x) = P (X ≤ x) = P (−∞ ≤ X ≤ x) = f (y)dy
−∞

Just as in the discrete case, the cumulative distribution function is defined for all x ∈ R, evene
if the sample space S is not the whole of the real line.
Properties of the CDF

1. Since it represents a probability, F (x) ∈ [0, 1].


2. F (−∞) = 0 and F (∞) = 1.
3. IF a < b, then F (a) ≤ F (b). That is F is a nondecreasing function.
4. When X is continuous, F (x) is continuous and by the fundamental theorem of calculus
d
we have that dx F (x) = f (x). That is the slope of the CDF is the PDF.
Example 2.16. For the light bulb lifetime, X, the CDF is
(
0, x < 100
f (x) =
1 − 100
x
, x ≥ 100

Definition 2.10. The median of a random quantity is the value m which is the ”middle” of
the distribution. That is, it is the value m such that
1
P (X ≤ m) = P (X ≥ m) =
2
21

Equivalently, it is the value m such that

F (x)0.5

Similarly, the lower quartile of a random quantity is the value l such that

F (l) = 0.25

and the upper quartile is the value u such that

F (u) = 0.75

Example 2.17. ...

Now, that we have the basic properties of continuous random variables, we can look at
some of the important standard continuous probability distribution models. We start with the
simplest of these which is the uniform distribution.

2.3.2 The Uniform Distribution

Definition 2.11. A random quantity X is said to have a uniform distribution over the
range [a, b] written
X ∼ U (a, b)
if is has PDF as (
1
b−a
a≤x≤b
,
f (x) = (2.27)
0, otherwise
and it’s CDF is 
0,
 x<a
F (x) = x−a
b−a
, a≤x≤b (2.28)

1, x>b

The lower quartile, median and upper quartile of the uniform distribution are given respec-
tively by
3 1 a+b 1 3
a + b, , a+ b (2.29)
4 4 2 4 4

The expectation of X ∼ U (a, b) is given by

a+b
E(X) = (2.30)
2
and the variance is given by
(b − a)2
V ar(X) = (2.31)
12
22

The uniform distribution is too simple to realistically model actual experimental data, but
is very useful for computer simulation, as random quantities from different distributions can be
obtained from U (0, 1) random quantities.

2.3.3 The Exponential Distribution

Definition 2.12. The random variable X has an exponential distribution with parameter
λ > 0, written
X ∼ Exp(λ)
if it has PDF (
λe−λx , x ≥ 0
f (x) = (2.32)
0, otherwise
and CDF (
0, x<0
f (x) = (2.33)
1 − e−λx , x≥0

The expectation of the exponential distribution X ∼ Exp(λ) is given by


1
E(X) = (2.34)
λ
and the variance by
1
V ar = (2.35)
λ2

2.3.4 The Normal Distribution

Definition 2.13. A random quantity X has a normal distribution with parameter µ and
σ 2 , written
X ∼ N (µ, σ 2 )
if it has PDF   2 
1 1 x−µ
f (x) = √ exp − , −∞ < x < ∞ (2.36)
σ 2π 2 σ
for σ > 0, and CDF  
x−µ
F (x) = P (X ≤ x) = Φ (2.37)
σ
   
b−µ a−µ
Note that P (a < X ≤ b) = Φ σ
−Φ σ

The expectation of the normal distribution X ∼ N (µ, σ 2 ) is given by

E(X) = µ (2.38)
23

And it’s variance by


V ar(X) = σ 2 (2.39)

2.3.5 Exercises

1. Let X be a discrete random variable with the following PMF




 0.1 f or x = 0.2

0.2 f or x = 0.4





0.2 f or x = 0.5
f (x) =
0.3 f or x = 0.8





 0.2 f or x = 1.0

0 otherwise

(a) Find Rx , the range of the random variable X.


(b) Find the CDF of X.
(c) Find P (X ≤ 0.5)
(d) Find P (0.25 < X < 0.75)
(e) Find P (X = 0.2|X < 0.6)
(f) Find the expectation and variance of X.

2. Mary rolls two fair dice and observes two numbers X and Y .

(a) Find RX , RY and the PMFs of X and Y .


(b) Find the CDF of X and Y
(c) Find P (X = 2, Y = 6)
(d) Find P (X > 3|Y = 2)
(e) Let Z = X + Y . Find the range and PMF of Z.
(f) Find P (X = 4|Z = 8)

3. Let X be a discrete random variable with the following PMF


1

 4
f or k = −2
1

f or k = −1



 8
 1 f or k = 0

f (k) = 81


 4
f or k = 1
 1
 f or k = 2
4



0 otherwise

I define a new random variable Y as Y = (X + 1)2 .


24

(a) Find the range of Y


(b) Find the PMF of Y
(c) Find E(Y ), V ar(X) and SD(X).

4. Let X be a discrete random variable with range RX = {1, 2, 3, ...}. Suppose the PMF of
X is given by
1
f (x) = x f or k = 1, 2, 3, ...
2
(a) Find P (2 < X ≤ 5)
(c) Find P (X > 4)

5. Let X be a continuous random variable with the following PDF


(
ce−x x ≥ 0
fX (x) =
0 otherwise

where c is a positive constant.

(a) Find c
(b) Find the CDF of X, FX (x)
(c) Find P (1 < X < 3)

6. Let X be a continuous random variable with PDF

fx (x)2x, 0≤x≤1

Find the expected value and variance of X.

7. Let X be a continuous random variable with PDF


(
3
x4
x≥1
fX (x) =
0 otherwise

Find the mean and variance of X.

8. Let X be uniform distribution with X ∼ U (0, 1), and let Y = ex

(a) Find the CDF of Y


(b) Find the PDF of Y
(c) Find E(Y )

9. Let X be a continuous random variable with PDF


(
4x3 0 < x ≤ 1
fX (x) =
0 otherwise

Find P (X ≤ 23 |X > 13 ).
25

10. Let X ∼ N (−5, 4)

(a) Find P (X < 0)


(b) Find P (−7 < X < −3)
(c) Find P (X > −3|X > −5)

11. Exercise 5.35 of William Mendenhall, Robert and Barbara Beaver

12. Exercise 5.43 of William Mendenhall, Robert and Barbara Beaver

13. Exercise 5.83 of William Mendenhall, Robert and Barbara Beaver

14. Exercise 6.27 of William Mendenhall, Robert and Barbara Beaver

You might also like