0% found this document useful (0 votes)
28 views64 pages

Probability - Sta 253 - Engineering

Uploaded by

mikeasare147
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views64 pages

Probability - Sta 253 - Engineering

Uploaded by

mikeasare147
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 64

.

PROBABILITY AND STATISTICS 2

UNIT 1

Definition
Probability is defined as a measure of the chance of things happening or not
happening

Or a measure of the likelihood of the occurrence or nonoccurrence of an event.

Definition of terms

Random Experiment: Any process whose outcome cannot be predicted


with certainty .E.g tossing a fair die once.

Sample Space(S): The set of all possible outcomes of an experiment. E.g


when we toss a fair die once the sample space S is given by
S= { 1 , 2, 3 , 4 ,5 , 6 } , n(S) =6

Event(E): This is an outcome or a set of outcomes in an experiment: e.g


when we toss a fair die once the event that an even number would show
up is given by E={ 2 , 4 , 6 } , n(E) = 3 An event is always a subset of the
sample space.

AXIOMS OF PROBABILITY

1. If E is an event then the probability of E lies between 0 and 1 inclusive.

Mathematically 0 ≤ P ( E ) ≤ 1

Test boundaries

P(E) =1 , this event is a sure event

P(E ) = 0 , This event is an impossible event

1
Proof of Axiom 1

Recall that

E ⊆S
≫ E≤S

Take probability of both sides

P ( E) ≤ P ( S) , BUT P(S)=1

Meaning P(E)≤ 1 ………………(4)

Also recall that ∅ ⊆ E

≫∅≤ E

Take probability of both sides P(∅ ) ≤P(E) but P(∅ ) =0

Meaning 0≤P(E) ………….(5)

Fuse eqns 4 and 5 by writing 5 first and then 4

0≤P(E), P(E)≤ 1

Meaning 0≤ P(E)≤ 1 as required

2.The probability that an event E would occur denoted by P(E) and the
Probability that an event E would not occur denoted by P ¿ ) equals 1.
Mathematically

P(E)+ P ¿ ) =1 ……………………….(1)

Results from 1

P(E)=1 - P ¿ )

2
P ¿ ) =1- P(E)

Proof of axiom 2

Recall that EU E=S ……………(1)

≫ n ( E ) +n ¿ = n(s)

Divide through by n(s)

P(E) +P( E ) = P(S) ……………………(2)

But P(S) =1

≫ P(E)+ P ¿ ) =1 as required

3. For any event E and F in a given sample space

P(EUF)=P(E) +P(F) – P(E∩ F ¿

This above theorem is often called the Addition rule of Probability Rule

This would be proved in class

4. If S is the sample space for any experiment, then

P(S) =1

Or =∪=+¿
And=∩=×

Elementary Theorem

P ( ∅ )=0

CLASSICAL DEFINITION OF PROBABILTY

In an experiment for which all possible outcomes in the sample space (S) are equally
likely to occur

3
then the probability of the event E( meaning the likelihood that the event E would
occur is given by

n(E) number of elemnets∈ E


Mathematically P(E) = n(S) = number of elements ∈the sample space (S )

Basic Example

A fair die is tossed once.

a) List the elements in the sample space


b)Find the probability that
(i) An even number shows up
(ii) An odd number shows up

Soln

a) S= { 1 , 2, 3 , 4 ,5 , 6 } , n(S) =6
n(even number) 3
b (i) P(even number) = = = 0.5
n(s) 6
n(odd number ) 3
(ii) P(odd number) = = = 0.5
n(s) 6

Example

A fair coin is tossed 2 times.

a) List the elements in the sample space


b)Find the probability that
(i) two heads show up
(ii) At least one head shows up.

Soln.

4
H T

H HH HT

T TH TT

a) S= { HH , HT , TH ,TT } , n(s) =4

n(two heads show up) 1


b (i) P(two heads show up) = = = 0.25
n(s) 4

n(at least one head ) 3


(ii) P(at least one head) = = = 0.75
n (s) 4

Examples

1. What is the probability of obtaining an odd or prime number if a fair die is


tossed once.

Solution

S = { 1 , 2, 3 , 4 ,5 , 6 }n(S) = 6

Let E denote the event that odd number shows up.

E = { 1 , 3 ,5 } , n(E) =3

5
Let F denote the event that a prime number shows up.

F = { 2 , 3 ,5 } , n(F) =3

E ∩ F={ 3 , 5 } n( E∩ F ¿=2

P(E Or F) = P( E ¿+ P ( F )−P(E ∩ F)

= 3/6+3/6-2/6 = 2/3

• Example (TRIAL)

A fair coin is tossed 3 times.


a) List the elements in the sample space
b)Find the probability that
(i) two heads show up

(ii) At least one head shows up.

(iii) At most two heads show up


(iv) No head shows up

NOTE

S = {HHH, HHT, HTT, THH, TTH, HTH, THT, TTT}.


n(s) =8

Generally the sample space for a fair coin tossed is given by 2n , where n is the
number of times the coin is tossed.

6
Generally the sample space for fair die tossed is given by 6 n , where n is the
number of times the die is tossed.
Also tossing one die twice is the same as tossing two dice one.
Also tossing one die thrice is the same as tossing three dice one

Example
Two fair dice red and black are tossed together.
a) List the elements in the sample space
b)Find the probability of obtaining
(i) a 3 on either dice
(ii) A score of 10
(iii) a 5 and a score of 9
(iv) A 5 on the red die and a score of 10

SAMPLE SPACE

1 2 3 4 5 6
1 1,1 1,2 1,3 1,4 1,5 1,6
2 2,1 2,2 2,3 2,4 2,5 2,6
3 3,1 3,2 3,3 3,4 3,5 3,6
4 4,1 4,2 4,3 4,4 4,5 4,6
5 5,1 5,2 5,3 5,4 5,5 5,6
6 6,1 6,2 6,3 6,4 6,5 6,6

n(s)=36

7
SOLN

n(3 on either die)


i) P(3 on either die)= = 11/36
n (s )

n(score of 10)
ii) P( score of 10) = = 3/36
n(s)

iii) P( a 5 and(∩) a score of 9) = 2/36

(iv) P(A 5 on the red die and(∩¿ a score of 10) = 1/36

Since an event is a subset of the sample space , we can combine events to form new
events using the various set operations. The sample space is considered as the universal
set. If A and B are two defined on the sample space , then

(1) A ∪ B¿ denotes the event A or B or both. Thus the event A ∪ B occursif either A
occurs or B occurs or both A and B occur.
(2) A ∩ B denotes the event both A and B .Thus the event A ∩ B occurs if both A
and B occur
(3) A or A' or Ac denotes the event which occurs iff A does not occurs

De Morgan’s Law

Venn diagrams are often used to verify relationships among sets thereby making it
unnecessary to give formal proofs based on the algebra of sets.

To illustrate let us show that ( AUB)' = A' ∩ B' which expresses the fact that the
complement of the union of two sets equals the intersection of their respective
complements.

We would use a diagram to explain this in class

Summary for De Morgan’s laws

1. ( AUB)' = A' ∩ B'


2. ( A ∩ B)' =A ' ∪ B'

We can also use Venn diagrams to verify the following

8
Two set Problems

If A and B are two events defined on a sample space S , then S can be split into the
following four mutually exclusive events

A ∩ B, A' ∩B , A ∩ B' ∧ A' ∩B '

Notice that A=( A ∩ B' ) ∪( A ∩ B)

Since A ∩ B' ∧ A ∩ B are mutually exclusive

≫ P ( A )=P ( A ∩ B' ) + P( A ∩ B)………………………….(1)

Similarly

P ( B )=P ( A ' ∩ B ) + P (A ∩B) ……………………………..(2)

Moreover

( A ∩ B' ¿ ∪ ¿ A ∩ B ¿ ∪ ( A' ∩B) U ( A ' ∩ B' ) =S

And since the four events are mutually exclusive

P( A ∩ B' ¿+ P ¿ A ∩ B ¿ + P ( A' ∩ B ) + P( A' ∩B ' ) =P(S)=1

From Eqn 1 above

P ( A ∩B ' ) =P ( A )−P( A ∩ B)……………………… (4)

Also from eqn 2 above

P ( A ' ∩ B ) =P ( B )−P( A ∩ B) …………………………..(5)

Example

The probability that a new airport will get an award for its design is 0.04, the
probability that it would get an award for the efficient use of material is 0.2 and the
probability that it would get both awards is 0.03. Find the probability that it will get

(a) At least one of the awards


(b) Only one of the two wards
(c) None of the awards

Soln

9
Let D denote the event that the airport would get an award for its design

E be the event that the airport would get an award for its efficient use of materials

P(D)= 0.04, P(E) = 0.2 and P( D ∩ E ¿=0.03

(a) We wish to find P(D ∪ E)

P(D ∪ E)= 0.04+0.2-0.03= 0.21

(b) Probability that it would get only one of the awards is

P ( D ∩ E' ) + P ( D' ∩ E )=0.01+0.17=0.18

(c) P ( D' ∩ E ' ) =P(DUE)' =1−P ( D ∪ E )=1−0.21=0.79

3. If A , B and C are events defined on the sample space , then


' ' ' '
( AUB ∪ C) = A ∩B ∩C
' ' ' '
( A ∩ B ∩C) = A ∪ B ∪ C
'
( A ¿ ¿ ' ∪ B' ∪ C) ¿= A ∩ B∩ C'

Mutually exclusive events

The events E and F are said to be mutually exclusive if they cannot occur together .
Meaning they are disjointed . Mathematically

P ( E ∩ F ) =0

Recall that from the total Probability rule , if E and F are two events defined on a
sample space S , then the Probability that the event E or F or both would
occur( meaning at least one must occur) is given by

P(EUF)=P(E) +P(F) – P(E∩ F ¿

10
Results from the above

If E and F are mutually exclusive the P ( E ∩ F ) =0

≫ P(EUF)=P(E) +P(F)

This is often referred to as the addition rule of probability as well , meaning Two
mutually exclusive result into only the addition of their probabilities when
considering the total Probability rule

Example

What is the probability of obtaining a total of 7 or 11 when a pair of fair dice are
thrown once.

SOLN
SAMPLE SPACE

1 2 3 4 5 6
1 1,1 1,2 1,3 1,4 1,5 1,6
2 2,1 2,2 2,3 2,4 2,5 2,6
3 3,1 3,2 3,3 3,4 3,5 3,6
4 4,1 4,2 4,3 4,4 4,5 4,6
5 5,1 5,2 5,3 5,4 5,5 5,6

P ( T 7∨T 11) =P ( T 7 ) + P ( T 11)−P (T 7 ∩T 11)

n(T7) = 6 , n(T11)=2 n(T7∩T 11¿=0

11
6 2 2
≫ P ( T 7∨T 11)=P ( T 7 ) + P (T 11) = + =
36 36 9

This indicates that the two events are mutually exclusive.

Independent Events

Two events E and F are said to be independent if the occurrence or non occurrence of
one does not affect the occurrence or non occurrence of the other.

Mathematically P ( E ∩ F ) =P ( E ) XP(F)

Also if P(E /F )=E , then E is independent of F

Results from the above

Recall that from the total Probability rule , if E and F are two events defined on a
sample space S , then the Probability that the event E or F or both would
occur( meaning at least one must occur) is given by

P(EUF)=P(E) +P(F) – P(E∩ F ¿

If E and F are independent, then P ( E ∩ F ) =P ( E ) XP(F)

≫ P(EUF)=P(E) +P(F) - P ( E ) XP(F)

This is often referred to as the multiplication rule of probability as well

12
CONDITIONAL PROBABILTY

In all example so far , a sample space was defined and all probabilities were calculated
with respect to the sample space . In many instances however we are able to update the
sample space based on new info.

Example

Four cards are drawn one after the other without replacement from the top of a well
shuffled deck. What is the probability that they are the four kings?

Solution

The prob. that the first card is a king is 4/52. Given that the first card is a king

the prob. that the second card is a king is 3/51.. Given that the first two cards are kings
the prob. that the third card is a king is 2/50. Given that the first three cards are kings
the prob. that the 4th card is a king is 1/49.

4 3 2 1
≫ the prob. that the first four cards are kings = x x x
52 51 50 49

DEFINITION

If E and F are any events of a sample space S and P(F)¿ 0 , then the ‘probability’ that
the vent E, would occur given(/) that F has already occurred is given by P ¿)

P ( E nF)
P ¿) =
P(F)

Note that P ¿) can also be written as

n(E nF )
P ¿) =
n(F)

13
Note. the key words for conditional Probability are [ Given or If or Supposed.] . We
replace then with the slash (/)

Example

Two fair dice are thrown once . Given that the first one shows a three , what is the
probability that the sum is greater than six.

Soln

Let F be the event that the first one shows a three

Let E be the event that the sum is greater than 6.

E ∩ F={ ( 3 , 4 ) , (3 , 5 ) ,(3 , 6) } n ( E ∩ F )=3

n(E nF ) 3 6 1
P ¿) = = ÷ =
n(F) 38 36 2

Independent and dependent Events (Detailed Illustrations)

Suppose we calculate P(E /F ) and find P(E /F ) = P(E) , the it implies that P(E) is
unaffected by the occurrence or non occurrence F. In such a situation we say E is
independent of F. If E is independent of F then F is independent of E . If E and F are
not independent, then they are dependent.

Proof

P ( E nF)
We know that P ¿) = ………………….(.1)
P(F)

14
If E and F are independent then

P(E /F ) = P(E)……………………(2)

From (1) and (2)

P ( E ∩ F ) =P ( E ) XP(F)

THE TOTAL PROBABILITY AND BAYES’S THEOREM

Let E1 , E2 , …, En be mutually exclusive events of which none has zero probability and
at lest one must occur, then for any event F (connected to E1 , E2 , …, En ¿, the total
probability is given by

P(F) = P ¿) P ( E1 ) + P ¿) P ( E2 ) +. . .+ P ¿) P ( En )

n
P(F) = ∑ P (F / Ei ) P ( E i )
i=1

Example

Three machines x, y and z are used to produce greeting cards.. During a day’s
production machine x produces 720 cards , y produces 432 and z produces 288.. The
probability of x producing a defective card is 0.02, y producing a defective card is 0.1
and that of z is 0.05. Find the probability that at the end of the day one card selected at
random would be defective.

Soln

Let D represent a defective card

Let P(x) denote the prob that x produced a card

Let P(y) denote the prob that y produced a card

Let P(z) denote the prob that z produced a card

15
720
P(x) = = 0.5
1440

432
P(y) = = 0.3
1440

288
P(z) = = 0.2
1440

Total number of cards = 720+432+288=1440 = n(S)

P(D) = P(D /x )) P ( x )+ P ¿ ) P ( y )+ P ¿) P ( z )

= 0.02(0.5)+0.1(0.3)+0.05(0.2)

= 0.05

Baye’s Theorem

Baye’s theorem is used to update conditional probability based on new evidence


available.

Let E1 , E2 ,… , E n be a collection of n mutually exclusive events such that

E1 ∪ E2 , … ,∪ E n=S and

E1 ∩ E2 , … ,∩ E n=∅

Let F be an event associated with S such that P(F)¿ 0

Then for i= 1,2,.., n

P(F /E i) P(Ei )
P(E i /F ) = ∑ P(F /E ) P ( E )
n

i i
i=1

16
Example

A consulting firm rents cars from three agencies. 30% from agency A , 20% from
agency B and 50% from agency C 15% of the cars from A, 10% of the cars from B and
6% of the cars from C have bad tyres . If a car rented by the firm has bad tyres , find
the probability that it came from C

Soln

Let E1 denote the even that the car came from agent A

E2 denote the even that the car came from agent B

E3 denote the even that the car came from agent C

Let F denote the event that a car rented by the firm has bad tyres.

E3
We wish to find P( ) Now P( E1 ¿=0.3 ,P( E2 ¿=0.2 ,P( E3 ¿=0.5
F

P(F / E1 ) = 0.15, P(F / E2 ) = 0.1 , P(F / E3 ) = 0.06

P ( F /E 3) P(E3 )
P(E 3 /F ) = P ( F /E 1)P (E1 )+ P ( F /E 2)P (E2 )+ P (F/ E 3)P (E3 )

0.5 x 0.06
= 0.3 x 0.15+ 0.2 x 0.1+ 0.5 x 0.06

= 0.3158

Example 2

It is estimated that there is a 20% chance that unemployment would increase by more
that 1% next year.. If this this increase does occur, then there would be a 90% chance

17
that congress would enact a federally funded job programme. Otherwise the probability
of such a programme being funded is 30%. Suppose that the job programme was
funded by congress.

(a) What is the probability that unemployment would increase by more than 1%?
(b) What is the probability that unemployment would not increase by more than
1%?

Soln

Let E be the event that unemployment will increase by more than 1%

'
E denote the event that unemployment would not increase by more than 1%

F be the event that congress enacts a job programme

'
F be the event that congress will not enact a job programme

P(E ) = 0.2

P ( E' )= 0.8

P(F / E)=¿ 0.9

'
P(F /E)=¿ 0.1

'
P(F / E )=¿ 0.3

' '
P(F /E )=¿ 0.7

P(F / E)P(E)
(a) P(E /F ) = P (F /E) P(E)+ P(F / E' ) P(E' )

0.9 X 0.2
= 0.9 X 0.2+0.3 X 0.8 = 0.43

(c) P(E ' / F)=¿ 1−P( E/ F)

=1−0.43=0.57

18
TRIAL QUESTIONS

1. The Venn diagram below shows the sports that members of the
KNUST Sports Club participate in Bowls (B), Tennis (T) and
Darts (D). This extra information can be used to complete this
diagram.
n ( T ∩ b )=24 , ( B∪ D ∩T )=55 , n¿

a) Complete this Venn diagram.

Using the Venn diagram above or otherwise find the probability


that a member chosen at random:
b) Plays Bowls.

c) Does not play Tennis or Darts.

19
d) Plays Tennis but not Darts.

e) Plays Darts given the person plays Bowls.

f) Plays Tennis if the person plays Bowls but not Tennis.


63 46 21 33 0
[ ANS: (b) 129 (c) 129 (d) 129 (e) 63 (f) 38 =0

2. If A and B are mutually exclusive events and it is known that


P(A)=0.20 , P(B)=0.30. Estimate
(a) P(A’) ( b) P(B’) (C) P(AUB) ( d) P( A ' ∩ B ' ¿
[ ANS : (d)=0.5}

3. Suppose in question 3 the vents A and B are not mutually


exclusive , re-evaluate the probabilities assuming A and B are
independent. [ ANS : (d) 0.56
4. If A and B are disjoint events and P(A) = 0.3 and P(B) = 0.6
Find (a) P(A’∪ B' ¿ (b) P(A’UB ¿ (c) P(A’∩ B ¿
Note: if A an B are disjoint it doesn’t mean A’ and B are also
disjoint
[ ANS: (a) 1. (b) 0.7 (c) 0.6

5. An insurance company has insured 4000 doctors, 8000 teachers


and 12000 businessmen. The chances of a doctor, teacher and
businessman dying before the age of 58 is 0.01, 0.03 and 0.05,
respectively. If one of the insured people dies before 58, find the
1
probability that he is a doctor. [ ANS : 22 ]

20
6. A manufacturing company has two plants, 1 and 2. Plant 1
produces 40% of the company’s output and plant 2 produces the
other 60% . Of the output produced by plant 1 , 95% are good
and of that produced by plant 2 , 10% are defective. If a product
is randomly selected from the output of this company, what is
the probability that the output would be good. [ANS: 0.92]

7. Refer to Example 6 above . if a product selected at was not


defective(good) what is the probability that it comes from plant
1. [ ANS: 0.4130]

8. M&R Electronics World is considering marketing a new model


of television.In the past 40% of the new model televisions have
been successful and 60% have been unsuccessful. Before
introducing the new model television , the marketing research
department conducts an extensive study and releases a report
either favourable or unfavourable . In the past 80% of the
successful new model televisions had received favourable
market research reports , and 30% of the unsuccessful new
model televisions had received favourable reports . If the
marketing research department had issued a favourable report
for the new model of television under consideration, what is the
probability that the television would be successful? [ ANS:
0.64]

21
UNIT 2

PROBABILITY DISTRIBUTION OF A DISCRETE RANDON VARIABLE

Experiments whose outcomes cannot be determined in advance are known as Random


Experiments. Our main interest here is to consider the random experiments whose
outcomes are discrete. In this session therefore, we will define key terms such as
variable, random variable, and discrete random variable. We will also study how to
find the probability distribution of a discrete random variable.

2.1 Random Variable

In order to understand the concept of probability distribution, we need to explain the


term random variable. A variable is any characteristic of a population or sample that
possesses different numerical values or categories. It is often of interest to the
researcher in an experiment . For instance, when a fair die is rolled, the characteristic
that may interest us is the number that appears.

Consider an experiment for which the sample space is denoted by S. A real-valued


function that is defined on the sample space S is called a random variable. In other

22
words, A random variable, usually written X, is a variable whose possible values are
numerical outcomes of a random phenomenon .Random events are events that are
unpredictable in the short run but there is a pattern over many occurrences in the long
run. e.g the outcomes of Tossing a coin, rolling a die, drawing a card can
be modelled as a random variables. As mentioned earlier We will use the
upper case letters such as X to denote a random variable and the lower case letters such
as x to denote a particular value that a random variable may assume. Now with this, let
us define a random variable in a more conventional way.

Definition

Let S be the sample space associated with some experiment, ∑. A random variable X
is a function that assigns a real number X (s) to each sample element s € S.

Example 1

Consider the experiment of tossing a fair coin three times. Define the random variable
X, to be the number of heads that showed up.

Solution

Let us denote H by a head showed up and T by a tail showed up, assuming we have
head at one side and tail at the other side of the coin. We can then represent the sample
space, S, by S = {HHH, HHT, HTT, THH, TTH, HTH, THT, TTT}.

HHH means that the first, second and third tosses showed head in that order. HHT
means that the first toss showed a head, the second showed a head and the third showed
a tail. Discover the meanings of the remaining on your own. Since the characteristic of
interest is the number of heads obtained, we only need to count the number of heads in
the three tosses; hence the elements in the sample space could be 0, 1, 2, and 3 heads.
Thus, the random variable, X, could be written as {X/ x = 0, 1, 2, 3}. The set forms the
range of the random variable, X. Each possible value of x € X represents an event. For
instance, the event that one head appeared, written as {X/ x = 1}, is simply the set

S = {HTT, TTH, THT}

2.2 Discrete Random Variable

23
A random variable which takes on countably finite or countably infinite number of
values is called Discrete Random Variable. This means that the random variable is
defined over a discrete sample space. Example 1 is an example of a discrete random
variable. If a random variable is not discrete then it is continuous.

Example 2

Two fair dice are tossed simultaneously. Define the random variable, X, as the sum of
numbers that showed up.

Solution

On each die the numbers expected are 1, 2, 3, 4, 5, or 6. If we represent one of the dice
by A and the other by B, then the sample space can be constructed in a table form as

A 1 2 3 4 5 6
1 1,1 1,2 1,3 1,4 1,5 1,6
2 2,1 1,2 2,3 2,4 2,5 2,6
3 3,1 3,2 3,3 3,4 3,5 3,6
4 4,1 4,2 4,3 4,4 4,5 4,6
5 5,1 5,2 5,3 5,4 5,5 5,6
6 6,1 6,2 6,3 6,4 6,5 6,6

Since the random variable is the sum of numbers on the two dice, the highest value is
12 and the lowest value is 2. Therefore, the discrete random variable for this
experiment is given as {X/x = 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}

2.3 Probability Distribution of a Discrete Random Variable

If X is a discrete random variable then the function given by f (x) = P(X = xi),

i = 0, 1, 2,………, for each x within the range of X, is called discrete probability


distribution. The distribution function is known as probability mass function (pmf).
The probability distribution of a discrete random variable can be presented in a tabular
form, formula form (in a form of an equation) or graphical form.

2.3.1 Tabular Form

The table comprise of the possible values of the random variable, X and their
corresponding probabilities, P (X = xi).

24
Possible values of x; x1 x2 …………….. xk

(X = x)

P (X = x) P(x1) P (x2) ..……….. P(xk)

For instance, the probability distribution for Example 1, will be given by

(X = xi) 0 1 2 3

1131
P (xi)
8888

That is, for probability of no head P (x = 0), we have TTT. For a fair coin,

1
P (TTT) ¿
8

Similarly, P(x = 1) implies THT, TTH and HTT. Therefore, P (getting one head) will
be given by

1 1 1 3
P(THT) + P(TTH) + P(HTT) = + + =
8 8 8 8

Example 3

Construct the probability distribution for Example 2

Solution

We can covert the sample space in Example 1 to suit the sum of the numbers that
appeared. Thus, the sample space becomes

A 1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12

The probabilities are calculated by counting the same numbers in the sample space and
dividing by 36. We divide by 36 because in the sample space we have 36 elements.

X= 2 3 4 5 6 7 8 9 10 11 12

25
1 2 3 4 5 6 5 4 3 2 1
P(X = xi)
36 36 36 36 36 36 36 36 36 36 36

2.3.2 Formula Form

The probability distribution can also be in an equation form. This is expressed in the
form P(X = x) = f(x). where f(x) is a function. An example of a probability
Distribution function is given as

x
, x =1, 2 , 3
6

P(x) = f(x) =

0, otherwise

This is often called Probability Mass Function as mentioned earlier.

At this point we need to know when a function qualifies to be called probability Mass
Function (Probability Distribution).

A function is called Probability Mass Function (Probability Distribution) if it satisfies


the following properties (conditions):

Property 1: P(X = xi) ≥ 0 for i =1, 2,…

Property 2: ∑ P ¿ ¿xi) = 1 for all values of x, within its domain.

From these properties, it is obvious that Example 3 is a probability distribution the


reason is that all the probabilities there are positive and also summation of the
probabilities is equal to 1

Lest us discuss one more example on the properties

Example 4

Check if the function given by

x+2
, x =1, 2 , 3 , 4 , 5
25
f(x) =

0, otherwise

is a probability mass function.

Solution

When we put the values of x into the function, we have

26
x 1 2 3 4 5

3 4 5 6 7
f(x)
25 25 25 25 25

Property 1: All the probabilities are positive; hence, this condition is satisfied.

Property 2: if we sum all the probabilities, it sums up to 1, hence, this condition is also
satisfied we can therefore conclude that the function is a probability mass function.

2. 3. 3 Probability Graph Form

A graph of p(X = xi) against xi is called a probability graph. We usually draw vertical
lines or bars above the possible xi values of the random variable X, on the horizontal
axis. The graph is as drawn as follows.

P(x3)

f(x) P(x2) P(xk)

P(x1)

x
x1 x2 x3 xk

Let us demonstrate this with an example

Example 5

x−1
, x=3 , 4 ,5
9

A discrete function is given by f(x) =

0, otherwise

a. Show that function f (x) is a probability mass function.


b. Draw a graph for this distribution.

27
Solution

Let us construct a table for the distribution.

1 2
For x = 3, f(3) = ( 3−1 )=
9 9

1 3
For x = 4, f(4) = ( 4−1 )=
9 9

1 4
For x = 5, f(5) = ( 5−1 )=
9 9

The probability distribution table is therefore constructed as follows

x 3 4 5

234
f(x)
999

5
a. Now, all the values of f(x) are positives. Also, ∑ f ( x )=1 , hence the function is
i=3

a probability mass function.

b. f(x)

4
9

3
9

2
9

3 4 5

Cumulative distribution function

There are many problems where we may wish to compute the probability that the
observed value of the random variable X will be les than or equal to some real number
x. E,g What are the chances that a certain candidate will get not more than 30% of
votes? What are the chances that the prices of gold would reman at or below 800 USD
per ounce. Writing F ( x )=P( X ≤ x ) for every real number x, we define F ( x ) to be the

28
cumulative distribution function of X or simply the distribution function of the random
variable X

The Cumulative distribution function of discrete random variable

The cumulative distribution function F ( x ) of a discrete random variable X with


probability mass function f (x) is defined by

F ( x )=P ( X ≤ x )=∑ f ¿ ¿ )
x i≤ x

If X takes on only a finite number of values x 1 , x 2 ,… , x n, the the cumulative distribution

{
0−∞ < x< x1
f ( x 1 ) x 1 ≤ x< x 2
f ( x 1 ) +f ( x2 )x 1 ≤ x < x 2
.
function of X is given by ¿ F ( x ) equals .
.
.
.
f ( x1 ) + f ( x 2 )+ ..+ f ( x n )=1 x n ≤ x <∞

Example

The following table gives the probability mass function of X . Find the cumulative
distribution function of X and sketch its graph.

X 0 1 2 3 4
f(x) 1/16 1/4 3/8 1/4 1/16

Soln

If x <0 , F ( x )=f ( x <0 )=0

1 1
If 0 ≤ x<1. F ( x )=f ( x< 0 ) +f (x=0) = 0 + =
16 16

1 1 5
If 1 ≤ x <2. F ( x )=f ( x <0 )+ f ( x=0 )+ f (x=1) = 0 + + =
16 4 16

If 2 ≤ x <3. F ( x )=f ( x <0 )+ f ( x=0 )+ f ( x=1 ) +f ( x=2 )

29
1 1 3 11
=0+ + + =
16 4 8 16

If 3 ≤ x < 4 F ( x )=f ( x< 0 ) +f ( x=0 )+ f ( x=1 ) + f ( x=2 )+ f ( x=3)

1 1 3 1 15
=0+ + + + =
16 4 8 4 16

If 4 ≤ x <∞ or x ≥ 4

F ( x )=f ( x< 0 ) +f ( x=0 )+ f ( x=1 ) + f ( x =2 )+ f ( x=3 ) + f (x=4)

1 1 3 1 1
=0+ + + + + =1
16 4 8 4 16

The cumulative distribution function is given by

{
0 x <0∨−∞ < x <0
1
0 ≤ x< 1
16
5
1 ≤ x <2
( ) 16
F x equals
11
2 ≤ x <3
16
15
3≤ x <4
16
1 4 ≤ x < ∞∨x ≥ 4

Notice that even if the random variable X can assume only integers the cdf of X
5
can be defined for non integers For example , in the above example F(1.5) =
16

11
F(2.5) =
16

Solve reverse case in class

Properties of Cumulative Distribution Function

1. The function F(x) is a probability, consequently

30
0 ≤ F(x )≤ 1

Type equation here .

For all x ∈(−∞ , ∞ ) , opened interval

2. F(x) is a nondecreasing function of x , meaning for two particular values x 1 and


x 2,

If x 1 ≤ x 2, then F ¿ )≤ F(x 2 )

3. The probability that a random variable X takes the value within an interval
(a,b) is equal to the increment of the distribution function in that interval

P ( a< x <b )=F ( b )−F (a)

This means that all probabilities of interest can be computed once the cumulative
distribution function F(x) is known.

Note : Even though this is an open interval , for a discrete distribution we can rewrite
this for new inclusive boundaries. For continuous we treat bot inclusive and exclusive
the same way.

lim F ( x )=¿ F ( + ∞ )=1¿


4. x→+∞

lim F ( x )=¿ F (−∞ )=0 ¿


x→−∞

5. F ( x ) is always ¿ continuos

lim ¿
+¿
x→ a F ( x ) =¿ F ( a) ¿¿

Any function satisfying all the five properties above is c.d.f of some random
variable

31
Exercise

1. A fair coin is flipped four times. Let X represent the number of heads which
show up. Find the probability distribution of the random variable, X.
2. A discrete random variable, X, has probability mass function

K(x + 2), x = 1, 2, 3, 4, 5

f(x) =

0 otherwise

Find the value of the constant k

UNIT 3

MEAN AND VARIANCE OF A DISCRETE RANDOM VARIABLE

In section 1, we explained the term discrete random variable. This term helped us in the
discussions of the concept of probability distributions. In this session, we will learn
how to find the mode, the median, the mean and the variance of a discrete probability
distribution.

The Mean or Expectation or Variance of a discrete random variable

The mean of the discrete probability distribution is also known as the mathematical
expectation of the distribution. It is usually used as the average value of the
probability distribution even though the mode and the median considered as the
average value.

The mean (expectation) of the distribution is defined as

E(x ) =∑ xf ( x )=∑ xp (X =xi)


x x

This is also denoted by mx. Give that we have a probability distribution as

Possible values of x x1 x2 ….. xk


(X = x)
P(X = x) p(x1) p(x2) …… p(xk)

32
Then the expectation, E(X) or (mx) is calculated as

Mx = 1. p(x = 1) + 2. p(x = 2) + …… + k .p(x = k)

We now take an example to show how to calculate the mean of a given distribution.

Example 11

Suppose that the probability distribution of a discrete random variable, is given by

(X = x) 0 1 2 3

27 54 36 8
P(x)
125 125 125 125

Find the expected value (mean) of this distribution.

Solution
3
27 54 36 8
Mx = ∑ x p(X = xi) = 0. +1. +2 +3.
i=0 125 125 125 125

54 72 24
=0+ + +
125 125 125

150
= = 1.2
125

Properties of the Mean

1. The mean of the distribution must be unique. This means that it should be a
single value.
2. The mean (expectation) of a constant is the constant, that is, if C is a constant
then E (C) = C
3. If C is a constant and X is a random variable then E (CX) = CE(X).

Example 12

Find the expectation of 2X of the distribution in Example 11.

Solution

Now, E(2X) = 2 E(X). We see from Example 10, that E (X) = 1.2. Hence,

33
E(2X) = 2. (1.2) = 2.4

3.2 The Variance

The variance of a distribution is one of the statistics that measures the spread or the
dispersion of the distribution about its mean. A small value of the variance is an
indication that the probability distribution is tightly concentrated around the mean, and
a large variance indicates that the probability distribution has a wide spread about the
mean.

Definition

Suppose that X is a discrete random variable with mean, µ = E(X). Then the variance of
X, denoted by σ 2 = var (X) is defined as
n
Var(x) = E [(X- µ) ]= ∑ ¿¿ p(x).
2

i=1

Where p(x1) is the probability for each of the corresponding x values. Using this
formula to compute the variance can be very difficult; hence we re-define the variance
as

Var (x) = E(X2) – [E(X)]2


n n
= ∑x 2
p ¿ i) - ∑ x ip(xi) 2
is given by
i=1 i

Laws of Variance

1. Var(C) = 0
2. Var(x) = E( x 2) - [ E(x)¿2
3. Var(Cx) = c 2 Var ( x)

Example 13

Suppose a probability distribution of a discrete random variable X, is

X 1 2 3 4 5

1 3 1 3 4
P(xi)
12 12 12 12 12

Find the variance of this distribution.

Solution
n n
The variance is given by var (X) = ∑ x p(xi) - 2
∑ x i P(xi) 2
i=1 i

34
5
1 3 1 3 4
Now, ∑ x 2P(xi) = 12. +22. +3 2. + 42. +52.
i=1 12 12 12 12 12

1 12 9 48 100
= + + + +
12 12 12 12 12

= 14.1633
5
1 3 1 1 4
Similarly, ∑ x P(xi) = 1. +2 3. 4. +5.
i=1 12 12 12 12 12

= 3.4993

Hence, var (X) = 14.1633 – (3.4993)2

= 1.9182

The non-negative square root of the variance of a distribution of a random variable X is


known as the standard deviation.

For instance, the standard deviation of a the probability distribution in Example 13 is


given by σ =√ var ( X )=√ 1.9182

= 1.3850

Exercise

1. A fair die is tossed once. Define a random variable as the number that showed
up. Find (a) the median; (b) the mean; and (c) the variance of this distribution
2. Suppose that two balanced dice are rolled, and let X denote the absolute value of
the difference between the two numbers that appeared. Determine the
probability distribution and calculate the variance of this distribution.
3. The following table lists the probability distribution for cash prizes in a lottery
condition at Melcom Supermarket.

Prize (GH₵) Probability


0 0.45
10 0.30
100 0.20
500 0.05

Compute the mean, variance and standard deviation of this distribution.

SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS

35
In the last two sessions, we discussed generally, discrete probability distributions. We
will turn our attention to special distributions that are widely used in applications of
Probability and Statistics. These distributions are meant to solve special type of
problems. In this session, we will consider two of these special types of probability
distributions. They are Binomial and the Poisson distributions, Notwithstanding, there
are other discrete distributions. Among these are the Geometric, Hyper geometric,
Negative Binomial, Discrete uniform distribution, Bernoulli distribution, e Some of
these may be additionally treated in this text

4.1The Binomial Distribution

The term binomial means two, thus binomial events have two options. The properties
stated here will help us to identify binomial experiments.

4.1.1Properties of the Binomial Distribution

The binomial distribution has some properties which identify it. A binomial experiment
is the one that possesses the following properties:

i. The experiment consists of n repeated trials under the same condition.


ii. Each of the n trials results in an outcome that may be classified as a
“Success” or a “failure”.
iii. The probability of a success, denoted by p remains constant.
iv. The n trials are independent of each other.
v. The random variable of interest, X, is the number of successes observed
during the n independent trials.

Let us now discuss the meaning of these properties. The first property means that the
trials should be performed under similar conditions. For instance, if we flip a fair coin
ten times, it is expected that each will be flipped under the same condition. As the
name implies, the second property means that the experiment should result in only two
results termed as “success” or “failure”. The third property means that, if the success of
the first trial is p, then the success in each of the subsequent trials will be p. For
example if you flip a fair coin three times, then in each trial the probability of a head
1
appearing is . Property (iv) means that, the occurrence of the first trail should not
2
influence the occurrence of the second trial, and so on. In property (V), we mean that
the random variable of interest is labeled as success.

4.2 Solving Problems Involving Binomial Experiments

We will now define the Binomial Distribution and then use it to solve problems
involving Binomial experiments.

4.2.1 Definition of the Binomial Distribution

36
If p and q (i.e. q = 1 – p) are probability of success and probability of failure
respectively on any one trial, then the probability of getting x observed successes and
n – x failures for n independent number of trials is given by

nCxPx (1 – P)n-x, for x = 0, 1,…, n

P(X = x) =

0, otherwise

Where nCx , is the number of ways of getting x observed successes out of n trials, and P
lies between 0 and 1 inclusive.

We need to remember that the random variable for the binomial distribution is discrete,
and a legitimate probability distribution. It can be denoted by b(x; n, p)

Example 14

A fair coin is tossed ten times. Define the random variable, X, as the number of heads
that appears. Find the probability that: (i) no head appeared. (ii) At most two heads
appears. (iii) At least two heads appeared.

Solution

This is a binomial trial since we have two options. Either a head appears (referred to as
a success) or no head appear (referred to as a failure).

1 1 1
i. n = 10 trials, p = , x = 0, q = 1 - =
2 2 2

Using binomial distribution, we have

P(X = 0) = 10Co ( 12 ) ( 12 )
o 10
= 1. ( 12 ) 10
=
1
1024
= 0.00098

ii. At most, two heads means that, there could be 0, 1 or 2 heads. Therefore,
the probability is given to be
P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)
Now, from (i) P(X = 0) is 0.00098

P(X = 1) = 10C1 ( 12 ) ( 12 ) = 10 ( 12 )
1 9 10
= 10.
1
1024
= 0.00977

P(X = 2) = 10C2 ( 12 ) ( 12 ) = 45. ( 12 )


2 8 10
= 45.
1
1024
= 0.04395

P (obtaining at most two heads) = 0.00098 + 0.00977 + 0.04395= 0.0547

(Corrected to 4 decimal places)

37
iii. We want to find P(X ≥ 2) which implies that we need
P(X = 2) + P(X=3) +………+ P(X = 10).

This is tedious and time consuming. The best way to find this is to find P(X < 2) and
subtract the results from 1. That is the Complementary Rule of Probability.

Thus, P(X ≥ 2) = 1 – P(x < 2), since 0 ≤ P≤ 1

Now, P(X < 2) = P(X = 0) + P(X = 1).

From (i), P(X = 0) = 0.00098, and from ii) P(X = 1) = 0.00977

Hence, P(X < 2) = 0.00098 + 0.00977

= 0.01075

Therefore, P(X ≥2) = 1 – 0.01075 = 0.98925

Example 15

For 800 families sampled, each has five children. How many of these families would
you expect to have three boys?

Solution

Let us define random variable, X, as observing a boy in the family, then we can
consider this experiment as a binomial since we may observe a boy or a girl in a family.

1 1 1
For the given problem, n= 5 children, P = and q = 1 - =
2 2 2

P(X = 3) = 5 C3 ( 12 ) ( 12 ) = 10 ( 12 ) = 10. 321 = 0.3125


3 2 5

Now to find the number of families expected to have 3 boys, we multiply this
probability by the number of families. That is, 0.3125 x 800

This gives 250 families. Therefore, 250 families are expected to have three boys.

4.2.2 Expectation and Variance of the Binomial Distribution

Expectation and variance of the binomial distribution is given by

E(X) = np and Var (X) = np(1-P) respectively

The above would be proved in class

Example 16

38
A fair coin is tossed ten times. Define the random variable, X as the number of heads
that appears. Find the mean and the variance of this experiment.

Solution

1 1
From the experiment, n = 10, P = , q =
2 2

The mean is calculated as

1
Mean = E(X) = np = 10. = 5,
2

And the variance as

1 1
Var (X) = np(1-p) = 10. . = 2.5
2 2

4.3 The Poisson distribution

The Poisson distribution has most properties similar to the binomial distribution.
Generally, Poisson distribution deals with experiments that have to do with events
happening within time intervals. For example, the number of car accidents occurring at
a particular intersection during a time period of one week; number of cars passing at a
point on a main road in one second; number of telephone calls handled by a switch
board in a time interval; can all be classified as Poisson experiments.

4.3.1 Definition of the Poisson Distribution

The probability distribution of the Poisson random variable X representing the number
of successes occurring in a given time interval or specified region of time is defined as
−λ x
e λ
, x = 0, 1, 2…
x!

P( x) =

0, otherwise

Where λ (λ>0) is the mean number of successes occurring in a given time interval or
specified region, and e = 2.71828.

4.3.2 Solving Problems Involving Poisson Experiments

With the help of the definition of the Poisson Distribution we now want to solve some
Poisson problems.

Example 17

39
Suppose that a random variable x has a Poisson distribution with mean, λ = 0.4. Find;

a. P(x=0) b. P(x=1) c. P(x ≥ 2)

Solution

a. P(x=0) = ¿ ¿= 0.6703
b. P ( x=1 )=¿ ¿ = 0.2681
c. P(x ≥ 2) = 1 – P(x < 2) = 1 – P[P(x=0) + P(x=1)] = 1 – (0.6703 + 0.2681) =

Example 18

The average number of road accidents per day recorded over 100 days In a certain
junction was 1.2. Calculate the probability that on a particular day

a. No accidents;
b. Less than 3 accidents; and
c. At least 1 accident will be recorded.

Solution

a. From the question, λ = 1.2. We want P(X = 0).


P(X = 0) = e−1.2 ¿ ¿ = e-1.2 = (2.718)-1.2 = 0.3012 (Corrected to 4 decimal places)
b. P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2)
From (a) P(X = 0) = 0.3012

Now,

P(X =1) = e−1.2 ¿ ¿ = 1.2 (2.718)1 = 0.3614

And

1.44(0.3012)
P(X = 2) = e−1.2 ¿ ¿ = (1.2)2 ¿ ¿ = = 0.2169.
2

Hence, the probability of less than 3 accidents is

P(X < 3) = 0.3012 + 0.3614 + 0.2169 = 0.8795 (corrected 4 decimal places).

c. At least one accident means P(X ≥ 1¿ .


Now, P(X ≥ 1) = 1 – P(X < 1)
But
P(X < 1) = P(X = 0).
From (a) P(X = 0)= 0.3012

Therefore,

P(X ≥ 1) = 1 – 0.3012 = 0.6988 (corrected to 4 decimal places).

40
Since Poison distribution has some properties similar to the binomial distribution, it is
possible to consider some binomial experiments as Poisson. We can therefore solve
some binomial problems using Poisson distribution. Thus, if n is large(n → ∞ ¿ and p is
small, closed to zero( p →0 ¿ , then the Poisson distribution is used to approximate the
Binomial distribution, with mean given by λ = np.

We will to demonstrate this by considering an example.

How large should n and p be . n≥ 20∧ p ≤ 0.05

Example 19

If 3% of electric bulbs manufactured by a company are defective. Find the probability


that in a sample of 100 bulbs, exactly two will be defective.

Solution

From the problem, probability of success, P = 0.03, and n = 100. This can be classified
as binomial experiment, but we see that it will be cumbersome for us because n is large.
Hence, the best approach is the Poisson distribution.

Thus, λ = np = 100 x 0.03 = 3.

Now, we want to find P(X = 2) using Poisson distribution.

9
P(X = 2) = e−3 ¿ ¿ = (2.718)-3 = 0.2241 (corrected to 4 decimal places).
2

We want to state here that the mean and the variance of the Poisson distribution have
the same value. That is,

E(X) = Var(X) = λ.

Students should research on the above proofs.

For example, the mean and the variance of Example 18, is 1.2.

41
Example

Show that the Poisson distribution is a legitimate discrete probability distribution or a


p.m.f

Soln

We need to show that

∑ p ( x )=1
x=0

We take the LHS, manipulate to get 1.

∞ ∞
e−λ λ x − λ
∑ p ( x )=¿ ∑ x!
=e ¿ ¿ ¿
x=0 x=0


≫ ∑ p ( x )=¿ e− λ ¿ ¿
x=0

2 3
x x
But from Taylor’s series e x =1+ x + + +…
2! 3 !

2 3
λ λ λ
≫ e =1+ λ+ + +…
2 ! 3!

42

≫ ∑ p ( x )=¿ e− λ . e λ =e− λ+λ =e 0=1 ¿
x=0


≫ ∑ p ( x )=¿ 1 ¿
x=0

As required

Assignment

Look for the proof of mean and variance of the Poisson distribution

E(x) = λ , var(x)= λ

Exercise

1. In Example 18, find the probability that


a. More than two; and
b. Less than or equal to two bulbs will be defective.
2. The average number of radioactive particles passing through a counter in a
millisecond during and experiment is 4. What is the probability that 6 particles
entered the counter during a millisecond?
3. An automobile dealer has found that 30% of the cars sold are returned to the
workshop for repair during the first month. If 10 cars are sold this month what is the
probability that
(i) all will return for service in the next month?
(ii) at most two will return for service during the next month?

4. One percent of the letters mailed in an office have incorrect addresses. If on a given
day 200 letters are mailed,
(i) How many with incorrect address are expected?
(ii) What is the probability of finding 3 or more letters with incorrect address?

43
PROBABILITY DISTRIBUTION FOR A CONTINUOUS RANDOM
VARIABLE

In Session 1 of this unit, we discussed the probability distribution for a discrete random
variable. In this session, we will discuss probability distribution for a continuous
random variable. We will see that the major difference is the meaning of discrete and
continuous. We will therefore try to explain the meaning of continuous random
variable and then use it to discuss continuous probability distributions.

5.1. Definition of Continuous Random Variable

As defined in Session 1, a random variable is a real-valued function defined on a


sample space. Now, if this sample space is continuous in nature then that random
variable is said to be continuous random variable. Thus, a random variable defined
over a continuous sample space is known as continuous random variable.

Suppose that our concern is to find the possibility that an accident will occur on a
highway which is 100km long. Let us assume that our interest is that the accident will
occur at a given location on the highway, then this characteristic to be measured is a
continuous random variable.

Also, consider an experiment in which a person is selected at random from some


population and the height of the person is measured. If the interest is the height of the
person measured, then we can talk of a continuous random variable here.

5.2 Probability Distribution of a Continuous Random Variable

Let X be a continuous random variable. A function, f(x), defined over the set of all real
numbers is called probability distribution function if
b

1.P(a ≤ X ≤ b) = ∫ f ( x ) dx
a

2. P(a ¿ X ≤ b) = ∫ f ( x ) dx
a

3. P(a ≤ X ¿ b) = ∫ f ( x ) dx
a

44
b

4. P(a ¿ X ¿ b) = ∫ f ( x ) dx
a

5. P(a ¿ X ) = ∫ f ( x ) dx
−∞

6. P( x <a) = ∫ f ( x ) dx
−∞

7. P( X ¿ a ) = ∫ f ( x ) dx
a

8. P( X ≥ a) = ∫ f ( x ) dx
a

Illustration

e.g 2 ≤ x ≤6=2 , 2.12.2 … … .3 , 3.1 … … … … .5.9 ,6=2< x <6 = 2.1, 2.2,……..5.9

For any real constants a and b with a ≤ b.

The definition means that the probability that a random variable X takes the value in the
interval (a, b) is equal to the shaded area of the region defined by the curve, y = f(x),

(See Figure 5.1) where f0(x) is the probability distribution function. This is also known
as probability density function (pdf)

45
y= f (x)

x
0 a b

Figure 5.1

The shaded area of Figure 4.1 represents the area of a probability density function lying
between a and b. this gives the probability that the event is found between a and b.

A function f(x) can serve as probability density function (pdf) of a continuous random
variable, X, if the following conditions are satisfied:

1. f (x) ≥ 0

2. ∫ f ( x ) dx=1
−∞

We will now take some examples to demonstrate what we have discussed so far.

Example 19

Let the random variable, X have a function

2
x
f(x) = , for -1 < x < 2
3

0, otherwise

(a) Verify whether the function is a probability density function.


(b) Find the probability that X lines between 0 and 1. That is, P (0< X <1).

Solution

(a) Condition 1: for f(x) to be a probability density function, f(x) ≥0. We see that
the function will always be positive since x2 cannot be negative.

46
Hence, condition 1 is satisfied for all the values between -1 and 2.

Condition 2: ∫ f ( x ) dx=1
−∞

2 2 2
x 1 1 2 8 1
∫ 3
dx= ∫ x dx = [x3]
3 −1
2
9
= + =1
−1 9 9
−1

We see from the calculation that the second condition is also satisfied. Since the two
conditions are satisfied, we conclude that the function f(x) is a probability density
function.

(b) We need to computer P(0 < X < 1).


1
2 1
x 1 1 1 1
P(0 < X < 1)= ∫ dx = ∫ x dx = [x3] =
2

0 3 30 9 0 9
1
This means that the probability that X lies between 0 and 1 is
9

Example 20
A random variable has the pdf

kx, for 0 < x < 4

f(x) =

0 other wise

(a) Find the value of the constant k.


(b) Compute P( 1< X <3)

Solution

To find k, we need to use the second condition.

47

Use ∫ f ( x ) dx=1
−∞

∫ f ( x ) dx
0

4 4

∫ kx dx =k ∫ x dx
0 0

k 2 4 16 k
[x ] = = 8k =1
2 0 2

1
K=
8

3
1
(b). P(1 < X < 3) =
81
∫ xdx

1 2 3 1
= [x ] =
16 1 8

Example 21

2
x
Given that the function , for -1 < x < b is a probability density function
3
f(x) =
0 otherwise

Find the value of the constant b.

Solution

Since the function is a pdf it implies that

∫ f ( x ) dx=1
−1

b 2 b
x 1 1 b
∫ 3
dx ¿ ∫ x dx = [ x 3]
3 −1
2
9 −1
−1

48
3
b 1
+ =1
9 9

3
b + 1 = 9, b3 = 8 b3 = 23

Comparing using properties of indices, b = 2.

It is important to mention here that the function f (x) should be differentiable, thus, for
d
are probability to be found it is necessary that the derivative ( f (x) = f(x)1exist.)
dx

We must also note that if X is a continuous random variable having probability density
function f(x) then for any constant a, P(X =a) = 0. The reason is that if X is a
continuous random variable then.

P(X = a) = (a ≤ X ≤a) ¿ ∫ f ( x ) dx=0


a

Hence, the above statement is true.

Based on this fact, it is worth noting for a continuous random variable the following
statement is true.

P(a ≤ X ≤b) = P (a ≤ X <b) = P(a < X≤b) = P( a< X < b).

Let us not carefully that is not true in the case of a discrete random variable.

In Example 20b, assuming we want to find P(1≤ X ≤3 ¿ , the answer will not be
different from what we had.

3
1 1
Thus,
81
∫ xd= .
8

Exercise

1. If X has the pdf

k (x-1), for 1 ≤ x ≤2
f(x) =
0, otherwise

49
(a) Find the value of k.

(b) Hence find (i) P(1.0< X < 1.5) (ii) P(X < 2.0) = −∞ ¿ 2 (iii) P( X > 3) = 3 to ∞

2. Show that the function

1
(x+1), for 2 < x < 4
f(x) = 8

0, otherwise

Is a probability density function.

Cumulative distribution function for a continuous random variable

This would be discussed in class

MEAN AND VARIANCE OF A CONTINUOUS DISTRIBUTION

In session 4, we discussed the probability distribution of a continuous random variable.


Based on this discussion we will now study how to find the mode, the median and the
mean which form the measure of the central tendencies of the probability distribution.
We will then continue to learn how to find the variance and the standard deviation
which also measure the dispersion of the distribution. The meanings of the terms to be
discussed in this unit have already been dealt with in unit 3.

The Mean or Expected Value or Expectation of a continuous random Variable

50
The mean which is also known as mathematical expectation is the most used measure of
central tendencies. Suppose that X is a continuous random variable and f (x) is the
probability density function, then the mean (mathematical expectation) is defined as

E (X) = ∫ xf ¿ ¿) dx
−∞

We need to note that mathematical expectation may or not exist. Note here that the f (x)
has been multiplied by x.

Example 24

Given that a random variable, X has the pdf.

2
(1+x), 2≤ x≤ 5
27
f(x) =

0, otherwise

Find the expectation of the random variable, X

Solution

5 5
2 2
E (x) = ∫ x ( 1+ x ) dx= ∫ ¿ ¿)dx
2 27 27 2

5
2 3
2 x x
= +
27 2 3
2

=
2 25 125 4 8
27 2
+
3
− −
2 3
= ( )
2 198
27 54
=
99
27

∫ e−2 x dx =

51
6.2 The Variance

In unit 2, we discussed the meaning and the importance of the variance. Our concern
here therefore is to learn how to find it in the case of the continuous random variable.
Suppose X is a continuous random variable, then the variance is defined as

Var (x) = E (x- μ ¿2 = ∫ ¿ ¿x- μ ¿ 2


f (x) dx
−∞

Where var ( x )=¿ = E (x2) -( E (x) )2

Hence,
2

Var (x) = ∫ f (¿ x) ¿x2 dx – ∫ x f ( x)dx


−∞

Example 25

A random variable X has the probability density function

1
x 0 ≤ x≤ 4
f(x) = 8

0, otherwise

Find the variance of the random variable, X and hence find the standard deviation.

Solution

52

Var (x) = ∫ f (¿ x) ¿x 2
dx – ∫ x f ( x)dx
−∞

By using the formula, the variance is given by

4 4 2
Var (x) = ∫ x
0
1
8
2
x dx - ( ) 0
( )
∫ x 18 x dx

4
4 4
1 1x
E x 2= ∫ x dx =
3
80 8 4
0

1 3
= 4 =8
8
4
4 4

∫ x ( 18 x ) dx = 18 ∫ x 2 dx =
3
1x
E x =
0 0 8 3
0

3
14 64 16 8
= = . =
8 3 24 6 3

Therefore

8 8 64
Var (X) = –¿= -
1 1 9

72−64 8
= =
9 9

The standard deviation is given by,

σ = √ Var (X )

Thus,

53

σ = 8 = 2 √ 2.
9 3

Exercise

1. If a random variable X has the probability density function

2 (x-1), 1≤ x ≤ 2
F(x) =
0, otherwise

(a) Find the mean


(b) Find also the standard deviation

2. Assume that the probability density function of the random variable X, is given
by

f(x) = 4x2 (1-x). 0≤ x<1


0, otherwise

Find the expectation of the random variable X.

UNIT 7

SPECIAL CONTINUIOUS PROBABILITY DISTRIBUTIONS

In units 5 and 6, we discussed continuous probability distributions.

54
The knowledge acquired from the five sessions so far, can also help us differentiate
between discrete and continuous probability distributions.

Now, as we studied in unit 3 of unit 4 about special discrete probability distribution, we


want to do a similar thing in this session with our focus now on the continuous case.
There are several of them; these include Normal, Exponential, Chi-square distributions,
Continuous Uniform distribution, Gammer distribution, Beta distribution, Lognormal
distribution, Weibull distribution, Log-logistic distribution, among others. However, we
will discuss only the normal distribution for now and may add exponential, gamma and
beta distributions.

7.1 Normal Distribution

The most widely used probability distribution in the entire field of Statistics is the
Normal distribution. It is important to know that the term Normal used should not
be interpreted to mean that other types of distributions are “abnormal”. It is used
basically due to the fact that its curve provides approximation to the pattern
observed in so many diverse histograms based on real data sets.

The normal distribution is usually used to model problems found to have


approximately normal distributions such as variability of outputs from industrial
line, lifetime of devices which wears out, biological variability such as height and
weight, and so on. Also, may important random variables such as that of the
Binomial and the Poisson have distributions that can be approximated to the Normal
distribution.

7.1.1 Definition of the Normal Distribution

A random variable X has a Normal distribution with mean, μ, and variance, σ 2,

(-∞ < μ< ∞∧σ > 0 ¿ if X has a continuous distribution for which the density function
f(x) is defined as,

f(x) =
1
e-
√2 π 2 σ
σ ( )
1 x−μ 2
, for (−∞ < x <∞ ) .

55
The curve is constructed so that the area under the curve bounded by two ordinates X=
x1 and X = x2 equals the probability that the random variable X assumes. This area is
shown in Figure 7.1

X1 μ x2

Fig. 7.1

To find the probability of X lying between x1 and x2 as discussed in session 4, we need


to integrate the Normal function. That is,
2
x
1 1 x−μ 2
P(x1 < X < x2) =∫ e- ( ) dx
x
1 σ √2 π 2 σ

Integrating this function is indeed tedious. However, the way out will be discussed later.

7.1.2 Properties of the Normal Distribution

The properties of the Normal distribution are:

1. The mode, which is a point on the horizontal axis (where the curve is maximum),
occurs at x = μ.
2. The general Normal curve is bell-shaped and symmetric about the vertical axis
through the mean, μ, (mean=median=mode).
3. The curve has its point of deflection at x= μ ± σ .
4. The Normal curve approaches the horizontal axis asymptotically as you proceed in
either direction away from the mean.
5. The total area under the horizontal axis is equal to 1.

56
Below is a typical shape of the normal curve.

μ−x μ μ +x
Figure 7.2

The shape of the Normal curve depends largely on the standard deviation of the normal
curve. The probability density function of the Normal distribution with a small value of
standard deviation has high peak and is very much concentrated around the means
However, a large standard deviation of the curve gives much dispersion about the mean
and the peak is quite flat (that is, quite low).

Figure 7.3 shows a normal distribution with different values of standard deviation.

σ3

σ2 σ 1 >σ 2 > σ 3

σ1

Fig. 7.3

7.1.3 The Standard Normal Distribution

The difficult encountered in integrating the Normal density function as mentioned


earlier is solved by transforming the random variable X of the normal distribution to the
random variable Z of the standard Normal distribution. This is done by way of
standardizing the random variable X.

57
Now, if X has Normal distribution with mean, μ and variance,σ 2 then the random
X−μ
variable, Z, given by Z = , has the Standard Normal distribution with mean μ=0 ,
σ
and variance σ 2 = 1. The probability density function of the standard normal
distribution is given by
2
−z
f(z) = 1 e 2

The advantage in using the Normal Distribution is that standard normal tables are
available for use. We therefore need not do any direct integration in using the normal
distribution. For instance, if we want to find P(x1< X < x2) we need to transform it
x−μ X −μ
using the formula Z = , to the form P(z1<Z< z2). Thus, z 1 = 1 .
σ σ

Example 26

A random variable X has a normal distribution with mean 50 and standard deviation 10.
Convert the following to the Z values.

(a) P (45<X<62) b P(X>20)


Solution
(a) x1= 45, x2= 62 μ = 50 and σ = 10, therefore,
45−50 62−50
z1 = =−0.5∧¿ z2= =1.2
10 10
thus, P(45 < X<62) = P(-0.5<Z<1.2)
(b) Here, x= 20, therefore,
20−50
Z= =−3.0
10
Thus, P(X > 20) = P(Z > - 3.0)

7.2 Determining probabilities for a Normal Distribution Using the Standard


Normal Table

We have two major types of the standard normal tables. One type comprises the use of
the entire area under the standard normal curve and the other type comprises the use of

58
half the area of the standard normal curve (50% of the total area). We will learn how to
use the half-area type since that is the most used standard normal table.

To know which type you are using, you need to look at the Table. You will see a graph
indicating Full- Table or a Half-Table. Figure 7.4 shows the half- table and figure 7.5
shows the full-table.

0 z 0 z
Fig 7.4 Fig 7.5

The other way you can differentiate between the full table and the half-table is that, on
the full-table you have the Z- value showing both negative and positive values on the
table but the half- table has no negative values at all. The negative values are to be
deduced.

7.1.2 Reading the Probabilities from the Standard Normal Table (Area between
Vertical Lines)

We begin by stating the steps that will enable us learn how to read probabilities from
the standard normal Table.

Steps
1. Draw the diagram and the necessary vertical lines.
2. Indicate the required area on the diagram.

59
3. Break the Z- value into two parts: the first two form the first part; and the
second part will be the difference. For example if Z = 1.344 then the first part
will be 1.3 and the difference will be 0.044.
4. The first column indicating Z is for the first part and the other columns are for
the difference.
5. Trace the first part to meet the second part on the table for the required
probability.

We need to mention that the shaded area (required area) will determine the actual
solution to the problem. The symbol ∅ will be used to denote the probabilities to be
read from table.

Example 27

Find the probability of the following, by using the standard Normal Table.

(a) P (0.0 < Z < 1.74); (b) P(0.34 < Z < 2.23); (c) P(Z > -1.35);

(d). P(-2.30 < Z < 0.0); (e) P(Z > -0.41) and (f) P(Z < -2.01)

Solution

(a) We first sketch the range as

0 1.74

The required probability is the shaded area.

P(0.0 < Z < 1.74) = -∅ (1.74) -∅ (0.0)

Note that ∅ (0.0) = 0

60
To read θ (1.74) from the standard Normal Table, follow the steps above. Look for 1.7
on the first column and then look for the difference. 0.04. Trace the two values to meet
on the table. The value there is the probability for θ (1.74). Thus if this is done properly
using the Table in the appendix the value will be 0.4591. Therefore, the probability is
0.4591

(b)

0 0.34 2.23
The required probability is the shaded area of the graph above.

P(0.34 < Z < 2.23), = θ (2.23) - θ(0.34)

To read∅ (2.23) from the standard Normal Table, we look for 2.2 on the first column
and then look for the difference 0.03. Trace the two values to meet on the table. The
value there is the probability for θ (2.23). The table value is 0. 4871.

Similarly, for ∅ (0.34) read0.3 “difference” 0.04. Thus, ∅ (0.34) = 0.1331

Hence,

P(0.34<Z<2.23)= 0.4871- 0.1331

= 0.3540

The probability is therefore, 0.3540.

( c). We first sketch the range as

0 1.35

61
Since the area required is at the extreme right, we subtract whatever we read for ∅
(0.34) from 0.5. That is,

P( Z>1.35) = 0.5 - ∅ (1.35).

From Table ∅ (1.35) is 0.4110. Hence, the probability is 0.5 - 0.4110 = 0.0890.

(d) The sketch of the range is

-2.3 0

Since the Normal curve is symmetrical, ∅ (-2.3) = ∅ (2.3)

Now, (-2.30 < Z < 0.0) = ∅ (2.3).

From Table ∅ (2.3) is 0.4893.

Therefore, the probability is 0.4893.

( e) The range is sketched as

0.5

-0.41 0

From the diagram we see that the shaded area is more than half of the graph. Therefore,
the solution will be

P (Z > -0.41) = 0.5+ ∅ (0.41)

62
From the Table, ∅ (0.41) is 0.1591. Therefore, the probability is 0.6591, that is,

0.5 + 0.1591.

(f) The range is sketched as

-2.01 0

From the diagram the shaded area is at the extreme left of the graph. Therefore, the

Solution will be given as

P( Z<-2.01) = 0.5 -∅ (2.01)

= 0.5 -0.4778=0.222

Hence, the probability is 0.222.

Let us now take a complete question and solve.

Example 28

An electric firm manufactures a light bulb that has a length life that is normally
distributed with mean 800 hours and standard deviation of 40 hours. Find the
probability that a bulb burns between 778 and 834 hours.

Solution

From the problem we see that σ = 40 hours, and μ= 800 hours.

We want to find P(778 < X < 834).

778−800 834−800
Now, P (778< X < = P ( < Z< ¿
40 40

63
= P(-0.55 < Z < 0.85).

This can be sketched as

-0.55 0 0.85
P(-0.55<Z<0.85) = ∅ (0.85) + ∅ (0.55)

From the table

∅ (0.85) = 0.3023 and ∅ (0.55) = 0.2088

∅ ( 0.85 ) +¿ ∅ (0.55) = 0.3023+ 0.2088 = 0.5111

The probability that the bulb burns between 778 and 834 hours is therefore, 0.5111.

Exponential Distribution, Gamma and Beta distributions would be discussed in


class.

ADDITIONAL TOPICS

1.MOMENT GENERATING FUNCTIONS

2. JOINT DISTRIBUTIONS

64

You might also like