Probability - Sta 253 - Engineering
Probability - Sta 253 - Engineering
UNIT 1
Definition
Probability is defined as a measure of the chance of things happening or not
happening
Definition of terms
AXIOMS OF PROBABILITY
Mathematically 0 ≤ P ( E ) ≤ 1
Test boundaries
1
Proof of Axiom 1
Recall that
E ⊆S
≫ E≤S
P ( E) ≤ P ( S) , BUT P(S)=1
≫∅≤ E
0≤P(E), P(E)≤ 1
2.The probability that an event E would occur denoted by P(E) and the
Probability that an event E would not occur denoted by P ¿ ) equals 1.
Mathematically
P(E)+ P ¿ ) =1 ……………………….(1)
Results from 1
P(E)=1 - P ¿ )
2
P ¿ ) =1- P(E)
Proof of axiom 2
≫ n ( E ) +n ¿ = n(s)
But P(S) =1
≫ P(E)+ P ¿ ) =1 as required
This above theorem is often called the Addition rule of Probability Rule
P(S) =1
Or =∪=+¿
And=∩=×
Elementary Theorem
P ( ∅ )=0
In an experiment for which all possible outcomes in the sample space (S) are equally
likely to occur
3
then the probability of the event E( meaning the likelihood that the event E would
occur is given by
Basic Example
Soln
a) S= { 1 , 2, 3 , 4 ,5 , 6 } , n(S) =6
n(even number) 3
b (i) P(even number) = = = 0.5
n(s) 6
n(odd number ) 3
(ii) P(odd number) = = = 0.5
n(s) 6
Example
Soln.
4
H T
H HH HT
T TH TT
a) S= { HH , HT , TH ,TT } , n(s) =4
Examples
Solution
S = { 1 , 2, 3 , 4 ,5 , 6 }n(S) = 6
E = { 1 , 3 ,5 } , n(E) =3
5
Let F denote the event that a prime number shows up.
F = { 2 , 3 ,5 } , n(F) =3
E ∩ F={ 3 , 5 } n( E∩ F ¿=2
P(E Or F) = P( E ¿+ P ( F )−P(E ∩ F)
= 3/6+3/6-2/6 = 2/3
• Example (TRIAL)
NOTE
Generally the sample space for a fair coin tossed is given by 2n , where n is the
number of times the coin is tossed.
6
Generally the sample space for fair die tossed is given by 6 n , where n is the
number of times the die is tossed.
Also tossing one die twice is the same as tossing two dice one.
Also tossing one die thrice is the same as tossing three dice one
Example
Two fair dice red and black are tossed together.
a) List the elements in the sample space
b)Find the probability of obtaining
(i) a 3 on either dice
(ii) A score of 10
(iii) a 5 and a score of 9
(iv) A 5 on the red die and a score of 10
SAMPLE SPACE
1 2 3 4 5 6
1 1,1 1,2 1,3 1,4 1,5 1,6
2 2,1 2,2 2,3 2,4 2,5 2,6
3 3,1 3,2 3,3 3,4 3,5 3,6
4 4,1 4,2 4,3 4,4 4,5 4,6
5 5,1 5,2 5,3 5,4 5,5 5,6
6 6,1 6,2 6,3 6,4 6,5 6,6
n(s)=36
7
SOLN
n(score of 10)
ii) P( score of 10) = = 3/36
n(s)
Since an event is a subset of the sample space , we can combine events to form new
events using the various set operations. The sample space is considered as the universal
set. If A and B are two defined on the sample space , then
(1) A ∪ B¿ denotes the event A or B or both. Thus the event A ∪ B occursif either A
occurs or B occurs or both A and B occur.
(2) A ∩ B denotes the event both A and B .Thus the event A ∩ B occurs if both A
and B occur
(3) A or A' or Ac denotes the event which occurs iff A does not occurs
De Morgan’s Law
Venn diagrams are often used to verify relationships among sets thereby making it
unnecessary to give formal proofs based on the algebra of sets.
To illustrate let us show that ( AUB)' = A' ∩ B' which expresses the fact that the
complement of the union of two sets equals the intersection of their respective
complements.
8
Two set Problems
If A and B are two events defined on a sample space S , then S can be split into the
following four mutually exclusive events
Similarly
Moreover
Example
The probability that a new airport will get an award for its design is 0.04, the
probability that it would get an award for the efficient use of material is 0.2 and the
probability that it would get both awards is 0.03. Find the probability that it will get
Soln
9
Let D denote the event that the airport would get an award for its design
E be the event that the airport would get an award for its efficient use of materials
The events E and F are said to be mutually exclusive if they cannot occur together .
Meaning they are disjointed . Mathematically
P ( E ∩ F ) =0
Recall that from the total Probability rule , if E and F are two events defined on a
sample space S , then the Probability that the event E or F or both would
occur( meaning at least one must occur) is given by
10
Results from the above
≫ P(EUF)=P(E) +P(F)
This is often referred to as the addition rule of probability as well , meaning Two
mutually exclusive result into only the addition of their probabilities when
considering the total Probability rule
Example
What is the probability of obtaining a total of 7 or 11 when a pair of fair dice are
thrown once.
SOLN
SAMPLE SPACE
1 2 3 4 5 6
1 1,1 1,2 1,3 1,4 1,5 1,6
2 2,1 2,2 2,3 2,4 2,5 2,6
3 3,1 3,2 3,3 3,4 3,5 3,6
4 4,1 4,2 4,3 4,4 4,5 4,6
5 5,1 5,2 5,3 5,4 5,5 5,6
11
6 2 2
≫ P ( T 7∨T 11)=P ( T 7 ) + P (T 11) = + =
36 36 9
Independent Events
Two events E and F are said to be independent if the occurrence or non occurrence of
one does not affect the occurrence or non occurrence of the other.
Mathematically P ( E ∩ F ) =P ( E ) XP(F)
Recall that from the total Probability rule , if E and F are two events defined on a
sample space S , then the Probability that the event E or F or both would
occur( meaning at least one must occur) is given by
12
CONDITIONAL PROBABILTY
In all example so far , a sample space was defined and all probabilities were calculated
with respect to the sample space . In many instances however we are able to update the
sample space based on new info.
Example
Four cards are drawn one after the other without replacement from the top of a well
shuffled deck. What is the probability that they are the four kings?
Solution
The prob. that the first card is a king is 4/52. Given that the first card is a king
the prob. that the second card is a king is 3/51.. Given that the first two cards are kings
the prob. that the third card is a king is 2/50. Given that the first three cards are kings
the prob. that the 4th card is a king is 1/49.
4 3 2 1
≫ the prob. that the first four cards are kings = x x x
52 51 50 49
DEFINITION
If E and F are any events of a sample space S and P(F)¿ 0 , then the ‘probability’ that
the vent E, would occur given(/) that F has already occurred is given by P ¿)
P ( E nF)
P ¿) =
P(F)
n(E nF )
P ¿) =
n(F)
13
Note. the key words for conditional Probability are [ Given or If or Supposed.] . We
replace then with the slash (/)
Example
Two fair dice are thrown once . Given that the first one shows a three , what is the
probability that the sum is greater than six.
Soln
n(E nF ) 3 6 1
P ¿) = = ÷ =
n(F) 38 36 2
Suppose we calculate P(E /F ) and find P(E /F ) = P(E) , the it implies that P(E) is
unaffected by the occurrence or non occurrence F. In such a situation we say E is
independent of F. If E is independent of F then F is independent of E . If E and F are
not independent, then they are dependent.
Proof
P ( E nF)
We know that P ¿) = ………………….(.1)
P(F)
14
If E and F are independent then
P(E /F ) = P(E)……………………(2)
P ( E ∩ F ) =P ( E ) XP(F)
Let E1 , E2 , …, En be mutually exclusive events of which none has zero probability and
at lest one must occur, then for any event F (connected to E1 , E2 , …, En ¿, the total
probability is given by
P(F) = P ¿) P ( E1 ) + P ¿) P ( E2 ) +. . .+ P ¿) P ( En )
n
P(F) = ∑ P (F / Ei ) P ( E i )
i=1
Example
Three machines x, y and z are used to produce greeting cards.. During a day’s
production machine x produces 720 cards , y produces 432 and z produces 288.. The
probability of x producing a defective card is 0.02, y producing a defective card is 0.1
and that of z is 0.05. Find the probability that at the end of the day one card selected at
random would be defective.
Soln
15
720
P(x) = = 0.5
1440
432
P(y) = = 0.3
1440
288
P(z) = = 0.2
1440
P(D) = P(D /x )) P ( x )+ P ¿ ) P ( y )+ P ¿) P ( z )
= 0.02(0.5)+0.1(0.3)+0.05(0.2)
= 0.05
Baye’s Theorem
E1 ∪ E2 , … ,∪ E n=S and
E1 ∩ E2 , … ,∩ E n=∅
P(F /E i) P(Ei )
P(E i /F ) = ∑ P(F /E ) P ( E )
n
i i
i=1
16
Example
A consulting firm rents cars from three agencies. 30% from agency A , 20% from
agency B and 50% from agency C 15% of the cars from A, 10% of the cars from B and
6% of the cars from C have bad tyres . If a car rented by the firm has bad tyres , find
the probability that it came from C
Soln
Let E1 denote the even that the car came from agent A
Let F denote the event that a car rented by the firm has bad tyres.
E3
We wish to find P( ) Now P( E1 ¿=0.3 ,P( E2 ¿=0.2 ,P( E3 ¿=0.5
F
P ( F /E 3) P(E3 )
P(E 3 /F ) = P ( F /E 1)P (E1 )+ P ( F /E 2)P (E2 )+ P (F/ E 3)P (E3 )
0.5 x 0.06
= 0.3 x 0.15+ 0.2 x 0.1+ 0.5 x 0.06
= 0.3158
Example 2
It is estimated that there is a 20% chance that unemployment would increase by more
that 1% next year.. If this this increase does occur, then there would be a 90% chance
17
that congress would enact a federally funded job programme. Otherwise the probability
of such a programme being funded is 30%. Suppose that the job programme was
funded by congress.
(a) What is the probability that unemployment would increase by more than 1%?
(b) What is the probability that unemployment would not increase by more than
1%?
Soln
'
E denote the event that unemployment would not increase by more than 1%
'
F be the event that congress will not enact a job programme
P(E ) = 0.2
P ( E' )= 0.8
'
P(F /E)=¿ 0.1
'
P(F / E )=¿ 0.3
' '
P(F /E )=¿ 0.7
P(F / E)P(E)
(a) P(E /F ) = P (F /E) P(E)+ P(F / E' ) P(E' )
0.9 X 0.2
= 0.9 X 0.2+0.3 X 0.8 = 0.43
=1−0.43=0.57
18
TRIAL QUESTIONS
1. The Venn diagram below shows the sports that members of the
KNUST Sports Club participate in Bowls (B), Tennis (T) and
Darts (D). This extra information can be used to complete this
diagram.
n ( T ∩ b )=24 , ( B∪ D ∩T )=55 , n¿
19
d) Plays Tennis but not Darts.
20
6. A manufacturing company has two plants, 1 and 2. Plant 1
produces 40% of the company’s output and plant 2 produces the
other 60% . Of the output produced by plant 1 , 95% are good
and of that produced by plant 2 , 10% are defective. If a product
is randomly selected from the output of this company, what is
the probability that the output would be good. [ANS: 0.92]
21
UNIT 2
22
words, A random variable, usually written X, is a variable whose possible values are
numerical outcomes of a random phenomenon .Random events are events that are
unpredictable in the short run but there is a pattern over many occurrences in the long
run. e.g the outcomes of Tossing a coin, rolling a die, drawing a card can
be modelled as a random variables. As mentioned earlier We will use the
upper case letters such as X to denote a random variable and the lower case letters such
as x to denote a particular value that a random variable may assume. Now with this, let
us define a random variable in a more conventional way.
Definition
Let S be the sample space associated with some experiment, ∑. A random variable X
is a function that assigns a real number X (s) to each sample element s € S.
Example 1
Consider the experiment of tossing a fair coin three times. Define the random variable
X, to be the number of heads that showed up.
Solution
Let us denote H by a head showed up and T by a tail showed up, assuming we have
head at one side and tail at the other side of the coin. We can then represent the sample
space, S, by S = {HHH, HHT, HTT, THH, TTH, HTH, THT, TTT}.
HHH means that the first, second and third tosses showed head in that order. HHT
means that the first toss showed a head, the second showed a head and the third showed
a tail. Discover the meanings of the remaining on your own. Since the characteristic of
interest is the number of heads obtained, we only need to count the number of heads in
the three tosses; hence the elements in the sample space could be 0, 1, 2, and 3 heads.
Thus, the random variable, X, could be written as {X/ x = 0, 1, 2, 3}. The set forms the
range of the random variable, X. Each possible value of x € X represents an event. For
instance, the event that one head appeared, written as {X/ x = 1}, is simply the set
23
A random variable which takes on countably finite or countably infinite number of
values is called Discrete Random Variable. This means that the random variable is
defined over a discrete sample space. Example 1 is an example of a discrete random
variable. If a random variable is not discrete then it is continuous.
Example 2
Two fair dice are tossed simultaneously. Define the random variable, X, as the sum of
numbers that showed up.
Solution
On each die the numbers expected are 1, 2, 3, 4, 5, or 6. If we represent one of the dice
by A and the other by B, then the sample space can be constructed in a table form as
A 1 2 3 4 5 6
1 1,1 1,2 1,3 1,4 1,5 1,6
2 2,1 1,2 2,3 2,4 2,5 2,6
3 3,1 3,2 3,3 3,4 3,5 3,6
4 4,1 4,2 4,3 4,4 4,5 4,6
5 5,1 5,2 5,3 5,4 5,5 5,6
6 6,1 6,2 6,3 6,4 6,5 6,6
Since the random variable is the sum of numbers on the two dice, the highest value is
12 and the lowest value is 2. Therefore, the discrete random variable for this
experiment is given as {X/x = 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
If X is a discrete random variable then the function given by f (x) = P(X = xi),
The table comprise of the possible values of the random variable, X and their
corresponding probabilities, P (X = xi).
24
Possible values of x; x1 x2 …………….. xk
(X = x)
(X = xi) 0 1 2 3
1131
P (xi)
8888
That is, for probability of no head P (x = 0), we have TTT. For a fair coin,
1
P (TTT) ¿
8
Similarly, P(x = 1) implies THT, TTH and HTT. Therefore, P (getting one head) will
be given by
1 1 1 3
P(THT) + P(TTH) + P(HTT) = + + =
8 8 8 8
Example 3
Solution
We can covert the sample space in Example 1 to suit the sum of the numbers that
appeared. Thus, the sample space becomes
A 1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
The probabilities are calculated by counting the same numbers in the sample space and
dividing by 36. We divide by 36 because in the sample space we have 36 elements.
X= 2 3 4 5 6 7 8 9 10 11 12
25
1 2 3 4 5 6 5 4 3 2 1
P(X = xi)
36 36 36 36 36 36 36 36 36 36 36
The probability distribution can also be in an equation form. This is expressed in the
form P(X = x) = f(x). where f(x) is a function. An example of a probability
Distribution function is given as
x
, x =1, 2 , 3
6
P(x) = f(x) =
0, otherwise
At this point we need to know when a function qualifies to be called probability Mass
Function (Probability Distribution).
Example 4
x+2
, x =1, 2 , 3 , 4 , 5
25
f(x) =
0, otherwise
Solution
26
x 1 2 3 4 5
3 4 5 6 7
f(x)
25 25 25 25 25
Property 1: All the probabilities are positive; hence, this condition is satisfied.
Property 2: if we sum all the probabilities, it sums up to 1, hence, this condition is also
satisfied we can therefore conclude that the function is a probability mass function.
A graph of p(X = xi) against xi is called a probability graph. We usually draw vertical
lines or bars above the possible xi values of the random variable X, on the horizontal
axis. The graph is as drawn as follows.
P(x3)
P(x1)
x
x1 x2 x3 xk
Example 5
x−1
, x=3 , 4 ,5
9
0, otherwise
27
Solution
1 2
For x = 3, f(3) = ( 3−1 )=
9 9
1 3
For x = 4, f(4) = ( 4−1 )=
9 9
1 4
For x = 5, f(5) = ( 5−1 )=
9 9
x 3 4 5
234
f(x)
999
5
a. Now, all the values of f(x) are positives. Also, ∑ f ( x )=1 , hence the function is
i=3
b. f(x)
4
9
3
9
2
9
3 4 5
There are many problems where we may wish to compute the probability that the
observed value of the random variable X will be les than or equal to some real number
x. E,g What are the chances that a certain candidate will get not more than 30% of
votes? What are the chances that the prices of gold would reman at or below 800 USD
per ounce. Writing F ( x )=P( X ≤ x ) for every real number x, we define F ( x ) to be the
28
cumulative distribution function of X or simply the distribution function of the random
variable X
{
0−∞ < x< x1
f ( x 1 ) x 1 ≤ x< x 2
f ( x 1 ) +f ( x2 )x 1 ≤ x < x 2
.
function of X is given by ¿ F ( x ) equals .
.
.
.
f ( x1 ) + f ( x 2 )+ ..+ f ( x n )=1 x n ≤ x <∞
Example
The following table gives the probability mass function of X . Find the cumulative
distribution function of X and sketch its graph.
X 0 1 2 3 4
f(x) 1/16 1/4 3/8 1/4 1/16
Soln
1 1
If 0 ≤ x<1. F ( x )=f ( x< 0 ) +f (x=0) = 0 + =
16 16
1 1 5
If 1 ≤ x <2. F ( x )=f ( x <0 )+ f ( x=0 )+ f (x=1) = 0 + + =
16 4 16
29
1 1 3 11
=0+ + + =
16 4 8 16
1 1 3 1 15
=0+ + + + =
16 4 8 4 16
If 4 ≤ x <∞ or x ≥ 4
1 1 3 1 1
=0+ + + + + =1
16 4 8 4 16
{
0 x <0∨−∞ < x <0
1
0 ≤ x< 1
16
5
1 ≤ x <2
( ) 16
F x equals
11
2 ≤ x <3
16
15
3≤ x <4
16
1 4 ≤ x < ∞∨x ≥ 4
Notice that even if the random variable X can assume only integers the cdf of X
5
can be defined for non integers For example , in the above example F(1.5) =
16
11
F(2.5) =
16
30
0 ≤ F(x )≤ 1
If x 1 ≤ x 2, then F ¿ )≤ F(x 2 )
3. The probability that a random variable X takes the value within an interval
(a,b) is equal to the increment of the distribution function in that interval
This means that all probabilities of interest can be computed once the cumulative
distribution function F(x) is known.
Note : Even though this is an open interval , for a discrete distribution we can rewrite
this for new inclusive boundaries. For continuous we treat bot inclusive and exclusive
the same way.
5. F ( x ) is always ¿ continuos
lim ¿
+¿
x→ a F ( x ) =¿ F ( a) ¿¿
Any function satisfying all the five properties above is c.d.f of some random
variable
31
Exercise
1. A fair coin is flipped four times. Let X represent the number of heads which
show up. Find the probability distribution of the random variable, X.
2. A discrete random variable, X, has probability mass function
K(x + 2), x = 1, 2, 3, 4, 5
f(x) =
0 otherwise
UNIT 3
In section 1, we explained the term discrete random variable. This term helped us in the
discussions of the concept of probability distributions. In this session, we will learn
how to find the mode, the median, the mean and the variance of a discrete probability
distribution.
The mean of the discrete probability distribution is also known as the mathematical
expectation of the distribution. It is usually used as the average value of the
probability distribution even though the mode and the median considered as the
average value.
32
Then the expectation, E(X) or (mx) is calculated as
We now take an example to show how to calculate the mean of a given distribution.
Example 11
(X = x) 0 1 2 3
27 54 36 8
P(x)
125 125 125 125
Solution
3
27 54 36 8
Mx = ∑ x p(X = xi) = 0. +1. +2 +3.
i=0 125 125 125 125
54 72 24
=0+ + +
125 125 125
150
= = 1.2
125
1. The mean of the distribution must be unique. This means that it should be a
single value.
2. The mean (expectation) of a constant is the constant, that is, if C is a constant
then E (C) = C
3. If C is a constant and X is a random variable then E (CX) = CE(X).
Example 12
Solution
Now, E(2X) = 2 E(X). We see from Example 10, that E (X) = 1.2. Hence,
33
E(2X) = 2. (1.2) = 2.4
The variance of a distribution is one of the statistics that measures the spread or the
dispersion of the distribution about its mean. A small value of the variance is an
indication that the probability distribution is tightly concentrated around the mean, and
a large variance indicates that the probability distribution has a wide spread about the
mean.
Definition
Suppose that X is a discrete random variable with mean, µ = E(X). Then the variance of
X, denoted by σ 2 = var (X) is defined as
n
Var(x) = E [(X- µ) ]= ∑ ¿¿ p(x).
2
i=1
Where p(x1) is the probability for each of the corresponding x values. Using this
formula to compute the variance can be very difficult; hence we re-define the variance
as
Laws of Variance
1. Var(C) = 0
2. Var(x) = E( x 2) - [ E(x)¿2
3. Var(Cx) = c 2 Var ( x)
Example 13
X 1 2 3 4 5
1 3 1 3 4
P(xi)
12 12 12 12 12
Solution
n n
The variance is given by var (X) = ∑ x p(xi) - 2
∑ x i P(xi) 2
i=1 i
34
5
1 3 1 3 4
Now, ∑ x 2P(xi) = 12. +22. +3 2. + 42. +52.
i=1 12 12 12 12 12
1 12 9 48 100
= + + + +
12 12 12 12 12
= 14.1633
5
1 3 1 1 4
Similarly, ∑ x P(xi) = 1. +2 3. 4. +5.
i=1 12 12 12 12 12
= 3.4993
= 1.9182
= 1.3850
Exercise
1. A fair die is tossed once. Define a random variable as the number that showed
up. Find (a) the median; (b) the mean; and (c) the variance of this distribution
2. Suppose that two balanced dice are rolled, and let X denote the absolute value of
the difference between the two numbers that appeared. Determine the
probability distribution and calculate the variance of this distribution.
3. The following table lists the probability distribution for cash prizes in a lottery
condition at Melcom Supermarket.
35
In the last two sessions, we discussed generally, discrete probability distributions. We
will turn our attention to special distributions that are widely used in applications of
Probability and Statistics. These distributions are meant to solve special type of
problems. In this session, we will consider two of these special types of probability
distributions. They are Binomial and the Poisson distributions, Notwithstanding, there
are other discrete distributions. Among these are the Geometric, Hyper geometric,
Negative Binomial, Discrete uniform distribution, Bernoulli distribution, e Some of
these may be additionally treated in this text
The term binomial means two, thus binomial events have two options. The properties
stated here will help us to identify binomial experiments.
The binomial distribution has some properties which identify it. A binomial experiment
is the one that possesses the following properties:
Let us now discuss the meaning of these properties. The first property means that the
trials should be performed under similar conditions. For instance, if we flip a fair coin
ten times, it is expected that each will be flipped under the same condition. As the
name implies, the second property means that the experiment should result in only two
results termed as “success” or “failure”. The third property means that, if the success of
the first trial is p, then the success in each of the subsequent trials will be p. For
example if you flip a fair coin three times, then in each trial the probability of a head
1
appearing is . Property (iv) means that, the occurrence of the first trail should not
2
influence the occurrence of the second trial, and so on. In property (V), we mean that
the random variable of interest is labeled as success.
We will now define the Binomial Distribution and then use it to solve problems
involving Binomial experiments.
36
If p and q (i.e. q = 1 – p) are probability of success and probability of failure
respectively on any one trial, then the probability of getting x observed successes and
n – x failures for n independent number of trials is given by
P(X = x) =
0, otherwise
Where nCx , is the number of ways of getting x observed successes out of n trials, and P
lies between 0 and 1 inclusive.
We need to remember that the random variable for the binomial distribution is discrete,
and a legitimate probability distribution. It can be denoted by b(x; n, p)
Example 14
A fair coin is tossed ten times. Define the random variable, X, as the number of heads
that appears. Find the probability that: (i) no head appeared. (ii) At most two heads
appears. (iii) At least two heads appeared.
Solution
This is a binomial trial since we have two options. Either a head appears (referred to as
a success) or no head appear (referred to as a failure).
1 1 1
i. n = 10 trials, p = , x = 0, q = 1 - =
2 2 2
P(X = 0) = 10Co ( 12 ) ( 12 )
o 10
= 1. ( 12 ) 10
=
1
1024
= 0.00098
ii. At most, two heads means that, there could be 0, 1 or 2 heads. Therefore,
the probability is given to be
P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)
Now, from (i) P(X = 0) is 0.00098
P(X = 1) = 10C1 ( 12 ) ( 12 ) = 10 ( 12 )
1 9 10
= 10.
1
1024
= 0.00977
37
iii. We want to find P(X ≥ 2) which implies that we need
P(X = 2) + P(X=3) +………+ P(X = 10).
This is tedious and time consuming. The best way to find this is to find P(X < 2) and
subtract the results from 1. That is the Complementary Rule of Probability.
= 0.01075
Example 15
For 800 families sampled, each has five children. How many of these families would
you expect to have three boys?
Solution
Let us define random variable, X, as observing a boy in the family, then we can
consider this experiment as a binomial since we may observe a boy or a girl in a family.
1 1 1
For the given problem, n= 5 children, P = and q = 1 - =
2 2 2
Now to find the number of families expected to have 3 boys, we multiply this
probability by the number of families. That is, 0.3125 x 800
This gives 250 families. Therefore, 250 families are expected to have three boys.
Example 16
38
A fair coin is tossed ten times. Define the random variable, X as the number of heads
that appears. Find the mean and the variance of this experiment.
Solution
1 1
From the experiment, n = 10, P = , q =
2 2
1
Mean = E(X) = np = 10. = 5,
2
1 1
Var (X) = np(1-p) = 10. . = 2.5
2 2
The Poisson distribution has most properties similar to the binomial distribution.
Generally, Poisson distribution deals with experiments that have to do with events
happening within time intervals. For example, the number of car accidents occurring at
a particular intersection during a time period of one week; number of cars passing at a
point on a main road in one second; number of telephone calls handled by a switch
board in a time interval; can all be classified as Poisson experiments.
The probability distribution of the Poisson random variable X representing the number
of successes occurring in a given time interval or specified region of time is defined as
−λ x
e λ
, x = 0, 1, 2…
x!
P( x) =
0, otherwise
Where λ (λ>0) is the mean number of successes occurring in a given time interval or
specified region, and e = 2.71828.
With the help of the definition of the Poisson Distribution we now want to solve some
Poisson problems.
Example 17
39
Suppose that a random variable x has a Poisson distribution with mean, λ = 0.4. Find;
Solution
a. P(x=0) = ¿ ¿= 0.6703
b. P ( x=1 )=¿ ¿ = 0.2681
c. P(x ≥ 2) = 1 – P(x < 2) = 1 – P[P(x=0) + P(x=1)] = 1 – (0.6703 + 0.2681) =
Example 18
The average number of road accidents per day recorded over 100 days In a certain
junction was 1.2. Calculate the probability that on a particular day
a. No accidents;
b. Less than 3 accidents; and
c. At least 1 accident will be recorded.
Solution
Now,
And
1.44(0.3012)
P(X = 2) = e−1.2 ¿ ¿ = (1.2)2 ¿ ¿ = = 0.2169.
2
Therefore,
40
Since Poison distribution has some properties similar to the binomial distribution, it is
possible to consider some binomial experiments as Poisson. We can therefore solve
some binomial problems using Poisson distribution. Thus, if n is large(n → ∞ ¿ and p is
small, closed to zero( p →0 ¿ , then the Poisson distribution is used to approximate the
Binomial distribution, with mean given by λ = np.
Example 19
Solution
From the problem, probability of success, P = 0.03, and n = 100. This can be classified
as binomial experiment, but we see that it will be cumbersome for us because n is large.
Hence, the best approach is the Poisson distribution.
9
P(X = 2) = e−3 ¿ ¿ = (2.718)-3 = 0.2241 (corrected to 4 decimal places).
2
We want to state here that the mean and the variance of the Poisson distribution have
the same value. That is,
E(X) = Var(X) = λ.
For example, the mean and the variance of Example 18, is 1.2.
41
Example
Soln
∑ p ( x )=1
x=0
∞ ∞
e−λ λ x − λ
∑ p ( x )=¿ ∑ x!
=e ¿ ¿ ¿
x=0 x=0
∞
≫ ∑ p ( x )=¿ e− λ ¿ ¿
x=0
2 3
x x
But from Taylor’s series e x =1+ x + + +…
2! 3 !
2 3
λ λ λ
≫ e =1+ λ+ + +…
2 ! 3!
42
∞
≫ ∑ p ( x )=¿ e− λ . e λ =e− λ+λ =e 0=1 ¿
x=0
∞
≫ ∑ p ( x )=¿ 1 ¿
x=0
As required
Assignment
Look for the proof of mean and variance of the Poisson distribution
E(x) = λ , var(x)= λ
Exercise
4. One percent of the letters mailed in an office have incorrect addresses. If on a given
day 200 letters are mailed,
(i) How many with incorrect address are expected?
(ii) What is the probability of finding 3 or more letters with incorrect address?
43
PROBABILITY DISTRIBUTION FOR A CONTINUOUS RANDOM
VARIABLE
In Session 1 of this unit, we discussed the probability distribution for a discrete random
variable. In this session, we will discuss probability distribution for a continuous
random variable. We will see that the major difference is the meaning of discrete and
continuous. We will therefore try to explain the meaning of continuous random
variable and then use it to discuss continuous probability distributions.
Suppose that our concern is to find the possibility that an accident will occur on a
highway which is 100km long. Let us assume that our interest is that the accident will
occur at a given location on the highway, then this characteristic to be measured is a
continuous random variable.
Let X be a continuous random variable. A function, f(x), defined over the set of all real
numbers is called probability distribution function if
b
1.P(a ≤ X ≤ b) = ∫ f ( x ) dx
a
2. P(a ¿ X ≤ b) = ∫ f ( x ) dx
a
3. P(a ≤ X ¿ b) = ∫ f ( x ) dx
a
44
b
4. P(a ¿ X ¿ b) = ∫ f ( x ) dx
a
5. P(a ¿ X ) = ∫ f ( x ) dx
−∞
6. P( x <a) = ∫ f ( x ) dx
−∞
7. P( X ¿ a ) = ∫ f ( x ) dx
a
8. P( X ≥ a) = ∫ f ( x ) dx
a
Illustration
The definition means that the probability that a random variable X takes the value in the
interval (a, b) is equal to the shaded area of the region defined by the curve, y = f(x),
(See Figure 5.1) where f0(x) is the probability distribution function. This is also known
as probability density function (pdf)
45
y= f (x)
x
0 a b
Figure 5.1
The shaded area of Figure 4.1 represents the area of a probability density function lying
between a and b. this gives the probability that the event is found between a and b.
A function f(x) can serve as probability density function (pdf) of a continuous random
variable, X, if the following conditions are satisfied:
1. f (x) ≥ 0
∞
2. ∫ f ( x ) dx=1
−∞
We will now take some examples to demonstrate what we have discussed so far.
Example 19
2
x
f(x) = , for -1 < x < 2
3
0, otherwise
Solution
(a) Condition 1: for f(x) to be a probability density function, f(x) ≥0. We see that
the function will always be positive since x2 cannot be negative.
46
Hence, condition 1 is satisfied for all the values between -1 and 2.
Condition 2: ∫ f ( x ) dx=1
−∞
2 2 2
x 1 1 2 8 1
∫ 3
dx= ∫ x dx = [x3]
3 −1
2
9
= + =1
−1 9 9
−1
We see from the calculation that the second condition is also satisfied. Since the two
conditions are satisfied, we conclude that the function f(x) is a probability density
function.
0 3 30 9 0 9
1
This means that the probability that X lies between 0 and 1 is
9
Example 20
A random variable has the pdf
f(x) =
0 other wise
Solution
47
∞
Use ∫ f ( x ) dx=1
−∞
∫ f ( x ) dx
0
4 4
∫ kx dx =k ∫ x dx
0 0
k 2 4 16 k
[x ] = = 8k =1
2 0 2
1
K=
8
3
1
(b). P(1 < X < 3) =
81
∫ xdx
1 2 3 1
= [x ] =
16 1 8
Example 21
2
x
Given that the function , for -1 < x < b is a probability density function
3
f(x) =
0 otherwise
Solution
∫ f ( x ) dx=1
−1
b 2 b
x 1 1 b
∫ 3
dx ¿ ∫ x dx = [ x 3]
3 −1
2
9 −1
−1
48
3
b 1
+ =1
9 9
3
b + 1 = 9, b3 = 8 b3 = 23
It is important to mention here that the function f (x) should be differentiable, thus, for
d
are probability to be found it is necessary that the derivative ( f (x) = f(x)1exist.)
dx
We must also note that if X is a continuous random variable having probability density
function f(x) then for any constant a, P(X =a) = 0. The reason is that if X is a
continuous random variable then.
Based on this fact, it is worth noting for a continuous random variable the following
statement is true.
Let us not carefully that is not true in the case of a discrete random variable.
In Example 20b, assuming we want to find P(1≤ X ≤3 ¿ , the answer will not be
different from what we had.
3
1 1
Thus,
81
∫ xd= .
8
Exercise
k (x-1), for 1 ≤ x ≤2
f(x) =
0, otherwise
49
(a) Find the value of k.
(b) Hence find (i) P(1.0< X < 1.5) (ii) P(X < 2.0) = −∞ ¿ 2 (iii) P( X > 3) = 3 to ∞
1
(x+1), for 2 < x < 4
f(x) = 8
0, otherwise
50
The mean which is also known as mathematical expectation is the most used measure of
central tendencies. Suppose that X is a continuous random variable and f (x) is the
probability density function, then the mean (mathematical expectation) is defined as
∞
E (X) = ∫ xf ¿ ¿) dx
−∞
We need to note that mathematical expectation may or not exist. Note here that the f (x)
has been multiplied by x.
Example 24
2
(1+x), 2≤ x≤ 5
27
f(x) =
0, otherwise
Solution
5 5
2 2
E (x) = ∫ x ( 1+ x ) dx= ∫ ¿ ¿)dx
2 27 27 2
5
2 3
2 x x
= +
27 2 3
2
=
2 25 125 4 8
27 2
+
3
− −
2 3
= ( )
2 198
27 54
=
99
27
∫ e−2 x dx =
51
6.2 The Variance
In unit 2, we discussed the meaning and the importance of the variance. Our concern
here therefore is to learn how to find it in the case of the continuous random variable.
Suppose X is a continuous random variable, then the variance is defined as
Hence,
2
∞
Example 25
1
x 0 ≤ x≤ 4
f(x) = 8
0, otherwise
Find the variance of the random variable, X and hence find the standard deviation.
Solution
52
∞
Var (x) = ∫ f (¿ x) ¿x 2
dx – ∫ x f ( x)dx
−∞
4 4 2
Var (x) = ∫ x
0
1
8
2
x dx - ( ) 0
( )
∫ x 18 x dx
4
4 4
1 1x
E x 2= ∫ x dx =
3
80 8 4
0
1 3
= 4 =8
8
4
4 4
∫ x ( 18 x ) dx = 18 ∫ x 2 dx =
3
1x
E x =
0 0 8 3
0
3
14 64 16 8
= = . =
8 3 24 6 3
Therefore
8 8 64
Var (X) = –¿= -
1 1 9
72−64 8
= =
9 9
σ = √ Var (X )
Thus,
53
√
σ = 8 = 2 √ 2.
9 3
Exercise
2 (x-1), 1≤ x ≤ 2
F(x) =
0, otherwise
2. Assume that the probability density function of the random variable X, is given
by
UNIT 7
54
The knowledge acquired from the five sessions so far, can also help us differentiate
between discrete and continuous probability distributions.
The most widely used probability distribution in the entire field of Statistics is the
Normal distribution. It is important to know that the term Normal used should not
be interpreted to mean that other types of distributions are “abnormal”. It is used
basically due to the fact that its curve provides approximation to the pattern
observed in so many diverse histograms based on real data sets.
(-∞ < μ< ∞∧σ > 0 ¿ if X has a continuous distribution for which the density function
f(x) is defined as,
f(x) =
1
e-
√2 π 2 σ
σ ( )
1 x−μ 2
, for (−∞ < x <∞ ) .
55
The curve is constructed so that the area under the curve bounded by two ordinates X=
x1 and X = x2 equals the probability that the random variable X assumes. This area is
shown in Figure 7.1
X1 μ x2
Fig. 7.1
Integrating this function is indeed tedious. However, the way out will be discussed later.
1. The mode, which is a point on the horizontal axis (where the curve is maximum),
occurs at x = μ.
2. The general Normal curve is bell-shaped and symmetric about the vertical axis
through the mean, μ, (mean=median=mode).
3. The curve has its point of deflection at x= μ ± σ .
4. The Normal curve approaches the horizontal axis asymptotically as you proceed in
either direction away from the mean.
5. The total area under the horizontal axis is equal to 1.
56
Below is a typical shape of the normal curve.
μ−x μ μ +x
Figure 7.2
The shape of the Normal curve depends largely on the standard deviation of the normal
curve. The probability density function of the Normal distribution with a small value of
standard deviation has high peak and is very much concentrated around the means
However, a large standard deviation of the curve gives much dispersion about the mean
and the peak is quite flat (that is, quite low).
Figure 7.3 shows a normal distribution with different values of standard deviation.
σ3
σ2 σ 1 >σ 2 > σ 3
σ1
Fig. 7.3
57
Now, if X has Normal distribution with mean, μ and variance,σ 2 then the random
X−μ
variable, Z, given by Z = , has the Standard Normal distribution with mean μ=0 ,
σ
and variance σ 2 = 1. The probability density function of the standard normal
distribution is given by
2
−z
f(z) = 1 e 2
2π
The advantage in using the Normal Distribution is that standard normal tables are
available for use. We therefore need not do any direct integration in using the normal
distribution. For instance, if we want to find P(x1< X < x2) we need to transform it
x−μ X −μ
using the formula Z = , to the form P(z1<Z< z2). Thus, z 1 = 1 .
σ σ
Example 26
A random variable X has a normal distribution with mean 50 and standard deviation 10.
Convert the following to the Z values.
We have two major types of the standard normal tables. One type comprises the use of
the entire area under the standard normal curve and the other type comprises the use of
58
half the area of the standard normal curve (50% of the total area). We will learn how to
use the half-area type since that is the most used standard normal table.
To know which type you are using, you need to look at the Table. You will see a graph
indicating Full- Table or a Half-Table. Figure 7.4 shows the half- table and figure 7.5
shows the full-table.
0 z 0 z
Fig 7.4 Fig 7.5
The other way you can differentiate between the full table and the half-table is that, on
the full-table you have the Z- value showing both negative and positive values on the
table but the half- table has no negative values at all. The negative values are to be
deduced.
7.1.2 Reading the Probabilities from the Standard Normal Table (Area between
Vertical Lines)
We begin by stating the steps that will enable us learn how to read probabilities from
the standard normal Table.
Steps
1. Draw the diagram and the necessary vertical lines.
2. Indicate the required area on the diagram.
59
3. Break the Z- value into two parts: the first two form the first part; and the
second part will be the difference. For example if Z = 1.344 then the first part
will be 1.3 and the difference will be 0.044.
4. The first column indicating Z is for the first part and the other columns are for
the difference.
5. Trace the first part to meet the second part on the table for the required
probability.
We need to mention that the shaded area (required area) will determine the actual
solution to the problem. The symbol ∅ will be used to denote the probabilities to be
read from table.
Example 27
Find the probability of the following, by using the standard Normal Table.
(a) P (0.0 < Z < 1.74); (b) P(0.34 < Z < 2.23); (c) P(Z > -1.35);
(d). P(-2.30 < Z < 0.0); (e) P(Z > -0.41) and (f) P(Z < -2.01)
Solution
0 1.74
60
To read θ (1.74) from the standard Normal Table, follow the steps above. Look for 1.7
on the first column and then look for the difference. 0.04. Trace the two values to meet
on the table. The value there is the probability for θ (1.74). Thus if this is done properly
using the Table in the appendix the value will be 0.4591. Therefore, the probability is
0.4591
(b)
0 0.34 2.23
The required probability is the shaded area of the graph above.
To read∅ (2.23) from the standard Normal Table, we look for 2.2 on the first column
and then look for the difference 0.03. Trace the two values to meet on the table. The
value there is the probability for θ (2.23). The table value is 0. 4871.
Hence,
= 0.3540
0 1.35
61
Since the area required is at the extreme right, we subtract whatever we read for ∅
(0.34) from 0.5. That is,
From Table ∅ (1.35) is 0.4110. Hence, the probability is 0.5 - 0.4110 = 0.0890.
-2.3 0
0.5
-0.41 0
From the diagram we see that the shaded area is more than half of the graph. Therefore,
the solution will be
62
From the Table, ∅ (0.41) is 0.1591. Therefore, the probability is 0.6591, that is,
0.5 + 0.1591.
-2.01 0
From the diagram the shaded area is at the extreme left of the graph. Therefore, the
= 0.5 -0.4778=0.222
Example 28
An electric firm manufactures a light bulb that has a length life that is normally
distributed with mean 800 hours and standard deviation of 40 hours. Find the
probability that a bulb burns between 778 and 834 hours.
Solution
778−800 834−800
Now, P (778< X < = P ( < Z< ¿
40 40
63
= P(-0.55 < Z < 0.85).
-0.55 0 0.85
P(-0.55<Z<0.85) = ∅ (0.85) + ∅ (0.55)
The probability that the bulb burns between 778 and 834 hours is therefore, 0.5111.
ADDITIONAL TOPICS
2. JOINT DISTRIBUTIONS
64