Lecture 04
Lecture 04
Lecture 4
Discrete Probability
Distributions
Introduction
We have learnt in lecture 3 that in statistics we deal with two types of variables:
discrete and continuous. These variables which behave in their own natural ways are
called random variables. In this lecture we will learn about probability distributions
associated with discrete random variables. Before we do that, we first try to
understand the concept of a probability distribution. We do this through the following
example.
Example 4.1
We consider the experiment of tossing two fair dies, as we did in example 3.7 of
lecture 3. We have already seen that the space for this experiment is
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
S=
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Let X denotes a random variable that is defined as the sum of the numbers on the two
faces of the dies. We can find the probabilities of different values of X as follows.
1
p{ X = 2} = p{(1, 1)} =
36
2
p{ X = 3} = p{(1, 2), ( 2, 1)} =
36
3
p{ X = 4} = p{(1, 3), (3, 3), (3, 1)} =
36
4
p{ X = 5} = p{(1, 4), ( 2, 3), (3, 2), ( 4, 1)} =
36
5
p{ X = 6} = p{(1, 5), ( 2, 4), (3, 3), ( 4, 2), (5, 1)} =
36
6
p{ X = 7} = p{(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)} =
36
5
p{ X = 8} = p{( 2, 6), (3, 5), ( 4, 4), (5, 3), (6, 2)} =
36
4
p{ X = 9} = p{(3, 6), (4, 5), (5, 4), (6, 3)} =
36
3
p{ X = 10} = p{( 4, 6), (5, 5), (6, 4)} =
36
2
p{ X = 11} = p{(5, 6), (6, 5)} =
36
1
p{ X = 12} = p{(6, 6)} =
36
We observe from the above example that the random variable X can be assigned
different values from 2 to 12. If we define a function
f(x) = p(X = x)
and if the random variable x is a discrete random variable, as defined in lecture 2, the
function is referred as a discrete probability distribution function. The function f(x) is
defined on the entire real line, and for any given value of x, f(x) is the probability that
the random variable X assumes the value x. For example, f(x) is the probability that
the random variable X assumes the numerical value 2. The possible values of x can be
limited, or may be unlimited. For example for the possible number of faces in a roll of
a fair dice, the number of faces is limited, but the possible number of defective items
on a production line can be (at least theoretically!) infinity. We must also observe that
since f(x) is probability, therefore 0 ≤ f(x) ≤ 1, and ∑ f ( x) = 1 .
all x
Example 4.2
A voice communication system for a business contains 48 external lines. At a
particular time, the system is observed, and some of the lines are being used. Let the
random variable X denote the number of lines in use. Then, X can assume any of the
integer values 0 through 48. When the system is observed, if 10 lines are in use, x =
10.
Example 4.3
Define the random variable X to be the number of contamination particles on a wafer
in semiconductor manufacturing. Although wafers possess a number of
characteristics, the random variable X summarizes the wafer only in terms of the
number of particles. The possible values of X are integers from zero up to some large
value that represents the maximum number of particles that can be found on one of
the wafers. If this maximum number is very large, we might simply assume that the
range of X is the set of integers from zero to infinity.
Example 4.4
We consider an example of four fair coins tossed. The space for the problem is
tttt httt thtt ttht
ttth hhtt htht htth
S =
thht thth tthh hhht
hhth hthh thhh hhhh
We define the random variable X as the number of head. Therefore, the possible
values for X are 0, 1, 2, 3, and 4. The values of the probabilities are as follows.
X 0 1 2 3 4
p(X = x) 1/16 1/4 3/8 1/4 1/16
The table shows the probability distribution function.
To find the cumulative probability for a random variable at a value x0, we add the
probabilities for all values of the random variable less than and equal to x0. First we
demonstrate this for the two examples we have seen in this lecture, than we would see
the utility of this function.
Example 4.5
For example 4.1, the probability distribution function and the cumulative distribution
function is shown in the table below.
x 2 3 4 5 6 7 8 9 10 11 12
f(x) 1/3 2/3 3/3 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
6 6 6
F(x 1/3 3/3 6/3 10/3 15/3 21/3 26/3 30/3 33/3 35/3 36/3
) 6 6 6 6 6 6 6 6 6 6 6
6/36 1.00
5/36
0.75
4/36
2/36
0.25
1/36
0 0.0
1 3 5 7 9 11
x
Example 4.6
Suppose that a day’s production of 850 manufactured parts contains 50 parts that do
not conform to customer requirements. Two parts are selected at random, without
replacement, from the batch. Let the random variable X equal the number of
nonconforming parts in the sample. What is the cumulative distribution function of
X?
The question can be answered by first finding the probability distribution function of
X.
800 799
p( X = 0) = ⋅ = 0.886
850 849
800 50
p( X = 1) = 2 ⋅ ⋅ = 0.111
850 849
50 49
p( X = 2) = ⋅ = 0.003
850 849
Therefore, the cumulative distribution function is
F(0) = p(X ≤ 0) = p(X = 0) = 0.886
F(1) = p(X ≤ 1) = p(X = 0) + p(X = 1) = 0.886 + 0.111 = 0.997
F(2) = p(X ≤ 2) = p(X = 0) + p(X = 1) + p(X = 2) = 0.886 + 0.111 + 0.003 = 1.0
Cumulative probability distribution is useful in many places. For example, for the
problem in example 4.3, if we are interested to find p(4 ≤ x ≤ 8), using f(x) we will
have to add the probabilities for all values of random variable between 4 and 8,
inclusive. Using the cumulative distribution function, we simply write
p(4 ≤ x ≤ 8) = F(8) − F(3).
The usefulness of the cumulative distribution function may not be very clear at this
time, but when we would study continuous distribution, we would see that the concept
of cumulative distribution function would be central concept in finding probabilities.
Mathematical expectation
The list can continue. Therefore a relevant question is: what is mathematical
expectation, and how is it calculated?
Expected value of a variable x is the long-term average of the variable under the
condition. To understand how this is done, let us take the example of the waiting time
at the gas station. We have not learnt all the techniques to handle the problem in its
entirety. Therefore, we will look at a simplified version. Let us suppose that from past
records, the waiting times at the station have been obtained for a long time, and the
data are as follows.
Waiting time 0 2 4 6 8
in minutes, x
No. of 12 22 28 26 16
customers
With the tools we have learnt in the last lecture, we can find the mean time from the
above data.
0 × 12 + 2 × 22 + 4 × 28 + 6 × 26 + 8 × 16 440
x avg = = = 4.23 minutes
12 + 22 + 28 + 26 + 16 104
Therefore the expected waiting time at the gas station is 4.23 minutes. We understand
that the expected waiting time is only an average estimate. The actual waiting time
can be more or less than this time.
12 22 28 26 16
x avg = 0 × + 2× + 4× + 6× + 8×
104 104 104 104 104
Suppose that you drive into the gas station. If you ask, what is the probability that you
28
will wait for 4 minutes. From the above figures, the answer is . The probability of
104
12
wait for 2 minutes is . Extending this concept, we can write xavg as
104
x avg = 0 × p (0) + 2 × p ( 2) + 4 × p (4) + 6 × p (6) + 8 × p (8)
or x avg = ∑ x p( x)
all x
Written like this, xavg is referred to as the ‘expected value of x’, and is denoted by E(x).
We can extend this definition of expected values to expected values of other entities,
like
E ( x 2 ) = ∑ x 2 p ( x)
all x
E ( x 3 ) = ∑ x 3 p ( x)
all x
And, in general
E[ H ( x)] = ∑ H ( x) p( x) (2)
all x
The expected value of H(x) can be thought of as being obtained by biasing H(x) with
the probability function of x. The quantities E(x), E(x2), E(x3) are referred to as the
first second, and third moments of the random variable x.
We learnt about point estimators in lecture 4. Expected values can be related to the
point estimators of last lectures as follows.
Mean = E(x)
Variance = E(x2) – [E(x)]2
Skewness and kurtosis can also be written in terms of third and fourth moments,
respectively.
Example 4.7
Find the expected value of a roll of a fair 6-faced dice.
From our earlier lecture, we know the probabilities of all the faces are same.
Therefore, we can construct the probability table as follows.
Face, x 1 2 3 4 5 6
Probability 1/6 1/6 1/6 1/6 1/6 1/6
Therefore
1 1 1 1 1 1
E ( x) = 1 × + 2 × + 3 × + 4 × + 5 × + 6 × = 3 .5
6 6 6 6 6 6
Example 4.8
Your friend runs a computer maintenance company. He charges his customers
Tk. 4000.00 per year, and performs all maintenance for one year. Past record shows
the cost distribution of computer maintenance as follows.
Cost (Tk.) 2000 4000 6000 8000
Percent 48 30 14 8
Will the company make a profit with this maintenance plan?
Exercises
4.1 Verify that the following functions are probability distribution functions, and
determine the requested probabilities.
x –2 –1 0 1 2
f(x) 1/8 2/8 2/8 2/8 1/8
(a) p(x ≤ 2) (b) p(x > –2)
(c) p(–1 ≤ x < 1) (d) p(x ≤ –1 or x = 2)
2x + 1
4.3 f ( x) = , x = 0, 1, 2, 3, 4
25
(a) p(x = 4) (b) p(x ≤ 1)
(c) p(2 ≤ x ≤ 4) (d) p(x > –10)
4.5 Marketing estimates that a new instrument for the analysis of soil samples will
be very successful, moderately successful, or unsuccessful, with probabilities
0.3, 0.6, and 0.1, respectively. The yearly revenue associated with a very
successful, moderately successful, or unsuccessful product is $10 million, $5
million, and $1 million, respectively. Let the random variable X denote the
yearly revenue of the product. Determine the probability distribution function of
X.
4.6 A disk drive manufacturer estimates that in five years a storage device with 1
terabyte of capacity will sell with probability 0.5, a storage device with 500
gigabytes capacity will sell with a probability 0.3, and a storage device with 100
gigabytes capacity will sell with probability 0.2. The revenue associated with the
sales in that year are estimated to be $50 million, $25 million, and $10 million,
respectively. Let X be the revenue of storage devices during that year.
Determine the probability distribution function of X.
4.7 An optical inspection system is to distinguish among different part types. The
probability of a correct classification of any part is 0.98. Suppose that three parts
are inspected and that the classifications are independent. Let the random
variable X denote the number of parts that are correctly classified. Determine the
probability distribution function of X.
4.8 In a semiconductor manufacturing process, three wafers from a lot are tested.
Each wafer is classified as pass or fail. Assume that the probability that a wafer
passes the test is 0.8 and that wafers are independent. Determine the probability
distribution function of the number of wafers from a lot that pass the test.
4.9 The distributor of a machine for cytogenics has developed a new model. The
company estimates that when it is introduced into the market, it will be very
successful with a probability 0.6, moderately successful with a probability 0.3,
and not successful with probability 0.1. The estimated yearly profit associated
with the model being very successful is $15 million and being moderately
successful is $5 million; not successful would result in a loss of $500,000. Let X
be the yearly profit of the new model. Determine the probability distribution
function of X.
4.12 Determine the cumulative distribution function for the random variable in
Exercise 4.1; also determine the following probabilities:
(a) p(x ≤ 1.25) (b) p(x ≤ 2.2)
(c) p(–1.1 < x < 1) (d) p(x > 0)
4.13 Determine the cumulative distribution function for the random variable in
Exercise 4.3; also determine the following probabilities:
(a) p(x < 1.5) (b) p(x ≤ 3)
(c) p(x > 2) (d) p(1 < x ≤ 2)
4.14 Determine the cumulative distribution function for the random variable in
Exercise 4.5.
4.15 Determine the cumulative distribution function for the random variable in
Exercise 4.6.
4.16 Determine the cumulative distribution function for the random variable in
Exercise 4.8.
4.17 Determine the cumulative distribution function for the variable in Exercise 4.9.
Verify that the following functions are cumulative distribution functions, and
determine the probability distribution function and the requested probabilities.
4.18
0.0 x <1
F ( x) = 0.5 1≤ x < 3
130 3≤ x
(a) p(x ≤ 3) (b) p(x ≤ 2)
(c) p(1 ≤ x ≤ 2) (d) p(x > 2)
4.19 Errors in an experimental transmission channel are found when the transmission
is checked by a certifier that detects missing pulses. The number of errors found
in an eightbit byte is a random variable with the following distribution:
0.0 x <1
0.7 1≤ x < 4
F ( x) =
0.9 4≤ x<7
1.0 7≤x
Determine each of the following probabilities:
(a) p(x ≤ 4) (b) p(x > 7)
(c) p(x ≤ 5) (d) p(x > 4)
(e) p(x ≤ 2)
4.20
4.21 The thickness of wood paneling (in inches) that a customer orders is a random
variable with the following cumulative distribution function:
0.0 x < 1/ 8
0.2 1/ 8 ≤ x < 1/ 4
F ( x) =
0.9 1/ 4 ≤ x < 3 / 8
1.0 3/8 ≤ x
Determine the following probabilities:
(a) p(x ≤ 1/18) (b) p(x ≤ 1/4)
(c) p(x ≤ 5/16) (d) p(x > ¼)
(e) p(x ≤ 1/2)
4.22 If the range of X is the set {0, 1, 2, 3, 4} and P(X = x) = 0.2 determine the mean
and variance of the random variable.
4.23 Determine the mean and variance of the random variable in Exercise 4.1.
4.24 Determine the mean and variance of the random variable in Exercise 4.3.
4.25 Determine the mean and variance of the random variable in Exercise 4.5.
4.26 Determine the mean and variance of the random variable in Exercise 4.6.
4.27 The range of the random variable X is [0, 1, 2, 3, x], where x is unknown. If each
value is equally likely and the mean of X is 6, determine x.
We have seen earlier that the first four moments are very important for a distribution.
These are used to find the descriptive nature of the sample. In fact higher moments
also become important in many other applications. Moment generating functions can
be used to find these moments easily. For a random variable X with a probability
distribution function f, the moment generating function is defined as
mX(t) = E(etX)
Using this function, the kth moment can be found using
d k m X (t )
k
= E( X k ) (3)
dt t =0
Binomial distribution
For this and some other distributions we will discuss next, we will need to understand
Bernoulli trial. Bernoulli trial has the following properties.
• Each trial is identical and independent in the sense that the outcome of one
trial does not affect the outcome of another trial.
• The outcome of each trial can be classed as a ‘success’ (s) or a ‘failure’ (f).
Example of Bernoulli trial is flipping of a coin, quality inspection at a quality control
station (outcome is either defective or non-defective). Bernoulli trials can also be
defined in experiments whose outcome normally is not thought of in terms of success
and failure. For example, in tossing of a fair dice, we know that there are 6 possible
outcomes. But if success is defined as ‘face 6 shows up’ than the experiment becomes
a Bernoulli trial.
In an attempt to find an expression for the probability function, let us assume that we
are looking for 3 successes in 5 trials. If there are 3 successes, the other two trials
must have ended in failures. Let us consider a possible outcome of this event. Let us
assume the event {sssff}. The probability of this event is p3(1 – p)2. Another possible
event of the same outcome is {sfsff}. The probability of this event is also p3(1 – p)2. In
fact since there are 3 successes, and 2 failures, probability of each of the possible
event would be p3(1 – p)2. We are now faced with the question: how many ways can 3
successes happen in 5 trials. This is a classic problem of combination. We know from
high school arithmetic that there are 5C2 ways that 3 successes can occur in 5 trials.
5
Therefore, the probability of 3 successes in 5 trials is p ( x = 3) = p 3 (1 − p ) 2 .
3
Generalizing this to x successes in n trials, we obtain the expression for binomial
distribution.
n
f ( x) = p x (1 − p) n − x (9)
x
The cumulative distribution is
t
n
F (t ) = ∑ p x (1 − p ) n − x (10)
x =0 x
Binomial probability
Example 4.9
A study of studies of air traffic controllers have shown that it is difficult to maintain
accuracy when working for long periods of time on data display screens. A
surprising aspect of the study is that the ability to detect spots on a radar screen
decreases as their appearance become too rare. The probability of correctly
identifying a signal is approximately 0.9 when 100 signals arrive per 30-minute
period. This probability drops to 0.5 when 10 signals arrive at random over a 30-
minute period. The hypothesis is that unstimulated minds tend to wander. Let x
denote the number of signals correctly identified in a 30-minute time span in which
10 signals arrive. Considering the probability of successful detection of a radar spot,
p = ½.
(a) The probability function is
x 10 − x
10 1 1
f ( x) =
x 2 2
(b) The moment generating function is
10
1 1
m x (t ) = + e t
2 2
(c) The mean, µ = 10 × ½ = 5, and the variance, σ2 = 10 × ½ × ½ = 10/4.
(d) The cumulative distribution function is
x 10− x
t
10 1 1
F (t ) = ∑
x = 0 x 2 2
Example 4.10
A communication system consists of n components, each of which will,
independently, function with probability p. The total system will be able to operate
effectively if at least one-half of its components function correctly. For what value of
p is a 5-component system more likely to perform more effectively than a 3-
component system?
3 2 3
p (1 − p )1 + p 3 (1 − p ) 0
2 3
Therefore, if the 5-component is to be more effective than the 3-component, than we
have
5 3 5 5
p (1 − p ) 2 + p 4 (1 − p )1 + p 5 (1 − p ) 0
3 4 5
3 3
≥ p 2 (1 − p )1 + p 3 (1 − p ) 0
2 3
or 10p3(1 – p)2 + 5p4(1 – p) + p5 ≥ 3p2(1 – p) + p3
or 3(p – 1)2(2p – 1) ≥ 0
or p>½
Exercises
4.28 For each scenario described below, state whether or not the binomial
distribution is a reasonable model for the random variable and why. State any
assumptions you make.
(a) A production process produces thousands of temperature transducers. Let X
denote the number of nonconforming transducers in a sample of size 30
selected at random from the process.
(b) From a batch of 50 temperature transducers, a sample of size 30 is selected
without replacement. Let X denote the number of nonconforming
transducers in the sample.
(c) Four identical electronic components are wired to a controller that can
switch from a failed component to one of the remaining spares. Let X denote
the number of components that have failed after a specified period of
operation.
(d) Let X denote the number of accidents that occur along the federal highways
in Arizona during a one-month period.
(e) Let X denote the number of correct answers by a student taking a multiple
choice exam in which a student can eliminate some of the choices as being
incorrect in some questions and all of the incorrect choices in other
questions.
(f) Defects occur randomly over the surface of a semiconductor chip. However,
only 80% of defects can be found by testing. A sample of 40 chips with one
defect each is tested. Let X denote the number of chips in which the test
finds a defect.
(g) Reconsider the situation in part (f). Now, suppose the sample of 40 chips
consists of chips with 1 and with 0 defects.
(h) A filling operation attempts to fill detergent packages to the advertised
weight. Let X denote the number of detergent packages that are underfilled.
(i) Errors in a digital communication channel occur in bursts that affect several
consecutive bits. Let X denote the number of bits in error in a transmission
of 100,000 bits.
(j) Let X denote the number of surface flaws in a large coil of galvanized steel.
4.29 The random variable X has a binomial distribution with n = 10 and p = 0.5.
Sketch the probability mass function of X.
(a) What value of X is most likely?
(b) What value(s) of X is (are) least likely?
4.30 The random variable X has a binomial distribution with n = 10 and p = 0.5.
Determine the following probabilities:
(a) p(X = 5) (b) p(X ≤ 2)
(c) p(X ≥ 9) (d) p(3 ≤ X < 5)
4.31 The random variable X has a binomial distribution with n = 10 and p = 0.01.
Determine the following probabilities.
(a) p(X = 5) (b) p(X ≤ 2)
(c) p(X ≥ 9) (d) p(3 ≤ X < 5)
4.34 An electronic product contains 40 integrated circuits. The probability that any
integrated circuit is defective is 0.01, and the integrated circuits are independent.
The product operates only if there are no defective integrated circuits. What is
the probability that the product operates?
4.35 Let X denote the number of bits received in error in a digital communication
channel, and assume that X is a binomial random variable with p = 0.001. If
1000 bits are transmitted, determine the following:
(a) p(X = 1) (b) p(X ≥ 1)
(c) p(X ≤ 2) (d) mean and variance of X
4.36 The phone lines to an airline reservation system are occupied 40% of the time.
Assume that the events that the lines are occupied on successive calls are
independent. Assume that 10 calls are placed to the airline.
(a) What is the probability that for exactly three calls the lines are occupied?
(b) What is the probability that for at least one call the lines are not occupied?
(c) What is the expected number of calls in which the lines are all occupied?
4.37 Batches that consist of 50 coil springs from a production process are checked for
conformance to customer requirements. The mean number of nonconforming
coil springs in a batch is 5. Assume that the number of nonconforming springs in
a batch, denoted as X, is a binomial random variable.
(a) What are n and p?
(b) What is p(X ≤ 2)?
4.38 A statistical process control chart example. Samples of 20 parts from a metal
punching process are selected every hour. Typically, 1% of the parts require
rework. Let X denote the number of parts in the sample of 20 that require
rework. A process problem is suspected if X exceeds its mean by more than
three standard deviations.
(a) If the percentage of parts that require rework remains at 1%, what is the
probability that X exceeds its mean by more than three standard deviations?
(b) If the rework percentage increases to 4%, what is the probability that X
exceeds 1?
(c) If the rework percentage increases to 4%, what is the probability that X
exceeds 1 in at least one of the next five hours of samples?
4.39 Because not all airline passengers show up for their reserved seat, an airline sells
125 tickets for a flight that holds only 120 passengers. The probability that a
passenger does not show up is 0.10, and the passengers behave independently.
(a) What is the probability that every passenger who shows up can take the
flight?
(b) What is the probability that the flight departs with empty seats?
4.40 This exercise illustrates that poor quality can affect schedules and costs. A
manufacturing process has 100 customer orders to fill. Each order requires one
component part that is purchased from a supplier. However, typically, 2% of the
components are identified as defective, and the components can be assumed to
be independent.
(a) If the manufacturer stocks 100 components, what is the probability that the
100 orders can be filled without reordering components?
(b) If the manufacturer stocks 102 components, what is the probability that the
100 orders can be filled without reordering components?
(c) If the manufacturer stocks 105 components, what is the probability that the
100 orders can be filled without reordering components?
4.41 A multiple choice test contains 25 questions, each with four answers. Assume a
student just guesses on each question.
(a) What is the probability that the student answers more than 20 questions
correctly?
(b) What is the probability the student answers less than 5 questions correctly?
4.42 A particularly long traffic light on your morning commute is green 20% of the
time that you approach it. Assume that each morning represents an independent
trial.
(a) Over five mornings, what is the probability that the light is green on exactly
one day?
(b) Over 20 mornings, what is the probability that the light is green on exactly
four days?
(c) Over 20 mornings, what is the probability that the light is green on more
than four days?
Geometric distribution
The geometric random variable x is defined as the number of trials needed to obtain
the first success. The space for this problem is
S = {s, fs, ffs, fffs, . . .}
Therefore, if success is obtained in the nth trial, this means that the previous (n – 1)
trials ended up in failure. If probability of success is considered to be p, the
probability of failure would be (1 – p). For sake of simplicity, the probability of
failure is often denoted by q. Therefore the geometric probability distribution can be
written as
p(x = n) = (1 – p)n – 1p = qn – 1p
Or simply
f(x) = qx – 1p (4)
q
Standard deviation, σ = (8)
p2
We illustrate the use of these expressions through the following example.
Example 4.11
Random digits are integers selected from among integers {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
one at a time in such a way that at each stage in the selection process the integer
chosen is just as likely to be one digit as any other. In simulation experiments it is
often necessary to generate a series of random digits. In generating such a series, let
x denote the number of trials needed to obtain the first instant pf a particular number,
let us say, 0. The experiment consists of a series of independent, identical trials with
1
‘success’ being the generation of a 0. The probability of success, p = . It is clear
10
that x is geometric random variable. Substituting this into equations (4)–(8), we
obtain
x −1
9 1
f ( x) =
10 10
F(x) = 1 – (0.9)x
1
µ= = 10
1 / 10
9 / 10
σ2 = = 90
(1 / 10) 2
σ = 9.487
Example 4.12
The probability that a wafer contains a large particle of contamination is 0.015. If it
is assumed that the wafers are independent, what is the probability that exactly 125
wafers need to be analyzed before a large particle is detected?
Let X denote the number of samples analyzed until a large particle is detected. Then
X is a geometric random variable with p = 0.01. The requested probability is
p(X = 125) = (0.985)1240.015 = 0.0023
Average number of wafers those have to be tested before a contamination is found is
1 1
= ≈ 67 . Standard deviation of the number of wafer those have to be tested
p 0.015
q 0.985
is = = 66.
p2 0.015 2
m x (t ) =
( pe )
t r
(1 − qe ) t r
r
The mean, µ =
p
rq
The variance, σ2 =
p2
Example 4.13
Cotton linters used in the production of solid rocket propellants are subjected to a
nitration process that enables the cotton fibers to go into the solution. The process is
90% effective in that the materials produced can be shaped as desired in a later
processing stage with probability 0.9. What is the probability that exactly 20 lots will
be produced in order to obtain the third defective lot?
Here ‘success’ is obtaining a defective lot; and hence p = 0.1 and r = 3. The
probability that x = 20 is given by
19
f (20) = (0.9)17 (0.1) 3 = 0.0285
2
The expected value is 3/0.1 = 30, meaning that on an average 30 trails would be
required to produce the third defective lot.
Exercises
4.43 Suppose the random variable X has a geometric distribution with p = 0.5.
Determine the following probabilities:
(a) p(X = 1) (b) p(X = 4)
(c) p(X = 8) (d) p(X ≤ 2)
(e) p(X > 2)
4.44 Suppose the random variable X has a geometric distribution with a mean of 2.5.
Determine the following probabilities:
(a) p(X = 1) (b) p(X = 4)
(c) p(X = 5) (d) p(X ≤ 3)
(e) p(X > 3)
4.46 In a clinical study, volunteers are tested for a gene that has been found to
increase the risk for a disease. The probability that a person carries the gene is
0.1.
(a) What is the probability 4 or more people will have to be tested before 2 with
the gene are detected?
(b) How many people are expected to be tested before 2 with the gene are
detected?
4.47 Assume that each of your calls to a popular radio station has a probability of
0.02 of connecting, that is, of not obtaining a busy signal. Assume that your
calls are independent.
(a) What is the probability that your first call that connects is your tenth call?
(b) What is the probability that it requires more than five calls for you to
connect?
(c) What is the mean number of calls needed to connect?
4.48 In Exercise 4.42, recall that a particularly long traffic light on your morning
commute is green 20% of the time that you approach it. Assume that each
morning represents an independent trial.
(a) What is the probability that the first morning that the light is green is the
fourth morning that you approach it?
(b) What is the probability that the light is not green for 10 consecutive
mornings?
4.49 A trading company has eight computers that it uses to trade on the New York
Stock Exchange (NYSE). The probability of a computer failing in a day is
0.005, and the computers fail independently. Computers are repaired in the
evening and each day is an independent trial.
(a) What is the probability that all eight computers fail in a day?
(b) What is the mean number of days until a specific computer fails?
(c) What is the mean number of days until all eight computers fail in the same
day?
4.50 In Exercise 4.38, recall that 20 parts are checked each hour and that X denotes
the number of parts in the sample of 20 that require rework.
(a) If the percentage of parts that require rework remains at 1%, what is the
probability that hour 10 is the first sample at which X exceeds 1?
(b) If the rework percentage increases to 4%, what is the probability that hour
10 is the first sample at which X exceeds 1?
(c) If the rework percentage increases to 4%, what is the expected number of
hours until X exceeds 1?
4.52 Show that the probability density function of a negative binomial random
variable equals the probability density function of a geometric random variable
when r = 1. Show that the formulas for the mean and variance of a negative
binomial random variable equal the corresponding results for geometric random
variable when r = 1.
4.53 Suppose that X is a negative binomial random variable with p = 0.2 and r = 4.
Determine the following:
(a) E(X) (b) p(X = 20)
(c) p(X = 19) (d) p(X = 21)
(e) The most likely value for X
4.56 A fault-tolerant system that processes transactions for a financial services firm
uses three separate computers. If the operating computer fails, one of the two
spares can be immediately switched online. After the second computer fails, the
last computer can be immediately switched online. Assume that the probability
of a failure during any transaction is and that the transactions can be considered
to be independent events.
(a) What is the mean number of transactions before all computers have failed?
(b) What is the variance of the number of transactions before all computers have
failed?
Hypergeometric distribution
Sampling from a finite population can be done in two ways. First, an item can be
selected, inspected, and returned to the population for possible reselection. In the
second way, the item can be set aside so that it is no longer selected. The former is
called sampling with replacement, and the latter is called sampling without
replacement. In case sampling without replacement, the draws are not independent,
hence are not binomial process. Rather, it follows a distribution known as
hypergeometric distribution.
Total N
objects
Total n
chosen
n–x
x
N–r
objects
r objects
Independent University, Bangladesh Lecture Notes on
Probability and Statistics
Figure 4.3. General setting for hypergeometric distribution.
Discrete Probability Distributions 99
N
objects. There are ways to perform this selection. We will choose x objects from
n
the group of r objects, and the rest n – x from the group of N – r objects. The choice is
r
illustrated in figure 1. There are ways to choose x objects from the group of r
x
N − r
objects; and ways to choose n – x objects from N – r objects. Therefore the
n−x
hypergeometric probability distribution is
r N − r
x n − x
f ( x) =
N
n
r r N − r N − n
The mean, µ = n , and the variance σ2 = n .
N N N N − 1
Example 4.14
The components of a 6-component system are to be randomly chosen from a bin of
20 used components. The resulting system will be functional if at least 4 of its 6
components are in working condition. If 15 of the 20 components in the bin are in
working condition, what is the probability that the resulting system will be
functional?
Exercises
4.59 A lot of 75 washers contains 5 in which the variability in thickness around the
circumference of the washer is unacceptable. A sample of 10 washers is selected
at random, without replacement.
(a) What is the probability that none of the unacceptable washers is in the
sample?
(b) What is the probability that at least one unacceptable washer is in the
sample?
(c) What is the probability that exactly one unacceptable washer is in the
sample?
(d) What is the mean number of unacceptable washers in the sample?
4.60 A company employs 800 men under the age of 55. Suppose that 30% carry a
marker on the male chromosome that indicates an increased risk for high blood
pressure.
(a) If 10 men in the company are tested for the marker in this chromosome,
what is the probability that exactly 1 man has the marker?
(b) If 10 men in the company are tested for the marker in this chromosome,
what is the probability that more than 1 has the marker?
4.61 Printed circuit cards are placed in a functional test after being populated with
semiconductor chips. A lot contains 140 cards, and 20 are selected without
replacement for functional testing.
(a) If 20 cards are defective, what is the probability that at least 1 defective card
is in the sample?
(b) If 5 cards are defective, what is the probability that at least 1 defective card
appears in the sample?
4.62 Magnetic tape is slit into half-inch widths that are wound into cartridges. A
slitter assembly contains 48 blades. Five blades are selected at random and
evaluated each day for sharpness. If any dull blade is found, the assembly is
replaced with a newly sharpened set of blades.
(a) If 10 of the blades in an assembly are dull, what is the probability that the
assembly is replaced the first day it is evaluated?
(b) If 10 of the blades in an assembly are dull, what is the probability that the
assembly is not replaced until the third day of evaluation? [Hint: Assume the
daily decisions are independent, and use the geometric distribution.]
(c) Suppose on the first day of evaluation, two of the blades are dull, on the
second day of evaluation six are dull, and on the third day of evaluation, ten
are dull. What is the probability that the assembly is not replaced until the
third day of evaluation? [Hint: Assume the daily decisions are independent.
However, the probability of replacement changes every day.]
4.63 A state runs a lottery in which 6 numbers are randomly selected from 40,
without replacement. A player chooses 6 numbers before the state’s sample is
selected.
(a) What is the probability that the 6 numbers chosen by a player match all 6
numbers in the state’s sample?
(b) What is the probability that 5 of the 6 numbers chosen by a player appear in
the state’s sample?
(c) What is the probability that 4 of the 6 numbers chosen by a player appear in
the state’s sample?
(d) If a player enters one lottery each week, what is the expected number of
weeks until a player matches all 6 numbers in the state’s sample?
Poisson distribution
The last discrete distribution that we would discuss is the Poisson distribution.
Poisson distribution is an approximation of binomial distribution when sample size is
very large; in fact when n → ∞. To derive the expression for Poisson probability
distribution, we start with the expression for binomial distribution.
n!
p ( x) = p x (1 − p) n − x
x!(n − x)!
We have learnt that for binomial distribution, the mean, µ = np. To be consistent with
the literature on Poisson distribution, we denote the mean with λ instead of µ.
λ
Therefore, we can write p = . Therefore
n
x n− x
n! λ λ
p ( x) = 1 −
x!( n − x)! n n
n− x
n! λx λ
= 1 −
(n − x)!n x x! n
n
λ
x
1−
n(n − 1)(n − 2) ⋅ ⋅ ⋅ (n − x + 1) λ n
=
nx x! λ x
1 −
n
n x
n( n − 1)( n − 2) ⋅ ⋅ ⋅ ( n − x + 1) λ λ
In the limit n → ∞, x
≈ 1 , 1 − ≈ e − λ , and 1 − ≈ 1 .
n n n
Therefore
λ x e −λ
p( x) =
x!
This is the expression for Poisson probability distribution. The moment generating
function is
t
m x (t ) = e λ ( e −1)
The mean and the variance, both, are λ.
Poisson random variables arise usually in connection with what are called Poisson
process.
Some examples of random variables that usually obey, to a good approximation, the
Poisson probability law (that is, they usually obey Equation 5.2.1 for some value of λ)
are:
1. The number of misprints on a page (or a group of pages) of a book.
2. The number of people in a community living to 100 years of age.
3. The number of wrong telephone numbers that are dialed in a day.
4. The number of transistors that fail on their first day of use.
5. The number of customers entering a post office on a given day.
6. The number of α-particles discharged in a fixed period of time from some
radioactive particle.
Example 4.15
Suppose that the average number of accidents occurring weekly on a particular
stretch of a highway equals 3. Calculate the probability that there is at least one
accident this week.
Let X denote the number of accidents occurring on the stretch of highway in question
during this week. Because it is reasonable to suppose that there are a large number of
cars passing along that stretch, each having a small probability of being involved in
an accident, the number of such accidents should be approximately Poisson
distributed. Hence,
3 0 e −3
p(X ≥ 1) = 1 – p(X = 0) = 1 − = 0.9502 !
0!
Example 4.16
Suppose the probability that an item produced by a certain machine will be defective
is .1. Find the probability that a sample of 10 items will contain at most one defective
item. Assume that the quality of successive items is independent.
Example 4.17
Consider an experiment that consists of counting the number of α particles given off
in a one-second interval by one gram of radioactive material. If we know from past
experience that, on the average, 3.2 such α-particles are given off, what is a good
approximation to the probability that no more than 2 α-particles will appear?
Example 4.18
If the average number of claims handled daily by an insurance company is 5, what
proportion of days have less than 3 claims? What is the probability that there will be
4 claims in exactly 3 of the next 5 days? Assume that the number of claims on
different days is independent.
Because the company probably insures a large number of clients, each having a
small probability of making a claim on any given day, it is reasonable to suppose that
the number of claims handled daily, call it X, is a Poisson random variable. Since
E(X ) = 5, the probability that there will be fewer than 3 claims on any given day is
p ( X ≤ 3) = p ( X = 0) + p ( X = 1) + p ( X = 2) + p ( X = 3)
5 0 e −5 51 e −5 5 2 e −5 5 3 e −5
= .+ + + = 0.1247
0! 1! 2! 3!
Since any given day will have fewer than 3 claims with probability .125, it follows,
from the law of large numbers, that over the long run 12.5 percent of days will have
fewer than 3 claims.
It follows from the assumed independence of the number of claims over successive
days that the number of days in a 5-day span that has exactly 4 claims is a binomial
random variable with parameters 5 and p(X = 4). Because
5 4 e −5
p ( X = 4) = = 0.1755
4!
it follows that the probability that 3 of the next 5 days will have 4 claims is equal to
5
C 3 (0.1755) 3 (1 − 0.1755) 2 = 0.0367
Exercises
4.64 Suppose X has a Poisson distribution with a mean of 4. Determine the following
probabilities:
(a) p(X = 0) (b) p(X ≤ 2)
(c) p(X = 4) (d) p(X = 8)
4.65 Suppose X has a Poisson distribution with a mean of 0.4. Determine the
following probabilities:
(a) p(X = 0) (b) p(X ≤ 2)
(c) p(X = 4) (d) p(X = 8)
4.66 Suppose that the number of customers that enter a bank in an hour is a Poisson
random variable, and suppose that p(X = 0) = 0.5. Determine the mean and
variance of X.
4.67 The number of telephone calls that arrive at a phone exchange is often modeled
as a Poisson random variable. Assume that on the average there are 10 calls per
hour.
(a) What is the probability that there are exactly 5 calls in one hour?
(b) What is the probability that there are 3 or less calls in one hour?
(c) What is the probability that there are exactly 15 calls in two hours?
(d) What is the probability that there are exactly 5 calls in 30 minutes?
4.69 When a computer disk manufacturer tests a disk, it writes to the disk and then
tests it using a certifier. The certifier counts the number of missing pulses or
errors. The number of errors on a test area on a disk has a Poisson distribution
with λ = 0.2.
(a) What is the expected number of errors per test area?
(b) What percentage of test areas have two or fewer errors?
4.70 The number of cracks in a section of interstate highway that are significant
enough to require repair is assumed to follow a Poisson distribution with a mean
of two cracks per mile.
(a) What is the probability that there are no cracks that require repair in 5 miles
of highway?
(b) What is the probability that at least one crack requires repair in mile of
highway?
(c) If the number of cracks is related to the vehicle load on the highway and
some sections of the highway have a heavy load of vehicles whereas other
sections carry a light load, how do you feel about the assumption of a
Poisson distribution for the number of cracks that require repair?
4.72 The number of surface flaws in plastic panels used in the interior of automobiles
has a Poisson distribution with a mean of 0.05 flaw per square foot of plastic
panel. Assume an automobile interior contains 10 square feet of plastic panel.
(a) What is the probability that there are no surface flaws in an auto’s interior?
(b) If 10 cars are sold to a rental company, what is the probability that none of
the 10 cars has any surface flaws?
(c) If 10 cars are sold to a rental company, what is the probability that at most
one car has any surface flaws?
Supplemental Exercises
4.74 A shipment of chemicals arrives in 15 totes. Three of the totes are selected at
random, without replacement, for an inspection of purity. If two of the totes do
not conform to purity requirements, what is the probability that at least one of
the nonconforming totes is selected in the sample?
4.75 The probability that your call to a service line is answered in less than 30
seconds is 0.75. Assume that your calls are independent.
(a) If you call 10 times, what is the probability that exactly 9 of your calls are
answered within 30 seconds?
(b) If you call 20 times, what is the probability that at least 16 calls are
answered in less than 30 seconds?
(c) If you call 20 times, what is the mean number of calls that are answered in
less than 30 seconds?
(d) What is the probability that you must call four times to obtain the first
answer in less than 30 seconds?
(e) What is the mean number of calls until you are answered in less than 30
seconds?
(f) What is the probability that you must call six times in order for two of your
calls to be answered in less than 30 seconds?
(g) What is the mean number of calls to obtain two answers in less than 30
seconds?
4.76 The number of messages sent to a computer bulletin board is a Poisson random
variable with a mean of 5 messages per hour.
(a) What is the probability that 5 messages are received in 1 hour?
(b) What is the probability that 10 messages are received in 1.5 hours?
(c) What is the probability that less than two messages are received in one-half
hour?
4.77 A Web site is operated by four identical computer servers. Only one is used to
operate the site; the others are spares that can be activated in case the active
server fails. The probability that a request to the Web site generates a failure in
the active server is 0.0001. Assume that each request is an independent trial.
What is the mean time until failure of all four computers?
4.78 The number of errors in a textbook follow a Poisson distribution with a mean of
0.01 error per page. What is the probability that there are three or less errors in
100 pages?
4.79 The probability that an individual recovers from an illness in a one-week time
period without treatment is 0.1. Suppose that 20 independent individuals
suffering from this illness are treated with a drug and 4 recover in a one-week
time period. If the drug has no effect, what is the probability that 4 or more
people recover in a one-week time period?
4.80 Patient response to a generic drug to control pain is scored on a 5-point scale,
where a 5 indicates complete relief. Historically the distribution of scores is
1 2 3 4 5
0.05 0.1 0.2 0.25 0.4
Two patients, assumed to be independent, are each scored.
(a) What is the probability mass function of the total score?
(b) What is the probability mass function of the average score?
4.83 Messages that arrive at a service center for an information systems manufacturer
have been classified on the basis of the number of keywords (used to help route
messages) and the type of message, either email or voice. Also, 70% of the
messages arrive via email and the rest are voice.
number of keywords 0 1 2 3 4
email 0.1 0.1 0.2 0.4 0.2
voice 0.3 0.4 0.2 0.1 0
Determine the probability mass function of the number of keywords in a
message.
4.85 Determine the probability mass function for the random variable with the
following cumulative distribution function:
0.0 x<2
0.2 2 ≤ x < 5 .7
f ( x ) = 0 .5 5 .7 ≤ x < 6 .5
0 .8 6 . 5 ≤ x < 8 .5
1.0 8 .5 ≤ x
4.86 Each main bearing cap in an engine contains four bolts. The bolts are selected at
random, without replacement, from a parts bin that contains 30 bolts from one
supplier and 70 bolts from another.
(a) What is the probability that a main bearing cap contains all bolts from the
same supplier?
(b) What is the probability that exactly three bolts are from the same supplier?
4.87 Assume the number of errors along a magnetic recording surface is a Poisson
random variable with a mean of one error every bits. A sector of data consists of
4096 eight-bit bytes.
(a) What is the probability of more than one error in a sector?
(b) What is the mean number of sectors until an error is found?
4.89 From 500 customers, a major appliance manufacturer will randomly select a
sample without replacement. The company estimates that 25% of the customers
will provide useful data. If this estimate is correct, what is the probability mass
function of the number of customers that will provide useful data?
(a) Assume that the company samples 5 customers.
(b) Assume that the company samples 10 customers.
4.90 It is suspected that some of the totes containing chemicals purchased from a
supplier exceed the moisture content target. Samples from 30 totes are to be
tested for moisture content. Assume that the totes are independent. Determine
the proportion of totes from the supplier that must exceed the moisture content
target so that the probability is 0.90 that at least one tote in the sample of 30 fails
the test.
4.92 Flaws occur in the interior of plastic used for automobiles according to a
Poisson distribution with a mean of 0.02 flaw per panel.
(a) If 50 panels are inspected, what is the probability that there are no flaws?
(b) What is the expected number of panels that need to be inspected before a
flaw is found?
(c) If 50 panels are inspected, what is the probability that the number of panels
that have one or more flaws is less than or equal to 2?
4.94 A communications channel transmits the digits 0 and 1. However, due to static,
the digit transmitted is incorrectly received with probability .2. Suppose that we
want to transmit an important message consisting of one binary digit. To reduce
the chance of error, we transmit 00000 instead of 0 and 11111 instead of 1. If
the receiver of the message uses “majority” decoding, what is the probability
that the message will be incorrectly decoded? What independence assumptions
are you making? (By majority decoding we mean that the message is decoded as
“0” if there are at least three zeros in the message received and as “1”
otherwise.)
4.95 Suppose that a particular trait (such as eye color or left-handedness) of a person
is classified on the basis of one pair of genes, and suppose that d represents a
dominant gene and r a recessive gene. Thus, a person with dd genes is pure
dominance, one with rr is pure recessive, and one with rd is hybrid. The pure
dominance and the hybrid are alike in appearance. Children receive 1 gene from
each parent. If, with respect to a particular trait, 2 hybrid parents have a total of
4 children, what is the probability that 3 of the 4 children have the outward
appearance of the dominant gene?
4.96 At least one-half of an airplane’s engines are required to function in order for it
to operate. If each engine independently functions with probability p, for what
values of p is a 4-engine plane more likely to operate than a 2-engine plane?