Probability Distribution
Introduction to Probability Distributions
• Random Variable
– Represents a possible numerical value from an
uncertain event
Random
Variables
Discrete Continuous
Random Variable Random Variable
Probability Distributions
• A probability distribution is a listing or graph of all of the
values that a random variable, X, can take and the
probability of each of those values.
• Probability distribution is theoretical frequency
distribution. i.e. (the expected frequency)
• A frequency distribution is listing out all the observed
frequencies of all the outcome of an experiment, that
actually occurred, when the experiment was done,
whereas the probability distribution is the listing of
possible outcome.
Discrete Random Variables
• Can only assume a countable number of values
Examples:
– Roll a die twice
Let X be the number of times 4 comes up
(then X could be 0, 1, or 2 times)
– Toss a coin 5 times.
Let X be the number of heads
(then X = 0, 1, 2, 3, 4, or 5)
Discrete Probability Distribution
Experiment: Toss 2 Coins. Let X = # heads.
4 possible outcomes
Probability Distribution
T T X Value Probability
0 1/4 = 0.25
T H 1 2/4 = 0.50
2 1/4 = 0.25
H T
Probability
0.50
0.25
H H
0 1 2 X
Discrete
Probability Distributions
Binomial
Poisson
Hypergeometric
Binomial Probability Distribution
Named after Swiss Mathematician Jacob Bernoulli.
A fixed number of observations, n
e.g., 15 tosses of a coin; ten light bulbs taken from a warehouse
Probability of outcome remains fix over a period of time.
e.g., head or tail in each toss of a coin; defective or not defective light
bulb
Generally called “success” and “failure”
Probability of success is p, probability of failure is 1 – p
Constant probability for each observation
e.g., Probability of getting a tail is the same each time we toss the coin
Binomial Probability Distribution
Observations are independent
The outcome of one observation does not affect the
outcome of the other
Two sampling methods
Infinite population without replacement
Finite population with replacement
Possible Binomial Distribution Settings
• A manufacturing plant labels items as either
defective or acceptable
• A firm bidding for contracts will either get a
contract or not
• A marketing research firm receives survey
responses of “yes I will buy” or “no I will not”
• New job applicants either accept the offer or
reject it
Rule of Combinations
• The number of combinations of selecting X objects out
of n objects is
n!
n Cx
X!(n X)!
where:
n! =(n)(n - 1)(n - 2) . . . (2)(1)
X! = (X)(X - 1)(X - 2) . . . (2)(1)
0! = 1 (by definition)
Binomial Distribution Formula
n! X n-X
P(X) = p (1-p)
X ! ( n - X )!
P(X) = probability of X successes in n trials,
with probability of success p on each trial
X = number of ‘successes’ in sample,
(X = 0, 1, 2, ..., n)
n = sample size (number of trials
or observations)
p = probability of “success”
Example:
Calculating a Binomial Probability
What is the probability of one success in five observations if the
probability of success is .1?
X = 1, n = 5, and p = 0.1
n!
P(X 1) p X (1 p)n X
X!(n X)!
5!
(0.1)1(1 0.1)51
1! (5 1)!
(5)(0.1)(0.9) 4
0.32805
Binomial Distribution
• The shape of the binomial distribution depends on the
values of p and n
Mean P(X) n = 5 p = 0.1
.6
Here, n = 5 and p = 0.1 .4
.2
0 X
0 1 2 3 4 5
P(X) n = 5 p = 0.5
Here, n = 5 and p = 0.5 .6
.4
.2
0 X
0 1 2 3 4 5
Binomial Distribution
Characteristics
• Mean μ E(x) np
Variance and Standard Deviation
σ np(1 - p)
2
σ np(1 - p)
Where n = sample size
p = probability of success
(1 – p) = probability of failure
Binomial Characteristics
Examples
μ np (5)(0.1) 0.5
Mean P(X) n = 5 p = 0.1
.6
.4
σ np(1- p) (5)(0.1)(1 0.1) .2
0.6708 0 X
0 1 2 3 4 5
μ np (5)(0.5) 2.5 P(X) n = 5 p = 0.5
.6
.4
σ np(1- p) (5)(0.5)(1 0.5) .2
1.118 0 X
0 1 2 3 4 5
Using Binomial Tables
n = 10
x … p=.20 p=.25 p=.30 p=.35 p=.40 p=.45 p=.50
0 … 0.1074 0.0563 0.0282 0.0135 0.0060 0.0025 0.0010 10
1 … 0.2684 0.1877 0.1211 0.0725 0.0403 0.0207 0.0098 9
2 … 0.3020 0.2816 0.2335 0.1757 0.1209 0.0763 0.0439 8
3 … 0.2013 0.2503 0.2668 0.2522 0.2150 0.1665 0.1172 7
4 … 0.0881 0.1460 0.2001 0.2377 0.2508 0.2384 0.2051 6
5 … 0.0264 0.0584 0.1029 0.1536 0.2007 0.2340 0.2461 5
6 … 0.0055 0.0162 0.0368 0.0689 0.1115 0.1596 0.2051 4
7 … 0.0008 0.0031 0.0090 0.0212 0.0425 0.0746 0.1172 3
8 … 0.0001 0.0004 0.0014 0.0043 0.0106 0.0229 0.0439 2
9 … 0.0000 0.0000 0.0001 0.0005 0.0016 0.0042 0.0098 1
10 … 0.0000 0.0000 0.0000 0.0000 0.0001 0.0003 0.0010 0
… p=.80 p=.75 p=.70 p=.65 p=.60 p=.55 p=.50 x
Examples:
n = 10, p = 0.35, x = 3: P(x = 3|n =10, p = 0.35) = 0.2522
n = 10, p = 0.25, x = 2: P(x = 2|n =10, p = 0.25) = 0.0004
The Poisson Distribution
• Named after French Mathematician Simon Denis Poisson
• The number of vehicles passing a single toll booth at rush
hours
– We can estimate the average number of vehicles arrive per hour
– If we divide the time period interval be in seconds, the following will
be true.
• The probability that exactly one vehicle will arrive at any single booth per second
is very small and is constant for every one second.
• The probability that two or more vehicles will arrive within one second is so
small that we can assign it a zero value
• The number of vehicles arrive in a given one second interval is independent of
the time at which that one second interval occurs during the rush hour
• The number of arrivals in any one second is not dependent on the number of
arrivals in any other one second.
The Poisson Distribution
• Distribution often used to model the number of incidences of
some characteristic in time or space:
– Arrivals of customers in a queue
– Numbers of flaws in a roll of fabric
– Number of typos per page of text.
Poisson Distribution Formula
x
e
P( X)
X!
where:
X = number of events in an area of opportunity
= Mean number of occurrences per interval
e = base of the natural logarithm system (2.71828...)
Poisson Distribution
Characteristics
• Mean μλ
Variance and Standard Deviation
σ2 λ
σ λ
where = expected number of events
Using Poisson Tables
X 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90
0 0.9048 0.8187 0.7408 0.6703 0.6065 0.5488 0.4966 0.4493 0.4066
1 0.0905 0.1637 0.2222 0.2681 0.3033 0.3293 0.3476 0.3595 0.3659
2 0.0045 0.0164 0.0333 0.0536 0.0758 0.0988 0.1217 0.1438 0.1647
3 0.0002 0.0011 0.0033 0.0072 0.0126 0.0198 0.0284 0.0383 0.0494
4 0.0000 0.0001 0.0003 0.0007 0.0016 0.0030 0.0050 0.0077 0.0111
5 0.0000 0.0000 0.0000 0.0001 0.0002 0.0004 0.0007 0.0012 0.0020
6 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0003
7 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Example: Find P(X = 2) if = 0.50
e λ λ X e 0.50 (0.50)2
P(X 2) 0.0758
X! 2!
Graph of Poisson Probabilities
0.70
Graphically: 0.60
= 0.50 0.50
= P(x) 0.40
X 0.50
0.30
0 0.6065
1 0.3033 0.20
2 0.0758 0.10
3 0.0126
0.00
4 0.0016 0 1 2 3 4 5 6 7
5 0.0002 x
6 0.0000
7 0.0000 P(X = 2) = 0.0758
Poisson Distribution Shape
• The shape of the Poisson Distribution depends
on the parameter :
0.70
= 0.50 0.25
= 3.00
0.60
0.20
0.50
0.15
0.40
P(x)
P(x)
0.30 0.10
0.20
0.05
0.10
0.00 0.00
0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10 11 12
x x
The Hypergeometric Distribution
• “n” trials in a sample taken from a finite
population of size N
• Sample taken without replacement
• Outcomes of trials are dependent
• Concerned with finding the probability of “X”
successes in the sample where there are “A”
successes in the population
The Hypergeometric Distribution
•It models that the total number of successes in a size
sample drawn without replacement from a finite
population.
•It differs from the binomial only in that the population
is finite and the sampling from the population is without
replacement.
•Trials are dependent
Hypergeometric Distribution Formula
A N A
[ A C X ][ N A Cn X ] X n X
P(X)
N Cn N
n
Where
N = population size
A = number of successes in the population
N – A = number of failures in the population
n = sample size
X = number of successes in the sample
n – X = number of failures in the sample
The Hypergeometric Distribution
Example 1 :A carton contains 24 light bulbs, three of which are defective.
What is the probability that, if a sample of six is chosen at random from the
carton of bulbs, x will be defective?
P(X x)
3
x
21
6 x
24
6
P( X 0)
3
0
21
0 . 40316
6
24
6
That is no
defective
Continuous Probability Distributions
• A continuous random variable is a variable that can
assume any value on a continuum (can assume an
uncountable number of values)
– thickness of an item
– time required to complete a task
– temperature of a solution
– height, in inches
• These can potentially take on any value, depending
only on the ability to measure accurately…i.e. exact
measurements!
Continuous
Probability Distributions
Normal
Uniform
Exponential
The Normal Distribution
Overview
• Discovered in 1733 by de Moivre as an approximation to the binomial
distribution when the number of trails is large
• Derived in 1809 by Gauss
• Importance lies in the Central Limit Theorem, which states that the sum of a Abraham de
large number of independent random variables (binomial, Poisson, etc.) will
Moivre (1667-
1754)
approximate a normal distribution
– Example: Human height is determined by a large number of factors,
both genetic and environmental, which are additive in their effects.
Thus, it follows a normal distribution.
– suppose that a sample is obtained containing a large number of
observations, each observation being randomly generated in a way that
does not depend on the values of the other observations, and that the
arithmetic average of the observed values is computed. If this procedure
is performed many times, the central limit theorem says that the Karl F. Gauss
computed values of the average will be distributed according to the (1777-1855)
normal distribution (commonly known as a "bell curve").
The Normal Distribution
The most popular probability distribution
Used by many researchers in business, math and
research
Many real-life random variables are either normally
distributed our approximately follow the normal
distribution!
The Normal Distribution
Single Peak i.e. unimodal
It looks like bell shape
Mode =Median=Mean
Mean in the center
Symmetrical
Two tails never touch the axis, implies that there is
some probability (even though very small) that a
random variable can take on enormous value.
Area to the left or right of mean is equal to 0.50
left side having negative area and right side having
positive area.
Many Normal Distributions
Any individual normal distribution can be identified
by its mean, μ and its measure of variation σ (or σ2)
By varying the parameters μ and σ, we obtain many
different normal distributions
The Normal Distribution Shape
Changing μ shifts the distribution left or right.
f(X)
Changing σ increases or decreases
the spread.
σ
μ X
The Normal Distribution
• ‘Bell
Shaped’
• Symmetrical
f(X)
• Mean, Median and Mode
are Equal
Location is determined by the mean, μ σ
Spread is determined by the standard X
deviation, σ μ
The random variable has an infinite
theoretical range: Mean
+ to = Median
= Mode
Total Area = 1 Many real-life variables follow a normal
distribution!!
The Standardized
Normal Distribution
• Also known as the “Z” distribution or Standard Normal
• Mean is 0
• Standard Deviation is 1
f(Z)
Z
0
Values above the mean have positive Z-values, values below the mean have negative Z-
values
The Standardized Normal
• Any normal distribution (with any mean and standard
deviation combination) can be transformed into the
standardized (standard) normal distribution (Z)
• Need to transform X units into Z units
• The reason for doing standardization is that X variable
may be in any measurement unit, say length, Rupees
etc. The standardization remove this issue.
Transformation to the Standard Normal
Distribution
• Translate from X to the standardized normal
(the “Z” distribution) by subtracting the mean
of X and dividing by its standard deviation:
X μ
Z
σ
The Z distribution always has mean = 0 and standard
deviation = 1
Example
• If X is distributed normally with mean of 100
and standard deviation of 50, the Z value for X
= 200 is
X μ 200 100
Z 2.0
σ 50
• This says that X = 200 is two standard
deviations (2 increments of 50 units) above the
mean of 100.
Comparing X and Z units
100 200 X (μ = 100, σ = 50)
0 2.0 Z (μ = 0, σ = 1)
Note that the distribution is the same, only the scale
has changed. We can express the problem in original
units (X) or in standardized units (Z)
Finding Probabilities
Probability is
the area under
the curve! P c X d ?
f(X)
X
c d
Finding Normal Probabilities
Probability is measured by the area under the curve
f(X)
P (a ≤ X ≤ b )
= P (a < X < b )
(Note that the probability
of any individual value is
zero)
X
a b
Comparing X and Z units and their areas
The area between 100
and 200 for X is equal to
the area between 0 and
2.0 for Z
100 200 X (μ = 100, σ = 50)
0 2.0 Z (μ = 0, σ = 1)
Note that the distribution is the same, only the scale has changed. We can
express the problem in original units (X) or in standardized units (Z)
Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is
symmetric, so half is above the mean, half is below
f(X)
P( X μ) 0.5 P(μ X ) 0.5
0.5 0.5
X
μ
P( X ) 1.0
The Standardized Normal Table
• The Cumulative Standardized Normal table in
the textbook gives the probability less than a
desired value for Z (i.e., from negative infinity to
Z)
0.9772
Example:
P(Z < 2.00) = 0.9772
0 2.00 Z
The Standardized Normal Table
• The Standardized Normal table in the
textbook gives the probability from 0 to a
desired value for Z (or from 0 to the
negative value of Z)
0.4772
Example:
P(0 < Z < 2.00) = 0.4772
0 2.00 Z
The Standardized Normal Table
The column gives the value of Z to the second decimal
point
Z 0.00 0.01 0.02 …
0.0
The row shows 0.1
the value of Z to .
the first decimal .
. The value within the table
point gives the probability from
2.0 .9772
Z = up to the desired Z
value
P(Z < 2.00) = 0.9772 2.0
Finding Normal Probabilities
• Suppose X is normal with mean 8.0 and standard
deviation 5.0. Find P(X < 8.6)
X μ 8.6 8.0
Z 0.12
σ 5.0
μ=8 μ=0
σ = 10 σ=1
8 8.6 X 0 0.12 Z
P(X < 8.6) P(Z < 0.12)
Solution: Finding P(Z < 0.12)
Standardized Normal Probability
P(X < 8.6)
Table (Portion)
= P(Z < 0.12)
Z .00 .01 .02 .5478
0.0 .5000 .5040 .5080
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
Z
0.00
0.3 .6179 .6217 .6255
0.12
Solution: Finding P(Z < 0.12)
P(X < 8.6)
= P(Z < 0.12)
Z .00 .01 .02 .5 + .0478 =.5478
0.0 .0000 .0040 .0080
.5
0.1 .0398 .0438 .0478
0.2 .0793 .0832 .0871
Z
0.00
0.3 .1179 .1217 .1255
0.12
Upper Tail Probabilities
• Suppose X is normal with mean 8.0 and
standard deviation 5.0.
• Now Find P(X > 8.6)
X
8.0
8.6
Upper Tail Probabilities
• Now Find P(X > 8.6)…
P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - 0.5478 = 0.4522
0.5478
1.000 1.0 - 0.5478 =
0.4522
Z Z
0 0
0.12 0.12
Probability Between
Two Values
• Suppose X is normal with mean 8.0 and
standard deviation 5.0. Find P(8 < X < 8.6)
Calculate Z-values:
X μ 8 8
Z 0
σ 5
8 8.6 X
X μ 8.6 8 0 0.12 Z
Z 0.12
σ 5 P(8 < X < 8.6)
= P(0 < Z < 0.12)
More Examples of Normal Distribution
A set of final exam grades was found to be normally distributed
with a mean of 73 and a standard deviation of 8.
What is the probability of getting a grade no higher than 91 on this
exam?
X ~ N 73,8 2
P X 91 ? 8
Mean 73
Standard Deviation 8
Probability for X <= X
X Value 91
Z Value 2.25
73 91
P(X<=91) 0.9877756
Z
0 2.25
Finding the X value for a Known Probability
• Steps to find the X value for a known
probability:
1. Find the Z value for the known probability
2. Convert to X units using the formula:
X μ Zσ
Finding the X value for a Known Probability
Example:
• Suppose X is normal with mean 8.0 and
standard deviation 5.0.
• Now find the X value so that only 20% of all
values are below this X
0.2000
? 8.0 X
? 0 Z
Find the Z value for
20% in the Lower Tail
1. Find the Z value for the known probability
• 20% area in the
lower tail is
Z … .03 .04 .05 consistent with a Z
-0.9 … .1762 .1736 .1711 value of -0.84
0.2000
-0.8 … .2033 .2005 .1977
-0.7 … .2327 .2296 .2266
? 8.0 X
-0.84 0 Z
Finding the X value
2. Convert to X units using the formula:
X μ Zσ
8.0 ( 0.84 )5.0
3.80
So 20% of the values from a distribution with mean
8.0 and standard deviation 5.0 are less than 3.80
Normal Approximation to the
Binomial Distribution
• The binomial distribution is very common but becomes difficult
to use when n grows.
• The binomial distribution is a discrete distribution, but the
normal is continuous
• To use the normal to approximate the binomial, accuracy is
improved if you use a correction for continuity adjustment ( X
– 0.5 and X + 0.5 )
• Example: The Continuity Correction
– X is discrete in a binomial distribution, so P(10 ≥ X ≥ 14) can be
approximated with a continuous normal distribution by finding
P(9.5 < X < 14.5)
Normal Approximation to the Binomial
Distribution
• The closer p is to 0.5, the better the normal approximation to
the binomial
• The larger the sample size n, the better the normal
approximation to the binomial
• General rule:
– The normal distribution can be used to approximate the binomial
distribution if
np ≥ 5 if n ≥ 30
and or
n(1 – p) ≥ 5
Normal Approximation to the Binomial
Distribution
• The mean and standard deviation of the
binomial distribution are
μ = np
σ np(1 p)
• Transform binomial to normal using the
formula:
X μ X np
Z
σ np(1 p)
Using the Normal Approximation
to the Binomial Distribution
Example: On a 100 question multiple choice
test with 5 possible answers (a through e), what
would be the probability of guessing 60 or more
correct answers?
P(60 ≤ X)
P(60 ≤ X ≤ 100)
P(59.5 ≤ X ≤ 100.5)
X N (100 .2, 100 .2 .8)
Using the Normal Approximation
to the Binomial Distribution
• If n = 1000 and p = 0.2, what is P(X ≤ 180)?
• Approximate P(X ≤ 180) using a continuity correction
adjustment:
P(-0.5 ≤ X ≤ 180.5)
• Transform to standardized normal:
X np 180.5 (1000)(0.2)
Z 1.54
np(1 p) (1000)(0.2)(1 0.2)
• So P(-15.8 ≤ Z ≤ -1.54) = 0.0618
0.5382
180.5 200 X
-1.54 0 Z
Evaluating Normality
• Not all continuous random variables are
normally distributed
• It is important to evaluate how well the data
set is approximated by a normal distribution
Evaluating Normality
• Construct charts or graphs
– For small- or moderate-sized data sets, do stem-and-
leaf display and box-and-whisker plot look symmetric?
– For large data sets, does the histogram or polygon
appear bell-shaped?
• Compute descriptive summary measures
– Do the mean, median and mode have similar values?
– Is the interquartile range approximately 1.33 σ?
– Is the range approximately 6 σ?
Assessing Normality
• Observe the distribution of the data set
– Do approximately 2/3 of the observations lie within
deviation?
mean 1 standard
– Do approximately 80% of the observations lie within
mean 1.28 standard deviations?
– Do approximately 95% of the observations lie within
deviations?
mean 2 standard
• Evaluate normal probability plot
– Is the normal probability plot approximately linear
with positive slope?
The Normal Probability Plot
A normal probability plot for data from a normal
distribution will be approximately linear:
X 90
60
30
-2 -1 0 1 2 Z
Normal Probability Plot
Left-Skewed Right-Skewed
X 90 X 90
60 60
30 30
-2 -1 0 1 2 Z -2 -1 0 1 2 Z
Rectangular
Nonlinear plots indicate a
X 90 deviation from normality
60
30
-2 -1 0 1 2 Z
The Uniform Distribution
• The uniform distribution is a probability
distribution that has equal probabilities for all
possible outcomes of the random variable
• Also called a rectangular distribution
The Uniform Distribution
The Continuous Uniform Distribution:
1
if a X b
ba
f(X) =
0 otherwise
where
f(X) = value of the density function at any X value
a = minimum value of X
b = maximum value of X
Properties of the
Uniform Distribution
• The mean of a uniform distribution is
ab
μ
2
• The standard deviation is
(b - a) 2
σ
12
Uniform Distribution Example
Example: Uniform probability distribution
over the range 2 ≤ X ≤ 6:
1
f(X) = 6 - 2 = 0.25 for 2 ≤ X ≤ 6
f(X)
ab 26
0.25 μ 4
2 2
(b - a) 2 (6 - 2) 2
σ 1 .1547
2 6 X 12 12
Uniform Distribution Example
Example: Using the uniform probability
distribution to find P(3 ≤ X ≤ 5):
P(3 ≤ X ≤ 5) = (Base)(Height) = (2)(0.25) = 0.5
f(X)
0.25
2 3 4 5 6 X
The Exponential Distribution
• Often used to model the length of time
between two occurrences of an event (the time
between arrivals)
– Examples:
• Time between trucks arriving at an unloading dock
• Time between transactions at an ATM Machine
• Time between phone calls to the main operator
The Exponential Distribution
• Defined by a single parameter, its mean λ (lambda)
• The probability that an arrival time is less than
some specified time X is
λX
P(arrival time X) 1 e
where e = mathematical constant approximated by 2.71828
λ = the population mean number of arrivals per unit
X = any value of the continuous variable where 0 < X <
Exponential Distribution
Example
Example: Customers arrive at the service counter at the
rate of 15 per hour. What is the probability that the
arrival time between consecutive customers is less than
three minutes?
The mean number of arrivals per hour is 15, so λ = 15
Three minutes is 0.05 hours
P(arrival time < .05) = 1 – e-λX = 1 – e-(15)(0.05) = 0.5276
So there is a 52.76% probability that the arrival time
between successive customers is less than three
minutes
Choosing an Appropriate distribution
Very Important to observe
the real theoretical
distribution