0% found this document useful (0 votes)
82 views61 pages

Session 3 Distribtion

The document discusses probability distributions and provides examples of some common discrete and continuous distributions. It begins by defining what a distribution is and why they are useful. It then covers the binomial, hypergeometric, and Poisson distributions in more detail, providing the formulas and characteristics of each. Examples are given to illustrate how to calculate probabilities using each distribution. The key points are that distributions describe the probability of outcomes, and certain distributions like binomial and Poisson are applicable when there are a fixed number of independent yes/no or success/failure trials.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views61 pages

Session 3 Distribtion

The document discusses probability distributions and provides examples of some common discrete and continuous distributions. It begins by defining what a distribution is and why they are useful. It then covers the binomial, hypergeometric, and Poisson distributions in more detail, providing the formulas and characteristics of each. Examples are given to illustrate how to calculate probabilities using each distribution. The key points are that distributions describe the probability of outcomes, and certain distributions like binomial and Poisson are applicable when there are a fixed number of independent yes/no or success/failure trials.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 61

Distribution

Dr. Rohit Joshi


Postdoc (UCLA, Los Angeles, USA), Ph.D (IIT Delhi), M.Tech
Let us see some situations
Let us see some situations

Election Contestants
Let us see some situations

Operations Manager
Let us see some situations

Marketing Research
Let us see some situations

A Pharmaceutical manufacturer needs to determine whether


a new drug is more effective than those currently in use
Let us see some situations

Effectiveness of a Gym or a Gym trainer


Let us see some situations

Performance evaluation and appraisal


Let us see some situations

Financial Evaluation and Forecasting


Let us see some situations
Basic Terms
Random variables

Random
Variables

Discrete Continuous
Random Variable Random Variable
What is a distribution?

Describes the ‘shape’ of a batch of numbers

Probability of happening of each outcome


Why Distribution?
can serve as a basis for standardized
comparison of empirical distributions
can help us estimate confidence intervals
for inferential statistics
form a basis for more advanced statistical
methods
‘fit’ between observed distributions and certain
theoretical distributions is an assumption of many
statistical procedures
Distributions
Discrete distributions
Binomial Distribution
Negative Binomial Distribution
Geometric Distribution
Hyper geometric Distribution
Poisson Distribution

Continuous Distribution
Normal Distribution
Uniform Distribution
Beta Distribution (t, F, Chi square)
Exponential Distribution
An Example
A balanced Die is rolled three times, what
is the probability a 5 comes up exactly
twice.
Important discrete probability
distribution: The binomial
The Binomial Distribution:
Jakob Bernoulli (1654-1705)

 A fixed number of observations, n


 ex. 15 tosses of a coin; ten light bulbs taken from a
warehouse
 Two mutually exclusive and collectively
exhaustive categories
 ex. head or tail in each toss of a coin; defective or not
defective light bulb; having a boy or girl
 Generally called “success” and “failure”
 Probability of success is p, probability of failure is 1 – p

 Constant probability for each observation


 The outcome of one observation does not affect the outcome of the
other
Binomial distribution
Condition
1. Fixed number of Trails
2. The trail must be Bernoulli trails
3. The outcome of the trail must be
independent
4. The probability of success in each trail
must be constant
Binomial distribution
Note the general pattern emerging  if you have only two possible
outcomes (call them 1/0 or yes/no or success/failure) in n independent
trials, then the probability of exactly X “successes”=
n = number of trials

n X n X
  p (1  p )
X 1-p = probability
X=# of failure
successes p=
out of n probability of
trials success
Binomial distribution: example

If I toss a coin 20 times, what’s the probability of


getting exactly 10 heads?

 20  10 10
 (.5) (.5)  .176
 10 
Binomial distribution: example
If I toss a coin 20 times, what’s the probability of
getting of getting 2 or fewer heads?

 20  20!
  (.5) 0
(.5) 20
 (.5) 20  9.5 x107 
0 20!0!
 20  20!
1
 (.5) (.5) 
19
(.5) 20  20x9.5 x10 7  1.9 x105 
1 19!1!
 20  20!
  (.5) 2
(.5)18
 (.5) 20  190x9.5 x10 7  1.8 x10 4
2 18!2!
 1.8 x10 4
The Binomial Distribution
Bin(0.3, 5)
Bin(0.1, 5)
0.4
0.8
0.3
0.6
0.2
0.4
0.2 0.1
0 0
0 1 2 3 4 5 0 1 2 3 4 5

Bin(0.5, 5)

0.4
0.3
0.2
0.1
0
0 1 2 3 4 5
Bin(0.9, 5)
Bin(0.7, 5)
0.8
0.4
0.6
0.3
0.4
0.2
0.2
0.1
0
0
0 1 2 3 4 5
0 1 2 3 4 5
**All probability distributions are characterized
by an expected value and a variance:
If X follows a binomial distribution with parameters n
and p: X ~ Bin (n, p)

Mean μ  E(x)  np

 Variance and Standard Deviation

σ 2  np (1 - p ) σ np (1 - p )

Where n = sample size


p = probability of success
(1 – p) = probability of failure
Applications
A manufacturing plant labels items as either
defective or acceptable
A firm bidding for contracts will either get a
contract or not
A marketing research firm receives survey responses
of “yes I will buy” or “no I will not”
New job applicants either accept the offer or reject it
Your team either wins or loses the football game at
the company picnic
Application
A recent study how Americans spend their leisure
time surveyed workers employed for more than 5
years. They determined the probability an employee
has 2 weeks of vacation time to be 0.45, 1 week
(0.1), 3 or more (0.20). Suppose 20 workers are
selected at a random.
 What is the probability that 8 have 2 weeks of vacation time?
 What is the probability that only one worker has 1 week of vacation
time?
 What is the probability that atmost 2 of the workers have 3 or more
weeks of vacation time?
The Hypergeometric Distribution
The binomial distribution is applicable
when selecting from a finite population with
replacement or from an infinite population
without replacement.

The hypergeometric distribution is


applicable when selecting from a finite
population without replacement.
The Hypergeometric Distribution

A  N  A 

X 

n  X 
P( X )    
N

n 

 

Where
N = population size
A = number of successes in the population
N – A = number of failures in the population
n = sample size
X = number of successes in the sample
n – X = number of failures in the sample
The Hypergeometric Distribution
Example
Different computers are checked from 10 in the
department. 4 of the 10 computers have illegal
software loaded. What is the probability that 2 of the 3
selected computers have illegal software loaded?
So, N = 10, n = 3, A = 4, X = 2

 A  N  A   4  6 
     
 X  n  X   2 1  (6)(6)
P(X  2)           0.3
 
N  
10 120
   
n  3 
   

The probability that 2 of the 3 selected computers have


illegal software loaded is .30, or 30%.
The Hypergeometric Distribution
Characteristics

The mean of the hypergeometric distribution is:

nA
μ  E(x) 
N
 The standard deviation is:

nA(N - A) N - n
σ 2

N N -1

N-n
Where N - 1 is called the “Finite Population Correction Factor”

from sampling without replacement from a finite population


The Poisson Distribution
Overview
 When there is a large number of
trials, but a small probability of
success, binomial calculation
becomes impractical
Example: Number of deaths from
horse kicks in the Army in
different years
Simeon D. Poisson (1781-
 The mean number of successes from 1840)

n trials is µ = np
Example: 64 deaths in 20 years
from thousands of soldiers
The Poisson Distribution
An area of opportunity is a continuous unit or
interval of time, volume, or such area in which
more than one occurrence of an event can
occur.

ex. The number of scratches in a car’s paint


ex. The number of mosquito bites on a
person
ex. The number of computer crashes in a day
The Poisson Distribution Properties
Apply the Poisson Distribution when:
You wish to count the number of times an event occurs in a
given area of opportunity
The probability that an event occurs in one area of opportunity
is the same for all areas of opportunity
The number of events that occur in one area of opportunity is
independent of the number of events that occur in the other
areas of opportunity
The probability that two or more events occur in an area of
opportunity approaches zero as the area of opportunity
becomes smaller
The average number of events per unit is  (lambda)
The Poisson Distribution Formula (PMF)

eλ λ x
P(X) 
X!

where:
X = the probability of X events in an area of opportunity
 = expected number of events
e = mathematical constant approximated by 2.71828…
An example
Suppose that, on an average, 5 cars enter a parking
lot per minute. What is the probability that in a
given minute, 7 cars will enter?
e  λ λ x e 5 5 7
P(7)    0.104
X! 7!

So, there is a 10.4% chance 7 cars will enter the


parking in a given minute.

Mean = Variance = λ
An example
 On an average, five birds hit the Kutub Minar and are
killed each week. Mr. Prashant Mohanto, an official of the
National Park Services has requested the government to
allocate funds for equipment to scare birds away from the
monument. A government sub-committee has replied that
funds cannot be allocated unless the chances of more than
three birds being killed in any week exceeds 70 in hundred.
Will the funds be allocated?

eλ λ x
λ5 P(x) 
X!
P(3)  1  P(0)  P(1)  P(2)  P(3)
The Relation Between Binomial and
Poisson Distribution

The Binomial Distribution tends towards


the Poison Distribution as n tends to infinity
and p tends to Zero
The Poisson Distribution with λ = np close
approximates the binomial distribution if n
is large and p is small
Continuous Probability Distribution
Continuous Distribution
 A continuous random variable is a variable that can assume
any value on a continuum (can assume an uncountable
number of values)
thickness of an item
time required to complete a task
temperature of a solution
height

 These can potentially take on any value, depending only on


the ability to measure precisely and accurately.
The Normal Distribution Properties

 ‘Bell Shaped’ f(X)


 Symmetrical and asymptotic
 Mean, Median and Mode are equal
 Location is characterized by the mean, μ σ
 Spread is characterized by the standard μ
deviation, σ
 Area to right and left of mean is 1/2. Mean
= Median
The random variable has an infinite
= Mode
theoretical range: - to +
The Normal Distribution Density
Function
The formula for the normal probability density
function is

2
1  (X μ) 
1  
2  

f(X)  e
2π

Where e = the mathematical constant approximated by 2.71828


π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
X = any value of the continuous variable
Sigma understanding of a NC

q = 99.7 %
What is a Sigma  level?
A metric that indicates how well a process
is performing.
• Higher is better
• Measures the capability of the process to
perform defect-free work
• Also known as “z”, it is based on standard
deviation for continuous data
Finding Probabilities
Probability is the area
under the curve!

P c  X  d   ?

f(X)

X
c d
Many Normal Distribution
There are an infinite number of normal distributions

By varying the parameters  and , we obtain


different normal distributions
Table Lookup of a
Standard Normal Probability
P( 0  Z  1)  0. 3413

Z 0.00 0.01 0.02

0.000.0000 0.0040 0.0080


0.100.0398 0.0438 0.0478
0.200.0793 0.0832 0.0871

1.000.3413 0.3438 0.3461

1.100.3643 0.3665 0.3686


1.200.3849 0.3869 0.3888
-3 -2 -1 0 1 2 3
The Cumulative Standardized
Normal Distribution
Cumulative Standardized Normal
Distribution Table (Portion)
Z  0 Z 1
Z .00 .01 .02
.5478
0.0 .5000 .5040 .5080
Shaded Area
Exaggerated
0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871 0


Probabilities
0.3 .6179 .6217 .6255 Z = 0.12

Only One Table is Needed


Standardizing Example
X   6.2  5
Z   0.12
 10
Normal Distribution Standardized
Normal Distribution
  10
Z 1

6.2 X 0.12 Z
 5 Z  0
Shaded Area Exaggerated
Example:
P  2.9  X  7.1  .1664
X   2.9  5 X   7.1  5
Z   .21 Z   .21
 10  10

Normal Distribution Standardized


Normal Distribution
  10
.0832 Z 1
.0832

2.9 7.1 X 0.21 0.21 Z


 5 Z  0
Shaded Area Exaggerated
Example:
P  2.9  X  7.1  .1664(continued)
Cumulative Standardized Normal
Distribution Table (Portion) Z  0 Z 1

Z .00 .01 .02


.5832
0.0 .5000 .5040 .5080 Shaded Area
Exaggerated
0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871 0


Z = 0.21
0.3 .6179 .6217 .6255
Example:
P  2.9  X  7.1  .1664(continued)
Cumulative Standardized Normal
Distribution Table (Portion) Z  0 Z 1

Z .00 .01 .02 .4168


-03 .3821 .3783 .3745 Shaded Area
Exaggerated
-02 .4207 .4168 .4129

-0.1 .4602 .4562 .4522 0


Z = -0.21
0.0 .5000 .4960 .4920
Example:
P  X  8   .3821
X   85
Z   .30
 10

Normal Distribution Standardized


Normal Distribution
  10
Z 1
.3821

8 X 0.30 Z
 5 Z  0
Shaded Area Exaggerated
Example:
P  X  8   .3821 (continued)

Cumulative Standardized Normal


Distribution Table (Portion) Z  0 Z 1

Z .00 .01 .02 .6179


0.0 .5000 .5040 .5080 Shaded Area
Exaggerated
0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871 0


Z = 0.30
0.3 .6179 .6217 .6255
Finding Z Values for Known
Probabilities
Cumulative Standardized Normal
What is Z Given Distribution Table (Portion)
Probability = 0.6217 ?

Z  0 Z 1 Z .00 .01 0.2

.6217 0.0 .5000 .5040 .5080

0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871


0
Shaded Area 0.3 .6179 .6217 .6255
Exaggerated Z  .31
Recovering X Values for Known
Probabilities
Normal Distribution Standardized
Normal Distribution
  10
.6179 Z 1
.3821

X
 5 ? Z  0
0.30 Z

X    Z  5   .30   10   8
An Example
We have a training program designed to
upgrade the supervisory skills of production
line supervisors. Because the program is self
administered, supervisors require different no.
of hours to complete the program. A study of
past participation indicates that the mean
length of time spent on the program is 500
hours and that this normally distributed random
variable has a standard deviation of 100 hrs.
Solve : (individual exercise)
What is the probability that a participant
selected at random will require more than
500 hrs to complete the program?
Between 500 and 650 hrs to complete the
training program?
More than 700 hrs.
Less than 580.
Between 450 to 650.
Lowest Stock decision at post office
The manager of a small postal substation is
trying to quantify the variation in the weekly
demand for mailing envelops. She has decided to
assume that this demand is normally distributed.
She knows that on an average 100 envelops are
purchased weekly and that 90 percent of the
time, weekly demand is below 115. The manager
wants to stock enough mailing envelops each
week so that the percentage of running out of
envelops is no higher than 5 percent. Can you
suggest her the lowest such stock level?
Prediction of number of spectators in a match

Mr. John, the McDonald stand manager for the One day
Series at Sri Lanka's cricket stadium, just had two
cancellation on his crew. This means that if more than 72,000
people come to watch today’s cricket match, the line for hot-
dogs will constitute a disgrace to Mr. John and will harm
business at the future games. Mr. John knows from his
experience that number of spectators who come to the game is
normally distributed with mean 67,000 and standard deviation
4,000 people. Mr. John has an option to hire two temporary
employees to ensure the business won’t be harmed in the
future at an additional cost of $200. If he believes the future
harm to business of having more than 72,000 fans at the match
would be $ 5000, what would you suggest him to go for?
Inspection Shop
On the basis of past experience,
automobile inspectors in Maruti Udyog
Limited in Gurgaon, have noticed that 5
percent of the cars coming in for their
annual inspection fail to pass. Find the
probability that between 7 and 18 of the
next 200 cars to enter the Inspection shop
will fail in the inspection.

You might also like