0% found this document useful (0 votes)
5 views34 pages

Chapter 5

The document explains the binomial probability distribution, which involves a fixed number of trials with binary outcomes and constant probabilities. It provides examples and formulas for calculating probabilities, expected values, variances, and standard deviations for various scenarios, including coin tosses and disease prevalence. Additionally, it introduces the Poisson distribution for modeling rare events and provides examples of its application.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views34 pages

Chapter 5

The document explains the binomial probability distribution, which involves a fixed number of trials with binary outcomes and constant probabilities. It provides examples and formulas for calculating probabilities, expected values, variances, and standard deviations for various scenarios, including coin tosses and disease prevalence. Additionally, it introduces the Poisson distribution for modeling rare events and provides examples of its application.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

BINOMIAL PROBABILITY DISTRIBUTION

◼ A fixed number of observations (trials), n


◼ e.g., 15 tosses of a coin; 20 patients; 1000 people surveyed

◼ A binary outcome
◼ e.g., head or tail in each toss of a coin; disease or no disease

◼ Generally called “success” and “failure”

◼ Probability of success is p, probability of failure is 1 – p

◼ Constant probability for each observation


◼ e.g., Probability of getting a tail is the same each time we toss the
coin
BINOMIAL DISTRIBUTION – DISCRETE VARIABLES

Example 1

5 coin tosses. What’s the probability that you flip exactly 3 heads in 5 coin tosses?
Solution:
One way to get exactly 3 heads: HHHTT

What’s the probability of this exact arrangement?


P(heads)xP(heads) xP(heads)xP(tails)xP(tails)

= (1/2)3 x (1/2)2

Another way to get exactly 3 heads: THHHT


Probability of this exact outcome
= (1/2)1 x (1/2)3 x (1/2)1
= (1/2)3 x (1/2)2
Outcome Probability
THHHT (1/2)3 x (1/2)2
HHHTT (1/2)3 x (1/2)2
TTHHH (1/2)3 x (1/2)2
HTTHH (1/2)3 x (1/2)2 The probability
5 (1/2)3 x (1/2)2
ways to HHTTH of each unique
  (1/2)3 x (1/2)2
arrange 3 HTHHT outcome (note:
heads in THTHH (1/2)3 x (1/2)2
 3 5 trials HTHTH (1/2)3 x (1/2)2
they are all
equal)
HHTHT (1/2)3 x (1/2)2
THHTH (1/2)3 x (1/2)2
10 arrangements x (1/2)3 x (1/2)2

5C3 = 5!/3!2! = 10

Factorial review: n! = n(n-1)(n-2)…


5
P(3 heads and 2 tails) =  3  x P(heads)3 x P(tails)2 =

= 10 x (½)5
=31.25%
BINOMIAL DISTRIBUTION

General output pattern will be emerging when there is only two


possible outcomes (1/0 or yes/no or success/failure) in n
independent trials.
EXAMPLE 2

Tossing a coin 20 times, what’s the probability of getting exactly 10 heads?

20
(.5)10 (.5)10 =
10
BINOMIAL DISTRIBUTION – CONTINUOUS VARIABLES
Therefore, the probability of “successes” can be identified using the
following formula:

n = number of trials

n X n− X
  p (1 − p)
X 1-p = probability of failure

X = # successes out of n trials


p = probability of success
EXAMPLE 3

If a coin is tossed 20 times, what’s the probability of getting of getting 2 or fewer


heads?

 20  20!
 (.5) (.5) =
0 20
(.5) 20 = 9.5 x10−7 +
0 20!0!
 20  20! −7 −5
 (.5) (.5) = (.5) = 20 x9.5 x10 = 1.9 x10 +
1 19 20

1 19!1!
 20  20! −7 −4
  (.5) 2
(.5)18
= (.5 ) 20
= 190 x 9.5 x10 = 1.8 x10
2 18!2!
= 1.8 x10− 4
EXPECTED VALUE, VARIANCE & STANDARD
DEVIATION
Calculated using the number of trial, probability of success and probability of failure
Then:
E(X) = np
Var (X) = np(1-p)
SD (X)= np(1 − p)
EXAMPLE 4

You are performing a cohort study. If the probability of developing disease in the
exposed group is 0.05 for the study duration, then if randomly sample 500 exposed
people, how many do you expect to develop the disease?
Give a margin of error (+/- 1 standard deviation) for the estimation.

What’s the probability that at most 10 exposed people develop the disease?
SOLUTION

How many is expected to develop the disease? Give a margin of error (+/- 1
standard deviation) for your estimate.

X ~ binomial (500, 0.05)


E(X) = 500 (0.05) = 25
Var(X) = 500 (0.05) (0.95) = 23.75
StdDev(X) = square root (23.75) = 4.87
25  4.87
What’s the probability that at most 10 exposed subjects develop the disease?

This is asking for a CUMULATIVE PROBABILITY


The probability of 0 getting the disease or 1 or 2 or 3 or 4 or up to 10.

P(X≤10) = P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)+….+ P(X=10)


EXAMPLE 5

A case-control study of smoking and lung cancer is conducted in the


hospital in the South of England. If the probability of being a smoker
among lung cancer cases is 0.6, what’s the probability that in a group of 8 cases
you have:
i. Less than 2 smokers?
ii. More than 5?
iii. What are the expected value and variance of the number of smokers?
SOLUTION
SOLUTION

P(<2)=.00065 + .008 = .00865 P(>5)=.21+.09+.0168 = .3168

0 1 2 3 4 5 6 7 8

E(X) = 8 (0.6) = 4.8


Var(X) = 8 (0.6) (0.4) =1.92
StdDev(X) = 1.38
REVIEW QUESTION 1

In a case-control study of smoking and lung-cancer, 60% of cases are smokers


versus only 10% of controls. What is the odds ratio between smoking and lung
cancer?

a. 2.5
b. 13.5
c. 15.0
d. 6.0
e. 0.05
REVIEW QUESTION 1 - ANSWER

In a case-control study of smoking and lung-cancer, 60% of cases are smokers


versus only 10% of controls. What is the odds ratio between smoking and lung
cancer?

0.6
a. 2.5 0.4 = 3 x 9 = 27
b. 13.5 0.1 2 1 2
0.9
c. 15.0 = 13.5
d. 6.0
e. 0.05
REVIEW QUESTION 2

What’s the probability of getting exactly 5 heads in 10 coin tosses?

 10 
a.  (.50) (.50)
5 5

0

b.  10 
 (.50) (.50)
5 5

5
c.  10 
 (.50) (.50)
10 5

5
d.
 10 
 (.50) (.50)
10 0

 10 
REVIEW QUESTION 2

What’s the probability of getting exactly 5 heads in 10 coin tosses?

 10 
a.  (.50) (.50)
0
5 5

b.  10 
 (.50) (.50)
5 5

5
c.  10 
 (.50) (.50)
10 5

5
d.  10 
 (.50) (.50)
10 0

 10 
REVIEW QUESTION 2

What’s the probability of getting exactly 5 heads in 10 coin tosses?

 10 
a.  (.50) (.50)
0
5 5

b.  10 
 (.50) (.50)
5 5

5
c.  10 
 (.50) (.50)
10 5

5
d.  10 
 (.50) (.50)
10 0

 10 
REVIEW QUESTION 3

A coin toss can be thought of as an example of a binomial distribution with N=1


and p=0.5. What are the expected value and variance of a coin toss?

a. 0.5, 0.25
b. 1.0, 1.0
c. 1.5, 0.5
d. 0.25, 0.5
e. 0.5, 0.5
REVIEW QUESTION 5

In a randomized trial with n=150, the goal is to randomize half to


treatment and half to control. The number of people randomized to
treatment is a random variable X. What is the probability
distribution of X?

a. X~Normal(=75,=10)
b. X~Exponential(=75)
c. X~Uniform
d. X~Binomial(N=150, p=0.5)
e. X~Binomial(N=75, p=0.5)
REVIEW QUESTION 6

In the test with n=150, if 69 end up in the treatment group and


81 in the control group, how far off is that from expected?

a. Less than 1 standard deviation


b. 1 standard deviation
c. Between 1 and 2 standard deviations
d. More than 2 standard deviations
PROPORTIONS

 The binomial distribution forms the basis of statistics for proportions.


 A proportion is just a binomial count divided by n.
 For example, if there are 200 cases and find 60 smokers, X=60 but the
observed proportion=0.30.
 Statistics for proportions are similar to binomial counts, but differ by a factor of n.
PROPORTIONS

For binomial:
 x = np
Differs by
 x = np(1 − p)
2 a factor of
n.

 x = np(1 − p)
Differs
by a
factor
For proportion:  pˆ = p of n.

np(1 − p) p(1 − p)
 pˆ 2 = 2
=
n n
P-hat stands for “sample p(1 − p)
proportion.”
 pˆ =
n
POISSON DISTRIBUTION

The Poisson distribution is used to model the number of events occurring


within a given time interval. The formula for the Poisson probability density
(mass) function is
𝑒 −𝜆 𝜆𝑥
𝑝(𝑥) =
𝑥!

 is the shape parameter which indicates the average number of events in the
given time interval.
POISSON DISTRIBUTION

Some events are rather rare - they don't happen that often. For instance, car
accidents are the exception rather than the rule. Still, over a period of time,
we can say something about the nature of rare events.

An example is the improvement of traffic safety, where the government wants


to know whether seat belts reduce the number of death in car accidents.
Here, the Poisson distribution can be a useful tool to answer questions about
benefits of seat belt use.
POISSON DISTRIBUTION

Other phenomena that often follow a Poisson


distribution are death of infants, the number of
misprints in a book, the number of customers arriving,
and the number of activations of a Geiger counter.

The distribution was derived by the French


mathematician Siméon Poisson in 1837, and the first
application was the description of the number of deaths
by horse kicking in the Prussian army.
POISSON DISTRIBUTION

Example 6
Arrivals at a bus-stop follow a Poisson distribution with an average
of 4.5 every quarter of an hour.
Obtain a bar plot of the distribution (assume a maximum of 20
arrivals in a quarter of an hour) and calculate the probability of
fewer than 3 arrivals in a quarter of an hour.
Solution
The probabilities of 0 up to 2 arrivals can be calculated directly from the
formula

−𝜆 𝑥
with  =4.5
𝑒 𝜆
𝑝(𝑥) =
𝑥!

𝑒 −4.5 4. 50
𝑝(0) =
0! So p(0) = 0.01111
Similarly p(1)=0.04999 and p(2)=0.11248

So the probability of fewer than 3 arrivals is 0.01111+ 0.04999 + 0.11248


=0.17358
POISSON DISTRIBUTION
PROPERTIES OF POISSON

The mean and variance are both equal to .


The sum of independent Poisson variables is a further Poisson variable with
mean equal to the sum of the individual means.
As well as cropping up in the situations already mentioned, the Poisson
distribution provides an approximation for the Binomial distribution.

You might also like