Introduction to Probability
1
Probability
Probability is the chance of observing a particular outcome or likelihood of
observing an event.
Assumes a “random” process: i.e... the outcome is not predetermined - there
is an element of chance
Probability theory developed from studying games of chance like dice and
cards.
A process like flipping a coin, rolling a die, or drawing a card from a deck are
probability experiments.
2
Common terms in probability
Experiment = any process with an uncertain outcome
Random experiment: an action or process that leads to one of several
possible outcomes whose results depend on chance, that is the result cannot be
predicted.
An outcome is a specific result of a single trial of a probability experiment.
Event = something that may happen when the experiment is performed.
Events are represented by uppercase letters such as A, B, and C
Sample space = set of all possible outcomes of an experiment
3
We talk about probability when dealing with a process that has an
uncertain outcome
When an experiment is performed, one and only one outcome is
obtained
An event either occurs or it does not occur
4
Why Probability in Statistics
and medicine?
There is uncertainty and variation in scientific data.
Results are not certain
To evaluate how accurate our results are:
Given how our data were collected, are our results accurate ?
Given the level of accuracy needed, how many observations need to
be collected?
Because medicine is an inexact science, physicians seldom
predict an outcome with absolute certainty.
5
Probability of an Event E
A number between 0 and 1 representing the proportion of
times that event E is expected to happen when the
experiment is done repeatedly under the same conditions
Any event can be expressed as a subset of the set of all
possible outcomes (S)
P(S) = 1
6
An understanding of probability is fundamental for quantifying the
uncertainty that is inherent in the decision-making process
Probability theory is a foundation for statistical inference, &
Allows us to conclude a population of patients based on
information obtained from a sample of patients drawn from that
population.
7
Probability concept is used to understand:
About probability distributions: Binomial, Poisson, and Normal
Distributions
Sampling and sampling distributions
Estimation
Hypothesis testing
Advanced statistical analysis
8
Two Categories of
Probability
Objective and Subjective Probabilities.
Objective probability
1) Classical probability and
2) Relative frequency probability.
Classical Probability
Definition: If an event can occur in N mutually exclusive and equally
likely ways, and if m of these posses a characteristic, E the probability of
the occurrence of E = m/N.
P(E)= the probability of E = P(E) = m/N
9
E.g In Rolling a dice
There are 6 possible outcomes:
Total ways = {1, 2, 3, 4, 5, 6}.
Each is equally likely
P(i) = 1/6, i=1,2,...,6. P(1) = 1/6, P(2) = 1/6……. P(6) = 1/6,SUM =
1
If we toss a die, what is the probability of 4 coming up?
m = 1(which is 4) and N = 6, The probability of 4 coming up is 1/6.
Another “equally likely” setting is the tossing of a coin
There are 2 possible outcomes in the set of all possible outcomes {H, T}.
P(H) = 0.5, P(T) = 0.5, SUM = 1
10
Relative Frequency Probability
In the long run process …..
The proportion of times the event A occurs — in many trials repeated under
essentially identical conditions
Definition: If a process is repeated many times (n), and if an event
with the characteristic E occurs m times, the relative frequency of E,
Probability of E = P(E) = m/n.
11
If you toss a coin 100 times and head comes up 40 times,
P(H) = 40/100 = 0.4.
If we toss a coin 10,000 times and the head comes up 5562,
P(H) = 0.5562.
12
Example: Of 158 people who attended a dinner party, 99 were ill.
P (Illness) = 99/158 = 0.63 = 63%.
In 1998, there were 2,500,000 registered live births; of these, 200,000
were LBW infants.
Therefore, the probability that a newborn is LBW is estimated by P
(LBW) = 200,000/2,500,000 = 0.08
13
Subjective Probability
Personalistic (represents one’s degree of belief in the occurrence of
an event).
Personal assessment of which is more effective to provide cure –
traditional/modern
Personal assessment of which sports team will win a match.
Also uses classical and relative frequency methods to assess the
likelihood of an event.
14
E.g., If someone says that he is 95% certain that a cure for AIDS will be
discovered within 5 years, then he means that:
P(discovery of cure for AIDS within 5 years) = 95% = 0.95
Although the subjective view of probability has enjoyed increased
attention over the years, it has not fully accepted by scientists.
15
Mutually Exclusive
Events
Two events A and B are mutually exclusive if they cannot both Happen
at the same time:
P (A ∩ B) = 0
Example:
A coin toss cannot produce heads and tails simultaneously.
Weight of an individual can’t be classified simultaneously as
“underweight”, “normal”, “overweight”
16
Independent Events
Two events A and B are independent if the probability of the first one
happening is the same no matter how the second one turns out.
The outcome of one event does not affect the occurrence or non-occurrence of
the other.
P(A∩B) = P(A) x P(B) (Independent events)
P(A∩B) ≠ P(A) x P(B) (Dependent events)
Example:
The outcomes on the first- and second-coin tosses are independent
17
Intersection, and union
The intersection of two events A and B, A ∩ B, is the event that A and
B happen simultaneously
P ( A and B ) = P (A ∩ B )
Let A represent the event that a randomly selected newborn is LBW,
and B the event that he or she is from a multiple birth
The intersection of A and B is the event that the infant is both LBW
and from a multiple birth
18
The union of A and B, A U B, is the event that either A happens or B
happens or they both happen simultaneously
P ( A or B ) = P ( A U B )
In the example above, the union of A and B is the event that the
newborn is either LBW or from a multiple birth, or both
19
Properties of
Probability
1. The numerical value of a probability always lies between 0 and 1,
inclusive.
0 P(E) 1
A value 0 means the event can not occur=impossible event
A value 1 means the event definitely will occur=sure event
A value of 0.5 means that the probability that the event will occur
is the same as the probability that it will not occur.
20
2. The sum of the probabilities of all mutually exclusive outcomes is
equal to 1.
P(E1) + P(E2 ) + .... + P(En ) = 1.
3. For two mutually exclusive events A and B,
P(A or B ) = P(AUB)= P(A) + P(B).
If not mutually exclusive:
P(A or B) = P(A) + P(B) - P(A and B)
21
4. The complement of an event A, denoted by Ā or Ac, is the event
that A does not occur
Consists of all the outcomes in which event A does NOT occur
P(Ā) = P(not A) = 1 – P(A)
Ā occurs only when A does not occur.
These are complementary events.
22
In the example, the complement of A is the event that a newborn is
not LBW
In other words, A is the event that the child weighs 2500 grams at
birth
P(Ā) = 1 − P(A)
P(not low bwt) = 1 − P(low bwt)
= 1− 0.076
= 0.924
23
Basic Probability
Rules
1. Addition rule
If events A and B are mutually exclusive:
P(A or B) = P(A) + P(B)
P(A n B) = 0
More generally:
P(A or B) = P(A) + P(B) - P(A and B)
P(event A or event B occurs or they both occur)
24
Example: The probabilities below represent years of
schooling completed by mothers of newborn infants.
25
What is the probability that a mother has completed < 12 years of
schooling?
P( 8 years) = 0.056 and
P(9-11 years) = 0.159
Since these two events are mutually exclusive,
P( 8 or 9-11) = P( 8 U 9-11)
= P( 8) + P(9-11)
= 0.056+0.159
= 0.215
26
What is the probability that a mother has completed 12 or more years of
schooling?
P(12) = P(12 or 13-15 or 16)
= P(12 U 13-15 U 16)
= P(12)+P(13-15)+P(16)
= 0.321+0.218+0.230
= 0.769
27
If A and B are not mutually exclusive events,
then subtract the overlapping:
P(AU B) = P(A)+P(B) − P(A ∩ B)
28
2. Multiplication rule
If A and B are independent events, then
P(A ∩ B) = P(A) × P(B)
More generally,
P(A ∩ B) = P(A) P(B|A) = P(B) P(A|B)
P(A and B) denotes the probability that A and B both occur at
the same time.
29
Conditional Probability
Refers to the probability of an event, given that another event is
known to have occurred.
“What happened first is assumed”.
Hint - When thinking about conditional probabilities, think in stages.
Think of the two events A and B occurring chronologically, one after
the other, either in time or space.
30
The conditional probability that event B has occurred given that
event A has already occurred is denoted P(B|A) and is defined
provided that P(A) ≠ 0.
31
Example:
A study investigating the effect of prolonged exposure to
bright light on retina damage in premature infants.
Retinopathy Retinopathy TOTAL
YES NO
Bright light 18 3 21
Reduced light 21 18 39
TOTAL 39 21 60
32
The probability of developing retinopathy is:
P (Retinopathy) = No. of infants with retinopathy
Total No. of infants
= (18+21)/(21+39)
= 0.65
33
We want to compare the probability of retinopathy, given that the
infant was exposed to bright light, with that the infant was
exposed to reduced light.
Exposure to bright light and exposure to reduced light are
conditioning events, events we want to take into account when
calculating conditional probabilities.
34
The conditional probability of retinopathy, given exposure to bright
light, is:
P(Retinopathy/exposure to bright light) =
No. of infants with retinopathy exposed to bright light
No. of infants exposed to bright light
= 18/21 = 0.86
35
P(Retinopathy/exposure to reduced light) =
# of infants with retinopathy exposed to reduced light
No. of infants exposed to reduced light
= 21/39 = 0.54
The conditional probabilities suggest that premature infants exposed
to bright light have a higher risk of retinopathy than premature infants
exposed to reduced light.
36
For independent events A and B
P(A/B) = P(A).
For non-independent events A and B
P(A and B) = P(A/B) P(B)
(General Multiplication Rule)
37
Test for
Independence
Two events A and B are Two events A and B are dependent
independent if: if:
P(B|A)=P(B) P(B|A) ≠P(B)
or or
P(A and B) = P(A) • P(B) P(A and B) ≠P(A) • P(B)
38
Example
In a study of optic-nerve degeneration in Alzheimer’s disease,
postmortem examinations were conducted on 10 Alzheimer’s
patients.
The following table shows the distribution of these patients according
to sex and evidence of optic-nerve degeneration.
Are the events “patients has optic-nerve degeneration” and “patient is
female” independent for this sample of 10 patients?
39
Optic-nerve Degeneration
Sex
Present Not Present
Female 4 1
Male 4 1
40
Solution
P(Optic-nerve degeneration/Female) =
No. of females with optic-nerve degeneration
No. of females
= 4/5 = 0.80
P(Optic-nerve degeneration) =
No of patients with optic-nerve degeneration
Total No. of patients
= 8/10 = 0.80
The events are independent for this sample.
41
Exercise:
Culture and Gonodectin (GD) test results for 240 Urethral
Discharge Specimens
Culture Result
GD Test Gonorrhea No Gonorrhea Total
Result
Positive 175 9 184
Negative 8 48 56
Total 183 57 240
42
1. What is the probability that a man has gonorrhea?
2. What is the probability that a man has a positive GD test?
3. What is the probability that a man has a positive GD test and
gonorrhea?
4. What is the probability that a man has a negative GD test and does
not have gonorrhea?
5. What is the probability that a man has a positive GD test given he
has gonorrhea ?
43
6. What is the probability that a man has a negative GD test given he
does not have gonorrhea ?
7. What is the probability that a man has a positive GD test given he
does not have gonorrhea ?
8. What is the probability that a man has gonorrhea given he has
positive GD test?
44
Probability
distribution
Probability distribution refers to the way data are distributed, to conclude a
set of data.
It tells us how total probability 1 is distributed among the various values
that the random variable can take.
A probability distribution of a random variable can be displayed by a table a
graph or a mathematical formula.
Random Variable = Any quantity or characteristic that can assume several
different values such that any particular outcome is determined by chance.
45
Random variables: can be either discrete or continuous.
A discrete random variable is able to assume only a finite or countable
number of outcomes
A continuous random variable can take on any value in a specified
interval
The probability distribution can be displayed in the form of a table
giving the values and their associated probabilities and/or it can be
expressed as a mathematical formula giving the probability of all possible
values.
46
Common Probability
distributions
1. Binomial distribution
Consider a dichotomous variable (a nominal variable with only two
possible values).
The two mutually exclusive outcomes are referred as “failure” and
“success”.
E.g. Let X represents smoking status; X=1 smoker and X=0 non-smoker.
The two outcomes are mutually exclusive.
E.g In USA; in 1987, 29% of the adults in USA were smokers, therefore
Pr (X=1) = 0.29 and Pr (X=0) = 1-0.29 = 0.71.
47
Binomial distribution…
In general in binomial distribution:
There are a fixed n number of trials each of which results in one of
two mutually exclusive outcomes.
The outcomes of n trials are independent.
The probability of “success” is constant for each trial
Pr (X=success) = Pr (X=1) = p , Pr (X=failure) = Pr (X=0) = 1-p
48
If an experiment is repeated n times, the probability P(X=x) that
outcome X occurs exactly x times is
Pr (X= x) = n! p x (1- p) n- x
x ! (n- x )!
n (trials) & p (probability outcome of event X) are parameters of the
binomial distribution.
x is number of successes. and n! read as ”n factorial” or factorial
n” is the product of all integers 1 to n inclusive.
By definition 1!=0!=1.
49
Binomial distribution….
In addition to the probabilities of individual outcomes, we can also
compute the numerical summary measures associated with a
probability distribution.
The mean value for a binomial distribution or the average number of
successes in repeated samples of n is equal to n × p and the standard
deviation S = √np(1-p)
50
Binomial distribution….
Suppose that in a certain population 52% of all recorded births are
males. If we select randomly 10 birth records What is the probability
that : A. Exactly 5 will be males? n=10, x=5,
Pr (X= x) = n! p x (1- p) n- x
x ! (n -x )!
Pr (X=5) = 10! X 0.52 5 x (1- 0.52)10-5 =0.24
5!(10-5)!
B. Less than 3 will be females?
Pr(X<3) = [Pr(X=0)+Pr(X=1)+Pr(X=2)]
=[0.001+0.013+0.055]= 0.069
51
2. Normal Distributions
The ND is the most important probability distribution in statistics
Frequently called the “Gaussian distribution” or bell-shape curve.
Variables such as blood pressure, weight, height, serum cholesterol
level— are approximately normally distributed
The ND is vital to statistical work, most estimation procedures and
hypothesis tests underlie ND.
52
Properties of the Normal
Distribution
1. It is symmetrical about its mean, .
2. The mean, the median and mode are almost equal, and it is uni-
modal.
3. The total area under the curve about the x-axis is 1 square unit.
4. The curve never touches the x-axis.
5. As the value of increases, the curve becomes more and more flat.
53
6. The distribution is completely determined by the parameters
and .
7. ± 1SD contains about 68%;
±2 SD contains about 95%;
±3 SD contains about 99.7%
of the area under the curve.
54
Standard Normal Distribution
It is a normal distribution that has a mean equal to 0 and a
SD equal to 1, and is denoted by N(0, 1).
The main idea is to standardize all the data that is given by
using Z-scores.
These Z-scores can then be used to find the area (and thus
the probability) under the normal curve.
The standard normal distribution has mean 0 and variance 1
55
Z - Transformation
If a random variable X~N(,) then we can transform it to a SND with
the help of Z-transformation xx
zz
Z represents the Z-score for a given x value.
Tells us how many SDs away from mean for normal distribution.
This process is known as standardization and gives the position on a
normal curve with μ=0 and σ=1, i.e., the SND, Z.
A Z-score is the number of standard deviations that a given x value is
above or below the mean.
56
Finding normal curve areas
1. The table gives areas between -∞ and the value of z.
2. Find the z value in tenths in the column at the left margin and locate its row.
Find the hundredth place in the appropriate column.
3. Read the value of the area (P) from the body of the table where the row and
column intersect.
Values of P are in the form of a decimal point and four places.
Following the model of the ND, a given value of x must be converted to a z
score before it can be looked up in the z table.
57
Some Useful Tips
Only a single curve for which μ = 0 and σ = 1 is tabulated.
58
59
a) What is the probability that z < -1.96?
(1) Sketch a normal curve
(2) Draw a perpendicular line for z = -1.9
(3) Find the area in the table
(4) The answer is the area to the left of the line P(z < -1.96) = 0.0250
60
b) What is the probability that -1.96 < z < 1.96?
The area between the values P(-1.96 < z < 1.96)
= .9750 - .0250 = .9500
61
c) What is the probability that z > 1.96?
The answer is the area to the right of the line; found by subtracting table value
from1.0000;P(z>1.96)=1.0000-.9750=.0250
62
Exercise
1. Compute P(-1 ≤ Z ≤ 1.5)
2. Find the area under the SND from 0 to 1.45
0.4265
3. Compute P(-1.66 < Z < 2.85)
63
Example on z-transformation
The diastolic blood pressures of males 35–44 years of age are normally
distributed with µ = 80 mm Hg and σ2 = 144 mm Hg2, Let individuals with
BP above 95 mm Hg are considered to be hypertensive
a. What is the probability that a randomly selected male has a BP above
95 mm Hg?
64
Approximately 10.6% of this population would be classified as
hypertensive.
65
b. What is the probability that a randomly selected male has a
DBP above 110 mm Hg?
Z = 110 – 80 = 2.50
12
P (Z > 2.50) = 0.0062
Approximately 0.6% of the population has a DBP above 110
mm Hg
66
c. What is the probability that a randomly selected male has a DBP
below 60 mm Hg?
Z = 60 – 80 = -1.67
12
P (Z < -1.67) = 0.0475
Approximately 4.8% of the population has a DBP below 60 mm Hg.
67
The normal distribution
depends on the two
parameters and .
determines the 1
2
3
1
<<
location of the curve. 1 2 3
2
But, determines
the scale of the curve, i.e.
3
the degree of flatness or
< <
peakedness of the curve.
1 2 3
68
Student’s t Distribution
The t distribution was discovered by W. S. Gosset in 1908 under a
family of continuous probability distributions
He used the pseudonym Student to avoid getting fired for doing
statistics on the job!!!
The shape of the t distribution is very similar to the shape of the
standard normal distribution.
They are all symmetric and uni-modal.
They are all centered at 0.
69
Flatter/broader than the Normal (0,1).
This means:
The variability of t is greater than that of a Z that is normal(0,1).
Thus, there is more area under the tails and less at center
Because variability is greater, resulting confidence intervals will be
wider.
70
Student’s t Distribution…….
The t distribution has a (slightly) different shape for each possible
sample size.
As the df gets larger, the student’s t-distribution looks more and
more like the SND with mean=0 and variance=1.
71
Student’s t Table
The body of the table contains Look up
t values, not probabilities
72
Thank You for Being
Patient Till the End!!!
73