0% found this document useful (0 votes)

32 views85 pages

Chapter 4: Probability and Probabilistic Models: Statistics I

Uploaded by

torresmartinezr15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views85 pages

Chapter 4: Probability and Probabilistic Models: Statistics I

Uploaded by

torresmartinezr15

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 85

Statistics I

Chapter 4: Probability and probabilistic models

Topic 4. Probability and Stochastic models

Summary
I Probability:
I Random experiments, sample space, elementary and composite
events.
I Axioms of Probability.
I Conditional probability and its properties.
I Random variables (RVs) and their properties.
I Stochastic models for discrete RVs: the Bernoulli and other related
models.
I Stochastic models for continuous RVs: the Normal (or Gaussian) and
related models.
Basic concepts
I Random Experiment: is the process of observing an outcome that
cannot be predicted with certainty.
I Sample Space: is the set of all possible outcomes of a random
experiment, denoted by

Ω = {e1 , e2 , . . . , en , . . .}

whose elements are called elementary events. These are disjoint (i.e.
they cannot occur at the same time).
I Event: is a collection of elementary events.

A = {e1 , e3 }

Examples:
I Result of a coin toss.
I The closing price of stock x at the end of next Monday.
Events: basic concepts

Intersection of events: Let A and B be two events of the sample space Ω,

then the intersection, A ∩ B, is the set of all events Ω that jointly belong
to A and B.

Representation by Euler-Venn diagrams:

Events: basic concepts

A and B are said to be incompatible events if they have no common

elementary events, that is the set A ∩ B = ∅ is empty (denoted by the
symbol ∅).
Events: basic concepts

Union of events: Let A and B be two events of the sample space Ω, the
union, A ∪ B, is the set of all events of Ω that belong either to A OR B.
Events: basic concepts

Trivial events:
I Sure event Ω: the set equals to the sample space
I Impossible event ∅: the empty set

Complementary event
The complementary of an event A is the set of all events of Ω that do
not belong to A.
Example: throw of a dice

Consider the outcome of throwing a regular dice (i.e. a dice with 6 faces):

I elementary events: observed face 1,2,3,4,5,6.

I sample space: Ω = {1, 2, 3, 4, 5, 6}
I events: A = {2, 4, 6} B = {4, 5, 6}
The event A corresponds to an even number, while B corresponds to a
number larger than three.
Example: throw of a dice
Ω = {1, 2, 3, 4, 5, 6} A = {2, 4, 6} B = {4, 5, 6}
I Complementary:
Ā = {1, 3, 5} B̄ = {1, 2, 3}
I Intersection:
A ∩ B = {4, 6} Ā ∩ B̄ = {1, 3} = A ∪ B
I Union:
A ∪ B = {2, 4, 5, 6} Ā ∪ B̄ = {1, 2, 3, 5} = A ∩ B
A ∪ Ā = {1, 2, 3, 4, 5, 6} = Ω
I Incompatible events:
A ∩ Ā = ∅
I Note that:
A∩B ⊂A A∩B ⊂B
A⊂A∪B B ⊂A∪B
Notions of Probability

Classic (Laplace rule)

Consider an experiment with k elementary events all equiprobable (i.e.
with probability 1/k), then the probability of a set A is defined as

1
P(A) = × size of A
k

In general, the probability can be viewed as an application that assigns to

each event (i.e. A) a number between 0 and 1, that is P (A) ∈ [0, 1].
Notions of Probability (... part 2)
Frequentist notion
The probability of A, P(A), is the limit of its relative frequency, that is
nA
P(A) = lim ,
n→∞ n
where nA is the number of times that we observed A in n experiments.
A Flight Example
What is the probability that A flight crashes ?
(i.e. 10−6 as for a million of flights only 1 falls)

Subjective / Epistemic notion

The probability of A, can be tough to be the proportion, P(A), of money
that one judge to be fair to bet for A.
MY Flight Example
What is the probability that MY flight crashes ?
(i.e. If I take it is because hope to be almost 0)
Postulates of Probability and some consequences

Postulates:
1. 0 ≤ P(A) ≤ 1.
Pn
2. If A = {e1 , e2 , . . . , en }, then P(A) = i=1 P(ei ).

3. P(Ω) = 1.

Consequently:
I Complementary: P(Ā) = 1 − P(A).
I P(∅) = 0

I Union: P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

I If A and B are incompatible (A ∩ B = ∅), then

P(A ∪ B) = P(A) + P(B).
Example: throw of a dice

I Probability of elementary events: P(ei ) = 16 , where ei = i, para

i = 1, . . . , 6
I Probability of an even score: A = {2, 4, 6}, then

1 1 1 1
P(A) = P(”2”) + P(”4”) + P(”6”) = + + =
6 6 6 2
I Probability of a score greater than 3: B = {4, 5, 6}, then

1 1 1 1
P(B) = P(”4”) + P(”5”) + P(”6”) = + + =
6 6 6 2
I Probability of an odd score
1 1
P(Ā) = 1 − P(A) = 1 − =
2 2
Example: throw of a dice

I Probability of an even score (A =“even”) OR greater than 3

(B =“greater than 3”)

P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

2 1
As A ∩ B = {4, 6}, then P(A ∩ B) = 6 = 3

1 1 1 4 2
P(A ∪ B) = + − = =
2 2 3 6 3
I Probability of even or face 1:
Events A = {2, 4, 6} and C = {1} are incompatible (A ∩ C = ∅)
therefore
1 1 4 2
P(A ∪ C ) = P(A) + P(C ) = + = =
2 6 6 3
Example: conditional probability

I We play at the roulette betting for numbers 3, 13 and 22. What is

the probability to win?

I The sample space is Ω = {0, 1, 2, . . . , 36}, made by 37 elementary

events. The event of interest is A = ”our bet” = {3, 13, 22} that
contains 3 elementary events.

3
I Therefore, the probability to win is P (A) = 37 .

I Suppose that before begin it were told us that the roulette is unfair
as only odd numbers come out. What is the probability to win when
including this information? Is the same as before ?
Notion of conditional probability
Conditional Probability
Let A and B be two events such that P(B) > 0, the conditional
probability of A given B is defined:

P(A ∩ B)
P(A|B) =
P(B)

The product’s law

If P(B) > 0, then
P(A ∩ B) = P(A|B)P(B)
Notion of independence
Event A is independent from B if conditioning on B its probability does
not change:
P(A|B) = P(A),
more over if P(A) > 0 and P(B) > 0 the above is equivalent to the
following:
P(A ∩ B) = P(A)P(B).

Note: Do not confuse independent events with incompatible events.

Example: conditional probability

I Let B =“The outcome is Odd”= {1, 3, 5, . . . , 35}, which contains

18 elementary events.

I Then, given that A ∩ B = {3, 13}, the conditional probability is:

2
P (A ∩ B) 37 2 1
P (A|B) = = 18 = =
P (B) 37
18 9

I Note that, once we know that the roulette is unfair, then the sample
space changes from the initial as no even numbers can be obtained.
It becomes Ω∗ = B = {1, 3, 5, . . . , 35}. The probability of A over Ω∗
is known 19 .

I As P(A) 6= P (A ∩ B), then A and B are dependent.

Examples
From a pack of 40 spanish cards, I extract two without replacement.
Calculate the probability that:
10
I The first card is a club: P(A) = 40 .
I The second is a club knowing that the first was also a club:
9
P(B|A) = 39 .
9 10
I The first two are clubs: P(A ∩ B) = P(B|A)P(A) = 39 40 .

I trow twice a dice. Calculate the probability that:

I the first dice is one: P(C ) = 61 .
I the second dice is one knowing that the first was one:
P(D|C ) = P(D) = 61 .
I the first is one, knowing that the second is also one:
P(C |D) = P(C ) = 61 .
11
I both are one: P(C ∩ D) = P(D)P(C ) = 66 (independent events)
Law of total probability (1)
Consider a set of events B1 , B2 , . . . , Bk which are mutually exclusive,

Bi ∩ Bj = ∅, ∀i 6= j.

if also they satisfy

Ω = B1 ∪ B2 ∪ . . . ∪ Bk ,
we say that they are a partition of the sample space.
Example

I For the pack of spanish card the following sets are partition of the
sample space:

I Ω = {diamonds, clubs, spades, hearts} .

I Ω = {aces, threes, queens, horses, kings, the rest of cards} .

Law of total probability (2)

Given a partition of the sample space, B1 , B2 , . . . , Bk , and and an event

A it must be

P(A) = P(A ∩ B1 ) + P(A ∩ B2 ) + . . . + P(A ∩ Bk ) =

= P(A|B1 )P(B1 ) + P(A|B2 )P(B2 ) + . . . + P(A|Bk )P(Bk ).
Example: of the law of total probability
I In a spanish pack of 48 cards, calculate the probability of drawing an
ace using the law of total probability
I the four seeds forms a partition of the pack
Ω = {diamonds, clubs, spades, hearts}, then

P (Ω) = P (diamonds) + P (clubs) + P (spades) + P (hearts) =

1 1 1 1
= + + +
4 4 4 4
I If A =“ace”, then

P (A) = P (A|diamonds) P (diamonds) + P (A|clubs) P (clubs) +

P (A|spades) P (spades) + P (A|hearts) P (hearts) =
1 12 1 12 1 12 1 12 4 1
= + + + = =
12 48 12 48 12 48 12 48 48 12
I If we draw an ace, what is the probability to be an ace of clubs? We
need to invert the conditioning.
The Bayes theorem

For two events A and B it must be

P(A ∩ B) P(B|A)P(A)
P(A|B) = =
P(B) P(B)

Such “theorem” is applied if we know P(B|A).

Example: (cont.) Given that we draw an ace what is the probability to

be an ace of clubs ?
1 1
P(A|clubs)P(clubs) 12 4 1
P(clubs|A) = = 1 =
P(A) 12
4
Example

I A cat want to fish in a aquarium with 5 fishes: 3 yellow and 2 black.

Assuming that he fished one, what is the probability to be black ?
Let R =“black”, then:
2
P (R) =
5
I Assuming that he fished one, what is the probability to be of
different color ?
Denoting with R1 =“first black”, R2 =“second black”, A1 =“first
yellow” and A2 =“second yellow”, then:

P (R1 ∩ A2 ) + P (A1 ∩ R2 ) = P (A2 |R1 ) P (R1 ) + P (R2 |A1 ) P (A1 ) =

32 23 6 6 12 3
= + = + = =
45 45 20 20 20 5
Example

I Assuming that he fishes twice and knowing that the second is black,
what is the probability that the first were yellow?

P (R2 |A1 ) P (A1 ) P (R2 |A1 ) P (A1 )

P (A1 |R2 ) = = =
P (R2 ) P (R2 |A1 ) P (A1 ) + P (R2 |R1 ) P (R1 )
23 6
45 20 6 3
= 23 12 = 6 2 = =
45 + 45 20 + 20
8 4
Random Variables

I Let Ω be the sample space of an experiment.

I We define a random variable (r.v.), X , a function X : Ω −→ R, such
that to each element ei ∈ Ω corresponds a number X (ei ) = xi ∈ R.
I Intuitively, a r.v. is a quantity that varies according to the possible
outcome ei .
I The random variable is denoted with a capital letter (i.e. X ), while
lower letter (i.e. xi ) denotes the specific value corresponding to a
point on the sample space (i.e. ei ).
I Note: The statistic variables we saw in the previous topics (1,2 and
3) may be viewed as the outcomes of a r.v. observed in a certain
sample. Such variables differ from the stochastic variables here
considered.
Random Variables

Discrete r.v.
If X assumes values over a finite or infinite countable space S ⊆ R we
say that X is a discrete r.v.

Continuous r.v.
If X assumes values over a infinite uncountable space S ⊆ R we say that
X is a continuous r.v.

Examples
I X =“Score of trowing a dice” is a discrete r.v. S = {1, 2, 3, 4, 5, 6}.
I Y =“Number of cars crossing a bridge in a week” is a discrete r.v.
S = {0, 1, 2, . . .} = N ∪ 0 as it is infinite countable.
I Z = “the heigh of a student” is a continuous r.v. S = [0, +∞).
Discrete r.v.

Probability function
Let X be a discrete r.v. with values {x1 , x2 , . . .}. We call probability
function or probability mass function, the set of probabilities where X
takes values, that is, pi = P[X = xi ], para i = 1, 2, . . . .

Example
X = the score at throwing a dice. The probability mass function for a
fair dice
x 1 2 3 4 5 6
1 1 1 1 1 1
P[X = x] 6 6 6 6 6 6

In this case, S = {1, 2, 3, 4, 5, 6} y p1 = . . . = p6 = 16 .

Discrete r.v.

Probability function. Properties

Let X be a discrete r.v. over the set S = {x1 , x2 . . .} with probabilities
p1 = P(X = x1 ), p2 = P(X = x2 ),. . .
I 0 ≤ P[X = xi ] ≤ 1.
X
I P[X = xi ] = 1.
i

X
I P[X ≤ x] = P[X = xi ].
i,xi ≤x

I P[X > x] = 1 − P[X ≤ x].

Example
I Suppose a game where 3 rings must be inserted in a arm. The
payoffs are: -3 euro for participating, 4 euros for 1 ring, 6 for 2 rings
and 30 for 3 rings inserted. Assume that the probability of inserting
one ring at each throw is 0.1 and that the 3 throws are independent.
I Let X be the r.v. representing the gain. The sample space is

Ω = {(f , f , f ) , (a, f , f ) , (f , a, f ) , (f , f , a) ,
(a, a, f ) , (a, f , a) , (f , a, a) , (a, a, a)}

where a denote inserted and f failed. Then, X has only four

outcomes:

P (X = −3) = 0.93 = 0.729

P (X = 1) = 3 × 0.1 × 0.92 = 0.243
P (X = 3) = 3 × 0.12 × 0.9 = 0.027
P (X = 27) = 0.13 = 0.001
Example

I What is the probability to gain at least 3 euros ?

P (X ≥ 3) = P (X = 3) + P (X = 27) = 0.027 + 0.001 = 0.028

I What is the probability to not lose money ?

P (X ≥ 0) = P (X = 1) + P (X = 3) + P (X = 27) =
= 0.243 + 0.027 + 0.001 = 0.271

or:

P (X ≥ 0) = 1 − P (X < 0) = 1 − P (X = −3) = 1 − 0.729 = 0.271

Discrete r.v.

Distribution Function
The cumulative distribution function (c.d.f.) of a r.v. X is an application
F : R → [0, 1], that to each x ∈ R assigns the probability:
X
F (x) = P[X ≤ x] = P (X = xi )
xi ∈S,xi ≤x

Note: that the c.d.f. is defined over all x ∈ R.

I 0 ≤ F (x) ≤ 1 for all x ∈ R.
I F (y ) = 0 for all y < min S. Then, F (−∞) = 0.
I F (y ) = 1 for all y > max S. Therefore, F (∞) = 1.
I If x1 ≤ x2 , then F (x1 ) ≤ F (x2 ), that is, F (x) is not decreasing.
I For all a, b ∈ R,
P (a < X ≤ b) = P (X ≤ b) − P (X ≤ a) = F (b) − F (a).
Example

I The probability function of the variable X in the game of rings is:


 0.729
 x = −3
0.243 x =1

P (X = x) =

 0.027 x =3
0.001 x = 27


I The c.d.f. of the variable X in the game of rings is:



 0 x < −3
0.729 −3 ≤ x < 1



F (x) = P (X ≤ x) = 0.729 + 0.243 = 0.972 1≤x <3
0.729 + 0.243 + 0.027 = 0.999 3 ≤ x < 27




0.729 + 0.243 + 0.027 + 0.001 = 1 27 ≤ x


I Note that the c.d.f. is a discontinuous function with jump sizes

equal to P (X = x), for all x ∈ S.
Expectation of a r.v.

Let X be a discrete r.v. over S = {x1 , x2 , . . . } with probabilities

p1 = P (X = x1 ) , p2 = P (X = x2 ) , . . . then, the expectation of X is
defined as:
X X X
E [X ] = xP (X = x) = xi P (X = xi ) = xi pi
x∈S i i

The following properties can be proved

I If a, b ∈ R, then:
E [a + bX ] = a + bE [X ]
I If g is a real function, then:
X
E [g (X )] = g (x) P (X = x)
x∈S
Example

The expectation of the variable X in the game of rings is:

X
E [X ] = xP (X = x) =
x∈S
= −3 × P (X = −3) + 1 × P (X = 1) + 3 × P (X = 3) + 27 × P (X = 27) =
= −3 × 0.729 + 1 × 0.243 + 3 × 0.027 + 27 × 0.001 = −1.836

That is, the expected gain is −1.836 euros.

Variance of a r.v.
The variance of a discrete r.v. is
h i X
2 2
V [X ] = E (X − E [X ]) = (x − E [X ]) P (X = x) =
x∈S
X 2
X 2
= (xi − E [X ]) P (X = xi ) = (xi − E [X ]) pi
i i

The following properties can be verified

I The variance can be written as:
2
V [X ] = E X 2 − E [X ]

I V [X ] ≥ 0 y Var [X ] = 0 if and only if, X is a constant.

I If a, b ∈ R, then:
V [a + bX ] = b 2 V [X ]
The squaredproot of the variance is called standard deviation and denoted
by S[X ] = V [X ].
Example

The variance of the variable X in the game of rings is:

2 2
V [X ] = E X 2 − E [X ] = 7.776 − (−1.836) = 4.405

where:
2
E X 2 = (−3) × 0.729 + 12 × 0.243 + 32 × 0.027 + 272 × 0.001 = 7.776

√
therefore the standard deviation is S[X ] = 4.405 = 2.0988.
Example

Let X count the number of tails in tossing a coin twice. The probability
function of X is
x 0 1 2
1 1 1
P[X = x] 4 2 4

The expectation is:

1 1 1
E [X ] = 0 × +1× +2× =1
4 2 4
and its variance is:
3 1
Var [X ] = E [X 2 ] − E [X ]2 = − 12 =
2 2
where:
1 1 1 3
E [X 2 ] = 02 × + 12 × + 22 × =
4 2 4 2
Chebyshev’s inequality

This inequality provides a bound for the probability of (for instance) a

discrete r.v. X when only the expectation E [X ] and the variance V [X ]
are available.
Let X be a discrete r.v. with expectation E [X ] and the variance V [X ],
then for k ≥ 1:
V (X )
P (|X − E [X ]| ≥ k) ≤
k2
or,
V (X )
P (|X − E [X ]| < k) ≥ 1 −
k2
Note: That the this is a raw bound and it must be used only when the
distribution of X is unavailable.
Chebyshev’s inequality

Consider the application of Chebyshev’s inequality for the game of rings if

we assume that we don’t know the probability of inserting the ring but
only that E [X ] = −1.836 and V [X ] = 4.405. Then:

4.405
P (|X + 1.836| ≥ 3) ≤ = 0.4894
9
Considering the probability function then the exact probability is:

P (|X + 1.836| ≥ 3) = P (X + 1.836 ≥ 3) + P (X + 1.836 ≤ −3) =

= P (X ≥ 1.164) + P (X ≤ −4.836) =
= P (X = 3) + P (X = 27) = 0.027 + 0.001 = 0.028

which shows that the Chebyshev’s bound can be roughly far from the
exact probability.
Summary Example

I Let X , be a discrete r.v. representing the number of tails minus the

number of crosses in tossing 3 unfair coins, where the probability of
tail is twice that of cross.

I Let “c”={tails} y “+”={cross}.

I The sample space is:

e1 = {c, c, c} , e2 = {+, c, c} , e3 = {c, +, c} , e4 = {c, c, +} ,
Ω=
e5 = {+, +, c} , e6 = {+, c, +} , e7 = {c, +, +} , e8 = {+, +, +}
Summary Example

I The set S where the r.v. has support S = {−3, −1, 1, 3} as:

X (e1 ) = 3 − 0 = 3
X (e2 ) = X (e3 ) = X (e4 ) = 2 − 1 = 1
X (e5 ) = X (e6 ) = X (e7 ) = 1 − 2 = −1
X (e8 ) = 0 − 3 = −3

I The probability function is:

 3

 P (X = −3) = 13 = 27 1
 2
P (X = −1) = 3 × 13 × 23 = 29

P (X = x) = 2

 P (X = 1) = 3 × 13 × 32 = 94
3

P (X = 3) = 32 = 27 8

Summary Example
I Assume that the payoffs for playing with the three coins are: -6 for
playing, then 4,6 and 30 if 1, 2, and 3 crosses appear, respectively.
What is the expected gain ?
I Let Y represents the gain then:
I For no crosses we have X = 3 tails, then Y = −6 with probability
8
P (Y = −6) = P (X = 3) = 27 .
I For 1 cross we have X = 1, so Y = −2 with probability
P (Y = −2) = P (X = 1) = 49 .
I For 2 crosses we have X = −1, thenY = 0 with probability
P (Y = 0) = P (X = −1) = 92 .
I For 3 crosses, X = −3, and then Y = 24 with probability
1
P (Y = 24) = P (X = −3) = 27 .
I Therefore, Y takes values over the set S = {−6, −2, 0, 24}. The
expected gain is:
8 4 2 1
E [Y ] = −6 × − 2 × + 0 × + 24 × = −1.78 euros
27 9 9 27
Bernoulli model

Description
This probability model describes the outcome of an experiments where
there are only two possible outcomes that we can denote as (for instance)
success and fail. The corresponding random variable:

1 if success
X =
0 if fail

Let p ∈ [0, 1] denotes the probability of success, then 1 − p is the

probability of fail.

Such experiment is called Bernoulli experiment is distributed according to

a Bernoulli distribution of parameter p,

X ∼ Ber (p).
Bernoulli model

Example
Throwing a fair coin
1 tail
X =
0 cross
It is a Bernoulli experiment and X follows a Bernoulli distribution with
p = 1/2.

Example
A certain airline assumes that a passenger has probability 0.05 to not
show up at the check-in.
Let
1 check-in
Y =
0 not check-in
Y follows a Bernoulli distribution with parameter p = 0.95.
Bernoulli model

Probability function:

P[X = 0] = 1 − p P[X = 1] = p

c.d.f.: 
 0 if x < 0
F (x) = 1−p if 0 ≤ x < 1
1 if x ≥ 1


Properties
I E [X ] = p × 1 + (1 − p) × 0 = p
I E [X 2 ] = p × 12 + (1 − p) × 02 = p
I V [X ] = E [X 2 ] − E [X ]2 = p − p 2 = p(1 − p)
p
I S[X ] = p(1 − p)
Binomial model

Description
This model describes the total number of successes of a n equal Bernoulli
experiments repeated independently. The r.v. represents the number of
successes, and follows a binomial distribution with parameters n ∈ N and
p = [0, 1].

Definition
A discrete r.v. X follows a a binomial distribution with parameters n and
p if
n
P[X = x] = p x (1 − p)n−x ,
x
for x = 0, 1, . . . , n where

n n!
= ,
x x!(n − x)!

and we write X ∼ B(n, p).

Binomial model

Example
Suppose that the previous airline sold 80 tickets for a certain flight and
the probability of each passenger to not show up at the check-in is 0.05.
Let X = number of check-in passengers. Then (assuming independence
between passengers)
X ∼ B(80, 0.95)

I The probability that all passengers show up is

80
P[X = 80] = 0.9580 × (1 − 0.95)80−80 = 0.0165
80

I The probability that at least one does not show-up is:

P[X < 80] = 1 − P[X = 80] = 1 − 0.0165 = 0.9835

Binomial model

Properties
I E [X ] = np

I Var [X ] = np(1 − p)

p
I S[X ] = np(1 − p)
Poisson model

Description
It models the probability of the number of rare events occurring in a
certain domain as, for instance, the time or space.
Examples: telephone calls in a hour, typos in a page, traffic accidents in
a week, particles in a m3 of air, ”Prussian soldiers hit by the kick of an
horse”, . . .

Definition
A r.v. X follows a Poisson distribution of parameter λ > 0 if

λx e −λ
P[X = x] = , para x = 0, 1, 2, . . . ,
x!
and we write X ∼ P(λ).
Poisson model

Properties (1)
I E [X ] = λ
I Var [X ] = λ
√
I S[X ] = λ

Property (2)
Let X ∼ P(λ) represents the number of events in a unit of time with
mean λ.
Let Y represent the number of events in a time length t then we have

Y ∼ P(tλ)
Poisson model

Example
The mean number of typos for slide is 0.2, let X represent such number,
then
X ∼ P(0.2)
What is the probability to have no typos ?

0.20 e −0.2
P[X = 0] = = e −0.2 = 0.8187.
0!
What is the probability to have one typo in 4 slides ?
Let Y be the number of typos in t = 4 slides, then

Y ∼
P(0.2 × 4) = P(0.8)
0.81 e −0.8
P[Y = 1] = = 0.8e −0.8 = 0.3595.
1!
Continuos r.v.

Distribution function
For a continuous r.v. X , the distribution function is
F (x) = P[X ≤ x], ∀x ∈ R

As for the discrete case, F (x) provides the cumulative probability up to

point x ∈ R and now, F (x) is a continuous function.
Continuos r.v.

Properties
I 0 ≤ F (x) ≤ 1, for all x ∈ R
I F (−∞) = 0.
I F (∞) = 1.
I If x1 ≤ x2 , then F (x1 ) ≤ F (x2 ), that is, F (x) is no decreasing.
I For all x1 , x2 ∈ R, P(x1 ≤ X ≤ x2 ) = F (x2 ) − F (x1 ).
I F (x) is continuos.

The probability mass function has no meaning for continuous r.v.,

because P(X = x) = 0. In its place we use the so called density function.
Continuos r.v.

Density function
For a continuos r.v. X with distribution function F (x), the density
function of X is:
dF (x)
f (x) = = F 0 (x)
dx

Properties
I f (x) ≥ 0 ∀x ∈ R
Rb
I P(a ≤ X ≤ b) = a f (x)dx ∀a, b ∈ R
Rx
I F (x) = P(X ≤ x) = −∞ f (u)du
R∞
I
−∞
f (x)dx = 1
Continuos r.v.

Example
For a r.v. X with density function

12x 2 (1 − x)

si 0 < x < 1
f (x) =
0 si no

we have
Z 0.5 Z 0.5
P(X ≤ 0.5) = f (u)du = 12u 2 (1 − u)du = 0.3125
−∞ 0
Z 0.5 Z 0.5
P(0.2 ≤ X ≤ 0.5) = f (u)du = 12u 2 (1 − u)du = 0.2853
0.2 0.2

Z x 
 30 si x ≤ 0
x x4
F (x) = P(X ≤ x) = f (u)du = 12 3 − 4 si 0 < x ≤ 1
−∞ 
1 si x > 1

Expectation of a continuous r.v.

Let X be a continuous r.v. over S ⊆ R, with density function f (x) .

Then, the expectation of X is
Z
E [X ] = xf (x) dx
S

The following properties can be verified:

I If a, b ∈ R, then:
E [a + bX ] = a + bE [X ]
I Let g be a real function, then
Z
E [g (X )] = g (x) f (x) dx
S
Example

The expectation of the r.v. X in the previous example is

Z Z 1
E [X ] = x · f (x)dx = x · 12x 2 (1 − x)dx =
R 0
Z 1
1 4 1 5 1 1 1 3
12(x 3 − x 4 ) dx = 12

= x − x 0 = 12 − =
0 4 5 4 5 5
Variance of a continuous r.v.

The variance of the continuous r.v. X is:

h i Z
2 2
V [X ] = E (X − E [X ]) = (x − E [X ]) f (x)dx =
S
Z
2 2
2
x f (x)dx − E [X ] = E X 2 − E [X ]

=
S

The following properties can be verified:

I V [X ] ≥ 0 y Var [X ] = 0 if and only if, X is a constant.
I If a, b ∈ R, then:
V [a + bX ] = b 2 V [X ]
Again, the squared root
p of the variance is called standard deviation and
denoted by S[X ] = V [X ].
Example

The variance of the r.v. X in the previous example is

2
2 2 3 2 9 1
Var [X ] = E X 2 − E [X ] = −

= − =
5 5 5 25 25

where:
Z Z 1
12 5 x=1 12 6 x=1
E X2 = 2
12x 4 (1 − x)dx =

x f (x)dx = x |x=0 − x |x=0 =
R 0 5 6
12 2
=−2=
5 5
q
1
Therefore the standard deviation is S[X ] = 25 = 15 .
Uniform distribution

Description
For the uniform distribution every sets of the same length has the same
probability, that is, the density is constant over a bounded set where the
r.v. takes values.

Definition
A continuous r.v. variable X follows a uniform distribution over the
interval (a, b) (where a and b are the parameters of the distribution) if
1
b−a si a < x ≤ b
f (x) =
0 otherwise

and we write X ∼ U(a, b).

Uniform distribution

Density function:
Properties
a+b
I Expectation: E [X ] = 2
(b−a)2
I Variance: V [X ] = 12
I Standard deviation:
b−a
S[X ] = √ 12
Example of uniform distribution

Suppose X follows a uniform distribution over (a = 3, b = 5) then its

density function is
1
2 si 3 < x < 5
f (x) =
0 si no

Lets calculate the following probabilities:

R 0.5
P(X ≤ 0.5) = −∞ f (u)du = 0
R4 R4
P(X ≤ 4) = −∞ f (u)du = 3 21 du = 12 u|43 = 21
R 4.5 R 4.5
P(3.5 ≤ X ≤ 4.5) = 3.5 f (u)du = 3.5 12 du = 12
Example of uniform distribution

Distribution function
Z x
F (x) = P(X ≤ x) = f (u)du = . . .
−∞

I Is x ≤ 3 then F (x) = P(X ≤ x) = 0.

Rx 1
I If 3 < x ≤ 5 then F (x) = P(X ≤ x) = 3 2
du = u2 |x3 = x−3
2 .

R51
I If 5 < x then F (x) = P(X ≤ x) = 3 2
du = u4 |53 = 5−3
2 = 1.
Summarizing we have:

 0 si x ≤ 3
x−3
F (x) = 2 si 3 < x ≤ 5
1 si x > 5

Example of uniform distribution

Expectation
R5 5
x2 52 −32
x · 12 dx =
R
E [X ] = R
x · f (x)dx = 3 4 = 4 =4
3

Variance
x 2 · f (x)dx − E [X ]2
R
Var [X ] = R
R5 5
x2 x3
= 3 2 dx − 42 = 6 − 16 = 0.33
3
Exponential distribution
Description
The exponential distribution models the time between two independent
events, separately and uniformly distributed over time (i.e. the time
between two Poisson events).

Definition
We say that X follows an exponential distribution of parameter λ >,
X ∼ E(λ), if its density function is

f (x) = λe −λx , para x ≥ 0.

Note that X takes values over S = [0, ∞).

Examples
I Time between the arrivals of two trucks at the discharge point.
I Time between two emergency calls.
I Life time of a lightbulb.
Exponential distribution

Density function
Properties
1
I Expectation: E [X ] = λ
1
I Variance: V [X ] = λ2
I Standard Deviation:
S[X ] = λ1
I Density function:

1 − e −λx

if x ≥ 0
F (x) =
0 otherwise

I The exponential distribution is related to the Poisson distribution.

I λ is the mean number of events in a certain unit of time (or space).
Exponential distribution
Example
In a given city there occur, in mean, 50 fires every year and we assume
that such number follows a Poisson distribution: We can calculate
I The probability of a certain time between two fires
I Knowing that a fire has just occurred, what is the probability that
the next will occur in two weeks ?
We have that
I The number of fires N ∼ P(λ) with λ = 50.
I The time between two fires is X ∼ E(λ) with λ = 50.
1
I Therefore the expected time between two fires is E [X ] = λ = 1/50
años, 7.3 dı́as.
I Two weeks correspond to the following fraction of the year:
2×7
365 = 0.03836.

I P[X > 0.03836] = 1 − P[X ≤ 0.03836] = 1 − (1 − e −50×0.03836 ) =

0.147.
Normal or Gaussian distribution

Description
The normal distribution, models the measurement errors of a certain
continuos quantity and it approximates very well most of the real
situations. Statistics makes large use of this model and those models that
derive from it.

Definition
The r.v. X follows a normal or Gaussian distribution with parameters
µ ∈ R and σ ∈ R+ , X ∼ N (µ, σ), if

1 1 2
f (x) = √ exp − 2 (x − µ)
σ 2π 2σ

Properties
E [X ] = µ V [X ] = σ 2
If X ∼ N (µ, σ), f (x) the distribution is symmetric around the median µ.
Normal or Gaussian distribution
Density function for 3 different values of µ and σ
Normal or Gaussian distribution

Property
If X ∼ N (µ, σ),
I P(µ − σ < X < µ + σ) ≈ 0.683
I P(µ − 2σ < X < µ + 2σ) ≈ 0.955
I P(µ − 3σ < X < µ + 3σ) ≈ 0.997

Chebyshev’s inequality
Chebyshev’s inequality applies also to continuous variables knowing only
its mean and standard deviation. In the case where X is Gaussian with
mean µ and standard deviation σ, we have that:

σ2
P (µ − k < X < µ + k) = P (|X − µ| < k) ≥ 1 −
k2
1
therefore, if k = cσ, we have that P (µ − cσ < X < µ + cσ) ≥ 1 − c2 .
Normal or Gaussian distribution

Linear transformation
If X ∼ N (µ, σ), then:

Y = aX + b ∼ N (aµ + b, |a|σ)

Standardization
If X ∼ N (µ, σ), it is possible to consider the standardized r.v.

X −µ
Z= ∼ N (0, 1)
σ
The special case of N(0, 1) is called the standard normal distribution. It
is symmetric around 0 its c.d.f. (whose analytical expression is not
available) is tabulated.
Table of N (0, 1)
Example of Normal Distribution

Let Z ∼ N(0, 1), calculate the following probabilities:

I Pr(Z < 1.5) = 0.9332. tabla

I Pr(Z > −1.5) = Pr(Z < 1.5) = 0.9332. why?

I Pr(Z < −1.5) = Pr(Z > 1.5) = 1 − Pr(Z < 1.5) = 1 − 0.9332 =
0.0668. Why nor ≤?

I Pr(−1.5 < Z < 1.5) = Pr(Z < 1.5) − Pr(Z < −1.5) =
0.9332 − 0.0668 = 0.8664.
Example of Normal Distribution
Let X ∼ N(µ = 2, σ = 3), we want to calculate Pr(X < 4) and
Pr(−1 < X < 3.5) using the table of the standard normal:
I First we make the statement over Z using standardization and then
we use the table:

X −2 4−2
Pr(X < 4) = P < = Pr Z < 0.666̇ ≈ 0.7454,
3 3

where Z ∼ N(0, 1).

I The same applies for the second probability:

Pr(−1 < X < 3.5) = Pr(−1 − 2 < X − 2 < 3.5 − 2)

−1 − 2 X −2 3.5 − 2
=P < < = Pr(−1 < Z < 0.5) =
3 3 3
= Pr(Z < 0.5) − Pr(Z < −1) = 0.6915 − 0.1587 = 0.5328.

where Z ∼ N(0, 1).

Example of Normal Distribution

It is difficult to label a pack of meet with its exact weight due to a

natural weight loss (defined as the percentage of loss weight). Assume
that the weight loss of a chicken packet is distributed according to a
normal distribution with mean 4% and standard deviation 1%.
Let X represent the loss of a pack
I What is the probability that 3% < X < 5%?
I What it the value of x such that the 90% of the packs have a loss
less than x?
I Consider a sample of 4 packs, provide a bound that all packs show a
loss between 3 and 5%.

Sexauer, B. (1980) Drained-Weight Labelling for Meat and Poultry: An

Economic Analysis of a Regulatory Proposal, Journal of Consumer Affairs, 14,
307-325.
Example of Normal Distribution

3−4 X −4 5−4
Pr(3 < X < 5) = Pr < < = Pr(−1 < Z < 1)
1 1 1
= Pr(Z < 1) − Pr(Z < −1) = 0.8413 − 0.1587 = 0.6827

We want Pr(X < x) = 0.9. Then

X −4 x −4
Pr < = Pr(Z < x − 4) = 0.9.
1 1
Looking at the table the value that satisfies the above equation is
x − 4 ≈ 1.28 implying that 90% of the packs have a loss less than
x = 5.28%.
For a pack we have p = Pr(3 < X < 5) = 0.6827. Let Y counts the
number of packs (out of the 4) with a loss between 3% and 5%, we have
Y ∼ B(4, 0.6827) and then

4
Pr(Y = 4) = 0.68274 (1 − 0.6827)0 = 0.2172.
4
Example of Normal Distribution

If the sample were made of 5 packs, what is the probability that at least
one has a loss between 3% and 5% ? In this case we have n = 5 and
p = 0.6827, then Y ∼ B(5, 0.6827) and

Pr(Y ≥ 1) = 1 − Pr(Y < 1) = 1 − Pr(Y = 0) =

5 5
=1− 0.68270 (1 − 0.6827)5−0 = 1 − (1 − 0.6827) = 0.9968.
0
The Central Limit Theorem (CLT)

The following theorem applies to the mean of set of independent and

identically distributed (i.i.d.) r.v.,
n
1X
X̄ = Xi
n
i=1

and state that in the limit, for n large, the distribution of X̄ is Gaussian
independently of the distribution of X . Given its generality it has called
“central”.
Theorem
Let X1 , X2 , . . . , Xn be independent r.v., identically distributed, with mean
µ and standard deviation σ (both finite). For n large enough, then

X̄ − µ
√ ∼ N (0, 1)
σ/ n
Approximations with the CLT

Binomial
Let X ∼ B(n, p) with n large enough (that is n ≥ 30 and 0.1 ≤ p ≤ 0.9
or np ≥ 5 and n (1 − p) ≥ 5), then:

X − np
p ∼ N (0, 1)
np(1 − p)

Poisson
If X ∼ P(λ) with λ large enough (λ > 5)

X −λ
√ ∼ N (0, 1)
λ
√
On the other hands, P(λ) ≈ N λ, λ .
Approximations with the CLT: Example
I Let X ∼ B(100, 1/3). Suppose we want to calculate Pr(X < 40), as
the exact calculus is heavy by hand we use the
I CLT where X ∼ B(100, 1/3) ≈ N (33.3, 4.714) , because

1
E [X ] = 100 × = 33.3̇
3
1 2
V [X ] = 100 × × = 22.2̇
p 3 3
S[X ] = 22.2̇ = 4.714

I Therefore,

X − 33.3̇ 40 − 33.3̇
Pr(X < 40) = P <
4.714 4.714
≈ P (Z < 1.414) donde Z ∼ N(0, 1)
≈ 0.921,

I the exact, value using a PC, is 0.934 and the approximation with the
CLT is not very far from the exact value.
Distributions related to the normal one

χ2 (Chi-squared)
Let X1 , X2 , . . . , Xn be i.i.d. N (0, 1). r.v.. The distribution of
n
X
S= Xi2
i=1

is called χ2 distribution regulated by one parameter,n, called degrees of

freedom (d.f.) with the following properties:
I E [S] = n
I Var [S] = 2n
Distributions related to the normal one

t Student
Let Y , X1 , X2 , . . . , Xn be i.i.d. N (0, 1) r.v.. The distribution of

Y
T = qP
n
i=1 Xi2 /n

is called t Student distribution regulated by one parameter,n, called

degrees of freedom (d.f.) with the following properties:
I E [T ] = 0
n
I Var [T ] = n−2
Distributions related to the normal one

Fn,m de Fisher
Let X1 , X2 , . . . , Xn and Y1 , Y2 , Y3 , . . . , Ym be N (0, 1) r.v.. The
distribution of Pn
m i=1 Xi2
F =
n m 2
P
i=1 Yi
is called Fisher distribution, Fn,m is regulated by two parameters,n and m,
called degrees of freedom (d.f.) with the following properties:
m
I E [F ] = (para m > 2)
m−2
2m2 (n + m − 2)
I Var [F ] = (para m > 4)
n(m − 2)2 (m − 4)
1
I ∼ Fm,n
F

Ponder Probability 2025 September
No ratings yet
Ponder Probability 2025 September
115 pages
ProbabilityStatistics Probability
No ratings yet
ProbabilityStatistics Probability
10 pages
Stats Prob Week 1
No ratings yet
Stats Prob Week 1
13 pages
UNIT3
No ratings yet
UNIT3
5 pages
Statistical Foundations: SOST70151 - LECTURE 1
No ratings yet
Statistical Foundations: SOST70151 - LECTURE 1
45 pages
Chapter 03
No ratings yet
Chapter 03
18 pages
Probability & Statistics
No ratings yet
Probability & Statistics
54 pages
EMBA Day8
No ratings yet
EMBA Day8
73 pages
Chapter 15 Notes
No ratings yet
Chapter 15 Notes
22 pages
Probability Basics for Students
No ratings yet
Probability Basics for Students
40 pages
Probability Theory and Random Processes
100% (1)
Probability Theory and Random Processes
312 pages
Chapitre 2 - Probabilities
No ratings yet
Chapitre 2 - Probabilities
44 pages
Chapitre 2 - Probability Calculation
No ratings yet
Chapitre 2 - Probability Calculation
10 pages
Data Science Book
No ratings yet
Data Science Book
107 pages
MATHEMATICAL - STATISTICS (p.1-34)
No ratings yet
MATHEMATICAL - STATISTICS (p.1-34)
34 pages
Lecture - 3 Probability Theory
No ratings yet
Lecture - 3 Probability Theory
25 pages
Probability Theory
No ratings yet
Probability Theory
7 pages
Probability Theory Random Experiment
No ratings yet
Probability Theory Random Experiment
4 pages
Lecture02 Chapter 02 Probability Michael Baron Inf Stats Final
No ratings yet
Lecture02 Chapter 02 Probability Michael Baron Inf Stats Final
69 pages
Introduction to Discrete Probability Models
No ratings yet
Introduction to Discrete Probability Models
29 pages
Lecture02 CH 02 Probability Michael Baron Inf Stats Final FA24
No ratings yet
Lecture02 CH 02 Probability Michael Baron Inf Stats Final FA24
65 pages
CHAPTER 1. Probability (1) .Pps
No ratings yet
CHAPTER 1. Probability (1) .Pps
41 pages
Probablity
No ratings yet
Probablity
310 pages
CHAPTER 1. Probability
No ratings yet
CHAPTER 1. Probability
41 pages
Outcomes of Picking Marbles and Dice
No ratings yet
Outcomes of Picking Marbles and Dice
24 pages
Probability - and - Statistics 1
No ratings yet
Probability - and - Statistics 1
33 pages
Probability LectureNotes
No ratings yet
Probability LectureNotes
16 pages
GM.... Probability R Session 1
No ratings yet
GM.... Probability R Session 1
54 pages
UWvn CFUEPJn 8 CX VR FFN 9
No ratings yet
UWvn CFUEPJn 8 CX VR FFN 9
18 pages
Basic Probability
No ratings yet
Basic Probability
16 pages
Slide Chap2
No ratings yet
Slide Chap2
31 pages
Probability 01
No ratings yet
Probability 01
13 pages
Probability Basics for Computer Science
No ratings yet
Probability Basics for Computer Science
4 pages
Probability
No ratings yet
Probability
53 pages
Understanding Probability Theory Basics
No ratings yet
Understanding Probability Theory Basics
52 pages
Basic Probability: Dr. K. M. Salah Uddin
No ratings yet
Basic Probability: Dr. K. M. Salah Uddin
38 pages
12th Mathematics Project
No ratings yet
12th Mathematics Project
62 pages
GS Tripathi Probability XII Q & A
No ratings yet
GS Tripathi Probability XII Q & A
42 pages
Probability and Prob. Distribution
No ratings yet
Probability and Prob. Distribution
9 pages
Applied Statistics and Probability For Engineers Chapter - 2
No ratings yet
Applied Statistics and Probability For Engineers Chapter - 2
16 pages
Ma 151 Lecture LT1
No ratings yet
Ma 151 Lecture LT1
95 pages
Introduction To Probability Theory PDF
No ratings yet
Introduction To Probability Theory PDF
33 pages
Prob. Review 1
No ratings yet
Prob. Review 1
25 pages
Topic 6 Probability
No ratings yet
Topic 6 Probability
60 pages
Probability Lectures
No ratings yet
Probability Lectures
40 pages
Basic Probability
No ratings yet
Basic Probability
57 pages
Probability
No ratings yet
Probability
7 pages
Course Instructors: 1. Dr. R. Archana Reddy 2. Mr. B. Ravindar 3. Dr. G. Ravi Kiran
No ratings yet
Course Instructors: 1. Dr. R. Archana Reddy 2. Mr. B. Ravindar 3. Dr. G. Ravi Kiran
22 pages
Probability Concepts and Examples
No ratings yet
Probability Concepts and Examples
19 pages
SMT5203
No ratings yet
SMT5203
57 pages
Course Name: MEM601 Statistics: For Engineering Managers (2 Credit Hours)
No ratings yet
Course Name: MEM601 Statistics: For Engineering Managers (2 Credit Hours)
38 pages
Statistics For Business Economics Prob Theory Unit 2 (A)
No ratings yet
Statistics For Business Economics Prob Theory Unit 2 (A)
79 pages
Information Theory and Coding - Chapter 1
No ratings yet
Information Theory and Coding - Chapter 1
10 pages
Key Probability Concepts Explained
No ratings yet
Key Probability Concepts Explained
21 pages
Unit II - Session 1
No ratings yet
Unit II - Session 1
48 pages
Enhancing Air Heater Efficiency in Boilers
100% (12)
Enhancing Air Heater Efficiency in Boilers
56 pages
Answers: Summative Test 1
No ratings yet
Answers: Summative Test 1
5 pages
Haldwani HVAC Tender
No ratings yet
Haldwani HVAC Tender
12 pages
Design of Hexagonal Patch Antenna For Mobile Wireless System
No ratings yet
Design of Hexagonal Patch Antenna For Mobile Wireless System
6 pages
Case Study of JFC FINAL Template
No ratings yet
Case Study of JFC FINAL Template
18 pages
Cambridge IGCSE ™ (9-1) : Mathematics 0980/31 October/November 2022
No ratings yet
Cambridge IGCSE ™ (9-1) : Mathematics 0980/31 October/November 2022
8 pages
Topic - 4A-Overcurrent Protection
No ratings yet
Topic - 4A-Overcurrent Protection
73 pages
Unit - 7
No ratings yet
Unit - 7
52 pages
Smart Tourism Destination A Critical Reflection
No ratings yet
Smart Tourism Destination A Critical Reflection
18 pages
Body Language
No ratings yet
Body Language
8 pages
Daniel Hoffman - The Poetry of Stephen Crane-Columbia University Press (2019)
No ratings yet
Daniel Hoffman - The Poetry of Stephen Crane-Columbia University Press (2019)
320 pages
9.3 The Early Theories of Motivation
100% (3)
9.3 The Early Theories of Motivation
2 pages
Risk Matrix
100% (2)
Risk Matrix
27 pages
Resistance Thermometers Explained
No ratings yet
Resistance Thermometers Explained
15 pages
Principles of Managerial Finance. Fifteenth Edition, Global Edition Chad J. Zutter / Scott B. Smart Chad J. Zutter Download
No ratings yet
Principles of Managerial Finance. Fifteenth Edition, Global Edition Chad J. Zutter / Scott B. Smart Chad J. Zutter Download
108 pages
Client Onboarding Workflow
No ratings yet
Client Onboarding Workflow
4 pages
ST74 Sensor Lamp User Manual
No ratings yet
ST74 Sensor Lamp User Manual
3 pages
FTS - 04 (Code B) Question
No ratings yet
FTS - 04 (Code B) Question
20 pages
Fh42t3a GBR Eng
No ratings yet
Fh42t3a GBR Eng
8 pages
Anatomy Act 2021: Nigeria's New Regulations
No ratings yet
Anatomy Act 2021: Nigeria's New Regulations
21 pages
Drone Pre-Flight Privacy Checklist
No ratings yet
Drone Pre-Flight Privacy Checklist
4 pages
Oil Country 45000
100% (2)
Oil Country 45000
70 pages
IoT Sensors and Actuators Lab Manual
No ratings yet
IoT Sensors and Actuators Lab Manual
54 pages
Engineering Standards: Name Engineering Standard Number
No ratings yet
Engineering Standards: Name Engineering Standard Number
10 pages
Road Safety Audits: Benefits & Process
No ratings yet
Road Safety Audits: Benefits & Process
5 pages
A. Transportation Qualification and Validation in Pharmaceutical Supply Chain (Dr. Ir. Berty Argiyantari)
No ratings yet
A. Transportation Qualification and Validation in Pharmaceutical Supply Chain (Dr. Ir. Berty Argiyantari)
21 pages
So Noi Vu-Placement Test
No ratings yet
So Noi Vu-Placement Test
7 pages
Mickybo and Me: A Belfast Friendship
No ratings yet
Mickybo and Me: A Belfast Friendship
2 pages
Knowledge Acquisition and Validation
No ratings yet
Knowledge Acquisition and Validation
79 pages
Pexip Infinity Upgrading Quickguide V31.2.a
No ratings yet
Pexip Infinity Upgrading Quickguide V31.2.a
3 pages