OCW 2020
Properties of one-dimensional random variables:
theory and practice
DISCRETE DISTRIBUTIONS OF
RANDOM VARIABLES
3. LESSON
Xabier Erdocia
Itsaso Leceta
OBJECTIVES
Be able to identify the discrete distribution that the random
variable is following
After identifying the discrete distribution that a random
variable is following, be able to calculate different probabilities
using probability or distribution function
Have the ability to calculate and interpret the moments of
different discrete distributions
Know the conditions that are necessary to approximate the
hypergeometric distribution through the binomial distribution
and the binomial distribution through Poisson distribution and
be able to perform properly the approximation
INDEX
3.1. Binary distribution
3.2. Binomial distribution
3.3. Geometric distribution
3.4. Negative binomial distribution
3.5. Hypergeometric distribution
3.6. Poisson distribution
3.1. Binary distribution
5
3.1. Binary distribution
• When there are only two possible results when performing a single random
test, the distribution the random variable is following is binary distribution
or Bernouilli distribution.
• The two possible results are X 1 (Success) or X 0 (Failure), being the
probabilities P( X 1) p and P( X 0) q 1 p .
• The distribution that the variable follows is given by:
X Binary( p)
• The probability function is given by:
p( x) p x (1 p)1 x p x q1 x
• The distribution function is given by:
q x0
F ( x) P( X x)
1 x 1
6
3.1. Binary distribution
• The mean of a random variable that follows a binary distribution is:
E( X ) p
• Knowing the probability function, other moments and measurements can
be calculated. For example, the value of the variance is:
2 pq
Binomial
3.2. distribution
8
3.2. Binomial distribution
• By repeating in n occasions the random test performed in the binary
distribution (being each test independent from the other), the distribution
that follows the random variable which considers the results obtained in all
the tests is a binomial distribution.
• In each test the two possible results are “Success” with p probability
and “Failure” with q 1 p probability.
• The distribution that the variable follows is given by :
X B(n, p)
• The probability function is given by:
n
p( x) p x (1 p)n x x 0,1,..., n
x
• The distribution function is given by:
x x
n
F ( x) P( X x) p(i ) p i q n i
i 0 i 0 i
9
3.2. Binomial distribution
• The mean of a random variable that follows a binomial distribution is:
E( X ) n p
• Knowing the probability function, other moments and measurements can
be calculated. For example, the value of the variance is:
2 n pq
Geometric
3.3. distribution
11
3.3. Geometric distribution
• Taking into account the random test defined in the binary distribution
(being each test independent from the other), the distribution that follows
the random variable which considers the number of assays performed until
the first success (not including the successful test) is geometric
distribution.
• In each test the two possible results are “Success” with p probability
and “Failure” with q 1 p probability.
• The distribution that the variable follows is given by:
X G( p)
• The probability function is given by:
p ( x) q x p x 0,1, 2,...
• The distribution function is given by:
x x
F ( x) P( X x) p (i ) q i p
i 0 i 0
12
3.3. Geometric distribution
• The mean of a random variable that follows a geometric distribution is:
q
E( X )
p
• Knowing the probability function, other moments and measurements can
be calculated. For example, the value of the variance is:
q
2
p2
Negative binomial
3.4. distribution
14
Negative binomial
3.4. distribution
• Taking into account the random test defined in the binary distribution
(being each test independent from the other), the distribution that follows
the random variable which considers the number of tests performed until n
successes (not including the n successful test) is geometric distribution
• In each test the two possible results are “Success” with p probability
and “Failure” with q 1 p probability
• The distribution that the variable follows is given by:
X BN(n, p)
• The probability function is given by:
n x 1 x n
p( x) q p x 0,1, 2,...
x
• The distribution function is given by:
x x
n i 1 i n
F ( x) P( X x) p(i ) q p
i 0 i 0 i
15
Negative binomial
3.4. distribution
• The mean of a random variable that follows a negative binomial
distribution is:
nq
E( X )
p
• Knowing the probability function, other moments and measurements can
be calculated. For example, the value of the variance is:
nq
2
p2
Hypergeometric
3.5. distribution
17
Hypergeometric
3.5. distribution
• The hypergeometric distribution is similar to the binomial distribution but
the sampling is not independent, that is, observations are made in a finite
population without replacement.
• As a finite population, not returning an element to the population after
observing it, influences the subsequent observation. Proportions do not
remain constant as the observations advance.
• In each observation of a population of size N there are two possible results,
“Success” and “Failure”. In the first observation (test) the probability of
success is p and the probability of failure is q 1 p .
• The distribution that the variable follows is given by:
X H( N , n, p)
• The probability function is given by:
Np Nq
x n x
p ( x) max(0, n Nq ) x min(n, Np )
N
n
18
Hypergeometric
3.5. distribution
• The distribution function is given by:
Np Nq
x x
i n i
F ( x) P( X x) p(i) i max(0,
i max(0, n Nq )
n Nq ) N
n
• The mean of a random variable that follows a negative binomial
distribution is:
E( X ) n p
• Knowing the probability function, other moments and measurements can
be calculated. For example, the value of the variance is:
( N n)
2 n pq
( N 1)
19
Hypergeometric
3.5. distribution
Approximation of hypergeometric distribution through binomial distribution:
• In a random variable that follows an hypergeometric distribution, if the
size of the population, N , is much larger than the number of observations, n,
this hypergeometric distribution can be approximated through a binomial
distribution.
• Therefore, if N and n 0 , H ( N , n, p) B(n, p)
• This approach is acceptable when: N 10 n
3.6. Poisson distribution
21
3.6. Poisson distribution
• When an X random variable defines the number of times an event is
repeated in a continuous interval (time, space…), it follows a Poisson
distribution. The events occurred in an interval are independent of those
produced in another interval, provided that the intervals do not overlap.
• The number of events will always be positive, the parameter will define
the number of events expected in a given interval.
• The distribution that the variable follows is given by:
X P ( )
• The probability function is given by:
e x
p( x) , x 0
x!
• The distribution function is given by:
x x
e i
F ( x) P( X x) p (i )
i 0 i 0 i!
22
3.6. Poisson distribution
• The mean of a random variable that follows a negative binomial
distribution is:
E( X )
• Knowing the probability function, other moments and measurements can
be calculated. For example, the value of the variance is:
2
23
3.6. Poisson distribution
Approximation of binomial distribution through Poisson distribution:
• In a random variable that follows a binomial distribution, when the
number of observations or tests, n , is very high and the probability of
success, p , is low, this binomial distribution can be approaches through
Poisson distribution.
• Therefore, if n and p 0 , B(n, p) P (n p )
• This approach is acceptable when: n 30 and p 0.1