LEARNING MODULE 2
KEY STATISTICAL CONCEPTS
• Probability distributions
• Discrete probability distributions
• Continuous probability distributions
• Binomial distribution
• Normal distribution
This work is licensed under a Creative Commons Attribution 4.0 International License.
Random Variables
Definition:
• A random variable, usually written X, is a variable whose
possible values are numerical outcomes of a random
phenomenon. These values can be associated with
probabilities. There are two types of random variables,
discrete and continuous.
• Discrete random variables have a countable number of
outcomes, e.g., dice
• Continuous random variables have an infinite continuum
of possible values, e.g., blood pressure
2
Probability Functions/Distribution
• A probability distribution or function is a function that
describes the probability of a random variable taking
certain values
• A probability function maps the possible values of x to
their respective probabilities of occurrence, p(x)
Note:
• p(x) is a number between 0 and 1
• Area under a probability function is always 1
3
Distributions
4
Mean and Variance
• If we understand the underlying probability distribution
of a certain phenomenon, we know how x is expected to
behave on average
• The expected value E[X] is the weighted average or
mean (µ) of random variable X
• A random variable X takes values x1 with a probability p1,
x2 with p2,… and xn with pn, the expected value or mean
is then given by
5
Mean and Variance
• The variance describes how far the values of a random
variable deviate from the mean
• Variance Var[X] of a random variable X with expected
value µ=E[X] is given by
• Variance is often also denoted as σ2, where σ the
standard deviation of the random variable X
Questions:
• How can you relate the concepts of accuracy and precision to
the measure of variance
• What about reproducibility?
6
Discrete Example: Roll of a Die
• There are six possible outcomes for a die roll: numbers from one
through six
• Assume the die is fair, i.e. all numbers have the same probability
of showing
• If all outcomes are equally likely, then the probabilities are equal
as well – and since the sum over all probabilities has to be one,
they are all 1/6
• The histogram below shows the probabilities for each number
showing for every single roll of the die
p(x)
1/6
x
1 2 3 4 5 6
http://s522.photobucket.com/user/poka-dot-pocky/media/Gaia/Decorated%20images/dice_zps0d0b23cc.png.html 7
Discrete Example: Roll of a Die
• Probabilities are equal x P
(uniformly distributed)
1 P(x = 1) = 1/6
• Each roll of the die has
to come up with some 2 P(x = 2) = 1/6
result
3 P(x = 3) = 1/6
• Probability of having
any result is thus one 4 P(x = 4) = 1/6
• Summing up all 5 P(x = 5) = 1/6
probabilities of a
6 P(x = 6) = 1/6
discrete random
variable will thus give
one
8
Cumulative Distribution Function
Definition:
The cumulative distribution function (CDF) or just
distribution function, describes the probability that random
variable X with a given probability distribution will be found
to have a value less than or equal to x.
For a discrete random variable X the CDF is computed by
summing up the probabilities of all possible values xi up to
x:
Note: For continuous random variables the summation is
replaced by an integral. 9
Cumulative Distribution Function
Example: six-sided die x P(X ≤ x)
• Outcomes are mutually
exclusive (only one side can 1 1/6
be up)
• Probabilities can just be 2 2/6
summed up to obtain the
CDF 3 3/6
FX(x)
1 4 4/6
5/6
2/3 5 5/6
1/2
1/3
6 6/6
1/6
1 2 3 4 5 6 x
10
Examples
1. What is the probability of rolling a 3 or less?
1. What is the probability of rolling a 5 or higher?
2. Is the probability of rolling 10 or higher on a 20-sided
die higher than the probability of rolling a 6 or higher
on a 12-sided die?
11
Important Discrete Distributions
Binomial Distribution Poisson Distribution
0.25
p=0.5 and n=20
p=0.7 and n=20
0.20
p=0.5 and n=40
0.15
0.10
0.05
0.00
0 10 20 30 40
Example application: Example application:
isotope distributions peptide identification
(Learning Unit 1C) (Learning Unit 7C/D)
http://upload.wikimedia.org/wikipedia/commons/7/75/Binomial_distribution_pmf.svg
http://upload.wikimedia.org/wikipedia/commons/1/16/Poisson_pmf.svg 12
Bernoulli Experiment
• Bernoulli experiment is a
random experiment where
the random variable can
take only two values
• Success (1)
• Failure (0)
• Given a probability of
success p, the probability
of failure is q = 1 - p
Example: Jakob Bernoulli
(1655-1705, Swiss mathematician)
Roll a (six-sided) die for a six
• 6: success p = 1/6
• 1,2,3,4,5: failure, q = 1 – p = 5/6
http://en.wikipedia.org/wiki/Jacob_Bernoulli#mediaviewer/File:Jakob_Bernoulli.jpg 13
Binomial Distribution
• Independent Bernoulli experiments build the basis for binomial
distributions
• Experiments with two possible outcomes (e.g., flipping a coin)
• n independent (repeated) experiments are performed
• Probability of success p is the same in every experiment
• N marbles in a jar
• r black and N-r white
• What is the probability
to have k black marbles,
if n are drawn with
replacement ?
14
Binomial Distribution
• The binomial distribution B(x;n,p) describes the probability for an
n-trial binomial experiment to result in exactly k successes:
where
• k: the number of successes that result from the binomial experiment
• n: the number of trials in the binomial experiment
• p: the probability of success in an individual trial
• q: the probability of failure (q = 1 - p)
• Binomial coefficient (read: “n choose k”) the number of different ways to
choose k things out of n things
15
Example: Drawing Marbles
Experiment: Draw two marbles from a jar containing 10
white and 10 black marbles (with replacement)
• Probability of having drawing k black marbles is:
# of black marbles probability
0 0.25
1 0.5
2 0.25
• Mean and variance of the probability distribution are
given by:
16
Example: Throwing a Coin
Experiment: Throw a fair coin ten times
Probability
Number of heads
17
Binomial Approximation of the
Poisson Distribution
• Let P(x=k) denote the binomial distribution
and let p = λ/n
• We then obtain for the limit for very large n:
18
Binomial Approximation of the
Poisson Distribution
We thus obtain
the well-known Poisson distribution
• The Poisson distribution describes a Bernoulli experiment
with a high number of repeats and low success
probability (i.e., if p small and n large)
• Therefore it is also called Poisson law of small numbers
• The mean as well as the variance of the Poisson
distribution is λ
19
Binomial Distributions
n=100 n=100
p=0.5 p=0.1
n=1,000 n=5,000
p=0.009 p=0.0009
20
Poisson Distribution
n = 1000 n = 1000
λ = 4.5 λ = 2.5
21
Continuous Random Variables
• The probability function fX for a continuous random
variable X is a non-negative, continuous function that
integrates to 1
• Cumulative distribution functions for continuous
random variables are computed equivalently to those of
discrete random variables
• Rather than summing over all suitable outcomes, we
need to integrate, though:
22
Gaussian Distribution
• The probability function is given by
• By definition
• The probability function results in the well-known bell-
shape Gaussian
23
Gaussian Mean and Variance
• The expectation value is calculated as follows,
• Furthermore
• Resulting in the general variance of Gaussian
distributions
24
Standard Normal Distribution
• The standard normal distribution corresponds to the
general form of the Gaussian distribution with µ = 0 and
σ2 = 1 (centered, unit variance)
• An arbitrary normal distribution can be converted to a
standard normal distribution via Z-transformation:
resulting in
25
Gaussian Distribution
26
Error Function
• Computing the CDF of a normal
distribution is related to the
error function (or Gauss error
function) erf
• Note that this integral cannot
be evaluated in closed form in
terms of elementary functions Note:
The error function is an odd function, i.e.
• It can be approximated with
elementary functions though
or evaluated numerically
http://en.wikipedia.org/wiki/Error_function#mediaviewer/File:Error_Function.svg 27
Error Function
• We can compute the CDF of a normal distribution and
thus obtain
• With the Gaussian error function, the CDF P{x ≤ r}
simplifies to
• This allows the evaluation of
the probability that a
Gaussian random variable Y
lies in an interval of size r
around the mean value µ
r 28
Error Function
• What is the probability that Gaussian random variable
lies within twice the standard deviation from the mean?
• Probability that a Gaussian random variable lies within
the interval [µ - 2 σ, µ + 2 σ] is thus 95.5%
29
p-Value
• Widely used as a measure of statistical significance
• p-values in the measurement context:=
The probability of observing an incorrect event with a
given score or better
• Hence, a low p-value implies a low probability that the
observed measurement is incorrect
• The p-value can be derived from the false positive rate
(FPR), the fraction of incorrect measurements among a
set of measurements (i.e., all measurements above a
given threshold)
• Problems associated with p-value calculations
• The FPR is usually unknown
• p-values should be corrected for multiple hypothesis testing
30
p-Values in Statistical Testing
• p-values are used to judge the significance of a test for the
null-hypothesis
• Null-hypothesis:= corresponds to the default position, e.g.,
random chance peptide identification or mean values of two
independent measurements are not different
• Alternative-hypothesis:=the opposite positions, e.g., non
random peptide identification
• Usually, the null hypothesis cannot be formally proven, but
statistical testing can accept or reject the null-hypothesis
• The null-hypothesis is rejected if the p-value is less then a
significance level α (e.g., 0.05 or 0.01)
31
What is False?
• A general problem for any statistical assessment
is usually we do not know what is really true or
false (ground truth)
• All applied methods need to make assumptions
on false positive assignments
32
A note of Weibull
3-parameter Weibull model (shown above), the scale parameter, η, defines
where the bulk of the distribution lies. The shape parameter, β, defines the
shape of the distribution and the location parameter, γ, defines the location of
the distribution in time
33
Understanding Poisson
Example. If the average number of baby born in a hospital every hour is 3
(λ=3)
the probability that 4 new borns within 2 hour could be calculated as showed
above:
34