0% found this document useful (0 votes)
61 views80 pages

Lecture 5: Joint Probability Distributions: Bo Li

This document provides an overview of a lecture on joint probability distributions. It discusses key concepts such as: - Joint probability mass/density functions which define the probabilities of random variables occurring together. - Marginal probability mass/density functions which are the probabilities of individual random variables. - Independence of random variables and how it relates their joint and marginal distributions. - Additional topics covered include expectations, covariance, and an application to portfolio selection involving multiple random variables.

Uploaded by

Lee Chia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views80 pages

Lecture 5: Joint Probability Distributions: Bo Li

This document provides an overview of a lecture on joint probability distributions. It discusses key concepts such as: - Joint probability mass/density functions which define the probabilities of random variables occurring together. - Marginal probability mass/density functions which are the probabilities of individual random variables. - Independence of random variables and how it relates their joint and marginal distributions. - Additional topics covered include expectations, covariance, and an application to portfolio selection involving multiple random variables.

Uploaded by

Lee Chia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

Lecture 5: Joint Probability Distributions

Bo LI

School of Economics and Management


Tsinghua University

[email protected]

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 1 / 80


Overview

1 Jointly Distributed Random Variables

2 Sampling Distributions and Estimation


The Law of Large Numbers
The Mean, Variance and Moment Generating Functions for
Several Variables
Distributions Based on a Normal Random Sample

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 2 / 80


Jointly Distributed Random Variables

Joint Probability Mass Function

Let X and Y be two discrete rv’s defined on the sample space S


of an experiment. Th joint probability mass function p(x, y ) is defined
for each pair of numbers (x, y ) by

p(x, y) = P(X = x and Y = y)

Let A be any set consisting of pairs of (x, y ) values. Then the


probability that the random pair (X , Y ) lies in A is obtained by
summing the joint pmf over pairs in A:
X X
P[(X , Y ) ∈ A] = p(x, y )
(x,y)∈A

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 3 / 80


Jointly Distributed Random Variables

Joint Probability Table


An insurance agency services customers who have purchsed both
a homeowner’s policy and an automobile policy. For each type of
policy, a deductible amount must be specified.
For an automobile policy, the choices are $100 and $250,
whereas for a homeowner’s policy, the choices are 0,$100 and $200.
Suppose a customer is selected at random from the agency’s files. Let
X be the deductible amount on the auto policy and Y on the
homeowner’s policy.
The joint pmf specifies the probability associated with possible
(X , Y ) pairs, with any other pair having probability zero. Suppose the
joint pmf is given in the accompanying joint probability table:
y
Bo LI (Tsinghua SEM) p(x, y) 0 Distributions
Lec 5: Joint Probability 100 200 4 / 80
Jointly Distributed Random Variables

Marginal Probability Mass Functions

The marginal probability mass functions of X and of Y ,


denoted poy pX (x) and pY (y), respectively, are given by
X X
pX (x) = p(x, y) pY (y ) = p(x, y)
y x

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 5 / 80


Jointly Distributed Random Variables

Calculate Marginal Probability


The possible X values are x = 100 and x = 250, so computing
row totals in the joint probability table yields

pX (100) = p(100, 0) + p(100, 100) + p(100, 200) = .50

pX (250) = p(250, 0) + p(250, 100) + p(250, 200) = .50


The marginal pmf of X is then
(
.5 x = 100, 250
pX (x) =
0 otherwise
Similarly, the marginal pmf of Y is obtained from column totals as

.25 y = 0, 100


pX (x) = .50 y = 200


 0 otherwise
Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 6 / 80
Jointly Distributed Random Variables

Joint Probability Density Function

Let X and Y be continuous rv’s. Then f(x,y) is the joint


probability density function for X and Y if for any two-dimensional
set A
Z Z
P[(X , Y ) ∈ A] = f (x, y)dx dy
A
In particular, if A is the two-dimensional rectangle
{(x, y ) : a ≤ x ≤ b, c ≤ y ≤ d}, then

Z b Z d
P[(X , Y ) ∈ A] = P(a ≤ X ≤ b, c ≤ Y ≤ d) = f (x, y )dx dy
a c

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 7 / 80


Jointly Distributed Random Variables

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 8 / 80


Jointly Distributed Random Variables

Marginal Probability Density Functions

The marginal probability density function of X and Y , denoted


by fX (x) and fY (y), respectively, are given by
Z ∞
fX (x) = f (x, y )dy for − ∞ < x < ∞
−∞
Z ∞
fY (y) = f (x, y)dx for − ∞ < y < ∞
−∞

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 9 / 80


Jointly Distributed Random Variables

Marginal Probability Density Functions: Example2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 10 / 80


Jointly Distributed Random Variables

Marginal Probability Density Functions: Example2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 11 / 80


Jointly Distributed Random Variables

Marginal Probability Density Functions: Example2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 12 / 80


Jointly Distributed Random Variables

Independence

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 13 / 80


Jointly Distributed Random Variables

Independence

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 14 / 80


Jointly Distributed Random Variables

Independence

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 15 / 80


Jointly Distributed Random Variables

More than Two Random Variables

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 16 / 80


Jointly Distributed Random Variables

Multinomial Distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 17 / 80


Jointly Distributed Random Variables

Expectation of More than Two Random Variables

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 18 / 80


Jointly Distributed Random Variables

Expectation of More than Two Random Variables:


Example

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 19 / 80


Jointly Distributed Random Variables

Covariance

The covariance between two rv’s X and Y is

Cov(X , Y ) = E [(X − µX ) (Y − µY )]
 P P
y (x − µX ) (y − µY ) p(x, y ),



 x

 if X and Y are discrete
= R∞ R∞
−∞ −∞ (x − µX ) (y − µY ) f (x, y )dx dy ,




if X and Y are continuous

Cov(X , Y ) = E(XY ) − µX · µY

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 20 / 80


Jointly Distributed Random Variables

Expected Values, Cov. Corr.: Example

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 21 / 80


Jointly Distributed Random Variables

Expected Values, Cov. Corr.: Example

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 22 / 80


Jointly Distributed Random Variables

Application: Portfolios and Random Variables

How should money be allocated among several stocks that form a


portfolio?
Need to manipulate several random variables at once to
understand portfolios
Since stocks tend to rise and fall together, random variables for
these events must capture dependence

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 23 / 80


Jointly Distributed Random Variables

Application: Portfolios and Random Variables

Two Random Variables


Suppose a day trader can buy stock in two companies, IBM and
Microsoft, at $100 per share
X denotes the change in value of IBM
Y denotes the change in value of Microsoft

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 24 / 80


Jointly Distributed Random Variables

Application: Portfolios and Random Variables

Comparisons and the Sharpe Ratio

The day trader can invest $200 in


Two shares of IBM;
Two shares of Microsoft; or
One share of each

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 25 / 80


Jointly Distributed Random Variables

Application: Portfolios and Random Variables

Joint Probability Distribution of X and Y

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 26 / 80


Jointly Distributed Random Variables

Application: Portfolios and Random Variables

Probability Distribution for the Two Stocks

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 27 / 80


Jointly Distributed Random Variables

Application: Portfolios and Random Variables

Dependent Random Variables


Joint probability table shows changes in values of IBM and
Microsoft (X and Y ) are dependent
The dependence between them is positive

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 28 / 80


Jointly Distributed Random Variables

Application: Portfolios and Random Variables

Which portfolio should she choose?

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 29 / 80


Jointly Distributed Random Variables

Application: Portfolios and Random Variables

Sharpe Ratio for Mixed Portfolio

(µ + µY ) − 2rf 0.22 − 0.03


S(X + Y ) = pX ≈ √ ≈ 0.050
Var(X + Y ) 14.64

Summary of Sharpe Ratios (Advantage of Diversifying)

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 30 / 80


Jointly Distributed Random Variables

Linear Combination of Random Variables

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 31 / 80


Jointly Distributed Random Variables

Correlation

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 32 / 80


Jointly Distributed Random Variables

Correlation

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 33 / 80


Jointly Distributed Random Variables

Association and Causation

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 34 / 80


Jointly Distributed Random Variables

Conditional Distributions

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 35 / 80


Jointly Distributed Random Variables

Conditional Distributions: Example1

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 36 / 80


Jointly Distributed Random Variables

Conditional Distributions: Example1

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 37 / 80


Jointly Distributed Random Variables

Conditional Distributions: Example2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 38 / 80


Jointly Distributed Random Variables

Conditional Distributions: Example2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 39 / 80


Jointly Distributed Random Variables

Conditional Mean/Conditional Expectation

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 40 / 80


Jointly Distributed Random Variables

Conditional Variance
The conditional mean of any function g(Y ) can be obtained similarly. In
the discrete case,
X
E(g(Y )|X = x) = g(y )pY |X (y |x)
y ∈DY

In the continuous case


Z ∞
E(g(Y )|X = x) = g(y )fY |X (y |x)dy
−∞

The conditional variance of Y given X = x is

σY2 |X =x = V (Y |X = x) = E [Y − E(Y |X = x)]2 |X = x




There is a shortcut formula for the conditional variance analogous to that for
V (Y ) itself:

σY2 |X =x = V (Y |X = x) = E Y 2 |X = x − µ2Y |X =x


Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 41 / 80


Jointly Distributed Random Variables

Conditional Distributions

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 42 / 80


Jointly Distributed Random Variables

Conditional Distributions

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 43 / 80


Jointly Distributed Random Variables

Conditional Distributions

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 44 / 80


Jointly Distributed Random Variables

Conditional Distributions

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 45 / 80


Jointly Distributed Random Variables

Conditional Distributions

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 46 / 80


Jointly Distributed Random Variables

The Bivariate Normal Distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 47 / 80


Jointly Distributed Random Variables

The Bivariate Normal Distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 48 / 80


Jointly Distributed Random Variables

Joint Normal Distribution (Density)


 
−1/2 1 0 −1
fY (y) = det(2π · Σ) exp − (y − µ) Σ (y − µ)
2

Bivariate normal densities with µX = µY = 0 and σX = σY = 1


Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 49 / 80
Jointly Distributed Random Variables

The Bivariate Normal Distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 50 / 80


Jointly Distributed Random Variables

The Bivariate Normal Distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 51 / 80


Jointly Distributed Random Variables

Regression to the Mean

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 52 / 80


Jointly Distributed Random Variables

Regression to the Mean

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 53 / 80


Jointly Distributed Random Variables

The Bivariate Normal Distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 54 / 80


Jointly Distributed Random Variables

Conditional Mean and Variance as Random


Variables: A Key Theorem

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 55 / 80


Jointly Distributed Random Variables

Conditional Expectations: Example1

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 56 / 80


Jointly Distributed Random Variables

Conditional Expectations: Example2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 57 / 80


Jointly Distributed Random Variables

Conditional Expectations: Random Sums


PN
This example introduces sums of the type T = i=1 Xi , where N is a
random variable with a finite expectation and Xi are random variables that are
independent of N and have the common mean E(X ).
An insurance company might receive N claims in a given period of time,
and the amounts of the individual claims might be modeled as random
variables X1 , X2 , · · · The random variable N could denote the number of
customers entering a store and Xi the expenditure of the i th customer, or N
could denote the number of jobs in a single-server queue and Xi the service
time for the ith job. For this last case, T is the time to serve all the jobs in the
queue.
According to Theorem a, E(T ) = E[E(T |N)]. Since
E(T |N = n) = nE(X ), i.e., E(T |N) = NE(X ) and thus

E(T ) = E[NE(x)] = E(N)E(X )


Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 58 / 80
Jointly Distributed Random Variables

Conditional Expectations: Random Sums


Assume Xi are independent random variables with the same
mean, and the same variable, Var(X ), and that Var(N) < ∞ .
According to Theorem b ,
Var(T ) = E[Var(T |N)] + Var[E(T |N)]

Because E(T |N) = NE(X ), we have


Var[E(T |N)] = [E(X )]2 Var(N)
Pn 
Also, since Var(T |N = n) = Var i=1 Xi = n Var(X ), we have
Var(T |N) = N Var(X ). Further
E[Var(T |N)] = E(N) Var(X )

We thus obtain
Var(T ) = [E(X )]2 Var(N) + E(N) Var(X )
Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 59 / 80
Jointly Distributed Random Variables

Conditional Expectations: Random Sums

As a concrete example, suppose that the number of insurance


claims in a certain time period has expected value equal to 900 and
standard deviation equal to 30, as would be the case if the number
were Poisson random variable with expected value 900. Suppose that
the average claim value is $1000 and the standard deviation if $500.
Then the expected value of the total, T , of the claims is
E(T ) = $900, 000 and the variance of T is

Var(T ) = 10002 × 900 + 900 × 5002 = 1.125 × 109

Or the standard deviation of T is $33, 541

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 60 / 80


Jointly Distributed Random Variables

Conditional Expectations: Random Sums

The insurance company could then plan on total claims of


$900, 000 plus or minus a few standard deviations (by Chebyshev’s
inequality). Observe that if the total number of claims were not variable
but were fixed at N = 900 , the variance of the total claims would be
given by E(N) Var(X ) in the previous expression. The result would be
a standard deviation equal to $15, 000. The variability in the number of
claims thus contributes substantially to the uncertainty in the total.

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 61 / 80


Jointly Distributed Random Variables

Prediction
Suppose we want the predict Y using an instrument X . Our
predictor is hence denoted as h(X ). We need some measure of the
effectiveness of a prediction. One that is amenable to mathematical
analysis and that is widely used is the mean square error (MSE):
h i
MSE = E [Y − h(X ))2

Note that
h i
MSE = E[[Y − E(Y |X ))]2 + E (E(Y |X ) − h(X ))2

Thus the minimization function h∗ (X ) is

h∗ (X ) = E(Y |X )

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 62 / 80


Jointly Distributed Random Variables

Prediction

For the bivariate normal distribution, we found that


σY
E(Y |X ) = µY + ρ (X − µX )
σX
This linear function of X is thus the minimum mean squared error
predictor of Y from X . It can be shown that for general joint
distribution of Y and X , the best linear predictor of Y in terms of X
(having the form α + βX ) in the sense of minimum MSE is also
σY
E(Y |X ) = µY + ρ (X − µX )
σX
Note that the optimal linear predictor depends on the joint distribution
of X and Y only through their means, variances, and covariance.
Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 63 / 80
Sampling Distributions and Estimation The Law of Large Numbers

The Law of Large Numbers(LLN)

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 64 / 80


Sampling Distributions and Estimation The Law of Large Numbers

Two Inequalities about Expectation and Variance

Markov Inequality: Assume X is a nonnegative r.V. and for which


E(X ) exists, then
P(X >= t) <= E(X )/t

Chebyshev Inequality

P(|X − E(X )| >= t) <= Var(X )|t 2 2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 65 / 80


Sampling Distributions and Estimation The Law of Large Numbers

Proof of the Law of Large Numbers

Proof using Chebyshevs inequality

Var(X ) σ2
P(|X − µ| > ) ≤ = →0
2 n2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 66 / 80


The Mean, Variance and Moment Generating Functions for
Sampling Distributions and Estimation Several Variables

Linear Combination of Several Variables: Mean

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 67 / 80


The Mean, Variance and Moment Generating Functions for
Sampling Distributions and Estimation Several Variables

Linear Combination of Several Variables:


Variance

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 68 / 80


The Mean, Variance and Moment Generating Functions for
Sampling Distributions and Estimation Several Variables

Linear Combination of Several Variables:


Example

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 69 / 80


The Mean, Variance and Moment Generating Functions for
Sampling Distributions and Estimation Several Variables

Linear Combination of Several Variables: Normal


Variables

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 70 / 80


The Mean, Variance and Moment Generating Functions for
Sampling Distributions and Estimation Several Variables

A Proof Using MGF

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 71 / 80


The Mean, Variance and Moment Generating Functions for
Sampling Distributions and Estimation Several Variables

Linear Combination of Several Variables:


Example cont’d

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 72 / 80


Sampling Distributions and Estimation Distributions Based on a Normal Random Sample

Chisquare Distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 73 / 80


Sampling Distributions and Estimation Distributions Based on a Normal Random Sample

Chisquare Distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 74 / 80


Sampling Distributions and Estimation Distributions Based on a Normal Random Sample

Chisquare Distribution

If X1 ∼ χ2v1 , X2 ∼ χ2v2 , and they are independent, then


X1 + X2 ∼ χ2v1 +v2
If Z1 , Z2 , . . . , Zn are independent and each has the standard
normal distribution, then Z12 + Z22 + · · · + Zn2 ∼ χ2n
If X1 , X2 , . . . , Xn are a random sample from a normal distribution,
then X and S 2 are independent.
If X1 , X2 , . . . , Xn are a random sample from a normal distribution,
then (n − 1)S 2 /σ 2 ∼ χ2n−1 .

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 75 / 80


Sampling Distributions and Estimation Distributions Based on a Normal Random Sample

t distribution

Let Z be a standard normal rv and let X be a χ2v rv independent of


Z . Then the t distribution with degrees of freedom v is defined to be
the distribution of the ratio
Z
T =p
X /v

Sometimes we will include a subscript to indicate the df, t = tv


If X1 , X2 , . . . , Xn is a random sample from a normal distribution
N µ, σ 2 , then


X −µ
T = √
S/ n
has the t distribution with (n − 1) degrees of freedom, tn−1
Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 76 / 80
Sampling Distributions and Estimation Distributions Based on a Normal Random Sample

t distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 77 / 80


Sampling Distributions and Estimation Distributions Based on a Normal Random Sample

F distribution

Let X1 and X2 be independent chi-squared random variables with


v1 and v2 degrees of freedom, respectively. The F distribution v1
numerator degrees of freedom and v2 denominator degrees of
freedom is defined to be the distribution of the ratio
X1 /v1
F =
X2 /v2

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 78 / 80


Sampling Distributions and Estimation Distributions Based on a Normal Random Sample

F distribution

Suppose that we have a random sample of m observations from


the normal population N µ1 , σ12 and an independent random sample


of n observations from al second normal population N µ2 , σ22 . Then




for the sample variance from the first group we know (m − 1)S12 /σ12 is
χ2m−1 , and similarly for the second group (n − 1)S22 /σ22 is χ2n−1 . Thus,

(m−1)S12 /σ12
m−1 S12 /σ12
Fm−1,n−1 = =
(n−1)S12 /σ22 S22 /σ22
n−1

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 79 / 80


Sampling Distributions and Estimation Distributions Based on a Normal Random Sample

F distribution

Bo LI (Tsinghua SEM) Lec 5: Joint Probability Distributions 80 / 80

You might also like