Rohatgi - An Introduction To Probability and Statistics Wiley 2015 - Removed
Rohatgi - An Introduction To Probability and Statistics Wiley 2015 - Removed
EX = μ, var(X) = μ3 /λ and
1/2
λ 2tμ2
M(t) = E exp(tX) = exp 1− 1− .
μ λ
In this section we introduce the bivariate and multivariate normal distributions and inves-
tigate some of their important properties. We note that bivariate analogs of other PDFs are
known but they are not always uniquely identified. For example, there are several versions
of bivariate exponential PDFs so-called because each has exponential marginals. We will
not encounter any of these bivariate PDFs in this book.
where σ1 > 0, σ2 > 0, |ρ| < 1, and Q is the positive definite quadratic form
2 2
1 x − μ1 x − μ1 y − μ2 y − μ2
Q(x, y) = − 2ρ + . (2)
1 − ρ2 σ1 σ1 σ2 σ2
(a)
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
−3
−2
−1
0 −3
1 −2
−1
2 0
1 ρ = −0.9
3 2
3
(b)
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
−3
−2
−1
0
1 –3
2 −1 −2
1 0
3 3 2 ρ = −0.5
(c)
0.2
0.15
0.1
−3
0.05 −2
−1
0 0
−3
−2 1 ρ = 0.5
−1
0 2
1
2 3
3
(d)
ρ = 0.9
Fig. 1 (continued).
We first show that (1) indeed defines a joint PDF. In fact, we prove the following result.
Theorem 1. The function defined by (1) and (2) with σ1 > 0, σ2 > 0, |ρ| < 1 is a joint
PDF. The marginal PDFs of X and Y are, respectively, N(μ1 , σ12 ) and N(μ2 , σ22 ), and ρ is
the correlation coefficient between X and Y.
BIVARIATE AND MULTIVARIATE NORMAL DISTRIBUTIONS 231
∞
Proof. Let f1 (x) = f (x, y) dy. Note that
−∞
2 2
y − μ2 x − μ1 x − μ1
(1 − ρ2 )Q(x, y) = −ρ + (1 − ρ2 )
σ2 σ1 σ1
!2 2
y − [μ2 + ρ(σ2 /σ1 )(x − μ1 )] x − μ1
= + (1 − ρ2 ) .
σ2 σ1
It follows that
! ∞
+ ,
1 −(x − μ1 )2 exp −(y − βx )2 /[2σ22 (1 − ρ2 )]
f1 (x) = √ exp * √ dy, (3)
σ1 2π 2σ12 −∞ σ2 1 − ρ2 2π
σ2
βx = μ2 + ρ (x − μ1 ). (4)
σ1
Thus
∞ ∞ ! ∞
f (x, y) dy dx = f1 (x) dx = 1,
−∞ −∞ −∞
and f (x, y) is a joint PDF of two RVs of the continuous type. It also follows that f1 is the
marginal PDF of X, so that X is N(μ1 , σ12 ). In a similar manner we can show that Y is
N(μ2 , σ22 ).
Furthermore, we have
!
f (x, y) 1 −(y − βx )2
= * √ exp , (5)
f1 (x) σ2 1 − ρ2 2π 2σ22 (1 − ρ2 )
where βx is given by (4). It is clear, then, that the conditional PDF fY|X (y | x) given by (5)
is also normal, with parameters βx and σ22 (1 − ρ2 ). We have
σ2
E{Y | x} = βx = μ2 + ρ (x − μ1 ) (6)
σ1
and
E(XY) = E{E{XY|X}}
!
σ2
= E X μ2 + ρ (X − μ1 )
σ1
ρσ2 2
= μ1 μ2 + σ .
σ1 1
It follows that
Next we compute the MGF M(t1 , t2 ) of a bivariate normal RV (X, Y). We have, if f (x, y)
is the PDF given in (1) and f1 is the marginal PDF of X,
∞ ∞
M(t1 , t2 ) = et1 x+t2 y f (x, y) dx dy,
−∞
∞
−∞
∞ !
= fY|X (y | x)et2 y dy et1 x f1 (x) dx
−∞
∞
−∞
!
1 22 σ2
= e f1 (x) exp σ2 t2 (1 − ρ ) + t2 μ2 + ρ (x − μ1 )
t1 x 2
dx
−∞ 2 σ1
∞
1 σ2
= exp σ22 t22 (1 − ρ2 ) + t2 μ2 − ρt2 μ1 et1 x e(ρσ2 /σ1 )xt2 f1 (x) dx.
2 σ1 −∞
Now
∞ 2
σ2 1 σ2
e(t1 +ρt2 σ2 /σ1 )x f1 (x) dx = exp μ1 t1 + ρ t2 + σ12 t1 + ρt2 .
−∞ σ1 2 σ1
Therefore,
Theorem 2. If (X, Y) has a bivariate normal distribution, X and Y are independent if and
only if ρ = 0.
Remark 2. It is quite possible for an RV (X, Y) to have a bivariate density such that the
marginal densities of X and Y are normal and the correlation coefficient is 0, yet X and Y
are not independent. Indeed, if the marginal densities of X and Y are normal, it does not
follow that the joint density of (X, Y) is a bivariate normal. Let
1 1 −1
f (x, y) = exp (x 2
− 2ρxy + y 2
) (9)
2 2π(1 − ρ2 )1/2 2(1 − ρ2 )
!
1 −1 2 2
+ exp (x + 2ρxy + y ) .
2π(1 − ρ2 )1/2 2(1 − ρ2 )
Here f (x, y) is a joint PDF such that both marginal densities are normal, f (x, y) is not
bivariate normal, and X and Y have zero correlation. But X and Y are not independent. We
have
1
f1 (x) = √ e−x /2 ,
2
−∞ < x < ∞,
2π
1 −y2 /2
f2 (y) = √ e , −∞ < y < ∞,
2π
EXY = 0.
Example 1. (Rosenberg [93]). Let f and g be PDFs with corresponding DFs F and G.
Also, let
where |α| ≤ 1 is a constant. It was shown in Example 4.3.1 that h is a bivariate density
function with given marginal densities f and g.
In particular, take f and g to be the PDF of N(0, 1), that is,
1
f (x) = g(x) = √ e−x /2 ,
2
−∞ < x < ∞, (11)
2π
and let (X, Y) have the joint PDF h(x, y). We will show that X + Y is not normal except in
the trivial case α = 0, when X and Y are independent.
Let Z = X + Y. Then
It is easy to show (Problem 2) that cov(X, Y) = α/π, so that var(Z) = 2[1 + (α/π)]. If Z
is normal, its MGF must be
2
[1+(α/π)]
Mz (t) = et . (12)
234 SOME SPECIAL DISTRIBUTIONS
Next we compute the MGF of Z directly from the joint PDF (10). We have
M1 (t) = E{etX+tY }
∞ ∞
2
= et + α etx+ty [2F(x) − 1][2F(y) − 1]f (x)f (y) dx dy
−∞ −∞
∞ 2
2
= et + α e [2F(x) − 1]f (x) dx
tx
.
−∞
Now
∞ ∞
2
etx [2F(x) − 1]f (x) dx = −2 etx [1 − F(x)]f (x) dx + et /2
−∞ −∞
∞ ∞ !
2 1 1 2
= e −2
t /2
exp − (x + u − 2tx) du dx
2
−∞ x 2π 2
!
1 2
∞ ∞ exp − [x + (v + x) − 2tx] 2
2 2
= et /2 − dv dx
−∞ 0 π
∞
2 exp{−v2 /2 + (v − t)2 /4} ∞ exp{−[x + (v − t)/2]2 }
= et /2 − √ √ dx dv
π −∞ π
0
!
1
∞ exp − [(v + t)2 /2]
2 2 2
= et /2 − 2et /2 √ dv
2 π
0
!
2 2 t
= et /2 − 2et /2 P Z1 > √ , (13)
2
where Z1 is an N(0, 1) RV.
It follows that
!2
2 2 2 1
M1 (t) = et + α et /2 − 2et /2 P Z1 > √
2
! 2
2 t
= et 1 + α 1 − 2P Z1 > √ . (14)
2
If Z were normally distributed, we must have Mz (t) = M1 (t) for all t and all |α| ≤ 1,
that is,
! 2
t2 (α/π)t2 t2 t
e e = e 1 + α 1 − 2P Z1 > √ . (15)
2
For α = 0, the equality clearly holds. The expression within the brackets on the right side
2
of (15) is bounded by 1 + α, whereas the expression e(α/π)t is unbounded, so the equality
cannot hold for all t and α.
vector of real numbers (x1 , x2 , . . . , xn ) and let μ denote the column vector (μ1 , μ2 , . . . , μn ) ,
where μi (i = 1, 2, . . . , n) are real constants.
defines the joint PDF of some random vector X = (X1 , X2 , . . . , Xn ) , provided that the
constant c is chosen appropriately. The MGF of X exists and is given by
!
t M−1 t
M(t1 , t2 , . . . , tn ) = exp t μ + , (17)
2
Proof. Let
! n
∞ ∞
(x − μ) M(x − μ)
I=c ··· exp t x − dxi . (18)
−∞ −∞ 2
i=1
Since M is positive definite, it follows that all the n characteristic roots of M, say
m1 , m2 , . . . , mn , are positive. Moreover, since M is symmetric there exists an n × n orthog-
onal matrix L such that L ML is a diagonal matrix with diagonal elements m1 , m2 , . . . , mn .
Let us change the variables to z1 , z2 , . . . , zn by writing y = Lz, where z = (z1 , z2 , . . . , zn ),
and note that the Jacobian of this orthogonal transformation is |L|. Since L L = In , where
In is an n × n unit matrix, |L| = 1 and we have
∞ ∞
z L MLz
n
I = c exp(t μ) ··· exp t Lz − dzi . (20)
−∞ −∞ 2
i=1
If follows that
" n #
(2π) n/2 u2
I = c exp(t u) exp i
. (21)
(m1 m2 · · · mn )1/2 i=1
2mi
By choosing
(m1 m2 · · · mn )1/2
c= (22)
(2π)n/2
we have
n
u2 i
= u (L M−1 L)u = t M−1 t.
mi
i=1
Also
It follows from (21) and (22) that the MGF of X is given by (17), and we may write
1
c= . (23)
{(2π)n |m−1 |}1/2
ti2
M(0, 0, . . . , 0, ti , 0, . . . , 0) = exp ti μi + σii
2
M(0, 0, . . . , 0, ti , 0, . . . , 0, tj , 0, . . . , 0)
" #
σii ti2 + 2σij ti tj + tj2 σjj
= exp ti μi + tj μj + .
2
BIVARIATE AND MULTIVARIATE NORMAL DISTRIBUTIONS 237
This is the MGF of a bivariate normal distribution with means μi , μj , variances σii , σjj ,
and covariance σij . Thus we see that
μ = (μ1 , μ2 , . . . , μn ) (24)
and
The matrix M−1 is called the dispersion (variance-covariance) matrix of the multivariate
normal distribution.
If σij = 0 for i = j, the matrix M−1 is a diagonal matrix, and it follows that the RVs
X1 , X2 , . . . , Xn are independent. Thus we have the following analog of Theorem 2.
The following result is stated without proof. The proof is similar to the two-variate case
n
except that now we consider the quadratic form in n variables: E{ i=1 ti (Xi − μi )}2 ≥ 0.
Theorem 5. The probability that the RVs X1 , X2 , . . . , Xn with finite variances satisfy at
least one linear relationship is 1 if and only if |M| = 0.
Accordingly, if |M| = 0 all the probability mass is concentrated on a hyperplane of
dimension < n.
n
Yp = Apj Xj , p = 1, 2, . . . , k; k ≤ n. (27)
j=1
n
cov(Yp , Yq ) = Api Aqj σij , (28)
i,j=1
k
Writing uj = p=1 tp Apj , j = 1, 2, . . . , n, we have
" n #
∗
M (t1 , t2 , . . . , tk ) = E exp ui Xi
⎛ i=1
⎞
1
n
= exp ⎝ σij ui uj ⎠ by (17)
2
i,j=1
⎛ ⎞
1 n k
= exp ⎝ σij tl tm Ali Amj ⎠
2
i,j=1 l,m=1
⎛ ⎞
1 k n
= exp ⎝ tl tm Ali Amj σij ⎠
2
l,m=1 i,j=1
⎧ ⎫
⎨1 k ⎬
= exp tl tm cov(Yl , Ym ) . (29)
⎩2 ⎭
l,m=1
X t = t1 X1 + t2 X2 + · · · + tn Xn
Proof. Suppose that X t is normal for any t. Then the MGF of X t is given by
1
M(s) = exp bs + σ 2 s2 . (30)
2
1
M(s) = exp t μs + t M−1 ts2 . (31)
2
Let s = 1 then
1
M(1) = exp t μ + t M−1 t , (32)
2
and since the MGF is unique, it follows that X has a multivariate normal distribution. The
converse follows from Corollary 1 to Theorem 6.
Many characterization results for the multivariate normal distribution are now available.
We refer the reader to Lukacs and Laha [70, p. 79].
PROBLEMS 5.4
is a joint PDF on Rn .
(b) Let (X1 , X2 , . . . , Xn ) have PDF f given in (a). Show that the RVs in any proper
subset of {X1 , X2 , . . . , Xn } containing two or more elements are independent
standard normal RVs.
Most of the distributions that we have so far encountered belong to a general family of
distributions that we now study. Let Θ be an interval on the real line, and let {fθ : θ ∈ Θ}
be a family of PDFs (PMFs). Here and in what follows we write x = (x1 , x2 , . . . , xn ) unless
otherwise specified.
Definition 1. If there exist real-valued functions Q(θ) and D(θ) on Θ and Borel-
measurable functions T(x1 , x2 , . . . , xn ) and S(x1 , x2 , . . . , xn ) on Rn such that