0% found this document useful (0 votes)
60 views5 pages

4.2 Variance and Covariance of Random Variables: Definition 4.4

The document discusses the concepts of variance and covariance for random variables, defining covariance as a measure of the relationship between two variables. It presents formulas for calculating covariance for both discrete and continuous random variables, along with examples to illustrate the calculations. Additionally, it introduces the correlation coefficient as a scale-free measure of the strength of the relationship between two variables, highlighting its properties and providing further examples.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views5 pages

4.2 Variance and Covariance of Random Variables: Definition 4.4

The document discusses the concepts of variance and covariance for random variables, defining covariance as a measure of the relationship between two variables. It presents formulas for calculating covariance for both discrete and continuous random variables, along with examples to illustrate the calculations. Additionally, it introduces the correlation coefficient as a scale-free measure of the strength of the relationship between two variables, highlighting its properties and providing further examples.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

4.

2 Variance and Covariance of Random Variables 123

Definition 4.4: Let X and Y be random variables with joint probability distribution f (x, y). The
covariance of X and Y is

σXY = E[(X − μX )(Y − μY )] = (x − μX )(y − μy )f (x, y)
x y

if X and Y are discrete, and


∞ ∞
σXY = E[(X − μX )(Y − μY )] = (x − μX )(y − μy )f (x, y) dx dy
−∞ −∞

if X and Y are continuous.


The covariance between two random variables is a measure of the nature of the
association between the two. If large values of X often result in large values of Y
or small values of X result in small values of Y , positive X − μX will often result in
positive Y −μY and negative X −μX will often result in negative Y −μY . Thus, the
product (X − μX )(Y − μY ) will tend to be positive. On the other hand, if large X
values often result in small Y values, the product (X − μX )(Y − μY ) will tend to be
negative. The sign of the covariance indicates whether the relationship between two
dependent random variables is positive or negative. When X and Y are statistically
independent, it can be shown that the covariance is zero (see Corollary 4.5). The
converse, however, is not generally true. Two variables may have zero covariance
and still not be statistically independent. Note that the covariance only describes
the linear relationship between two random variables. Therefore, if a covariance
between X and Y is zero, X and Y may have a nonlinear relationship, which means
that they are not necessarily independent.
124 Chapter 4 Mathematical Expectation

The alternative and preferred formula for σXY is stated by Theorem 4.4.

Theorem 4.4: The covariance of two random variables X and Y with means μX and μY , respec-
tively, is given by

σXY = E(XY ) − μX μY .

Proof : For the discrete case, we can write



σXY = (x − μX )(y − μY )f (x, y)
x y
 
= xyf (x, y) − μX yf (x, y)
x y x y
 
− μY xf (x, y) + μX μY f (x, y).
x y x y

Since
  
μX = xf (x, y), μY = yf (x, y), and f (x, y) = 1
x y x y

for any joint discrete distribution, it follows that


σXY = E(XY ) − μX μY − μY μX + μX μY = E(XY ) − μX μY .
For the continuous case, the proof is identical with summations replaced by inte-
grals.

Example 4.13: Example 3.14 on page 95 describes a situation involving the number of blue refills
X and the number of red refills Y . Two refills for a ballpoint pen are selected at
random from a certain box, and the following is the joint probability distribution:

x
f (x, y) 0 1 2 h(y)
3 9 3 15
0 28 28 28 28
3 3 3
y 1 14 14 0 7
1 1
2 28 0 0 28
5 15 3
g(x) 14 28 28 1

Find the covariance of X and Y .


Solution : From Example 4.6, we see that E(XY ) = 3/14. Now


2      
5 15 3 3
μX = xg(x) = (0) + (1) + (2) = ,
x=0
14 28 28 4

and

2      
15 3 1 1
μY = yh(y) = (0) + (1) + (2) = .
y=0
28 7 28 2
4.2 Variance and Covariance of Random Variables 125

Therefore,
  
3 3 1 9
σXY = E(XY ) − μX μY = − =− .
14 4 2 56

Example 4.14: The fraction X of male runners and the fraction Y of female runners who compete
in marathon races are described by the joint density function

8xy, 0 ≤ y ≤ x ≤ 1,
f (x, y) =
0, elsewhere.

Find the covariance of X and Y .


Solution : We first compute the marginal density functions. They are
 3
4x , 0 ≤ x ≤ 1,
g(x) =
0, elsewhere,

and

4y(1 − y 2 ), 0 ≤ y ≤ 1,
h(y) =
0, elsewhere.

From these marginal density functions, we compute


1 1
4 8
μX = E(X) = 4x4 dx = and μY = 4y 2 (1 − y 2 ) dy = .
0 5 0 15
From the joint density function given above, we have
1 1
4
E(XY ) = 8x2 y 2 dx dy = .
0 y 9

Then
  
4 4 8 4
σXY = E(XY ) − μX μY = − = .
9 5 15 225
Although the covariance between two random variables does provide informa-
tion regarding the nature of the relationship, the magnitude of σXY does not indi-
cate anything regarding the strength of the relationship, since σXY is not scale-free.
Its magnitude will depend on the units used to measure both X and Y . There is a
scale-free version of the covariance called the correlation coefficient that is used
widely in statistics.

Definition 4.5: Let X and Y be random variables with covariance σXY and standard deviations
σX and σY , respectively. The correlation coefficient of X and Y is
σXY
ρXY = .
σX σY

It should be clear to the reader that ρXY is free of the units of X and Y . The
correlation coefficient satisfies the inequality −1 ≤ ρXY ≤ 1. It assumes a value of
zero when σXY = 0. Where there is an exact linear dependency, say Y ≡ a + bX,
126 Chapter 4 Mathematical Expectation

ρXY = 1 if b > 0 and ρXY = −1 if b < 0. (See Exercise 4.48.) The correlation
coefficient is the subject of more discussion in Chapter 12, where we deal with
linear regression.

Example 4.15: Find the correlation coefficient between X and Y in Example 4.13.
Solution : Since
     
5 15 3 27
E(X 2 ) = (02 ) + (12 ) + (22 ) =
14 28 28 28

and
     
15 3 1 4
E(Y 2 ) = (02 ) + (12 ) + (22 ) = ,
28 7 28 7

we obtain
 2  2
27 3 45 4 1 9
2
σX = − = and σY = −
2
= .
28 4 112 7 2 28

Therefore, the correlation coefficient between X and Y is

σXY −9/56 1
ρXY = = = −√ .
σX σY (45/112)(9/28) 5

Example 4.16: Find the correlation coefficient of X and Y in Example 4.14.


Solution : Because
1 1
2 2 1
E(X 2 ) = 4x5 dx = and E(Y 2 ) = 4y 3 (1 − y 2 ) dy = 1 − = ,
0 3 0 3 3

we conclude that
 2  2
2 4 2 1 8 11
2
σX = − = and σY2 = − = .
3 5 75 3 15 225

Hence,

4/225 4
ρXY =  =√ .
(2/75)(11/225) 66

Note that although the covariance in Example 4.15 is larger in magnitude (dis-
regarding the sign) than that in Example 4.16, the relationship of the magnitudes
of the correlation coefficients in these two examples is just the reverse. This is
evidence that we cannot look at the magnitude of the covariance to decide on how
strong the relationship is.
/ /

Exercises 127

Exercises

4.33 Use Definition 4.3 on page 120 to find the vari- random variable Y = 3X − 2, where X has the density
ance of the random variable X of Exercise 4.7 on page function
117. 1 −x/4
4
e , x>0
f (x) =
4.34 Let X be a random variable with the following 0, elsewhere.
probability distribution:
x −2 3 5 Find the mean and variance of the random variable Y .
f (x) 0.3 0.2 0.5
4.44 Find the covariance of the random variables X
Find the standard deviation of X. and Y of Exercise 3.39 on page 105.
4.35 The random variable X, representing the num- 4.45 Find the covariance of the random variables X
ber of errors per 100 lines of software code, has the and Y of Exercise 3.49 on page 106.
following probability distribution:
x 2 3 4 5 6 4.46 Find the covariance of the random variables X
f (x) 0.01 0.25 0.4 0.3 0.04 and Y of Exercise 3.44 on page 105.
Using Theorem 4.2 on page 121, find the variance of
X. 4.47 For the random variables X and Y whose joint
density function is given in Exercise 3.40 on page 105,
4.36 Suppose that the probabilities are 0.4, 0.3, 0.2, find the covariance.
and 0.1, respectively, that 0, 1, 2, or 3 power failures
will strike a certain subdivision in any given year. Find 4.48 Given a random variable X, with standard de-
the mean and variance of the random variable X repre- viation σX , and a random variable Y = a + bX, show
senting the number of power failures striking this sub- that if b < 0, the correlation coefficient ρXY = −1, and
division. if b > 0, ρXY = 1.

4.37 A dealer’s profit, in units of $5000, on a new 4.49 Consider the situation in Exercise 4.32 on page
automobile is a random variable X having the density 119. The distribution of the number of imperfections
function given in Exercise 4.12 on page 117. Find the per 10 meters of synthetic failure is given by
variance of X. x 0 1 2 3 4
f (x) 0.41 0.37 0.16 0.05 0.01
4.38 The proportion of people who respond to a cer-
tain mail-order solicitation is a random variable X hav- Find the variance and standard deviation of the num-
ing the density function given in Exercise 4.14 on page ber of imperfections.
117. Find the variance of X.
4.50 For a laboratory assignment, if the equipment is
4.39 The total number of hours, in units of 100 hours, working, the density function of the observed outcome
that a family runs a vacuum cleaner over a period of X is
one year is a random variable X having the density
function given in Exercise 4.13 on page 117. Find the 2(1 − x), 0 < x < 1,
f (x) =
variance of X. 0, otherwise.

4.40 Referring to Exercise 4.14 on page 117, find Find the variance and standard deviation of X.
2
σg(X) for the function g(X) = 3X 2 + 4.
4.51 For the random variables X and Y in Exercise
4.41 Find the standard deviation of the random vari- 3.39 on page 105, determine the correlation coefficient
able g(X) = (2X + 1)2 in Exercise 4.17 on page 118. between X and Y .

4.52 Random variables X and Y follow a joint distri-


4.42 Using the results of Exercise 4.21 on page 118, bution
find the variance of g(X) = X 2 , where X is a random
variable having the density function given in Exercise 2, 0 < x ≤ y < 1,
4.12 on page 117. f (x, y) =
0, otherwise.
4.43 The length of time, in minutes, for an airplane Determine the correlation coefficient between X and
to obtain clearance for takeoff at a certain airport is a Y.

You might also like