0% found this document useful (0 votes)
25 views5 pages

Mil780 Transform

Uploaded by

coxdevon045
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views5 pages

Mil780 Transform

Uploaded by

coxdevon045
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Transformation of random variables

Stephan Schmidt
February 27, 2024

1 Transformations of random variables


The purpose of this document is to derive the probability density functions
of transformed random variables, with a specific emphasis on linearly trans-
formed random variables. More theory on this can be found in textbooks
such as Papoulis and Unnikrishna Pillai (2002). This is a draft document
and therefore please bring any mistakes to my attention.
Transformations are often used in models. For example, the linear regres-
sion model
y = w0 + w1 · x + ϵ (1)
transforms a known x to y for a specific set of parameters w0 and w1 , with
ϵ ∼ p(ϵ). Also, there are data scaling methods such as standardisation
y−µ
z= (2)
σ
and normalisation
y − ymin
z= (3)
ymax − ymin
that can written as a linear transformation z = a · y + b. For the multivariate
case, we often encounter transformations of this form:

y = Ax + b (4)

For example, we can choose A and b so that y ∼ N (y; 0, I) for x ∼


N (y; µ, Σ).

Reference:

• Papoulis, A. and Unnikrishna Pillai, S., 2002. Probability, random


variables and stochastic processes. Fourth edition. McGraw-Hill.

1
1.1 Change of variables
If y = ψ(x), we can use a change of variables to obtain

dx
py (y) = px (ψ −1 (y)) · (5)
dy

if ψ(x) is monotonic and x = ψ −1 (y) where px (x) is the probability density


function of the random variable x and py (y) is the probability density function
over the random variable y.
For multivariate problems, y = f (x) and if x = f −1 (y) exists, then
 
dx
p(y) = p(x) det (6)
dy
 
−1 dx
p(y) = p(f (y)) det (7)
dy

1.2 Linear transformations of random variables


If y = a · x + b, we can use a change of variables to obtain
 
y−b 1
py (y) = px · (8)
a a

If y = Ax + b and if A is invertible, we can write x = A−1 (y − b) and


therefore  
−1 dx
px (y) = px (f (y)) det (9)
dy
can be simplified as follows:

py (y) = px A−1 (y − b) det A−1


 
(10)

1.3 Linear transformations of Gaussian variables


We can show that if
y =a·x+b (11)
where a and b are deterministic and x ∼ N (µx , σx ) that

p(y) = N y|a · µx + b, a2 σx2



(12)

2
Derivation: The Gaussian probability density function over x is given by
 
2 1 1 2
N (x|µx , σx ) = p exp − 2 (x − µx ) (13)
2πσx2 2σx

The distribution of y is as follows:


 
y−b 1
py (y) = px · (14)
a a
(  2 )
1 1 y−b 1
=p exp − 2 − µx · (15)
2πσx2 2σx a a
 
1 1 2
=p exp − 2 2 (y − (a · µx + b)) (16)
2πa2 σx2 2a σx
 
1 1 2
=p exp − 2 (y − µy ) (17)
2πσy2 2σy
= N (y|µy , σy2 ) (18)
= N (y|a · µx + b, a2 σx2 ) (19)

Figure 1 shows an example of transforming a Gaussian variable with a


linear operation y = a · x + b. If we draw samples from xsample ∼ N (µx , σx2 )
and substitute it in the equation ysample = a · xsample + b, we can obtain
an estimate of the probability density function with the histogram of the
samples ysample without knowing the distribution of y. Since x is Gaussian
and we are performing a linear operation, the analytical probability density
function of y is available and given by p(y) = N (y|a · µx + b, a2 σx2 ). In the
right plot of Figure 1, the sampling approach and the analytical probability
density function are compared.
Let’s consider the multivariate case, where x is a multivariate Gaussian
variable. We can show that if

y = Ax + b (20)

with x ∼ N (µx , Σx ), then

p(y) = N y|Aµx + b, AΣx AT



(21)

The derivation of this equation is given on page 5.


We can use the expected values to calculate the mean and covariance of
y for any distribution. Since y is Gaussian distributed,

p(y) = N y|µy , Σy (22)

3
0.8 0.200
0.7 0.175
0.6 0.150
0.5 0.125
Samples: xsamples (x|3, 0.52) Samples: xsamples (x|3, 0.52), then y = 4 xsamples + 2
PDF

PDF
0.4 Analytical: (x|3, 0.52) 0.100 Analytical: (y|14, 2.02)
0.3 0.075
0.2 0.050
0.1 0.025
0.0 0.000
1 2 3 4 5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 22.5
x y

Figure 1: The left plot shows the probability density function of x ∼


N (x|3, 0.52 ) and the right plot shows the probability density function of
y = 4 · x + 2.

the expected value of y can be used to calculate its mean, and the covariance
of y can be used to calculate its covariance matrix, i.e.

p(y) = N y|Ex∼N (µx ,Σx ) {y} , covx∼N (µx ,Σx ) {y} (23)

where µy = Ex∼N (µx ,Σx ) {y} and


n  T o
covx∼N (µx ,Σx ) {y} = Ex∼N (µx ,Σx ) y − µy y − µy (24)

Note, other distributions are not necessarily parametrised by their expected


values.

Exercise questions

• Derive the distribution of y if x is Laplacian distributed and y = a·x+b.

• Use the expected value formulas to derive Equation (21).

• How should A and b be selected so that y ∼ N (y; 0, I) for x ∼


N (x; µ, Σ) with y = Ax + b?

4
Derivation: The probability density function of x is Gaussian and given by
 
−D/2 −1/2 1 T −1
px (x) = (2π) det(Σx ) exp − (x − µx ) Σx (x − µx ) (25)
2

and the probability density function of y is given by

py (y) = px A−1 (y − b) det A−1


 
(26)

if y = Ax + b and A is invertible. The probability density function over y


is given by
 
1 −1 T −1 −1
py (y) = κ · exp − (A (y − b) − µx ) Σx (A (y − b) − µx ) (27)
2
 
1 T −T −1 −1
= κ · exp − (y − (Aµx + b)) A Σx A (y − (Aµx + b))
2
(28)
 
1 −1
= κ · exp − (y − (Aµx + b))T AΣx AT (y − (Aµx + b))
2
(29)

where κ = (2π)−D/2 det(Σx )−1/2 det A−1 and A−T Σ−1 −1



x A =
T −1

AΣx A . The exponential term is in the form of a Gaussian distri-
bution with a mean of µy = Aµx + b and a covariance of Σy = AΣx AT
and therefore the normalisation constant can be determined with the form
of multivariate Gaussian probability density functions:
 
−D/2 −1/2 1 T −1
py (x) = (2π) det(Σy ) exp − (y − µy ) Σy (y − µy ) (30)
2
 
1 T −1
T

exp − 2 (y − Aµx − b) AΣx A (y − Aµx − b)
= (31)
(2π)D/2 det(AΣx AT )1/2

p(y) = N y|Aµx + b, AΣx AT



(32)

You might also like