0% found this document useful (0 votes)
46 views9 pages

GCI Proof

This document presents a detailed proof of the Gaussian correlation inequality by Thomas Royen, which asserts that for any centered Gaussian measure and symmetric convex sets, the measure of their intersection is at least the product of their measures. The authors aim to clarify Royen's proof by focusing specifically on the Gaussian case and providing additional details for better understanding. The document also discusses related results and previous attempts at proving the inequality, emphasizing Royen's contribution as a complete proof for the Gaussian case.

Uploaded by

L. Ren
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views9 pages

GCI Proof

This document presents a detailed proof of the Gaussian correlation inequality by Thomas Royen, which asserts that for any centered Gaussian measure and symmetric convex sets, the measure of their intersection is at least the product of their measures. The authors aim to clarify Royen's proof by focusing specifically on the Gaussian case and providing additional details for better understanding. The document also discusses related results and previous attempts at proving the inequality, emphasizing Royen's contribution as a complete proof for the Gaussian case.

Uploaded by

L. Ren
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Royen’s proof of the Gaussian correlation inequality

Rafal Latala and Dariusz Matlak


arXiv:1512.08776v1 [[Link]] 29 Dec 2015

Abstract
We present in detail Thomas Royen’s proof of the Gaussian correlation inequality
which states that µ(K ∩ L) ≥ µ(K)µ(L) for any centered Gaussian measure µ on Rd
and symmetric convex sets K, L in Rd .

1 Introduction
The aim of this note is to present in a self contained way the beautiful proof of the Gaussian
correlation inequality, due to Thomas Royen [7]. Although the method is rather simple
and elementary, we found the original paper not too easy to follow. One of the reasons
behind it is that in [7] the correlation inequality was established for more general class
of probability measures. Moreover, the author assumed that the reader is familiar with
properties of certain distributions and may justify some calculations by herself/himself.
We decided to reorganize a bit Royen’s proof, restrict it only to the Gaussian case and
add some missing details. We hope that this way a wider readership may appreciate the
remarkable result of Royen.
The statement of the Gaussian correlation inequality is as follows.

Theorem 1. For any closed symmetric sets K, L in Rd and any centered Gaussian measure
µ on Rd we have
µ(K ∩ L) ≥ µ(K)µ(L). (1)

For d = 2 the result was proved by Pitt [5]. In the case when one of the sets K, L is
a symmetric strip (which corresponds to min{n1 , n2 } = 1 in Theorem 2 below) inequality
(1) was established independently by Khatri [3] and Šidák [9]. Hargé [2] generalized the
Khatri-Šidak result to the case when one of the sets is a symmetric ellipsoid. Some other
partial results may be found in papers of Borell [1] and Schechtman, Schlumprecht and
Zinn [8].
Up to our best knowledge Thomas Royen was the first to present a complete proof of
the Gaussian correlation inequality. Some other recent attempts may be found in [4] and
[6], however both papers are very long and difficult to check. The first version of [4], placed
on the arxiv before Royen’s paper, contained a fundamental mistake (Lemma 6.3 there was
wrong).

1
Since any symmetric closed set is a countable intersection of symmetric strips, it is
enough to show (1) in the case when

K = {x ∈ Rd : ∀1≤i≤n1 |hx, vi i| ≤ ti } and L = {x ∈ Rd : ∀n1 +1≤i≤n1 +n2 |hx, vi i| ≤ ti },

where vi are vectors in Rd and ti nonnegative numbers. If we set n = n1 +n2 , Xi := hvi , Gi,
where G is the Gaussian random vector distributed according to µ, we obtain the following
equivalent form of Theorem 1.
Theorem 2. Let n = n1 + n2 and X be an n-dimensional centered Gaussian vector. Then
for any t1 , . . . , tn > 0,

P(|X1 | ≤ t1 , . . . ,|Xn | ≤ tn )
≥ P(|X1 | ≤ t1 , . . . , |Xn1 | ≤ tn1 )P(|Xn1 +1 | ≤ tn1 +1 , . . . , |Xn | ≤ tn ).

Remark 3. i) The standard approximation argument shows that the Gaussian correlation
inequality holds for centered Gaussian measures on separable Banach spaces.
ii) Thomas Royen established Theorem 2 for more general class of random vectors X such
that X 2 = (X12 , . . . , Xn2 ) has an n-variate gamma distribution (see [7] for details).
Notation. By N (0, C) we denote the centered Gaussian measure with the covariance
matrix C. We write Mn×m for a set of n × m matrices and |A| for the determinant of a
square matrix A. For a matrix A = (aij )i,j≤n and J ⊂ [n]; = {1, . . . , n} by AJ we denote
the square matrix (aij )i,j∈J and by |J| the cardinality of J.

2 Proof of Theorem 2
Without loss of generality we may and will assume that the covariance matrix C of X is
nondegenerate (i.e. strictly positively defined). We may write C as
 
C11 C12
C= ,
C21 C22
where Cij is the ni × nj matrix. Let
 
C11 τ C12
C(τ ) := , 0 ≤ τ ≤ 1.
τ C21 C22

Set Zi (τ ) := 12 Xi (τ )2 , 1 ≤ i ≤ n, where X(τ ) ∼ N (0, C(τ )).


We may restate the assertion as

P(Z1 (1) ≤ s1 , . . . , Zn (1) ≤ sn ) ≥ P(Z1 (0) ≤ s1 , . . . , Zn (0) ≤ sn ),

where s1 = 12 t2i . Therefore it is enough to show that the function

τ 7→ P(Z1 (τ ) ≤ s1 , . . . , Zn (τ ) ≤ sn ) is nondecreasing on [0, 1].

2
Let f (x, τ ) denote the density of the random vector Z(τ ) and K = [0, s1 ] × · · · × [0, sn ].
We have
∂ ∂ ∂
Z Z
P(Z1 (τ ) ≤ s1 , . . . , Zn (τ ) ≤ sn ) = f (x, τ )dx = f (x, τ )dx,
∂τ ∂τ K K ∂τ
where the last equation
R follows by Lemma 6 applied to λ1 = . . . = λn = 0. Therefore it is

enough to show that K ∂τ f (x, τ ) ≥ 0.

To this end we will compute the Laplace transform of ∂τ f (x, τ ). By Lemma 6, applied
n
to K = [0, ∞) , we have for any λ1 . . . , λn ≥ 0,
Pn ∂ ∂ Pn
Z Z
e− i=1 λi xi f (x, τ )dx = e− i=1 λi xi f (x, τ )dx.
[0,∞)n ∂τ ∂τ [0,∞)n
However by Lemma 4 we have
n
!
Pn 1
Z X
e− i=1 λi xi f (x, τ )dx = E exp − λk Xk2 (τ ) = |I + ΛC(τ )|−1/2 ,
[0,∞)n 2
k=1

where Λ = diag(λ1 , . . . , λn ).
Formula (2) below yields
X X Y
|I + ΛC(τ )| = 1 + |(ΛC(τ ))J | = 1 + |C(τ )J | λj .
∅6=J⊂[n] ∅6=J⊂[n] j∈J

Fix ∅ =
 6 J ⊂ [n]. Then  J = J1 ∪ J2 , where J1 := [n1 ] ∩ J, J2 := J \ [n1 ] and C(τ )J =
CJ 1 τ CJ 1 J 2
. If J1 = ∅ or J2 = ∅ then C(τ )J = CJ , otherwise by (3) we get
τ CJ 2 J 1 CJ 2
−1/2 −1/2
|C(τ )J | = |CJ1 ||CJ2 | I|J1| − τ 2 CJ1 CJ1 J2 CJ−1
2
CJ 2 J 1 CJ 1
|J1 |
Y
= |CJ1 ||CJ2 | (1 − τ 2 µJ1 ,J2 (i)),
i=1
−1/2 −1/2
where µJ1 ,J2 (i), 1 ≤ i ≤ |J1 | denote the eigenvalues of CJ1 CJ1 J2 CJ−1
2
CJ 2 J 1 CJ 1 (by (4)
they belong to [0, 1]). Thus for any ∅ = 6 J ⊂ [n] and τ ∈ [0, 1] we have

aJ (τ ) := − |C(τ )J | ≥ 0.
∂τ
Therefore
∂ 1 X ∂
|I + ΛC(τ )|−1/2 = − |I + ΛC(τ )|−3/2 |C(τ )J ||ΛJ |
∂τ 2 ∂τ
∅6=J⊂[n]
1 X Y
= |I + ΛC(τ )|−3/2 aJ (τ ) λj .
2
∅6=J⊂[n] j∈J

3
We have thus shown that
Pn ∂ 1
Z X Y
e− i=1 λi xi f (x, τ )dx = aJ (τ )|I + ΛC(τ )|−3/2 λj .
[0,∞)n ∂τ 2
∅6=J⊂[n] j∈J

Let hτ := h3,C(τ ) be the density function on (0, ∞)n defined by (5). By Lemmas 8 and
7 iii) we know that
Pn ∂ |J|
Y Z
|I + ΛC(τ )|−3/2 λj = e− i=1 λi xi
hτ .
(0,∞)n ∂xJ
j∈J

This shows that


∂ X 1 ∂ |J|
f (x, τ ) = aJ (τ ) hτ (x).
∂τ 2 ∂xJ
∅6=J⊂[n]

Finally recall that aJ (τ ) ≥ 0 and observe that by Lemma 7 ii),

∂ |I|
lim hτ (x) for i ∈
/ I ⊂ [n],
xi →0+ ∂xI

thus
∂ |J|
Z Z
hτ (x)dx = Q hτ (tJ , xJ c )dxJ c ≥ 0,
K ∂xJ j∈J c [0,tj ]

where J c = [n] \ J and y = (tJ , xJ c ) if yi = ti for i ∈ J and yi = xi for i ∈ J c .

3 Auxiliary Lemmas
Lemma 4. Let X be an n dimensional centered Gaussian vector with the covariance matrix
C. Then for any λ1 , . . . , λn ≥ 0 we have
n
!
X
E exp − λi Xi = |In + 2ΛC|−1/2 ,
2

i=1

where Λ := diag(λ1 , . . . , λn ).

Proof. Let A be a symmetric positively defined matrix. Then A = U DU T for some U ∈


O(n) and D = diag(d1 , d2 , . . . , dn ). Hence
n r
π
Z Z Y
exp(−hAx, xi)dx = exp(−hDx, xi)dx = = π n/2 |D|−1/2 = π n/2 |A|−1/2 .
Rn Rn dk
k=1

4
Therefore for a canonical Gaussian vector Y ∼ N (0, In ) and a symmetric matrix B such
that 2B < In we have
    −1/2
1 1
Z
−n/2
E exp(hBY, Y i) = (2π) exp − In − B x, x dx = 2−n/2 In − B
Rn 2 2
= |In − 2B|−1/2 .

We may represent X ∼ N (0, C) as X ∼ AY with Y ∼ N (0, In ) and C = AAT . Thus


n
!
X
E exp − λi Xi = E exp(−hΛX, Xi) = E exp(−hΛAY, AY i) = E exp(−hAT ΛAY, Y i)
2

i=1
= |In + 2AT ΛA|−1/2 = |In + 2ΛC|−1/2 ,

where to get the last equality we used the fact that |In + A1 A2 | = |In + A2 A1 | for A1 , An ∈
Mn×n .

Lemma 5. i) For any matrix A ∈ Mn×n ,


X
|In + A| = 1 + |AJ |. (2)
∅6=J⊂[n]
 
A11 A12
ii) Suppose that n = n1 +n2 and A ∈ Mn×n has the block representation A = ,
A21 A22
where Aij ∈ Mni ×nj and A11 , A22 are invertible. Then
−1/2 −1/2
|A| = |A11 ||A22 | In1 − A11 A12 A−1
22 A21 A11 . (3)

Moreover, if A is symmetric and positively defined then


−1/2 −1/2
0 ≤ A11 A12 A−1
22 A21 A11 ≤ In 1 . (4)

Proof. i) This formula may be verified in several ways – e.g. by induction on n or by using
the Leibniz formula for the determinant.
ii) We have
! ! !
  1/2 −1/2 −1/2 1/2
A11 A12 A11 0 In 1 A11 A12 A22 A11 0
= 1/2 −1/2 −1/2 1/2
A21 A22 0 A22 A22 A21 A11 In 2 0 A22
and
! !
−1/2 −1/2 −1/2 −1/2
In 1 A11 A12 A22 In1 − A11 A12 A−1
22 A21 A11 0
−1/2 −1/2 = −1/2 −1/2
A22 A21 A11 In 2 A22 A21 A11 In 2
−1/2 −1/2
= In1 − A11 A12 A−1
22 A21 A11 .

5
−1/2 −1/2
To show the last part of the statement notice that A11 A12 A−1
22 A21 A11 = B T B ≥ 0,
−1/2 −1/2
where B := A22 A21 A11 . If A is positively defined then for any t ∈ R, x ∈ Rn1 and
y ∈ Rn2 we have t2 hA11 x, xi + 2thA21 x, yi + hA22 y, yi ≥ 0. This implies hA21 x, yi2 ≤
−1/2 −1/2
hA11 x, xihA22 y, yi. Replacing x by A11 x and y by A22 y we get hBx, yi2 ≤ |x|2 |y|2 .
Choosing y = Bx we get hB T Bx, xi ≤ |x|2 , i.e. B T B ≤ In1 .

Lemma 6. Let f (x, τ ) be the density of the random vector Z(τ ) defined above. Then for
any Borel set K in [0, ∞)n and any λ1 , . . . , λn ≥ 0,
P ∂ ∂ Pn
Z Z
− n
e i=1 λi x i
f (x, τ )dx = e− i=1 λi xi f (x, τ )dx.
K ∂τ ∂τ K
Proof. The matrix C is nondegenerate, therefore matrices C11 and C22 are nondegerate
and C(τ ) is nondegenerate for any τ ∈ [0, 1]. Random vector X(τ ) ∼ N (0, C(τ )) has the
density |C(τ )|−1/2 (2π)−n/2 exp(− 12 hC(τ )−1 x, xi). Standard calculation shows that Z(τ )
has the density
1 −1 √x ,√x i
X
f (x, τ ) = |C(τ )|−1/2 (4π)−n/2 √ e−hC(τ ) ε ε 1
(0,∞)n (x),
x1 · · · xn n ε∈{−1,1}
√ √
where for ε ∈ {−1, 1}n and x ∈ (0, ∞)n we set xε := (εi xi )i .
The function τ 7→ |C(τ )|−1/2 is smooth on [0, 1], in particular

sup |C(τ )|−1/2 + sup |C(τ )|−1/2 =: M < ∞.
τ ∈[0,1] τ ∈[0,1] ∂τ


Since C(τ ) = τ C(1) + (1 − τ )C(0) we have ∂τ C(τ ) = C(1) − C(0) and
∂ −hC(τ )−1 √xε ,√xε i √ √ −1 √x ,√x i
e = −hC(τ )−1 (C(1) − C(0))C(τ )−1 xε , xε ie−hC(τ ) ε ε .
∂τ
The continuity of the function τ 7→ C(τ ) gives
n
√ √ √ √ X
hC(τ )−1 xε , xε i ≥ ah xε , xε i = a |xi |
i=1

and
n
√ √ √ √ X
hC(τ )−1 (C(1) − C(0))C(τ )−1 xε , xε i ≤ bh xε , xε i = b |xi |
i=1
for some a > 0, b < ∞. Hence for x ∈ (0, ∞)n
n
!
∂ 1 X Pn
sup f (x, τ ) ≤ g(x) := M π −n/2 √ 1+b |xi | e−a i=1 |xi |
.
τ ∈[0,1] ∂τ x1 · · · xn
i=1
Pn
Since g(x) ∈ L1 ((0, ∞)n
and e− i=1 λi xi ≥ 1 the statement easily follows by the Lebesgue
dominated convergence theorem.

6
Let for α > 0,

X xk+α−1 y k
gα (x, y) := e−x−y x > 0, y ≥ 0.
Γ(k + α) k!
k=0

For µ, α1 , . . . , αn > 0 and a random vector Y = (Y1 , . . . , Yn ) such that P(Yi ≥ 0) = 1 we


set "n  #
Y1 xi
hα1 ,...,αn ,µ,Y (x1 , . . . , xn ) := E gα , Yi , x1 , . . . , xn > 0.
µ i µ
i=1

Lemma 7. Let µ > 0 and Y be a random n-dimensional vector with nonnegative coordi-
nates. For α = (α1 , . . . , αn ) ∈ (0, ∞)Rn set hα := fα1 ,...,αn ,µ,Y .
i) For any α ∈ (0, ∞)n , hα ≥ 0 and (0,∞)n hα (x)dx = 1.

ii) If α ∈ (0, ∞)n and αi > 1 then limxi →0+ hα (x) = 0, ∂xi hα (x) exists and


hα (x) = hα−ei − hα .
∂xi
∂ | J|
iii) If α ∈ (1, ∞)n then for any J ⊂ [n], ∂xJ hα (x) exists and belongs to L1 ((0, ∞)n ).
Moreover for λ1 , . . . , λn ≥ 0,
Pn ∂ |J| Pn
Z Y Z

e i=1 λi xi
hα (x)dx = λi e− i=1 λi xi
hα (x)dx.
(0,∞)n ∂xJ (0,∞)n
i∈J

Proof. i) Obviously hα ∈ [0, ∞]. We have for any y ≥ 0 and α > 0,


Z ∞   Z ∞
1 x
gα , y dx = gα (x, y)dx = 1.
0 µ µ 0

Hence by the Fubini theorem,


k Z ∞  
1 xi
Z Y
hα (x)dx = E gα , Yi dxi = 1.
(0,∞)n µ i µ
j=1 0

ii) It is well known that Γ(x) is decreasing on (0, x0 ] and increasing on [x0 , ∞), where
1 < x0 < 2 and Γ(x0 ) > 1/2. Therefore for k = 1, . . . and α > 0, Γ(k + α) ≥ 12 Γ(k) =
1
2 (k − 1)! and

∞ ∞
!
−x
X xk+α−1 α−1 −x
X xk−1 −x
gα (x, y) ≤ e ≤2 x e +x α
e = 2xα−1 (e−x + x).
Γ(k + α) (k − 1)!
k=0 k=1

7
This implies that for α > 0 and 0 < a < b < ∞, gα (x, y) ≤ C(α, a, b) < ∞ for x ∈ (a, b)
and y ≥ 0. Moreover,
n 
xi αi −1
 n Y   
2 xi
hα (x) ≤ 1+ .
µ µ µ
i=1


In particular limxi →0+ hα (x) = 0 if αi ≥ 1. Observe that for α > 1, ∂x gα = gα−1 − gα .
Standard application of the Lebegue dominated convergence theorem concludes the proof
of part ii).
iii) By ii) we get

∂ |J| P
fα−Pi∈J δi ei ∈ L1 ((0, ∞)n ).
X
hα = (−1)|J|− i∈J δi
∂xJ
δ∈{0,1}J

∂ |J |
Moreover limxj →0+ ∂x J
hα (x) = 0 for j ∈
/ J. We finish the proof by induction on |J| using
integration by parts.

Let C be a strictly positively defined symmetric n × n matrix. Then there exists µ > 0
such that C − µIn is positively defined, so C = µIn + AAT for some A ∈ Mn×n . Let
(l)
(gj )j≤n,l≤k be i.i.d. N (0, 1) r.v’s. Set
 2
k k n
1 X X (l) (l) 1
√ gj(l) ai,j  ,
X X
Yi = gj gj ′ ai,j ai,j ′ =  1≤i≤n
2µ ′

l=1 j,j ≤n l=1 j=1

and
hk,C := h k ,..., k ,µ,Y . (5)
2 2

Lemma 8. For any λ1 , . . . , λn ≥ 0 we have


Pn
Z
k
e− i=1 λi xi hk,C (x) = |In + ΛC|− 2 ,
(0,∞)n

where Λ = diag(λ1 , . . . , λn ).

Proof. We have for any α, µ > 0 and λ, y ≥ 0


∞ ∞ Z ∞
yk k+α−1
 
1 −λx x )x x
Z
1
−(λ+ µ
X
−y
e gα , y dx = e e dx
0 µ µ k!Γ(k + α) 0 µk+α
k=0

X yk − µλ y
= e−y k+α
= (1 + µλ)−α e 1+µλ .
k!(1 + µλ)
k=0

8
By the Fubini theorem we have
Pn n Z ∞  
1 xi
Z Y
− λi xi −λi xi
e i=1 hk,C (x)dx = E e g , Yi dxi
(0,∞)n 0 µ k/2 µ
i=1
Pn µλi
k − i=1 1+µλi Yi
= |In + µΛ|− 2 Ee .
(l) (l)
Observe that Yi = kl=1 (Xi )2 , where X (l) := (Xi )i≤n are independent N (0, 2µ 1
AAT ).
P
Therefore by Lemma 4 we have
Pn 1
Z
k k k
e− i=1 λi xi hk,C (x)dx = |In + µΛ|− 2 |I + 2µΛ(I + µΛ)−1 AAT |− 2 = |In + C|− 2 .
(0,∞)n 2µ

References
[1] C. Borell, A Gaussian correlation inequality for certain bodies in Rn , Math. Ann. 256
(1981), 569–573.
[2] G. Hargé, A particular case of correlation inequality for the Gaussian measure, Ann.
Probab. 27 (1999), 1939–1951.
[3] C. G. Khatri, On certain inequalities for normal distributions and their applications to
simultaneous confidence bounds, Ann. Math. Statist. 38 (1967), 1853–1867.
[4] Y. Memarian, The Gaussian Correlation Conjecture Proof, arXiv:1310.8099.
[5] L. D. Pitt, A Gaussian correlation inequality for symmetric convex sets, Ann. Probab.
5 (1977), 470–474.
[6] G. Qingyang, The Gaussian Correlation Inequality for Symmetric Convex Sets,
arXiv:1012.0676.
[7] T. Royen, A simple proof of the Gaussian correlation conjecture extended to multivariate
gamma distributions, Far East J. Theor. Stat. 48 (2014), 139–145.
[8] G. Schechtman, T. Schlumprecht, J. Zinn, On the Gaussian measure of the intersection,
Ann. Probab. 26 (1998), 346–357.
[9] Z. Šidák, Rectangular confidence regions for the means of multivariate normal distri-
butions, J. Amer. Statist. Assoc. 62 (1967), 626–633.

Institute of Mathematics
University of Warsaw
Banacha 2
02-097 Warszawa, Poland
rlatala@[Link], ddmatlak@[Link]

You might also like