Convex Optimization — Boyd & Vandenberghe
2. Convex sets
• affine and convex sets
• some important examples
• operations that preserve convexity
• generalized inequalities
• separating and supporting hyperplanes
• dual cones and generalized inequalities
2–1
Affine set
line through x1, x2: all points
x = θx1 + (1 − θ)x2 (θ ∈ R)
θ = 1.2 x1
θ=1
θ = 0.6
x2
θ=0
θ = −0.2
affine set: contains the line through any two distinct points in the set
example: solution set of linear equations {x | Ax = b}
(conversely, every affine set can be expressed as solution set of system of
linear equations)
Convex sets 2–2
Convex set
line segment between x1 and x2: all points
x = θx1 + (1 − θ)x2
with 0 ≤ θ ≤ 1
convex set: contains line segment between any two points in the set
x1, x2 ∈ C, 0≤θ≤1 =⇒ θx1 + (1 − θ)x2 ∈ C
examples (one convex, two nonconvex sets)
Convex sets 2–3
Convex combination and convex hull
convex combination of x1,. . . , xk : any point x of the form
x = θ1 x1 + θ2 x2 + · · · + θk xk
with θ1 + · · · + θk = 1, θi ≥ 0
convex hull conv S: set of all convex combinations of points in S
Convex sets 2–4
Convex cone
conic (nonnegative) combination of x1 and x2: any point of the form
x = θ1 x1 + θ2 x2
with θ1 ≥ 0, θ2 ≥ 0
x1
x2
0
convex cone: set that contains all conic combinations of points in the set
Convex sets 2–5
Hyperplanes and halfspaces
hyperplane: set of the form {x | aT x = b} (a != 0)
a
x0
x
aT x = b
halfspace: set of the form {x | aT x ≤ b} (a != 0)
a
x0 aT x ≥ b
aT x ≤ b
• a is the normal vector
• hyperplanes are affine and convex; halfspaces are convex
Convex sets 2–6
Euclidean balls and ellipsoids
(Euclidean) ball with center xc and radius r:
B(xc, r) = {x | !x − xc!2 ≤ r} = {xc + ru | !u!2 ≤ 1}
ellipsoid: set of the form
{x | (x − xc)T P −1(x − xc) ≤ 1}
with P ∈ Sn++ (i.e., P symmetric positive definite)
xc
other representation: {xc + Au | !u!2 ≤ 1} with A square and nonsingular
Convex sets 2–7
Norm balls and norm cones
norm: a function ! · ! that satisfies
• !x! ≥ 0; !x! = 0 if and only if x = 0
• !tx! = |t| !x! for t ∈ R
• !x + y! ≤ !x! + !y!
notation: ! · ! is general (unspecified) norm; ! · !symb is particular norm
norm ball with center xc and radius r: {x | !x − xc! ≤ r}
norm cone: {(x, t) | !x! ≤ t}
0.5
Euclidean norm cone is called second- t
order cone 0
1
1
0 0
x2 −1 −1 x1
norm balls and cones are convex
Convex sets 2–8
Polyhedra
solution set of finitely many linear inequalities and equalities
Ax ! b, Cx = d
(A ∈ Rm×n, C ∈ Rp×n, ! is componentwise inequality)
a1 a2
P
a5
a3
a4
polyhedron is intersection of finite number of halfspaces and hyperplanes
Convex sets 2–9
Positive semidefinite cone
notation:
• Sn is set of symmetric n × n matrices
• Sn+ = {X ∈ Sn | X # 0}: positive semidefinite n × n matrices
X ∈ Sn+ ⇐⇒ z T Xz ≥ 0 for all z
Sn+ is a convex cone
• Sn++ = {X ∈ Sn | X ' 0}: positive definite n × n matrices
! "
x y 0.5
z
example: ∈ S2+
y z
0
1
1
0
0.5
y −1 0 x
Convex sets 2–10
Operations that preserve convexity
practical methods for establishing convexity of a set C
1. apply definition
x1, x2 ∈ C, 0≤θ≤1 =⇒ θx1 + (1 − θ)x2 ∈ C
2. show that C is obtained from simple convex sets (hyperplanes,
halfspaces, norm balls, . . . ) by operations that preserve convexity
• intersection
• affine functions
• perspective function
• linear-fractional functions
Convex sets 2–11
Intersection
the intersection of (any number of) convex sets is convex
example:
S = {x ∈ Rm | |p(t)| ≤ 1 for |t| ≤ π/3}
where p(t) = x1 cos t + x2 cos 2t + · · · + xm cos mt
for m = 2:
2
1
1
p(t)
0
x2 0 S
−1
−1
−2
0 π/3 2π/3 π −2 −1
t x01 1 2
Convex sets 2–12
Affine function
suppose f : Rn → Rm is affine (f (x) = Ax + b with A ∈ Rm×n, b ∈ Rm)
• the image of a convex set under f is convex
S ⊆ Rn convex =⇒ f (S) = {f (x) | x ∈ S} convex
• the inverse image f −1(C) of a convex set under f is convex
C ⊆ Rm convex =⇒ f −1(C) = {x ∈ Rn | f (x) ∈ C} convex
examples
• scaling, translation, projection
• solution set of linear matrix inequality {x | x1A1 + · · · + xmAm % B}
(with Ai, B ∈ Sp)
• hyperbolic cone {x | xT P x ≤ (cT x)2, cT x ≥ 0} (with P ∈ Sn+)
Convex sets 2–13
Perspective and linear-fractional function
perspective function P : Rn+1 → Rn:
P (x, t) = x/t, dom P = {(x, t) | t > 0}
images and inverse images of convex sets under perspective are convex
linear-fractional function f : Rn → Rm:
Ax + b
f (x) = T , dom f = {x | cT x + d > 0}
c x+d
images and inverse images of convex sets under linear-fractional functions
are convex
Convex sets 2–14
example of a linear-fractional function
1
f (x) = x
x1 + x2 + 1
1 1
x2
x2
0 C 0
f (C)
−1 −1
−1 0 1 −1 0 1
x1 x1
Convex sets 2–15
Generalized inequalities
a convex cone K ⊆ Rn is a proper cone if
• K is closed (contains its boundary)
• K is solid (has nonempty interior)
• K is pointed (contains no line)
examples
• nonnegative orthant K = Rn+ = {x ∈ Rn | xi ≥ 0, i = 1, . . . , n}
• positive semidefinite cone K = Sn+
• nonnegative polynomials on [0, 1]:
K = {x ∈ Rn | x1 + x2t + x3t2 + · · · + xntn−1 ≥ 0 for t ∈ [0, 1]}
Convex sets 2–16
generalized inequality defined by a proper cone K:
x !K y ⇐⇒ y − x ∈ K, x ≺K y ⇐⇒ y − x ∈ int K
examples
• componentwise inequality (K = Rn+)
x !Rn+ y ⇐⇒ xi ≤ yi , i = 1, . . . , n
• matrix inequality (K = Sn+)
X !Sn+ Y ⇐⇒ Y − X positive semidefinite
these two types are so common that we drop the subscript in !K
properties: many properties of !K are similar to ≤ on R, e.g.,
x !K y, u !K v =⇒ x + u !K y + v
Convex sets 2–17
Convex Optimization — Boyd & Vandenberghe
3. Convex functions
• basic properties and examples
• operations that preserve convexity
• the conjugate function
• quasiconvex functions
• log-concave and log-convex functions
• convexity with respect to generalized inequalities
3–1
Definition
f : Rn → R is convex if dom f is a convex set and
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
for all x, y ∈ dom f , 0 ≤ θ ≤ 1
(y, f (y))
(x, f (x))
• f is concave if −f is convex
• f is strictly convex if dom f is convex and
f (θx + (1 − θ)y) < θf (x) + (1 − θ)f (y)
for x, y ∈ dom f , x %= y, 0 < θ < 1
Convex functions 3–2
Examples on R
convex:
• affine: ax + b on R, for any a, b ∈ R
• exponential: eax, for any a ∈ R
• powers: xα on R++, for α ≥ 1 or α ≤ 0
• powers of absolute value: |x|p on R, for p ≥ 1
• negative entropy: x log x on R++
concave:
• affine: ax + b on R, for any a, b ∈ R
• powers: xα on R++, for 0 ≤ α ≤ 1
• logarithm: log x on R++
Convex functions 3–3
Examples on Rn and Rm×n
affine functions are convex and concave; all norms are convex
examples on Rn
• affine function f (x) = aT x + b
!n
• norms: !x!p = ( i=1 |xi|p)1/p for p ≥ 1; !x!∞ = maxk |xk |
examples on Rm×n (m × n matrices)
• affine function
m "
" n
f (X) = tr(AT X) + b = Aij Xij + b
i=1 j=1
• spectral (maximum singular value) norm
f (X) = !X!2 = σmax(X) = (λmax(X T X))1/2
Convex functions 3–4
Restriction of a convex function to a line
f : Rn → R is convex if and only if the function g : R → R,
g(t) = f (x + tv), dom g = {t | x + tv ∈ dom f }
is convex (in t) for any x ∈ dom f , v ∈ Rn
can check convexity of f by checking convexity of functions of one variable
example. f : Sn → R with f (X) = log det X, dom f = Sn++
g(t) = log det(X + tV ) = log det X + log det(I + tX −1/2V X −1/2)
!n
= log det X + log(1 + tλi)
i=1
where λi are the eigenvalues of X −1/2V X −1/2
g is concave in t (for any choice of X # 0, V ); hence f is concave
Convex functions 3–5
Extended-value extension
extended-value extension f˜ of f is
f˜(x) = f (x), x ∈ dom f, f˜(x) = ∞, x #∈ dom f
often simplifies notation; for example, the condition
0≤θ≤1 =⇒ f˜(θx + (1 − θ)y) ≤ θf˜(x) + (1 − θ)f˜(y)
(as an inequality in R ∪ {∞}), means the same as the two conditions
• dom f is convex
• for x, y ∈ dom f ,
0≤θ≤1 =⇒ f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
Convex functions 3–6
First-order condition
f is differentiable if dom f is open and the gradient
! "
∂f (x) ∂f (x) ∂f (x)
∇f (x) = , ,...,
∂x1 ∂x2 ∂xn
exists at each x ∈ dom f
1st-order condition: differentiable f with convex domain is convex iff
f (y) ≥ f (x) + ∇f (x)T (y − x) for all x, y ∈ dom f
f (y)
f (x) + ∇f (x)T (y − x)
(x, f (x))
first-order approximation of f is global underestimator
Convex functions 3–7
Second-order conditions
f is twice differentiable if dom f is open and the Hessian ∇2f (x) ∈ Sn,
2 ∂ 2f (x)
∇ f (x)ij = , i, j = 1, . . . , n,
∂xi∂xj
exists at each x ∈ dom f
2nd-order conditions: for twice differentiable f with convex domain
• f is convex if and only if
∇2f (x) # 0 for all x ∈ dom f
• if ∇2f (x) $ 0 for all x ∈ dom f , then f is strictly convex
Convex functions 3–8
Examples
quadratic function: f (x) = (1/2)xT P x + q T x + r (with P ∈ Sn)
∇f (x) = P x + q, ∇2f (x) = P
convex if P # 0
least-squares objective: f (x) = $Ax − b$22
∇f (x) = 2AT (Ax − b), ∇2f (x) = 2AT A
convex (for any A)
quadratic-over-linear: f (x, y) = x2/y 2
f (x, y)
! "! "T 1
2 y y
∇2f (x, y) = 3 #0
y −x −x 0
2 2
1 0
convex for y > 0 y 0 −2 x
Convex functions 3–9
!n
log-sum-exp: f (x) = log k=1 exp xk is convex
1 1
∇2f (x) = diag(z) − zz T
(zk = exp xk )
1T z (1T z)2
to show ∇2f (x) # 0, we must verify that v T ∇2f (x)v ≥ 0 for all v:
! 2
! !
T 2 ( k zk vk )( k zk ) − ( k v k zk ) 2
v ∇ f (x)v = ! ≥0
( k zk ) 2
! 2
! 2
!
since ( k v k zk ) ≤( k zk vk )( k zk ) (from Cauchy-Schwarz inequality)
"n 1/n n
geometric mean: f (x) = ( k=1 x k ) on R ++ is concave
(similar proof as for log-sum-exp)
Convex functions 3–10
Epigraph and sublevel set
α-sublevel set of f : Rn → R:
Cα = {x ∈ dom f | f (x) ≤ α}
sublevel sets of convex functions are convex (converse is false)
epigraph of f : Rn → R:
epi f = {(x, t) ∈ Rn+1 | x ∈ dom f, f (x) ≤ t}
epi f
f is convex if and only if epi f is a convex set
Convex functions 3–11
Jensen’s inequality
basic inequality: if f is convex, then for 0 ≤ θ ≤ 1,
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
extension: if f is convex, then
f (E z) ≤ E f (z)
for any random variable z
basic inequality is special case with discrete distribution
prob(z = x) = θ, prob(z = y) = 1 − θ
Convex functions 3–12
Operations that preserve convexity
practical methods for establishing convexity of a function
1. verify definition (often simplified by restricting to a line)
2. for twice differentiable functions, show ∇2f (x) " 0
3. show that f is obtained from simple convex functions by operations
that preserve convexity
• nonnegative weighted sum
• composition with affine function
• pointwise maximum and supremum
• composition
• minimization
• perspective
Convex functions 3–13
Positive weighted sum & composition with affine function
nonnegative multiple: αf is convex if f is convex, α ≥ 0
sum: f1 + f2 convex if f1, f2 convex (extends to infinite sums, integrals)
composition with affine function: f (Ax + b) is convex if f is convex
examples
• log barrier for linear inequalities
m
!
f (x) = − log(bi − aTi x), dom f = {x | aTi x < bi, i = 1, . . . , m}
i=1
• (any) norm of affine function: f (x) = #Ax + b#
Convex functions 3–14
Pointwise maximum
if f1, . . . , fm are convex, then f (x) = max{f1(x), . . . , fm(x)} is convex
examples
• piecewise-linear function: f (x) = maxi=1,...,m(aTi x + bi) is convex
• sum of r largest components of x ∈ Rn:
f (x) = x[1] + x[2] + · · · + x[r]
is convex (x[i] is ith largest component of x)
proof:
f (x) = max{xi1 + xi2 + · · · + xir | 1 ≤ i1 < i2 < · · · < ir ≤ n}
Convex functions 3–15
Pointwise supremum
if f (x, y) is convex in x for each y ∈ A, then
g(x) = sup f (x, y)
y∈A
is convex
examples
• support function of a set C: SC (x) = supy∈C y T x is convex
• distance to farthest point in a set C:
f (x) = sup "x − y"
y∈C
• maximum eigenvalue of symmetric matrix: for X ∈ Sn,
λmax(X) = sup y T Xy
"y"2 =1
Convex functions 3–16
Composition with scalar functions
composition of g : Rn → R and h : R → R:
f (x) = h(g(x))
g convex, h convex, h̃ nondecreasing
f is convex if
g concave, h convex, h̃ nonincreasing
• proof (for n = 1, differentiable g, h)
f !!(x) = h!!(g(x))g !(x)2 + h!(g(x))g !!(x)
• note: monotonicity must hold for extended-value extension h̃
examples
• exp g(x) is convex if g is convex
• 1/g(x) is convex if g is concave and positive
Convex functions 3–17
Vector composition
composition of g : Rn → Rk and h : Rk → R:
f (x) = h(g(x)) = h(g1(x), g2(x), . . . , gk (x))
gi convex, h convex, h̃ nondecreasing in each argument
f is convex if
gi concave, h convex, h̃ nonincreasing in each argument
proof (for n = 1, differentiable g, h)
f !!(x) = g !(x)T ∇2h(g(x))g !(x) + ∇h(g(x))T g !!(x)
examples
!m
• i=1 log gi (x) is concave if gi are concave and positive
!m
• log i=1 exp gi(x) is convex if gi are convex
Convex functions 3–18
Minimization
if f (x, y) is convex in (x, y) and C is a convex set, then
g(x) = inf f (x, y)
y∈C
is convex
examples
• f (x, y) = xT Ax + 2xT By + y T Cy with
! "
A B
! 0, C"0
BT C
minimizing over y gives g(x) = inf y f (x, y) = xT (A − BC −1B T )x
g is convex, hence Schur complement A − BC −1B T ! 0
• distance to a set: dist(x, S) = inf y∈S $x − y$ is convex if S is convex
Convex functions 3–19
Perspective
the perspective of a function f : Rn → R is the function g : Rn × R → R,
g(x, t) = tf (x/t), dom g = {(x, t) | x/t ∈ dom f, t > 0}
g is convex if f is convex
examples
• f (x) = xT x is convex; hence g(x, t) = xT x/t is convex for t > 0
• negative logarithm f (x) = − log x is convex; hence relative entropy
g(x, t) = t log t − t log x is convex on R2++
• if f is convex, then
T
! T
"
g(x) = (c x + d)f (Ax + b)/(c x + d)
is convex on {x | cT x + d > 0, (Ax + b)/(cT x + d) ∈ dom f }
Convex functions 3–20
The conjugate function
the conjugate of a function f is
f ∗(y) = sup (y T x − f (x))
x∈dom f
f (x)
xy
(0, −f ∗(y))
• f ∗ is convex (even if f is not)
• will be useful in chapter 5
Convex functions 3–21
examples
• negative logarithm f (x) = − log x
f ∗(y) = sup(xy + log x)
x>0
!
−1 − log(−y) y < 0
=
∞ otherwise
• strictly convex quadratic f (x) = (1/2)xT Qx with Q ∈ Sn++
f ∗(y) = sup(y T x − (1/2)xT Qx)
x
1 T −1
= y Q y
2
Convex functions 3–22
Quasiconvex functions
f : Rn → R is quasiconvex if dom f is convex and the sublevel sets
Sα = {x ∈ dom f | f (x) ≤ α}
are convex for all α
a b c
• f is quasiconcave if −f is quasiconvex
• f is quasilinear if it is quasiconvex and quasiconcave
Convex functions 3–23
Examples
!
• |x| is quasiconvex on R
• ceil(x) = inf{z ∈ Z | z ≥ x} is quasilinear
• log x is quasilinear on R++
• f (x1, x2) = x1x2 is quasiconcave on R2++
• linear-fractional function
aT x + b
f (x) = T , dom f = {x | cT x + d > 0}
c x+d
is quasilinear
• distance ratio
#x − a#2
f (x) = , dom f = {x | #x − a#2 ≤ #x − b#2}
#x − b#2
is quasiconvex
Convex functions 3–24
internal rate of return
• cash flow x = (x0, . . . , xn); xi is payment in period i (to us if xi > 0)
• we assume x0 < 0 and x0 + x1 + · · · + xn > 0
• present value of cash flow x, for interest rate r:
n
!
PV(x, r) = (1 + r)−ixi
i=0
• internal rate of return is smallest interest rate for which PV(x, r) = 0:
IRR(x) = inf{r ≥ 0 | PV(x, r) = 0}
IRR is quasiconcave: superlevel set is intersection of open halfspaces
n
!
IRR(x) ≥ R ⇐⇒ (1 + r)−ixi > 0 for 0 ≤ r < R
i=0
Convex functions 3–25