Robust Optimization
• definitions of robust optimization
• robust linear programs
• robust cone programs
• chance constraints
EE364b, Stanford University
Robust optimization
convex objective f0 : Rn → R, uncertainty set U , and fi : Rn × U → R,
x 7→ fi(x, u) convex for all u ∈ U
general form
minimize f0(x)
subject to fi(x, u) ≤ 0 for all u ∈ U , i = 1, . . . , m.
equivalent to
minimize f0(x)
subject to sup fi(x, u) ≤ 0, i = 1, . . . , m.
u∈U
• Bertsimas, Ben-Tal, El-Ghaoui, Nemirovski (1990s–now)
EE364b, Stanford University 1
Setting up robust problem
• can always replace objective f0 with supu∈U f0(x, u), rewrite in
epigraph form to
minimize t
subject to sup f0(x, u) ≤ t, sup fi(x, u) ≤ 0, i = 1, . . . , m
u u
• equality constraints make no sense: a robust equality aT (x + u) = b for
all u ∈ U ?
three questions:
• is robust formulation useful?
• is robust formulation computable?
• how should we choose U ?
EE364b, Stanford University 2
Example failure for linear programming
0
1000
−.01 −.02 .5 .6
2000
100 1
1 0 0
800
199.9 0 0 90 100
c= A= and b = 100000 .
−5500 0 0 40 50
0
−6100 100 199.9 700 800
0
−I4
0
c vector of costs/profits for two drugs, constraints Ax b on production
• what happens if we vary percentages .01, .02 (chemical composition of
raw materials) by .5% and 2%, i.e. .01 ± .00005 and .02 ± .0004?
EE364b, Stanford University 3
Example failure for linear programming
800
700
600
500
Frequency
400
300
200
100
0
0.00 0.05 0.10 0.15 0.20 0.25
relative change
Frequently lose 15–20% of profits
EE364b, Stanford University 4
Alternative robust LP
minimize cT x
subject to (A + ∆)x b, all ∆ ∈ U
where |∆11| ≤ .00005, |∆12| ≤ .0004, ∆ij = 0 otherwise
• solution xrobust has degradation provably no worse than 6%
EE364b, Stanford University 5
How to choose uncertainty sets
• uncertainty set U a modeling choice
• common idea: let U be random variable, want constraints that
Prob(fi(x, U ) ≥ 0) ≤ ǫ (1)
• typically hard (non-convex except in special cases)
• find set U such that Prob(U ∈ U ) ≥ 1 − ǫ, then sufficient condition
for (1)
fi(x, u) ≤ 0 for all u ∈ U
EE364b, Stanford University 6
Uncertainty set with Gaussian data
minimize cT x
subject to Prob(aTi x > bi) ≤ ǫ, i = 1, . . . , m
coefficient vectors ai i.i.d. N (a, Σ) and failure probability ǫ
• marginally aTi x ∼ N (aTi x, xT Σx)
• for ǫ = .5, just LP
minimize cT x subject to aTi x ≤ bi, i = 1, . . . , m
• what about ǫ = .1, .9?
EE364b, Stanford University 7
Gaussian uncertainty sets
√
{x | Prob(aTi x > bi) ≤ ǫ} = {x | aTi x − bi − Φ −1
(ǫ) xT Σx ≤ 0}
ǫ = .9 ǫ = .5 ǫ = .1
EE364b, Stanford University 8
Problem is convex, so no problem?
not quite...
consider quadratic constraint
kAx + Buk2 ≤ 1 for all kuk∞ ≤ 1
• convex quadratic maximization in u
• solutions on extreme points u ∈ {−1, 1}n
• and NP-hard to maximize (even approximately [Håstad]) convex
quadratics over hypercube
EE364b, Stanford University 9
Robust LPs
Important question: when is a robust LP still an LP (robust SOCP an
SOCP, robust SDP an SDP)
minimize cT x
subject to (A + U )x b for U ∈ U .
can always represent formulation constraint-wise, consider only one
inequality
(a + u)T x ≤ b for all u ∈ U .
• Simple example: U = {u ∈ Rn | kuk∞ ≤ δ}, then
aT x + δ kxk1 ≤ b
EE364b, Stanford University 10
Polyhedral uncertainty
for matrix F ∈ Rm×n, g ∈ Rm,
(a + u)T x ≤ b for u ∈ U = {u ∈ Rn | F u + g 0} .
duality essential for transforming (semi-)infinite inequality into tractable
problem
• Lagrangian for maximizing uT x:
(
+∞ if F T λ + x 6= 0
L(u, λ) = xT u + λT (F u + g), sup L(u, λ) =
u λT g if F T λ + x = 0.
• gives equivalent inequality constraints
aT x + λT g ≤ b, F T λ + x = 0, λ 0.
EE364b, Stanford University 11
Portfolio optimization (with robust LPs)
• n assets i = 1, . . . , n, random multiplicative return Ri with
E[Ri] = µi ≥ 1, µ1 ≥ µ2 ≥ · · · ≥ µn
• “certain” problem has solution xnom = e1,
maximize µT x subject toxT 1 = 1 x 0
• if asset i varies in range µi ± ui, robust problem
X n
maximize inf (µi + u)xi subject to 1T x = 1, x 0
u∈[−u1 ,ui ]
i=1
and equivalent
maximize µT x − uT x subject to 1T x = 1, x 0
EE364b, Stanford University 12
Robust LPs as SOCPs
norm-based uncertainty on data vectors a,
(a + P u)T x ≤ b for u ∈ U = {u ∈ Rm | kuk ≤ 1},
gives dual-norm constraint
aT x + P T x ∗
≤b
EE364b, Stanford University 13
Portfolio optimization (tigher control)
• Returns Ri ∈ [µi − ui, µi + ui] with E Ri = µi
• guarantee return with probability 1 − ǫ
n
X
maximize t subject to Prob Ri x i ≥ t ≥1−ǫ
µ,t
i=1
• value at risk is non-convex in x, approximate it?
• approximate with high-probability bounds
• less conservative than LP (certain returns) approach
EE364b, Stanford University 14
Portfolio optimization: probability approximation
• Hoeffding’s inequality
n
t2
X
Prob (Ri − µi)xi ≤ −t ≤ exp − Pn 2 u2
.
i=1
2 x
i=1 i i
• written differently
" n n 1# 2
X
T
X
2 2
2 t
Prob Ri x i ≤ µ x − t ui xi ≤ exp −
i=1 i=1
2
p
• set t = 2 log(1/ǫ), gives robust problem
r
1
T
maximize µ x − 2 log kdiag(u)xk2 subject to 1T x = 1, x 0.
ǫ
EE364b, Stanford University 15
Portfolio optimization comparison
• data µi = 1.05 + 3(n−i) n−i
10n , uncertainty |ui | ≤ ui = .05 + 2n and un = 0
• nominal minimizer xnom = e1
• conservative (LP) minimizer xcon = en (guaranteed 5% return),
• robust (SOCP) minimizer xǫ for value-at risk ǫ = 2 × 10−4
EE364b, Stanford University 16
Portfolio optimization comparison
10000
xnom
xcon
xǫ
8000
Frequency
6000
4000
2000
0
0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
T
Return R x
Returns chosen randomly in µi ± ui, 10,000 experiments
EE364b, Stanford University 17
LPs with conic uncertainty
• convex cone K, dual cone K ∗ = {v ∈ Rm | v T x ≥ 0, all x ∈ K}
• recall x K y iff x − y ∈ K
• robust inequality
(a + u)T x ≤ b for all u ∈ U = {u ∈ Rn | F u + g K 0}
• under constraint qualification, equivalent to
aT x + λT g ≤ b, λ K ∗ 0, x + F T λ = 0
EE364b, Stanford University 18
Example calculation: LP with semidefinite uncertainty
• symmetric matrices A0, A1, . . . , Am ∈ Sk , robust counterpart to
aT x ≤ b
m
X
(a + P u)T x ≤ b for all u s.t. A0 + u i Ai 0
i=1
• cones K = Sk+, K ∗ = Sk+
P
• Slater condition: ū such that A0 + i Ai ūi ≻0
• duality gives equivalent representation
Tr(ΛA1)
aT x + Tr(ΛA0) ≤ b, P T x + .. = 0, Λ 0.
Tr(ΛAm)
EE364b, Stanford University 19
Robust second-order cone problems
• Lorentz/SOCP cone, nominal inequality
kAx + bk2 ≤ cT x + d
• A = [a1 · · · an]T ∈ Rm×n, allow A, c to vary
• interval uncertainty
• ellipsoidal uncertainty
• matrix uncertainty
EE364b, Stanford University 20
SOCPs with interval uncertainty
entries Aij perturbed by ∆ij with |∆ij | ≤ δ, c by cone:
k(A + ∆)x + bk2 ≤ (c + u)T x + d all k∆k∞ ≤ δ, u ∈ U
• split into two inequalities (first is robust LP)
k(A + ∆)x + bk2 ≤ t, t ≤ (c + u)T x + d
second
m
X 1/2
sup k(A + ∆)x + bk2 = sup [(ai + ∆i)T x + bi]2
∆:|∆ij |≤δ ∆:|∆ij |≤δ i=1
kzk2 | zi = aTi x + ∆Ti x + bi, k∆ik∞ ≤ δ
= sup
m×n
∆∈R
= inf kzk2 | zi ≥ |aTi x + b| + δ kxk1 .
EE364b, Stanford University 21
SOCPs with ellipse-like uncertainty
• matrices P1, . . . , Pm ∈ Rn×n, u ∈ Rm with kuk ≤ 1
• robust/uncertain inequality
m
X 1/2
[(ai + Piu)T x + bi]2 ≤ t for all u s.t. kuk2 ≤ 1.
i=1
• rewrite zi ≥ supkuk≤1 |aTi x + bi + uT PiT x|, equivalent
kzk2 ≤ t, zi ≥ |aTi x + bi| + PiT x ∗
, i = 1, . . . , m.
EE364b, Stanford University 22
SOCPs wtih matrix uncertainty
• Matrix P ∈ Rm×n and radius δ, uncertain inequality
k(A + P ∆)x + bk2 ≤ t, for ∆ ∈ Rn×n s.t. k∆k ≤ δ,
• tool one: Schur complements gives equivalence of
T
t x
kxk2 ≤ t and 0.
x tIn
• tool two: homogeneous S-lemma
xT Ax ≥ 0 implies xT Bx ≥ 0 if and only if ∃ λ ≥ 0 s.t. B λA.
EE364b, Stanford University 23
SOCPs with matrix uncertainty
k(A + P ∆)x + bk2 ≤ t, for ∆ ∈ Rn×n s.t. k∆k ≤ δ,
equivalent to
T
t ((A + P ∆)x + b)
0 for k∆k ≤ 1.
(A + P ∆)x + b tIm
or
2
ts2 + 2s((A + P ∆)x + b)T v + t kvk2 ≥ 0 for all s ∈ R, v ∈ Rm, k∆k ≤ 1.
EE364b, Stanford University 24
SOCPs with matrix uncertainty: final result
k(A + P ∆)x + bk2 ≤ t, for ∆ ∈ Rn×n s.t. k∆k ≤ δ,
equivalent to
T T
t (Ax + b) x
Ax + b t − λP P T 0 0.
x 0 λIn
EE364b, Stanford University 25
Example: robust regression
minimize kAx − bk2
where A corrupted by Gaussian noise,
A = A⋆ + ∆ for ∆ij ∼ N (0, 1)
decide to be robust to ∆ by
• bounding individual entries ∆ij
• bounding norms of rows ∆i
• bounding (ℓ2-operator) norm of ∆
EE364b, Stanford University 26
Choice of uncertainty in robust regression
Theorem [e.g. Vershynin 2012] Let ∆ ∈ Rm×n have i.i.d. N (0, 1) entries.
For all t ≥ 0, the following hold:
• For each pair i, j
2
t
Prob(|∆ij | ≥ t) ≤ 2 exp − .
2
• For each i 2
√
t
Prob(k∆ik2 ≥ n + t) ≤ exp − .
2
• For the entire matrix ∆,
√ √ t2
Prob(k∆k ≥ m+ n + t) ≤ exp − .
2
EE364b, Stanford University 27
Choice of uncertainty in robust regression
idea: choose bounds t(δ) to guarantee Prob(deviation ≥ t(δ)) ≤ δ
• coordinate-wise: t∞(δ)2 = 2 log 2mn
δ ,
2
t∞(δ)
Prob(max |∆ij | ≥ t∞(δ)) ≤ 2mn exp − =δ
i,j 2
• row-wise: t2(δ)2 = 2 log m
δ,
2
t2(δ)
Prob(max k∆ik2 ≥ t2(δ)) ≤ m exp − =δ
i 2
• matrix-norm: top(δ)2 = 2 log 1δ ,
√ √ top(δ)2
Prob(k∆k ≥ n+ m + top(δ)) ≤ exp − = δ.
2
EE364b, Stanford University 28
Robust regression results
minimize sup k(A + ∆)x − bk2
x ∆∈U
where U is one of the three uncertainty sets
U∞ = {∆ | k∆k∞ ≤ t∞(δ)},
√
U2 = {∆ | k∆ik2 ≤ n + t2(δ) for i = 1, . . . m},
√ √
Uop = {∆ | k∆k ≤ n + m + top(δ)}.
EE364b, Stanford University 29
Robust regression results
2
10
x∞
x2
kAx̂ − bk2 − kAx⋆ − bk2 xop
1
10
0
10
-1
10
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1
10 10 10 10 10 10 10 10 10 10
δ
Objective value kAx̂ − bk2 − kAx⋆ − bk2 versus δ, where x⋆ minimizes
nominal objective and x̂ denotes robust solution
EE364b, Stanford University 30
Robust regression results
3500 18000
xop xop
xnom 16000 xnom
3000
x2
14000 x∞
Frequency
Frequency
2500
12000
2000
10000
8000
1500
6000
1000
4000
500
2000
0 0
8 10 12 14 16 18 20 0 10 20 30 40 50 60 70
k(A + ∆)x − bk2 k(A + ∆)x − bk2
• residuals for the robust least squares problem k(A + ∆)x − bk2
• uncertainty sets Unom = {0} vs. U∞, U2, Uop
• experiment with N = 105 random Gaussian matrices
EE364b, Stanford University 31