Advanced Gurobi Optimization Techniques
Advanced Gurobi Optimization Techniques
Advanced Gurobi
Algorithms
Roland Wunderling
October 2022
• Problem Types
• Presolve
• Algorithms for Continuous
Optimization
• Algorithms for Discrete Optimization
Solution
Solution
Original Presolved Optimize Unpresolve
Model Presolve Model
40%
35%
30%
25%
20% 17%
14% 13%
15%
10% 9%
10%
6% 6% 6% 5% 4% 4% 3% 3% 3% 3%
5%
1% 1% 1% 0% 0% 0% 0%
0%
• Problem statement
𝒎𝒊𝒏 𝒄′ 𝒙
𝒔. 𝒕. 𝑨𝒙 = 𝒃
𝒙 ≥ 𝟎
• Optimal solution can be found at a vertex
• Intersection of n constraints satisfied with
equality
• Pick 𝑨𝒙 = 𝒃 and 𝒙𝑵 = 𝟎, 𝑵 = 𝒏 − 𝒎
• Then 𝒙𝑩 = 𝑨𝑩−𝟏 𝒃 − 𝑵𝒙𝑵 , 𝑩 = 𝟏, … , 𝒏 \𝑵
• Basis
• Partition: 𝟏, … , 𝒏 = 𝑩 ∪ 𝑵, 𝑩 ∩ 𝑵 = ∅
• Such that 𝑨𝑩 is non-singular
• Primal feasibility
• All constraints must be satisfied
• 𝒙𝑩 ≥ 𝟎
𝑧𝑗 < 0
𝑥𝐵𝑖 < 0
Primal Update Vector: Δ𝑥𝐵 𝑇 𝐿𝑈 = 𝐴𝑗 Dual Update Vectors: Δ𝑦𝑇 𝐿𝑈 = 𝑒𝐵𝑇𝑖 Δ𝑧𝑁 = Δ𝑦𝑇 𝐴
x𝑁
• Dikin’s Algorithm:
apply affine transformation to stay away
from the boundary at each iteration
• Karmarkar’s Algorithm: Central
apply projective transformation to re-center path
the solution at each iteration
• Logarithmic Barrier Algorithm:
use a logarithmic penalty function on the
variable bounds to stay centered Degeneracy
does no harm
𝑐𝑇 𝑥 ∗ = 𝑦 ∗ 𝑇 𝑏
(if primal and dual are both feasible)
𝐴𝑥 = 𝑏, 𝑥 ≥ 0 (primal feasibility)
𝐴𝑇 𝑦 + 𝑧 = 𝑐, z ≥ 0 (dual feasibility)
𝑐 𝑇 𝑥 - 𝑏𝑇 𝑦 = 0 ⟺ 𝑐 𝑇 𝑥 - (𝐴𝑥)𝑇 𝑦 = 0 ⟺ (𝑐 𝑇 - 𝑦 𝑇 𝐴)𝑥 ⟺ 𝑧 𝑇 𝑥 = 0
32
2 Interpretations of the Barrier Algorithm
𝑥𝑗 ∗ 𝑧𝑗 = 𝜇 > 0,
𝑗 = 1, … , 𝑛
• Start with 𝜇 > 0, systematically reduce it to 0 to converge to optimal primal dual pair to of LP
• Now we have an interior point method, but what makes it a barrier method?
33
2 Interpretations of the Barrier Algorithm
• In both cases, differentiate the unconstrained optimization and apply Newton’s method
34
2 Interpretations of the Barrier Algorithm
• These are duals of each other. From either one can derive (e.g. multiply (5) by 𝑍)
Does not matter
Look how we got here;
familiar? we can use
Newton’s Method
35
Applying Newton’s Method
𝜕 𝑓𝑖
𝑥𝑘+1= 𝑥𝑘 - [J 𝑥𝑘 ]−1𝑓(𝑥𝑘 ), where 𝐽(𝑥)𝑖𝑗 = (𝑥)
𝜕𝑥𝑗 𝜕𝑥𝑖
∆𝑥
Starting point
∇𝑥 ∇𝑦 ∇𝑧
At each iteration
J 𝑥0 ∆𝑥 𝑓(𝑥0 )
36
Applying Newton’s Method
At each iteration
• Recall
𝐴𝑥 = 𝑏, 𝑥 ≥ 0 (primal feasibility)
𝐴𝑇 𝑦 + 𝑧 = 𝑐, z ≥ 0 (dual feasibility)
𝑐 𝑇 𝑥 − 𝑏𝑇 𝑦 = 0 ⟺ 𝑧 𝑇 𝑥 = 0 (duality gap ⟺ complementary slackness)
• At each iteration
• Duality gap can be shown to reduce
• Albeit neither primal nor dual objective needs change monotonicaly
• Terminate when normalized duality gap and complementary slackness within tolerance:
Update x, y, and z
* Lustig, I.J. (1990). "Feasibility issues in a primal-dual interior point method for linear programming,” Mathematical Programming, 49(2), 145-162
© 2022 Gurobi Optimization, LLC. Confidential, All Rights Reserved | 39
Ordering time: 0.00s
Barrier statistics:
Dense cols : 1
AA' NZ : 1.056e+03
Factor NZ : 4.200e+03 (roughly 1 MB of memory)
Factor Ops : 8.603e+04 (less than 1 second per iteration)
Threads : 1
Barrier solved model in 7 iterations and 0.36 seconds (0.83 work units)
• Continue with crossover Optimal objective -3.00000000e+02
Simplex Barrier
• Thousand/millions of iterations on extremely sparse • Dozens of iterations on denser matrices
matrices
• Each iteration extremely cheap • Each iteration is expensive
• Primal or Dual degeneracy can be problematic • No issues with degenerate extreme points
Performance results:
1.83
• Gurobi 9.5, Intel(R) Xeon (R) E3-1240 v5 1.68
(4 core at 3.5GHz)
• Simplex on 1 core, Barrier on 4 cores 1.21
1.09
• Concurrent with 1 thread dual, 3 threads 1
barrier
• Result for models that take >1s
LP ALGORITHMS
Primal Dual Barrier Concurrent Det. Concurrent Det. Concurrent Simplex
G
LP based Branch-and- A
P
Bound Root:
𝑣 = 3.5
G
LP based Branch-and- A
P
Bound Root:
𝑣 = 3.5
G
A
G
LP based Branch-and- P
A
P
Bound Root:
𝑣 = 3.5
Cutoff
LP Relaxation
[3]
Cutting Planes
[1] Heuristics
[1] Achterberg and Wunderling: "Mixed Integer Programming: Analyzing 12 Years of Progress" (2013) [1]
Presolve
• Tighten formulation and reduce problem size
Node selection Presolving Node Selection
• Select next subproblem to process
Node presolve
• Additional presolve for subproblem Conflict Analysis Node Presolve
Solve continuous relaxations
• Gives a bound on the optimal integral objective
Conflict analysis LP Relaxation
• Learn from infeasible subproblems
Cutting planes Cutting Planes
• Cut off relaxation solutions Heuristics
Primal heuristics
• Find integer feasible solutions
Branching variable selection
• Crucial for limiting search tree size
Branching
Presolve
• Tighten formulation and reduce problem size
Node selection Presolving Node Selection
• Select next subproblem to process
Node presolve
• Additional presolve for subproblem Conflict Analysis Node Presolve
Solve continuous relaxations
• Gives a bound on the optimal integral objective
Conflict analysis LP Relaxation
• Learn from infeasible subproblems
Cutting planes Cutting Planes
• Cut off relaxation solutions Heuristics
Primal heuristics
• Find integer feasible solutions
Branching variable selection
• Crucial for limiting search tree size
Branching
Gomory Cover
• A cut (cutting plane) is a Mixed Integer Rounding (MIR) Implied bound
constraint that reduces the StrongCG cuts Projected Implied bound
feasible region of the continuous Lift and Project Clique
relaxation but not its integer hull Infeasibility cuts GUB Cover
• Separation of cuts: Flow cover Zero-half
Given an x that is feasible for the Flow path Mod-K
relaxation, find a cut for which x is Network RLT
infeasible and add it to the MIP separation cuts BQP
relaxation Relax and Lift SubMIP
• Thus, the relaxation more closely User Cuts Outer Approximation
approximates the integer hull
© 2022 Gurobi Optimization, LLC. Confidential, All Rights Reserved | 52
Chvatal-Gomory Cuts
Chvatal-Gomory procedure:
• Choose non-negative multipliers 𝝀 ∈ ℝ𝒎
≥𝟎
• Aggregated inequality 𝜆𝑇 𝐴𝑥 ≤ 𝜆𝑇 𝑏 is valid for 𝑃 because 𝜆 ≥ 0
• Relaxed inequality 𝜆𝑇 𝐴 𝑥 ≤ 𝜆𝑇 𝑏 is still valid for 𝑃 because 𝑥 ≥ 0
• Rounded Inequality 𝜆𝑇 𝐴 𝑥 ≤ 𝜆𝑇 𝑏 is still valid for 𝑃𝐼 because 𝑥 ∈ ℤ𝑛
CG procedure suffices to generate all non-dominated valid inequalities for 𝑃𝐼 in a finite number of
iterations!
• P(0) = P, P(k) = P(k-1) ∩ {CG cuts for P(k-1)}: k-th CG closure of P - is a polyhedron!
• CG rank of a valid inequality for PI: minimum k s.t. inequality is valid for P(k)
• Higher rank cuts get more and more dense and numerically unstable
Lifting
• If x5 = 1, then x1 + x2 + x3 + x4 ≤ 1
Consider (1,1,1,0,1/17)
• Hence, x1 + x2 + x3 + x4 + 2x5 ≤ 3 is valid
• Need to solve knapsack problem αj := d0 - max{dx | ax ≤ b - aj} to find lifting
coefficient for variable xj
• Use dynamic programming to solve knapsack problem
48%
0.5
0.4
28%
0.3
0.2
14%
8% 7%
0.1 6%
4% 4% 3% 2%
Achterberg and Wunderling: "Mixed Integer Programming: Analyzing 12 Years of Progress" (2013)
benchmark data based on CPLEX 12.5
• At the end of a MIP log we usually find statistics about which cuts have been added to the LP
relaxation:
Cutting planes:
Gomory: 37
Lift-and-project: 3
Cover: 8
Implied bound: 19
MIR: 326
StrongCG: 14
Flow cover: 624
Inf proof: 4
Zero half: 19
Mod-K: 1
Presolve
• Tighten formulation and reduce problem size
Node selection Presolving Node Selection
• Select next subproblem to process
Node presolve
• Additional presolve for subproblem Conflict Analysis Node Presolve
Solve continuous relaxations
• Gives a bound on the optimal integral objective
Conflict analysis LP Relaxation
• Learn from infeasible subproblems
Cutting planes Cutting Planes
• Cut off relaxation solutions Heuristics
Primal heuristics
• Find integer feasible solutions
Branching variable selection
• Crucial for limiting search tree size
Branching
736%
7 548%
2
0% 2%
1
0
most fractional random pseudo-costs with SB init reliability
(baseline) Achterberg and Wunderling: "Mixed Integer Programming: Analyzing 12 Years of Progress" (2013)
benchmark data based on CPLEX 12.5
Achterberg, Koch, and Martin: "Branching Rules Revisited" (2005)
Presolve
• Tighten formulation and reduce problem size
Node selection Presolving Node Selection
• Select next subproblem to process
Node presolve
• Additional presolve for subproblem Conflict Analysis Node Presolve
Solve continuous relaxations
• Gives a bound on the optimal integral objective
Conflict analysis LP Relaxation
• Learn from infeasible subproblems
Cutting planes Cutting Planes
• Cut off relaxation solutions Heuristics
Primal heuristics
• Find integer feasible solutions
Branching variable selection
• Crucial for limiting search tree size
Branching
Constructive heuristics
• No knowledge about other solutions needed
• Goal is to find solution early and to define “starting” point for improvement
heuristics
• May produce poor quality solutions
• Typically fast (but not always, e.g. NoRel, ZeroObj)
Improvement heuristics
• Can be more expensive
• Need at least one known to solution to work on
• High quality solutions
• Provide better cutoff bound to prune tree
• Can be effective even on low quality solutions
Rounding Heuristic
• Start with:
• Solution of relaxation
• Round integer variables
Quick?
• Very quick
Captures problem structure?
• No
General?
• Finds solutions to lots of easy models
No Relaxation Heuristic
• Start from some (feasible or infeasible) vector
• Constructed by quick heuristic
• Solve smaller sub-MIPs (with fixed variables) to decrease infeasibility or objective value
• Use multiple threads to solve sub-MIPs in parallel
• Various neighborhood strategies
• adaptive to spend more time on more successful ones
Quick?
• No – runs “forever”, until work (NoRelHeurWork) or time (NoRelHeurTime) limit is reached
Captures problem structure?
• No – ad-hoc generation of sub-MIPs
General?
• No – main use when relaxations solve too slowly
• Yes – can be successful to find good solution in limited time
© 2022 Gurobi Optimization, LLC. Confidential, All Rights Reserved | 70
Heuristic Log Lines
Root relaxation: objective 1.010525e+06, 1504 iterations, 0.06 seconds (0.13 work units)
Roland Wunderling
Senior Developer
1.60 144%
1.40
113%
1.20
1.00
0.80
54%
0.60 39%
37%
31%
0.40 25%
19%
0.20 5%
0.00
2 threads 4 threads 6 threads 8 threads 10 threads 12 threads
Achterberg and Wunderling: "Mixed Integer Programming: Analyzing 12 Years of Progress" (2013)
benchmark data based on CPLEX 12.5, models with ≥ 10 seconds solve time
Consider:
• 5 b1 + 3 b2 + 3 b3 + 3 b4 + 8 b5 ≤ 8
Stronger…?
• 4 b1 + 4 b2 + 4 b3 + 4 b4 + 8 b5 ≤ 8
Probably, but…
• Doesn't strictly dominate original
• (1, 0, 0, 0, 0.5) satisfies second, but not first
• Could weaken relaxation
• No definitive metrics
• Hippocratic oath of presolve
• "First, do no harm"
• Lots of cases where it hurts