0% found this document useful (0 votes)
17 views

IP Lecture Notes

Uploaded by

lxz1160915566
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

IP Lecture Notes

Uploaded by

lxz1160915566
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 95

Integer Programming

Lecture Notes

School of Mathematics
University of Birmingham
Birmingham B15 2TT, UK

J-BJI – Autumn 2023

Version of November 17, 2023


Recommended literature
Below, you find a list of relevant (but not necessary) books. The lecture notes are mainly
based on [1].
[1] L.A. Wolsey, Integer Programming, John Wiley & Sons, 1998
[2] L.A. Wolsey and G.L. Nemhauser, Integer and Combinatorial Optimization, John
Wiley & Sons 1980, 1999
[3] A. Schrijver, The Theory of Linear and Integer Programming, John Wiley & Sons,
1986
[4] M. Conforti, G. Conuejols and G. Zambelli, Integer Programming, Springer, 2014.
Contents

1 Introduction 1
1.1 Scope of the course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

I Modeling & Some Methods 4

2 Modeling Part I: Fundamental Integer Programs 4


2.1 Capital Budgeting Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 0-1 Knapsack Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Integer Knapsack Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Assignment Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 Set covering problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.6 Traveling salesman problem (TSP) . . . . . . . . . . . . . . . . . . . . . . 9

3 First Methods for Solving IPs and BIPs 11


3.1 Brute Force – Combinatorial Explosion . . . . . . . . . . . . . . . . . . . . 11
3.2 Graphical method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Solving IP by rounding the solution and objective of an LP? . . . . . . . . 14
3.4 Solving Integer Knapsack Problems by Divide and Conquer & Dynamic
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Modeling Part II: Fundamental Mixed Integer Programs 25


4.1 Uncapacitated Facility Location (UFL) . . . . . . . . . . . . . . . . . . . . 25
4.2 Uncapacitated Lot-Sizing (ULS) . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Modeling Part III: Formulations 31


5.1 Convex Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 Formulations for (Mixed) Integer Programs . . . . . . . . . . . . . . . . . . 32
5.3 Quality of Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

II Optimality, Bounds and Relaxation 37

6 Bounds and Optimality Criteria 37


6.1 Optimality – Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Finding primal bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.3 Finding Dual Bounds – Relaxation . . . . . . . . . . . . . . . . . . . . . . 40

7 Total Unimodularity (TU) 46


7.1 A Sufficient criterion for LP relaxation to be integral . . . . . . . . . . . . 46
7.2 Recognizing TU matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.3 Application to Minimum Cost Network Flow Problems . . . . . . . . . . . 51

III Further Methods and Algorithms 57

8 Branch and Bound 57


8.1 Solving general integer programming problems . . . . . . . . . . . . . . . . 57
8.2 Pruning Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.3 Branch and Bound Method . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8.4 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.5 Preprocessing and Fine-Tuning . . . . . . . . . . . . . . . . . . . . . . . . 66
8.6 Node Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8.7 Branching Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8.8 Further Improving Branch-and-Bound Systems . . . . . . . . . . . . . . . . 71

9 Cutting Plane Algorithm 71


9.1 Valid inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.2 Generate Valid inequalities for integer programming . . . . . . . . . . . . . 75
9.3 Gomory’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

IV More on Relaxation and Applications 83

10 Lagrangian relaxation 83
10.1 Solving Lagrangian Dual . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
10.2 Lagrangian dual as non-differentiable convex optimization . . . . . . . . . 88
10.3 Solving (LD) by Subgradient method . . . . . . . . . . . . . . . . . . . . . 89

11 Further applications of integer programming 89


11.1 Fixed Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
11.2 Piecewise Linear Representation . . . . . . . . . . . . . . . . . . . . . . . 90
11.3 Approximation of Nonlinear Functions . . . . . . . . . . . . . . . . . . . 91
CHAPTER 1. INTRODUCTION

1 – Introduction
In design, construction, maintenance,. . . engineers have to take decisions. The goal of all
such decisions is either to minimise effort or to maximise benefit. Sometimes it is impor-
tant to restrict the possible solutions to integer values, as for instance it is not profitable
to produce half a car. In such situations we speak of integer programming. Indeed, many
practical problems such as train and airline scheduling, vehicle routing, production plan-
ning, resource management, telecommunications and network design can be modeled as
integer or mixed-integer programs.

Before we delve deeper into the particulars of integer programming let us recall the steps
to be taken to model practical problems mathematically. For modeling, the following
three questions need to be addressed.
1.) What precisely shall be decided? This yields variables (also called design or de-
cision variables), which typically are stored in a vector x ∈ Rn . Every allocation
of the vector x is a solution to the problem.
2.) Which conditions need to be satisfied? This leads to formulating constraints in
the form of equalities and inequalities. One needs to check which values of the
design variables satisfy these equalities and inequalities. If a solution x satisfies all
the constraints, then it is called a feasible solution.
3.) Which feasible solution is the best one? For this one needs to come up with a rating
of the different feasible solutions (e. g. in regards to effort or benefit) so that one
can compare them and choose the best one of them. Such an optimisation criterion
is called objective and a feasible solution which is optimal with respect to the
objective is called an optimal solution.
The result of the above process is a mathematical model or program. The objective
can be usually expressed as a function of the design variables, which leads to the following
definition.

Definition 1.1 (Optimization or mathematical programming problem). A (nonlinear)


optimization or mathematical programming problem is given by

(N LP ) min f (x)
s.t. gi (x) ≤ 0 ∀i ∈ I1 ,
gi (x) = 0 ∀i ∈ I2 ,
x ∈ B,

where
• I = I1 ∪ I2 is a finite index set with
• I1 ∩ I2 = ∅
• B ⊆ Rn ,
• f, gi : B → R for all i ∈ I.

1
CHAPTER 1. INTRODUCTION

Remark 1.2. (i) In general the minimum might not exist and thus strictly speaking
we need to consider the infimum.
(ii) There are applications, in which B ⊆ Rn does not hold or in which I is not finite.
(iii) It suffices to consider minimization, as max f (x) = − min(−f (x)).
(iv) Similarly, it suffices to consider constraints of the form gi (x) ≤ 0, as gi (x) ≥ 0 is
equivalent to −gi (x) ≤ 0.

Mathematical programming problems can be classified in regards to the structure of the


constraints as follows.
Unconstrained problems: Here, no constraints are imposed, meaning that every point
in Rn is feasible. Formally, (N LP ) is unconstrained if B = Rn and I = ∅.
Constrained problems: Here, not all points in Rn are feasible. That means that con-
straints are imposed either as inequalities of the form gi ≤ 0, as equalities of the
form gi = 0 or by restricting the feasible domain B.
Integer problems: Such problems form an important subclass of constrained problems.
Here, it is requested that the solutions be integer, that is B ⊆ Zn . Of course integer
programs can have further constraints. As mentioned above, many problems arising
in practice can be modeled by integer programs. An important special case is formed
by boolean (also known as binary or 0-1) problems, where the variables are
restricted to taking the values 0 or 1., i. e. B ⊆ {0, 1}n . Likewise, these programs
are very significant in applications and often are surprisingly difficult to solve.
Mixed integer problems: Also these programs form an important subclass of con-
strained programs. Here, it is requested that some but not all variables be integer,
that is xj ∈ Z for some j ∈ {1, . . . , n} and xj ∈ R for the others.
Combinatorial Optimization problems: Loosely speaking, we speak of combinato-
rial optimization problems if the feasible domain is of finite cardinality, i. e. consists
of finitely many elements. A formal definition was provided in the first part of the
course, Combinatorial Optimisation.
Mathematical programming problems can also be classified in regards to the structure of
the objective and constraints. Important classes in this respect are the following.
Linear programs: f and gi are affine linear ∀i ∈ I.
Convex programs: f and gi are convex ∀i ∈ I1 and gi are affine linear for i ∈ I2 .
Smooth programs: f and gi are differentiable for all i ∈ I.
Naturally, one can combine the above classifications and speak of
• Smooth convex programs,
• Convex integer programs or
• Linear integer programs
to name but a few.

2
CHAPTER 1. INTRODUCTION

1.1 Scope of the course


In this course we will focus on linear integer programming problems and linear mixed
integer programming problems. We will abbreviate such programs by IP and MIP
respectively. (Nonlinear integer and mixed integer programs are abbreviated by INLP
and MINLP respectively in the literature but these do not fall into the scope of the
present course.) In these lecture notes we will often leave out the term ’linear’ and just
speak of integer programs, these shall however always be linear:
Definition 1.3 ((Mixed and Binary) Integer Program). Let A and G respectively denote
an m × n and m × p matrix. Further, let c, x ∈ Rn , b ∈ Rm and h, y ∈ Rp . We define
integer programs (IP), mixed integer programs (MIP), and binary or boolean
integer programs (BIP) to be linear programs of the form

(IP ) max c⊤ x (M IP ) max c⊤ x + h⊤ y


s. t. Ax ≤ b s. t. Ax + Gy ≤ b
x ≥ 0 integer x ≥ 0
y ≥ 0 integer

(BIP ) max c⊤ x
s. t. Ax ≤ b
x ∈ {0, 1}n .
Remark 1.4. Any (BIP) is a combinatorial optimization problem, as {0, 1}n is of cardi-
nality 2n . Can you formulate a corresponding COP as defined in the course Combinatorial
Optimisation?

In the present course we will examine how some fundamental examples of optimisation
problems can be modeled as IPs, BIPs and MIPs. Further, we will discuss methods
and algorithms for solving these. The methods and algorithms that we study build on
investigations concerning optimality criteria and finding upper and lower bounds on the
objective value.
This module presents a comprehensive theory as well as exact and approximate algorithms
for integer programming problems and a wide variety of its applications. More precisely,
this module will start with modeling, formulations and illustrative examples of integer
programming problems. Before following on to discussing optimality criteria and bounds
(including relaxation and total unimodularity) we will discuss some first methods for
solving certain IPs and BIPs. These methods include brute force, the graphical method,
divide and conquer and dynamic programming. We will then turn to some further im-
portant computational methods of integer programing, such as branch and bound, valid
inequalities and the cutting plane method. Finally, we will investigate computational
complexity of the problems and consider some further applications.

Acknowledgments. These notes are based off the lecture notes Integer Programming
by Dr. Yunbin Zhao, whom I wish to thank for sharing his notes.

3
I
Modeling & Some Methods

A central and fundamental part of Integer Programming is to find a good mathematical


description of integer optimization problems. Finding such models and deciding which
models are good ones is the content of the first part of the lecture course.

2 – Modeling Part I:
Fundamental Integer Programs
In this chapter we will introduce several well-known and fundamental optimization prob-
lems, which can be modeled as integer programs. We will focus on the modeling process
here and will return to these examples in later chapters to learn about algorithms and
methods to solve them.

2.1 Capital Budgeting Problems


Suppose we wish to invest £19 000. There are four investment opportunities. The required
invested sum and the present net values are listed in the following table.

Investment Required investment in £ Net present value in £


1 6 700 8 000
2 10 000 11 000
3 5 500 6 000
4 3 400 4 000

Into which investments should we place our money so as to maximize our total present
value? Each investment is a ‘take it or leave it’ opportunity: it is not allowed to invest
partially in any of the investments.
Recall from the introduction that we need to determine the decision variables, the con-
straints and the objective in order to arrive at a mathematical program.
Definition of Variables: We use binary variables xj for each investment, i. e. xj ∈ {0, 1}
for j ∈ {1, 2, 3, 4}. If xj = 1 then we will make investment j. If xj = 0, we will not
make the investment.

4
CHAPTER 2. MODELING I: INTEGER PROGRAMS

Constraint: The total investment may not exceed our budget:

6 700x1 + 10 000x2 + 5 500x3 + 3 400x4 ≤ 19 000.

Objective: We wish to maximize our profit:

max 8 000x1 + 11 000x2 + 6 000x3 + 4 000x4 .

Thus, we arrive at the (BIP)

(BIP ) max 8 000x1 + 11 000x2 + 6 000x3 + 4 000x4


s. t. 6 700x1 + 10 000x2 + 5 500x3 + 3 400x4 ≤ 19 000
x ∈ {0, 1}4 .

There are a number of additional constraints we might want to add. For instance, consider
the following additional constraints:
• Only make two investments, i. e. x1 + x2 + x3 + x4 ≤ 2.
• If investment 2 is made, then investment 4 must also be made, i. e. x2 ≤ x4 .
• If investment 1 is made, then investment 3 cannot be made, i. e. x1 + x3 ≤ 1.
This leads to the 0-1 integer programming problem:
max 8 000x1 + 11 000x2 + 6 000x3 + 4 000x4
s. t. 6 700x1 + 10 000x2 + 5 500x3 + 3 400x4 ≤ 19 000
x1 + x2 + x3 + x4 ≤ 2
(2.1)
x2 − x4 ≤ 0
x1 + x3 ≤ 1
x1 , . . . , x4 ∈ {0, 1}

2.2 0-1 Knapsack Problem


Example 2.1. A knapsack of volume b has to be packed with a selection of n items. Item
i has volume ai and value ci . How to pack the knapsack so as to maximize the total value
of items in it? (We will neglect the shapes of the objects and assume that any selection
of items can be packed into the knapsack without space inbetween.)
Example 2.2. A budget b is available for investment in projects during the coming year.
n projects are under consideration, ai shall denote the outlay for project i, and ci shall
denote its expected return. The goal is to choose a set of projects so that the budget is
not exceeded and the expected return is maximized.

We can model these problems as follows.


Definition of the Variables:
(
1 if item i is selected,
xi =
0 otherwise.

5
CHAPTER 2. MODELING I: INTEGER PROGRAMS

Constraints:
n
X
ai xi ≤ b,
i=1
xi ∈ {0, 1}, i ∈ {1, . . . , n}.

Objective:
n
X
max ci x i
i=1

2.3 Integer Knapsack Problem


Here, the situation is the same as in Section 2.2, but multiple copies of each type of item
are available. In this case the problem can be modeled as
n
X
max ci x i
i=1
n
X
s.t. ai x i ≤ b
i=1
xi ≥ 0 and integer for i ∈ {1, . . . , n}.

2.4 Assignment Problem


There are n people available to carry out n jobs. Each person is assigned to carry out
exactly one job. Some individuals are better suited to particular jobs than others, so there
is an estimated cost cij if person i is assigned to job j. The problem is to find a minimum
cost assignment.

Figure 2.1: Assignment problem, where 4 persons are assigned to 4 different jobs.

6
CHAPTER 2. MODELING I: INTEGER PROGRAMS

Conditions:
(i) n people carry out n jobs.
(ii) Each person carries out exactly one job.
(iii) If person i is assigned to job j, a cost cij is incurred
Which assignment minimizes the total cost?

Definition of the Variables: For i, j ∈ {1, . . . , n} define


(
1 if person i carries out job j,
xij =
0 otherwise.

So the variable xij only takes the values 0 or 1


Constraints: • Each person does exactly one job:
n
X
xij = 1, i ∈ {1, . . . , n}.
j=1

• Each job is done by exactly one person:


n
X
xij = 1, j ∈ {1, . . . , n}.
i=1

• Variables are 0-1 (binary):

xij ∈ {0, 1}, i, j ∈ {1, . . . , n}.

Objective: The total cost is


n X
X n
cij xij .
i=1 j=1

Thus, the assignment problem can be formulated as the following binary program:
n X
X n
min cij xij
i=1 j=1
n
X
s. t. xij = 1, i ∈ {1, . . . , n},
j=1
Xn
xij = 1, j ∈ {1, . . . , n},
i=1
xij ∈ {0, 1}, i, j ∈ {1, . . . , n}.

7
CHAPTER 2. MODELING I: INTEGER PROGRAMS

2.5 Set covering problem


• A city has m neighborhoods, and n potential locations for installing fire stations.
• Sj for j ∈ {1, . . . , n} is the set of neighborhoods that can be served from location j.
• Establishing a fire station at location j incurs costs of cj .

Where to set up fire stations to minimize the total set-up costs?


To model this problem, let

M = {1, ..., m}, N = {1, ..., n}.

Then the problem can be formulated as a combinatorial problem:


( )
X [
min cj : Sj = M .
T ⊆N
j∈T j∈T

(Notice: The above model is not formulated as a Combinatorial Optimisation Problem


(COP) as we defined it in ’ Combinatorial Optimisation’. Can you modify the model to
obtain a COP? Hint: Consider complements.)
The above model is not in an explicit form of an integer program. However, it can be
reformulated as a binary integer program in the following way:

Defining the variables: Define an indicator variable xj as follows:


(
1 if location j is selected,
xj =
0 otherwise.

Constraints: Define (
1 if i ∈ Sj ,
aij =
0 otherwise.

(We may construct a 0-1 incident matrix A = (aij ). This is nothing but pro-
cessing of the data.) With this notation we can describe the covering constraint (at
least one fire station must serve neighborhood i) as follows.
n
X
aij xj ≥ 1, for each i ∈ {1, . . . , m}.
j=1

Objective: Minimizing the total costs can be formulated as


n
X
min cj x j .
j=1

8
CHAPTER 2. MODELING I: INTEGER PROGRAMS

Therefore, the set covering problem (which we had formulated as a combinatorial problem
before) can be likewise formulated as a BIP:
n
X
min cj x j
j=1
Xn
s. t. aij xj ≥ 1, i ∈ {1, . . . , m},
j=1
xj ∈ {0, 1}, for all j ∈ {1, . . . , n}.

2.6 Traveling salesman problem (TSP)


The following example is perhaps the most notorious problem in operations research
because it is easy to explain, and so tempting to try and solve. And this problem arises
in a multitude of forms.
• A salesman must visit each of n cities exactly once and then return to his starting
point.
• The time taken to travel from city i to city j is cij .
• Find the order in which he should make his tour so as to finish as quickly as possible.

Remark 2.3. It may be the case that cij ̸= cji . If cij = cji for all i, j ∈ {1, ..., n}, then
we speak of the Symmetric Traveling salesman problem (STSP).

TSP can be stated in different forms: A truck driver has a list of clients he must
visit on a given day, or a machine must place modules on printed circuit boards, or a
stacker crane must pick up and depose crates.

Formulation as a binary integer program:


Decision variables: For all i, j

1 if he goes directly from city i to city j,
xij =
0 otherwise.

Constraints: • He leaves city i exactly once


n
X
xij = 1, i ∈ {1, . . . , n}.
j:j̸=i

• He arrives in city j exactly once


n
X
xij = 1, j ∈ {1, . . . , n}.
i:i̸=j

9
CHAPTER 2. MODELING I: INTEGER PROGRAMS

Figure 2.2: Subtours

The two types of constraints above are related to the constraints of the assignment
problem. A solution to the assignment problem (with additional condition xii = 0)
might give a solution of the form shown as in Figure 2.2 (i. e., a set of disconnected
sub-tours).

To eliminate these solutions, we need more constraints that guarantee connectivity


by imposing that the salesman must pass from one set of cities to another, so-called
cut-set constraints:
XX
xij ≥ 1, ∀S ⊊ N, S ̸= ∅,
i∈S j ∈S
/

where N = {1, 2, ..., n}.

An alternative is to replace the above cut-set constraints by the following subtour


elimination constraints:
X X
xij ≤ |S| − 1 for S ⊆ N, 2 ≤ |S| ≤ n − 1.
i∈S j∈S,j̸=i

Thus, the TSP can be formulated as follows

n X
X n
min cij xij
i=1 j=1
X
s.t. xij = 1, i = 1, ..., n,
j:j̸=i
X
xij = 1, j = 1, ..., n,
i:i̸=j
XX
xij ≥ 1, ∀S ⊊ N, S ̸= ∅,
i∈S j ∈S
/
xij ∈ {0, 1} for all i, j = 1, ..., n.

10
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

Or
n X
X n
min cij xij
i=1 j=1
X
s.t. xij = 1, i = 1, ..., n,
j:j̸=i
X
xij = 1, j = 1, ..., n,
i:i̸=j
X X
xij ≤ |S| − 1 for S ⊆ N, 2 ≤ |S| ≤ n − 1,
i∈S j∈S,j̸=i
xij ∈ {0, 1} for all i, j = 1, ..., n.

3 – First Methods for Solving IPs and BIPs


3.1 Brute Force – Combinatorial Explosion
The integer programs that we got to know in Chapter 2 all have the property that the
set of feasible solutions is of finite cardinality. One strategy to solve such problems is
to explicitly determine all of the finitely many feasible solutions, calculate the objective
value for each of them and choose the one with maximal objective value. This approach
is known as the brute-force approach.
Example 3.1. Recall the Capital Budgeting Problem from Section 2.1 and its formulation
as a BIP in (2.1):

max 8 000x1 + 11 000x2 + 6 000x3 + 4 000x4


s. t. 6 700x1 + 10 000x2 + 5 500x3 + 3 400x4 ≤ 19 000
x1 + x2 + x3 + x4 ≤ 2
x2 − x4 ≤ 0
x1 + x3 ≤ 1
x1 , . . . , x4 ∈ {0, 1}

We can solve this relatively small problem by using brute-force, i. e. consider all sixteen
binary vectors (x1 , x2 , x3 , x4 ) ∈ {0, 1}4 , determine which of these vectors satisfy the con-
straints, calculate the objective value for all feasible vectors and choose the one with
maximal objective value. The set of feasible solutions is
              

 1 1 0 0 0 0 0 
              
0 , 0 , 1 , 0 , 0 , 0 , 0 .

 0 0 0 1 1 0 0
 
1 0 1 1 0 1 0
 

It is not difficult to check that (0, 1, 0, 1)⊤ gives the optimal solution. The associated
optimal objective value is 15 000.

11
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

3.1.1 BIP The idea described in the above example can be generalised to general BIP
problems. If we have n binary design variables then there are 2n possible solutions. For
each of these we check if all constraints are satisfied, eliminate those which do not satisfy
all constraints, and determine the objective value of the remaining ones. Note that the
time needed to find a solution following this strategy grows exponentially in n.

3.1.2 The Assignment Problem There is a one-to-one correspondence between as-


signments and permutations of {1, . . . , n}. Thus,
√ theren are n! solutions to compare.
n
(Note, by Stirling’s formula, we know that n! ∼ 2πn e as n → ∞, or equivalently,
√ n
limn→∞ n!1 2πn ne .)

3.1.3 The Knapsack and Covering Problems Pn In both cases the number of subsets
n
is 2 . For the knapsack problem with b = j=1 aj /2, at least half of the subsets are
feasible, and thus there are at least 2n−1 feasible solutions to compare.

3.1.4 The Traveling Salesman Problem Starting at city 1, the salesman has n − 1
choices. For the next choice n − 2 cities are possible, and so on. Thus, there are (n − 1)!
feasible tours.
The conclusion drawn from the above classes of examples is that it is only sensible to use
brute-force for solving such problems for very small values of n, as in all instances the
problem size grows exponentially with n.

Remark 3.2. Every programming problem whose set of feasible solutions is of finite
cardinality can be formulated as a combinatorial optimisation problem. Having the brute-
force approach in mind one can consider the following independence system (X, F), where
X is the set of all feasible solutions and where F = {Y ⊆ X : |Y | ≤ 1}.
Let us consider BIPs: Assume that m, n ∈ N and A ∈ Rm×n . Further, let c, x ∈ Rn and
b ∈ Rm .
(BIP ) max c⊤ x
s. t. Ax ≤ b
x ∈ {0, 1}n
can be formulated as a COP in the following way. For the seed set X we can take
X = {x ∈ {0, 1}n | Ax ≤ b}. Further, we let F = {Y ⊆ X : |Y | ≤ 1} as above and
w(x) = c⊤ x for x ∈ X. Then the above BIP is the COP associated with the independence
system (X, F) with weight function given by w. Note that using this independence system
is not very convenient and indeed corresponds to the brute-force approach.

3.2 Graphical method


For linear programs with two (or three) decision variables we can apply the graphical
method to find an optimal solution. Below, we demonstrate by means of an example that
the graphical method also allows for finding optimal integer solutions.

12
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

Example 3.3. We consider the following (IP) and its so-called relaxed (LP).

(IP ) max 4x1 − x2 (LP ) max 4x1 − x2


s. t. 7x1 − 2x2 ≤ 14 s. t. 7x1 − 2x2 ≤ 14
x2 ≤ 3 x2 ≤ 3
2x1 − 2x2 ≤ 3 2x1 − 2x2 ≤ 3
x1 , x2 ∈ Z≥0 x1 , x2 ≥ 0
Here, and in the remainder of this course, Z≥0 and Z+ shall denote the set of non-negative
integers, i e. Z≥0 := Z+ := {0, 1, 2, 3, . . .}.
In the relaxed (LP) the conditions that x1 , x2 be integer are dropped. (We will study
relaxation in more detail in Chapters 6 and 10.)
We know how to solve (LP) graphically:
1.) First we determine the simplex of feasible solutions (grey region in Figure 3.1a) by
plotting the constraints (red lines in Figure 3.1a).
2.) Then we choose an arbitrary point in the simplex, evaluate its objective value and
plot the associated contour line of the objective function. In Figure 3.1a we chose
the point (0, 0)⊤ with objective value 0. The associated contour line is plotted in
green.
3.) Increasing the objective value corresponds to parallely shifting this line. We now
shift as far as we can without leaving the simplex. It is apparent that the obtained
line intersects the simplex either in exactly one point or in a whole line segment.
In the first case we have arrived at the unique optimal solution. In the second
case there are infinitely many optimal solutions, namely all those which lie on the
described line segment. We conclude from Figure 3.1a that the optimal solution to
(LP) is (20/7, 3)⊤ with objective value 59/7.

( 20
7 , 3)
T

(2, 1)T

(a) LP (b) IP

Figure 3.1: Graphical solution of the (IP) and its associated relaxed (LP) from Exam-
ple 3.3.

13
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

Let us now turn to (IP). We can proceed in a similar manner. But notice that the feasible
region for (LP) is a polyhedron, whereas the feasible region of (IP) is a discrete set of
lattice points. Thus, we need to restrict to integer lattice points:
1.) First we determine the feasible region by intersecting Z2≥0 with the simplex which
we determined for (LP). We end up with a finite set of points which are visualised
by black balls in Figure 3.1b.
2.) As before, we choose an arbitrary point in the feasible region (i. e. from the finitely
many points), evaluate its objective value and plot the associated contour line of the
objective function. In Figure 3.1b we again chose the point (0, 0)⊤ with objective
value 0. The associated contour line is plotted in green.
3.) Again we shift the green line as far as we can but only consider those possibilities
which pass through feasible points. We conclude from Figure 3.1b that the optimal
solution to (IP) is (2, 1)⊤ with objective value 7.

3.3 Solving IP by rounding the solution and objective of an LP?


The first idea that one could have to solve a general IP is the following. Instead of solving
the (IP) one first solves the associated relaxed (LP), e. g. by using the Simplex method.
Then one rounds the obtained solution and objective value. However, when proceeding
in this way the following difficulties can arise.
1.) It can happen that the rounded solution is infeasible, even if one allows both, round-
ing up and down. This is visualized by the example in R2 that is displayed in
Figure 3.2a.

(a) Rounding leaves the feasible (b) Rounded solution is far away
region. from the optimal integer solution.

Figure 3.2: Rounding.

2.) Even if one obtains a feasible solution by rounding up or down, the optimal objective
value of the (IP) can be arbitrarily far away from the rounded optimal objective
value of the (LP). This is visualized by the following example in R2 that is displayed
in Figure 3.2b.

14
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

Example 3.4. Consider the integer program:

max x1 + 0.64x2
s. t. 50x1 + 31x2 ≤ 250,
3x1 − 2x2 ≥ −4,
x1 , x2 ≥ 0 and integer.

The optimal LP solution (376/193, 950/193)⊤ is a long way from the optimal integer
solution (5, 0)⊤ , as we see in Figure 3.2b.

(376/193, 950/193)

(5, 0)
0

1 2 3 4 5

Figure 3.3: Rounding the solution of LP

3.) Especially when considering boolean integer problems rounding will not be helpful.

3.4 Solving Integer Knapsack Problems by Divide and Conquer


& Dynamic Programming
In this section we will discuss two related approaches by which some IP problems can
be solved by recursively reducing the problem dimension. These approaches are called
divide and conquer & dynamic programming (DP for short). We will see that
knapsack problems with integer coefficients are generally well solved via dynamic
programming. Indeed, DP is an effective approach for knapsack problems if the size of
the data is restricted.

What is meant by recursively reducing the problem dimension?


Consider the 0-1 knapsack problem. That is, think of a knapsack of volume b and a choice
of n different items, say 1, . . . , n, that we can pack into the knapsack. We have only one of
each item and can either choose to pack it or not. aj and cj shall respectively denote the
volume and the value of item j. Our goal is to maximize the total value of the contents
of the knapsack.

15
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

Suppose that item n fits into the knapsack, i. e. that an ≤ b. If we pack item n, then the
remaining space which is left in the knapsack is b − an and we have one less item to choose
from when wanting to pack more into the knapsack. This leads to solving the smaller
problem of maximizing the value of a knapsack of volume b − an when choosing from the
n − 1 items 1, . . . , n − 1. Let us call this problem A and its optimal objective value x∗A .
We shall also solve the smaller problem B, which corresponds to the case that item n is
not packed into the knapsack. So, here, we want to maximize the value of a knapsack
of volume b by choosing from the n − 1 items 1, . . . , n − 1. Let us denote the optimal
objective value of B by x∗B .
If x∗A +cn > x∗B then we know that we should pack item n. If x∗A +cn = x∗B then there is an
optimal solution with and without item n. If x∗A +cn < x∗B then we should not pack item n.

The above-described approach is known as a divide and conquer approach: One divides
the problem into smaller subproblems, conquers the subproblems recursively and combines
the solutions to give a solution to the original problem.
Dynamic programming is a related technique, which is more efficient when solving
problems with overlapping subproblems. In dynamic programming one starts with solving
the smallest subproblems and then recursively finds solutions for the bigger problems.
The benefit of DP is that each subproblem is solved only once. In DP the result of each
subproblem is stored in a table (generally implemented as an array or a hash table) for
future references. These subsolutions may be used to obtain the original solution and the
technique of storing the subproblem solutions is known as memorization.

3.4.1 0-1 Knapsack Problems Let n ∈ N and aj , b ∈ Z≥0 for j ∈ {1, . . . , n}. Consider
the 0 − 1 knapsack problem
n
X
(P ) max cj x j
j=1
Xn
s. t. aj xj ≤ b,
j=1
x ∈ {0, 1}n .

For λ ∈ {0, . . . , b} and r ∈ {1, . . . , n} we consider the subproblems


r
X
(Pr (λ)) fr (λ) = max cj x j
j=1
r
X
s. t. aj xj ≤ λ,
j=1
x ∈ {0, 1}r .

Then z = fn (b) gives us the optimal value of the knapsack problem, and furthermore, all

16
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

the problems (Pr (λ)) are all knapsack problems of the same type but of smaller size.

To arrive at a recursive relationship between the values fr (λ), we let x∗ (λ) = (x∗1 (λ), . . . , x∗r (λ))⊤
denote an optimal solution of (Pr (λ)) and distinguish two cases:
• If x∗r (λ) = 0, then the choice of the remaining variables x∗1 (λ), . . . , x∗r−1 (λ) must be
optimal for the problem (Pr−1 (λ)). In other words, we then have

fr (λ) = fr−1 (λ).

• If x∗r (λ) = 1, then necessarily λ ≥ ar and the choice of the remaining variables
x∗1 (λ), . . . , x∗r−1 (λ) must be optimal for (Pr−1 (λ − ar )), and hence,

fr (λ) = cr + fr−1 (λ − ar ).

Note that if λ < ar then necessarily x∗r (λ) = 0 and whence fr (λ) = fr−1 (λ). However, if
λ ≥ ar then both x∗r (λ) = 0 and x∗r (λ) = 1 are possible. In this case, we can proceed as
follows. Since we are looking for the maximum objective value, we just compare the two
function values above and pick the one that produces the larger value! The larger value
is the value for fr (λ).

Thus, we arrive at the recursion


(
fr−1 (λ) : λ < ar
fr (λ) = (3.1)
max {fr−1 (λ), cr + fr−1 (λ − ar )} : λ ≥ ar .

A divide and conquer approach

Example 3.5. Consider the 0 − 1 knapsack problem

max 10x1 + 7x2 + 25x3 + 24x4


s. t. 2x1 + x2 + 6x3 + 5x4 ≤ 7,
x ∈ {0, 1}4 .

We want to determine f4 (7). Using the recursion (3.1), we know that

f4 (7) = max{f3 (7), 24 + f3 (2)}.

Thus, we need to solve the two subproblems P3 (7) and P3 (2), which have a number of
subproblems themselves. We solve these recursively by using (3.1):

17
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

f3 (7) = max{f2 (7), 25 + f2 (1)} f3 (2) = f2 (2)


f2 (7) = max{f1 (7), 7 + f1 (6)} f2 (2) = max{f1 (2), 7 + f1 (1)}
f1 (7) = 10 f1 (2) = 10
f1 (6) = 10 f1 (1) = 0
f2 (1) = max{f1 (1), 7 + f1 (0)}
f1 (1) = 0
f1 (0) = 0

The subproblem-structure is visualized by a binary tree in Figure 3.4.

f4 (7)
24

f3 (7) f3 (2)
25

f2 (7) f2 (1) f2 (2)


7 7 7

f1 (7) f1 (6) f1 (1) f1 (0) f1 (2) f1 (1)

Figure 3.4: Binary decision tree for the IP of Example 3.5.

We can determine the entries in the tree in the following way. It is not difficult to
determine the entries in the bottom row. For each entry in the second to last row we
consider its two children. We add the value on the edge to the value in the node of each
child (no value on the edge means 0), and compare which of the two children gives a larger
sum. The larger value is the one that we insert into the parental node.

34
24

32 10
25

17 7 10
7 7 7

10 10 0 0 10 0

Figure 3.5: Binary decision tree for the IP of Example 3.5.

The outcome of this procedure is shown in Figure 3.5. We can conclude that the optimal
objective value is 34. How can we determine the associated optimal solution? For this we

18
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

need to recall which route we have taken to arrive at the optimal objective value of 34.
In Figure 3.5 this route is displayed in bold. Starting from the root of the tree, we see
that along the bold path we first take a labeled edge. This means x∗4 = 1. Then we take
an unlabeled edge, implying x∗3 = 0. The next edge likewise is unlabeled, yielding x∗2 = 0.
For determining x∗1 we check whether the entry in the bottom node of the bold path is
zero or non-zero. As it is non-zero in our case, we conclude that x∗1 = 1.
Thus an optimal solution is x∗ = (1, 0, 0, 1)⊤ with objective value 34.

In the above example we have considered and solved the subproblem f1 (1) twice. In this
simple example this does not matter too much. However, when considering problems
with many more variables this phenomenon can occur often and should be avoided. Note
also, that with n items the tree can have more than 2n−1 − 1 nodes in the bottom row
(compare with Section 3.1.3), so the tree size, and hence the computational complexity
of the algorithm, grows exponentially in n.
We now turn to a dynamic programming , which - as mentioned before - avoids repeatedly
solving the same subproblems.

Dynamic Programming

To initialize the recursion, we set the obvious boundary values

fr (0) = 0, r ∈ {1, . . . , n}
f0 (λ) = 0, λ ∈ {0, . . . , b}.

And for any other r and λ, we have the following recursions:


(
fr−1 (λ) : λ < ar
fr (λ) =
max {fr−1 (λ), cr + fr−1 (λ − ar )} : λ ≥ ar .

We use the recursion to successively calculate f1 , f2 , f3 , . . . , fn for all integral values of


λ ∈ {0, . . . , b}.
Once we have all these values, the question is how to find an associated optimal solution.
This could be done by iterating back from the optimal value fn (b):
• If fn (b) = fn−1 (b), we set x∗n = 0 and continue by checking fn−1 (b).
• If fn (b) = cn +fn−1 (b−an ), we set x∗n = 1 and then continue by checking fn−1 (b−an ).
The computation of the backtracking procedure can be avoided altogether by keeping
track of the indicator variables
(
0 if fr (λ) = fr−1 (λ),
pr (λ) =
1 if fr (λ) = cn + fr−1 (λ − an ).

Thus, the above procedure becomes

19
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

• If pn (b) = 0, we set x∗n = 0 and continue by checking the value pn−1 (b).
• If pn (b) = 1, we set x∗n = 1 and then continue by checking the value pn−1 (b − an ).
Therefore, the DP algorithm for 0-1 knapsack problem can be stated as follows:

Algorithm 3.6 (DP for 0-1 Knapsack with integer coefficients (maximization))

1.) Initialisation: Set

fr (0) = 0 for r ∈ {1, . . . , n},


f0 (λ) = 0 for λ ∈ {0, . . . , b}.

2.) Forward Recursion: For r ∈ {1, 2, . . . , n}, repeat


(a) For λ ∈ {0, . . . , ar − 1} set

fr (λ) = fr−1 (λ),


pr (λ) = 0.

(b) For λ ∈ {ar , . . . , b} set

fr (λ) = max{fr−1 (λ), cr + fr−1 (λ − ar )},


(
0 if fr (λ) = fr−1 (λ),
pr (λ) =
1 if fr (λ) = cr + fr−1 (λ − ar ).

If fr−1 (λ) = cn + fr−1 (λ − an ) then pr (λ) is not well-defined. In this case


we may choose whether to take pr (λ) to be 0 or 1. The different choices
will lead to different optimal solutions (both with the same objective value).
3.) Backward Recursion:
(a) Set λ = b, r = n, x = 0.
(b) While r, λ ≥ 0,
if pr (λ) = 1
. set xr = 1 and
. λ ←− λ − ar ,
end.
r ←− r − 1. Repeat the step (b).

Counting the number of calculations required to arrive at fn (b) we see that for each
calculation fr (λ) for λ ∈ {0, 1, . . . , b} and r ∈ {1, . . . , n} there are a constant number of
additions, subtractions, and comparisons. Calculating the optimal solution requires at
most the same amount of work. Thus, the DP algorithm for 0-1 knapsack problems is
O(nb), where O denotes the big-Landau notation, i.e. f (n) = O(nb) if and only if there
exist constants c, C such that c ≤ fbn (n)
≤ C for all n ∈ N.

20
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

Example 3.7. Let us apply the above algorithm to the 0 − 1 knapsack problem

max 10x1 + 7x2 + 25x3 + 24x4


s.t. 2x1 + x2 + 6x3 + 5x4 ≤ 7,
x ∈ {0, 1}4 .

• Starting the forward recursion, f1 (1) = f0 (1) = 0, p1 (1) = 0,


• f1 (2) = max{f0 (2), c1 + f0 (2 − a1 )} = max{0, 10 + 0} = 10, p1 (2) = 1,
• and likewise f1 (3) = · · · = f1 (7) = 10, p1 (3) = · · · = p1 (7) = 1.
• Next, f2 (1) = max{f1 (1), c2 + f1 (1 − a2 )} = max{0, 7 + 0} = 7, p2 (1) = 1,
• f2 (2) = max{f1 (2), c2 + f1 (2 − a2 )} = max{10, 7 + 0} = 10, p2 (2) = 0,
• f2 (3) = max{f1 (3), c2 + f1 (3 − a2 )} = max{10, 7 + 10} = 17, p2 (3) = 1,
• and likewise, f2 (4) = · · · = f2 (7) = 17, p2 (4) = · · · = p2 (7) = 1,
• Continuing this way, we find the values

f1 f2 f3 f4 p1 p2 p3 p4
λ=0 0 0 0 0 0 0 0 0
1 0 7 7 7 0 1 0 0
2 10 10 10 10 1 0 0 0
3 10 17 17 17 1 1 0 0
4 10 17 17 17 1 1 0 0
5 10 17 17 24 1 1 0 1
6 10 17 25 31 1 1 1 1
7 10 17 32 34 1 1 1 1

Back-tracking is easy, using the pr (λ),


• p4 (7) = 1, thus x∗4 = 1.
• Therefore, we next have to check p3 (7 − a4 ) = p3 (2) = 0. Thus x∗3 = 0.
• We check p2 (2) = 0, showing that x∗2 = 0,
• and finally, p1 (2) = 1, so x∗1 = 1.
Thus, x∗ = (1, 0, 0, 1)⊤ is an optimal solution.

Example 3.8. Solve the following 0-1 knapsack problem by dynamic programming

max 7x1 + 8x2 + 3x3


s.t. x1 + 2x2 + x3 ≤ 3,
x1 , x2 , x3 ∈ {0, 1}.

21
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

3.4.2 Solving General Integer Knapsack Problems by DP We now extend the


DP method to integer knapsack problems
n
X
max cj x j
j=1
Xn
s.t. aj xj ≤ b,
j=1
x ∈ Zn≥0

where the coefficients b, aj for j ∈ {1, . . . , n} are positive integers.


We can again vary the right-hand side λ ∈ {0, . . . , b} and the number of variables r ∈
{0, . . . , n} and consider all of the following subproblems,
r
X
(Pr (λ)) gr (λ) := max cj x j
j=1
r
X
s.t. aj xj ≤ λ,
j=1
x ∈ Zr≥0

Clearly, our original knapsack problem is (Pn (b)), so that its optimal value is given by
gn (b).
The following boundary (starting) values are obvious,

gr (0) = 0, for r ∈ {0, . . . , n}


g0 (λ) = 0, for λ ∈ {0, . . . , b}.

If x∗ (λ) is an optimal solution to Pr (λ) giving value gr (λ), then we consider the value of
x∗r (λ), which can be any integer value of the following
 
λ
0, . . . , .
ar

If x∗r (λ) = t then necessarily tar ≤ λ. Using the principle of optimality, we have that

gr (λ) = cr t + gr−1 (λ − tar ).

Since x∗r (λ) = t can only occur if tar ≤ λ, or equivalently, if


 
λ
t ∈ {0, 1, . . . , }.
ar
we find the recursion formula
   
λ
gr (λ) = max tcr + gr−1 (λ − tar ) | t ∈ {0, . . . , } .
ar

22
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

This is the recursion to use for the divide and conquer approach, that we introduced for
0-1 knapsack problems. Note that in the setting of integer knapsack problems the tree
might
j k be much larger than in the 0-1 knapsack setting: A node with entry fr (λ) has
λ
ar
+ 1 children. When considering a dynamical programming approach starting with
j k
the above recursion note the following. As aλr = b in the worst case, the above recursion
gives an algorithm of complexity O(nb2 ). Can we do better?
Observe the following.
• If x∗r = 0, then the vector (x∗1 , . . . , x∗r−1 )⊤ must be optimal for the problem (Pr−1 (λ)),
so that gr (λ) = gr−1 (λ).
• If x∗r ≥ 1, then necessarily ar ≤ x∗r ar ≤ λ and the vector (x∗1 , . . . , x∗r−1 , x∗r − 1)⊤ must
be optimal for (Pr (λ − ar )), so that gr (λ) = cr + gr (λ − ar ).
Therefore, we arrive at the recursion
(
gr−1 (λ) if ar > λ
gr (λ) = .
max {gr−1 (λ), cr + gr (λ − ar )} if ar ≤ λ

We set the indicator variables


(
0, if gr (λ) = gr−1 (λ),
pr (λ) =
1, if gr (λ) = cr + gr (λ − ar )
and note that pr (λ) is not well defined.

Algorithm 3.9 (DP for Integer Knapsack)

1.) Initialisation: Set

gr (0) = 0, for r ∈ {1, . . . , n},


g0 (λ) = 0, for λ ∈ {0, . . . , b}.

2.) Forward recursion: For r ∈ {1, . . . , n} repeat


(a) For λ ∈ {0, . . . , ar − 1} set

gr (λ) = gr−1 (λ)


pr (λ) = 0.

(b) For λ ∈ {ar , . . . , b} set

gr (λ) = max{gr−1 (λ), cr + gr (λ − ar )},


(
0, if gr (λ) = gr−1 (λ),
pr (λ) =
1, if gr (λ) = cr + gr (λ − ar ).

23
CHAPTER 3. FIRST METHODS FOR SOLVING IPS AND BIPS

If gr−1 (λ) = cr + gr (λ − ar ) then pr (λ) is not well-defined. In this case we


may choose whether to take pr (λ) to be 0 or 1. The different choices will
lead to different optimal solutions (both with the same objective value).
3.) Backward Recursion:
(a) Set λ = b, r = n, x = 0.
(b) While λ, r ≥ 0, repeat
. if pr (λ) = 0, set r ←− r − 1,
. elseif pr (λ) = 1, set xr ←− xr + 1, and λ ←− λ − ar , end.
To back-track, we then start checking pn (b):
• If pn (b) = 1 then x∗n ≥ 1 and we must check pn (b − an ) to see whether x∗n ≥ 2
etc.
• If pn (b) = 0 then x∗n = 0 and we must check pn−1 (b) to see whether x∗n−1 ≥ 1
etc.

This now gives an algorithm of complexity O(nb), which is the same as that of the 0-1
knapsack problem.

Example 3.10. Consider the integer knapsack problem

max 7x1 + 9x2 + 2x3 + 15x4


s.t. 3x1 + 4x2 + x3 + 7x4 ≤ 10,
x ∈ Z4≥0 .

Applying Algorithm 3.9, we find the following table of values for gr (λ) and pr (λ) :

g1 g2 g3 g4 p1 p2 p3 p4
λ=0 0 0 0 0 0 0 0 0
1 0 0 2 2 0 0 1 0
2 0 0 4 4 0 0 1 0
3 7 7 7 7 1 0 0 0
4 7 9 9 9 1 1 0/1 0
5 7 9 11 11 1 1 1 0
6 14 14 14 14 1 0 0 0
7 14 16 16 16 1 1 0/1 0
8 14 18 18 18 1 1 0/1 0
9 21 21 21 21 1 0 0 0
10 21 23 23 23 1 1 0/1 0

Back-tracking:
• p4 (10) = 0, so x∗4 = 0.
• p3 (10) = 0, so x∗3 = 0.
• p2 (10) = 1, so x∗2 ≥ 1.
• p2 (10 − a2 ) = p2 (6) = 0, so x∗2 ̸≥ 2, and hence, x∗2 = 1.

24
CHAPTER 4. MODELING II: MIXED INTEGER PROGRAMS

• p1 (6) = 1, so x∗1 ≥ 1.
• p1 (6 − a1 ) = p1 (3) = 1, so x∗1 ≥ 2.
• p1 (3 − a1 ) = p1 (0) = 0, so x∗1 ̸≥ 3, and hence, x∗1 = 2.
We have found that x∗ = (2, 1, 0, 0)⊤ is an optimal solution.

Example 3.11. Solve the following integer programming problem by dynamic program-
ming

max 11x1 + 7x2 + 5x3


s. t. x1 + 2x2 + x3 ≤ 5,
x1 , x2 , x3 ≥ 0 and integer.

4 – Modeling Part II:


Fundamental Mixed Integer Programs

4.1 Uncapacitated Facility Location (UFL)


• Given is a set of potential depots N = {1, ..., n}, and a set of clients M = {1, ..., m}
who make orders.
• A fixed cost fj is incurred if depot j is in use.
• Each client makes orders. If all of client i’s orders are delivered from depot j, the
transportation cost is cij . If only a portion x ∈ [0, 1] of client i’s order is delivered
from depot j then proportional transport costs of xcij incur.
• Decide which depots to open and what proportion of the demand of client i to satisfy
from depot j so as to minimize total costs (fixed and transportation costs).
Note: This problem is similar to the covering problem, except for the addition of the
variable transportation costs.
Definition of variables: • xij is the fraction of the demand of client i satisfied from
depot j.
• For each j ∈ N , let yj = 1 if depot j is used, and yj = 0 otherwise.
Objective: The aim is to minimize the total cost
XX X
min cij xij + fj y j .
i∈M j∈N j∈N

Constraints: • Satisfaction of the demand of client i:


n
X
xij = 1, for i ∈ {1, . . . , m}.
j=1

25
CHAPTER 4. MODELING II: MIXED INTEGER PROGRAMS

• yj = 1 if and only if there exists i ∈ M such that xij > 0. Instead of putting
this equivalence relation into a constraint, it suffices to require that yj ≥ xij
for all i ∈ {1, . . . , m}, since we are minimizing over yi (hence if xij = 0 for all
i ∈ {1, . . . , m} then although yj = 1 satisfies the constraint yj ≥ xij for all
i ∈ {1, . . . , m}, yj = 1 will not belong to an optimal solution unless fj = 0).
With the same arguments we see that indeed it suffices to require
m
X
xij ≤ myj , for j ∈ {1, . . . , n}.
i=1

Indeed, we should require 0 ≤ xij ≤ 1 for all i, j. However, noteP


that the constraints
xij ≤ 1 are superfluous, as they are implied by 0 ≤ xij and nj=1 xij = 1 for all
i ∈ {1, . . . , m}.
So, UFL can be modeled as
XX X
minx,y cij xij + fj yj ,
i∈M j∈N j∈N
X n
s.t. xij = 1, for i ∈ {1, . . . , m},
j=1
Xm
xij ≤ myj , for j ∈ {1, . . . , n},
i=1
x ≥ 0, yj ∈ {0, 1} for all j ∈ {1, . . . , n}.

Notice, that yj , xij for j ∈ {1, . . . , n}, i ∈ {1, . . . , m} are the decision variables for this
problem. The xij are allowed to take any value in [0, 1] while yj are binary. Thus, this
problem is an example of a mixed integer program.

4.2 Uncapacitated Lot-Sizing (ULS)


A factory produces a single type of product over a horizon of n periods, say 1, . . . , n. Over
the time demand varies and there are certain production and storage costs:
• ft > 0 is the fixed cost of producing in period t.
• pt is the unit production cost in period t.
• Earlier production can be kept in storage for later sale at a unit storage cost of ht
during period t.
• Initially the storehouse is empty.
• The client demand in period t is dt .
How to plan production so as to satisfy all demands and minimize the total costs?
Definition of variables: • xt is the amount produced in period t (allowed to be
fractional).

26
CHAPTER 4. MODELING II: MIXED INTEGER PROGRAMS

xt

S t-1 St

dt

Figure 4.1: Period t

• yt = 1 if production occurs in period t, and yt = 0 otherwise.


• st is the stock at the end of period t.
Objective function: The aim is to minimize the following objective function:
n
X
(pt xt + ft yt + ht st ).
t=1

Constraints: This leaves us with modeling the constraints.


• In the beginning the storehouse is empty, so s0 = 0.
• The stock at the end of period t is the stock at the end of period t − 1 plus
production in period t minus demand of period t, i. e. st = st−1 + xt − dt
• All client demand in every period shall be satisfied, so st ≥ 0 for all t.
Here, it is more challenging to model the constraint ’yj = 1 if production occurs
in period t and yt = 0 otherwise’ as no upper bound is given on xt . One way of
resolving this is by using ”a very large value M ”. Then the constraint becomes
xt ≤ M yt . Alternatively, one can calculate an upper bound based on the problem
data. For example, if we impose that sn = 0, then we can tighten the variable upper
bound constraint to !
X n
xt ≤ dℓ y t .
ℓ=1

We arrive at the following integer programming model for ULS:


Pn
+ nt=1 ft yt + nt=1 ht st ,
P P
minx,y,s t=1pt x t
s.t. st−1 + xt = dt + st , t ∈ {1, . . . , n},
xt ≤ M yt , t ∈ {1, . . . , n},
xt ≥ 0, t ∈ {1, . . . , n},
yt ∈ {0, 1}, t ∈ {1, . . . , n},
s0 = 0, st ≥ 0, t ∈ {1, . . . , n}.

27
CHAPTER 4. MODELING II: MIXED INTEGER PROGRAMS

With the additional assumption that sn = 0 we have the alternative formulation:


Pn Pn
+ nt=1 ht st ,
P
minx,y,s t=1 pt xt + t=1 ft yt
s.t. st−1 + xt = dP t + st , t ∈ {1, . . . , n},
xt ≤ ( nℓ=1 dℓ ) yt , t ∈ {1, . . . , n},
xt ≥ 0, t ∈ {1, . . . , n},
yt ∈ {0, 1}, t ∈ {1, . . . , n},
s0 = 0, st ≥ 0, t ∈ {1, . . . , n}.

4.3 Alternatives
4.3.1 Discrete Alternatives (disjunctions)

Example 4.1. Suppose that two jobs must be processed on the same machine and cannot
be processed simultaneously. If pi (i = 1, 2) are the processing times, and the variables ti
the start times for i = 1, 2, then either job 1 precedes job 2 and so t2 ≥ t1 + p1 , or job 2
comes first and t1 ≥ t2 + p2 . Note that these two conditions are mutually exclusive, i. e.
exactly one of them is true, if pi > 0.

The above example demonstrates that in scheduling and other applications, problems of
the following type occur:

minx∈Rn c⊤ x
s.t. 0 ≤ x ≤ u and
either w⊤ x ≤ b1 (4.1)
or v ⊤ x ≤ b2 . (4.2)

Here, ’either or’ is to be understood as follows. We impose exactly one of the conditions
(4.1) and (4.2). If we impose Condition (4.1) then we do not impose (4.2) and vice
versa. In other words we require that at least one of the conditions (4.1) and (4.2) must
hold. (Notice the difference between ’a condition is imposed’ and ’a condition holds’: If a
condition is imposed, then it must hold. If a condition is not imposed it may still hold.)
c, u, w, v are given vectors and b1 , b2 two real numbers. (In Example 4.1, x = (t1 , t2 )⊤ ,
w⊤ = (1, −1), b = (−p1 , −p2 )⊤ , v ⊤ = (−1, 1).)

Remark 4.2. Note that we can alternatively state the above problem in the following
way:

minx∈Rn c⊤ x
s.t. x ∈ P1 ∪ P2 ,

where

P1 = {x ∈ Rn | w⊤ x ≤ b1 , 0 ≤ x ≤ u}
P2 = {x ∈ Rn | v ⊤ x ≤ b2 , 0 ≤ x ≤ u}.

28
CHAPTER 4. MODELING II: MIXED INTEGER PROGRAMS

We can model the problem as an integer program:


• Introduce extra variables which model which of the two constraints (4.1) and (4.2)
is imposed.
(
1 if (4.1) is imposed,
y1 =
0 if (4.1) is not imposed,
(
1 if (4.2) is imposed,
y2 =
0 if (4.2) is not imposed.

Recall that if a condition is not imposed, it may still hold, but when it is imposed,
it must hold. Since exactly one of the two conditions must be imposed, we have
y1 + y2 = 1.
• Let
M ≥ max w⊤ x − b1 ; v ⊤ x − b2 .

x∈[0,u]

Then the problem is modeled as


minx∈Rn c⊤ x
s.t. 0 ≤ x ≤ u,
w⊤ x − b1 ≤ M (1 − y1 ),
v ⊤ x − b2 ≤ M (1 − y2 ),
y1 + y2 = 1,
y1 , y2 ∈ {0, 1}.
If y1 = 1 then x satisfies w⊤ x ≤ b1 while v ⊤ x ≤ b2 is inactive, and conversely if y2 = 1.

4.3.2 More General Alternative Constraints Consider a situation with the alter-
native constraints:
f1 (x1 , . . . , xn ) ≤ b1 , f2 (x1 , . . . , xn ) ≤ b2 ,
where x = (x1 , ..., xn ) ∈ C, with a bounded region C in Rn and f1 (x1 , . . . , xn ), f2 (x1 , . . . , xn ),
b1 , b2 ∈ R. In this situation, at least one (but not necessarily both) of these constraints
must be satisfied. By introducing certain binary variables, we have seen that this restric-
tion can be modelled as follows:
f1 (x1 , . . . , xn ) − b1 ≤ M y1 ,
f2 (x1 , . . . , xn ) − b2 ≤ M y2 ,
y1 + y2 = 1,
y1 , y2 are binary.
The constant M is chosen as
M ≥ max{f1 (x1 , . . . , xn ) − b1 , f2 (x1 , . . . , xn ) − b2 }.
x∈C

29
CHAPTER 4. MODELING II: MIXED INTEGER PROGRAMS

The multiple-choice constraint y1 + y2 = 1 implies that exactly one variable yj equals 0, so


that, as required, exactly one constraint is imposed and at least one constraint is satisfied.
We can save one integer variable in this formulation by noting that the multiple-choice
constraint can be replaced by y2 = 1 − y1 . The resulting formulation is given by:
f1 (x1 , . . . , xn ) − b1 ≤ M y1 ,
f2 (x1 , . . . , xn ) − b2 ≤ M (1 − y1 ),
y1 = 0 or 1.

Conditional Constraints

Consider constraints of the form:


f1 (x1 , x2 , . . . , xn ) > b1 implies f2 (x1 , x2 , . . . , xn ) ≤ b2 .
Note that this implication is not satisfied only when both
f1 (x1 , x2 , . . . , xn ) > b1 and f2 (x1 , x2 , . . . , xn ) > b2
hold true. Thus, the conditional constraint is logically equivalent to the alternative con-
straints
f1 (x1 , x2 , . . . , xn ) ≤ b1 or f2 (x1 , x2 , . . . , xn ) ≤ b2 ,
where at least one must be satisfied. Hence, this situation can be modeled by alternative
constraints as indicated above.

k-Fold Alternatives

Suppose that at least k of the following constraints must be satisfied:


fj (x1 , x2 , . . . , xn ) ≤ bj j ∈ {1, 2, . . . , n}.
[For example, these restrictions may correspond to manpower constraints for n potential
inspection systems for quality control in a production process. If management has decided
to adopt at least k inspection systems, then the k constraints specifying the manpower
restrictions for these systems must be satisfied, and the remaining constraints can be
ignored.]
Assuming that Mj for j ∈ {1, 2, . . . , n} are chosen so that the ignored constraints will not
be binding, the general problem can be formulated as follows:
fj (x1 , x2 , . . . , xn ) − bj ≤ Mj (1 − yj ), j ∈ {1, 2, . . . , n},
X n
yj ≥ k,
j=1

yj ∈ {0, 1}, j ∈ {1, 2, . . . , n}.


That is, yj = 1 if the jth constraint is to be satisfied, and at least k of the constraints
must be satisfied.

30
CHAPTER 5. MODELING III: FORMULATIONS

5 – Modeling Part III:


Formulations
In Chapters 2 and 4 we derived mathematical models for some well-known optimization
problems. Observe that an optimization problem may have a number of different mathe-
matical formulations that are equivalent from a mathematical point of view. (The feasible
region and the optimal values of the problems are the same.)
Alternative formulations of the same problem do matter when we consider the algorithm
performance. For different formulations, the performance of algorithms can be very dif-
ferent. However, deriving alternative formulations of an IP is usually a nontrivial task.
Many algorithms for solving IPs can in fact be seen as generating better and better
formulations until a trivial formulation is found – one for which the optimal solution is
obvious.
Let us first review some definitions.

5.1 Convex Geometry


Definition 5.1 (Polyhedron, Polytope). A subset of Rn described by a finite set of linear
inequalities
P = {x ∈ Rn | Ax ≤ b}
is called a polyhedron. A polytope is the convex hull of finitely many points x1 , . . . , xk ∈
Rn :
( k k
)
X X
P = conv(x1 , . . . , xk ) = λ i xi | λi = 1, λi ≥ 0 for i ∈ {1, . . . , k} .
i=1 i=1

Definition 5.2 (Extreme point). Let C ⊆ Rn be a convex set. A point x ∈ C is an


extreme point of C if x is not a convex combination of two points in C distinct from x.

Theorem 5.3. Let P ⊆ Rn . The following are equivalent


(i) P is a bounded polyhedron.
(ii) P is a polytope.
(iii) P has finitely many extreme points {x1 , . . . , xk } and

P = conv(x1 , ..., xk ).

Definition 5.4 (Formulation). A polyhedron P ⊆ Rn+p is a formulation for a set


S ⊆ Zn × Rp , where n is maximal and p is minimal with the property S ⊆ Zn × Rp , if
and only if
S = P ∩ (Zn × Rp ).

31
CHAPTER 5. MODELING III: FORMULATIONS

5.2 Formulations for (Mixed) Integer Programs


Let us consider the mixed integer programming problem

(M IP ) maxx∈Rn ,y∈Rp c⊤ x + w⊤ y
s. t. Ax + By ≤ b
x ∈ Zn ,

where A and B are matrices, and c, w are two vectors. We can represent the feasible set
by
n  x o
n+p n
S= ∈R Ax + By ≤ b, x ∈ Z .
y

If we drop the constraint x ∈ Zn , the set of points that satisfy the remaining constraints
is a polyhedron:
n  x o
P = ∈ Rn+p Ax + By ≤ b .
y
Clearly, we have
S = P ∩ (Zn × Rp ),
which shows that P is a formulation for S.

Example 5.5. Two different formulations for the set


             
1 2 3 1 2 3 2
, , , , , ,
1 1 1 2 2 2 3

are displayed in Figure 5.1.

Figure 5.1: Two different formulations for X from Example 5.5.

32
CHAPTER 5. MODELING III: FORMULATIONS

Example 5.6. Consider the IP:

max x + y
s.t. −3x + 2y ≤ 2
x+y ≤5
1≤x≤3
1≤y≤3
x, y ∈ Z

Note that the feasible set X of the IP is


             
1 2 3 1 2 3 2
, , , , , , .
1 1 1 2 2 2 3

Clearly, y ≤ 3 is actually a superfluous constraint. Notice that we obtain the same feasible
set if we add the constraint x − y ≥ −1. So we get the convex hull of all feasible solutions
for the IP, which is described by the following system:

−3x + 2y ≤ 2
x+y ≤5
1≤x≤3
1≤y≤3
−x + y ≤ 1
x, y ∈ Z.

Example 5.7 (Equivalent formulations for a 0-1 knapsack set). Consider the set of points
             

 1 0 0 0 0 0 0 
             
0 1 1 0 0 0
 ,   ,   ,   ,   ,   , 0 .

X=  0 0 0 1 1 0 0

 
0 1 0 1 0 1 0
 

It is not difficult to check that the three polyhedra below are formulations for X.

P1 = {x ∈ [0, 1]4 : 83x1 + 61x2 + 49x3 + 20x4 ≤ 100},


P2 = {x ∈ [0, 1]4 : 4x1 + 3x2 + 2x3 + x4 ≤ 4},
P3 = {x ∈ [0, 1]4 : 4x1 + 3x2 + 2x3 + x4 ≤ 4,
x1 + x2 + x3 ≤ 1,
x1 + x4 ≤ 1}.

Notes: It can be seen that P3 ⊆ P2 ⊆ P1 , so the formulations are successively tighter.


Tighter formulations of X are better, because algorithms tend to find an optimal solution
faster.

33
CHAPTER 5. MODELING III: FORMULATIONS

Example 5.8 (Traveling Salesman). Recall the binary programming model of the trav-
eling salesman problem:
n X
X n
minx cij xij
i=1 j=1
X
s. t. xij = 1, i = 1, ..., n,
j:j̸=i
X
xij = 1, j = 1, ..., n,
i:i̸=j
XX
xij ≥ 1, ∀S ⊊ N, S ̸= ∅,
i∈S j ∈S
/
xij ∈ {0, 1} for all i, j = 1, ..., n.
The cut-set constraints
XX
xij ≥ 1, ∀S ⊊ N, S ̸= ∅
i∈S j ∈S
/

were introduced to eliminate solutions that contain subtours.


Recall that alternatively, this can be achieved by using subtour elimination constraints
X X
xij ≤ |S| − 1, ∀S ⊆ N, 2 ≤ |S| ≤ n − 1,
i∈S j∈S,j̸=i

leading to the alternative model:


n X
X n
minx cij xij
i=1 j=1
X
s. t. xij = 1, i = 1, ..., n,
j:j̸=i
X
xij = 1, j = 1, ..., n,
i:i̸=j
X X
xij ≤ |S| − 1, ∀S ⊆ N, 2 ≤ |S| ≤ n − 1,
i∈S j∈S,j̸=i
xij ∈ {0, 1} for all i, j = 1, ..., n.
This model is mathematically equivalent to the first one (The set of feasible tours has not
changed, and neither has the objective function) but from an algorithmic point of view
the formulation is very different.

5.3 Quality of Formulations


Geometrically we can see that there must be an infinite number of formulations, so how
can we choose between them? Because conv(X) has the property that
X ⊆ conv(X) ⊆ P

34
CHAPTER 5. MODELING III: FORMULATIONS

Figure 5.2: Ideal formulation for the set X from Example 5.5.

for all formulations P , whenever x is of finite cardinality, this suggests the following
definition.

Definition 5.9. (i) Given a set X ⊆ Zn , and two formulations P1 and P2 for X, P1 is
a better formulation than P2 if P1 ⊆ P2 .
(ii) If X ⊆ Zn is of finite cardinality, the formulations with the property P = conv(X)
are called ideal formulations.

Returning to Example 5.5, we visualize the ideal formulation in Figure 5.2.

Theorem 5.10. When P is an ideal formulation of X ⊆ Zn and X is of finite cardinality,


then, the optimal solution of IP

max{c⊤ x | x ∈ X}

coincides with an extremal optimal solution of the linear programming problem

max{c⊤ x | x ∈ P }

which can for instance be solved by the Simplex method.

Proof. P being an ideal formulation of X ⊆ Zn means by definition that P = conv(X).


From Theorem 5.3 we infer that P has finitely many extreme points, all of which neces-
sarily are elements of X. In other words, the extremal points of P are all integer and the
result follows.

Thus, using an ideal formulation, an integer programming instance can become easy to
solve! However, often, this is only a theoretical solution, because in most cases there
is such an enormous (exponential) number of inequalities needed to describe conv(X),
and there is no simple characterization for them. Further, it might be very tedious to
determine conv(X).

35
CHAPTER 5. MODELING III: FORMULATIONS

Example 5.11 (Example 5.7 ctd.). It can be checked that P3 = conv(X) and thus, P3
is an ideal formulation.

The algorithms we will develop later in the course are based on approaches that can be
seen as attempting to successively improve the formulation until it is good enough to be
solved to optimality.

36
II
Optimality, Bounds and
Relaxation

Suppose that we have guessed a feasible solution to an (IP) or (MIP). How can we find
out how good it is? Further, if we are convinced that our solution is good, how can we
determine and prove that it is optimal?

6 – Bounds and Optimality Criteria

6.1 Optimality – Overview


Example 6.1 (Example 3.3 continued). Recall that we considered the following (IP) and
its associated relaxed (LP) in Example 3.3.

(IP ) max 4x1 − x2 (LP ) max 4x1 − x2


s. t. 7x1 − 2x2 ≤ 14 s. t. 7x1 − 2x2 ≤ 14
x2 ≤ 3 x2 ≤ 3
2x1 − 2x2 ≤ 3 2x1 − 2x2 ≤ 3
x1 , x2 ∈ Z≥0 x1 , x2 ≥ 0
In Example 3.3 we used the graphical method to solve (LP) and (IP) and arrived at the
optimal solution (2, 1)⊤ of (IP), see Figure 6.1. Now, suppose that we had not solved
this problem by the graphical method, but guessed the feasible solution x = (2, 1)⊤ of
(IP). How good is this solution? We want to find an answer without using the graphical
method, as this is unavailable for many IPs.
The objective value associated with x is c⊤ x = 7. Thus, we know that the objective value
x∗ of an optimal solution of (IP) satisfies

c⊤ x∗ ≥ 7.

We can solve (LP) by the Simplex method which yields the optimal solution (20/7, 3)⊤
with optimal objective value 59/7. This gives an upper boud for the objective value of

37
CHAPTER 6. BOUNDS AND OPTIMALITY CRITERIA

( 20
7 , 3)
T

(2, 1)T

(a) LP (b) IP

Figure 6.1: Graphical solution of the (IP) and its associated relaxed (LP) from Exam-
ple 3.3, see Figure 3.1.

(IP), namely
59
c⊤ x ∗ ≤
.
7
This upper bound can even be improved, since we know that the objective value associated
with an integer solution (when the cost-vector c is integer-valued) has to be integer.
Therefore, we can round down and obtain
 
⊤ ∗ 59
c x ≤ = 8.
7

Thus, we conclude that c⊤ x∗ ∈ {7, 8} for any optimal solution x∗ of (IP). We do not yet
know if x is optimal but we know that it is not far off from being so (its objective value
is at most 1 less than the optimal one).

Let us address the following question for general IPs: How to prove that a given point x∗ is
optimal? Put differently, we are looking for some optimality conditions that will provide
stopping criteria in an algorithm for IPs. Note that the general strategy for estimating
the quality of a solution of an IP is as demonstrated in Example 6.1.
Consider the problem

(IP ) max{c⊤ x | x ∈ X = P ∩ Zn }

(where P ⊆ Rn is a formulation of X) with unknown optimal objective value z ∗ . The


naive but important strategy for obtaining a stopping criterion is to find a lower bound
z ≤ z ∗ and an upper bound z ≥ z ∗ such that z = z ∗ = z.
• The gap z̄ − z can be used to estimate the quality of the best found objective value
c⊤ x̂ as an approximation of the unknown value z ∗ .
• If the gap z̄ − z < ε which is the chosen error tolerance, one may decide to stop the
algorithm if a feasible point x̂ is found with objective value c⊤ x̂ ∈ [z, z̄].

38
CHAPTER 6. BOUNDS AND OPTIMALITY CRITERIA

• If z̄ = z, then z ∗ must be optimal. Thus “z̄ = z” provides a certificate of optimality.

Thus, we need to find ways of deriving such upper and lower bounds.

Definition 6.2 (Primal and dual bounds). A primal bound for a maximization problem
(IP) is a lower bound z ≤ z ∗ . A dual bound is an upper bound z ∗ ≤ z̄.

In the above example we have determined an upper and a lower bound for the objective
value and from these bounds concluded how good the guessed feasible solution is. More-
over, we need to investigate how to find a feasible solution. This is in particular crucial
to arrive at a lower bound. We will look into this for general IPs below, and also estimate
the quality of a given feasible solution of a general IP.

6.2 Finding primal bounds


Every feasible solution x ∈ X provides a lower bound c⊤ x ≤ z ∗ . Thus, the mechanism
for finding primal bounds is conceptually simple.
• For some integer problems, finding a feasible solution is easy, and the real question
is how to find a good solution.
• For other IPs, finding a feasible solution can be as difficult as the IP problem itself.

Example 6.3.

z ∗ = max x1 − 2x2
s. t. 2x1 + x2 ≤ 4,
x1 , x2 ≥ 0 and integer.

Clearly, (1, 1)⊤ is a feasible solution, hence −1 is a lower bound on z ∗ ; and (2, 0)⊤ is also
a feasible solution, and hence 2 is a lower bound on z ∗ . It is better than (1, 1)⊤ in the
sense that it provides a better lower bound.

In general, finding a feasible solution is not always as easy as in Example 6.3. Heuristics
are typically used to overcome this problem. One important such heuristic is the greedy
heuristic. Greedy approaches construct a solution from scratch, and at each step choose
the item which brings the ’best’ immediate result. In other words, the idea of a greedy
algorithm is to take the best element and run. It is very shortsighted. It just chooses
one after the other whichever element gives the maximum profit and still gives a feasible
solution.

Example 6.4. We aim to find a feasible solution to the following problem

z ∗ = max 5x1 + 8x2 + 17x3


s. t. 4x1 + 3x2 + 7x3 ≤ 9
x1 , x2 , x3 ∈ {0, 1}.

39
CHAPTER 6. BOUNDS AND OPTIMALITY CRITERIA

This is a 0-1 knapsack problem for which a feasible solution can be found using the greedy
heuristic: Notice that
c2 8 c3 17 c1 5
= > = > = .
a2 3 a3 7 a1 4
This means that relative to the cost (i. e. constraint restriction) the second variable gives
the largest contribution to the objective value. As the bound on the right-hand side of
the constraint is 9, we may set x2 = 1.
After this, the residual of the right-hand-side is 9 − a2 = 6. So we have to set x3 = 0,
and then we may set x1 = 1. Therefore, by greedy heuristic, we obtain the feasible point
(1, 1, 0)⊤ at which the objective value is 5 + 8 = 13.
Example 6.5. Find a lower bound for the optimal objective value of the problem below
by using the greedy heuristic method
z ∗ = max 5.5x1 + 8.2x2 + 13.4x3 + 21.2x4
s.t. 3x1 + 3x2 + 7.1x3 + 10.8x4 ≤ 30.9
xi ≥ 0 and integer, i ∈ {1, . . . , 4}.

6.3 Finding Dual Bounds – Relaxation


Finding upper bounds for a maximization problem presents a different challenge. The
most important approach is by relaxation, one type of which we have briefly encountered in
Example 3.3. The idea is to replace a difficult maximization IP by a simpler optimization
problem whose optimal value is at least as large as z ∗ .
There are two obvious possibilities to construct an optimization problem with this prop-
erty:
(i) Enlarge the set of feasible solutions so that one optimizes over a larger set, or
(ii) replace the maximization objective function by a function that has the same or a
larger value everywhere.
Definition 6.6. A problem
(RP ) max{f (x) | x ∈ T ⊆ Rn }
is a relaxation of
(P ) max{c(x) | x ∈ X ⊆ Rn }
if
(i) X ⊆ T , and
(ii) f (x) ≥ c(x) for all x ∈ X.
Proposition 6.7. If (RP) is a relaxation of (P) with respective optimal objective values
z R and z, then z R ≥ z.

Proof. If x∗ is an optimal solution of (P), then x∗ ∈ X ⊆ T and z = c(x∗ ) ≤ f (x∗ ). As


x∗ ∈ T we moreover know that f (x∗ ) does not exeed z R , and so z ≤ f (x∗ ) ≤ z R .

40
CHAPTER 6. BOUNDS AND OPTIMALITY CRITERIA

The next result shows that if a relaxation is tight enough it can either provide a certificate
of infeasibility or optimality for the original problem:
Proposition 6.8. Let X ⊆ Zn and T ⊆ Rn . Let
(RP ) z R = max{f (x) | x ∈ T }
be a relaxation of the following IP
(IP ) z ∗ = max{c⊤ x | x ∈ X}.
Then the following hold.
(i) If x∗ is an optimal solution of (RP) for which x∗ ∈ X and f (x∗ ) = c⊤ x∗ , then x∗ is
an optimal solution for (IP).
(ii) If (RP) is infeasible, then (IP) is infeasible.

Proof. (i) Any feasible solution z ∈ X of (IP ) is a feasible solution of (RP). Thus, as x∗
is an optimal solution of (RP), we know that f (z) ≤ f (x∗ ). As (RP) is a relaxation
of (IP), we know that c⊤ z ≤ f (z). Moreover, by assumption, f (x∗ ) = c⊤ x∗ and
x∗ ∈ X. Hence, c⊤ z ≤ f (z) ≤ f (x∗ ) = c⊤ x∗ for any z ∈ X and whence x∗ is
optimal for (IP ).
(ii) Clear.

Example 6.9. Consider the binary programming problem


(BP ) max 7x1 + 4x2 + 5x3 + 2x4
s.t. 3x1 + 3x2 + 4x3 + 2x4 ≤ 6,
xi ∈ {0, 1}, i ∈ {1, 2, 3, 4}.
The relaxation
(LP ) max 7x1 + 4x2 + 5x3 + 2x4
s.t. 3x1 + 3x2 + 4x3 + 2x4 ≤ 6,
0 ≤ xi ≤ 1, i ∈ {1, 2, 3, 4}.
has optimal solution x∗ = (1, 1, 0, 0)⊤ . Since x∗ is binary and the objective function is
not changed, x∗ is also optimal for problem (BP).
There are a number of different types of relaxation:
• LP-relaxation (which we have already come across in Example 3.3), see Section 6.3.1
• Combinatorial relaxation, see Section 6.3.2
• Duality relaxation, see Section 6.3.3
• Lagrangian relaxation, see Chapter 10
We will discuss LP-relaxation, combinatorial relaxation and duality relaxation in the
remainder of this chapter and also see an example of Lagrangian relaxation. We will treat
Lagrangian relaxation in more detail later on in the course namely in Chapter 10 after
having covered some helpful preliminaries.

41
CHAPTER 6. BOUNDS AND OPTIMALITY CRITERIA

6.3.1 LP Relaxation Any IP problem

(IP ) max{c⊤ x | x ∈ P ∩ Zn }

with formulation P = {x ∈ Rn | Ax ≤ b} has an associated linear programming


relaxation
(LP ) max{c⊤ x | Ax ≤ b} = max{c⊤ x | x ∈ P }.
Clearly, (LP) is a relaxation of (IP), since P ∩ Zn ⊆ P and since the objective functions
of (LP) and (IP) are identical. We can conclude from Proposition 6.7 that

z ∗ ≤ z̄,

where z̄ is the optimal objective value of the LP relaxation (LP) and z ∗ the corresponding
value for the original problem (IP). Not using Proposition 6.7, this is intuitively clear,
since dropping the integrality constraints enlarges the feasible set.
The LP can be solved by the simplex method or interior-point method.
We already saw that under an ideal formulation the two values z̄ and z ∗ coincide. For
other formulations this clearly is not the case. (Recall the warning from Section 3.3 that
also rounding does not help here: 1.) It can happen that the rounded solution is infeasible.
2.) Even if one obtains a feasible solution by rounding up or down, the optimal objective
value of the (IP) can be arbitrarily far away from the rounded optimal objective value of
the (LP). 3.) Especially when considering boolean integer problems rounding will not be
helpful.)
The next result shows that better formulations of integer programming problems give
tighter dual bounds.

Proposition 6.10. Let P1 ⊂ P2 ⊆ Rn be two different formulations of the IP problem

(IP ) max{c⊤ x | x ∈ X ⊆ Zn }.

That is, X = P1 ∩ Zn = P2 ∩ Zn . Let

(LP1 ) z1LP = max{c⊤ x | x ∈ P1 },


(LP2 ) z2LP = max{c⊤ x | x ∈ P2 }

be the optimal objective values of the LP relaxations corresponding to these two formula-
tions. Then
z1LP ≤ z2LP ,
i. e. formulation P1 produces a tighter (dual) bound.

Proof. This is immediate as P1 ⊆ P2 and the maximum taken over a larger set is bigger.

42
CHAPTER 6. BOUNDS AND OPTIMALITY CRITERIA

6.3.2 Combinatorial Relaxation When the relaxed problem is a combinatorial opti-


mization problem, we speak of combinatorial relaxation. Some hard integer program-
ming problems can be relaxed by well-defined combinatorial problems. Here we illustrate
this with a few examples.
Example 6.11 (The knapsack problem).
n
X
(K) max ci x i
i=1
Xn
s. t. ai xi ≤ b,
i=1
x ≥ 0, x ∈ Zn ,
where ai and b are real numbers. This problem can be relaxed as follows
( n n
)
X X
max ci x i | ⌊ai ⌋xi ≤ ⌊b⌋, x ≥ 0, x ∈ Zn .
i=1 i=1

We have seen in Section 3.4 that this relaxed problem can be well solved by dynamic
programming.
Example 6.12 (The traveling salesman problem).
n X
X n
minx cij xij
i=1 j=1
X
s. t. xij = 1, i ∈ {1, . . . , n},
j:j̸=i
X
xij = 1, j ∈ {1, . . . , n},
i:i̸=j
XX
xij ≥ 1, ∀S ⊊ N, S ̸= ∅,
i∈S j ∈S
/
xij ∈ {0, 1} for all i, j ∈ {1, . . . , n}.

Notice that the salesman tours are precisely the assignments containing no subtours.
Thus, if we remove the cut-set constraints the feasible set becomes larger, and we arrive
at the following assignment problem
Xn X n
minx cij xij
i=1 j=1
X
s. t. xij = 1, i ∈ {1, . . . , n},
j:j̸=i
X
xij = 1, j ∈ {1, . . . , n},
i:i̸=j
xij ∈ {0, 1} for all i, j ∈ {1, . . . , n}.
which can be solved efficiently.

43
CHAPTER 6. BOUNDS AND OPTIMALITY CRITERIA

Example 6.13 (The quadratic 0-1 problem). Many combinatorial optimization problems
can be formulated as the maximization of a 0–1 quadratic function subject to linear
constraints.
( n
)
X X
max qij xi xj − pj xj : x ̸= 0, x ∈ {0, 1}n .
i,j:1≤i<j≤n i=1

Replacing all terms qij xi xj with qij < 0 by 0 gives a relaxation


( n
)
X X
max max{qij , 0}xi xj − pj xj | x ̸= 0, x ∈ {0, 1}n .
i,j:1≤i<j≤n i=1

This relaxation can be solved as a series of maximum flow problems.

6.3.3 Duality Relaxation For LP, duality provides a standard way to obtain upper
bounds. It is therefore natural to ask the question: Is it possible to find duals for integer
programs? The important property of a dual is that the value of any feasible solution
provides an upper bound on the objective value. Note that an LP or combinatorial
relaxation first needs to be solved to optimality in order to be certain to have found an
upper bound for the IP.
Definition 6.14. The two problems
(IP ) z ∗ = max{c⊤ x | x ∈ X ⊆ Zn },
(D) s∗ = min{w(u) | u ∈ U ⊆ Rm }
form a weak dual pair if
c⊤ x ≤ w(u) for all x ∈ X, u ∈ U.
In this case (D) is called a weak dual of (IP). If z ∗ = s∗ , then (IP) and (D) are said to
form a strong dual pair and (D) is called a strong dual of (IP).

Note that for a given weak dual pair, strong duality does not always hold.
Dual problems can sometimes allow us to prove optimality, and provide a certificate of
infeasibility for the original problem.
Proposition 6.15. Let (D) be a weak dual of (IP).
(i) If x∗ ∈ X and u∗ ∈ U are such that c⊤ x∗ = w(u∗ ). Then x∗ is optimal for (IP),
and u∗ is optimal for (D).
(ii) If (D) is unbounded (i. e., it has ”optimal” value −∞), then (IP) is infeasible.
Example 6.16 (IP and the dual of its LP relaxation form a weak dual pair). Consider
an (IP) and its LP relaxation (LP):
(IP ) max{c⊤ x | Ax ≤ b, x ∈ Zn+ }
(LP ) max{c⊤ x | Ax ≤ b, x ≥ 0}.

44
CHAPTER 6. BOUNDS AND OPTIMALITY CRITERIA

From previous modules you know that the dual of (LP) is


(D) min{b⊤ y | A⊤ y ≥ c, y ∈ Rm
+ }.

From the weak duality theorem for linear programs, see Theorem ?? it follows that (IP)
and (D) form a weak dual pair.
Example 6.17 (Lagrangian relaxation). Consider the following IP
(IP ) max{c⊤ x | Ax ≤ b, x ∈ X ⊆ Zn }.
Let
z(u) = max{c⊤ x + u⊤ (b − Ax) | x ∈ X}. (6.1)
Then the following problem
(LD) min{z(u) | u ≥ 0}
is a weak dual of (IP). In fact, for any feasible point x of (IP), and any feasible point u
of (LD), we have
c⊤ x ≤ c⊤ x + u⊤ (b − Ax) ≤ z(u).
The first inequality above follows from the fact u⊤ (b − Ax) ≥ 0 and the second inequality
follows from (6.1).
Such a duality is called Lagrangian duality which will be dealt with in much greater
detail in later lectures, see Chapter 10.
Example 6.18 (Matching and Covering form a weak dual pair). Given an undirected
graph G = (V, E) where V is the set of nodes and E the set of edges,
• a matching M ⊆ E is a set of disjoint edges in E (in the sense that no two edges
share a common endpoint);
• a covering by nodes of G is a subset R ⊆ V of nodes such that every edge in E
has at least one endpoint in R.
Now consider the problem of finding a maximum cardinality matching
(M ) max{|M | | M is a matching},
M ⊆E

and the problem of finding a minimum cardinality covering


(C) min{|R| | R is a covering by nodes}.
R⊆V

Then (M) and (C) form a weak dual pair, which can be seen as follows.
If M is a matching, say
M = {(i1 , j1 ), . . . , (ik , jk )},
then the 2k nodes
{i1 , j1 , . . . , ik , jk }
are distinct, and any covering by nodes R must contain at least one node from each pair
{is , js } for s ∈ {1, . . . , k}. Therefore,
|R| ≥ k = |M |.

45
CHAPTER 7. TOTAL UNIMODULARITY (TU)

7 – Total Unimodularity (TU)


In this chapter, we return to LP relaxation and discuss a sufficient condition which ensures
that the optimal solution of the LP relaxation is integral.

7.1 A Sufficient criterion for LP relaxation to be integral


In this section we study some integer programs that are well solved in the sense that an
efficient algorithm is known for solving all instances of the problem.
For example, consider the integer program
(IP ) max{c⊤ x | x ∈ X = P ∩ Zn },
where P = conv(X) is an ideal formulation for the (IP).
If the convex hull conv(X) can be given explicitly, then we may replace the (IP) by the
linear program
max{c⊤ x | x ∈ conv(X)}
which can be solved efficiently. Moreover, for this case the dual of the linear program
suggests that the strong duality property also holds. This implies that the dual bound is
exact.
Consider the problem
(IP ) max{c⊤ x | Ax ≤ b, x ≥ 0, x ∈ Zn+ }
with integral data A ∈ Zm×n and b ∈ Zm . Its LP-relaxation is given by
max{c⊤ x | Ax ≤ b, x ≥ 0, x ∈ Rn+ }.

Questions:
• When will the LP relaxation have an optimal solution that is integral ?
• Does such a class of well-solved integer programs exist?
Note that if the LP relaxation has an optimal solution that is integral, then Proposition 6.8
yields that IP has an optimal solution and that these two coincide.
From LP theory, we know that basic feasible solutions take the form
 ∗   −1 
∗ xB B b
x = =
x∗N 0
where B is an m × m nonsingular sub-matrix of [A, I] and I is an m × m identity matrix.
Theorem 7.1. Suppose A and b are integral. If the optimal basis B has
|det(B)| = 1,
then the LP relaxation has an integral optimal solution, and therefore the LP relaxation
solves IP.

46
CHAPTER 7. TOTAL UNIMODULARITY (TU)

Proof. From Cramer’s rule,


B −1 = B ∗ /det(B)
where B ∗ is the adjoint matrix, i. e. (B ∗ )ij = (−1)i+j det Bji , where Bji is the matrix
obtained from B by deleting the j-th row and i-th column.
The entries of B ∗ are all products of terms of B. Thus B ∗ is an integral matrix. Since
det(B) = ±1, we see that B −1 is also integral. Thus, B −1 b is integral for all integral b.
This implies that the optimal basic feasible solution (i. e. the optimal extreme point) is
integral, and hence that the LP relaxation solves IP.

A particularly lucky situation occurs when |det(B)| = 1 for all submatrices of [A, I]
(corresponding to any set of basic variables) as then for any possible basis B we know
that |det(B)| = 1. The following definition guarantees that this is the case:

Definition 7.2. A matrix A is totally unimodular (TU) if every square submatrix


of A (of any size) has determinant +1, −1 or 0. A submatrix of A is a matrix which is
obtained from A by deleting rows and colums.

Next, we investigate how we can recognize TU matrices. Immediate consequences of the


above definition are the following observations.

Proposition 7.3. All elements of a TU matrix A must be {0, 1, −1}, i. e. aij must be
1, −1 or 0 for any i, j.

Proof. Note: It suffices to consider sub-matrices of size 1 × 1.

Example 7.4. (i) The following matrix is TU:


 
1 −1 −1 0
 −1 0 0 1 
 
 0 1 0 −1 
0 0 1 0

(ii) The following matrix is not TU:


 
1 1 0
 1 0 1 
0 1 1

Remark 7.5. Clearly, total unimodularity is merely a sufficient criterion for when the
integer program
(IP ) max{c⊤ x | Ax ≤ b, x ∈ Zn+ }
is solved by its LP relaxation

max{c⊤ x | Ax ≤ b, x ≥ 0}.

The following result shows that in some case the converse is also true.

47
CHAPTER 7. TOTAL UNIMODULARITY (TU)

Theorem 7.6. (i) If A is TU, then (LP) solves (IP) for all integer vectors b for which
it has a finite optimal value.
(ii) If (LP) solves (IP) for all integer vectors b for which it has a finite optimal value,
then A is TU.

Proof. Omitted.

When A is TU, strong duality holds and an ideal formulation can be obtained explicitly,
as the following result shows.

Theorem 7.7. Let A be TU and b ∈ Zm . Consider

(IP ) z ∗ = max{c⊤ x | Ax ≤ b, x ∈ Zn+ }.

Let (D) be the linear programming dual of the LP relaxation (LP) of (IP), i. e.

(LP ) z LP = max{c⊤ x | Ax ≤ b, x ≥ 0}
(D) w∗ = min{b⊤ y | A⊤ y ≥ c⊤ , y ≥ 0}.

(i) Then (D) is a strong dual of (IP).


(ii) The feasible set P = {x ∈ Rn | Ax ≤ b, x ≥ 0} of (LP) is an ideal formulation of
(IP), if the feasible region of (IP) is of finite cardinality.

Proof. (i) From the strong duality theorem from LP theory and the relaxation rela-
tionship, we know that w∗ = z LP ≥ z ∗ , and by total unimodularity we infer from
Theorem 7.6 that z LP = z ∗ . Thus, w∗ = z ∗ .
(ii) Every extreme point of P is a basic feasible solution of (LP), and hence every
extreme point (vertex) of P can be represented as x = (B −1 b, 0) for some nonsingular
submatrix B of (A, I).
Since A is TU and b ∈ Zm , Theorem 7.6 yields that all extreme points of P are
integral, and hence they all lie in the feasible set X = P ∩ Zn of (IP). Thus P ⊆
conv(X).
On the other hand, it follows from X ⊆ P that conv(X) ⊆ P . Thus, P = conv(X),
as desired.

7.2 Recognizing TU matrices


There are certain simple and important criteria that help us recognize TU matrices. We
need either

(i) certain sufficient criteria that can easily be checked and allow us to identify some
important families of TU matrices, or

48
CHAPTER 7. TOTAL UNIMODULARITY (TU)

(ii) rules by which small TU matrices can be assembled into larger ones, and conversely,
we also need the rules about how to decompose a matrix into smaller parts for which
the TU property can be easily verified.
Theorem 7.8. An m × n matrix A is TU if and only if any of the following matrices are
TU,
(i)A⊤
(ii)[A, I]
(iii)[A, − A]
(iv) P A or AQ, where P, Q are m × m, n × n permutation matrices, respectively
   
A J1 I 0
(v) , with Ji = Pi Qi and where I are identity matrices, 0 blocks of
J2 0 0 0
zeros, and Pi , Qi permutation matrices of appropriate size.
(vi) [A, ei ] where ei denotes a column of identity matrix.
Example 7.9. Verify that the matrix
 
1 0 0 −1
A =  1 1 −1 −1 
0 0 1 0

is TU.
 
1 0
• Notice that is TU.
1 1
• By Theorem 7.8 (iii), the following is TU
 
1 0 −1 0
,
1 1 −1 −1

• and by Theorem 7.8 (i) and (vi), the following is TU


 
1 0 −1 0
 1 1 −1 −1  .
0 0 0 1

• Permuting the last two columns yields that A is a TU matrix.

The next result provides a simple sufficient condition for A to be TU.


Theorem 7.10 (Sufficient Condition). Let A = [aij ] be a matrix such that
(i) aij ∈ {+1, −1, 0} for all i, j.
(ii) Each column contains at most two nonzero coefficients, i. e.
m
X
|aij | ≤ 2, j ∈ {1, . . . , n}.
i=1

49
CHAPTER 7. TOTAL UNIMODULARITY (TU)

(iii) The set M of rows can be partitioned into (M1 , M2 ) such that each column j con-
taining two nonzero coefficients satisfies
X X
aij − aij = 0.
i∈M1 i∈M2

Then A is totally unimodular.

Proof. Assume that (i)–(iii) are satisfied but that A is not TU. Let B be a smallest square
submatrix of A such that det(B) ∈ / {0, +1, −1}.
Then all columns of B contain exactly two nonzero coefficients (why? If B contains a
column with a single nonzero entry, B would not be minimal)
Because of (iii), adding the rows of B with indices in M1 and subtracting the rows with
indices in M2 yields the zero vector, showing that the rows of B are linearly dependent
and det(B) = 0, which contradicts the choice of B.
Remark 7.11. Condition (iii) means that if the nonzero elements are in rows i and k,
and if aij = −akj , then {i, k} ∈ M1 or {i, k} ∈ M2 , whereas if aij = akj , then i ∈ M1 and
k ∈ M2 , or vice versa.
Example 7.12. (i) We can use Theorem 7.10 to verify that the following matrix is
TU:  
1 0 0 1 0 0 1 0 0
0 1 0 0 1 0 0 1 0
 
0 0 1 0 0 1 0 0 1
 .
1 1 1 0 0 0 0 0 0
 
0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 1 1 1
(i) and (ii) are apparent and (iii) holds with M1 = {1, 2, 3} and M2 = {4, 5, 6}.
(ii) Consider the LP-relaxation of the assignment problem
n X
X n
min cij xij
i=1 j=1
Xn
xij = 1 for j = 1, ..., n (7.1)
i=1
Xn
xij = 1 for i = 1, ..., n (7.2)
j=1
0 ≤ x ≤ 1. (7.3)
We can write the constraints (7.1) and (7.2) as Ax = e where e = (1, 1, ..., 1)⊤ , and
A is the node-edge incidence matrix. Then A is totally unimodular.
Corollary 7.13. Let A be any m × n matrix with entries taken from {0, +1, −1} with
the property that any column contains at most one +1 and at most one −1. Then A is
totally unimodular.

50
CHAPTER 7. TOTAL UNIMODULARITY (TU)

Proof. Assume that A contains exactly two nonzero entries per column. Then the fact
that A is TU for this case follows from the Theorem 7.10 with M1 = {1, . . . , m} and
M2 = ∅. For the general case, observe that a column with at most one nonzero from
{+1, −1} cannot destroy unimodularity, since we can develop the determinant of square
submatrices by that column.
Theorem 7.14 (A general sufficient condition). A matrix A is totally unimodular if
(i) aij ∈ {0, +1, −1} for all i, j, and
(ii) for any subset M of the rows of A, there exists a partition (M1 , M2 ) of M such that
each column j satisfies
X X
aij − aij ≤ 1.
i∈M1 i∈M2

7.3 Application to Minimum Cost Network Flow Problems


As one application of total unimodularity, we consider the minimum cost network flow
problem that is a very important class of problems with many applications lying at the
frontier between linear and integer programming.
Minimum cost network flow problems contain two important special cases:
• shortest path problems and
• maximum flow problems.
The minimum cost network flow problem is defined as follows.
• Consider a directed graph (V, E) with arc capacities bij for all (i, j) ∈ E.
• At each node i ∈ V, hi units of liquid are produced (negative production hi is to be
interpreted as consumption).
• Liquid can flow via the network at rates up to the specified capacities bij .
• Pumping one unit of liquid per unit of time along arc (i, j) ∈ E incurs a cost of cij .
P
We suppose that hi = 0 meaning that the total demand equals the total production.
We call nodes with positive production sources and nodes with negative production, i. e.
demand, sinks. Our aim is to transport the liquid from the sources to the sinks so that
the total costs are minimised.
Question: Which feasible flow minimises the total cost?

Let
• xij be the flow along arc (i, j),
• V + (i) = {k | (i, k) ∈ E} denote the set of nodes to which there is an arc from i
(successor nodes),
• V − (i) = {k | (k, i) ∈ E} be the set of nodes from which there is an arc into i
(predecessor nodes).

51
CHAPTER 7. TOTAL UNIMODULARITY (TU)

Figure 7.1: Network Flows

The following LP models the minimum cost network flow problem:


X
min cij xij
(i,j)∈E
X X
s.t xik − xki = hi , i ∈ V, (7.4)
k∈V + (i) k∈V − (i)

0 ≤ xij ≤ hij , (i, j) ∈ E.

Which is of the form

min c⊤ x
s.t Ax = h, (7.5)
0 ≤ x ≤ b.

equally,

min c⊤ x   
A h
s.t  −A  x ≤  −h  , x ≥ 0. (7.6)
I b

Example 7.15. Consider the digraph with 3 nodes and arcs

E = {(1, 2), (1, 3), (2, 1), (2, 3), (3, 2)}

with capacity
b = (b12 , b13 , b21 , b23 , b32 )⊤ = (3, 2, 4, 3, 6)⊤ ,
transportation costs

c = (c12 , c13 , c21 , c23 , c32 )⊤ = (1, 1, 2, 3, 4)⊤ ,

and node production rates h = (2, 4, −6)⊤ .

52
CHAPTER 7. TOTAL UNIMODULARITY (TU)

Denote the flow by


x = (x12 , x13 , x21 , x23 , x32 )⊤
Then it is easy to check that (7.4) can be given as

min x12 + x13 + 2x21 + 3x23 + 4x32


s.t x12 + x13 − x21 = 2,
−x12 + x21 + x23 − x32 = 4,
−x13 − x23 + x32 = −6,
0 ≤ x12 ≤ 3,
0 ≤ x13 ≤ 2,
0 ≤ x21 ≤ 4,
0 ≤ x23 ≤ 3,
0 ≤ x32 ≤ 6

which can be written as (7.6) where


   
1 1 −1 0 0 2
A =  −1 0 1 1 −1  , h =  4  .
0 −1 0 −1 1 −6

Note that the constraint matrix is TU.

Theorem 7.16. The constraint matrix arising in a minimum cost network flow problem
is totally unimodular.

Proof. As in (7.6), the constraint matrix has the form


 
A
 −A  .
I

It suffices therefore to show that A is TU. Each column of A has exactly two nonzero
entries, one +1, the other one −1, because each arc leaves one node as an outgoing arc, and
enters in another as an incoming arc. Therefore, the sufficient criterion of Theorem 7.10
is satisfied with M1 = M and M2 = ∅.

Corollary 7.17. In a minimum cost network flow problem, if the production rate vector
b and the capacity vector h are integral, then the following hold.
(i) Each extreme point of the feasible polyhedron is integral.
(ii) If there exists an optimal flow, then there exists an integral optimal flow.
(iii) The constraints of the problem describe the convex hull of the set of integral feasible
flows.

Now, we consider two special cases.

53
CHAPTER 7. TOTAL UNIMODULARITY (TU)

7.3.1 Shortest Path problem


• Consider a directed graph G = (V, E) with nonnegative arc lengths cij for all (i, j) ∈
E.
• Two nodes s, t ∈ V are marked.
• Find a shortest path from s to t in G.
The shortest path problem can be viewed as a special case of a minimum cost network
flow problem with bij = 1 for all arcs.
Setting hs = 1, ht = −1 and hi = 0 for the remaining i ∈ V , and
(
1 if (i, j) is in the shortest path,
xij =
0 otherwise.

We can model the shortest path problem as a special case of the min-cost network flow
problem with additional integrality constraints:
X
(SP ) z = min cij xij
(i,j)∈E
X X
s.t. xsj − xjs = 1,
j∈V + (s) j∈V − (s)
X X
xtj − xjt = −1,
j∈V + (t) j∈V − (t)
X X
xij − xji = 0, i ∈ V \ {s, t},
j∈V + (i) j∈V − (i)

xij ∈ {0, 1}, (i, j) ∈ E.

Because the constraint matrix is TU, and the data b is integral (binary), the LP relaxation
of (SP) is integral and hence solves the shortest path problem. Also we have the following
strong duality theorem for the shortest path problem.

Theorem 7.18. The dual

(D) wLP = max{πs − πt | πj − πi ≤ cij , (i, j) ∈ E}

of the LP relaxation is a strong dual of (SP).

Notice, replacing πj by πj + α for all j ∈ V does not change the dual, so we can set πs = 0
without loss of generality.

Example 7.19. One can use LP to find the length of a shortest path from node s to
node t in the directed graph displayed in Figure 7.2. Note that this LP problem has 16
variables and 8+16 constraints (not counting the non-negativity constraints).

Let us consider the next example: Max-flow problem

54
CHAPTER 7. TOTAL UNIMODULARITY (TU)

Figure 7.2: Shortest path instance

7.3.2 The Max Flow Problem


• Consider a directed graph G = (V, E) with nonnegative capacities bij for all (i, j) ∈
E.
• Two nodes s, t ∈ V are marked.
• Find a maximum flow from s to t in G.
Adding a backward arc from s to t, the maximum s − t flow problem can be formulated
as

max xts
X X
s.t. xij − xji = 0, i ∈ V,
j∈V + (i) j∈V − (i)

0 ≤ xij ≤ bij , (i, j) ∈ E.

The constraint matrix is TU. And the following is the strong dual problem for the max
flow problem
X
min bij wij
(i,j)∈E

s.t. ui − uj + wij ≥ 0, (i, j) ∈ E


ut − us ≥ 1.

Example 7.20. Use LP to find a maximum s−t flow in the following capacitated network

55
CHAPTER 7. TOTAL UNIMODULARITY (TU)

Figure 7.3: Network instance

56
III
Further Methods and Algorithms

So far, we have learned


• how to solve the class of IP problems with total unimodularity,
• how to solve Knapsack problems by dynamic programming.
For general IP problems we have learned
• how to find dual bounds (i. e. upper bounds for maximization problems) on integer
programming problems by solving certain relaxations, and
• how to solve some relaxations (i. e. “easy” optimization problems).
What we have not learned yet is how to use these bounds to solve general IP problems,
which is the problem we are going to attack next!
Recall from Chapter 6 the following algorithmic approach to find optimal solutions.
Practically, the concept of finding primal and dual bounds (i. e. lower and upper bounds)
means that any algorithm will find a decreasing sequence

z1 > z2 > · · · zs ≥ z ∗

of upper bounds, and an increasing sequence

z1 < z2 < · · · zt ≤ z ∗

of lower bounds, and stop when


zs − zt ≤ ε,
where ε is some suitably chosen small nonnegative value.

8 – Branch and Bound


8.1 Solving general integer programming problems
A small-size problem is usually much easier to solve than a large-size (large-scaled) prob-
lem. Given a large problem, a natural approach to solving the problem is to break it into
a series of small problems.

57
CHAPTER 8. BRANCH AND BOUND

Consider the problem


z = max{c⊤ x | x ∈ S}.
Question: How can we break the problem into a series of smaller problems that are
easier, solve the smaller problems, and then put the information together again to solve
the original problem?
For Knapsack problems we discussed divide and conquer and dynamic programming. Now
we turn to the branch-and-bound method, which is a refinement of the divide and conquer
approach and can be applied to general (M)IPs.
Proposition 8.1. Consider the problem
z = max{c⊤ x | x ∈ S}.
If the set S of feasible solutions can be decomposed into a union of simpler sets S =
S1 ∪ · · · ∪ Sk and if
z j := max{c⊤ x | x ∈ Sj } j ∈ {1, . . . , k},
then
z = max z j .
j

A typical way to represent such a divide and conquer approach is via an enumeration tree.
Example 8.2. Let S ⊆ {0, 1}3 . How do we break the set into smaller sets, and construct
the enumeration tree?
Example 8.3. Let S be the set of feasible tours of the traveling salesman problem on a
network of 4 cities. Let node 1 be the departure city.
• S can be subdivided into the disjoint sets of tours that start with an arc (12), (13)
or (14) respectively, i. e.
S = S(12) ∪ S(13) ∪ S(14),
where S(1i) means the tour starting with arc (1i).
• Each of the sets S(12), S(13) and S(14) can be further subdivided according to the
choice of the second arc, S(12) = S(12)(23) ∪ S(12)(24) etc.
• Each of these sets corresponds to a specific TSP tour and cannot be further subdi-
vided. We have found an enumeration tree of the TSP tours.
We see that S was decomposed on a first level, and then each of the constituent parts
was decomposed on a further level and so on.
Thus, Proposition 8.1 allows us to decompose a hard problem into a possibly large number
of easier branch problems, and to find an optimal solution of S by comparing the solutions
found for the branch problems.
However, for even quite moderately sized problems such a tree can no longer be explicitly
enumerated, as the number of leaves grows exponentially in the problem size.
The idea of implicit enumeration is based on building up the enumeration tree as we
explore it, and to prune certain parts that are not worth looking at before those parts are
even generated.

58
CHAPTER 8. BRANCH AND BOUND

(14)
(12) (13)

(23) (24) (34) (42


(32)
(43)

(1234) (1243) 1342 (1423)


(1324) (1432)

Figure 8.1: TSP enumeration tree

8.2 Pruning Mechanisms


The pruning mechanisms are based on the following insight.

Proposition 8.4. Consider the problem

z = max{c⊤ x | x ∈ S},

and let
S = S1 ∪ · · · ∪ Sk
be a decomposition of its feasible domain into smaller sets. Let

z j ≤ z j ≤ z̄ j

be lower and upper bounds on

z j = max{c⊤ x | x ∈ Sj }

for all j. Then


z := max z j ≤ z ≤ max z̄ j =: z̄
j j

gives a lower and an upper bound on z.

We speak of pruning a branch when we detect that we need no longer explore it further.
This can happen for a variety of reasons.

8.2.1 Pruning by Bound A branch Sj can be pruned when z̄ j ≤ z.

8.2.2 Pruning by Infeasibility If Sj = ∅ then the corresponding branch can be


pruned.
Why would anyone introduce an empty Sj into the decomposition of S?

59
CHAPTER 8. BRANCH AND BOUND

36 33
S S
20 25

33 24 33 24
S1 S2 S1 S2
21 25
25 21

Figure 8.2: Pruning by bound

Remember that we do not set up the decomposition tree explicitly but use an implicit
enumeration, typically by introducing more and more extra constraints as we trickle down
towards the leaves of the enumeration tree.
As we proceed, the constraints may become incompatible and correspond to an empty set
Sj , but this may not be a-priori obvious and has to be detected.

8.2.3 Pruning by Optimality When z j = z̄ j for some j, then the branch correspond-
ing to Sj no longer has to be considered further, as an optimal solution z j = z j = z̄ j for
this branch is already available.
However, we will not throw this solution away, as it may later turn out to be optimal for
the parent problem S.

36 33
S S
20 27

33 27 33 27
S1 S2 S1 S2
25
25 27 27

Figure 8.3: Pruning by Optimality

8.3 Branch and Bound Method


A branch and bound method systematically exploits all of the above to break down the
solution of a hard optimization problem max{c⊤ x | x ∈ S} into easier parts:
• Implicitly build an enumeration tree for S by subdividing S = S1 ∪ · · · ∪ Sk on the
first level, and then subdividing each Sj further on the next level and so on.
• Only build the parts of the tree that are actually explored.
• For each subproblem Sj (corresponding to a branch of the enumeration tree), com-
pute primal and dual bounds: use heuristics for primal bounds, relaxation for dual

60
CHAPTER 8. BRANCH AND BOUND

bounds.
• Use Proposition 8.4 to tighten bounds at the root.
• Prune branches that need not be explored.
If LP relaxation is used for generating dual bounds in a branch-and-bound system, we
speak of LP based branch-and-bound. We will now explore this framework in more detail.

Algorithm 8.5 (LP based branch-and-bound)


Throughout the algorithm we will maintain and update a list of active nodes List,
a primal bound z on max{c⊤ x | x ∈ S}, and an incumbent x∗ , that is, the best
solution encountered so far.
1.) Initialisation: Set List := {S}, z := −∞ and x∗ := ∅.
2.) While List ̸= ∅, repeat:
(i) Choose a problem Sj ∈ List with formulation Pj , and solve the LP relax-
ation xj := arg max{c⊤ x | x ∈ Pj }.
. If an optimal solution xj to the relaxed problem exists, store z̄ j = c⊤ xj ,
. elseif the LP is unbounded, store z̄ j = +∞,
. else store Pj = ∅,
end.
(ii) Prune or branch:
. If Pj = ∅,

List ←− List \ {Sj }, (prune by infeasibility),


. elseif z̄ j ≤ z,

List ←− List \ {Sj }, (prune by bound),

. elseif xj is feasible for S (i. e. integer),

z ←− z j , (update primal bound),

x∗ ←− xj , (update incumbent),
List ←− List \ {Sj }, (prune by optimality),
[1] [2]
. else branch Sj into two subproblems Sj , Sj ,
[1] [2]
List ←− (List \ {Sj }) ∪ {Sj , Sj }.

end.
3.) Stop with incumbent x∗ optimal (x∗ = ∅ is a certificate of infeasibility of the
problem).

A variety of modifications can be made to the above algorithm. For instance, the efficiency

61
CHAPTER 8. BRANCH AND BOUND

of the algorithm can be improved if in step 2.i) we also compute a primal bound for Sj
and use this to update z as well as an upper bound z̄.
Four Questions
1.) How are the bounds to be obtained? This question has been answered already. The
primal (lower) bounds are provided by feasible solutions, and dual (upper) bounds
by relaxation or duality.
2.) How should the feasible region be separated into smaller regions? One simple idea is
to choose an integer variable that is fractional in the linear programming solution,
and split the problem into two about this fractional value. If xj = xej ∈
/ Z, one can
take

S1 = S ∩ {x | xj ≤ ⌊e
xj ⌋}
S2 = S ∩ {x | xj ≥ ⌈e
xj ⌉}

It is clear that S = S1 ∪ S2 and S1 ∩ S2 = ∅.


Example 8.6. If xj = 4/3 is the solution of the associated relaxation problem, and
S is the current feasible region, then we can split S up into S = S1 ∪ S2 where

S1 = S ∩ x | xj ≤ 43
  
= S ∩ {x | xj ≤ 1}
 4
S2 = S ∩ x | xj ≥ 3 = S ∩ {x | xj ≥ 2}

3.) If the LP relaxation is used, How to solve the subproblems efficiently?


• We split up the feasible region by adding an inequality to the current LP
problems.
• Thus, we may use sensitivity analysis of LP to reoptimize the slightly changed
LP problems without starting again from scratch.
• As we have just added one single upper and lower bound constraint to the linear
program, our previous optimal basis remains dual feasible, and it is therefore
natural to reoptimize from this basis using the dual simplex algorithm.
• Let x∗ , y ∗ be optimal solutions for the primal and dual LP instances

(P ) max{c⊤ x | Ax ≤ b, x ≥ 0} and
⊤ ⊤
(D) min{b y | A y ≥ c, y ≥ 0}.

Adding a new constraint, LP becomes

(P ′ ) max{c⊤ x | Ax ≤ b, a⊤
m+1 x ≤ bm+1 , x ≥ 0}

This corresponds to the situation where a new variable appears in the dual
problem

(D′ ) min{b⊤ y + bm+1 ym+1 | A⊤ y + am+1 ym+1 ≥ c, y, ym+1 ≥ 0}.

4.) In what order should the subproblems (nodes in enumerate tree) be examined? There
is no single answer that is best for all instances.

62
CHAPTER 8. BRANCH AND BOUND

8.4 An Example
Example 8.7. Solve the following IP.

(IP ) z = max 4x1 − x2


s. t. 7x1 − 2x2 ≤ 14
x2 ≤ 3
2x1 − 2x2 ≤ 3
x ∈ Z2+

(i) Bounding: To obtain the first upper bound, solve the LP relaxation

(LP (S)) z = max 4x1 − x2


s. t. 7x1 − 2x2 ≤ 14
x2 ≤ 3
2x1 − 2x2 ≤ 3
x ∈ R2+

Let x3 , x4 , x5 be slack variables. By the simplex method, the resulting optimal basis
representation is

z̄ = max − 74 x3 − 17 x4 + 59
7
x1 + 71 x3 + 27 x4 = 20
7
x2 +x4 = 3
− 27 x3 + 10 x +x5
7 4
= 23
7
x1 , x 2 , x3 , x4 , x5 ≥ 0

from which we obtain the nonintegral solution (x1 , x2 ) = ( 20


7
, 3) and the upper (or
dual) bound z̄ = 59/7. Is there any straightforward way to find a feasible solution
to (IP)? Apparently not. By convention, as no feasible solution is yet available, we
set z = −∞.
(ii) Branching: Since z < z̄, (IP) is not solved to optimality yet, and we need to
branch. As x1 = 20
7
, we take

S1 = S ∩ {x | x1 ≤ 2}, S2 = S ∩ {x | x1 ≥ 3}.

We now have the tree in the Figure 8.4. The subproblem (nodes) S1 , S2 that must
still be examined are called active. The node S on the other hand has been
processed and is inactive.
(iii) Choosing an active node: During the run of the algorithm, we maintain a list
of active nodes. This list currently consists of S1 , S2 . Later we will discuss breath-
first versus depth-first choices of the next active node to be processed. For now we
arbitrarily select S1 .

63
CHAPTER 8. BRANCH AND BOUND

59/7
S
-inf

x1>= 3
x1<= 2

S1 S2

Figure 8.4: Branching 1

(iv) Bounding: Next we derive a bound z̄ 1 by solving the LP relaxation

(LP (S1 )) z = max 4x1 − x2


s.t. 7x1 − 2x2 ≤ 14
x2 ≤ 3
2x1 − 2x2 ≤ 3
x1 ≤ 2
x ∈ R2+

of the problem (IP1 ). z̄ 1 = max{c⊤ x | x ∈ S1 } that corresponds to node S1 . Note


that (LP1 ) differs from (LP ) by having one more constraint x1 ≤ 2 imposed. Using
x1 = 20
7
− 17 x3 − 27 x4 from the optimal representation of (LP), express the constraint
x1 ≤ 2 as
1 2 6
− x3 − x4 + s = −
7 7 7
where s is a new slack variable. Thus we have the dual feasible representation:

z̄ ′ = max − 47 x3 − 17 x4 + 59
7
x1 + 17 x3 + 27 x4 = 20
7
x2 +x4 = 3
− 72 x3 + 10 x +x5
7 4
= 23
7
− 71 x3 2
− 7 x4 +s = − 76
x1 , x 2 , x 3 , x4 , x5 , s ≥ 0

After two simplex pivots, the linear program is reoptimized, giving z̄ 1 = 15/2 and
(x̄11 , x̄12 ) = (2, 12 ).
(v) Branching: S1 is not solved to optimality and cannot be pruned, so using the same
branching rule as before, we have two new nodes

S11 = S1 ∩ {x | x2 ≤ 0}, S12 = S1 ∩ {x | x2 ≥ 1},

64
CHAPTER 8. BRANCH AND BOUND

59/7
S
-inf

x1<= 2 x1>= 3

15/2
S1 S2

x2>= 1
x2=0

S11 S12

Figure 8.5: Branching 2

and add them to the node list. The tree is now as shown in Figure 8.5 and the new
list of active nodes is S11 , S12 , S2 .
(vi) Choosing an active node: We arbitrarily choose S2 for processing.
(vii) Bounding: We compute a bound z̄ 2 by solving the LP relaxation
(LP (S2 )) z = max 4x1 − x2
s.t. 7x1 − 2x2 ≤ 14
x2 ≤ 3
2x1 − 2x2 ≤ 3
x1 ≥ 3
x ∈ R2+
of the problem
(IP2 ) z 1 = max{c⊤ x | x ∈ S2 }.
To solve LP (S2 ), we use the dual simplex algorithm in the same way as above. The
constraint x1 ≥ 3 is first written as x1 − t = 3, t ≥ 0 which expressed in terms of
the nonbasic variables becomes:
1 2 1
x3 + x4 + t = − .
7 7 7
The resulting linear program is
z̄ = max − 74 x3 − 17 x4 + 59
7
x1 + 17 x3 + 27 x4 = 20
7
x2 +x4 = 3
− 27 x3 + 10 x +x5
7 4
= 23
7
1
x
7 3
+ 72 x4 +t = − 17
x1 , x 2 , x 3 , x4 , x5 , t ≥ 0.

65
CHAPTER 8. BRANCH AND BOUND

But this LP is infeasible, because the last constraint contradicts x3 , x4 , t ≥ 0.


Hence, S2 can be pruned by infeasibility.
(viii) Choosing an active node: The list of active nodes is S11 , S12 . We arbitrarily
choose S12 .
(ix) Bounding: S12 = S ∩ {x | x1 ≤ 2, x2 ≥ 1}. The resulting linear program has
optimal solution x̄12 = (2, 1) with value 7. Since this is an integer solution, z 12 = 7.
(x) Updating the incumbent: We store (2, 1) as the best integer solution found so far
and update the lower bounds z = max{z, 7}, and S12 is now pruned by optimality.
(xi) Choosing an active node: Only S11 is active, so choose this node.
(xii) Bounding: S11 = S ∩ {x | x1 ≤ 2, x2 ≤ 0}. The resulting linear program has
optimal solution x̄11 = (3/2, 0) with value 6 < z = 7. Thus the node is pruned by
bound.
(xiii) Termination: There are no further active nodes left, and the algorithm terminates,
returning the optimal value z = 7 and the maximiser x = (2, 1) that achieves it.
The complete branch-bound tree is shown in Figure 8.6:

59/7
S

x1<= 2 x1>= 3

-INF
15/2
S1 S2

x2>= 1
x2=0

6 7
S11 S12
7

Figure 8.6: Complete branch and bound tree

8.5 Preprocessing and Fine-Tuning


In this section, we are going to discuss in more detail some of the practical aspects of
developing and using the branch and bound algorithm.
Commercial branch-and-bound systems for integer and mixed integer programming are
designed to solve and resolve the linear programming relaxations as rapidly as possible,
and to branch intelligently.
Given this philosophy, all recent algorithmic systems contain or offer:

66
CHAPTER 8. BRANCH AND BOUND

• A powerful preprocessor, which simplifies the model by reducing the number of


constraints and variables, so that the linear programs are easier.
• The simplex method with a choice of pivoting strategies, and an interior-point option
for solving the linear programs.
• Limited choice of branching and node selection options
• Use of priorities

LP or IP models can often be simplified by reducing the number of variables and con-
straints (e.g. eliminating the redundant constraints), and IP models can be tightened
before any actual branch-and-bound computations are performed. All the commercial
branch-bound systems carry out such a check, called preprocessing.

Example 8.8. Consider the LP instance

max 2x1 + x2 − x3
s. t. 5x1 − 2x2 + 8x3 ≤ 15,
8x1 + 3x2 − x3 ≥ 9,
x1 + x2 + x3 ≤ 6,
0 ≤ x1 ≤ 3,
0 ≤ x2 ≤ 1,
1 ≤ x3 .

8.5.1 Tightening bounds


• Isolating x1 in the first constraint and using x2 ≤ 1 and x3 ≥ 1 yields

5x1 ≤ 15 + 2x2 − 8x3 ≤ 15 + 2 · 1 − 8 · 1 = 9,

and hence, x1 ≤ 9/5, which tightens the bound x1 ≤ 3.


• Similarly, isolating x3 in the first constraint, we obtain

8x3 ≤ 15 + 2x2 − 5x1 ≤ 15 + 2 · 1 − 5 · 0 = 17

which implies x3 ≤ 17/8 and it tightens x3 ≤ ∞.


• Isolating x2 in the first constraint, we get

2x2 ≥ 5x1 + 8x3 − 15 ≥ 5 · 0 + 8 · 1 − 15 = −7.

This yields x2 ≥ −7/2, which does not tighten x2 ≥ 0.


• Proceeding similarly with the second and third constraints, we obtain the tightened
bound
8x1 ≥ 9 − 3x2 + x3 ≥ 9 − 3 + 1 = 7,
yielding the improved bound x1 ≥ 7/8.

67
CHAPTER 8. BRANCH AND BOUND

• As some of the bounds have changed after the first sweep, we may now go back to
the first constraint and tighten the bounds yet further. Isolating x3 , we obtain

8x3 ≤ 15 + 2x2 − 5x1 ≤ 15 + 2 − 5 · 7/8 = 101/8

yielding the improved bound x3 ≤ 101/64.


Continuing the second sweep by isolating each variable in turn in each of the constraints
1-3, and using the bound constraints, several bound constraints may further tighten in
general, but not in the present example. How many sweeps of this process are needed?
One can show that after two sweeps of all the constraints and variables, the bounds cannot
improve further!

8.5.2 Redundant Constraints Using the final upper bounds in constraint 3,

9 101
x1 + x2 + x3 ≤ +1+ < 6,
5 64
which shows that this constraint x1 + x2 + x3 ≤ 6 is redundant and can be omitted from
the problem. The remaining problem is

max 2x1 + x2 − x3
5x1 − 2x2 + 8x3 ≤ 15,
8x1 + 3x2 − x3 ≥ 9,
7/8 ≤ x1 ≤ 9/5,
0 ≤ x2 ≤ 1,
1 ≤ x3 ≤ 101/64.

8.5.3 Variable fixing


• Increasing x2 makes the objective function grow and loosens all constraints except
x2 ≤ 1. Therefore, in an optimal solution we must have x2 = 1.
• Decreasing x3 makes the objective function grow and loosens all constraints except
1 ≤ x3 . Thus, in an optimal solution we must have x3 = 1.
This leads to the trivial problem:

max{2x1 | 7/8 ≤ x1 ≤ 9/5}.

Example 8.8 shows how to simplify linear programming instances. In the preprocessing
of IPs we have further possibilities:
(i) For all xj with an integrality constraint xj ∈ Z any bounds lj ≤ xj ≤ uj can be
tightened to ⌈lj ⌉ ≤ xj ≤ ⌊uj ⌋.
(ii) For binary variables new logical or Boolean constraints can be derived that tighten
the formulation and hence lead to fewer branching nodes in a branch-and-bound
procedure.

68
CHAPTER 8. BRANCH AND BOUND

The latter point is illustrated in the next example

Example 8.9. Consider a binary IP instance whose feasible set is defined by the following
constraints,

7x1 + 3x2 − 4x3 − 2x4 ≤ 1,


−2x1 + 7x2 + 3x3 + x4 ≤ 6,
−2x2 − 3x3 − 6x4 ≤ −5,
3x1 − 2x3 ≥ −1,
x ∈ {0, 1}4 .

8.5.4 Generating logical inequalities The first constraint shows that x1 = 1 implies
x3 = 1, which can be written as
x1 ≤ x3 .

Likewise, x1 = 1 implies x4 = 1, or equivalently,

x1 ≤ x4 .

Finally, constraint 1 also shows that the problem is infeasible if x1 = x2 = 1. Therefore,


the following constraint must hold,

x1 + x2 ≤ 1.

We can process the remaining constraints in a similar way:


(i) Constraint 2 yields the inequalities x2 ≤ x1 and x2 + x3 ≤ 1.
(ii) Constraint 3 yields x2 + x4 ≥ 1 and x3 + x4 ≥ 1.
(iii) Constraint 4 yields x1 ≥ x3 .
Although the introduction of the new logical constraints makes the problem seem more
complicated, the formulation becomes tighter and thus better. Furthermore, we can now
process the problem further.

8.5.5 Combining pairs of logical inequalities We now consider pairs involving the
same variables.

x1 ≤ x3 and x1 ≥ x3 =⇒ x1 = x3 .
x1 + x2 ≤ 1 and x2 ≤ x1 =⇒ x2 = 0,

and then
x2 + x4 ≥ 1 =⇒ x4 = 1.

69
CHAPTER 8. BRANCH AND BOUND

8.5.6 Simplifying Substituting the identities x2 = 0, x3 = x1 and x4 = 1 we found, all


four constraints become redundant. We are left with the choice x1 ∈ {0, 1}, and hence
the feasible set can contain at most two points

S ⊆ {(1, 0, 1, 1)⊤ , (0, 0, 0, 1)⊤ }.

Indeed, both points satisfy all four constraints and are binary. Thus,

S = {(1, 0, 1, 1)⊤ , (0, 0, 0, 1)⊤ }.

8.6 Node Selection


To prune the tree significantly, one needs a good lower bound provided by a feasible
solution. Finding such a solution is sometimes possible via a heuristic, but in general this
problem is hard.

8.6.1 Depth-First search strategy A depth-first strategy aims at finding a feasible


solution quickly, by only choosing an active node that is a direct descendent of the previ-
ously processed node. This strategy also makes it easier to warm-start the LP calculations
in each iteration.

8.6.2 Bredth-First search strategy In the breadth-first strategy, one ensures that
the associated tree is as short as possible, by branching all problems on the current level
of the tree first, before turning to the next level of the tree.

8.6.3 Best-Node-First To minimise the total number of nodes processed during the
run of the algorithm, the optimal strategy is to always choose the node with the largest
upper bound, i.e., Sj such that

z̄ j = max{z i : Si ∈ List}.

With such a rule, we will never branch on a node St whose upper bound z̄ ⊤ is smaller
than the optimal value z of S. This is called a best-node-first strategy. The depth-first
and best-node-first strategies are usually mutually contradictory, so a compromise has to
be reached. Usually, depth-first is used initially until a feasible solution is found and a
lower bound z is established.

8.7 Branching Options


In our example, we branched on a fractional variable. Often several candidate variables
are available. In this case, a choice has to be made.

70
CHAPTER 9. CUTTING PLANE ALGORITHM

8.7.1 Most Fractional Variable Let C be the set of fractional variables of the solution
x∗ to a LP relaxation. The most fractional variable approach is to branch on the variable
that corresponds to the index

j = arg max min{fi , 1 − fi },


i∈C

where fi = x∗i − ⌊x∗i ⌋.

8.7.2 Branching by Priorities In this approach the user can indicate a priority of
importance for the decision variables to be integer. The system will then branch on the
fractional variable with highest priority.
Suppose the underlying optimization problem is a minimization problem. Since rounding
up the variables yj corresponding to large fixed costs fj changes the objective function
more severely than rounding yj corresponding to small fixed costs, we prioritise the yj in
order of decreasing fixed costs fj .

8.8 Further Improving Branch-and-Bound Systems


LP based branch-and-bound may not be strong enough to solve really hard IP problems.
In these cases one has to resort to the following improvements:

• Finding feasible solutions via heuristics. This is hard, but when it succeeds, it
generates lower bounds z that allow pruning by bounding.
• Finding better dual bounds by using
– combinatorial relaxation (see Section 6.3.2),
– duality (see Section 6.3.3),
• Tightening the formulation of the IP, having the effect that fewer branching nodes
are needed until an optimal solution is found. This can be achieved
– by cutting plane methods
– or by Lagrangian relaxation.

9 – Cutting Plane Algorithm

9.1 Valid inequalities


Definition 9.1 (Valid Inequality). An inequality u⊤ x ≤ η is a valid inequality for
X ⊆ Rn if u⊤ x ≤ η for all x ∈ X.

Some Simple Valid Inequalities:

71
CHAPTER 9. CUTTING PLANE ALGORITHM

Example 9.2 (Integer Rounding). Consider the integer region

X = P ∩ Zn

where
P = {x ∈ R4+ | 13x1 + 20x2 + 11x3 + 6x4 ≥ 72}.
Dividing by 11 gives the valid inequality for P ,

13 20 6 72
x1 + x 2 + x3 + x4 ≥ .
11 11 11 11
Since x is nonnegative, rounding up the coefficients on the left to the nearest integer leads
to
72
2x1 + 2x2 + x3 + x4 ≥ ,
11
which is a weaker valid inequality for P .
Notice that x ∈ X is integer, and that the coefficients are integer. Thus, the left-hand-
side of the above inequality must be integer. An integer that is greater than or equal
to 72
11
must be at least 7. So rounding the right up to the nearest integer gives the valid
inequality for X:
2x1 + 2x2 + x3 + x4 ≥ 7.

Similarly, we can verify that all the following inequalities are valid inequalities for X:

x1 + x2 + x3 + x4 ≥ 4,
x1 + 2x2 + x3 + x4 ≥ 6,
3x1 + 4x2 + 2x3 + x4 ≥ 12.

(Do this exercise!)

Example 9.3 (Pure 0-1 set). Consider the following set

X = {x ∈ {0, 1}5 : 3x1 − 4x2 + 2x3 − 3x4 + x5 ≤ −2}.

Clearly,
• there is no point in X satisfying x2 = x4 = 0. So all feasible solutions satisfy

x2 + x4 ≥ 1

which thus is a valid inequality.


• there is no feasible solution satisfying x1 = 1 and x2 = 0 (since x1 = 1 ⇒ x2 = 1).
So
x1 ≤ x2
is a valid inequality.

72
CHAPTER 9. CUTTING PLANE ALGORITHM

Example 9.4 (Mixed 0-1 set). Consider the set

X = {(x, y) | x ≤ 5000y, 0 ≤ x ≤ 5, y ∈ {0, 1}}.

Since
X = {(0, 0), (x, 1) with 0 ≤ x ≤ 5},
it is easily checked that the inequality

x ≤ 5y

is valid.
Example 9.5 (Mixed integer set). Consider the set

X = {(x, y) | x ≤ 10y, 0 ≤ x ≤ 14, y ∈ Z1+ }.

Clearly,

X = {(0, 0)} ∪ {(x, 1) | 0 ≤ x ≤ 10} ∪ {(x, y) | 0 ≤ x ≤ 14, y ≥ 2, y ∈ Z}.

It is easy to verify that


x ≤ 4y + 6
is a valid inequality.
Since this example is 2-dimensional, we may represent X graphically, and see that the
addition of the valid inequality gives the convex hull of X.
Proposition 9.6. For the general case (0 < C < b), when C does not divide b, and

X = {(x, y) | x ≤ Cy, 0 ≤ x ≤ b, y ∈ Z1+ },

one obtains the valid inequality

x ≤ b − α(K − y),

where     
b b
K= , α=b− − 1 C.
C C

Outline of Proof. Let K = ⌈ Cb ⌉. By definition of K, we have

b
K −1< ≤K
C
i. e.,
C(K − 1) < b ≤ KC.
For y ∈ {0, 1, ..., K − 1}, define

S(y) = {(x, y) | 0 ≤ x ≤ Cy}.

73
CHAPTER 9. CUTTING PLANE ALGORITHM

Further, define
SK = {(x, y) | y ≥ K, y ∈ Z, 0 ≤ x ≤ b}.
It is easy to see that

X = {(x, y) | x ≤ Cy, 0 ≤ x ≤ b, y ∈ Z1+ }


= S(0) ∪ S(1) ∪ S(2) ∪ · · · ∪ S(K − 1) ∪ SK .

All feasible points of X are above the straight line crossing the two points:

((K − 1)C, K − 1)⊤ and (b, K)⊤ .

We know that a line is of the form

x = αy + β.

Since it crosses the above two points, we have

α = b − (K − 1)C, β = b − αK.

Then the valid inequality is given as

x ≤ αy + b − αK
= b − α(K − y).

Example 9.7 (Mixed Integer Rounding). Consider the set

X = P ∩ (Z4 × R1 )

where
P = {(y, s) ∈ R4+ × R1+ | 13y1 + 20y2 + 11y3 + 6y4 + s ≥ 72}.
It is not difficult to prove that
 
72 − s
≥ 7 − αs for some α.
11

For example, α = 16 . Thus dividing the inequality

13y1 + 20y2 + 11y3 + 6y4 + s ≥ 72

by 11 yields
13 20 6 72 − s
y1 + y2 + y3 + y4 ≥
11 11 11 11
which suggests that
 
72 − s 1
2y1 + 2y2 + y3 + y4 ≥ ≥7− s
11 6
is a valid inequality for X.

74
CHAPTER 9. CUTTING PLANE ALGORITHM

9.2 Generate Valid inequalities for integer programming


The following simple observation is already used in Example 1.
Proposition 9.8. Let X = {y ∈ Z1 | y ≤ b}. Then the inequality y ≤ ⌊b⌋ is valid for X.

We now use again this observation in the next example.


Example 9.9. Let X = P ∩ Zn be the set of integer points in P where P is given by
7x1 − 2x2 ≤ 14,
x2 ≤ 3,
2x1 − 2x2 ≤ 3,
x ≥ 0
• Combining the first three constraints with nonnegative weights w = ( 72 , 37
63
, 0), we
obtain the valid inequality for P ,
1 121
2x1 + x2 ≤ .
63 21
• Reducing the coefficient on the left-hand-side to the nearest integer gives the valid
inequality for P :
121
2x1 + 0x2 ≤ .
21
• Now as the left-hand-side is integral for all points of X, reducing the right-hand-side
to the nearest integer leads to the valid inequality for X:
 
121
2x1 ≤ = 5.
21
Repeating the procedure, and using a weight of 12 on this last constraint yields the
tighter inequality  
5
x1 ≤ = 2.
2
An identical approach can be used to derive valid inequalities for any integer programming
region. The general procedure is described as follows:

9.2.1 Chvátal-Gomory procedure to construct a valid inequality Let X = P ∩


Zn , where
P = {x ∈ Rn+ | Ax ≤ b}.
A is an m × n matrix with columns {a1 , a2 , . . . , an }, and w ∈ Rm
+.

• The inequality
n
X
waj xj ≤ wb
j=1
Pn
is valid for P as w ≥ 0 and j=1 aj xj ≤ b.

75
CHAPTER 9. CUTTING PLANE ALGORITHM

• The inequality
n
X
⌊waj ⌋xj ≤ wb
j=1

is valid for P as x ≥ 0.
• The inequality
n
X
⌊waj ⌋xj ≤ ⌊wb⌋
j=1
Pn
is valid for X as x is integer, and thus j=1 ⌊waj ⌋xj is integer.
This simple procedure is sufficient to generate all valid inequalities for an integer program.

Theorem 9.10. Every valid inequality for X (the feasible set of integer programs) can be
obtained by applying the Chvátal-Gomory procedure a finite number of times.

9.3 Gomory’s method


9.3.1 Tightening the formulation by valid inequality In the branch and bound
method, preprocessing is the first step in tightening a formulation of the integer region by
tightening bounds, removing redundant inequalities, identifying logical inequalities, and
so on.
In this section, we are going to see how valid inequalities can be used to tighten the
formulation, and how valid inequalities can be used in the cutting-plane method to solve
integer programs.
The idea here is to examine the initial formulation

P = {x | Ax ≤ b, x ≥ 0}

of
X = P ∩ Zn ,
find a set of valid inequalities Qx ≤ q for X, add these to the formulation, immediately
giving a new formulation

P ′ = {x | Ax ≤ b, Qx ≤ q, x ≥ 0}

with X = P ′ ∩ Zn . Since P ′ ⊆ P , P ′ is better than P . Then one can apply the branch-
and-bound method or whatever, to the formulation P ′ .

Example 9.11 (Tightening formulations by a valid inequality). Let


n
X
X = {(x, y) | 0 ≤ xi ≤ 1, i ∈ {1, . . . , n}, xi ≤ ny, y ∈ {0, 1}}
i=1
= P ∩ Rn+ × {0, 1} ,


76
CHAPTER 9. CUTTING PLANE ALGORITHM

where n
X
P = {(x, y) | 0 ≤ xi ≤ 1, i ∈ {1, . . . , n}, xi ≤ ny, 0 ≤ y ≤ 1}.
i=1

For a point (x, y) ∈ X notice that ni=1 xi ≤ ny and x ≥ 0. Thus, if there exists an
P
i ∈ {1, . . . , n} with xi > 0, then necessarily y = 1. Thus, the following inequalities

xi ≤ y, i ∈ {1, . . . , n}

are valid inequalities for X. Adding all these valid inequalities to P yields the formulation
n
X
P ′ = {(x, y) | 0 ≤ xi ≤ 1, xi ≤ y, i ∈ {1, . . . , n}, xi ≤ ny, 0 ≤ y ≤ 1}.
i=1

Since
xi ≤ y, i ∈ {1, . . . , n}
implies that
n
X
xi ≤ ny
i=1

this inequality becomes redundant, and can be removed from the system. Thus, we obtain

P ′ = {(x, y) | 0 ≤ xi ≤ 1, xi ≤ y, i ∈ {1, . . . , n}, 0 ≤ y ≤ 1},

which is a better formulation than P, in fact, P ′ ⊂ P . It is not difficult to verify that


P ′ = conx(X) which is the ideal formulation for X. (Prove this!! )

9.3.2 Advantage and disadvantage of tightening formulations by valid inequal-


ities
• Advantage: If the valid inequalities are well chosen so that formulation P ′ is
significantly better than P , the bounds should be improved and hence the branch-
and-bound algorithm should be more effective. In addition, the chances of finding
feasible integer solutions in the course of the algorithm should increase.
• Disadvantage: Often the family of valid inequalities is enormous. The resulting
linear programs become very big. It becomes impossible to use the standard branch-
and-bound software because there are too many constraints.

9.3.3 Gomory’s Fractional Cutting Plane Algorithm The fundamental idea be-
hind the cutting plane method is to add constraints to a linear program cutting off its
optimal solutions until the optimal basic feasible solution takes on integer values.
Of course, we have to be careful which constraints we add: we would not want to change
the problem by adding additional constraints. Here, we will add a special type of con-
straint called a cut.
A cut relative to a current fractional solution satisfies the following criteria:

77
CHAPTER 9. CUTTING PLANE ALGORITHM

• Every feasible integer point is feasible for the cut (therefore, a cut must be a valid
inequality for the integer feasible set), and
• the current fractional solution is not feasible for the cut.
Two ways to generate cuts:
• The first, called Gomory cuts, generates cuts from any linear programming tableau.
This has the advantage of “solving” any problem but has the disadvantage that the
method can be very slow.
• The second approach is to use the structure of the problem to generate very good
cuts. This approach needs a problem-by-problem analysis, but can provide very
efficient solution techniques.
In what follows, we focus our attention on the first approach.
Example 9.12. Consider the integer set
X = {(x, y) | x ≤ 10y, 0 ≤ x ≤ 14, x ∈ Z1+ , y ∈ {0, 1, 2, 3}}.
Find a cut cutting off the point (14, 1.4).
Consider the IP problem:
max{c⊤ x | Ax = b, x ≥ 0 and integer}.
The idea is to first solve the associated linear programming relaxation and find an optimal
basis, choose a basic variable that is not integer, and then generate a Chvátal-Gomory
inequality on the constraint associated with this basic variable so as to cut off the linear
programming solution. We suppose, given an optimal basis, that the problem is rewritten
in the form:
X
Maximize f0 + c̄j xj
j∈N
X
subject to xB + āj xj = b̄
j∈N
x ≥ 0 and integer
where b̄ = B −1 b, aj is a column vector, xj (j ∈ N ) are nonbasic variables, and xB is the
vector of basic variables.

If the basic optimal solution x∗ is not integer, there exists some row i with b̄i ∈
/ Z.
Choosing such a row, the Chvátal-Gomory inequality for row i is
!
X
(xB )i + ⌊āj ⌋xj ≤ ⌊b̄i ⌋,
j∈N i

where ⌊aj ⌋ = (⌊aj,1 ⌋, . . . , ⌊aj,k ⌋) for aj = (aj,1 , . . . , aj,k ). Combining this inequality with
previous equation leads to
!
X
(āj − ⌊āj ⌋)xj ≥ b̄i − ⌊b̄i ⌋,
j∈N i

78
CHAPTER 9. CUTTING PLANE ALGORITHM

i. e., X
(āji − ⌊āji ⌋)xj ≥ b̄i − ⌊b̄i ⌋,
j∈N

This is called Chvátal-Gomory cut (C-G cut for short).


By the definition and choice of row i, we notice that

0 ≤ āji − ⌊āji ⌋ < 1, 0 < b̄i − ⌊b̄i ⌋ < 1.

As x∗j = 0 for all nonbasic variables j ∈ N in the optimal LP solution, x∗ does not satisfy
the Chvátal-Gomory cut, and hence x∗ will be cut off.

Based on these considerations, we now describe a basic cutting plane algorithm for (IP):

max{c⊤ x | x ∈ X = P ∩ Zn }.

A framework of cutting plane algorithm:


Consider the integer program:
max{c⊤ x | x ∈ X}.
Initialization: Set t = 0 and P 0 = P (the initial formulation)
Iteration k. Solve the linear program:

z̄ ⊤ = max{c⊤ x | x ∈ P k }.

Let xk be an optimal solution.


(i) If xk ∈ Zn , xk is an optimal solution for IP.
/ Zn ) find a valid inequality (uk )⊤ x ≤ γ k cutting off xk . Set
(ii) Else (i. e. if xk ∈

P k+1 = P k ∩ {x | (uk )⊤ x ≤ γ k },

and augment k.
Stop.
The general idea of a cutting-plane algorithm is as follows:
(i) We find an optimal solution x∗ for the linear program max{c⊤ x | x ∈ P }. This can
be done by any linear programming algorithm (possibly a solver that is available
only as a black-box).
(ii) If x∗ is integral, we already have an optimal solution to the IP and we can terminate.
(iii) Otherwise, we search our family (or families) of valid inequalities for inequalities
which are violated by x∗ , that is, w⊤ x∗ > d where w⊤ x ≤ d is valid for P .
(iv) We add the inequalities found to our LP-relaxation and resolve to find a new optimal
solution x∗∗ of the improved formulation. This procedure is continued.
(v) If we are fortunate, we terminate with an optimal integral solution.

79
CHAPTER 9. CUTTING PLANE ALGORITHM

(vi) If we are so lucky, we still have gained something. Namely, we have found a new
formulation for our initial problem which is better than the original one (since we
have cut off some non-integral points). The formulation obtained upon termination
gives an upper bound z̄ for the optimal objective function value z ∗ which is no worse
than the initial one (and usually is much better). We can now use z̄ in a branch
and bound algorithm.

Remark 9.13. If the algorithm terminates without finding an integer solution for (IP),

P k = P ∩ {x | (uj )⊤ x ≤ γ j , j ∈ {1, . . . , k}}

is an improved formulation that can be input to a branch-and-bound algorithm. It should


also be noted that in practice it is often better to add several violated cuts at each
iteration, and not just one at a time.

The next two examples show how the Gomory’s cutting plane algorithm works.

Example 9.14 (Gomory method). Consider the integer program

z = max 4x1 − x2
s.t. 7x1 − 2x2 ≤ 14,
x2 ≤ 3,
2x1 − 2x2 ≤ 3,
x1 , x2 ≥ 0 and integer.

Adding slack variables x3 , x4 , x5 , observe that as the constraint data is integer, the slack
variables must also take integer values. Now solving as a linear program gives:
59
z = max 7
− 47 x3
− 71 x4
x1 + 17 x3
+ 72 x4 = 20
7
,
x2 +x4 = 3,
2 10 23
− 7 x3 + 7 x4 +x5 = 7
,
x1 , x 2 , x3 , x4 , x5 ≥ 0

The optimal linear programming solution is


 ⊤
20 23
x= , 3, 0, 0, / Z5+ ,

7 7

so we use the first row, in which the basic variable x1 is fractional, to generate the Gomory
cut:
1 2 6
x 3 + x4 ≥ .
7 7 7
By adding slack variable s ≥ 0, this can be written as
1 2 6
s = x3 + x4 − .
7 7 7

80
CHAPTER 9. CUTTING PLANE ALGORITHM

Adding this cut to the above LP problem, and reoptimizing it leads to the new optimal
tableau:
z = max 15 2
− 21 x5 −3s
x1 +s = 2,
x2 − 21 x5 +s = 12 ,
x3 −x5 −5s = 1,
x4 + 12 x5 +6s = 52 ,
x1 , x 2 , x 3 , x 4 , x5 , s ≥ 0
Now the new optimal linear programming solution
 ⊤
1 5
x = 2, , 1, , 0
2 2
is still not integer, as the variable x2 is fractional. The Gomory fractional cut on row 2 is
1 1
x5 ≥
2 2
or
1 1
− x5 + t = −
2 2
with t ≥ 0. Adding the constraint and reoptimizing, we obtain
z = max 7 −3s −t
x1 +s = 2,
x2 +s −t = 1,
x3 −5s −2t = 2,
x4 +6s +t = 2,
x5 −t = 1,
x1 , x 2 , x 3 , x 4 , x 5 s, t, ≥ 0

Now the linear programming solution is integer, thus optimal, and whence x∗ = (2, 1)⊤
solves the original integer program.
Example 9.15 (Gomory method – Understand the Gomory cut in different way). Con-
sider the following integer program:
z = max 7x1 + 9x2
s.t. −x1 + 3x2 ≤ 6,
7x1 + x2 ≤ 35,
x1 , x2 ≥ 0 and integer.
First ignoring the integrality condition and solving its LP relaxation by the simplex
method yields the following optimal tableau, where s1 and s2 are slack variables

x1 x2 s1 s2 RHS
x2 0 1 7/22 1/22 7/2
x1 1 0 −1/22 3/22 9/2
−z 0 0 28/11 15/11 63

81
CHAPTER 9. CUTTING PLANE ALGORITHM

Let us look at the first constraint:


7 1 7
x2 + s1 + s2 = .
22 22 2
We can manipulate this to put all of the integer parts on the left side, and all the fractional
parts on the right to get:
1 7 1
x2 − 3 = − s 1 − s 2 .
2 22 22
Now, note that the left-hand side consists only of integers, so the right hand side must
add up to an integer.
Which integer can it be?
Well, it consists of some positive fraction minus a series of positive values. Therefore, the
right-hand side can only be 0, −1, −2, . . .; it cannot be a positive value.
Therefore, we have derived the following constraint:
1 7 1
− s1 − s2 ≤ 0.
2 22 22
This constraint is satisfied by every feasible integer solution to our original problem. But,
in our current solution, s1 and s2 are nonbasic variables and both equal to 0, which is
infeasible to the above constraint.
This means the above constraint is a Gomory cut. We can now add this constraint to the
linear program and be guaranteed to find a different solution, one that might be integer.
We can also generate a cut from the other constraint.
1 3 9
x1 −s1 + s2 = ,
 22
 22 2
21 3 1
x1 + −1 + s1 + s2 = 4 + .
22 22 2

Thus,
1 21 3
x1 − s 1 − 4 = − s1 − s2
2 22 22
giving the constraint
1 21 3
− s1 − s2 ≤ 0,
2 22 22
which is a Gomory cut.

82
IV
More on Relaxation and
Applications

10 – Lagrangian relaxation
Lagrangian relaxation provides an improvement to LP-relaxation for certain IP and MIP
problems. Consider integer programming problems of the following type:

(IP ) z = max c⊤ x
s. t. Ax ≤ a,
Dx ≤ d,
x ∈ Zn+ .

Suppose that the constraints Ax ≤ a are “nice” (e.g., A is totally unimodular) in the sense
that an IP with just these constraints is easy to solve. Thus if we drop the complicating
constraints “Dx ≤ d”, we have the following relaxation

max c⊤ x
s. t. Ax ≤ a
x ∈ Zn+ .

For such problems we will now derive a family of relaxations that generate stronger bounds
than LP relaxations. Consequently, branch-and-bound systems built around these relax-
ations are often more efficient than an LP-based approach.
We write (IP) in the slightly more general form

(IP ) z = max c⊤ x
s .t. Dx ≤ d (or Dx = d)
x ∈ X,

where X is a feasible set of “benign” type.


Definition 10.1. A Lagrangian relaxation of (IP) is a problem of the form

(IP (u)) z(u) = max{c⊤ x + u⊤ (d − Dx) | x ∈ X}

83
CHAPTER 10. LAGRANGIAN RELAXATION

where u ∈ Rm is a fixed vector of so-called Lagrange multipliers, and where it is further


assumed that u ≥ 0 when the constraints are inequality “≤” type.

Proposition 10.2. (IP (u)) is a relaxation of (IP).

Proof. Assume that the constraints are all inequality “≤”type. The equality case “=”
can be proved in a similar way.
• The feasible region of (IP (u)) is larger than that of (IP ),

{x ∈ X | Dx ≤ d} ⊆ X.

• For all x feasible to (IP ), the objective function of (IP (u)) is at least as large as
that of (IP), in fact,
c⊤ x + u⊤ (d − Dx) ≥ c⊤ x

due to u⊤ (d − Dx) ≥ 0.

We have infinitely many Lagrangian relaxations (IP (u)) to choose from. So, how should
we fix the vector u of Lagrange multipliers?
Since (IP (u)) is a relaxation of (IP ), we have z ≤ z(u). Therefore, to find the least upper
bound of this kind, we have to solve the Lagrangian dual problem

wLD = minm z(u) for the inequality case “ ≤ ”,


u∈R+

wLD = minm z(u) for the equality case “ = ”


u∈R

Sometimes, Lagrangian relaxation yields an optimal solution to (IP):

Proposition 10.3. (i) (Inequality case “Dx ≤ d”) Let u ∈ Rm


+ . If x(u) is optimal
for (IP (u)) and satisfies Dx(u) ≤ d and

(Dx(u))i = di for ui > 0,

then x(u) is optimal for (IP).


(ii) (Equality case “Dx = d”) Let u ∈ Rm . If x(u) is optimal for (IP (u)) and satisfies
Dx(u) = d then x(u) is optimal for (IP).

Proof. Note that the conditions in both (i) and (ii) imply that z(u) = c⊤ x(u). Since
(IP (u)) is a relaxation of (IP ) we can apply Proposition 6.8, which tells us that x(u) is
optimal for (IP ).

84
CHAPTER 10. LAGRANGIAN RELAXATION

Example 10.4 (UFL). Consider the uncapacitated facility location problem from Chap-
ter 4,
XX X
(IP ) z = max cij xij − fj y j
i∈M j∈N j∈N
X
s. t. xij = 1 (i ∈ M )
j∈N
xij − yj ≤ 0 (i ∈ M, j ∈ N )
|M |×|N |
x ∈ R+ , y ∈ {0, 1}|N | .

where
• M is the set of customer locations,
• N is the set of potential facility locations,
• fj are the fixed costs for opening facility j,
• and where we replaced the original servicing costs cij with −cij to turn the problem
into a maximisation problem.
P
Dualising the demand constraints j∈N xij = 1, we find the Lagrangian relaxation
XX X X
(IP (u))) z(u) = max (cij − ui )xij − fj yj + ui
i∈M j∈N j∈N i∈M
s.t. xij − yj ≤ 0 (i ∈ M, j ∈ N )
|M |×|N |
x ∈ R+ , y ∈ {0, 1}|N | .

Now note that the constraint that linked the different facility locations to one another
has been subsumed in the objective function, so that (IP (u)) decouples,
X X
z(u) = zj (u) + ui
j∈N i∈M

where zj (u) is the optimal solution of the following problem,


X
(IPj (u))) zj (u) = max (cij − ui )xij − fj yj
i∈M
s. t xij − yj ≤ 0 i ∈ M,
xij ≥ 0(i ∈ M ), yj ∈ {0, 1}.

Furthermore, (IPj (u)) is easily solved by inspection:


• If yj = 0, then xij = 0 for all i, and the objective value is 0.
• If yj = 1,Pthen all clients i for which cij − ui > 0 will be served, and the objective
value is i∈M max{0, cij − ui } − fj .
 P
Therefore, zj (u) = max 0, i∈M max{0, cij − ui } − fj .

85
CHAPTER 10. LAGRANGIAN RELAXATION

10.1 Solving Lagrangian Dual


Let
X := {y ∈ Zn+ | Ay ≤ b}.
We continue to consider integer programming problems of the form

(IP ) z = max c⊤ x
s .t. Dx ≤ d,
x ∈ X.

where Dx ≤ d are m constraints that make the problem difficult, while optimising over
X is easy.
The Lagrangian dual of (IP) is

(LD) wLD = min{z(u) | u ∈ Rm


+ },

where
(IP (u)) z(u) = max{c⊤ x + u⊤ (d − Dx) | x ∈ X}.
We saw that (LD) is a dual of (IP), so that z ≤ wLD . Solving (LD) is thus a good way of
generating a hopefully quite tight bound on z, but note that z(u) is generally a nonlinear
function, and we do not know yet how to solve such problems efficiently.
We would like to answer the questions:
(i) How good is the upper bound obtained by solving the Lagrangian dual?
(ii) More specifically, does the Lagrangian dual give a stronger bound on (IP) than the
respective LP relaxation?
(iii) How can one solve the Lagrangian dual?
The strength of the Lagrangian dual. To answer the aforementioned questions, we
must understand the structure of (LD) better.
Let us assume that X contains a large but finite number of points X = {x[1] , . . . , x[T ] }.
Then

wLD = min z(u) = min max{c⊤ x[t] + u⊤ (d − Dx[t] ) | t ∈ {1, . . . , T }}



u≥0 u≥0

which can be further written as

wLD = min η
s. t. η ≥ c⊤ x[t] + u⊤ (d − Dx[t] ) (for t ∈ {1, . . . , T })
u ∈ Rn+ , η ∈ R.

The last problem is an LP instance with decision variables η, u.

86
CHAPTER 10. LAGRANGIAN RELAXATION

Taking the LP dual,



X
wLD = max µt (c⊤ x[t] )
t=1
X⊤
s. t. µt (Dx[t] − d) ≤ 0,
t=1
X⊤
µt = 1,
t=1
µ ∈ R⊤
+,

which can be rewritten as

wLD = max{c⊤ x | Dx ≤ d, x ∈ conv(X)}.

The result still holds true for arbitrary X = {x ∈ Zn+ | Ax ≤ b}, not only when X is
finite.
Theorem 10.5.
wLD = max{c⊤ x | Dx ≤ d, x ∈ conv(X)}.
Corollary 10.6. (i) If {x ∈ Rn+ | Ax ≤ b} is an ideal formulation of X, then

conv(X) = {x ∈ Rn | Ax ≤ b, x ≥ 0},

and hence, wLD coincides with the bound produced by the LP relaxation of (IP),

wLD = max{c⊤ x | Dx ≤ d, Ax ≤ b, x ≥ 0}.

(ii) If {x ∈ Rn+ | Ax ≤ b} is not an ideal formulation of X then

conv(X) ⊂ {x ∈ Rn+ | Ax ≤ b}

and wLD is generally a tighter bound than the one given by the LP relaxation of
(IP).

Some remarks:
• Situation (i) of Corollary 2 happens quite often, because

max{c⊤ x | x ∈ X}

is an easy problem mainly when {x ∈ Rn+ | Ax ≤ b} is an ideal formulation of X.


Solving the Lagrangian dual is nonetheless interesting in this case.
• In situation (ii) of Corollary 2 one is rewarded with a stronger bound, but the fact
that {x ∈ Rn+ | Ax ≤ b} is not an ideal formulation of X means that IP (u)) may
not be that easy to solve; thus, the reward comes at a higher price.

87
CHAPTER 10. LAGRANGIAN RELAXATION

10.2 Lagrangian dual as non-differentiable convex optimization

Let us first assume again that X is of finite cardinality and write X = {x[1] , . . . , x[T ] }.
• It is not very difficult to prove that z(u) is a piecewise linear function

c⊤ x[t] + u⊤ (d − Dx[t] )

u 7−→ max (10.1)
t=1,...,T

• Furthermore, this function is convex, since the linear functions u 7−→ c⊤ x[t] +u⊤ (d−
Dx[t] ) are convex and the pointwise maximum of a set of convex functions is again
convex.
• The Lagrangian dual function z(u) is not differentiable at the breakpoints where
the maximum in (10.1) is achieved by more than one index.
To solve (LD) we need to minimise z(u) over u ≥ 0.
Convex nondifferentiable functions can often be reasonably well solved by the subgradient
algorithm which we will introduce next.

Definition 10.7. Let u ∈ Rm and let f : Rm → R be a convex function. A subgradient


of f at u is a vector γ ∈ Rm such that for all v ∈ Rm ,

f (v) ≥ f (u) + γ ⊤ (v − u).

Example 10.8. (i) If f is differentiable at u, then ∇f (u) is the unique subgradient of


f at u.
(ii) d − Dx[t] is a subgradient for z(u) for any index t that achieves the maximum in
(10.1), see Lemma 10.9 below.
(iii) Any convex combination of subgradients is a subgradient.

Lemma 10.9. Let u ∈ Rm and

t∗ = arg max{c⊤ x[t] + u⊤ (d − Dx[t] ) | t ∈ {1, . . . , T }}.



Then d − Dx[t ] is a subgradient of z(u) at u.

Proof. For all v ∈ Rm ,

c⊤ x[t] + v ⊤ (d − Dx[t] )

z(v) = max
t=1,...,T
∗ ∗
≥ c⊤ x[t ] + v ⊤ (d − Dx[t ] )
∗ ∗ ∗
= c⊤ x[t ] + u⊤ (d − Dx[t ] + (v − u)⊤ (d − Dx[t ] )

= z(u) + (v − u)⊤ (d − Dx[t ] ),

which implies that d − Dx[t ] is a subgradient of z(u) at u.

88
CHAPTER 11. FURTHER APPLICATIONS OF INTEGER PROGRAMMING

10.3 Solving (LD) by Subgradient method


• Initialisation: Choose initial Lagrange multiplier u[0] ≥ 0.
• For k ∈ {0, 1, 2, . . .} repeat the following steps:
i) Find a maximiser x(u[k] ) of the Lagrangian relaxation (IP (u[k] )) corresponding
to the Lagrange multiplier u[k] ,
ii) Determine a step length µk > 0.
iii) For i ∈ {1, . . . , m} set
[k+1] [k]
ui = max(ui − µk (d − Dx(u[k] ))i , 0).

End.
Some remarks:
• The motivation of the algorithm is very simple: in each iteration of the main loop
the Lagrange multiplier vector is improved by correcting it in a direction that makes
the objective function z(u) decrease.
• Note that we built in a mechanism that prevents individual components of u[k+1]
to become negative. For Lagrange multipliers corresponding to equality constraints
Dx = d, this safeguard is of course unnecessary.
• The implementation of the subgradient algorithm is very simple too, but the diffi-
culty lies in choosing the step length µk , as the following theorem shows.

Theorem 10.10. (i) If k µk → ∞ and µk → 0 as k → ∞, then z(uk] ) → wLD .


P

(ii) If µk = µ0 ρk for some fixed ρ ∈ (0, 1), then z(uk] ) → wLD if µ0 and ρ are sufficiently
large.
(iii) If w̄ ≥ wLD ,
εk (z(u[k] − w̄)
µk =
∥d − Dx(u[k] )∥2
where εk ∈ (0, 2) for all k, then either z(u[k] ) → wLD for k → ∞, or else w̄ ≥
z(u[k] ) ≥ wLD occurs for some finite k.

11 – Further applications of integer programming


Integer-Programming models arise in practically every area of application of mathematical
programming. To conclude this course, we discuss two types of applications.

11.1 Fixed Costs


Frequently, the objective function for a minimization problem contains fixed costs (pre-
liminary design costs, fixed investment costs, fixed contracts, and so forth). For example,

89
CHAPTER 11. FURTHER APPLICATIONS OF INTEGER PROGRAMMING

the cost of producing x units of a specific product might consist of a fixed cost of setting
up the equipment and a variable cost per unit produced on the equipment.
Assume that the equipment has a capacity of N units. Define y to be a binary variable
that indicates when the fixed cost is incurred, so that y = 1 when x > 0 and y = 0 when
x = 0. Then the contribution to cost due to x may be written as

Ky + cx,

with the constraints:


x ≤ N y, x ≥ 0, y = 0 or 1.
As required, these constraints imply that x = 0 when the fixed cost is not incurred, i.e.,
when y = 0. The constraints themselves do not imply that y = 0 if x = 0. But when
x = 0, the minimization will clearly select y = 0, so that the fixed cost is not incurred.
Finally, observe that if y = 1, then the added constraint becomes x ≤ N , which reflects
the capacity limit on the production equipment.

11.2 Piecewise Linear Representation


Another type of nonlinear function that can be represented by integer variables is a
piecewise linear curve. Figure ?? illustrates a cost curve for plant expansion that contains
three linear segments with variable costs of 5, 1, and 3 million dollars per 1000 items of
expansion.
To model the cost curve, we express any value of x as the sum of three variables δ1 , δ2 , δ3 ,
so that the cost for each of these variables is linear. Hence,

x = δ1 + δ2 + δ3 ,

where
0 ≤ δ1 ≤ 4, 0 ≤ δ2 ≤ 6, 0 ≤ δ3 ≤ 5; (11.1)
and the total variable cost is given by:

Cost = 5δ1 + δ2 + 3δ3 .

Note that we have defined the variables so that:


• δ1 corresponds to the amount by which x exceeds 0, but is less than or equal to 4;
• δ2 is the amount by which x exceeds 4, but is less than or equal to 10; and
• δ3 is the amount by which x exceeds 10, but is less than or equal to 15.
If this interpretation is to be valid, we must also require that δ1 = 4 whenever δ2 > 0 and
that δ2 = 6 whenever δ3 > 0. Otherwise, when x = 2, say, the cost would be minimized
by selecting δ1 = δ3 = 0 and δ2 = 2, since the variable δ2 has the smallest variable cost.
However, these restrictions on the variables are simply conditional constraints and can be
modeled by introducing binary variables, as before.

90
CHAPTER 11. FURTHER APPLICATIONS OF INTEGER PROGRAMMING

If we let
( (
1 if δ1 is at its upper bound, 1 if δ2 is at its upper bound,
w1 = w2 =
0 otherwise, 0 otherwise,

then constraints (11.1) can be replaced by


 

 4w1 ≤ δ1 ≤ 4, 

6w2 ≤ δ2 ≤ 6w1 ,
 
(11.2)

 0 ≤ δ3 ≤ 5w2 , 

w1 and w2 binary,
 

to ensure that the proper conditional constraints hold. Note that if w1 = 0, then w2 = 0,
to maintain feasibility for the constraint imposed upon δ2 , and (11.2) reduces to

0 ≤ δ1 ≤ 4, δ2 = 0, and δ3 = 0.

If w1 = 1 and w2 = 0, then (11.2) reduces to

δ1 = 4, 0 ≤ δ2 ≤ 6, and δ3 = 0.

Finally, if w1 = 1 and w2 = 1, then (11.2) reduces to

δ1 = 4, δ2 = 6, and 0 ≤ δ3 ≤ 5.

Hence, we observe that there are three feasible combinations for the values of w1 and w2 :

w1 = 0, w2 = 0 corresponding to 0 ≤ x ≤ 4 since δ2 = δ3 = 0;
w1 = 1, w2 = 0 corresponding to 4 ≤ x ≤ 10 since δ1 = 4 and δ3 = 0;
w1 = 1, w2 = 1 corresponding to 10 ≤ x ≤ 15 since δ1 = 4 and δ2 = 6.

The same general technique can be applied to piecewise linear curves with any number of
segments. The general constraint imposed upon the variable δj for the jth segment will
read:
Lj wj ≤ δj ≤ Lj wj−1 ,
where Lj is the length of the segment.

11.3 Approximation of Nonlinear Functions


One of the most useful applications of the piecewise linear representation is for approx-
imating nonlinear functions. If we draw linear segments joining selected points on the
curve, we obtain a piecewise linear approximation, which can be used instead of the curve
in the model. The piecewise approximation, of course, is represented by introducing in-
teger variables as indicated above. By using more points on the curve, we can make the
approximation as close as we desire.

91

You might also like