Computational Control
Model Predictive Control
Saverio Bolognani
Automatic Control Laboratory (IfA)
ETH Zurich
Optimal control beyond LQR
Beyond LQR
Constrained linear-quadratic optimal control
K
X −1
min xkT Qxk + ukT Ruk + xKT SxK
u0 ,...,uK −1
x0 ,...,xK k=0
subject to xk+1 = Axk + Buk for k = 0, . . . , K − 1
x0 = X0
xk ∈ Xk (polytope)
uk ∈ Uk (polytope)
No backward induction, one single program with
quadratic cost
convex constraints
linear equality constraints
1 / 39
Open-loop optimal control
By solving the optimization in one shot, we get
u0∗ , u1∗ , . . . , uK∗ −1 ∈ Rm
instead of a state feedback
u0 (x), u1 (x), . . . uK −1 (x) like ut = Γt xt (LQR)
u
planned input
K k
x
predicted state
K k
→ open-loop optimal control strategy
2 / 39
u
planned input
applied input
K k
x
predicted state
measured state
now K k
What if a disturbance/model mismatch takes you elsewhere?
3 / 39
u previously planned input
planned input
applied input
K k
x previously predicted state
predicted state
measured state
now K k
Recompute!
Incorporate new information about the current state
Markovian system: the current state is all you need to know to compute the
optimal control for the future
Previously computed optimal trajectory is invalid
Computation of the new optimal trajectory needs to happen in 1 time step
4 / 39
u
K
past future k
K
past future k
Receding horizon
Apply only one control input, u0∗
Use new state measurement as new initial conditions
Same horizon length → receding horizon
If the system is time invariant → same optimization problem, different x0
5 / 39
Receding horizon
Apply only one control input, u0∗
Use new state measurement as new initial conditions
Same horizon length → receding horizon
If the system is time invariant → same optimization problem, different x0
5 / 39
Receding-horizon control
The resulting open-loop optimal control problem corresponds to a parametric
optimization problem parametrized in the initial state x.
Parametric optimization
K
X −1
u0∗ (x) determined by min gk (xk , uk ) + gK (xK )
u,x
k=0
subject to xk+1 = f (xk , uk )
x0 = x
xk ∈ X k
uk ∈ U k
6 / 39
Feedforward or feedback?
uk xk
u0∗ (·) plant
What do we know about the optimization problem?
Under continuity assumptions of g and f , the minimum cost VK (x, u) is a
continuous function of x0 .
Under compactness assumptions on the constraints, the parametric
optimization problem has a solution when it’s feasible.
However, the map x 7→ u0∗ (x) is not necessarily continuous
MPC control law
MPC is a memory-less, nonlinear, time-invariant feedback control law.
Question: Can I achieve zero tracking error with an MPC controller?
7 / 39
Does MPC solve the optimal control problem?
Example
K
X −1
min ak |uk | + |x(K )|2 |a| < 1
x,u
k=0
subject to xk+1 = xk + uk
What is the solution x, u?
What is the MPC control law x 7→ u0∗ (x)?
The optimal trajectory defined by
state cost function
input cost function
horizon length
constraints
is never realized (even with no disturbance or model mismatch)!
8 / 39
K
X −1
min xk⊤ Qxk + uk⊤ Ruk + xK⊤ SxK , Q, S ⪰ 0, R ≻ 0
u,x
k=0
subject to xk+1 = Axk + Buk
Finite-time LQR
Backward computation of PK , PK −1 , . . . P1
Linear feedback uk = Γk xk with Γk = −(R + B⊤ Pk+1 B)−1 B⊤ Pk+1 A
u0LQR = −(R + B⊤ P1 B)−1 B⊤ P1 Ax0
u1LQR = −(R + B⊤ P2 B)−1 B⊤ P2 A (Ax0 + Bu0LQR )
MPC
Backward computation of PK , PK −1 , . . . P1
Linear time-invariant feedback uk = Γ0 xk
u0MPC = −(R + B⊤ P1 B)−1 B⊤ P1 Ax0 = u0LQR
u1MPC = −(R + B⊤ P1 B)−1 B⊤ P1 A (Ax0 + Bu0MPC )
9 / 39
Implementation of MPC
uk xk
u0∗ (·) plant
Option 1: Compute the function u0∗ (·) offline
extremely complex with the exception of very few cases
abundant offline computation power
large memory requirement
Option 2: Compute the value u0∗ (·) online at every iteration
often a tractable optimization problem
limited time to perform the computation online
10 / 39
Option 1: Explicit computation of u0∗
We already know one case.
Unconstrained linear MPC
→ first element of the finite-time LQR solution → linear feedback u0∗ (·)
How about constraints?
Linear MPC + linear constraints∗
K
X −1
min xk⊤ Qxk + uk⊤ Ruk + xK⊤ SxK , Q, S ⪰ 0, R ≻ 0
u,x
k=0
subject to xk+1 = Axk + Buk
x ∈ Px (polytope Cx ≤ d)
u ∈ Pu (polytope Eu ≤ f )
∗ Extremely common MPC problem formulation
11 / 39
Example
Consider the toy problem (time horizon 1, scalar integrator)
min u02 + x12
u0 ,x1
subject to x1 = x0 + u0
x1 ≤ 1
Quadratic problem with linear constraints
Parametrized in x0
The equality constraint (model) allows to eliminate x1
Convex ⇒ KKT conditions are necessary and sufficient∗
KKT conditions (only inequalities)
⊤
∇f (x) + µ ∇g(x) = 0,
µ≥0
minx f (x)
→ g(x) ≤ 0
subject to g(x) ≤ 0
µi gi (x) = 0 ∀i
12 / 39
f (u0 ) = u02 + (x0 + u0 )2
g(u0 ) = x1 − 1 = x0 + u0 − 1
∇f (u0 ) = 4u0 + 2x0
∇g(u0 ) = 1
Let’s consider the complementary slackness condition µg(u0 ) = 0.
Case 1: µ = 0 (and g(u0 ) ≤ 0)
We then have ∇f (u0 ) = 0, which means u0 = −x0 /2.
With this input g(u0 ) then becomes x0 /2 − 1, so that g(u0 ) ≤ 0 means x0 ≤ 2.
If x0 ≤ 2 then u0∗ = −x0 /2.
Case 2: g(u0 ) = 0 (and µ ≥ 0)
g(u0 ) = 0 implies that u0 = 1 − x0 .
Because µ ≥ 0, we must have ∇f (u0 ) ≤ 0 which means 4u0 + 2x0 ≤ 0.
With u0 = 1 − x0 this means 4 − 2x0 ≤ 0, that is x0 ≥ 2.
If x0 ≥ 2 then u0∗ = 1 − x0 .
13 / 39
Piecewise-affine feedback
min u02 + x12
u0 ,x1
subject to x1 = x0 + u0
x1 ≤ 1
u0∗ (·) is continuous
u0∗ (·) is piecewise affine
The number of affine regions depends on the number of constraints (how?)
In each region, the same constraints are active/inactive
u0∗ xt
u0∗ = − 12 x0
2 2
x0 1
0
1 t
u0∗ = 1 − x0
14 / 39
Explicit MPC in practice
Tens or hundreds of constraints
exponentially many regions read the state
→ very long offline optimization X
even searching for the region is
challenging in real-time! n multiplications
check state
against hyperplanes
n+3 memory accesses
→ even longer offline
Hi X < Ki
preprocessing (binary search tree) about
→ 20 ms for K = 5, n = 3 (10 years ago) log N
iterations
region no
found?
yes
compute
mn multiplications
control law
m(n+1) memory accesses
U = Fi X + Gi
Multi-Parametric Toolbox (MPT) from ETH
http://control.ee.ethz.ch/~mpt/
15 / 39
Option 2: online computation
16 / 39
from www.embotech.com, ETH spin-off
Online MPC is increasingly common!
custom optimization solvers
approximate solutions of the optimization problem
distributed solvers
faster hardware
17 / 39
Stability
MPC as a feedback design tool
K , Q, R, S
Expert u0∗ (·) uk xk
(you) plant
Ultimately, MPC is just a “clever” (principled) way to design static, time-invariant,
non-linear feedback laws.
Closed-loop stability
Analysis: compute u0∗ (hard!) and assess stability
Design: derive conditions on K , Q, R, S that guarantee closed-loop stability
Tools:
Small-signal (local) stability?
Large-signal (global) stability?
18 / 39
Lyapunov analysis
Stable equilibrium
The equilibrium x = 0 is stable for the dynamics xk+1 = f (xk ) if a small perturbation
of the state perturbs the subsequent state trajectory in a continuous manner.
∀ϵ > 0, ∃δ > 0 such that ∥x0 ∥ < δ ⇒ ∥xk ∥ < ϵ ∀k ≥ 0.
Asymptotically stable equilibrium
The equilibrium x = 0 is asymptotically stable if it is stable and limk→∞ ∥xk ∥ = 0.
Lyapunov theorem
Let W be a real-valued function such that
W (0) = 0 and W (x) > 0 ∀x ̸= 0
W (f (x)) < W (x) ∀x ̸= 0
Then x = 0 is asymptotically stable.
→ https://arxiv.org/abs/1809.05289
19 / 39
Stability of infinite horizon MPC
x
At time k, the MPC controller determines a trajectory u∗ that minimizes
∞
X
Vk∞ (x, u) = gj (xj , uj )
j=k
By Bellman’s principle, this trajectory minimizes also the “tail” cost
∞
X
∞
Vk+1 (x, u) = gj (xj , uj )
x j=k+1
k k +1
20 / 39
This makes W (x) = minu,x Vk∞ an obvious candidate Lyapunov function, under
very reasonable assumptions (for example on the functions gs ).
it is zero only at the target equilibrium x = 0
it decreases along the closed-loop trajectories of the system
W (xk+1 ) = W (xk ) − gk (xk , u0∗ (xk )) < W (xk )
⇒ the equilibrium x = 0 is asymptotically stable for the closed-loop system.
Some subtleties
Technically, we haven’t defined the MPC control with an infinite-dimensional
optimization problem
We are assuming that the global minimum is found and attainable
The functions g need to satisfy some properties
(remember the observability assumptions on infinite-time LQR?)
21 / 39
Finite-time MPC stability
The same trick clearly does not apply to finite-time receding-horizon MPC.
x
k k +K
x
k +1 k +1+K
At k + 1, a different cost is minimized → different minimizer x, u
The first element of the cost is removed, but a new element enters the tail.
22 / 39
Stability with terminal constraint
K
X −1
W (x) = min gk (xk , uk )
u,x
k=0
subject to xk+1 = f (xk , uk )
x0 = x, xK = 0
x
xk+K = 0
k k +K
x
xk+1+K = 0
k +1 k +1+K
23 / 39
Theorem
The equilibrium x = 0 is asymptotically stable for the closed-loop dynamics
generated by a finite-time MPC controller with zero terminal constraint.
−1
k+K k+K
!
X X
W (xk+1 ) = min g(xs , us ) = min g(xs , us ) − g(xk , uk ) + g(xk+K , uk+K )
s=k+1 s=k
By definition of the minimum, this is small than the same quantity evaluated for an
arbitrary u.
Let’s evaluate it for the input that minimizes the optimization problem at time k, that
is
k+K
X −1
ũ = arg min g(xs , us ),
s=k
which yields xk+K = 0, followed by uk+K = 0 that maintains the systems at
xk+K +1 = 0. We then have
W (xk+1 ) ≤ W (xk ) − g(x̃k , ũk ) ≤ W (xk ).
24 / 39
Key proof idea
x
xk+K = 0
k k +K
W (xk )
x
xk+1+K = 0
k +1 k +1+K
upper bound on W (xk+1 )
x
xk+1+K = 0
k +1 k +1+K
W (xk+1 )
25 / 39
Receding horizon as an approximation of infinite horizon
Infinite horizon
x
P∞
k=0 g (xk , uk )
k
Receding horizon
x
PK −1
k=0 g (xk , uk )
k K
Receding horizon with terminal cost
x
PK −1
k=0 g (xk , uk ) + gK (xK )
k K
Receding horizon with terminal constraint
x
PK −1
k=0 g (xk , uk ) xK ∈ XK
k K
26 / 39
MPC closed loop stability
Many alternative stability conditions have been derived
K large enough
zero terminal constraint
terminal region for the state
...
Subtleties
We always assumed feasibility of the MPC problem, which may not be true
(for example with a zero terminal constraint)
Future feasibility of the problem may depends on the current control actions.
We always assume that the global minimum can be found.
27 / 39
28 / 39
Steady-state selection
T
X −1
min xt⊤ Qxt + ut⊤ Rut + xT⊤ SxT , Q, S ⪰ 0, R ≻ 0
u,x
t=0
subject to xt+1 = Axt + But
x∈X
u∈U
We have implicitly assumed that we are trying to regulate the steady state
xs = 0 us = 0
However
we usually want plants to operate at a desireable working point xs
(not to stop! xs = 0)
we can often allow a constant input uS
29 / 39
Incremental formulation of MPC
Remember: MPC is “just” a well-designed non-linear static feedback law.
xk
u0∗ (·) uk xk ∆u0∗ (·) ∆uk uk
1
plant z−1 plant
If a non-zero input is necessary to uk+1 = uk + ∆uk
drive the system to zero, then MPC augment the plant with input
will not achieve x = 0. integrator
Particularly fragile against (if not present already!)
▶ input disturbances
▶ process noise integral nonlinear feedback law
▶ multiplicative noise uncertainty input weight penalizes changes in
Very low input weights may cause input signal
numerical issues.
30 / 39
Non-zero steady state
If a desired equilibrium (xs , us ) is known, MPC can be simply be used to stabilize
the system around that point.
(
x̃ = x − xs
change of coordinates
ũ = u − us
For linear systems, the change of variables is inconsequential.
K
X −1
V (x0 ) = min x̃k⊤ Qx̃k + ũk⊤ Rũk + x̃K⊤ Sx̃K subject to x̃k+1 = Ax̃k + Bũk
x̃,ũ
k=0
us
xs x̃k ũ0∗ (·) ũk uk xk
− + plant
31 / 39
Steady-state selection
How do you decide the steady state (equilibrium) (xs , us )?
In some cases, it can be derived from first principles.
q = a, q̇ = 0, d = D, ḋ = 0, u = g sin a
In other cases, specifications indicate a “desired” steady state (xspec , uspec ).
Steady-state selection problem
min ∥xs − xspec ∥2Qs + ∥us − uspec ∥2Rs
xs ,us
xs
subject to I − A −B = 0 (a valid steady state)
us
xs ∈ X
us ∈ U
32 / 39
MPC control architecture
Steady-state
offline
optimization
xs , us
online MPC
xk uk
plant
More on this later...
33 / 39
Disturbance and zero-offset
Not surpisingly, MPC relies on having access to a model of the system.
Model mismatch
▶ what is the effect of an approximate model on the feedback gain?
▶ is (xs , us ) a steady-state?
Output/measurement noise
▶ what is the effect of an additive noise to the measurement?
Process noise
▶ what is the effect of an additive noise to the input?
xk+1 = Axk + Buk +Bd dk
yk = Cxk +Cd dk
Desired behavior (without invoking tools like robust MPC)
In the presence of any of these disturbances, either
zero steady-state offset, or
unbounded trajectory (instability), or
active constraints.
34 / 39
Disturbance rejection in MPC
1 Add a model for the disturbance
2 Based on the augmented model and the available measurements, estimate
the disturbance
3 Correct the steady-state target based on the estimate of the disturbance
uk
MPC
xs , us x̃k+1 = Ax̃k + B ũk xk plant
Q, R
steady-state d̂k
target selection estimator
35 / 39
Step 1: Add a model for the disturbance
Unless better information is available, a constant model is appropriate
dk+1 = dk dk ∈ Rnd , nd ≤ n
Good regulator theorem
“Every good regulator of a system must be a model of that system”
If a better model is available, use it!
periodic disturbance? ramps?
see Internal Model Principle
the closed loop system will reject disturbances of the class that you modeled
how many integrators do you need to reject a step disturbance? a ramp?
xk+1 A Bd xk B
= + u
dk+1 0 I dk 0 k
36 / 39
Step 2: Estimate the disturbance
uk
MPC
xs , us x̃k+1 = Ax̃k + B ũk xk plant
Q, R
steady-state d̂k
target selection estimator
For example, a Luenberger state estimator
prediction of xk+1 based on measurements xk , uk and estimate d̂k
x̂k+1 = Axk + Buk + Bd d̂k
correction based on the prediction error
d̂k+1 = d̂k + L(xk − x̂k )
37 / 39
Correct the steady-state target
Based on the constant disturbance model, d̂k is out best estimate of the
steady-state ds .
Modified steady-state selection problem
∥xs − xspec ∥2Qs + ∥us − uspec ∥2Rs
min
xs ,us
xs
subject to I − A −B = Bd d̂k
us
xs ∈ X
us ∈ U
uk
MPC
xs , us x̃k+1 = Ax̃k + B ũk xk plant
Q, R
steady-state d̂k
target selection estimator
38 / 39
And more...
Output-feedback MPC
Terminal constraints
More powerful stability guarantees
Examples
Chapters 1, 2, 7 of
Model Predictive Control:
Theory, Computation, and Design
James B. Rawlings, David Q. Mayne, Moritz M. Diehl
2nd edition, 2022
Free to download
https://sites.engineering.ucsb.edu/~jbraw/mpc/
39 / 39
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License