0% found this document useful (0 votes)
53 views50 pages

Numerical Optimal Control Techniques

The document discusses numerical optimal control problems, particularly focusing on simplified optimal control problems governed by ordinary differential equations (ODEs). It outlines various methods such as dynamic programming, Hamilton-Jacobi-Bellman equations, and indirect methods like Pontryagin's Minimum Principle, as well as direct methods for solving these problems. The document also presents a numerical test problem to illustrate the application of single shooting optimization techniques.

Uploaded by

hemsasp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views50 pages

Numerical Optimal Control Techniques

The document discusses numerical optimal control problems, particularly focusing on simplified optimal control problems governed by ordinary differential equations (ODEs). It outlines various methods such as dynamic programming, Hamilton-Jacobi-Bellman equations, and indirect methods like Pontryagin's Minimum Principle, as well as direct methods for solving these problems. The document also presents a numerical test problem to illustrate the application of single shooting optimization techniques.

Uploaded by

hemsasp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Numerical Optimal Control

Moritz Diehl
Simplified Optimal Control Problem in ODE
path constraints h(x, u) ≥ 0
6
states x(t)
initial value terminal
x0 r constraint r (x(T )) ≥ 0

controls u(t)
p p
-
0 t T

Z T
minimize L(x(t), u(t)) dt + E (x(T ))
x(·),u(·) 0
subject to

x(0) − x0 = 0, (fixed initial value)


ẋ(t)−f (x(t), u(t)) = 0, t ∈ [0, T ], (ODE model)
h(x(t), u(t)) ≥ 0, t ∈ [0, T ], (path constraints)
r (x(T )) ≥ 0 (terminal constraints)
More general optimal control problems

Many features left out here for simplicity of presentation:


I multiple dynamic stages
I differential algebraic equations (DAE) instead of ODE
I explicit time dependence
I constant design parameters
I multipoint constraints r (x(t0 ), x(t1 ), . . . , x(tend )) = 0
Optimal Control Family Tree

Three basic families:


I Hamilton-Jacobi-Bellmann equation /
dynamic programming
I Indirect Methods / calculus of variations / Pontryagin
I Direct Methods (control discretization)
Principle of Optimality

Any subarc of an optimal trajectory is also optimal.

6
intermediate
states x(t) s
value x̄
initial s
value x0
optimal
controls u(t)
p p-
0 t̄ T

Subarc on [t̄, T ] is optimal solution for initial value x̄.


Dynamic Programming Cost-to-go
IDEA:
I Introduce optimal-cost-to-go function on [t̄, T ]
Z T
J(x̄, t̄) := min L(x, u)dt + E (x(T )) s.t. x(t̄) = x̄, . . .
x,u t̄

I Introduce grid 0 = t0 < . . . < tN = T .


I Use principle of optimality on intervals [tk , tk+1 ]:
Z tk+1
J(xk , tk ) = min L(x, u)dt + J(x(tk+1 ), tk+1 )
x,u tk
s.t. x(tk ) = xk , . . .
rxk rx(tk+1 )
p-
tk tk+1 T
Dynamic Programming Recursion

Starting from J(x, tN ) = E (x), compute recursively backwards, for


k = N − 1, . . . , 0
Z tk+1
J(xk , tk ) := min L(x, u)dt + J(x(tk+1 ), tk+1 ) s.t. x(tk ) = xk , . . .
x,u tk

by solution of short horizon problems for all possible xk and


tabulation in state space.
Dynamic Programming Recursion

Starting from J(x, tN ) = E (x), compute recursively backwards, for


k = N − 1, . . . , 0
Z tk+1
J(xk , tk ) := min L(x, u)dt + J(x(tk+1 ), tk+1 ) s.t. x(tk ) = xk , . . .
x,u tk

by solution of short horizon problems for all possible xk and


tabulation in state space.

J(·, tN )
6

@
R xN
@
Dynamic Programming Recursion

Starting from J(x, tN ) = E (x), compute recursively backwards, for


k = N − 1, . . . , 0
Z tk+1
J(xk , tk ) := min L(x, u)dt + J(x(tk+1 ), tk+1 ) s.t. x(tk ) = xk , . . .
x,u tk

by solution of short horizon problems for all possible xk and


tabulation in state space.

J(·, tN−1 ) J(·, tN )


6 6

@ @
R xN−1 @
@ R xN
Dynamic Programming Recursion

Starting from J(x, tN ) = E (x), compute recursively backwards, for


k = N − 1, . . . , 0
Z tk+1
J(xk , tk ) := min L(x, u)dt + J(x(tk+1 ), tk+1 ) s.t. x(tk ) = xk , . . .
x,u tk

by solution of short horizon problems for all possible xk and


tabulation in state space.

J(·, t0 ) J(·, tN−1 ) J(·, tN )


6 6 6
···

@ @ @
@ x0
R R xN−1 @
@ R xN
Hamilton-Jacobi-Bellman (HJB) Equation

I Dynamic Programming with infinitely small timesteps leads to


Hamilton-Jacobi-Bellman (HJB) Equation:
 
∂J ∂J
− (x, t) = min L(x, u) + (x, t)f (x, u) s.t. h(x, u) ≥ 0.
∂t u ∂x
I Solve this partial differential equation (PDE) backwards for
t ∈ [0, T ], starting at the end of the horizon with

J(x, T ) = E (x).

I NOTE: Optimal controls for state x at time t are obtained


from
 
∗ ∂J
u (x, t) = arg min L(x, u) + (x, t)f (x, u) s.t. h(x, u) ≥ 0.
u ∂x
Dynamic Programming / HJB
I “Dynamic Programming” applies to discrete time,
“HJB” to continuous time systems.
I Pros and Cons
+ Searches whole state space, finds global optimum.
+ Optimal feedback controls precomputed.
+ Analytic solution to some problems possible (linear systems
with quadratic cost → Riccati Equation)
I “Viscosity solutions” (Lions et al.) exist for quite general
nonlinear problems.
- But: in general intractable, because partial differential
equation (PDE) in high dimensional state space: “curse of
dimensionality”.
I Possible remedy: Approximate J e.g. in framework of

neuro-dynamic programming [Bertsekas 1996].


I Used for practical optimal control of small scale systems e.g.
by Bonnans, Zidani, Lee, Back, ...
Indirect Methods

For simplicity, regard only problem without inequality constraints:


6
states x(t)
initial value terminal
x0 r cost E (x(T ))

controls u(t)
p p
-
0 t T

Z T
minimize L(x(t), u(t)) dt + E (x(T ))
x(·),u(·) 0
subject to

x(0) − x0 = 0, (fixed initial value)


ẋ(t)−f (x(t), u(t)) = 0, t ∈ [0, T ], (ODE model)
Pontryagin’s Minimum Principle
OBSERVATION: In HJB, optimal controls
 
∗ ∂J
u (t) = arg min L(x, u) + (x, t)f (x, u)
u ∂x
∂J
depend only on derivative ∂x (x, t), not on J itself!
IDEA: Introduce adjoint variables
∂J
λ(t) =
ˆ (x(t), t)T ∈ Rnx
∂x
and get controls from Pontryagin’s Minimum Principle
 

u ∗ (t, x, λ) = arg min  L(x, u) + λT f (x, u) 


 
u | {z }
Hamiltonian=:H(x,u,λ)
QUESTION: How to obtain λ(t)?
Adjoint Differential Equation

I Differentiate HJB Equation


∂J ∂J
− (x, t) = min H(x, u, (x, t)T )
∂t u ∂x
with respect to x and obtain:

−λ̇T = (H(x(t), u ∗ (t, x, λ), λ(t))) .
∂x
I Likewise, differentiate J(x, T ) = E (x) and obtain terminal
condition
∂E
λ(T )T = (x(T )).
∂x
How to obtain explicit expression for controls?

I In simplest case,

u ∗ (t) = arg min H(x(t), u, λ(t))


u

is defined by
∂H
(x(t), u ∗ (t), λ(t)) = 0
∂u
(Calculus of Variations, Euler-Lagrange).
I In presence of path constraints, expression for u ∗ (t) changes
whenever active constraints change. This leads to state
dependent switches.
I If minimum of Hamiltonian locally not unique, “singular arcs”
occur. Treatment needs higher order derivatives of H.
Necessary Optimality Conditions

Summarize optimality conditions as boundary value problem:

x(0) = x0 , initial value



ẋ(t) = f (x(t), u (t)), t ∈ [0, T ], ODE model
∂H
−λ̇(t) = (x(t), u ∗ (t), λ(t))T , t ∈ [0, T ], adjoint equations
∂x
u ∗ (t) = arg min H(x(t), u, λ(t)), t ∈ [0, T ], minimum principle
u
∂E
λ(T ) = (x(T ))T . adjoint final value.
∂x
Solve with so called
I gradient methods,
I shooting methods, or
I collocation.
Indirect Methods

I “First optimize, then discretize”


I Pros and Cons
+ Boundary value problem with only 2 × nx ODE.
+ Can treat large scale systems.
- Only necessary conditions for local optimality.
- Need explicit expression for u ∗ (t), singular arcs difficult to
treat.
- ODE strongly nonlinear and unstable.
- Inequalities lead to ODE with state dependent switches.
Possible remedy: Use interior point method in function space
inequalities, e.g. Weiser and Deuflhard, Bonnans and
Laurent-Varin
I Used for optimal control e.g. in satellite orbit planning at
CNES...
Direct Methods

I “First discretize, then optimize”


I Transcribe infinite problem into finite dimensional, Nonlinear
Programming Problem (NLP), and solve NLP.
I Pros and Cons:
+ Can use state-of-the-art methods for NLP solution.
+ Can treat inequality constraints and multipoint constraints
much easier.
- Obtains only suboptimal/approximate solution.
I Nowadays most commonly used methods due to their easy
applicability and robustness.
Direct Methods Overview

We treat three direct methods:


I Direct Single Shooting (sequential simulation and
optimization)
I Direct Collocation (simultaneous simulation and optimization)
I Direct Multiple Shooting (simultaneous resp. hybrid)
Direct Single Shooting [Hicks1971,Sargent1978]

Discretize controls u(t) on fixed grid 0 = t0 < t1 < . . . < tN = T ,


regard states x(t) on [0, T ] as dependent variables.

states x(t; q)

x0 r
discretized controls u(t; q)

q0 qN−1
p p-
0 q1 t T

Use numerical integration to obtain state as function x(t; q) of


finitely many control parameters q = (q0 , q1 , . . . , qN−1 )
NLP in Direct Single Shooting

After control discretization and numerical ODE solution, obtain


NLP:
Z T
minimize L(x(t; q), u(t; q)) dt + E (x(T ; q))
q 0
subject to
h(x(ti ; q), u(ti ; q)) ≥ 0,
(discretized path constraints)
i = 0, . . . , N,
r (x(T ; q)) ≥ 0. (terminal constraints)

Solve with finite dimensional optimization solver, e.g. Sequential


Quadratic Programming (SQP).
Solution by Standard SQP

Summarize problem as

min F (q) s.t. H(q) ≥ 0.


q

Solve e.g. by Sequential Quadratic Programming (SQP), starting


with guess q 0 for controls. k := 0
1. Evaluate F (q k ), H(q k ) by ODE solution, and derivatives!
2. Compute correction ∆q k by solution of QP:
1
min ∇F (qk )T ∆q+ ∆q T Ak ∆q s.t. H(q k )+∇H(q k )T ∆q ≥ 0.
∆q 2

3. Perform step q k+1 = q k + αk ∆q k with step length αk


determined by line search.
ODE Sensitivities

∂x(t; q)
How to compute the sensitivity of a numerical ODE
∂q
solution x(t; q) with respect to the controls q?

Four ways:
1. External Numerical Differentiation (END)
2. Variational Differential Equations
3. Automatic Differentiation
4. Internal Numerical Differentiation (IND)
Numerical Test Problem

Z 3
minimize x(t)2 + u(t)2 dt
x(·),u(·) 0
subject to

x(0) = x0 , (initial value)


ẋ = (1 + x)x + u, t ∈ [0, 3], (ODE model)
   
1 − x(t) 0
1 + x(t) 0
1 − u(t) ≥ 0 , t ∈ [0, 3], (bounds)
   

1 + u(t) 0
x(3) = 0. (zero terminal constraint).

Remark: Uncontrollable growth for


(1 + x0 )x0 − 1 ≥ 0 ⇔ x0 ≥ 0.618.
Single Shooting Optimization for x0 = 0.05

I Choose N = 30 equal control intervals.


I Initialize with steady state controls u(t) ≡ 0.
I Initial value x0 = 0.05 is the maximum possible, because
initial trajectory explodes otherwise.
Single Shooting: Initialization
Single Shooting: First Iteration
Single Shooting: 2nd Iteration
Single Shooting: 3rd Iteration
Single Shooting: 4th Iteration
Single Shooting: 5th Iteration
Single Shooting: 6th Iteration
Single Shooting: 7th Iteration and Solution
Direct Single Shooting: Pros and Cons

I Sequential simulation and optimization.


+ Can use state-of-the-art ODE/DAE solvers.
+ Few degrees of freedom even for large ODE/DAE systems.
+ Active set changes easily treated.
+ Need only initial guess for controls q.
- Cannot use knowledge of x in initialization (e.g. in tracking
problems).
- ODE solution x(t; q) can depend very nonlinearly on q.
- Unstable systems difficult to treat.
I Often used in engineering applications e.g. in packages gOPT
(PSE), DYOS (Marquardt), . . .
Direct Collocation (Sketch) [Tsang1975]

I Discretize controls and states on fine grid with node values


si ≈ x(ti ).
I Replace infinite ODE

0 = ẋ(t) − f (x(t), u(t)), t ∈ [0, T ]

by finitely many equality constraints

ci (qi , si , si+1 ) = 0, i = 0, . . . , N − 1,
 
si+1 − si si + si+1
e.g. ci (qi , si , si+1 ) := −f , qi
ti+1 − ti 2
I Approximate also integrals, e.g.
Z ti+1  
si + si+1
L(x(t), u(t))dt ≈ li (qi , si , si+1 ) := L , qi (ti+1 −ti )
ti 2
NLP in Direct Collocation

After discretization obtain large scale, but sparse NLP:

N−1
X
minimize li (qi , si , si+1 ) + E (sN )
s,q
i=0
subject to

s0 − x0 = 0, (fixed initial value)


ci (qi , si , si+1 ) = 0, i = 0, . . . , N − 1, (discretized ODE model)
h(si , qi ) ≥ 0, i = 0, . . . , N, (discretized path constraint
r (sN ) ≥ 0. (terminal constraints)

Solve e.g. with SQP method for sparse problems.


What is a sparse NLP?

General NLP:

min F (w ) s.t.
w
G (w ) = 0,
H(w ) ≥ 0.

is called sparse if the Jacobians (derivative matrices)


 
∂G ∂G
∇w G T = = and ∇w H T
∂w ∂wj ij

contain many zero elements.


In SQP methods, this makes QP much cheaper to build and to
solve.
Direct Collocation: Pros and Cons

I Simultaneous simulation and optimization.


+ Large scale, but very sparse NLP.
+ Can use knowledge of x in initialization.
+ Can treat unstable systems well.
+ Robust handling of path and terminal constraints.
- Adaptivity needs new grid, changes NLP dimensions.
I Successfully used for practical optimal control e.g. by Biegler
and Wächter (IPOPT), Betts,
Direct Multiple Shooting [Bock 1984]

I Discretize controls piecewise on a coarse grid

u(t) = qi for t ∈ [ti , ti+1 ]

I Solve ODE on each interval [ti , ti+1 ] numerically, starting with


artificial initial value si :

ẋi (t; si , qi ) = f (xi (t; si , qi ), qi ), t ∈ [ti , ti+1 ],


xi (ti ; si , qi ) = si .

Obtain trajectory pieces xi (t; si , qi ).


I Also numerically compute integrals
Z ti+1
li (si , qi ) := L(xi (ti ; si , qi ), qi )dt
ti
Sketch of Direct Multiple Shooting

6
xi (ti+1 ; si , qi ) 6= si+1
@
R
@
r r r sN−1
r r r s
r N
r r s r si+1
r i
s0r s1
6
x0 f
r
qi

q0
q q q q q q q q -
t0 t1 ti ti+1 tN−1 tN
NLP in Direct Multiple Shooting
6
q
q q q q q q q q q
q q
bq
p p p p p p p p
-

N−1
X
minimize li (si , qi ) + E (sN )
s,q
i=0
subject to

s0 − x0 = 0, (initial value)
si+1 − xi (ti+1 ; si , qi ) = 0, i = 0, . . . , N − 1, (continuity)
h(si , qi ) ≥ 0, i = 0, . . . , N, (discretized path constraints
r (sN ) ≥ 0. (terminal constraints)
Structured NLP

I Summarize all variables as w := (s0 , q0 , s1 , q1 , . . . , sN ).


I Obtain structured NLP

G (w ) = 0
min F (w ) s.t.
w H(w ) ≥ 0.
I Jacobian ∇G (w k )T contains dynamic model equations.
I Jacobians and Hessian of NLP are block sparse, can be
exploited in numerical solution procedure.
Test Example: Initialization with u(t) ≡ 0
Multiple Shooting: First Iteration
Multiple Shooting: 2nd Iteration
Multiple Shooting: 3rd Iteration and Solution
Direct Multiple Shooting: Pros and Cons

I Simultaneous simulation and optimization.


+ uses adaptive ODE/DAE solvers
+ but NLP has fixed dimensions
+ can use knowledge of x in initialization (here bounds;
more important in online context).
+ can treat unstable systems well.
+ robust handling of path and terminal constraints.
+ easy to parallelize.
- not as sparse as collocation.
I Used for practical optimal control e.g by Franke (ABB)
(“HQP”), Terwen (Daimler); Bock et al.
(“MUSCOD-II”); in ACADO Toolkit; ...
Conclusions: Optimal Control Family Tree

( ((
( ( (((
 
(( ( 
(((( 
 


Hamilton-Jacobi- Indirect Methods, Direct Methods:


Bellman Equation: Pontryagin: Transform into
Tabulation in Solve Boundary Nonlinear Program
State Space Value Problem (NLP)
( (((( (
( 

( ((  
( ( ((  
(( ( 
 
Single Shooting: Collocation: Multiple Shooting:
Only discretized Discretized controls Controls and node
controls in NLP and states in NLP start values in NLP
(sequential) (simultaneous) (simultaneous/hybrid)
Literature

I T. Binder, L. Blank, H. G. Bock, R. Bulirsch, W. Dahmen, M.


Diehl, T. Kronseder, W. Marquardt and J. P. Schler, and O.
v. Stryk: Introduction to Model Based Optimization of
Chemical Processes on Moving Horizons. In Grötschel,
Krumke, Rambau (eds.): Online Optimization of Large Scale
Systems: State of the Art, Springer, 2001. pp. 295–340.
I John T. Betts: Practical Methods for Optimal Control Using
Nonlinear Programming. SIAM, Philadelphia, 2001. ISBN
0-89871-488-5
I Dimitri P. Bertsekas: Dynamic Programming and Optimal
Control. Athena Scientific, Belmont, 2000 (Vol I, ISBN:
1-886529-09-4) & 2001 (Vol II, ISBN: 1-886529-27-2)
I A. E. Bryson and Y. C. Ho: Applied Optimal Control,
Hemisphere/Wiley, 1975.

You might also like