Jay H Lee - MPC Lecture Notes
Jay H Lee - MPC Lecture Notes
Jay H. Lee
School of Chemical and Biomolecular Engineering Center for Process Systems Engineering Georgia Inst. of Technology
Prepared for Pan American Advanced Studies Institute Program on Process Systems Engineering
Schedule
Lecture 1: Introduction to MPC Lecture 2: Details of MPC Algorithm and Theory Lecture 3: Linear Model Identification
Lecture 1
Introduction to MPC
- Motivation - History and status of industrial use of MPC - Overview of commercial packages
min i ( xi , ui )
ui
g i ( xi , ui ) 0 xi +1 = F ( xi , ui )
i =0
u0 = ( x0 ) HJB Eqn.
At t = k , Set x0 = xk (Estimated Current State) Solve the optimization problem numerically Implement solution u0 as the current move. Repeat!
x Qx + u Ru
i =0 T i i i =0 T i
m1
Fairly general
State regulation Output regulation Setpoint tracking
Unconstrained linear least squares problem has an analytical solution. (Kalmans LQR) Solution is smooth with respect to the parameters Presence of inequality constraints no analytical solution
PID Controllers Lead / Lag Filters Switches Min, Max Selectors If / Then Logics Sequence Logics
Model is not explicitly used inside the control algorithm No clearly stated objective and constraints
Other Elements
Inconsistent performance Complex control structure Not robust to changes and failures Focus on the performance of a local unit
Set-point
Input
Output
r(t)
u(t)
y(t)
terminal cost
g i ( xi , ui ) 0 g p (xp ) 0 & x = f ( x, u )
Controller
Set-point
Input
Output
r(t)
u(t)
Measurements
y(t)
Must be coupled with on-line state / model parameter update p 1 ( x i , u i ) + p (x p ) Requiresuminu p 1 solution for each updated problem on-line 0 i= 0 ,K ,
stage-wise terminal Analytical solution possible only in a few cases (LQ control) cost cost
g i(xi,u ) 0 Computational i limitation for numerical solution, esp. back in Path constraints p ( x p ) the 50sgand 60s 0 Terminal constraints & x = f (x,u )
Model constraints
Unit 2 - MPC Structure Global Steady-State Optimization (every day) Local Steady-State Optimization (every hour) Dynamic Constraint Control (every minute) Supervisory Dynamic Control (every minute) Basic Dynamic Control (every second)
LC
PID
Lead/Lag
PID
SUM
SUM
FC PC
TC LC
FC PC
TC
j = k ,L, k + p 1
i =1
(ui ) min
+ (rB (k + i | k ) rB *)
2 2
Models used are predominantly empirical models developed through plant testing. Technology is used not only for multivariable control but for most economic operation within constraint boundaries.
Honeywell
Robust MPC Technology (RMPCT)
Adersa
Predictive Functional Control (PFC) Hierarchical Constraint Control (HIECON) GLIDE (Identification package)
MDC Technology
(Emerson)
ABB
3d MPC
Aspen Technology
Aspen Target
Multivariable Control (MVC): Linear Dynamics + Static Nonlinearity NOVA Nonlinear Controller (NLC): First Principles Model
Pavilion Technologies
Process Perfecter: Linear Dynamics + Static Nonlinearity
Local Optimization
MPC
Distributed Control System (PID)
FC PC TC LC
Local Optimization
A separate steady-state optimization to determine steady-state targets for the inputs and outputs; RMPCT introduced a dynamic optimizer recently Linear Program (LP) for SS optimization; the LP is used to enforce input and output constraints and determine optimal input and output targets for the thin and fat plant cases The RMPCT and PFC controllers allow for both linear and quadratic terms in the SS optimization The DMCplus controller solves a sequence of separate QPs to determine optimal input and output targets; CVs are ranked in priority so that SS control performance of a given CV will never be sacrificed to improve performance of lower priority CVs; MVs are also ranked in priority order to determine how extra degrees of freedom is used
Dynamic Optimization
At the dynamic optimization stage, all of the controllers can be described (approximately) as minimizing a performance index with up to three terms; an output penalty, an input penalty, and an input rate penalty:
J = j =1 e
P
y k+ j
2 Qj
+ j =0 uk + j
M 1
2 Sj
+ j =0 e
M 1
u k+ j
2 Rj
A vector of inputs uM is found which minimizes J subject to constraints on the inputs and outputs:
u = u , u ,...u
M T 0 T 1
T T M1
u uk u
u u k u
y yk y
x k +1 = f (x k , u k )
y k +1 = g ( x k +1 ) + b k +1
Dynamic Optimization
Most control algorithms use a single quadratic objective The HIECON algorithm uses a sequence of separate dynamic optimizations to resolve conflicting control objectives; CV errors are minimized first, followed by MV errors Connoisseur allows for a multi-model approach and an adaptive approach The RMPCT algorithm defines a funnel and finds the optimal trajectory yr and input uM which minimize the following objective:
y k + j ,u
minM J = r
P j =1
r k+ j
yk+ j
2 Q
+ u M 1 u ss
2 S
Output Trajectories
Aspen Techs DMC ID-COM, Adersas
Honeywells RMPCT
quadratic penalty Zone past future past future Funnel quadratic penalty
Output Horizon
Coincidence points
Input Parameterization
u Multiple moves (with blocking)
physics L,NL
S,U
Laplace Transfer physics L Function data ARMAX/NARMAX data Convolution data (Finite Impulse or Step Response) Other (Polynomial, Neural Net) data L,NL L
S,U
S,U S
L,NL
S,U
Identification Technology
Most products use PRBS-like or multiple steps test signals. es non-PRBS signals Most products use FIR, ARX or step response models
Glide uses transfer function G(s) RMPCT uses Box-Jenkins SMOC uses state space models
Glide us
Connoisseur has adaptive capability using RLS A few products (DMCplus, SMOC) have subspace identification metho ds available for MIMO identification Most products have uncertainty estimate, but most products do not m ake use of the uncertainty bound in control design
Summary
MPC is a mature technology!
Many commercial vendors with packages differing in model form, objective function form, etc. Sound theory and experience
Challenges are
Simplifying the model development process plant testing & system identification
nonlinear model development
State Estimation
Lack of sensors for key variables
FCCU Debutanizer
Debutanizer Diagram
PC
190 lb
Pressure
PCT Pre-Heater
160 F
Fan
Reflux
To Deethanizer
From Stripper
Feed
TC
TC
Flooding
Tray 20 Temp.
TC
400 F
RVP
Gasoline to blending
Process Limitation Operation Problems: Overloading -- over design capacity. Flooding -- usually jet flooding, causing very poor separation. Lack of Overhead Fan Cooling -- especially in summer. Consequences: High RVP, giving away Octane Number High OVHD C5, causing problems at Alky.
Control Objectives
Constrained Control: Preventing safety valve from relieving Keep the tower from flooding Keep RVP lower than its target. Regulatory Control: Regulate OVHD PCT or C5 at spec. Rejecting disturbance not through slurry, if possible.
Real-Time Optimization
Optimization Objectives: While maintaining PCT, RVP on their specifications Minimizing energy consumed Minimizing overhead reflux Minimizing overhead cooling required Minimizing overhead pressure Maximizing separation efficiency.
MPC Configuration
Controlled
PCT Fan RVP Diff. Pressure Feed Temp. Tray 20 Temp. OVHD C5 % OVHD Pressure R-PID Internal Reflux
Manipulated
Fan Output Reflux Pre-heater By-Pass Reboiler By-Pass
MPC
Disturbances
Feed Stripper Bottom Temp.
MPCs MV Moves
Optimal Point Reached
200 180 160 140 120 100 80 60 40 20 0 0
Pressure (lb)
Feed-heater by-pass
minute
minute
Reflux-to-Feed Ratio
0.54
0.52
0.5
0.48
Handling Flood
0.46 0.44
0.42
500
1000
1500
2000
2500
3000
minute
Handling Flood
Potential Flooding
2.4 2.2 2 1.8 1.6 1.4 1.2 0 500 1000 1500 2000 2500 3000
minute
Product Spec's
minute
160 F
Feed Flooding Tray 20 Temp. Tray 20 After MPC is on 400 F RVP Temperature
Gasoline to blending
Tray #
11
12
13
14
15
16
17
April, 1991
Date
(2)
RVP
7.5
6.5
5.5
5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
April, 1991
Date
(3)
C5+ Total 20 % 18
16 14 12 10 8 6 4 2 1 2 3 4 5 6 7 8 9 10
11
12
13
14
15
16
17
April, 1991
Date
Other Benefits Reflux is 15% lower than before Separation efficiency is increased Now have room for lower RVP or PCT, if needed Variance on PCT and RVP is reduced; Note: Variance on the tray-20 temperature is increased, and it should be!! Energy saving (~10%) OVHD fan maximum bound may be avoided Flooding is eliminated.
Acknowledgment
Professor S. Joe Qin, UT, Austin Dr. Joseph Lu, Honeywell, Phoenix, AZ Center for Process Systems Engineering Industrial Members
Lecture 2
General Setup
Dynamic Model State: Compact representation Of the past input record Previous State (in Memory) State Update New Input Move (Just Implemented) Current State Prediction Future Input Moves (To Be Determined) Prediction Model for Future Outputs
?
To Optimization
Options
Model Types
Finite Impulse Response Model or Step Response Model State-Space Model Linear or Nonlinear
Measurement Correction
To the prediction (based on open-loop state calculation) To the state (through state estimation)
Objective Function
Linear or Quadratic Constrained or Unconstrained
{y0 , y1 , y2 ,L}
v can be a MV (u) or a Measured DV (d)
{v0 , v1 , v2 ,L}
Assumptions: H0 = 0: no immediate effect The response settles back in n steps s.t. Hn+1 = Hn+2 = = 0: Finite Impulse Response (reasonable for stable processes)
y (k ) = H1v(k 1) + L + H n v(k n)
insert
y (k ) = H1v(k 1) + L + H n v(k n) = Cx (k )
e( k ) = y m ( k ) y ( k )
Assumptions: S0 = 0: no immediate effect The response settles in n steps s.t. Sn= Sn+1 = = S: the same as the finite impulse response assumption Relationship with the impulse response coefficients:
n future outputs assuming the input remains constant at the most recent value
yi (k ) = y (k + i ) w/ u (k ) = u (k + 1) = L = 0
x(k)
Note yn 1 (k ) = yn (k ) = L = y (k ) Also y (k ) = y0 (k )
M1 x(k)
x(k + 1) = M 1 x(k ) + {
shift
step response
S {
v(k )
drop
y0 (k + 1) y (k + 1) 1 M yn 2 (k + 1) y (k + 1) n 1
S1
S2
Sn-1 Sn
y0 ( k ) y (k ) 1 M yn 2 (k ) y (k ) n 1
e( k ) = y m ( k ) y ( k )
The state stored in memory
y (k ) = y0 (k ) = [1,0, L,0]x(k )
Future input moves (to be decided)
~ ~ ~ & z = Az + B u u + B d d ~ y = Cz
Linearization
y ( z ) = G u ( z )u ( z ) + G d ( z )d ( z ) or y ( s ) = G ( s )u ( s ) + G d ( s )d ( s )
I/O model Identification
State-Space Identification
& z = f ( z, u, d ) y = g ( z)
Fundamental Model
y (k + 1) = Cz (k + 1) y (k + 1) = y (k ) + C Az (k ) + B u u (k ) + B d d (k )
y (k ) = x(k )
e( k ) = y m ( k ) y ( k )
The state stored in memory Future input moves (to be decided)
Summary
Regardless of model form, one gets the prediction equation in the form of
Assumptions
Measured DV (d) remains constant at the current value of d(k) Model prediction error (e) remains constant at the current value of e(k)
e(k)-e(k-1)
To Optimization
Prediction Equation
Contains past feedback measurement corrections
y1
Process
y2
Systematic handling of multi-rate measurements Optimal extrapolation of output error and filtering of noise (based on the given stochastic system model)
Optimization
Objective Function
Minimization Function: Quadratic cost (as in DMC)
p m 1 i =0
Consider only m input moves by assuming u(k+j)=0 for jm Penalize the tracking error as well as the magnitudes of adjustments
Substitute
T m
Y (k ) = b(k ) + Lu U m (k ) m
T
V (k ) = U (k ) HU m (k ) + g (k )U m (k ) + c(k )
constant
Constraints
CU m (k ) h(k )
Optimization Problem
Quadratic Program
U m ( k )
min U (k ) HU m (k ) + g (k )U m (k )
T m T
1 1 U m (k ) = H g (k ) 2
Constrained Solution
Must be solved numerically
Quadratic Program
Minimization of a quadratic function subject to linear constraints Convex and therefore fundamentally tractable Solution methods
Active set method: Determination of the active set of constraints on the basis of the KKT condition Interior point method: Use of barrier function to trap the solution inside the feasible region, Newton iteration
Solvers
Off-the-shelf software, e.g., QPSOL Customization is desirable for large-scale problems
Two-Level Optimization
Steady-State Optimization (Linear Program) min L( y|k , u s (k ))
us ( k )
y | k Cs cs ( k ) u s (k ) u s (k ) = u (k 1) + u (k ) + L + u (k + m 1) y|k = bs (k ) + Ls u s (k )
Optimal Setting Values (setpoints)
Steady-State Prediction Eqn. State Feedforward Measurement Feedback Error Dynamic Prediction Eqn.
y , u (k )
* | k * s
Stability guarantee
The optimal cost function can be shown to be the control Lyapunov function
Less parameters to tune More consistent, intuitive effect of weight parameters Close connection with the classical optimal control methods, e.g., LQG control
V (k ) =
m + n 1 i =1
m 1
Additional Comments
Use of a sufficiently large horizon (p m+ the settling time) should have a similar effect Can we always satisfy the settling constraint?
y=y* may not be feasible due to input constraints or insufficient m use two-level approach
Two-Level Optimization
y , u (k )
* | k * s
Difficulty (1)
& x = f ( x, u , d ) y = g ( x)
Discretization?
Orthogonal Collocation
yk + 2|k = g o F (F ( x(k ), u (k ), d (k )), u (k + 1), d (k ) ) + e(k ) M yk + p|k = g o F p ( x(k ), u (k ),Lu (k + p 1), d (k ) ) + e(k )
& x = f ( x, u , d ) + w y = g ( x) +
Extended Kalman Filtering
k +1
x(k + 1) =
f ( x , u , d ) + K ( k )( y
k
(k ) g ( x(k ) )
Computationally more demanding steps, e.g., calculation of K at each time step Based on linearization at each time step not optimal, may not be stable Best practical solution at the current time Promising alternative: Moving Horizon Estimation (requires solving NLP) Difficult to come up with an appropriate stochastic system model (no ID technique)
Practical Algorithm
EKF
x(k)
Dynamic Matrix based on the linearized model at the current state and input values
System Identification
Building a dynamic system model using data obtained from the plant
Why Important?
Almost all industrial MPC applications use an empirical mo del obtained through system identification Poor model Poor Prediction Poor Performance Up to 80% of time is spent on this step Direct interaction with the plant
Cost factor, safety issues, credibility issue
Issues and decisions are sufficiently complicated that syst ematic procedures must be used
Pretreatment
Conditioned Data
ID Algorithm
Model
No
Validation
Yes
Model Structure
Inputs
Plant Dynamics
I/O Model
White noise sequence
y ( k ) = G ( q )u ( k ) + 1 24 4 3
effect of inputs
H ( q ) e( k ) 14 3 24
G (q ) = b1q 1 + L + bm q m , H (q ) = 1
~ (k ) = a ~ (k 1) + L + a ~ (k n) + b u (k 1) + L + b u (k m); y 1y ny 1 m y ( k ) = ~ ( k ) + e( k ) y
b1q 1 + L + bm q m G (q) = , H (q) = 1 1 n 1 a1q L an q
y (k ) = A1 y (k 1) + L + An y (k n)
G (q) = I A1q L An q
1 1
( H (q) = (I A q
L An q
n 1
) (B q )
1
+ L + Bm q m
A is an sets n matrix and Bi is an n y same G(q) iDifferentn y of ycoefficient matrices giving exactlynu matrixand H( q) through pole/zero cancellations Problems in parameter estimati on Requires special parameterization to avoid problem
ARMAX Structure
Overview
Data
Model Structure
IV Method
Statistical Method
MLE Bayesian
1 min N
k =1
e( k ) 2
2
ARX, FIR Linear least squares, ARMAX, OE, BJ Nonlinear least squares
Subspace Method
More recent development Dates back to the classical realization theories but rediscov ered and extended by several people Identifies a state-space model Some theories and software tools Computationally simple
Non-iterative, linear algebra
Not optimal in any sense May need a lot of data for good results May be combined with PEM
Use SS method to obtain an initial guess for PEM
L1
x1 (k ) M xn ( k )
u (k 1) M u (k n )
Past inputs
L2
yk |k 1 ek |k 1 M + M yk + n 1|k 1 ek + n 1|k 1
Future Output Prediction Prediction Error
u (k ) M u (k + n 2)
Future inputs
M = [Q1
n Q2 ] 0
0 P1T T 0 P2
o = Q11/ 2 n
Some variations exist among different algorithms in terms of picking the state basis
Properties
N4SID (Van Overschee and DeMoor)
Kalman filter interpretation Proof of asymptotically unbiasedness of A, B, and C Efficient algorithm using QR factorization -
CVD (Larimore)
Founded on statistical argument Same idea but the criterion for choosing the state basis (Q1) diff ers a bit from N4SID based on correlation between past I/O data and future output data, rather than minimization of the pre diction error for the given data
Alternative
o C , A B
MOESP (Verhaegen)
Error Analysis
Error Types
Bias: Error due to structural mismatch
Bias = the error as # of data points Independent of # of data points collected Bias distribution (e.g., in the frequency domain) depends on the input spectrum, pre-filtering of the data, etc. Frequency-domain bias distribution under PEM - by Ljung
Main tradeoff
Richer structure (more parameters) Bias, Variance
n cov(vec(GN )) ( u ) T d N 144 44 2 3
Noise to signal ratio
Test Signals
Very Important
Signal-to-noise ratio Distribution and size of the variance Bias distribution
Popular Types
Multiple steps: Power mostly in the low-frequency region Good e stimation of steady-state gains (even with step disturbances) but g enerally poor estimation of high frequency dynamics PRBS: Flat spectrum Good estimation of the entire frequency re sponse, given the error also has a flat spectrum (often not true) Combine steps w/ PRBS?
MIT gives better signal-to-noise ratio for a given t esting time Control-relevant data generation requires MIT MIT can be necessary for identification of highly i nteractive systems (e.g., systems with large RGA) SIT is often preferred in practice because of the m ore predictable effect on the on-going operation
Perturbation Signal r G0 u
Closed-Loop Testing
Dither Signal r
Location 1 Location 2
d y G0
Cons
Correlation between input perturbations and disturbanc es / noise through the feedback. Many algorithms can fail or give problems They give bias unless the assumed noise structure is perfect
)) n ( r ) T cov(vec(GN u d N
The level of external perturbation signals also contribute to the siz e of bias due to the feedback-induced correlation
e eu u 1 = ( Pe u 1 ) ( u u 1 ) 123 1 24 4 3 Noise to signal ratio
E{G0 GN } = ( H 0 H ) eu u 1
Indirect Approach
T I C) yr G =(T G D = { y (i ), r (i ), i = 1,..., N } TN N
yr N
yr N
yr N
ur 1 N
Data Pretreatment
Main Issues (1) Time-consuming but very important Remove outliers Remove portions of data corresponding to u nusual disturbances or operating conditions Filter the data
Affects bias distribution (emphasize or de-emphasize different frequency regions) Does NOT improve the S/N ratio often a misconcept ion
Main Issues (2) Difference the data? (y = y(k) y(k-1), u(k) = u(k) u(k-1))
Removes trends (e.g., effect of step disturbances, set point changes) that can destroy the effectiveness of m any ID methods (e.g., subspace ID) Often used in practice Also removes the input power in the low-frequency re gion. (PRBS zero input power at = 0) Amplifies high-frequency parts of the data (e.g., noise ), so low-pass filtering may be necessary
Model Validation
Overview
Use fresh data different from the data used for m odel building Various methods
Size of the prediction error Whiteness of the prediction error Cross correlation test (e.g., prediction error and inputs)
Good prediction with test data but poor predictio n with validation data
Sign of overfit Reduce the order or use more compact structures like ARM AX (instead of ARX)
Good theories and systematic tools are available System ID can also be used for constructing monitorin g models
Subspace identification Trend model, not a causal model Active testing is not needed
p 1 min1 ( x j , u j ) + p (x p ) u0 ,K,u p j = 0
stage-wise terminal cost cost
g j (x j ,u j ) 0 g p (xp ) 0 x j +1 = f ( x j , u j )
General formulation for deterministic control and scheduling problems. Continuous and integer state / decision variables possible In control, p= case is solved typically. Uncertainty is not explicitly addressed.
Solution Approaches
Analytical approach: 50s-70s
Derivation of closed form optimal policy ( requires solution to HJB equation (hard!)
u j = * (x j )
Parametric programming:
General parameter dependent solution (e.g., a lookup table) Significantly higher computational burden
Practical solution:
Resolve the problem on-line whenever parameters are updated or constraints are violated (e.g., in Model Predictive Control or Reactive Scheduling).
Next holy-grail of control: A general form for control, scheduling, and other real-time decision problems in an uncertain dynamic environment. No satisfactory solution approach currently available.
x k+2
xk+1
xk
uk
u k +1
u 2 k+ x k+2 u k+2
x k +2
Total number of decision variables = (1+2+4++2p-1) nu Number of branches to evaluate for each candidate decisions = 2p
x+1 k
u k +1
u k +2 x k +2 u k +2
Total number of decision variables = (1 + S + S2 ++ Sp-1) nu Number of branches to evaluate for each decision candidate = Sp Not feasible for large S (large number of scenarios) and/or large p (large number of stages)
practically limited to two stage problems with a small number of scenarios.
Current practical approach: Evaluate most likely branch(es) only. BUT highly limited!
j = k +1
j k 1 (x j , * ( x j ) )
( x ) = arg min { ( x, u ) + J * ( fh ( x, u) )}
J * ( xk ) = min E ( xk , u k ) + J * ( f h ( xk , u k )
u(k )
ali ) ion ces s +a en J ip1( ) = min E ( , u ) + J i () S m u Di tion f c eo A s ur te & sampling & C ta discretization (S
Valuety iteration
i = i+1
converged solution
~ J*
~ J ( x) := xk J ( xk )