0% found this document useful (0 votes)
109 views19 pages

Quasi-Newton Methods in Optimization

This document discusses quasi-Newton methods for optimization of chemical processes. Quasi-Newton methods approximate the Hessian matrix to modify Newton's method, which requires second derivatives that are difficult to compute. The most popular quasi-Newton update is the Broyden-Fletcher-Goldfarb-Shanno method, which works well without line searches.

Uploaded by

ermias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views19 pages

Quasi-Newton Methods in Optimization

This document discusses quasi-Newton methods for optimization of chemical processes. Quasi-Newton methods approximate the Hessian matrix to modify Newton's method, which requires second derivatives that are difficult to compute. The most popular quasi-Newton update is the Broyden-Fletcher-Goldfarb-Shanno method, which works well without line searches.

Uploaded by

ermias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

OPTIMIZATION OF CHEMICAL

PROCESSES (CHE1011)

Dr. Dharmendra Kumar Bal


Assistant Professor (Sr.)
School of Chemical Engineering

11
QUASI-NEWTON METHODS
QUASI-NEWTON METHODS

Newton method, Xk+1 = Xk – k [2f(Xk)]-1f(Xk)


requires second derivatives, which are difficult to compute.

Modify Newton method by approximating [2f(Xk)]-1 and continually


updating the approximation during iterations using latest results.

Resulting methods are known as quasi-Newton methods.

6
QUASI-NEWTON METHODS
For a quadratic function, f(X) = a + bTX + ½ XTH X,
gradient is g(X) = f(X) = b + H X.

Hence, H can be found using f(X) at several X.

Assuming quadratic function, many solutions were proposed for


approximating and updating [2f(Xk)]-1.

Most popular update is that of


Broyden-Fletcher-Goldfarb-Shanno (BFGS).
This works well even without line searches (i.e., k = 1).

7
Gradient Search Methods
BFGS method

Assume B is an estimate of H-1 = [2f(X)]-1.

Initial estimate of B is a positive definite matrix, B0

New estimate by BFGS Method:

Xk+1 = Xk – Bk f(Xk)

Compared to Newton method, Xk+1 = Xk – k [2f(Xk)]-1f(Xk)

8
Gradient Search Methods
Compute ∆𝐗 k = 𝐗 k+1 − 𝐗 k

and ∆𝐠 k = 𝛻f(𝐗 k+1 ) − 𝛻f(𝐗 k )

Update Bk to find Bk+1 by the following matrix equation:


𝐁k+1 = 𝐁k +
T T k T
X k X k 𝐠 𝐁
k g k 𝐁k 𝐠 k X k
1+ −
X k T g k X k T g k X k T g k
T k
𝐗 k 𝐠 𝐁
k

X k T g k
9
Gradient Search Methods
BFGS ALGORITHM
1) Select initial estimate, X0, a positive definite matrix, B0 as
initial estimate of B, and set iteration counter, k = 0.
Compute f(Xk) and f(Xk).
2) Find the new estimate using: Xk+1 = Xk – Bk f(Xk)
3) Compute f(Xk+1), f(Xk+1) and Bk+1 (using the update
formula).
4) Check for convergence [on 𝛻f(𝐗 k+1 ), 𝐗 k+1 − 𝐗 k and
f(𝐗 k+1 ) − f(𝐗 k )] and limit on number of iterations. Either
stop or make k = k + 1 & repeat from Step 2.

10
Gradient Search Methods
 Summary

 All gradient search methods for unconstrained problems find


the new estimate using Xk+1 = Xk + k sk.

 Search direction, sk is related to the gradient, f(Xk) and


varies from one method to another.

 Algorithm of all indirect/gradient search methods is similar.

 Selection of k is based on one-dimensional (line) search or


simply selected as 1.0.

11
Application Issues
• Termination Criteria: Three criteria are possible.

(1) 𝛻𝑓(𝑿𝑘+1 ) < 𝜀1


(Is the necessary condition met?)

𝑥𝑖𝑘+1 −𝑥𝑖𝑘
(2) 𝑥𝑖𝑘+1 − 𝑥𝑖𝑘 < 2 or < 3
𝑥𝑖𝑘

(for i = 1, 2, 3, …, n)

(Is there a significant change in DV values?)

12
Application Issues
𝑓 𝑿𝑘+1 −𝑓 𝑿𝑘
(3) 𝑓 𝑿𝑘+1 − 𝑓 𝑿𝑘 < 4 or < 𝜀5
𝑓 𝑿𝑘

(Is there a significant change in

Objective Function value?)

Better to Use all Three Criteria.

13
Application Issues
• Derivative Evaluation
 Use finite difference approximation of derivatives if
analytical derivatives are not possible or very expensive to
compute.

 This approach is better than direct search methods for large


problems.

 Partial derivative of f(x, y) w.r.t. x is defined as:


𝜕f(x, y) f x + ∆x, y − f(x, y)
= lim
𝜕x ∆x→0 ∆x
14
Application Issues
 Forward difference approximation is
𝜕f(x, y) f x + ∆x, y − f(x, y)

𝜕x ∆x
which requires one function evaluation, and the truncation
error in the approximation is proportional to x.

 Central difference approximation is


𝜕f(x, y) f x + ∆x, y − f(x − ∆𝑥, y)

𝜕x 2∆x
requiring two function evaluations and having truncation error
proportional to (x)2.
15
Application Issues
• Software
 Many computer programs are available. A particular
method may be implemented differently in different
programs, and hence its performance may vary.

 Relative comparison of methods and programs is based on


(i) computational time, (ii) number of function evaluations
and (iii) reliability.

 In general, BFGS (Quasi-Newton) method is better.

16
Summary
• Optimality Criteria
– Necessary and sufficient conditions
– Useful for solving simple problems
– Useful for checking the solution found

• Direct Search Methods


 Search direction is based on objective function values only.
Derivatives are not needed.
 Satisfactory for small problems (< 5 variables) and/or when
derivatives are difficult to calculate.
 Modified simplex method of Nelder and Mead is simple and
popular for certain applications.

17
Summary
• Gradient Search Methods

 Search direction is based on gradient vector, which may be


obtained analytically or numerically.

 These methods are efficient and recommended for large


problems or when derivatives can be calculated.

 Newton method has faster convergence when good initial


estimates are available. But it requires second derivatives.

18
Summary
 Levenberg-Marquardt method combines strengths of
steepest descent and Newton methods. It is popular for
nonlinear regression (NLS).

 Quasi-Newton methods have good convergence


characteristics. In particular, BFGS method works well
without a line search.

19

You might also like