Mathematical Methods of Optimization
Mathematical Methods of Optimization
Nonlinear Programming
Introduction to Nonlinear
Click to edit Master title style
Programming
• We already talked about a basic aspect of
nonlinear programming (NLP) in the
Introduction Chapter when we considered
unconstrained optimization.
Introduction to Nonlinear
Click to edit Master title style
Programming
f f f
f =
x1 x2 xn
Example: f = 15 x1 + 2( x2 )3 − 3x1 ( x3 ) 2
f = 15 − 3( x3 ) 2 6( x2 ) 2 − 6 x1 x3
Click to edit
The Master
Hessiantitle style
• The Hessian (2) of f(x1, x2, …, xn) is:
f 2 f 2
f
2
x12
2
x1x2 x1xn
f f 2
f
2
2 f = x x x2
2
x2xn
2 1
2
f f f
2 2
x x 2
n 1 xn x2 xn
Click to
Hessian
edit Master
Example
title style
• Example (from previously):
f = 15 x1 + 2( x2 )3 − 3x1 ( x3 ) 2
f = 15 − 3( x3 ) 2 6( x2 ) 2 − 6 x1 x3
0 0 − 6 x3
2 f = 0 12 x2 0
− 6 x3 0 − 6 x1
Click
Unconstrained
to edit Master
Optimization
title style
The optimization procedure for multivariable
functions is:
1. Solve for the gradient of the function equal to
zero to obtain candidate points.
2. Obtain the Hessian of the function and
evaluate it at each of the candidate points
• If the result is “positive definite” (defined later) then
the point is a local minimum.
• If the result is “negative definite” (defined later) then
the point is a local maximum.
Click
Positive/Negative
to edit Master Definite
title style
• A matrix is “positive definite” if all of the
eigenvalues of the matrix are positive
(> 0)
2 −1 0
f = − 1 2 − 1
2
0 − 1 2
Unconstrained
Click to edit Master
NLP title
Example
style
The eigenvalues of this matrix are:
1 = 3.414 2 = 0.586 3 = 2
x1 − 1
So, the point x = x2 = − 1 is a minimum
x3 − 1
Unconstrained
Click to edit Master
NLP title
Example
style
Unlike in Linear Programming, unless we
know the shape of the function being
minimized or can determine whether it is
convex, we cannot tell whether this point is
the global minimum or if there are function
values smaller than it.
ClickMethod
to edit Master
of Solution
title style
• In the previous example, when we set the
gradient equal to zero, we had a system of
3 linear equations & 3 unknowns.
• For other problems, these equations could
be nonlinear.
• Thus, the problem can become trying to
solve a system of nonlinear equations,
which can be very difficult.
ClickMethod
to edit Master
of Solution
title style
• To avoid this difficulty, NLP problems are
usually solved numerically.
f (x )
where k is the current iteration.
Iteration is continued until |xk+1 – xk| < where
is some specified tolerance.
Click
Newton’s
to editMethod
MasterDiagram
title style
Tangent of
f (x) at xk
x
f (x) x* xk+1 xk
f (x ) − f (x )
' a
f (x)
xa xc
x* x
x
b
+ k
So, x k +1
= x f (x )
k k
−
Click
Steepest
to editDescent
MasterMethod
title style
• Because the gradient is the rate of change
of the function at that point, using the
gradient (or negative gradient) as the
search direction helps reduce the number
of iterations needed
x2 f(x) = 5
-f(xk)
f(x) = 20
f(xk)
xk
f(x) = 25
x1
Steepest
Click to Descent
edit Master
Method
title style
Steps
So the steps of the Steepest Descent
Method are:
1. Choose an initial point x0
2. Calculate the gradient f(xk) where k is
the iteration number
3. Calculate the search vector: d = f (x )
k k
for convergence
Or, use another tolerance 2 and evaluate:
f ( x k ) 2
for convergence
Click to edit
Convergence
Master title style
• These two criteria can be used for any of
the multivariable optimization methods
discussed here
0
Let’s pick x 0 = 0
0
Click
Steepest
to edit
Descent
MasterExample
title style
f (x) = 2 x1 + (1 − x2 ) − x1 + 2 x2 − x3 − x2 + 2 x3 + 1
d 0 = −f (x 0 ) = −2(0) + 1 − 0 − 0 + 0 − 0 − 0 + 0 + 1
= −1 0 1 = −1 0 −1
x1 = 0 0 0 + 0 − 1 0 − 1
= 2( ) − 2( )
0 2 0
1
df (x )
= 4( ) − 2
0
d 0
1 1
= 0 0 0 + − 0 −
2 2
1 1
x = −
1
0 −
2 2
Click
Steepest
to edit
Descent
MasterExample
title style
Take the negative gradient to find the next
search direction:
1 1
d = −f (x ) = − − 1 + 1 + 0
1 1
+0+ 0 − 1 + 1
2 2
d1 = 0 − 1 0
Click
Steepest
to edit
Descent
MasterExample
title style
Update the iteration formula:
1 1
x = −
2
0 − + 1 0 − 1 0
2 2
1 1
= − − 1
−
2 2
Click
Steepest
to edit
Descent
MasterExample
title style
Insert into the original function & take the
derivative so that we can find 1:
1 1
( ) 1 1 1 1
f (x ) = + − 1 + + ( ) − ( ) + −
2
4 2
1 1 2
2 4 2
= ( ) − − 1
1 2 1
2
1
df (x )
= 2( ) − 1
1
d 1
Click
Steepest
to edit
Descent
MasterExample
title style
Now we can set the derivative equal to zero
and solve for 1:
2( ) = 1 = 1
1 1
2
Click
Steepest
to edit
Descent
MasterExample
title style
Now, calculate x2:
1 1
x = −
2
0 − + 1 0 − 1 0
2 2
1 1 1
= − 0 − + 0 − 0
2 2 2
1 1 1
x = −
2
− −
2 2 2
Click
Steepest
to edit
Descent
MasterExample
title style
1 1 1 1
d = −f (x ) = − − 1 + 1 +
2 2
−1 + − 1 + 1
2 2 2 2
1 1
d = −
2
0 −
2 2
So,
1 1 1 2 1 1
x = −
3
− − + − 0 −
2 2 2 2 2
1 2 1 1 2
= − ( + 1) − − ( + 1)
2 2 2
Click
Steepest
to edit
Descent
MasterExample
title style
Find 2: 1 2 3 2
f (x ) = ( + 1) − ( + 1) +
3 2 1
2 2 4
3
df (x ) 3
= ( + 1) −
2
d 2
2
3 1 3
x = −
3
− −
4 2 4
Click
Steepest
to edit
Descent
MasterExample
title style
Find the next search direction:
1 1
d = −f (x ) = − 0
3 3
0 = 0 − 0
2 2
3 1 3 3 1
x = −
4
− − + 0 − 0
4 2 4 2
3 1 3 3
= − − ( + 1) −
4 2 4
Click
Steepest
to edit
Descent
MasterExample
title style
Find 3: 1 3 3 3 3
f (x ) = ( + 1) − ( ) −
4 2
4 2 2
4
df (x ) 1 3 9
= ( + 1) − = 0
d 3
2 8
5
=
3
4
Click
Steepest
to edit
Descent
MasterExample
title style
So, x4 becomes:
3 1 3 5
x = −
4
− − + 0 − 0
4 2 4 8
3 9 3
x = −
4
− −
4 8 4
Click
Steepest
to edit
Descent
MasterExample
title style
The next search direction:
5 3 5 5 3 5
d = −f (x ) = −
4 4
− = − −
8 4 8 8 4 8
3 9 3 4 5 3 5
x = −
5
− − + − −
4 8 4 8 4 8
1 5 4 3 3 1 5 4
= − (3 + ) − ( − ) − (3 + )
4
4 2 4 2 4 2
Click
Steepest
to edit
Descent
MasterExample
title style
Find 4: 73 4 2 43 4 51
f (x ) = ( ) − −
5
32 32 64
5
df (x ) 73 4 43
= − =0
d 4
16 32
= 43146
4
Click
Steepest
to edit
Descent
MasterExample
title style
Update x5:
3 9 3 43 5 3 5
x = −
5
− − + − −
4 8 4 146 8 4 8
1091 66 1091
x = −
5
− −
1168 73 1168
Click
Steepest
to edit
Descent
MasterExample
title style
Let’s check to see if the convergence criteria
is satisfied
Evaluate ||f(x5)||:
21 35 21
f (x ) =
5
f (x ) =
5
(21584 ) + (35 584 ) + (21584 )
2 2 2
= 0.0786
Click
Steepest
to edit
Descent
MasterExample
title style
So, ||f(x5)|| = 0.0786, which is very small
and we can take it to be close enough to
zero for our example
Notice that the answer of
1091 66 1091
x = − − −
1168 73 1168
is very close to the value of x = − 1 − 1 − 1
*
f (x) f (x 0 ) + 2 f (x 0 ) (x − x 0 )
We can set the right-hand side equal to zero
and rearrange to give:
0
x = x − f (x )
2 0
−1
f (x )
0
Multivariable
Click to edit Master
Newton’s
title
Method
style
We can generalize this equation to give an
iterative expression for the Newton’s
Method:
x k +1 k
= x − f (x )
2 k
−1
f (x )
k
x k +1 k
= x − f (x )
2 k
−1
f (x )
k
f (x) = 2 x1 − x2 + 1 − x1 + 2 x2 − x3 − x2 + 2 x3 + 1
Click
Newton’s
to editMethod
MasterExample
title style
The Hessian is: 2 −1 0
f (x) = − 1 2 − 1
2
0 − 1 2
2 −1 0
−1 3 1 1
4 2 4
2
−1
f (x) = − 1 2 − 1 = 1
2
1 1
2
0 − 1 2 1 1 3
4 2 4
Click
Newton’s
to editMethod
MasterExample
title style
0
So, pick
x = 0
0
0
f (x 0 ) = 1 0 1
Click
Newton’s
to editMethod
MasterExample
title style
So, the new x is:
−1
x = x − f ( x ) f ( x 0 )
1 0 2 0
3 1 1
0 4 2 4 1
= 0 − 1 1 1 0
2 2
0 1 1 3 1
4 2 4
− 1
x1 = − 1
− 1
Click
Newton’s
to editMethod
MasterExample
title style
Now calculate the new gradient:
f (x ) = − 2 + 1 + 1 1 − 2 + 1 1 − 2 + 1 = 0 0 0
1