0% found this document useful (0 votes)
38 views14 pages

1D Methods

The document discusses optimization methods for single-variable and multi-variable problems. It describes gradient-free methods like interval halving, Fibonacci search, and simplex search which rely only on function values. It also covers gradient-based methods that involve derivatives like Newton's method.

Uploaded by

Agrani Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views14 pages

1D Methods

The document discusses optimization methods for single-variable and multi-variable problems. It describes gradient-free methods like interval halving, Fibonacci search, and simplex search which rely only on function values. It also covers gradient-based methods that involve derivatives like Newton's method.

Uploaded by

Agrani Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

1 -D

OPTIMIZATION

Dhish Kumar Saxena


Professor
Department of Mechanical & Industrial Engineering
(Joint Faculty, Department of Computer Science)
IIT Roorkee
A BROAD CLASSIFICATION OF METHODS
Gradient-free Methods Gradient-based Methods
Category (direct search methods: (involve first and second order
use only f(X), g(X) values) derivatives)

Single-variable (n=1) Region-elimination (Bracketing) Interpolation Method


• Interval Halving Method • Newton Raphson Method
often used repeatedly as a subtask • Fibonacci Method • Bisection Method
(uni-directional search) in multi- • Golden Section Method • Secant Method
variate problem solving Point-estimation Method
• Simplex Search Method • Cauchy’s Steepest Descent Method
• Hooke-Jeeves Pattern Search • Newton’s Method
• Powell’s Conjugate Direction • Marquardt’s Method
• Conjugate Gradient Method
Unconstrained • Variable Metric Method
• Davidon-Fletcher-Powell
Method
• Broydon-Fletcher-Goldfarb
Multi-variate
(n ≥ 2) Linearized Search Techniques Primal Methods
• Frank Wolfe Method • Feasible Direction Methods
• Cutting Plane Method • Active set Methods
• Gradient Projection Method
Constrained • Reduced Gradient Method
Penalty function Methods
Primal-Dual Methods
• First-order (steepest Descent)
• Conjugate directions
• Modified Newton
2
SINGLE-VARIABLE: GRADIENT-FREE METHODS

The core assumption: The core concept:


Unimodality Bracketing

Definition: Unimodal function


Consider: Minimize f (x); x ∈ R .

Let x* be a minimum point of f (x), such that x* ∈ [a b]

A function is said to be unimodal in [a b] if for a ≤ x1 ≤ x2 ≤ b the following holds:

x* < x1 < x2 x* > x2 > x1


(x* lower than lower of the two) (x* higher than higher of the two)
↓ ↓
f(x*) < f(x1) < f(x2) f(x*) < f(x2) < f(x1)

Since x* is the lowest point, then the function approaching x* should be falling towards x*, regardless of the approach dir.
the f value at a point near x* should be lower than that at a farther point
3
SINGLE-VARIABLE: GRADIENT-FREE METHODS

The core assumption: The core concept:


Unimodality Bracketing

The Bracketing Concept


∙ Start with an Interval [a b] in which the Optima is expected to lie (I.o.U): Prior Knowledge
∙ Based on function values at bounds & intermediate point(s), and under assumption of unimodality: keep eliminating I.o.U

Would one intermediate point help? Would two intermediate points help?

a λ b a λ μ b
Can these function values help reduce Can these function values help reduce
I.o.U [a b] to either [a λ] or [λ b]? I.o.U [a b] to either [a λ] or [λ b]?

NO: since the function may be such that The answer isnt straight forward
optima may lie in either of the two segments …more deliberation is required

4
SINGLE-VARIABLE: GRADIENT-FREE METHODS

The core assumption: The core concept:


Unimodality Bracketing

The Bracketing Concept: would two intermediate points help reduce I.o.U?
f(λ) < f(μ) f(λ) > f(μ)

a λ μ b a λ μ b
∙ under unimodality: function values ONLY ON ONE side ∙ under unimodality: function values ONLY ON ONE side
of μ can be smaller than f (μ) of λ can be smaller than f (λ)

∙ under unimodality x* ∉ [μ b] ∙ under unimodality x* ∉ [a λ]

∙ I.o.U can be reduced to [a μ] ∙ I.o.U can be reduced to [λ b]

f (μ) > f (λ) ⟹ [μ b] can be eliminated f (λ) > f (μ) ⟹ [a λ] can be eliminated

Interval from one of the bound points and an intermediate point with higher f-value can be eliminated
5
SINGLE-VARIABLE: GRADIENT-FREE METHODS
Interval Halving method: Dichotomous Search

μk
λk

ak mk bk
ak + bk
∙ For the current I.o.U given by [ak bk ], determine the midpoint as mk = ; λk = mk − ϵ; μk = mk + ϵ
2
∙ If f (μk ) > f (λk ) : eliminate [μk bk ]; shift the upper bound from bk to μk ⟹ bk+1 = μk

ak+1 bk+1
∙ This process is repeated until bcurrent − acurrent > l, where l is user defined threshold on I.O.U

∙ Key highlights:
∙ ≈ 50 % reduction in I.o.U, in each iteration
∙ Two function evaluations per iteration
SINGLE-VARIABLE: GRADIENT-FREE METHODS
Fibonacci Search Method
Say, somehow we have a way to determine λ1; μ1 … for now: located symmetrically w.r.t a and b, respectively
δ2
δ2
δ1 δ1
a1 λ1 μ1 b1 I1 = b1 − a1 = b1 − μ1 + μ1 − a1
I1 = (μ1 − a1) + (b1 − μ1)
a2 λ2 μ2 b2 I1 = I2 + (b1 − μ1)
f (μ1) > f (λ1)
I1 = I2 + (λ1 − a1) since | b1 − μ1 | = | λ1 − a1 |
a3 λ3μ3 b3 I1 = I2 + (μ2 − a2) = I2 + (b3 − a3)
f (μ2) > f (λ2)
I1 = I2 + I3
I3 I2 = I3 + I4
I2 ⋮
In = In+1 + In+2 ⟹ we have n equations in n+2 intervals
I1 As I1 is known, and we can let In+2 = 0 ⟹ n eqns; n unknowns

Of λ & μ with Intervals/Bounds ⟹ Let us observe two sets of patterns ⟹ Among Intervals
δ1 = | b1 − μ1 | = | λ1 − a1 | = I3 Intm . point can be expressed w . r . t . In+2 = 0 In+1 = 1 In
δ2 = | b1 − λ1 | = | μ1 − a1 | = I2 ∙ Neighbouring Bound Point In = 1 In
± next − to − next − Inter val
λ1 = b1 − I2 = a1 + I3 In−1 = In + In+1 ⟹ In−1 = 2 In
⟹ λk = bk − Ik+1 = ak + Ik+2 ∙ Farther Bound Point In−2 = In−1 + In ⟹ In−2 = 3 In
± next − Inter val
μ1 = a1 + I2 = b1 − I3 In−3 = In−2 + In−1 ⟹ In−3 = 5 In
⟹ μk = ak + Ik+1 = bk − Ik+2 + : IP > BP else−
SINGLE-VARIABLE: GRADIENT-FREE METHODS
Fibonacci Search Method
Linking Intervals to Fibonacci Numbers

An interval In+2 In+1 In In−1 In−2 In−3 In−4 In−5 In−6 In−7 In−8 In−9
as multiple of In
0 1 1 2 3 5 8 13 21 34 55 89
Fibonacci Series F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 …
In+1 = 1 In ⟹ In+1 = F0 In
In = 1 In ⟹ In = F1 In
In−1 = 2 In ⟹ In−1 = F2 In In+1 In In−1 I1
In−2 = 3 In ⟹ In−2 = F3 In In = = = ⋯=
F0 F1 F2 Fn
In−3 = 5 In ⟹ In−3 = F4 In
⋮ ⋮ Ij
All are equal as long as j + k = n + 1
⟹ I1 = Fn In Fk

Example Let I1 = [0 3] and let us say 6 iterations of Fibonacci Search Method are to be applied. Determine λ1 and μ1

λ1 = b1 − I2 = a1 + I3 ⟹ λk = bk − Ik+2 = ak + Ik+2

μ1 = a1 + I2 = b1 − I3 ⟹ μk = ak + Ik+1 = bk − Ik+2

I2 I1 F5 8 λ1 = b1 − I2 = 1.154
= ⟹ I2 = I1 ⋅ ⟹ I2 = 3 ⋅ = 1.846 ⟹
F5 F6 F6 13 μ1 = a1 + I2 = 1.846
SINGLE-VARIABLE: GRADIENT-FREE METHODS
Fibonacci Search Method: an example
0.75 −1
Minimize f (x) = 0.65 − − 0.65 x tan (x) in the interval [0 3] by Fibonacci Method using n = 6
1 + x2

Step-1(a): a1 = 0; b1 = 3 and the need is to determine λ1 and μ1 based on n = 6

I2 I F5 8 λ1 = b1 − I2 = 1.153846 f (λ1) = − 0.207270


= 1 ⟹ I2 = I1 ⋅ ⟹ I2 = 3 ⋅ = 1.846154 ⟹ ⟹
F5 F6 F6 13 μ1 = a1 + I2 = 1.846154 f (μ1) = − 0.115843

Step-1(b): Re-define the I.o.U f (λ1) < f (μ1) ⟹ [μ1 b1] can be eliminated

Step-2(a): a2 = 0; b2 = μ1 = 1.846154 = 3; μ2 = λ1 = 1.153846 and the need is to determine λ2

f (λ2) = − 0.291364
λ2 − a2 = b2 − μ2 ⟹ λ2 = 0.692308 ⟹
f (μ2) = − 0.207270

Step-2(b): Re-define the I.o.U f (λ2) < f (μ2) ⟹ [μ2 b2] can be eliminated

Keep repeating it
∙ If you keep repeating it, you will arrive at a reduced Interval of Uncertainty (not just length, but also spatially)
I
∙ Else, without going through the above, you can just determine the length of the final interval: I6 = 1
F6
SINGLE-VARIABLE: GRADIENT-FREE METHODS
Golden Section Search Method
The major limitation of Fibonacci is that one needs to know apriori : n, even to compute λ1 and μ1
In−k I1 F5
= ⟹ I2 = I1 ⋅
Fk+1 Fn F6
Let us revisit

Fk+1
Intermittent ratio: 1 1.5 1.66 1.60 1.625 1.615 1.619 1.618 1.618 Ltk→∞ = 1.618
Fk

A more formal Interpretation


Ik Ik+1
We understand: Ik = Ik+1 + Ik+2 What if successive Intervals bear a constant ratio? ⟹ = =r
Ik+1 Ik+2
Ik Ik+1 Ik Ik+1 Ik+1
= +1 ⟹ ⋅ = + 1 ⟹ r 2 = r + 1 ⟹ r = 1.618
Ik+2 Ik+2 Ik+1 Ik+2 Ik+2
I1 I1 I1 I1
Lengths of Intervals : {I1, , 2 , 3 ⋯} ⟹ IkGS = k−1
r r r r

It can be shown that : IkGS ≈ 1.17 IkFBS


∙ Gain some; lose some
∙ Gain independence of n to determine Ik
∙ Lose speed of reduction in Ik
SINGLE-VARIABLE: GRADIENT-BASED METHODS

f′(mk)
μk
λk

ak mk bk ak mk bk
If f′(mk ) > 0 Reduce I.o.U to [ak mk ]
Else, reduce I.o.U to [mk bk ]

Bisection Method


SINGLE-VARIABLE: GRADIENT-BASED METHODS

Newton Raphson Method: Basic Version

Basically a root (where the function crosses the x-axis) finding method
∙ Start with an initial point xk
∙ Find the tangent to the function at xk and locate xk+1
as the point where this tangent hits the x-axis
∙ Repeat this process until f (xj) = 0

xk+1 #If the initial point is sufficiently close to the root


xk then this method iteratively leads to the root

Newton Raphson Method: for single variable optimization


∙ The first-order NCC for optimality is f′(x) = 0
∙ N-R Method is used to find the root of f′(x) with a caveat that:
instead of the original function (f), its quadratic approximation (q) is used


SINGLE-VARIABLE: GRADIENT-BASED METHODS

Newton Raphson Method: for single variable optimization


∙ At any iteration k, construct a quadratic model q(x) which agress with f at xk up to 2nd derivative
h2
q(xk+1) = f (xk ) + h f′(xk ) + f′′(xk ) where h = x − xk
2
(x − xk )2
q(xk+1) = f (xk ) + (x − xk ) f′(xk ) + f′′(xk )
2

∙ To find xk+1 by minimizing the quadratic model : q′(xk+1) = 0 ⟹ f′(xk ) + (xk+1 − xk ) f′′(xk ) = 0

f′(xk )
xk+1 = xk − …repeat until | f′(xk ) | > ϵ
f′′(xk )

Works only if :
∙ f′′(xk ) is positive ⟹ positive curvature at every point
∙ If the initial point is sufficiently close to the root
















Thank You

You might also like