FX X RCX I CX I I: Study On Lagrangian Methods
FX X RCX I CX I I: Study On Lagrangian Methods
In this paper we consider the methods for solving a general constrained optimization problem. We discussed penalty
methods for the problems with equality constraints and barrier methods for the problems with inequality constraints these
are easy to combine. We consider the related, better and more complex class of augmented Lagrangian methods.
Received: Feb 17, 2022; Accepted: Mar 07, 2022; Published: Apr 18, 2022; Paper Id.: IJMCARJUN20224
INTRODUCTION
Original Article
A general constrained optimization problem can be given as
min f ( x)
x
x R n ci ( x) 0, i , ci ( x) 0, i I
Newton’s method, which is much less accepted by ill-conditioning provided the linear system at each of
iteration is solved directly, is accepted by having its domain of attraction shrink, hence the importance of the
continuation technique.
1 Penalty Method
min f ( x ) (1a)
Define
m
L( x, ) f ( x) i ci ( x) (1d)
i 1
www.tjprc.org [email protected]
34 Dr. Updesh Kumar, Prashant Chauhan & Paras Bhatnagar
and that the KKT conditions require, in addition to (10.1b), that at ( x*, *),
m
Lx f ( x) i ci ( x) 0.
i 1
In the quadratic penalty method we solve a sequence of unconstrained minimization problems of the form
1 m 2
min ( x; ) f ( x) ci ( x)
2 i 1
(2)
for a sequence of values k 0. We can use, for instance, the solution x *(k 1 ) of the (k -1)st
unconstrained problem as an initial guess for the unconstrained problem (2) with k . This is a simple continuation
technique.
It is hard to imagine anything simpler to intuit. Unfortunately, however, the problem (2) becomes ill-conditioned
as µ gets small. Both BFGS and CG methods become severely accepted by this. An algorithmic framework for such a
method reads: Given µ0 > 0, and for k 0,1, 2,...
( x; k ) k
Else
Choose k 1 (0, k )
End
The choice of how to decrease µ can depend on how difficult it has been to solve the previous subproblem, e.g.,
When comparing the gradients of the unconstrained objective function ( x, ) of (2) and the Lagrangian
ci
L( x, ) of (1d) it appears that
and
ci ( xk* )
i* , i 1, 2,...m. (3)
k
Example 1
Let us consider the minimization of the objective functions under the constraint
xT x 1.
For each value of penalty parameter µ encountered, we define the function of (2) with the line search option
and tolerance k tol 1.e 6, to solve the sub problem of minimizing . If more than 9 iterations are needed then the
update is 0.7 , otherwise it is 0.1 .
For the quadratic 4-variable objective function we start with the unconstrained minimizer,
x0 (1, 0, 1, 2)T and obtain convergence after a total of 44 damped Newton iterations to
x* (0.02477,0.31073, 0.78876,0.52980)T .
The objective value increases from the unconstrained (and infeasible) f ( x0 ) 167.28 to f ( x*) 133.56.
µ = 1,.1,.01,...,1.e-8,
i.e., all sub problems encountered were deemed “easy”, even though for one sub problem a damped Newton step
(i.e. step size < 1) was needed. The final approximation for the Lagrange multiplier is
(1 xT x) /108 40.94.
2
1,1 , which satisfies the constraint. Convergence to x. x* (0.99700, 0.07744)T
T
We start with x0
2
is reached after a total of 39 iterations. The objective value is f ( x*) 4.42, up from 0 for the unconstrained minimum.
The penalty parameter sequence was
µ = 1,.1,.01,.001,7.e-4,7.e-5,...,7.e-8,
But the path to convergence was more tortuous than these numbers indicate, as solution of the 3rd and 4th sub
problems failed. The final approximation for the Lagrange multiplier is
(1 xT x) / (7 108 ) 3.35.
www.tjprc.org [email protected]
36 Dr. Updesh Kumar, Prashant Chauhan & Paras Bhatnagar
To understand the nature of the ill-conditioning better, note that the Hessian of of (2) is
m
1 1
2 ( x; k ) 2 f ( x)
k
AT ( x) A( x)
k
c ( x) c ( x)
i 1
i
2
i
1
2 L AT A.
k
1
The matrix AT A has n-m zero eigen values as well as m eigen values with size ( k1 ) . So, we have an
k
unholy mixture of very large and zero eigenvalues. This could give trouble even for Newton’s method.
Fortunately, for Newton’s iteration, to find the next direction p we can write the linear system in augmented form
(verify!),
ci ( x) 2 p
2 f ( x) m ci ( x) A ( x) x ( x; k ) .
T
i1 k
0
A ( x ) - k I
This matrix tends towards the KKT matrix and all is well in the limit. Finally, we note for later purposes that
instead of the quadratic penalty function (2) the function
m
1
1 ( x; ) f ( x)
c ( x)
i 1
i (4)
Could be considered. This is an exact penalty function: for su. ciently small µ > 0 one minimization yields the
optimal solution.
Unfortunately, that one unconstrained minimization problem turns out in general practice to be harder to solve
than applying the continuation method presented before with the quadratic penalty function (2). But the function (4) also
has other uses, namely, as a merit function. Thus, (10.4) can be used to assess the quality of iterates obtained by some other
method for constrained optimization.
2 Barrier Method
min f ( x ) (5a)
We may or may not have m > n, but we use the same notation A(x) as in (10.1c), dropping the requirement of a
full row rank. In the log barrier method we solve a sequence of unconstrained minimization problems of the form
m
min ( x; ) f ( x) log c ( x)
i 1
i (6)
Starting with a feasible x0 in the interior of we always stay strictly in the interior of , so this is an interior
point method. This feasibility is a valuable property if we stop before reaching optimum.
Example 2
min x, x 0.
x
By (6),
( x; ) x log x.
d
Setting k we consider 0 1 / x, from which we get xk k 0 x *.
dx
Method in Section 10.1. The algorithmic framework is also the same as for the penalty method, with replacing
there.
Note that
m
x ( x, ) f ( x) ci ( x).
i 1 ci ( x)
k
*
i* , i 1, 2,...., m.
ci ( x )
k
For the strictly inactive constraints (ci 0) we get i* 0 in (7), as we should.
m
(i* )2
2 ( x; k ) 2 L( x* , * ) ci ( x)ci ( x)T
i 1 k
(i* ) 2
2 L ( x* , * ) k
ci ( x)ci ( x)T . (7)
i A ( x* )
This expresses ill-conditioning exactly as in the penalty case (unless there are no active constraints at all).
www.tjprc.org [email protected]
38 Dr. Updesh Kumar, Prashant Chauhan & Paras Bhatnagar
i ( ) , i=1,2,….m (8)
ci ( x( ))
Then x L( x( ), x( )) x 0. Also, c( x) 0, 0. > 0. So, all KKT conditions hold except for
ci ( x( ))i ( ) 0, i 1, 2,...., m.
C pd ( x( )) ( ) s( )) 0 , (9)
Where s = c(x) are the slack variables. For the primal-dual formulation we can therefore write the above as
c( x) s 0, (10b)
Se e , (10c)
0, s 0, (10d)
Where, as in Section 8.2, and S are diagonal matrices with entries i and si , respectively, and e(1,1,...,1)T .
The setup is therefore that of primal-dual interior point methods. A modified Newton iteration for the equalities in
(10) reads
2x L( x, ) -AT ( x) 0 x f AT
A(x) 0 -I s c . (11)
0
S s e Se r , s
The modification term r , s (which does not come out of Newton’s methodology at all!) turns out to be crucial
x x x, , s s s,
Consider the problem with only equality constraints, (1). The basic difficulty with the quadratic penalty method has been
that elusive limit of dividing 0 by 0. Let us therefore consider instead adding the same penalty term to the Lagrangian,
rather than to the objective function.
1 m 2
LA ( x, , ) L( x) ci ( x),
2 i 1
m
1 m 2
f ( x) i ci ( x) ci ( x). (12)
i 1 2 i 1
The KKT conditions require, to recall, that x L( x , ) 0 and c( x ) 0, c(x.) = 0, so at the optimum the
* * *
augmented Lagrangian coincides with the Lagrangian, and µ no longer need be small.
m
c ( x)
x LA ( x, ; ) f ( x) i i c ( x),
i 1 i
ci ( x)
i i* , i=1,…..,m. (13)
We can choose some µ > 0 not very small. The minimization of the augmented Lagrangian (12) therefore yields a
1
stabilization, replacing f by f
2
2
AT A. Thus, the Hessian matrix (wrto x) of the augmented
Lagrangian, Z
T
L Z, is s.p.d. It can further be shown that for µ small enough the minimization of (10.12)
2
x
Moreover, the formula (13) suggests a way to update i in a penalty-like sequence of iterates:
ci ( xk* )
ik 1 ik .
k
In fact, we should update also µ while updating . This then leads to the following algorithmic framework:
LA ( x, k ; k ) k
*
Call the result xk . .
www.tjprc.org [email protected]
40 Dr. Updesh Kumar, Prashant Chauhan & Paras Bhatnagar
Else
k 1 k k1c( xk* )
Choose k 1 (0, k )
This allows a gentler decrease of µ: both primal and dual variables par-ticipate in the iteration. The constraints
satisfy
ci ( xk* )
ik ik 1 0, 1 i m,
k
Clear improvement over the expression (3) relevant for the quadratic penalty method.
Example 3
Let us repeat the experiments of Example 1 using the Augmented Lagrangian method. We use the same
For the quadratic 4-variable objective function we obtain convergence after a total of 29 Newton iterations. No
damping was needed. The resulting penalty parameter sequence is
µ = 1,.1,.01,...,1.e-5,
For the non-quadratic objective function we obtain con-vergence after a total of 28 iterations. The penalty
parameter sequence is
µ = 1,.1,.01,.001,1.e-4,
The smallest values of µ required here are much larger than in Example 1, and no difficulty is encountered in the
path to convergence for the augmented Lagrangian method: the advantage over the penalty method of Example 1 is more
than the iteration counts alone indicate.
It is possible to extend the augmented Lagrangian method directly for inequality constraints [26]. But instead we
can use slack variables. Thus, for a given constraint
ci ( x) 0, i I ,
we write
ci ( x) si 0, si 0.
For the general problem with equality con-straints plus nonnegativity constraints
ci ( x) 0, i , (14b)
ci ( x) si 0, i I , (14c)
s 0. (14d)
For the latter problem we can utilize a mix of the augmented Lagrangian method applied for the equality
constraints and the gradient projection method, as described in Section 9.3, applied for the nonnegativity constraints.
This is the approach taken by the highly successful general-purpose code LANCELOT by Conn, Gould and Toint
[11]. In the algorithmic framework presented earlier we now have the sub problem
min L ( x, s, ; )
x,s
A
(15)
s.t. s 0,
REFERENCES
1. Bertsekas, D.P., (1976), "Multiplier Methods: A Survey", Automatica, Vol. 12, pp. 133-145.
2. Bertsekas, D.P., (1980a), "Enlarging the Region of Convergence of Newton's Method for Constrained Optimization", LIDS
Report R-985, M.I.T., Cambridge, Mass. (to appear in J.O.T.A.).
3. Bertsekas, D.P., (1980b), "Variable Metric Methods for Constrained Optimization Based on Differentiable Exact Penalty
Functions", Proc. of Eigtheenth Allerton Conference on Communication, Control and Computing, Allerton Park, Ill., pp. 584-
593.
4. Bertsekas, D.P., (1982), Constrained Optimization and Lagrange Multiplier Methods, Academic Press, N.Y.
5. Boggs, P.T., and Tolle, J.W., (1980), "Augmented Lagrangians which are Quadratic in the Multiplier", J.O.T.A., Vol. 31, pp.
17-26.
6. Boggs, P.T., and Tolle, J.W., (1981), "Merit Functions for Nonlinear Programming Problems", Operations Research and
Systems Analysis Report, Univ. of North Carolina, Chapel Hill.
7. Chamberlain, R.M., Lemarechal, C., Pedersen, H.C., and Powell, M.J.D., (1979), "The Watchdog Technique for Forcing
Convergence in Algorithms for Constrained Optimization", Presented at the Tenth International Symposium on Mathematical
Programming, Montreal.
8. DiPillo, G and Grippo, L., (1979), "A New Class of Augmented Lagrangians in Non-linear Programming", SIAM J. on
Control and Optimization, Vol. 17, pp. 618-628.
www.tjprc.org [email protected]
42 Dr. Updesh Kumar, Prashant Chauhan & Paras Bhatnagar
9. DiPillo, G., Grippo, L., and Lampariello, F., (1979), "A Method for Solving Equality Constrained Optimization Problems by
Unconstrained Minimization", Proc.9th IFIP Conference on Optimization Techniques, Warsaw, Poland.
10. Dixon, L.C.W., (1980), "On the Convergence Properties of Variable Metric Recursive Quadratic Programming Methods",
Numerical Optimization Centre Report No. 110, The Hatfield Polytechnic, Hatfield, England. Fletcher, R., (1970), "A Class of
Methods for Nonlinear Programming with Termination and Convergence Properties", in Integer and Nonlinear Programming
(J.Abadie, ed.),North-Holland, Amsterdam.
11. Han, S.-P., (1977), "A Globally Convergent Method for Nonlinear Programming",.J.O.T.A., Vol. 22, pp. 297-309.
12. Han, S.-P., and Mangasarian, O.L., (1981), "A Dual Differentiable Exact Penalty Function", Computer Sciences Tech. Report
#434, University of Wisconsin.
13. Maratos, N., (1978), "Exact Penalty Function Algorithms for Finite Dimensional and Control Optimization Problems", Ph.D.
Thesis, Imperial College of Science and Technology, University of London.
14. Mayne, D.Q., and Polak, E., (1978), "A Super linearly Convergent Algorithm for Constrained Optimization Problems",
Research Report 78-52, Department of Computing and Control, Imperial College of Science and Technology, University of
London.
15. Powell, M.J.D., (1978),"Algorithms for Nonlinear Constraints that Use Lagrangian Functions", Math. Programming, Vol. 14,
pp. 224-248.
16. Pschenichny, B.N., (1970), "Algorithms for the General Problem of Mathematical Programming", Kibernetica, pp. 120-125
(Translated in Cybernetics, 1974).
17. Pschenichny, B.N., and Danilin, Y.M., (1975),Numerical Methods in extremal Problems, M.I.R. Publishers, Moscow (English
Translation 1978).
18. Rockafellar, R.T,, (1976), "Solving a Nonlinear Programming Problem by Way of a Dual Problem", Symposia Mathematica,
Vol. XXVII, pp. 135-160.
19. Qashqaei, Amir, and Ramin Ghasemi Asl. "Numerical modeling and simulation of copper oxide nanofluids used in compact
heat exchangers." International Journal of Mechanical Engineering, 4 (2), 1 8 (2015).
20. Lakshmi, B., and Z. Abdulaziz al Suhaibani. "A Glimpse at the System of Slavery." International Journal of Humanities and
Social Sciences 5.1 (2016): 211.
21. Clementking, A., et al. "Neuron Optimization for Cognitive Computing Process Augmentation." International Journal of
Computer Science and Engineering (IJCSE) ISSN 2278-9960 Vol. 2, Issue 3, July 2013, 5-12