Simulated Annealing
Simulated Annealing
Simulated Annealing
Introduction 2
The simulated annealing method simulates the process of slow cooling of molten
metal to achieve the minimum function value in a minimization problem.
The cooling phenomenon of the molten metal is simulated by introducing a
temperature-like parameter and controlling it using the concept of Boltzmann’s
probability distribution.
The Boltzmann’s probability distribution implies that the energy (E) of a system in
thermal equilibrium at temperature T is distributed probabilistically according to
the relation
P(E) = e−E/kT
where P(E) denotes the probability of achieving the energy level E, and k is called
the Boltzmann’s constant.
5
Equation shows that at high temperatures the system has nearly a uniform
probability of being at any energy state;
However, at low temperatures, the system has a small probability of being at a
high-energy state. This indicates that when the search process is assumed to follow
Boltzmann’s probability distribution, the convergence of the simulated annealing
algorithm can be controlled by controlling the temperature T .
The method of implementing the Boltzmann’s probability distribution in simulated
thermodynamic systems, suggested by Metropolis et al. [13.37], can also be used
in the context of minimization of functions.
In the case of function minimization, let the current design point (state) be Xi ,
with the corresponding value of the objective function given by fi = f (Xi ). Similar
to the energy state of a thermodynamic system, the energy Ei at state Xi is given by
6
7
Ei = fi = f (Xi )
Then, according to the Metropolis criterion, the probability of the next design point
(state) Xi+1 depends on the difference in the energy state or function values at the two
design points (states) given by
∆E = Ei+1 − Ei = ∆f = fi+1 − fi ≡ f (Xi+1) − f (Xi )
The new state or design point Xi+1 can be found using the Boltzmann’s probability
distribution:
P[Ei+1] = min {1, e−∆E/kT }
The Boltzmann’s constant serves as a scaling factor in simulated annealing and, as
such, can be chosen as 1 for simplicity. Note that if ∆E ≤ 0, Eq. (13.17) gives P[Ei+1] =
1and hence the point Xi+1 is always accepted. This is a logical choice in the context of
minimization of a function because the function value at Xi+1, fi+1, is better (smaller)
than at Xi , fi , and hence the design vector Xi+1 must be accepted. On the other hand,
when ∆E >0, the function value fi+1 at Xi+1 is worse (larger) than the one at Xi .
8
Start with an initial design vector X1 (iteration number i = 1) and a high value of temperature T .
Generate a new design point randomly in the vicinity of the current design point and find the
difference in function values:
∆E = ∆f = fi+1 − fi ≡ f (Xi+1) − f (Xi ) (13.19)
If fi+1 is smaller than fi (with a negative value of ∆f ), accept the point Xi+1 as the next design point.
Otherwise, when ∆f is positive, accept the point Xi+1 as the next design point only with a probability
e−∆E/kT . This means that if the value of a randomly generated number is larger than e−∆E/kT , accept the
point Xi+1; otherwise, reject the point Xi+1.
If the point Xi+1 is rejected, then the process of generating a new design point Xi+1 randomly in the
vicinity of the current design point, evaluating the corresponding objective function value fi+1, and
deciding to accept Xi+1 as the new design point, based on the use of the Metropolis criterion, Eq.
(13.18), is continued.
To simulate the attainment of thermal equilibrium at every temperature, a predetermined number (n) of
new points Xi+1 are tested at any specific value of the temperature T.
10
Once the number of new design points Xi+1 tested at any temperature T exceeds the value of n, the
temperature T is reduced by a prespecified fractional value c (0 < c < 1) and the whole process is
repeated.
The procedure is assumed to have converged when the current value of temperature T is sufficiently
small or when changes in the function values (∆f ) are observed to be sufficiently small.
The choices of the initial temperature T , the number of iterations n before reducing the temperature,
and the temperature reduction factor c play important roles in the successful convergence of the SA
algorithm.
For example, if the initial temperature T is too large, it requires a larger number of temperature
reductions for convergence. On the other hand, if the initial temperature is chosen to be too small, the
search process may be incomplete in the sense that it might fail to thoroughly investigate the design
space in locating the global minimum before convergence.
The temperature reduction factor c has a similar effect. Too large a value of c (such as 0.8 or 0.9)
requires too much computational effort for convergence.
On the other hand, too small a value of c (such as 0.1 or 0.2) may result in a faster reduction in
temperature that might not permit a thorough exploration of the design space for locating the global
minimum solution.
11
Similarly, a large value of the number of iterations n will help in achieving quasi-
equilibrium state at each temperature but will result in a larger computational effort.
A smaller value of n, on the other hand, might result either in a premature convergence
or convergence to a local minimum (due to inadequate exploration of the design space
for the global minimum).
Unfortunately, no unique set of values are available for T , n, and c that will work well
for every problem. However, certain guidelines can be given for selecting these values.
The initial temperature T can be chosen as the average value of the objective function
computed at a number of randomly selected points in the design space.
The number of iterations n can be chosen between 50 and 100 based on the computing
resources and the desired accuracy of solution. The temperature reduction factor c can
be chosen between 0.4 and 0.6 for a reasonable temperature reduction strategy (also
termed the cooling schedule).
Features of Simulated Annealing 12