Molecular Modelling: Lecture 2: Geometry Optimization and Brief Repetition of Statistical Thermodynamics
Molecular Modelling: Lecture 2: Geometry Optimization and Brief Repetition of Statistical Thermodynamics
2
E
q
2
1
2
E
q
1
q
2
2
E
q
1
q
n
2
E
q
2
q
1
2
E
q
2
2
2
E
q
2
q
n
.
.
.
.
.
.
.
.
.
.
.
.
2
E
q
n
q
1
2
E
q
n
q
2
2
E
q
2
n
_
_
and solve the following eigenvalue problem
H = PkP
1
.
The P matrix is the eigenvector matrix whose columns are direction vectors for the
vibrations whose force constants are given by the k eigenvalue matrix. For a minimum in
the PES,
k
i
> 0
for all k.
For a maximum in the PES
k
i
< 0
for all k
i
.
Geometry Optimization
All other stationary points are saddle points, where the number of negative eigenvalues
or frequencies indicate the order of the saddle point. In chemistry we are generally
interested in rst order saddle points. But more about this in the lecture about rates and
reaction paths.
Minimization Methods
A geometry optimization tries to nd a (local) minimum on the PES. One can generally
distinguish three types of methods to do this
methods that use no derivatives (only function values, such as the energy)
It mimics the process undergone by misplaced atoms in a metal when its heated and
then slowly cooled
Good in nding a local minimum for a system with many free parameters
The synthetic temperature goes down during the optimization. The cooling
scheme and cooling rate have an important inuence on the outcome and should be
tailored for each application. Adaptive simulated annealing algorithms exists that
address this problem by connecting the cooling schedule to the search progress.
(k)
=
(g
(k)
)
T
g
(k)
(g
(k1)
)
T
g
(k1)
or
Geometry Optimization
(k)
=
(g
(k)
g
(k1)
)
T
g
(k)
(g
(k1)
)
T
g
(k1)
T denotes the transpose of a matrix. Which expression for is superior depends on the
functional form of the surface one wants to optimize.
The step direction is not always orthogonal to the previous direction which results in
a faster than steepest decent.
Second-order methods
Methods that use both the rst derivative and the second derivative can reach the
minimum in the least number of steps because the curvature gives an estimation of
where the minimum is. The simplest method in this category is a method based on the
Newton-Raphson method for roots nding (points where function values are zero).
Newton-Raphson for roots nding
1. Start in x
(1)
and determine f
(1)
(function evaluation)
and g
(1)
(gradient in point x
(1)
).
2. The function can be linearly approximated by
y = g
(1)
x +f
(1)
g
(1)
x
(1)
3. This has a root at x
(2)
= x
(1)
f
(1)
g
(1)
4. Repeat steps 1-3
The starting point is crucial, for y = x exp(x
2
) (Fig. 5.7) |x| < 0.5 nds root x = 0
and |x| > 0.5 nd the roots at innity.
We can now apply this method to nd stationary points by searching for roots of the
gradient. We use a linear function for the approximation of the gradient which results in
a quadratic function for the approximation of the function.
g
(k)
(x) H
(k)
(x x
(k)
) +g
(k)
= 0
x
(k+1)
= x
(k)
g
(k)
H
(k)
with H
(k)
the Hessian, second derivative, at the current point x
(1)
.
In the case of multiple variables:
x
(k+1)
= x
(k)
(H
(k)
)
1
g
(k)
Geometry Optimization
Hessian
For geometry optimization of a single molecule virtually all software packages use a
method that needs a hessian and/or its inverse. If the functional form of the PES is
known, for instance in the case of force eld where only the force constant change, one
could derive an analytical expression for the gradient and the hessian. However, in most
cases the functional form is not known or too system specic to use in a general
modeling program, and one has to determine the gradient and hessian numerically.
Analytically dierentiation is easier than integration, numerically it is the reverse to get
accurate values. And minimizers need accurate gradients to converge correctly, they
fortunately do not need precise Hessians and often approximates are applied.
Quasi-Newton methods start with an approximate (inverse) Hessian and update this
every iteration. There are several routines for this. One is the Davidon-Fletcher-Powell
formula which update the inverse hessian B
k
= H
1
k
in the following way:
y
(k)
= g
(k+1)
g
(k)
B
k+1
= B
k
B
k
y
k
y
T
k
B
k
y
T
k
B
k
y
k
+
x
k
(x
k
)
T
y
T
k
x
k
The initial estimated hessian can be the unit matrix or one can use internal coordinates.
The individual diagonal elements of the Hessian can be identied as bond-stretching and
bond-bending force constants, etc. The Hessian should also have the correct number of
degrees of freedom. A nonlinear molecule of N atoms can be described by 3N Cartesian
coordinates. There are however only p = 3N 6 vibrational degrees of freedom, that
can be varied during a geometry optimization. The six coordinates that one looses
correspond to rotation and translation of the entire molecule and are therefore
degenerate. These should not be used in a geometry optimization. For this reason often
a Z-matrix which denes the molecule in terms of internal coordinates, is used as input
and indicating symmetry is also more straightforward in this way. However, Cartesian
coordinates can be preferred and are often just easier to use, but as mentioned they can
lead to too many degrees of freedom. How to circumvent this?
q = AX (q in internal and X in Cartesian coordinates)
A
T
U = (U)
C
(q is rectangular)
U = G
1
Au(U)
C
with G = AuA
T
Statistical thermodynamics
Geometry optimization results in the optimum structure at T = 0K. Properties that are
calculated on the basis of this structure are a good rst starting point, but since we work
experimentally mostly at T > 0K, they can be quite far o. At T > 0K not only the
system is not only its most optimum structure but also other states are accessible, which
can have dierent properties which leads to a dierent average macroscopic
(measurable) property. To determine this average property is the area of statistical
thermodynamics. Statistical thermodynamics predicts the probability that a molecule is
in a certain energy state (ground state vs. excited state)
The probability that a molecule is in a state i is
P
i
=
exp(
i
/kT)
j
exp(
j
/kT)
with discrete energy levels and the molecular partition function
q =
j
exp(
j
/kT).
If
0
= 0 then the partition function gives the number of accessible energy levels.
In molecular mechanics one usually has a collection of interacting molecules and the
mutual potential energy of the system is considered instead of the internal energy of the
molecule. We now move from the microcanonical to the canonical ensemble with
P
i
=
exp(U
i
/kT)
j
exp(U
j
/kT)
and
Q =
j
exp(U
j
/kT).
To calculate a property of a system:
< A >=
A
i
exp(U
i
/kT)
Q
So to get the average mutual potential energy
< U >=
U
i
exp(U
i
/kT)
Q
We can achieve this by considering a large number of cells with the same thermodynamic
properties (N, V, T, E, p and/or etc.) but dierent molecular arrangements. If N, V
and T is kept constant between cells we speak about the canonical ensemble
P
i
=
_
exp(U
i
/kT)
Q
_
N,V,T
with Q the canonical partition function.
Statistical thermodynamics
Ensembles
canonical ensemble with constant N, V, T
microcanonical ensemble with constant N, E, V (no energy ow between cells)
isothermal-isobaric ensemble with constant N, p, T
grand canonical ensemble with constant V, T, (chemical potential)
In MM, energy is a function of the coordinates of the atoms (parameter space) and all
points in parameter space contribute to the partition function. For this reason often in
instead of a sum an integration over parameter space is used:
Q =
1
N!
_
exp
_
U(r
N
)
kt
_
dr
N
The factor 1/N! is for indistinguishable particles; for particles that can be distinguished
this term drops from the equation. The average mutual potential energy can be
determined by solving:
< U >=
_
U(r
N
) exp
_
U(r
N
)
kT
_
dr
N
Q
Remember the angle dependent dipole-dipole mutual potential energy
(4
0
)U
AB
=
p
A
p
B
R
3
(2 cos
A
cos
B
sin
A
sin
B
cos )
We would like to determine the average potential energy. Lets assume a parallel
conguration:
A
=
B
and = 0.
U
AB
=
p
A
p
B
4
0
R
3
_
1 3 cos
2
_
Statistical Thermodynamics
Due to thermal uctuations can change:
< U
AB
>
dip...dip
=
_
U
AB
exp(U
AB
/kT)d
_
exp(U
AB
/kT)d
=
_
d
_
=0
U
AB
exp(U
AB
/kT) sin d
_
d
_
=0
exp(U
AB
/kT) sin d
Lets assume U
AB
<< kT then exp(U
AB
/kT) 1 U
AB
/kT and use
U
AB
= C
_
1 3 cos
2
_
< U
AB
>
dip...dip
=0
C
_
1 3 cos
2
_
(1 C
_
1 3 cos
2
_
/kT) sin d
_
=0
(1 C (1 3 cos
2
) /kT) sin d
Substitute x = cos , which implies that dx = sin d
< U
AB
>
dip...dip
_
1
x=1
C
_
1 3x
2
_
(1 C
_
1 3x
2
_
/kT)dx
_
1
x=1
(1 C (1 3x
2
) /kT)dx
=
4C
2
5kT
=
4p
2
A
p
2
B
5kT(4
0
)
2
1
R
6
if all angles are taken into account
< U
AB
>
dip...dip
=
2p
2
A
p
2
B
3kT(4
0
)
2
1
R
6
=
C
R
6
(Keesom)
For the dipole-dipole mutual potential energy the integrals in
< U >=
_
U(r
N
) exp
_
U(r
N
)
kT
_
dr
N
Q
could be solved analytically using some approximations. This is however often not the
case and the integrals need to be solved numerically. This problem together with the
arrival of the rst computers lead to the development of the Monte Carlo technique,
which will be discussed in the next lecture.