Entropy and The Tao of Counting: A Brief Introduction To Statistical Mechanics and The Second Law of Thermodynamics
Entropy and The Tao of Counting: A Brief Introduction To Statistical Mechanics and The Second Law of Thermodynamics
Kim Sharp
Series Editors
Balasubramanian Ananthanarayan, Centre for High Energy Physics, Indian
Institute of Science, Bangalore, India
Egor Babaev, Physics Department, University of Massachusetts Amherst, Amherst,
MA, USA
Malcolm Bremer, H H Wills Physics Laboratory, University of Bristol, Bristol, UK
Xavier Calmet, Department of Physics and Astronomy, University of Sussex,
Brighton, UK
Francesca Di Lodovico, Department of Physics, Queen Mary University of
London, London, UK
Pablo D. Esquinazi, Institute for Experimental Physics II, University of Leipzig,
Leipzig, Germany
Maarten Hoogerland, University of Auckland, Auckland, New Zealand
Eric Le Ru, School of Chemical and Physical Sciences, Victoria University
of Wellington, Kelburn, Wellington, New Zealand
Hans-Joachim Lewerenz, California Institute of Technology, Pasadena, CA, USA
Dario Narducci, University of Milano-Bicocca, Milan, Italy
James Overduin, Towson University, Towson, MD, USA
Vesselin Petkov, Montreal, QC, Canada
Stefan Theisen, Max-Planck-Institut für Gravitationsphysik, Golm, Germany
Charles H.-T. Wang, Department of Physics, The University of Aberdeen,
Aberdeen, UK
James D. Wells, Physics Department, University of Michigan, Ann Arbor, MI,
USA
Andrew Whitaker, Department of Physics and Astronomy, Queen’s University
Belfast, Belfast, UK
SpringerBriefs in Physics are a series of slim high-quality publications encompass-
ing the entire spectrum of physics. Manuscripts for SpringerBriefs in Physics will be
evaluated by Springer and by members of the Editorial Board. Proposals and other
communication should be sent to your Publishing Editors at Springer.
Featuring compact volumes of 50 to 125 pages (approximately 20,000–45,000
words), Briefs are shorter than a conventional book but longer than a journal
article. Thus, Briefs serve as timely, concise tools for students, researchers, and
professionals.
Typical texts for publication might include:
• A snapshot review of the current state of a hot or emerging field
• A concise introduction to core concepts that students must understand in order to
make independent contributions
• An extended research report giving more details and discussion than is possible
in a conventional journal article
• A manual describing underlying principles and best practices for an experimental
technique
• An essay exploring new ideas within physics, related philosophical issues, or
broader topics such as science and society
Briefs allow authors to present their ideas and readers to absorb them with
minimal time investment.
Briefs will be published as part of Springer’s eBook collection, with millions
of users worldwide. In addition, they will be available, just like other books, for
individual print and electronic purchase.
Briefs are characterized by fast, global electronic dissemination, straightforward
publishing agreements, easy-to-use manuscript preparation and formatting guide-
lines, and expedited production schedules. We aim for publication 8–12 weeks after
acceptance.
This Springer imprint is published by the registered company Springer Nature Switzerland AG.
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
For Thi and Thuan
Preface
The second law of thermodynamics, which says that a certain quantity called
entropy always increases, is the law that scientists believe gives the direction to
time’s arrow. It is the reason why so many everyday events happen only one way.
Heat flows from hot to cold. Stirring milk into coffee always mixes it, never un-
mixes it. Friction slows things down, never speeds them up. Even the evolution
of the universe and its ultimate fate depend on entropy. It is no surprise that there
are numerous books and articles on the subjects of entropy and the second law of
thermodynamics. So why write another book, even a short one, on the subject?
In 1854 Rudolf Clausius showed that there was a well-defined thermodynamic
quantity for which, in 1865, he coined the name entropy. In his words “The energy
of the universe remains constant. The entropy of the universe tends to a maximum.”
Entropy could be measured and its behavior described, but what it was remained a
mystery. Twelve years later Ludwig Boltzmann provided the answer in atomistic,
probabilistic terms and helped create the field of statistical mechanics.
Boltzmann’s explanation was the first, and arguably still the best, explanation
of entropy. His classic paper, published in 1877, is probably never read anymore
yet something can still be learned by reading it. One of the things Boltzmann
explains very clearly is that there are two equally important contributions to entropy:
one from the distribution of atoms in space and the other from the motion of
atoms (heat); in other words from the distribution of kinetic energy among the
atoms.1 However, many introductory books and articles on entropy do not give
equal attention to each. The spatial contribution by its nature is just easier to
visualize and explain. Explanations of the kinetic energy contribution to entropy
often use fossilized concepts like Carnot cycles and efficiencies of heat engines. It
is also customary to teach thermodynamics before statistical mechanics, because
the former is thought to be easier, and this follows their historical order of
1 We now know of four sources of entropy in the universe. The third is radiation: photons have
entropy, a fact also discovered by Boltzmann. The fourth is black holes. These require a level of
physics and mathematics beyond the scope of this book.
vii
viii Preface
1 Learning to Count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 What Is Entropy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Entropy and Free Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4 Entropic Forces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
ix
Chapter 1
Learning to Count
Unpacking the meaning of this quote provides the key to understanding entropy.
The term more probable simply means that some states or transformations between
states can happen in more ways than others. What we then observe in the real world
is that which can happen the most ways. This chapter first defines what is meant by
the term way, and then shows how to count these ways. In Chinese philosophy the
term Tao can be translated as ‘The Way’, so we may say that subject of this chapter
is the Tao of counting.
(6, 6). The lesson from the dice analogy is that different conditions can be realized
by different numbers of complexions. We are very close to the meaning of entropy
when we recognize that in a large number of rolls of two dice we are likely to see
a total of 7 more often than any other total simply because it is produced by more
complexions than any other total. To understand thermodynamics, we need only to
apply the same ideas to atoms.
v is the length of the velocity arrow in Fig. 1.2. In summary, at any moment in time
an atom is described by three numbers for its position, and three numbers for its
velocity, which will be abbreviated as a list of six numbers (x, y, z, u, γ , w).
1.3 The Kinetic Energy of an Atom 3
An atom that is in motion has a kinetic energy proportional to its mass m, and
proportional to the square of its speed v
1 2
Ek = mv . (1.2)
2
Because the kinetic energy of an atom depends on the square of its speed the total
kinetic energy is the sum of the contributions from the x, y and z components of
velocity:
1 2 1 1
Ek = mu + mγ 2 + mw 2 (1.3)
2 2 2
This can be verified by substituting for v in Eq. 1.2 using Eq. 1.1. So we can speak
of an atom as having three independent kinetic energy components.
If we have N atoms, the total kinetic energy is the sum of the atomic kinetic
energies
1
N
EK = mi (u2i + γi2 + wi2 ) (1.4)
2
i
where the subscript i labels the atom. The mean kinetic energy per atom is
ÊK = EK /N. If N is large then the mean kinetic energy per atom and the
absolute temperature, T , are just the same thing but expressed in different units. In
fact Maxwell and Boltzmann, when developing statistical thermodynamics, mostly
4 1 Learning to Count
used mean energy per atom not temperature.1 Because, in the development of
thermodynamics, the units of energy and the units of absolute temperature were
determined independently the scale factor relating them had to be determined
empirically. This was done, for example, using the properties of an ideal gas
and Boyle’s Law (see Appendix). It required knowing how many atoms are in a
given amount of material, i.e. knowing Avogadro’s constant. The relationship is
kb T = 23 ÊK , where kb is called Boltzmann’s constant. If energy is measured in
Joules and temperature is in degrees Kelvin then kb = 1.38 × 10−23 J/K.
(x1 , y1 , z1 , u1 , γ1 , w1 . . . xN , yN , zN , uN , γN , wN ) (1.5)
defines what will be called a complexion. The term complexion is the Anglicized
form of the word Komplexion used by Boltzmann in his original work explaining
entropy (Boltzmann 1877; Sharp and Matschinsky 2015). I resurrect this neutral
term to avoid baggage-laden terms such as microstate, disordered state, etc.
often used in explaining entropy. Because the atoms are in motion the system
will be continually moving from one complexion to another according to the
laws of dynamics. The set of 3N position coordinate values alone, (x1 , . . . zN ),
will be referred to as a spatial complexion. Similarly, the set of 3N velocity
coordinate values alone, (u1 , . . . wN ), will be referred to as a velocity complexion,
or equivalently, a kinetic energy complexion, since the kinetic energy components
of the atoms are completely specified by their velocity components. Figure 1.3
shows an example of four of the many possible complexions that three atoms
could adopt.
1 In
their papers Maxwell and Boltzmann used the symbol T to refer to kinetic energy, not
temperature; a convention physicists still use today and a source of confusion.
1.5 Boltzmann’s Postulate 5
Fig. 1.3 Four possible complexions of three atoms. (a) and (b) have the same spatial complexion,
but different velocity complexions, as do (c) and (d). (a) and (c) have the same velocity complexion,
but different spatial complexions, as do (b) and (d). (a) and (d) have different spatial and velocity
complexions, as do (b) and (c)
Every complexion of N atoms having the same total energy E is equally likely
the atoms and their velocities? This task is described in the next section. With
Boltzmann’s postulate and the ability to count complexions we will then derive the
Second Law of Thermodynamics and the properties of entropy.
The laws of dynamics describe how, given a set of atomic positions and velocities
at some instant, these positions and velocities change with time. These laws do not,
however, specify what positions and velocities are possible, only how they change.
Without violating the laws of dynamics, any given spatial arrangement of atoms can
in principle be combined with any given set of atomic velocities. In the language
of statistical mechanics positions and velocities are independent degrees of freedom
(see, for example, Fig. 1.3). So we can split the task of complexion counting into two
parts. First, that of counting all possible velocity complexions with the same total
kinetic energy EK for any given spatial complexion. Second, accounting for the
different spatial complexions including, if necessary, any change in potential energy
with atomic positions. In this section the machinery for counting velocity complex-
ions is derived using a geometric argument and basic concepts of mechanics.
Consider an atom i with velocity components (u, γ , w). It has kinetic energy
1
Ek,i = mi (u2 + γ 2 + w 2 ). (1.6)
2
The velocity of this atom can represented
by a vector arrow which has a length,
using Pythagoras’ theorem, of vi = 2Ek,i /mi (Fig. 1.4). There are many other
combinations of x, y and z component velocities, (u , γ , w ), that have the same
kinetic energy provided that they satisfy the equation Ek,i = 12 mi (u2 + γ 2 + w 2 ).
Each possibility corresponds to a velocity vector of the same length (magnitude)
vi but representing a different velocity complexion. Every possible velocity com-
plexion with the same kinetic energy can be represented by a vector starting
at the
origin and ending somewhere on the surface of a sphere of radius R = 2Ek,i /mi .
Figure 1.4 shows two other velocity vectors of the same magnitude whose velocities
differ by a small amount δv. It is clear from construction of this ‘velocity sphere’
that the number of different velocity complexions Wv with fixed kinetic energy Ek,i
for atom i is proportional to the surface area of its velocity-sphere:
Wv ∝ R 2 ∝ ( Ek,i )2 . (1.7)
1.6 Counting Complexions: Part 1 7
1 2
N
EK = m (ui + γi2 + wi2 ). (1.8)
2
i=1
3N/2
Wv ∝ EK (1.9)
with respect to the total kinetic energy. Now obviously the absolute number
of complexions depends on how finely we divide up the surface of the 3N -
dimensional sphere; how small is the surface area element δA3N which represents
a single velocity complexion. However, all we will need are ratios of numbers of
complexions corresponding to some change in EK . So the size of the area element
δA3N will cancel out. To calculate changes in the number of complexions the only
3N/2
thing that matters is that this number scales as EK .
One of Boltzmann’s most important discoveries was how to compute the change in
number of complexions when heat is added to the system (Boltzmann 1877). His
original derivation, which is followed by most textbooks of statistical mechanics, is
quite involved mathematically and is thus hard to follow for the non-specialist. So
here we pursue the geometric argument of Fig. 1.4. If a small amount of heat δQ is
added to a collection of N atoms, the kinetic energy increases from EK to EK +δQ.
The radius of the 3N -dimensional sphere representing all possible velocity vectors
increases by a factor of
1/2
R EK + δQ δQ 1/2
= = 1+ (1.10)
R EK N EˆK
where ÊK is the mean kinetic energy per atom. The surface area of this sphere
increases by a factor of
3N
R δQ 3N/2
= 1+ . (1.11)
R N ÊK
The number of complexions increases by the same factor as the surface area:
Wv δQ 3N/2
= 1+ , (1.12)
Wv N ÊK
where the second equality uses the relationship of Sect. 1.3 between temperature
and mean kinetic energy, namely kb T = 23 EˆK .
We now generalize the argument leading to Eq. 1.15 for atoms with different masses.
Equation 1.8 for the total kinetic energy is now replaced by
1
N
EK = mi (u2i + γi2 + wi2 ). (1.16)
2
i=1
N
u2 γ2 w2
1= ( 2i + i2 + 2i ) (1.17)
a
i=1 i
ai ai
√
where ai = 2EK /mi . This is the equation for a 3N-dimensional ellipsoid with
semi-axes of length ai . For the same kinetic energy a lighter atom has a greater
velocity than a heavy atom. So the 3N-dimensional velocity-sphere becomes elon-
gated along the axial directions of lighter atoms, and shortened along the directions
for heavier atoms, becoming an ellipsoid. All possible velocity complexions for
fixed EK are represented by a 3N -dimensional vector from the center of this
ellipsoid ending somewhere on its surface. The argument about how the number
of complexions depends on EK is the same as for the equal mass/sphere case: Wv
is given by how many different 3N-dimensional velocity vectors there are, which
is proportional to the surface area of the 3N-ellipsoid. This ellipsoid is of fixed
shape determined by the various masses of the atoms. Thus its surface area scales as
the (3N − 1)th ≈ (3N)th power of the linear dimension of the ellipsoid. The linear
√ th
dimension scales as EK , so the area scales as the 3N2 power of EK as before. The
addition of a small amount of heat provides the same factor of (1 + δQ/N ÊK )3N/2
for the number of complexions. For δQ << N ÊK this leads again to Boltzmann’s
equation 1.15.
10 1 Learning to Count
Equation 1.15 says that if a small amount of heat (kinetic energy) δQ is added
to N atoms that have average kinetic energy EˆK , then the number of velocity
complexions increases by a factor of Wv /Wv given by Eq. 1.15 where Wv and
Wv are the number of velocity complexions before and after the heat addition. The
increase in number of velocity complexions comes from the increase in the number
of ways to distribute the kinetic energy among the N atoms. Conversely, if we
remove heat (δQ is negative) the number of complexions decreases. The exponential
form of Eq. 1.15 is a reflection of the combinatoric nature of complexion counting.
For example, if we add an amount of heat equal to the average kinetic energy of one
atom, ÊK , the increase in velocity complexions is a factor of e3/2 = 4.48. If twice
as much heat is added, then the number of complexions increases by the square of
this factor, about 20-fold.
Another feature of Boltzmann’s equation is that the effect of a given increment
of heat depends on how much heat is already there; how large δQ is compared to the
average kinetic energy of the atoms. In terms of the geometric picture of Fig. 1.4,
the increment in radius of the velocity sphere due to a fixed amount of heat δQ
increases the surface area of the 3N-sphere
√ by a larger factor when the radius of the
sphere (which is proportional to EK ) is smaller. The proportionally larger effect
of heat added at a lower temperature is graphically illustrated in Fig. 1.5 using the
velocity sphere of a single atom. Here 2 units of kinetic energy are added to an atom
starting with 4 units of kinetic energy (low T) or starting with 8 units of energy (high
T). It is apparent from the figure, which is drawn to scale, that the area of the low T
sphere (on the left) is increased by a larger factor than the high T sphere.
Most of what follows will make use of Boltzmann’s equation 1.15. Before we
move on to complete the counting of complexions, the next section gives two
applications of this equation to demonstrate its explanatory power.
If we examine Boltzmann’s equation 1.15 we see that in the exponent the amount
of heat δQ is divided by the mean kinetic energy per atom, ÊK . A given amount
of heat will cause a greater increase in the number of velocity complexions (ways
to distribute the kinetic energy) when it is added to a collection of atoms with a
smaller amount of kinetic energy, i.e. with a lower temperature (see Fig. 1.5). So
if a small amount of heat leaves a body A at temperature TA and enters a body B
at a lower temperature TB it is clear that the number of velocity complexions for A
decreases by a smaller factor than that by which the number of velocity complexions
for B increases, because TB < TA . The total number of velocity complexions
is the product of the complexion numbers of A and B, so the total changes by a
factor
W W W δQ 1 1
= A B = exp − . (1.18)
W WA WB k b TB TA
This factor is greater than one because 1/TB > 1/TA and the exponent is
positive. If heat flowed in the reverse direction, from cold to hot, the number of
complexions would decrease, which by Boltzmann’s fundamental postulate means
it is less likely to be observed. So heat will flow from A to B until the number
of complexions stops increasing, i.e. reaches a maximum. This happens when
TB = TA , which is thermal equilibrium. Put another way, when TB = TA ,
the exponent in Eq. 1.18 is zero, so the ratio W /W is one. The transfer of a
small amount of heat in either direction does not change the number of velocity
complexions.
The real import of counting complexions comes when representative numbers are
calculated. Take two 0.1 kg blocks of copper. One is at a temperature of 300 K. The
other is one millionth of a degree hotter at 300.000001 K. From the heat capacity
of copper, 385 J/kg/K, we can calculate how much more heat the hotter block has:
38.5 µJ. To equalize the temperature, half of this extra heat must flow into the colder
block. Evaluating Eq. 1.18 with δQ = 19.25 µJ and kb = 1.38 × 10−23 J/K the
exponent has a value of about 7.8 × 106 , giving
W 6
= e7.8×10 ≈ 103,400,000 . (1.19)
W
The number of kinetic energy complexions increases by a factor of 1 followed by
more than three million zeros.2 The equal temperature situation has overwhelmingly
more complexions. Clearly the possibility of anything more than a really micro-
scopic flow of heat in the reverse direction, from cold to hot, is close to zero.
[Technical note: Equation 1.15 was derived assuming that the change in temper-
ature upon addition of the heat was negligible, while of course in the above thermal
equilibration example the temperature of A does change, from TA to TA − ΔT /2,
while that of B changes from TB to TB + ΔT /2, where ΔT = TA − TB . So the
calculation was done assuming constant temperatures for A and B equal to their
“average values” during the equilibration, namely TA − ΔT /4 and TB + ΔT /4. For
this example the error compared to the exact result (obtained by integration over a
continuous temperature change) is tiny.]
A rubber ball is released from a certain height above a smooth hard surface. It
accelerates downwards, Fig. 1.6, hits the surface, bounces up a few times, each
time to a lesser height until finally it comes to rest on the surface. Since energy is
conserved why does this happen, why does the ball not bounce back to the original
height and keep on bouncing for ever? Or why cannot the ball, after remaining at rest
for a while, spontaneously move upwards to its initial height? Initially the ball has a
certain potential energy mgh where m is the mass of the ball, h is the release height
above the surface, and g is the acceleration due to Earth’s gravity. This potential
energy is converted into the kinetic energy of the ball’s motion, EK = 12 mv 2 =
mgh, where v is the speed of the ball when it reaches the surface. How many ways
are there to distribute this kinetic energy among the atoms of the ball? In effect, just
one, where all atoms have identical velocity components (u, γ , w) = (0, 0, −v)
equal to the center of mass motion. When the ball comes to rest this kinetic
energy has been converted to exactly the same amount of heat: kinetic energy now
distributed at random among the atoms of the ball, the surface, the air.
Boltzmann’s equation 1.15 tells us how the number of velocity complexions has
increased. For example take a 0.1 kg ball released from a height of 0.1 m, at room
temperature, 300 K. The gravitational constant is g = 10 m/s2 , so δQ = 0.1 J. The
W 19 19
= e2.42×10 ≈ 1010 . (1.20)
W
In other words, the number of complexions increases by a factor of 1 followed by
ten quintillion (1019 ) zeros. This is a mind-bogglingly large number. No wonder
the effect of friction is irreversible and the kinetic energy components in the
surroundings could never spontaneously ‘line up’ to push the ball back up in the
air once it has come to rest.
We again have a collection of N atoms with total kinetic energy EK . First assume
that there are no forces acting on the atoms, either from other atoms or from external
factors. This means that there is no variation in potential energy with the positions
of the atoms. In this case if there are Ws spatial complexions each can be combined
with all Wv possible velocity complexions having the same total kinetic energy EK .
This gives a total of
W = Ws × Wv (1.21)
complexions. An example in miniature is shown in Fig. 1.3. Here there are two
different spatial complexions which are combined with two different velocity
complexions to give a total of four complexions. If we apply Boltzmann’s postulate
to Eq. 1.21, what it tells us is that in the absence of forces each spatial complexion is
equally likely, since each can be combined with exactly the same number of velocity
complexions.
Now assume that there are forces acting. Equation 1.21 no longer applies. When
an atom moves its potential energy is changed according to the relationship
δU = −f δx (1.22)
δQ = −δU.
It is immaterial whether the forces are external or come from interactions between
the atoms: When there is a change in potential energy heat is created or annihilated,
depending on the sign of the change. Because of this heat change, what Boltzmann’s
key discovery, Eq. 1.15, tells us is that the corresponding spatial complexions are
now no longer equally likely, because each has a different number of possible
1.10 Counting Complexions: Part 2 15
velocity complexions available to it. With constant total energy, spatial complexions
of higher potential energy have lower kinetic energy. Therefore they have less
kinetic energy complexions and so they are less likely, and vice versa. This is
illustrated qualitatively in Fig. 1.7 for the velocity sphere of a single atom at two
positions of different potential energy.
To quantify the effect of potential energy on complexions in the general case,
first compare two spatial complexions of N atoms, labelled A and B which have
different potential energies: UA and UB . When the atoms move from complexion A
to complexion B an amount of heat −(UB − UA ) is created if we move downhill
in potential energy (UB < UA ) or is annihilated if we move uphill in potential
energy (UB > UA ). Using Eq. 1.15 the ratio of the corresponding number of velocity
complexions is
Wv (B) (UB − UA )
= exp − (1.23)
Wv (A) kb T
Put another way, the presence of potential energy has introduced a coupling between
the spatial and velocity complexions. From Eq. 1.24 it follows that we can write the
absolute probability of any spatial complexion i as
exp(−Ui /kb T )
pi = W (1.25)
s
i=1 exp(−Ui /kb T )
16 1 Learning to Count
since any pair of probabilities pi and pj given by Eq. 1.25 satisfy Eq. 1.24, and the
probabilities sum to one. The probability distribution pi is called the Boltzmann
distribution. The term e−Ui /kb T in the numerator is called the Boltzmann factor.
The sum of these factors in the denominator is taken over all possible spatial
complexions Ws . This sum functions as a probability normalizing factor and in spite
of its mundane origin it plays an important role in entropy and statistical mechanics.
Because of this it is given a name, the partition function, often denoted by the
symbol Z from the German Zustandssumme (literally sum of states):
Ws
Z= exp(−Ui /kb T ) (1.26)
i=1
Using Eq. 1.25 we can now calculate the average value of any property X using
Ws
1
Ws
X= pi Xi = Xi exp(−Ui /kb T ) (1.27)
Z
i=1 i=1
where Xi is the value of the property in the i th spatial complexion. For example the
mean potential energy is given by
Ws
1
Ws
U= p i Ui = Ui exp(−Ui /kb T ) (1.28)
Z
i=1 i=1
Ws
W = Wv (i) (1.29)
i=1
where in general Wv (i) will vary from spatial complexion to spatial complexion
depending on that complexion’s potential energy. We will pick an arbitrary spatial
complexion o, with potential energy Uo as a reference and use Eq. 1.23 to write the
number of velocity complexions possible with spatial complexion i in terms of the
number of velocity complexions of the reference state:
−(Ui − Uo )
Wv (i) = Wv (o) exp (1.30)
kb T
1.10 Counting Complexions: Part 2 17
Ws
W = Wv (o) exp(Uo /kb T ) exp(−Ui /kb T ) = C Z (1.31)
i=1
The factor coming from the arbitrary complexion o used as a reference state is
common to all the spatial complexions. It is brought outside the sum as a constant,
C . The sum itself is just the partition function Z of Eq. 1.26.
Equation 1.31 gives the total number of complexions for a fixed number of atoms
with a constant total energy E. This assumes that the collection of N atoms we are
interested in is completely isolated from its environment: No heat can enter or leave,
which is known as adiabatic conditions. The more common situation is a system in
thermal equilibrium with its surroundings, i.e. under constant temperature condi-
tions. How does this affect the counting of complexions? We can apply the results
from the heat flow example, Sect. 1.7 to find out. Consider our collection of N atoms
at equilibrium with its surroundings at a temperature T . At a certain moment this
collection has total energy E. The total number of complexions is given by Eq. 1.31.
Since it is in thermal equilibrium with its surroundings, say a small fluctuation (due
to collisions with its surroundings, or thermal radiation emitted or received) causes a
small amount of heat δQ to pass from the surrounding environment to the system or
vice versa. Although Eq. 1.15 says that the number of complexions for the system of
N atoms changes, application of Eq. 1.18 with TA = TB = T shows that the number
of complexions of the surroundings changes by an equal and opposite amount, so
that the total number of complexions (of system plus environment) is unaffected by
the heat flow allowed by constant T conditions: Complexion counting is the same
as though we kept the system at constant energy E.
Summary
Complexion counting has produced the following results
• Equation 1.9: The number of kinetic energy complexions is proportional
to the total kinetic energy raised to the 3N/2th power,
3N/2
Wv ∝ EK
• Equation 1.21: When there are no forces acting the total number of com-
plexions is the product of the number of spatial and velocity complexions
W = Ws × Wv
exp(−Ui /kb T )
pi = W
s
i=1 exp(−Ui /kb T )
Ws
W ∝ exp(−Ui /kb T )
i=1
where the sum is over all possible spatial complexions, and the terms in
the sum are the Boltzmann factors.
With the ability to count complexions and determine their probabilities we
can now explain what we see at equilibrium, by determining what state has
the most complexions. To paraphrase Feynman (1972) much of statistical
mechanics is either leading up to the summit of Eqs. 1.25 and 1.27 or coming
down from them. In this chapter on complexion counting we proceeded
entirely without the concept of entropy. Historically, however, entropy was
characterized as a thermodynamic quantity before its molecular, statistical
origins were known. So in the next chapter we connect the concepts of
complexion counting and entropy.
Chapter 2
What Is Entropy?
δQ
δS = (2.1)
T
where δQ is small enough that it causes a negligible change in temperature. But this
ratio of added heat to temperature is just what appears in the exponent of Eq. 1.15.
If we substitute Eq. 2.1 in and take the logarithm of both sides we get:
Wv
δS = kb ln (2.2)
Wv
The change in entropy is due to the change in the number of complexions; the
change in the number of ways to distribute the kinetic energy among the atoms. The
equivalence in Eq. 2.2 led Boltzmann to a more general definition of the entropy of
a state in terms of the total number of complexions for that state
S = kb lnW + C (2.3)
where W includes both spatial and velocity complexions. The more general
expression for a change in entropy is then
W
ΔS = kb ln (2.4)
W
The use here of the notation ΔS instead of δS indicates that Boltzmann’s expression
is valid for any size change of entropy.
C in Eq. 2.3 is an undetermined constant which is unimportant since we are
always interested in changes in entropy, so it is convenient to set C = 0. Then
the logarithmic relationship between entropy and complexions means that entropies
of independent systems are additive because the total number of complexions is the
product of each system’s complexion number.
In Eqs. 2.2–2.4 the units of entropy are energy per degree of temperature, e.g.
J/mol/K. So the numerical value depends on the units being used. However, S/kb
is dimensionless. Only dimensionless quantities have physical meaning, but more
commonly units of energy per degree Kelvin are used. The crucial point is that
entropy is simply a logarithmic measure of the number of complexions. As we
have seen it is often insightful to express results directly in terms of numbers
of complexions rather than in entropy units: Although the molecular definition of
entropy is probabilistic, appreciation of the numbers involved more clearly conveys
the almost deterministic nature of entropic effects at the observable level.
The macroscopic state that we see at equilibrium is the state that can happen
in the most ways; that corresponds to the most complexions.
This principle is illustrated in miniature by the dice analogy: If you roll two dice,
the ‘macroscopic’ total of 7 is the most likely because 6 of the 36 complexions,
(1,6), (2,5), (3,4), (4,3), (5,2) and (6,1) produce this total, more complexions than
for any other total. We will observe a total of 7 on average one sixth of the time.
Equation 2.3 formalizes Boltzmann’s identification of the most probable state,
the state that can happen the most ways, with the state of maximum entropy. What
must be appreciated is that when you deal with the typical number of atoms involved
2.2 Entropy Maximization Explains Equilibrium Behavior 21
in a real world situation, the state with the most complexions dominates to an almost
unbelievable degree.
Consider 100 million gas molecules distributed between two equal volumes,
labeled left and right in Fig. 2.1. Each molecule could be in either the left hand
volume or the right hand volume. So at this macroscopic level of description there
is a total of 2N different spatial complexions. A particular macroscopic distribution
would be L molecules in the left-hand volume, and R = N − L molecules in the
right-hand volume. The number of spatial complexions in this macroscopic state is
the different number of ways of splitting N molecules into two groups with L in one
group, R in the other. This is given by the binomial distribution
N!
W (L) = (2.5)
L!R!
example are for 100 million molecules, which is a rather small amount of matter,
about 5 million billionths of a gram of oxygen gas. For truly macroscopic amounts of
matter the complexions will cluster even more closely around the maximum entropy
50:50 state.
It is also important to understand what Boltzmann’s postulate is not saying. It is
not saying that the macroscopic behavior is a result of the atoms exploring all the
complexions (the so-called ergodic hypothesis), or even most of the complexions.
Even for this small system of 100 million atoms and at this coarse level of
description the total number of complexions is about 1 followed by 30 million
zeros. The system can only explore a tiny fraction of the possible complexions in the
time we observe it. Therefore it will almost never encounter the vanishing minority
of complexions that are detectably unlike the most probable or maximum entropy
states.
The partition of the volume in Fig. 2.1 into left and right regions is arbitrary. Another
equi-volume partition is illustrated by the checkerboard pattern in the lower left
of the figure. The same argument about distributions applies to any equi-volume
division: almost all complexions will correspond to an equal distribution. From
this we conclude that in the absence of external forces the maximum entropy
distribution, the one that can happen the most ways, the one that represents all those
that will be observed at equilibrium, is a uniform distribution of gas in the available
volume.
Now let the gas initially be confined to one volume. Then a valve is opened to a
second, equal volume, as in the right panel of the figure. As the gas molecules move
from one complexion to another because of their thermal motion, over time they
are overwhelmingly likely to end up in the equilibrium-like complexions, the ones
with the gas equally distributed between the volumes, simply because almost every
complexion is like this. So the gas expands to uniformly occupy all the volume.
This is not because the second volume has a vacuum—in fact the expansion will
occur even if the second volume is filled with a different gas. The mixing of the
two gases is just a by-product of each gas moving independently to states for which
there are the most complexions. A similar argument applies to mixing of liquids: A
drop of ink mixing with water, or milk stirred into coffee. The molecules of ink or
milk are simply moving to complexions belonging to the most likely distribution—a
uniform distribution in their available volume. Similarly with any solute dissolved
in a solvent. The counting of solute spatial complexions is analogous to that of a gas
in the same volume as the solution. If there are no strong interactions between the
solutes, and the solute concentration is not too high, then the most likely distribution
is a uniform density.
2.4 The Entropy of an Ideal Gas 23
An ideal gas is one in which the molecules have no volume, and make no
interactions. The molecules are basically modelled as points, whose only properties
are mass, position and velocity. An ideal gas is also a good model for gases
like nitrogen, oxygen and argon at normal temperatures and pressures. For these
reasons it is often used to illustrate thermodynamic principles. Since there are
no forces, the total number of complexions is given by Eq. 1.21. The number of
3N/2
velocity complexions according to Eq. 1.9 is proportional to EK and so it is also
proportional to T 3N/2 , since temperature is proportional to the total kinetic energy
for a fixed number of atoms. For the spatial complexions, we imagine the volume
V occupied by the gas to be divided up into a large number of small cubic regions
of equal volume b. There are V /b ways of placing a single molecule into one of
the small boxes. In an ideal gas the molecules are all independent of each other so
there are (V /b)2 ways of placing two molecules into the boxes, and so on. Thus
the total number of spatial complexions is proportional to V N . Clearly the absolute
number of spatial complexions depends on how finely we divide space, but for any
process of interest involving a volume change, the entropy change will depend on
the ratio of the numbers of complexions, and b will cancel out. Putting the spatial
and velocity complexion results together, the total number of complexions for an
ideal gas is
W ∝ V N T 3N/2 (2.6)
3
Sid = Nkb lnV + Nkb lnT + C (2.7)
2
Note the spatial and velocity contributions to the entropy are additive, which follows
from what was said earlier, that in the absences of forces the spatial and velocity
complexions can be realized independently of the other, so the total number of
complexions is given by their product (Eq. 1.21). Again, C is a constant of no
interest since it does not depend on T or V and so it will cancel out for any change
in these two thermodynamic quantities.
24 2 What Is Entropy?
One of Boltzmann’s very first applications of the distribution that now bears his
name was to the distribution of a gas in a gravitational field (Boltzmann 1877).
Taking an altitude of zero (mean sea level) as a reference, the potential energy of a
gas molecule of mass m at a height h above sea level is mgh. Then Eq. 1.24 gives
the probability of that gas molecule being at height h as
−mgh
p(h) ∝ exp . (2.8)
kb T
where ρ(0) is the density at sea level. Using an approximate molecular weight of
30 g/mole for atmospheric gas, the mass per molecule is 5 × 10−26 kg, g = 10 N/kg.
With T = 300 K and kb = 1.38 × 10−23 J/K, the factor kb T /mg has the value
8280 m. In other words, the equation gives an atmospheric density half that of
sea level at about 5700 m, or 70% of the way up Mt. Everest. Here we are not
correcting for the decrease in temperature with altitude. Equation 2.9 is known as the
barometric equation. It gives the maximum entropy arrangement of gas molecules
in the atmosphere at constant temperature. This demonstrates that the principle of
maximum entropy can operate on a very large scale. Note that on the laboratory
scale, these variations in density with height are tiny, so the uniform distribution of
Sect. 2.3 is still the maximum entropy distribution.
Sections 2.3 and 2.5 give the maximum entropy spatial distributions of a gas under
different conditions. We now consider the maximum entropy distribution of kinetic
energy in a gas. As illustrated in Fig. 1.4, for a fixed total kinetic energy EK all the
possible velocity
√ complexions can be represented by a 3N -dimensional vector of
length R = 2EK /m, 3N being the number of velocity terms. Since these velocity
vectors are of constant length R they trace out the (3N − 1)-dimensional surface of
a 3N -sphere, the number of degrees of freedom being reduced by one because of
the constant energy constraint. We can use the velocity-sphere concept to describe
the equilibrium distribution of kinetic energy/velocity components in more detail.
First, we know from Sect. 1.7 on heat flow that in a region containing a large
number of atoms the temperature should be uniform, i.e. on average the kinetic
2.6 The Distribution of Velocities at Equilibrium 25
energy is uniformly distributed. However this does not mean that at the atomic
level at any particular moment all the atoms have exactly the same kinetic energy,
distributed equally among the x, y and z components. In fact this completely
uniform distribution is very unlikely: It corresponds to a complexion where |u1 | =
|γ1 | = |w1 | = |u2 | = . . . |wN |. Geometrically, this corresponds to a special
direction in 3N-dimensional space for the velocity vector, namely aligned exactly
on one of the 3N diagonals. But there many more non-diagonal directions in the
3N -dimensional sphere to which the velocity vector could point. Thus there will be
a range of velocities among the atoms. The form of this velocity distribution can be
obtained from the velocity-sphere, as follows.
Consider any component of the velocity of any atom, say the x component of
the ith atom. If the distribution of velocity vectors is uniform over the (3N − 1)-
dimensional surface of the velocity sphere, as per Boltzmann’s postulate, then what
is the probability distribution of this velocity component? If√this velocity component
has a particular value ui , where −R ui +R, and R = 2EK /m is the radius of
the velocity sphere, then this constrains all possible 3N-dimensional velocity vectors
to trace a (3N − 2)-dimensional ‘circle’ C of radius
1/2
1 2
2 mui
Rc = R 2 − u2i =R 1− (2.10)
EK
on the (3N − 1)-surface, see Fig. 2.2. In Eq. 2.10 12 mu2i is the x component of the
kinetic energy of atom i.
Technically, in high dimensions the (3N − 1)-dimensional surface of a 3N-
sphere and a (3N − 2)-dimensional circle on this surface are both hyperspheres,
but retaining the surface/circle terminology for the minus-one and minus-two
dimensions makes the explanation easier.
Re-writing the right hand side using the identity x = elnx we have
1
3N mu2
p(ui ) ∝ exp ln(1 − 2 i ) . (2.13)
2 EK
Since there are very many atoms among which to partition the total kinetic
energy, we can assume that almost all the time a particular atom has only a small
fraction of the total, so 12 mu2i << EK and we can use the logarithm approximation
ln(1 − x) ≈ −x for small x to obtain
1 2 1
2 mui mu2
p(ui ) ∝ exp − 2 ∝ exp − 2 i (2.14)
3 EK /N
kb T
Chemical and biochemical reactions of all kinds start with the reactants interacting
with each other to form some kind of reactive complex, then this complex undergoes
some kind of rearrangement of atoms, followed by breakdown of the complex into
the products. Rearrangement of the atoms during the reaction most often involves
going over some potential energy barrier. As with evaporation, the molecules that
are most likely to react are the ones whose atoms have higher kinetic energy, those
at the upper end of the Maxwell-Boltzmann distribution. Figure 2.3 shows how this
distribution changes with temperature, in this case going from 300 to 360 K, a 20%
increase. We know that the average kinetic energy increases proportionally, because
0.30
0.25
0.20
p(Ek)
0.15
0.10
0.05
0.00
1 2 3 4 5 6
Ek (kJ/mole)
Fig. 2.3 Probability of kinetic energy component values at 300 K (dotted line) and 360 K (solid
line)
28 2 What Is Entropy?
Summary
Clausius’ definition of an entropy change as the transferred heat divided by
the temperature is equivalent to the logarithm of the change in number of
kinetic energy complexions, the number of ways to distribute heat. More
generally the entropy of a state is a logarithmic measure of the total number
of complexions, kinetic plus spatial, available to that state. Equilibrium cor-
responds to the maximum entropy state, the state with the most complexions.
This leads in the absence of any constraints to a uniform distributions of gas,
mixtures and solutes in the accessible volume, to the exponential dependence
of atmospheric pressure with altitude, and to the exponential distribution of
atomic velocities (The Maxwell-Boltzmann distribution).
Chapter 3
Entropy and Free Energy
Applying the relation S = kb lnW +C to the general expression for the total number
of complexions, Eq. 1.31, we obtain the most general expression for the entropy,
valid whether there are forces or not:
where Z is the partition function. For an entropy difference between states A and B:
Ws
Z= exp(−Ui /kb T ) (3.3)
i
which is a weighted sum over spatial complexions, where each spatial complexion
is weighted by the number of kinetic energy complexions available to it. So the
spatial and kinetic energy contributions to entropy seem to be combined in a
1 1
Ws (A) Ws (B)
UA = Ui exp(−Ui /kb T ), UB = Ui exp(−Ui /kb T ).
ZA ZB
i i
(3.4)
The change in average potential energy is ΔU = UB − UA . Applying
conservation of energy the change in kinetic energy is ΔQ = −ΔU .
2. Equation 1.15 is used to calculate the change in number of kinetic energy
complexions:
Wv (B)
= exp(−ΔU/kb T ), (3.5)
Wv (A)
assuming again that the number of atoms N is large enough that ΔQ << EK ,
i.e. the temperature change is negligible
We then take the logarithm to obtain the kinetic energy contribution to entropy
The spatial contribution to the entropy is then obtained by subtracting the kinetic
energy contribution from the total
Put another way, the total entropy change can be cleanly separated into its two
contributions
where ΔSv = −ΔU/T , and ΔSs , are the kinetic complexion and spatial complex-
ion contributions to the entropy change, respectively.
There is an alternative approach to separating the total entropy into spatial and
kinetic components. It is less direct, but it leads to the third fundamental equation
for entropy. We can write as an identity
3.2 Entropy, Enthalpy and Work 31
Ws
Ws
− kb pi lnpi = kb pi (lnZ + U i/kb T ) (3.9)
i=1 i=1
where Eq. 1.25 has been used to substitute for lnpi . Thefirst term sums to kb lnZ
since lnZ can be brought in front of the summation and p1 = 1. Using Eq. 1.28
the second term simply gives the average energy divided by temperature, U/T . So
Ws
− kb pi lnpi = kb lnZ + U/T . (3.10)
i=1
Ws
− kb pi lnpi = Ss + C. (3.11)
i=1
This is known as the Gibbs-Planck-Einstein expression for the spatial entropy. The
constant C is unimportant since we are only ever interested in changes in entropy,
wherein it cancels.
The final thing we have to account for is the possibility that, when there is a change
in the potential energy due to a change in the spatial complexions of the atoms,
not all that energy turns into kinetic energy. Some of that energy may be used to
do work w. Quite simply, the amount of heat produced is reduced by the amount
of work done: ΔQ = −ΔU − w. A common example is when there is a change
in volume, ΔV , which does work w = P ΔV against an external pressure P . The
change in kinetic energy is now −ΔU − P ΔV and Eq. 3.8 for the change in total
entropy is replaced by
Other types of work include surface free energy work w = γ ΔA done by a change
ΔA of surface area with free energy (surface tension) of γ , electrical work w =
qΔΨ caused by a charge q moving across a voltage difference ΔΨ , and magnetic
work w = Δ(B · m), where B and m are the magnetic field and dipole moment,
respectively.
32 3 Entropy and Free Energy
which is the ratio of the partition functions for the A and B conformations,
respectively. If we use the identity
kb T lnZ
Z = exp (3.18)
kb T
The equilibrium will favor the conformation that has the greatest total entropy
(number of complexions).
Equation 3.17 was obtained for a system containing a single molecule of X, in
other words the equilibrium constant would refer to the average occupancy of the
A and B conformations over a period of time long compared to the time X spends
in each conformation. Alternatively, if the system contained a macroscopic number
of X molecules, at any instant the ratio of the number in the A and B state (or in
solution the ratio of concentrations) will be given by the same probability ratio:
NA [A] pA
= = = KAB (3.21)
NB [B] pB
Summary
1. The partition function Z is the sum of the Boltzmann factors over the
spatial complexions.
Ws
Z= exp(−Ui /kb T ).
i=1
ΔS = kb ΔlnZ
(continued)
3.3 Equilibrium Constants and Entropy 35
ΔQ ΔU
ΔSv = =−
T T
4. The change in the spatial contribution to the entropy is given by the
difference of (2) and (3), or directly from the Gibbs-Planck-Einstein
expression
Ws
ΔSs = kb ΔlnZ + ΔU/T = Δ −kb pi lnpi
i=1
Abstract This chapter examines the origin of entropic forces. These forces are
compared with forces that arise from interaction potentials, like gravitational and
electrical forces.
where G is the gravitational constant. The force acting on atom 1 due to atom 2 is
given by the gradient of this potential with respect to the position of atom 1:
dU
f1 = − (4.2)
dri
The force is directed along the line between the two atoms in the direction of
decreasing potential. The force on atom 2 is obtained in the same manner and is
equal and opposite.
Another familiar example is a metal spring. For small displacements the spring
potential is
1
U = Kx 2 (4.4)
2
where K is the spring constant, and x is the displacement of the spring away from
its unstrained length. The gradient of this potential gives the restoring force
f = −Kx (4.5)
which is linear in the displacement (Hooke’s Law). The minus sign indicates that the
force is in the opposite direction to the displacement and acts to restore the spring
to its unstrained state.
The interaction potential (IP) derived gravitational and electrical forces act on
individual atoms but at the macroscopic level they are very concrete: We can feel
the force exerted by a weight held in our hand. We can feel the resistance when
we stretch a metal spring: It arises from electrical forces bonding the metal atoms
together as we slightly displace the atoms from their equilibrium separations. Forces
also arise from entropic effects, but rather than coming from individual atom-atom
interactions they arise from the collective behavior of atoms. Because of this they
perhaps seem less concrete; but they are just as real and tangible. The ideal gas again
forms a good starting point.
Consider a cylinder fitted with a piston and filled with N molecules of an ideal gas,
Fig. 4.1. The area of the piston is A, and the position of the piston relative to the
bottom of the cylinder is h. The volume of the gas is V = Ah, and its pressure is
given by Boyles Law, Eq. A.20 as
P = Nkb T /V . (4.6)
f = P A = Nkb T / h. (4.7)
3
−T Sid = −Nkb T lnAh − Nkb T lnT − T C.
2
We find, in exact correspondence with Eq. 4.2, that the negative gradient of this
entropic potential with respect to the piston position gives a force
Consider now our cylinder of ideal gas when the piston is loaded with a weight of
mass m, as shown in Fig. 4.2. We assume that the mass of the piston is negligible.
With no load the pressure in the cylinder must equal atmospheric pressure, Patm .
The equilibrium position of the piston is then given by
nRT
V0 = (4.8)
Patm
where V0 is the volume of the gas, and n is the number of moles. When the piston
is loaded with the weight and comes to its new equilibrium position the pressure in
the cylinder increases to Patm + mg/A to counteract the external atmosphere and
40 4 Entropic Forces
the weight. The new equilibrium position in terms of the gas volume is
nRT
Vw = (4.9)
Patm + mg/A
since Vatm >> ΔV . We don’t need to know the number of moles or the volume
of the atmosphere, since by the ideal gas law their ratio is given by natm R/Vatm =
Patm /T , which finally gives the change in entropy of the atmosphere as
Patm ΔV
ΔSatm = − (4.12)
T
which is linear in displacement and positive. The total change in entropy is
mgΔV Patm ΔV V0 + ΔV
ΔStot = − − + nRln (4.13)
AT T V0
To summarize, since ΔV is negative, the first two terms are positive and the third
term is negative. The first term is the change in the kinetic part of the entropy, while
the second and third terms give the spatial contribution, including both gas in the
cylinder and in the atmosphere. The maximum in entropy is obtained by taking the
derivative with respect to the displacement ΔV and setting it equal to zero:
mg Patm nR
0=− − + (4.14)
AT T (V0 + ΔV )
which is the same as that given by force balance, Eq. 4.9. Figure 4.3 plots the
various entropy terms against linear displacement x = ΔV /A with parameters
T = 300 K, P = 1 atm, n = 0.1 moles, A = 0.01 m2 . Loading with a mass of
2.5
2.0
Entropy Change (mJ/K)
1.5
1.0
0.5
0.0
0 2 4 6 8 10 12 14
–0.5
–1.0
–1.5
–2.0
Displacement (mm)
Fig. 4.3 Entropy contribution from gravitational potential energy change plus external atmosphere
(dashes), gas in the cylinder (dots) and total entropy (solid line)
42 4 Entropic Forces
ΔU Patm ΔV
ΔStot = − − + ΔSs (4.16)
T T
and translating this into energeticist’s terms by multiplying by −T we get
P = mRT (4.18)
From the gradient of the entropic potential −T Sc (x) with respect to the end-to-end
distance the restoring force is
an entropic spring obeying Hooke’s law with spring constant K = 4kb T /L2 . The
minus sign in Eq. 4.22 indicates that the force always acts to restore the polymer
to its unstrained, maximum entropy length of L. Rubber, for instance, consists of
cross-linked random polymers, so when two ends of a piece of rubber are pulled the
strain is transmitted to individual polymer chains which results in a restoring force
on their ends described by Eq. 4.22. In actual rubber there are other contributions to
elasticity, but the polymer conformational entropy reduction is the major source of
elasticity. Polymer conformational entropy is of course a type of spatial entropy.
By analogy with the gas spring in Sect. 4.2.2 consider a weight of mass m attached
to an elastic cord (Fig. 4.6). The cord is made of rubber with a polymer chain density
ρ (number of chains per unit volume). The cord in the unstrained state has a length
B and a constant cross-sectional area A, so the volume of rubber is AB, and the
number of polymer chains is Np = ρAB. If the weight stretches the band by a
distance d the decrease in gravitational potential energy is ΔUg = −mgd with a
corresponding increase in the kinetic part of the entropy by ΔSv = −ΔUg /T =
+mgd/T . The strain induced in the band, as a ratio of its original length, is d/B.
For simplicity assume that this strain is homogeneously distributed throughout the
rubber. In other words, the strain ratio for each polymer molecule is equal to the
macroscopic strain. Hence in Eq. 4.21 x/L = d/B. Then the stretching of the band
reduces the conformational entropy by
d2 −2kb ρA 2
ΔSc = −2kb Np = d . (4.23)
B2 B
4.4 Maximum Entropy Is Equivalent to Minimum Free Energy 45
The chain entropy of the rubber produces a Hooke’s law spring with a force constant
that is proportional to the density of the rubber and the cross-sectional area, and
inversely proportional to the length, as one expects. The force constant is also
proportional to temperature, characteristic of entropic forces.1 The minimum of the
corresponding free energy also occurs at the point of force balance, with a value
given by
ΔGtot = ΔUg − T ΔSc . (4.26)
People who have only encountered free energies at the molecular scale, for example
in the thermodynamics of chemical and biochemical reactions, may be surprised
by their appearance in the simple laboratory scale mechanical examples given here.
Even the distinction between energy and enthalpy contributions (ΔU vs. ΔH ) is
here, depending on whether the process is happening at constant volume (the weight
suspended by a rubber cord) or with a volume change at constant pressure (the gas
spring). The free energy is exactly the same quantity at both scales. Only the origin
of the potential, and the scale of the system are different. In chemical reactions for
1 The main cause of the space shuttle Challenger disaster was decreased rubber elasticity in the
solid rocket booster O-rings due to the unusually low temperatures before launch.
46 4 Entropic Forces
The final entropic force considered here is the hydrophobic force. At the outset it
should be said that this is a complex and not yet fully understood force. In part this is
because it depends on the properties of liquid water, which is still a somewhat mys-
terious solvent, and in part because it involves interactions between solutes in water,
which by definition takes us into the realm of non-ideal solutions. We do know that
the force exists and that it is entropic at room temperature, although there is likely
some non-entropic contributions. Also, while the energies involved are quite well
known, the range and magnitude of the forces are not. For all these reasons it is
often called the hydrophobic effect rather than a force. The reader is referred to two
monographs on the subject for more details (Tanford 1980; Ben-Naim 1987).
It is well known that apolar compounds—liquids like oil, organic solvents like
alkanes and benzene, etc. do not mix with water. Shake oil and water together and
they rapidly separate into two layers. Apolar solutes in general dissolve poorly in
water and if they exceed their solubility limit they will aggregate and come out
of solution. The cause of all this is the hydrophobic effect. Consider the case of
oil and water. After shaking vigorously, a uniform mixture should be the maximum
entropy situation according to argument given in Chap. 2, but instead the two liquids
spontaneously de-mix, with an apparent decrease in entropy (Fig. 4.7). This decrease
in entropy is only apparent. Thermodynamic measurements show that the water
immediately surrounding an apolar solute is in a lower entropy state than bulk
water. The entropy of this hydrating water is lowered sufficiently that the mixed
state is lower in total entropy than the de-mixed state, hence de-mixing occurs. Put
another way, demixing of the oil from water releases this hydrating water into the
bulk aqueous phase, with a gain in entropy.
It is believed that the hydrating water, because it cannot hydrogen bond with an
apolar solute, is forced to form clathrate-like structures around the solute. Clathrates
are characterized by pentagonal rings of hydrogen bonded water (Fig. 4.8, Tanford
4.5 The Hydrophobic Force 47
1980). The clathrate is more structured than bulk water. The positions and orien-
tations of clathrate waters are more restricted. This reduces their spatial entropy.
Following this line of argument, taking two separated apolar solute molecules and
bringing them close enough to exclude water from their interface should result in an
increase in total entropy, since the amount of water in contact with apolar surface is
decreased. In going thus to a state of higher entropy, somewhere along this pathway
there must be a positive gradient in entropy, and hence an attractive hydrophobic
force. This force presumably develops once the solute surfaces are close enough
exclude water, which is of the order of the diameter of a water molecule, ≈0.36 nm.
This model predicts that the hydrophobic effect is proportional to the change in
apolar surface area exposed to water. This area dependence is borne out quite well
by the data (Tanford 1980), which gives the change in water entropy per unit area of
about −5.8 × 10−23 J/K/nm2 , which from Eq. 2.2 corresponds to about a 66-fold
decrease in water spatial complexions per nm2 . This gives a hydrophobic strength,
defined as the free energy change per unit area exposed to water, of 104 J/mole/Å2 .
The magnitude of the hydrophobic effect is sufficient for it to play a major role in
biology: Helping to determine the structure and stability of proteins and promoting
self assembly of biological membranes, for example.
While the structure of water around apolar molecules, especially large ones,
may not be as regular as the clathrate model suggests the general concept of more
ordered, lower entropy water hydrating apolar molecules is widely accepted as the
origin of the hydrophobic force.
48 4 Entropic Forces
Summary
The maximum entropy state specifies what will be observed at equilibrium,
which is basically a state with no net forces acting at the macroscopic level.
Entropic effects can give rise to forces. These forces appear when a system is
not at equilibrium, or when part of a system is constrained from adopting
its maximum entropy state, for example a compressed gas or a stretched
rubber band. These forces are determined by the gradient in the spatial entropy
contribution with respect to the pertinent displacement, e.g. a piston or the end
of a polymer. Entropic forces are directed towards the maximum entropy state
and close to equilibrium they are typically linear in the displacement. Entropic
forces are distinguished from forces arising from interaction potentials by
their linear temperature dependence. By introducing the entropic potential
−T S, forces from either entropic effects or interaction potentials can be
obtained in the same way as the negative gradient of a potential with respect
to the relevant displacement. When entropic forces and interaction potential
forces occur together, equilibrium may be described equivalently as
1. The state of maximum kinetic+spatial entropy
2. The point of balance between all forces, from entropy and interaction
potentials
3. The minimum in free energy, where the free energy is a sum of a spatial
entropy term multiplied by −T and a potential energy or enthalpy term.
Chapter 5
Summary
Abstract Starting from the fact that atoms have both position and velocity, and
using basic concepts of mechanics—force, kinetic energy, potential energy and con-
servation of energy—we have used complexion counting to obtain all the equations
required to understand both entropy and elementary statistical mechanics, including
free energy. These are summarized here along with their physical meaning.
One mathematical factor appears over and over again in the discussion of entropy
and statistical mechanics. It is the Boltzmann factor, which has the form
exp(X/kb T ) (5.1)
where X is some quantity with the units of energy, scaled by the mean kinetic energy
(or temperature). It first appeared in Eq. 1.15 describing the increase in number of
velocity complexions upon adding heat. It next appeared in Eq. 1.18 describing ther-
mal equilibrium between two objects in contact. It appears as Eq. 1.25 describing
the probability of a particular spatial complexion. Most importantly, it appears as
the weighting term for each spatial complexion in the expression for the partition
function, Eq. 1.26. It appears as the factor describing the pressure distribution in the
atmosphere (the Barometric equation 2.9). It appears in the Maxwell-Boltzmann
distribution of atomic velocities, Eq. 2.14. Finally, it appears in the expression for
the equilibrium constant, Eq. 3.19.
The first important equation is the sum of the Boltzmann factors over all the spatial
complexions, also known as the partition function, Eq. 1.26:
Ws
Z= exp(−Ui /kb T )
i=1
This quantity is central to statistical mechanics. This not surprising since, as Eq. 1.31
tells us, it is to within a multiplicative constant just the total number of complexions
W = CZ
exp(−Ui /kb T )
pi =
Z
gives us the probability of each spatial complexion. From this it is possible to
calculate many average, or macroscopic, properties by forming their probability
weighted sum over the spatial complexions.
The entropy is, to within an additive constant, the logarithm of the total number
of complexions. The entropy change given by Eq. 2.4 is thus the difference in the
logarithm of the total number of complexions
ΔS = kb (lnW − lnW )
For the special case of an ideal gas, the total number of complexions is just the
product of the number of spatial complexions and kinetic energy complexions
(Eq. 1.21). The former is proportional to the volume raised to the power of the
number of atoms, V N , while the latter is proportional to the total kinetic energy
3N/2
raised the power of 3N/2, namely EK . This separability leads to a simple additive
formula for the ideal gas entropy, Eq. 2.7
3
Sid = Nkb lnV + Nkb lnT + C
2
The partition function can be used to obtain one of the most important average
quantities, the mean potential energy, Eq. 1.28
1
Ws
U= Ui exp(−Ui /kb T )
Z
i=1
5.4 Modest Entropies but Very Very Large Numbers 51
The logarithm of the partition function (difference) yields a free energy (difference).
For constant pressure conditions, this is the Gibb’s free energy, Eq. 3.14:
The treatment here uses classical mechanics only. Yet the real world obeys the rules
of quantum mechanics. How does this change things? The answer is very little. First,
Boltzmann’s postulate still applies: All possible complexions (read quantum states)
with a given number of atoms and total energy are equally likely (Tolman 1938). The
definition of entropy as a logarithmic measure of the number of complexions still
applies. Only the counting of complexions is affected by quantization. However,
all the applications here involve large numbers of atoms at normal temperatures.
Under these conditions, the number of quantum states is so vast and their energy
levels so close together that the classical treatment of counting given here is more
than adequate. The Boltzmann distribution is closely followed and the macroscopic,
thermodynamic behavior is accurately described. Generally speaking, a quantum
mechanical treatment is only necessary when studying a very small number of
atoms, very low temperatures or the interaction of radiation with atoms.
The machinery for complexion counting was applied to several everyday events.
We now give both the change in number of complexions and the change in entropy,
using Eq. 2.4
1. Heat flows from hot to cold. Temperature equilibration of two 0.1 kg blocks of
copper differing in temperature by 1 millionth of a degree:
W
≈ 103,400,000 ΔS ≈ 1 × 10−16 J/K
W
52 5 Summary
2. A moving object comes to rest due to friction. For a 0.1 kg ball falling 0.1 m:
W 19
≈ 1010 ΔS ≈ 0.33 mJ/K
W
3. A gas expands into the available volume. For 10 cc of gas expanding to twice its
volume:
W 20
≈ 1010 ΔS ≈ 2.6 mJ/K
W
W 19
≈ 101.4×10 ΔS ≈ 0.43 mJ/K
W
From its inception, entropy has been given a surprising variety of physical inter-
pretations. The most common are discussed here in the context of the method of
complexion counting described in this book.
Most of the pioneers in the development of statistical mechanics have at some time
equated an increase in entropy with an increase in disorder. A completely non-
technical but remarkably accurate description of entropy as disorder is provided
by the anthropologist Gregory Bateson in a charming fictional dialogue with his
daughter (Bateson 1972). Nevertheless, some care is required with this definition
to avoid apparent contradictions. Take for example supercooled liquid water. Being
a liquid, its molecular structure is quite disordered. Thermodynamic equilibrium is
reached when the water freezes. Now the molecules are more ordered spatially as ice
crystals. Yet this spontaneous process must have an overall increase in entropy. The
contradiction is only apparent. When water freezes there is a positive latent heat
of fusion which causes an increase in velocity complexions. Because the spatial
component is literally the only part of the entropy change that is visible, it is easy
to assume that it is the only part. This illustrates the limitation, alluded to in the
introduction, of popular explanations of entropy that discuss only the spatial contri-
bution, not the kinetic part. The latter, invisible, contribution must be included using
Boltzmann’s equation 1.15. The overall amount of disorder, spatial plus kinetic,
does increase and the entropy change provides a quantitative measure of this.
5.5 Different Interpretations of Entropy 53
The measure was derived using only general requirements such as consistency and
additivity, yet remarkably it has the same form as the Gibbs-Einstein expression for
spatial entropy, Eq. 3.10. For this reason Shannon called his measure information
entropy. Jaynes (1957) saw this as more than a mathematical coincidence and
asked how an information theory perspective could add to our understanding of
thermodynamic entropy: Given knowledge of the macroscopic state of the system,
what information does that provides us about the states of the system at the atomic or
molecular level? The answer is that the thermodynamic entropy is a measure of this
information. The extreme examples illustrate the point. If the entropy were zero,
then knowledge of the macrostate would give us complete knowledge as there is
just one corresponding complexion or microstate. Conversely, the maximum entropy
state corresponds to the minimum amount of information: The number of alternative
complexions the system might be in consistent with the observed state is maximal.
A larger entropy means less information since we become less certain of the detailed
state of the system.
First, equilibrium behavior can with almost no error be predicted from the
properties of the macrostate with the most complexions. Even if the system never
samples that particular set of complexions it will with almost certainty sample only
those indistinguishable from it. Here there is no requirement for ergodicity. Because
of the vast numbers of complexions involved, the statistical limit to the accuracy of
predictions about observable states and thermodynamic equilibria is better than that
of most measuring instruments. The accuracy is in fact as good as, or better than,
predictions of physical behaviors governed by non-statistical laws such as those of
electromagnetism or classical dynamics.
Second, we can predict with almost certainty that a non-equilibrium system will
change in such a way as to increase its entropy. In fact, provided the system is
not too far from equilibrium the gradient of the entropy provides the forces, and
hence the pathway of change. The famous tension between the time-reversibility of
fundamental physical laws and the irreversibility of the Second Law was resolved, in
Boltzmann’s time, by recognizing the crucial role of initial conditions (Boltzmann
1895). Tracing the origins of lower entropy back further and further in time,
the source of the time asymmetry that sets the direction for the Second Law is
cosmological (Davies 1977). For reasons that are yet unclear, the universe started
off in a very low entropy state because of the uniformity of the gravitational field,
and it has been running uphill in entropy ever since.
The theme of this book is that entropy is simply a logarithmic measure of the
number of ways the positions and velocities of the atoms can be arranged consistent
with a given macroscopic, observable state. The corollary is: what we see is
what can happen the most ways, what has the largest entropy. Even a very tiny
entropy increase corresponds to truly huge increase in the number of complexions,
ineluctably orienting the arrow of time we experience every day.
Appendix
Aside from standard algebra, the only other mathematics used in this book involves
the exponential and logarithm functions. Their basic properties are summarized
here.
The exponential function, base-10, is written as y = 10x , where x is the
exponent. This is familiar from common notation used for large decimal numbers.
For example
although the exponent need not be an integer. If two exponentials of the same base
are multiplied/divided, their exponents are added/subtracted:
Repeated application of the addition rule for the same number c gives the power rule
Exponents and logarithms in the natural base follow all the same rules as for base-
10.
ec × ed = e(c+d) (A.9)
y = ex ⇒ x = ln(y) or y = eln(y) (A.10)
ln(c) + ln(d) = ln(c × d) (A.11)
ln(cn ) = n ln(c) (A.12)
A useful approximation for the natural logarithm when the argument is close to 1 is:
ln(1 + x) ≈ x (x 1) (A.13)
For example, when x = 0.01, ln(1 + x) = 0.00995 with less than half a percent
error. For smaller values of x encountered here, the approximation is effectively
error-less.
To round out the mechanical presentation of statistical mechanics the ideal gas
law, also known as Boyle’s law, is derived using basic mechanical concepts. This
also provides the relationship between the mean kinetic energy of the atoms
and temperature required in Sect. 1.2. Consider a rectangular box of volume V
containing N molecules of an ideal gas. The molecules of the gas will continually
be colliding with the walls of the box and rebounding. These collisions produce a
A.2 Mechanical Origins of the Ideal Gas Law 57
force on the walls given by the rate of change of momentum of the gas molecules.
The pressure exerted by the gas is just the average force per unit area. Consider the
molecules colliding with the right hand wall of the box (Fig. A.1). The area of the
wall is A. If in a short time dt a number of molecules b with mass m and velocity
u in the x direction collide with the wall, then the change in momentum is 2mub.
There’s a factor of two because the collision reverses the velocity in the x direction.
The pressure is the rate of change of momentum per unit area
2mub
P = (A.14)
Adt
The number of molecules b that collide with the wall in time dt is equal to the
number of molecules within a distance dx = udt of the wall that are moving
towards the wall, namely with a positive velocity in the x-direction. This is half the
total number of molecules in a volume Adx, since on average half the molecules
will be moving to the right, and half to the left. The gas density is N/V , so
b = N Adx/2V . Substituting for dx the number of collisions is
NAdtu
b= (A.15)
2V
Substituting for b in Eq. A.14, the pressure is
Nmu2
P = (A.16)
V
Finally, since there will be a range of velocities, u2 in Eq. A.16 must be replaced by
the mean squared velocity, uˆ2 . This is obtained from the expression for the mean
kinetic energy of the molecules, Eq. 1.4:
m 2
N
m
ÊK = (ui + vi2 + wi2 ) = (uˆ2 + vˆ2 + wˆ2 ) (A.17)
2N 2
i
58 Appendix
Now the mean squared velocities in the x, y and z directions will be the same since
the gas is homogeneous and at equilibrium. So
2ÊK
uˆ2 = vˆ2 = wˆ2 = . (A.18)
3m
Using this expression for uˆ2 in Eq. A.16 we obtain the ideal gas law
2
P V = N ÊK (A.19)
3
in terms of the mean kinetic energy. P , V and T had previously been measured for
various gases leading to the empirical Boyle’s Law
where n is the number of moles of gas and R = 8.314 J/K/mole is the gas constant.
The second form of Boyle’s Law uses the fact that the number of molecules in n
moles is given by N = Na n where Na = 6.02 × 1023 is Avogadro’s number. So
the scale factor relating temperature, measured on the ideal gas or Kelvin scale, and
mean kinetic energy per molecule is
2 ˆ
EK = kb T (A.21)
3
Bateson G (1972) Metalogue: why do things get in a Muddle? In: Steps to an ecology of mind.
Ballantine Books, New York
Ben-Naim A (1987) Solvation thermodynamics. Plenum Press, New York
Biot MA (1955) Variational principles in irreversible thermodynamics with application to vis-
coelasticity. Phys Rev 97:1463–1469
Boltzmann L (1877) Uber die Beziehung zwischen dem zweiten Hauptsatze der mechanischen
Warmetheorie und der Wahrscheinlichkeitsrechnung respektive den Satzen uber das Warmgle-
ichgewicht. Weiner Berichte 76:373–435
Boltzmann L (1895) On certain questions of the theory of gases. Nature 51:413–415
Bridgman PW (1972) The nature of thermodynamics. Harper and Brothers, New York
Craig NC (2005) Let’s drive “driving force” out of chemistry. J Chem Educ 82:827
Davies PCW (1977) The physics of time asymmetry. Surrey University Press, London
Feynman RP (1972) Statistical mechanics; a set of lectures. W.A. Benjamin, Reading
Jaynes ET (1957) Information theory and statistical mechanics. Phys Rev 106:620–630
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423
Sharp KA, Matschinsky F (2015) Translation of Ludwig Boltzmann’s paper “On the Relationship
between the Second Fundamental Theorem of the Mechanical Theory of Heat and Probability
Calculations Regarding the Conditions for Thermal Equilibrium” (Wien. Ber. 1877, 76:373–
435). Entropy 17:1971–2009
Tanford C (1961) Physical chemistry of macromolecules. Wiley, New York
Tanford C (1980) The hydrophobic effect, 2nd edn. Wiley, New York
Tolman RC (1938) The principles of statistical mechanics. Clarendon Press, Oxford
A E
Atom Enthalpy, 32, 42, 45
complexion, 4–5 Entropic force
equilibrium distribution, 21 in gas, 38–43
force on, 37 polymers, 43–45
kinetic energy, 3 potential energy, 37–38
probability, 26 Entropic work, 31, 32, 40, 51
velocity, 2–3 Entropy
Avogadro’s number, 23, 58 Boltzmann’s expression, 20
dice analogy, 20
disorder, 52
equilibrium behavior, 54
B
equilibrium constants, 32–34
Barometric equation, 24, 49
factorial functions, 21
Boltzmann factor, 16, 49, 50
and free energy (see Free energy)
Boltzmann’s constant, 4, 58
gas molecules, 21, 22
Boltzmann’s equation, 10–12, 52
Gibbs–Einstein equation, 30–31
Boltzmann’s postulate, 5–6, 22, 26
ideal gas, 23
Boyle’s Law, 39, 56, 58
information, measurement, 53
kinetic energy, 19, 30
language of energetics, 32
C macroscopic quantities, 20
Complexion, 1–2 microstate, 53
accounting atoms, masses, 9 non-equilibrium system, 54
Boltzmann’s equation, 10 partition function, 29
constant temperature conditions, 17 physical law, 53
heat effect, velocity, 8–9 positions and velocities, atoms, 54
kinetic energy, 7, 11 potential energy, 30
single atom, 6 solute, 22
spatial and velocity, 14 spatial complexions, 29
total number, 16–17 spatial contribution, 30
velocity, 6 surface free energy work, 31
F N
Force Newton’s Third Law, 14
Hydrophobic force (see Hydrophobic
force)
See also Entropic force O
Free energy Osmotic pressure, 42–43
entropy (see Entropy)
Gibb’s free energy, 51
Friction, 12–13, 52 P
Partition function, 16, 17, 29, 50
logarithm, 51
G Polymers
Gas molecule, 24 elastic cord, 44–45
Gibbs–Einstein equation, 30–31, 53 end-to-end distance, 43–44
gravitational and elastic force balance, 45
H rigid segments, 43
Hooke’s law, 38, 44, 45 Potential energy, 14–16
Hydrophobic force, 46, 47 and forces, 37–38
clathrate water structure, 47, 48 gas molecule, 24
experimental basis, 46 gravitational, 44
mechanism, 46–47 Pythagoras’ theorem, 2, 6
I Q
Ideal gas, 23 Quantum mechanics, 46, 51
atmosphere, 41
cylindrical piston, 39
entropic forces, 38 S
entropy contribution, 41 Spatial and velocity complexions
equilibrium position, 39–40 Boltzmann factor, 16
free energy expression, 42 energy atoms, 14–15
IP-forces, 39 potential energy, 15
mean squared velocities, 57–58 velocity, 14
Index 63