Statistical Physics
Statistical Physics
G. Falkovich
http://www.weizmann.ac.il/home/fnfal/statphys.pdf
Contents
1 Basic principles. 3
1.1 Distribution in the phase space . . . . . . . . . . . . . . . . . 3
1.2 Microcanonical distribution . . . . . . . . . . . . . . . . . . . 4
1.3 Canonical distribution . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Two simple examples . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.1 Two-level system . . . . . . . . . . . . . . . . . . . . . 8
1.4.2 Harmonic oscillators . . . . . . . . . . . . . . . . . . . 11
1.5 Grand canonical ensemble . . . . . . . . . . . . . . . . . . . . 13
1.6 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7 Information theory approach . . . . . . . . . . . . . . . . . . . 18
2 Gases 23
2.1 Ideal Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.1.1 Boltzmann (classical) gas . . . . . . . . . . . . . . . . . 23
2.2 Fermi and Bose gases . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.1 Degenerate Fermi Gas . . . . . . . . . . . . . . . . . . 31
2.2.2 Photons . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.3 Phonons . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.4 Bose gas of particles and Bose-Einstein condensation . 36
2.3 Chemical reactions . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Non-ideal gases 40
3.1 Coulomb interaction and screening . . . . . . . . . . . . . . . 40
3.2 Cluster and virial expansions . . . . . . . . . . . . . . . . . . . 45
3.3 Van der Waals equation of state . . . . . . . . . . . . . . . . . 48
1
4 Phase transitions 51
4.1 Thermodynamic approach . . . . . . . . . . . . . . . . . . . . 51
4.1.1 Necessity of the thermodynamic limit . . . . . . . . . . 51
4.1.2 First-order phase transitions . . . . . . . . . . . . . . . 53
4.1.3 Second-order phase transitions . . . . . . . . . . . . . . 55
4.1.4 Landau theory . . . . . . . . . . . . . . . . . . . . . . 56
4.2 Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.1 Ferromagnetism . . . . . . . . . . . . . . . . . . . . . . 59
4.2.2 Impossibility of phase coexistence in one dimension . . 64
4.2.3 Equivalent models . . . . . . . . . . . . . . . . . . . . 64
5 Fluctuations 69
5.1 Thermodynamic fluctuations . . . . . . . . . . . . . . . . . . . 69
5.2 Spatial correlation of fluctuations . . . . . . . . . . . . . . . . 71
5.3 Universality classes and renormalization group . . . . . . . . . 76
5.4 Response and fluctuations . . . . . . . . . . . . . . . . . . . . 80
5.5 Temporal correlation of fluctuations . . . . . . . . . . . . . . . 82
5.6 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . 86
6 Kinetics 88
6.1 Boltzmann equation . . . . . . . . . . . . . . . . . . . . . . . 88
6.2 H-theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.3 Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.4 Transport and dissipation . . . . . . . . . . . . . . . . . . . . 92
2
1 Basic principles.
Here we introduce microscopic statistical description in the phase space and
describe two principal ways (microcanonical and canonical) to derive ther-
modynamics from statistical mechanics.
The main idea is that ρ(p, q) for a subsystem does not depend on the initial
states of this and other subsystems so it can be found without actually solving
equations of motion. We define statistical equilibrium as a state where macro-
scopic quantities equal to the mean values. Assuming short-range forces we
conclude that different macroscopic subsystems interact weakly and are sta-
tistically independent so that the distribution for a composite system ρ12 is
factorized: ρ12 = ρ1 ρ2 .
Now, we take the ensemble of identical systems starting from different
points in phase space. In a flow with the velocity v = (ṗ, q̇) the density
changes according to the continuity equation: ∂ρ/∂t + div (ρv) = 0. If the
motion is considered for not very large time it is conservative and can be
described by the Hamiltonian dynamics: q̇i = ∂H/∂pi and ṗi = −∂H/∂qi .
Hamiltonian flow in the phase space is incompressible: div v = ∂ q̇i /∂qi +
∂ ṗi /∂pi = 0. That gives the Liouville theorem: dρ/dt = ∂ρ/∂t + (v · ∇)ρ = 0
that is the statistical distribution is conserved along the phase trajectories
of any subsystem. As a result, equilibrium ρ must be expressed solely via
the integrals of motion. Since ln ρ is an additive quantity then it must be
expressed linearly via the additive integrals of motions which for a general
mechanical system are energy E(p, q), momentum P(p, q) and the momentum
of momentum M(p, q):
ln ρa = αa + βEa (p, q) + c · Pa (p, q) + d · M(p, q) . (2)
3
Here αa is the normalization constant for a given subsystem while the seven
constants β, c, d are the same for all subsystems (to ensure additivity) and
are determined by the values of the seven integrals of motion for the whole
system. We thus conclude that the additive integrals of motion is all we
need to get the statistical distribution of a closed system (and any sub-
system), those integrals replace all the enormous microscopic information.
Considering system which neither moves nor rotates we are down to the sin-
gle integral, energy. For any subsystem (or any system in the contact with
thermostat) we get Gibbs’ canonical distribution
Usually one considers the energy fixed with the accuracy ∆ so that the mi-
crocanonical distribution is
{
1/Γ for E ∈ (E0 , E0 + ∆)
ρ= (5)
0 for E ̸∈ (E0 , E0 + ∆) ,
For example, for N noninteracting particles (ideal gas) the states with the
∑ 2
energy √
E = p /2m are in the p-space near the hyper-sphere with the
radius 2mE. Remind that the surface area of the hyper-sphere with the
radius R in 3N -dimensional space is 2π 3N/2 R3N −1 /(3N/2 − 1)! and we have
4
One can link statistical physics with thermodynamics using either canon-
ical or microcanonical distribution. We start from the latter and introduce
the entropy as
S(E, V, N ) = ln Γ(E, V, N ) . (8)
This is one of the most important formulas in physics1 (on a par with F =
ma , E = mc2 and E = h̄ω).
Noninteracting subsystems are statistically independent so that the sta-
tistical weight of the composite system is a product and entropy is a sum.
For interacting subsystems, this is true only for short-range forces in the
thermodynamic limit N → ∞. Consider two subsystems, 1 and 2, that
can exchange energy. Assume that the indeterminacy in the energy of any
subsystem, ∆, is much less than the total energy E. Then
E/∆
∑
Γ(E) = Γ1 (Ei )Γ2 (E − Ei ) . (9)
i=1
We denote Ē1 , Ē2 = E − Ē1 the values that correspond to the maximal
term in the sum (9), the extremum condition is evidently (∂S1 /∂E1 )Ē1 =
(∂S2 /∂E2 )Ē2 . It is obvious that Γ(Ē1 )Γ(Ē2 ) ≤ Γ(E) ≤ Γ(Ē1 )Γ(Ē2 )E/∆. If
the system consists of N particles and N1 , N2 → ∞ then S(E) = S1 (Ē1 ) +
S2 (Ē2 ) + O(logN ) where the last term is negligible.
Identification with the thermodynamic entropy can be done consider-
ing any system, for instance, an ideal gas. The problem is that the log-
arithm of (7) contains non-extensive term N ln V . The resolution of this
controversy is that quantum particles (atoms and molecules) are indistin-
guishable so one needs to divide Γ (7) by the number of transmutations N !
which makes the resulting entropy of the ideal gas extensive: S(E, V, N ) =
(3N/2) ln E/N + N ln V /N +const2 . Defining temperature in a usual way,
T −1 = ∂S/∂E = 3N/2E, we get the correct expression E = 3N T /2.
We express here temperature in the energy units. To pass to Kelvin de-
grees, one transforms T → kT and S → kS where the Boltzmann constant
k = 1.38 · 1023 J/K. The value of classical entropy (8) depends on the units.
Proper quantitative definition comes from quantum physics with Γ being the
number of microstates that correspond to a given value of macroscopic pa-
1
It is inscribed on the Boltzmann’s gravestone
2
One can only wonder at the genius of Gibbs who introduced N ! long before quantum
mechanics. See, L&L 40 or Pathria 1.5 and 6.1
5
rameters. In the quasi-classical limit the number of states is obtained by
dividing the phase space into units with ∆p∆q = 2πh̄.
The same definition (entropy as a logarithm of the number of states)
is true for any system with a discrete set of states. For example, con-
sider the set of N two-level systems with levels 0 and ϵ. If energy of the
set is E then there are L = E/ϵ upper levels occupied. The statistical
weight is determined by the number of ways one can choose L out of N :
Γ(N, L) = CNL = N !/L!(N − L)!. We can now define entropy (i.e. find the
fundamental relation): S(E, N ) = ln Γ. Considering N ≫ 1 and L ≫ 1
we can use the Stirling formula in the form d ln L!/dL = ln L and de-
rive the equation of state (temperature-energy relation) T −1 = ∂S/∂E =
ϵ−1 (∂/∂L) ln[N !/L!(N − L)!] = ϵ−1 ln(N − L)/L and specific heat C =
dE/dT = N (ϵ/T )2 2 cosh−2 (ϵ/T ). Note that the ratio of the number of parti-
cles on the upper level to those on the lower level is exp(−ϵ/T ) (Boltzmann
relation). Specific heat turns into zero both at low temperatures (too small
portions of energy are ”in circulation”) and in high temperatures (occupation
numbers of two levels already close to equal).
The derivation of thermodynamic fundamental relation S(E, . . .) in the
microcanonical ensemble is thus via the number of states or phase volume.
6
Note that there is no trace of thermostat left except for the temperature.
The normalization factor Z(T, V, N ) is a sum over all states accessible to the
system and is called the partition function.
The probability to have a given energy is the probability of the state (10)
times the number of states:
Here Γ(E) grows fast while exp(−E/T ) decays fast when the energy E grows.
As a result, W (E) is concentrated in a very narrow peak and the energy
fluctuations around Ē are very small (see Sect. 1.5 below for more details).
For example, for an ideal gas W (E) ∝ E 3N/2 exp(−E/T ). Let us stress again
that the Gibbs canonical distribution (10) tells that the probability of a given
microstate exponentially decays with the energy of the state while (12) tells
that the probability of a given energy has a peak.
An alternative and straightforward way to derive the canonical distri-
bution is to use consistently the Gibbs idea of the canonical ensemble as a
virtual set, of which the single member is the system under consideration
and the energy of the total set is fixed. The probability to have our system
in the state a is then given by the average number of systems n̄a in this state
divided by the total number of systems N . The set of occupation numbers
{na } = (n0 , n1 , n2 . . .) satisfies obvious conditions
∑ ∑
na = N , Ea na = E = ϵN . (13)
a a
Any given set is realized in W {na } = N !/n0 !n1 !n2 ! . . . number of ways and
the probability to realize the set is proportional to the respective W :
∑
na W {na }
n̄a = ∑ , (14)
W {na }
where summation goes over all the sets that satisfy (13). We assume that
in the limit when N, na → ∞ the main contribution into (14) is given by
the most probable distribution which is found by looking at the extremum
∑ ∑
of ln W − α a na − β a Ea na . Using the Stirling formula ln n! = n ln n − n
∑
we write ln W = N ln N − a na ln na and the extremum n∗a corresponds to
ln n∗a = −α − 1 − βEa which gives
n∗a exp(−βEa )
=∑ . (15)
N a exp(−βEa )
7
The parameter β is given implicitly by the relation
∑
E a Ea exp(−βEa )
=ϵ= ∑ . (16)
N a exp(−βEa )
equivalent to F = E − T S in thermodynamics.
One can also come to this by defining entropy. Remind that for a closed
system we defined S = ln Γ while the probability of state is wa = 1/Γ. Since
ln wa is linear in E then
∑
S(Ē) = ln wa (Ē) = −⟨ln wa ⟩ = − wa ln wa (18)
∑
= wa (Ea /T + ln Z) = E/T + ln Z .
8
L ln[(N − L)/L]. The temperature in the microcanonical approach is as
follows:
∂S
T −1 = = ϵ−1 (∂/∂L) ln[N !/L!(N − L)!] = ϵ−1 ln(N − L)/L . (19)
∂E
The entropy as a function of energy is drawn on the Figure:
S
T= T=−
T=+0 T=−0 E
0 Nε
Indeed, entropy is zero at E = 0, N ϵ when all the particles are in the same
state. The entropy is symmetric about E = N ϵ/2. We see that when
E > N ϵ/2 then the population of the higher level is larger than of the
lower one (inverse population as in a laser) and the temperature is negative.
Negative temperature may happen only in systems with the upper limit of
energy levels and simply means that by adding energy beyond some level we
actually decrease the entropy i.e. the number of accessible states. Available
(non-equilibrium) states lie below the S(E) plot, notice that the entropy
maximum corresponds to the energy minimum for positive temperatures and
to the energy maximum for the negative temperatures part. A glance on
the figure also shows that when the system with a negative temperature is
brought into contact with the thermostat (having positive temperature) then
our system gives away energy (a laser generates and emits light) decreasing
the temperature further until it passes through infinity to positive values
and eventually reaches the temperature of the thermostat. That is negative
temperatures are actually ”hotter” than positive.
Let us stress that there is no volume in S(E, N ) that is we consider only
subsystem or only part of the degrees of freedom. Indeed, real particles have
kinetic energy unbounded from above and can correspond only to positive
temperatures [negative temperature and infinite energy give infinite Gibbs
factor exp(−E/T )]. Apart from laser, an example of a two-level system is
spin 1/2 in the magnetic field H. Because the interaction between the spins
and atom motions (spin-lattice relaxation) is weak then the spin system for
9
a long time (tens of minutes) keeps its separate temperature and can be
considered separately.
External fields are parameters (like volume and chemical potential) that
determine the energy levels of the system. They are sometimes called gen-
eralized thermodynamic coordinates, and the derivatives of the energy with
respect to them are called respective forces. Let us derive the generalized
force M that corresponds to the magnetic field and determines the work done
under the change of magnetic field: dE = T dS − M dH. Since the projection
of every magnetic moment on the direction of the field can take two values ±µ
then the magnetic energy of the particle is ∓µH and E = −µ(N+ − N− )H.
The force (the partial derivative of the energy with respect to the field at a
fixed entropy) is called magnetization or magnetic moment of the system:
( )
∂E exp(µH/T ) − exp(−µH/T )
M =− = µ(N+ − N− ) = N µ . (20)
∂H S
exp(µH/T ) + exp(−µH/T )
10
To conclude let us treat the two-level system by the canonical approach
where we calculate the partition function and the free energy:
∑
N
Z(T, N ) = CNL exp[−Lϵ/T ] = [1 + exp(−ϵ/T )]N , (22)
L=0
F (T, N ) = −T ln Z = −N T ln[1 + exp(−ϵ/T )] . (23)
We can now re-derive the entropy as S = −∂F/∂T and derive the (mean)
energy and specific heat:
∑ ∂ ln Z ∂ ln Z
Ē = Z −1 Ea exp(−βEa ) = − = T2 (24)
a ∂β ∂T
Nϵ
= , (25)
1 + exp(ϵ/T )
dE N exp(ϵ/T ) ϵ2
C = = . (26)
dT [1 + exp(ϵ/T )]2 T 2
Note that (24) is a general formula which we shall use in the future. Specific
heat turns into zero both at low temperatures (too small portions of energy
are ”in circulation”) and in high temperatures (occupation numbers of two
levels already close to equal).
C/N
1/2
2 T/ε
11
We start from the quasi-classical limit, h̄ω ≪ T , when the single-oscillator
partition function is obtained by Gaussian integration:
∫ ∞ ∫ ∞ T
Z1 (T ) = (2πh̄)−1 dp dq exp(−H/T ) = . (28)
−∞ −∞ h̄ω
We can now get the partition function of N independent oscillators as Z(T, N ) =
Z1N (T ) = (T /h̄ω)N , the free energy F = N T ln(h̄ω/T ) and the mean energy
from (24): E = N T — this is an example of the equipartition (every oscillator
has two degrees of freedom with T /2 energy for each)3 . The thermodynamic
equations of state are µ(T ) = T ln(h̄ω/T ) and S = N [ln(T /h̄ω)+1] while the
pressure is zero because there is no volume dependence. The specific heat
CP = CV = N .
Apart from thermodynamic quantities one can write the probability dis-
tribution of coordinate which is given by the Gibbs distribution using the
potential energy:
Using kinetic energy and simply replacing q → p/ω one obtains a similar
formula dwp = (2πT )−1/2 exp(−p2 /2T )dp which is the Maxwell distribution.
For a quantum case, the energy levels are given by En = h̄ω(n + 1/2).
The single-oscillator partition function
∞
∑
Z1 (T ) = exp[−h̄ω(n + 1/2)/T ] = 2 sinh−1 (h̄ω/2T ) (30)
n=0
where one sees the contribution of zero quantum oscillations and the break-
down of classical equipartition. The specific heat is as follows: CP = CV =
N (h̄ω/T )2 exp(h̄ω/T )[exp(h̄ω/T ) − 1]−2 . Comparing with (26) we see the
same behavior at T ≪ h̄ω: CV ∝ exp(−h̄ω/T ) because “too small energy
portions are in circulation” and they cannot move system to the next level.
3
If some variable ∫x enters energy as x2n∫ then the mean energy associated with that
degree of freedom is x2n exp(−x2n /T )dx/ exp(−x2n /T )dx = T 2−n (2n − 1)!!.
12
At large T the specific heat of two-level system turns into zero because the
occupation numbers of both levels are almost equal while for oscillator we
have classical equipartition (every oscillator has two degrees of freedom so it
has T in energy and 1 in CV ).
C/N
1
2 T/ε
At h̄ω ≪ T it coincides with (29) while at the opposite (quantum) limit gives
dwq = (ω/πh̄)1/2 exp(−q 2 ω/h̄)dq which is a purely quantum formula |ψ0 |2 for
the ground state of the oscillator.
See also Pathria Sect. 3.7 for more details.
13
the first-order terms as in (10,11)
Here we used ∂S/∂E = 1/T , ∂S/∂N = −µ/T and introduced the grand
canonical potential which can be expressed through the grand partition func-
tion ∑ ∑
Ω(T, V, µ) = −T ln exp(µN/T ) exp(−EaN )/T ) . (34)
N a
where F (T, V, N ) is the free energy calculated from the canonical distribu-
tion for N particles in volume V and temperature T . The mean value N̄
14
is determined by the extremum of probability: (∂F/∂N )N̄ = µ. The sec-
ond derivative determines the width of the distribution over N that is the
variance:
( )−1 ( )−1
∂2F −2 ∂P
(N − N̄ )2 = 2T = 2T N v ∝N . (36)
∂N 2 ∂v
1.6 Entropy
By definition, entropy of a closed system determines the number of available
states (or, classically, phase volume). Assuming that system spends compa-
rable time in different available states we conclude that since the equilibrium
must be the most probable state it corresponds to the entropy maximum. If
the system happens to be not in equilibrium at a given moment of time [say,
the energy distribution between the subsystems is different from the most
probable Gibbs distribution (16)] then it is more probable to go towards
equilibrium that is increasing entropy. This is a microscopic (probabilistic)
interpretation of the second law of thermodynamics formulated by Clausius
in 1865. Note that the probability maximum is very sharp in the thermo-
dynamic limit since exp(S) grows exponentially with the system size. That
means that for macroscopic systems the probability to pass into the states
with lower entropy is so vanishingly small that such events are never observed.
Dynamics (classical and quantum) is time reversible. Entropy growth
is related not to the trajectory of a single point in phase space but to the
15
behavior of finite regions (i.e. sets of such points). Consideration of finite
regions is called coarse graining and it is the main feature of stat-physical
approach responsible for the irreversibility of statistical laws.
The dynamical background of entropy growth is the separation of trajec-
tories in phase space so that trajectories started from a small finite region
fill larger and larger regions of phase space as time proceeds. The relative
motion is determined by the velocity difference between neighboring points
δvi = rj ∂vi /∂xj . We can decompose the tensor of velocity derivatives into an
antisymmetric part (which describes rotation) and a symmetric part (which
describes deformation). The symmetric tensor, Sij = (∂vi /∂xj + ∂vj /∂xi )/2,
can be always transformed into a diagonal form by an orthogonal transfor-
mation (i.e. by the rotation of the axes). The diagonal components are the
rates of stretching in different directions. Solving the equation for the dis-
tance between two points ṙi = δvi = rj Sij one recognizes that distances grow
(or decay) exponentially in time. On the figure, one can see how the black
square of initial conditions (at the central box) is stretched in one (unstable)
direction and contracted in another (stable) direction so that it turns into a
long narrow strip (left and right boxes). Rectangles in the right box show
finite resolution (coarse-graining). Viewed with such resolution, our set of
points occupies larger phase volume (i.e. corresponds to larger entropy) at
t = ±T than at t = 0. Time reversibility of any particular trajectory in
the phase space does not contradict the time-irreversible filling of the phase
space by the set of trajectories considered with a finite resolution. By re-
versing time we exchange stable and unstable directions but the fact of space
filling persists.
p p p
16
enough then the entropy does not change i.e. the process is adiabatic. Indeed,
the positivity of Ṡ = dS/dt requires that the expansion of Ṡ(λ̇) starts from
the second term,
( )2
dS dS dλ dλ dS dλ
= · =A ⇒ =A . (37)
dt dλ dt dt dλ dt
We see that when dλ/dt goes to zero, entropy is getting independent of λ.
That means that we can change λ (say, volume) by finite amount making the
entropy change whatever small by doing it slow enough.
During the adiabatic process the system is assumed to be in thermal
equilibrium at any instant of time (as in quasi-static processes defined in
thermodynamics). Changing λ (called coordinate) one changes the energy
levels Ea and the total energy. Respective force (pressure when λ is volume,
magnetic or electric moments when λ is the respective field) is obtained as the
average (over the equilibrium statistical distribution) of the energy derivative
with respect to λ:
( )
∂H(p, q, λ) ∑ ∂Ea ∂ ∑ ∂E(S, λ, . . .)
= wa = wa E a = . (38)
∂λ a ∂λ ∂λ a ∂λ S
We see that the force is equal to the derivative of the thermodynamic energy
at constant entropy because the probabilities wa = 1/Γ does not change.
Note that in an adiabatic process all wa are assumed to be constant i.e.
the entropy of any subsystem us conserved. This is more restrictive than
the condition of reversibility which requires only the total entropy to be
conserved. In other words, the process can be reversible but not adiabatic.
See Landau & Lifshitz (Section 11) for more details.
The last statement we make here about entropy is the third law of thermo-
dynamics (Nernst theorem) which claims that S → 0 as T → 0. A standard
argument is that since stability requires the positivity of the specific heat
cv then the energy must monotonously increase with the temperature and
zero temperature corresponds to the ground state. If the ground state is
non-degenerate (unique) then S = 0. Since generally the degeneracy of the
ground state grows slower than exponentially with N , then the entropy per
particle is zero in the thermodynamic limit. While this argument is correct
it is relevant only for temperatures less than the energy difference between
the first excited state and the ground state. As such, it has nothing to do
with the third law established generally for much higher temperatures and
17
related to the density of states as function of energy. We shall discuss it later
considering Debye theory of solids. See Huang (Section 9.4) for more details.
A B ... L ... Z
A B ...... O ... Z
N
A B ... V ... Z
A B E Z
... ...
In reality though it brings even less information (no matter how emotional
we can get) since we know that letters are used with different frequencies.
18
Indeed, consider the situation when there is a probability wi assigned to
each letter (or box) i = 1, . . . , n. Now if we want to evaluate the missing
information (or, the information that one symbol brings us on average) we
ought to think about repeating our choice N times. As N → ∞ we know
that candy in the i-th box in N wi cases but we do not know the order in
which different possibilities appear. Total number of orders is N !/ Πi (N wi )!
and the missing information is
( ) ∑
IN = k ln N !/ Πi (N wi )! ≈ −N k wi ln wi + O(lnN ) . (40)
i
The missing information per problem (or per symbol in the language) coin-
cides with the entropy (18):
∑
n
I = lim IN /N = −k wi ln wi . (41)
N →∞
i=1
This is generally less than (39) and coincides with it only for wi = 1/n. Note
that when n → ∞ then (39) diverges while (41) may well be finite.
We can generalize this for a continuous distribution by dividing into cells
(that is considering a limit of discrete points). Here, different choices of
variables to define equal cells give different definitions of information. It is in
such a choice that physics enters. We use canonical coordinates in the phase
space and write the missing information in terms of the density which may
also depend on time:
∫
I(t) = − ρ(p, q, t) ln[ρ(p, q, t)] dpdq . (42)
19
systematic way of guessing, making use of incomplete information. The main
problem is how to get the best guess for the probability distribution ρ(p, q, t)
based on any given information presented as ⟨Rj (p, q, t)⟩ = rj , i.e. as the
expectation (mean) values of some dynamical quantities. Our distribution
must contain the whole truth (i.e. all the given information) and nothing
but the truth that is it must maximize the missing information I. This is to
provide for the widest set of possibilities for future use, compatible with the
existing information. Looking for the maximum of
∑ ∫ ∑
I− λj ⟨Rj (p, q, t)⟩ = ρ(p, q, t){ln[ρ(p, q, t)] − λj ⟨Rj (p, q, t)} dpdq ,
j j
We can explicitly solve this for k ≪ 1 ≪ kn when one can approximate the
sum by the integral so that Z(λ) ≈ nI0 (λ) where I0 is the modified Bessel
function. Equation I0′ (λ) = 0.3I0 (λ) has an approximate solution λ ≈ 0.63.
20
Note in passing that the set of equations (45) may be self-contradictory
or insufficient so that the data do not allow to define the distribution or
allow it non-uniquely. If, however, the solution exists then (42,44) define the
missing information I{ri } which is analogous to thermodynamic entropy as
a function of (measurable) macroscopic parameters. It is clear that I have
a tendency to increase whenever a constraint is removed (when we measure
less quantities Ri ).
If we know the given information at some time t1 and want to make
guesses about some other time t2 then our information generally gets less
relevant as the distance |t1 − t2 | increases. In the particular case of guessing
the distribution in the phase space, the mechanism of loosing information
is due to separation of trajectories described in Sect. 1.6. Indeed, if we
know that at t1 the system was in some region of the phase space, the set
of trajectories started at t1 from this region generally fills larger and larger
regions as |t1 − t2 | increases. Therefore, missing information (i.e. entropy)
increases with |t1 − t2 |. Note that it works both into the future and into
the past. Information approach allows one to see clearly that there is really
no contradiction between the reversibility of equations of motion and the
growth of entropy. Also, the concept of entropy as missing information4
allows one to understand that entropy does not really decrease in the system
with Maxwell demon or any other information-processing device (indeed,
if at the beginning one has an information on position or velocity of any
molecule, then the entropy was less by this amount from the start; after using
and processing the information the entropy can only increase). Consider, for
instance, a particle in the box. If we know that it is in one half then entropy
(the logarithm of available states) is ln(V /2). That also teaches us that
information has thermodynamic (energetic) value: by placing a piston at the
half of the box and allowing particle to hit and move it we can get the work
T ∆S = T ln 2 done; on the other hand, to get such an information one must
make a measurement whose minimum energetic cost is T ∆S = T ln 2 (Szilard
1929).
Yet there is one class of quantities where information does not age. They
are integrals of motion. A situation in which only integrals of motion are
known is called equilibrium. The distribution (44) takes the canonical form
(2,3) in equilibrium. From the information point of view, the statement that
systems approach equilibrium is equivalent to saying that all information is
4
that entropy is not a property of the system but of our knowledge about the system
21
forgotten except the integrals of motion. If, however, we possess the infor-
mation about averages of quantities that are not integrals of motion and
those averages do not coincide with their equilibrium values then the distri-
bution (44) deviates from equilibrium. Examples are currents, velocity or
temperature gradients like considered in kinetics.
Ar the end, mention briefly the communication theory which studies
transmissions through imperfect channels. Here, the message (measure-
ment) A we receive gives the information about the event B as follows:
I(A, B) = ln P (B|A)/P (B), where P (B|A) is the so-called conditional prob-
ability (of B in the presence of A). Summing over all possible B1 , . . . , Bn and
A1 , . . . , Am we obtain Shannon’s “mutual information” used to evaluate the
quality of communication systems
∑
m ∑
n
I(A, B) = P (Aj , Bj ) ln[P (Bj |Ai )/P (Bj )]
i=1 j=1
∫
→ I(Z, Y ) = dzdyp(z, y) ln[p(z|y)/p(y)] . (46)
22
2 Gases
From this Chapter we start applying a general theory given in the first Chap-
ter. Here we consider systems with the kinetic energy exceeding the potential
energy of inter-particle interactions: ⟨U (r1 − r2 )⟩ ≪ ⟨mv 2 /2⟩.
To get the feeling of the order of magnitudes, one can make an estimate with
m = 1.6 · 10−24 g (proton) and n = 1021 cm−3 which gives T ≫ 0.5K. Another
way to interpret (48) is to say that the mean distance between molecules
23
n−1/3 must be much larger than the wavelength h/p. In this case, one can
pass from the distribution over the quantum states to the distribution in the
phase space: [ ]
µ − ϵ(p, q)
n̄(p, q) = exp . (49)
T
In particular, the distribution over momenta is always quasi-classical for the
Boltzmann gas. Indeed, the distance between energy levels is determined by
the size of the box, ∆E ≃ h2 m−1 V −2/3 ≪ h2 m−1 (N/V )2/3 which is much less
than temperature according to (48). To put it simply, if the thermal quantum
wavelength h/p ≃ h(mT )−1/2 is less than the distance between particles it is
also less than the size of the box. We conclude that the Boltzmann gas has the
Maxwell distribution over momenta. If such is the case even in the external
field then n(q, p) = exp{[µ − ϵ(p, q)]/T } = exp{[µ − U (q) − p2 /2m]/T }. That
gives, in particular, the particle density in space n(r) = n0 exp[−U (r)/T ]
where n0 is the concentration without field. In the uniform gravity field we
get the barometric formula n(z) = n(0) exp(−mgz/T ).
Partition function of the Boltzmann gas can be obtained from the
partition function of a single particle (like we did for two-level system and
oscillator) with the only difference that particles are now real and indistin-
guishable so that we must divide the sum by the number of transmutations:
[ ]N
1 ∑
Z= exp(−ϵa /T ) .
N! a
Using the Stirling formula ln N ! ≈ N ln(N/e) we write the free energy
[ ]
e ∑
F = −N T ln exp(−ϵa /T ) . (50)
N a
Since the motion of the particle as a whole is always quasi-classical for the
Boltzmann gas, one can single out the kinetic energy: ϵa = p2 /2m + ϵ′a .
If in addition there is no external field (so that ϵ′a describes rotation and
the internal degrees of freedom of the particle) then one can integrate over
d3 pd3 q/h3 and get for the ideal gas:
[ ( )3/2 ∑ ]
eV mT
F = −N T ln exp(−ϵ′a /T ) . (51)
N 2πh̄2 a
24
Mono-atomic gas. At the temperatures much less than the distance to
the first excited state all the atoms will be in the ground state (we put ϵ0 = 0).
That means that the energies are much less than Rydberg ε0 = e2 /aB =
me4 /h̄2 and the temperatures are less than ε0 /k ≃ 3 · 105 K (otherwise atoms
are ionized).
If there is neither orbital angular momentum nor spin (L = S = 0 —
∑
such are the atoms of noble gases) we get a exp(−ϵ′a /T ) = 1 as the ground
state is non-degenerate and
[ ( ) ]
eV mT 3/2 eV
F = −N T ln 2 = −N T ln − N cv T ln T − N ζT , (52)
N 2πh̄ N
3 m
cv = 3/2 , ζ = ln . (53)
2 2πh̄2
Here ζ is called the chemical constant. Note that for F = AT + BT ln T the
energy is linear E = F − T ∂F/∂T = BT that is the specific heat, Cv = B, is
independent of temperature. The formulas thus derived allow one to derive
the conditions for the Boltzmann statistics to be applicable which requires
n̄a ≪ 1. Evidently, it is enough to require exp(µ/T ) ≪ 1 where
( )3/2
E − TS + PV F + PV F + NT N 2πh̄2
µ= = = = T ln .
N N N V mT
25
In the opposite limit of temperature smaller than all the fine structure level
differences, only the ground state with ϵJ = 0 contributes and one gets
ζJ = ln(2J + 1) ,
where J is the total angular momentum in the ground state.
ζ
ζ SL
ζJ
T
cv
3/2
T
Note that cv = 3/2 in both limits that is the specific heat is constant at
low and high temperatures (no contribution of electron degrees of freedom)
having some maximum in between (due to contributions of the electrons).
We have already seen this in considering two-level system and the lesson is
general: if one has a finite number of levels then they do not contribute to
the specific heat both at low and high temperatures.
We estimate the parameters here assuming the typical scale to be Bohr radius
aB = h̄2 /me2 ≃ 0.5 · 10−8 cm and the typical energy to be Rydberg ε0 =
e2 /aB = me4 /h̄2 ≃ 4 · 10−11 erg. Note that m = 9 · 10−28 g is the electron
mass here. Now the frequency of the atomic oscillations is given by the ratio
of the Coulomb restoring force and the mass of the ion:
√ √
ε0 e2
ω≃ 2
= .
aB M a3B M
26
Rotational energy is determined by the moment of inertia I ≃ M a2B . We
may thus estimate the typical energies of vibrations and rotations as follows:
√
m h̄2 m
h̄ω ≃ ε0 , ≃ ε0 . (55)
M I M
Since m/M ≃ 10−4 then that both energies are much smaller than the energy
of dissociation ≃ ϵ0 and the rotational energy is smaller than the vibrational
one so that rotations start to contribute at lower temperatures: ε0 /k ≃
3 · 105 K, h̄ω/k ≃ 3 · 103 K and h̄2 /Ik ≃ 30 K.
The harmonic oscillator was considered in in Sect. 1.4.2. In the quasi-
classical limit, h̄ω ≪ T , the partition function of N independent oscillators
is Z(T, N ) = Z1N (T ) = (T /h̄ω)N , the free energy F = N T ln(h̄ω/T ) and the
mean energy from (24): E = N T . The specific heat CV = N .
For a quantum case, the energy levels are given by En = h̄ω(n + 1/2).
The single-oscillator partition function
∞
∑
Z1 (T ) = exp[−h̄ω(n + 1/2)/T ] = 2 sinh−1 (h̄ω/2T ) (56)
n=0
where one sees the contribution of zero quantum oscillations and the break-
down of classical equipartition. The specific heat (per molecule) of vibrations
is thus as follows: cvib = (h̄ω/T )2 exp(h̄ω/T )[exp(h̄ω/T ) − 1]−2 . At T ≪ h̄ω:
we have CV ∝ exp(−h̄ω/T ). At large T we have classical equipartition (every
oscillator has two degrees of freedom so it has T in energy and 1 in CV ).
To calculate the contribution of rotations one ought to calculate the par-
tition function
( )
∑ h̄2 K(K + 1)
zrot = (2K + 1) exp − . (57)
K 2IT
Again, when temperature is much smaller than the distance to the first
level, T ≪ h̄2 /2I, the specific heat must be exponentially small. Indeed,
retaining only two first terms in the sum (57), we get zrot = 1+3 exp(−h̄2 /IT )
which gives in the same approximation Frot = −3N T exp(−h̄2 /IT ) and crot =
27
3(h̄2 /IT )2 exp(−h̄2 /IT ). We thus see that at low temperatures diatomic gas
behaves an mono-atomic.
At large temperatures, T ≫ h̄2 /2I, the terms with large K give the main
contribution to the sum (57). They can be treated quasi-classically replacing
the sum by the integral:
∫ ( )
∞ h̄2 K(K + 1) 2IT
zrot = dK(2K + 1) exp − = . (58)
0 2IT h̄2
That gives the constant specific heat crot = 1. The resulting specific heat of
the diatomic molecule, cv = 3/2 + crot + cvibr , is shown on the figure:
Cv
7/2
5/2
3/2
Ι/h 2 hω T
Note that for h̄2 /I < T ≪ h̄ω the specific heat (weakly) decreases be-
cause the distance between rotational levels increases so that the level density
(which is actually cv ) decreases.
For (non-linear) molecules with N > 2 atoms we have 3 translations, 3
rotations and 6N − 6 vibrational degrees of freedom (3n momenta and out
of total 3n coordinates one subtracts 3 for the motion as a whole and 3 for
rotations). That makes for the high-temperature specific heat cv = ctr +crot +
cvib = 3/2 + 3/2 + 3N − 3 = 3N . Indeed, every variable (i.e. every degree
of freedom) that enters ϵ(p, q), which is quadratic in p, q, contributes 1/2
to cv . Translation and rotation each contributes only momentum and thus
gives 1/2 while each vibration contributes both momentum and coordinate
(i.e. kinetic and potential energy) and gives 1.
Landau & Lifshitz, Sects. 47, 49, 51.
28
2.2 Fermi and Bose gases
Like we did at the beginning of the Section 2.1 we consider all particles at
the same quantum state as Gibbs subsystem and apply the grand canonical
distribution with the potential
∑
Ωa = −T ln exp[na (µ − ϵa )/T ] . (59)
na
Here the sum is over all possible occupation numbers na . For fermions, there
are only two terms in the sum with na = 0, 1 so that
Ωa = −T ln {1 + exp[β(µ − ϵa )]} .
For bosons, one must sum the infinite geometric progression (which converges
when µ < 0) to get Ωa = T ln {1 − exp[β(µ − ϵa )]}. Remind that Ω depends
on T, V, µ. The average number of particles in the state with the energy ϵ is
thus
∂Ωa 1
n̄(ϵ) = − = . (60)
∂µ exp[β(ϵ − µ)] ± 1
Upper sign here and in the subsequent formulas corresponds to the Fermi
statistics, lower to Bose. Note that at exp[β(ϵ − µ)] ≫ 1 both distributions
turn into Boltzmann distribution (47). The thermodynamic potential of the
whole system is obtained by summing over the states
∑ [ ]
Ω = ∓T ln 1 ± eβ(µ−ϵa ) . (61)
a
29
Integrating over volume we get the quantum analog of the Maxwell dis-
tribution: √
gV m3/2 ϵ dϵ
dN (ϵ) = √ 2 3 . (63)
2π h̄ exp[β(ϵ − µ)] ± 1
In the same way we rewrite (61):
gV T m3/2 ∫ ∞ √ [ ]
Ω=∓ √ 2 3 ϵ ln 1 ± eβ(µ−ϵ) dϵ
2π h̄ 0
2 gV m3/2 ∫ ∞ ϵ3/2 dϵ 2
=− √ 2 3 = − E. (64)
3 2π h̄ 0 exp[β(ϵ − µ)] ± 1 3
Since also Ω = −P V we get the equation of state
2
PV = E . (65)
3
We see that this relation is the same as for a classical gas, it actually is true for
any non-interacting particles with ϵ = p2 /2m in 3-dimensional space. Indeed,
consider a cube with the side l. Every particle hits a wall |px |/2ml times per
unit time transferring the momentum 2|px | in every hit. The pressure is the
total momentum transferred per unit time p2x /ml divided by the wall area l2
(see Kubo, p. 32):
∑
N
p2ix ∑
N
p2i 2E
P = = = . (66)
i=1 ml3 i=1 3ml
3 3V
In the limit of Boltzmann statistics we have E = 3N T /2 so that (65)
reproduces P V = N T . Let us obtain the (small) quantum corrections to the
pressure assuming exp(µ/T ) ≪ 1. Expanding integral in (64)
∫∞ ∫∞ [ ]
√
ϵ3/2 dϵ 3 π βµ ( −5/2 βµ
)
≈ ϵ 3/2 β(µ−ϵ)
e 1 ∓ eβ(µ−ϵ)
dϵ = e 1 ∓ 2 e ,
eβ(ϵ−µ) ± 1 4β 5/2
0 0
30
2.2.1 Degenerate Fermi Gas
The main goal of the theory here is to describe the electrons in the metals
(it is also applied to the Thomas-Fermi model of electrons in large atoms,
to protons and neutrons in large nucleus, to electrons in white dwarf stars,
to neutron stars and early Universe). Drude and Lorents at the beginning
of 20th century applied Boltzmann distribution and obtained decent results
for conductivity but disastrous discrepancy for the specific heat (which they
expected to be 3/2 per electron). That was cleared out by Sommerfeld in 1928
with the help of Fermi-Dirac distribution. Since the energy of an electron
in a metal is comparable to Rydberg and so is the chemical potential (see
below) then for most temperatures we may assume T ≪ µ so that the Fermi
distribution is close to the step function:
n
T
ε
ε
F
At T = 0 electrons fill all the momenta up to pF that can be expressed
via the concentration (g = 2 for s = 1/2):
N 4π ∫ pF 2 p3
=2 3 p dp = 2F 3 , (68)
V h 0 3π h̄
which gives the Fermi energy
( )2/3
2 2/3 h̄2 N
ϵF = (3π ) . (69)
2m V
The chemical potential at T = 0 coincides with the Fermi energy (putting
already one electron per unit cell one obtains ϵF /k ≃ 104 K). Condition
T ≪ ϵF is evidently opposite to (48). Note that the condition of ideality
requires that the electrostatic energy Ze2 /a is much less than ϵF where Ze
is the charge of ion and a ≃ (ZV /N )1/3 is the mean distance between elec-
trons and ions. We see that the condition of ideality, N/V ≫ (e2 m/h̄2 )3 Z 2 ,
surprisingly improves with increasing concentration. Note nevertheless that
31
in most metals the interaction is substantial, why one can still use Fermi
distribution (only introducing an effective electron mass) is the subject of
Landau theory of Fermi liquids to be described in the course of condensed
matter physics (in a nutshell, it is because the main effect of interaction is
reduced to some mean effective periodic field).
To obtain the specific heat, Cv = (∂E/∂T )V,N one must find E(T, V, N )
i.e. exclude µ from two relations, (63) and (64):
√
2V m3/2 ∫ ∞ ϵdϵ
N=√ 2 3 ,
2π h̄ 0 exp[β(ϵ − µ)] + 1
2V m3/2 ∫ ∞ ϵ3/2 dϵ
E= √ .
2π 2 h̄3 0 exp[β(ϵ − µ)] + 1
At T ≪ µ ≈ ϵF this can be done perturbatively using the formula
∫ ∞ ∫ µ
f (ϵ) dϵ π2
≈ f (ϵ) dϵ + T 2 f ′ (µ) , (70)
0 exp[β(ϵ − µ)] + 1 0 6
which gives
2V m3/2 2 ( )
N = √ 2 3 µ3/2 1 + π 2 T 2 /8µ2 ,
2π h̄ 3
2V m3/2 2 ( )
E = √ 2 3 µ5/2 1 + 5π 2 T 2 /8µ2 .
2π h̄ 5
From the first equation we find µ(N, T ) perturbatively
( )2/3 ( )
µ = ϵF 1 − π 2 T 2 /8ϵ2F ≈ ϵF 1 − π 2 T 2 /12ϵ2F
32
2.2.2 Photons
Consider electromagnetic radiation in an empty cavity kept at the temper-
ature T . Since electromagnetic waves are linear (i.e. they do not interact)
thermalization of radiation comes from interaction with walls (absorption
and re-emission)5 . One can derive the equation of state without all the for-
malism of the partition function. Indeed, consider the plane electromagnetic
wave with the fields having amplitudes E and B. The average energy density
is (E 2 + B 2 )/2 = E 2 while the momentum flux modulus is |E × B| = E 2 .
The radiation field in the box can be considered as incoherent superposi-
tion of plane wave propagating in all directions. Since all waves contribute
the energy density and only one-third of the waves contribute the radiation
pressure on any wall then
P V = E/3 . (73)
In a quantum consideration we treat electromagnetic waves as photons
which are massless particles with the spin 1 that can have only two inde-
pendent orientations (correspond to two independent polarizations of a clas-
sical electromagnetic wave). The energy is related to the momentum by
ϵ = cp. Now, exactly as we did for particles [where the law ϵ = p2 /2m gave
P V = 2E/3 — see (66)] we can derive (73) considering6 that every incident
photon brings momentum ∫
2p cos θ to the wall, that the normal velocity is
c cos θ and integrating cos2 θ sin θ dθ. Photon pressure is relevant inside the
stars, particularly inside the Sun.
Let us now apply the Bose distribution to the system of photons in a
cavity. Since the number of photons is not fixed then minimumality of the
free energy, F (T, V, N ), requires zero chemical potential: (∂F/∂N )T,V = µ =
0. The Bose distribution over the quantum states with fixed polarization,
momentum h̄k and energy ϵ = h̄ω = h̄ck is called Planck distribution
1
n̄k = . (74)
eh̄ω/T − 1
At T ≫ h̄ω it gives the Rayleigh-Jeans distribution h̄ωn̄k = T which is
classical equipartition. Assuming cavity large we consider the distribution
5
It is meaningless to take perfect mirror walls which do not change the frequency of
light under reflection and formally correspond to zero T .
6
This consideration is not restricted to bosons. Indeed, ultra-relativistic fermions have
ϵ = cp and P = E/3V . Note that in the field theory energy and momentum are parts
of the energy-momentum tensor whose trace must be positive which requires cp ≤ ϵ and
P ≤ E/3V where E is the total energy including the rest mass N mc2 , L&L 61.
33
over wave vectors continuous. Multiplying by 2 (the number of polarizations)
we get the spectral distribution of energy
2V 4πk 2 dk V h̄ ω 3 dω
dEω = h̄ck = . (75)
(2π)3 eh̄ck/T − 1 π 2 c3 eh̄ω/T − 1
cE ∫ π/2 cE
I= cos θ sin θ dθ = = σT 4 . (77)
V 0 4V
Landau & Lifshitz, Sect. 63 and Huang, Sect. 12.1.
2.2.3 Phonons
The specific heat of a crystal lattice can be calculated considering the oscil-
lations of the atoms as acoustic waves with three branches (two transversal
and one longitudinal) ωi = ui k where ui is the respective sound velocity. De-
bye took this expression for the spectrum and imposed a maximal frequency
ωmax so that the total number of degrees of freedom is equal to 3 times the
number of atoms:
3 ω∫max 2
4πV ∑ ω dω 3
V ωmax
= = 3N . (78)
(2π)3 i=1 u3i 2π 2 u3
0
34
Here we introduced some effective sound velocity u defined by 3u−3 = 2u−3
t +
−3
ul . One usually introduces the Debye temperature
At T ≪ Θ for the specific heat we have the same cubic law as for photons:
12π 4 T 3
C=N . (81)
5 Θ3
For liquids, there is only one (longitudinal) branch of phonons so C =
N (4π 4 /5)(T /Θ)3 which works well for He IV at low temperatures.
At T ≫ Θ we have classical specific heat (Dulong-Petit law) C = 3N .
Debye temperatures of different solids are between 100 and 1000 degrees
Kelvin. We can also write the free energy of the phonons
( )3 T∫/Θ ( ) [ ( ) ]
T
F = 9N T z 2 ln 1 − e−z dz = N T 3 ln 1 − e−Θ/T −D(Θ/T ) ,(82)
Θ
0
35
independent of temperature. One can also express it via so-called mean
∑
geometric frequency defined as follows: ln ω̄ = (3N )−1 ln ωa . Then δF =
∑
δG = T a ln(h̄ωa /T ) = N T ln h̄ω̄(P ) , and α = (N/V ω̄)dω̄/dP . When the
pressure increases, the atoms are getting closer, restoring force increases and
so does the frequency of oscillations so that α ≥ 0.
Note that we’ve got a constant contribution 9N Θ/8 in (80) which is due
to quantum zero oscillations. While it does not contribute the specific heat,
it manifests itself in X-ray scattering, Mössbauer effect etc. Incidentally,
this is not the whole energy of a body at zero temperature, this is only the
energy of excitations due to atoms shifting from their equilibrium positions.
There is also a negative energy of attraction when the atoms are precisely in
their equilibrium position. The total (so-called binding) energy is negative
for crystal to exists at T = 0.
One may ask why we didn’t account for zero oscillations when considered
photons in (75,76). Since the frequency of photons is not restricted from
above, the respective contribution seems to be infinite. How to make sense
out of such infinities is considered in quantum electrodynamics; note that
the zero oscillations of the electromagnetic field are real and manifest them-
selves, for example, in the Lamb shift of the levels of a hydrogen atom. In
thermodynamics, zero oscillations of photons are of no importance.
Landau & Lifshitz, Sects. 64–66; Huang, Sect. 12.2
We introduced the thermal wavelength λ = (2πh̄2 /mT )1/2 and the function
1 ∫ ∞ xa−1 dx ∞
∑ zi
ga (z) = = . (83)
Γ(a) 0 z −1 ex − 1 i=1 ia
36
One may wonder why we single out the contribution of zero-energy level as it
is not supposed to contribute at the thermodynamic limit V → ∞. Yet this
is not true at sufficiently low temperatures. Indeed, let us rewrite it denoting
n0 = z/(1 − z) the number of particles at p = 0
n0 1 g3/2 (z)
= − . (84)
V v λ3
The function g3/2 (z) behaves as shown at the figure, it monotonically grows
while z changes from zero (µ = −∞) to unity (µ = 0). Remind that the
chemical potential of bosons is non-positive (otherwise one would have in-
finite occupation numbers). At z = 1, the value is g3/2 (1) = ζ(3/2) ≈ 2.6
and the derivative is infinite. When the temperature and the specific volume
v = V /N are such that λ3 /v > g3/2 (1) (notice that the thermal wavelength
is now larger than the inter-particle distance) then there is a finite frac-
tion of particles that occupies the zero-energy level. The solution of (84)
looks as shown in the figure. When V → ∞ we have a sharp transition at
λ3 /v = g3/2 (1) i.e. at T = Tc = 2πh̄2 /m[vg3/2 (1)]2/3 : at T ≤ Tc we have
z ≡ 1 that is µ ≡ 0. At T > Tc we obtain z solving λ3 /v = g3/2 (z).
g 3/2 z
2.6 O(1/V)
1
3
1/2.6 v/λ
0 1 z
Therefore, at the thermodynamic limit we put n0 = 0 at T > Tc and
n0 /N = 1 − (T /Tc )3/2 as it follows from (84). All thermodynamic relations
have now different expressions above and below Tc (upper and lower cases
respectively):
{
3 2πV ∫ ∞ p4 dp (3V T /2λ3 )g5/2 (z)
E = PV = = (85)
2 mh3 0 z −1 exp(p2 /2mT ) − 1 (3V T /2λ3 )g5/2 (1)
{
(15v/4λ3 )g5/2 (z) − 9g3/2 (z)/4g1/2 (z)
cv = (86)
(15v/4λ3 )g5/2 (1)
37
phonons and photons and µ = 0 too) yet slower than cv ∝ T 3 (that we had
for ϵp = cp) because the particle levels, ϵp = p2 /2m, are denser at lower
energies. On the other hand, since the distance between levels increases with
energy so that at high temperatures cv decreases with T as for rotators in
Sect. 2.1.1:
cv P transition line Pv5/3=const
isotherms
3/2
T
T v
Tc vc(T)
At T < Tc the pressure is independent of the volume which promts the
analogy with a phase transition of the first order. Indeed, this reminds the
properties of the saturated vapor (particles with nonzero energy) in contact
with the liquid (particles with zero energy): changing volume at fixed tem-
perature we change the fraction of the particles in the liquid but not the pres-
sure. This is why the phenomenon is called the Bose-Einstein condensation.
Increasing temperature we cause evaporation (particle leaving condensate in
our case) which increases cv ; after all liquid evaporates (at T = Tc ) cv starts
to decrease. It is sometimes said that it is a “condensation in the momentum
space” but if we put the system in a gravity field then there will be a spatial
separation of two phases just like in a gas-liquid condensation (liquid at the
bottom).
We can also obtain the entropy [above Tc by usual formulas∫ that fol-
low∫ from (64) and below Tc just integrating specific heat S = dE/T =
N cv (T )dT /T = 5E/3T = 2N cv /3):
{
S (5v/2λ3 )g5/2 (z) − log(z)
= (87)
N (5v/2λ3 )g5/2 (1)
The entropy is zero at T = 0 which means that the condensed phase has no
entropy. At finite T all the entropy is due to gas phase. Below Tc we can
write S/N = (T /Tc )3/2 s = (v/vc )s where s is the entropy per gas particle:
s = 5g5/2 (1)/2g3/2 (1). The latent heat of condensation per particle is T s that
it is indeed phase transition of the first order.
Landau & Lifshitz, Sect. 62; Huang, Sect. 12.3.
38
2.3 Chemical reactions
Time to learn why µ is called chemical potential. Reactions in the mixture
of ideal gases. Law of mass action. Heat of reaction. Ionization equilibrium.
Landau & Lifshitz, Sects. 101–104.
39
3 Non-ideal gases
Here we take into account a weak interaction between particles. There are
two limiting cases when the consideration is simplified:
i) when the typical range of interaction is much smaller than the mean dis-
tance between particles so that it is enough to consider only two-particle
interactions,
ii) when the interaction is long-range so that every particle effectively interact
with many other particles and one can apply some mean-field description.
We start from ii) (even though it is conceptually more complicated) so
that after consideration of i) we can naturally turn to phase transitions.
40
sphere: n0 rd3 ≫ 1 (in electrolytes rD is of order 10−3 ÷ 10−4 cm while in
ionosphere plasma it can be kilometers). Everywhere n0 is the mean density
of either ions or electrons.
We can now estimate the electrostatic contribution to the energy of the
system of N particles (what is called correlation energy):
e2 N 3/2 e3 A
Ū ≃ −N ≃−√ = −√ . (90)
rD VT VT
The (positive) addition to the specific heat
A e2
∆CV = ≃N ≪N . (91)
2V 1/2 T 3/2 rD T
One can get the correction to the entropy by integrating the specific heat:
∫ ∞ CV (T )dT A
∆S = − = − 1/2 3/2 . (92)
T T 3V T
We set the limits of integration here as to assure that the effect of screening
disappears at large temperatures. We can now get the correction to the free
energy and pressure
2A A
∆F = Ū − T ∆S = − 1/2 T 1/2
, ∆P = − 3/2 T 1/2
. (93)
3V 3V
Total pressure is P = N T /V − A/3V 3/2 T 1/2 — a decrease at small V (see
figure) hints about the possibility of phase transition which indeed happens
(droplet creation) for electron-hole plasma in semiconductors even though
our calculation does not work at those concentrations.
P
ideal
41
repulsion so that it is necessary that corrections to energy, entropy, free
energy and pressure are negative. Positive addition to the specific heat could
be interpreted that increasing temperature one decreases screening and thus
increases energy.
Now, we can do all the consideration in a more consistent way calculating
exactly the value of the constant A. To calculate the correlation energy of
electrostatic interaction one needs to multiply every charge by the potential
created by other charges at its location. The electrostatic potential ϕ(r)
around an ion determines the distribution of ions (+) and electrons (-) by
the Boltzmann formula n± (r) = n0 exp[∓eϕ(r)/T ] while the charge density
e(n+ − n− ) in its turn determines the potential by the Poisson equation
( ) 8πe2 n0
∆ϕ = −4πe(n+ − n− ) = −4πen0 e−eϕ/T − eeϕ/T ≈ ϕ, (94)
T
where we expanded the exponents assuming the weakness of interaction.
This equation has a central-symmetric solution ϕ(r) = (e/r) exp(−κr) where
−2
κ2 = 8πrD . We are interesting in this potential near the ion i.e. at small r:
ϕ(r) ≈ e/r − eκ where the first term is the field of the ion itself while the
second term is precisely what we need i.e. contribution of all other charges.
We can now write the energy of every ion and electron as −e2 κ and get the
total electrostatic energy multiplying by the number of particles (N = 2n0 V )
and dividing by 2 so as not to count every couple of interacting charges twice:
√ N 3/2 e3
Ū = −n0 V κe2 = − π √ . (95)
VT
√
Comparing with the rough estimate (90), we just added the factor π.
The consideration by Debye-Hückel is the right way to account for the
1/3
first-order corrections in the small parameter e2 n0 /T . One cannot though
get next corrections within the method [further expanding the exponents in
(94)]. That would miss multi-point correlations which contribute the next
orders. To do this one needs Bogolyubov’s method of correlation functions.
Such functions are multi-point joint probabilities to find simultaneously par-
ticles at given places. The correlation energy is expressed via the two-point
correlation function wab where the indices mark both the type of particles
(electrons or ions) and the positions ra and rb :
1 ∑ Na Nb ∫ ∫
E= uab wab dVa dVb . (96)
2 a,b V 2
42
Here uab is the energy of the interaction. The pair correlation function is
determined by the Gibbs distribution integrated over the positions of all
particles except the given pair:
∫ [ ]
2−N F − Fid − U (r1 . . . rN )
wab = V exp dV1 . . . dVN −2 . (97)
T
Here ∑ ∑
U = uab + (uac + ubc ) + ucd .
c c,d̸=a,b
Expanding this equation in U/T we get terms like uab wab and in addition
(uac + ubc )wabc which involves the third particle c and the triple correlation
function that one can express via the integral similar to (97):
∫ [ ]
F − Fid − U (r1 . . . rN )
wabc = V 3−N exp dV1 . . . dVN −3 . (98)
T
and observing that the equation on wab is not closed, it contains wabc ; the
similar equation on wabc will contain wabcd etc. Debye-Hückel approximation
corresponds to closing this hierarchical system of equations already at the
level of the first equation (99) putting wabc ≈ wab wbc wac and assuming ωab =
wab −1 ≪ 1, that is assuming that two particles rarely come close while three
particles never come together:
∑ ∫
∂ωab 1 ∂uab −1 ∂ubc
=− − (V T ) Nc ωac dVc , (100)
∂rb T ∂rb c ∂rb
For other contributions to wabc , the integral turns into zero due to isotropy.
This is the general equation valid for any form of interaction. For Coulomb
interaction, we can turn the integral equation (100) into the differential equa-
tion by using ∆r−1 = −4πδ(r). For that we differentiate (100) once more:
4πza zb e2 4πzb e2 ∑
∆ωab (r) = δ(r) + Nc zc ωac (r) . (101)
T TV c
43
The dependence on ion charges and types is trivial, ωab (r) = za zb ω(r) and
we get ∆ω = 4πe2 δ(r)/T + κ2 ω which is (94) with delta-function enforc-
ing the condition at zero. We see that the pair correlation function satis-
fies the same equation as the potential. Substituting the solution ω(r) =
−(e2 /rT ) exp(−κr) into wab (r) = 1 + za zb ω(r) and that into (96) one gets
contribution of 1 vanishing because of electro-neutrality and the term linear
in ω giving (95). To get to the next order, one considers (99) together with
the equation for wabc , where one expresses wabcd via wabc .
The quantum (actually quasi-classical) variant of such mean-field consid-
eration is called Thomas-Fermi method (1927) and is traditionally studied
in the courses of quantum mechanics as it is applied to the electron distri-
bution in large atoms (such placement is probably right despite the method
is stat-physical because objects of study are more important than meth-
ods). In this method we consider the effect of electrostatic interaction on a
degenerate electron gas at zero temperature. According to the Fermi distri-
bution (68) the maximal kinetic energy is related to the local concentration
n(r) by p20 /2m = (3π 2 n)2/3 h̄2 /2m. We need to find the electrostatic poten-
tial ϕ(r) which determines the total electron energy p2 /2m − eϕ. Denote
−eϕ0 = p20 /2m − eϕ the maximal value of the total energy of the electron (it
is zero for neutral atoms and negative for ions). The maximal total energy
must be space-independent otherwise the electrons drift to the places with
lower −ϕ0 . We can now relate the local electron density n(r) to the local po-
tential ϕ(r): p20 /2m = eϕ − eϕ0 = (3π 2 n)2/3 h̄2 /2m — that relation one must
now substitute into the Poisson equation ∆ϕ = 4πen ∝ (ϕ − ϕ0 )3/2 . This
equation has a power-law solution at large distances ϕ ∝ r−4 , n ∝ r−6 which
does not make much sense as atoms are supposed to have finite sizes. It also
is inapplicable at the distances below the Bohr radius. The Thomas-Fermi
approximation works well for large-atoms where there is an intermediate in-
terval of distances (Landau&Lifshitz, Quantum Mechanics, Sect. 70).
For more details on Coulomb interaction, see Landau & Lifshitz, Sects.
78,79 for more details.
44
3.2 Cluster and virial expansions
Consider a dilute gas with the short-range inter-particle energy of interaction
u(r). We assume that u(r) decays on the scale r0 and
Integrating over momenta we get the partition function Z and the grand
partition function Z as
1 ∫ ZN (V, T )
Z(N, V, T ) = 3N
dr1 . . . drN exp[−U (r1 , . . . , rN )] ≡ .
N !λT N !λ3N
T
∑∞
z N ZN
Z(z, V, T ) = 3N
. (102)
N =0 N !λT
The first terms does not account for interaction. The second one accounts
for the interaction of only one pair (under the assumption that when one
pair of particles happens to be close and interact, this is such a rare event
that the rest can be considered non-interacting). The third term accounts for
simultaneous interaction of three particles etc. We can now write the Gibbs
potential Ω = −P V = −T ln Z and expand the logarithm in powers of z/λ3T :
∞
∑
P = λ−3
T bl z l . (104)
l=1
45
∫
b1 = 1 , b2 = (1/2)λ−3
T f12 dr12 ,
∫ ( )
b3 = (1/6)λ−6
T e−U123 /T −e−U12 /T −e−U23 /T −e−U13 /T +2 dr12 dr13
∫
= (1/6)λ−6
T (3f12 f13 + f12 f13 f23 ) dr12 dr13 . (105)
Using the cluster expansion we can now show that the cluster integrals bl
indeed appear in the expansion (104). For l = 1, 2, 3 we saw that this is
indeed so.
46
Denote ml the number of l-clusters and by {ml } the whole set of m1 , . . ..
In calculating ZN we need to include the number of ways to distribute N
∏
particles over clusters which is N !/ l (l!)ml . We then must multiply it by
the sum of all possible clusters in the power ml divided by ml ! (since an
exchange of all particles from one cluster with another cluster of the same
3(l−1)
size does not matter). Since the sum of all l-clusters is bl l!λT V then
∑ ∏
ZN = N !λ3N (bl λ−3 ml
T V ) /ml ! .
{ml } l
∑
Here we used N = lml . The problem here is that the sum over different
partitions {ml } needs to be taken under this restriction too and this is techni-
cally very cumbersome. Yet when we pass to calculating the grand canonical
partition function7 and sum over all possible N we ∑ obtain an unrestricted
∏ l m
summation over all possible {ml }. Writing z = z
N lml
= l (z ) l we get
∞ ∞
( )ml
∑ ∏ V bl z l 1
Z =
m1 ,m2 ,...=0 l=1 λ3T ml !
∞ ∞
[ ( )m1 ( )m2 ]
∑ ∑ 1 V b1 1 V b2
= ··· z z ···
m1 =0 m2 =0 m1 ! λ3T m2 ! λ3T
( ∞
)
V ∑
= exp 3 bl z l . (108)
λT l=1
We can now reproduce (104) and write the total number of particles:
∞
∑
P V = −Ω = T ln Z(z, V ) = (V /λ3T ) bl z l (109)
l=1
∞
1 z ∂ ln Z −3
∑
= = λT lbl z l . (110)
v V ∂z l=1
To get the equation of state of now must express z via v/λ3 from (110) and
substitute into (109). That will generate the series called the virial expansion
∞
( )l−1
Pv ∑ λ3
= al (T ) T . (111)
T l=1 v
7
Sometimes the choice of the ensemble is dictated by the physical situation, sometimes
by a technical convenience like now. The equation of state must be the same in the
canonical and microcanonical as we expect the pressure on the wall restricting the system
to be equal to the pressure measured inside.
47
Dimensionless virial coefficients can be expressed via cluster coefficients i.e.
they depend on the interaction potential and temperature:
∫
a1 = b1 = 1 , a2 = −b2 , a3 = 4b22 − 2b3 = −λ−6
T f12 f13 f23 dr12 dr13 /3 . . . .
can be estimated by splitting the integral into two parts, from 0 to r0 (where
we can neglect the exponent assuming u large positive) and from r0 to ∞
(where we can assume small negative energy, u ≪ T , and expand the expo-
nent). That gives
∫ ∞
a
B(T ) = b − , a ≡ 2π u(r)r2 dr . (113)
T r0
48
molecules hit each other often and negative at low temperatures when long-
range attraction between molecules decreases the pressure. Since N B/V <
N b/V ≪ 1 the correction is small. Note that a/T ≪ 1 since we assume weak
interaction.
While by its very derivation the formula (114) is derived for a dilute
gas one may desire to change it a bit so that it can (at least qualitatively)
describe the limit of incompressible liquid. That would require the pressure
to go to infinity when density reaches some value. This is usually done by
replacing in (114) 1 + bn by (1 − bn)−1 which is equivalent for bn ≪ 1 but
for bn → 1 gives P → ∞. The resulting equation of state is called van der
Waals equation: ( )
P + an2 (1 − nb) = nT . (115)
There is though an alternative way to obtain (115) without assuming the gas
dilute. This is some variant of the mean field even though it is not a first
step of any consistent procedure. Namely, we assume that every molecule
moves in some effective field Ue (r) which is a strong repulsion (Ue → +∞)
in some region of volume bN and is an attraction of order −aN outside:
{∫ } [ ]
−Ue (r)/T aN
F − Fid ≈ −T N ln e dr/V = −T N ln(1 − bn) + . (116)
VT
Differentiating (116) with respect to V gives (115). That “derivation” also
helps understand better the role of the parameters b (excluded volume) and
a (mean interaction energy per molecule). From (116) one can also find the
entropy of the van der Waals gas S = −(∂F/∂T )V = Sid + N ln(1 − nb)
and the energy E = Eid − N 2 a/V , which are both lower than those for an
ideal gas, while the sign of the correction to the free energy depends on the
temperature. Since the correction to the energy is T -independent then CV
is the same as for the ideal gas.
Let us now look closer at the equation of state (115). The set of isotherms
is shown on the figure:
P V µ J
C E
D L Q
E D
N
L J
Q J E C
D
N L C N
V Q P P
49
Since it is expected to describe both gas and liquid then it must show
phase transition. Indeed, we see the region with (∂P/∂V )T > 0 at the lower
isotherm in the first figure. When the pressure correspond to the level NLC,
it is clear that L is an unstable point and cannot be realized. But which
stable point is realized, N or C? To get the answer, one must minimize the
Gibbs potential G(T, P, N ) = N µ(T, P ) since we have T and P fixed. For
one mole, integrating the relation∫ dµ(T, P ) = −sdT +vdP under the constant
temperature we find: G = µ = v(P )dP . It is clear that the pressure that
corresponds to D (having equal areas before and above the horizontal line)
separates the absolute minimum at the left branch Q (liquid-like) from that
on the right one C (gas-like). The states E (over-cooled or over-compressed
gas) and N (overheated or overstretched liquid) are metastable, that is they
are stable with respect to small perturbations but they do not correspond to
the global minimum of chemical potential. We thus conclude that the true
equation of state must have isotherms that look as follows:
P
Tc
V
50
4 Phase transitions
4.1 Thermodynamic approach
The main theme of this Chapter is the competition (say, in minimizing the
free energy) between the interaction energy, which tends to order systems,
and the entropy, which tends brings a disorder. Upon the change of some
parameters, systems can undergo a phase transition from more to less ordered
states. We start this chapter from the phenomenological approach to the
transitions of both first and second orders. We then proceed to develop a
microscopic statistical theory based on Ising model.
51
For Z(z) being a polynomial, both P and v are analytic functions of z in a
region of the complex plane that includes the real positive axis. Therefore,
P (v) is an analytic function in a region of the complex plane that includes
the real positive axis. Note that V /Nm ≤ v < ∞. One can also prove that
∂v −1 /∂z > 0 so that ∂P/∂v = (∂P/∂z)/(∂v/∂z) < 0.
For a phase transition of the first order the pressure must be independent
of v in the transition region. We see that strictly speaking in a finite volume
we cannot have that since P (v) is analytic, nor we can have ∂P/∂v > 0. That
means that singularities, jumps etc can appear only in the thermodynamic
limit N → ∞, V → ∞ (where, formally speaking, the singularities that
existed in the complex plane of z can come to the real axis). Such singularities
are related to zeroes of Z(z). When such a zero z0 tends to a real axis at the
limit N → ∞ (like the root e−iπ/2N of the equation z N + 1 = 0) then 1/v(z)
and P (z) are determined by two different analytic functions in two regions:
one, including the part of the real axis with z < z0 and another with z > z0 .
Depending on the order of zero of Z(z), 1/v itself may have a jump or its n-th
derivative may have a jump, which corresponds to the n + 1 order of phase
transition. For n = 0, P ∝ limN →∞ N −1 ln |z − z0 − O(N −1 )| is continuous
at z → z0 but ∂P/∂z and 1/v are discontinuous; this is the transition of the
first order. For the second-order phase transition, volume is continuous but
its derivative jumps. We see now what happens as we increase T towards
Tc : another zero comes from the complex plane into real axis and joins the
zero that existed there before, turning 1st order phase transition into the
2nd order transition; at T > Tc the zeroes leave the real axis. Huang, Sect.
15.1-2.
52
P 1/v P
first order
z z v
z0 z0
P 1/v P
second order
z z v
z0 z0
53
P T
LIQUID L
Critical point L-S G-L
T tr G
S
SOLID Triple point 1 A 2
G-S
GAS T V
Ttr
µ1 L>0
µ2 T
dP s1 − s2 L
= = . (119)
dT v1 − v2 T (v2 − v1 )
Since the entropy of a liquid is usually larger than that of a solid then L > 0
that is the heat is absorbed upon melting and released upon freezing. Most
of the substances also expand upon melting then the solid-liquid equilibrium
line has dP/dT > 0. Water, on the contrary, contracts upon melting so the
slope of the melting curve is negative as on the P-T diagram above. Note
that symmetries of solid and liquid states are different so that one cannot
continuously transform solid into liquid. That means that the melting line
starts on another line and goes to infinity since it cannot end in a critical
point (like the liquid-gas line).
Clausius-Clapeyron equation allows one, in particular, to obtain the pres-
sure of vapor in equilibrium with liquid or solid. In this case, v1 ≪ v2 .
We may treat the vapor as an ideal so that v2 = T /P and (119) gives
d ln P/dT = L/T 2 . We may further assume that L is approximately inde-
54
pendent of T and obtain P ∝ exp(−L/T ) which is a fast-increasing function
of temperature. Landau & Lifshitz, Sects. 81–83.
µ2
µ1
P P
Another examples of continuous phase transitions (i.e. such that corre-
spond to a continuous change in the system) are all related to the change
in symmetry upon the change of P or T . Since symmetry is a qualitative
characteristics, it can change even upon an infinitesimal change (for exam-
ple, however small ferromagnetic magnetization breaks isotropy). Here too
every phase can exist only on one side of the transition point. The tran-
sition with first derivatives of the thermodynamic potentials continuous is
called second order phase transition. Because the phases end in the point
of transition, such point must be singular for the thermodynamic potential,
and indeed second derivatives, like specific heat, are generally discontinuous.
One set of such transitions is related to the shifts of atoms in a crystal lattice;
while close to the transition such shift is small (i.e. the state of matter is
almost the same) but the symmetry of the lattice changes abruptly at the
transition point. Another set is a spontaneous appearance of macroscopic
magnetization (i.e. ferromagnetism) below Curie temperature. Transition
to superconductivity is of the second order. Variety of second-order phase
transitions happen in liquid crystals etc. Let us stress that some transitions
with a symmetry change are first-order (like melting) but all second-order
phase transitions correspond to a symmetry change.
55
4.1.4 Landau theory
To describe general properties of the second-order phase transitions Lan-
dau suggested to characterize symmetry breaking by some order parameter
η which is zero in the symmetrical phase and is nonzero in nonsymmetric
phase. Example of an order parameter is magnetization. The choice of order
parameter is non-unique; to reduce arbitrariness, it is usually required to
transform linearly under the symmetry transformation. The thermodynamic
potential can be formally considered as G(P, T, η) even though η is not an
independent parameter and must be found as a function of P, T from requir-
ing the minimum of G. We can now consider the thermodynamic potential
near the transition as a series in small η:
G(P, T, η) = G0 + A(P, T )η 2 + B(P, T )η 4 . (120)
The linear term is absent to keep the first derivative continuous. The coeffi-
cient B must be positive to since arbitrarily large values of η must cost a lot
of energy. The coefficient A must be positive in the symmetric phase when
minimum in G corresponds to η = 0 (left figure below) and negative in the
non-symmetric phase where η ̸= 0. Therefore, at the transition Ac (P, T ) = 0
and Bc (P, T ) > 0:
G G
A>0 A<0
η η
56
In the lowest order in η the entropy is S = −∂G/∂T = S0 + a2 (T − Tc )/2B at
T < Tc and S = S0 at T > Tc . Entropy is lower at lower-temperature phase
(which is generally less symmetric). Specific heat Cp = T ∂S/∂T has a jump
at the transitions: ∆Cp = a2 Tc /2B. Specific heat increases when symmetry
is broken since more types of excitations are possible.
If symmetries allow the cubic term C(P, T )η 3 (like in a gas or liquid
near the critical point discussed in Sect. 5.2 below) then one generally has a
first-order transition, say, when A < 0 and C changes sign:
G G G
η η η
has one solution η(h) above the transition and may have three solutions (one
stable, two unstable) below the transition:
η η η
h h h
The similarity to the van der Waals isotherm is not occasional: changing
the field at T < Tc one encounters √ a first-order phase transition at h = 0
where the two phases with η = ± a(Tc − T )/2B coexist. We see that p − pc
is analogous to h and 1/v − 1/vc to the order parameter (magnetization) η.
57
Susceptibility,
( ) {
∂η V [2α(T − Tc )]−1 at T > Tc
χ= = = (124)
∂h h=0
2a(T − Tc ) + 12Bη 2 [4α(Tc − T )]−1 at T < Tc
58
of different nature and the technical difficulties related with the commuta-
tion relations of spin operators. It is remarkable that there exists one highly
simplified approach that allows one to study systems so diverse as ferromag-
netism, condensation and melting, order-disorder transitions in alloys, phase
separation in binary solutions, and also model phenomena in economics, so-
ciology, genetics, to analyze the spread of forest fires etc. This approach is
based on the consideration of lattice sites with the nearest-neighbor interac-
tion that depends upon the manner of occupation of the neighboring sites.
We shall formulate it initially on the language of ferromagnetism and then
establish the correspondence to some other phenomena.
4.2.1 Ferromagnetism
Experiments show that ferromagnetism is associated with the spins of elec-
trons (not with their orbital motion). Spin 1/2 may have two possible pro-
jections. We thus consider lattice sites with elementary magnetic moments
±µ. We already considered (Sect. 1.4.1) this system in an external magnetic
field H without any interaction between moments and got the magnetization
(20):
exp(µH/T ) − exp(−µH/T )
M = nµ . (125)
exp(µH/T ) + exp(−µH/T )
First phenomenological treatment of the interacting system was done by
Weiss who assumed that there appears some extra magnetic field proportional
to magnetization which one adds to H and thus describes the influence that
M causes upon itself:
µ(H + ξM )
M = N µ tanh ). (126)
T
And now put the external field to zero H = 0. The resulting equation can
be written as
Tc η
η = tanh , (127)
T
where we denoted η = M/µN and Tc = ξµ2 N . At T > Tc there is a
single solution η = 0 while at T < Tc there are two more nonzero solutions
which exactly means the appearance of the spontaneous magnetization. At
Tc − T ≪ Tc one has η 2 = 3(Tc − T ) exactly as in Landau theory (122).
59
η
1
T/Tc
1
One can compare Tc with experiments and find surprisingly high ξ ∼ 103 ÷
4
10 . That means that the real interaction between moments is much higher
than the interaction between neighboring dipoles µ2 n = µ2 /a3 . Frenkel and
Heisenberg solved this puzzle (in 1928): it is not the magnetic energy but
the difference of electrostatic energies of electrons with parallel and antipar-
allel spins, so-called exchange energy, which is responsible for the interaction
(parallel spins have antisymmetric coordinate wave function and much lower
energy of interaction than antiparallel spins).
We can now at last write the Ising model (formulated by Lenz in 1920
and solved in one dimension by his student Ising in 1925): we have the
variable σi = ±1 at every lattice site. The energy includes interaction with
the external field and between neighboring spins:
∑
N ∑
H = −µH σi + J/4 (1 − σi σj ) . (128)
i ij
60
we get:
N − 2N+ N − N+
γJ − T ln =0. (129)
N N+
Here we can again introduce the variables η = M/µN and Tc = γJ/2 and
reduce (129) to (127). We thus see that indeed Weiss approximation is equiv-
alent to the mean field. The only addition is that now we have the expression
for the free energy, F/2N = Tc (1−η 2 )−T (1+η) ln(1+η)−T (1−η) ln(1−η),
so that we can indeed make sure that the nonzero η at T < Tc correspond to
minima. Here is the free energy plotted as a function of magnetization, we
see that it has exactly the form we assumed in the Landau theory (which as
we see near Tc corresponds to the mean field approximation). The energy is
symmetrical with respect to flipping all the spins simultaneously. The free
energy is symmetric with respect to η ↔ −η. But the system at T < Tc lives
in one of the minima (positive or negative η). When the symmetry of the
state is less than the symmetry of the potential (or Hamiltonian) it is called
spontaneous symmetry breaking.
F
T<Tc
T=T_
c
T>Tc η
61
two-level system (22): Z = 2[1 + exp(−J/T )]N −1 . Here 2 because there are
two possible orientations of the first spin. There are N − 1 links. Now, as
we know, there is no phase transitions for a two-level system. In particular
one can compare the mean-field energy E = Tc (1 − η 2 ) with the exact 1d
expression (25) which can be written as E(T ) = N J/(1 + eJ/T ) and compare
the mean field specific heat with the exact 1d expression:
C mean−field
1d
T
We can improve the mean-field approximation by accounting exactly for
the interaction of a given spin σ0 with its γ nearest neighbors and replacing
the interaction with the rest of the lattice by a new mean field H ′ (this is
called Bethe-Peierls or BP approximation):
∑
γ ∑
γ
′
Hγ+1 = −µH σj − (J/2) σ0 σj . (130)
j=1 j=1
we obtain [ ]
γ−1 cosh(η + ν)
η= ln (131)
2 cosh(η − ν)
62
instead of (127) or (129). Condition that the derivatives with respect to η at
zero are the same, (γ − 1) tanh ν = 1, gives the critical temperature:
( )
γ
Tc = J ln−1 , γ≥2. (132)
γ−2
It is lower than the mean field value γJ/2 and tends to it when γ → ∞
— mean field is exact in an infinite-dimensional space. More important, it
shows that there is no phase transition in 1d when γ = 2 and Tc = 0 (in
fact, BP is exact in 1d). Note that η is now not a magnetization, which is
given by the mean spin σ̄0 = sinh(2η)/[cosh(2η) + exp(−2ν)]. BP also gives
nonzero specific heat at T > Tc : C = γν 2 /2 cosh2 ν (see Pathria 11.6 for
more details):
C exact 2d solution
BP
mean−field
T/J
1.13 1.44 2
63
4.2.2 Impossibility of phase coexistence in one dimension
It is physically natural that fluctuations has much influence in one dimen-
sion: it is enough for one spin to flip to loose the information of the preferred
orientation. It is thus not surprising that phase transitions are impossible
in one-dimensional systems with short-range interaction. Another way to
understand that the ferromagnetism is possible only starting from two di-
mensions is to consider the spin lattice and ask if we can make temperature
low enough to have a nonzero magnetization. The state of lowest energy has
all spins parallel. The first excited state correspond to one spin flip and has
an energy higher by ∆E = γJ, the concentration of such opposite spins is
proportional to exp(−γJ/T ) and must be low at low temperatures so that
the magnetization is close to µN and η ≈ 1. In one dimension, however, the
lowest excitation is not the flip of one spin (energy 2J) but flipping all the
spins to the right or left from some site (energy J). Again the mean number
of such flips is N exp(−J/T ) and in sufficiently long chain this number is
larger than unity i.e. the mean magnetization is zero. Note that short pieces
with N < exp(J/T ) are magnetized.
That argument can be generalized for arbitrary systems with the short-
range interaction in the following way (Landau, 1950): assume we have n
contact points of two different phases. Those points add nϵ − T S to the
thermodynamic potential. The entropy is ln CLn where L is the length of
the chain. Evaluating entropy at 1 ≪ n ≪ L we get the addition to the
potential nϵ − T n ln(eL/n). The derivative of the thermodynamic potential
with respect to n is thus ϵ−T ln(eL/n) and it is negative for sufficiently small
n/L. That means that one decreases the thermodynamic potential creating
the mixture of two phases all the way until the derivative comes to zero which
happens at L/n = exp(ϵ/T ) — this length can be called the correlation scale
of fluctuations and it is always finite in 1d at a finite temperature as in a
disordered state. Landau & Lifshitz, Sect. 163.
64
one of the sublattices. Therefore we have the second-order phase transition
at zero field and at the temperature which is called Neel temperature. The
difference from ferromagnetic is that there is a phase transition also at a
nonzero external field (there is a line of transition in H − T plane.
One can try to describe the condensation transition by considering a
regular lattice with N cites that can be occupied or not. We assume our
lattice to be in a contact with a reservoir of atoms so that the total number
of atoms, Na , is not fixed. We thus use a grand canonical description with
Z(z, N, T ) given by (102). We model the hard-core repulsion by requiring
that a given cite cannot be occupied by more than one atom. The number of
cites plays the role of volume (choosing the volume of the unit cell unity). If
the neighboring cites are occupied by atoms it corresponds to the (attraction)
energy −2J so we have the energy E = −2JNaa where Naa is the total
number of nearest-neighbor pairs of atoms. The partition function is
∑
a
Z(Na , T ) = exp(2JNaa /T ) , (133)
gives the equation of state in the implicit form (like in Sect. 4.1.1): P =
T ln Z/N and 1/v = (z/V )∂ ln Z/∂z. The correspondence with the Ising
model can be established by saying that an occupied site has σ = 1 and
unoccupied one has σ = −1. Then Na = N+ and Naa = N++ . Recall that
for Ising model, we had E = −µH(N+ − N− ) + JN+− = µHN + (Jγ −
2µH)N+ − 2JN++ . Here we used the identity γN+ = 2N++ + N+− which
one derives counting the number of lines drawn from every up spin to its
nearest neighbors. The partition function of the Ising model can be written
similarly to (134) with z = exp[(γJ − 2µH)/T ]. Further correspondence can
be established: the pressure P of the lattice gas can be expressed via the
free energy per cite of the Ising model: P ↔ −F/N + µH and the inverse
specific volume 1/v = Na /N of the lattice gas is equivalent to N+ /N =
(1 + M/µN )/2 = (1 + η)/2. We see that generally (for given N and T ) the
lattice gas corresponds to the Ising model with a nonzero field H so that the
65
transition is generally of the first-order in this model. Indeed, when H = 0
we know that η = 0 for T > Tc which gives a single point v = 2, to get the
whole isotherm one needs to consider the nonzero H i.e. the fugacity different
from exp(γJ). In the same way, the solutions of the zero-field Ising model
at T < Tc gives us two values of η that is two values of the specific volume
for a given pressure P . Those two values, v1 and v2 , precisely correspond to
two phases in coexistence at the given pressure. Since v = 2/(1 + η) then as
T → 0 we have two roots η1 → 1 which correspond to v1 → 1 and η1 → −1
which corresponds to v1 → ∞. For example, in the mean field approximation
(129) we get (denoting B = µH)
γJ T 1 − η2 γJ T
P =B− (1 + η 2 ) − ln , B= − ln z ,
4 (
2 4 ) 2 2
2 B γJη
v= , η = tanh + . (135)
1+η T 2T
As usual in a grand canonical description, to get the equation of state one
expresses v(z) [in our case B(η)] and substitutes it into the equation for the
pressure. On the figure, the solid line corresponds to B = 0 at T < Tc where
we have a first-order phase transition with the jump of the specific volume,
the isotherms are shown by broken lines. The right figure gives the exact
two-dimensional solution.
P/T
c P/ T
c
5
0.2
T
1 T
c
v v
0
1 2 3 0 1 2 3
66
p
η2 η
η1
67
the case when ϵ1 + ϵ2 > 2ϵ12 so that it is indeed preferable to have alternating
atoms and two sublattices may exist at least at low temperatures. The phase
transition is of the second order with the specific heat observed to increase
as the temperature approaches the critical value. Huang, Chapter 16 and
Pathria, Chapter 12.
As we have seen, to describe the phase transitions of the second order
near Tc we need to describe strongly fluctuating systems. We shall study
fluctuations more systematically in the next section and return to critical
phenomena in Sects. 5.2 and 5.3.
68
5 Fluctuations
5.1 Thermodynamic fluctuations
Consider fluctuations of energy and volume of a given (small) subsystem. The
probability of a fluctuation is determined by the entropy change of the whole
system w ∝ exp(∆S0 ) which is determined by the minimal work needed for
a reversible creation of such a fluctuation: T ∆S0 = −Rmin = T ∆S − ∆E −
P ∆V where ∆S, ∆E, ∆V relate to the subsystem. Here we only assumed the
subsystem to be small i.e. ∆S0 ≪ S0 , E ≪ E0 , V ≪ V0 while fluctuations
can be substantial.
V0 S0
V − ∆S 0
Rmin
E0
69
and obtain [ ( ) ]
Cv 1 ∂P
w ∝ exp − 2 (∆T )2 + (∆V ) 2
. (139)
2T 2T ∂V T
⟨(∆v)2 ⟩ = N −2 ⟨(∆V )2 ⟩
which can be converted into the mean squared fluctuation of the number of
particles in a fixed volume:
( )
V 1 V ∆N N2 ∂V
∆v = ∆ = V ∆ = − , ⟨(∆N ) ⟩ = −T 2
2
. (140)
N N N2 V ∂P T
We see that the fluctuations cannot be considered small near the critical
point where ∂V /∂P is large.
For a classical ideal gas with V = N T /P it gives ⟨(∆N )2 ⟩ = N . In this
case, we can do more than considering small fluctuations (or large volumes).
Namely, we can find the probability of fluctuations comparable to the mean
value N̄ = N0 V /V0 . The probability for N (noninteracting) particles to be
inside some volume V out of the total volume V0 is
( ) ( )
N0 ! V N V0 − V N0 −N
wN =
N !(N0 − N )! V0 V0
( )
N̄ N N̄ N0 N̄ N exp(−N̄ )
≈ 1− ≈ . (141)
N! N0 N!
Here we assumed that N0 ≫ N and N0 ≈ (N0 − N )!N0N . The distribution
(141) is called Poisson law which takes place for independent events. Mean
squared fluctuation is the same as for small fluctuations:
∑
N̄ N N
⟨(∆N ) ⟩ = ⟨N ⟩ − N̄ = exp(−N̄ )
2 2 2
− N̄ 2
N =1 (N − 1)!
[ ]
∑N̄ N ∑ N̄ N
= exp(−N̄ ) + − N̄ 2 = N̄ . (142)
N =2 (N − 2)! N =1 (N − 1)!
70
5.2 Spatial correlation of fluctuations
We now consider systems with interaction and discuss a spatial correlation of
fluctuations of concentration n = N/V , which is particularly interesting near
the critical point. Indeed, the level of fluctuations depends on the volume of
the subsystem. To estimate whether the fluctuations are important or not
(say in destroying the order appearing in a phase transition) we need to con-
sider the subsystem size of the order of the correlation scale of fluctuations.
Since the fluctuations of n and T are independent, we assume T = const
so that the minimal work is the change in the free energy, which we again
expand to the quadratic terms
1∫
w ∝ exp(−∆F/T ) , ∆F = ϕ(r12 )∆n(r1 )∆n(r2 ) dV1 dV2 . (143)
2
Here ϕ is the second (variational) derivative of F with respect to n(r). After
Fourier transform,
∑ ∫ ∫
1
∆n(r) = ∆nk eikr , ∆nk = ∆n(r)e−ikr dr , ϕ(k) = ϕ(r)e−ikr dr .
k V
the free energy change takes the form
V ∑
∆F = ϕ(k)|∆nk |2 ,
2 k
which corresponds to a Gaussian probability distribution of independent vari-
ables - amplitudes of the harmonics. The mean squared fluctuation is as
follows
T
⟨|∆nk |2 ⟩ = . (144)
V ϕ(k)
Usually, the largest fluctuations correspond to small k where we can use the
expansion called the Ornshtein-Zernicke approximation
ϕ(k) ≈ ϕ0 + 2gk 2 . (145)
This presumes short-range interaction which makes large-scale limit regular.
From the previous section, ϕ0 (T ) = n−1 (∂P/∂n)T .
Making the inverse Fourier transform we find (the large-scale part of) the
pair correlation function of the concentration in 3d:
∑ ∫
V d3 k
⟨∆n(0)∆n(r)⟩ = |∆nk | e 2 ikr
= |∆nk |2 eikr
k (2π)3
71
∫ ∞ eikr − e−ikr V k 2 dk T exp(−r/rc )
= |∆nk |2 = . (146)
0 ikr (2π)2 8πgr
One can derive that by recalling (from Sect. 3.1) that (κ2 − ∆) exp(−κr)/r =
4πδ(r) or directly: expand the integral to −∞ and then close the contour in
the complex upper half plane for the first exponent and in the lower half plane
for the second exponent so that the integral is determined by the respective
poles k = ±κ = ±irc−1 . We defined the correlation radius of fluctuations
rc = [2g(T )/ϕ0 (T )]1/2 . Far from any phase transition, the correlation radius
is typically the mean distance between molecules.
Not only for the concentration but also for other quantities, (146) is a
general form of the correlation function at long distances. Near the critical
point, ϕ0 (T ) ∝ A(T )/V = α(T − Tc ) decreases and the correlation radius in-
creases. To generalize the Landau theory for inhomogeneous η(r) one writes
the space density of the thermodynamic potential uniting (121) and (145) as
g|∇η|2 + α(T − Tc )η 2 + bη 4 . That means that having an inhomogeneous state
costs us extra value of the thermodynamic potential. We again assume that
only the coefficient at η 2 turns into zero at√ the transition. The √correlation ra-
dius diverges at the transition : rc (T ) = g/α(T − Tc ) = rc0 Tc /(T − Tc ).
11
72
only outside the fluctuation region, in particular, the jump in the specific
heat ∆Cp is related to the values on the boundaries of the fluctuation region.
Landau theory predicts rc ∝ (T − Tc )−1/2 and the correlation function
approaching the power law 1/r as T → Tc in 3d. Of course, those scalings
are valid under the condition that the criterium (147) is satisfied that is
not very close to Tc . As we have seen from the exact solution of 2d Ising
model, the true asymptotics at T → Tc (i.e. inside the fluctuation region)
are different: rc ∝ (T − Tc )−1 and φ(r) = ⟨σ(0)σ(r)⟩ ∝ r−1/4 at T = Tc
in that case. Yet the fact of the radius divergence remains. It means the
breakdown of the Gaussian approximation for the probability of fluctuations
since we cannot divide the system into independent subsystems. Indeed, far
from the critical point, the probability distribution of the density has two
approximately Gaussian peaks, one at the density of liquid nl , another at
the density of gas ng . As we approach the critical point and the distance
between peaks is getting comparable to their widths, the distribution is non-
Gaussian. In other words, one needs to describe a strongly interaction system
near the critical point which makes it similar to other great problems of
physics (quantum field theory, turbulence).
w
ng nl n
73
transverse modes do not cost any energy. This is an example of the Gold-
stone theorem which claims that whenever continuous symmetry is sponta-
neously broken (i.e. the symmetry of the state is less than the symmetry
of the thermodynamic potential or Hamiltonian) then the mode must ex-
ist with the energy going to zero with the wavenumber. This statement is
true beyond the mean-field approximation or Landau theory as long as the
force responsible for symmetry breaking is short-range. Goldstone modes
are easily excited by thermal fluctuations and they destroy long-range or-
der for d ≤ 2. Indeed, in less than three dimensions, the integral (146)
at ϕ0 = 0 describes the∫ correlation function which grows with the dis-
tance: ⟨∆η(0)∆η(r)⟩ ∝ dd kk −2 ∝ [r2−d − L2−d ]/(d − 2). For example,
⟨∆n(0)∆n(r)⟩ ∝ ln r in 2d. Simply speaking, if at some point you have some
value then far enough from this point the value can be much larger. That
means that the state is actually disordered despite rc = ∞: soft (Goldstone)
modes with no energy price for long fluctuations (ϕ0 = 0) destroy long or-
der (this statement is called Mermin-Wagner theorem). One can state this
in another way by saying that the mean variance of the order parameter,
⟨(∆η)2 ⟩ ∝ L2−d , diverges with the system size L → ∞ at d ≤ 2. In exactly
the same way phonons with ωk ∝ k make 2d crystals impossible: the energy
of the lattice vibrations is proportional to the squared atom velocity (which
is the frequency ∫times displacement), that makes mean squared displacement
proportional to dd kT /ωk2 ∝ L2−d — in large enough samples the amplitude
of displacement is getting comparable to the distance between atoms in the
lattice, which means the absence of a long-range order.
Another example is the case of the complex scalar order parameter Ψ =
ϕ1 + ıϕ2 (say, the amplitude of the quantum condensate). In this case, the
density of the Landau thermodynamic potential invariant with respect to the
phase change Ψ → Ψ exp(iα) (called global gauge invariance) has the form
g|∇Ψ|2 + α(T − Tc )|Ψ|2 + b|Ψ|4 . At T < Tc , the minima of the potential form
a circle ϕ21 + ϕ22 = α(Tc − T )/b = ϕ20 in ϕ1 − ϕ2 plane. Any ordered (coherent)
state would correspond to some choice of the phase, say ϕ1 = ϕ0 , ϕ2 = 0. For
small perturbations around this state, Ψ = ϕ0 + φ + ıξ, the quadratic part of
the potential takes the form g|∇ξ|2 + g|∇φ|2 + bϕ20 |φ|2 /2 i.e. fluctuations of ξ
have an infinite correlation radius even at T < Tc ; their correlation function
(146) diverges at d ≤ 2, which means that phase fluctuations destroy 2d
quantum condensate.
While a long-range order in absent in 2d, local order exists which means
that at sufficiently low temperatures superfluidity and superconductivity can
74
exist in 2d films and 2d crystals can support transverse sound. At such
state, correlation function decays by a power law, and there exists a peculiar
Berezinskii-Kosterlitz-Thouless phase transition to a high-T state with an
exponential decay of correlations. For example, locally coherent condensate
can support vortices; the phase transition is from a low-T state is that of
dipole vortex pairs to a high-T state of free vortices which provide for a
Debye screening with Debye radius being the (finite) correlation length. For
crystals, the role of vortices are played by dislocations.
It is different however if the condensate is made of charged particles which
can interact with the the electromagnetic field, defined by the vector poten-
tial A and the field tensor Fµν = ∇µ Aν − ∇ν Aµ . That corresponds to the
thermodynamic potential |∇Ψ − ıeA|2 + α(T − Tc )|Ψ|2 + b|Ψ|4 − Fµν F µν /4,
which is invariant with respect to the local (inhomogeneous) gauge transfor-
mations Ψ → Ψ exp[iα(r)], A → A + ∇α/e for any differentiable function
α(r). Perturbations can now be defined as Ψ = (ϕ0 + h) exp[ıα/ϕ0 ], absorb-
ing also the phase shift into the vector potential: A → A + ∇α/eϕ0 . The
quadratic part of the thermodynamic potential is now
i.e. the correlation radii of both modes h, A are finite. This case does
not belong to the validity domain of the Mermin-Wagner theorem since the
Coulomb interaction is long-range.
Thinking dynamically, if the energy of a perturbation contains a nonzero
homogeneous term quadratic in the perturbation amplitude, then the spec-
trum of excitations has a gap at k → 0, so that what we have just seen is that
the Coulomb interaction leads to a gap in the plasmon spectrum in super-
conducting transition. On yet another language of relativistic quantum field
theory, Lagrangian plays a role of the thermodynamic potential and vacuum
plays a role of the equilibrium. Here, the presence of such a term means a fi-
nite energy (mc2 ) at zero momentum which means a finite mass. Interaction
with the gauge field described by (148) is called Higgs mechanism by which
particles acquire mass in quantum field theory, the excitation described by
A is called vector Higgs boson. When you hear about the infinite correlation
radius, gapless excitations or massless particles, remember that people talk
about the same phenomenon.
Let us summarize the relation between the space dimensionality and the
possibility of phase transition and ordered state for a short-range interaction.
75
At d ≥ 3 an ordered phase is always possible for a continuous as well as
discrete broken symmetry. For d = 2, a phase transition is possible that
breaks a discrete symmetry and corresponds to a real scalar order parameter
like in Ising model. At d = 1 no symmetry breaking and ordered state is
possible.
Landau & Lifshitz, Sects. 116, 152.
76
inantly up or down. We assume that the phenomena very near critical point
can be described equally well in terms of block spins with the energy of the
∑ ∑
same form as original, E ′ = −h′ i σi′ + J ′ /4 ij (1 − σi′ σj′ ), but with differ-
ent parameters J ′ and h′ . Let us demonstrate how it works using 1d Ising
model with
∑ [ ∑ h = 0 and ] J/2T ≡ K. Let us transform the partition function
12
{σ} exp K i σi σi+1 by the procedure (called decimation ) of eliminating
degrees of freedom by ascribing (undemocratically) to every block of k = 3
spins the value of the central spin. Consider two neighboring blocks σ1 , σ2 , σ3
and σ4 , σ5 , σ6 and sum over all values of σ3 , σ4 keeping σ1′ = σ2 and σ2′ = σ5
fixed. The respective factors in the partition function can be written as fol-
lows: exp[Kσ3 σ4 ] = cosh K + σ3 σ4 sinh K. Denote x = tanh K. Then only
the terms with even powers of σ3 , σ4 contribute and
∑
cosh3 K (1 + xσ1′ σ3 )(1 + xσ4 σ3 )(1 + xσ2′ σ4 ) = 4 cosh3 K(1 + x3 σ1′ σ2′ )
σ3 ,σ4 =±1
has the form of the Boltzmann factor exp(K ′ σ1′ σ2′ ) with the re-normalized
constant K ′ = tanh−1 (tanh3 K) or x′ = x3 . Note that T → ∞ correspond
to x → 0+ and T → 0 to x → 1−. One is interested in the set of the
parameters which does not change under the RG, i.e. represents a fixed point
of this transformation. Both x = 0 and x = 1 are fixed points, the first one
stable and the second one unstable. Indeed, after iterating the process we see
that x approaches zero and effective temperature infinity. That means that
large-scale degrees of freedom are described by the partition function where
the effective temperature is high so the system is in a paramagnetic state.
We see that there is no phase transition since there is no long-range order
for any T (except exactly for T = 0). RG can be useful even without critical
behavior, for example, the correlation length measured in lattice units must
satisfy rc (x′ ) = rc (x3 ) = rc (x)/3 which has a solution rc (x) ∝ ln−1 x, an exact
result for 1d Ising. It diverges at x → 1 (T → 0) as exp(2K) = exp(J/T ) in
agreement with the general argument of Sec. 4.2.2.
σ1 σ2 σ3 σ4 σ5 σ6
1d 2d
77
The picture of RG flow is different in higher dimensions. Indeed, in 1d
in the low-temperature region (x ≈ 1, K → ∞) the interaction constant K
is not changed upon renormalization: K ′ ≈ K⟨σ3 ⟩σ2 =1 ⟨σ4 ⟩σ5 =1 ≈ K. This is
clear because the interaction between k-blocks is mediated by their boundary
spins (that all look at the same direction). In d dimensions, there are k d−1
spins at the block side so that K ′ ∝ k d−1 K as K → ∞. That means that
K ′ > K that is the low-temperature fixed point is stable at d > 1. On
the other hand, the paramagnetic fixed point K = 0 is stable too, so that
there must be an unstable fixed point in between at some Kc which precisely
corresponds to Tc . Indeed, consider rc (K0 ) ∼ 1 at some K0 that corresponds
to sufficiently high temperature, K0 < Kc . Since rc (K) ∼ k n(K) , where
n(K) is the number of RG iterations one needs to come from K to K0 ,
and n(K) → ∞ as K → Kc then rc → ∞ as T → Tc . Critical exponent
ν = −d ln rc /d ln t is expressed via the derivative of RG at Tc . Indeed, denote
dK ′ /dK = k y at K = Kc . Since krc (K ′ ) = rc (K) then ν = 1/y. We see that
in general, the RG transformation of the set of parameters K is nonlinear.
Linearizing it near the fixed point one can find the critical exponents from
the eigenvalues of the linearized RG and, more generally, classify different
types of behavior. That requires generally the consideration of RG flows in
multi-dimensional spaces.
RG flow with two couplings
K2
critical surface
K2
K1 σ
K1
78
contracting stable directions, like the projection on K1 , K2 plane shown in
the Figure. The line of points attracted to the fixed point is the projection of
the critical surface, so called because the long-distance properties of each sys-
tem corresponding to a point on this surface are controlled by the fixed point.
The critical surface is a separatrix, dividing points that flow to high-T (para-
magnetic) behavior from those that flow to low-T (ferromagnetic) behavior
at large scales. We can now understand universality of critical behavior in
a sense that systems in different regions of the parameter K-space flow to
the same fixed point and have thus the same exponents. Indeed, changing
the temperature in a system with only nearest-neighbor coupling, we move
along the line K2 = 0. The point where this line meets critical surface defines
K1c and respective Tc1 . At that temperature, the large-scale behavior of the
system is determined by the RG flow i.e. by the fixed point. In another sys-
tem with nonzero K2 , by changing T we move along some other path in the
parameter space, indicated by the broken line at the figure. Intersection of
this line with the critical surface defines some other critical temperature Tc2 .
But the long-distance properties of this system are again determined by the
same fixed point i.e. all the critical exponents are the same. For example, the
critical exponents of a simple fluid are the same as of a uniaxial ferromagnet.
See Cardy, Sect 3 and http://www.weizmann.ac.il/home/fedomany/
79
5.4 Response and fluctuations
The mean squared thermodynamic fluctuation of any quantity is determined
by the second derivative of the thermodynamic potential with respect to
this quantity. Those second derivatives are related to susceptibilities with
respect to the properly defined external forces. One can formulate a general
relation. Consider a system with the Hamiltonian H and add some small
static external force f so that the Hamiltonian becomes H − xf where x is
called the coordinate. The examples of force-coordinate pairs are magnetic
field and magnetization, pressure and volume etc. The mean value of any
other variable B can be calculated by the canonical distribution with the
new Hamiltonian ∑
B exp[(xf − H)/T ]
B̄ = ∑ .
exp[(xf − H)/T ]
Note that we assume that the perturbed state is also in equilibrium. The
susceptibility of B with respect to f is as follows
∂ B̄ ⟨Bx⟩ − B̄ x̄ ⟨Bx⟩c
χ≡ = ≡ . (150)
∂f T T
Here the cumulant (also called the irreducible correlation function) is defined
for quantities with the subtracted mean values ⟨xy⟩c ≡ ⟨(x − x̄)(y − ȳ)⟩ and
it is thus the measure of statistical correlation between x and y. We thus
learn that the susceptibility is the measure of the statistical coherence of the
system, increasing with the statistical dependence of variables. Consider few
examples of this relation.
The mean linear response can be written as an integral with the response
(Green) function:
∫
ā(r) = G(r − r′ )f (r′ ) dr′ , āk = Gk fk . (151)
One relates the Fourier components of the Green function and the correlation
function of the coordinate fluctuations choosing B = ak , x = a−k in (150):
1∫ ′ V ∫
V Gk = ⟨a(r)a(r′ )⟩c eik·(r −r) drdr′ = ⟨a(r)a(0)⟩c e−ik·r dr .
T T
T Gk = (a2 )k . (152)
4. If B = x = N then f is the chemical potential µ:
( )
∂N ⟨N 2 ⟩c ⟨(∆N )2 ⟩ V ∫
= = = ⟨n(r)n(0)⟩c dr .
∂µ T,V
T T T
This formula coincides with (140) if one accounts for
( ) ( ) ( )
∂V ∂n ∂N
−n 2
= N =n
∂P T,N
∂P T,N
∂P T,V
( ) ( ) ( )
∂P ∂N ∂N
= = .
∂µ T,V
∂P T,V
∂µ T,V
Hence the response of the density to the pressure is related to the density
fluctuations.
These fluctuation-response relations can be related to the change of the
thermodynamic potential (free energy) under the action of the force:
∑
F = −T ln Z = −T ln exp[(xf − H)/T ]
f2 2
= −T ln Z0 − T ln⟨exp(xf /T )⟩0 = F0 − f ⟨x⟩0 − ⟨x ⟩0c + . . . (153)
2T
⟨x⟩ = −∂F/∂f, ⟨x2 ⟩c /T = ∂⟨x⟩/∂f = −∂F 2 /∂f 2 . (154)
81
Subscript 0 means an average over the state with f = 0, we don’t write it
in the expansion (154) which can take place around any value of f . Formula
(153) is based on the cumulant expansion theorem (which we already used
implicitly in making virial expansion in Sect. 3.2):
∞
∑ ∞
∑
an an
⟨exp(ax)⟩ = ⟨xn ⟩ , ln⟨exp(ax)⟩ = ⟨xn ⟩c = ⟨eax − 1⟩c . (155)
n=1 n! n=1 n!
82
We now use (158) to derive the relation between the fluctuations and
response in the time-dependent case. Indeed, the linear response of the co-
ordinate to the force is as follows
∫ t ∫
⟨x(t)⟩ ≡ α(t, t′ )f (t′ ) dt′ = xdxρ1 (x, t) , (159)
−∞
83
We can now calculate the average dissipation using (158)
∫
dE
= − xf˙ρ1 dpdx = βω 2 |fω |2 (x2 )ω , (164)
dt
where the spectral density of the fluctuations is calculated with ρ0 (i.e. at
unperturbed equilibrium). Comparing (163) and (164) we obtain the spectral
form of the fluctuation-dissipation theorem (Callen and Welton, 1951):
2T α′′ (ω) = ω(x2 )ω . (165)
We can also obtain directly making a Fourier transform of (160),
T α(ω) ∫ ∞
= ⟨x(0)x(t)⟩ exp(iωt)dt ,
iω 0
∫ ∞ ∫ 0
(x2 )ω = ⟨x(0)x(t)⟩ exp(iωt)dt + ⟨x(0)x(t)⟩ exp(iωt)dt
0 −∞
T [α(ω) − α(−ω)] 2T α′′ (ω)
= = .
iω ω
This truly amazing formula relates the dissipation coefficient that governs
non-equilibrium kinetics under the external force with the equilibrium fluc-
tuations. The physical idea is that to know how a system reacts to a force
one might as well wait until the fluctuation appears which is equivalent to
the result of that force. Note that the force f disappeared from the final
result which means that the relation is true even when the (equilibrium)
fluctuations of x are not small. Integrating (165) over frequencies we get
∫ ∞ dω T ∫ ∞ α′′ (ω)dω T ∫ ∞ α(ω)dω
⟨x ⟩ =
2
(x )ω 2
= = = T α(0) . (166)
−∞ 2π π −∞ ω ıπ −∞ ω
The spectral density has a universal form in the low-frequency limit when
the period of the force is much longer than the relaxation time for establishing
the partial equilibrium characterized by the given value x̄ = α(0)f . In this
case, the evolution of x is the relaxation towards x̄:
ẋ = −λ(x − x̄) . (167)
For harmonics,
(λ − iω)xω = λx̄ = λα(0)f ,
λ ω
α(ω) = α(0) , α′′ (ω) = α(0) 2 . (168)
λ − iω λ + ω2
84
The spectral density of such (so-called quasi-stationary) fluctuations is as
follows:
2λ
(x2 )ω = ⟨x2 ⟩ 2 . (169)
λ + ω2
It corresponds to the long-time exponential decay of the temporal correla-
tion function: ⟨x(t)x(0)⟩ = ⟨x2 ⟩ exp(−λ|t|). That exponent is a temporal
analog of the large-scale formula (146). Non-smooth behavior at zero is an
artefact of the long-time approximation, consistent consideration would give
zero derivative at t = 0.
When several degrees of freedom are weakly deviated from equilibrium,
the relaxation must be described by the system of linear equations (consider
all xi = 0 at the equilibrium)
The dissipation coefficients are generally non-symmetric: λij ̸= λji . One can
however find a proper coordinates in which the coefficients are symmetric.
Single-time probability distribution of small fluctuations is Gaussian w(x) ∼
exp(∆S) ≈ exp(−βjk xj xk ). Introduce generalized forces Xj = −∂S/∂xj =
∫
βij∫xj so that ẋi = γij Xj , γij = λik (β̂ −1 )kj with ⟨xi Xj ⟩ = dxxi Xj w =
− dxxi ∂w/∂xj = δij — we have seen that the coordinates and the gener-
alized forces do not cross-correlate already in the simplest case of uniform
fluctuations described by (137) which gave ⟨∆T ∆V ⟩ = 0, for instance. Re-
turning to the general case, note also that ⟨Xj Xj ⟩ = βij and ⟨xj xk ⟩ = (β̂ −1 )jk .
If xi all have the same properties with respect to the time reversal then their
correlation function is symmetric too: ⟨xi (0)xk (t)⟩ = ⟨xi (t)xk (0)⟩. Differen-
tiating it with respect to t at t = 0 we get the Onsager symmetry principle,
γik = γki . For example, the conductivity tensor is symmetric in crystals
without magnetic field. Also, a temperature difference produces the same
electric current as the heat current produced by a voltage. Such symmetry
relations due to time reversibility are valid only near equilibrium steady state
and are manifestations of the detailed balance (i.e. absence of any persistent
currents in the phase space).
See Landay & Lifshitz, Sect. 119-120 for the details and Sect. 124 for the
quantum case. Also Kittel Sects. 33-34.
85
5.6 Brownian motion
The momentum of a particle in a fluid, p = M v, changes because of collisions
with the molecules. When the particle is much heavier than the molecules
then its velocity is small comparing to the typical velocities of the molecules.
Then one can write the force acting on it as Taylor expansion with the parts
independent of p and linear in p:
ṗ = −λp + f . (171)
We now assume that ⟨f ⟩ = 0 and that ⟨f (t′ ) · f (t′ + t)⟩ = 3C(t) decays with
t during the correlation time τ which is much smaller than λ−1 . Since the
integration time in (172) is of order λ−1 then the condition λτ ≪ 1 means
that the momentum of a Brownian particle can be considered as a sum of
many independent random numbers (integrals over intervals of order τ ) and
so it must have a Gaussian statistics ρ(p) = (2πσ 2 )−3/2 exp(−p2 /2σ 2 ) where
∫ ∞
σ 2
= ⟨p2x ⟩ = ⟨p2y ⟩ = ⟨p2z ⟩ = C(t1 − t2 )e−λ(t1 +t2 ) dt1 dt2
0
∫ ∫
∞ 2t 1 ∫∞
≈ e−2λt dt C(t′ ) dt′ ≈ C(t′ ) dt′ . (173)
0 −2t 2λ −∞
On the other hand, equipartition guarantees that ⟨p2x ⟩ = M T so that we
can express the friction coefficient via the correlation function of the force
fluctuations (a particular case of the fluctuation-dissipation theorem):
1 ∫∞
λ= C(t′ ) dt′ . (174)
2T M −∞
Displacement
∫ t′
′
∆r = r(t + t ) − r(t) = v(t′′ ) dt′′
0
is also Gaussian with a zero mean. To get its second moment we need the
different-time correlation function of the velocities
86
which can be obtained from (172)14 . That gives
∫ ∫
t′ t′ 6T t′
⟨(∆r)2 ⟩ = dt1 dt2 ⟨v(t1 )v(t2 )⟩ =
0 0 Mλ
and the probability distribution of displacement,
that satisfies the diffusion equation ∂ρ/∂t′ = D∇2 ρ with the diffusivity D =
T /M λ — the Einstein relation. If we have many particles initially distributed
∫
according to n(r, 0) then their distribution at any time, n(r, t) = ρ(r −
r′ , t)n(r′ , 0) dr′ , also satisfies the diffusion equation: ∂n/∂t′ = D∇2 n.
Ma, Sect. 12.7
14
Note that the friction makes velocity correlated on a longer timescale than the force.
87
6 Kinetics
Here we consider non-equilibrium behavior of a rarefied classical gas.
Note that we assumed here that the particle velocity is independent of the po-
sition and that the two particles are statistically independent that is the prob-
ability to find two particles simultaneously is the product of single-particle
probabilities. This sometimes is called the hypothesis of molecular chaos and
has been proved only for few simple cases. We believe that (176) must work
well when the distribution function evolves on a time scale much longer than
that of a single collision. Since w ∝ |v −v1 | then one may introduce the scat-
tering cross-section dσ = wdv′ dv1′ /|v − v1 | which in principle can be found
for any given law of particle interaction by solving a kinematic problem. Here
we describe the general properties. Since mechanical laws are time reversible
then
w(−v, −v1 ; −v′ , −v1′ ) = w(v′ , v1′ ; v, v1 ) . (177)
If, in addition, the medium is invariant with respect to inversion r → −r
then we have the detailed equilibrium:
88
and so the sums are equal to each other:
∫ ∫
w(v, v1 ; v′ , v1′ ) dv′ dv1′ = w(v′ , v1′ ; v, v1 ) dv′ dv1′ . (179)
We can now write the collision term as the difference between the number of
particles coming and leaving the given region of phase space around v:
∫
I = (w′ f ′ f1′ − wf f1 ) dv1 dv′ dv1′
∫
= w′ (f ′ f1′ − f f1 ) dv1 dv′ dv1′ . (180)
Here we used (179) in transforming the second term. We can now write the
famous Boltzmann kinetic equation (1872)
∫
∂f ∂f F ∂f
+v + = w′ (f ′ f1′ − f f1 ) dv1 dv′ dv1′ , (181)
∂t ∂r m ∂v
6.2 H-theorem
The entropy of the ideal classical gas can be derived for an arbitrary (not
necessary equilibrium) distribution in the phase. Consider an element dpdr
which has Gi = dpdr/h3 states and Ni = f Gi particles. The entropy of the
element is Si = ln(GNi /Ni !) ≈ Ni ln(eGi /Ni ) =∫f ln(e/f )dpdr/h . We write
i 3
3
the total entropy up to the factor M/h : S = f ln(e/f ) drdv. Let us look
at the evolution of the entropy
∫ ∫
dS ∂f
=− ln f drdv = − I ln f drdv , (182)
dt ∂t
since
∫ ( )
∫ ( )
∂f F ∂f ∂ F ∂ f
ln f v + drdv = v + f ln drdv = 0 .
∂r m ∂v ∂r m ∂v e
The integral (182) contains the integrations over all velocities so we may
exploit two interchanges, v1 ↔ v and v, v1 ↔ v′ , v1′ :
dS ∫ ′
= w ln f (f f1 − f ′ f1′ ) dvdv1 dv′ dv1′ dr
dt ∫
1
= w′ ln f f1 (f f1 − f ′ f1′ ) dvdv1 dv′ dv1′ dr
2∫
1
= w′ ln(f f1 /f ′ f1′ )f f1 dvdv1 dv′ dv1′ dr ≥ 0 , (183)
2
89
∫
Here we may add the integral w′ (f f1 − f ′ f1′ ) dvdv1 dv′ dv1′ dr/2 = 0 and
then use the inequality x ln x − x + 1 ≥ 0 with x = f f1 /f ′ f1′ . Note that
entropy production is positive in every element dr.
Even though we use scattering cross-sections obtained from mechanics
reversible in time, our use of molecular chaos hypothesis made the kinetic
equation irreversible. The reason for irreversibility is coarse-graining that is
finite resolution in space and time, as was explained in Sect. 1.6.
Equilibrium distribution realizes the entropy maximum and so must be a
steady solution of the Boltzmann equation. Indeed, the equilibrium dis-
tribution depends only on the integrals of motion. For any function of
the conserved quantities, the left-hand-side of (181) (which is a total time
derivative) is zero. Also the collision integral turns into zero by virtue of
f0 (v)f0 (v1 ) = f0 (v′ )f0 (v1′ ) since ln f0 is the linear function of the integrals
of motion as was explained in Sect. 1.1. Note that all this is true also for the
inhomogeneous equilibrium in the presence of an external force.
90
∫ ∫ ∫
m f (r, v, t) dv and velocity u = vf dv/ f dv. Collisions do not change
total number of particles, momentum and energy so that if we multiply (181)
respectively by m, mvα , ϵ and integrate over dv we get three conservation laws
(mass, momentum and energy):
∂ρ
+ div ρu = 0 , (184)
∂t
∫
∂ρuα ∂ ∂Pαβ
= nFα − mvα vβ f dv ≡ nFα − , (185)
∂t ∂xβ ∂xβ
∫
∂nϵ̄
= n(F · u) − div ϵvf dv ≡ n(F · u) − div q , (186)
∂t
While the form of those equations is suggestive, to turn them into the hydro-
dynamic equations ready to use practically, one needs to find f and express
the tensor of momentum flux Pαβ and the vector of the energy flux q via
the macroscopic quantities ρ, u, nϵ̄. Since we consider situations when ρ and
u are both inhomogeneous then the system is clearly not in equilibrium.
Closed macroscopic equations can be obtained when those inhomogeneities
are smooth so that in every given region (much larger than the mean free path
but much smaller than the scale of variations in ρ and u) the distribution is
close to equilibrium.
At the first step we assume that f = f0 which (as we shall see) means
neglecting dissipation and obtaining so-called ideal hydrodynamics. Equilib-
rium in the piece moving with the velocity u just correspond to the changes
v = v′ +u and ϵ = ϵ′ +m(u·v′ )+mu2 /2 where primed quantities relate to the
co-moving frame where the distribution is isotropic and ⟨vα′ vβ′ ⟩ = ⟨v 2 ⟩δαβ /3.
The fluxes are thus
91
volume which turns into surface integrals of two terms. One is unϵ̄ which is
the energy brought into the volume, another is the pressure term P u which
gives the work done. The closed first-order equations (184-188) constitute
ideal hydrodynamics. While we derived it only for a dilute gas they are used
for liquids as well which can be argued heuristically.
92
We split the energy of a molecule into kinetic and internal: ϵ = ϵi + mv 2 /2.
Having in mind both viscosity and thermal conductivity we assume all macro-
scopic parameters to be functions of coordinates and put F = 0 zero. We can
simplify calculations doing them in the point with u = 0 because the answer
must depend only on velocity gradients. Differentiating (192) one gets
[( ) ] ( )
T ∂f0 ∂µ µ − ϵ ∂T ∂µ ∂P ∂u
= − + + mv
f0 ∂t ∂T T T ∂t ∂P T ∂t ∂t
ϵ − w ∂T 1 ∂P ∂u
= + + mv .
[(T ) ∂t n ∂t ] ∂t
T ∂µ µ−ϵ 1
v∇f0 = − v∇ T + v∇ P + mva vb uab .
f0 ∂T T T n
Here uab = (∂ua /∂xb + ∂ub /∂xa )/2. We now add those expressions and
substitute time derivatives from the ideal expressions (184-188), ∂u/∂t =
−ρ−1 ∇P , ρ−1 ∂ρ/∂t = (T /P )∂(P/T )/∂t = −div u,
∂s/∂t = (∂s/∂T )P ∂T /∂t+(∂s/∂P )T ∂P/∂t = (cp /T )∂T /∂t−P −1 ∂P/∂t , etc.
After some manipulations one gets the kinetic equation (for the classical gas
with w = cp T ) in the following form:
(ϵ/T − cp )v∇ T + (mva vb − δab ϵ/cv )uab = I(χ) . (193)
The expansion in gradients or in the parameter l/L where l is the mean-
free path and L is the scale of velocity and temperature variations is called
Chapman-Enskog method (1917). Note that the pressure gradient cancel out
which means that it does not lead to the deviations in the distribution (and
to dissipation).
Thermal conductivity. Put uab = 0. The solution of the linear integral
(ϵ − cp T )v∇ T = T I(χ) has the form χ(r, v) = g(v) · ∇T (r). One can find
g specifying the scattering cross-section for any material. In the simplest
case of the τ -approximation, g = v(mv 2 /2T − 5/2)τ 15 . And generally, one
can estimate g ≃ l and obtain the applicability condition for the Chapman-
Enskog expansion: χ ≪ T ⇒ l ≪ L ≡ T /|∇T |.
The correction χ to the distribution makes for the correction to the energy
flux (which for u = 0 is the total flux):
√
1 ∫ v̄ 1 T
q = −κ∇T , κ=− f0 ϵ(v · g) dv ≃ lv̄ ≃ ≃ . (194)
3T nσ nσ m
∫
15
check that it satisfies (191) i.e. f0 (v · g) dv = 0.
93
Note that the thermal conductivity κ does not depend on on the gas density
(or pressure). This is because we accounted only for binary collisions which
is OK for a dilute gas.
Viscosity. We put ∇T = 0 and separate the compressible part div u
from other derivatives which turns (193) into
( )
mv 2 ϵ
mva vb (uab − δab div u/3) + − div u = I(χ) . (195)
3 cv
The two terms in the left-hand side give χ = gab uab + g ′ div u. that give the
following viscous contributions into the momentum flux Pab :
16
ζ = 0 for mono-atomic gases which have ϵ = mv 2 /2, cv = 3/2 so that the second term
in the lhs of (195) turns into zero
94
Cited basic books
L. D. Landau and E. M. Lifshitz, Statistical Physics Part 1, 3rd edition
(Course of Theor. Phys, Vol. 5).
R. K. Pathria, Statistical Mechanics.
R. Kubo, Statistical Mechanics.
K. Huang, Statistical Mechanics.
C. Kittel, Elementary Statistical Physics.
Additional reading
S.-K. Ma, Statistical Mechanics.
E. M. Lifshitz and L.P. Pitaevsky, Physical Kinetics.
A. Katz, Principles of Statistical Mechanics.
J. Cardy, Scaling and renormalization in statistical physics.
M. Kardar, Statistical Physics of Particles, Statistical Physics of Fields.
J. Sethna, Entropy, Order Parameters and Complexity.
95
Exam 2007
1. A lattice in one dimension has N cites and is at temperature T .
At each cite there is an atom which can be in either of two energy states:
Ei = ±ϵ. When L consecutive atoms are in the +ϵ state, we say that they
form a cluster of length L (provided that the atoms adjacent to the ends of
the cluster are in the state −ϵ). In the limit N → ∞,
a) Compute the probability PL that a given cite belongs to a cluster of
∑
length L (don’t forget to check that ∞ L=0 PL = 1);
b) Calculate the mean length of a cluster ⟨L⟩ and determine its low- and
high-temperature limits.
2. Consider a box containing an ideal classical gas at pressure P and
temperature T. The walls of the box have N0 absorbing sites, each of which
can absorb at most two molecules of the gas. Let −ϵ be the energy of an
absorbed molecule. Find the mean number of absorbed molecules ⟨N ⟩. The
dimensionless ratio ⟨N ⟩/N0 must be a function of a dimensionless parameter.
Find this parameter and consider the limits when it is small and large.
3. Consider the spin-1 Ising model on a cubic lattice in d dimensions,
given by the Hamiltonian
∑ ∑ ∑
H = −J Si Sj − ∆ Si2 − h Si ,
⟨i,j⟩ i i
∑
where Si = 0, ±1, <ij> denote a sum over z nearest neighbor sites and
J, ∆ > 0.
(a) Write down the equation for the magnetization m = ⟨Si ⟩ in the mean-
field approximation.
(b) Calculate the transition line in the (T, ∆) plane (take h = 0) which
separates the paramagnetic and the ferromagnetic phases. Here T is
the temperature.
96
phase is given by
1 1
χ= .
1 −β∆
kB T 1 + 2 e − Jz
kB T
Answers
Problem 1.
a) Probabilities of any cite to have energies ±ϵ are
P± = e±βϵ (eβϵ + e−βϵ )−1 .
The probability for a given cite to belong to an L-cluster is PL = LP+L P−2
for L ≥ 1 since cites are independent and we also need two adjacent cites to
have −ϵ. The cluster of zero length corresponds to a cite having −ϵ so that
PL = P− for L = 0. We ignore the possibility that a given cite is within L
of the ends of the lattice, it is legitimate at N → ∞.
∞
∑ ∞
∑ ∞
∂ ∑
PL = P− + P−2 LP+L = P− + P−2 P+ P+L
L=0 L=1 ∂P+ L=1
P−2 P+
= P− + = P− + P+ = 1 .
(1 − P+ )2
b)
∞ ∞
∑ ∂ ∑ P+ (1 + P+ ) eβϵ + 2e−βϵ
⟨L⟩ = LPL = P−2 P+ LP+L = = e−2βϵ βϵ .
L=0 ∂P+ L=1 P− e + e−βϵ
At T = 0 all cites are in the lower level and ⟨L⟩ = 0. As T → ∞, the proba-
bilities P+ and P− are equal and the mean length approaches its maximum
⟨L⟩ = 3/2.
97
Problem 2.
Since each absorbing cite is in equilibrium with the gas, then the cite and
the gas must have the same chemical potential µ and the same temperature
T . The fugacity of the gas z = exp(βµ) can be expressed via the pressure
from the grand canonical partition function
Zg (T, V, µ) = exp[zV (2πmT )3/2 h−3 ] ,
P V = Ω = T ln Zg = zV T 5/2 (2πm)3/2 h−3 .
The grand canonical partition function of an absorbing cite Zcite = 1 + zeβϵ +
z 2 e2βϵ gives the average number of absorbed molecules per cite:
⟨N ⟩ ∂Zcite x + 2x2
=z =
N0 ∂z 1 + x + x2
where the dimensionless parameter is x = P T −5/2 eβϵ h3 (2πm)−3/2 . The limits
are ⟨N ⟩/N0 → 0 as x → 0 and ⟨N ⟩/N0 → 2 as x → ∞.
Problem 3.
a) Hef f (S) = −JmzS − ∆S 2 − hS, S = 0, ±1.
eβ(Jzm+h) − e−β(Jzm+h)
m = eβ∆ [ ] .
1 + eβ∆ eβ(Jzm+h) + e−β(Jzm+h)
b) h = 0,
2βJzm + (βJzm)3 /3
m ≈ eβ∆ .
1 + 2eβ∆ [1 + (βJzm)2 /2]
Transition line βc Jz = 1 + 12 e−βc ∆ . At ∆ → ∞ it turns into Ising.
c)
(β − βc )Jz
m2 = .
(βc Jz)2 /2 − (βc Jz)3 /6
d)
2βJzm + 2βh
m ≈ eβ∆ , m ≈ 2βh(2 + e−β∆ − βJz)−1 , χ = ∂m/∂h .
1 + 2eβ∆
Problem 4
Since there are 25 = 32 different characters then every character brings
5 bits and the entropy decrease is 5 × 3000/ log2 e. The energy emitted by
the lamp P t brings the entropy increase P t/kT which is 100 × 100 × 1023 ×
log2 e/1.38 × 300 × 5 × 3000 ≃ 1020 times larger.
98
Exam 2009
99
4. One may think of certain networks, such as the internet, as a directed
graph G of N vertices. Every pair of vertices, say i and j, can be connected by
multiple edges (e.g. hyperlinks) and loops may connect edges to themselves.
The graph can therefore be described in terms of an adjacency matrix Aij
with N 2 elements; each Aij counts the number of edges connecting i and j
and can be any non-negative integer, 0, 1, 2...
∑
The entropy of the ensemble of all possible graphs is S = − G pG ln pG =
∑
− {Aij } p(Aij ) ln p(Aij ). Consider such an ensemble with the fixed average
number of edges per vertex ⟨k⟩.
(i) Write an expression for the number of edges per vertex k for a given
graph Aij . Use the maximum entropy principle to calculate pG (Aij ) and the
partition function Z (denote the Lagrange multiplier that accounts for fixed
⟨k⟩ by τ ). What is the equivalent of the Hamiltonian? What are the degrees
of freedom? What kind of “particles” are they? Are they interacting?
(ii) Calculate the free energy F = − ln Z, and express it in terms of τ . Is
it extensive with respect to any number?
(iii) Write down an expression for the mean occupation number ⟨Aij ⟩ as
a function of τ . What is the name of this statistics? What is the ”chemical
potential” and why?
(iv) Express F via N and ⟨k⟩. Express pG for a given graph as a function
of k, ⟨k⟩ and N .
100
Solutions
Problem 1.
a) n(x) = n(0) exp(−βAx2 /2) = 2n(Aβ/2π)1/2 exp(−βAx2 /2).
b) For xi > 0, ( )
∑ p2i Ax2i
H= + .
i 2m 2
Equipartition gives E = 4(T /2)N = 2T N and Cv = 2N .
c)
[ ( )]
h3N ∫ ∑ p2i Ax2i
ZN = πdpdx exp −β +
N! i 2m 2
√ N
h3N
L 2N √ 2πT /A
= ( 2πmT )3N ,
N! 2
[ ( )]
F = −T 3N ln h − N ln N + N + 2N ln L + N ln 2π 2 T 2 m3/2 /A1/2 ,
∂F [ ( )]
µ= = −T 3 ln h − ln N + 2 ln L + ln 2π 2 T 2 m3/2 /A1/2
∂N
[ ( )]
= −T 3 ln h − ln n + ln 2π 2 T 2 m3/2 /A1/2 .
Problem 2
Our main task is to find the number of particles per unit time escaping
from the cavity. To this end, we first suppose that the cavity is a cube of side
L and that single-particle wavefunctions satisfy periodic boundary conditions
at its walls. Then the allowed momentum eigenvalues are pi = hni /L (i =
1, ..., 3), where the ni are positive or negative integers. Then (allowing for
two spin polarizations) the number of states with momentum in the range d3 p
is (2V /h3 )d3 p, where V = L3 is the volume. (To an excellent approximation,
this result is independent of the shape of the cavity.) Multiplying by the
grand canonical occupation numbers for these states, we find that the number
101
of particles per unit volume with momentum in the range d3 p near p is
n(p)d3 p, where
2 1
n(p) = 3 (197)
h exp[β(ϵ(p) − η)] + 1
with ϵ(p) = |p|2 /2m. Now we consider, in particular, a small volume enclos-
ing the hole in the cavity wall, and adopt a polar coordinate system with
the usual angles θ and ϕ, such that the axis θ = 0 (the z axis, say) is the
outward normal to the hole (see corresponding figure).
The number of electrons per unit volume whose volume has a magnitude
between p and p + dp and is directed into a solid angle dΩ = sinθdθdϕ
surrounding the direction (θ, ϕ) is
n(p)p2 sin θdpdθdϕ (198)
and, since these electrons have a speed p/m, the number per unit time cross-
ing a unit area normal to the (θ, ϕ) direction is
1
n(p)p3 sin θdpdθdϕ . (199)
m
The hole subtends an area δA cos θ normal to this direction, so the number
of electrons per unit time passing through the hole with momentum between
p and p + dp into the solid angle dΩ is
δA
n(p)p3 sin θ cos θdpdθdϕ . (200)
m
It is useful to check this for the case of a classical gas, with n(p) = n(2πmkB T )−3/2 e−p
2 /2mk
BT
,
where n is the total number density, and with V = 0. Bearing in mind that
only those particles escape for which 0 ≤ θ < π/2, we then find that the
total number of particles escaping per unit time is
∫ ∞ ∫ π/2 ∫ 2π [ ]
δAn p2 1
dp dθ dϕp 3
exp − sin θ cos θ = nc̄δA ,
m(2πmkB T )3/2 0 0 0 2mkB T 4
√ (201)
where c̄ = 8kB T /πm is the mean speed. This standard result can be
obtained by several methods in elementary kinetic theory. For the case in
hand, the number of electrons escaping per unit time is
dN δA ∫ ∞ ∫ π/2 ∫ 2π
= √ dp dθ dϕp3 n(p) sin θ cos θ (202)
dt m 2mV 0 0
2πδA ∫√
∞ p3
= mh3 2mV eβ(ϵ(p)−µ) +1 dp
(203)
.
102
If V − µ ≫ kB T , then β(ϵ(p) − µ) is large for all values of p in the range of
integration, and we can use the approximation
( )
dN 2πδA ∫ ∞ 3 −(ϵ(p)−µ)β πδA 2 −(V −µ)β V
≃ √ p e dp ≃ (2mk B T ) e 1 + .
dt mh3 2mV mh3 kB T
(204)
Since µ is positive for a Fermi gas at low temperatures, we also have V ≫
kB T , so the 1 and 1 + V /kB T can be neglected. Finally, therefore, on multi-
plying by the charge −e each electron, we estimate the current as
( )
4πme
I=− δAV kB T exp [−β(V − µ)] . (205)
h3
If the temperature is low enough that kB T ≪ ϵF , then µ can be replaced
by the Fermi energy ϵF = (h2 /8m)(3n/π)2/3 , where n is the total number of
particles per unit volume. Of course, the current, that we have calculated is
the charge per unit time emerging from the hole. The number of electrons
per unit solid angle at an angle θ to the normal (z direction) is proportional
to cos θ, so, although no electrons emerge tangentially to the wall (θ = π/2),
not all of them travel in the z direction.
Problem 3
In general, particles flow from a region of higher chemical potential to a
region of lower chemical potential. We therefore need to find out in which
region the chemical potential is higher, and we do this by considering the
grand canonical expression for the number of particles per unit volume. In
the presence of a magnetic field, the single- particle energy is ϵ ± τ H, where ϵ
is the kinetice energy, depending on whether the magnetic moment is parallel
or antiparallel to the field. The total number of particles is then given by
∫ ∞ ∫ ∞
1 1
N= dϵg(ϵ) + dϵg(ϵ) .
0 exp[β(ϵ − η − τ H)] + 1 0 exp[β(ϵ − η + τ H)] + 1
(206)
For non-relativistic particles in a d−dimensional volume V , the density of
states is g(ϵ) = γV ϵd/2−1 , where γ is a constant. At T = 0, the Fermi
distribution function is
( )
1
lim = θ(µ ∓ τ H − ϵ) (207)
β→∞ eβ(ϵ−µ±τ H) + 1
103
where θ(·) is the step function, so the integrals are easily evaluated with the
result
N 2γ [ ]
= (µ + τ H)d/2 + (µ − τ H)d/2 . (208)
V d
At the moment that the wall is removed, N/V is the same in regions A and
B; so (with H = 0 is the region B) we have
d/2
(µA + τ H)d/2 + (µA − τ H)d/2 = 2µB . (209)
For small fields, we can make use of the Taylor expansions
( )
d d d
(1 ± x) d/2
=1± x+ − 1 x2 + ... (210)
2 4 2
to obtain ( )d/2 ( )2
µB d(d − 2) τH
=1+ + ... (211)
µA 8 µA
We see that, for d = 2, the chemical potentials are equal, so there is no flow of
particles. For d > 2, we have µB > µA so particles flow towards the magnetic
field in region A while, for d < 2, the opposite is true. We can prove that the
same result holds for any magnetic field strength as follows. For compactness,
d/2
we write λ = τ H. Since our basic equation (µA +λ)d/2 +(µA −λ)d/2 = 2µB is
unchanged if we change λ to −λ, we can take λ > 0 without loss of generality.
Bearing in mind the µB is fixed, we calculate dµA /dλ as
dµA (µA − λ)d/2−1 − (µA + λ)d/2−1
= . (212)
dλ (µA − λ)d/2−1 + (µA + λ)d/2−1
Since µA + λ > µA − λ, we have (µA + λ)d/2−1 > (µA − λ)d/2−1 if d > 2 and
vice versa. Therefore, if d > 2, then dµA /dλ is negative and, as the field
is increased, µA decreased from its zero-field value µB and is always smaller
than µB . Conversely, if d < 2, then µA is always greater than µB . For d = 2,
we have µA = µB independent of the field.
Problem 4
∑
(i) k = N −1 i,j Aij . The thermodynamic potential (Lagrangian in me-
chanical terms) to be minimized is therefore
∑ ∑ ∑ ∑
L=− p(Aij ) ln p(Aij ) + λ0 p(Aij ) − τ N −1 p(Aij ) Aij . (213)
{Aij } {Aij } {Aij } i,j
104
Minimizing L, that is ∂L/∂p(Aij ) = 0, one finds the distribution and the
partition function
[ ∑ ]
exp −(τ /N ) i,j Aij ∑ ∑
pG (Aij ) = , Z= exp −(τ /N ) Aij . (214)
Z {Aij } i,j
F is extensive in (N 2 . )−1
(iii) ⟨Aij ⟩ = eτ /N − 1 , that is a Bose-Einstein statistics with zero
chemical potential since additional( edges are free to form.
)−1
∑
(iv) ⟨k⟩ = N −1 i,j ⟨Aij ⟩ = N eτ /N − 1 ⇒ τ /N = ln(N/ ⟨k⟩ + 1). It
( )
follows that the free energy is F = N 2 ln 1 − e−τ /N = −N 2 ln(1 + ⟨k⟩ /N ).
Similarly pG (Aij ) = (N/ ⟨k⟩ + 1)−N k (1 + ⟨k⟩ /N )−N .
2
105