100% found this document useful (1 vote)
209 views180 pages

Classical Field Theory

The document is a course outline for MAT 560 Mathematical Physics I, focusing on Classical Field Theory, authored by Leon A. Takhtajan. It includes detailed lecture topics covering classical mechanics, classical gauge theories, special relativity, and the theory of gravity, along with specific concepts such as Hamiltonian formalism, Noether's theorem, and Maxwell's equations. The course is structured into three main parts, each containing multiple lectures that explore fundamental principles and applications in mathematical physics.

Uploaded by

thenightsatyam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
209 views180 pages

Classical Field Theory

The document is a course outline for MAT 560 Mathematical Physics I, focusing on Classical Field Theory, authored by Leon A. Takhtajan. It includes detailed lecture topics covering classical mechanics, classical gauge theories, special relativity, and the theory of gravity, along with specific concepts such as Hamiltonian formalism, Noether's theorem, and Maxwell's equations. The course is structured into three main parts, each containing multiple lectures that explore fundamental principles and applications in mathematical physics.

Uploaded by

thenightsatyam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

c 2014-2017 Leon A.

Takhtajan

i
MAT 560 Mathematical Physics I.
Classical Field Theory

Leon A. Takhtajan

Department of Mathematics, Stony Brook University, Stony


Brook, NY 11794-3651, USA
E-mail address: leontak@[Link]
Contents

Part 1. Classical Mechanics . . . . . . . . . . . . . . . . . . . . . 1

Lecture 1. Equations of motion . . . . . . . . . . . . . . . . . . . . . . . 3


1.1. Generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . 4
1.2. The principle of least action . . . . . . . . . . . . . . . . . . . . . 4
1.3. Newtonian spacetime . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4. Examples of Lagrangian systems . . . . . . . . . . . . . . . . . . 9

Lecture 2. Integrals of motion and Noether’s theorem . . . . . . . . . . . 15


2.1. Conservation of energy . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2. Noether theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3. Examples of conservation laws . . . . . . . . . . . . . . . . . . . . 19

Lecture 3. Integration of equations of motion . . . . . . . . . . . . . . . 23


3.1. One-dimensional motion . . . . . . . . . . . . . . . . . . . . . . . 23
3.2. Two-body problem . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3. Kepler problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4. The motion of a rigid body . . . . . . . . . . . . . . . . . . . . . 29

Lecture 4. Legendre transform and Hamilton’s equations . . . . . . . . . 33


4.1. Legendre transform . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2. Hamiltonian function . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3. Hamilton’s equations . . . . . . . . . . . . . . . . . . . . . . . . . 36

Lecture 5. Hamiltonian formalism . . . . . . . . . . . . . . . . . . . . . . 39


5.1. The action functional in the phase space . . . . . . . . . . . . . . 39
5.2. The action as a function of coordinates . . . . . . . . . . . . . . . 40
5.3. Classical observables and Poisson bracket . . . . . . . . . . . . . 42
5.4. Canonical transformations and generating functions . . . . . . . 44

Lecture 6. Symplectic and Poisson manifolds . . . . . . . . . . . . . . . 49


6.1. Symplectic manifolds . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.2. Poisson manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.3. Noether theorem with symmetries . . . . . . . . . . . . . . . . . 57

Lecture 7. Hamiltonian systems with constraints . . . . . . . . . . . . . 61


7.1. First order formalism . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2. Singular Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . 63

v
vi CONTENTS

7.3. First class constraints and reduced phase space . . . . . . . . . . 65


7.4. Second class constraints and Dirac bracket . . . . . . . . . . . . . 68

Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Part 2. Classical gauge theories . . . . . . . . . . . . . . . . . . 73

Lecture 8. Maxwell equations . . . . . . . . . . . . . . . . . . . . . . . . 75


8.1. Physics formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.2. Using differential forms . . . . . . . . . . . . . . . . . . . . . . . . 76
8.3. Maxwell’s equations with sources . . . . . . . . . . . . . . . . . . 79
8.4. The principle of least action . . . . . . . . . . . . . . . . . . . . . 81

Lecture 9. Electrodynamics as U(1) gauge theory . . . . . . . . . . . . . 83


9.1. Bundles, connections and curvature . . . . . . . . . . . . . . . . . 83
9.2. Line bundles and Maxwell equations . . . . . . . . . . . . . . . . 88
9.3. Self-duality equations . . . . . . . . . . . . . . . . . . . . . . . . . 89

Lecture 10. Yang-Mills theory . . . . . . . . . . . . . . . . . . . . . . . . 93


10.1. Yang-Mills equations . . . . . . . . . . . . . . . . . . . . . . . . 93
10.2. Self-duality equations . . . . . . . . . . . . . . . . . . . . . . . . 96
10.3. Hitchin equations . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Lecture 11. Electromagnetic waves in a free space . . . . . . . . . . . . . 101


11.1. Energy-momentum tensor . . . . . . . . . . . . . . . . . . . . . 101
11.2. Gauge fixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
11.3. Plane waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
11.4. The general solution . . . . . . . . . . . . . . . . . . . . . . . . . 105

Lecture 12. Hamiltonian formalism. Real scalar field . . . . . . . . . . . 109


12.1. Lagrangian formulation . . . . . . . . . . . . . . . . . . . . . . . 109
12.2. The energy-momentum tensor . . . . . . . . . . . . . . . . . . . 110
12.3. Hamiltonian formulation . . . . . . . . . . . . . . . . . . . . . . 111
12.4. Fourier modes for the Klein-Gordon model . . . . . . . . . . . . 114

Lecture 13. Hamiltonian formalism. Gauge theories. . . . . . . . . . . . 117


13.1. Classical electrodynamics . . . . . . . . . . . . . . . . . . . . . . 117
13.2. Yang-Mills equations . . . . . . . . . . . . . . . . . . . . . . . . 121

Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Part 3. Special relativity and theory of gravity . . . . . . . . . 127

Lecture 14. Special relativity . . . . . . . . . . . . . . . . . . . . . . . . 129


14.1. The relativity principle and the Lorentz group . . . . . . . . . . 129
14.2. The Lorentz contraction and time delay . . . . . . . . . . . . . . 132
14.3. Lie algebra of the Lorentz group . . . . . . . . . . . . . . . . . . 133
14.4. Lorentz group as deformation of the Galilean group . . . . . . . 135
CONTENTS vii

Lecture 15. Relativistic particle . . . . . . . . . . . . . . . . . . . . . . . 137


15.1. The principle of the least action . . . . . . . . . . . . . . . . . . 137
15.2. Energy-momentum vector . . . . . . . . . . . . . . . . . . . . . 139
15.3. Charged particle in the electromagnetic field . . . . . . . . . . . 140
Lecture 16. Hamiltonian formulation . . . . . . . . . . . . . . . . . . . . 143
16.1. Poincaré group and Noether integrals . . . . . . . . . . . . . . . 143
16.2. Hamiltonian action of the Poincaré group . . . . . . . . . . . . . 144
16.3. No-interaction theorem . . . . . . . . . . . . . . . . . . . . . . . 148
Lecture 17. General relativity . . . . . . . . . . . . . . . . . . . . . . . . 151
17.1. Spacetime in general relativity . . . . . . . . . . . . . . . . . . . 151
17.2. Particle in a gravitation field . . . . . . . . . . . . . . . . . . . . 154
17.3. The Riemann tensor . . . . . . . . . . . . . . . . . . . . . . . . 155
Lecture 18. Einstein equations – I . . . . . . . . . . . . . . . . . . . . . . 159
18.1. Einstein field equations . . . . . . . . . . . . . . . . . . . . . . . 159
18.2. Particle in a weak gravitational field . . . . . . . . . . . . . . . . 159
18.3. Hilbert action . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Lecture 19. Einstein equations – II . . . . . . . . . . . . . . . . . . . . . 165
19.1. Palatini formalism . . . . . . . . . . . . . . . . . . . . . . . . . . 165
19.2. The Schwarzschild solution . . . . . . . . . . . . . . . . . . . . . 167
Lecture 20. Kaluza-Klein theory . . . . . . . . . . . . . . . . . . . . . . . 169
20.1. Geodesic equation on M . . . . . . . . . . . . . . . . . . . . . . 170
20.2. Hilbert action on M . . . . . . . . . . . . . . . . . . . . . . . . 171
20.3. Criticism of the Kaluza-Klein theory . . . . . . . . . . . . . . . 172
Part 1

Classical Mechanics
LECTURE 1

Equations of motion

We assume that the reader is familiar with the basic notions from the the-
ory of smooth — C ∞ — manifolds, and recall here the standard notation.
Unless it is stated explicitly otherwise, all maps are assumed to be smooth,
and all functions are assumed to be smooth and real-valued. Local coordinates
q = (q 1 , . . . , q n ) on a smooth n-dimensional manifold M at point q ∈ M are
Cartesian coordinates on ϕ(U ) ⊂ Rn , where (U, ϕ) is a coordinate chart on M
centered at q ∈ U . For f : U → Rn we denote (f ◦ ϕ−1 )(q 1 , . . . , q n ) by f (q),
and we let
 
∂f ∂f ∂f
= ,..., n
∂q ∂q 1 ∂q

stand for the gradient of a function f at point q ∈ Rn with Cartesian coordinates


(q 1 , . . . , q n ). We denote by

n
M
A• (M ) = Ak (M )
k=0

the graded algebra of smooth differential forms on M with respect to the wedge
product, and by d the de Rham differential — a graded derivation of A• (M ) of
degree 1, such that df is a differential of a function f ∈ A0 (M ) = C ∞ (M ). Let
Vect(M ) be the Lie algebra of smooth vector fields on M with the Lie bracket
[ , ], given by a commutator of vector fields. For X ∈ Vect(M ) we denote
by LX and iX , respectively, the Lie derivative along X and the inner product
with X. The Lie derivative is a degree 0 derivation of A• (M ) which commutes
with d and satisfies LX (f ) = X(f ) for f ∈ A0 (M ), and the inner product is
a degree −1 derivation of A• (M ) satisfying iX (f ) = 0 and iX (df ) = X(f ) for
f ∈ A0 (M ). They satisfy Cartan formulas

(1.1) LX = iX ◦ d + d ◦ iX = (d + iX )2 ,
(1.2) i[X,Y ] = LX ◦ iY − iY ◦ LX .

For a smooth mapping of manifolds f : M → N we denote by f∗ : T M → T N


and f ∗ : T ∗ N → T ∗ M , respectively, the induced mappings on tangent and
cotangent bundles. Other notations, including those traditional for classical
mechanics, will be introduced in the main text.

3
4 1. EQUATIONS OF MOTION

1.1. Generalized coordinates


Classical mechanics describes systems of finitely many interacting particles.
Position of a system in space is specified by the positions of its particles and
determines a point in some smooth, finite-dimensional manifold M , called a
configuration space of the system. Coordinates on M are called generalized
coordinates of a system, and the dimension n = dim M is called the number of
degrees of freedom.
A state of the system at any instant of time is described by a point q ∈ M
and by a tangent vector v ∈ Tq M at this point. The basic principle of classical
mechanics is the Newton-Laplace determinacy principle, which asserts that a
state of the system at a given instant of time completely determines its motion
at all times t (in the future and in the past). The motion is described by a
classical trajectory — a path γ(t) in the configuration space M . In generalized
dq i
coordinates γ(t) = (q 1 (t), . . . , q n (t)), and corresponding derivatives q̇ i =
dt
are called generalized velocities. The Newton-Laplace principle is a fundamen-
d2 q i
tal experimental fact. It implies that generalized accelerations q̈ i = are
dt2
uniquely determined by generalized coordinates q and generalized velocities q̇ i ,
i

so that classical trajectories satisfy a system of second order ordinary differential


equations, called equations of motion.
A Lagrangian system on a configuration space M is defined by a smooth,
real-valued function L on T M × R — the direct product of a tangent bundle
T M of M and the time axis1 — called the Lagrangian function (or simply,
Lagrangian).

1.2. The principle of least action


The most general principle governing the motion of Lagrangian systems is
the principle of least action in the configuration space (or Hamilton’s principle),
formulated as follows.
Let
P (M )qq10 ,t
,t0 = {γ : [t0 , t1 ] → M ; γ(t0 ) = q0 , γ(t1 ) = q1 }
1

be the space of smooth parametrized paths in M connecting points q0 and


q1 . The path space P (M ) = P (M )qq10 ,t1
,t0 is an infinite-dimensional real Fréchet
manifold, and the tangent space Tγ P (M ) to P (M ) at γ ∈ P (M ) consists of all
smooth vector fields along the path γ in M which vanish at the endpoints q0 and
q1 . A smooth path Γ in P (M ), passing through γ ∈ P (M ), is called a variation
with fixed ends of the path γ(t) in M . A variation Γ is a family γε (t) = Γ(t, ε)
of paths in M given by a smooth map

Γ : [t0 , t1 ] × [−ε0 , ε0 ] → M

1It follows from the Newton-Laplace principle that L could depend only on generalized
coordinates and velocities, and on time.
1.2. THE PRINCIPLE OF LEAST ACTION 5

such that Γ(t, 0) = γ(t) for t0 ≤ t ≤ t1 and Γ(t0 , ε) = q0 , Γ(t1 , ε) = q1 for


−ε0 ≤ ε ≤ ε0 . The tangent vector

∂Γ
δγ = ∈ Tγ P (M )
∂ε ε=0

corresponding to a variation γε (t) is traditionally called an infinitesimal varia-


tion. Explicitly,

δγ(t) = Γ∗ ( ∂ε )(t, 0) ∈ Tγ(t) M, t0 ≤ t ≤ t1 ,

where ∂ε is a tangent vector to the interval [−ε0 , ε0 ] at 0. Finally, a tangential
lift of a path γ : [t0 , t1 ] → M is the path γ 0 : [t0 , t1 ] → T M defined by γ 0 (t) =
∂ ∂
γ∗ ( ∂t ) ∈ Tγ(t) M, t0 ≤ t ≤ t1 , where ∂t is a tangent vector to [t0 , t1 ] at t. In
0
other words, γ (t) is the velocity vector of a path γ(t) at time t.
Definition. The action functional S : P (M ) → R of a Lagrangian system
(M, L) is defined by
Z t1
S(γ) = L(γ 0 (t), t)dt.
t0

Principle of Least Action (Hamilton’s principle). A path γ ∈ P M


describes the motion of a Lagrangian system (M, L) between the position q0 ∈
M at time t0 and the position q1 ∈ M at time t1 if and only if it is a critical
point of the action functional S,

d
S(γε ) = 0
dε ε=0

for all variations γε (t) of γ(t) with fixed ends.


The critical points of the action functional are called extremals and the
principle of the least action states that a Lagrangian system (M, L) moves along
the extremals2. The extremals are characterized by equations of motion — a
system of second order differential equations in local coordinates on T M . The
equations of motion have the most elegant form for the following choice of local
coordinates on T M .
Definition. Let (U, ϕ) be a coordinate chart on M with local coordinates
q = (q 1 , . . . , q n ). Coordinates

(q, v) = (q 1 , . . . , q n , v 1 , . . . , v n )

on a chart T U on T M , where v = (v 1 , . . . , v n ) are coordinates in the fiber cor-


∂ ∂
responding to the basis , . . . , n for Tq M , are called standard coordinates.
∂q 1 ∂q
2 The principle of least action does not state that an extremal connecting points q and
0
q1 is a minimum of S, nor that such an extremal is unique. It also does not state that any
two points can be connected by an extremal.
6 1. EQUATIONS OF MOTION

Standard coordinates are Cartesian coordinates on ϕ∗ (T U ) ⊂ T Rn ' Rn ×


R and have the property that for (q, v) ∈ T U and f ∈ C ∞ (U ),
n

n
X ∂f ∂f
(1.3) v(f ) = vi i
=v .
i=1
∂q ∂q

Let (U, ϕ) and (U 0 , ϕ0 ) be coordinate charts on M with the transition functions


F = (F 1 , . . . , F n ) = ϕ0 ◦ ϕ−1 : ϕ(U ∩ U 0 ) → ϕ0 (U ∩ U 0 ), and let (q, v) and
(q 0 , v 0 ), respectively, be the standard coordinates on T U and T U 0 . We have
q 0 = F (q), and it follows from (1.3) that
n
∂F i

(1.4) v 0 = F∗ (q)v, where F∗ (q) = (q)
∂q j i,j=1

is a matrix-valued function on ϕ(U ∩ U 0 ). In other words, “vertical” coordi-


nates v = (v 1 , . . . , v n ) in the fibers of T M → M transform like components
of a tangent vector on M under the change of coordinates on M . In classical
terminology, v is a contravariant vector.
The tangential lift γ 0 (t) of a path γ(t) in M in standard coordinates on T U
is (q(t), q̇(t)) = (q 1 (t), . . . , q n (t), q̇ 1 (t), . . . , q̇ n (t)), where the dot stands for the
time derivative, so that

L(γ 0 (t), t) = L(q(t), q̇(t), t).

Following a centuries long tradition3, we will usually denote standard coordi-


nates by
(q, q̇) = (q 1 , . . . , q n , q̇ 1 , . . . , q̇ n ),

where the dot does not stand for the time derivative. Since we only consider
paths in T M that are tangential lifts of paths in M , there will be no confusion4.

Theorem 1.1. The equations of motion of a Lagrangian system (M, L) in


standard coordinates on T M are given by the Euler-Lagrange equations
 
∂L d ∂L
(q(t), q̇(t), t) − (q(t), q̇(t), t) = 0.
∂q dt ∂ q̇

Proof. Suppose first that an extremal γ(t) lies in a coordinate chart U of


M . Then a simple computation in standard coordinates, using integration by

3Used in all texts on classical mechanics and theoretical physics.


4We reserve the notation (q(t), v(t)) for general paths in T M .
1.2. THE PRINCIPLE OF LEAST ACTION 7

parts, gives

d
0= S(γε )
dε ε=0
Z t1
d
= L (q(t, ε), q̇(t, ε), t) dt
dε ε=0 t0
n Z t1  
X ∂L i ∂L i
= δq + i δ q̇ dt
i=1 t0
∂q i ∂ q̇
n Z
X 1 ∂Lt   n t1
d ∂L i
X ∂L i
= i
− i
δq dt + δq .
i=1 t0
∂q dt ∂ q̇ i=1
∂ q̇ i t0

The second sum in the last line vanishes due to the property δq i (t0 ) = δq i (t1 ) =
0, i = 1, . . . , n. The first sum is zero for arbitrary smooth functions δq i on the
interval [t0 , t1 ] which vanish at the endpoints. This implies that for each term
in the sum the integrand is identically zero,
 
∂L d ∂L
(q(t), q̇(t), t) − (q(t), q̇(t), t) = 0, i = 1, . . . , n.
∂q i dt ∂ q̇ i

Since the restriction of an extremal of the action functional S to a coordinate


chart on M is again an extremal, each extremal in standard coordinates on T M
satisfies Euler-Lagrange equations. 
Remark. In calculus of variations, the directional derivative of a functional
S with respect to a tangent vector V ∈ Tγ P (M ) — the Gato derivative — is
defined by
d
δV S = S(γε ),
dε ε=0
where γε is a path in P (M ) with a tangent vector V at γ0 = γ. The result
of the above computation (when γ lies in a coordinate chart U ⊂ M ) can be
written as
Z t1 Xn  
∂L d ∂L
δV S = i
− i
(q(t), q̇(t), t)v i (t)dt
t0 i=1 ∂q dt ∂ q̇
Z t1  
∂L d ∂L
(1.5) = − (q(t), q̇(t), t)v(t)dt.
t0 ∂q dt ∂ q̇
n
X ∂
Here V (t) = v i (t)i
is a vector field along the path γ in M . Formula (1.5)
i=1
∂q
is called the formula for the first variation of the action with fixed ends. The
principle of least action is a statement that δV S(γ) = 0 for all V ∈ Tγ P (M ).

Remark. It is also convenient to consider a space P\


(M ) = {γ : [t0 , t1 ] →
M } of all smooth parametrized paths in M . The tangent space Tγ P\ (M ) to
8 1. EQUATIONS OF MOTION

P\(M ) at γ ∈ P\(M ) is the space of all smooth vector fields along the path γ in
M (no condition at the endpoints). The computation in the proof of Theorem
1.1 yields the following formula for the first variation of the action with free
ends:
Z t1   t1
∂L d ∂L ∂L
(1.6) δV S = − v dt + v .
t0 ∂q dt ∂ q̇ ∂ q̇ t0

In expanded form, the Euler-Lagrange equations are given by the following


system of second order ordinary differential equations:
 
∂L d ∂L
(q, q̇, t) = (q, q̇, t)
∂q i dt ∂ q̇ i
n
∂2L ∂2L ∂2L
X  
j j
= (q, q̇, t) q̈ + (q, q̇, t) q̇ + (q, q̇, t), i = 1, . . . , n.
j=1
∂ q̇ i ∂ q̇ j ∂ q̇ i ∂q j ∂ q̇ i ∂t

In order for this system to be solvable for the highest derivatives for all initial
conditions in T U , the symmetric n × n matrix
 2 n
∂ L
HL (q, q̇, t) = (q, q̇, t)
∂ q̇ i ∂ q̇ j i,j=1

should be invertible on T U .
Definition. A Lagrangian system (M, L) is called non-degenerate if for
every coordinate chart U on M the matrix HL (q, q̇, t) is invertible on T U .
Otherwise Lagrangian system is called singular.
Remark. Note that the n × n matrix HL is a Hessian of the Lagrangian
function L for vertical directions on T M . Under the change of standard coor-
dinates q 0 = F (q) and q̇ 0 = F∗ (q)q̇ it has the transformation law

HL (q, q̇, t) = F∗ (q)τ HL (q 0 , q̇ 0 , t)F∗ (q),

where F∗ (q)τ is the transposed matrix, so that the condition det HL 6= 0 does
not depend on the choice of standard coordinates.
Inverting the matrix HL , we can write Euler-Lagrange equations for a non-
degenerate Lagrangian in the form

(1.7) q̈ i = F i (q, q̇, t), i = 1, . . . , n.

1.3. Newtonian spacetime


To describe a mechanical phenomena it is necessary to choose a frame of
reference. The properties of the spacetime where the motion takes place depend
on this choice. The spacetime is characterized by the following postulates5.
5Strictly speaking, these postulates are valid only in the non-relativistic limit of special
relativity, when the speed of light in the vacuum is assumed to be infinite.
1.4. EXAMPLES OF LAGRANGIAN SYSTEMS 9

Newtonian Space-Time. The space is a three-dimensional affine Eucli-


dean space E 3 . A choice of the origin 0 ∈ E 3 — a reference point — establishes
the isomorphism E 3 ' R3 , where the vector space R3 carries the Euclidean inner
product and has a fixed orientation. The time is one-dimensional — a time axis
R — and the spacetime is a direct product E 3 × R. Points in the spacetime are
called events. Two events (r, t) and (r 0 , t0 ) are called simultaneous, if t = t0 .
The distance can be defined only for simultaneous events and is the Euclidean
distance |r − r 0 |.
An inertial reference frame is a coordinate system with respect to the origin
0 ∈ E 3 , initial time t0 , and an orthonormal basis in R3 . In an inertial frame the
space is homogeneous and isotropic and the time is homogeneous. The laws of
motion are invariant with respect to the transformations

r 7→ g · r + r0 , t 7→ t + t0 ,

where r, r0 ∈ R3 and g ∈ O(3) is an orthogonal linear transformation in R3 .


The time in classical mechanics is absolute.
The Galilean group G is a group of all affine transformations of E 3 ×R, which
preserve time intervals, and which for every t ∈ R are isometries in E 3 . Every
Galilean transformation is a composition of rotation, spacetime translation, and
a special Galilean transformation

(1.8) r 7→ r + vt, t 7→ t,

where v ∈ R3 . Any two inertial frames are related by a Galilean transformation.


The homogeneous Galilean group G0 consists of rotations and special Galilean
transformations (1.8). As a Lie group, G0 is isomorphic to the Euclidean Lie
group E(3) — a semi-direct product R3 o O(3). Explicitly,
  
g v 3
G0 = : g ∈ O(3), v ∈ R ,
0 1

so that       
r g v r g · r + vt
7→ = .
t 0 1 t t
Galileo’s Relativity Principle. The laws of motion are invariant with
respect to the Galilean group.
These postulates impose restrictions on Lagrangians of mechanical systems.
In particular, Lagrangian L of a closed system 6 does not explicitly depend on
time.

1.4. Examples of Lagrangian systems


Physical systems are described by special Lagrangians, in agreement with
the experimental facts about the motion of material bodies.
6A system is called closed if its particles do not interact with the outside material bodies.
10 1. EQUATIONS OF MOTION

Example 1.1 (Free particle). The configuration space for a free particle
is M = R3 , and it can be deduced from Galileo’s relativity principle that the
Lagrangian for a free particle is

L = 12 mṙ 2 .

Here m > 07 is the mass of a particle and ṙ 2 = |ṙ|2 is the length square of the
velocity vector ṙ ∈ Tr R3 ' R3 . Indeed, under the Galilean transformation (1.8)
d
(1.9) L = 12 mṙ 2 7→ L0 = L = 21 m(ṙ + v)2 = L + (mrv + 21 v 2 t),
dt
so that Lagrangians L and L0 have the same equations of motion (cf. Problem
1.2). Specifically, Euler-Lagrange equations give Newton’s law of inertia,

r̈ = 0.

Example 1.2 (Interacting particles). A closed system of N interacting par-


ticles in R3 with masses m1 , . . . , mN is described by a configuration space

M = R3N = R3 × · · · × R3
| {z }
N

with a position vector r = (r1 , . . . , rN ), where ra ∈ R3 is a position vector of


the a-th particle, a = 1, . . . , N . It is found that the Lagrangian is given by
N
X
1 2
L= 2 ma ṙa − V (r) = T − V,
a=1

where
N
X
1 2
T = 2 ma ṙa
a=1

is called kinetic energy of a system and V (r) is potential energy. The Euler-
Lagrange equations give Newton’s equations

ma r̈a = Fa ,

where
∂V
Fa = −
∂ra
is a force on the a-th particle, a = 1, . . . , N . Forces of this form are called
conservative. Thus the interaction of particles is through the action of potential
forces, and is an instantaneous action at a distance 8.
7Otherwise the action functional is not bounded from below.
8This means a phenomenon in which a change in intrinsic properties of one system induces
an instantaneous change in the intrinsic properties of a distant system without a process that
carries this influence contiguously in space and time.
1.4. EXAMPLES OF LAGRANGIAN SYSTEMS 11

It follows from homogeneity of space that potential energy V (r) of a closed


system of N interacting particles with conservative forces depends only on rel-
ative positions of the particles, i.e., V (r1 + c, . . . , rN + c) = V (r1 , . . . , rN ) for
all c ∈ R3 , which leads to the equation
N
X
Fa = 0.
a=1

In particular, for a closed system of two particles F1 + F2 = 0, which is the


equality of action and reaction forces, also called Newton’s third law.
The potential energy of a closed system with only pair-wise interaction be-
tween the particles has the form
X
V (r) = Vab (ra − rb ).
1≤a<b≤N

It follows from the isotropy of space that V (r) depends only on relative distances
between the particles, so that the Lagrangian of a closed system of N particles
with pair-wise interaction has the form
N
X X
1 2
L= 2 ma ṙa − Vab (|ra − rb |).
a=1 1≤a<b≤N

Example 1.3 (Universal gravitation). According to Newton’s law of gravi-


tation, the potential energy of the gravitational force between two particles with
masses ma and mb is
ma mb
V (ra − rb ) = −G ,
|ra − rb |
where G is the gravitational constant. The configuration space of N particles
with gravitational interaction is

M = {(r1 , . . . , rN ) ∈ R3N : ra 6= rb for a 6= b, a, b = 1, . . . , N }.

Example 1.4 (Small oscillations). Consider a particle of mass m with n


degrees of freedom moving in a potential field V (q), and suppose that potential
energy V has a minimum at q = 0. Expanding V (q) in Taylor series around
0 and keeping only quadratic terms, one obtains a Lagrangian system which
describes small oscillations from equilibrium. Explicitly,

L = 12 mq̇ 2 − V0 (q),

where V0 is a positive-definite quadratic form on Rn given by


n
X ∂2V
V0 (q) = 1
2 i ∂q j
(0)q i q j .
i,j=1
∂q
12 1. EQUATIONS OF MOTION

Since every quadratic form can be diagonalized by an orthogonal transformation,


we can assume from the very beginning that coordinates q = (q 1 , . . . , q n ) are
chosen so that V0 (q) is diagonal and
n
X
L = 21 m q̇ 2 − ωi2 (q i )2 ,

(1.10)
i=1

where ω1 , . . . , ωn > 0. Such coordinates q are called normal coordinates. In


normal coordinates Euler-Lagrange equations take the form

q̈ i + ωi2 q i = 0, i = 1, . . . , n,

and describe n decoupled (i.e., non-interacting) harmonic oscillators with fre-


quencies ω1 , . . . , ωn .
Example 1.5 (Free particle on a Riemannian manifold). Let (M, ds2 ) be
a Riemannian manifold with the Riemannian metric ds2 . In local coordinates
x1 , . . . , xn on M ,
ds2 = gµν (x)dxµ dxν ,
where we are using summation over repeated indices. The Lagrangian of a free
particle on M is
L(v) = 12 hv, vi = 12 kvk2 , v ∈ T M,
where h , i stands for the inner product in fibers of T M , given by the Riemannian
metric. The corresponding functional
Z t1 Z t1
0 2
S(γ) = 2 1
kγ (t)k dt = 21
gµν (x)ẋµ ẋν dt
t0 t0

is called the action functional in Riemannian geometry. The Euler-Lagrange


equations are
∂gµν µ λ 1 ∂gµλ µ λ
gµν ẍµ + ẋ ẋ = ẋ ẋ ,
∂xλ 2 ∂xν
and after multiplying by the inverse metric tensor g σν and summation over ν
they take the form

ẍσ + Γσµν ẋµ ẋν = 0, σ = 1, . . . , n,

where  
1 σλ ∂gµλ ∂gνλ ∂gµν
Γσµν = g + −
2 ∂xν ∂xµ ∂xλ
are Christoffel’s symbols. The Euler-Lagrange equations of a free particle mov-
ing on a Riemannian manifold are geodesic equations.
Let ∇ be the Levi-Civita connection — the metric connection in the tangent
bundle T M — and let ∇ξ be a covariant derivative with respect to the vector
field ξ ∈ Vect(M ). Explicitly,
 µ 
µ ∂η µ λ ∂ ∂
(∇ξ η) = + Γνλ η ξ ν , where ξ = ξ µ (x) µ , η = η µ (x) µ .
∂xν ∂x ∂x
1.4. EXAMPLES OF LAGRANGIAN SYSTEMS 13

For a path γ(t) = (xµ (t)) denote by ∇γ̇ a covariant derivative along γ,

dη µ (t) ∂
(∇γ̇ η)µ (t) = + Γµνλ (γ(t))ẋν (t)η λ (t), where η = η µ (t)
dt ∂xµ
is a vector field along γ. Formula (1.5) can now be written in an invariant form
Z t1
δS = − h∇γ̇ γ̇, δγidt,
t0

which is known as the formula for the first variation of the action in Riemannian
geometry.
Problem 1.1. Show that the action functional is given by the evaluation of the
1-form Ldt on T M × R over the 1-chain γ̃ on T M × R,
Z
S(γ) = Ldt,
γ̃

where γ̃ = {(γ 0 (t), t); t0 ≤ t ≤ t1 } and Ldt w, c ∂t




= cL(q, v), w ∈ T(q,v) T M, c ∈ R.
Problem 1.2. Let f ∈ C ∞ (M ). Show that Lagrangian systems (M, L) and
(M, L + df ) (where df is a fibre-wise linear function on T M ) have the same equations
of motion. In general, the Lagrangian is defined up to an addition of a total time
derivative of a function of coordinates and time.
Problem 1.3. Give examples of Lagrangian systems such that an extremal con-
necting two given points (i) is not a local minimum; (ii) is not unique; (iii) does not
exist.
Problem 1.4. For γ an extremal of the action functional S, the second variation
of S is defined by
∂2
δV2 1 V2 S = S(γε1 ,ε2 ),
∂ε1 ∂ε2 ε1 =ε2 =0

where γε1 ,ε2 is a smooth two-parameter family of paths in M such that the paths γε1 ,0
and γ0,ε2 in P (M ) at the point γ0,0 = γ ∈ P (M ) have tangent vectors V1 and V2 ,
respectively. For a Lagrangian system (M, L) find the second variation of S and verify
that for given V1 and V2 it does not depend on the choice of γε1 ,ε2 .
Problem 1.5. Prove that the second variation of the action functional in Rie-
mannian geometry is given by
Z t1
δ2 S = hJ (δ1 γ), δ2 γidt.
t0

Here δ1 γ, δ2 γ ∈ Tγ P M , J = −∇2γ̇
− R(γ̇, · )γ̇ is the Jacobi operator, and R is a
curvature operator — a fibre-wise linear mapping R : T M ⊗ T M → End(T M ) of
vector bundles, defined by R(ξ, η) = ∇η ∇ξ − ∇ξ ∇η + ∇[ξ,η] : T M → T M , where
ξ, η ∈ Vect(M ).
LECTURE 2

Integrals of motion and Noether’s theorem

To describe the motion of a mechanical system one needs to solve the Euler-
Lagrange equations — a system of second order ordinary differential equations
for the generalized coordinates. This could be a very difficult problem. There-
fore of particular interest are those functions of generalized coordinates and
velocities, which remain constant during the motion.

Definition. A smooth function I : T M → R is called an integral of motion


(first integral, or conservation law ) for a Lagrangian system (M, L) if

d
I(γ 0 (t)) = 0
dt

for all extremals γ of the action functional.

2.1. Conservation of energy


Definition. The energy of a Lagrangian system (M, L) is a function E on
T M × R, defined in standard coordinates on T M by
n
X ∂L
E(q, q̇, t) = q̇ i (q, q̇, t) − L(q, q̇, t).
i=1
∂ q̇ i

∂L
Lemma 2.1. The energy E = q̇ − L is a well-defined function on T M ×R.
∂ q̇

Proof. Let (U, ϕ) and (U 0 , ϕ0 ) be coordinate charts on M with the tran-


sition functions F = (F 1 , . . . , F n ) = ϕ0 ◦ ϕ−1 : ϕ(U ∩ U 0 ) → ϕ0 (U ∩ U 0 ).
Corresponding standard coordinates (q, q̇) and (q 0 , q̇ 0 ) are related by q 0 = F (q)
and q̇ 0 = F∗ (q)q̇ (see formula (1.3) in Lecture 1). We have

dq 0 = F∗ (q)dq and dq̇ 0 = G(q, q̇)dq + F∗ (q)dq̇,

where
( n
)n
X ∂2F i k
G(q, q̇) = q̇ ,
∂q j ∂q k
k=1 i,j=1

15
16 2. INTEGRALS OF MOTION

so that
∂L 0 ∂L 0 ∂L
dL = dq + dq̇ + dt
∂q 0 ∂ q̇ 0 ∂t
 
∂L ∂L ∂L ∂L
= 0
F ∗ (q) + 0
G(q, q̇) dq + 0
F∗ (q)dq̇ + dt
∂q ∂ q̇ ∂ q̇ ∂t
∂L ∂L ∂L
= dq + dq̇ + dt.
∂q ∂ q̇ ∂t
Thus under a change of coordinates
∂L ∂L ∂L ∂L
0
F∗ (q) = and q̇ 0 0
= q̇ ,
∂ q̇ ∂ q̇ ∂ q̇ ∂ q̇
so that E is a well-defined function on T M . 
Corollary 2.1. Under  a change  of local coordinates on M , components of
∂L ∂L ∂L
a vector (q, q̇, t) = , . . . , n transform like components of a 1-form
∂ q̇ ∂ q̇ 1 ∂ q̇
∂L
on M . In classical terminology, is a covariant vector.
∂ q̇
Let θL be a 1-form on T M , defined in standard coordinates associated with
a coordinate chart U ⊂ M by
n
X ∂L i ∂L
(2.1) θL = i
dq = dq.
i=1
∂ q̇ ∂ q̇
It follows from Corollary 2.1 that θL is a well-defined 1-form on T M .
Proposition 2.1 (Conservation of energy). The energy of a closed system
is an integral of motion.
Proof. For an extremal γ put E(t) = E(γ 0 (t)). We have, according to the
Euler-Lagrange equations,
 
dE d ∂L ∂L ∂L ∂L ∂L
= q̇ + q̈ − q̇ − q̈ −
dt dt ∂ q̇ ∂ q̇ ∂q ∂ q̇ ∂t
   
d ∂L ∂L ∂L ∂L
= − q̇ − =− .
dt ∂ q̇ ∂q ∂t ∂t
∂L
Since for a closed system = 0, the energy is conserved. 
∂t
Conservation of energy for a closed mechanical system is a fundamental law
of physics, which follows from the homogeneity of time. For a general closed
system of N interacting particles considered in Example 1.2,
N
X N
X
E= ma ṙa2 − L = 1 2
2 ma ṙa + V (r).
a=1 a=1

In other words, the total energy E = T + V is a sum of the kinetic energy and
the potential energy.
2.2. NOETHER THEOREM 17

2.2. Noether theorem


Definition. A Lagrangian L : T M → R is invariant with respect to the
diffeomorphism g : M → M if L(g∗ (v)) = L(v) for all v ∈ T M . The diffeomor-
phism g is called a symmetry of a closed Lagrangian system (M, L). A Lie group
G is the symmetry group of (M, L) (group of continuous symmetries), if there is
a left G-action on M such that for every g ∈ G the mapping M 3 x 7→ g · x ∈ M
is a symmetry.
Continuous symmetries give rise to conservation laws.
Theorem 2.2 (Noether). Suppose that a Lagrangian L : T M → R is in-
variant under a one-parameter group {gs }s∈R of diffeomorphisms of M . Then
the Lagrangian system (M, L) admits an integral of motion I, given in standard
coordinates on T M by
n  i 
X ∂L dgs (q) ∂L
I(q, q̇) = i
(q, q̇) = a,
i=1
∂ q̇ ds s=0 ∂ q̇
n
X ∂
where X = ai (q) i
is the vector field on M associated with the flow gs .
i=1
∂q
The integral of motion I is called the Noether integral.
Proof. It follows from Corollary 2.1 that I is a well-defined function on
T M . Now differentiating L((gs )∗ (γ 0 (t))) = L(γ 0 (t)) with respect to s at s = 0
and using the Euler-Lagrange equations, we get
   
∂L ∂L d ∂L ∂L da d ∂L
0= a+ ȧ = a+ = a ,
∂q ∂ q̇ dt ∂ q̇ ∂ q̇ dt dt ∂ q̇

where a(t) = a1 (γ(t)), . . . , an (γ(t)) . 
Remark. A vector field X on M is called an infinitesimal symmetry, if the
corresponding “time s” local flow gs of X (defined for each s ∈ R on some
Us ⊆ M as a diffeomorphism gs : Us → U−s ) is a symmetry: L ◦ (gs )∗ = L on
Us . Every vector field X on M lifts to a vector field X 0 on T M , defined by a
local flow on T M , induced from the corresponding local flow on M . In standard
coordinates on T M ,
n
X ∂
X= ai (q) i ,
i=1
∂q
and the corresponding local flow on M is given by
dq i
= ai (q),
ds
and induces the local flow on T M ,
n
dq̇ i X ∂ai
= q̇ j j (q), i = 1, . . . , n.
ds j=1
∂q
18 2. INTEGRALS OF MOTION

Thus
n n
X ∂ X ∂ai ∂
(2.2) X0 = ai (q) i
+ q̇ j j (q) i ,
i=1
∂q i,j=1
∂q ∂ q̇

and for every path γ in M ,

∂L ∂L
dL(X 0 )(γ 0 (t)) = a+ ȧ.
∂q ∂ q̇

It is easy to verify that X is an infinitesimal symmetry if and only if dL(X 0 ) = 0


∂L
on T M , and I(q, q̇) = a is an integral of motion.
∂ q̇
Remark. Using the 1-form θL , the Noether integral I in Theorem 2.2 can
be written as

(2.3) I = θL (X 0 ).

Remark. Noether’s theorem generalizes to time-dependent Lagrangians L :


T M × R → R. Namely, on the extended configuration space M1 = M × R define
a time-independent Lagrangian L1 by
 

L1 (q, τ, q̇, τ̇ ) = L q, , τ τ̇ ,
τ̇

where (q, τ ) are local coordinates on M1 and (q, τ, q̇, τ̇ ) are standard coordinates
on T M1 . The Noether integral I1 for a closed system (M1 , L1 ) defines an integral
of motion I for a system (M, L) by the formula

I(q, q̇, t) = I1 (q, t, q̇, 1).

When the Lagrangian L does not depend on time, L1 is invariant with respect
to the one-parameter group of translations τ 7→ τ + s, and the Noether integral
∂L1
I1 = gives I = −E.
∂ τ̇
Noether’s theorem can be generalized further as follows.
Proposition 2.2. Suppose that for a given Lagrangian L : T M → R there
exist a vector field X on M and a function K on T M , such that for every path
γ in M ,
d
dL(X 0 )(γ 0 (t)) = K(γ 0 (t)).
dt
Then
n
X ∂L
I= ai (q) i (q, q̇) − K(q, q̇)
i=1
∂ q̇

is an integral of motion for the Lagrangian system (M, L).


2.3. EXAMPLES OF CONSERVATION LAWS 19

Proof. Using Euler-Lagrange equations, we have along the extremal γ,


 
d ∂L ∂L ∂L dK
a = a+ ȧ = . 
dt ∂ q̇ ∂q ∂ q̇ dt

For a closed, non-degenerate Lagrangian system (M, L) this result can be


generalized further by allowing coefficients ai (q) of the vector field X to depend
also on q̇. Namely, rewrite Euler-Lagrange equations as in (1.7), and consider
a vector field X̃ on T M , given in the standard coordinates by
n n  i
∂ai

X
i ∂ X
j ∂a j ∂
(2.4) X̃ = a (q, q̇) i + q̇ j
(q, q̇) + F (q, q̇) j (q, q̇) .
i=1
∂q i,j=1
∂q ∂ q̇ ∂ q̇ i

Proposition 2.3. Suppose that for a closed, non-degenerate Lagrangian L


there exist a vector field X̃ on T M of the form (2.4), and a function K on T M ,
such that for every path γ in M ,
d
(2.5) dL(X̃)(γ 0 (t)) = K(γ 0 (t)).
dt
Then
n
X ∂L
I= ai (q, q̇) (q, q̇) − K(q, q̇)
i=1
∂ q̇ i

is an integral of motion.

Proof. Along the extremal γ(t),

dI ∂L ∂L dK d
= a+ ȧ − = dL(X̃)(γ 0 (t)) − K(γ 0 (t)) = 0. 
dt ∂q ∂ q̇ dt dt
2.3. Examples of conservation laws
Example 2.1 (Conservation of momentum). Let M = V be a vector space,
and suppose that a Lagrangian L is invariant with respect to a one-parameter
group gs (q) = q + sv, v ∈ V . According to Noether’s theorem,
n
X ∂L
I= vi
i=1
∂ q̇ i

is an integral of motion. Now let (M, L) be closed Lagrangian system of N


interacting particles, considered in Example 1.2. We have M = V = R3N , and
the Lagrangian L is invariant under a simultaneous translation of coordinates
ra = (ra1 , ra2 , ra3 ) of all particles by the same vector c ∈ R3 . Thus v = (c, . . . , c) ∈
R3N , and for every c = (c1 , c2 , c3 ) ∈ R3 ,
N  
1 ∂L 2 ∂L 3 ∂L
X
I= c 1
+c 2
+c 3
= c1 P1 + c2 P2 + c3 P3
a=1
∂ ṙa ∂ ṙa ∂ ṙa
20 2. INTEGRALS OF MOTION

is an integral of motion. The integrals of motion P1 , P2 , P3 define the vector


N
X ∂L
P = ∈ R3
a=1
∂ ṙa

(or rather a vector in a dual space to R3 ), called the momentum of a system.


Explicitly,
XN
P = ma ṙa ,
a=1
so that the total momentum of a closed system is a sum of momenta of individ-
ual particles. Conservation of momentum is a fundamental physical law which
reflects the homogeneity of space.
∂L
Traditionally, pi = i are called generalized momenta corresponding to gen-
∂ q̇
∂L
eralized coordinates q i , and Fi = i are called generalized forces. In these
∂q
notation, the Euler-Lagrange equations have the same form
ṗ = F
as Newton’s equations in Cartesian coordinates. Conservation of momentum
implies Newton’s third law.
Example 2.2 (Conservation of angular momentum). Let M = V be a vector
space with Euclidean inner product. Let G = SO(V ) be the connected Lie group
of automorphisms of V preserving the inner product, and let g = so(V ) be the
Lie algebra of G. Suppose that a Lagrangian L is invariant with respect to the
action of a one-parameter subgroup gs (q) = esx · q of G on V , where x ∈ g and
ex is the exponential map. According to Noether’s theorem,
n
X ∂L
I= (x · q)i
i=1
∂ q̇ i

is an integral of motion. Now let (M, L) be a closed Lagrangian system of N


interacting particles, considered in Example 1.2. We have M = V = R3N ,
and the Lagrangian L is invariant under a simultaneous rotation of coordinates
ra of all particles by the same orthogonal transformation in R3 . Thus x =
(u, . . . , u) ∈ so(3) ⊕ · · · ⊕ so(3), and for every u ∈ so(3),
| {z }
N

N  
X
1 ∂L ∂L ∂L
I= (u · ra ) 1
+ (u · ra )2 2 + (u · ra )3 3
a=1
∂ ṙa ∂ ṙa ∂ ṙa
0 0 0 
is an integral of motion. Let u = u1 X1 +u2 X2 +u3 X3 , where X1 = 0 0 −1 , X2 =
 0 0 1  0 −1 0  01 0
0 0 0 , X3 = 1 0 0 is the basis in so(3) ' R3 corresponding to the rota-
−1 0 0 0 0 0
tions about the vectors e1 , e2 , e3 of the standard orthonormal basis in R3 . Since
2.3. EXAMPLES OF CONSERVATION LAWS 21

u · ra = u × ra , where u = (u1 , u2 , u3 ), we have

I = u1 M1 + u2 M2 + u3 M3 ,

where M = (M1 , M2 , M3 ) ∈ R3 (or rather a vector in a dual space to so(3)) is


given by
N
X ∂L
M= ra × .
a=1
∂ ṙa
The vector M is called the angular momentum of a system. Explicitly,
N
X
M= ra × ma ṙa ,
a=1

so that the total angular momentum of a closed system is a sum of angular


momenta of individual particles. Conservation of angular momentum is a fun-
damental physical law which reflects the isotropy of space.
Example 2.3 (The center of mass). Let (M, L) be a closed Lagrangian
system of N interacting particles, considered in Example 1.2. Under a simul-
taneous Galilean transformation (1.8) of all coordinates, ra 7→ ra + vt, and
corresponding transformation of velocities ṙa 7→ ṙa + v, we have
N
d X
L 7→ L0 = L + ma (ra v + 21 v 2 t).
dt a=1

Therefore for infinitesimal Galilean transformation — the time-dependent vector


field
N  
X ∂ ∂
X̃ = tv +v
a=1
∂ra ∂ ṙa
equation (2.5) holds, where the functions K is given by
N
X
K= ma ra v.
a=1

According to Proposition 2.3, the vector


N
X
I = tP − ma ra
a=1

is an integral of motion, I˙ = 0 on the solutions of the Euler-Lagrange equations.


This is equivalent to the statement that the center of mass of the system
N N
1 X X
R= ma ra , where M= ma
M a=1 a=1

is the total mass, moves with the constant velocity V = P /M .


22 2. INTEGRALS OF MOTION

Problem 2.1. Prove that a Lagrangian system (M, L) is non-degenerate if and


only if the 2-form dθL on T M is non-degenerate.
Problem 2.2 (Second tangent bundle). Let π : T M → M be the canonical
projection and let TV (T M ) be a vertical tangent bundle of T M along the fibers of π —
the kernel of the bundle mapping π∗ : T (T M ) → T M . Prove that there is a natural
bundle isomorphism i : π ∗ (T M ) ' TV (T M ), where π ∗ (T M ) → T M is the pullback of
the tangent bundle T M of M under the map π.
Problem 2.3 (Invariant definition of the 1-form θL ). Show that θL (v) =
dL((i ◦ π∗ )v), where v ∈ T (T M ).
Problem 2.4. Prove that if a vector field X on M is an infinitesimal symmetry
of the Lagrangian system (M, L), then LX 0 (θL ) = 0, where LX 0 stands for the Lie
derivative.
Problem 2.5. Prove that a path γ(t) in M is a trajectory for the Lagrangian
system (M, L) if and only if

iγ̇ 0 (t) (dθL ) + dEL (γ 0 (t)) = 0,

where γ̇ 0 (t) is the velocity vector of the path γ 0 (t) in T M .


LECTURE 3

Integration of equations of motion

A complete general solution can be obtained for three very important exam-
ples: a motion on the real line, a system of two interacting particles, including
the Kepler problem, and the rotation of a rigid body.

3.1. One-dimensional motion


The motion of systems with one degree of freedom is called one-dimensional.
In terms of a Cartesian coordinate x on M = R, the Lagrangian takes the form

L = 12 mẋ2 − V (x).

The conservation of energy


1
E= mẋ2 + V (x),
2
allows to solve the equation of motion in a closed form by separation of variables.
We have r
dx 2
= (E − V (x)),
dt m
so that r Z
m dx
t= p .
2 E − V (x)
The inverse function x(t) is a general solution of Newton’s equation
dV
mẍ = − ,
dx
with two arbitrary constants, the energy E and the constant of integration.
Since kinetic energy is non-negative, for a given value of E the actual motion
takes place in the region of R where V (x) ≤ E. The points where V (x) = E are
called turning points. The motion which is confined between two turning points
is called finite. The finite motion is periodic — the particle oscillates between
the turning points x1 and x2 with the period
√ Z x2
dx
T (E) = 2m p .
x1 E − V (x)
If the region V (x) ≤ E is unbounded, then the motion is called infinite and the
particle eventually goes to infinity. The regions where V (x) > E are forbidden.

23
24 3. INTEGRATION OF EQUATIONS OF MOTION

V (x) = E

x1 x2 x3 x

Figure 1

Thus on Fig. 1 the motion between points x1 and x2 is periodic, and in the
region x3 ≤ x the motion is infinite; all other regions there are forbidden.
On the phase plane with coordinates (x, y) Newton’s equation reduces to the
first order system
dV
mẋ = y, ẏ = − .
dx
Trajectories correspond to the phase curves (x(t), y(t)), which lie on the level
sets
y2
+ V (x) = E
2m
of the energy function. The points (x0 , 0), where x0 is a critical point of the po-
tential energy V (x), correspond to the equilibrium solutions. The local minima
correspond to the stable solutions and local maxima correspond to the unstable
solutions. For the values of E which do not correspond to the equilibrium solu-
tions the level sets are smooth curves. These curves are closed if the motion is
finite.
The simplest non-trivial one-dimensional system, besides the free particle, is
the harmonic oscillator with V (x) = 21 kx2 (k > 0), considered in Example 1.4.
The general solution of the equation of motion is

x(t) = A cos(ωt + α),


r
k
where A is the amplitude, ω = is the frequency, and α is the phase of a
m

simple harmonic motion with the period T = . The energy is E = 21 mω 2 A2
ω
and the motion is finite with the same period T for E > 0.
3.2. TWO-BODY PROBLEM 25

3.2. Two-body problem


The motion of a system of two interacting particles — the two-body problem
— can also be solved completely. Namely, in this case (see Example 1.2) M = R6
and
m1 ṙ12 m2 ṙ22
L= + − V (|r1 − r2 |).
2 2
Introducing on R6 new coordinates
m1 r1 + m2 r2
r = r1 − r2 and R = ,
m1 + m2
we get
L = 12 mṘ2 + 12 µṙ 2 − V (|r|),
m1 m2
where m = m1 + m2 is the total mass and µ = is the reduced mass
m1 + m2
of a two-body system. The Lagrangian L depends only on the velocity Ṙ of
the center of mass and not on its position R. A generalized coordinate with
this property is called cyclic. It follows from the Euler-Lagrange equations that
generalized momentum corresponding to the cyclic coordinate is conserved. In
our case it is a total momentum of the system,
∂L
P = = mṘ,
∂ Ṙ
so that the center of mass R moves uniformly. Thus in the reference frame
R = 0 the two-body problem reduces to the problem of a single particle of mass
µ in the external central field V (|r|).
It follows from the conservation of angular momentum M = µr × ṙ that
during the motion position vector r lies in the plane P orthogonal to M in R3 .
Choosing the z-axis along M , the plane P becomes the xy-plane and in polar
coordinates
x = r cos ϕ, y = r sin ϕ
the Lagrangian takes the form

L = 21 µ(ṙ2 + r2 ϕ̇2 ) − V (r).


The coordinate ϕ is cyclic and its generalized momentum µr2 ϕ̇ coincides with
|M | if ϕ̇ > 0 and with −|M | if ϕ̇ < 0. Denoting this quantity by M , we get
the equation
(3.1) µr2 ϕ̇ = M,
which is equivalent to Kepler’s second law 1. Using (3.1) we get for the total
energy
M2
(3.2) E = 21 µ(ṙ2 + r2 ϕ̇2 ) + V (r) = 12 µṙ2 + V (r) + .
2µr2
1It is the statement that sectorial velocity of a particle in a central field is constant.
26 3. INTEGRATION OF EQUATIONS OF MOTION

Thus the radial motion reduces to a one-dimensional motion on the half-line


r > 0 with the effective potential energy
M2
Veff (r) = V (r) + ,
2µr2
where the second term is called the centrifugal energy. As in the previous
section, the solution is given by
r Z
µ dr
(3.3) t= p .
2 E − Veff (r)

It follows from (3.1) that the angle ϕ is a monotonic function of t, given by


another quadrature
Z
M dr
(3.4) ϕ= √ p ,
2µ 2
r E − Veff (r)
yielding an equation of the trajectory in polar coordinates.
The set Veff (r) ≤ E is a union of annuli 0 ≤ rmin ≤ r ≤ rmax ≤ ∞, and the
motion is finite if 0 < rmin ≤ r ≤ rmax < ∞. Though for a finite motion r(t)
oscillates between rmin and rmax , corresponding trajectories are not necessarily
closed. The necessary and sufficient condition for a finite motion to have a
closed trajectory is that the angle
Z rmax
M dr
∆ϕ = √ p
2µ rmin r2 E − Veff (r)
m
is commensurable with 2π, i.e., ∆ϕ = 2π for some m, n ∈ Z. If the angle
n
∆ϕ is not commensurable with 2π, the orbit is everywhere dense in the annulus
rmin ≤ r ≤ rmax . If

lim Veff (r) = lim V (r) = V < ∞,


r→∞ r→∞

q motion is infinite for E > V — the particle goes to ∞ with finite velocity
the
2
µ (E − V ).

3.3. Kepler problem


A very important special case is when
α
V (r) = − .
r
It describes Newton’s gravitational attraction (α > 0) and Coulomb electrostatic
interaction (either attractive or repulsive). First consider the case when α > 0
— Kepler’s problem. The effective potential energy is
α M2
Veff (r) = − +
r 2µr2
3.3. KEPLER PROBLEM 27

Veff

r0
r

V0

Figure 2

and has the global minimum


α2 µ
V0 = −
2M 2
M2
at r0 = (see Fig. 2). The motion is infinite for E ≥ 0 and is finite for
αµ
V0 ≤ E < 0. Since
 2
1 1
2µ(E − Veff (r)) = 2µ(E − V0 ) − M 2 − ,
r r0
elementary integration in (3.4) gives
M M

r r0
ϕ = cos−1 p + C,
2µ(E − V0 )
which allows to determine the explicit form of trajectories.
Namely, choosing a constant of integration C = 0 and introducing the nota-
tion r
E
p = r0 and e = 1 − ,
V0
we get the equation of the orbit (trajectory)
p
(3.5) = 1 + e cos ϕ.
r
28 3. INTEGRATION OF EQUATIONS OF MOTION

This is the equation of a conic section with one focus at the origin. Quantity
2p is called the latus rectum of the orbit, and e is called the eccentricity. The
choice C = 0 is such that the point with ϕ = 0 is the point nearest to the origin
(called the perihelion). When V0 ≤ E < 0, the eccentricity e < 1 so that the
orbit is the ellipse2 with the major and minor semi-axes
p α p |M |
(3.6) a= 2
= , b= √ =p .
1−e 2|E| 1−e 2 2µ|E|
p p
Correspondingly, rmin = , rmax = , and the period T of elliptic orbit
1+e 1−e
is given by r
µ
T = πα .
2|E|3
The last formula is Kepler’s third law. When E > 0, the eccentricity e > 1
and the motion is infinite — the orbit is a hyperbola with the origin as internal
focus. When E = 0, the eccentricity e = 1 — the particle starts from rest at ∞
and the orbit is a parabola.
For the repulsive case α < 0 the effective potential energy Veff (r) is always
positive and decreases monotonically from ∞ to 0. The motion is always infinite
and the trajectories are hyperbolas (parabola if E = 0)
p
= −1 + e cos ϕ
r
with s
M2 2EM 2
p= and e = 1 + .
αµ µα2
Kepler’s problem is very special: for every α ∈ R the Lagrangian system on
R3 with
α
(3.7) L = 21 µṙ 2 +
r
has three extra integrals of motion W1 , W2 , W3 in addition to the components
of the angular momentum M . The corresponding vector W = (W1 , W2 , W3 ),
called the Laplace-Runge-Lenz vector, is given by
αr
(3.8) W = ṙ × M − .
r
αr
Indeed, using equations of motion µr̈ = − 3 and conservation of the angular
r
momentum M = µr × ṙ, we get
αṙ α(ṙ · r)r
Ẇ = µr̈ × (r × ṙ) − +
r r3
αṙ α(ṙ · r)r
= (µr̈ · ṙ)r − (µr̈ · r)ṙ − +
r r3
= 0.

2The statement that planets have elliptic orbits with a focus at the Sun is Kepler’s first
law.
3.4. THE MOTION OF A RIGID BODY 29

Using µ(ṙ × M ) · r = M 2 and the identity (a × b)2 = a2 b2 − (a · b)2 , we get

2M 2 E
(3.9) W 2 = α2 +
µ

where
p2 α
E= −
2µ r
is the energy corresponding to the Lagrangian (3.7). The fact that all orbits are
conic sections follows from this extra symmetry of the Kepler problem.

3.4. The motion of a rigid body


The configuration space of a rigid body in R3 with a fixed point is a Lie
group G = SO(3) of orientation preserving orthogonal linear transformations
in R3 . Every left-invariant Riemannian metric h , i on G defines a Lagrangian
L : T G → R by
L(v) = 12 hv, vi, v ∈ T G.
According to Example 1.5, equations of motion of a rigid body are geodesic
equations on G with respect to the Riemannian metric h , i. Let g = so(3) be
the Lie algebra of G. A velocity vector ġ ∈ Tg G determines the angular velocity
of the body Ω = (Lg−1 )∗ ġ ∈ g, where Lg : G → G are left translations on G. In
terms of angular velocity, the Lagrangian takes the form

(3.10) L = 21 hΩ, Ωie ,

where h , ie is an inner product on g = Te G given by the Riemannian metric


h , i.
Let
B(x, y) = − 12 Tr xy
be the Killing form on the Lie algebra g = so(3) — the Lie algebra of 3 × 3
skew-symmetric matrices. It determines ad g-invariant inner product on g,

B([x, z], y) + B(x, [y, z]) = 0

for all x, y, z ∈ g. Thus we have hΩ, Ωie = B(A·Ω, Ω) for some symmetric linear
operator A : g → g, which is positive-definite with respect to the Killing form.
Such linear operator A is called the inertia tensor of the body, and Lagrangian
(3.10) takes the form

(3.11) L = 12 B(A · Ω, Ω).

Now we are ready to derive equations of motion for Lagrangian (3.11). Sim-
ilar to Sect. 1.2, for a path g : [t0 , t1 ] → G, consider the family

g(t, ε) = g(t) exp{εu(t)}, where u : [t0 , t1 ] → g, u(t0 ) = u(t1 ) = 0,


30 3. INTEGRATION OF EQUATIONS OF MOTION

and exp : g → G is the exponential map. We have


∂g(t, ε)
δg(t) = = (Lg )∗ u(t) ∈ Tg(t) G and u(t) = (Lg(t)−1 )∗ δg(t) ∈ g.
∂ε ε=0

The corresponding angular velocity Ω(t, ε) = g −1 (t, ε)ġ(t, ε) ∈ g takes the form
Ω(t, ε) = Adhε (t) Ω(t) + εu̇(t), where hε (t) = exp{−εu(t)}, Ω(t) = Ω(t, 0),
and Adg stands for the adjoint action of G on g. Thus for the infinitesimal
variation
∂Ω(t, ε)
δΩ(t) = ∈g
∂ε ε=0
we readily obtain
(3.12) δΩ = u̇ + [Ω, u].
As in Sect. 1.2 in Lecture 1, consider the action functional
1 t1
Z
S(g, ġ) = B(A · Ω(t), Ω(t))dt.
2 t0
Using the symmetry of the operator A we obtain
1 t1
Z Z t1
δS = (B(A · δΩ(t), Ω(t)) + B(A · Ω(t), δΩ(t))) dt = B(A·Ω(t), δΩ(t))dt,
2 t0 t0

and using (3.12), ad g-invariance of the Killing form and integration by parts,
we get
Z t1
δS = B(A · Ω(t), u̇(t) + [Ω(t), u(t)])dt
t0
Z t1
= B(−A · Ω̇(t) + [A · Ω(t), Ω(t)], u(t))dt.
t0

Since the Killing form is non-degenerate and u(t) is an arbitrary smooth


g-valued function with u(t1 ) = u(t2 ) = 0, from δS = 0 we obtain the following
equations of motion
(3.13) A · Ω̇ = [A · Ω, Ω].
Remark. Our derivation of equations of motion (3.13) is valid for any com-
pact Lie group G and are called Euler equations. In general, for Lagrangian
(3.10) we obtain the following equations of motion,

A · Ω̇ = ad∗Ω (A · Ω),
where ad∗Ω is the adjoint of the operator adΩ on g with respect to the inner
product h , ie . These equations are called Euler-Arnold equations for the
geodesics of a left-invariant Riemannian metric on a Lie group G, finite or
infinite-dimensional.
3.4. THE MOTION OF A RIGID BODY 31

Returning to the case G = SO(3), the principal axes of inertia of the body
are orthonormal eigenvectors e1 , e2 , e3 of A; corresponding eigenvalues I1 , I2 , I3
are called the principal moments of inertia. Choosing the principal axes of
inertia as a basis in g and setting Ω = Ω1 e1 + Ω2 e2 + Ω3 e3 , we get the Lie
algebra isomorphism g ' R3 ,
 
0 −Ω3 Ω2
g 3 Ω =  Ω3 0 −Ω1  7→ (Ω1 , Ω2 , Ω3 ) ∈ R3 ,
−Ω2 Ω1 0

where the Lie bracket in R3 is given by the cross-product (see Example 2.2).
Indeed, for the matrices
   
0 −a3 a2 0 −b3 b2
a =  a3 0 −a1  and b =  b3 0 −b1 
−a2 a1 0 −b2 b1 0

corresponding to the vectors a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) we have

[a, b] = c,

where c corresponds to the vector c = a × b. Moreover,

B(a, b) = a · b.

It is easy to see that if A = diag(I1 , I2 , I3 ), then

A · Ω = AΩ + ΩA,

where A = diag(l1 , l2 , l3 ) and

I2 + I3 − I1 I1 + I3 − I2 I1 + I2 − I3
l1 = , l2 = , l3 = .
2 2 2
Thus
[A · Ω, Ω] = AΩ2 − Ω2 A
and (3.13) become celebrated Euler’s equations for rotation of a free rigid body
around a fixed point,

I1 Ω̇1 = (I2 − I3 )Ω2 Ω3 ,


I2 Ω̇2 = (I3 − I1 )Ω1 Ω3 ,
I3 Ω̇3 = (I1 − I2 )Ω1 Ω2

— the system of first order differential equations. Finally, the position g(t) of a
rigid body is determined from the first order linear matrix differential equation,

ġ = gΩ.
32 3. INTEGRATION OF EQUATIONS OF MOTION

It is easy to see by direct computation that Euler’s equations have two


integrals of motion, total kinetic energy

T = I1 Ω21 + I2 Ω22 + I3 Ω23

and total angular momentum

M 2 = I12 Ω21 + I22 Ω22 + I32 Ω23 .

Leaving aside the trivial case I1 = I2 = I3 , we conclude that the motion in R3


is constrained to the intersection of two quadrics which is a real form of elliptic
curve.
Problem 3.1. Prove all statements in Sect. 3.2.
Problem 3.2. Show that if
lim Veff (r) = −∞,
r→0

then there are orbits with rmin = 0 — “fall” of the particle to the center.
Problem 3.3. Prove that all finite trajectories in the central field are closed only
when
α
V (r) = kr2 , k > 0, and V (r) = − , α > 0.
r
Problem 3.4 (Hamilton’s Theorem). Prove that the velocity vector v = ṙ(t)
of the Kepler problem moves along a circle C in the plane P from Sect. 3.2, not in
general centered at the origin. Any such “velocity circle” can occur, and a circle C,
together with its orientation, determines the orbit r = r(t) uniquely.
Problem 3.5. Derive Kepler’s third law from Kepler’s second law and equation
(3.6).
Problem 3.6. Find parametric equations for orbits in the Kepler’s problem.
Problem 3.7. For the Kepler problem, consider vector fields Y = (Y 1 , Y 2 , Y 3 )
on R6 , defined by (2.4) with aij (r, ṙ) = 2ṙi rj − ri ṙj − δ ij r · ṙ. Prove that they
2αr
satisfy (2.5) with K = = (K 1 , K 2 , K 3 ), and show that corresponding integrals
r
of motions are components of the Laplace-Runge-Lenz vector.
Problem 3.8. Prove that the Laplace-Runge-Lenz vector W points in the di-
rection of the major axis of the orbit and that |W | = αe, where e is the eccentricity
of the orbit.
Problem 3.9. Using the conservation of the Laplace-Runge-Lenz vector, prove
that trajectories in Kepler’s problem with E < 0 are ellipses. (Hint: Evaluate W · r
and use the previous problem.)
Problem 3.10. Derive Euler-Arnold equations.
Problem 3.11. In case g = so(3) prove that for every symmetric A ∈ End g
there is a symmetric 3 × 3 matrix A such that

A · Ω = AΩ + ΩA.

Problem 3.12. Solve Euler’s equations.


LECTURE 4

Legendre transform and Hamilton’s equations

4.1. Legendre transform



Let T M be the cotangent bundle of M . As in case of the tangent bundle,
we have the following definition.
Definition. Let (U, ϕ) be a coordinate chart on M . Coordinates

(p, q) = (p1 , . . . , pn , q 1 , . . . , q n )

on the chart T ∗ U ' Rn × U on the cotangent bundle T ∗ M are called standard


coordinates 1 if for (p, q) ∈ T ∗ U and f ∈ C ∞ (U )

∂f
pi (df ) = , i = 1, . . . , n.
∂q i
Equivalently, standard coordinates on T ∗ U are uniquely characterized by
the condition that p = (p1 , . . . , pn ) are coordinates in the fiber corresponding
∂ ∂
to the basis dq 1 , . . . , dq n for Tq∗ M , dual to the basis 1
, . . . , n for Tq M .
∂q ∂q
Definition. The 1-form θ on T ∗ M , defined in standard coordinates by
n
X
θ= pi dq i = pdq,
i=1

is called Liouville’s canonical 1-form.


Corollary 2.1 shows that θ is a well-defined 1-form on T ∗ M . It also admits
invariant definition,

θ(u) = p(π∗ (u)), where u ∈ T(p,q) T ∗ M,

and π : T ∗ M → M is the canonical projection.


Definition. A fibre-wise mapping τL : T M → T ∗ M is called a Legendre
transform associated with the Lagrangian L, if

θL = τL∗ (θ).
1Following tradition, the first n coordinates parametrize the fiber of T ∗ U and the last n
coordinates parametrize the base.

33
34 4. LEGENDRE TRANSFORM AND HAMILTON’S EQUATIONS

In standard coordinates the Legendre transform is given by


∂L
τL (q, q̇) = (p, q), where p= (q, q̇).
∂ q̇
The mapping τL is a local diffeomorphism if and only if the Lagrangian L is
non-degenerate.

4.2. Hamiltonian function


Definition. Suppose that the Legendre transform τL : T M → T ∗ M is a
diffeomorphism. The Hamiltonian function H : T ∗ M → R, associated with the
Lagrangian L : T M → R, is defined by
∂L
H ◦ τL = EL = q̇ − L.
∂ q̇
In standard coordinates,
H(p, q) = (pq̇ − L(q, q̇))| ∂L ,
p= ∂ q̇

∂L
where q̇ is a function of p and q defined by the equation p = (q, q̇) through
∂ q̇

the implicit function theorem. The cotangent bundle T M is called the phase
space of the Lagrangian system (M, L). It turns out that on the phase space
the equations of motion take a very simple and symmetric form.
Theorem 4.1. Suppose that the Legendre transform τL : T M → T ∗ M is a
diffeomorphism. Then the Euler-Lagrange equations in standard coordinates on
TM,
d ∂L ∂L
i
− i = 0, i = 1, . . . , n,
dt ∂ q̇ ∂q
are equivalent to the following system of first order differential equations in
standard coordinates on T ∗ M :
∂H ∂H
ṗi = − , q̇ i = , i = 1, . . . , n.
∂q i ∂pi
Proof. We have
∂H ∂H
dH = dp + dq
∂p ∂q
 
∂L ∂L
= pdq̇ + q̇dp − dq − dq̇
∂q ∂ q̇ ∂L
p= ∂ q̇
 
∂L
= q̇dp − dq .
∂q p=
∂L
∂ q̇

Thus under the Legendre transform,


∂H d ∂L ∂L ∂H
q̇ = and ṗ = = =− . 
∂p dt ∂ q̇ ∂q ∂q
4.2. HAMILTONIAN FUNCTION 35

Corresponding first order differential equations on T ∗ M are called Hamil-


ton’s equations (canonical equations).
Corollary 4.2. The Hamiltonian H is constant on the solutions of Hamil-
ton’s equations.
Proof. For H(t) = H(p(t), q(t)) we have
dH ∂H ∂H ∂H ∂H ∂H ∂H
= q̇ + ṗ = − = 0. 
dt ∂q ∂p ∂q ∂p ∂p ∂q
For the Lagrangian
mṙ 2
L= − V (r) = T − V, r ∈ R3 ,
2
of a particle of mass m in a potential field V (r) we have
∂L
p= = mṙ.
∂ ṙ
Thus the Legendre transform τL : T R3 → T ∗ R3 is a global diffeomorphism,
linear on the fibers, and
p2
H(p, r) = (pṙ − L)|ṙ= p = + V (r) = T + V.
m 2m
Hamilton’s equations
∂H p
ṙ = = ,
∂p m
∂H ∂V
ṗ = − =−
∂r ∂r
∂V
are equivalent to Newton’s equations with the force F = − .
∂r
For the Lagrangian system describing small oscillators, considered in Exam-
ple 1.4, we have p = mq̇, and using normal coordinates we get
n
p2 1 X
p2 + m2 ωi2 (q i )2 .

H(p, q) = (pq̇ − L(q, q̇))|q̇= p = + V0 (q) =
m 2m 2m i=1

Similarly, for the system of N interacting particles, considered in Example 1.2,


we have p = (p1 , . . . , pN ), where
∂L
pa = = ma ṙa , a = 1, . . . , N.
∂ ṙa
The Legendre transform τL : T R3N → T ∗ R3N is a global diffeomorphism, linear
on the fibers, and
N
X p2a
H(p, r) = (pṙ − L)|ṙ= p = + V (r) = T + V.
m
a=1
2ma
36 4. LEGENDRE TRANSFORM AND HAMILTON’S EQUATIONS

In particular, for a closed system with pair-wise interaction,


N
X p2a X
H(p, r) = + Vab (ra − rb ).
a=1
2ma
1≤a<b≤N

In general, consider the Lagrangian


n
X
i j
L= 1
2 aij (q)q̇ q̇ − V (q), q ∈ Rn ,
i,j=1

where A(q) = {aij (q)}ni,j=1 is a symmetric n × n matrix. We have


n
∂L X
pi = i
= aij (q)q̇ j , i = 1, . . . , n,
∂ q̇ j=1

and the Legendre transform is a global diffeomorphism, linear on the fibers, if


and only if the matrix A(q) is non-degenerate for all q ∈ Rn . In this case,
n
X
1 ij
H(p, q) = (pq̇ − L(q, q̇))| ∂L = 2 a (q)pi pj + V (q),
p= ∂ q̇
i,j=1

where {aij (q)}ni,j=1 = A−1 (q) is the inverse matrix.

4.3. Hamilton’s equations


With every function H : T ∗ M → R on the phase space T ∗ M there are
associated Hamilton’s equations — a first-order system of ordinary differential
equations, which in the standard coordinates on T ∗ U has the form
∂H ∂H
(4.1) ṗ = − , q̇ = .
∂q ∂p
The corresponding vector field XH on T ∗ U ,
n  
X ∂H ∂ ∂H ∂ ∂H ∂ ∂H ∂
XH = i
− i = − ,
i=1
∂p i ∂q ∂q ∂p i ∂p ∂q ∂q ∂p

gives rise to a well-defined vector field XH on T ∗ M , called the Hamiltonian


vector field. Suppose now that the vector field XH on T ∗ M is complete, i.e.,
its integral curves exist for all times. The corresponding one-parameter group
{gt }t∈R of diffeomorphisms of T ∗ M generated by XH is called the Hamiltonian
phase flow. It is defined by gt (p, q) = (p(t), q(t)), where p(t), q(t) is a solution
of Hamilton’s equations satisfying p(0) = p, q(0) = q.
Liouville’s canonical 1-form θ on T ∗ M defines a 2-form ω = dθ. In standard
coordinates on T ∗ M it is given by
n
X
ω= dpi ∧ dq i = dp ∧ dq,
i=1
4.3. HAMILTON’S EQUATIONS 37

and is a non-degenerate 2-form. The form ω is called canonical symplectic form


on T ∗ M . The symplectic form ω defines an isomorphism

J : T ∗ (T ∗ M ) → T (T ∗ M )

between tangent and cotangent bundles to T ∗ M . For every (p, q) ∈ T ∗ M the


linear mapping J −1 : T(p,q) T ∗ M → T(p,q)

T ∗ M is given by

ω(u1 , u2 ) = J −1 (u2 )(u1 ), u1 , u2 ∈ T(p,q) T ∗ M.

The mapping J induces the isomorphism

A1 (T ∗ M ) ' Vect(T ∗ M )

between the infinite-dimensional vector spaces, which is linear over the ring
C ∞ (T ∗ M ). Namely, if ϑ is a 1-form on T ∗ M , then the corresponding vector
field X = J(ϑ) on T ∗ M satisfies

(4.2) ω(Y, X) = ϑ(Y ) for all Y ∈ Vect(T ∗ M ),

and, correspondingly,

(4.3) ϑ = J −1 (X) = −iX ω.

In particular, in standard coordinates,


∂ ∂
J(dp) = and J(dq) = − ,
∂q ∂p

so that XH = J(dH). In this notation, for every f ∈ C ∞ (T ∗ M ),

(4.4) df = −iXf ω.

Theorem 4.3. The Hamiltonian phase flow on T ∗ M preserves the canonical


symplectic form.
Proof. We need to prove that (gt )∗ ω = ω. Since gt is a one-parameter
group of diffeomorphisms, it is sufficient to show that

d
(gt )∗ ω = LXH ω = 0,
dt t=0

where LXH is the Lie derivative along the Hamiltonian vector field XH . Since
for every vector field X,
LX (df ) = d(X(f )),
we compute
   
∂H i ∂H
LXH (dpi ) = −d and LXH (dq ) = d ,
∂q i ∂pi
38 4. LEGENDRE TRANSFORM AND HAMILTON’S EQUATIONS

so that
n
X
LXH (dpi ) ∧ dq i + dpi ∧ LXH (dq i )

LXH ω =
i=1
n     
X ∂H ∂H
= −d ∧ dq i + dpi ∧ d = −d(dH) = 0. 
i=1
∂q i ∂pi

The canonical symplectic form ω on T ∗ M defines the volume form


ωn 1
= ω ∧ ··· ∧ ω
n! n! | {z }
n

on T ∗ M , called Liouville’s volume form.


Corollary 4.4 (Liouville’s theorem). The Hamiltonian phase flow on T ∗ M
preserves Liouville’s volume form.
The restriction of the symplectic form ω on T ∗ M to the configuration space
M is 0. Generalizing this property, we get the following notion.
Definition. A submanifold L of the phase space T ∗ M is called a La-
grangian submanifold if dim L = dim M and ω|L = 0.
It follows from Theorem 4.3 that the image of a Lagrangian submanifold
under the Hamiltonian phase flow is a Lagrangian submanifold.
Problem 4.1. Suppose that for a Lagrangian system (Rn , L) the Legendre trans-
form τL is a diffeomorphism and let H be the corresponding Hamiltonian. Prove that
∂L
for fixed q and q̇ the function pq̇ − H(p, q) has a single critical point at p = .
∂ q̇
Problem 4.2. Give an example of a non-degenerate Lagrangian system (M, L)
such that the Legendre transform τL : T M → T ∗ M is one-to-one but not onto.
Problem 4.3. Verify that XH is a well-defined vector field on T ∗ M .
Problem 4.4. Show that if all level sets of the Hamiltonian H are compact
submanifolds of T ∗ M , then the Hamiltonian vector field XH is complete.
Problem 4.5. Prove that LXH (θ) = d(−H + θ(XH )), where θ is Liouville’s
canonical 1-form.
LECTURE 5

Hamiltonian formalism

5.1. The action functional in the phase space


With every function H on the phase space T ∗ M there is an associated 1-form
θ − Hdt = pdq − Hdt
on the extended phase space T ∗ M × R, called the Poincaré-Cartan form. Let γ :
[t0 , t1 ] → T ∗ M be a smooth parametrized path in T ∗ M such that π(γ(t0 )) = q0
and π(γ(t1 )) = q1 , where π : T ∗ M → M is the canonical projection. By
definition, the lift of a path γ to the extended phase space T ∗ M × R is a path
σ : [t0 , t1 ] → T ∗ M × R given by σ(t) = (γ(t), t), and a path σ in T ∗ M × R
is called an admissible path if it is a lift of a path γ in T ∗ M . The space of
admissible paths in T ∗ M × R is denoted by P̃ (T ∗ M )qq01 ,t
,t1
0
. A variation of an
admissible path σ is a smooth family of admissible paths σε , where ε ∈ [−ε0 , ε0 ]
and σ0 = σ, and the corresponding infinitesimal variation is
∂σε
δσ = ∈ Tσ P̃ (T ∗ M )qq01 ,t
,t1
∂ε ε=0
0

(cf. Section 1.2). The principle of least action in the phase space is the following
statement.
Theorem 5.1 (Poincaré). The admissible path σ in T ∗ M × R is an extremal
for the action functional
Z Z t1
S(σ) = (pdq − Hdt) = (pq̇ − H)dt
σ t0

if and only if it is a lift of a path γ(t) = (p(t), q(t)) in T ∗ M , where p(t) and
q(t) satisfy canonical Hamilton’s equations
∂H ∂H
ṗ = − , q̇ = .
∂q ∂p
Proof. As in the proof of Theorem 1.1, for an admissible family σε (t) =
(p(t, ε), q(t, ε), t) we compute using integration by parts,
n Z t1  
d X ∂H ∂H
S(σε ) = q̇ i δpi − ṗi δq i − i δq i − δpi dt
dε ε=0 i=1 t0
∂q ∂pi
n
X t1
+ pi δq i t0
.
i=1

39
40 5. HAMILTONIAN FORMALISM

Since δq(t0 ) = δq(t1 ) = 0, the path σ is critical if and only if p(t) and q(t)
satisfy canonical Hamilton’s equations (4.1). 
Remark. For a Lagrangian system (M, L), every path γ(t) = (q(t)) in the
configuration space M connecting points q0 and q1 defines an admissible path
∂L
γ̂(t) = (p(t), q(t), t) in the phase space T ∗ M by setting p = . If the Legendre
∂ q̇
transform τL : T M → T ∗ M is a diffeomorphism, then
Z t1 Z t1
S(γ̂) = (pq̇ − H)dt = L(γ 0 (t), t)dt.
t0 t0

Thus the principle of the least action in a configuration space — Hamilton’s


principle — follows from the principle of the least action in a phase space. In
fact, in this case the two principles are equivalent (see Problem 4.1).
From Corollary 4.2 we immediately get the following result.
Corollary 5.2. Solutions of canonical Hamilton’s equations
R lying on the
hypersurface H(p, q) = E are extremals of the functional σ pdq in the class of
admissible paths σ lying on this hypersurface.
Corollary 5.3 (Maupertuis’ principle). The trajectory γ = (q(τ )) of a
closed Lagrangian system (M, L) connecting points q0 and q1 and having energy
E is the extremal of the functional
Z Z
∂L
pdq = (q(τ ), q̇(τ ))q̇(τ )dτ
γ γ ∂ q̇

on the space of all paths in the configuration space M connecting points q0 and
q1 and parametrized such that H( ∂L∂ q̇ (τ ), q(τ )) = E.

The functional Z
S0 (γ) = pdq
γ

is called the abbreviated action 1.


Proof. Every path γ = q(τ ), parametrized such that H( ∂L ∂ q̇ , q) = E, lifts
∂L
to an admissible path σ = ( ∂ q̇ (τ ), q(τ ), τ ), a ≤ τ ≤ b, lying on the hypersurface
H(p, q) = E. 

5.2. The action as a function of coordinates


Consider a non-degenerate Lagrangian system (M, L) and denote by γ(t; q0 , v0 )
the solution of Euler-Lagrange equations
d ∂L ∂L
− =0
dt ∂ q̇ ∂q
1The accurate formulation of Maupertuis’ principle is due to Euler and Lagrange.
5.2. THE ACTION AS A FUNCTION OF COORDINATES 41

with the initial conditions γ(t0 ) = q0 ∈ M and γ̇(t0 ) = v0 ∈ Tq0 M . Suppose


that there exist a neighborhood V0 ⊂ Tv0 M of v0 and t1 > t0 such that for all
v ∈ V0 the extremals γ(t; q0 , v), which start at time t0 at q0 , do not intersect in
the extended configuration space M × R for times t0 < t < t1 . Such extremals
are said to form a central field which includes the extremal γ0 (t) = γ(t; q0 , v0 ).
The existence of the central field of extremals is equivalent to the condition that
for every t0 < t < t1 there is a neighborhood Ut ⊂ M of γ0 (t) ∈ M such that
the mapping
(5.1) V0 3 v 7→ q(t) = γ(t; q0 , v) ∈ Ut
is a diffeomorphism. Basic theorems in the theory of ordinary differential
equations guarantee that for t1 sufficiently close to t0 every extremal γ(t) for
t0 < t < t1 can be included into the central field. In standard coordinates the
mapping (5.1) is given by q̇ 7→ q(t) = γ(t; q0 , q̇).
For the central field of extremals γ(t; q0 , q̇), t0 < t < t1 , we define the action
as a function of coordinates and time (or, classical action) by
Z t
S(q, t; q0 , t0 ) = L(γ 0 (τ ))dτ,
t0

where γ(τ ) is the extremal from the central field that connects q0Sand q. For
given q0 and t0 , the classical action is defined for t ∈ (t0 , t1 ) and q ∈ t0 <t<t1 Ut .
For a fixed energy E,
(5.2) S(q, t; q0 , t0 ) = S0 (q, t; q0 , t0 ) − E(t − t0 ),
where S0 is the abbreviated action from the previous section.
Theorem 5.4. The differential of the classical action S(q, t) with fixed initial
point is given by
dS = pdq − Hdt,
∂L
where p = (q, q̇) and H = pq̇ − L(q, q̇) are determined by the velocity q̇ of
∂ q̇
the extremal γ(τ ) at time t.
Proof. Let qε be a path in M passing through q at ε = 0 with the tangent
vector v ∈ Tq M ' Rn , and for ε small enough let γε (τ ) be the family of
extremals from the central field satisfying γε (t0 ) = q0 and γε (t) = qε . For the
infinitesimal variation δγ we have δγ(t0 ) = 0 and δγ(t) = v, and for fixed t we
get from the formula for variation with the free ends (1.6) that
∂L
dS(v) = v.
∂ q̇
∂S
This shows that = p. Setting q(t) = γ(t), we obtain
∂q
d ∂S ∂S
S(q(t), t) = q̇ + = L,
dt ∂q ∂t
42 5. HAMILTONIAN FORMALISM

∂S
so that = L − pq̇ = −H. 
∂t
Corollary 5.5. The classical action satisfies the following nonlinear partial
differential equation
 
∂S ∂S
(5.3) +H , q = 0.
∂t ∂q
This equation is called the Hamilton-Jacobi equation. Hamilton’s equations
(4.1) can be used for solving the Cauchy problem

(5.4) S(q, t)|t=0 = s(q), s ∈ C ∞ (M ),

for Hamilton-Jacobi equation (5.3) by the method of characteristics.


We can also consider the action S(q, t; q0 , t0 ) as a function of both variables
q and q0 . The analog of Theorem 5.4 is the following statement.
Proposition 5.1. The differential of the classical action as a function of
initial and final points is given by

dS = pdq − p0 dq0 − H(p, q)dt + H(p0 , q0 )dt0 .

5.3. Classical observables and Poisson bracket


Smooth real-valued functions on the phase space T ∗ M are called classical
observables. The vector space C ∞ (T ∗ M ) is an R-algebra — an associative
algebra over R with a unit given by the constant function 1, and with a mul-
tiplication given by the point-wise product of functions. The commutative al-
gebra C ∞ (T ∗ M ) is called the algebra of classical observables. Assuming that
the Hamiltonian phase flow gt exists for all times, the time evolution of every
observable f ∈ C ∞ (T ∗ M ) is given by

ft (p, q) = f (gt (p, q)) = f (p(t), q(t)), (p, q) ∈ T ∗ M.

Equivalently, using the Hamiltonian vector field


∂H ∂ ∂H ∂
XH = − ,
∂p ∂q ∂q ∂p
the time evolution is described by the differential equation
dft dfs+t d(ft ◦ gs )
= = = XH (ft )
dt ds s=0 ds s=0
n  
X ∂H ∂ft ∂H ∂ft ∂H ∂ft ∂H ∂ft
= i
− i = − ,
i=1
∂pi ∂q ∂q ∂pi ∂p ∂q ∂q ∂p

called Hamilton’s equation for classical observables. Setting


∂f ∂g ∂f ∂g
(5.5) {f, g} = Xf (g) = − , f, g ∈ C ∞ (T ∗ M ),
∂p ∂q ∂q ∂p
5.3. CLASSICAL OBSERVABLES AND POISSON BRACKET 43

we can rewrite Hamilton’s equation in the concise form


df
(5.6) = {H, f },
dt
where it is understood that (5.6) is a differential equation for a family of func-
tions ft on T ∗ M with the initial condition ft (p, q)|t=0 = f (p, q). The properties
of the bilinear mapping

{ , } : C ∞ (T ∗ M ) × C ∞ (T ∗ M ) → C ∞ (T ∗ M )

are summarized below.


Theorem 5.6. The mapping { , } satisfies the following properties.
(i) (Relation with the symplectic form)

{f, g} = ω(J(df ), J(dg)) = ω(Xf , Xg ).

(ii) (Skew-symmetry)

{f, g} = −{g, f }.

(iii) (Leibniz rule)

{f g, h} = f {g, h} + g{f, h}.

(iv) (Jacobi identity)

{f, {g, h}} + {g, {h, f }} + {h, {f, g}} = 0

for all f, g, h ∈ C ∞ (T ∗ M ).
Proof. Property (i) immediately follows from the definitions of ω and J
in Section 4.3. Namely, it follows from (4.2) that

ω(Xf , Xg ) = ω(Xf , J(dg)) = dg(Xf ) = Xf (g) = {f, g}.

Properties (ii)-(iii) are obvious. The Jacobi identity could be verified by a


direct computation using (5.5), or by the following elegant argument. Observe
that {f, g} is a bilinear form in the first partial derivatives of f and g, and every
term in the left-hand side of the Jacobi identity is a linear homogenous function
of second partial derivatives of f, g, and h. Now the only terms in the Jacobi
identity which could actually contain second partial derivatives of a function h
are the following:

{f, {g, h}} + {g, {h, f }} = (Xf Xg − Xg Xf )(h).

However, this expression does not contain second partial derivatives of h since
it is a commutator of two differential operators of the first order which is again
a differential operator of the first order! 
44 5. HAMILTONIAN FORMALISM

The observable {f, g} is called the canonical Poisson bracket of the observ-
ables f and g. The Poisson bracket map { , } : C ∞ (T ∗ M ) × C ∞ (T ∗ M ) →
C ∞ (T ∗ M ) turns the algebra of classical observables C ∞ (T ∗ M ) into a Lie al-
gebra with a Lie bracket given by the Poisson bracket. It has an important
property that the Lie bracket is a bi-derivation with respect to the multiplica-
tion in C ∞ (T ∗ M ). The algebra of classical observables C ∞ (T ∗ M ) is an example
of the Poisson algebra — a commutative algebra over R carrying a structure of
a Lie algebra with the property that the Lie bracket is a derivation with respect
to the algebra product.
In Lagrangian mechanics, a function I on T M is an integral of motion for the
Lagrangian system (M, L) if it is constant along the trajectories. In Hamiltonian
mechanics, an observable I — a function on the phase space T ∗ M — is called an
integral of motion (first integral) for Hamilton’s equations (4.1) if it is constant
along the Hamiltonian phase flow. According to (5.6), this is equivalent to the
condition
{H, I} = 0.
It is said that the observables H and I are in involution (Poisson commute).

5.4. Canonical transformations and generating functions


Definition. A diffeomorphism g of the phase space T ∗ M is called a canon-
ical transformation, if it preserves the canonical symplectic form ω on T ∗ M , i.e.,
g ∗ (ω) = ω. By Theorem 4.3, the Hamiltonian phase flow gt is a one-parameter
group of canonical transformations.
Proposition 5.2. Canonical transformations preserve Hamilton’s equations.
Proof. From g ∗ (ω) = ω it follows that the mapping J : T ∗ (T ∗ M ) →
T (T ∗ M ) satisfies

(5.7) g∗ ◦ J ◦ g ∗ = J.

Indeed, for all X, Y ∈ Vect(M ) we have2

ω(X, Y ) = g ∗ (ω)(X, Y ) = ω(g∗ (X), g∗ (Y )) ◦ g,

so that for every 1-form ϑ on M ,

ω(X, J(g ∗ (ϑ))) = g ∗ (ϑ)(X) = ϑ(g∗ (X)) ◦ g = ω(g∗ (X), J(ϑ)) ◦ g,

which gives J(g ∗ (ϑ)) = g∗−1 (J(ϑ)). Using (5.7), we get

g∗ (XH ) = g∗ (J(dH)) = J((g ∗ )−1 (dH)) = XK ,

where K = H ◦ g −1 . Thus the canonical transformation g maps trajectories of


the Hamiltonian vector field XH into the trajectories of the Hamiltonian vector
field XK . 
2Since g is a diffeomorphism, g X is a well-defined vector field on M .

5.4. CANONICAL TRANSFORMATIONS AND GENERATING FUNCTIONS 45

Remark. In classical terms, Proposition 5.2 means that canonical Hamil-


ton’s equations
∂H ∂H
ṗ = − (p, q), q̇ = (p, q)
∂q ∂p
in new coordinates (P , Q) = g(p, q) continue to have the canonical form
∂K ∂K
Ṗ = − (P , Q), Q̇ = (P , Q)
∂Q ∂P
with the old Hamiltonian function K(P , Q) = H(p, q).
Consider now the classical case M = Rn . For a canonical transformation
(P , Q) = g(p, q) set P = P (p, q) and Q = Q(p, q). Since dP ∧ dQ = dp ∧ dq
on T ∗ M ' R2n , the 1-form pdq − P dQ — the difference between the canonical
Liouville 1-form and its pullback by the mapping g — is closed. From the
Poincaré lemma it follows that there exists a function F (p, q) on R2n such that

(5.8) pdq − P dQ = dF (p, q).


 n
∂P ∂Pi
Now assume that at some point (p0 , q0 ) the n × n matrix = is
∂p ∂pj i,j=1
non-degenerate. By the inverse function theorem, there exists a neighborhood
U of (p0 , q0 ) in R2n for which the functions P , q are coordinate functions. The
function
S(P , q) = F (p, q) + P Q
is called a generating function of the canonical transformation g in U . It follows
from (5.8) that
dS = pdq + QdP ,
whence in new coordinates P , q on U ,
∂S ∂S
p= (P , q) and Q = (P , q).
∂q ∂P
The converse statement below easily follows from the implicit function theorem.
Proposition 5.3. Let S(P , q) be a function in some neighborhood U of a
point (P0 , q0 ) ∈ R2n such that the n × n matrix
n
∂2S
 2
∂ S
(P0 , q0 ) = (P0 , q0 )
∂P ∂q ∂Pi ∂q j i,j=1

is non-degenerate. Then S is a generating function of a local (i.e., defined in


some neighborhood of (P0 , q0 ) in R2n ) canonical transformation.
Suppose there is a canonical transformation (P , Q) = g(p, q) such that
H(p, q) = K(P ) for some function K. Then in the new coordinates Hamilton’s
equations take the form
∂K
(5.9) Ṗ = 0, Q̇ = ,
∂P
46 5. HAMILTONIAN FORMALISM

and are trivially integrated:

∂K
P (t) = P (0), Q(t) = Q(0) + t (P (0)).
∂P
∂P
Assuming that the matrix is non-degenerate, the generating function S(P , q)
∂p
satisfies the differential equation
 ∂S 
(5.10) H (P , q), q = K(P ),
∂q

where after the differentiation one should substitute q = q(P , Q), defined by
the canonical transformation g −1 . The differential equation (5.10) for fixed P ,
as it follows from (5.2), coincides with the Hamilton-Jacobi equation for the
abbreviated action S0 = S − Et where E = K(P ),
 ∂S 
0
H (P , q), q = E.
∂q

Theorem 5.7 (Jacobi). Suppose that there is a function S(P , q) which de-
pends on n parameters P = (P1 , . . . , Pn ), satisfies the Hamilton-Jacobi equation
∂2S
(5.10) for some function K(P ), and has the property that the n×n matrix
∂P ∂q
is non-degenerate. Then Hamilton’s equations

∂H ∂H
ṗ = − , q̇ =
∂q ∂p

can be solved explicitly, and the functions P (p, q) = (P1 (p, q), . . . , Pn (p, q)),
∂S
defined by the equations p = (P , q), are integrals of motion in involution.
∂q

∂S ∂S
Proof. Set p = (P , q) and Q = (P , q). By the inverse function
∂q ∂P
theorem, g(p, q) = (P , Q) is a local canonical transformation with the gener-
ating function S. It follows from (5.10) that H(p(P , Q), q(P , Q)) = K(P ), so
that Hamilton’s equations take the form (5.9). Since ω = dP ∧ dQ, integrals of
motion P1 (p, q), . . . , Pn (p, q) are in involution. 

The solution of the Hamilton-Jacobi equation satisfying conditions in Theo-


rem 5.7 is called the complete integral. At first glance it seems that solving the
Hamilton-Jacobi equation, which is a nonlinear partial differential equation, is
a more difficult problem then solving Hamilton’s equations, which is a system of
ordinary differential equations. It is quite remarkable that for many problems
of classical mechanics one can find the complete integral of the Hamilton-Jacobi
equation by the method of separation of variables. By Theorem 5.7, this solves
the corresponding Hamilton’s equations.
5.4. CANONICAL TRANSFORMATIONS AND GENERATING FUNCTIONS 47

Problem 5.1. Let π : T ∗ M → M be the canonical projection, and let L be a


Lagrangian submanifold. Show that if the mapping π|L : L → M is a diffeomor-
phism, then L is a graph of a smooth function on M . Give examples when for some
t > 0 the corresponding projection of gt (L ) onto M is no longer a diffeomorphism.
Problem 5.2. Find the generating function for the identity transformation P =
p, Q = q.
Problem 5.3. Prove Proposition 5.3.
Problem 5.4. Suppose that the canonical transformation g(p, q) = (P , Q) is
such that locally (Q, q) can be considered as new coordinates (canonical transforma-
tions with this property are called free). Prove that S1 (Q, q) = F (p, q), also called a
generating function, satisfies
∂S1 ∂S1
p= and P =− .
∂q ∂Q

Problem 5.5. Find the complete integral for the case of a particle in R3 moving
in a central field.
LECTURE 6

Symplectic and Poisson manifolds

6.1. Symplectic manifolds


The notion of a symplectic manifold is a generalization of the example of a
cotangent bundle T ∗ M .
Definition. A non-degenerate, closed 2-form ω on a manifold M is called
a symplectic form, and the pair (M , ω) is called a symplectic manifold.
Since a symplectic form ω is non-degenerate, a symplectic manifold M is
necessarily even-dimensional, dim M = 2n. The nowhere vanishing 2n-form ω n
ωn
defines a canonical orientation on M , and as in the case M = T ∗ M , is
n!
called Liouville’s volume form. We also have the general notion of a Lagrangian
submanifold.
Definition. A submanifold L of a symplectic manifold (M , ω) is called a
Lagrangian submanifold, if dim L = 21 dim M and the restriction of the sym-
plectic form ω to L is 0.
Besides cotangent bundles, another important class of symplectic manifolds
is given by Kähler manifolds. The simplest compact Kähler manifold is CP 1 '
S 2 with the symplectic form given by the area 2-form of the Hermitian metric
of Gaussian curvature 1 — the round metric on the 2-sphere. In terms of the
local coordinate z associated with the stereographic projection CP 1 ' C ∪ {∞},
dz ∧ dz̄
ω = 2i .
(1 + |z|2 )2
Similarly, the natural symplectic form on the complex projective space CP n
is the symplectic form of the Fubini-Study metric. By pull-back, it defines
symplectic forms on complex projective varieties.
The simplest non-compact Kähler manifold is the n-dimensional complex
vector space Cn with the standard Hermitian metric. In complex coordinates
z = (z 1 , . . . , z n ) on Cn it is given by
n
X
h = dz ⊗ dz̄ = dz α ⊗ dz̄ α .
α=1

In terms of real coordinates (x, y) = (x1 , . . . , xn , y 1 , . . . , y n ) on R2n ' Cn , where


z = x + iy, the corresponding symplectic form ω = − Im h has the canonical

49
50 6. SYMPLECTIC AND POISSON MANIFOLDS

form
n
i X
ω = dz ∧ dz̄ = dxα ∧ dy α = dx ∧ dy.
2 α=1
This example naturally leads to the following definition.
Definition. A symplectic vector space is a pair (V, ω), where V is a vector
space over R and ω is a non-degenerate, skew-symmetric bilinear form on V .
It follows from basic linear algebra that every symplectic vector space V has
a symplectic basis — a basis e1 , . . . , en , f1 , . . . , fn of V , where 2n = dim V , such
that
ω(ei , ej ) = ω(fi , fj ) = 0 and ω(ei , fj ) = δji , i, j = 1, . . . , n.

In coordinates (p, q) = (p1 , . . . , pn , q 1 , . . . , q n ) corresponding to this basis, V '


R2n and
X n
ω = dp ∧ dq = dpi ∧ dq i .
i=1
Thus every symplectic vector space is isomorphic to a direct product of the
phase planes R2 with the canonical symplectic form dp∧dq. Introducing complex
coordinates z = p+iq, we get the isomorphism V ' Cn , so that every symplectic
vector space admits a Kähler structure.
It is a basic fact of symplectic geometry that every symplectic manifold is
locally isomorphic to a symplectic vector space.
Theorem 6.1 (Darboux’ theorem). Let (M , ω) be a 2n-dimensional sym-
plectic manifold. For every point x ∈ M there is a neighborhood U of x with
local coordinates (p, q) = (p1 , . . . , pn , q 1 , . . . , q n ) such that on U
n
X
ω = dp ∧ dq = dpi ∧ dq i .
i=1

Coordinates p, q are called canonical coordinates (Darboux coordinates).


The proof proceeds by induction on n with the two main steps stated as Prob-
lems 6.1 and 6.2.
A non-degenerate 2-form ω for every x ∈ M defines an isomorphism J :
Tx∗ M → Tx M by
ω(u1 , u2 ) = J −1 (u2 )(u1 ), u1 , u2 ∈ Tx M .
Explicitly, for every X ∈ Vect(M ) and ϑ ∈ A1 (M ) we have
ω(X, J(ϑ)) = ϑ(X) and J −1 (X) = −iX (ω).
Defining the Hamiltonian vector field associated with the function f by the
formula Xf = J(df ) we have
(6.1) df = −iXf (ω),
cf. formulas (4.2)–(4.4). This proves the following result.
6.1. SYMPLECTIC MANIFOLDS 51

Lemma 6.1. A vector field X on M is a Hamiltonian vector field if and only


if the 1-form iX (ω) is exact.
In local coordinates x = (x1 , . . . , x2n ) for the coordinate chart (U, ϕ) on M ,
the 2-form ω is given by
2n
X
ω= 1
2 ωij (x) dxi ∧ dxj ,
i,j=1

where {ωij (x)}2n


i,j=1 is a non-degenerate, skew-symmetric matrix-valued function
on ϕ(U ). Denoting the inverse matrix by {ω ij (x)}2n
i,j=1 , we have

2n
X ∂
J(dxi ) = − ω ij (x) , i = 1, . . . , 2n.
j=1
∂xj

Definition. A Hamiltonian system is a pair consisting of a symplectic man-


ifold (M , ω), called a phase space, and a smooth real-valued function H on M ,
called Hamiltonian. The motion of points on the phase space is described by
the vector field
XH = J(dH),
called a Hamiltonian vector field.
The trajectories of a Hamiltonian system ((M , ω), H) are the integral curves
of a Hamiltonian vector field XH on M . In canonical coordinates (p, q) they
are described by the canonical Hamilton’s equations (4.1),

∂H ∂H
ṗ = − , q̇ = .
∂q ∂p
Suppose now that the Hamiltonian vector field XH on M is complete. The
Hamiltonian phase flow on M associated with a Hamiltonian H is a one-
parameter group {gt }t∈R of diffeomorphisms of M generated by XH . The
following statement generalizes Theorem 4.3.
Theorem 6.2. The Hamiltonian phase flow preserves the symplectic form.
Proof. It is sufficient to show that LXH ω = 0. Using Cartan’s formula
(1.1) and dω = 0, we get for every X ∈ Vect(M ),

LX ω = (d ◦ iX )(ω),

and it follows from Lemma 6.1 that

LXH ω = −d(dH) = 0. 

Definition. A vector field X on a symplectic manifold (M , ω) is called a


symplectic vector field if the 1-form iX (ω) is closed, which is equivalent to the
condition LX ω = 0.
52 6. SYMPLECTIC AND POISSON MANIFOLDS

The commutative algebra C ∞ (M ), with a multiplication given by the point-


wise product of functions, is called the algebra of classical observables. Assuming
that the Hamiltonian phase flow gt exists for all times, the time evolution of
every observable f ∈ C ∞ (M ) is given by

ft (x) = f (gt (x)), x ∈ M,

and is described by the differential equation

dft
= XH (ft )
dt
— Hamilton’s equation for classical observables. Hamilton’s equations for ob-
servables on M have the same form as Hamilton’s equations on M = T ∗ M ,
considered in Section 2.3. Since

XH (f ) = df (XH ) = ω(XH , J(df )) = ω(XH , Xf ),

we have the following definition.


Definition. A Poisson bracket on the algebra C ∞ (M ) of classical observ-
ables on a symplectic manifold (M , ω) is a bilinear mapping { , } : C ∞ (M ) ×
C ∞ (M ) → C ∞ (M ), defined by

{f, g} = ω(Xf , Xg ), f, g ∈ C ∞ (M ).

Now Hamilton’s equation takes the concise form

df
(6.2) = {H, f },
dt
understood as a differential equation for a family of functions ft on M with the
initial condition ft |t=0 = f . In local coordinates x = (x1 , . . . , x2n ) on M ,

2n
X ∂f (x) ∂g(x)
{f, g}(x) = − ω ij (x) .
i,j=1
∂xi ∂xj

Theorem 6.3. The Poisson bracket { , } on a symplectic manifold (M , ω)


is skew-symmetric and satisfies Leibniz rule and the Jacobi identity.

Proof. The first two properties are obvious. It follows from the definition
of a Poisson bracket and the formula

[Xf , Xg ](h) = (Xg Xf − Xf Xg )(h) = {g, {f, h}} − {f, {g, h}}

that the Jacobi identity is equivalent to the property

(6.3) [Xf , Xg ] = X{f,g} .


6.2. POISSON MANIFOLDS 53

Let X and Y be symplectic vector fields. Using Cartan’s formulas (1.1)–(1.2)


and (4.4), we get
i[X,Y ] (ω) = LX (iY (ω)) − iY (LX (ω))
= d(iX ◦ iY (ω)) + iX d(iY (ω))
= d(ω(Y, X)) = iZ (ω),
where Z is a Hamiltonian vector field corresponding to ω(X, Y ) ∈ C ∞ (M ).
Since the 2-form ω is non-degenerate, this implies [X, Y ] = Z, so that setting
X = Xf , Y = Xg and using {f, g} = ω(Xf , Xg ), we get (6.3). 
From (6.3) we immediately get the following result.
Corollary 6.4. The subspace Ham(M ) of Hamiltonian vector fields on M
is a Lie subalgebra of Vect(M ). The mapping C ∞ (M ) → Ham(M ), given by
f 7→ Xf , is a Lie algebra homomorphism with the kernel consisting of locally
constant functions on M .
As in the case M = T ∗ M (see Section 5.3), an observable I — a function
on the phase space M — is called an integral of motion (first integral) for the
Hamiltonian system ((M , ω), H) if it is constant along the Hamiltonian phase
flow. According to (6.2), this is equivalent to the condition
(6.4) {H, I} = 0.
It is said that the observables H and I are in involution (Poisson commute).
From the Jacobi identity for the Poisson bracket we get the following result.
Corollary 6.5 (Poisson’s theorem). The Poisson bracket of two integrals
of motion is an integral of motion.
Proof. If {H, I1 } = {H, I2 } = 0, then
{H, {I1 , I2 }} = {{H, I1 }, I2 } − {{H, I2 }, I1 } = 0. 
It follows from Poisson’s theorem that integrals of motion form a Lie algebra
and, by (6.3), corresponding Hamiltonian vector fields form a Lie subalgebra in
Vect(M ). Since {I, H} = dH(XI ) = 0, the vector fields XI are tangent to
submanifolds ME = {x ∈ M : H(x) = E} — the level sets of the Hamiltonian
H. This defines a Lie algebra of integrals of motion for the Hamiltonian system
((M , ω), H) at the level set ME .

6.2. Poisson manifolds


The notion of a Poisson manifold generalizes the notion of a symplectic
manifold.
Definition. A Poisson manifold is a manifold M equipped with a Poisson
structure — a skew-symmetric bilinear mapping
{ , } : C ∞ (M ) × C ∞ (M ) → C ∞ (M )
which satisfies the Leibniz rule and Jacobi identity.
54 6. SYMPLECTIC AND POISSON MANIFOLDS

Equivalently, M is a Poisson manifold if the algebra A = C ∞ (M ) of classical


observables is a Poisson algebra — a Lie algebra such that the Lie bracket is
a bi-derivation with respect to the multiplication in A (a point-wise product
of functions). It follows from the derivation property that in local coordinates
x = (x1 , . . . , xN ) on M , the Poisson bracket has the form
N
X ∂f (x) ∂g(x)
{f, g}(x) = η ij (x) .
i,j=1
∂xi ∂xj

The 2-tensor η ij (x), called a Poisson tensor, defines a global section η of the
vector bundle T M ∧ T M over M .
The evolution of classical observables on a Poisson manifold is given by
Hamilton’s equations, which have the same form as (6.2),
df
= XH (f ) = {H, f }.
dt
The phase flow gt for a complete Hamiltonian vector field XH = {H, · } defines
the evolution operator Ut : A → A by

Ut (f )(x) = f (gt (x)), f ∈ A.

Theorem 6.6. Suppose that every Hamiltonian vector field on a Poisson


manifold (M , { , }) is complete. Then for every H ∈ A, the corresponding
evolution operator Ut is an automorphism of the Poisson algebra A, i.e.,

(6.5) Ut ({f, g}) = {Ut (f ), Ut (g)} for all f, g ∈ A.

Conversely, if a skew-symmetric bilinear mapping { , } : C ∞ (M ) × C ∞ (M )


→ C ∞ (M ) is such that XH = {H, · } are complete vector fields for all H ∈ A,
and corresponding evolution operators Ut satisfy (6.5), then (M , { , }) is a
Poisson manifold.
Proof. Let ft = Ut (f ), gt = Ut (g), and1 ht = Ut ({f, g}). By definition,
d dht
{ft , gt } = {{H, ft }, gt } + {ft , {H, gt }} and = {H, ht }.
dt dt
If (M , { , }) is a Poisson manifold, then it follows from the Jacobi identity that

{{H, ft }, gt } + {ft , {H, gt }} = {H, {ft , gt }},

so that ht and {ft , gt } satisfy the same differential equation (6.2). Since these
functions coincide at t = 0, (6.5) follows from the uniqueness theorem for the
ordinary differential equations.
Conversely, we get the Jacobi identity for the functions f, g, and H by dif-
ferentiating (6.5) with respect to t at t = 0. 
1Here g is not the phase flow!
t
6.2. POISSON MANIFOLDS 55

Corollary 6.7. A global section η of T M ∧ T M is a Poisson tensor if and


only if
LXf η = 0 for all f ∈ A.
Definition. The center of a Poisson algebra A is

Z(A) = {f ∈ A : {f, g} = 0 for all g ∈ A}.

A Poisson manifold (M , { , }) is called non-degenerate if the center of a Poisson


algebra of classical observables A = C ∞ (M ) consists only of locally constant
functions (Z(A) = R for connected M ).
Equivalently, a Poisson manifold (M , { , }) is non-degenerate if the Poisson
tensor η is non-degenerate everywhere on M , so that M is necessarily an even-
dimensional manifold. A non-degenerate Poisson tensor for every x ∈ M defines
an isomorphism J : Tx∗ M → Tx M by

η(u1 , u2 ) = u2 (J(u1 )), u1 , u2 ∈ Tx∗ M .

In local coordinates x = (x1 , . . . , xN ) for the coordinate chart (U, ϕ) on M , we


have
N
i
X ∂
J(dx ) = η ij (x) j , i = 1, . . . , N.
j=1
∂x

A map ϕ : M1 → M2 of Poisson manifolds (M1 , { , }1 ) and (M2 , { , }2 ) is


called a Poisson mapping, if

{f ◦ ϕ, g ◦ ϕ}1 = {f, g}2 ◦ ϕ for all f, g ∈ C ∞ (M2 ).

A symplectic manifold carries a natural Poisson structure. Its non-degeneracy


follows from the non-degeneracy of a symplectic form. Converse statement also
holds.
Theorem 6.8. A non-degenerate Poisson manifold is a symplectic manifold.

Proof. Let (M , { , }) be a non-degenerate Poisson manifold. Define the


2-form ω on M by

ω(X, Y ) = J −1 (Y )(X), X, Y ∈ Vect(M ),

where the isomorphism J : T ∗ M → T M is defined by the Poisson tensor η. In


local coordinates x = (x1 , . . . , xN ) on M ,
X
ω=− ηij (x) dxi ∧ dxj ,
1≤i<j≤N

where {ηij (x)}N ij N


i,j=1 is the inverse matrix to {η (x)}i,j=1 . The 2-form ω is
skew-symmetric and non-degenerate. For every f ∈ A let Xf = {f, · } be the
56 6. SYMPLECTIC AND POISSON MANIFOLDS

corresponding vector field on M . The Jacobi identity for the Poisson bracket
{ , } is equivalent to LXf η = 0 for every f ∈ A, so that
LXf ω = 0.
Since Xf = Jdf , we have ω(X, Jdf ) = df (X) for every X ∈ Vect(M ), so that
ω(Xf , Xg ) = {f, g}.
By Cartan’s formula for the exterior differential,
1
dω(X, Y, Z) = 3 (LX ω(Y, Z) + LY ω(Z, X) + LZ ω(X, Y )
−ω([X, Y ], Z) − ω([Z, X], Y ) − ω([Y, Z], X)) ,
where X, Y, Z ∈ Vect(M ). Now setting X = Xf , Y = Xg , Z = Xh , we get
1
dω(Xf , Xg , Xh ) = 3 (ω(Xh , [Xf , Xg ]) + ω(Xf , [Xg , Xh ]) + ω(Xg , [Xh , Xf ]))
1

= 3 ω(Xh , X{f,g} ) + ω(Xf , X{g,h} ) + ω(Xg , X{h,f } )
1
= 3 ({h, {f, g}} + {f, {g, h}} + {g, {h, f }})
= 0.
The exact 1-forms df, f ∈ A, generate the vector space of 1-forms A1 (M )
as a module over A, so that Hamiltonian vector fields Xf = Jdf generate the
vector space Vect(M ) as a module over A. Thus dω = 0 and (M , ω) is a
symplectic manifold associated with the Poisson manifold (M , { , }). 
Remark. One can also prove this theorem by a straightforward computation
in local coordinates x = (x1 , . . . , xN ) on M . Just observe that the condition
∂ηij (x) ∂ηjl (x) ∂ηli (x)
+ + = 0, i, j, l = 1, . . . , N,
∂xl ∂xi ∂xj
which is a coordinate form of dω = 0, follows from the condition
N 
∂η kl (x) ∂η ik (x) ∂η li (x)
X 
ij lj kj
η (x) + η (x) + η (x) = 0,
j=1
∂xj ∂xj ∂xj

which is a coordinate form of the Jacobi identity, by multiplying it three times


by the inverse matrix ηij (x) using
N 
∂ηpk (x) ∂η ip (x)
X 
ip
η (x) + ηpk (x) = 0.
p=1
∂xj ∂xj

Remark. Let M = T ∗ Rn with the Poisson bracket { , } given by the


canonical symplectic form ω = dp ∧ dq, where (p, q) = (p1 , . . . , pn , q 1 , . . . , q n )
are coordinate functions on T ∗ Rn . The non-degeneracy of the Poisson manifold
(T ∗ Rn , { , }) can be formulated as the property that the only observable f ∈
C ∞ (T ∗ Rn ) satisfying
{f, p1 } = · · · = {f, pn } = 0, {f, q 1 } = · · · = {f, q n } = 0
is f (p, q) = const.
6.3. NOETHER THEOREM WITH SYMMETRIES 57

6.3. Noether theorem with symmetries


Let G be a finite-dimensional Lie group that acts on a connected symplectic
manifold (M , ω) by symplectomorphisms. The Lie algebra g of G acts on M
by vector fields
d
Xξ (f )(x) = f (e−sξ · x),
ds s=0
and the linear mapping g 3 ξ 7→ Xξ ∈ Vect(M ) is a homomorphism of Lie
algebras,
[Xξ , Xη ] = X[ξ,η] , ξ, η ∈ g.

The G-action on M is called a Hamiltonian action, if Xξ are Hamiltonian vector


fields, i.e., for every ξ ∈ g there is Φξ ∈ C ∞ (M ), defined up to an additive
constant, such that Xξ = XΦξ = J(dΦξ ). Using (6.3), we see that for the
Hamiltonian action
X{Φξ ,Φη } = XΦ[ξ,η] ,

so that
{Φξ , Φη } = Φ[ξ,η] + c(ξ, η)

for some constants c(ξ, η). The Hamiltonian action is called a Poisson action if
there is a choice of functions Φξ such that the linear mapping Φ : g → C ∞ (M )
is a homomorphism of Lie algebras,

(6.6) {Φξ , Φη } = Φ[ξ,η] , ξ, η ∈ g.

Definition. A Lie group G is a symmetry group of the Hamiltonian system


((M , ω), H) if there is a Hamiltonian action of G on M such that

H(g · x) = H(x), g ∈ G, x ∈ M .

Theorem 6.9 (Noether theorem with symmetries). If G is a symmetry


group of the Hamiltonian system ((M , ω), H), then the functions Φξ , ξ ∈ g, are
the integrals of motion. If the action of G is Poisson, the integrals of motion
satisfy (6.6).

Proof. By definition of the Hamiltonian action, for every ξ ∈ g,

0 = Xξ (H) = XΦξ (H) = {Φξ , H}. 

Corollary 6.10. Let (M, L) be a Lagrangian system such that the Legendre
transform τL : T M → T ∗ M is a diffeomorphism. Then if a Lie group G
is a symmetry of (M, L), then G is a symmetry group of the corresponding
Hamiltonian system ((T ∗ M, ω), H = EL ◦ τL−1 ), and the corresponding G-action
on T ∗ M is Poisson. In particular, Φξ = −Iξ ◦ τL−1 , where Iξ are Noether
integrals of motion for the one-parameter subgroups of G generated by ξ ∈ g.
58 6. SYMPLECTIC AND POISSON MANIFOLDS

Proof. Let X be the vector field associated with the one-parameter sub-
group {esξ }s∈R of diffeomorphisms of M , used in Theorem 2.2, and let X 0 be
its lift to T M . We have2

(6.7) Xξ = −(τL )∗ (X 0 ),

and it follows from (2.3) that Φξ = iXξ (θ) = θ(Xξ ), where θ is the canonical
Liouville 1-form on T ∗ M . From Cartan’s formula (1.1) and formula LX 0 (θL ) = 0
(see Problem 2.4) we get

dΦξ = d(iXξ (θ)) = −iXξ (dθ) + LXξ (θ) = −iXξ (ω).

It follows from (6.1) that Xξ = J(dΦξ ) and the G-action is Hamiltonian. Using
again the formula LX 0 (θL ) = 0 and Cartan’s formula (1.2), we obtain

Φ[ξ,η] = i[Xξ ,Xη ] (θ) = LXξ (iXη (θ)) + iXη (LXξ (θ))
= Xξ (Φη ) = {Φξ , Φη }. 

Example 6.1. The Lagrangian

L = 21 mṙ 2 − V (r)

for a particle in R3 moving in a central field (see Section 3.2) is invariant with
respect to the action of the group SO(3) of orthogonal transformations of the
Euclidean space R3 . Let u1 , u2 , u3 be a basis for the Lie algebra so(3) corre-
sponding to the rotations with the axes given by the vectors of the standard
basis e1 , e2 , e3 for R3 (see Example 2.2 in Section 2.2). These generators satisfy
the commutation relations
[ui , uj ] = εijk uk ,
where i, j, k = 1, 2, 3, and εijk is a totally anti-symmetric tensor, ε123 = 1.
Corresponding Noether integrals of motion are given by Φui = −Mi , where

M1 = (r × p)1 = r2 p3 − r3 p2 ,
M2 = (r × p)2 = r3 p1 − r1 p3 ,
M3 = (r × p)3 = r1 p2 − r2 p1

are components of the angular momentum vector M = r × p. (Here it is


convenient to lower the indices of the coordinates ri by the Euclidean metric on
R3 .) For the Hamiltonian
p2
H= + V (r)
2m
we have
{H, Mi } = 0.

2The negative sign reflects the difference in definitions of X and X .


ξ
6.3. NOETHER THEOREM WITH SYMMETRIES 59

According to Theorem 6.9 and Corollary 6.10, Poisson brackets of the compo-
nents of the angular momentum satisfy

{Mi , Mj } = −εijk Mk ,

which is also easy to verify directly using (5.5),


∂f ∂g ∂f ∂g
{f, g}(p, r) = − .
∂p ∂r ∂r ∂p
Example 6.2 (Kepler’s problem). For every α ∈ R the Lagrangian system
on R3 with
α
L = 12 mṙ 2 +
r
has three extra integrals of motion — the components W1 , W2 , W3 of the Laplace-
Runge-Lenz vector, given by
p αr
W = ×M −
m r
(see Section 3.3). Using Poisson brackets from the previous example, together
with {ri , Mj } = −εijk rk and {pi , Mj } = −εijk pk , we get by a straightforward
computation,
2H
{Wi , Mj } = −εijk Wk and {Wi , Wj } = εijk Mk ,
m
p2 α
where H = − is the Hamiltonian of Kepler’s problem.
2m r
The Hamiltonian system ((M , ω), H), dim M = 2n, is called completely
integrable if it has n independent integrals of motion F1 = H, . . . , Fn in involu-
tion. The former condition means that dF1 (x), . . . , dFn (x) ∈ Tx∗ M are linearly
independent for almost all x ∈ M . Hamiltonian systems with one degree of
freedom such that dH has only finitely many zeros are completely integrable.
Complete separation of variables in the Hamilton-Jacobi equation (see Section
5.4) provides other examples of completely integrable Hamiltonian systems.
Let ((M , ω), H) be a completely integrable Hamiltonian system. Suppose
that the level set Mf = {x ∈ M : F1 (x) = f1 , . . . , Fn (x) = fn } is compact and
tangent vectors JdF1 , . . . , JdFn are linearly independent for all x ∈ Mf . Then
by the Liouville-Arnold theorem, in a neighborhood of Mf there exist so-called
action-angle variables: coordinates I = (I1 , . . . , In ) ∈ Rn+ = (R>0 )n and ϕ =
(ϕ1 , . . . , ϕn ) ∈ T n = (R/2πZ)n such that ω = dI ∧ dϕ and H = H(I1 , . . . , In ).
According to Hamilton’s equations,
∂H
I˙i = 0 and ϕ̇i = ωi = , i = 1, . . . , n,
∂Ii
so that action variables are constants, and angle variables change uniformly,
ϕi (t) = ϕi (0) + ωi t, i = 1, . . . , n. The classical motion is almost-periodic with
the frequencies ω1 , . . . , ωn .
60 6. SYMPLECTIC AND POISSON MANIFOLDS

Problem 6.1. Let (M , ω) be a symplectic manifold. For x ∈ M choose a


function q 1 on M such that q 1 (x) = 0 and dq 1 does not vanish at x, and set X = −Xq1 .
Show that there is a neighborhood U of x ∈ M and a function p1 on U such that
X(q 1 ) = 1 on U , and there exist coordinates p1 , q 1 , z 1 , . . . , z 2n−2 on U such that
∂ ∂
X= and Y = Xp1 = .
∂p1 ∂q 1
Problem 6.2. Continuing Problem 6.1, show that the 2-form ω − dp1 ∧ dq 1 on
U depends only on coordinates z 1 , . . . , z 2n−2 and is non-degenerate.
Problem 6.3 (Coadjont orbits). Let G be a finite-dimensional Lie group, let g
be its Lie algebra, and let g∗ be the dual vector space to g. For u ∈ g∗ let M = Ou
be the orbit of u under the coadjoint action of G on g∗ . Show that the formula
ω(u1 , u2 ) = u([x1 , x2 ]),
where u1 = ad∗ x1 (u), u2 = ad∗ x2 (u) ∈ Tu M , and ad∗ stands for the coadjoint action
of a Lie algebra g on g∗ , gives rise to a well-defined 2-form on M , which is closed and
non-degenerate. (The 2-form ω is called the Kirillov-Kostant symplectic form.)
Problem 6.4 (Symplectic quotients). For a Poisson action of a Lie group G
on a symplectic manifold (M , ω), define the moment map P : M → g∗ by
P (x)(ξ) = Φξ (x), ξ ∈ g, x ∈ M ,
where g is the Lie algebra of G. For every p ∈ g∗ such that a stabilizer Gp of p acts
freely and properly on Mp = P −1 (p) (such p is called the regular value of the moment
map), the quotient Mp = Gp \Mp is called a reduced phase space. Show that Mp is a
symplectic manifold with the symplectic form uniquely characterized by the condition
that its pull-back to Mp coincides with the restriction to Mp of the symplectic form
ω.
Problem 6.5 (Dual space to a Lie algebra). Let g be a finite-dimensional Lie
algebra with a Lie bracket [ , ], and let g∗ be its dual space. For f, g ∈ C ∞ (g∗ ) define
{f, g}(u) = u ([df, dg]) ,

where u ∈ g and Tu∗ g∗
' g. Prove that { , } is a Poisson bracket. (It was introduced
by Sophus Lie and is called a linear, or Lie-Poisson bracket.) Show that this bracket
is degenerate and determine the center of A = C ∞ (g∗ ).
Problem 6.6. A Poisson bracket { , } on M restricts to a Poisson bracket { , }0
on a submanifold N , if the inclusion ı : N → M is a Poisson mapping. Show that the
Lie-Poisson bracket on g∗ restricts to a non-degenerate Poisson bracket on a coadjoint
orbit, associated with the Kirillov-Kostant symplectic form.
Problem 6.7. Do the computation in Example 6.2 and show that the Lie alge-
bra of the integrals M1 , M2 , M3 , W1 , W2 , W3 in Kepler’s problem at H(p, r) = E is
isomorphic to the Lie algebra so(4), if E < 0, to the Euclidean Lie algebra e(3), if
E = 0, and to the Lie algebra so(1, 3), if E > 0.
Problem 6.8. Find the action-angle variables for a particle with one degree of
freedom, when the potentialH V (x) is a convex function on R satisfying lim|x|→∞ V (x)
= ∞. (Hint: Define I = pdx, where integration goes over the closed orbit with
H(p, x) = E.)
Problem 6.9. Show that a Hamiltonian system describing a particle in R3 mov-
ing in a central field is completely integrable, and find the action-angle variables.
LECTURE 7

Hamiltonian systems with constraints

7.1. First order formalism


As in Lecture 1, consider Lagrangian system (M, L) with the Lagrangian
function L : T M → R. If the Lagrangian L is non-degenerate, by doubling the
number of degrees of freedom, we can replace (M, L) with another Lagrangian
system (M, L), where the Lagrangian function L : T M → R is linear in gener-
alized velocities.
Namely, consider M = T M as new configuration space with generalized
coordinates ξ and define the Lagrangian L on T M by the following formula
n
X ∂L i ∂L
(7.1) L(ξ, ξ̇) = i
(q̇ − v i ) + L(q, v) = (q̇ − v) + L(q, v).
i=1
∂v ∂v

Here ξ = (q, v) are standard coordinates1 on M = T M , and (ξ, ξ̇), where


ξ̇ = (q̇, v̇), are corresponding standard coordinates on T M.
Lemma 7.1. If Lagrangian function L : T M → R is non-degenerate, La-
grangian systems (M, L) and (M, L) are equivalent — corresponding Euler-
Lagrange equations coincide.
Proof. The Euler-Lagrange equations
d ∂L ∂L
− =0
dt ∂ ξ̇ ∂ξ

for the Lagrangian system (M, L) reduce to


d ∂L ∂L d ∂L ∂L
(7.2) − =0 and − = 0.
dt ∂ q̇ ∂q dt ∂ v̇ ∂v
∂L
It follows from (7.1) that = 0, and from the second equation in (7.2) we
∂ v̇
obtain
∂L ∂2L ∂L ∂L ∂2L
0= = (q̇ − v) − + = (q̇ − v).
∂v ∂v∂v ∂v ∂v ∂v∂v
Since Lagrangian L is non-degenerate, this implies

(7.3) q̇ = v.
1To avoid confusion, here we do not denote standard coordinates on T M by (q, q̇).

61
62 7. HAMILTONIAN SYSTEMS WITH CONSTRAINTS

Using (7.1) we can rewrite the first equation in (7.2) as

d ∂L ∂2L ∂L
0= − (q̇ − v) − .
dt ∂v ∂q∂v ∂q

Using (7.3), we obtain Euler-Lagrange equations for the Lagrangian system


(M, L),
d ∂L ∂L
− = 0. 
dt ∂ q̇ ∂q

In general, Lagrangian system (M, L) in the first order formalism is defined


by the Lagrangian function L : T M → R, which in standard coordinates on
T M is given by
N
X
(7.4) L(ξ, ξ̇) = fα (ξ)ξ˙α − H(ξ),
α=1

where H is a function on M and N = dim M. It is also said that Lagrangian L


is linear in generalized velocities. It is associated with the 1-form ϑL on M × R,
N
X
(7.5) ϑL = fα (ξ)dξ α − H(ξ)dt.
α=1

It has the property that for every path γ : [t0 , t1 ] : M → R,


Z t1 Z
0
L(γ (t))dt = ϑL ,
t0 σ

where γ 0 (t) is a vertical lift of γ(t) to T M and σ = {(γ(t), t); t0 ≤ t ≤ t1 } is a


1-chain on M × R (cf. Problem 1.1 in Lecture 1).
Remark. In case when M = T M and Lagrangian L is given by (7.1),

ϑL = θL − Edt,

∂L
where θL = dq is the 1-form associated with the Lagrangian L : T M → R
∂v
∂L
(see formula (2.1) in Lecture 2), and E = v − L is the corresponding energy
∂v
(see Sect. 2.1 in Lecture 2).
Definition. Lagrangian L, given by (7.4), is called non-degenerate, if the
2-form !
N N
X
α
X ∂fβ
ω=d fα (ξ)dξ = (ξ)dξ α ∧ dξ β
α=1
∂x α
α,β=1

is non-degenerate on M.
7.2. SINGULAR LAGRANGIANS 63

It follows from the previous remark and Problem 2.1 in Lecture 2, that
for Lagrangians (7.1) this definition agrees with the one given in Lecture 1.
If the Lagrangian L is non-degenerate, it follows from the Darboux theorem
that N = 2n is even and there exist local canonical coordinates (p, q) =
(p1 , . . . , pn , q 1 , . . . , q n ) on M such that

ϑL = pdq − H(p, q)dt

and the Euler-Lagrange equations for the Lagrangian

L = pq̇ − H(p, q)

are Hamilton’s equations


∂H ∂H
ṗ = − , q̇ =
∂q ∂p

with the Hamiltonian function H(p, q). This repeats derivation of the Hamil-
ton’s equations given in Sect. 4.2 in Lecture 4, but without explicitly using
Legendre transform.
Remark. In this case we trivially have
∂L
p= and H = pq̇ − L.
∂ q̇
7.2. Singular Lagrangians
Here we consider important case when Lagrangian (7.4) is singular. Dar-
boux theorem is still applicable and guarantees existence of local coordinates
(p, q, λ) = (p1 , . . . , pn , q 1 , . . . , q n , λ1 , . . . , λm ) on M, where N = 2n + m, such
that
ϑL = pdq − H(p, q, λ)dt + dS
for some (local) function S(p, q, λ). Since addition of the exact form does not
change equations of motion (see Problem 1.2 in Lecture 1), the Euler-Lagrange
equations for the Lagrangian L have the following form
∂H ∂H ∂H
(7.6) ṗ = − , q̇ =
and = 0.
∂q ∂p ∂λ
 2 m
∂ H
Now suppose that the m×m matrix has constant rank k on
∂λa ∂λb a,b=1
M. If k = m, it follows from the implicit function theorem that the equations
∂H
= 0 in (7.6) determine a submanifold M̃ in M of dimension N − m = 2n,
∂λ
given by the equations λa = λa (p, q), a = 1, . . . , m. In other words, in this case
it is possible to exclude coordinates λ = (λ1 , . . . , λm ). Putting

H̃(p, q) = H(p, q, λ(p, q)) and L̃ = pq̇ − H̃(p, q),


64 7. HAMILTONIAN SYSTEMS WITH CONSTRAINTS

we obtain a non-degenerate Lagrangian system (M̃, L̃) whose Euler-Lagrange


equations coincide with the restriction to M̃ of the first two equations in (7.6).
Indeed, we have
 
∂ H̃ ∂H ∂H ∂λ ∂H
= + = ,
∂p ∂p ∂λ ∂p M̃ ∂p M̃
 
∂ H̃ ∂H ∂H ∂λ ∂H
= + = .
∂q ∂q ∂λ ∂q M̃ ∂q M̃

Correspondingly, M̃ is a symplectic manifold with the symplectic form dp ∧ dq


and Hamiltonian H̃(p, q), and Euler-Lagrange equations (7.6), restricted to M̃,
become Hamilton’s equations.
In case k < m, by using appropriate change of coordinates λ, we can exclude
the first k coordinates λ1 , . . . , λk (such coordinates are called excludable), while
remaining m − k coordinates λk+1 , . . . , λm satisfy

∂2H
= 0, a, b = k + 1, . . . , m,
∂λa ∂λb
so that H is linear function of λk+1 , . . . , λm . Thus from the very beginning we
can assume that M = M0 × Rm , where M0 has canonical coordinates (p, q)
and symplectic form ω = dp ∧ dq, and consider singular Lagrangians on T M of
the form
m
X
(7.7) L = pq̇ − H(p, q) − λa ϕa (p, q),
a=1

where coordinates λ play the role of Lagrange multipliers. The Euler-Lagrange


equations are
m
∂H X ∂ϕa
(7.8) ṗ = − − λa ,
∂q a=1
∂q
m
∂H X ∂ϕa
(7.9) q̇ = + λa ,
∂p a=1 ∂p

and

(7.10) ϕa (p, q) = 0, a = 1, . . . , m.

The functions ϕa (p, q) are called constraints and


 equations(7.10) determine
∂ϕa ∂ϕa
a subset M0 ⊂ M. In case when m × 2n matrix , has rank m on
∂pi ∂q i
M0 , the set M0 is a submanifold of M of dimension 2n − m. Restricting the
1-form ϑL on M0 × R and using Darboux theorem, we either obtain a non-
degenerate Lagrangian system that corresponds to the Hamiltonian system, or
we get a singular Lagrangian. In that case we repeat the above procedure
7.3. FIRST CLASS CONSTRAINTS AND REDUCED PHASE SPACE 65

and obtain a Lagrangian as in (7.7), and corresponding constraints determine


a submanifold M1 of M0 . Iterating this procedure, in finitely many steps we
obtain a non-degenerate Lagrangian.
At each step of this procedure one needs to solve equations (7.10) in order to
apply the Darboux theorem to the restriction of the 1-form ϑL to the subman-
ifold M0 . This could be a very difficult problem. However, there is important
class of constraints for which one does need to solve equations (7.10).

7.3. First class constraints and reduced phase space


It seems natural to identify functions on M with the same restrictions on
M0 . Namely, let I be an ideal in the algebra A = C ∞ (M), aconsisting
 of
∂ϕ ∂ϕa
functions that vanish on M0 . Condition that the matrix , has
∂pi ∂q i
constant rank m on M0 leads to the following result.
Lemma 7.2. Ideal I is generated by the constraints ϕ1 , . . . , ϕm .
However, there are two questions one needs to address in order to formulated
consistent dynamics for Hamiltonian systems with constraints.

• Whether trajectories (p(t), q(t)) of the Hamilton’s equations (7.8)–


(7.9) lie on M0 if (p(0), q(0)) ∈ M0 .

• Describing the algebra of observables whose evolution does not depend


on the arbitrary parameters λ1 , . . . , λm in (7.8)–(7.9).

It is remarkable that, according to Dirac, the affirmative answer to both of these


questions is obtained by using the following definition. Let { , } be the Poisson
bracket on M associated with the symplectic form dp ∧ dq.
Definition. Constraints ϕ1 , . . . , ϕm for the singular Lagrangian (7.7) are
called first class constraints if {ϕa , ϕb }, {H, ϕa } ∈ I, a, b = 1, . . . , m.
In other words, there are functions gcab and hab on M such that
m
X m
X
(7.11) {ϕa , ϕb } = gcab ϕc and {H, ϕa } = hab ϕb .
c=1 b=1

Lemma 7.3. For the first class constraints trajectories (p(t), q(t)) of the
Hamilton’s equations (7.8)–(7.9) lie on M0 if (p(0), q(0)) ∈ M0 .

Proof. It follows from (7.8)–(7.9) that


m
X
ϕ̇a = {H, ϕa } + λb {ϕb , ϕa },
b=1

and it follows from (7.11) that ϕ̇a = 0 on M0 . Thus ϕa (p(t), q(t)) = ϕa (p(0), q(0)) =
0, a = 1, . . . , m. 
66 7. HAMILTONIAN SYSTEMS WITH CONSTRAINTS

In general, according to (7.8)–(7.9), the evolution of arbitrary f ∈ A is given


by
m
X
(7.12) f˙ = {H, f } + λa {ϕa , f },
a=1

and it follows from (7.11) that restriction of this equation to M0 does not depend
on the choice of a representative in f mod I and defines the evolution in the
algebra A/I. Still, this evolution depend on the choice of arbitrary parameters
λ1 , . . . , λm .
Definition. Admissible observables are functions f ∗ on M0 whose exten-
sions f to M satisfy

(7.13) {f, ϕa }|M0 = 0, a = 1, . . . , m.

In particular, H ∗ = H|M0 is an admissible observable. It follows from


Lemma 7.2 and (7.11) that (7.13) is valid for any smooth extension f of a
function f ∗ on M0 . The Poisson bracket of admissible observables is defined by

{f ∗ , g ∗ }0 = {f, g}|M0 ,

and it follows from (7.13) that admissible observables form a Poisson algebra
A∗ . For admissible observables equation (7.13) takes the form

(7.14) f˙∗ = {H ∗ , f ∗ }0

and no longer depends on on the choice of arbitrary parameters λ1 , . . . , λm .


Put
A0 = {f ∈ A : {f, ϕa }|M0 = 0, a = 1, . . . , m.
Lemma 7.4. A0 is a Poisson subalgebra of A: if f, g ∈ A0 , then f g ∈ A0
and {f, g} ∈ A0 . Moreover, I ⊂ A0 is a Poisson algebra ideal of A0 and

A0 /I ' A∗ .

Proof. Follows from Lemma 7.2, equations (7.11) and Jacobi identity. 

The functions f ∗ ∈ A∗ depend on 2n − m − m = 2(n − m) and in many cases


can be thought of as functions on the reduced phase space — symplectic manifold
Γ of dimension 2n − 2m. This can be described geometrically as follows. Let
Xϕa ∈ Vect(M) be the Hamiltonian vector fields corresponding to the functions
ϕa on M. We have, according to formula (6.3) in Lecture 6,

(7.15) [Xϕa , Xϕb ] = X{ϕa ,ϕb } , a, b = 1, . . . , m.

We also have
ω(Xϕa , Xϕb ) = {ϕa , ϕb },
7.3. FIRST CLASS CONSTRAINTS AND REDUCED PHASE SPACE 67

so that
ωm (Xϕa , Xϕb ) = 0 for all m ∈ M0 .
Denote by Ya the vector vector fields Xϕa along M0 . It follows from (7.10) that
Ya are tangent to M0 and

ω|M0 (Ya , Yb ) = 0, a, b = 1, . . . , m.

Thus the closed 2-form ω0 — a restriction of the symplectic form ω to M0 —


has an m-dimensional kernel, generated by the vector fields Ya ∈ Vect(M0 ). It
follows from (7.11) and (7.15) that
m
X
[Xϕa , Xϕb ] = gcab Xϕc ,
c=1

This means that the vector fields Y1 , . . . , Ym generated a smooth involutive


distribution on M0 — a subbundle P of the tangent bundle T M0 , such that
[X, Y ] ∈ P if X, Y ∈ P. By the Frobenius theorem, M0 is a foliation with
m-dimensional leaves given by the integral manifolds of the distribution P.
In case this foliation is a fibration with the base M∗ , a 2n − 2m dimensional
submanifold of M0 , we have A∗ = C ∞ (M∗ ) and the closed 2-form ω ∗ — a
restriction of the 2-form ω0 to M∗ is non-degenerate! Indeed, equations (7.13)
imply that the functions f are constant along the fibers. Locally, M∗ can be
defined by the equations

(7.16) χa (p, q) = 0, a = 1, . . . , m,

called additional constraints. Condition


m
(7.17) det {χa , ϕb } a,b=1 6= 0

guarantees that the submanifold of M0 , defined by (7.16), intersects transver-


sally the integral manifolds of the distribution P. If intersection of every integral
manifold with this submanifold consists of only one point, equations (7.10) and
(7.16) in M determine the reduced phase space — a 2n − 2m manifold M∗ with
the symplectic form ω ∗ = ω|M∗ .
In special case when

(7.18) {χa , χb } = 0, a, b = 1, . . . , m,

one can easily find canonical coordinates on M∗ . Indeed, put pa = χa , a =


1, . . . , m. By Darboux theorem, there are coordinates q a and

(p∗ , q ∗ ) = p∗1 , . . . , p∗2n−2m , (q ∗ )1 , . . . , (q ∗ )2n−2m




such that
m
X
ω= dpa ∧ dq a + dp∗ ∧ dq ∗ .
a=1
68 7. HAMILTONIAN SYSTEMS WITH CONSTRAINTS

Transversality condition (7.17) becomes


 a m
∂ϕ
det 6= 0,
∂q b a,b=1

so that q a = q a (p∗ , q ∗ ). Thus the reduced phase space M∗ is given by the


equations
pa = 0, q a = q a (p∗ , q ∗ ), a = 1, . . . , m,
and
ω ∗ = dp∗ ∧ dq ∗ .
We also have f ∗ (p∗ , q ∗ ) = f (0, p∗ , q a (p∗ , q ∗ ), q ∗ ) and
∂f ∗ ∂g ∗ ∂f ∗ ∂g ∗
(7.19) {f ∗ , g ∗ } = ∗ ∗
− .
∂p ∂q ∂q ∗ ∂p∗
7.4. Second class constraints and Dirac bracket
Constraints (7.10), for which

det {ϕa , ϕb } =

6 0,

are called the second class constraints. In this case m is necessarily even, m =
2k, and the submanifold M0 , determined by equations (7.10) is a symplectic
manifold with a symplectic form ω0 = ω|M0 . In this case Poisson bracket { , }0
corresponding to ω0 is obtained by the following construction. Let Cab be the
inverse matrix to ({ϕa , ϕb }), and let { , } be a Poisson bracket on M associated
with the symplectic form ω.
Definition. Dirac bracket { , }DB on M is given by the following formula
2k
X
(7.20) {f, g}DB = {f, g} − {f, ϕa }Cab {ϕb , g}.
a,b=1

It follows from this definition, that for all f ∈ A,

(7.21) {f, ϕa }DB = 0, a = 1, . . . , 2k.

Lemma 7.5. Dirac bracket is a degenerate Poisson bracket on M whose


center consists of the functions F (ϕ1 , . . . , ϕ2k ), where F : R2k → R. More-
over, Dirac bracket restricts to M0 as a non-degenerate Poisson bracket that
corresponds to the symplectic form ω0 .
It follows from the transversality condition (7.17), that the first class con-
straints ϕa and additional constraints χa can be combined iinto the second class
constraints ϕ1 , . . . , ϕm , χ1 , . . . , χm . Lemma 7.5 implies
Corollary 7.1. Poisson bracket on the reduced phase space M∗ for the
first class constraints coincides with the Dirac bracket for the associated set of
the second class constraints.
7.4. SECOND CLASS CONSTRAINTS AND DIRAC BRACKET 69

Problem 7.1. Prove that formula (7.1) gives a well defined function L on T M.
Problem 7.2. Prove Lemma 7.2.
Problem 7.3. Prove that the symplectic quotient construction (see Problem 6.4
in Lecture 6) in case p = 0 is a particular case of the Dirac formalism, where ϕa are
the Hamiltonian functions of the Hamiltonian vector fields Xξa associated with a basis
ξ a of the Lie algebra g.
Problem 7.4. Prove (7.19) by computing Poisson bracket {f, g} on M in coor-
dinates η = (pa , p∗ , ϕa , q ∗ ).
Problem 7.5. Prove Lemma 7.5.
Problem 7.6. Prove Corollary 7.1.
Notes and references

In addition to references in §3 in Chapter 1 in the book

• Leon A. Takhtajan, Quantum Mechanics for Mathematicians, GSM


95, Amer. Math. Soc., Providence, RI, 2008,

there are the following monographs and lecture notes

• Jerrold E. Marsden and Tudor S. Ratiu, Introduction to Mechanics


and Symmetry, 2nd Edition, Springer, 2002.

• Juan-Pablo Ortega and Tudor S. Ratiu, Momentum Maps and Hamil-


tonian Reduction, Birkhäuser, 2004.

• Alan Wienstein, Lectures on symplectic manifolds, Amer. Math. Soc.,


Providence, RI, 1977.

The paper

• J.-M. Lévy-Leblond, Conservation Laws for Gauge-Variant Lagrangians


in Classical Mechanics, Am. J. Phys. 39 (1971), 502-506,

discusses the generalized Noether theorem and Laplace-Runge-Lenz vector. More


details on Kepler problem can be found in Milnor’s paper

• John Milnor, On the geometry of Kepler problem, The Amer. Math.


Monthly, 90 (1983), 353-365

and in the book

• Victor Guiliemin and Shlomo Sternberg, Variations on a Theme by


Kepler, Amer. Math. Soc., Providence, RI, 1990.

Symplectic quotients (Marsden-Weinstein quotients) construction was intro-


duced in the paper

• J. E. Marsden and A. Weinstein, Reduction of symplectic manifolds


with symmetry, Reports on Mathematical Physics, 5(1) (1974), 121–
130.

Generalized Hamiltonian dynamics for singular Lagrangians was developed by


Dirac in the classics paper

71
72 NOTES AND REFERENCES

• P.A.M. Dirac, Generalized Hamiltonian Dynamics, Proc. R. Soc. Lond.


A 246 (1958) 326-332.
Our exposition follows Faddeev’s paper
• L.D. Faddeev, The Feynman integral for singular Lagrangians, Theo-
ret. and Math. Phys., 1:1 (1969), 1–13.
More details on Hamiltonian systems with constraints can be found in the
physics book
• Marc Henneaux and Claudio Teitelboim, Quantization of Gauge Sys-
tems, Princeton University Press, Princeton, NJ, 1992.
Part 2

Classical gauge theories


LECTURE 8

Maxwell equations

8.1. Physics formulation


The electromagnetic force is a fundamental force responsible for the in-
teraction of electrically charged particles. Particles with positions ra ∈ R3 ,
a = 1, . . . , N , may carry electric charges ea with the density function

N
X
ρ(r) = ea δ(r − ra ).
a=1

In general one considers the charge density — a signed σ-additive measure,


which is absolutely continuous with respect to the standard Lebesgue measure
on R3 , i.e., a signed measure ρ(r)d3 r. Moving charges produce electric current.
A single charge e0 at a moving point r0 (t) produces a current

dr0 (t)
j(r, t) = e0 v(t)δ(r − r0 (t)), where v(t) = .
dt
In general, the current density is

j(r, t) = ρ(r, t)v(r, t),

where v(r, t) is a charge velocity at point r ∈ R3 at time t.


An electric field E(r, t), where r ∈ R3 , is generated by electric charge,
and time-varying magnetic field B(r, t), which is produced by moving electric
charges. They satisfy Maxwell equations, which summarize the basic laws of
electromagnetism. In a free space these equations have the following beautiful
form1
1
(8.1) ∇·E = ρ (Gauss law)
ε0

— the electric flux leaving a volume is proportional to the charge inside;

(8.2) ∇·B =0 (Gauss law for magnetism)

1We are using standard notations for the divergence and curl from the multivariable
calculus.

75
76 8. MAXWELL EQUATIONS

— there are no magnetic charges, the total magnetic flux through a closed
surface is zero;
∂B
(8.3) ∇×E =− (Faraday’s induction law)
∂t
— the voltage induced in a closed circuit is proportional to the rate of change
of the magnetic flux it encloses;
∂E
(8.4) ∇ × B = µ0 j + µ0 ε0 (Ampère’s circular law)
∂t
— the magnetic field induced around a closed loop is proportional to the electric
current plus displacement current (rate of change of electric field) it encloses.
Here the constant ε0 is called a permitivity of the free space and the constant
µ0 is called permeability of the free space or magnetic constant. They satisfy
1
µ0 ε0 = ,
c2
where c is the speed of light in the free space.2 Maxwell equations imply all
laws of the electromagnetism: Coulomb law, Bio-Laplace-Savart law, etc.
It follows from equation (8.2) that there is a vector-valued function A(r, t),
called vector potential, A = (Ax , Ay , Az ), such that
(8.5) B = ∇ × A.
Plugging (8.5) into (8.3) we get
 
∂A
∇× E+ = 0,
∂t
so that there is a function ϕ(r, t), called scalar potential, such that
∂A
(8.6) E = −∇ϕ − .
∂t
Formulas (8.5) and (8.6) solve the first pair of Maxwell equations — equations
(8.2)–(8.3).

8.2. Using differential forms


One can rewrite (8.5)–(8.6) as single equation by introducing the following
four-dimensional notations3. Put x0 = ct, x1 = x, x2 = y, x3 = z and consider
four-vectors x ∈ R4 , x = (xµ ), where µ = 0, 1, 2, 3. Put4
A = Aµ dxµ ,
2In the SI system of units ε = 8.85 × 10−12 C2 N−1 m−2 , where C = Coulomb and
0
N = Newton, and µ0 = 4π × 10−7 NA−2 , A = Ampère. In the Gaussian system of units
1
(a part of CGS system of units based on centimetre-gram-second) ε0 = 4πc , µ0 = 4π
c
and
ECGS = c−1 ESI .
3No reference to the special relativity yet!
4Here and in what follows we always use summation over repeated indices.
8.2. USING DIFFERENTIAL FORMS 77

where A0 = 1c ϕ, A1 = −Ax , A2 = −Ay , A3 = −Az , and consider the 2-form


F = dA. Explicitly,
1
(8.7) F = Fµν dxµ ∧ dxν , where Fµν = ∂µ Aν − ∂ν Aµ
2

and ∂µ = , µ = 0, 1, 2, 3.
∂xµ
It follows from (8.5)–(8.6) that the skew-symmetric 2-tensor Fµν is repre-
sented by the following 4 × 4 matrix
1 1 1
 
0 c Ex c Ey c Ez
− 1 Ex 0 −Bz By 
(8.8) F = c ,
− 1 Ey Bz 0 −Bx 
c
− 1c Ez −By Bx 0
or
1 1 1
F = Ex dx0 ∧ dx1 + Ey dx0 ∧ dx2 + Ez dx0 ∧ dx3
c c c
−Bx dx2 ∧ dx3 − By dx3 ∧ dx1 − Bz dx1 ∧ dx2
The 2-tensor Fµν is called the electromagnetic field tensor, or the field strength
tensor or Faraday tensor.
Equation F = dA gives expressions (8.5)–(8.6) for electric and magnetic
fields in terms of the four-vector potential Aµ . The first pair of Maxwell equa-
tions — equations (8.2)–(8.3) — directly follow from this representation, and
can be written succinctly as
dF = 0,
or, equivalently,
(8.9) ∂ν Fλµ + ∂λ Fµν + ∂µ Fνλ = 0, λ, µ, ν = 0, 1, 2, 3.
Indeed, by an elementary computation we have
 
1
1 2 3
dF = −∇ · B dx ∧ dx ∧ dx − ∂0 Bx + (∇ × E)x dx0 ∧ dx2 ∧ dx3
c
   
1 1
− ∂0 By + (∇ × E)y dx ∧ dx ∧ dx − ∂0 Bz + (∇ × E)z dx0 ∧ dx1 ∧ dx2 .
0 3 1
c c
To rewrite the second pair of Maxwell equations, equations (8.1) and (8.4),
we observe that in the absence of the sources these equations can be obtained
from the first pair (8.2)–(8.3) by the electro-magnetic duality
1 1
E 7→ −B and B 7→ E.
c c
Under this transformation F 7→ ∗F , the dual field strength 2-form, given by
∗F = −Bx dx0 ∧ dx1 − By dx0 ∧ dx2 − Bz dx0 ∧ dx3
1 1 1
− Ex dx2 ∧ dx3 − Ey dx3 ∧ dx1 − Ez dx1 ∧ dx2 ,
c c c
78 8. MAXWELL EQUATIONS

so that equations (8.1) and (8.4) can be written as a single equation


d ∗ F = 0.
What is the geometric meaning of the dual 2-form ∗F ? It is easy to check
that it is a Hodge dual to the 2-form F with respect to the Minkowski metric
ds2 = ηµν dxµ dxν
on R4 , given by the diagonal 4 × 4 matrix η = diag(1, −1, −1, −1)! In other
words, Minkowski metric is a pseudo-Riemannian metric on R4 , given explicitly
by
(8.10) ds2 = (dx0 )2 − (dx1 )2 − (dx2 )2 − (dx3 )2 .
Indeed, let V be an oriented n-dimensional real vector space with a non-
degenerate inner product h , i (not necessarily positive-definite). The inner
product on the vector spaces Λk V is defined by
hu1 ∧ · · · ∧ uk , v1 ∧ · · · ∧ vk i = det(hui , vj i).
Let ω ∈ Λn V be the unit vector associated with the orientation of V — an
element in Λn V , uniquely characterized by the property that its image is 1
under the isomorphism Λn V ' R. Then for v ∈ Λk V its Hodge dual is a vector
∗v ∈ Λn−k V , satisfying
u ∧ ∗v = hu, viω for all v ∈ Λk V.
Applying this definition to the vector space V with the basis dx0 , dx1 , dx2 , dx3
and the inner product hdxµ , dxν i = η µν , we get
∗(aµν dxµ ∧ dxν ) = bµν dxµ ∧ dxν ,
where
1
bµν = εαβµν η αλ η βρ aλρ
2
and εαβγδ is totally antisymmetric tensor, ε0123 = 1. We have
∗(dx0 ∧ dx1 ) = −dx2 ∧ dx3 ,
∗(dx0 ∧ dx2 ) = dx1 ∧ dx3 ,
∗(dx0 ∧ dx3 ) = −dx1 ∧ dx2 ,
∗(dx1 ∧ dx2 ) = dx0 ∧ dx3 ,
∗(dx3 ∧ dx1 ) = dx0 ∧ dx2 ,
∗(dx2 ∧ dx3 ) = dx0 ∧ dx1 ,

and the formula for ∗F follows from the definition of the 2-form F .
To summarize, Maxwell equations in an empty space (without sources) can
be written succinctly as
(8.11) dF = 0 and d ∗ F = 0.
8.3. MAXWELL’S EQUATIONS WITH SOURCES 79

Remark. The signs in Maxwell equations, reflected in electro-magnetic du-


ality, forces the use of a pseudo-Riemannian metric (8.10). This may be con-
sidered as an alternative discovery of the Minkowski spacetime, without the
reference to special relativity.

8.3. Maxwell’s equations with sources


We have

d∗F =
   
1 1
(∇ × B)x − ∂0 Ex dx0 ∧ dx2 ∧ dx3 − (∇ × B)y − ∂0 Ey dx0 ∧ dx1 ∧ dx3
c c
 
1 1
+ (∇ × B)z − ∂0 Ez dx0 ∧ dx1 ∧ dx2 − ∇ · E dx1 ∧ dx2 ∧ dx3 .
c c
Define the four-current
J = Jµ dxµ ,
where J0 = −cρ and J1 = jx , J2 = jy , J3 = jz . Using

∗(dx0 ∧ dx2 ∧ dx3 ) = dx1 ,


∗(dx0 ∧ dx1 ∧ dx3 ) = −dx2 ,
∗(dx0 ∧ dx1 ∧ dx2 ) = dx3 ,
∗(dx1 ∧ dx2 ∧ dx3 ) = dx0 ,

we can succinctly rewrite equations (8.1) and (8.4) as

∗d ∗ F = µ0 J.

Equivalently, since ∗2 = −(−1)k on the space of k-forms on R4 , we have

d ∗ F = µ0 ∗ J,

so that d ∗ J = 0, which is a continuity equation. Using that

∗dx0 = dx1 ∧ dx2 ∧ dx3 ,


∗dx1 = dx0 ∧ dx2 ∧ dx3 ,
∗dx2 = −dx0 ∧ dx1 ∧ dx3 ,
∗dx3 = dx0 ∧ dx1 ∧ dx2 ,

we can rewrite it as follows


∂J µ
= 0, where J µ = η µν Jν .
∂xµ
Explicitly, the continuity equation has the form
∂ρ
+ ∇ · j = 0.
∂t
80 8. MAXWELL EQUATIONS

Remark. If J has compact support or is of rapid decay, the continuity


equation leads to the total charge conservation. Namely, let
Z Z
1
Q(t) = − ∗J = ρ(t, r)d3 r
c {ct}×R3 R3

be the total charge at time t. Then it follows from Stokes’s theorem for M =
[ct1 , ct2 ] × R3 that
Z Z
0= d∗J = ∗J = Q(t2 ) − Q(t1 ).
M ∂M

Also for any compact 3-manifold V ⊂ R3 we have


Z Z

ρ(t, r)d3 r = − j · dS.
∂t V ∂V

It is also convenient to introduce the tensor

F µν = η µα η νβ Fαβ ,

which has the the same form as Fµν , where E is replaced by −E. It is related
to the dual strength field tensor by
1
(∗F )µν = εµναβ F αβ .
2
Then the second pair of Maxwell equations can be written in the following form

(8.12) ∂µ F µν = J ν , ν = 0, 1, 2, 3,

which is often used by physicists.


To summarize, the Maxwell’s equations on R4 have the following form

(8.13) dF = 0 and ∗ d ∗ F = J,

where the 4-current J satisfies the continuity equation. By Poincaré lemma, the
first equation has a solution

F = dA where A = Aµ dxµ .

Upon the identification A0 = 1c ϕ and (A1 , A2 , A3 ) = −A we get expressions


(8.5) and (8.6) for the magnetic and electric fields in terms of the vector and
scalar potentials A and ϕ. Maxwell’s equations are invariant under the gauge
transformations
A 7→ A + df,
where f is a smooth real-valued function on R4 .
8.4. THE PRINCIPLE OF LEAST ACTION 81

8.4. The principle of least action


The Maxwell equations (8.13) can be obtained from the principle of least
action.
Namely, let A = Ω1 (R4 ) be a vector space of smooth (C ∞ ) real-valued 1-
forms A = Aµ dxµ on R4 such that corresponding 2-forms F = dA have compact
support (or decay sufficiently fast as |x| → ∞). Let J be a smooth real-valued
1-form on R4 with compact support (or decaying sufficiently fast as |x| → ∞)
satisfying the continuity equation. Define the action functional S : A → R by
Z
1
(8.14) S(A) = − (F ∧ ∗F + 2A ∧ ∗J),

R4

where F = dA.
Proposition 8.1. The critical points of the action functional S(A) are given
by the Maxwell equations.
Proof. For given a ∈ A put
d
δS(A) = S(A + εa).
dε ε=0

We have, using the symmetry property of the Hodge star operator


(8.15) α ∧ ∗β = β ∧ ∗α
and the Stokes theorem,
Z
1
δS(A) = − (da ∧ ∗F + a ∧ ∗J)

R4
Z Z
1 1
=− (a ∧ d ∗ F + a ∧ ∗J) − d(a ∧ ∗F )
2π 2π
R4 R4
Z
1
=− a ∧ (d ∗ F + ∗J).

R4

Whence δS(A) = 0 for all a ∈ A yields


d ∗ F = − ∗ J. 
Remark. As in Sect. 1.2 in Lecture 1, one can consider a vector space
1
A[A ,t1 µ 3
A0 ,t0 ] of real-valued 1 forms A = Aµ dx on [ct0 , ct1 ]×R satisfying Aµ (ct0 , r) =
A0µ (r) and Aµ (ct1 , r) = A1µ (r). It follows from the above computation using the
variations with fixed ends that critical points of the functional
Z t1 Z
1
− (F ∧ ∗F + 2A ∧ ∗J)d3 rdt
4π t0 R3
are given by the Maxwell equations.
82 8. MAXWELL EQUATIONS

Remark. In physics notation,


Z  
1
S(A) = L (A) − Aµ J µ d4 x,

R4

where
 
1 c 1 2
(8.16) L (A) = − Fµν F µν = E − B2
16π 8π c2

is the Lagrangian function of the free electromagnetic field.


LECTURE 9

Electrodynamics as U(1) gauge theory

Electrodynamics — theory of electromagnetism, described by Maxwell equa-


tions — is a gauge theory with the symmetry group G = U(1). To explain this
fundamental fact, and to formulate the gauge theory with arbitrary compact
symmetry group G — the celebrated Yang-Mills theory — one needs to use dif-
ferential geometry of principal and vector bundles. It is succinctly summarized
below.

9.1. Bundles, connections and curvature


Let G be a Lie group and M be a smooth manifold. A principal G-bundle
over X is fiber bundle π : P → M with the smooth right G-action

P × G 3 (p, g) 7→ p · g ∈ P,

which preserves the Sfibers and is free and transitive. By definition, there is an
open covering M = α∈A Uα such that over each Uα there is a local trivializa-
tion, a diffeomorphism
ϕα : π −1 (Uα ) → Uα × G
such that

π(ϕ−1
α (x, g)) = x and ϕ−1 −1
α (x, g) = ϕα (x, e) · g for all x ∈ Uα , g ∈ G,

where e is identity in G. Putting

λαβ = ϕα ◦ ϕβ−1 : Uαβ × G → Uαβ × G,

where Uαβ = Uα ∩ Uβ , introduces transition functions tαβ : Uαβ → G by

λαβ (x, g) = (x, tαβ (x)g).

The transition functions satisfy

(9.1) tαβ = t−1


βα on Uαβ

and

(9.2) tαβ tβγ tγα = e on Uαβγ = Uα ∩ Uβ ∩ Uγ .

83
84 9. ELECTRODYNAMICS AS U(1) GAUGE THEORY

Conversely, a principal G-bundle P can be defined by transition functions,


maps tαβ : Uαβ → G, satisfying (9.1)–(9.2) by
G
(9.3) P = (Uα × G)/ ∼,
α∈A

where (x, g) ∼ (y, h) if and only if x = y ∈ Uαβ and g = tαβ (x)h. Transition
functions tαβ and fα−1 tαβ fβ , where fα : U → G, are arbitrary smooth functions,
define the same bundle P . Sections of P over U ⊆ M are the maps S : U → P
satisfying π ◦ S = id|U . They are determined by the maps S α : Uα → G,
satisfying

(9.4) Sβ = S α tαβ on Uαβ .

The gauge group G(P ) of a principal G-bundle P consists of bundle isomor-


phisms f : P → P that commute with right action. Such f can be uniquely
written f (p) = p · f∗ (p), where a function f∗ : P → G satisfies

f∗ (p · g) = g −1 f∗ (p)g for all p ∈ P, g ∈ G.

Elements of the gauge group G(P ) are collections {fα }α∈A of arbitrary smooth
functions fα : Uα → G that map sections to sections by the formula S 0 = S ◦ f .
Explicitly,
0
S α = S α fα : Uα → G,
0
and Sα satisfy (9.4) with the transition functions

t0αβ = fα−1 tαβ fβ .

With every representation R : G → GL(V ) of a Lie group G in a complex


vector space V there is a vector bundle E → M of rank n = dim V , associated
with a principal G-bundle P → M . It has fiber V and the structure group G,
as is defined as a quotient

(9.5) E = (P × V )/G,

where the right G-action is given by

(p, v) · g = (p · g, R(g −1 )v), p ∈ P, v ∈ V.

Equivalently, a vector bundle E → M can be defined by the transition


functions gαβ . Sections of E over U ⊆ M are the functions sα : Uα → V ,
satisfying

(9.6) sα = gαβ sβ on Uαβ .

If a vector bundle E → M is associated with a principal G-bundle P → M


through a representation R : G → GL(V ), then gαβ = R(tαβ ).
Denote by Ω0 (E) the sheaf of smooth sections of a vector bundle E and by
1
Ω (E) — a sheaf of 1-forms on M with values in E — a sheaf of smooth sections
9.1. BUNDLES, CONNECTIONS AND CURVATURE 85

of E ⊗ T ∗ M → M . Connection1 on E is a linear map ∇ : Ω0 (E) → Ω1 (E)


satisfying the Leibniz rule

(9.7) ∇(f ζ) = df ⊗ ζ + f (∇ζ)

for all sections ζ ∈ Ω0 (E)(U ) and functions f ∈ C ∞ (U ), U ⊂ M . Connections


can be thought of as a way of differentiating sections of E.
In terms of transition functions gαβ of the bundle E, connection ∇ is a
collection {d + Aα }α∈A , where d is the de Rham differential and Aα are End V -
valued 1-forms on Uα , satisfying the transformation law
−1 −1
(9.8) Aα = gαβ Aβ gαβ − dgαβ gαβ on Uαβ .

Indeed, if sα satisfy (9.6), then V -valued 1-forms on Uα

(9.9) ∇sα = (d + Aα )sα

satisfy ∇sα = gαβ ∇sβ on Uαβ if and only if

(d + Aα )(gαβ sβ ) = dgαβ sβ + gαβ dsβ + Aα gαβ sβ

and
gαβ (d + Aβ )sβ = gαβ dsβ + gαβ Aβ sβ
are equal for all sβ |Uαβ , which is equation (9.8). Notation ∇A = d + A we will
used sometimes.
In local coordinates x1 , . . . , xn on a chart U ⊆ M ,

∇s = ∇µ (s)dxµ , where ∇µ = ∂µ + Aµ and A = Aµ dxµ .

Operators ∇µ are called covariant derivatives.


Remark. In physics literature the notation ∇µ = ∂µ + ieAµ is customary
used, where e is the elementary charge — the magnitude of the electron charge
−e. In quantum field theory the electromagnetic field describes photons, the
exchange particles ‘of light’ for the electromagnetic interaction.
If a vector bundle E → M is associated with a principal G-bundle P → M
through a representation R : G → GL(V ), denote by ρ = de R the corresponding
infinitesimal representation — a representation of a Lie algebra g in End V .
Connections ∇ on E with the symmetry group2 G have the property that Aα
are 1-forms on Uα with values in ρ(g).
Connections ∇ form an affine space A(E) over the complex vector space
Ω1 (M, End E) of End E-valued 1-forms on M . Here End E = E ⊗ E ∗ , where
E ∗ is a dual bundle to E, is an endomorphism bundle of E with the transition
∗ ∗
functions gαβ ⊗ gαβ , where gαβ t
= (gαβ )−1 . The gauge group G(E) consists of

1Here we use a definition that does not use a notion of a connection on a principal
G-bundle.
2Such connections are obtained from connections on a principal G-bundle P .
86 9. ELECTRODYNAMICS AS U(1) GAUGE THEORY

maps φ = {φα : Uα → End V }α∈A , and it acts on A(E) by Aφ = {Aφα }α∈A ,


where

(9.10) Aφα = φα Aα φ−1 −1


α − dφα φα on Uα .

Connection ∇A on a bundle E determines a connection ∇EndE


A on the bundle
End E,

(9.11) ∇EndE sα = dsα + [Aα , sα ],

where sα : Uα → End V satisfy


−1
sα = gαβ sβ gαβ on Uαβ .

A linear map ∇ : Ω0 (E) → Ω1 (E) satisfying (9.7), by Leibniz rule extends


to a map Ωk (E) → Ωk+1 (E), which we continue to denote by ∇. Explicitly, it
is determined by
∇(ψ ⊗ ζ) = dψ ⊗ ζ + (−1)k ψ ∧ ∇ζ,
where ζ ∈ Ω0 (E)(U ) and ψ ∈ Ωk (M )(U ). In particular, for the map

∇2 : Ω0 (E) → Ω2 (E)

we obtain

∇2 (f ζ) = ∇(df ⊗ ζ + f ∇ζ)
= −df ∧ ∇ζ + df ∧ ∇ζ + f ∇2 ζ = f ∇2 ζ.

This means that ∇2 : Ω0 (E) → Ω2 (E) is determined by a 2-form F on M with


values in End E — a global section of the bundle Λ2 T ∗ M ⊗ End E — by

∇2 sα = Fα sα on Uα .

In terms of the transitions functions,

∇2 sα = (d + Aα )(dsα + Aα sα )
= dAα sα − Aα ∧ dsα + Aα ∧ dsα + Aα ∧ Aα sα
= (dAα + Aα ∧ Aα )sα ,

where Aα ∧Aα is understood as a product in End V together with the usual exte-
rior multiplication. Thus End E-valued 2-form F on M is a collection {Fα }α∈A
of End V -valued 2-forms on Uα ,

(9.12) Fα = dAα + Aα ∧ Aα ,

satisfying
−1
(9.13) Fα = gαβ Fβ gαβ on Uαβ .
9.1. BUNDLES, CONNECTIONS AND CURVATURE 87

Transformation law (9.13) follows from (9.8)–(9.12). We will often use notation

F = F (A) = dA + A ∧ A.

If a vector bundle E → M is associated with a principal G-bundle P over M


through a representation R : G → GL(V ), corresponding 2-forms Fα on Uα
take values in ρ(g), where ρ = de R. It follows from (9.10) that the action of the
gauge group G(E) on F is given by F 7→ F φ , where

(9.14) Fαφ = φα Fα φ−1


α on Uα .

In local coordinates x1 , . . . , xn on a chart U ⊆ M ,

1 ∂Aν ∂Aµ
F = Fµν dxµ ∧ dxν , where Fµν = [∇µ , ∇ν ] = − + [Aµ , Aν ]
2 ∂xµ ∂xν
and [Aµ , Aν ] = Aµ ∧ Aν − Aν ∧ Aµ . It follows from (9.11) that curvature satisfies
the Bianchi identity,

(9.15) ∇EndE
A (F ) = dF + A ∧ F − F ∧ A = 0,

which we will simply write as ∇A F = 0. It can also be obtained from the Jacobi
identity
[∇µ , ∇ν ], ∇σ ] + [∇ν , ∇σ ], ∇µ ] + [∇σ , ∇µ ], ∇ν ] = 0.
Remark. In general, for B ∈ Ωk (M, End E) we have

∇EndE
A (B) = dB + A ∧ B − (−1)k B ∧ A.

Let Φ : End V → C be a homogeneous polynomial of order k, invariant under


the adjoint action of GL(V ) on End V ,

Φ(B) = P (gBg −1 ) for all B ∈ End V and g ∈ GL(V ).

It follows form (9.13)


Φ(Fα ) = Φ(Fβ ) on Uαβ ,
so that Φ(F ) ∈ Ω2k (M ). The Chern-Weil theory establishes the following facts.

1. The 2k-form Φ(F ) on M is closed,

dΦ(F ) = 0.

2. Cohomology class

[Φ(F )] ∈ H 2k (M )

does not depend on a choice of a connection d + A in a vector bundle


E.
88 9. ELECTRODYNAMICS AS U(1) GAUGE THEORY

3. A map
Φ 7→ Φ(F )
is a homomorphism of a commutative algebra of invariant polynomials
on End V into the commutative algebra H even (M ) of differential forms
of even degree on M .
The map Φ 7→ Φ(F ) is called Weil homomorphism, and cohomology classes
[Φ(F )] — characteristic classes of a bundle E, associated with the invariant
polynomial Φ. Let P i be elementary invariant polynomials of degree i =
1, . . . , n, defined by
Xn
det(B + tI) = P n−k (B)tk .
k=0
√ 
−1
Forms ci (F ) = P i F are called Chern forms, and corresponding co-

homology classes — Chern classes. It is a fundamental fact in the theory of
characteristic classes, that
 √ 
i −1
ci (E) = P F ∈ Ȟ 2i (M, Z), i = 1, . . . , n,

where Ȟ 2i (M, Z) stands for the Čech cohomology with coefficients in the con-
stant sheaf Z.

9.2. Line bundles and Maxwell equations


Let L → M be a complex line bundle over an n-dimensional manifold M
associated with a principal U(1)-bundle P over M . Let {Uα }α∈A be an open
cover of M and let gαβ : Uα ∩ Uβ → U(1) be a transition functions for L,
satisfying the cocycle condition

gαβ gβγ = gαγ on Uα ∩ Uβ ∩ Uγ .

A unitary connection ∇ — connection with symmetry group U(1) — is given


by

(9.16) ∇ = d + Aα ,

where
√ Aα ∈ Ω1 (Uα ) are 1-forms on Uα with values in the Lie algebra u(1) '
−1 R of the Lie group U(1), satisfying
−1
Aα = Aβ − gαβ dgαβ on Uα ∩ Uβ .

Corresponding curvature 2-form F = ∇2 is a global 2-form on M given by

F = dA

and is a closed form, dF = 0.


9.3. SELF-DUALITY EQUATIONS 89

Suppose that M carries either Riemannian or pseudo-Riemannian metric


ds2 , and let ∗ be the corresponding Hodge star operator. As in Sect. 8.4 in
Lecture 8, consider the functional
Z
1
(9.17) S(A) = − F ∧ ∗F,

M

defined on the affine space A of unitary connections on the line bundle L. In


case M is non-compact it is assumed that connection ∇ = d + A is such that
integral (9.17) with F = dA is convergent. As in Sect. 8.4, the critical points
of the functional S(A) are given by the equations

(9.18) dF = 0 and d ∗ F = 0.

In case M = R4 with the Minkowski metric, and L = R4 × C is a trivial line


bundle, these equations are Maxwell equations (8.11) in empty space3.
The functional S(A) is invariant under the action of a gauge group G(L) and
defines a gauge theory with the symmetry group U(1). Corresponding equations
of motions are given by (9.18), and in case when M is a four-manifold with the
metric ds2 of the signature (+, −, −, −), generalize Maxwell equations (8.11) in
empty space to a ‘curved’ spacetime.

9.3. Self-duality equations


In the Riemannian case equations (9.18) do not have physical interpretation.
However, in case when M is compact Riemannian 4-manifold, they have extra
mathematical structure, which will be very important for non-abelian gauge
theories. Namely, in this case the first Chern form of the line bundle L with
connection ∇ is

−1
c1 (L, ∇) = F, c1 (L) = [c1 (L, ∇)] ∈ Ȟ 2 (M, Z).

Due to the isomorphism
 
cos θ sin θ
U(1) 3 eiθ 7→ ∈ SO(2),
− sin θ cos θ

for a complex U(1)-line bundle L there is a real rank 2 vector bundle L over M
with the symmetry group SO(2). Its first Pontryagin class p1 (L) ∈ Ȟ 4 (M, Z) is
given by
p1 (L) = −c2 (L ⊗ C),
where L ⊗ C is a complexification of the real bundle L — rank 2 complex vector
bundle over M , and c2 (L ⊗ C) is its second Chern class. It is easy to see
that L ⊗ C ' L ⊕ L̄, where L̄ is the line bundle with the transition functions
3Note that if A = A dxµ is a real-valued 1-form on M = R4 , used in Sect 8.2 in Lecture
µ √
8 in case L is a trivial line bundle, then in (9.16) we have ∇ = d + −1 A.
90 9. ELECTRODYNAMICS AS U(1) GAUGE THEORY

1
ḡαβ , so that c2 (L ⊗ C) is represented by the differential form F ∧ F . The
4π 2
corresponding first Pontryagin number is
Z
1
p1 = − 2 F ∧ F ∈ Z.
4π M
In case when M is a Riemannian manifold with the metric ds2 , then the
Maxwell’s equations on M have the form
dF = 0 and d ∗ F = 0,

where F ∈ Ω2 (M, −1 R) and ∗ is the Hodge star of the metric ds2 . They
characterize curvature forms F as harmonic 2-forms. Since in the Riemannian
case ∗2 = 1 on 2-forms, and we have a decomposition
√ √ √
(9.19) Ω2 (M, −1 R) = Ω2+ (M, −1 R) ⊕ Ω2− (M, −1 R)
according to the eigenspaces of the Hodge ∗-operator corresponding to the eigen-
values 1 and√−1. The 2-form F on √ M is called self-dual or anti-self-dual, if
F ∈ Ω2+ (M, −1 R) or F ∈ Ω2− (M, −1 R) respectively,
∗F = ±F.
Correspondingly, connection ∇ = d + A on a U(1)-line bundle L is called self-
dual or anti-self-dual, if its curvature 2-form F = dA is, respectively, self-dual or
anti-self-dual. Curvature forms of self-dual or anti-self-dual connections satisfy
Maxwell’s equations on a Riemannian 4-manifold M automatically!
From the inequality Z
− ω ∧ ∗ω ≥ 0
M

for all ω ∈ Ω2 (M, −1 R), we get for a curvature 2-form F of a line bundle
L → M,
Z Z
− F ∧ ∗F − 4π 2 p1 = − F ∧ ∗F + F ∧ F
M M
Z
1
=− (F − ∗F ) ∧ ∗(F − ∗F ) ≥ 0
2 M
and
Z Z
− F ∧ ∗F + 4π 2 p1 = − F ∧ ∗F − F ∧ F
M M
Z
1
=− (F + ∗F ) ∧ ∗(F + ∗F ) ≥ 0.
2 M

Thus we obtain the inequality


S(A) ≥ π|p1 |,
where the absolute minima of the action are given by the self-dual connections
in case p1 > 0, by the anti-self-dual connections in case p1 < 0 and by both
these types in case p1 = 0.
9.3. SELF-DUALITY EQUATIONS 91

Remark. In the pseudo-Riemannian case ∗2 = −1 on 2-forms and analog of


decomposition (9.19) is valid only for complex-valued 2-forms. Corresponding
self-duality equations take the form

∗F = ± −1 F

and have no solutions in Ω2 (M, −1 R). in other words, these equations have
only “non-physical” solutions.
Problem 9.1. Find local trivializations for a vector bundle defined by (9.5) and
show that in this case definition (9.7) reduces to (9.8).
Problem 9.2. Show that property 1 follows from the Bianchi identity (9.15).
Problem 9.3. Prove property 2. (Hint: given two connections ∇0 and ∇1 on
E, consider a homotopy ∇t = (1 − t)∇0 + t∇1 ).
Problem 9.4. Prove that for every closed 2-form F on a compact manifold M
with the property √ 
−1
F ∈ Ȟ 2 (M, Z),

there is a line bundle L → M and a connection ∇ = d + A such that F = dA.
LECTURE 10

Yang-Mills theory

Here we consider the case when G is compact, connected, semi-simple Lie


group.

10.1. Yang-Mills equations


Let E → M be a complex rank r vector bundle over an n-dimensional
manifold M , which may be considered as a vector bundle with a non-compact
symmetry group G = GL(r, C). There is a natural bundle map of the bundle
End E — the endomorphism bundle of E — to the trivial line bundle over M ,
given by the trace map tr : End V → C in the fibers. Explicitly,

End V = V ⊗ V ∗ 3 v ⊗ w 7→ tr(v ⊗ w) = w(v) ∈ C.

This determines a map

(10.1) Ωp (M, End E) ⊗ Ωq (M, End E) 3 ω1 ⊗ ω2 7→ tr(ω1 ∧ ω2 ) ∈ Ωp+q (M ).

Namely, if ω1 = ψ1 ⊗ ζ1 , ω2 = ψ2 ⊗ ζ2 , where ψ1 ∈ Ωp (M ), ψ2 ∈ Ωq (M ) and


ζ1 , ζ2 ∈ Ω0 (M, End E), then

tr(ω1 ∧ ω2 ) = tr(ζ1 ζ2 ) ψ1 ∧ ψ2 .

A choice of a Riemannian or pseudo-Riemannian metric ds2 on M defines


a Hodge star operator on the algebra Ω• (M ) of differential forms on M . It
extends to the operator

? : Ωp (M, End E) → Ωn−p (M, End E)

by
?(ψ ⊗ ζ) = ∗ψ ⊗ ζ, ψ ∈ Ωp (M ), ζ ∈ Ω0 (M, End E).
Denote by AE the affine space of connections on E.
Definition. A Yang-Mills action functional S : AE → C is given by
Z
1
(10.2) S(A) = − tr(F ∧ ? F ), F = dA + A ∧ A, A ∈ AE .
4π M
If manifold M is non-compact, we assume that connections A are such that
the integral in (10.2) is convergent (e.g., F has compact support). It follows
from (9.14) that the functional S is invariant under the action of a gauge group
G with the symmetry group GL(r, C).

93
94 10. YANG-MILLS THEORY

Proposition 10.1. The critical points of the Yang-Mills action functional


are given by the solutions of the Yang-Mills equations

(10.3) ∇A F = 0 and ∇A ? F = 0.

Proof. The first equation is just a Bianchi identity (9.15) in Lecture 9,


while derivation of the second equation repeats the proof of Proposition 8.1 in
Lecture 8. Namely, for a ∈ Ω1 (M, End E) we have

F (A + a) = F (A) + da + A ∧ a + a ∧ A + a ∧ a
= F (A) + ∇A a + a ∧ a.

Whence using the cyclic property of the trace, formula (8.15), Leibniz rule

d(a ∧ ? F ) = da ∧ ? F − a ∧ d ? F

and Stokes theorem, we obtain

d
δS(A) = S(A + εa)
dε ε=0
Z
1
=− tr ((da + A ∧ a + a ∧ A) ∧ ? F + F ∧ ?(da + A ∧ a + a ∧ A))
4π M
Z
1
=− tr ((da + A ∧ a + a ∧ A) ∧ ? F )
2π M
Z
1
=− tr (a ∧ (d ? F + A ∧ ? F − ? F ∧ A))
2π M
Z
1
=− tr(a ∧ ∇A ? F ). 
2π M
Suppose that the vector bundle E is associated with a principal G-bundle
P over M through a representation R : G → GL(V ) of a compact Lie group
G. When representation R is unitary with respect to Hermitian inner product
in V , restriction of the Yang-Mills functional to the connections AEG with the
symmetry group G gives a functional taking non-negative values. Indeed, in
this case ρ(g) consists of skew-Hermitian endomorphisms V , and

− tr B 2 ≥ 0 for B = −B ∗ ∈ End V,

where ∗ stands for the Hermitian conjugation.


Another important example is when a real vector space V = g and repre-
sentation R is given by the adjoint action Ad of G on g. A Lie algebra g carries
Ad-invariant symmetric bilinear form — the Killing form — given by

hx, yi = − tr(adx ady ), x, y ∈ g,

where adx ∈ End g is given by the adjoint action, adx (y) = [x, y]. The Killing
form defines positive-definite inner product if and only if a Lie group G is
10.1. YANG-MILLS EQUATIONS 95

compact and semi-simple. Corresponding Lie algebra g of a semi-simple Lie


group G is characterized by the property that there is a basis x1 , . . . , xn of g,
such that in the adjoint representation the n × n matrices Xa = adxa satisfy

(10.4) tr(Xa Xb ) = −2δab , a, b = 1, . . . , n.

Equivalently, there is a basis x1 , . . . , xn of g such that corresponding structure


constants tcab ,
Xn
[xa , xb ] = tcab xc ,
c=1

are totally anti-symmetric. In case g = su(2) such basis in the defining two-
dimensional representation is given by the matrices
     
1 0 1 1 0 −i 1 1 0
x1 = , x2 = , x3 = ,
2i 1 0 2i i 0 2i 0 −1

where tabc = tcab is totally anti-symmetric and t123 = 1. Corresponding matrices


X1 , X2 , X3 in the adjoint representation of su(2) are given by (see Example 2.2
in Lecture 2)
     
0 0 0 0 0 1 0 −1 0
X1 = 0 0 −1 , X2 =  0 0 0 , X3 = 1 0 0
0 1 0 −1 0 0 0 0 0

and
hx, yi = −4 trC2 (xy).
The real vector bundle associated with a principal G-bundle P through the
adjoint representation of a Lie group G on its Lie algebra g is called an adjoint
bundle and is denoted by ad P . In case when (M, ds2 ) is a Riemannian manifold,
the Killing form defines on Ωp (M, ad P ) an inner product
Z
(10.5) (ω1 , ω2 ) = hω1 ∧ ? ω2 i
M

with the L2 -norm Z


kωk2 = hω ∧ ? ωi.
M

The symmetry

(10.6) (ω1 , ω2 ) = (ω2 , ω1 )

follows from (8.15) in Lecture 8 and the cyclic property of the trace. The Yang-
Mills functional is the L2 -norm of the curvature form F (A) ∈ Ω2 (M, ad P ),

1
S(A) = kF (A)k2 .

96 10. YANG-MILLS THEORY

In physics applications M is a four-manifold with pseudo-Riemannian metric


of signature (+, −, −, −) and E = ad P for some principal G-bundle P , where
G is compact semi-simple Lie group. Of special importance is the case M = R4
with Minkowski metric, and ad P = M × g. Introduce of A = Aµ dxµ and

1 ∂Aν ∂Aµ
(10.7) F = Fµν dxµ ∧ dxν , Fµν = µ
− + [Aµ Aν ],
2 ∂x ∂xν
where1 Aµ = Aaµ Xa , Fµν = Fµν
a
Xa ∈ g and generators Xa satisfy (10.4). Corre-
sponding Yang-Mills functional (10.2) takes the form
Z Z
1 1
(10.8) S(A) = hFµν , F µν id4 x = − F a (F a )µν d4 x,
16π R4 8π R4 µν

where F µν = (F a )µν Xa , and Yang-Mills equations (10.3) become

∂F µν
(10.9) ∇µ F µν = + [Aµ , F µν ] = 0.
∂xµ
Yang-Mills equations (10.7) and (10.9) generalize U(1)-invariant Maxwell equa-
tions to the case of non-abelian symmetry group G. In terms of the components
equation (10.9) takes the form

∂(F a )µν
+ tabc Abµ (F c )µν = 0,
∂xµ
where tabc are totally anti-symmetric structure constants of g.
Remark. In physics one uses ∇µ = ∂µ − gAµ for the covariant derivative,
where g is a coupling constant of the theory. In Quantum Chromodynamics
(QCD) on considers G = SU(3) in the adjoint representation, and corresponding
components Aaµ (x), a = 1, . . . , 8, are the gluon fields; corresponding quark fields
are in the fundamental representation of SU(3). In our notation gluon part of
the QCD Lagrangian is
1 a
(10.10) L (A) = − F (F a )µν ,
4g 2 µν
a
where Fµν plays the role of gluon field strength tensor. Corresponding elemen-
tary particle — a gluon (or gauge boson) — is the exchange particle for the
strong force between quarks. In the Standard Model of electroweak and strong
interactions one uses the symmetry group G = SU(3) × SU(2) × U(1).

10.2. Self-duality equations


In the Riemannian case Yang-Mills equations do not have direct physical
interpretation. However, in case when M is compact Riemannian four-manifold,
these equations have a fundamental mathematical significance.
1Here summation over repeated indices is understood.
10.2. SELF-DUALITY EQUATIONS 97

Recall that the first Pontryagin class of a real vector bundle ad P over M is
defined by
p1 (ad P ) = −c2 (adC P ) ∈ Ȟ 4 (M, Z),
where adC P = ad P ⊗R C is a complex vector bundle. Since c1 (adC P ) = 0, it is
1
easy to see that c2 (adC P ) is represented by the differential form tr(F ∧ F ),
8π 2
and the corresponding first Pontryagin number is
Z
1
p1 = − 2 tr(F ∧ F ) ∈ Z.
8π M
In case when M is a Riemannian four-manifold with the metric ds2 , we have
a decomposition
(10.11) Ω2 (M, ad P ) = Ω2+ (M, ad P ) ⊕ Ω2− (M, ad P )
according to the eigenspaces of the Hodge star operator ? corresponding to
the eigenvalues 1 and −1. Since operator ? is symmetric with respect to inner
product (10.5) in Ω2 (M, ad P ), these subspaces are orthogonal. Equivalently,
for F = F+ + F− , where F± ∈ Ω2± (M, ad P ),
Z Z
(F+ , F− ) = − hF+ ∧ F− i = − hF− ∧ ?F+ i = −(F− , F+ ),
M M

and it follows from (10.6) that (F+ , F− ) = 0.


The curvature F ∈ Ω2 (M, ad P ) is called self-dual or anti-self-dual, if F ∈
Ω+ (M, Ω2 (M, ad P )) or F ∈ Ω2− (M, Ω2 (M, ad P ) respectively,
2

? F = ±F.
Correspondingly, connection ∇ = d + A on a real vector bundle ad P is called
self-dual or anti-self-dual, if its curvature is self-dual or anti-self-dual. Curvature
forms of self-dual or anti-self-dual connections satisfy Yang-Mills equations on
a Riemannian four-manifold M automatically!
Using the orthogonality of F+ and F− , we obtain
1 1
kF (A)k2 = kF+ k2 + kF− k2

S(A) =
4π 4π
and
Z
1
p1 = − tr((F+ + F− ) ∧ (F+ + F− ))
8π 2 M
1
= − 2 (F+ + F− , F+ − F− )

1
= − 2 k(F+ k2 − kF− k2 .


From here we obtain the inequalities
1 1
S(A) − 2πp1 ≥ kF+ k2 and S(A) + 2πp1 ≥ kF− k2 .
2π 2π
98 10. YANG-MILLS THEORY

Thus we see that the absolute minima of the Yang-Mills action on Aad G
P are
given by the self-dual connections in case p1 > 0, by the anti-self-dual connec-
tions in case p1 < 0 and by both these types in case p1 = 0. Number p1 in
called the instanton number. Solutions of the self-dual Yang-Mills equations for
M = S 4 and G = SU(2) in case p1 = k > 0 form he instanton moduli space
Mk , a smooth manifold of dimension 8k − 3.

10.3. Hitchin equations


Let G be a compact real form of a complex Lie group, and denote by ∗
corresponding anti-involution on a complex Lie algebra gC = g ⊗R C. Consider
the self-duality equations in a trivial bundle ad P over R4 with Euclidean met-
ric. Corresponding connections are g-valued 1-form A = Aµ dµ on R4 with the
curvature 2-form
1
F = Fµν dxµ ∧ dxν , Fµν = ∂µ Aν − ∂ν Aµ + [Aµ , Aν ].
2
The self-duality equations F = ?F take a simple form
F12 = F34 , F13 = F42 , F14 = F23 .
Suppose that Aµ do not depend on variables x3 and x4 . Introducing the so-
called Higgs fields — g-valued functions φ1 = A3 , φ2 = A4 on R2 — we can
rewrite the self-duality equations as
F12 = [φ1 , φ2 ] = F24 ,
F13 = [∇1 , φ1 ] = [φ2 , ∇1 ] = F42 ,
F14 = [∇1 , φ2 ] = [∇2 , φ1 ] = F23 .
Let
F = F12 = ∂1 A2 − ∂2 A1 + [A1 , A2 ]
be the curvature form of a connection d + A1 dx1 + A2 dx2 on a √
trivial ad P
bundle over R2 . Introducing the complex Higgs field φ = φ1 − −1φ2 , the
above equations can be written as

−1 √
(10.12) F = [φ, φ∗ ] and [∇1 + −1∇2 , φ] = 0.
2

Put z = x1 + −1x2 and introduce
Φ = 12 φ dz ∈ Ω1,0 (C, adC P ), Φ∗ = 21 φ dz̄ ∈ Ω0,1 (C, adC P ).
Introducing connection 1-form
A = A1 dx1 + A2 dx2 = A1,0 dz + A0,1 dz̄
in the complex vector bundle adC P over C, we can rewrite equations (10.12) as
(10.13) F + [Φ, Φ∗ ] = 0,
(10.14) ∂¯A Φ = 0.
10.3. HITCHIN EQUATIONS 99

Here
[Φ, Φ∗ ] = Φ ∧ Φ∗ + Φ∗ ∧ Φ
is a graded Lie bracket on adC P -valued 1-forms, and ∂¯A is a (0, 1)-component
of
∇A = ∂ + A1,0 dz + ∂¯ + A0,1 dz̄ = ∂A + ∂¯A .
It is remarkable that equations (10.13)–(10.14) make sense over a Riemann
surface M ! Namely, consider a principal G-bundle P over M , a connection A
in the adjoint bundle ad P and the Higgs field Φ ∈ Ω1,0 (M, adC P ). The pair
(A, Φ) satisfies self-duality equations over a Riemann surface M , if

(10.15) F (A) + [Φ, Φ∗ ] = 0 and ∂¯A Φ = 0.

The second equation states that Φ is a holomorphic section of the complex


vector bundle ad P ⊗ Ω1,0 (M, C) with respect to the complex structure in adC P
determined by the Cauchy-Riemann operator ∂¯A = ∂¯ + A0,1 and the natural
complex structure in Ω1,0 (M, C). Solution (A, Φ) of the self-duality equations
(10.15) determines a flat complex connection d + A + Φ + Φ∗ on adC P .
LECTURE 11

Electromagnetic waves in a free space

11.1. Energy-momentum tensor


Suppose that F satisfies Maxwell equations without sources. Using equations
(8.9) and (8.12) we have
∂ ∂Fµν µν ∂F µν ∂Fµν µν
α
(Fµν F µν ) = α
F + Fµν =2 F
∂x ∂x ∂xα ∂xα
∂Fαµ ∂Fνα ∂
= −2 ν
+ µ
F µν = −4 µ (Fνα F µν ).
∂x ∂x ∂x
Thus
∂ ∂
α
(Fµν F µν ) = −4 β (Fνα F βν ),
∂x ∂x
and introducing
1
Tαβ = Fνα F βν + δαβ Fµν F µν ,
4
we can rewrite this equation as a conservation law

∂Tαβ
(11.1) = 0, α = 0, 1, 2, 3.
∂xβ
The tensor Tαβ is traceless Tαα = 0 and symmetric, T αβ = T βα , where
1
(11.2) T αβ = η αγ Tγβ = −ηµν F αµ F βν + η αβ Fµν F µν .
4
The tensor T αβ is called the energy-momentum tensor. Its components contain
the energy density  
00 1 1 2 2
T = E +B
2 c2
and the momentum density
1
T 0i = F 0k F ik = (E × B)i , i = 1, 2, 3.
c
The vector S = E × B is called the Poynting vector.
Remark. The conservation law (11.1)

∂T 00
= −∇ · S
∂t

101
102 11. ELECTROMAGNETIC WAVES IN A FREE SPACE

can be verified directly using Maxwell’s equations and the calculus formula

∇ · (a × b) = b · (∇ × a) − a · (∇ × b).

It also implies that implies that the total energy of the electromagnetic field
Z
1
E = T 00 d3 r
4π {ct}×R3

does not depend on time.

11.2. Gauge fixing


Maxwell equations in the empty space

(11.3) dF = 0 and d ∗ F = 0

describe harmonic 2-forms on R4 . Their general solution is given by F = dA,


where

(11.4) ∗ d ∗ dA = 0.

This equation is not hyperbolic: if A ∈ Ω1 (R4 ) is a solution then A + df for


any smooth function f on R4 is also a solution. However, one can impose an
additional condition

(11.5) d ∗ A = 0,

which turns (11.4) into the hyperbolic equation

A = 0,

where
=d∗d∗+∗d∗d
is the D’Alambertian — the Laplace operator of the Minkowski metric on R4 ,
acting on 1-forms. In terms of A = Aµ dxµ equation (11.5) becomes


(11.6) ∂µ Aµ = 0, where ∂µ = ,
∂xµ
and is called the Lorenz1 gauge condition. Since equation (11.4) can be written
as
∂µ F µν = 0, where F µν = ∂ µ Aν − ∂ ν Aµ
(see (8.12) in Lecture 8), we readily obtain that in the Lorenz gauge

Aµ = 0, µ = 0, 1, 2, 3,
1Named after Danish physicist and mathematician Ludvig Lorenz, not to be confused
with Dutch physicist Hedrick Lorentz!
11.2. GAUGE FIXING 103

where
1 ∂2 ∂2 ∂2 ∂2
 = ∂µ ∂ µ = 2 2
− 2
− 2 − 2.
c ∂t ∂x ∂y ∂z
For every A ∈ Ω1 (R4 ) there is a gauge equivalent 1-form Af = A + df
satisfying the Lorentz condition. Indeed, (11.5) gives

∗d ∗ df = − ∗ d ∗ A.

Using (11.6), for the function f we get the hyperbolic equation

f = −∂µ Aµ .

Remark. Maxwell equations with sources in the Lorentz gauge have the
form
Aµ = J µ .
The Lorenz gauge is not unique: if A satisfies (11.6), so does Af , where
f = 0. In free and empty space one can make a unique choice by imposing
A0 = 0. In general, this gauge condition is called Hamilton gauge. Together
with Lorenz gauge it yields the Coulomb gauge,

(11.7) ∇ · A = 0.

Indeed, we can always make A0 = 0 by using Af , where ∂0 f = −A0 . The


remaining gauge transformations preserving A0 = 0 are of the form A 7→ A+dχ,
where χ is independent of x0 . In the free and empty space ρ = 0 and since
ϕ = cA0 = 0, it follows from equation (8.6) in Lecture 8 that

0=∇·E =− (∇ · A),
∂t
whence ∇ · A does not depend on t. Determining χ from the elliptic equation
∂2 ∂2 ∂2
∆χ = −∇ · A, where ∆=∇·∇= + + ,
∂x2 ∂y 2 ∂z 2
we arrive at (11.7). Not that in the free and empty space in the Coulomb
gauge we also have A0 = 0. In the presence of electric charges Coulomb gauge
condition is
c
−∇2 A0 = ρ and ∇ · A = 0,
ε0
where ρ(r, t) is the electric charge density.
To summarize, in the Coulomb gauge Maxwell equations in free and empty
space take the form
∂A
(11.8) A = 0 and E = − , B = ∇ × A,
∂t
so that also

(11.9) E = 0 and B = 0.
104 11. ELECTROMAGNETIC WAVES IN A FREE SPACE

11.3. Plane waves


In the Coulomb gauge consider the case when potential A depends only on
the coordinate x. The wave equation reduces to
∂2A ∂2A
2
− c2 2 = 0
∂t ∂x
and has a general solution
 x  x
A(t, x) = A1 t − + A2 t + .
c c
The wave moving in a positive direction on the x-axis is
 x
A t− ,
c
and the Coulomb gauge condition gives
∂Ax
= 0.
∂x
Thus Ax = at, where a is a constant, which gives rise to a constant electric field
in the x-direction. Since such a field has nothing to do with the electromagnetic
wave, we can set Ax = 0. Introducing the direction of the wave — the unit
vector n = ex — we obtain that always A ⊥ n. Correspondingly,
1 1
E = −A0 and B = − n × A0 = n × E,
c c
where the prime indicates t-derivative. Thus the electric and magnetic fields are
perpendicular to the direction of propagation of the wave, and corresponding
electromagnetic plane waves are transverse. Moreover, the electric and magnetic
fields are orthogonal, and their strengths are related by E = cB. The vectors
E B
n, , form an orthonormal positively oriented basis of R3 .
E B
The components of the energy-momentum tensor of a plane wave are given
by
E2 1 E2
T 00 = 2 and S = 2 E × n × E = 2 n,
c c c
so that (T 00 )2 = S 2 .
A monochromatic wave which is a simply periodic function of t with a vector
potential n o
x
A = Re A0 e−iω(t− c ) .
2πc
Here A0 ∈ C3 is a constant complex vector, ω is the frequency, λ = is the
ω
ω
wave length and k = n is the wave vector, where n is a unit vector in the
c
direction of propagation of the wave (in our case n = ex ). We have
n o
A = Re A0 ei(k·r−ωt) ,
11.4. THE GENERAL SOLUTION 105

where k · r − ωt is the phase of the wave. Correspondingly,


n o n o
E = Re E0 ei(k·r−ωt) and B = Re B0 ei(k·r−ωt) ,

where
E0 = iωA0 and B0 = ik × A0 .
Consider the vector E0 ∈ C and put b = E0 eiα , where E02 = E0 · E0 =
3

|E0 |2 e−2iα . Then b2 = b · b = |E0 |2 and


n o
E = Re bei(k·r−ωt−α) .

Putting b = b1 + ib2 , where b1 , b2 ∈ R3 , we have


b2 = b21 − b22 + 2ib1 · b2 ∈ R,
so that b1 and b2 are orthogonal. Since A0 is orthogonal to the wave vector k,
vectors b1 and b2 are also orthogonal to k.
Choosing the xyz coordinate axes along positively oriented orthogonal basis
k, b1 , ±b2 , we get
Ey = b1 cos(ωt − k · r − α),
Ez = ±b2 sin(ωt − k · r − α),

where b1 = |b1 | and b2 = |b2 |. If b1 , b2 are non-zero, we have


Ex2 E2
2 + 2z = 1,
b1 b2
so that at each point of the space the electric field vector E rotates in the plane
perpendicular to the direction of propagation and describes an ellipse. Such
wave is called elliptically polarized. If b1 = b2 , the wave is called circularly
polarized, and in case b1 or b2 is zero, the wave is called linearly polarized.
ω  ω 
Remark. Introduce the 4-vector (k µ ) = , k and (kµ ) = , −k with
c c
the property kµ k µ = 0. We have kµ xµ = ωt − k · r, so that
n µ
o
A(x) = Re A0 e−ikµ x .

The electromagnetic waves describe photons, particles with 4-wave vector satis-
fying k02 = k2 .

11.4. The general solution


The Cauchy problem for equation (11.8) has the form
A = 0,
A(0, r) = A0 (r),
∂A
(0, r) = A1 (r),
∂t
106 11. ELECTROMAGNETIC WAVES IN A FREE SPACE

where Cauchy data A0 (r) and A1 (r) satisfy Coulomb gauge condition
∇ · A0 = 0 and ∇ · A1 = 0
and rapidly decay as |r| → ∞.
Cauchy problem for the wave equation in R4 is solved by the Fourier trans-
form. Namely, let
Z
1
A0 (r) = 3 eik·r a0 (k)d3 k,
(2π) 2 R3
Z
1
A1 (r) = 3 eik·r a1 (k)d3 k,
(2π) 2 R3
where a0 (k) = ā0 (−k), a1 (k) = ā1 (−k) and k · a0 (k) = k · a1 (k) = 0. The
solution is given by
Z
1
(11.10) A(t, r) = 3 eik·r a(t, k)d3 k,
(2π) 2 R3
where
sin(c|k|t)
a(t, k) = cos(c|k|t)a0 (k) + a1 (k).
c|k|
Introducing
1 1
a(k) = a0 (k) + a1 (k),
2 2ic|k|
we can rewrite (11.10) as
Z
1  
(11.11) A(t, r) = e−i(ωk t−k·r) a(k) + ei(ωk t−k·r) ā(k) d3 k,
(2π)3 R3

where ωk = c|k|. For electric and magnetic fields we have


∂A
E=−
∂t Z
i  
= 3 ωk e−i(ωk t−k·r) a(k) − ei(ωk t−k·r) ā(k) d3 k
(2π) 2 R3

and
B =∇×A
Z
i 
−i(ωk t−k·r) i(ωk t−k·r)

= 3 k × e a(k) − e ā(k) d3 k.
(2π) 2 R3
By Plancherel theorem we have for total energy of the electromagnetic field,
Z  
1 1 2
E = E + B d3 r
2
8π R3 c2
Z
1
= (ω 2 a(k)ā(k) + (k × a(k)) · (k × ā(k))d3 k
4π R3 k
Z
1
= ω 2 a(k) · ā(k)d3 k,
2πc2 R3 k
11.4. THE GENERAL SOLUTION 107

where we have used the identity (k × a(k)) · (k × ā(k)) = |k|2 a(k) · ā(k), which
follows from k · a(k) = 0.
Similarly,
Z Z
1 3 1
Sd r = (E × B)d3 r
4π R3 4πc R3
Z
1
= ωk a(k) × (k × ā(k))d3 k
2πc R3
Z
1
= ωk (a(k) · ā(k))kd3 k.
2π R3

Finally, putting
ωk i
P (k) = √ (a(k) + ā(k)) Q(k) = √ (a(k) − ā(k)),
2c π 2c π

we obtain a representation of the energy and momentum of electromagnetic field


in terms of the oscillators
Z   Z
1 1 2 2 3 1
E +B d r = (P 2 (k) + ωk2 Q2 (k))d3 k
8π R3 c2 2 R3
Z Z
1 c
(E × B)d3 r = (ω −1 P 2 (k) + ωk Q2 (k))kd3 k,
4πc R3 2 R3 k

where the normal modes P (k) and Q(k) satisfy

k · P (k) = k · Q(k) = 0.
LECTURE 12

Hamiltonian formalism. Real scalar field

Here we consider four-dimensional spacetime R4 with coordinates x = (x0 , x1 , x2 , x3 )


and Minkowski metric (dx0 )2 − (dx1 )2 − (dx2 )2 − (dx3 )2 . We put c = 1 so that
x0 = t.

12.1. Lagrangian formulation


The scalar field ϕ(x) is a smooth real-valued function on R4 of the Schwartz
class for each time slice t = t0 . The corresponding Lagrangian function has the
form
1
L (ϕ(x), ∂µ ϕ(x)) = ∂µ ϕ(x)∂ µ ϕ(x) − m2 ϕ(x) − Vint (ϕ(x)),

2
where
∂ϕ
∂µ ϕ = , µ = 0, 1, 2, 3.
∂xµ
In particular, Vint (ϕ) = 0 corresponds to the Klein-Gordon model, and Vint (ϕ) =
gϕ4 /4! — to the ϕ4 –model.
The action functional
Z
S(ϕ) = L (ϕ, ∂µ ϕ)d4 x,

where integration goes over the part of R4 between the slices t = t0 and t = t1
with fixed ϕ(t0 , x) = ϕ0 (x) and ϕ(t1 , x) = ϕ1 (x), or over R4 , where ϕ(x) is
assumed to be rapidly decaying as |x| → ∞. Corresponding Euler-Lagrange
equation δS = 0 takes the form
∂L ∂ ∂L
(12.1) − =0
∂ϕ ∂xµ ∂(∂µ ϕ)
and yields equation of motion of the massive real scalar field
0
(12.2) ( + m2 )ϕ + Vint (ϕ) = 0.
For the ϕ4 –model this equation takes the form
ϕ3
( + m2 )ϕ + g = 0,
3!
and is a nonlinear Klein-Gordon equation with cubic nonlinearity.
Remark. Let F be the space of scalar fields on R4 . The Lagrangian L is
map from F to the functions on R4 such that L(ϕ)(x) depends only on the 1-jet
of ϕ at x ∈ R4 , i.e., L(ϕ)(x) = L (ϕ(x), ∂µ ϕ(x)).

109
110 12. HAMILTONIAN FORMALISM. REAL SCALAR FIELD

12.2. The energy-momentum tensor


Since the Lagrangian function does not depend explicitly on x, we have

∂L ∂L
∂ν L = ∂ν ϕ + ∂ν ∂µ ϕ
∂ϕ ∂(∂µ ϕ)
   
∂L ∂L ∂L
= − ∂µ ∂ν ϕ + ∂µ ∂ν ϕ .
∂ϕ ∂(∂µ ϕ) ∂(∂µ ϕ)

Thus on the solutions of the Euler-Lagrange equation (12.1) we have


 
∂L
∂ν L − ∂µ ∂ν ϕ = 0,
∂(∂µ ϕ)
or

(12.3) ∂µ Tνµ = 0,

where
∂L
Tνµ = ∂ν ϕ − δνµ L
∂(∂µ ϕ)
is the energy-momentum tensor. The tensor T µν = η νλ Tλµ satisfies the conser-
vation law
∂µ T µν = 0,
and is defined up to the addition of ∂σ Ψµνσ , where Ψµνσ = −Ψµσν .
For the scalar field the tensor T µν = ∂ µ ϕ∂ ν ϕ − η µν L is symmetric and
1
T 00 = (∂0 ϕ)2 + (∇ϕ)2 + m2 ϕ2 + Vint (ϕ) ,

2
T 0k = ∂ 0 ϕ∂ k ϕ, T ij = ∂ i ϕ∂ j ϕ.

Conservation law for the energy-momentum vector (h, p), where h = T 00 and
p = (T 01 , T 02 , T 03 ) reads
∂h
+ ∇ · p = 0.
∂t
For the electromagnetic field L = − 16π
1
Fµν F µν , and the tensor

∂L
∂ ν Aσ − η µν L
∂(∂µ Aσ )

is no longer symmetric. Adding to it


1 1
− ∂σ (Aν F σµ ) = − ∂σ Aν F σµ
4π 4π
(remember that equations of motion are used!), we get the energy-momentum
tensor discussed in Sect. 11.1 of Lecture 11.
12.3. HAMILTONIAN FORMULATION 111

Remark. In physics textbooks one proves (12.3) by using the invariance of


the action functional under the translations x 7→ x̃ = x + a,
Z Z
L (ϕ̃, ∂µ ϕ̃)d x̃ −
4
L (ϕ, ∂µ ϕ)d4 x = 0,
Ṽ V

where ϕ̃(x̃) = ϕ(x), Ṽ = V +a for arbitrary domain V ⊂ R4 , and expressing the


resulting zero as the variation of the action with δϕ = ∂µ aµ using the Stokes’
theorem and that ϕ(x) satisfies Euler-Lagrange equations.

12.3. Hamiltonian formulation


As in classical mechanics, let

∂L
π(x) = = ∂0 ϕ(x)
∂(∂0 ϕ(x))

be canonically conjugated momentum to the field ϕ(x), and define the Hamil-
tonian functional density H (π, ϕ) by the Legendre transform

H (π(x), ϕ(x)) = π 2 (x) − L (ϕ(x), ∂µ ϕ(x))|∂0 ϕ=π


1 2
π (x) + (∇ϕ(x))2 + m2 ϕ2 (x) + Vint (ϕ(x)).

=
2
Equations of motion of the theory are Hamiltonian equations for the infinite-
dimensional Hamiltonian system (M , Ω, H) with the phase space M = S (R3 , R)×
S (R3 , R), the symplectic form
Z
Ω= (dπ(x) ∧ dϕ(x)) d3 x,
R3

and the Hamiltonian functional


Z
H= H d3 x.
R3

Remark. The Schwartz space S (R3 ) is a Fréchet space with the topology
defined by the system of the semi-norms

kf kα,β = sup |xα Dβ f (x)|


x∈R3

for all multi-indices α, β ∈ Z3≥0 . The symplectic form Ω is continuous skew-


symmetric bilinear form on M defined by
Z
Ω ((π1 , ϕ1 ), (π2 , ϕ2 )) = (π1 (x)ϕ2 (x) − π2 (x)ϕ1 (x)) d3 x.
R3

The symplectic form Ω is (weakly) non-degenerate: Ω ((π1 , ϕ1 ), (π2 , ϕ2 )) = 0


for all (π2 , ϕ2 ) ∈ M implies (π1 , ϕ1 ) = 0.
112 12. HAMILTONIAN FORMALISM. REAL SCALAR FIELD

Darboux coordinates on M are π(x), ϕ(x), x ∈ R3 , and canonical Hamil-


ton’s equations
δH
(12.4) ∂0 π(t, x) = − (π(t, x), ϕ(t, x)),
δϕ(x)
δH
(12.5) ∂0 ϕ(t, x) = (π(t, x), ϕ(t, x))
δπ(x)
give equation (12.2). Indeed, by calculus of variations we obtain
δH
(π(x), ϕ(x)) = π(x)
δπ(x)
and
δH 0
(π(x), ϕ(x)) = −∆ϕ(x) + m2 ϕ(x) + Vint (ϕ(x)),
δϕ(x)
so that (12.4)–(12.5) yield
0
∂02 ϕ(x) = ∆ϕ(x) − m2 ϕ(x) − Vint (ϕ(x)).
To make these arguments rigorous, we need to define the algebra A of clas-
sical observables on M .
Definition. A functional F : M → R is called real-analytic if F (ϕ) for all
ϕ ∈ M is represented by the absolutely convergent series
∞ Z Z
X 1
F (π, ϕ) = ··· cmn (x1 , . . . , xm ; y1 , . . . , yn )×
m,n=0
m!n! R3 R3

×π(x1 ) · · · π(xm )ϕ(y1 ) · · · ϕ(yn )d3 x1 · · · d3 xm d3 y1 · · · d3 yn ,


where c00 = c — a constant, and tempered distributions
cmn (x1 , . . . , xm ; y1 , . . . , yn ) ∈ S (R3 × · · · × R3 )0
| {z }
m+n

are independently symmetric with the respect to the variables x1 , . . . , xm and


y1 , . . . , yn .
Definition. The real-analytic functional F is called admissible, if the vari-
ational derivatives
∞ X ∞ Z Z
δF X 1
= ··· cmn (x, x2 , . . . , xm ; y1 , . . . , yn )×
δπ(x) m=1 n=0 (m − 1)!n! R3 R3

× π(x2 ) · · · π(xm )ϕ(y1 ) · · · ϕ(yn )d3 x2 · · · d3 xm d3 y1 · · · d3 yn

and
∞ X ∞ Z Z
δF X 1
= ··· cmn (x1 , . . . , xm ; x, y2 , . . . , yn )×
δϕ(x) m=0 n=1 m!(n − 1)! R3 R3

× π(x1 ) · · · π(xm )ϕ(y2 ) · · · ϕ(yn )d3 x1 · · · d3 xm d3 y2 · · · d3 yn


belong to the Schwarz class S (R3 ).
12.3. HAMILTONIAN FORMULATION 113

Clearly the product of admissible functionals is an admissible functional.


Remark. For every real-analytic functional F : M → R its differential dF
at every point (π, ϕ) ∈ M is a continuous linear map dF : M → R, so that
dF ∈ S (R3 × R3 )0 . A functional F is admissible if dF ∈ S (R3 × R3 ), i.e. there
δF δF
exist Schwartz class functions, denoted by and , such that
δπ(x) δϕ(x)
Z  
δF δF
dF (u, v) = u(x) + v(x) d3 x
R3 δπ(x) δϕ(x)
for all (u, v) ∈ M .
Remark. Condition that F is admissible means that for all m, n ≥ 0 and
π1 , . . . , ϕm , ϕ1 , . . . , ϕn ∈ S (R3 ) the distributions

cmn (π2 ⊗ · · · ⊗ πm ⊗ ϕ1 ⊗ · · · ⊗ ϕn ) ∈ S (R3 )0

and
cmn (π1 ⊗ · · · ⊗ πm ⊗ ϕ2 ⊗ · · · ⊗ ϕn ) ∈ S (R3 )0
are represented by the Schwarz class functions.
Definition. The algebra A of classical observables on M is the algebra of
all admissible functionals on M .
The following result provides a rigorous foundation for the Hamiltonian me-
chanics with the infinite-dimensional phase space M .
Lemma 12.1. The symplectic form Ω endows A with the Poisson algebra
structure given by the Poisson bracket
Z  
δF δG δF δG
(12.6) {F, G}(π, ϕ) = − d3 x,
R3 δπ(x) δϕ(x) δϕ(x) δπ(x)
where variational derivatives are evaluated at (π, ϕ) ∈ M .
Proof. It follows from the definition of real-analytic functionals and the
above remark that {F, G} ∈ A for F, G ∈ A . As in case of the canonical
Poisson bracket on R2n (see Sect. 5.3 in Lecture 5) the Jacobi identity for the
bracket given by (12.6) is proved by a direct computation. 
The Darboux coordinates π(x), ϕ(x), considered as evaluation functionals of
(π, ϕ) at x ∈ R3 , do not belong to A . Nevertheless, we have in the distributional
sense,
δπ(x) δπ(x) δϕ(x) δϕ(x)
= δ(x − y), =0 and = 0, = δ(x − y),
δπ(y) δϕ(y) δπ(y) δϕ(y)
and it follows from (12.6) that
δF δF
{F, π(x)} = − and {F, ϕ(x)} = .
δϕ(x) δπ(x)
114 12. HAMILTONIAN FORMALISM. REAL SCALAR FIELD

Since for F ∈ A
Z  
δF (π, ϕ) δF (π, ϕ)
∂0 F (π, ϕ) = ∂0 π(t, x) + ∂0 ϕ(t, x) d3 x,
R3 δπ(x) δϕ(x)

Hamilton’s equations for smooth observables

∂0 F = {H, F }

are equivalent to canonical Hamilton’s equations (12.4)–(12.5).


Remark. In physics textbooks, Poisson structure (12.6) on A is defined by
the following Poisson brackets

(12.7) {π(x), π(y)} = {ϕ(x), ϕ(y)} = 0 and {π(x), ϕ(y)} = δ(x − y),

understood in the distributional sense.

12.4. Fourier modes for the Klein-Gordon model


The Klein-Gordon equation

(12.8) ( + m2 )ϕ(x) = 0

in terms of the Fourier transform


Z
1
ϕ̂(k) = eik·x ϕ(x)d4 x, where k · x = k µ xµ = k 0 x0 − kx,
(2π)2 R4

takes the form


(k 2 − m2 )ϕ̂(k) = 0.
Its general solution is a distribution supported on the two-sheeted mass hyper-
boloid k 2 = (k 0 )2 − k2 = m2 , which can be written as

ϕ̂(k) = δ(k 2 − m2 )ρ(k).

Here
ρ(k) = θ(k 0 )ρ1 (k) + θ(−k 0 )ρ2 (k),
where θ(k 0 ) is the Heavyside function and ρ1 , ρ2 are distributions supported
on R3√
. By definition of the distribution δ(k 2 − m2 ) = δ((k 0 )2 − ωk2 ), where
ωk = k2 + m2 > 0, for a test function u(k) ∈ S (R4 ) we have

(θ(k 0 )ρ1 (k)δ(k 2 − m2 ), u) = (ρ1 (k), u1 ),


(θ(−k 0 )ρ2 (k)δ(k 2 − m2 ), u) = (ρ1 (k), u2 ),

where
u(ωk , k) u(−ωk , k)
u1 (k) = , u2 (k) = ∈ S (R3 ).
2ωk 2ωk
12.4. FOURIER MODES FOR THE KLEIN-GORDON MODEL 115

Whence
1 1
ϕ̂(k) = ρ1 (k)δ(k 0 − ωk ) + ρ2 (k)δ(k 0 + ωk ),
2ωk 2ωk
where reality condition ρ(k) = ρ(−k) gives ρ2 (k) = ρ1 (−k).
Substituting this ϕ̂(k) into the inverse Fourier transform
Z
1
ϕ(x) = e−ik·x ϕ̂(k)d4 k,
(2π)2 R4

introducing a(k) = 2πρ1 (k), ā(k) = a(k) and changing in the second integral
k by −k we obtain

 d3 k
Z
1
(12.9) ϕ(x) = 3 a(k)e−ik·x + ā(k)eik·x , where k 0 = ωk .
(2π) 2 R3 2ωk

From this general distributional solution we can obtain a solution of the Cauchy
problem for the Klein-Gordon equation, which consists in finding a solution ϕ(x)
of (12.8) satisfying

ϕ(0, x) = ϕ(x) and ∂0 ϕ(0, x) = π(x).

Namely, from

 d3 k
Z Z
1 1
ϕ(x) = 3 ϕ̂(k)eikx d3 k = 3 a(k)eikx + ā(k)e−ikx ,
(2π) R3
2 (2π) R32 2ωk
−i  d3 k
Z Z
1
π(x) = 3 π̂(k)eikx d3 k = 3 ωk a(k)eikx − ā(k)e−ikx
(2π) 2 R3 (2π) 2 R3 2ωk

we get
a(k) = ωk ϕ̂(k) + iπ̂(k) ∈ S (R3 ),
so that (12.9) gives classical solution of the Cauchy problem.
It follows from Poisson brackets (12.7) that in the distributional sense

{π̂(k), π̂(l)} = {ϕ̂(k), ϕ̂(l)} = 0

and
Z Z
1
{π̂(k), ϕ̂(l)} = {π(x), ϕ(y)}e−i(kx+ly) d3 xd3 y
(2π)3 R3 R3
Z
1
= e−i(k+l)x d3 x = δ(k + l),
(2π)3 R3
Z Z
1
{π̂(k), ϕ̂(l)} = {π(x), ϕ(y)}e−i(kx−ly) d3 xd3 y
(2π)3 R3 R3
Z
1
= e−i(k−l)x d3 x = δ(k − l).
(2π)3 R3
116 12. HAMILTONIAN FORMALISM. REAL SCALAR FIELD

Thus we obtain

(12.10) {a(k), a(l)} = {ā(k), ā(l)} = 0 and {a(k), ā(l)} = 2iωk δ(k − l).

Now it follows from Plancherel’s theorem that


Z
1
π 2 (x) + (∇ϕ)2 (x) + m2 ϕ2 (x) d3 x

H=
2 R3
Z
1
|π̂(k)|2 + ωk2 |ϕ̂(k)|2 d3 k

=
2 R3
d3 k
Z
= ωk ā(k)a(k) .
R3 2ωk
Similar computation gives for the total momentum
Z
P =− π(x)(∇ϕ)(x)d3 x
R3
Z
=i π̂(k)ϕ̂(−k)kd3 k
R 3

d3 k
Z
= ā(k)a(k)k .
R3 2ωk
Thus we see that in terms of Fourier modes Hamilton’s equations (12.4)–
(12.5) decouple

ȧ(k) = {H, a(k)} = −iωk a(k),


˙
ā(k) = {H, ā(k)} = iωk ā(k)

and in accordance with (12.9)

a(t, k) = e−iωk t a(k), ā(t, k) = eiωk t ā(k).

The real coordinates in the Fourier space


a(k) + ā(k) i(a(k) − ā(k))
P (k) = , Q(k) =
2 2ωk
are Darboux coordinates for the symplectic form Ω,
Z
Ω= (dP (k) ∧ dQ(k)) d3 k,
R3

and the Hamiltonian of the Klein-Gordon model takes the form


Z
1
P 2 (k) + ωk2 Q2 (k) d3 k.

H=
2 R3
Thus in terms of Fourier modes the classical Klein-Gordon field is a collection of
infinitely many non-interacting
√ harmonic oscillators, parametrized by k ∈ R3 ,
with the frequencies ωk = k + m2 .
2
LECTURE 13

Hamiltonian formalism. Gauge theories.

13.1. Classical electrodynamics


Here we continue with c = 1 and use the Lagrangian function
1 1
L (A) = − Fµν F µν = (E 2 − B 2 ),
4 2
where
∂Aν ∂Aµ
Fµν = µ

∂x ∂xν
(thus absorbing the factor 1/4π in (8.16) in Lecture 8). One can also rewrite
Lagrangian function in the first order formalism (see Sect. 7.1 in Lecture 7),
 
1 1
(13.1) L=− ∂µ Aν − ∂ν Aµ − Fµν F µν ,
2 2
where Aµ and Fµν are considered to be independent. Indeed, corresponding
Euler-Lagrange equations for L are Maxwell equations in free and empty space
Fµν = ∂µ Aν − ∂ν Aµ and ∂µ F µν = 0.
Plugging formula for Fµν back in (13.1), we obtain the Lagrangian function
L (A). Using Aµ = (A0 , A1 , A2 , A3 ) = (A0 , −A), formula (8.8) and equa-
tions Fij = ∂i Aj − ∂j Ai , we can rewrite (13.1), up to a total divergence term
−∇(A0 E), as
1
(13.2) L = −E · ∂0 A − (E 2 + B 2 ) + A0 ∇ · E, where B = ∇ × A.
2
Thus for the electromagnetic field Lagrangian we obtain1
Z
(13.3) L = (−E(x) · ∂0 A(x) − 12 (E 2 (x) + B 2 (x)) + A0 (x)∇ · E(x))d3 x.
R3

Comparison with formula (7.7) in Lecture 7 shows that Lagrangian of classical


electrodynamics is singular. Namely, it follows from the first term in (13.3) that
the phase space of the theory is the following infinite-dimensional real vector
space2
M = S (R3 , R3 ) × S (R3 , R3 )
1By the Stokes’ theorem, contribution of the total divergence term is zero.
2Here S (R3 , R3 ) stands for the R3 -valued Schwartz functions on R3 .

117
118 13. HAMILTONIAN FORMALISM. GAUGE THEORIES.

with the symplectic form Ω


Z
(13.4) Ω= (dEi (x) ∧ dAi (x)) d3 x,
R3

so that the pairs (Ei (x), Ai (x)), are Darboux coordinates on M with the canon-
ical Poisson brackets

(13.5) {Ei (x), Aj (y)} = δij δ(x − y), i, j = 1, 2, 3.

The second term in (13.3) is the Hamiltonian of the electromagnetic field,


Z Z
1
H (x)d3 x = E 2 + B 2 d3 x, B = ∇ × A.

(13.6) H=
R3 2 R3
Comparing the last term in (13.3) with the corresponding term in (7.7) we
conclude that components A0 (x) of the gauge field are the Lagrange multipliers,
and the constraints C(x) are given by the Gauss law,

C(x) = ∇ · E(x) = 0, x ∈ R3 .

It is instructive to analyze Hamilton’s equations for the Hamiltonian system


(M , Ω, H). We have, using (13.5),

(13.7) Ȧi (x) = {H, Ai (x)} = Ei (x),

and since A = −(A1 , A2 , A3 ), it gives


∂A
E=− ,
∂t
which implies the Faraday law
∂B
∇×E =− .
∂t
Since the Gauss law for the magnetic field follows from the definition of B =
∇ × A, we get the first pair of Maxwell equations. Using Bj = (∇ × A)j =
−εjkl ∂k Al (note the negative sign!), we obtain
Z
Ėi (x) = {H, Ei (x)} = Bj (y){(∇ × A)j (y), Ei (x)}d3 y
R3
Z
= −εjkl Bj (y){∂k Al (y), Ei (x)}d3 y
R3
Z

= εjki Bj (y) k δ(x − y)d3 y = εikj ∂k Bj (x),
R 3 ∂y
which gives the Ampére-Maxwell law
∂E
= ∇ × B.
∂t
13.1. CLASSICAL ELECTRODYNAMICS 119

However, the remaining equation in the second pair of Maxwell equations


— the Gauss law for the electric field ∇ · E = 0 — is missing from Hamilton’s
equations and appears as constraints C(x) = 0. This is a manifestation of the
fact that Maxwell equations are described by a singular Lagrangian, and for
their Hamiltonian formulation one needs to reduce the phase space M .
It is easy to see that constraints C(x) are the first class constraints (see Sect.
7.3 in Lecture 7). Indeed, it follows from (13.5) that

{C(x), C(y)} = 0 x, y ∈ R3

and also

{H, C(x)} = {H, ∇ · E(x)} = ∇ · (∇ × B)(x) = 0, x ∈ R3 .

According to Sect. 7.3 in Lecture 7, to determine the reduced phase space


we need to introduce additional constraints D(x) = 0 such that the integral
operator with the kernel {C(x), D(y)} is non-degenerate in L2 (R3 ). Convenient
choice is
D(x) = −∇ · A(x),
which forces the Coulomb gauge! Indeed, it follows from (13.5) that,

∂2
{C(x), D(y)} = δ(x − y),
∂xi ∂y i
which is the integral kernel of the operator −∆, Laplace operator of the Eu-
clidean metric on R3 . Thus the reduced phase space M0 of classical electrody-
namics is a linear subspace in M defined by

M0 = (E(x), A(x)) ∈ M : C(x) = D(x) = 0, x ∈ R3 .




Since
{D(x), D(y)} = 0,
Darboux coordinates for the symplectic form Ω0 = Ω|M0 can be found by the
general procedure described in Sect. 7.3 in Lecture 7.
Using Corollary 7.1 in Lecture 7, the Poisson bracket { , }0 on M0 , associated
with the symplectic form Ω0 , can be written as a restriction of the Dirac bracket
on M , associated with the second class constraints (C(x), D(x)). Namely, it
follows from (7.20) in Lecture 7 that
Z Z 
{F, G}DB = {F, G} + {F, C(x)}G(y − x){D(y), G}−
R3 R3

(13.8) − {F, D(x)}G(x − y){C(y), G} d3 xd3 y,

where G(x − y) is a distribution satisfying


Z
G(x − z){C(z), D(y)}d3 z = δ(x − y),
R3
120 13. HAMILTONIAN FORMALISM. GAUGE THEORIES.

or
eikx 3
Z
1
G(x) = d k.
(2π)3 R3 k2
Using (13.5) we readily compute that

(13.9) {Ei (x), Ej (y)}DB = {Ai (x), Aj (y)}DB = 0

and

(13.10) {Ei (x), Aj (y)}DB = 4πδij (x − y), x, y ∈ R3 ,

where the distribution δij (x) is the transverse δ-function,
Z  
⊥ 1 ki kj
(13.11) δij (x) = δij − 2 eikx d3 k, i, j = 1, 2, 3.
(2π)3 R3 k

It satisfies

∂i δij (x) = 0, j = 1, 2, 3.
Thus Dirac bracket (13.8) yields a ‘transverse’ Poisson structure { , } on
M , determined by (13.9)–(13.10). It is degenerate and its center is generated
by C(x) and D(x) for x ∈ R3 . The Dirac bracket { , }DB restricts to M0 and
yields a non-degenerate Poisson bracket { , }0 associated with the symplectic
form Ω0 . Since Z

δij (x − y)fj (y)d3 y = fi (x)
R3

for any f (x) ∈ S (R3 , R3 ) satisfying ∇ · f (x) = 0, it immediately follows from


previous computations that Hamilton’s equations on M0

Ė(x) = {H, E(x)}0 ,


Ȧ(x) = {H, A(x)}0 ,

yield
∂E ∂A
= ∇ × B, where B = ∇ × A and = −E.
∂t ∂t
Together with the Gauss law, they give the full set of Maxwell equations in the
Coulomb gauge.
In terms of the normal modes P (k) and Q(k) (see Sect. 11.4 in Lecture 11),
satisfying
k · P (k) = k · Q(k) = 0,
the Poisson structure { , }0 is given by the transverse Poisson brackets
 
ki lj
{Pi (k), Qj (l)}0 = δij − δ(k − l).
k·l

This finishes Hamiltonian formulation of Maxwell’s equations.


13.2. YANG-MILLS EQUATIONS 121

13.2. Yang-Mills equations


Let G be a semi-simple compact Lie group, g be its Lie algebra with gener-
ators Xa satisfying

tr(Xa Xb ) = −2δab , a, b = 1, . . . , n.

Let A = Aµ dxµ be a connection on a trivial bundle R4 × ad g over R4 , Aµ =


Aaµ Xa , and let

1
F = Fµν dxµ ∧ dxν , where Fµν = ∂µ Aν − ∂ν Aµ + [Aµ , Aν ],
2
be its curvature. Consider the Yang-Mills Lagrangian function (see Lecture 10),

1 1 a
L (A) = tr Fµν F µν = − Fµν (F a )µν ,
8 4
a
where Fµν = Fµν Xa , and we put g = 1 in formula (10.10). As in case of classical
electrodynamics, it can be written in the first order formalism
 
1 1
(13.12) L = tr ∂µ Aν − ∂ν Aµ + [Aµ , Aν ] − Fµν F µν .
4 2

Put
Ei = F0i and Bi = εijk F jk , i =, 1, 2, 3.
Using equations Fij = ∂i Aj − ∂j Ai + [Ai , Aj ] and the cyclic property of the
trace, we can rewrite (13.12) as follows
 
1 1 1
L = − tr ∂0 Ak − ∂k A0 + [A0 , Ak ] − Ek Ek + tr Fij F ij
2 2 8
 
1 1 2 1
= − tr Ek ∂0 Ak − (Ek + Bk ) + A0 (∂k Ek + [Ak , Ek ]) + ∂k (tr A0 Ek ) .
2
2 2 2

Thus up to a total divergence,


 
1 1 2
L = − tr Ek ∂0 Ak − (Ek + Bk ) + A0 C
2
where C = ∂k Ek + [Ak , Ek ],
2 2
or
1
L = Eka ∂0 Aak − (Eka )2 + (Bka )2 + Aa0 C a ,

(13.13)
2
where Ek = Eka Xa , Bk = Bka Xa and C = C a Xa . Thus for the Yang-Mills
Lagrangian we obtain
(13.14) Z
Eka (x)∂0 Aak (x) − 12 (Eka )2 (x) + (Bka )2 (x) + Aa0 (x)C a (x) d3 x.
 
LYM =
R3
122 13. HAMILTONIAN FORMALISM. GAUGE THEORIES.

As formula (13.3) in case of classical electrodynamics, formula (13.14) shows


that Yang-Mills Lagrangian is singular. Namely, it follows from the first term in
(13.14) that the phase space of the theory is the following infinite-dimensional
real vector space
M = S (R3 , R3n ) × S (R3 , R3n )
with the symplectic form Ω
Z
(13.15) Ω= (dEka (x) ∧ dAak (x)) d3 x,
R3

so that the pairs (Eka (x), Aak (x)), are Darboux coordinates on M with the canon-
ical Poisson brackets
(13.16) {Eka (x), Abl (y)} = δkl δ ab δ(x−y), i, j = 1, 2, 3 and a, b = 1, . . . , n.
The second term in (13.14) is the Hamiltonian of the Yang-Mills field,
Z Z
1
H (x)d x =
3
(Eka )2 (x) + (Bka )2 (x) d3 x.

(13.17) H=
R3 2 R 3

Comparing the last term in (13.3) with the corresponding term in (7.7) we
conclude that components Aa0 (x) of the gauge field are the Lagrange multipliers,
and the constraints C a (x) are given by the nonabelian Gauss law,
C a (x) = ∂k Eka (x) + tabc Abk (x)Ekc (x) = 0, x ∈ R3 .
As in case of classical electrodynamics, these are the first class constraints,
and we verify it by the following computation. Namely, it directly follows from
(13.16) that
{Eka (x), C b (y)} = −tab c
c Ek (x)δ(x − y)

and

{Aak (x), C b (y)} = {Aak (x), ∂l Elb (y) + tbcd Acl (y)Eld (y)}

= −δ ab δ(x − y) − tbca Ack (x)δ(x − y)
∂yk
 

= − δ ab + tab
c A c
(y) δ(x − y).
∂yk
Introducing
Z Z
1
C(f ) = − tr C(x)f (x)d3 x = C a (x)f a (x)d3 x
2 R3 R3

for a test function f : R3 → ad g, where f (x) = f a (x)Xa and f a (x) ∈ S (R3 , R),
we can succinctly rewrite these formulas as
def
(13.18) {Ek (x), C(f )} = {Eka (x), C(f )}Xa = [Ek (x), f (x)],
def
(13.19) {Ak (x), C(f )} = {Aak (x), C(f )}Xa = (∇k f )(x).
13.2. YANG-MILLS EQUATIONS 123

From here we obtain

{∂k Ek (x), C(g)} = [∂k Ek (x), g(x)] + [Ek (x), ∂k g(x)]

and

{[Ak (x), Ek (x)], C(g)} = Ak (x){Ek (x), C(g)} + {Ak (x), C(g)}Ek (x)
− Ek (x){Ak (x), C(g)} − {Ek (x), C(g)}Ak (x)
= [Ak (x), [Ek (x), g(x)]] + [(∇k g)(x), Ek (x)].

Whence

{C(x), C(g)} = [∂k Ek (x), g(x)] + [Ak (x), [Ek (x), g(x)]] + [[Ak (x), g(x)], Ek (x)]
= [∂k Ek (x), g(x)] + [[Ak (x), Ek (x)], g(x)]
= [C(x), g(x)]

and
Z
1
{C(f ), C(g)} = − tr ([C(x), g(x)]f (x)) d3 x
2 R3
Z
1
= tr (C(x)[f (x), g(x)]) d3 x.
2 R3

Therefore, we finally obtain

(13.20) {C(f ), C(g)} = −C([f, g]).

Equivalently, (13.20) can be written as

{C a (x), C b (y)} = −tab c


c C (x)δ(x − y).

Remark. One can also obtain formulas (13.18)–(13.20) using representation


Z
1
C(f ) = − tr (Ek (x)(∇k f )(x)) d3 x.
2 R3
To compute {H, C a (x)} we observe that it follows from (13.18)

{Ek2 (x), C(f )} = [Ek2 , f (x)].

Using (13.19), we also obtain

{Fij (x), C(f )} = {∂i Aj (x) − ∂j Ai (x) + [Ai (x), Aj (x)], C(f )}
= ∂i (∇j f )(x) − ∂j (∇i f )(x) + Ai (x)(∇j f )(x) + (∇i f )(x)Aj (x)
− (∇j f )(x)Ai (x) − Aj (x)(∇i f )(x)
= [∂i Aj (x) − ∂j Ai (x), f (x)] + [Ai (x), [Ak (x), f (x)] + [[Ai (x), f (x)], Ak (x)]
= [Fij (x), f (x)],
124 13. HAMILTONIAN FORMALISM. GAUGE THEORIES.

so that
{Bk2 (x), f (x)} = [B 2 (x), f (x)].
Whence
{Ek2 (x) + Bk2 , C(f )} = [Ek2 (x) + Bk2 , f (x)]
1
and for H (x) = − tr Ek2 (x) + Bk2 (x) we obtain

4
{H (x), C(f )} = 0.

Therefore
{H, C(f )} = 0
or equivalently,
{H, C a (x)} = 0.
This finishes the proof that Yang-Mills theory is a Hamiltonian theory with
first class constraints. As in the U(1)-case, for additional constraints one can
use non-abelian Coulomb gauge

D(x) = ∂k Ak (x) = 0.

Putting D(x) = Da (x)Xa , we readily compute

∂2 ∂
(13.21) {C a (x), Db (y)} = δ ab k k
δ(x − y) + tab c
c Ak (x) δ(x − y).
∂x ∂y ∂y k

Thus M ab (x, y) = {C a (x), Db (y)} is an integral kernel of the differential oper-


ator

(13.22) M = −∆ + ad Ak (x)∂k ,

acting on square summable ad g-valued functions on R3 . As in the U(1)-case,


this operator is formally invertible, at least for small Ak (x), which allows to
define the reduced phase space of the theory.
Problem 13.1. The abelian group C ∞ (R3 , R) of gauge transformations acts on
the phase space M by f · (E, A) = (E, A + ∇f ). Prove that this action is Poisson
and find the corresponding moment map (see Problems 6.4 and 7.3). Show that the
reduced phase space for the regular value 0 is M0 and the corresponding symplectic
structure is given by transverse Poisson brackets (13.10).
Problem 13.2. The nonabelian group C ∞ (R3 , G) of gauge transformations acts
on the phase space M by g · (Ek , Ak ) = (gEk g −1 , gAg −1 − ∂k gg −1 ). Prove that this
action is Poisson and find the corresponding moment map.
Notes and references

Classic text
• L.D. Landau and E.M. Lifschitz, The Classical Theory of Fields, 4th
edition, Butterworth-Heinemann, 1980
is the basic reference for classical electrodynamics, special relativity and theory
of gravity. For more detailed exposition, including applications, see
• D.J. Griffiths, Introduction to Electrodynamics, Prentice-Hall, NJ, 1999
• J.D. Jackson, Classical Electrodynamics, 3rd edition, Wiley, 1998.
The book
• B.A. Dubrovin, A.T. Fomenko and S.P. Novikov, Modern Geometry
— Methods and Applications: Part II: The Geometry and Topology of
Manifolds, 2nd edition, Springer, 1991
is a good introduction to theory of connections on principal and vector bundles,
see also
• T. Frankel, The Geometry of Physics: an Introduction, Cambridge
University Press, 1999.
For concise exposition for vector bundles, see the books
• P. Griffiths and J. Harris, Principles of algebraic geometry, Wiley-
Interscience, 1994.
• R.O. Wells, Differential Analysis on Complex Manifolds, Springer-
Verlag New York, 2008.
For global existence and uniqueness theorems for the Yang-Mills equations
on Minkowski spacetime see the papers
• D.M. Eardley and V. Moncrief, The global existence of Yang-Mills-
Higgs fields in 4-dimensional Minkowski space I. Local existence and
smoothness properties Comm. Math. Phys., 83 (1982), 171–191; II.
Completion of proof, ibid. 193–212
• M.V. Goganov and L.V. Kapitanskiĭ, Global solvability of the Cauchy
problem for Yang-Mills-Higgs equations, J. Sov. Math. 37 (1987),
802–822

125
126 NOTES AND REFERENCES

and
• P.T. Chruściel and J. Shatah, Global existence of solutions of the Yang-
Mills equations on globally hyperbolic Lorentzian manifolds, Asian J.
Math. 1:3 (1997), 530–548

for the general case.


For application of self-duality equations to the geometry of 4-manifolds, see
the monograph
• S.K. Donaldson and P.B. Kronheimer, The geometry of four-manifolds,
The Clarendon Press, Oxford University Press, New York, 1990.

The Hitchin equations were introduced in


• N. J. Hitchin, The self-duality equations over a Riemann surface, Proc.
London Math. Soc. (3) 55 (1987) 59–126.
Our exposition of the Hamiltonian formalism for the Yang-Mills theory is
based on
• L.D. Faddeev, The Feynman integral for singular Lagrangians, Theo-
ret. and Math. Phys., 1:1 (1969), 1–13
• L.D. Faddeev and A.A. Slavnov, Gauge Fields. Introduction to Quan-
tum Theory, 2nd edition, Addison-Wesley, 1991.

The elegant proof in Lecture 13, that for the Yang-Mills theory C a (x) are the
first class constraints, is based on Faddeev’s lectures on Feynman path integral
and gauge fields (unpublished, 1974).
Note that for large Ak Faddeev-Popov operator M , defined by formula
(13.22), may have a zero eigenvalue, so that the Coulomb gauge condition in-
tersects the orbits of gauge group more that once. This is the so-called Gribov
ambiguity, which has been rigorously considered in
• I. M. Singer, Some remarks on the Gribov ambiguity, Commun. Math.
Phys. 60 (1978), 7–12.

However, this problem does not affect the perturbation theory (Feynman rules)
based on the Hamiltonian formulation of the Yang-Mills theory.
Part 3

Special relativity and theory of


gravity
LECTURE 14

Special relativity

Maxwell’s equations in vacuum are invariant with respect to the Lorentz


group L = O(1, 3) — the isometry group of Minkowski spacetime M4 — the
vector space R4 with Minkowski metric

ds2 = ηµν dxµ dxν = c2 dt2 − dx2 − dy 2 − dz 2 .

Points in the spacetime are thought of as coordinates of events and the Minkowski
distance between two events P1 = (ct1 , x1 , y1 , z1 ) and P2 = (ct2 , x2 , y2 , z2 ) is
called the interval,

s212 = c2 (t2 − t1 )2 − (x2 − x1 )2 − (y2 − y1 )2 − (z2 − z1 )2 .

14.1. The relativity principle and the Lorentz group


The Minkowski structure of physical spacetime is a mathematical formu-
lation of Einstein’s relativity principle: “the speed of light is the same in all
inertial frames of reference”. If K and K 0 are two inertial reference frames,
then the relativity principle is the statement that if ds = 0 in K then ds0 = 0
in K 0 . From here it follows that

ds2 = a(v)ds0 2 ,

where the constant a(v) can depend only on the absolute value v = |v| of the
relative velocity v of the inertial frames K and K 0 . Applying this to three
reference frames K, K1 , K2 we get

a(v1 )
= a(v12 ),
a(v2 )

where v12 = |v2 − v1 |, which implies that a(v) = 1.


The Einstein relativity principle states that the physical laws are invari-
ant with respect to the Lorentz group L, and replaces the Galilean relativity
principle in Newtonian mechanics.
The orbits of the Lorentz group L in M4 have the form

Om = {x ∈ M4 : xµ xµ = c2 t2 − x2 − y 2 − z 2 = m2 }

for all m2 ∈ R and are two-sheeted hyperboloids when m2 > 0, one-sheeted


hyperboloids for m2 < 0 and a cone c2 t2 = x2 + y 2 + z 2 for m = 0, the light

129
130 14. SPECIAL RELATIVITY

Figure 1. Light cone

cone (see Fig. 1). Correspondingly, two events P1 , P2 ∈ M4 are called timelike
if s212 > 0, spacelike if s212 < 0 and lightlike if s12 = 0. It follows from the
transitivity of the L-action on orbits that for two timelike events there is a
Lorentz transformation such that they take place in the same point in space,
P2 − P1 = (t2 − t1 , 0, 0, 0), while for the two spacelike events there is a Lorentz
transformation such that they take place at the same time, P2 −P1 = (0, x2 −x1 ).
Clearly the spacelike events cannot be causally related. Correspondingly, the
points inside the light cone with t > 0 represent the absolute future of the event
at the origin O, while the points inside with t < 0 belong to the absolute past.
The points outside the light cone are not causally related to the origin O and are
absolutely remote relative to O. This means that the concepts “simultaneous”,
“earlier” and “later” are relative for these regions.
The Lorenz group L = O(1, 3) consists of 4 × 4 matrices Λ = {Λµα } satisfying

(14.1) Λt ηΛ = η,

where η = diag{1, −1, −1, −1}. Equivalently,

Λµα Λνβ ηµν = ηαβ .

The group L acts linearly on M 4 , x 7→ x0 = Λx, where x0 µ = Λµν xν . We have

(Λ00 )2 − (Λ10 )2 − (Λ20 )2 − (Λi0 )2 = 1,

so that Λ00 ≥ 1 or Λ00 ≤ −1. We also have det Λ = ±1, so that the Lorentz group
L has four connected components.
The component of the identity L↑+ preserves the future and past light cones
and is called the proper orthochronous Lorentz group or restricted Lorentz group.
14.1. THE RELATIVITY PRINCIPLE AND THE LORENTZ GROUP 131

Other components are obtained from it by applying the space inversion P =


diag{1, −1, −1, −1} or the time reversal T = diag{−1, 1, 1, 1}, or P T .
The restricted Lorentz group L↑+ is six-dimensional connected Lie group
generated rotations in xµ xν -planes, 0 ≤ µ < ν ≤ 3. Spacial rotations generated
a subgroup SO(3), while rotations in x0 xi -planes give Lorentz boosts. Explicitly,
the rotation in x0 x1 -plane preserves c2 t2 −x2 , where x = x1 . The corresponding
transformation xµ 7→ x0 µ can be written as

x = x0 cosh ψ + ct0 sinh ψ,


ct = x0 sinh ψ + ct0 cosh ψ.

Putting
v
1
cosh ψ = r , sinh ψ = r c ,
v2 v2
1− 2 1− 2
c c

where |v| ≤ c, we get

v
x0 + vt0 t 0 + 2 x0
(14.2) x= r , y = y0 , z = z0 , t = r c .
v2 v2
1− 2 1− 2
c c

This transformation relates coordinates (t, x, y, z) in the inertial reference frame


K with the coordinates (t0 , x0 , y 0 , z 0 ) in the inertial reference frame K 0 moving
relative to K with velocity v along the x-axis. The formula for (t0 , x0 , y 0 , z 0 ) in
terms of (t, x, y, z) is given by replacing v by −v. wge When |v|  c (or in the
limit c → ∞) Lorentz boost (14.2) becomes Galilean transformation (1.8) in
Lecture 1,

x = x0 + vt0 , y = y 0 , z = z 0 , t = t0 .

dr
Consider a particle in a reference frame K moving with velocity v = . In
0
dt
the reference frame K moving relative to K with velocity V in the x direction
dr 0
velocity of a particle is v 0 = 0 . Using
dt

V
dx0 + V dt0 dt0 + 2 dx0
dx = r 0 0
, dy = dy , dz = dz , dt = r c
V2 V2
1− 2 1− 2
c c
132 14. SPECIAL RELATIVITY

we obtain
dx vx0 + V
vx = = ,
dt v0 V
1 + x2
r c
V2
dy vy0 1 − 2
vy = = c ,
dt vx0 V
1+ 2
r c
V2
dz vz0 1 − 2
vz = = c .
dt vx0 V
1+ 2
c
When |V |  c we get

vx = vx0 + V, vy = vy0 vz = vz0 .

14.2. The Lorentz contraction and time delay


Consider a rod at rest in the K reference frame and suppose that it is parallel
to the x-axis with the endpoints x1 and x2 . The length of the rod, measured
in the K reference frame, is just ∆x = x2 − x1 . To determine the length of the
rode in the moving reference frame K 0 , we need to find its endpoints x01 and x02
in K 0 at the same time t0 . From (14.2) we obtain

x0 + vt0 x0 + vt0
x1 = r1 , x2 = r2
v2 v2
1− 2 1− 2
c c
and
∆x0
∆x = r .
v2
1− 2
c
Denoting by l0 = ∆x the proper length of the rod, the length in a reference
frame where it is at rest, and by l = ∆x0 its length in a moving reference frame
K 0 , we obtain the Lorentz contraction
r
v2
l = l0 1 − 2 ,
c
so that l < l0 .
Next consider the clock which is at rest in the moving reference frame K 0 .
Let (t01 , x0 , y 0 , z 0 ) and (t02 , x0 , y 0 , z 0 ) be two events occurring at the same point
(x0 , y 0 , z 0 ) in space in the reference frame K 0 , so that the time between these
events in K 0 is ∆t0 = t02 − t01 . It follows from (14.2) that in the fixed reference
14.3. LIE ALGEBRA OF THE LORENTZ GROUP 133

frame K
v v
t01 + 2 x0 t02 + 2 x0
t1 = r c , t2 = r c .
v2 v2
1− 2 1− 2
c c
Thus the time that elapses between these two events in the reference frame at
rest K is
∆t0
∆t = r ,
v2
1− 2
c
so that ∆t0 < ∆t. This is time dilation in special relativity: the time between
events occurring at the same place in a moving reference frame is always smaller
than the time between these events in a reference frame at rest. The time ∆t0
is called a proper time.
Remark. Note that notion of being on the same point in space depends
on the reference frame. Thus events (t01 , x0 , y 0 , z 0 ) and (t02 , x0 , y 0 , z 0 ) occur in the
same point in space in the reference frame K 0 , but in the reference frame K

x0 + vt01 x0 + vt02
x1 = r , x2 = r ,
v2 v2
1− 2 1− 2
c c

and x1 6= x2 .

14.3. Lie algebra of the Lorentz group


The Lie algebra so(1, 3) of the Lorentz group is a Lie algebra of 4×4 matrices
X satisfying
X t η + ηX = 0,

which is obtained from (14.1) by setting Λ = esX = I + sX + O(s2 ). It is a semi-


simple six-dimensional Lie algebra with the generators M λµ , 0 ≤ λ < µ ≤ 3,
and the Lie brackets

[M λµ , M ρσ ] = −η λρ M µσ + η λσ M µρ − η µσ M λρ + η µρ M λσ .

Here it is understood that M λλ = 0 (no summation over repeated indices!) and


M λµ = −M µλ for λ > µ. The generators M λµ can be realized as the following
4 × 4 matrices

M λµ β = η αλ δβµ − η αµ δβλ .

Introducing

1
Ji = εikl M kl and Ki = M0i , i = 1, 2, 3,
2
134 14. SPECIAL RELATIVITY

we obtain the following Lie brackets

[Ji , Jj ] = εijl Jl ,
[Ki , Kj ] = −εijl Jl ,
[Ji , Kj ] = εijl Kl , i, j = 1, 2, 3.

The generators J1 , J2 , J3 correspond to therotations in R3 and


 0 K01 ,0K 2 , K3 —
000 0 0

1 000 0 0 0 01
to the Lorentz boosts. Explicitly , J1 = 0 0 0 −1 , J2 = 0 0 0 0 , J3 =
0 0 0 0 0 1 0 0  0 0 10001 0  0 0 0 10−1 0 0
0 0 −1 0
01 0 0 and K1 = 10 00 00 00 , K2 = 01 00 00 00 , K3 = 00 00 00 00 .
00 0 0 0000 0000 1000

Remark. Complexified Lie algebra so(1, 3) is isomorphic to so(4, C) with


the generators
(±) 1 √
Ji = (Ji ± −1 Ki )
2
satisfying
(+) (+) (+) (−) (−) (−) (+) (−)
[Ji , Jj ] = εijl Jl , [Ji , Jj ] = εijl Jl , [Ji , Jj ] = 0,

which establishes the Lie algebra isomorphism so(4) ∼


= so(3) ⊕ so(3). Note that
over R there is a Lie group isomorphism

SO(3) × SO(3) ∼
= SO(4)/{I, −I}.

Remark. Replacing η = diag(1, −1, −1, −1) by ηc = diag(c, −1, −1, −1),
we get generators Ji and Kic , and since ηc−1 = diag(1/c, −1 − 1, −1) we obtain

1
[Kic , Kjc ] = − εijl Jl .
c2

Thus in the non-relativistic limit c → ∞ for the generators Ji and K̃i =


limc→∞ Kic we obtain the relations

[Ji , Jj ] = εijl Jl ,
[Ji , K̃j ] = εijl Kl ,
[K̃i , K̃j ] = 0,

which characterize the Lie algebra se(3) of the Euclidean group E(3) — the
homogenous Galilean group G0 — discussed in Sect. (1.3) in Lecture 1! Thus
we see that Euclidean Lie algebra se(3) is a contraction of the Lorentz Lie
algebra so(1, 3).
1Compare with formulas for X , X and X in Example 2.2 in Lecture 2.
1 2 3
14.4. LORENTZ GROUP AS DEFORMATION OF THE GALILEAN GROUP 135

14.4. Lorentz group as deformation of the Galilean group


Specifically, the Lorentz Lie algebra so(1, 3) can be considered as a defor-
mation of the Galilean Le algebra se(3), with the deformation parameter being
the inverse square of the speed of light c.
Namely recall that a formal deformation of a Lie algebra g with a Lie bracket
[ , ] is a Lie algebra g̃ over R[[t]], a ring of formal power series in variable2 t,
with the Lie bracket
[x, y]t = [x, y] + tm1 (x, y) + t2 m2 (x, y) + · · ·
The Jacobi identity for the bracket [ , ]t implies that the linear map m1 : Λ2 g →
g satisfies
[m1 (x, y), z] + m1 ([z, x], y) + [m1 (y, z), x]
(14.3) +[m1 (z, x), y] + m1 ([z, x], y) + m1 ([y, z], x) = 0
for all x, y, z ∈ g. This is the equation of 2-cocycle in the Shevalley-Eilenberg
complex Hom(Λ• g, g), where g is considered as a left g-module with respect
to the adjoint action. Specifically, for any g-module M the coboundary map
δk : Hom(Λk g, M ) → Hom(Λk+1 g, M ) is defined by
k+1
X
(δk f )(x1 , . . . , xk+1 ) = (−1)i+1 xi · f (x1 , . . . , x̂i , . . . , xk+1 )+
i=1
X
i+j
+ (−1) f ([xi , xj ], x1 , . . . , x̂i , . . . , x̂j , . . . xk+1 ).
1≤i<j≤k+1

Remark. In case when g = Vect(X), where X is a smooth manifold, and


M = C ∞ (X), the Chevalley-Eilenberg complex Hom(Λ• g, M ) becomes the de
Rham complex Ω•dR (X, R).
Equation (14.3) for m1 can be written as δ2 m1 = 0. Coboundaries
(δ1 f )(x, y) = [x, f (y)] − [y, f (x)] − f ([x, y])
give infinitesimally trivial deformations: the linear map Ft (x) = x + tf (x)
establishes the infinitesimal isomorphism
Ft ([x, y]t ) = [Ft (x), Ft (y)] + O(t2 ).
Thus nontrivial infinitesimal deformations are in one-to-one correspondence
with the second cohomology group H 2 (g, g).
Definition. The Lie algebra is called stable if H 2 (g, g) = 0.
The semi-simple Lie algebras are stable. However, for the Lie algebra g =
se(3) we have H 2 (g, g) = R and for the 2-cocycle m1 with the only non-zero
values m1 (K̃i , K̃j ) = −εijk Jk we obtain that the bracket
[x, y]t = [x, y] + tm1 (x, y)
2Should not be confused with the time variable!
136 14. SPECIAL RELATIVITY

is a Lie bracket (contribution of the terms proportional to t2 to the Jacobi


identity is zero). Putting t = c−2 we obtain the Lorentz Lie algebra!
The Lorentz algebra is semi-simple and therefore is stable. Whence the
passage from the Newtonian spacetime to the Minkowski spacetime is a defor-
mation from the unstable structure to the stable one and the special relativity
is natural deformation of the Newtonian mechanics.
LECTURE 15

Relativistic particle

A motion of a particle in M4 is described by a world line. By definition, it is


a map γ : [t1 , t2 ] → M 4 , γ(t) = xµ (t), such that at each t ∈ [t1 , t2 ] the tangent
vector γ 0 (t) is timelike. Explicitly, γ(t) = (ct, r(t)) where v(t) = ṙ(t) satisfies
|v(t)| < c, where v = |v|. In terms of the natural parameter s on the world line,
r
v2
ds = c 1 − dt,
c2
the unit tangent vector is given by
 
dxµ 1 v
uµ = uµ uµ = 1,
 
= r , r ,
ds 
v2 v2

1− 2 c 1− 2
c c

and the acceleration is


duµ
aµ = , aµ uµ = 0.
ds
Remark. The natural parameter is c times the proper time along the world
line,
Z tr
v 2 (τ )
s(t) = c 1 − 2 dτ.
t1 c

15.1. The principle of the least action


Let a, b ∈ M4 be two events with a timelike interval s2ab > 0. It is natural to
define the action of the a relativistic particle along the world line γ : [t0 , t1 ] →
M4 , γ(t0 ) = a and γ(t1 ) = b, by the following expression
Z b
S(γ) = −α ds.
a

Here integration goes over the world line γ and α is a constant.


It follows from the pseudo-Euclidean structure of the Minkowski spacetime
Rb
that the integral a ds takes a maximal value when it is taken along a straight
world line connecting a and b. Indeed, applying a Lorentz transformation, we

137
138 15. RELATIVISTIC PARTICLE

can assume that a = (ct00 , x0 , y 0 , z 0 ) and b = (ct01 , x0 , y 0 , z 0 ), so that along a world


line γ
Z b
ds ≤ c(t01 − t00 )
a
and the equality occurs for γ being a straight line connecting a and b with zero
velocity.
Thus to have a minimum of the action we have α > 0, so that or γ(t) =
(ct, r(t)),
Z t1 r
v2
S(γ) = L(γ 0 (t))dt, where L = −α 1 − 2 and v = |ṙ|.
t0 c
The quantity α characterizes the particle. In classical mechanics a particle
is characterized by its mass m (see Lecture 1). Whence in the non-relativistic
limit c → ∞ we should recover the Lagrangian of a free particle mv 2 /2, and
this comparison yields a relation between α and m. Namely, we have as c → ∞
r
v2 αv 2
L = −αc 1 − 2 = −αc + + O(c−3 ).
c 2c
Omitting the constant term −αc, which does not affect the equations of motion,
we obtain α = mc. Thus the action of a free relativistic particle of mass m is
Z b Z t1
(15.1) S(γ) = −mc ds = L(γ 0 (t))dt
a t0

with the Lagrangian function


r
2 v2
(15.2) L = −mc 1− .
c2
Proposition 15.1. The Euler-Lagrange equations of a free relativistic par-
ticle are
duµ
=0
ds
and describe a motion with constant velocity.
p
Proof. Since ds = dxµ dxµ , we have along the world-line γ,

dxµ
 
1 dxµ
δ(ds) = δdxµ + δdxµ
2 ds ds
= uµ dδxµ
duµ
= d(uµ δxµ ) − δxµ ds,
ds
and using δxµ (a) = δxµ (b) = 0, we obtain
Z b b
duµ
Z
δS = −mc δ(ds) = mc δxµ ds. 
a a ds
15.2. ENERGY-MOMENTUM VECTOR 139

15.2. Energy-momentum vector


Canonically conjugated momentum p to the position r of the particle is
given by
∂L mv
p= =r .
∂v v2
1− 2
c
The corresponding energy is
r
mv 2 v2 mc2
E =p·v−L= r + mc 2
1− = .
c2
r
v2 v2
1− 1− 2
c2 c

At v = 0 we obtain the rest energy E0 of the particle,

E0 = mc2 .

At small velocities we obtain

mv 2
E = E0 + + O(v 4 )
2

which, except for the rest energy, is the classical expression for the kinetic energy
of a free particle. We have

E2
= p 2 + m 2 c2 , p2 = p · p,
c2

so that the corresponding Hamiltonian function is


p
H = c p2 + m2 c2 ,

and Hamilton’s equations

∂H ∂H
ṗ = − , ṙ =
∂r ∂p

give Euler-Lagrange equations of a free relativistic particle (see Proposition


15.1). Introducing the energy-momentum four vector pµ = (E /c, p), so that
pµ = (E /c, −p), we have
pµ pµ = m2 c2 .

Note that p = −(p1 , p2 , p3 ) and

∂L
pµ = − .
∂ ẋµ
140 15. RELATIVISTIC PARTICLE

15.3. Charged particle in the electromagnetic field


Here we consider the interaction of a free relativistic particle of mass m and
charge e with the external electromagnetic field with a potential A = Aµ dxµ ,
where Aµ = (cϕ, −A). To every world line γ : [t0 , t1 ] → M4 one associates a
holonomy of the connection d + A along γ, the integral
Z b
Aµ dxµ
a

of A along γ. It is natural to define the action of a free particle in the electro-


magnetic field as a linear combination of the action of a free particle and the
holonomy, and we put
Z b
e b
Z
S(γ) = −mc ds − Aµ dxµ
a c
Z t1 r a !
2 v2 e
(15.3) = −mc 1 − 2 + A · v − eϕ dt.
t0 c c

Proposition 15.2. The Euler-Lagrange equations for the action functional


(15.3) have the form
dp
= F,
dt
where F is the Lorentz force,
 v 
F =e E+ ×B .
c
Proof. We have
Z b Z b
∂Aµ dxµ ν dδxµ

δ Aµ dxµ = δx + A µ ds
a a ∂xν ds ds
Z b
∂Aµ dxµ ν ∂Aµ dxν µ

= δx − δx ds
a ∂xν ds ∂xν ds
Z b
dxµ ν
=− Fµν δx ds.
a ds
Now using Proposition 15.1 we obtain
Z b
dxµ

duν e
δS = mc + Fµν δxν ds,
a ds c ds
and the Euler-Lagrange equations take the following invariant form
duν e dxµ
(15.4) mc + Fµν = 0.
ds c ds
Using formula (8.8) in Lecture 8, relation mcuν = pν and equation (15.4) for
ν = 1, 2, 3, we readily obtain
dp e
(15.5) = eE + v × B. 
dt c
15.3. CHARGED PARTICLE IN THE ELECTROMAGNETIC FIELD 141

p
Remark. Since mcu0 = m2 c2 + p2 , equation (15.4) for ν = 0 follows
from (15.5).
Remark. In the non-relativistic limit |v|  c equation (15.5) turns into

dv  v 
m =e E+ ×B
dt c
— Newton’s equation with the Lorentz force.
The Lagrangian of a charged particle in electromagnetic field is
r
2 v2 e
L = −mc 1 − 2 + A · v − eϕ.
c c
The canonically conjugated to r momentum of the charged particle, the gener-
alized momentum, is defined by
∂L mv e e
P = =r + A = p + A,
∂v v 2 c c
1− 2
c
and the corresponding energy is

∂L mc2
E =v −L= r + eϕ,
∂v v2
1− 2
c
p
2 2 2
= c m c + p + eϕ.

The Hamiltonian function is obtained from the energy E by replacing p =


e
P − A and is given by
c
r  e 2
H = c m2 c2 + P − A + eϕ.
c
Hamilton’s equations of motion
∂H ∂H
Ṗ = − , ṙ = ,
∂r ∂P
together with the definitions
∂A
E = −∇ϕ − , B = ∇ × A,
∂t
give Euler-Lagrange equations for a charged particle in the electromagnetic field.
LECTURE 16

Hamiltonian formulation

16.1. Poincaré group and Noether integrals


The Poincaré group is a ten-dimensional Lie group, the group of isometries

(16.1) xµ 7→ x0µ = Λµν xν + aµ

of Minkowski spacetime M4 . The group multiplication in P is given by

(Λ1 , a1 )(Λ1 , a2 ) = (Λ1 Λ2 , a1 + Λ1 a2 ), Λ1,2 ∈ L, a1,2 ∈ R4 .

There is an embedding P ,→ GL(5, R) given by


 
Λ a
(Λ, a) 7→ .
0 1

The Lie algebra p of the Poincaré group P is a ten-dimensional Lie algebra,


a semi-direct sum of the abelian Lie algebra R4 and the Lorentz Lie algebra
so(1, 3). Denoting by P µ the generators of p corresponding to space-time trans-
lations we obtain the following set of relations:

[P µ , P ν ] = 0,
[M λµ , P σ ] = η λσ P µ − η µσ P λ ,
[M λµ , M ρσ ] = −η λρ M µσ + η λσ M µρ − η µσ M λρ + η µρ M λσ .

The Lagrangian function of a free relativistic particle

dxµ x0
ẋµ =
p
L = −mc ẋµ ẋµ , , t= ,
dt c
is invariant under the action (16.1) of the Poincaré group on M4 ,
q dx0µ x00
Ldt = L0 dt0 , where L0 = −mc ẋ0µ ẋ0µ , ẋ0µ = , t0 = .
dt c
According to Noether theorem in Sect. 2.2 in Lecture 2, there are ten integrals
of motion corresponding to the generators P µ and M λµ . The integrals of motion
for the abelian Lie algebra R4 are
∂L
pµ = − ,
∂ ẋµ

143
144 16. HAMILTONIAN FORMULATION

that is,
H p mv
p0 = = p2 + m2 c2 , p= r
c v2
1−
c2
(recall that pµ = (p0 , −p), see Sect. 15.2 in Lecture 15). The vector fields on R4
µν
which corresponds to the one-parameter subgroups esM of the Lorentz group
µν
generated by M are

∂ ∂
X µν = (M µν · x)σ σ
= (η σν xµ − η σµ xν ) σ .
∂x ∂x
The corresponding Noether integrals are given by

∂L
J µν = (η σν xµ − η σµ xν ) = xµ p ν − xν p µ .
∂ ẋσ
Thus we obtain components of the total angular momentum

Jx = J 23 = x2 p3 − x3 p2 , Jy = J 31 = x3 p1 − x1 p3 , Jz = J 12 = x1 p2 − p1 x2

and integrals of motion corresponding to Lorentz boosts

Kx = J 01 = x0 p1 − x1 p0 , Ky = J 02 = x0 p2 − x2 p0 , Kz = J 01 = x0 p3 − x1 p3 .

Of course it is easy to verify directly that these functions are integrals of


motion. Thus we have
J˙0i = cpi − ẋi p0 = 0
due to the relation
cp
v=p ,
p2 + m2 c2
which follows from
mv
p= r .
v2
1− 2
c
16.2. Hamiltonian action of the Poincaré group
The Legendre transform
mv
(16.2) p= r .
v2
1− 2
c

maps B(0, c), the ball of radius c in R3 , onto R3 and the phase space of a free
relativistic particle of mass m is R6 . The inverse transform is
cp cp
(16.3) v=p = 0.
p2 + m2 c2 p
16.2. HAMILTONIAN ACTION OF THE POINCARÉ GROUP 145

The symplectic form is given by

ω = dp ∧ dr = dp1 ∧ dx1 + dp2 ∧ dx2 + dp3 ∧ dx3

with Darboux coordinates1 (p, r) = (p1 , p2 , p3 , x1 , x2 , x3 ).


It is remarkable that there is a Hamiltonian action of the Poincaré group P
on R6 !
Indeed, let L be the set of all timelike straight line in R4 . Every l ∈ L has
the form l = {x + sv, s ∈ R}, where x, v ∈ R4 and v is timelike, v µ vµ > 0. The
Poincaré group P acts on L by

(Λ, a)(l) = {Λx + a + sΛv}.

Each timelike l admits a unique representation l = {x + sv, s ∈ R} where


x = (0, r) and v = (c, v) with v = |v| < c. Thus L ∼ = R3 × B(0, c), which is
6
isomorphic to R by the Legendre transform v 7→ p, and we obtain the Poincaré
group action on R6 .
This action preserves the symplectic form and is Hamiltonian. Specifically,
the action of the Euclidean group E(3) < P on R6 ∼
= R3 ×B(0, c) is Hamiltonian
with the Hamiltonian functions

J1 = x2 p3 − x2 p3 , J2 = x3 p1 − x1 p3 J3 = x1 p2 − x2 p1

(see Example 6.1 in Lecture 6) and Pi = −pi . Indeed, abelian group of transla-
tions of R3 acts on R6 by (p, r) 7→ (p, r + a) and the corresponding vector field
Xa is given by

d ∂f
Xa (f )(p, r) = f (p, r − a) = −ai (p, r).
du u=0 ∂xi

Thus the vector fields Xei are Hamiltonian vector fields with Hamiltonian func-
tions −pi , i.e.,

Xei = − i = −J(dpi ), i = 1, 2, 3.
∂x
The one-parameter subgroup T of time translations acts on L by l 7→ l +
(x0 , 0, 0, 0) with the representative (r − x0 v/c, v). Thus T acts on R6 by

x0 p
r 7→ r − , p 7→ p
p0

pi ∂
and the corresponding vector field is X = . Using that
p0 ∂xi

∂ ∂
J(dp) = and J(dr) = − ,
∂r ∂p

1Note that in accordance with Sect. 15.2 in Lecture 18 we have p = (p1 , p2 , p3 ).


146 16. HAMILTONIAN FORMULATION

0
(see Sect. 4.3 in Lecture 4) we obtain that X = J(dpp ), i.e., X is a Hamiltonian
vector with with the Hamiltonian function is p0 = p2 + m2 c2 , i.e., is 1/c times
the Hamiltonian of a free relativistic particle of mass m.
Next, consider the one-parameter subgroup K1 of P which consists on
Lorentz boosts in x0 x1 -planes,
Λ(ψ)x = (x0 cosh ψ + x1 sinh ψ, x0 sinh ψ + x1 cosh ψ, x2 , x3 ), ψ ∈ R.
To find the action of Λ(ψ) on R6 we need to determine how in acts on the
representative (r, v) of a straight line l. We have

Λ(ψ)(0, r) = (x1 sinh ψ, x1 cosh ψ, x2 , x3 ),


Λ(ψ)(c, v) = (c cosh ψ + v 1 sinh ψ, c sinh ψ + v 1 cosh ψ, v 2 , v 3 ),
so that
cv 1 cosh ψ + c2 sinh ψ cv 2 cv 3
 
Λ(ψ)(v) = 1
, 1 , 1
v sinh ψ + c cosh ψ v sinh ψ + c cosh ψ v sinh ψ + c cosh ψ
and from this we obtain
v 1 cosh ψ + c sinh ψ

Λ(ψ)(r) = x1 cosh ψ − x1 sinh ψ 1 ,
v sinh ψ + c cosh ψ
x1 v 2 sinh ψ x1 v 3 sinh ψ

x2 − 1 , x3 − 1
v sinh ψ + c cosh ψ v sinh ψ + c cosh ψ
1 1 2
x1 v 3 sinh ψ
 
cx 2 x v sinh ψ 3
= ,x − 1 ,x − 1 .
v 1 sinh ψ + c cosh ψ v sinh ψ + c cosh ψ v sinh ψ + c cosh ψ
Using (16.3), we get
Λ(ψ)(r) =
1
x1 p2 sinh ψ x1 p3 sinh ψ
 
x p0 2 3
, x − , x − .
p1 sinh ψ + p0 cosh ψ p1 sinh ψ + p0 cosh ψ p1 sinh ψ + p0 cosh ψ
To obtain the action of the Lorentz boost on the momentum vector p we need
to use equation (16.2). Namely, Λ(ψ)(p) = p̃ is relativistic momentum for the
velocity vector ṽ = Λ(ψ)(v). Denoting ṽ = |ṽ| we get
ṽ 2 c2 v2
 
1− 2 = 1 1− 2 .
c (v sinh ψ + c cosh ψ)2 c
Using
mc
p0 = r ,
v2
1−
c2
we obtain

p̃ = r = (p1 cosh ψ + p0 sinh ψ, p2 , p3 ),
ṽ 2
1− 2
c
16.2. HAMILTONIAN ACTION OF THE POINCARÉ GROUP 147

so that
Λ(ψ)(p) = (p1 cosh ψ + p0 sinh ψ, p2 , p3 ).
The vector field corresponding to the K1 action on R6 is given by
d
X1 (f )(p, r) = f (Λ(−ψ)p, Λ(−ψ)r)
dψ ψ=0
x1
 
∂ ∂ ∂ ∂
= p1 + p2 2 + p3 3 − p0 .
p0 ∂x1 ∂x ∂x ∂p1
Thus we obtained thatpX is a Hamiltonian vector field with the Hamiltonian
function K1 (p, r) = x1 p2 + m2 c2 , i.e.,
X = J(dK1 ).
Similarly, we see that vector fields X2 and X3 for one-parameter subgroups K2
and pK3 are Hamiltonian vector field p with the Hamiltonian function K2 (p, r) =
2 3
x p + m c and K3 (p, r) = x p2 + m2 c2 .
2 2 2

Since Hamiltonian vector fields preserves symplectic form, the Poincaré


group P acts on R6 by canonical transformations (symplectomorphisms). The
following theorem summarizes obtained results.
Theorem 16.1. The defined above action of the Poincaré group P on the
phase space R6 of free relativistic particle with mass m is Hamiltonian. The
Hamiltonian functions corresponding to space-time translations, space rotations
and Lorentz boosts are
p p
P0 = p2 + m2 c2 , Pi = −pi , Ji = εijk xj pk , Ki = xi p2 + m2 c2 ,
i = 1, 2, 3. They satisfy the following Poisson brackets
(16.4) {Pi , Pj } = {Pi , P0 } = {Ji , P0 } = 0, {Ji , Jj } = −εijk Jk ,
(16.5) {Ki , Kj } = εijk Jl , {Ji , Kj } = −εijk Kk ,
(16.6) {Ki , P0 } = Pi , {Ki , Pi } = −δij P0 , {Ji , Pj } = −εijk Pk .
Proof. Straightforward computation using the Poisson bracket
∂f ∂g ∂f ∂g
{f, g}(p, r) = − . 
∂p ∂r ∂r ∂p
Remark. As in Example 6.1 in Lecture 6, Poisson brackets between Hamil-
tonian functions have the same form as Lie brackets of the corresponding gen-
erators of Poincaré Lie algebra, taken with the negative sign.
Using that cp0 = H , the Hamiltonian of a free particle, we obtain from
(16.4)–(16.6),
(16.7) {Ji , xj } = −εijk xk ,
(16.8) c{Ki , xj } = xi {H , xj },
(16.9) {Pi , xj } = −δij , i, j = 1, 2, 3.
148 16. HAMILTONIAN FORMULATION

These Poisson brackets exemplify that R6 is a phase space of a relativistic par-


ticle.

16.3. No-interaction theorem


The relativity principle imposes very strong restriction on Hamiltonian sys-
tems: it implies that the interaction of finitely many relativistic particles is not
possible! The precise statement is the following.
Theorem 16.2. Consider the Hamiltonian system of of n particles with the
phase space R6n , the symplectic form
n
X
ω= dpa ∧ dra ,
a=1

where ra and pa are coordinates and momenta of the a-th particle, and with the
Hamiltonian function H . Suppose that (R6n , ω, H ) is a system of n relativistic
particles, that is, the principle of relativity holds in the following form:
a) There exists a set of ten generators of the Poincaré Lie algebra — ten
functions P0 = H /c, Pi , Ji and Ki on R6n with Poisson brackets
(16.4)–(16.6).
b) The coordinates of the particles transform correctly under the Poincaré
group — coordinates ra , a = 1, . . . , n, and the generators of the Poincaré
Lie algebra have Poisson brackets (16.7)–(16.9).
In addition, suppose that the system is non-degenerate,
( )
∂2H
det 6= 0.
∂pia ∂pjb

Then the acceleration of each particle vanishes,

{H , {H , ra }} = 0, a = 1, . . . , n.

Equivalently, there are Darboux coordinates p̃a and ra (the coordinates of the
particles are unchanged) and the constants ma > 0 such that
n
X
P =− p̃a ,
a=1
n
X
H =
p
c p̃2a + m2a c2 ,
a=1
Xn
Ji = εijk xja p̃ka ,
a=1
Xn p
Ki = xia p̃2a + m2a c2 .
a=1
16.3. NO-INTERACTION THEOREM 149

The theorem is a manifestation of the fundamental fact that relativistic in-


variant Hamiltonian systems of interacting particles in Minkowski spacetime
should have infinitely many of degrees of freedom, and the interaction is de-
scribed by by a field theory. The examples we have seen so far are the theory
of electromagnetism and a charged relativistic particle interacting with the ex-
ternal electromagnetic field. Another fundamental example in classical physics
is the theory of gravity and a massive relativistic particle interacting with the
external gravitational field.
Problem 16.1. Prove the no-interaction theorem for n = 1.
LECTURE 17

General relativity

Newton’s law of universal gravitation states that a particle with mass m1 at


point r1 attracts a particle with mass m2 at point r2 with the force
r2 − r1
F2 = −Gm1 m2
|r2 − r1 |3

and F1 = −F2 . Obviously the Newton’s law is not a Lorentz invariant and one
needs to find a Lorentz invariant description of gravity.
The first attempt1 was to include the theory of gravity into the special
relativity by assuming that gravitation field is determined by the four potential
AGµ . The interaction of a relativistic particle of charge e and mass m would be
described by the action
Z Z Z
e
S = −mc ds − Aµ dxµ − m AG µ
µ dx .
c

Considering the case e = 0 and using AG µ = (ϕ, 0, 0, 0), one gets a Lorentz
invariant modification of Newton’s law of universal gravitation,
dp ∂ϕ mv
= −m , p= r .
dt ∂r v2
1− 2
c
However, this approach does not give a correct answer for the precession of the
perihelion of Mercury.

17.1. Spacetime in general relativity


A smooth connected four-manifold M is called a Lorentzian manifold if it
carries a pseudo-Riemannian metric

ds2 = gµν (x)dxµ dxν

with the signature (+, −, −, −) at every x ∈ M . The Minkowski space is a


non-compact Lorentzian manifold, and it is easy to see that every non-compact
manifold admits a Lorentzian metric. However, a compact manifold M admits
a Lorentzian metric if and only if its Euler characteristic vanishes. In other
1A. Poincaré in 1905.

151
152 17. GENERAL RELATIVITY

words, a manifold M admits Lorentzian metric if and only if is has nowhere


vanishing vector filed2.
As for the case of Minkowski metric, a tangent vector v ∈ Tx M is timelike,
null, or spacelike if, respectively, its length is positive, zero, or negative. A curve
γ : [u1 , u2 ] → M is timelike if γ 0 (u) is timelike for all u ∈ [u1 , u2 ] and is causal
if if γ 0 (u) is timelike or null for all u ∈ [u1 , u2 ]. A Lorentzian manifold M is
time-orientable if admits a timelike vector field X ∈ Vec(M ) which defines a
time orientation of M . The opposite time orientation is given by the vector
field −X. Specifically, a timelike or null vector u ∈ Tx M is future-directed (or
past-directed ), if u · Xx > 0 (or u · Xx < 0). A timelike curve γ : [u1 , u2 ] → M is
future-directed (or past-directed), if γ 0 (u) is future-directed (or past-directed)
for all u ∈ [u1 , u2 ].
Definition. A spacetime is time-oriented Lorentzian four-manifold M .
M
Definition. The chronological future I+ (x) of x ∈ M is the set of points
that can be reached from x by future-directed timelike curves. The causal future
M
J+ (x) of x ∈ M is the set of points that can be reached from x by future-directed
M
causal curves and of x itself. Similarly, the chronological past I− (x) and causal
M
past J− (x) of x ∈ M are defined by using past-directed timelike and causal
curves.
Proposition 17.1. If the spacetime M is compact, there exists a closed
timelike curve in M .
M
Proof. The familiy {I+ (x)}x∈M is an open covering of M . By compact-
M M M M
ness, M = I+ (x1 ) ∪ · · · ∪ I+ (xm ). If x1 ∈ I+ (x2 ) ∪ · · · ∪ I+ (xm ), then
M M M
x1 ∈ I+ (xk ) for some 2 ≤ k ≤ m. Then I+ (x1 ) ⊆ I+ (xk ) and we can omit
M M
I+ (x1 ) from the covering. Thus x1 ∈ I+ (x1 ), so that there is a timelike future-
directed curve starting and ending in x1 . 
Since this allows for the time travel, we will consider only non-compact
spacetimes. Recall that a piecewise C 1 -curve in M is called inextendible, if
no piecewise C 1 -reparametrization of the curve can be continuously extended
beyond any of the end points of the parameter interval. A set S is called achronal
if there is no timelike curve which intersects S twice.
Definition. An achronal hypersurface Σ in M is a Cauchy hypersurface if
every inextendible causal curve intersects Σ exactly once.
Proposition 17.2. If a spacetime M admits two Cauchy hypersurfaces Σ1
and Σ2 , then Σ1 is diffeomorphic to Σ2 .
Definition. A spacetime M satisfies the causality condition if it does not
contain any closed causal curve. A spacetime M satisfies the strong causality
condition if there are no almost closed causal curves. That is, for each x ∈ M
2Indeed, according to a theorem by Steenrod, a compact manifold admits everywhere
defined, continuous quadratic form of signature k if and only if it admits a continuous field of
tangent k planes.
17.1. SPACETIME IN GENERAL RELATIVITY 153

and for each open neighborhood U of x there exists an open neighborhood


V ⊆ U of x such that no causal curve in M intersects V more then once.
Clearly the strong causality condition implies the causality condition.
Definition. A space-time M is globally hyperbolic if it satisfies the strong
M M
causality condition and for all x, y ∈ M the intersection J+ (x) ∩ J− (y) is
compact.
The following fundamental result holds3. It describes the structure of glob-
ally hyperbolic spacetimes explicitly: they are foliated by smooth spacelike
Cauchy hypersurfaces.
Theorem 17.1. Let M be a spacetime M . The following are equivalent.
(1) M is globally hyperbolic.
(2) There exists a Cauchy hypersurface in M .
(3) M is isometric to R × Σ with the Lorentzian metric βdt2 − γt , where β
is a smooth positive function on M , γt is is a Riemannian metric on
Σ depending smoothly on t ∈ R and each {t} × Σ is a smooth spacelike
Cauchy hypersurface in M .
Corollary 17.2. On every globally hyperbolic spacetime M there exists a
smooth function h : M → R whose gradient ∇h ∈ Vect(M ) is timelike and
future-directed and all level sets of h are spacelike Cauchy hypersurfaces.
Such function h is called a Cauchy time function and its gradient ∇h is
defined by
∂h ∂
∇h = g µν µ ν ,
∂x ∂x
where g µν is the inverse matrix. In fact4, for every Cauchy hypersurface Σ in
M there is a Cauchy time function h such that Σ = h−1 (0).
From physics point of view, a proper time τ along a timelike curve γ is
defined by
1 u
Z
τ (u) = ds,
c u1
where the integration goes over γ. It is natural to consider only those coordi-
nates xµ for which x0 plays a role of a time variable, and x1 , x2 , x3 are space
coordinates. Specifically, two events occurring at a same point (x1 , x2 , x3 ) in
space should be connected by a timelike curve γ(u) = (x0 (u), x1 , x2 , x3 ). This
implies that g00 > 0 and the proper time between these two events is
1 √
Z
τ= g00 dx0 .
c
3Bernal, A.N., Sánchez, M.: Smoothness of time functions and the metric splitting of
globally hyperbolic spacetimes, Commun. Math. Phys. 257 (2005), 43.
4Bernal, A.N., Sánchez, M.: Further results on the smoothability of Cauchy hypersurfaces
and Cauchy time functions, Lett. Math. Phys. 77 (2006), 183.
154 17. GENERAL RELATIVITY

To determine the metric dl2 = γij dxi dxj in space induced by ds2 we can-
not simply put dx0 = 0 since proper time at different points in space depend
differently on the coordinate x0 . However,
 2
g0i i
ds2 = g00 (dx0 )2 + 2g0i dx0 dxi + gij dxi dxj = g00 dx0 + dx − γij dxi dxj ,
g00
where
g0i g0j
(17.1) γij = −gij + , i, j = 1, 2, 3
g00
is a three-dimensional metric tensor. Since g00 > 0 it is a Riemannian metric
tensor. It depends on x0 so that the distance in real space depends on time.
The relation
g0i i
dx0 + dx = 0
g00
can be integrated over any curve in space to define x0 along the curve. This
allows to synchronize the clocks in general relativity along any curve in space.
However, this synchronization depends on a curve connecting two points in
space. Proposition 17.1 asserts that for a globally hyperbolic spacetime one can
choose coordinates such that g0i vanish and one can synchronize clocks over all
space. The corresponding coordinates (reference system in physics terminology)
are called syncrhonous.
It is easy to see from (17.1) that

−γij g jk = δik .

The relations g00 > 0 and γij is positive-definite 3 × 3 matrix are equivalent to
the  
  g00 g01 g02
g g01
g00 > 0, det 00 < 0, det g10 g11 g12  > 0
g10 g11
g20 g21 g22
and  
g00 g01 g02 g03
g10 g11 g12 g13 
g = det 
g20
 < 0.
g21 g22 g23 
g30 g31 g32 g33
Physically these conditions should hold for any choice of coordinates on M which
can be realized with the aid of “physical bodies”.

17.2. Particle in a gravitation field


A gravitational field is a change of a metric of a space-time and is described
by the metric tensor gµν (x). The action of a relativistic particle of mass m in a
gravitational field has the same form as in Lecture 15,
dxµ
Z Z
gµν uµ uν ds, uµ =
p
S(γ) = −mc ds = −mc .
ds
17.3. THE RIEMANN TENSOR 155

In other words, the action functional is −mc times the length functional in the
pseudo-Riemannian geometry. Correspondingly, the Euler-Lagrange equations
are the geodesic equations with respect to the natural parameter,

d2 xλ µ
λ dx dx
ν
+ Γ µν = 0,
ds2 ds ds
where
 
1 λσ ∂gµσ ∂gνσ ∂gµν
(17.2) Γλµν = g ν
+ µ

2 ∂x ∂x ∂xσ
are Christoffel’s symbols. The free particle in a gravitational field moves along
the geodesics.

17.3. The Riemann tensor


Recall that the metric gµν (x) on the spacetime M determines a Levi-Civita
connection5 in the tangent bundle T M . Explicitly it is given by

∇ = d + A, where A = Aµ dxµ .

Here Aµ (x) are linear operators in Tx M which in the basis are given by
∂xµ
the matrices
λ
(17.3) (Aµ )ν = Γλνµ .


Thus a derivative of a (1, 0)-tensor, a vector field V = v µ µ , in the direction
∂x

is given by
∂xµ
∂v λ
(∇µ V )λ = + Γλνµ v ν ,
∂xµ
while a derivative of a (0, 1)-tensor, a 1-form θ = aµ dxµ , is
∂aλ
(∇µ θ)λ = − Γνλµ aν .
∂xµ
Directional derivative ∇µ of an arbitrary (p, q)-tensor is defined similarly. We
have

(17.4) ∇λ gµν = 0 and ∇λ g µν = 0.

The curvature of the connection ∇ is F = dA + A ∧ A, a 2-form with values


in End T M (see Sect. 9.1 in Lecture 9). We have
X
F = Fµν dxµ ∧ dxν ,
µ<ν

5A metric connection with no torsion.


156 17. GENERAL RELATIVITY

where
∂Aν ∂Aµ
Fµν = µ
− + [Aµ , Aν ].
∂x ∂xν
On 2-forms B with values in End T M the connection ∇ acts by

∇B = dB + A ∧ B − B ∧ A,

which gives the Bianci identity

∇F = 0

for a curvature 2-form. Equivalently,

∇λ Fµν + ∇µ Fνλ + ∇ν Fλµ = 0.

Using (17.3), we obtain the following formula for the Riemann curvature
λ
tensor Rλρµν = (Fµν )ρ ,

∂Γλρν ∂Γλρµ
(17.5) Rλρµν = − + Γλσµ Γσρν − Γλσν Γσρµ .
∂xµ ∂xν
The Bianci identity for the Riemann tensor has the form

(17.6) ∇σ Rλρµν + ∇ν Rλρσµ + ∇µ Rλρνσ = 0.

The Ricci curvature


Rµν = Rλµλν
is the trace of the Riemann tensor and is given explicitly by

∂Γλµν ∂Γλµλ
(17.7) Rµν = − + Γλµν Γσλσ − Γσµλ Γλσν .
∂xλ ∂xν
It follows from (17.2) that
 
1 λσ ∂gµσ ∂gλσ ∂gµλ
Γλµλ = g + −
2 ∂xλ ∂xµ ∂xσ
1 ∂gσλ
= g λσ
2 ∂xµ √
1 ∂g ∂ log −g
= = .
2g ∂xµ ∂xµ
Thus the Ricci tensor is symmetric, Rµν = Rνµ , and determines a symmetric
bilinear form Rµν dxµ dxν on the tangent space.
Finally, the scalar curvature R is the trace of Ricci curvature tensor,

R = g µν Rµν .

Contracting λ and ν in (17.6), we get

2∇µ Rρσ − ∇σ Rρµ = 0


17.3. THE RIEMANN TENSOR 157

and using (17.4) we obtain

2∇µ Rσρ − ∇σ Rµρ = 0.

Finally contracting µ and ρ we get

2∇µ Rσµ − ∇σ R = 0,

or
 
1
(17.8) ∇µ Rνµ − δνµ R = 0.
2
LECTURE 18

Einstein equations – I

18.1. Einstein field equations


In general relativity the Lorentzian metric gµν of the space-time M satisfies
Einstein equations
1 8πG
Rµν − gµν R = 4 Tµν ,
2 c
where Rµν is the Ricci curvature, R is the scalar curvature and Tµν is the
stress-energy tensor of matter. It is defined as
δSmatter
Tµν = .
δg µν
It follows from Bianci identity (17.8) that Einstein equations imply that neces-
sarily
∇µ Tνµ = 0, ν = 0, 1, 2, 3.
These are conservation laws in general relativity.
Rewriting Einstein equations in the form
1 8πG
Rνµ − δνµ R = 4 Tνµ
2 c
and taking traces we obtain
8πG
R=− T,
c4
where T = Tµµ . Thus Einstein equations can be also written as
 
µ 8πG µ 1 µ
(18.1) Rν = 4 Tν − δν T .
c 2
In particular, the empty space Einstein equations reduces to

Rµν = 0.

18.2. Particle in a weak gravitational field


Here we solve the geodesic equation and Einstein equations in case of a weak
gravitational field. Namely, suppose that M = R4 and
 
1 (2) 1
(18.2) gµν (x) = ηµν + 2 gµν (x) + O 3 ,
c c

159
160 18. EINSTEIN EQUATIONS – I

where ηµν is Minkowski metric. It is also assumed that these asymptotics can
be differentiated with respect to xµ .
Timelike geodesic is slow if ẋi (t)  c, where i = 1, 2, 3 and t = x0 /c. Since
  
1p µ ν
1
dτ = gµν ẋ ẋ dt = 1 + O 2 dt,
c c
the equation for slow geodesic takes the form

d2 xλ dxµ dxν
 
1
2
+ Γλµν =O .
dt dt dt c

It follows from (18.2) that


2
   
1 1 ∂g00 1
Γ000 = O 3 , Γi00 = − +O ,
c 2 ∂xi c

and all other Christoffel’s symbols are of order O(1/c2 ). Putting


2
g00 (x) = 2ϕ(x0 , r)

we see that up to the order O(1/c) the geodesic equation becomes Newton’s
equation
∂ϕ
r̈ = − ,
∂r
∂ϕ
and the force acting on a particle is F = −m .
∂r
To find the potential ϕ we need to use Einstein equations. The energy-
momentum tensor of a macroscopic body which consists of slow moving particles
is given by
T µν = M (x)c2 uµ uν ,
where M (x) is the mass density of the body and uµ is a four-velocity vector.
If the macroscopic motion of the body is slow, we can put u0 = 1 and ui = 0,
i = 1, 2, 3. Thus the energy-momentum tensor takes the form

Tνµ = M c2 δ0µ δν0 .

It follows from formula (17.7) in Lecture 17 that in the weak gravitational field
Rνµ = O(1/c2 ) and the only nontrivial contribution to Einstein equation (18.1)
is
4πG 4πM
R00 = 4 T = 2 .
c c
Since
∂Γi00
   
0 1 1 2 1
R0 = + O 3 = 2∇ ϕ + O 3 ,
∂xi c c c
Einstein equations for the weak gravitational field reduce to the Poisson equation

∇2 ϕ = 4πM
18.3. HILBERT ACTION 161

for the gravitational potential. Namely,


M (r 0 ) 3 0
Z
ϕ(r) = −G d r
|r − r 0 |

and in case M (r 0 ) = M δ(r − r 0 ) we obtain Newtonian potential


GM
ϕ(r) = − .
r
So that the force acting on a slow particle of mass m in a weak gravitational
force generated by a particle of a mass M is the Newtonian force!

18.3. Hilbert action


On the space M of smooth Lorentzian metrics on the spacetime M consider
the celebrated Hilbert (or Hilbert-Einstein) functional

Z
S(gµν ) = R −g d4 x,

where R is the scalar curvature of the metric ds2 = gµν dxµ dxν ∈ M , and

−g d4 x is the corresponding volume form on M . Here integration goes over a
domain D in M (usually bounded by two spacelike Cauchy hypersurfaces) and
it is assumed that all metrics in M have the same boundary value on ∂D. In
addition, normal derivatives of gµν on ∂D are fixed.
Proposition 18.1. Let uµν = δgµν be a tangent vector to M at a point
gµν ∈ M and uµν = g µα g νβ uαβ . Then the Gato derivative of the Hilbert func-
tional S in the direction u is given by

Z  
1
δu S = Rµν − gµν R uµν −g d4 x.
D 2
Proof. Putting
d
δS = SEH (gµν + εδgµν )
dε ε=0

we have
√ √
Z Z
µν µν 4
δS = (δg Rµν + g δRµν ) −g d x + Rδ( −g)d4 x.
D D

To compute δRµν (x) we use geodesic normal coordinates at x ∈ M to obtain


δΓσµν δΓσµσ
δRµν = − .
∂xσ ∂xν
Since δΓλµν is a (1, 2) tensor, we get the formula

δRµν = ∇σ δΓσµν − ∇ν δΓσµσ ,


162 18. EINSTEIN EQUATIONS – I

called Palatini identity. Since ∇σ g µν = 0, we obtain from the Palatini identity


g µν δRµν = ∇σ (g µν δΓσµν ) − ∇ν (g µν δΓσµσ ),
so that
g µν δRµν = ∇σ W σ , where W σ = g µν δΓσµν − g µσ δΓρµρ .
Since
∂ √
Γνµν = log( −g),
∂xµ
we obtain
∂W µ
∇µ W µ = + Γµνµ W ν
∂xµ
1 ∂ √
=√ ( −g W µ ).
−g ∂xµ
Thus we have
1 ∂ √
(18.3) g µν δRµν = √ ( −g W µ ).
−g ∂xµ

To find δ( −g), we use
∂g
= Gµν = g g µν ,
∂gµν
so that
∂g
δg = δgµν = g g µν δgµν = −g gµν δg µν
∂gµν
and we obtain
√ 1√
(18.4) δ( −g) = − −g gµν δg µν .
2
Substituting (18.3)–(18.4) into the formula for δS we obtain
√ ∂ √
Z   Z
1
δS = Rµν − gµν R uµν −g d4 x + µ
( −g W µ )d4 x
D 2 D ∂x
µν √
Z  
1 4
= Rµν − gµν R u −g d x.
D 2

Here we used the Stokes theorem and the condition that δΓλµν = 0 on ∂D, which
follows from our assumptions on the space M of Lorentzian metrics on M . 
Remark. ‘Tautologically’ computing variation of the Hilbert-Einstein ac-
tion we obtain the relation
 
 √ √ 
1 1  ∂( −g R) ∂ ∂( −g R) 
Rµν − gµν R = √ − .
2 −g 
 ∂g
µν ∂xλ ∂g µν  
∂xλ
18.3. HILBERT ACTION 163

Remark. If one fixes only the values of metric tensor gµν on ∂D then δS
will contain the boundary term. It is possible to add to the Hilbert-Einstein
functional S the so-called Gibbons-Hawking-York boundary term so that the δS
is still given by Hilbert’s formula. This boundary term is the integral over ∂D
of trace of the second fundamental form over the volume form of the induced
metric on ∂D.
Denote
c3
Sgravity = − S(g).
16πG
The total action of the gravitational field in the presence of a matter with the
density function Λ(x), depending only on gµν and its first derivatives, is given
by
S = Sgravity + Smatter ,
where

Z
1
Smatter = Λ −g d4 x.
c
Defining symmetric stress-energy tensor by
 
 √ √ 
2c δSmatter 2  ∂( −g Λ) ∂ ∂( −g Λ) 
Tµν = √ = √ −
−g δg µν −g  ∂g
µν ∂xλ ∂g µν  
∂xλ
from δS = 0 we obtain Einstein equations
1 8πG
Rµν − gµν R = 4 Tµν .
2 c
When Λ depends only on gµν , the formula for the stress-energy tensor sim-
plifies
∂Λ
Tµν = 2 µν − gµν Λ.
∂g
Thus for the electromagnetic field
1 1
Λ=− Fαβ F αβ = − Fαβ Fγδ g αγ g βδ
16π 16π
and we obtain
 
1 1
Tµν = −Fµλ Fνσ g λσ + gµν Fαβ F αβ .
4π 4
Up to the factor 1/4π this is formula (11.2) in Lecture 11. For a macroscopic
body the energy-momentum tensor is
Tµν = (p + ε)uµ uν − pgµν ,
where p is the pressure and ε is the energy density of the body.
For a complete determination of the distribution and motion of the matter
one must add to Einstein equations equation of the state of the matter, that is,
equation relating the pressure density and temperature. This equation must be
given along with the Einstein equations.
LECTURE 19

Einstein equations – II

19.1. Palatini formalism


In this approach to general relativity we consider the metric tensor gµν on
the space-time M and affine torsion-free connection Γλµν on T M as independent
fields (due to the condition Γλµν = Γλνµ there are 50 = 10 + 40 independent
functions). Consider the action

Z
SP = g µν Rµν −g d4 x,
M

where Rµν is given by formula (17.7) in Lecture 21,


∂Γλµν ∂Γλµλ
Rµν = − + Γλµν Γσλσ − Γσµλ Γλσν .
∂xλ ∂xν
Its variation with respect to Γλµν is still given by the Palatini identity

δRµν = ∇λ (δΓλµν ) − ∇ν (δΓλµλ ),



whereas variation of −g is given by formula (18.4), in Lecture 22,
√ 1√
δ( −g) = − −g gµν δg µν .
2
Indeed,
∂δΓλµν ∂δΓλµλ
δRµν = λ
− + δΓλµν Γσλσ + Γλµν δΓσλσ − δΓσµλ Γλσν − Γσλµ δΓλσν
∂x ∂xν
∂δΓλµν σ λ σ λ λ σ
∂δΓλµλ
= + Γ λσ δΓ µν − Γ λµ δΓ σν − Γ σν δΓ µλ − + Γλµν δΓσλσ
∂xλ ∂xν
= ∇λ (δΓλµν ) − ∇ν (δΓλµλ ).
Denoting R = g µν Rµν and using Stokes’ theorem we obtain

δ( −g) √
Z  
δSP = Rµν δg µν + g µν δRµν + R √ −g d4 x
M −g
√ √
Z   Z
1
= Rµν − gµν R δg µν −g d4 x + g µν δRµν −g d4 x
M 2 M

Z   
1 µν µν λ
= Rµν − gµν R δg + Qλ δΓµν −g d4 x,
M 2

165
166 19. EINSTEIN EQUATIONS – II

where

1 ∂( −gg µν )
Qµν
λ = − √ + g µν Γσλσ − g µσ Γνλσ − g νσ Γµλσ
−g ∂xλ

1 ∂( −gg µσ )
 
+δλν √ + g ρσ µ
Γ ρσ .
−g ∂xσ

Thus equation δSP = 0 yileds


1
Rµν − gµν R = 0 and Qµν
λ = 0.
2
Using √
∂ −g 1√ ∂g µν
= − −g gµν
∂xλ 2 ∂xλ
and definition of the covariant derivative,
∂g µν
∇λ g µν = + Γµλσ g σν + Γνλσ g µσ ,
∂xλ
we can rewrite equation Qµν
λ = 0 as
 
1 1
(19.1) − ∇λ g µν + g µν gσρ ∇λ g σρ + δλν ∇σ g µσ − g µα gσρ ∇α g σρ = 0.
2 2

Equation (19.1) has free indices λ, µ and ν. Putting λ = ν and summing


over ν gives
 
µν 1 µν σρ µσ 1 µα σρ
−∇ν g + g gσρ ∇ν g + 4 ∇σ g − g gσρ ∇α g = 0,
2 2

whence
1 µν
∇ν g µν = g gσρ ∇ν g σρ .
2
Substituting this formula to (19.1) gives,
1 µν
(19.2) ∇λ g µν = g gσρ ∇λ g σρ .
2
Contracting (19.2) gµν using gµν g µν = 4 yields

gσρ ∇λ g σρ = 0,

and putting it back to (19.2) we finally obtain

∇λ g µν = 0.

This shows that ∇ is the Levi-Civita connection. Thus in the Palatini formalism
equations (17.2) for the Christoffel’s symbols appear from the principle of the
least action.
19.2. THE SCHWARZSCHILD SOLUTION 167

19.2. The Schwarzschild solution


For the case of static spherically symmetric metric in the empty space we
consider the following ansatz

ds2 = g00 (r)c2 dt2 − g11 (r)dr2 − r2 (dθ2 + sin2 θ dϕ2 ),

where we are using spherical coordinates

x = r cos θ cos ϕ, y = y cos θ sin ϕ, z = r cos θ.

It describes the gravitational field outside a spherical mass, on the assumption


that the electric charge of the mass and angular momentum of the mass are
all zero. Computing Γλµν , where x0 = ct, x1 = r, x2 = θ, x3 = ϕ, and solving
Rµν = 0 we obtain
a 1
g00 (r) = 1 − , g11 = a,
r 1−
r
where a is a constant. Thus
 a 2 2 dr2
ds2 = 1 − c dt − 2 2
a − r dΩ ,
r 1−
r
where dΩ2 is the induced metric on S 2 ⊂ R3 . In the limit r → ∞ we should
have  
1 2 1
gµ = ηµν + 2 gµν + O 3 ,
c c
so
2 ac2 2M G
g00 =− =− ,
r r
where M is the mass of a body creating gravitational field. By definition, the
quantity
2M G
a=
c2
is called Schwarzschild radius and is denoted by rs 1.
Thus the Schwarzschild metric is
 rs  2 2 dr2
ds2 = 1 − c dt − 2
rs − r dΩ
2
r 1−
r
and it is applicable for r > R, the radius of the body. At r = rs we have
event horizon and r < rs describes the black hole, where the time coordinate t
becomes spacelike and the radial coordinate r becomes timelike. The singularity
at r = rs is apparent and can be eliminated by the change of coordinates, called
Gullstrand-Painlevé coordinates.
1For the Earth r = 0.8.9 mm, while for the Sun r = 3 km.
s s
LECTURE 20

Kaluza-Klein theory

In the 1920s the only knows fundamental forces were electromagnetism and
the force of gravity, and the only known elementary particles were electron and
proton. Einstein’s idea of the unified field theory was to obtain electromagnetism
and general relativity from a single fundamental field. Toward this goal, T.
Kaluza (1921) and O. Klein (1926) proposed to consider the five-dimensional
space-time M = M × Sr1 , where the fifth dimension in the circle of small radius
r
~G
r= ∼ 10−35 m
c3
— the Planck’s length `P . The coordinates on M will be denoted by x̃a , a =
0, 1, 2, 3, 4, where x̃4 = θ, so that using xµ , µ = 0, 1, 2, 3, for coordinates on M
we have x̃µ = xµ . Consider the following pseudo-Riemannian metric on M of
signature (+, −, −, −, −),
 
g00 − A0 A0 g01 − A0 A1 g02 − A0 A2 g03 − A0 A3 A0
g10 − A1 A0 g11 − A1 A1 g12 − A1 A2 g13 − A1 A3 A1 
 
g̃ab = g20 − A2 A0 g21 − A2 A1 g22 − A2 A2 g23 − A2 A3 A2 

g30 − A3 A0 g31 − A3 A1 g32 − A3 A2 g33 − A3 A3 A3 
A0 A1 A2 A3 −1
so that
ds̃2 = g̃ab dx̃a x̃b = gµν dxµ dxν − (Aµ dxµ − dθ)2 .
Also assume that the metric gµν dxµ dxν and the 1-form Aµ dxµ on M do not
depend on θ.
We have the following basic facts.
1) For g̃ = det g̃ab one has g̃ = −g, where g = det gµν .
2) The inverse matrix g̃ ab is given by
 00
g 01 g 02 g 03 A0

g
g 10 g 11 g 12 g 13 A1 
 20 21

g
 30 g 31 g 22 g 23 A 2 

g g g 32 g 33 A 3 
A0 A1 A2 A3 −1 + Aµ A µ

3) Under the change of coordinates x 7→ x0 = F (x), θ 7→ θ + λ(x) we have


Aµ 7→ A0µ + ∂µ λ, so that U(1)-gauge invariance is a relativity in the
fifth dimension!

169
170 20. KALUZA-KLEIN THEORY

20.1. Geodesic equation on M


From formulas for Christoffel’s symbols we get for metric g̃ab :

1
Γ̃µαβ = Γµαβ + g µσ (Aα Fσβ + Aβ Fσα ),
2
1
Γ̃µα4 = g µσ Fασ ,
2  
4 µ 1 µ ∂Aα ∂Aβ
Γ̃αβ = Aµ Γαβ − A (Aα Fβµ + Aβ Fαµ ) − − ,
2 ∂xβ ∂xα
1
Γ̃4α4 = Aµ Fαµ ,
2
Γ̃a44 = 0.

As usual, here
∂Aβ ∂Aα
Fαβ = − .
∂xα ∂xβ
For the free particle of mass m on the five-dimensional space-time M we
have the action
Z r
dx̃a dx̃b
Z
S = −mc ds̃ = −mc g̃ab ds̃.
ds̃ ds̃

dx̃a
Using the formulas for Christoffel’s symbols Γ̃abc and putting ua = , we get
ds̃
the following equations

duµ
+ Γµαβ uα uβ = −g µσ Aα Fσβ uα uβ − g µσ Fασ uα u4 , µ = 0, 1, 2, 3,
ds̃
and

du4 ∂Aα α β
+ Aµ Γµαβ uα uβ = −Aσ Fασ uα u4 + Aσ Aα Fβσ uα uβ + u u .
ds̃ ∂xβ
Multiplying first equations by Aµ and adding them to the second equation yields

du4 duµ ∂Aα α β


− Aµ − u u =0
ds̃ ds̃ ∂xβ
so that
d 4
(u − Aµ uµ ) = 0.
ds̃
Thus u4 − Aµ uµ = ξ is constant and the first equation takes the form

duµ
+ Γµαβ uα uβ = −ξg µν Fαν uα .
ds̃
20.2. HILBERT ACTION ON M 171

Since 1 = gµν uµ uν + (u4 − Aµ uν )2 we have gµν uµ uν = 1 − ξ 2 , i.e.,


ds p
= 1 − ξ2.
ds̃
Whence
dxµ ds̃ uµ
= uµ =p
ds ds 1 − ξ2
and we obtain
d2 xµ α
µ dx dx
β
ξ µσ dxα
+ Γ αβ = − g F ασ .
ds2
p
ds ds 1 − ξ2 ds

Putting
e
ξ=√
m2 c4 + e2
we see that the right hand side becomes
e µσ dxα
g F ασ .
mc2 ds
Thus we get the equation of a free charged particle moving in external gravita-
tional and magnetic fields, obtained from the action
Z Z
e
−mc ds − Aµ dxµ .
c
This is the so-called first Kaluza miracle.

20.2. Hilbert action on M


By a direct and lengthy computation on gets
1
R̃ = R + Fµν F µν ,
4
which is Kaluza’s second miracle. The pure gravity action on M is proportional
to the Hilbert action,
c3
Z p
SM = − R̃ g̃ d5 x̃,
16π G̃ M
where G̃ is the gravitational
√ constant M. Putting G̃ = 2πrG, replacing Aµ by
κAµ , where κ = 2 G/c2 , and trivially integrating over Sr1 we finally obtain

c3 µν √
Z  
1
SM = − R+ Fµν F −g d4 x.
16πG M 16πc
This is the desired unification of general relativity and electromagnetism. It
yields Einstein equations
1 8πG
Rµν − gµν R = 4 Tµν
2 c
172 20. KALUZA-KLEIN THEORY

with the energy-momentum tensor of the electromagnetic field on M ,


 
1 1
Tµν = −Fµλ Fνσ g λσ + gµν Fαβ F αβ ,
4π 4

and Maxwell’s equations


∇ν F µν = 0
on M in the presence of the gravitation field gµν . Thus the Kaluze-Klein pure
gravity action in the five-dimensional space M naturally produces Einstein-
Hilbert-Maxwell action on the space-time M .

20.3. Criticism of the Kaluza-Klein theory


Though mathematically elegant, Kaluza-Klein theory gives unrealistic pre-
dictions for the masses of particles. Namely, consider the massless scalar field
Φ(x, θ) on M satisfying the five-dimensional wave equation

∂2
 
 4 − 2 Φ = 0,
∂θ

where gµν is the Minkowski metric. Corresponding Fourier coefficients



inθ
X
Φ(x, θ) = ϕn (x)e r

n=−∞

satisfy Klein-Gordon equations

( 4 + m2n )ϕn = 0

with masses
n2
m2n = .
r2
However, these masses are very large! Thus assuming that n = 1 gives electron,
the obtained mass would me ∼ 3·1030 MeV, while the actual electron mass is
only 0.5 MeV.
Geometrically one can consider general Kaluza-Klein metrics
 
g − ΦAµ Aν ΦAµ
g̃ab (x, θ) = µν ,
ΦAν −Φ

where Φ(x, θ) is a function on M, and consider the corresponding pure gravity


Hilbert action. However, even assuming that the metric g̃ab does not depend
on θ, setting Φ = 1 in the field equations is not the same as setting first Φ = 1
and consider the resulting field equations, which unify general relativity and
electromagnetism. In other words, this unification is obtained considered a
special subvariety of metrics on M which have Φ = 1.

You might also like