0% found this document useful (0 votes)
42 views29 pages

428 Notes

The document consists of lecture notes for the Math 428/609E course on Mathematical Classical Mechanics, authored by Dr. Lior Silberman. It covers topics such as Newtonian mechanics, kinematics, Lagrangian mechanics, and includes a course plan, bibliographic references, and mathematical definitions related to ordinary differential equations. The notes are intended for academic use and are copyrighted for the Spring 2025 course at UBC Department of Mathematics.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views29 pages

428 Notes

The document consists of lecture notes for the Math 428/609E course on Mathematical Classical Mechanics, authored by Dr. Lior Silberman. It covers topics such as Newtonian mechanics, kinematics, Lagrangian mechanics, and includes a course plan, bibliographic references, and mathematical definitions related to ordinary differential equations. The notes are intended for academic use and are copyrighted for the Spring 2025 course at UBC Department of Mathematics.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Math 428/609E: Mathematical Classical Mechanics

Lecture Notes

Lior Silberman
Dr. Lior Silberman, UBC Department of Mathematics
[email protected], http://www.math.ubc.ca/~lior

These are rough notes for the Spring 2025 course, copyright ©Lior Silberman 2025. They
may not be posted wholesale elsewhere – instead link to file posted on the original course web-
siteoriginal course website. Traditional academic reuse is permitted, however.
Compiled February 6, 2025.
Contents

Administrivia 4
0.1. Course plan (subject to revision) 4
Chapter 1. Review of Newtonian mechanics 5
1.1. Newton’s laws 5
1.2. Galilean group; spacetime 8
1.3. Energy and work 9
Chapter 2. Kinematics 11
2.1. Configurations and configuration space 11
2.2. Coordinates 13
Chapter 3. Lagrangian mechanics 17
3.1. Introduction 17
3.2. Calculus of Variations 17
3.3. The Euler–Lagrange equations 22
3.4. More on conserved quantities: symmetries Noether’s Theorem 23
3.5. Rotations and angular momentum 25
3.6. Small oscillations 27
Bibliography 28
Bibliography 29

3
Administrivia

• See syllabus, especially about problem sets.


• Textbooks
– Mathematical point of view: [2, 7, 3]
– Physical point of view: [4, 6]
0.1. Course plan (subject to revision)
Physics Mathematics
1 Kinematics Coordinates, tangent vectors, implicit and inverse function theorems
2 Newtonian mechanics ODE, cotangent vectors
3 Lagrangian mechanics Calculus of variations, convexity, symmetry and conservation laws
4 Angular momentum The rotation group
5 Hamiltonian mechanics Manifolds, measures

4
CHAPTER 1

Review of Newtonian mechanics

1.1. Newton’s laws


1.1.1. Mathematics: Elementary kinematics. Equipqthe vector space Rd with the standard
2
inner product and associated distance function |v − v′ | = ∑di=1 v′i − v′i . Euclidean space Ed
is the affine space modeled on Rd , in other words a principal homogenous space for Rd . For
any x ∈ Ed and v ∈ Rd we have the translate x + v ∈ Ed and conversely given x, x′ ∈ Ed there
is a unique displacement vector v ∈ Rd with x′ = x + v in which case d(x, x′ ) = |v| is called the
Euclidean distance from x to x′ .
Let γ : I → Ed denote the path of a particle through Ed . Then γ(t + h) − γ(t) is a displace-
ment (vector in Rd ) and if γ is differentiable we can talk about the velocity vector γ̇(t) ∈ Rd .
In coordinates if γ(t) = (x1 (t), . . . , xd (t)) with respect to some orthonormal coordinate system
then γ̇ is the vector of derivatives of the functions xi (t). We can similarly define the accelera-
tion γ̈(t) = (ẍi (t))di=1 . If we have N particles moving in the space, their joint position is given by
a point γ(t) ∈ EdN and we can similarly talk about the velocity vectorγ̇(t) and acceleration vector
γ̈(t), both in RdN .
See also [2, \S2D].
1.1.2. Physics: The second law.
8
A XIOM 1 (Newton’s second law). There is a function F : EdN × RdN × R → RdN called the
force so that the path γ = (x(t))t∈I is determined by the ODE
M ẍ = F(x, ẋ;t) .
R EMARK 2 (Physics). The force F represents (1) interactions between the particles; (2) any
external forces on the system; and (3) any constraint forces.
R EMARK 3 (Mathematics). Writing the differential equation in the form
   
d x ẋ
=
dt ẋ F(x, ẋ;t)
we see that is suffices to analyze equations of the form ẏ(t) = F(y;t) for y ∈ Rn .
E XAMPLE 4. Some standard systems include:
(1) The free particle mẍ = 0 (“Newton’s first law”)
(2) The Hookean spring, a.k.a. harmonic oscillator : mẍ = −kx.
(3) The physical pendulum mLθ̈ = −mg sin θ
(4) Pulley systems
R EMARK 5. Newton’s first law is the statement “there is a way to match physical space with
Ed so that free particles
5
1.1.3. Mathematics: ODE1.
D EFINITION 6. An ordinary differential equation is a pair (Ω, F) where Ω ⊂ En × E1 is open,
F : Ω → Rn is continuous. A solution to the differential equation is pair (I, γ) where I ⊂ E1 is an
interval, and γ ∈ C1 (I; En ) is a curve such that for all t ∈ I we have (γ(t),t) ∈ Ω and -
γ̇(t) = F (γ(t);t) .
D EFINITION 7. We say that F is locally Lipschitz if for every (y0 ,t0 ) ∈ Ω there is a number L >
0 and a neighbourhood (y0 ,t0 ) ∈ U ⊂ Ω such that for all (y,t), (y′ ,t) ∈ U we have |F(y,t) − F(y′ ,t)| ≤
L |y − y′ |.
E XERCISE 8. If F ∈ C1 it is locally Lipschitz.
Fix an ODE (Ω, F). The following argument is called “Picard iteration”.
P ROPOSITION 9 (Picard existence and uniqueness). Suppose that F locally Lipschi tz. Then
for every initial condition (y0 ,t0 ) ∈ Ω there is ε > 0 so that the equation has a unique solution on
(t0 − ε,t0 + ε).
L EMMA 10. There are ε, R, L, M > 0 such that:
(1) B = B(y0 , R) × [t0 − ε,t0 + ε] ⊂ Ω.
(2) For all (y,t) ∈ B we have |F| ≤ M.
(3) For all (y,t), (y′ ,t) ∈ B we have |F(y,t) − F(y′ ,t)| ≤ L |y − y′ |.
R 1
(4) We have ε ≤ M+1 and ε ≤ L+1 .
P ROOF. Let U be a neighbourhood of (y0 ,t0 ) on which we have a Lipschitz constant L. Since
U is open, we can choose R and ε̃ are small enough so that the closed box B̃ = B(y0 , R) ×
 R F is1 continuous and B̃ is compact, there is M such that |F| ≤ M on
[t0 − ε̃,t0 + ε] ⊂ U. Since
B̃. Finally let ε = min M+1 , L+1 , ε̃ . Then B = B(y0 , R) × [t0 − ε,t0 + ε] ⊂ B̃ giving (1), (2), and
(3), and (4) holds by the choice of ε. □
L EMMA 11 (A-priori estimate). Let I = [t0 − ε,t0 + ε] and let γ ∈ C1 (I; En ) be a solution to the
ODE on I satisfying γ(t0 ) = y0 . Then γ(t) ∈ B(y0 , R) for all t ∈ I.
P ROOF. Suppose there is t such that γ(t) ∈ / B(y0 , R). Wlog t > t0 (reverse time and replace F
with −F otherwise). Let t1 = inf {t > t0 : |γ(t) − y0 | ≥ R} where, since γ is continuous, γ(t1 ) ≥ R
and in particular t1 > t0 . By construction for all s ∈ (t0 ,t1 ) we have γ(s) ∈ B(y0 , R), so (γ(s), s) ∈ B
and |F(γ(s), s)| ≤ M. By the FTC
Z t
|γ(t1 ) − y0 | = |γ(t1 ) − γ(t0 )| = γ̇(s)ds
t0
Z t1
= F (γ(s), s) ds ≤ M |t − t0 |
t0
M
≤ Mε ≤ R < R,
M+1
1For a historical survey see [1] which is Chapter 11 of [5]. The Existence and Uniqueness Theorem is due to
Cauchy (~1821) in one dimension and where F is differentiable; the “Lipschitz condition” is due to Lipschiz, who
treated d > 1 as well. Their proofs relied on what is today called the Euler Scheme, which we discuss in the homework.
The proof given here is due to Picard; the older argument also gives the Peano Existence Theorem; see the Homework
for that.
6
contradicting the fact that |γ(t1 ) − y0 | = R. □
P ROOF OF P ROPOSITION 9. Let ε, R, L, M, B, I be as in Lemma 10. Let C (I; En ) be the space
of continuous functions [t0 − ε,t0 + ε] = I → En equipped with the metric
def
d(γ, γ ′ ) = γ − γ ′ ∞
= sup γ(t) − γ ′ (t) .
t∈I

Then (X, d) is a complete metric space. Let X ⊂ C1 (I; En ) be the set of functions γ : I → En such
that γ(t) ∈ B(y0 , R) for all t. Then X is a closed subset (if γn converge uniformly in C (I; En ) then
they converge pointwise and B(y0 , R) is closed), hence a complete metric space in its own right.
Given γ ∈ X and t ∈ I define
Z t
(G(γ)) (t) = y0 + F (γ(s), s) ds .
t0
The integral is well-defined since γ and F are continuous and since by the Lemma for all s ∈ I we
have (γ(s), s) ∈ B ⊂ Ω. We observe that this also means that |F(γ(s), s)| ≤ M for all s. Now by the
FTC the function G(γ) : I → En is continuously differentiable. Moreover we have
Z t
|(G(γ)) (t) − y0 | = F (γ(s), s) ds
t0
Z t
≤ |F (γ(s), s)| ds
t0
M
≤ |t − t0 | M ≤ εM ≤ R < R.
M+!
It follows that G(γ) is a continuous function on I valued in B(y0 , R), hence also an element of X.
Finally let γ, γ ′ ∈ X. Then for each t we have
Z t Z t

F (γ(s), s) ds − F γ ′ (s), s ds
 
(G(γ)) (t) − G(γ ) (t) =
t0 t0
Z t
F (γ(s), s) − F γ ′ (s), s ds


t0
Z t
≤ L γ(s) − γ ′ (s) ds
t0
≤ Lε sup γ(s) − γ ′ (s) .
s∈I
L
Taking the supremum over t we get for ρ = < 1 that
L+1
d G(γ), G(γ ) ≤ ρd γ, γ ′ .

 

By the Banach Fixed Point Theorem (contractive mapping principle) there is a unique γ ∈ X
such that G(γ) = γ, in other words such that for all t we have
Z t
γ(t) = y0 + F (γ(s), s) ds .
t0
Clearly γ(t0 ) = y0 . In addition, by the Fundamental Theorem of Calculus γ is a differentiable
function and, for all t ∈ I,
γ̇(t) = F (γ(t),t)) .
7
Conversely, any solution defined on I belongs to X by Lemma 11 and is a fixed point for G, so
the solution is unique. □
T HEOREM 12 (Picard Existence and Uniqueness Theorem). Given the Lipschitz ODE (Ω, F)
and an initial condition (y0 ,t0 ):
(1) (Existence) There exist a solution γ of the ODE on some interval I = (t0 − ε,t0 + ε).
(2) (Uniqueness) If (I, γ) and (I ′ , γ ′ ) are two solutions, then γ = γ ′ on I ∩ I ′ .
(3) (Blowup) There exists a a solution (Imax , γmax ) defined on an open interval such such any
other solution is obtained by restricting γmax to a subinterval of Imax . Furthermore if
Imax = (a, b) and either a or b is finite then γ(t) “escapes” as t → a+ or t → b− in the
sense that for any compact set K ⊂ Ω there is δ > 0 such that if t < a + δ or t > b − δ we
have (γ(t),t) ∈/ K.
(4) (Autonomous equation) Suppose F0 : Ω0 → Rn for Ω0 ⊂ En and F(y,t) = F0 (y) is inde-
pendent of t. Then in the blowup we get that eventually γ(t) ∈ / K for compact subsets of
Ω0 .
P ROOF. The first claim is Proposition 9. For the second claim suppose first that I, I ′ are open
and let J = {t ∈ I ∩ I ′ | γ(t) = γ ′ (t)}. By assumption γ(t0 ) = γ ′ (t0 ) = y0 so t0 ∈ J. This set is
closed since γ, γ ′ are continuous. To see that it is open let t1 ∈ J. Applying Proposition 9 to
the initial condition (γ(t1 ),t) we see that γ = γ ′ on an interval containing t1 . By connectedness
J = I ∩ I ′ . Finally if an endpoint of I or I ′ is contained in both intervals then it is a limit point of
the intersection, and the solutions agree there by continuity.
Let S be the set of solutions γ defined on open intervals containing t0 . By (2), γmax = S is a
S

function and its domain I is a union of intervals containing t0 hence an open interval. For any t ∈ I
there is some solution γ ∈ S defined at t hence on a neighbourhood of t and since γmax agrees with
γ on that interval γmax is differentiable at t and is a solution. Given a compact set K ⊂ Ω for each
(y,t) ∈ K obtain ε, R, L, M, B, I are in Lemma 10. By compactness we can cover K with finitely
many boxes
B′k = B(yk , Rk /2) × [tk − εk /2,tk + εk /2] ⊂ Bk = B(yk , Rk /2) × [tk − εk ,tk + εk ] ⊂ Ω
such that F is bounded by Mk and Rhas Lipschitz constant Lk on Bk . Let M = maxk Mk , L = maxk Lk ,
R = 12 mink Rk and let ε = min M+1 1 1
, L+1 , 2 εk . Then for each (y,t) ∈ K the parameters ε, R, L, M
work on B(y, R) × [t − ε,t + ε]. It follows that any solution passing through (y,t) can be extended
by at least time ε, contradicting the minimality of a and the maximality of b if the return time is
close enough to a or b respectively.
When F is autonomous the bounds above are independent of t so the the time to live is uniform
on compacta of Ω0 and the same argument applies. □

1.2. Galilean group; spacetime


In principle the force can be anything (say as “external” forces). The interactions between
particles, however, are more restricted. To see this we need to introduce the symmetry.
1.2.1. Mathematics. We extend our affine space Ed to spacetime Ad,1 = Ed × E1 , equipped
with the projection t : Ad+1 → E1 on the last coordinate we call “time”. A point in spacetime is
called an event. Recall that we have already equipped Ed with the Euclidean metric. The subset
Ed × {t} is called a timeslice; two events in it are said to be simultaneous.
8
E XERCISE 13. Isom(Ed ) contains Rd acting by translations, which is a simply transitive
subgroup. The point stabilizer is O(d) and thus have Isom(Ed ) ≃ O(d) ⋊ Rd . In particular,
Isom(Ed ) ⊂ Aff(Ed ).
D EFINITION 14. A Galilean transformation is an invertible affine map between two space-
times of the same dimension, which (1) preserves simultaneity; (2) restricts to an isometry on
each timeslice. The Galilean group is the group of Galilean automorphisms of a single spacetime
(“Galilean symmetries”).
E XAMPLE 15. Fix v ∈ Rd . Then mapping (x,t) 7→ (x + vt,t) (“uniform motion”) is a Galilean
symmetry. Similarly the mapping (“time translation”) (x,t) 7→ (x,t + s).
E XERCISE 16. The Galilean group is a group; it is an appropriate semidirect product.
1.2.2. Physics.
A XIOM 17 (Galilean symmetry). The laws of physics are invariant under the action of the
Galilean group. In other words, when the force F only represents internal forces between the
particles, it must be equivariant for the Galilean group.
A XIOM 18. The force on each particle is the vector sum of an external force on the particle,
and interaction forces between pairs of particles.
L EMMA 19. Suppose Fi j only depends on xi , x j . Then Fi j is parallel to the displacement x j −
xi ∈ Rd .
A XIOM 20 (Newton’s third law). Fi j = −Fji .
C OROLLARY 21 (Conservation of total momentum). Suppose there are no external forces.
Then ∑ j m j v j is constant.
E XAMPLE 22. Two masses connected by a spring freely moving in 1d.

1.3. Energy and work


∗
The inner product on Rd represented by a function g : Rd → Rd extends to an inner product
⊕Nj=1 g j on RdN .
Ld
D EFINITION 23. Associating to each particle a mass m j > 0 we let M = j=1 m j g be the mass
matrix. If the jth particle has velocity v j ∈ Rd we we call
1 1 1 2
T= ⟨Mv, v⟩ = ∑ m j v j , v j = ∑ m j v j
2 2 j 2 j
the kinetic energy of the system.
What is the time derivative of this quantity?
dT
= ∑ m ja j, v j
dt j
= ⟨Ma, v⟩
= ⟨F, v⟩ .
9
D EFINITION 24. The work done by the force when the system moves along the path is the
integral Z Z t1
⟨F, dγ⟩ = ⟨F, v⟩ dt .
t0
Note that ⟨F, v⟩ is the sum of terms Fj , v j corresponding to the individual particles.
C OROLLARY 25. The change in kinetic energy is the total work done by the force.
We now divide the forces into categories.
D EFINITION 26. A force on a single particule is conservative if locally F = −dU for some
function U of position called the potential.
H
E XERCISE 27. This is equivalent to Fdx = 0 for small loops (all loops if true globally).
• For a force between two particles get Fi j = −Fji . This iscalled “Newton’s Third law”.
dU
L EMMA 28. For a conservative force we have dt = −Fv.
O BSERVATION 29. Constraint forces do no work.
P ROOF. A constraint force acts dually to the level sets, whereas v is in the tangent space. □
We have shown
P ROPOSITION 30 (Conservation of mechanical energy). Suppose that all inter-particle forces
are conservative, with total potential U, and let Fj be the external force on the jth particle. Let
E = T +U. Then
dE
= ∑ Fj , v j .
dt j
Adding to U the potential due to any conservative external forces gives the same result with Fj
representing any non-conservative forces.

10
CHAPTER 2

Kinematics

We begin by developing the language to describe motion, material which will lead directly to
predicting motion in Chapter 1.
Physics keywords: configuration space,
Mathematics keyword: inverse and implicit function theorems,
2.1. Configurations and configuration space
2.1.1. Physics.
D EFINITION 31 (Informal). A mechanical system consists of several point particles moving in
some ambient space subject to interactions and constraints.
The ambient space will be Euclidean d-space, denoted Ed . For the distinction between Ed and
Rd see the problems on “Affine Algebra” in Problem Set 1, and also [2, \S2A].
N
D EFINITION 32. A configuration of the system is then a point x ∈ Ed satisfying the con-
straints. The configuration space of the system is the set X of all configurations.
• We will think of a particle moving on the round 2-sphere as moving on the surface x2 +
y2 + z2 = 1 in En , but it is also possible to think of it as moving on the sphere directly.
• We will almost always have d ∈ {1, 2, 3} but this is a contingent fact about our everyday
experience, not a mathematical requirement.
E XAMPLE 33. Single free particle; single particle at the end of a Hookean spring; physical
pendulum with massless rod (2d ; 3d); rope-and-pulley system;
• From the mathematical point of view one can dispense with this definition and just talk
about the configuration manifold from the start, but that’s not how physicists think,. More
importantly we need to be able to construct the configuration space of a physical system.
• We will pretend that continuum systems (e.g. a solid rod) actually consist of a finite but
large number of particles. As long as the rod is rigid this will not cause a problem since
the configuration space will be finite-dimensional.
• Truly infinite-dimensional systems (e.g. fluid flow, or deformation of a plastic material)
are the subject of continuum mechanics and outside the scope of this course.
2.1.2. Mathematics. Often the constraints are holonomic in that, locally at least, they can be
written in the form1 F(x) = 0 – in other words they concern the configuration of the system only –
e.g. the particle constrained to move on the sphere (say of radius r(t)), or (“rigid motion”) where
the distance between a pair of particles is fixed. Other examples of constraints include
1In general one should permit the constraints to depend on time, in which case the configuration space would be
X = X(t), and the analysis below should be extended. This may be developed in a later version of the notes or in a
problem set.
11
• Hard boundaries, e.g. a bouncing ball restricted to the upper half-plane y ≥ 0. Mathe-
matically we can either work on a manifold with boundary or handle the situation at the
boundary separately.
• Constraints on the motion (e.g. rolling without slipping), which we will develop later. we
Formally we fix an open set Ω ⊂ EdN , a time interval I, and assume the constraints take the form
F = 0 for some continuously differentiable function F : Ω → Rm . We will generally assume F is
non-degenerate in that rank dF = m, at least locally on X (i.e. possibly we express the constraints
by different functions in different places).
E XAMPLE 34. In E2 suppose we have a massless slider moving along a horizontal wire. A
point of mass m is attached a massless rigid rod of length L freely swinging from the slider. We
want to say something like:
“Let the slider be at (xs , ys ) and the mass be at (x, y), with constraints

xs ∈ [a, b]

ys = 0 ”
(x − x )2 + (y − y )2 = L2 .

s s

Of course a better parametrization would be through the angle θ the rod makes with the vertical
axis (say). We also want to say:
“Instead use as coordinates the location xs of the slider and the angle θ of the mass.
The slider is then at (xs , 0) and the mass is at (xs + L sin θ , −L cos θ )”.
Finally, we would like to say
“The potential energy of the system will then be U = −mgL cos θ . The kinetic
energy will be
1  2  1 
m ẋ + ẏ2 = m ẋs2 + 2L cos θ ẋs θ̇ + L2 θ̇ 2 .

2 2
Our goal is to make sense of all these statements. For this we need to understand what we
mean by the coordinates xs , ys , x, y, θ , what we mean by U(θ ) where U ought to be a function on
X, what we mean by derivatives on configuration space and of coordinates, and so on. For technical
reasons we will begin with the derivatives, and the discuss coordinates, coordinate systems, and
parametrization.
R EMARK 35. Observe that the angle θ is not really a function on configuration space – if we
go around a full circle we acquire a phase of 2π. We can define branches locally, but not globally.
The same will apply to constraints written in terms of θ .
2.1.3. tangent vectors and derivatives.
R EMARK 36. Recall that F : En → Em is differentiable at x ∈ En if there is a linear map
|R(x,v)|
dF x : Rn → Rm so that for v ∈ Rn , F(x + v) = F(x)+ dF x (v)+ R(x, v) where |v| → 0 as |v| → 0.
Note that there is no dot product or metric here; just the notion of displacement on affine space.
E XAMPLE 37. When m = 1 note that dFx is not a vector – it is a linear map from Rn → R, in
other words a linear functional; in physics language a dual vector or a covector.
12
R EMARK 38. If we wish to speak of the gradient vector ⃗∇F(x) we need a way to associate a
vector to each linear functional. This is provided by an inner product (look up “Riesz Representa-
tion Theorem”), but note that the choice of inner product matters, and different inner products will
produce different gradients for the same function.
Let x, x′ ∈ X be close to each other. We can write x′ = x + εv where v is some unit vector and
ε is small. Then
0 = F(x′ ) = F(x + εv)
= F(x) + εdF x (v) + o(ε)
= εdF x (v) + o(ε) .
so
dF x (v) = o(1) .
x′
By compactness as → x we will have the v converge to a point on the sphere satisfying dF x (v) =
0, that is to v ∈ Ker dF x .
D EFINITION 39. For x ∈ X(t) the tangent space is Tx X(t) = Ker dF x .
N OTATION 40. We use Newton’s dot to denote derivatives with respect to time. By γ̇(t) we
mean the vector of partial derivatives in RdN , which is also the image of the standard basis vector
of Tt R by the linear map dγt .
L EMMA 41. Let I be an interval, and let γ : I → EdN be a differentiable curve. Suppose that
γ(t0 ) ∈ X for some t0 . Then the image of γ lies in X iff γ̇(t) ∈ Ker dF γ(t) for all t ∈ I.
P ROOF. PS1. □

2.2. Coordinates
2.2.1. Mathematics.
T HEOREM 42 (Implicit function theorem). Let x ∈ Ω and suppose that rank dF x = m. Then we
can choose some m coordinates of EdN so that locally these coordinates are uniquely determined
by the others. Furthermore this function has the expected derivative. If F is k times differentiable
so is the function defined implicitly.
C OROLLARY 43 (Inverse function theorem). When m = dN we have an inverse with the inverse
derivative.
By the implicit function theorem we can, at least locally, parametrize configuration space as
follows: we choose some set ¬C ∈ [dN]m . Then any configuration x = (xi )i∈[dN] is uniquely deter-
mined by (xi )i∈C .
E XAMPLE 44. Particle on incline parametrized by x or y coordinate. Same for particle on
circle, note different coordinates at different points.
• No constraints in such a a parametrization.
It is often more convenient, however, to parametrize by something other than the Euclidean coor-
dinate axes. The key observation is that the xi are not functions “valued in X” but rather functions
on X!
13
D EFINITION 45. A coordinate (in the physics literature “generalized coordinate”) is a pair
(U, qα ) where U ⊂ X is an open  set and qα : U → R is any function. A system of coordinates or
coordinate patch is a tuple q = U, {qα }dim
i=1
X
of coordinates defined on the same neighbourhood
such that q is continuously differentiable and dqx is invertible on Tx X for x ∈ U.
What do we mean by dq? Note that by the implicit function theorem, if x, x′ ∈ X are close
enough we have x′ = x + v + e where v ∈ Tx X and |e| = o (|x − x′ |) as x → x′ . We can thus differen-
tiate functions on X (without extending them to the ambient EdN ) by asking whether f (x′ ) − f (x)
is approximately linear in v.
D EFINITION 46. Let f : U → Er where U ⊂ X is open . We shall say f is differentiable at x if
there is a linear map d fx : Tx X → Rr such that f (x′ ) = f (x) + d fx (v) + o (|x − x′ |).
L EMMA 47 (Differentiation on X). (1) Let V ⊂ EdN be an open set, and let f : V → Er
be a function differentiable at some x ∈ U = V ∩ X. Then f ↾U : U → Er is differentiable
in the sense above and d ( f ↾U )x = (d fx ) ↾Tx X .
(2) d fx is linear in f and satisfies the chain rule in both directions (i.e. for composition with
g : Es → X and with h : Er → Et ). It therefore also satisfies the Leibnitz rule.
P ROOF. Exercise. □
 2 E XAMPLE 48. The angle θ for a circle, e..g the pendulum. Let S1 ⊂ E2 be the unit circle
2
x + y = 1 . On the open right semicircle we define θ = arctan(y/x). We then conversely have
the parametrization (inverse map) θ 7→ (cos θ , sin θ ). It’s also possible to define θ on any arc not
covering the whole circle.
R EMARK 49. Since θ is only locally defined, we sometimes prefer to have the coordinates
NOT be valued in R – e.g. have θ valued in S1 . This requires some care, but has some advantages.
E XERCISE 50. Given a mechanical system, find a coordinate system, and then a parametriza-
tion of configuration space by the coordinates.
N OTATION 51. A (locally defined) function f : Rdim X → R induces a function on configuration
space by composing with the coordinates: when we write f (q) we really mean f ◦ (qα )α (example:
the potential energy U(θ ) = −mgL cos θ from Example 34). Conversely if f is a function on X
we can identify it with a function defined on the coordinates by composing with the inverse of q.
Similarly a curve γ : I → X induces a coordinate curve via q ◦ γ, and a coordinate curve induces a
curve in configuration space by composition with q−1 .
E XAMPLE 52. Potential energy due to external gravity, or due to interaction between pairs (or
groups) of particles.
R EMARK 53. It is in fact useful to have the coordinates depend on time – to have qα : U × I →
R where I is some time interval. This is significant and will play a role in the sequel.
2.2.2. Physics: computation in coordinates. We first clarify something.
Warning: Let f : X → R be some function (say potential energy). Then f ◦ q−1 is a function
on coordinate space (i.e. if you plug in values for the coordinates you get the value of f at the
corresponding point of X). Following the physics convention we will use the same letter for both
functions, and leave it to the reader to figure out which we mean. In particular when we write
xi = xi (q;t) we might mean the standard coordinate functions on X coming by restriction from
EdN , or the Euclidean coordinates as functions of the generalized coordinates.
14
• Suppose the system moves according to a curve γ ⊂ X. Then the coordinates change
according to qα (γ(t);t). We usually write qα (t) for these functions; the vector (qα (t))α ∈
Rdim X is the coordinate curve. We often compute these functions directly, and then see
the implications in physics space by applying q−1 (·;t) to get points in X.
• In particular we will usually write the equations of motion as ODEs for qα (t) and solve
those.
• We often try to choose the coordinates qα to make the expression for relevant functions
or for the equations of motion simpler.
Let γ : I → X be a differentiable curve through configuration space. As we saw in Lemma 41, at
every time t we have γ̇(t) ∈ Tγ(t) X.
D EFINITION 54. We call γ̇ the velocity of the path. This is a vector of the N velocities of the
individual particles.
D EFINITION 55. Let ⟨·, ·⟩ denote the inner product on Rd = Tx Ed with associated norm |·|;
think of γ̇(t) as a collection of N vectors v j ∈ Rd , and suppose the jth particle has mass m j . Then
the kinetic energy of the particles is
1 2
K = ∑ m j v̇ j .
2 j

Observe that is exactly the restriction of a positive-definite quadratic form from Tγ(t) EdN to Tγ(t) X,
hence again a positive-definite quadratic form.
Via the parametrization xi = xi (q;t) we obtain a linear relation
∂ xi
ẋi = ∑ q̇α
α ∂ qα
which allows us to change variables in K, writing K as a quadratic form in the q̇α instead. Against
it is positive definite.
E XAMPLE 56 (Rotating
  frame). Suppose we have  a particle moving in the plane, where we
x cos(ωt) − sin(ωt)
name points by their coordinates. Let R(t) = (warning: this is the
y sin(ωt) cos(ωt)
matrix of rotation by −ωt), and consider the time-dependent coordinate system
   
X x
= R(t)
Y y
(so X = X(x, y;t) and Y = Y (x, y;t) as indicated. Conversely we have
   
x X
= R(−t)
y Y
so        
ẍ Ẍ Ẋ X
= R(−t) − 2Ṙ(−t) + R̈(−t)
ÿ Ÿ Ẏ Y
and        
Ẍ ẍ Ẋ X
m = R(t)m + 2R(t)Ṙ(−t)m − R(t)R̈(−t)m .
Ÿ ÿ Ẏ Y
15
 

Now Newton’s 2nd law reads m = F(x, y;t) where we

S CHOLIUM 57. This definition clarifies that velocity is a pointwise notion: we need more
structure to compare velocities at different points. For example, a particle moving around the
circle has velocities tangent to the circle. To study v(t + h) − v(t) we need a connection.
R EMARK 58. In fact we could have started with anypositive-definite
 quadratic form. If the
N LN
ambient space is a Riemannian manifold (M, g) then on M , j=1 g the kinetic energy is the
LN
quadratic form on the tangent space which, relative to j=1 g, is block-diagonal with eigenvalues
m j.
R EMARK 59. As we shall see later, the most important fact is the convexity of K as a function
on Tx X.

16
CHAPTER 3

Lagrangian mechanics

3.1. Introduction
3.1.1. Historical overview.
• Euler: the equations of motions of Newtonian mechanics can be written in a form that
works for any coordinate system.
• Lagrange: even if there are constraints.
• Hamilton: these equations follow from a variational principle.
3.1.2. Plan.
(1) (Mathematics) Calculus of variations I
(2) (Physics) Hamilton’s principle and the Euler–Lagrange equations; examples
(3) (Physics) conservation laws
(4) (Mathematics) Lagrange multipliers for variational problems
(5) (Physics) Constraint forces
R EMARK 60. We will not derive the Euler–Lagrange equations (i.e. show that they are equiv-
alent to Newtonian mechanics), which is essentially a tedious calculation.

3.2. Calculus of Variations


3.2.1. The problem; formal calculation. Fix a bounded domain Ω ⊂ Rr , and consider the
problem of minimizing the expression
Z
S= L (u(t), dut ;t) dt

over the space of sufficiently nice functions u : Ω̄ → Rn , subject to a boundary condition u ↾∂ Ω = g
for some fixed g. Here L : Rn × Mn×r × Ω → R is some sufficiently nice functions.
E XAMPLE 61 (Brachistochrone; Johann Bernouli 1696 after Galileo). Given two points A, B
in a vertical plane with A higher, find the curve y = u(x) such that a mass sliding along the graph
of u subject to gravity alone will reach B from A in the shortest time.
Align the y-axis vertically down, and suppose A = (0, 0) and B = (xB , yB ).pWhen the mass is at
(x, u(x)) it has velocity 21 mv2 = mgu(x) by conservation of energy, so v(x) = 2gu(x). The length
p
the part of the curve from x to x + dx is 1 + u′ (x)2 dx so the time it takes to cover that segment
of q
1+u′2
is u dx. We therefore need to minimize
Z xB r
1 + u′2
dx
0 u
subject to u(0) = 0; u(xB ) = yB .
17
• Problem famously solved by Newton overnight after finding a challenge letter from Bernoulli
returning from his work at the mint; he sent the solution anonymously to the Royal Acad-
emy by first post and Bernoulli famously remarked “I recognize the lion from his claw
mark” (The Bernoullis took two weeks to solve the problem).
E XAMPLE 62 (Catenary). A chain of length L made from a material of constant density hangs
from two points A, B in the vertical plane. It is known (“principle of virtual work”) that the chain
will hang so as to minimize total potential energy. What shape will it take?
q
E XAMPLE 63 (Minimal surface). Let Ω ⊂ R be a plane, u any curve. Then Ω 1 + |Du(x)|2 dx
d
R

is the area of the hypersurface y = u(x) given by the graph of u in Rd+1 .


• Idea: Differentiate wrt u, set derivative to zero.
3.2.2. Differentiation in function spaces. Let V be a (real) vector space (example: the space
of paths γ : I → Rn in coordinate space satisfying some differentiability conditions. A function
f : V → R is called a functional. Let x ∈ A.
D EFINITION 64. We say that f is
(1) Frécéht differentiable at x if V is a Banach space and there is a linear functional λ =d fx ∈
V ∗ such that f (x + v) = f (x) + ⟨λ , v⟩ + o (∥v∥).
(2) Gateaux differentiable at x if V is topological and there is a linear functional λ = d fx ∈ V ∗
such that d fx (v) = limh→0 f (x+hv)−
h
f (x)
holds for each v ∈ V .
E XAMPLE 65. If V = Rn the first notion is the usual derivative, and the second is the directional
derivative (when linear).
R EMARK 66. Can directly define the differentiability of functionals on manifolds of paths
γ : I → X but we’ll elide this point.
Since we will only be interested in critical points of f , we will consider a much weaker con-
dition, where we only differentiate at a single direction, and only consider a subset of the possible
directions. We will also concentrate on the case r = 1 where the situation is considerably simpler
(for functional-analytic reasons),
3.2.3. Formal calculation in coordinates. Let I = [t0 ,t1 ] be a closed interval. Let Ω ⊂ Rn be
open *”coordinate patch”. Fix a function (“Lagrangian”) L : T Ω × I → R as nice as needed (write
this as L = L(q, v;t), and let γ : I → Ω be as nice as needed. The associated action is
Z t1
S(γ) = L (γ, γ̇;t) dt .
t0
suppose γ minimizes S among reasonable curves with γ(t0 ) = a, γ(t1 ) = b.
To investigate this let η(t) ∈ Cc∞ (I ◦ ) be a “bump function”. Then
Z t1
S(γ + εη) = L (γ + εη, γ̇ + ε η̇;t) dt .
t0
Taylor expansion gives (suppressing the dependence of L on time)
* +  
∂L ∂L
L (γ(t) + εη(t), γ̇(t) + ε η̇(t);t)−L (γ(t), γ̇(t);t) = ε , η(t) +ε , η̇(t) +O(ε 2 ) .
∂ q (γ(t),γ̇(t)) ∂ v (γ(t),γ̇(t))
18
Integrating this dt we get
Z t1
"* +  #
∂L ∂L
S(γ + εη) − S(γ) = ε , η(t) + , η̇(t) dt + O(ε 2 ) .
t0 ∂ q (γ(t),γ̇(t)) ∂ v (γ(t),γ̇(t))
Thus if γ is extremal (or even critical) for S we must have
Z t1
"* +  #
∂L ∂L
, η(t) + , η̇(t) dt = 0 .
t0 ∂ q (γ(t),γ̇(t)) ∂ v (γ(t),γ̇(t))
R EMARK 67. If γ needs to be valued in some open domain we can always choose ε small
enough to ensure that γ + εη remains in the domain.
We now integrate the second term by parts. Since η has compact support we have η(t0 ) =
η(t1 ) = 0 so there are no boundary terms and we get
Z t1
"* +  #
∂L d ∂L
, η(t) − , η(t) dt = 0 ,
t0 ∂ q (γ(t),γ̇(t)) dt ∂ v (γ(t),γ̇(t))
that is * +
Z t1
∂L d ∂L
− , η(t) dt = 0 .
t0 ∂ q (γ(t),γ̇(t)) dt ∂ v (γ(t),γ̇(t))
Now if γ is extremal this must hold for all η. However if γ ∈ C1 and L ∈ C2 then the function
∂L d ∂L
∂ x (γ(t),γ̇(t)) − dt ∂ v (γ(t),γ̇(t)) is continuous; if it were nonzero somewhere we could choose η to
make the integral nonzero. It follows (“Euler–Lagrange equation”) that along the path we have
d ∂L ∂L
(γ(t), γ̇(t);t) = (γ(t), γ̇(t);t) .
dt ∂ v ∂q
By the chain rule we can also write this as
 2   2   2 
∂ L ∂ L ∂ L ∂L
2
(γ(t), γ̇(t);t) γ̈(t)+ (γ(t), γ̇(t);t) γ̇(t)+ (γ(t), γ̇(t);t) − (γ(t), γ̇(t);t) = 0 ,
∂v ∂ v∂ x ∂ v∂t ∂q
which is visibly a second order ODE we can hope to solve.
• Classical approach: “indirect method”, that is first solve this equation, then show that the
solution is extremal.
• “Direct method”: show a-priori that the action has minimizers and that they must satisfy
the equation.
3.2.4. From informal to formal.
Rb
L EMMA 68. Let f ∈ C(a, b). Suppose that for all non-negative η ∈ Cc∞ (a, b) we have a f (t)η(t)dt ≥
0. Then f ≥ 0.
P ROOF. Suppose f (t0 ) < 0. Then there is a small interval J about t0 we have f ↾J ≤ −δ . Let η
be any nonzero nonegative function supported on J. Then ab
R

Rb
L EMMA 69. Let f ∈ L1 (a, b). Suppose that for all non-negative η ∈ Cc∞ (a, b) we have a f (t)η(t)dt ≥
0. Then f ≥ 0 almost everywhere.
19
P ROOF. Let dµ = f dt be the measure on [a, b] with density f wrt Lebesgue. Then µ(η) =
Rb
a f (t)η(t)dt and from the Riesz Representation Theorem we get that µ is a positive measure, so
its Radon–Nykodim derivative f with respect to dt must be non-negative (if f we negative on a set
of positive measure, the measure of that set would be negative). □
C OROLLARY 70. In either case, if the integrals always vanish so does the function.
We have proved:

R t1
P ROPOSITION 71. Let L ∈ C2 (T Ω × I → R) and suppose γ ∈ C2 (I; Ω) is critical for S =
t0 Ldt given its endpoints. Then
d ∂L ∂L
(γ(t), γ̇(t);t) = (γ(t), γ̇(t);t) .
dt ∂ v ∂q
2
Furthermore, suppose (“Ellipticity”) that ∂∂ vL2 is positive definite. We can then write the ODE
in the form
 2 −1   2   2 
∂ L ∂L ∂ L ∂ L
γ̈ = (γ(t), γ̇(t);t) − (γ(t), γ̇(t);t) γ̇(t) − (γ(t), γ̇(t);t) ,
∂ v2 ∂q ∂ v∂ x ∂ v∂t
which will have a unique solution for each initial condition (γ(t0 ), γ̇(t0 )).
• The ellipticity condition is exactly the positive definiteness of the mass matrix.
We would like to do two related things:
(1) Show that there actually exists a minimizer. We will concentrate on a particular class of
Lagrangians
(2) Extend the class of acceptable paths γ.
D EFINITION 72. Say the Lagrangian is standard if it has the form
1
L(q, q̇;t) = ⟨M(t)q̇, q̇⟩ −U(q,t)
2
where M = M(t) is symmetric and satisfies M ≥ µ for some constant µ > 0, and U is continuous.
3.2.5. Existence of minimizers.
D EFINITION 73 (Sobolev space). For a sufficiently differentiable function γ : I → Rn define
k
∥γ∥2H k = ∑ γ (k)
i=0 L2 (I)

and let H k (I; Rn ) be the completion of the space of smooth functions wrt this norm.
FACT 74. This is the space of γ such that the kth distributional derivative is represented by an
L2 -function.
T HEOREM 75 (Sobolev embedding). The inclusion map Ck (I), ∥·∥H k → Ck−1 (I), ∥·∥Ck−1
 
is compact.
C OROLLARY 76. Let L be a standard Lagrangian. Then γ 7→ S(γ) is continuous with ∥·∥H 1
and thus extends to a continuous function on H k (I).
20
L EMMA 77 (Poincaré inequality). Let u : I → Rn be a differentiable function with u(t0 ) =
u(t1 ) = 0. Then
t1 − t0 2 t1 2
Z t1   Z
2
|u| dt ≤ |u̇| dt .
t0 2 t0

P ROOF. Wlog the interval is [−∆, ∆]. Integrating by parts we have


Z ∆ h i∆ Z ∆
|u|2 dt = t |u|2 − tuu̇dt
∆ −∆ −∆
Z ∆
1/2 Z ∆ 1/2
2 2
≤∆ |u| dt |u̇| dt .
∆ ∆

R EMARK 78. This is not the optimal constant – which is the smallest eigenvalue of the Dirich-
let Laplacian.

L EMMA 79 (Coercivity). Suppose U(q) ≤ A +C |q|2 . For ε > 0 we can find δ > 0 such that
Z t1 +δ Z t1 +δ
U(q,t) ≤ B + ε |q̇|2 dt .
t0 t0

P ROOF. Let q̃ be a linear function of time interpolating a = q(t0 ), b = q(t0 + δ ) and let u =
2
2 2 |b−a|
q − q̃. Then u̇ = q̇ − b−a
δ so |u̇| ≤ 2 |q̇| + 2 δ 2

U(q) ≤ A +C |q̃ + u|2 ≤ A + 2C |q̃|2 + 2C |u|2 .

Now integrate on [t0 ,t0 + δ ]. Get


Z t0 +δ   1 Z t0 +δ
2 2
U(q)dt ≤ Aδ + 2Cδ |a| + |b| + Cδ 2
|u̇|2 dt
t0 2 t0
  1 Z t0 +δ
2 2 2
≤ Aδ + 2Cδ |a| + |b| + Cδ |b − a| +Cδ 2
|q̇|2 dt .
2 t0

Now take δ small enough so that Cδ 2 < ε. □

C OROLLARY 80 (Lower bound). Suppose U grows at most quadratically and that the time
interval is short enough (depending on the constants including the initial conditions) to get ε < µ.
Then
(1) S(γ) are bounded below. In particular infγ S(γ) exists.
(2) Sublevel sets are bounded in H 1 .

Now let {γn }∞ 1


n=1 ⊂ C (I) have S(γn ) → infγ S(γ). By the Sobolev embedding theorem we can
pass to a subsequence so that γn converge uniformly to a continuous function γ∞ . By Banach–
Alaoglu we can also assume that γn converge weakly, so that
21
3.3. The Euler–Lagrange equations
Let us now calculate with this formalism.

D EFINITION 81. We call ∂∂ Lv the (generalized) momentum. In a particular coordinate system


call pα = ∂∂q̇Lα the momentum associated to the coordinate qα (though of course this depends on the
∂L
entire coordinate system). We call ∂q the (generalized) force. In particular coordinate system we
∂L
get a generalized force Fα = ∂ qα associated to each coordinate, with equation of motion

d
pα = Fα .
dt
E XAMPLE 82. For a standard Lagrangian L(x, v) = 21 ⟨M(x,t)v, v⟩−U(x,t) we have p = M(x,t)ẋ
and F = −dU.

3.3.1. Cyclic coordinates and conserved quantities. Fix a coordinate system q = (qα )nα=1 : X →
Ω ⊂ Rn . The Euler–Lagrange equation take the form
n  
d ∂L ∂L
dt ∂ q̇α = ∂ qα 1 ≤ α ≤ n.

Here we think of L as a function on L via composition with q−1 .

D EFINITION 83. Call a coordinate qβ cyclic if ∂∂qL = 0, in other words if L does not depend on
β
qβ explicitly (of course in the given coordinate system, that is when the other coordinates are the
other qα ).
 
O BSERVATION 84. If qα is cyclic then dtd ∂∂q̇Lα = 0. In other words, the associated general-
∂L
ized momentum pα = ∂ q̇α : T X × R → R is constant along the physical path.

D EFINITION 85. We say that a quantity f : T X × R → R is a conserved quantity in that situa-


tion, in other words if dtd f (γ(t), γ̇(t);t) = 0 along the physical path γ(t).

E XAMPLE 86. Consider a particle moving in the plane with downward-pointing gravity, that
is the Lagrangian
1
L = m ẋ2 + ẏ2 + mgy .

2
Clearly the x-coordinate is cyclic. Now retain the x-coordinate but switch to the coordinate system
(x, z) where z = x + y. Then y = z − x so we also have

1
L = m 2ẋ2 + ż2 − 2żẋ + mgz − mgx .

2
Now the x-coordinate is not cyclic – showing that the notion of cyclicity depends on the coordinate
system and not just on a single coordinate.
22
L itself is a function which we can differentiate along the physical path. We have
d ∂L ∂L ∂L
(L(γ, γ̇;t)) = γ̇ + γ̈ + chain rule
dt ∂ x ∂v ∂t
d ∂L ∂L d ∂L
= γ̇ + (γ̇) + equations of motion
dt ∂ v ∂ v dt ∂t
 
d ∂L ∂L
= γ̇ + Leibnitz rule .
dt ∂ v ∂t
Rearranging we obtain the Beltrami identity
 
d ∂L ∂L
v−L = − .
dt ∂ v ∂t
Here we interpret ∂∂ Lv v − L as a function on T X × R, which is to be evaluated at (γ(t), γ̇(t);t) and
then differentiated wrt t.
∂L
D EFINITION 87. The energy of the system is E = ∂ v v − L.
∂L
C OROLLARY 88 (Conservation of energy). Suppose ∂t = 0, that is that L does not depend on
time explicitly. Then E is a conserved quantity.
• Observe that E = C is a first-order ODE. With one degree of freedom that is the first
integral of the equations of motion.
E XERCISE 89. Use this to solve the catenary and brachistochrone problems.
R EMARK 90. We will discuss conserved quantities further in Section 3.4
3.3.2. Constraints. Paying debt.
3.3.3. Examples.
3.4. More on conserved quantities: symmetries Noether’s Theorem
3.4.1. Symmetries of configuration space.
D EFINITION 91. A one-parameter semigroup is a smooth function g : I × X → X (which we
write gr (x) instead of g(r, x)) so that g0 (x) = x and so that gr+s = gr ◦ gs .
O BSERVATION 92. g−r = g−1
r .
d d 2
E XAMPLE
  93. In E fix  v ∈ R and let gr (x) = x + vt. In R (i.e. fixing an origin) let
 avector
x cos θ sin θ x
gθ = be the rotation by θ .
y − sin θ cos θ y
We can differentiate g with respect to each variable separately. In particular gr induces a map
T X → T X (which we denote with the same symbol) via
gr (x, v) = (gr (x), (dx gr ) (v)) .
D EFINITION 94. We say that the one-parameter semigroup {gr } is a symmetry of that L is
invariant by the semigroup if L ◦ gr = L for all r.
E XAMPLE 95. Translation by a cyclic coordinate.
23
3.4.2. Noether’s Theorem. For a fixed x, r 7→ gr (x) is a differentiable curve. Write g′ (x) ∈
Tx X for its derivative at x = 0. This is a vector field on X.
L EMMA 96. gr (x) are the integral curves of this vector field.
P ROOF. We have gr+ε (x) = gε (gr (x)). It follows that d
dr gr (x) = g′ (gr (x)) so we have the
(unique) solution to dy ′
dr = g (y). □
d d d d d ′
L EMMA 97. dr (dx gr (γ̇)) = dr dt gr (γ(t)) = dt gr (γ(t))) = dt g (gr (x)).

D T HEOREM E 98 (Noether; weak version). Suppose that gr is a symmetry. Then the quantity
∂L ′
∂ v , g (x) is conserved.

P ROOF. By assumption we have S(gr ◦ γ) = S(γ) for all r. We now differentiate this identity
with respect to r:
d
0 = S(gr ◦ γ)
dr
d t1
Z
= L (gr (γ(t), γ̇(t));t) dt
dr t0
* + * +
Z t1
=  ∂L , g′ (gr (γ(t))) +
∂L d
, (dx gr ) (γ̇(t))  dt
t0 ∂x ∂v dr
(gr (γ(t),γ̇(t));t) (gr (γ(t),γ̇(t));t)
* + * +
Z t1
=  ∂L , g′ (gr (γ(t))) +
∂L d
, g′ (gr (γ(t))  dt .
t0 ∂x ∂v dt
(gr (γ(t),γ̇(t));t) (gr (γ(t),γ̇(t));t)
Now setting r = 0 and integrating by parts we get
Z t1    t=t1
∂L d ∂L ′ ∂L ′
0= − , g (γ(t) dt + , g (γ(t)) ,
t0 ∂ x dt ∂ v ∂v t=t0
as claimed. □
3.4.3. Total derivatives. Let f : X × R → R be any function, and define formally the “total
derivative” ddtf : T X × R → R by
 
df ∂f ∂f
(x, v,t) = ,v + .
dt ∂x ∂t
d df
L EMMA 99. Let γ be any path. Then dt f (γ(t);t) = dt (γ(t), γ̇(t);t). In particular
Z t1
df
(γ(t), γ̇(t);t) dt = f (γ(t1 );t1 ) − f (γ(t0 );t0 ) .
t0 dt
C OROLLARY 100. Let L̃ = L + ddtf . Then for any path γ with endpoints a, b we have S̃(γ) =
S(γ) + f (b;t1 ) − f (a;t0 ) and in particular S, S̃ have the same critical points and the same Euler–
Lagrange equations.
E XAMPLE 101. Let L = T −U be time independent, with conserved energy E = T +U. Then
L̃ = T −U + 21 t 2 has the same conserved quantity despite not being time independent.
24
We now generalize the previous discussion.
D EFINITION 102. A one-parameter group is a smooth family of smooth maps gr : X × E1 →
X × E1 so that g0 (x,t) = (x,t) and so that gr+s = gr ◦ gs .
We say that the one-parameter group is a symmetry of the Lagrangian L if L ◦ gr − L is a total
derivative for each r.
• Now write d ′ ′
dr r=0 gr (x,t) = (g (x,t), T (x,t)).
The following is a common generalization of the law of conservation of energy and the weak
version of the theorem.
D E
∂L ′
T HEOREM 103 (Noether). Suppose that {gr }r is a symmetry. Then the quantity ∂ v , g (x,t) −
 
T ′ (x,t) ∂∂ Lv v − L is conserved.

P ROOF. Exercise. □

3.5. Rotations and angular momentum


3.5.1. Linear algebra. Equip Rd with its standard inner product and Euclidean metric, and
let O(d) be the group of rigid motions fixing the origin.
L EMMA 104. Each g ∈ O(d) is linear and satisfies g∗ g = Id. Conversely O(d) = {g ∈ Md (R) | g∗ g = Id}.
L EMMA 105. The (upper triangular part of the) constraints g∗ g = Id are non-degenerate. The
tangent space at the identity is so(d) = {X ∈ Md (R) : X ∗ + X = 0}.
P ROOF. Let F(g) = g∗ g. Given a deformation Y ∈ Md (R) we have
F(g +Y ) = (g∗ +Y ∗ ) (g +Y )
= g∗ g + g∗Y +Y ∗ g + O(Y 2 ) ,
so dFg (Y ) is the symmetrization of g∗Y . Where g is invertible the image of this map is the space
of symmetric matrices, which has the same dimension as the target space of F. □
C OROLLARY 106. so(d) = T1 O(d) = {X ∈ Md (R) | X ∗ + X = 0}; Tg O(d) = g−1 X | X ∈ so(d) .


D EFINITION 107. We call elements X ∈ so(d) infinitesimal rotations.


1 k
D EFINITION 108. The matrix exponential is given by exp X = ∑∞
k=0 k! X . The matrix loga-
rithm is log g = ∑∞ k−1 1 (g − Id)k .
k=1 (−1) k
L EMMA 109. log converges in a small neighbourhood of the identity, exp converges for all
X; for small enough g, X they are inverse to each other. If X,Y commute we have exp(X + Y ) =
exp X expY ; if g, h commute we have log(gh) = log g + log h.
L EMMA 110. On the neighbourhood V ⊂ O(d) of absolute convergence log : V → so(d) is a
coordinate system with parametrization exp.
P ROOF. For g ∈ O(d) close enough to the identity we have (log g) + (log g)∗ = (log g) +
(log g∗ ) = (log g∗ ) (log g) = log(g∗ g) = 0 since g, g∗ commute (they are inverse to each other).
Conversely if X ∗ = −X then X, X ∗ commute and exp(X)∗ exp(X) = Id.
25
We also remark that near the identity we have
log I + X + O(X 2 ) = X + O(X 2 ) ,


giving a different confirmation that log has full rank (and in fact its derivative is the identity). □
C OROLLARY 111. Each X ∈ so(d) defines a one-parameter subgroup gr = exp(rX).
d/2 (d−1)/2
FACT 112. Write Rd ≃ R2 or R2  ⊕ R(orthogonal sum) depending on whether d
1
is even or odd. For each 1 ≤ i ≤ d/2 let Xi = ∈ so(2) in the relevant coordinates. Then
−1
{exp(ri Xi )}i≤d/2 is a maximal family of one-parameter subgroups; {exp(∑ri=1 ri Xi )} is a maximal
commutative subgroup (“maximal torus”).
3.5.2. Angular momentum. Suppose our standard Lagrangian
N
1
2 ∑ m j v2j +U(x) .
j=1

Each g ∈ O(d) acts by matrix multiplication on the coordinate of each particle. We have
d
 2 2
dt gx j = gv j so gv j = v j . Accordingly if U(gx) = U(x) (g acting diagonally) we have a
rotationally invariant Lagrangian. Now for each X ∈ so(d) we obtain a one-parameter subgroup
d
gr = exp(rX) with dr exp(rX)x j = Xx j . It follows that the quantity
r=0
N d
∑ m j ∑ v j Xx j
j=1 i=1

is conserved.
D EFINITION 113. Fix x0 ∈ Ed . The angular momentum of a particle of mass m at position x
moving at velocity v ∈ Rd is the linear functional L ∈ so(d)′ given by
L(X) = (x − x0 )T Xv .
     
−1 1
E XERCISE 114. Using the basis X1 =  1, X1 =  , X3 = −1 0
−1 1 0
see that in 3d we recover the usual angular momentum.
C OROLLARY 115. If the potential is invariant under rotation, total angular momentum is con-
served.
3.5.3. More linear algebra. Better to think of X as a map Rn → Rn∗ with X ∗ the dual map
Rn = Rn∗∗ → Rn∗ required to equal −X. Note that this still forces ⟨Xu, u⟩ = ⟨X ∗ u, u⟩ = − ⟨Xu, u⟩.
L EMMA 116. Let u, v ∈ Rd be vectors. Then the functional X 7→ uXv depends only on the plane
spanned by u, v (modulu rescaling) and (if nonzero) conversely.
P ROOF. By antisymmetry (au + bv)X(cu + dv) = (ad − bc)uXv. Conversely let w be indepen-
dent of u, v and let u∗ , v∗ , w∗ be corresponding elements of a dual basis, X = uw∗ − wu∗ . Then
uXv = 0 since both w∗ , u∗ vanish at v, On the other hand uXw = u ̸= 0. □
26
P ROPOSITION 117. Given L we can find 2k ≤ d orthonormal vectors {ui , vi }ki=1 such that L is
a linear combination of the functionals uTi Xvi .
P ROOF. Think of L as an antisymmetric matrix; find orthonormal eigenbasis invariant under
complex conjugation, let u, v be the real and imaginary parts of an eigenvector. Alternatively apply
Darboux’s Theorem. □
• Multiparticle motion is more complicated.
3.5.4. Central potential. Suppose a single mass in Rd is moving in a central potential U(x) =
U(r) where r = |x|. At some particular time t either v, x are proportional to each other, and then we
have a 1d problem, or they are not. By Lemma 116 and conservation of angular momentum the
motion is restricted to the plane spanned by x, v. We therefore have the Lagrangian
1
m ṙ2 + r2 θ̇ 2 −U(r) .

2
We have two conserved quantities:
1
E = m ṙ2 + r2 θ̇ 2 +U(r)

2
L = mr2 θ̇ .
C OROLLARY 118 R(Kepler equal area law). The angle is monotone; the area swept by the orbit
between times t0 ,t1 is tt01 r2 dθ = mL (t1 − t2 ).
Combining the two equations we get
1
E = mṙ2 + Ũ(r)
2
L 2
where Ũ(r) = U(r) + 2mr 2 is the “effective potential”. This is a separable ODE, which (in theory)

can be integrated to give r = r(t). We can then determine the angle from θ̇ = mrL 2 and thus obtain
the orbit.
1
E XAMPLE 119. If U blows up at zero slower than r2
then we can’t have r → 0.
• Clearly the orbit is either unbounded (coming from infinity, to a a least radius and return-
ing to infinity) or bounded (oscillating between rmin , rmax . These extrema determined by
ṙ = 0, E = Ũ(r).
• Orbit periodic only if while going between extreme radii gives multiple of 2π. Note that
Z rmax Z rmax Z rmax
θ̇ 2L dr
2 θ̇ dt = 2 dr = √ p
rmin rmin ṙ 2m rmin r2 (E − Ũ(r))
3.6. Small oscillations
Let L = 12 ⟨M(x)v, v⟩ −U(x) with equation of motion
d
(M(x)v) = −dU .
dt
In particular if dU(x0 ) = 0 then γ(t) ≡ x0 solves the equations of motion. Letting q denote the
displacement x − x0 we have to first order in q,
M(x0 )q̈ ≈ −H(x0 )q
27
where H(x0 ) is the Hessian of U at x0 , since dtd M(q) q̇ = dM · q̇2 is of second order. We are thus

interested in solving the equation
M q̈ = −Hq
n
where q ∈ R and M, H are symmetric positive-definite matrices (i.e. we are working near a po-

tential minimum). Letting y = Mq this takes the form
ÿ = −H̃y
n
= M −1/2 HM 1/2 .

where H̃ Suppose H is diagonable with eigenvectors (“normal modes”) q j j=1
,
n on
eigenvalues ω 2j . Then H̃ has same eigenvalues, but eigenvectors M −1/2 q j . It follows that
j=1
!
n n
y(t) = ℜ ∑ A j eiω jt + ∑ B j e−iω jt M−1/2q j
j=1 j=1
and hence !
n n
q(t) = ℜ ∑ A j eiω jt + ∑ B j e−iω jt M −1 q j .
j=1 j=1

E XAMPLE 120. N equal masses connected by identical springs, say with x0 pinned, q j the
2
displacement from equillibrium of the jth mass. Then U = 12 k ∑Nj=1 q j − q j−1 so
1 −1
 
−1 2 −1 
.. ..
 
H =

−1 . . 


 . . . 2 −1

−1 1

28
Bibliography

[1] Tom Archibald. Differential equations: a historical overview to circa 1900. In A history of analysis, volume 24 of
Hist. Math., pages 325–353. Amer. Math. Soc., Providence, RI, 2003.
[2] V. I. Arnol′ d. Mathematical methods of classical mechanics, volume 60 of Graduate Texts in Mathematics.
Springer-Verlag, New York, [1989?]. Translated from the 1974 Russian original by K. Vogtmann and A. We-
instein, Corrected reprint of the second (1989) edition.
[3] Richard H. Cushman and Larry M. Bates. Global aspects of classical integrable systems. Birkhäuser/Springer,
Basel, second edition, 2015.
[4] Herbert Goldstein. Classical mechanics. Addison-Wesley Series in Physics. Addison-Wesley Publishing Co.,
Reading, MA, second edition, 1980.
[5] Hans Niels Jahnke, editor. A history of analysis, volume 24 of History of Mathematics. American Mathematical
Society, Providence, RI; London Mathematical Society, London, 2003. Translated from the German.
[6] L. D. Landau and E. M. Lifshitz. Mechanics, volume Vol. 1 of Course of Theoretical Physics. Pergamon Press,
Oxford; Addison-Wesley Publishing Co., Inc., Reading, MA, 1960. Translated from the Russian by J. B. Bell.
[7] Michael Spivak. Physics for mathematicians—mechanics I. Publish or Perish, Inc., Houston, TX, 2010.

29

You might also like