428 Notes
428 Notes
Lecture Notes
Lior Silberman
Dr. Lior Silberman, UBC Department of Mathematics
[email protected], http://www.math.ubc.ca/~lior
These are rough notes for the Spring 2025 course, copyright ©Lior Silberman 2025. They
may not be posted wholesale elsewhere – instead link to file posted on the original course web-
siteoriginal course website. Traditional academic reuse is permitted, however.
Compiled February 6, 2025.
Contents
Administrivia 4
0.1. Course plan (subject to revision) 4
Chapter 1. Review of Newtonian mechanics 5
1.1. Newton’s laws 5
1.2. Galilean group; spacetime 8
1.3. Energy and work 9
Chapter 2. Kinematics 11
2.1. Configurations and configuration space 11
2.2. Coordinates 13
Chapter 3. Lagrangian mechanics 17
3.1. Introduction 17
3.2. Calculus of Variations 17
3.3. The Euler–Lagrange equations 22
3.4. More on conserved quantities: symmetries Noether’s Theorem 23
3.5. Rotations and angular momentum 25
3.6. Small oscillations 27
Bibliography 28
Bibliography 29
3
Administrivia
4
CHAPTER 1
Then (X, d) is a complete metric space. Let X ⊂ C1 (I; En ) be the set of functions γ : I → En such
that γ(t) ∈ B(y0 , R) for all t. Then X is a closed subset (if γn converge uniformly in C (I; En ) then
they converge pointwise and B(y0 , R) is closed), hence a complete metric space in its own right.
Given γ ∈ X and t ∈ I define
Z t
(G(γ)) (t) = y0 + F (γ(s), s) ds .
t0
The integral is well-defined since γ and F are continuous and since by the Lemma for all s ∈ I we
have (γ(s), s) ∈ B ⊂ Ω. We observe that this also means that |F(γ(s), s)| ≤ M for all s. Now by the
FTC the function G(γ) : I → En is continuously differentiable. Moreover we have
Z t
|(G(γ)) (t) − y0 | = F (γ(s), s) ds
t0
Z t
≤ |F (γ(s), s)| ds
t0
M
≤ |t − t0 | M ≤ εM ≤ R < R.
M+!
It follows that G(γ) is a continuous function on I valued in B(y0 , R), hence also an element of X.
Finally let γ, γ ′ ∈ X. Then for each t we have
Z t Z t
′
F (γ(s), s) ds − F γ ′ (s), s ds
(G(γ)) (t) − G(γ ) (t) =
t0 t0
Z t
F (γ(s), s) − F γ ′ (s), s ds
≤
t0
Z t
≤ L γ(s) − γ ′ (s) ds
t0
≤ Lε sup γ(s) − γ ′ (s) .
s∈I
L
Taking the supremum over t we get for ρ = < 1 that
L+1
d G(γ), G(γ ) ≤ ρd γ, γ ′ .
′
By the Banach Fixed Point Theorem (contractive mapping principle) there is a unique γ ∈ X
such that G(γ) = γ, in other words such that for all t we have
Z t
γ(t) = y0 + F (γ(s), s) ds .
t0
Clearly γ(t0 ) = y0 . In addition, by the Fundamental Theorem of Calculus γ is a differentiable
function and, for all t ∈ I,
γ̇(t) = F (γ(t),t)) .
7
Conversely, any solution defined on I belongs to X by Lemma 11 and is a fixed point for G, so
the solution is unique. □
T HEOREM 12 (Picard Existence and Uniqueness Theorem). Given the Lipschitz ODE (Ω, F)
and an initial condition (y0 ,t0 ):
(1) (Existence) There exist a solution γ of the ODE on some interval I = (t0 − ε,t0 + ε).
(2) (Uniqueness) If (I, γ) and (I ′ , γ ′ ) are two solutions, then γ = γ ′ on I ∩ I ′ .
(3) (Blowup) There exists a a solution (Imax , γmax ) defined on an open interval such such any
other solution is obtained by restricting γmax to a subinterval of Imax . Furthermore if
Imax = (a, b) and either a or b is finite then γ(t) “escapes” as t → a+ or t → b− in the
sense that for any compact set K ⊂ Ω there is δ > 0 such that if t < a + δ or t > b − δ we
have (γ(t),t) ∈/ K.
(4) (Autonomous equation) Suppose F0 : Ω0 → Rn for Ω0 ⊂ En and F(y,t) = F0 (y) is inde-
pendent of t. Then in the blowup we get that eventually γ(t) ∈ / K for compact subsets of
Ω0 .
P ROOF. The first claim is Proposition 9. For the second claim suppose first that I, I ′ are open
and let J = {t ∈ I ∩ I ′ | γ(t) = γ ′ (t)}. By assumption γ(t0 ) = γ ′ (t0 ) = y0 so t0 ∈ J. This set is
closed since γ, γ ′ are continuous. To see that it is open let t1 ∈ J. Applying Proposition 9 to
the initial condition (γ(t1 ),t) we see that γ = γ ′ on an interval containing t1 . By connectedness
J = I ∩ I ′ . Finally if an endpoint of I or I ′ is contained in both intervals then it is a limit point of
the intersection, and the solutions agree there by continuity.
Let S be the set of solutions γ defined on open intervals containing t0 . By (2), γmax = S is a
S
function and its domain I is a union of intervals containing t0 hence an open interval. For any t ∈ I
there is some solution γ ∈ S defined at t hence on a neighbourhood of t and since γmax agrees with
γ on that interval γmax is differentiable at t and is a solution. Given a compact set K ⊂ Ω for each
(y,t) ∈ K obtain ε, R, L, M, B, I are in Lemma 10. By compactness we can cover K with finitely
many boxes
B′k = B(yk , Rk /2) × [tk − εk /2,tk + εk /2] ⊂ Bk = B(yk , Rk /2) × [tk − εk ,tk + εk ] ⊂ Ω
such that F is bounded by Mk and Rhas Lipschitz constant Lk on Bk . Let M = maxk Mk , L = maxk Lk ,
R = 12 mink Rk and let ε = min M+1 1 1
, L+1 , 2 εk . Then for each (y,t) ∈ K the parameters ε, R, L, M
work on B(y, R) × [t − ε,t + ε]. It follows that any solution passing through (y,t) can be extended
by at least time ε, contradicting the minimality of a and the maximality of b if the return time is
close enough to a or b respectively.
When F is autonomous the bounds above are independent of t so the the time to live is uniform
on compacta of Ω0 and the same argument applies. □
10
CHAPTER 2
Kinematics
We begin by developing the language to describe motion, material which will lead directly to
predicting motion in Chapter 1.
Physics keywords: configuration space,
Mathematics keyword: inverse and implicit function theorems,
2.1. Configurations and configuration space
2.1.1. Physics.
D EFINITION 31 (Informal). A mechanical system consists of several point particles moving in
some ambient space subject to interactions and constraints.
The ambient space will be Euclidean d-space, denoted Ed . For the distinction between Ed and
Rd see the problems on “Affine Algebra” in Problem Set 1, and also [2, \S2A].
N
D EFINITION 32. A configuration of the system is then a point x ∈ Ed satisfying the con-
straints. The configuration space of the system is the set X of all configurations.
• We will think of a particle moving on the round 2-sphere as moving on the surface x2 +
y2 + z2 = 1 in En , but it is also possible to think of it as moving on the sphere directly.
• We will almost always have d ∈ {1, 2, 3} but this is a contingent fact about our everyday
experience, not a mathematical requirement.
E XAMPLE 33. Single free particle; single particle at the end of a Hookean spring; physical
pendulum with massless rod (2d ; 3d); rope-and-pulley system;
• From the mathematical point of view one can dispense with this definition and just talk
about the configuration manifold from the start, but that’s not how physicists think,. More
importantly we need to be able to construct the configuration space of a physical system.
• We will pretend that continuum systems (e.g. a solid rod) actually consist of a finite but
large number of particles. As long as the rod is rigid this will not cause a problem since
the configuration space will be finite-dimensional.
• Truly infinite-dimensional systems (e.g. fluid flow, or deformation of a plastic material)
are the subject of continuum mechanics and outside the scope of this course.
2.1.2. Mathematics. Often the constraints are holonomic in that, locally at least, they can be
written in the form1 F(x) = 0 – in other words they concern the configuration of the system only –
e.g. the particle constrained to move on the sphere (say of radius r(t)), or (“rigid motion”) where
the distance between a pair of particles is fixed. Other examples of constraints include
1In general one should permit the constraints to depend on time, in which case the configuration space would be
X = X(t), and the analysis below should be extended. This may be developed in a later version of the notes or in a
problem set.
11
• Hard boundaries, e.g. a bouncing ball restricted to the upper half-plane y ≥ 0. Mathe-
matically we can either work on a manifold with boundary or handle the situation at the
boundary separately.
• Constraints on the motion (e.g. rolling without slipping), which we will develop later. we
Formally we fix an open set Ω ⊂ EdN , a time interval I, and assume the constraints take the form
F = 0 for some continuously differentiable function F : Ω → Rm . We will generally assume F is
non-degenerate in that rank dF = m, at least locally on X (i.e. possibly we express the constraints
by different functions in different places).
E XAMPLE 34. In E2 suppose we have a massless slider moving along a horizontal wire. A
point of mass m is attached a massless rigid rod of length L freely swinging from the slider. We
want to say something like:
“Let the slider be at (xs , ys ) and the mass be at (x, y), with constraints
xs ∈ [a, b]
ys = 0 ”
(x − x )2 + (y − y )2 = L2 .
s s
Of course a better parametrization would be through the angle θ the rod makes with the vertical
axis (say). We also want to say:
“Instead use as coordinates the location xs of the slider and the angle θ of the mass.
The slider is then at (xs , 0) and the mass is at (xs + L sin θ , −L cos θ )”.
Finally, we would like to say
“The potential energy of the system will then be U = −mgL cos θ . The kinetic
energy will be
1 2 1
m ẋ + ẏ2 = m ẋs2 + 2L cos θ ẋs θ̇ + L2 θ̇ 2 .
2 2
Our goal is to make sense of all these statements. For this we need to understand what we
mean by the coordinates xs , ys , x, y, θ , what we mean by U(θ ) where U ought to be a function on
X, what we mean by derivatives on configuration space and of coordinates, and so on. For technical
reasons we will begin with the derivatives, and the discuss coordinates, coordinate systems, and
parametrization.
R EMARK 35. Observe that the angle θ is not really a function on configuration space – if we
go around a full circle we acquire a phase of 2π. We can define branches locally, but not globally.
The same will apply to constraints written in terms of θ .
2.1.3. tangent vectors and derivatives.
R EMARK 36. Recall that F : En → Em is differentiable at x ∈ En if there is a linear map
|R(x,v)|
dF x : Rn → Rm so that for v ∈ Rn , F(x + v) = F(x)+ dF x (v)+ R(x, v) where |v| → 0 as |v| → 0.
Note that there is no dot product or metric here; just the notion of displacement on affine space.
E XAMPLE 37. When m = 1 note that dFx is not a vector – it is a linear map from Rn → R, in
other words a linear functional; in physics language a dual vector or a covector.
12
R EMARK 38. If we wish to speak of the gradient vector ⃗∇F(x) we need a way to associate a
vector to each linear functional. This is provided by an inner product (look up “Riesz Representa-
tion Theorem”), but note that the choice of inner product matters, and different inner products will
produce different gradients for the same function.
Let x, x′ ∈ X be close to each other. We can write x′ = x + εv where v is some unit vector and
ε is small. Then
0 = F(x′ ) = F(x + εv)
= F(x) + εdF x (v) + o(ε)
= εdF x (v) + o(ε) .
so
dF x (v) = o(1) .
x′
By compactness as → x we will have the v converge to a point on the sphere satisfying dF x (v) =
0, that is to v ∈ Ker dF x .
D EFINITION 39. For x ∈ X(t) the tangent space is Tx X(t) = Ker dF x .
N OTATION 40. We use Newton’s dot to denote derivatives with respect to time. By γ̇(t) we
mean the vector of partial derivatives in RdN , which is also the image of the standard basis vector
of Tt R by the linear map dγt .
L EMMA 41. Let I be an interval, and let γ : I → EdN be a differentiable curve. Suppose that
γ(t0 ) ∈ X for some t0 . Then the image of γ lies in X iff γ̇(t) ∈ Ker dF γ(t) for all t ∈ I.
P ROOF. PS1. □
2.2. Coordinates
2.2.1. Mathematics.
T HEOREM 42 (Implicit function theorem). Let x ∈ Ω and suppose that rank dF x = m. Then we
can choose some m coordinates of EdN so that locally these coordinates are uniquely determined
by the others. Furthermore this function has the expected derivative. If F is k times differentiable
so is the function defined implicitly.
C OROLLARY 43 (Inverse function theorem). When m = dN we have an inverse with the inverse
derivative.
By the implicit function theorem we can, at least locally, parametrize configuration space as
follows: we choose some set ¬C ∈ [dN]m . Then any configuration x = (xi )i∈[dN] is uniquely deter-
mined by (xi )i∈C .
E XAMPLE 44. Particle on incline parametrized by x or y coordinate. Same for particle on
circle, note different coordinates at different points.
• No constraints in such a a parametrization.
It is often more convenient, however, to parametrize by something other than the Euclidean coor-
dinate axes. The key observation is that the xi are not functions “valued in X” but rather functions
on X!
13
D EFINITION 45. A coordinate (in the physics literature “generalized coordinate”) is a pair
(U, qα ) where U ⊂ X is an open set and qα : U → R is any function. A system of coordinates or
coordinate patch is a tuple q = U, {qα }dim
i=1
X
of coordinates defined on the same neighbourhood
such that q is continuously differentiable and dqx is invertible on Tx X for x ∈ U.
What do we mean by dq? Note that by the implicit function theorem, if x, x′ ∈ X are close
enough we have x′ = x + v + e where v ∈ Tx X and |e| = o (|x − x′ |) as x → x′ . We can thus differen-
tiate functions on X (without extending them to the ambient EdN ) by asking whether f (x′ ) − f (x)
is approximately linear in v.
D EFINITION 46. Let f : U → Er where U ⊂ X is open . We shall say f is differentiable at x if
there is a linear map d fx : Tx X → Rr such that f (x′ ) = f (x) + d fx (v) + o (|x − x′ |).
L EMMA 47 (Differentiation on X). (1) Let V ⊂ EdN be an open set, and let f : V → Er
be a function differentiable at some x ∈ U = V ∩ X. Then f ↾U : U → Er is differentiable
in the sense above and d ( f ↾U )x = (d fx ) ↾Tx X .
(2) d fx is linear in f and satisfies the chain rule in both directions (i.e. for composition with
g : Es → X and with h : Er → Et ). It therefore also satisfies the Leibnitz rule.
P ROOF. Exercise. □
2 E XAMPLE 48. The angle θ for a circle, e..g the pendulum. Let S1 ⊂ E2 be the unit circle
2
x + y = 1 . On the open right semicircle we define θ = arctan(y/x). We then conversely have
the parametrization (inverse map) θ 7→ (cos θ , sin θ ). It’s also possible to define θ on any arc not
covering the whole circle.
R EMARK 49. Since θ is only locally defined, we sometimes prefer to have the coordinates
NOT be valued in R – e.g. have θ valued in S1 . This requires some care, but has some advantages.
E XERCISE 50. Given a mechanical system, find a coordinate system, and then a parametriza-
tion of configuration space by the coordinates.
N OTATION 51. A (locally defined) function f : Rdim X → R induces a function on configuration
space by composing with the coordinates: when we write f (q) we really mean f ◦ (qα )α (example:
the potential energy U(θ ) = −mgL cos θ from Example 34). Conversely if f is a function on X
we can identify it with a function defined on the coordinates by composing with the inverse of q.
Similarly a curve γ : I → X induces a coordinate curve via q ◦ γ, and a coordinate curve induces a
curve in configuration space by composition with q−1 .
E XAMPLE 52. Potential energy due to external gravity, or due to interaction between pairs (or
groups) of particles.
R EMARK 53. It is in fact useful to have the coordinates depend on time – to have qα : U × I →
R where I is some time interval. This is significant and will play a role in the sequel.
2.2.2. Physics: computation in coordinates. We first clarify something.
Warning: Let f : X → R be some function (say potential energy). Then f ◦ q−1 is a function
on coordinate space (i.e. if you plug in values for the coordinates you get the value of f at the
corresponding point of X). Following the physics convention we will use the same letter for both
functions, and leave it to the reader to figure out which we mean. In particular when we write
xi = xi (q;t) we might mean the standard coordinate functions on X coming by restriction from
EdN , or the Euclidean coordinates as functions of the generalized coordinates.
14
• Suppose the system moves according to a curve γ ⊂ X. Then the coordinates change
according to qα (γ(t);t). We usually write qα (t) for these functions; the vector (qα (t))α ∈
Rdim X is the coordinate curve. We often compute these functions directly, and then see
the implications in physics space by applying q−1 (·;t) to get points in X.
• In particular we will usually write the equations of motion as ODEs for qα (t) and solve
those.
• We often try to choose the coordinates qα to make the expression for relevant functions
or for the equations of motion simpler.
Let γ : I → X be a differentiable curve through configuration space. As we saw in Lemma 41, at
every time t we have γ̇(t) ∈ Tγ(t) X.
D EFINITION 54. We call γ̇ the velocity of the path. This is a vector of the N velocities of the
individual particles.
D EFINITION 55. Let ⟨·, ·⟩ denote the inner product on Rd = Tx Ed with associated norm |·|;
think of γ̇(t) as a collection of N vectors v j ∈ Rd , and suppose the jth particle has mass m j . Then
the kinetic energy of the particles is
1 2
K = ∑ m j v̇ j .
2 j
Observe that is exactly the restriction of a positive-definite quadratic form from Tγ(t) EdN to Tγ(t) X,
hence again a positive-definite quadratic form.
Via the parametrization xi = xi (q;t) we obtain a linear relation
∂ xi
ẋi = ∑ q̇α
α ∂ qα
which allows us to change variables in K, writing K as a quadratic form in the q̇α instead. Against
it is positive definite.
E XAMPLE 56 (Rotating
frame). Suppose we have a particle moving in the plane, where we
x cos(ωt) − sin(ωt)
name points by their coordinates. Let R(t) = (warning: this is the
y sin(ωt) cos(ωt)
matrix of rotation by −ωt), and consider the time-dependent coordinate system
X x
= R(t)
Y y
(so X = X(x, y;t) and Y = Y (x, y;t) as indicated. Conversely we have
x X
= R(−t)
y Y
so
ẍ Ẍ Ẋ X
= R(−t) − 2Ṙ(−t) + R̈(−t)
ÿ Ÿ Ẏ Y
and
Ẍ ẍ Ẋ X
m = R(t)m + 2R(t)Ṙ(−t)m − R(t)R̈(−t)m .
Ÿ ÿ Ẏ Y
15
ẍ
Now Newton’s 2nd law reads m = F(x, y;t) where we
ÿ
S CHOLIUM 57. This definition clarifies that velocity is a pointwise notion: we need more
structure to compare velocities at different points. For example, a particle moving around the
circle has velocities tangent to the circle. To study v(t + h) − v(t) we need a connection.
R EMARK 58. In fact we could have started with anypositive-definite
quadratic form. If the
N LN
ambient space is a Riemannian manifold (M, g) then on M , j=1 g the kinetic energy is the
LN
quadratic form on the tangent space which, relative to j=1 g, is block-diagonal with eigenvalues
m j.
R EMARK 59. As we shall see later, the most important fact is the convexity of K as a function
on Tx X.
16
CHAPTER 3
Lagrangian mechanics
3.1. Introduction
3.1.1. Historical overview.
• Euler: the equations of motions of Newtonian mechanics can be written in a form that
works for any coordinate system.
• Lagrange: even if there are constraints.
• Hamilton: these equations follow from a variational principle.
3.1.2. Plan.
(1) (Mathematics) Calculus of variations I
(2) (Physics) Hamilton’s principle and the Euler–Lagrange equations; examples
(3) (Physics) conservation laws
(4) (Mathematics) Lagrange multipliers for variational problems
(5) (Physics) Constraint forces
R EMARK 60. We will not derive the Euler–Lagrange equations (i.e. show that they are equiv-
alent to Newtonian mechanics), which is essentially a tedious calculation.
R t1
P ROPOSITION 71. Let L ∈ C2 (T Ω × I → R) and suppose γ ∈ C2 (I; Ω) is critical for S =
t0 Ldt given its endpoints. Then
d ∂L ∂L
(γ(t), γ̇(t);t) = (γ(t), γ̇(t);t) .
dt ∂ v ∂q
2
Furthermore, suppose (“Ellipticity”) that ∂∂ vL2 is positive definite. We can then write the ODE
in the form
2 −1 2 2
∂ L ∂L ∂ L ∂ L
γ̈ = (γ(t), γ̇(t);t) − (γ(t), γ̇(t);t) γ̇(t) − (γ(t), γ̇(t);t) ,
∂ v2 ∂q ∂ v∂ x ∂ v∂t
which will have a unique solution for each initial condition (γ(t0 ), γ̇(t0 )).
• The ellipticity condition is exactly the positive definiteness of the mass matrix.
We would like to do two related things:
(1) Show that there actually exists a minimizer. We will concentrate on a particular class of
Lagrangians
(2) Extend the class of acceptable paths γ.
D EFINITION 72. Say the Lagrangian is standard if it has the form
1
L(q, q̇;t) = ⟨M(t)q̇, q̇⟩ −U(q,t)
2
where M = M(t) is symmetric and satisfies M ≥ µ for some constant µ > 0, and U is continuous.
3.2.5. Existence of minimizers.
D EFINITION 73 (Sobolev space). For a sufficiently differentiable function γ : I → Rn define
k
∥γ∥2H k = ∑ γ (k)
i=0 L2 (I)
and let H k (I; Rn ) be the completion of the space of smooth functions wrt this norm.
FACT 74. This is the space of γ such that the kth distributional derivative is represented by an
L2 -function.
T HEOREM 75 (Sobolev embedding). The inclusion map Ck (I), ∥·∥H k → Ck−1 (I), ∥·∥Ck−1
is compact.
C OROLLARY 76. Let L be a standard Lagrangian. Then γ 7→ S(γ) is continuous with ∥·∥H 1
and thus extends to a continuous function on H k (I).
20
L EMMA 77 (Poincaré inequality). Let u : I → Rn be a differentiable function with u(t0 ) =
u(t1 ) = 0. Then
t1 − t0 2 t1 2
Z t1 Z
2
|u| dt ≤ |u̇| dt .
t0 2 t0
R EMARK 78. This is not the optimal constant – which is the smallest eigenvalue of the Dirich-
let Laplacian.
L EMMA 79 (Coercivity). Suppose U(q) ≤ A +C |q|2 . For ε > 0 we can find δ > 0 such that
Z t1 +δ Z t1 +δ
U(q,t) ≤ B + ε |q̇|2 dt .
t0 t0
P ROOF. Let q̃ be a linear function of time interpolating a = q(t0 ), b = q(t0 + δ ) and let u =
2
2 2 |b−a|
q − q̃. Then u̇ = q̇ − b−a
δ so |u̇| ≤ 2 |q̇| + 2 δ 2
C OROLLARY 80 (Lower bound). Suppose U grows at most quadratically and that the time
interval is short enough (depending on the constants including the initial conditions) to get ε < µ.
Then
(1) S(γ) are bounded below. In particular infγ S(γ) exists.
(2) Sublevel sets are bounded in H 1 .
d
pα = Fα .
dt
E XAMPLE 82. For a standard Lagrangian L(x, v) = 21 ⟨M(x,t)v, v⟩−U(x,t) we have p = M(x,t)ẋ
and F = −dU.
3.3.1. Cyclic coordinates and conserved quantities. Fix a coordinate system q = (qα )nα=1 : X →
Ω ⊂ Rn . The Euler–Lagrange equation take the form
n
d ∂L ∂L
dt ∂ q̇α = ∂ qα 1 ≤ α ≤ n.
D EFINITION 83. Call a coordinate qβ cyclic if ∂∂qL = 0, in other words if L does not depend on
β
qβ explicitly (of course in the given coordinate system, that is when the other coordinates are the
other qα ).
O BSERVATION 84. If qα is cyclic then dtd ∂∂q̇Lα = 0. In other words, the associated general-
∂L
ized momentum pα = ∂ q̇α : T X × R → R is constant along the physical path.
E XAMPLE 86. Consider a particle moving in the plane with downward-pointing gravity, that
is the Lagrangian
1
L = m ẋ2 + ẏ2 + mgy .
2
Clearly the x-coordinate is cyclic. Now retain the x-coordinate but switch to the coordinate system
(x, z) where z = x + y. Then y = z − x so we also have
1
L = m 2ẋ2 + ż2 − 2żẋ + mgz − mgx .
2
Now the x-coordinate is not cyclic – showing that the notion of cyclicity depends on the coordinate
system and not just on a single coordinate.
22
L itself is a function which we can differentiate along the physical path. We have
d ∂L ∂L ∂L
(L(γ, γ̇;t)) = γ̇ + γ̈ + chain rule
dt ∂ x ∂v ∂t
d ∂L ∂L d ∂L
= γ̇ + (γ̇) + equations of motion
dt ∂ v ∂ v dt ∂t
d ∂L ∂L
= γ̇ + Leibnitz rule .
dt ∂ v ∂t
Rearranging we obtain the Beltrami identity
d ∂L ∂L
v−L = − .
dt ∂ v ∂t
Here we interpret ∂∂ Lv v − L as a function on T X × R, which is to be evaluated at (γ(t), γ̇(t);t) and
then differentiated wrt t.
∂L
D EFINITION 87. The energy of the system is E = ∂ v v − L.
∂L
C OROLLARY 88 (Conservation of energy). Suppose ∂t = 0, that is that L does not depend on
time explicitly. Then E is a conserved quantity.
• Observe that E = C is a first-order ODE. With one degree of freedom that is the first
integral of the equations of motion.
E XERCISE 89. Use this to solve the catenary and brachistochrone problems.
R EMARK 90. We will discuss conserved quantities further in Section 3.4
3.3.2. Constraints. Paying debt.
3.3.3. Examples.
3.4. More on conserved quantities: symmetries Noether’s Theorem
3.4.1. Symmetries of configuration space.
D EFINITION 91. A one-parameter semigroup is a smooth function g : I × X → X (which we
write gr (x) instead of g(r, x)) so that g0 (x) = x and so that gr+s = gr ◦ gs .
O BSERVATION 92. g−r = g−1
r .
d d 2
E XAMPLE
93. In E fix v ∈ R and let gr (x) = x + vt. In R (i.e. fixing an origin) let
avector
x cos θ sin θ x
gθ = be the rotation by θ .
y − sin θ cos θ y
We can differentiate g with respect to each variable separately. In particular gr induces a map
T X → T X (which we denote with the same symbol) via
gr (x, v) = (gr (x), (dx gr ) (v)) .
D EFINITION 94. We say that the one-parameter semigroup {gr } is a symmetry of that L is
invariant by the semigroup if L ◦ gr = L for all r.
E XAMPLE 95. Translation by a cyclic coordinate.
23
3.4.2. Noether’s Theorem. For a fixed x, r 7→ gr (x) is a differentiable curve. Write g′ (x) ∈
Tx X for its derivative at x = 0. This is a vector field on X.
L EMMA 96. gr (x) are the integral curves of this vector field.
P ROOF. We have gr+ε (x) = gε (gr (x)). It follows that d
dr gr (x) = g′ (gr (x)) so we have the
(unique) solution to dy ′
dr = g (y). □
d d d d d ′
L EMMA 97. dr (dx gr (γ̇)) = dr dt gr (γ(t)) = dt gr (γ(t))) = dt g (gr (x)).
D T HEOREM E 98 (Noether; weak version). Suppose that gr is a symmetry. Then the quantity
∂L ′
∂ v , g (x) is conserved.
P ROOF. By assumption we have S(gr ◦ γ) = S(γ) for all r. We now differentiate this identity
with respect to r:
d
0 = S(gr ◦ γ)
dr
d t1
Z
= L (gr (γ(t), γ̇(t));t) dt
dr t0
* + * +
Z t1
= ∂L , g′ (gr (γ(t))) +
∂L d
, (dx gr ) (γ̇(t)) dt
t0 ∂x ∂v dr
(gr (γ(t),γ̇(t));t) (gr (γ(t),γ̇(t));t)
* + * +
Z t1
= ∂L , g′ (gr (γ(t))) +
∂L d
, g′ (gr (γ(t)) dt .
t0 ∂x ∂v dt
(gr (γ(t),γ̇(t));t) (gr (γ(t),γ̇(t));t)
Now setting r = 0 and integrating by parts we get
Z t1 t=t1
∂L d ∂L ′ ∂L ′
0= − , g (γ(t) dt + , g (γ(t)) ,
t0 ∂ x dt ∂ v ∂v t=t0
as claimed. □
3.4.3. Total derivatives. Let f : X × R → R be any function, and define formally the “total
derivative” ddtf : T X × R → R by
df ∂f ∂f
(x, v,t) = ,v + .
dt ∂x ∂t
d df
L EMMA 99. Let γ be any path. Then dt f (γ(t);t) = dt (γ(t), γ̇(t);t). In particular
Z t1
df
(γ(t), γ̇(t);t) dt = f (γ(t1 );t1 ) − f (γ(t0 );t0 ) .
t0 dt
C OROLLARY 100. Let L̃ = L + ddtf . Then for any path γ with endpoints a, b we have S̃(γ) =
S(γ) + f (b;t1 ) − f (a;t0 ) and in particular S, S̃ have the same critical points and the same Euler–
Lagrange equations.
E XAMPLE 101. Let L = T −U be time independent, with conserved energy E = T +U. Then
L̃ = T −U + 21 t 2 has the same conserved quantity despite not being time independent.
24
We now generalize the previous discussion.
D EFINITION 102. A one-parameter group is a smooth family of smooth maps gr : X × E1 →
X × E1 so that g0 (x,t) = (x,t) and so that gr+s = gr ◦ gs .
We say that the one-parameter group is a symmetry of the Lagrangian L if L ◦ gr − L is a total
derivative for each r.
• Now write d ′ ′
dr r=0 gr (x,t) = (g (x,t), T (x,t)).
The following is a common generalization of the law of conservation of energy and the weak
version of the theorem.
D E
∂L ′
T HEOREM 103 (Noether). Suppose that {gr }r is a symmetry. Then the quantity ∂ v , g (x,t) −
T ′ (x,t) ∂∂ Lv v − L is conserved.
P ROOF. Exercise. □
giving a different confirmation that log has full rank (and in fact its derivative is the identity). □
C OROLLARY 111. Each X ∈ so(d) defines a one-parameter subgroup gr = exp(rX).
d/2 (d−1)/2
FACT 112. Write Rd ≃ R2 or R2 ⊕ R(orthogonal sum) depending on whether d
1
is even or odd. For each 1 ≤ i ≤ d/2 let Xi = ∈ so(2) in the relevant coordinates. Then
−1
{exp(ri Xi )}i≤d/2 is a maximal family of one-parameter subgroups; {exp(∑ri=1 ri Xi )} is a maximal
commutative subgroup (“maximal torus”).
3.5.2. Angular momentum. Suppose our standard Lagrangian
N
1
2 ∑ m j v2j +U(x) .
j=1
Each g ∈ O(d) acts by matrix multiplication on the coordinate of each particle. We have
d
2 2
dt gx j = gv j so gv j = v j . Accordingly if U(gx) = U(x) (g acting diagonally) we have a
rotationally invariant Lagrangian. Now for each X ∈ so(d) we obtain a one-parameter subgroup
d
gr = exp(rX) with dr exp(rX)x j = Xx j . It follows that the quantity
r=0
N d
∑ m j ∑ v j Xx j
j=1 i=1
is conserved.
D EFINITION 113. Fix x0 ∈ Ed . The angular momentum of a particle of mass m at position x
moving at velocity v ∈ Rd is the linear functional L ∈ so(d)′ given by
L(X) = (x − x0 )T Xv .
−1 1
E XERCISE 114. Using the basis X1 = 1, X1 = , X3 = −1 0
−1 1 0
see that in 3d we recover the usual angular momentum.
C OROLLARY 115. If the potential is invariant under rotation, total angular momentum is con-
served.
3.5.3. More linear algebra. Better to think of X as a map Rn → Rn∗ with X ∗ the dual map
Rn = Rn∗∗ → Rn∗ required to equal −X. Note that this still forces ⟨Xu, u⟩ = ⟨X ∗ u, u⟩ = − ⟨Xu, u⟩.
L EMMA 116. Let u, v ∈ Rd be vectors. Then the functional X 7→ uXv depends only on the plane
spanned by u, v (modulu rescaling) and (if nonzero) conversely.
P ROOF. By antisymmetry (au + bv)X(cu + dv) = (ad − bc)uXv. Conversely let w be indepen-
dent of u, v and let u∗ , v∗ , w∗ be corresponding elements of a dual basis, X = uw∗ − wu∗ . Then
uXv = 0 since both w∗ , u∗ vanish at v, On the other hand uXw = u ̸= 0. □
26
P ROPOSITION 117. Given L we can find 2k ≤ d orthonormal vectors {ui , vi }ki=1 such that L is
a linear combination of the functionals uTi Xvi .
P ROOF. Think of L as an antisymmetric matrix; find orthonormal eigenbasis invariant under
complex conjugation, let u, v be the real and imaginary parts of an eigenvector. Alternatively apply
Darboux’s Theorem. □
• Multiparticle motion is more complicated.
3.5.4. Central potential. Suppose a single mass in Rd is moving in a central potential U(x) =
U(r) where r = |x|. At some particular time t either v, x are proportional to each other, and then we
have a 1d problem, or they are not. By Lemma 116 and conservation of angular momentum the
motion is restricted to the plane spanned by x, v. We therefore have the Lagrangian
1
m ṙ2 + r2 θ̇ 2 −U(r) .
2
We have two conserved quantities:
1
E = m ṙ2 + r2 θ̇ 2 +U(r)
2
L = mr2 θ̇ .
C OROLLARY 118 R(Kepler equal area law). The angle is monotone; the area swept by the orbit
between times t0 ,t1 is tt01 r2 dθ = mL (t1 − t2 ).
Combining the two equations we get
1
E = mṙ2 + Ũ(r)
2
L 2
where Ũ(r) = U(r) + 2mr 2 is the “effective potential”. This is a separable ODE, which (in theory)
can be integrated to give r = r(t). We can then determine the angle from θ̇ = mrL 2 and thus obtain
the orbit.
1
E XAMPLE 119. If U blows up at zero slower than r2
then we can’t have r → 0.
• Clearly the orbit is either unbounded (coming from infinity, to a a least radius and return-
ing to infinity) or bounded (oscillating between rmin , rmax . These extrema determined by
ṙ = 0, E = Ũ(r).
• Orbit periodic only if while going between extreme radii gives multiple of 2π. Note that
Z rmax Z rmax Z rmax
θ̇ 2L dr
2 θ̇ dt = 2 dr = √ p
rmin rmin ṙ 2m rmin r2 (E − Ũ(r))
3.6. Small oscillations
Let L = 12 ⟨M(x)v, v⟩ −U(x) with equation of motion
d
(M(x)v) = −dU .
dt
In particular if dU(x0 ) = 0 then γ(t) ≡ x0 solves the equations of motion. Letting q denote the
displacement x − x0 we have to first order in q,
M(x0 )q̈ ≈ −H(x0 )q
27
where H(x0 ) is the Hessian of U at x0 , since dtd M(q) q̇ = dM · q̇2 is of second order. We are thus
interested in solving the equation
M q̈ = −Hq
n
where q ∈ R and M, H are symmetric positive-definite matrices (i.e. we are working near a po-
√
tential minimum). Letting y = Mq this takes the form
ÿ = −H̃y
n
= M −1/2 HM 1/2 .
where H̃ Suppose H is diagonable with eigenvectors (“normal modes”) q j j=1
,
n on
eigenvalues ω 2j . Then H̃ has same eigenvalues, but eigenvectors M −1/2 q j . It follows that
j=1
!
n n
y(t) = ℜ ∑ A j eiω jt + ∑ B j e−iω jt M−1/2q j
j=1 j=1
and hence !
n n
q(t) = ℜ ∑ A j eiω jt + ∑ B j e−iω jt M −1 q j .
j=1 j=1
E XAMPLE 120. N equal masses connected by identical springs, say with x0 pinned, q j the
2
displacement from equillibrium of the jth mass. Then U = 12 k ∑Nj=1 q j − q j−1 so
1 −1
−1 2 −1
.. ..
H =
−1 . .
. . . 2 −1
−1 1
28
Bibliography
[1] Tom Archibald. Differential equations: a historical overview to circa 1900. In A history of analysis, volume 24 of
Hist. Math., pages 325–353. Amer. Math. Soc., Providence, RI, 2003.
[2] V. I. Arnol′ d. Mathematical methods of classical mechanics, volume 60 of Graduate Texts in Mathematics.
Springer-Verlag, New York, [1989?]. Translated from the 1974 Russian original by K. Vogtmann and A. We-
instein, Corrected reprint of the second (1989) edition.
[3] Richard H. Cushman and Larry M. Bates. Global aspects of classical integrable systems. Birkhäuser/Springer,
Basel, second edition, 2015.
[4] Herbert Goldstein. Classical mechanics. Addison-Wesley Series in Physics. Addison-Wesley Publishing Co.,
Reading, MA, second edition, 1980.
[5] Hans Niels Jahnke, editor. A history of analysis, volume 24 of History of Mathematics. American Mathematical
Society, Providence, RI; London Mathematical Society, London, 2003. Translated from the German.
[6] L. D. Landau and E. M. Lifshitz. Mechanics, volume Vol. 1 of Course of Theoretical Physics. Pergamon Press,
Oxford; Addison-Wesley Publishing Co., Inc., Reading, MA, 1960. Translated from the Russian by J. B. Bell.
[7] Michael Spivak. Physics for mathematicians—mechanics I. Publish or Perish, Inc., Houston, TX, 2010.
29