0% found this document useful (0 votes)
397 views210 pages

Visser - Notes On Differential Geometry, Victoria University of Wellington, New Zealand - 2004

The document contains lecture notes for MATH 464: Differential Geometry, authored by Matt Visser at Victoria University of Wellington. It includes an introduction to differential geometry, fundamental concepts, and detailed sections on affine connections, curvature, and integrability. The notes serve as a supplementary resource for students attending the course and highlight the importance of attending lectures for comprehensive understanding.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
397 views210 pages

Visser - Notes On Differential Geometry, Victoria University of Wellington, New Zealand - 2004

The document contains lecture notes for MATH 464: Differential Geometry, authored by Matt Visser at Victoria University of Wellington. It includes an introduction to differential geometry, fundamental concepts, and detailed sections on affine connections, curvature, and integrability. The notes serve as a supplementary resource for students attending the course and highlight the importance of attending lectures for comprehensive understanding.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 210

School Of Mathematical and Computing Sciences

Te Kura Pāngarau, Rorohiko

MATH 464 Differential Geometry Autumn 2004

Math 464:
Notes on differential geometry

Matt Visser
School of Mathematical and Computing Sciences,
Victoria University of Wellington,
New Zealand.
E-mail: [email protected]
URL: http://www.mcs.vuw.ac.nz/˜visser

Draft form: 26 February 2004; LATEX-ed February 26, 2004

Warning:
These notes are provided as a supplement to the lectures.
They are not a substitute for attending the lectures.
There are still a few rough edges:
If you find errors, typos, and/or obscurities, let me know.
They are not a substitute for attending the lectures.
Contents

1 Introduction 6
1.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Textbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Supplementary texts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Fundamentals 9
2.1 Elementary topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 The “usual topology” on IR . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Hausdorff topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 “Point doubling” topologies . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 “Train track” topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6 Topological closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.8 Elementary metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.9 Locally Euclidean spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.10 Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.11 Atlases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.12 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.13 Differentiable Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.14 Connected sum: M1 #M2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.15 “Modding out” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.16 Tangent vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.17 Covectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.18 Dual spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.19 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.20 Tensor components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.21 Tensor fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.22 Fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.23 Final comments on fundamentals . . . . . . . . . . . . . . . . . . . . . . . 39

2
Math 464: Differential Geometry 3

3 Affine Connexions 40
3.1 Partial derivatives are not tensors . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Parallel transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.1 Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.2 Connexion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.3 Covariantly constant vector fields . . . . . . . . . . . . . . . . . . . 45
3.2.4 Covariantly auto-parallel vector fields . . . . . . . . . . . . . . . . . 45
3.2.5 Path ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Covariant derivative: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3.2 Covectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.3 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.1 Geometrical interpretation 1 . . . . . . . . . . . . . . . . . . . . . . 54
3.5.2 Geometrical interpretation 2 . . . . . . . . . . . . . . . . . . . . . . 55
3.5.3 Commutator acting on a scalar . . . . . . . . . . . . . . . . . . . . 56
3.6 Riemann curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.7 General commutator identities[Ricci identities] . . . . . . . . . . . . . . . . 59
3.8 Basic curvature identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.8.1 Anti-symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.8.2 Antisymmetric part of the Ricci tensor . . . . . . . . . . . . . . . . 61
3.8.3 Nonmetricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.9 Weitzenbock identities [generalized Bianchi identities] . . . . . . . . . . . . 64
3.10 Integrability [geometric route] . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.10.1 Basic definitions and results . . . . . . . . . . . . . . . . . . . . . . 66
3.10.2 Zero curvature implies integrability . . . . . . . . . . . . . . . . . . 68
3.10.3 Affine flat connexions . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.11 Integrability [PDE route] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.11.1 Frobenius–Mayer systems . . . . . . . . . . . . . . . . . . . . . . . 74
3.11.2 Frobenius complete integrability theorem: . . . . . . . . . . . . . . 75
3.11.3 From auto-parallel to Frobenius–Mayer . . . . . . . . . . . . . . . . 76
3.12 n-beins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.13 Weitzenbock connexion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.14 Deforming general connexions . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.15 Preserving auto-parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.16 Preserving geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.17 Decomposing the general connexion . . . . . . . . . . . . . . . . . . . . . . 87
3.18 Locally geodesic coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.19 Riemann normal coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Math 464: Differential Geometry 4

4 Symmetric Connexions 92
4.1 Semi-symmetric connexions . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.2 Symmetric connexions: Identities . . . . . . . . . . . . . . . . . . . . . . . 93
4.3 Locally geodesic coordinates for symmetricconnexions . . . . . . . . . . . . 95
4.4 Weyl symmetric connexion . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.5 Weyl tensor for symmetric connexions . . . . . . . . . . . . . . . . . . . . . 96
4.6 Projective connexions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5 Metric Connexions 100


5.1 Metric connexions with torsion . . . . . . . . . . . . . . . . . . . . . . . . 100
5.2 Metric semi-symmetric connexions . . . . . . . . . . . . . . . . . . . . . . . 101
5.3 Metric connexions without torsion — the GR “standard connexion” . . . . 102

6 Metric tensor 104


6.1 Measuring distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.2 Riemannian geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.3 pseudo-Riemannian geometry . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4 Finsler/pseudo-Finsler geometry . . . . . . . . . . . . . . . . . . . . . . . . 107
6.5 Metric geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.6 Metric connexion revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.7 Bianchi identities revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.8 The Weyl tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.9 The differential geometry of Newton’s second law . . . . . . . . . . . . . . 119

7 Exterior derivatives 125


7.1 Partial derivatives are not tensors . . . . . . . . . . . . . . . . . . . . . . . 125
7.2 Exterior differential forms . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.3 “Index free” notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.4 Closed and exact forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.5 Integration of forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.6 Generalized Kronecker tensors . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.7 The Levi–Civita tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.8 The Hodge star operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.9 The divergence operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.10 Gauss’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.11 Euclidean three-space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.12 Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7.13 Riemann curvature forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

8 Lie derivatives 154


8.1 Partial derivatives are not tensors . . . . . . . . . . . . . . . . . . . . . . . 154
8.2 Lie derivatives on T0r tensors . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.3 Lie derivatives on covariant vectors . . . . . . . . . . . . . . . . . . . . . . 157
8.4 Lie derivatives on general tensors . . . . . . . . . . . . . . . . . . . . . . . 157
Math 464: Differential Geometry 5

8.5 Geometry of the Lie derivative — Lie dragging . . . . . . . . . . . . . . . . 158


8.6 Symmetries and Killing vectors . . . . . . . . . . . . . . . . . . . . . . . . 161

9 Extrinsic curvature 162


9.1 Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
9.2 First fundamental form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
9.3 Second fundamental form (extrinsic curvature) . . . . . . . . . . . . . . . . 162
9.4 Third fundamental form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
9.5 Equations of Gauss and Codazzi . . . . . . . . . . . . . . . . . . . . . . . . 162
9.6 Equations of Weingarten . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

10 Distribution valued curvature 163


10.1 Discontinuities in the connexion . . . . . . . . . . . . . . . . . . . . . . . . 163
10.2 Discontinuities in the metric? . . . . . . . . . . . . . . . . . . . . . . . . . 163
10.3 Colombeaux algebra? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

11 Gauge-fields 164
11.1 Connexions on vector bundles . . . . . . . . . . . . . . . . . . . . . . . . . 164
11.2 Abelian gauge fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
11.3 non-Abelian gauge fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

12 Coda 169

A Notation 170

B Symmetries and counting arguments 171


B.1 Totally antisymmetic tensors . . . . . . . . . . . . . . . . . . . . . . . . . . 171
B.2 Totally symmetric tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
B.2.1 Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
B.2.2 Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
B.2.3 Some simple results . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

C Some matrix identities 176


C.1 Determinants, traces, and matrix logarithms . . . . . . . . . . . . . . . . . 176
C.2 Derivatives of determinants . . . . . . . . . . . . . . . . . . . . . . . . . . 178

D Path ordered integrals 179

E Elements of the calculus of variations 182

F Hilbert’s 19th, 20th, and 23rd problems 186

G Riemann: On the hypotheses which underlie... 196

Bibliography 208
Chapter 1

Introduction

1.1 Outline

In this course I will present an overview of differential geometry, also known as the the-
ory of manifolds, (sometimes loosely known as non-Euclidean geometry or Riemannian
geometry, but that is actually a more specialized topic).

Be aware that these are notes, not a textbook, and I make no pretense or claim of
completeness...

Things I hope to cover in the course:

— Topological Manifolds and differentiable structure.

— Tangent and cotangent spaces.

— Fibre bundles.

— Geodesics and connexions — parallel transport.

— Riemann curvature (intrinsic curvature).

— Exterior differential forms: generalized Stokes’ theorem.

— Lie derivatives; symmetries.

— Extrinsic curvature; equations of Gauss, Codazzi, and Weingarten.

— Gauge fields (vector bundle connexions).

6
Math 464: Differential Geometry 7

Each topic will expand to roughly three to four hours of lectures.

Some of the unusual features of this course are:

• I will discuss general affine connexions, with both non-metricity and torsion. Do-
ing so complicates parts of the discussion where the metric-tensor-based Christoffel
connexion would lead to considerably simpler results. (For instance there are extra
terms in the differential Bianchi identities, which in this context should more prop-
erly be called the Weitzenbock identities.) The reason I go to this trouble is because
torsion and nonmetricity are playing an increasing role in modern research.

• I will make extensive use of “index-based” methods. Mathematicians typically den-


igrate such methods as “index gymnastics” and prefer pure abstract “coordinate-
free” methods. However when it comes to specific computations of the geometric
properties of a specific manifold, “index-based” methods are often superior — one
can waste an awful lot of time unwrapping “coordinate-free” notation to figure out
what you are actually calculating.

• On the other hand, I will not eschew “coordinate-free” methods completely. One
place where such methods are clearly advantageous is in the theory of differential
forms and the exterior derivative.

• The presentation is very much in the spirit of applied mathematics (and here my
training as a physicist shows through); rigour will be subservient to getting the job
done. If rigour is needed, it will be used. If correct results can be obtained through
less rigorous mental pictures that’s fine — we can always pick up an appropriately
rigorous textbook to flesh out the arguments.

• Somewhat unusually, I will take the parallel transport operator on the manifold as
the primitive notion; and then use it to define the connexion and so the covariant
derivative.

• I also show how to reverse the procedure, starting from the connexion the paral-
lel transport operator can be constructed as a “path-ordered” exponential integral.
Path-ordering is a technique that is extremely useful, but is currently almost com-
pletely confined to the theoretical physics community (and more specifically to the
particle physics community).

• Whenever I need an example it will almost always come from within physics — for
example interpreting (re-interpreting) Newton’s second law in terms of the geodesics
of a conformally flat space.

• For cultural reasons I’ve added two historical appendices — one on Bernhard Rie-
mann’s inaugural lecture wherein he basically set out his programme for developing
differential geometry, and another on some of the Hilbert problems (those to do
Math 464: Differential Geometry 8

with the calculus of variations, which is a very useful general purpose tool of great
importance in applications of differential geometry).

1.2 Textbook
• Introducing Einstein’s Relativity
Ray D’Inverno
Oxford University Press, reprinted 2000.
Cost US$50.00; about NZ$100.00

The Math Dept has some copies available for loan; see me.

This will be the only book you will need any access to.

For almost all purposes these notes will be sufficient.

1.3 Supplementary texts

[For additional background; *not* required textbooks.]

• General relativity for mathematicians


R.K. Sachs and H. Wu
Springer graduate texts in Mathematics #48
(out of print)
• B.F. Schutz — A first course in general relativity.
Cambridge, 1985, US$40.00; about NZ$100.00
• J.B. Hartle — Gravity: An introduction to Einstein’s General Relativity.
Addison–Wesley, 2002; US$56; about NZ$120.00
• Additional textbooks and reference books are listed in the bibliography.

1.4 Background

This course is an introduction, so we will pretty much be sticking with classical mathe-
matical tools — much of the mathematics I will develop would make sense to a late 19th
century mathematician. (Though some bits of notation, and some of the applications I’ll
mention, would be a bit of a surprise.)
Chapter 2

Fundamentals

Differential geometry was originally developed to handle very practical problems in large-
scale surveying. For small parcels of land the “flat-Earth approximation” is perfectly
adequate (except in Wellington). For large parcels of land the curvature of the Earth
does have an effect on surveying. The angles of the surveyor’s triangles do not add up to
π radians.

Similarly, for long-distance navigating, you had better be prepared to understand spher-
ical trigonometry.

Karl Friedrich Gauss is generally credited with the first steps towards what is now called
differential geometry. He was specifically interested in 2-dimensional curved surfaces
embedded in flat Euclidean 3-space, but the notions generalize — eventually it’s better
to discard the idea of an embedding space completely.

Bernhard Riemann is then generally credited with developing notions of n-dimensional


“curved spaces”, now generally known as manifolds. (I have included a copy of Riemann’s
inaugural lecture as one of the appendices.)

We will develop these notions in stages, using the flat-Earth approximation as our guide.

(This chapter is a little more abstract than strictly needed for standard general relativ-
ity; but if you ever plan on looking at a paper on “braneworlds” or “string-inspired GR”
this vocabulary will be both useful and necessary.)

Quiz: A hunter leaves camp and walks one mile south. He then spots a bear, and walks
one mile east, and then shoots the bear. After that he walks one mile north and finds he
is back at his camp. What colour was the bear? Where is his campsite? ♦

Quiz: A geologist leaves camp and walks one mile south. She then spots a likely loca-

9
Math 464: Differential Geometry 10

tion, and walks one mile east to her sampling site. After a bit of digging, she encounters
bedrock and extracts a rock sample. After that she walks one mile north and finds she is
back at her camp. What did she have to dig through to find her rock sample? Where is
her campsite? ♦

Quiz: A navigator directs his ship to follow a “rhumb line”. Ignoring trivia such as the
continents, islands, and rocks that might get in the way, where will his ship eventually
go? ♦

2.1 Elementary topology

In this brief section I want to present just enough abstract topology for you to understand
some of the technical issues in setting up the definition of manifold.

Topology is an abstract branch of mathematics designed to let you handle notions of


“continuity” — without the additional baggage of “distance” or “differentiability”. If the
only thing you ever work with is the real or complex numbers, you might be excused for
thinking that “continuity”, “distance”, and “differentiability” are inextricably entwined.
They are not, and it is extremely useful to keep the concepts distinct.

Note: You can get a lot of reference material from Google; it’s generally pretty good
and reasonably reliable. ♦

In fact there is a hierarchy of concepts:

• Topological spaces: Good enough to define “continuity”, but it may not even make
sense to ask questions about “distance” and/or “differentiability”.

• Metric spaces: Good enough to define a notion of distance. All metric spaces
have a natural topology, but it may not even make sense to ask questions about
“differentiability”.

• Differentiable structure: [Considerably more subtle] With minor exceptions, it means


you better have some real [or complex] numbers floating around somewhere. The
real numbers will definitely specify a natural topology. [They might also specify a
natural metric, but more typically will specify an equivalence class of metrics.]

Definition 1 A topological space (E, T ) is a set E together with a collection T of


distinguished subsets of E called the “open sets” which satisfy the axioms:
Math 464: Differential Geometry 11

• ∅∈T;

• E ∈T.

• The intersection of any finite number of open sets is open.

• The union of any [possibly infinite] collection of open sets is open.

Note that U ∈ T ⇒ ∅ ⊆ U ⊆ E. Strictly speaking one should reserve the phrase


“topological space” for the ordered pair (E, T ), but it is not uncommon to simply call E
the topological space and T the topology.

Remarkably enough, this very abstract definition is sufficient to capture the essential
notion of “continuity”.

Definition 2 Suppose a function f maps one topological space (E1 , T1 ) into another (E2 , T2 ).
Then the function is continuous iff [if and only if ] whenever U ∈ T2 we have f −1 (U ) ∈ T1 .

That is: “continuous” iff “the inverse image of every open set is open”.

Note: f −1 (U ) is simply the “set inverse” defined by:

f −1 (U ) ≡ {x ∈ E1 : f (x) ∈ U }, (2.1)

we are not [at this stage] assuming the inverse function exists as a function. This is simply
the set of points in E1 which maps to U in E2 . ♦

Note: This is a definition; it is meaningless to try to prove it. You can however, seek to
justify it by verifying that it makes sense in situations you are already familiar with. ♦

2.2 The “usual topology” on IR

Consider the real line IR.

• Consider the collection of “open intervals” (a, b) in the usual sense.

• That is, “open intervals” are subsets of IR of the form {x : a < x < b}.

• Note: ∅ = (0, 0) is an “open interval”.

• Note: IR = (−∞, ∞) is an “open interval”.


Math 464: Differential Geometry 12

• Now consider the “set closure” of the collection of “open intervals” under finite in-
tersection and arbitrary union.
Sometimes this “set closure” is called the “algebraic closure” — we want to con-
sider the class of objects that can be constructed by arbitrary unions and finite
intersections of the collection of “open intervals” (a, b).
• This set closure is a topology on IR, called the “usual topology”.

In the “usual topology” everything is as expected from previous courses you might have
taken.

Note the set of “open intervals” itself does not form a topology. It is however a “sub-
basis” for a topology. (Meaning that after you invoke set closure under arbitrary unions
and finite intersections; then you have a topology.)

Comment: I’m being a little loose with the technical definition of “basis of a topology”.
If you want to know precise details, consult textbooks on general topology. Brief sketch
of details below. ♦

Comment: The “usual topology” is not the only topology you can put on IR. There
are other perverse topologies you can place on IR [or any other set]. ♦

Definition 3 The set 2E is the collection of all subsets of E.

Exercise: [trivial] Show that for any topology (E, T ), we have T ⊆ 2E . ♦

Definition 4 A set B ⊂ 2E is called a “basis” for some topology on E if it satisfies

• ∅ ∈ B.
S
• B∈B B = E.

• If B1 and B2 are in B, then there exists a subset B̃ of B such that


[
B1 ∩ B 2 = B (2.2)
B∈B̃

Exercise: Let B be a basis defined on the set E. Show that


TB = U ∈ 2E : U is a union of elements of B

(2.3)
is a topology on E. ♦
Math 464: Differential Geometry 13

Definition 5 A set A ⊆ 2E is called a “sub-basis” if the collection B of finitely many


intersections of elements of A forms a basis B.

Exercise: Show that any arbitrary collection A of subsets of E can be used as a sub-basis
for some topology on E.

That is, given an arbitrary collection A of subsets of E, a topology T can be formed as


follows: First take the collection B of finite intersections of members of A, and show that
this is a basis. Then take the topology T generated by taking B as basis. T will then be
the “smallest” topology such that A ⊆ T . ♦

Exercise: Show that the set A of all open intervals (a, b) on the reals is closed under
finite intersection. Consequently, if we treat this as a sub-basis, then the basis B it gener-
ates is itself: B = A. By definition the topology TB defined by this basis is the usual one. ♦

Exercise: Suppose we are not dealing with the reals. Show that you can do similar
things for any set that is ordered by a notion of “greater”, “lesser”, or “equal”. [Techni-
cally consider a “total order”, look up the definition if necessary.] ♦

Exercise: Look up the definition for a “partial order”, or “poset”, and repeat the con-
struction of sub-basis → basis → topology. ♦

Discrete and indiscrete topologies

Two particularly common perverse topologies are:

• Discrete topology: all subsets of E are declared open; T = 2E .

• Indiscrete topology: the only open sets are ∅ and E; T = {∅, E}.

Exercise: If (E1 , T1 ) has its discrete topology then show that all functions f : E1 → E2
are continuous. Note I don’t have to say anything about the topology on E2 . ♦

Exercise: If (E2 , T2 ) has its indiscrete topology then show that all functions f : E1 → E2
are continuous. Note I don’t have to say anything about the topology on E1 . ♦
Math 464: Differential Geometry 14

This means that the discrete and indiscrete topologies are close to useless [except as a
source of interesting counterexamples].

The key message to take from this discussion is:

Topology and continuity are very primitive [very basic] concepts that make sense
even if there are no real [or complex] numbers anywhere in the problem.

2.3 Hausdorff topologies

Definition 6 A topology is “Hausdorff ” iff for any two distinct points x1 and x2 in E
there exist open sets U1 and U2 in T such that:

x1 ∈ U 1 ; x2 ∈ U 2 ; U1 ∩ U2 = ∅. (2.4)

Colloquially: A topology is Hausdorff iff distinct points can be “housed off” from each
other using open sets.

Examples:

• IR in the usual topology is Hausdorff.

• IR in the discrete topology is Hausdorff.

• IR in the indiscrete topology is not Hausdorff.

2.4 “Point doubling” topologies

More drastically, there are things like “point-doubling topologies”:

Define a set E1 by considering the ordinary real line IR. Remove the point 0, and replace
it by two points 01 and 02 .

Define a basis for the topology on E1 :

1. Any open set on IR that does not contain 0 is also an open set in E1 .

2. ∀ > 0, (−, 0) ∪ {01 } ∪ (0, ) is an open set in E1 that contains 01 .

3. ∀ > 0, (−, 0) ∪ {02 } ∪ (0, ) is an open set in E1 that contains 02 .


Math 464: Differential Geometry 15

Now consider the topology formed by the set closure of this basis under finite intersection
and arbitrary union.

In this topology, every open set containing 01 has a non-empty intersection with every
open set containing 02 . Thus this topology fails to be Hausdorff for the pair of points
{01 , 02 }. This is an elementary example of “point doubling”.

2.5 “Train track” topologies

Other perversities include “train track topologies”.

Define a set E2 by considering the ordinary real line IR. Remove the half-closed interval
I = [0, +∞), and replace it by two copies I1 = [01 , +∞1 ) and I2 = [02 , +∞2 ).

Define a basis for the topology on E2 :

1. Any open interval on (−∞, 0) is also an open set in E2 .

2. Any open interval in I1 is also an open set in E2 .

3. Any open interval in I2 is also an open set in E2 .

4. ∀ > 0, (−, 0) ∪ [01 , 01 + ) is an open set in E2 that contains 01 .

5. ∀ > 0, (−, 0) ∪ [02 , 02 + ) is an open set in E2 that contains 02 .

Now consider the topology formed by the set closure of this basis under finite intersection
and arbitrary union.

This topology fails to be Hausdorff for the pair of points {01 , 02 }. Because of its explicit
construction in terms of the real line, this set can be given a “locally Euclidean” structure
in the obvious fashion (see below). This is an elementary example of a one-dimensional
manifold that “branches” into two.

Exercise: Since this “line splitting” topology is “locally Euclidean” you might be
tempted to try setting up a theory of ordinary differential equations on this space, similar
to that for ODEs on IR. Think a little about what might happen to an ODE when it hits
the branch point. ♦

The key message to take from this discussion is:

Non-Hausdorff topologies are rarely of use in applied mathematics.


Math 464: Differential Geometry 16

We went through this discussion so that you could see why it might be a good idea to
exclude these types of behaviour from further consideration.

2.6 Topological closure

Some other definitions:

Definition 7 A set X is “closed” iff its complement E − X is open; that is E − X ∈ T .

Definition 8 A set X is “clopen” iff is is both closed and open.

Note ∅ and E are both clopen.

Definition 9 A topological space is “connected” iff the only clopen sets are ∅ and E.

Lemma 1 A topological space is connected iff it is not the union of two disjoint nonempty
open sets.

Definition 10 The “closure” cl(X) of a set X is the smallest closed set that contains X,
more formally E − cl(X) is the union of all open sets that lie completely in E − X.
[
cl(X) = E − {U : U ∈ T and U ⊆ E − X} (2.5)
U

Notation: It is quite common to see the closure denoted by an over-bar, as in

cl(X) = X̄. (2.6)

Many analysts strongly condemn such usage as potentially confusing; but it is almost
universal in the physics literature. ♦

Definition 11 The “interior” int(X) of a set X is the biggest open set contained in X,
more formally int(X) is the union of all open sets that lie completely in X.
[
int(X) = {U : U ∈ T and U ⊆ X} (2.7)
U
Math 464: Differential Geometry 17

Definition 12 The “boundary” bd(X) of a set X is the set difference

bd(X) = cl(X) − int(X). (2.8)

Notation: It is quite common to see the boundary denoted by a partial derivative


operator ∂, as in
bd(X) = ∂X. (2.9)
Many analysts strongly condemn such usage as potentially confusing; but it is almost
universal in the physics literature. ♦

2.7 Compactness

You will often see the word “compact” flying around in the discussion. The technical
definition is this:

Definition 13 A set U ⊆ E is compact with respect to the topology (E, T ) iff every open
cover of U has a finite refinement.

That is: suppose we have any collection {Oi : i ∈ I} such that ∀i Oi ∈ T and U ⊆ ∪i∈I Oi .
Then U is compact iff for any such collection there exists a finite sub-collection {Oj : j ∈
J; #(J) < ∞} which does the same job: U ⊆ ∪i∈J Oi .

Theorem 1 (Heine-Borel) A set U ⊂ IRn is compact iff it is closed and bounded.

The proof is well outside the scope of this course.

Example: IRn is not compact. ♦


Example: An open interval in IR is not compact. ♦
Warning: In older analysis textbooks you may encounter the old-style version of the
Heine–Borel theorem. ♦

Definition 14 (Old-style compactness) A set U ⊂ IRn is compact iff it is closed and


bounded.

Theorem 2 (Old-style Heine-Borel) A set U ⊂ IRn is compact [with respect to the


usual topology] iff every open cover of U has a finite refinement.
Math 464: Differential Geometry 18

That is, “old style” Heine–Borel was limited to statements concerning IR n , and the defi-
nition of compactness and the statement of the theorem were interchanged.

Another word you will sometimes encounter is “paracompactness”; for our purposes this
represents a highly technical issue that almost never impinges on the things we want to
do.

Definition 15 A set U ⊆ E is paracompact with respect to the topology (E, T ) iff every
open cover of U has a locally finite refinement.

That is, for each x ∈ E there exists a U ∈ T containing x such that the original open
cover of E, restricted by intersection with U , has a finite refinement that covers U .

Warning: Some authors prefer to make “paracompactness” part of the technical defi-
nition of a manifold; in preference to “second countability” defined below.
The constructions are largely equivalent — see for instance appendix A of Wald, “Gravi-
tation”. ♦

2.8 Elementary metric spaces

Warning: The metric defined here is subtly different from the “metric tensor” to be
introduced later.

Worse, the “physicist’s metric” in special relativity [SR] and general relativity [GR] is
not a metric in this sense. The SR/GR metric should really be called a “pseudo-metric”.
More on this next semester. Unfortunately the physics usage is well-established and un-
likely to change. Fortunately, most of the people working in the filed are smart enough
to figure out what is meant from context. ♦

Definition 16 A metric space (E, d) is a set equipped with a distance function d : E ×E →


IR. Here d(x, y) is non-negative and is to be thought of as the distance from x to y.

The metric satisfies the axioms:

• d(x, x) = 0.

• d(x, y) = 0 ⇒ x = y. [“non-degeneracy”.]

• d(x, y) = d(y, x). [“symmetry”.]


Math 464: Differential Geometry 19

• d(x, y) + d(y, z) ≥ d(x, z). [“triangle inequality”.]

In any metric space we can define the “open balls”

B(x, r) = {y : d(x, y) < r}. (2.10)

Note: ∅ = B(x, 0). ♦


Note: E = B(x, ∞). ♦
If we now take the set closure of the collection of “open balls” under finite intersection
and arbitrary union, we will obtain a topology. This is the metric topology for the set E.

That is: The collection of open balls is a sub-basis. The collection of finite intersections
of open balls is a basis. The collection of arbitrary unions of finite intersections of open
balls is a topology.

Theorem 3 In any metric space the metric topology is Hausdorff.

Proof: Let x and y be distinct, and let δ = d(x, y) be the distance between x and y
which is guaranteed positive. Now consider the open balls

B1 = B(x, δ/3); B2 = B(y, δ/3). (2.11)

Then B1 ∩ B2 = ∅. (You should be able to prove this using the triangle inequality.) QED

Since all metric spaces are Hausdorff, and nearly all of applied mathematics takes place
in some sort of metric space [or at least a metrizable space] this is another reason why the
applied mathematics community doesn’t worry too much about non-Hausdorff topologies.

Note: In metric spaces you can certainly define continuity; but you do not need to be
in a metric space to do so. That’s the whole point of going to the trouble of defining a
general topological space. ♦
a+b b−a
Note: In IR we have (a, b) = B( 2 , 2 ). ♦
Thus for IR the “usual topology” is the metric topology [with the usual metric].

Even better, in IR [or metric spaces generally] continuity in the topological sense is
equivalent to the standard ∀ ∃δ sense used in formal analysis.

Reminder: A function f : IR → IR is said to be continuous at the point x iff

∀ > 0 ∃δ > 0 : |y − x| < δ ⇒ |f (y) − f (x)| < . (2.12)

This is the version of continuity most often used in classical analysis, but has now largely
been superseded end by the topological version. ♦
Math 464: Differential Geometry 20

Notation: homeomorphism ⇐⇒ the function and its inverse are both continuous.

(That is, 1 to 1 and continuous in both directions.) ♦

2.9 Locally Euclidean spaces

Definition 17 Locally Euclidean space


A locally Euclidean space is a topological space (E, T ), a set E together with a topology T
(a collection of sets that we define to be “open”) that satisfies the following:

∀x ∈ E ∃ O ∈ T and n ∈ Z + x ∈ O and ∃ X ⊂IRn and ∃ homeomorphism f : O ↔ X.


(2.13)

That is: surrounding any point there is at least one open neighborhood that can be
mapped onto a subset of IRn in a 1–to–1 and bi-continuous manner.

Warning: This definition is actually very weak and still permits a number of things
that physicists would consider pathologies:

• non-Hausdorff topologies.
• dimensionality that is position-dependent.
• dimensionality n that does not seem to match the underlying space E.

Examples of these perversities:

• Point doubling and/or line splitting — see previous discussion of Hausdorff topology.
(It is easy to verify that those spaces are locally Euclidean in this sense.)
• Disconnected topologies.
Take the set union of a line with a plane, each with its usual topology. The result
is a disconnected topology, which is a locally Euclidean space, but in some regions
the space is 1-dimensional while in other regions it is 2-dimensional.
• Square with 1d topology.
This is particularly perverse. See Chillingworth p 121.
Take a square [0, 1] × [0, 1] ⊂ IR2 .
Define a topology: A set X is open iff it is either of the form
X = {(x, y) : {y} open in the usual topology on IR} , (2.14)
Math 464: Differential Geometry 21

or is the arbitrary union of such sets.


Then there is a basis for the open sets which is in a natural sense 1d [parts of IR],
though the space itself is in some natural sense 2d [part of IR2 ].
Warning: Chillingworth still calls this a manifold; we [and most of the com-
munity] will refine our definition to exclude this perversity. Chillingworth uses the
word manifold where we use the phrase “locally Euclidean space”. ♦

Note that this peculiar example requires an uncountable infinity of open sets. In
particular any open cover of the square must contain an uncountable infinity of open
sets.

Definition 18 A set is infinite but countable if all its elements can be put into 1 to 1
correspondence with the natural numbers IN = 0, 1, 2, 3...

This level of infinity is denoted by the symbol ℵ0 , read as aleph-null.

Exercise: [trivial] Show that the set Z of integers is countable. ♦

Exercise: [tricky] Show that the set Q of rationals is countable. ♦

Exercise: Show that the set IR of reals is not countable. ♦

Definition 19 A set is infinite and uncountable if it is infinite but not countable.

For example ℵ1 = 2ℵ0 is uncountably infinite.

Exercise: [trivial] Show that the set of all possible subsets of the natural numbers is
uncountable. ♦

Warning: [In case you like to read mathematics in French] The French word “varieté”
is equivalent to the English word “manifold”, it does not translate to “variety”. ♦

Comment: Why locally Euclidean? Why the importance of IR n ? This is a purely


pragmatic decision based on the extremely useful differentiation and integration theories
you can build on IRn , sufficient to develop theories of ODEs and PDEs.

There are speculations that quantum gravity might eventually force us to modify our
notions of space and time at short distances, in which case “locally Euclidean” might not
Math 464: Differential Geometry 22

be the most appropriate choice. Maybe we should look at ”locally X” spaces. There is
currently no consensus as to what the X might be in “locally X”, so I will not explore
that route in this course. The speculations will remain speculations unless and until some
clearly useful notion of ”locally X” space is developed. ♦

2.10 Charts

Definition 20 Chart
A chart (O, f, U ) on an open subset O ∈ T is simply a set U ⊂IR n together with a
homeomorphism f : O ↔ U = f (O).

That is: in any locally Euclidean space each point x lies in at least one chart. These
charts can be thought of as “maps” (in the cartographic sense) that cover part of the
space E.

Comment: The word “map” already has a technical meaning in mathematics, which
is why in manifold theory it is better to adopt the word “chart”. ♦

Definition 21 Coordinates
The image of the chart, f (O) = U ⊆IRn , is a set of n-tuples of real numbers. Each point
x in O corresponds to a single n-tuple, which are referred to as the coordinates of the
point x in the given chart. It is common to write the coordinates as xa with a = 1, 2, ...n,
or xµ with µ = 1, 2, ...n.

Notation: In the specific case of general relativity it has become conventional to write
a = 0, 1, 2, 3 or µ = 0, 1, 2, 3 with 0th coordinate typically [not always] being the time
coordinate. This is merely convention, it is not fundamental. ♦

2.11 Atlases

Now patch all the charts together:

Definition 22 Atlas
An atlas is a collection of charts that covers the entire locally Euclidean space E.
Math 464: Differential Geometry 23

By definition, any locally Euclidean space has at least one atlas.

Note that whenever two charts overlap, the intersection Oij ≡ Oi ∩ Oj 6= ∅ is an open
set, and the mapping fi ◦ fj−1 , suitably restricted, is a homeomorphism from some open
subset of IRn to another open subset of IRn .

Specifically, if Oij ≡ Oi ∩ Oj 6= ∅ then

fi ◦ fj−1 : fi (Oij ) ↔ fj (Oij ). (2.15)

Lemma 2 In any locally Euclidean space, the local dimensionality n(x) is piecewise con-
stant on the connected components of E.

Exercise: Prove this. ♦

2.12 Manifolds

Definition 23 Manifold
A manifold M is a locally Euclidean space which:

• has the same dimension everywhere;

• is Hausdorff;

• is second-countable.

Note that we have to put the dimension and Hausdorff conditions in “by hand”, while
the second-countable condition is there to make sure the dimensionality of the chart
“matches” the dimensionality of the space E.

Second-countable means there is at least one countable basis for the topology on M.
It implies (not that this is blindingly obvious) that there is at least one way of covering
the space with a countable number of charts. This is much weaker than the restriction
“compact”, but implied by it, since compactness would require every atlas [since it is in
particular an open cover] to have a finite refinement.

Lemma 3 Any compact, connected, metrizable, locally Euclidean space is a manifold.


Math 464: Differential Geometry 24

Proof: Connected ⇒ globally constant dimension; Metrizable ⇒ Hausdorff; Compact ⇒


second countable. QED

Comment: In a manifold context the words “second countable”, “paracompact”, and


“metrizable” are essentially equivalent (in that they imply one another). I have encoun-
tered an extensive list of at least 88 subtly different topological concepts that are known
to be mutually equivalent in a manifold setting.

The only [mild] danger is that differing texts sometimes use slightly different definitions
as being more basic — in all situations of relevance to applied mathematics and physics the
definitions agree and the “second countable”/ “paracompact”/ “metrizable” restriction
simply serves to keep the number of charts under control.

In particular, appendix A (topological spaces) of Wald, “General Relativity”, contains


a brief description of the interplay between second countability, paracompactness, and
partitions of unity. ♦

So far we only have some continuity information about the manifold — where does
differentiability come in?

2.13 Differentiable Manifolds

Definition 24 Differentiable Manifold


A differentiable manifold is a manifold such that: ∀Oij ≡ Oi ∩ Oj 6= ∅

fi ◦ fj−1 : fi (Oij ) ↔ fj (Oij ) (2.16)

is a diffeomorphism from a subset of IRn to a subset of IRn .

Reminder: diffeomorphism ⇐⇒ 1 to 1 and differentiable in both directions. ♦


−1 0 r ∞ ω
By placing suitable constraints on fi ◦ fj we can define C , C , C , and C manifolds.
[C 0 is automatic.]

Reminder: C r ⇐⇒ the r-th derivative exists and is continuous; C ∞ ⇐⇒ all


derivatives of arbitrary order exist and are continuous; C ∞ ⇐⇒ analytic — around every
point there exists a Taylor series expansion with a non-zero radius of convergence. ♦
As usual in applied mathematics and mathematical physics: We assume everything is as
smooth as needed.

Example: Euclidean space IR3


Euclidean IR3 is a manifold with a particularly simple globally defined coordinate chart
Math 464: Differential Geometry 25

(and so it has a one-chart atlas).


The same applies to any IRn . ♦

Example: Minkowski space (IR4 , η)


Minkowski space is also a manifold with a globally defined coordinate chart (a one-chart
atlas). ♦
Abstract mathematicians can mostly ignore Minkowski space. It is a flat pseudo-Riemannian
geometry which becomes useful when dealing with special relativity. It is vitally important
to physical applications of manifold theory.

Example: Sphere S 2
The sphere S 2 is a manifold.
It is not enough to just mutter “latitude and longitude” since this does not cover the
whole sphere; the way we have set things up latitude and longitude would be defined on
an open set that covers most of the sphere except for both poles and one line of longitude.
How would we fix this? ♦

Exercise: What is the minimum number of charts needed to provide an atlas for S 2 ? ♦

Example: Sphere S n
The sphere S n is a manifold.
To see this use stereographic projection. ♦

Exercise: What is the minimum number of charts needed to provide an atlas for S n ? ♦

Example: Products M × N
If M is a manifold and N is a manifold then the product M × N is a manifold of dimen-
sion dim(M × N ) = dim(M) + dim(N ). ♦

Exercise: Go through the definition in detail to verify that if M is a manifold and N


is a manifold then the product M × N is also a manifold. What topology should you put
on M × N ? ♦

Example: Torus T n ≡ (S 1 )n = S 1 × S 1 × · · · × S 1
The n-torus T n is a manifold. ♦

Example: Tangent bundle T (S n )


The tangent bundle T (S n ) to the sphere S n is a manifold. This is a simple special case
Math 464: Differential Geometry 26

of a more general construction to be developed later.

View S n as a subset of IRn+1 with |~x| = 1. A vector ~t ∈IRn+1 is tangent to S n at ~x if


~t ⊥ ~x. To keep track of the point ~x at which the tangent is defined write (~x, ~t) ∈IRn+1
× IRn+1 .
The set of all tangents to S n is then

T (S n ) = (~x, ~t) ∈IRn+1 × IRn+1 : |~x| = 1 and ~x ⊥ ~t .



(2.17)

This “bundle” of all the tangents is called, surprise, the tangent bundle.
It has a natural topology as a subset of IRn+1 × IRn+1 .
To specify a point in the bundle you need 2n real numbers.
To prove it is a manifold is a relatively simple exercise; not attempted here. ♦

Exercise: Go through the definition in detail to verify that the tangent bundle T (S n )
is actually a manifold. ♦

Comment:

• Locally T (S n ) is always a product of the form Ui × IRn with the set of Ui ’s covering
S n ; but this need not hold globally.

• Globally T (S n ) = S n × IRn iff n = 0, 1, 3, 7.


This is related to the existence of special number systems in n+1 dimensions:
IR, C, the quaternions, and the octonions.

2.14 Connected sum: M1#M2

Definition 25 Let the n-disk (closed n-ball) be defined by:

D n = cl(B n ) = {~x ∈IRn : |~x| ≤ 1}. (2.18)

Then the boundary is


bd(D n ) = S n−1 (2.19)

Definition 26 Connected sum:


Let M1 and M2 be two n-dimensional manifolds (without boundary), with D1 a n-disk
in M1 and D2 a n-disk in M2 . Construct

M0i = Mi − Di (2.20)
Math 464: Differential Geometry 27

These two manifolds both have boundaries bd(Mi0 ) ∼ S n−1 which we can identify by some
homeomorphism. The result, denoted M1 #M2 is a n-dimensional manifold (without
boundary) called the connected sum of M1 and M2 .

Comment: More precisely it is an equivalence class of homeomorphic manifolds, because


you needed to make some choice in carrying out the identification bd(M01 ) ∼ bd(M01 ). ♦

Exercise: Understand why:

• M1 #M2 = M2 #M1 ;

• M#S n = M.

2.15 “Modding out”

Definition 27 Equivalence relation:


An equivalence relation ∼ on a set X is a symmetric bilinear relation such that

• x ∼ x. [reflexive]

• x∼y ⇒ y ∼ x. [symmetric]

• x∼y & y∼z ⇒ x ∼ z. [transitive]

Definition 28 Modding out:


If X is a set with an equivalence relation ∼ then X/ ∼ denotes the set of equivalence
classes in X. Going from X to X/ ∼ is called “modding out”. Note that there is a
canonical projection
π : X → X/ ∼ (2.21)
This canonical projection is automatically a surjection (that is, onto).

• If X is a topological space then X/ ∼ is also a topological space, and π is continuous


in this natural topology.

– If X is compact then X/ ∼ is also compact.


Math 464: Differential Geometry 28

– If X is Hausdorff then X/ ∼ may fail to be Hausdorff.

• If X is a manifold then X/ ∼ is sometimes but certainly not always a manifold.

– IRn /Z n = (S 1 )n = T n is a manifold. IRn modded out by discrete linearly


independent translations is the n-torus.
– Sometimes you don’t get a manifold; look for words like “orbifold” or “variety”.
The result of modding out a manifold will, almost always, be a locally Euclidean
space, and will “almost” be a manifold (apart from a few singular points or
surfaces). The most likely failure is the Hausdorff property.
– Orbifolds are now relatively common in string-inspired theoretical physics as
“low-energy” not-quite manifolds arising in “string theory” [brane theory].
– Consider this simple orbifold: O is defined to be IRn where ~x and −~x are iden-
tified. Then ~0 is an orbifold point, and for ~x 6= ~0 the space is locally Euclidean.
(In fact, if you delete the point ~0, the remaining points are a manifold, home-
omorphic to IR+ × IRP n−1 , the positive real line times real projective space.)

2.16 Tangent vectors

Suppose we have a curve (line) in a manifold. How would we represent this?

h : IR→ M h(λ) ∈ M for λ ∈IR (2.22)

Note that any homeomorphism from IR to IR would allow us to re-parameterize the


curve λ → λ̄(λ) without changing the image h(IR) in M.

If we have any sense we would at least want to make the curve h :IR→ M continuous.
Question: How would we define continuity? ♦
Better yet:
Question: How would we define differentiability? ♦
Answer: In each chart U the curve h induces a map f ◦ h from IR to f (U ) ⊂IR n , and we
can certainly define continuity and differentiability for this map f ◦ h. We write

xa (λ) = f ◦ h(λ) = f (h(λ)) ∈ f (U ) ⊂IRn . (2.23)

Definition 29 Tangent vector (components):


In a specific chart U , and subject to the particular parameterization λ, let me define the
components of the tangent vector to the curve h(λ) as

a dxa
t = . (2.24)

Math 464: Differential Geometry 29

Note that this does the obvious thing for Cartesian space M =IRn . If we reparameterize
λ, then by the chain rule
dxa dλ dxa dλ a
t̄a = = = t . (2.25)
dλ̄ dλ̄ dλ dλ̄
If we change coordinate patches, U to Ū , using
f¯ ◦ f −1 : f −1 (U ) → f¯(U ) (2.26)
then, again by the chain rule
n n
dx̄a X ∂ x̄a dxb X ∂ x̄a b
t̄a = = b dλ
= b
t. (2.27)
dλ b=1
∂x b=1
∂x

That is: Once we have the components of the tangent vector in any one chart, there are
specific unambiguous rules for changing to any other chart. Also note that it only makes
sense to talk about the tangent vector components at a particular point p(λ) on the curve
and in the manifold.

Definition 30 Tangent vector:


The tangent vector to a parameterized curve h(λ) is the abstract object, denoted t, that is
defined as follows: Pick chart any chart U surrounding the point p = h(λ 0 ) and calculate
the components ta . Then t is defined to be that geometrical object which in chart U has
components ta (and so we know its components in any chart).

Definition 31 Vector:
Anything that transforms the same way as a tangent vector is called a vector, or more
specifically a contravariant vector.

Theorem 4 Tangent space:


If we look at the set of all possible tangent vectors at a specified point p, this set is a vector
space Tp , which we call the tangent space at p.

How do we (informally) prove it is a vector space? By working in some specific chart U


where it is clear that the components ta do span a vector space (indeed IRn ), and then
noting that this property is invariant under change of chart.

Indeed, pick a chart U surrounding p and write coordinates xa ∈ f (U ). Then every


distinct mapping from IR to IRn corresponds to a distinct [parameterized] curve in M.
The tangents defined by the distinct mappings from IR to IRn certainly span a vector
space.

To verify this, pick a curve xa (λ) in IRn defined by xa (λ) = ta λ and “pull it back” to
U by defining
h(λ) = f −1 (xa (λ)) = f −1 (ta λ). (2.28)
Math 464: Differential Geometry 30

Definition 32 Tangent bundle:


T (M): Consider the set {p, Tp } : p ∈ M, the collection of all tangent spaces defined at
all points of the manifold M. This set is called the “tangent bundle” to the manifold M.
M is called the “base space”.
All the Tp are homeomorphic to IRn , and IRn is called the “fibre” of the tangent bundle.

Theorem 5 The tangent bundle is a manifold of dimension 2 dim(M).

To (informally) prove this, note that “above” any chart U , the tangent bundle will locally
look like U × IRn , so that sets of this form provide charts for T (M).
Since each of these charts has dimension 2n, so does T (M).
Since M (and hence U ) is Hausdorff, and IRn is Hausdorff, so is U × IRn , and so is T (M).
Since M is second countable, pick a countable atlas Ui . Then Ui × IRn provides a countable
atlas for T (M), which is thereby second countable.

Comment: Once we have the components of the vector ta , it is useful to consider the
directional derivative in the direction ta . Define:

t = ta (2.29)
∂xa
It is then easy to verify (by the chain rule) that t is independent of coordinate chart —
it is a “geometrical object”. This provides a natural isomorphism between the set of all
vectors at a point and the set of all directional derivatives at that point.

Indeed, in more abstract formulations of differential geometry, this construction is taken


to be primary, and the concept of contravariant vector is defined this way. ♦

Definition 33 [Alternative definition of vector] A contravariant vector at a point p is a


directional derivative at the point p (which has to satisfy certain axioms appropriate to
being a linear differential operator).

See for instance, Wald pp 14–16.

This leads to statements that at first glance look a bit strange, such as:

• “For each a, the linear operator ∂/∂xa is a vector.”


• “The n linear operators ∂/∂xa provide a linearly independent basis for the tangent
space Tp at the point p.”

You should be prepared, when encountering such terminology, to make appropriate trans-
lations.
Math 464: Differential Geometry 31

2.17 Covectors

There is a second set of distinct vectors (not tangent vectors) that it is natural to define
on arbitrary manifolds, the co-vectors, a natural extension of the notion of “gradient”.

Let φ(p) be a mapping M →IR. Then in a specific coordinate chart (O, f, U ) we can
write
φ(xa ) ∈IR meaning φ ◦ f −1 : f (O) = U ⊆IRn →IR . (2.30)

Now copy over (with minor modifications) the discussion we had for tangent vectors:

Definition 34 Gradient vector (components):


In this chart (O, f, U ), let me define the components of the gradient vector as

∂φ
ga = . (2.31)
∂xa
Index placement is important!

If we change coordinate patches to Ū using f¯ ◦ f −1 : f −1 (U ) → f¯(U ) then, again by the


chain rule n n
∂φ X ∂xb ∂φ X ∂xb
ḡa = a = = gb . (2.32)
∂ x̄ b=1
∂ x̄a ∂xb b=1
∂ x̄a
Index placement is important!
That is: Once we have the components of the gradient vector in any one chart, there
are specific unambiguous rules for changing to any other chart. Also note that it only
makes sense to talk about the gradient vector components at a particular point p in the
manifold.

Warning: The gradient vectors do not transform the same way as the tangent vectors;
the index placement is different; the use of the the chain rule is different; to define tangent
vectors you need a map IR→ M, to define gradient vectors you need a map M →IR. ♦

Definition 35 Gradient:
The gradient to the function φ at a point p in M is the abstract object g defined as
follows: Pick chart any chart U surrounding p and calculate the components g a as above.
g is defined to be the object which in chart U has components ga (and so we know its
components in any chart).

Definition 36 Co-vector:
A co-vector at the point p in M is an abstract geometrical object, denoted g, defined as
Math 464: Differential Geometry 32

follows: Pick chart any chart U surrounding p and in that chart suppose g has components
ga . Suppose that in any other chart it has components
n
X ∂xb
ḡa = gb . (2.33)
b=1
∂ x̄a

Then we say that g is a co-vector that is independent of the specific choice of chart used
to initiate the definition.

Covectors are also called covariant vectors; they are simply not the same as contravariant
vectors.

Theorem 6 Cotangent space:


If we look at the set of all possible co-vectors vectors at a specified point p, this set is a
vector space Tp∗ , which we call the cotangent space at p.

How do we prove it is a vector space? By working in some specific chart U where it is


clear that the components ga do span a vector space (indeed Rn ), and then noting that
this property is invariant under change of chart.

Indeed, pick a chart U surrounding p and write coordinates {xa } ∈ f (U ). Then every
distinct mapping from IRn to IR corresponds to a distinct real function φ on M. The
gradients defined by the distinct mappings from IRn to IR certainly span a vector space.

To see this pick a function on IRn defined by ga xa and interpret it as a function on U


by defining
φ(p) = ga xa (p). (2.34)

Definition 37 Cotangent bundle:


T ∗ (M): Consider the set {p, Tp∗ } : p ∈ M, the collection of all cotangent spaces defined
at all points of the manifold M. This set is called the “cotangent bundle”, T ∗ (M), to the
manifold M.
M is called the “base space”.
All the Tp∗ are homeomorphic to IRn , and IRn is called the “fibre” of the tangent bundle.

Theorem 7 The cotangent bundle is a manifold of dimension 2 dim(M).

To prove this (informally), note that “above” any chart U , the cotangent bundle will
locally look like U × IRn , so that sets of this form provide charts for T ∗ (M).
Since each of these charts has dimension 2n, so does T ∗ (M ).
Since M (and hence U ) is Hausdorff, and IRn is Hausdorff, so is U × IRn , and so is T ∗ (M ).
Math 464: Differential Geometry 33

Since M is second countable, pick a countable atlas Ui . Then Ui × IRn provides a countable
atlas for T ∗ (M), which is thereby second countable.

Warning: In Euclidean space with Cartesian coordinates we normally do not [and do


not need to] distinguish vectors (tangents) from covectors (gradients).
This is because Cartesian coordinates have very special properties.
But even in Euclidean space, if we go to general curvilinear coordinates this vector/covector
distinction would be important. ♦

2.18 Dual spaces

Now that we have defined both tangent space and cotangent space, I’ll explain the nomen-
clature by pointing out that the tangent space Tp and cotangent space Tp∗ are dual to each
other in the sense of vector space duality. Specifically any cotangent g ∈ Tp∗ can be viewed
as a linear mapping from Tp to IR.

The most down to earth way of seeing this is by working in terms of coordinates:

g ∈ Tp∗ → ga ; t ∈ Tp → ta (2.35)

Define n
X
g(t) = ga ta ∈IR (2.36)
a=1
a
and note that the combination ga t is independent of coordinate system — ga transforms
“covariantly”, but ta transforms “contravariantly”, and in the combination ga ta the two
transformation matrices, being inverses, cancel.

Note: By using the chain rule:


n
X ∂xa ∂ x̄b
b ∂xc
= δac (2.37)
b=1
∂ x̄

n
X ∂ x̄a ∂xb
b ∂ x̄c
= δac (2.38)
b=1
∂x
That is, the transformation matrices for vectors and covectors are matrix inverses of each
other. ♦

Note: Einstein summation convention:


Einstein introduced the very useful convention that repeated indices should be summed
Math 464: Differential Geometry 34

over, thus for example


n
X
g a ta ≡ g a ta (2.39)
a=1

This greatly simplifies many formulae...


(Though if you don’t want to sum on a repeated index you now have to say so explicitly.) ♦

Warning: Einstein summation convention:


If you have any sense at all, you will only apply the Einstein summation convention to
combinations with with one index that is up (contravariant) and one index that is down
(covariant). Why? ♦

Notation:
Bra-ket notation:
g(t) = hg|ti (2.40)
Dirac’s bra-ket notation (widely used in quantum mechanics to represent inner products
on an appropriate Hilbert space) is an example of distinguishing a vector space from its
dual. Of course Dirac was dealing with a much larger vector space, that of suitably de-
fined “functions” on IR3 × IR. ♦
Comment: “bra” vectors are written hg|; “ket” vectors are written |ti; put them to-
gether, as in hg|ti and you get a “bra-ket” [bracket]. Blame Paul Dirac for this abuse of
the English language. ♦

Because it is independent of coordinate system, the definition


n
X
g(t) = ga ta ∈IR (2.41)
a=1

actually defines a pairing


Tp∗ × Tp →IR (2.42)
This pairing is by construction linear in Tp , which is why Tp∗ is referred to as the dual
vector space to Tp . If we construct the dual to the dual we get back where we started:

(Tp∗ )∗ = Tp (2.43)

Note that as a side effect of the above discussion we have

dim(T ∗ ) = dim(T ) (2.44)

at least for finite-dimensional manifolds.


Math 464: Differential Geometry 35

There is an important interpretation of ga ta in terms of directional derivatives: If


g ∈ Tp∗ arises from the gradient of a scalar φ(p), and t ∈ Tp arises as the tangent vector
to a curve h(λ), then
∂φ dxa dφ
g(t) = ga ta = a = . (2.45)
∂x dλ dλ
That is, g(t) = hg|ti is simply the total derivative of the scalar field φ(p) along the curve
h(λ).

So, of course it’s a coordinate invariant. (Since φ(λ) : IR→IR has a meaning that is
quite independent of any coordinates you introduce on the manifold.)

2.19 Tensors

Now that we have defined tangents, cotangents, and duality, we are ready to define tensors.
[Adapted from a discussion I first saw presented by Chris Grigson.]

Definition 38 Tensor
A tensor Tsr (p) at a point p is a multi-linear mapping from (Tp∗ )r ⊗ (Tp )s into the real
numbers IR.

The symbol ⊗ denotes the usual Cartesian product of two structures, in this case the
Cartesian product of a number of vector spaces.

That is, let g 1 , g 2 , . . . ,g r be r cotangent vectors at p, and let t1 , t2 , . . . ,ts be s tangent


vectors at p, then
Tsr (g 1 , g 2 , . . . g r ; t1 , t2 , . . . ts ) ∈IR (2.46)
and this function is defined to be linear in each one of its r + s arguments.

Why are we doing such a perverse thing? Because this is the abstract way of defining
a tensor without first dealing with components.

Because a tensor Tsr (p) is defined in this way, as a multi-linear mapping from (Tp∗ )r ⊗
(Tp )s into the real numbers IR, and because (Tp∗ )r ⊗ (Tp )s is itself a vector space [of
dimension dim(M)r+s ], we could equally well view the tensor as an element of the dual
of (Tp∗ )r ⊗ (Tp )s . That is
∗
Tsr (p) ∈ (Tp∗ )r ⊗ (Tp )s (2.47)
It is then an easy exercise to show
∗
(Tp∗ )r ⊗ (Tp )s = [(Tp∗ )r ]∗ ⊗ [(Tp )s ]∗ = [(Tp∗ )∗ ]r ⊗ [(Tp )∗ ]s = (Tp )r ⊗ (Tp∗ )s (2.48)
That is:
Tsr (p) ∈ (Tp )r ⊗ (Tp∗ )s (2.49)
Math 464: Differential Geometry 36

Some people may find this characterization more “intuitive”.

Definition 39 Tensor [alternative definition]


A tensor Tsr (p) at a point p is an element of (Tp∗ )r ⊗ (Tp )s .

2.20 Tensor components

The components of a tensor of type Tsr (p) at a point p are defined as follows:

Let g 1 , g 2 , . . . , g r be r cotangent vectors at p, with components ga1 , ga2 , . . . , gar .

Let t1 , t2 , . . . , ts be s tangent vectors at p, with components tb1 , tb2 , . . . , tbs .

Then, because the tensor Tsr is a multi-linear mapping, there must exist a collection of
numbers T a1 a2 ...ar b1 b2 ...bs such that:

Tsr (g 1 , g 2 , . . . g r ; t1 , t2 , . . . ts ) = T a1 a2 ...ar b1 b2 ...bs ga11 ga22 . . . garr tb11 tb22 . . . tbss . (2.50)

These numbers are called the components of the tensor Tsr in the specified chart/ coordi-
nate patch/ coordinates.

Index placement is important!

A “quick and dirty” characterization of a tensor of type Tsr is that is it a collection


of numbers with r indices up (r contravariant indices) and s indices down (s covariant
indices).

Note that under a change of charts (change of coordinates) Tsr transforms as you would
expect: each one of the up (contravariant) indices transforms like the components of a
tangent vector; each one of the down (covariant) indices transforms like a gradient. Thus
in particular for a tensor of type:

∂xā a ∂xb
T11 : X a b → X ā b̄ = X b (2.51)
∂xa ∂xb̄
∂xā ∂xb̄
T02 : X ab → X āb̄ = a X ab (2.52)
∂x ∂xb
∂xa ∂xb
T20 : Xab → Xāb̄ = ā Xab (2.53)
∂x ∂xb̄
(Note that I have put the over-bars on the indices, not on the coordinates themselves.
This is supposed to make things clearer.)

Index placement is important!


Math 464: Differential Geometry 37

Exercise: Consider the T11 tensor whose components are (in the chart whose coordi-
nates are xa ) given by the Kronecker delta δ a b . What are it components in some other
coordinate system x̄a (xb )? ♦

2.21 Tensor fields

Once you have the notion of a tensor at a point, the notion of a tensor field is immediate:

Definition 40 Vector field:


A vector field is an assignment of a vector t ∈ Tp to each point p ∈ M.
A vector field is continuous iff on each patch the mapping xb → ta (xb ) is continuous as a
mapping from IRn →IRn .
Similarly we can define differentiable vector fields.
In fact we can define C 0 , C 1 , C r , C ∞ , C ω vector fields.

Definition 41 Covector field:


A covector field is an assignment of a covector t ∈ Tp∗ to each point p ∈ M.
Similarly we can define C 0 , C 1 , C r , C ∞ , and C ω covector fields.

Definition 42 Tensor field:


A tensor field is an assignment of a tensor Tsr (p) ∈ (Tp )r ⊗ (Tp∗ )s to each point p ∈ M.
Similarly for C 0 , C 1 , C r , C ∞ , and C ω tensor fields.

2.22 Fibre bundles

A fibre bundle [US spelling: fiber bundle] is a particular type of manifold with special
structure. Standard examples of fibre bundles are the tangent bundle T (M) and the
cotangent bundle T ∗ (M), but the basic idea is much more general.

Definition 43 Fibre bundle:


A “fibre bundle” is a quartet (M, F , T , π) where

• M is a manifold called the “base space”.

• F is a manifold called the “fibre”.


Math 464: Differential Geometry 38

• T is a manifold called the “total space” [warning: not a topology].


• π : T → M is a projection with ∀x ∈ M, π −1 (x) ∼ F .
That is, the inverse image of any point x ∈ M is homeomorphic to the fibre F .

Additionally we demand that T is “locally” a product space. That is, suppose we have an
open cover of the base space M by a collection of open sets {U }. Then for each U we
demand π −1 (U ) ∼ U × F ; the inverse image of any open set in M is homeomorphic to
U × F.

Examples:

• Any product manifold M1 × M2 can be viewed as a fibre bundle in two different


ways. Either take M1 as base and M2 as the fibre, or vice versa.
• T 2 = S 1 × S 1 is a fibre bundle.
• The tangent bundle T (M) is a fibre bundle with T = T (M), M = M, F = IRn .
Similarly for the cotangent bundle.
• The “Mobius band” is a fibre bundle. Take M = S 1 , F = (−1, 1) ⊂ IR, and T
locally of the form (a, b) × (−1, 1), with a single twist on going all the way round
the circle.

Special cases:

• If the fibre is a vector space [not necessarily the tangent space], the fibre bundle is
called a “vector bundle”.
• If the fibre is a group manifold, and the charts respect the group structure, then the
fibre bundle is called a “principal bundle”.
• If the fibre is IR, the fibre bundle is called a “line bundle”.
• If the fibre is C, the fibre bundle is called a “complex line bundle”.
• In quantum mechanics on configuration spaces of nontrivial topology the Schrodinger
wavefunction ψ(x) is not a function. It is instead a section of a complex line bundle
defined over the configuration space. This is actually important for a proper anal-
ysis of both the Aharonov–Bohm effect, and flux quantization in super-conducting
rings.
• In quantum field theory the Dirac spinor describing an electron ψ(x) is not a func-
tion. It is instead a section of a spinor bundle defined over Minkowski space. This
is actually important for a deeper understanding of the electromagnetic field.
Math 464: Differential Geometry 39

• Don’t be afraid of terms like “[something] bundle”; they are just special types of
bundle where you have additional information about the fibre.

2.23 Final comments on fundamentals

All the technical machinery and terminology I have set up so far is very general, and
very flexible; it is used both in standard GR, and in various extensions to GR currently
under investigation. It will be common, at least in some approximation, to all physically
reasonable extensions of GR and is also used in other quite different physical situations
such as topological solitons, ...

So far, I have not introduced the notion of a metric tensor. The tangent and cotangent
spaces, though dual to each other, have otherwise been independent of each other. This
will change, in a chapter or two.

Aside: The relationship between mathematicians and theoretical physicists is often


fruitful but sometimes just a little bit strained. I have heard some physicists bemoaning
the grim disease of “bundle fibrosis” that has swept through the theoretical physics com-
munity in the last 25 years. ♦
Chapter 3

Affine Connexions

The fundamental reason that tensor calculus is (relatively) difficult is that partial deriva-
tives of a tensor are not themselves a tensor; so that it takes quite a bit of work to even
define tensor differentiation. We will focus on one way of defining tensor differentiation
in this chapter; there are at least two other types of tensor derivative we’ll get round to
later in the course.

3.1 Partial derivatives are not tensors

Suppose V a are the components of a vector, then under a change of chart


∂xā b

V = V (3.1)
∂xb
Take partial derivatives1
∂ x̄a b
 

∂c̄ V = ∂c̄ V (3.2)
∂xb
 a  a
∂ x̄ b ∂ x̄
= ∂c̄ V + ∂c̄ V b (3.3)
∂xb ∂xb
  a   a  d
∂ ∂ x̄ b ∂ x̄ ∂x
= c̄ b
V + b c
∂d V b (3.4)
∂x ∂x ∂x ∂ x̄
 d  2 a   a  d
∂x ∂ x̄ ∂ x̄ ∂x
= c̄ d b
Vb+ b
∂d V b (3.5)
∂x ∂x ∂x ∂x ∂ x̄c
If only the last term were present, then we would have a T11 tensor. Unfortunately the
presence of the first term destroys the tensorial properties of the partial derivative ∂V —
1
As usual we will use ∂a as shorthand for ∂/∂xa .

40
Math 464: Differential Geometry 41

unless, that is, you can guarantee that the only coordinate changes you will ever have to
make are linear.

This happens for instance if you are working in Euclidean space using Cartesian coor-
dinates and are only interested in changing coordinates to another Cartesian system.

Comment: There is a whole vast subject called “Cartesian tensors”, mainly used in
geophysics, elasticity theory, and engineering analysis for which all these simplifications
do occur — so all the complications of this chapter quietly go away. ♦

In general though, you simply have to live with the first term and either develop ad-
ditional structure to deal with it or suitably restrict the questions you ask. The three
standard routes to tensor differentiation are:

1. the covariant derivative [affine connexion];


[additional structure].
2. the exterior derivative [antisymmetric tensors];
[restrict the tensors you differentiate].
3. the Lie derivative [differentiation along a curve];
[restrict the directions in which you differentiate].

For this chapter we will stick with the affine connexion [covariant derivative]. Note that
the first term, which causes the problems, is linear in the components of V ; this is one
hint of how to fix things up. The second hint comes from a deeper look at what went
wrong: In the partial derivative we are calculating
V (x + δx) − V (x)
∂c̄ V ā ≡ lim (3.6)
δx→0 δx
But that means we are really subtracting two different vectors in two different vector
spaces, one at x, [V (x) ∈ Tx ], and one at x + δx, [V (x + δx) ∈ Tx+δx ] — no wonder the
result is not a tensor.

3.2 Parallel transport

3.2.1 Basic idea

To define a covariant notion of derivative we will need to do something like this


transportx+δx→x [V (x + δx)] − V (x)
∇V ≡ lim (3.7)
δx→0 δx
Math 464: Differential Geometry 42

But what properties should this “transport” function have? In general,

transport(y→x;γ) : Ty → Tx (3.8)

should be an invertible mapping — distinct vectors should map to distinct vectors — but
it might well depend on the specific path chosen to go from y → x. (In fact, in general
it will be path dependent.) Now we certainly want parallel transport along the null path
γ0 (the path that does not move anywhere) to be the trivial identity operator

transport(x→x;γ0 ) = I : Tx → Tx (3.9)

Additionally, if we parallel transport along a path and then parallel transport back along
the same path in a reversed sense, we would want the composition of these parallel
transport processes to be trivial. Specifically let γ : y → x and let the reversed path be
γ̃ : x → y, then we would expect

transport(x→y;γ̃) ◦ transport(y→x;γ) = transport(y→y;γ0 ) = I (3.10)

that is  −1
transport(x→y;γ̃) = transport(y→x;γ) (3.11)
so that reversing the path corresponds to the inverse transformation.

Finally, since it maps vector spaces to vector spaces, the transport operator should at
the very least be linear:

transport(y→x;γ) [V1 + V2 ] = transport(y→x;γ) [V1 ] + transport(y→x;γ) [V2 ] (3.12)

This is also necessary if we wish to enforce the basic property ∇(V1 + V2 ) = ∇V1 + ∇V2 .
So if we have a coordinate chart that covers both the points x and y we will be able to
write the components of the transport operator as:

[transport(y→x;γ) ]a b (3.13)

Warning: This is not a tensor, it is a bi-tensor — the index a is a vector index at x


while the index b is a covector index at y. ♦

Warning: It is possible to think of generalizations of the usual transport operator that


do not act linearly, and do not invert themselves under reversal of the path you traverse.
Of course if you do this, you will not get the usual definition of covariant derivative.

As always, the most basic properties are assumed, not derived, and the relevant ques-
tion is whether the assumptions you make [the axioms] lead to a useful mathematical
structure. ♦
Math 464: Differential Geometry 43

3.2.2 Connexion

The “affine connexion” Γa bc is now informally defined by:


Γa bc = c
{transport[y → x; γ]}a b , (3.14)
∂y y→x

where the limit y → x approaches the trivial curve γ0 . The affine connexion is not a
tensor, for exactly the same reason that the partial derivative of a vector is not a tensor.
Note that the leading indices a and b have to do with “matrix multiplication” of the vector
components, while the trailing index c has to do with the “direction of differentiation”.
To emphasise the different roles these indices play it is sometimes useful to introduce an
abstract “bullet” notation and write:

Γ• •c = {transport[y → x; γ]}• • . (3.15)
∂y c y→x

The “bulleted” indices should, whenever possible, be chained together by matrix multi-
plication.

Warning: This “bullet” notation is my own slightly nonstandard notation. You will
soon see why it is often useful. ♦

Warning: Authors differ regarding where to place the indices on the affine connex-
ion. Check whichever book you are using. The convention of these notes is [putting the
direction you are differentiating in as the trailing lower index] is compatible with Misner–
Thorne–Wheeler but is not [strictly speaking] compatible with either Hartle or Wald.
Fortunately both of these authors deal explicitly with symmetric connexions, so the order
in which they put the lower two indices does not really matter [for them]. ♦

Comment: A slightly more formal version of this construction works as follows: Pick
a coordinate chart around the point xa and in some “small” neighborhood of xa set up a
special collection of curves
a
γy,x (λ) = λy a + (1 − λ)xa ; λ ∈ [0, 1] (3.16)

Now these curves are all “straight lines” in the specified coordinate chart, but this implies
no loss of generality because we are always looking at “small” regions and relying on the
locally Euclidean nature of the manifold. (So that any complicated [but differentiable]
curve can in “small” enough regions be safely approximated by straight line segments.)
Note that by construction
lim γx,y = γ0 (3.17)
y→x
Math 464: Differential Geometry 44

the trivial curve, and that on an open region surrounding xa


{transport[y → x; γx,y ]}• • (3.18)
is now well defined as a function of y a . Then

Γa bc = {transport[y → x; γ]}a b , (3.19)
∂y c y→x

is guaranteed to be well-defined. ♦

If you are only transporting the vector a short distance, then via Taylor’s theorem the
change in components due to the transport should be linear in δx, (and path independent
at this level of approximation). In components:
transport(x+δx→x) [V (x + δx)]a = V a (x + δx) + Γa bc V b δxc + O[(δx)2 ] (3.20)
This implies
transport(x+δx→x) [V (x + δx)]a = V a (x) + {∂c V a (x) + Γa bc V b }δxc + O[(δx)2 ] (3.21)

Exercise: In • notation, justify the statement


transport(x+δx→x) [V (x + δx)]• = V • (x) + {∂c V • (x) + Γ• •c V • }δxc + O[(δx)2 ] (3.22)
Note that this emphasises the difference between those indices that have to do with the
direction in which you are parallel transporting, and the other indices that are just “com-
ing along for the ride”. ♦

Exercise: In • notation, define


∂x•
J• • = = ∂ • x• (3.23)
∂x•
and use this to justify the change of coordinates formula
V • = V • J• • (3.24)
Thereby justify
∂• V • = [J −1 ]• • ∂• V • J• • + [J −1 ]• • (V m ∂m ) J• • (3.25)
or the even more compact
∂• V • = [J −1 ]• • ∂• V • J• • + [J −1 ]• • (V · ∂) J• • (3.26)
This focuses attention on the fact that all the complicated index structure in the (non
covariant) transformation law for partial derivatives is really just a matter of chaining
together a few matrix multiplications. ♦
Math 464: Differential Geometry 45

3.2.3 Covariantly constant vector fields

This now lets you write the notion of a vector field that is “covariantly constant” to itself
in differential form. By definition of the transport operator a vector field is “covariantly
constant” iff
transport(x+δx→x) [V (x + δx)]a = V a (x) (3.27)
But this then implies

{∂c V a (x) + Γa bc V b } δxc + O[(δx)2 ] = 0 (3.28)

Parameterizing the curve with some parameter λ and taking the limit as δxc → 0 and
δλ → 0, we have
c
a a b dx
{∂c V (x) + Γ bc V } =0 (3.29)

or
∂V a (λ) dxc a b dxc
+ Γ bc (λ) V (λ) = 0. (3.30)
∂xc dλ dλ
That is
dV a (λ) dxc
+ Γa bc (λ) V b (λ) = 0. (3.31)
dλ dλ
This differential version of the “parallel transport” process shows that if you (1) know what
the curve xa (λ) is, and you know (2) the components of the affine connexion Γa bc (x), [and
hence Γa bc (λ)], then finding the parallel transported coefficients of V is simply a matter
of solving a first-order linear ODE. And remember from Math301 that you know that
the solutions of first-oder ODEs enjoy nice existence and uniqueness properties. [Finally,
once you have the parallel transported components at x, you can compare them to the
actual components at x to define the covariant derivative — see below.]

3.2.4 Covariantly auto-parallel vector fields

One could also consider a slightly different notion: that of a vector field that is parallel
to itself (“auto-parallel” but not necessarily “covariantly constant”). By definition of the
transport operator a vector field is parallel to itself iff

transport(x+δx→x) [V (x + δx)]a ∝ V a (x) (3.32)

That is
transport(x+δx→x) [V (x + δx)]a = f (x, δx) V a (x) (3.33)
But this then implies

{∂c V a (x) + Γa bc V b } δxc + O[(δx)2 ] = {f (x, δx) − 1} V a (x) (3.34)


Math 464: Differential Geometry 46

Parameterizing the curve with some parameter λ and taking the limit as δxc → 0 and
δλ → 0, we have
dxc dxc a
{∂c V a (x) + Γa bc V b } = fc V (3.35)
dλ dλ
or
dV a (λ) dxc dxc a
+ Γa bc (λ) V b (λ) = fc V . (3.36)
dλ dλ dλ
This is the differential version of an “auto-parallel” vector field. The covector fc is un-
constrained.

3.2.5 Path ordering

To go back from affine connexion to transport operator, consider the object:


Z y

[transport(y→x;γ) ] • = P exp(Γ• •c dxc ) (3.37)
x

Here the integral is to be taken along the curve γ and P denotes the process of “path or-
dering”. The transport operator then forms a “group” under path composition. (Strictly
speaking, a pseudo-group, since composition is defined only if the end of the first curve cor-
responds to the beginning of the second curve.) The technical definition of path-ordering
is (see appendix for details)
Z y N
Y
• c
P exp {Γ •c dx } ≡ lim exp {Γ• •c (xn ) δxcn } (3.38)
x N →∞
n=1

It is the matrix multiplication analogue of the Riemann sum; in quantum physics this is
exactly the same as the “time-ordered exponential integral”.

Aside: To a physicist, and in particular to a particle physicist, the underlying logic here
is obvious. If we think of the affine connexion as analogous to a non-Abelian [Yang–Mills]
gauge field (e.g., in QCD) then the parallel transport operator is analogous to what physi-
cists would call a gauge transport operator — and parallel transport around a closed path
is analogous to what physicists would call a Wilson loop. While I am not doing anything
with non-Abelian gauge theories at this stage, see the last chapter. ♦

Note that affine connexion and transport operator are interchangeable. From the trans-
port operator you can extract the connexion by differentiation or by looking at “short”
curves; from the connexion you can construct the transport operator by path-ordered
exponentiation.

Note that (at this stage) the transport operator and affine connexion Γa bc are introduced
as primitive a priori concepts that have no further resolution into more primitive concepts
— this will change once I get around to discussing the metric tensor and metric connexions.
Math 464: Differential Geometry 47

Note that there is no unique best way of introducing the concepts of parallel transport
and affine connexion — many textbooks use an axiomatic formation in terms of the
“covariant derivative” ∇V .

Aside: The determinant of the transport operator is easily seen to be independent of


these “path ordering” problems. Indeed
 Z y 
• • c

det [transport(y→x;γ) ] • = det P exp(Γ •c dx ) (3.39)
x
( N
)
Y
= det lim exp (Γ• •c (xn ) δxcn ) (3.40)
N →∞
n=1
N
Y
= lim exp ([tr Γ• •c (xn )] δxcn ) (3.41)
N →∞
n=1
Z y 
a c
= exp Γ ac dx (3.42)
x

So the determinant of the transport operator has a simple connection with the trace Γa ac .
There is no special word for this object (it’s not a vector), but maybe there should be. ♦

Exercise: Explicitly calculate the transformation law for Γa ac . ♦

Exercise: Look up the definition of tensor density and convince yourself that Γa ac
transforms as the gradient of a scalar density. ♦

Notation: Because it shows up relatively often it is useful to define the quantity


Γa ≡ Γm ma ≡ tr [Γ• •m ] (3.43)
Note that the trace is on the first two indices. This quantity does not seem to have a
special name. ♦

3.3 Covariant derivative:

3.3.1 Vectors

Now formalize the definition of covariant derivative:


transport(x + δx → x)[V (x + δx)] − V (x)
∇V ≡ lim (3.44)
δx→0 δx
Math 464: Differential Geometry 48

In terms of components
V a (x + δx) + Γa dc V d (x + δx) δxc + O[(δx)2 ] − V a (x)
∇b V a ≡ lim (3.45)
δx→0 δxb
Then
∇b V a = ∂b V a (x) + Γa db (x) V d (x) (3.46)
This is the component version of the covariant derivative acting on vectors. And if every-
thing has worked right, ∇V should b a T11 tensor. Let’s look at the transformation law
for the connexion

Γā b̄c̄ = {transport[y → x; γ]}ā b̄ (3.47)
∂y c̄ y→x
∂y c ∂
 ā b

∂x a ∂y
= {transport[y → x; γ]} b (3.48)
∂y c̄ ∂y c ∂xa ∂y b̄ y→x
∂xc ∂xā ∂xb a ∂xā ∂ 2 xa
= Γ bc + (3.49)
∂xc̄ ∂xa ∂xb̄ ∂xa ∂xc̄ ∂xb̄
∂xc ∂xā ∂xb a ∂xb ∂ 2 xā ∂xc
= Γ bc − . (3.50)
∂xc̄ ∂xa ∂xb̄ ∂xb̄ ∂xb ∂xc ∂xc̄
So the affine connexion Γ has the transformation law of a T21 tensor except for an addi-
tional second-derivative term. This second-derivative terms has of course been carefully
cooked up to cancel the second-derivative terms arising in the transformation of the par-
tial derivative of a vector. Result: the covariant derivative (of a contravariant vector)
really is a tensor.

Now that we have the notion of covariant derivative of a vector, we can revisit our
differential equation for an auto-parallel vector field
dxc dxc a
{∂c V a (x) + Γa bc V b } = fc V (3.51)
dλ dλ
and write it in the form
dxc dxc a
{∇c V a }= fc V (3.52)
dλ dλ
Multiply by V b and anti-symmetrize, then
dxc
∇c V [a V b] =0 (3.53)

which has the advantage that the unknown covector fc has been eliminated. If we only
specify that the vector field is auto-parallel along the one curve, this is as far as we can
go. But if this is supposed to hold for all curves, that is for all dxa /dλ, then
∇c V [a V b] = 0 (3.54)
which is our final form of the differential constraint for an auto-parallel vector field. Note
that a covariantly constant vector field satisfies the stronger constraint
∇c V a = 0 (3.55)
Math 464: Differential Geometry 49

3.3.2 Covectors

How would we define the covariant derivative of a covector? Use the fact that t a ga is a
scalar, and make some extra assumptions:

• For a scalar
∇a φ ≡ ∂ a φ (3.56)
This is justified since a gradient of a scalar does transform in a tensorial manner.
• Preserve the Leibnitz rule:

∇(XY ) = X∇Y + Y ∇X (3.57)

This is too useful a rule to abandon (unless absolutely necessary).

Then, subject to these two assumptions,

∇a (tb gb ) = (∇a tb )gb + tb (∇a gb ) = ∂a (tb gb ) = (∂a tb )gb + tb (∂a gb ) (3.58)

So
(∂a tb + Γb ca tc )gb + tb (∇a gb ) = (∂a tb )gb + tb (∂a gb ) (3.59)
That is
(Γb ca tc )gb + tb (∇a gb ) = tb (∂a gb ) (3.60)
Rearrange
tb (∇a gb ) = tb (∂a gb − Γc ba gc ) (3.61)
a
But this is supposed to hold for arbitrary t , therefore

∇a gb = ∂a gb − Γc ba gc (3.62)

And so the rule for covariantly differentiating covectors is implicit in that for vectors.

3.3.3 Tensors

The generalization to any Tsr is now obvious. There will be a total of r + s + 1 terms.
The first will be the partial derivative. Then r terms involving +Γ, and finally s terms
involving −Γ. On the Γ’s the last trailing index will always be the same as the index on
the ∇ and the remaining indices can be reconstructed to get the tensor structure correct.
Thus
r
X s
X
b1 b2 ...br b1 b2 ...br bα b1 b2 ...σ...br
∇a T c1 c2 ...cs ≡ ∂a T c1 c2 ...cs + Γ σa T c1 c2 ...cs − Γσ cβ a T b1 b2 ...br c1 c2 ...σ···cs
α=1 β=1
(3.63)
Math 464: Differential Geometry 50

This should be compared to

∇b V a = ∂b V a + Γa db V d (3.64)

and
∇a gb = ∂a gb − Γc ba gc (3.65)
As a specific example

∇a X b c = ∂a X b c + Γb ma X m c − Γm ca X b m (3.66)

3.4 Geodesics

Geodesics are now defined as the “straightest possible” curves in the manifold, where
straight means that the tangent vector to the curve, parallel propagated along the curve,
is still parallel to the tangent vector.

Suppose we have a curve γ with an arbitrary parameterization λ. Define the tangent


vector by
dxa
ta = (3.67)

Then the curve is a geodesic iff:
 
transport(γ(λ2 )→λ(λ1 );γ) t(λ2 ) ∝ t(λ1 ) (3.68)

This can be written in infinitesimal form by differentiating with respect to λ2 , and then
letting y → x (ie, λ2 → λ1 ). Then
   
dt d  
+ transport(γ(λ2 )→λ(λ1 );γ) t ∝t (3.69)
dλ dλ

In coordinates:
dxa ∂
   
dt
+ transport(γ(λ2 )→λ(λ1 );γ) t ∝t (3.70)
dλ dλ ∂xa
That is
dta
+ {tc Γa bc } tb ∝ ta (3.71)

Alternatively
dta
+ Γa bc tb tc ∝ ta (3.72)

So that:
d2 x a a dxb dxc dxa
+ Γ bc ∝ . (3.73)
dλ2 dλ dλ dλ
Math 464: Differential Geometry 51

Here are six other equivalent forms of the geodesic equation:


 a   a 
∂t a b c a ∂t
+ Γ bc t t ∝ t ; a b
+ Γ bc t tc = f (λ) ta ; (3.74)
∂xc ∂xc
(∇c ta ) tc ∝ ta ; (∇c ta ) tc = f (λ) ta ; (3.75)
(t · ∇)t ∝ t; (t · ∇)t = f (λ) t; (3.76)
These are all equivalent statements regarding the notion of “straightest possible curve”
in a manifold armed with an affine connexion.

Where does the notion of metric tensor come into this? It doesn’t — not yet anyway.

Suppose we now change the parameterization: λ → λ̄(λ). Then


dx dλ
t= → t̄ = t (3.77)
dλ dλ̄
Therefore   2
d2 λ
 
dλ dλ dλ
(t̄ · ∇)t̄ = t·∇ t = (t · ∇)t + 2 t (3.78)
dλ̄ dλ̄ dλ̄ dλ̄
Using the geodesic equation
( 2 )
2
dλ dλ
(t̄ · ∇)t̄ = f (λ) + t (3.79)
dλ̄ dλ̄2

So by suitably choosing λ̄(λ) you can make the term proportional to t vanish so that:
(t̄ · ∇)t̄ = 0. (3.80)
A parameterization of this type is called an affine parameterization of the curve. Affine
parameterizations are invariant under affine transformations of the form
λ → λ̄ = Aλ̄ + B. (3.81)

Exercise: Show that to make the term proportional to t vanish you need to solve the
ODE  2
d2 λ dλ
2
= −f (λ) (3.82)
dλ̄ dλ̄
What are appropriate boundary conditions? What can you say about the existence and
uniqueness of solutions to this ODE? ♦

Warning: The use of the word “affine” in “affine connexion” is subtly different from its
use in “affine parameterization”; in the connexion context it arises because the transfor-
mation laws from one chart to another are qualitatively of the form
Γ → Γ̄ = JJJ −1 Γ + ∂J (3.83)
Math 464: Differential Geometry 52

where J is a transformation matrix; it’s affine in the sense that it’s linear with an inho-
mogeneous term. ♦

3.5 Torsion

Definition 44 Let Γa bc be an affine connexion, then


1 a
T a bc ≡ [Γ bc − Γa cb ] = Γa [bc] (3.84)
2
is a tensor of type T21 called the torsion of the connexion.

Notation: Because it shows up relatively often it is useful to define the quantity

Ta ≡ T m ma (3.85)

Note that the trace is on the first two indices. This quantity does not seem to have a
special name. ♦

Warning: There is no real standardization as to whether or not to include the 21 in


the definition of torsion. You will simply have to check the definitions in whatever book
or article you are reading. (Similarly, the sign of the torsion can depend on convention.) ♦

Lemma 4 Ta is a covector of type T10 .

Exercise: How would you prove this? ♦

Comment: In standard GR the torsion is always zero. There are extensions of GR


(not tremendously popular except for the string-inspired extensions to GR) in which the
torsion need not be zero.

Torsion is also used to study the continuum approximation os line defects in solid state
crystals. ♦

Warning: (Reinforcement by reiteration) Some authors define torsion slightly differ-


ently, without the factor of 1/2. ♦
Math 464: Differential Geometry 53

Lemma 5 Let Γa bc be an affine connexion, then


1 a
Γa (bc) ≡ [Γ bc + Γa cb ] (3.86)
2
is also an affine connexion, called a symmetric connexion, and it has exactly the same
geodesics as the original connexion.

Exercise: How would you prove this? ♦

Comment: Symmetric connexions are very important; the entire next chapter (thank-
fully a small chapter) will be devoted to them. ♦

Note that we can always write

Γa bc = Γa (bc) + T a bc (3.87)

which has the effect of decomposing an arbitrary affine connexion into a symmetric con-
nexion plus the torsion.

Lemma 6 Let Γa bc be an affine connexion and let X a bc be any tensor of type T21 . Then

Γ̃a bc = Γa bc + X a bc ; Γ̃• •c = Γ• •c + X • •c (3.88)

is also an affine connexion.

More generally:

Lemma 7 Let [Γ1 ]a bc and [Γ2 ]a bc be two affine connexions. Then


1 1
{[Γ1 ]a bc + [Γ2 ]a bc } ; {[Γ1 ]• •c + [Γ2 ]• •c } ; (3.89)
2 2
is also an affine connexion, and
1
{[Γ1 ]a bc − [Γ2 ]a bc } (3.90)
2
is a tensor of type T21 .

Obvious generalizations of this process can be developed.


Math 464: Differential Geometry 54

3.5.1 Geometrical interpretation 1

How should we try to interpret torsion geometrically? Torsion can be interpreted in terms
of “infinitesimal parallelograms” and the fact that they fail to close. (There will now be
a brief agony of index gymnastics.)

Specifically, pick some point x and a pair of tangent vectors t1 , t2 . Take the geodesic
passing through x in the direction t1 and, adopting an affine parameterization, move out
a “distance” λ1 . This puts you at the point
1
xa (0 → 1) = xa + ta1 λ1 + Γa bc tb1 tc1 λ21 + O[λ31 ]. (3.91)
2
Meanwhile, parallel transport the vector t2 along this geodesic. The components of t2 at
xa + ta1 λ1 will be
ta2 (xb (0 → 1)) = ta2 + Γa bc tb2 [tc1 λ1 ] + O(λ1 )2 . (3.92)
Now move out a “distance” λ2 along the geodesic specified by this vector, that will now
place you at the point
 
a a 1 a b c 2 3
x + t1 λ1 + Γ bc t1 t1 λ1 + O[λ1 ]
2
1
+ t2 + Γa bc tb2 [tc1 λ1 ] + O(λ1 )2 λ2 + Γa bc tb2 tc2 λ22 + O[λ32 ]
a
(3.93)
2
That is, at the point
1 1
xa (0 → 1 → 2) = xa + ta1 λ1 + ta2 λ2 + Γa bc tb1 tc1 λ21 + Γa bc tb2 tc2 λ22 + Γa bc tb2 tc1 λ1 λ2 + O[λ3 ]
2 2
(3.94)
Now go back to the original point x and repeat in the opposite order, first travelling out
along t2 a “distance” λ2 and then along the parallel transported t1 a “distance” λ1 . You
will now end up at the points
1
xa (0 → 2) = xa + ta2 λ2 + Γa bc tb2 tc2 λ22 + O[λ32 ]. (3.95)
2
and
1 1
xa (0 → 2 → 1) = xa + ta1 λ1 + ta2 λ2 + Γa bc tb1 tc1 λ21 + Γa bc tb2 tc2 λ22 + Γa bc tb1 tc2 λ1 λ2 + O[λ3 ]
2 2
(3.96)
If we were in flat space the “parallelogram” would certainly close and we would have
xa (0 → 1 → 2) = xa (0 → 2 → 1). In the general situation however we see that the
parallelogram does not close and that

xa (0 → 1 → 2) − xa (0 → 2 → 1) = [Γa bc − Γa cb ] tb2 tc1 λ1 λ2 + O[λ3 ]. (3.97)


Math 464: Differential Geometry 55

That is:
xa (0 → 1 → 2) − xa (0 → 2 → 1) = 2 T a bc tb2 tc1 λ1 λ2 + O[λ3 ]. (3.98)
In other words: torsion has to do with the fact that travelling along “parallel” geodesics is
not a commutative process, and that in a manifold with torsion there are no “infinitesimal
parallelograms”; at least no infinitesimal parallelograms defined via geodesic motion and
parallel transport.

3.5.2 Geometrical interpretation 2

Now wait a minute? We have already seen that given an arbitrary affine connexion Γ a bc
the related symmetric connexion Γa (bc) has the same geodesics, and by definition, no
torsion.

So using Γa (bc) instead of the original Γa bc , we will be able to construct “infinitesimal


parallelograms”, and the sides of these “infinitesimal parallelograms” will still be geodesics
in terms of the original Γa bc . What gives?

Well, although the the sides of the “infinitesimal parallelograms” defined by Γa (bc) are
still geodesic when viewed in terms of Γa bc , the sides of the “infinitesimal parallelograms”
defined by Γa (bc) fail to be parallel when viewed in terms of Γa bc .
Exercise: Check this. Show that the lack of parallelism is proportional to the torsion. ♦
In other words, there is more to parallelism than geodesics. Even if you know what all the
geodesics are in a given manifold, this does not mean you know how to define “parallel”.

Example: To really drive this home, consider the extreme case in which Γa (bc) is trivial,
meaning that there is a coordinate system in which the symmetric connexion is zero
throughout the manifold. In that case the geodesics are simply the geodesics of Euclidean
space IRn — ordinary straight lines. But parallelism is still complicated: the differential
equation for parallel transport [in this situation] simplifies to

dV a (λ) dxc
+ T a bc (λ) V b (λ) = 0. (3.99)
dλ dλ
Let’s make the further brutal simplification that the torsion is position independent.
Because the geodesics are straight limes we can wlog [without loss of generality] param-
eterize them by ordinary distance, in which case tc = dxc /ds is a constant vector. Then
the equation of parallel transport reduces to

dV a (s)
+ T a bc tc V b (s) = 0. (3.100)
ds
with the explicit solution

V a (s) = [exp{−s T • •c tc }]a b V b (0). (3.101)


Math 464: Differential Geometry 56

Though this example is very special [Γa (bc) trivial; T a bc constant] there is a general message
that can be inferred from the above:

Torsion arises from the part of the parallel transport operation that is independent
from the geodesics.

3.5.3 Commutator acting on a scalar

Suppose we take a scalar function φ(x) and calculate [∇a , ∇b ]φ. Now

∇a ∇b φ = ∇a (∂b φ) = ∂a ∂b φ − Γm ba ∂m φ. (3.102)

Therefore, anti-symmetrizing

[∇a , ∇b ]φ = 2 T m ab ∇m φ. (3.103)

Thus we see that the torsion is related to the failure of covariant derivatives, acting on a
scalar, to commute. The generic failure of the covariant derivatives, acting on a vector,
to commute leads naturally to the concept of curvature.

Warning: Note that the commutator [a, b] = ab−ba, as opposed to the anti-symmetrization
process A[ab] = (Aab − Aba )/2, is defined without any factor of 1/2. This usage is unfor-
tunately standard and I’m not about to try to change it. ♦

Exercise: Show that for any covector

∇[a Vb] = ∂[a Vb] + T m ab Vm . (3.104)

3.6 Riemann curvature

There are two natural ways of getting from the affine connexion to the notion of curvature:
Either by using the commutativity properties of the covariant derivative or by looking at
parallel transport around a small curve.
Math 464: Differential Geometry 57

One of the nice features of partial derivatives is that they commute. Unfortunately
partial derivatives are not tensorial. We have fixed the tensor properties by introducing
the connexion, but this will have an effect on the commutativity properties:

∇a ∇b X c = ∇a (∂b X c + Γc db X d ) (3.105)
c c d c m m d n
= ∂a (∂b X + Γ db X ) + Γ ma (∂b X + Γ db X ) − Γ ba (∂n X + Γ dn X d )
c c

(3.106)
c c d c d c m n c
= ∂a ∂b X + (∂a Γ db )X + Γ db (∂a X ) + Γ ma ∂b X − Γ ba ∂n X
+(Γc ma Γm db − Γn ba Γc dn )X d (3.107)

Now interchange a ↔ b and subtract

[∇a , ∇b ]X c = (∂a Γc db − ∂b Γc da + Γc ma Γm db − Γn ba Γc dn − Γc mb Γm da + Γn ab Γc dn )X d
+Γc db ∂a X d + Γc ma ∂b X m − Γn ba ∂n X c − Γc da ∂b X d − Γc mb ∂a X m + Γn ab ∂n X c
(3.108)

Regroup:

[∇a , ∇b ]X c = (∂a Γc db − ∂b Γc da + Γc ma Γm db − Γc mb Γm da )X d
+2T n ab ∇n X c
+Γc db ∂a X d + Γc ma ∂b X m − Γc da ∂b X d − Γc mb ∂a X m (3.109)

Stare at that last line for a second — it’s zero. (Check out the free indices and the dummy
indices.) Therefore

[∇a , ∇b ]X c = (∂a Γc db − ∂b Γc da + Γc ma Γm db − Γc mb Γm da )X d
+2T n ab ∇n X c . (3.110)

Now the LHS is by construction a tensor. Similarly on the RHS the torsion T and covariant
derivative ∇X are by construction tensors. Therefore the combination in brackets must
be a tensor — in fact it is called the Riemann tensor and we define:

Rc dba ≡ − (∂a Γc db − ∂b Γc da + Γc ma Γm db − Γc mb Γm da ) (3.111)

Re-labelling indices

Ra bcd ≡ − (∂d Γa bc − ∂c Γa bd + Γa md Γm bc − Γa mc Γm bd ) (3.112)

The index placement is important!

Note: This agrees with the sign chosen in MTW, D’Inverno, and Wald. ♦

Note that technically there is something to prove here. Lawden calls it the “quotient
theorem” and it states:
Math 464: Differential Geometry 58

Theorem 8 (Quotient theorem) If we have objects A, B, and C, and we know that


B and C are tensors, and if for some contraction of indices
A B = C, (3.113)
then, provided B is an arbitrary tensor, we can conclude that A is also a tensor.

Exercise: The proof is best developed by looking at a few simple examples. Try it. ♦

Warning: Various authors differ in their index placement, and sign conventions, but
once you pick a convention stick to it. The convention here is relatively standard. To
compare with other conventions look at the flyleaf of Misner–Thorne–Wheeler (MTW). ♦

Warning: Schouten “Ricci calculus” does something particularly weird with his index
placement; be warned. ♦

Warning: Maple also does something particularly weird; details later. ♦

Note: The way we have placed the indices on the Riemann tensor, the first two can
be thought of as carrying internal matrix structure, while the last two carry information
about the directions you are differentiating in. If we introduce a bullet • to stand for a
generic index we can schematically write
R• •cd ≡ −2 Γ• •[c,d] − Γ• •[c Γ• •d]

(3.114)
where the anti-symmetrization is only on [cd], the directions you are differentiating along,
and the bullets • denote general matrix indices with implied matrix multiplication.
There are of course several good reasons for this notation. It is helpful in a purely mathe-
matical sense and for physicists it does serve to make manifest the deep analogies between
a general affine connexion and the non-Abelian Yang–Mills gauge field, and between Rie-
mann curvature and Yang-Mills field strength. (This comment need not make sense now,
see the lecture on Yang-Mills at the end of the course.) ♦

Note the remarkable feature of this result: the affine connexion is not a tensor, but this
particular combination of derivatives and quadratic terms is a tensor (type T31 in fact).

From the Riemann tensor you can easily construct two additional tensors:
Rab ≡ Rc acb = ∂c Γc ab − ∂b Γc ac + Γc dc Γd ab − Γc db Γd ac (3.115)
is called the Ricci tensor, while
Sab ≡ Rc cab = − (∂a Γc cb − ∂b Γc ca ) = − (∂a Γb − ∂b Γa ) (3.116)
Math 464: Differential Geometry 59

does not seem to have a special name. (I’ll just call it the S-tensor.) In general both R(ab)
and R[ab] are nonzero, though R[ab] vanishes in GR. Furthermore the S-tensor Sab = 0 in
GR; which is why you don’t see much discussion of it.

Note again the remarkable feature of this result: the affine connexion is not a tensor,
but these particular combination of derivatives and quadratic terms are tensors (type T 20
in fact).

It is often preferable to re-organize the expression for the Ricci tensor by using
Γc ab = Γc ba + 2T c ab = Γc ba − 2T c ba (3.117)
and the more specialized
Γc ac = Γc ca + 2T c ac = Γc ca − 2T c ca = Γa − 2Ta (3.118)
to write
Rab = ∂c Γc ab − ∂b Γa + Γd Γd ab − Γc db Γd ca + 2Ta;b + 2Γc db T d ca (3.119)

We still have not used the metric tensor — the Riemann tensor exists independent of
whether or not there is a metric tensor present; as long as there’s an affine connexion
that’s enough.

3.7 General commutator identities


[Ricci identities]

We have already seen that on scalars


[∇a , ∇b ]φ = 2T m ab ∇m φ (3.120)
while on vectors
[∇a , ∇b ]X c = −Rc dba X d + 2T n ab ∇n X c . (3.121)
The same calculation applied to covectors yields
[∇a , ∇b ]Xc = +Rd cba Xd + 2T n ab ∇n Xc . (3.122)
and slightly more generally
[∇a , ∇b ]Xcd = +Xmd Rm cab + Xcm Rm dab + 2∇m Xcd T m ab (3.123)
In general, a formula qualitatively similar to that for the general covariant derivative holds
1...r
c ...c mcα+1 ...cr
X
[∇a , ∇b ]Xdc11...d
...cr
s
= − Xd11...dα−1
s
Rrα mab
α
1...s
X
+ Xdc11...d
...cr
β−1 mdβ+1 ...ds
Rb sβ ab
β
+2∇m Xdc11...d
...cr
s
T m ab (3.124)
Math 464: Differential Geometry 60

Note the general pattern:

[∇a , ∇b ]X → Riemann X + T orsion ∇X (3.125)

These equations are often referred to as the “Ricci identities”.

Exercise: Try to do these explicitly, see if you can get the indices in the right place.
Check if I have the indices in the right place. ♦

Comment: One place to find this stuff, modulo some notation changes, is on page 7 of
Eisenhart. ♦

3.8 Basic curvature identities

3.8.1 Anti-symmetries

Note that the Riemann tensor is antisymmetric on its last pair of indices

Ra b(cd) = 0 (3.126)

For a general affine connexion this is in fact the only symmetry.

Schouten calls this the “first identity”.

Exercise: Show that for a generic asymmetric connexion the Riemann tensor has
3
n (n − 1)/2 algebraically independent components. ♦

If we anti-symmetrize on the three covariant indices

Ra [bcd] = −2 Γa [bc,d] − 2 Γa m[d Γm bc] (3.127)


= −2 T a [bc,d] − 2 Γa m[d T m bc] (3.128)
= −2 T a [bc;d] − 2 A[bcd] (T a mc Γm bd + T a bm Γm cd ) (3.129)
= −2 T a [bc;d] + 4 T a m[d T m bc] (3.130)

(This is Schouten equation (5.2) p 144.) For typographic convenience we have introduced
the anti-symmetrization operator, A[abc] , which in obvious notation anti-symmetrizes the
indicated indices.

Schouten calls this the “second identity”. The authors of the “Encyclopedic Dictionary
of Mathematics” [EDM] refer to it as the “first Bianchi identity”, see EDM §80.J page
304, with a terrifyingly obscure notation.
Math 464: Differential Geometry 61

Exercise: Check all signs and coefficients. ♦

3.8.2 Antisymmetric part of the Ricci tensor

If we expand out the anti-symmetrization in Schouten’s “second identity”

Ra bcd + Ra cdb + Ra dbc = −2 (T a bc;d + T a cd;b + T a db;c )


+4 (T a md T m bc + T a mb T m cd + T a mc T m db ) (3.131)

and now contract on ab

Scd − Rcd + Rdc = −2 (T a ac;d + T a cd;a + T a da;c )


+4 (T a md T m ac + T a ma T m cd + T a mc T m da ) (3.132)

then

Scd − 2R[cd] = −4T a a[c;d] − 2T a cd;a + 4T a ma T m cd − 8T n m[c T m |n|d] (3.133)

That is:
1
R[cd] = Scd + 2T a a[c;d] + T a cd;a − 2T a ma T m cd + 4T n m[c T m |n|d] (3.134)
2
Rearranging:
1
R[cd] = Scd + T a cd;a + 2T a a[c;d] + 2T a am T m cd + 4T n m[c T m |n|d] (3.135)
2
Notation: The vertical bar in the previous equation indicates that the relevant index
is not to be included in the anti-symmetrization process. ♦
Now focus on the last term, noting that

T n mc T m nd (3.136)

is symmetric in (cd). That is


T n m[c T m |n|d] = 0 (3.137)
Thus
1
R[cd] = Scd + T a cd;a + 2T a a[c;d] + 2T a am T m cd (3.138)
2
This is [equivalent to] Schouten’s equation (5.6) on page 144.

From Schouten’s point of view this is merely an application of what he calls this the
“second identity”. Maybe we should call it the “contracted second identity”.
Math 464: Differential Geometry 62

Alternatively, we could proceed by brute force. (Which is a useful check on this formula.)
If we look at the antisymmetric part of the Ricci tensor
1
R[ab] = Sab + 2T[a;b] + T m ab,m + Γd T d ab − 2Γc d[b T d a]c (3.139)
2
Convert the partial derivatives to covariant derivatives using

T m ab;m = T m ab,m + Γm nm T n ab − Γn am T m nb − Γn bm T m an (3.140)

The second term rearranges to yield

T m ab;m = T m ab,m + Γm T n ab − 2Tm T m ab − 2Γn [a|m| T m |n|b] . (3.141)

And the last term now converts to

T m ab;m = T m ab,m + Γm T n ab − 2Tm T m ab − 2Γn m[a T m |n|b] + 4T n m[a T m |n|b] . (3.142)

This last term vanishes by symmetry, so

T m ab;m = T m ab,m + Γm T n ab − 2Tm T m ab − 2Γn m[a T m |n|b] . (3.143)

Collecting terms
1
R[ab] = Sab + 2Tc[a;b] + T m ab;m + 2Tm T m ab (3.144)
2
This agrees with the previous derivation above, and with Schouten.

Note the relative paucity of symmetries for the Riemann tensor generated by the general
affine connexion. Once you add symmetry and metricity things improve tremendously.

Exercise: Check all signs and coefficients. ♦

3.8.3 Nonmetricity

Let gab be some arbitrary symmetric type T20 tensor, that is nondegenerate in the sense
that there is locally some coordinate system such that det(gab ) 6= 0. Then we can define
a type T02 tensor by
g ab = [g −1 ]ab (3.145)
and use these objects gab and g ab to raise and lower indices. Though this looks rather
similar to what one might be expecting to see — the long-awaited metric tensor — it
is not. What we are emphatically not demanding at this stage is ∇c gab = 0, indeed we
define
∇c gab = qabc 6= 0 (3.146)
Math 464: Differential Geometry 63

to be the “non-metricity” tensor.

Then in terms of the non-metricity and the torsion

[∇a , ∇b ]gcd = ∇a qcdb − ∇b qcda = Rm cab gmd + Rm dab gcm − T m ab ∇m gcd (3.147)

= 2R(cd)ab − T m ab qcdm (3.148)


Rearranging
1
R(cd)ab = qcd[a;b] + qcdm T m ab . (3.149)
2
Schouten calls this the “third identity”.

Exercise: Check all signs and coefficients. ♦

Warning: Remember that gab is at this stage merely some arbitrary symmetric nonsin-
gular T20 tensor. ♦

Exercise: Use the auxiliary tensor gab and its inverse g ab to show

g ab Rabcd = Scd ; and hence g ab R(ab)cd = Scd (3.150)

Using this, and the third identity, show that

Sab = g cd qcd[a;b] + qcdm T m ab


 
(3.151)

Note that the LHS of this expression is explicitly independent of gab (since gab never shows
up in the definition of the S-tensor). Thus the RHS must be independent of the specific
auxiliary tensor gab that we chose to evaluate it. ♦

Exercise: Use the auxiliary tensor gab and its inverse g ab to define

S̄ab = g mn Rambn = g mn gae Re mbn (3.152)

Thus the S̄ tensor is defined in terms of a “trace” on the second and fourth indices of the
Riemann tensor. Using this, and the third identity, show that

S̄ab = Rab − 2g mn qma[n;b] + qmak T k nb


 
(3.153)

Can you simplify this any further? ♦

Exercise: What are the integrability conditions for making the nonmetricity vanish? ♦
Math 464: Differential Geometry 64

3.9 Weitzenbock identities


[generalized Bianchi identities]

The Riemann tensor satisfies some important constraints in the form of differential iden-
tities. It is relatively rare to see these identities discussed for a general asymmetric
connexion.

Theorem 9 (Weitzenbock)

Ra b[cd;e] = 2Ra bm[e T m cd] (3.154)

Schouten gives some comments on the history: in the case of a symmetric metric connex-
ion early results were due to Voss, Ricci, Padova, and Bianchi. Schouten appears to be
the first for a general symmetric connexion, and Weitzenbock the general affine connex-
ion. Nevertheless it now seems common to call them the Bianchi identities — perhaps
Weitzenbock identities would be more appropriate. For details see Schouten, “Ricci Cal-
culus”, p 146. (EDM calls this the “second Bianchi identity”, see §80.J page 304, with a
terrifyingly obscure notation.)

Proof: The definition of the Riemann tensor can be rewritten as


1
Γ• •[c,d] = Γ• •[c Γ• •d] − R• •cd (3.155)
2
(Whenever we see a •, think “matrix multiplication”.) Now take the partial derivative
with respect to xe and anti-symmetrize on [cde]
1
Γ• •[c,de] = Γ• •[c,e Γ• •d] + Γ• •[c Γ• •d,e] − R• •[cd,e] (3.156)
2
Now the LHS is zero because it is both symmetric and antisymmetric in de. The first two
terms on the RHS can be simplified using the definition of the Riemann tensor to give
1
0 = Γ• •[c Γ• •e Γ• •d] − R• •[ce Γ• •d] (3.157)
2
1 1
+ Γ• •[c Γ• •d Γ• •e] − Γ• •[c R• •de] − R• •[cd,e] (3.158)
2 2
The two trilinear terms in Γ cancel whence

R• •[cd,e] + R• •[ce Γ• •d] + Γ• •[c R• •de] = 0. (3.159)

This formula involves only partial derivatives. If we replace them by covariant derivatives
we see [before anti-symmetrization]
0 0
R• •cd;e = R• •cd,e + Γ• •e R• •cd − R• •cd Γ• •e − R• •c0 d Γc ce − R• •cd0 Γd de (3.160)
Math 464: Differential Geometry 65

Now anti-symmetrize
0 0
R• •[cd;e] = R• •[cd,e] + Γ• •[e R• •cd] − R• •[cd Γ• •e] − R• •c0 [d T c ce] − R• •[c|d0 T d de] (3.161)

Rearrange a few indices

R• •[cd;e] = R• •[cd,e] + Γ• •[c R• •de] + R• •[ce Γ• •d] − 2R• •m[d T m ce] (3.162)

But we have just seen that the first three terms on the RHS add to zero, thus

R• •[cd;e] = +2R• •m[e T m cd] (3.163)

Explicitly reinstating the •-ed indices

Ra b[cd;e] = +2Ra bm[e T m cd] (3.164)

as required. QED

Exercise: Check all signs and coefficients. ♦

Note: The proof for the generic affine connexion presented above is significantly more
complicated than what is needed in the case of the symmetric connexion. (Where the
entire RHS vanishes for a start.) When dealing with the symmetric connexion judicious
use of Riemann normal coordinates, described later, greatly simplifies the proof. ♦

Since we have this identity we can also deduce related identities by contracting on the
up-down indices:
Ra a[cd;e] = 2Ra am[e T m cd] (3.165)
That is
S[cd;e] = 2Sm[e T m cd] (3.166)
The more interesting contraction is:

(Ra bad;e + Ra bde;a + Ra bea;d ) = 2 (Ra bme T m ad + Ra bma T m de + Ra bmd T m ea ) (3.167)

That is

Rbd;e − Rbe;d + Ra bde;a = −2Rbm T m de + Rbme


n
T m nd − Rbmd
n
T m ne (3.168)

This is true for generic affine connexions, without needing any auxiliary arbitrary sym-
metric nonsingular T20 tensor gab .

Exercise: Check all signs and coefficients. ♦


Math 464: Differential Geometry 66

Exercise: (Non-trivial) Let’s re-introduce the arbitrary symmetric nonsingular T20 ten-
sor gab and its covariant derivative qabc . Use this metric to contract the contacted Bianchi
identities a second time. Define the “Ricci scalar”
R = g ab Rab (3.169)
and using the contracted Bianchi identity, and the S̄ tensor defined above, derive
 d
R δ c − Rd c − S̄ d c ;e = “an expression involving Riemman and Torsion tensors” (3.170)


and evaluate the expression on the RHS. ♦

Exercise: (Non-trivial) Now define the “Einstein tensor” as


1
Gab = Rab − gab R (3.171)
2
(note that it explicitly depends on the auxiliary T20 tensor gab ). Show that for a general
affine connexion
Gab ;b = “an expression involving the Riemman, Torsion, and nonmetricity tensors” (3.172)
and evaluate the expression on the RHS. ♦

3.10 Integrability [geometric route]

Warning: If you look at the details of his proofs, D’Inverno assumes all his manifolds
are topologically trivial. I will work around this restriction at the cost of some additional
technical machinery. ♦

3.10.1 Basic definitions and results

Definition 45 Integrable connexion: The connexion Γ is said to be “integrable” iff


in each contractible region of the manifold parallel propagation is path independent.

Note: “Contractible region” means that any closed loop in this region is contractible;
it can be shrunk to a point without the loop ever leaving the region of interest. Formal
topologists will say that the [first] homotopy is trivial. ♦
Note: Typically in the older literature, you might also see the word “teleparallel” [tele-
parallel; long-distance parallelism] or the phrase “absolute parallelism” used as a synonym
for integrable. Sometimes you will even see “Fernparallelismus”. ♦
Math 464: Differential Geometry 67

Lemma 8 The connexion Γ is “integrable” iff parallel propagation around any contractible
closed path is trivial. That is

Γ integrable ⇐⇒ transport(x→x;γ) = I (3.173)

Proof: Proving the lemma is straightforward. Suppose the connexion is integrable,


and we have some contractible closed path γ : x → x. Then parallel transport around
that closed path must by definition be identical to parallel transport around the trivial
closed path γ0 that never leaves x. But parallel transport around the trivial path is
manifestly trivial (the identity). Conversely, consider any two [homotopically equivalent]
paths connecting the same endpoints, γ1 : x → y and γ2 : x → y. Now reverse the sense
of one path γ̃1 : y → x to make a closed loop γ2 ◦ γ̃1 : x → x. Then because γ1 and γ2 .
are homotopically equivalent γ2 ◦ γ̃1 is contractible. Consequently

transport(x→y;γ2 ) ◦ transport(y→x;γ̃1 ) = transport(x→x;γ2◦γ̃1 ) = I (3.174)

Now using
transport(y→x;γ̃1 ) = [transport(x→y;γ1 ) ]−1 (3.175)
so that the effect of reversing the path is to obtain the inverse parallel transport operator,
we have
transport(x→y;γ2 ) = transport(x→y;γ1 ) (3.176)
proving path independence [for homotopically equivalent paths]. Consequently paral-
lel transport is path independent on contractible regions, and so the connexion is inte-
grable. QED
Now we want to relate integrability to the Riemann tensor:

Theorem 10 If the connexion Γ is “integrable” then the Riemann tensor is everywhere


zero.

Proof: Suppose the connexion is integrable. Pick an arbitrary point p with coordinates
x and some vector X a at that point. Now extend this vector to a vector field defined on
a topologically trivial [in particular, contractible] open region surrounding p by parallel
propagation (which by assumption of integrability is supposed to define a unique vector
X a (y) at all y 6= x).2 Then along any arbitrary curve in the topologically trivial region
of interest
transport[y → x; γ]X(y) = X(x) (3.177)
2
Aside: There is actually no need to extend X a (y) to the entire manifold. If this could be done,
then X a (y) would now be an everywhere nonzero vector field; but there are topologies, e.g. S 2 , for which
you know such things do not exist. Therefore there are topological manifolds for which you are forced to
work on topologically trivial regions. D’Inverno quietly ignores all such complications. ♦
Math 464: Differential Geometry 68

which by definition of covariant derivative implies


(t · ∇)X = 0 (3.178)
for any tangent vector t. That is
∇a X b = 0 (3.179)
But then
0 = [∇a , ∇b ]X c ≡ Rc dab X d + T n ab ∇n X c = Rc dab X d . (3.180)
So for all X d
Rc dab X d = 0 (3.181)
which implies
Rc dab = 0. (3.182)
Of course this only means the Riemann tensor is zero on the topologically trivial region
we used to set up our vector fields X a (y). But we can always cover any manifold by an
atlas of topologically trivial regions, so this allows us to deduce that the Riemann tensor
vanishes throughout the manifold. QED
Note: This proof is somewhat more general than D’Inverno. There is no assumption
about the torsion vanishing. More significantly, we can now deal with nontrivial topol-
ogy. ♦
You can flesh out the details of the previous theorem a little by splitting it into two
lemmata

Lemma 9 If the connexion Γ is “integrable” then on topologically trivial regions of the


manifold there exist n = dim(M) linearly independent covariantly constant vector fields.

Lemma 10 If there exist n = dim(M) linearly independent covariantly constant vector


fields on some region of the manifold, then the Riemann tensor vanishes on that region
of the manifold.

Exercise: Prove these two lemmata; you can do this by adding a few technical details
to the preceding proof. ♦
Proving the converse of this theorem is technically more difficult.

3.10.2 Zero curvature implies integrability

Theorem 11 If the Riemann tensor is zero everywhere on the manifold then the connex-
ion is integrable.

Comment: There is no condition on the torsion. ♦


There are two main routes one can take:
Math 464: Differential Geometry 69

• a “geometric” route that concentrates on a geometric picture of parallel transporting


vectors around closed loops.

• a “PDE” route that concentrates on properties of partial differential equations, and


conditions for the existence of solutions to certain PDEs.

The “geometric route” is traditional, and I will deal with that first. The “PDE route”
is actually technically easier, but requires a little extra machinery. It has the advantage
of providing you with techniques of somewhat wider applicability. I’ll deal with the PDE
route in the next section.
Proof: (Geometric route) We want to show that the result of parallel propagation
around any [topologically trivial] closed loop is the identity. It is sufficient to prove this
for infinitesimal loops, since any finite loop can be subdivided into many smaller loops.
(More precisely, you want an analytic estimate of the possible deviation from triviality as
a function of loop size; see below.)

Warning: The proof in the presence of torsion is somewhat messier than the case with-
out torsion. The cleanest “geometric” version I have found is on pp 23-24 of Eisenhart. ♦

(There will now be another brief agony of index gymnastics.)

Consider a small rectangle of points in a plane parameterized by (u, v). Let the points
in question be p(u, v), q(u + du, v), r(u + du, v + dv), and s(u, v + dv). Along each
line segment [not necessarily a geodesic] we parallel propagate the vector V a using the
equations
dV a a b dx
c
+ Γ bc V = 0. (3.183)
dλ dλ
Then along each side of the “rectangle” we have
dV a 1 d2 V a
V a (q) = V a (p) + du + du2 + O(du3 ) (3.184)
du p 2 du2 p

dV a 1 d2 V a
V a (r) = V a (q) + dv + dv 2 + O(dv 3 ) (3.185)
dv q 2 dv 2 q

dV a 1 d2 V a
V a (s) = V a (r) − du + du2 + O(du3 ) (3.186)
du r 2 du2 r

a dV a 1 d2 V a
Vreturn (p) = V a (r) − dv + dv 2 + O(dv 3 ) (3.187)
dv r 2 dv 2 s
Now add these equations
" # " #
a dV a dV a dV a dV a
Vreturn (p) = V a (p) + − du + − dv (3.188)
du p du r dv q dv s
Math 464: Differential Geometry 70
" # " #
1 d2 V a d2 V a 2 1 d2 V a d2 V a
+ + du + + dv 2 + O(du3 ) + O(dv 3 )
2 du2 p du2 r 2 dv 2 q dv 2 s
(3.189)
Now note that generically, for any quantity X(u, v)
∂X ∂X
X(r) = X(p) + du + dv + O(dλ2 ) (3.190)
∂u p ∂v p

where dλ now stands for either du or dv. Similarly


∂X
X(q) = X(p) + du + O(du2 ) (3.191)
∂u p

∂X
X(s) = X(p) + dv + O(dv 2 ) (3.192)
∂v p
So to the required accuracy
∂ dV a ∂ dV a
 
a a
∆V ≡ Vreturn (p) − Va (p) = − − du dv + O(dλ3 ) (3.193)
∂v du ∂u dv p

But the equation of parallel transport is


dV a a b dx
c
= −Γ bc V . (3.194)
dλ dλ
So
∂ dV a ∂[Γa bc V b ] dxc a b d x
2 c
=− − [Γ bc V ] (3.195)
∂v du ∂v du du dv
Using the equation of parallel transport again
∂ dV a ∂Γa bc c d 2 c
 
a b0 b dx dx a b d x
= − + Γ bc
0 Γ bd V + [Γ bc V ] (3.196)
∂v du ∂xd du dv du dv
Now reinsert this into the computation of ∆V a and use the definition of the Riemann
tensor. We obtain
c d
 
a a b dx dx
∆V (p) = − R bcd V du dv + O(dλ3 ) (3.197)
du dv p
Therefore, if the Riemann tensor is zero, then parallel propagation around the small
“rectangle” is independent of path taken; to order O[(dλ)3 ]. This then bootstraps to
arbitrary paths; as long as they are topologically trivial.

To see this take a finite “square” closed loop of size L × L (rectangular in the particular
coordinate chart adopted) and subdivide it into N × N smaller squares. Then simply
from the definition of path composition
NY
×N
transport(L×L;γ) = transport(L/N ×L/N ;γα ) (3.198)
α=1
Math 464: Differential Geometry 71

But from the analytic estimate above, for each one of the smaller squares we have
transport(L/N ×L/N ;γα ) = exp(O[(L/N )3 ]) (3.199)
That is:
transport(L×L;γ) = exp(N × N × O[(L/N )3 ]) = exp(O[1/N ]) → I. (3.200)
So far this works for “square” paths; arbitrarily shaped (but topologically trivial) paths
may now be handled by approximating them with finer and finer collections of square
paths — these are standard techniques usually developed in the theory of surface inte-
gration. QED

Exercise: Verify that if the Riemann tensor is nonzero, then our analytic estimate im-
plies that parallel propagation around “square” paths will generally lead to a nontrivial
result. ♦

Comment: The argument simplifies somewhat if the torsion is zero. See, for instance
D’Inverno. Note that even for zero torsion D’Inverno does not really complete the proof,
he stops at the stage
∆V = O(dλ3 ) (3.201)
Also note that he implicitly assumes everything is topologically trivial. ♦

Comment: We now have necessary and sufficient conditions. A connexion is integrable


iff the Riemann tensor is zero — independent of what happens to the torsion. ♦

Question: Is it possible to extend this sort of analysis to a non-Abelian Stokes theorem?


(This question will not even make sense to you at this stage. It will require searching
thru the technical literature.) ♦

3.10.3 Affine flat connexions

Definition 46 Affine flat: A connexion is “affine flat” if there exists an atlas in which
Γa bc = 0 in all charts.

Warning: The notion “affine flat” is most useful for symmetric connexions. For general
asymmetric affine connexions there are flat connexions (integrable, zero curvature) which
are not affine flat. ♦
Math 464: Differential Geometry 72

Comment: Note that if an affine flat connexion exists then then this places a topological
constraint on the tangent bundle
T (M) = M × IRn (3.202)

Lemma 11 If a manifold is affine flat then the torsion vanishes identically (the connexion
is symmetric).
[Trivial]

Lemma 12 If a manifold is affine flat then the connexion is integrable.


[Trivial]

Lemma 13 If a manifold is affine flat then the Riemann tensor vanishes identically.
[Trivial]

Lemma 14 If the connexion is symmetric (torsion zero) and integrable, then the connex-
ion is affine flat.
[Easy; I think]

Finally a nice analytic result:

Theorem 12
For a general affine connexion the Riemann tensor is zero iff there exists a coordinate
system such that
Γa bc = ∂c D a b (3.203)
where D a b is a diagonal matrix.

Proof: Sufficiency is easy; just plug it into the definition of Riemann. The ∂Γ terms
cancel because of the partial derivative and the Γ Γ terms cancel because it is diagonal in
the first 2 indices.

There is an alternative proof of sufficiency by using the notion of integrability. Since


the matrices D a b are diagonal they commute and the “path ordered” expression for the
transport operator reduces to
Z y

[transport(y→x;γ) ] • = P exp(Γ• •c dxc ) (3.204)
x
Z y 
= exp ∂c D • • dxc (3.205)
x
= exp {D • (y) − D • • (x)}

(3.206)
Math 464: Differential Geometry 73

This demonstrates integrability, because the transport operator is path independent.

On the other hand, establishing necessity takes several pages of rather turgid analysis
of PDEs in Eisenhart [pp 14-22]. There is a nicer version of the proof using n-beins that
I will discuss later, after I’ve introduced n-beins.

In contrast, an easy result is that if the Riemann tensor is zero then in any coordinate
patch
Γa ac = ∂c Θ (3.207)
One way to see this limited result is from the parallel transport operator — if the Riemann
tensor is zero then the transport operator is integrable, and so for any closed loop it is
the identity operator, whose determinant is unity. But then by our general result for the
determinant of the transport operator we have
I
Γa ac dxc = 0 (3.208)

for arbitrary closed loops. We can now apply Stokes’ theorem [not that we have proved
it yet].

A more prosaic way proceeds from Sab which equals zero simply because the Riemann
tensor is zero, but that implies (as an algebraic identity) that

∂a Γm mb − ∂b Γm ma = 0 (3.209)

whence locally there is a potential

Γm ma = ∂a Θ (3.210)

There is no guarantee that this quantity is a scalar (actually it is a scalar density), or


that it is globally defined. [See the discussion of “closed” versus “exact” forms later in
these notes.] QED

Exercise: Explicitly evaluate the transformation law for Θ. ♦

We still have not seen the metric tensor — what gives?

3.11 Integrability [PDE route]

We have defined the word “integrable” in terms of the path independence of the parallel
transport operator. Now there is another more basic notion of integrability you can use
— that of the integrability of a system of PDEs. The ideas are of course related.
Math 464: Differential Geometry 74

• The condition for the existence of a covariantly constant vector field, namely
∂V a
b
+ Γa mb V m = 0 (3.211)
∂x
is a particular case of a Frobenius–Mayer system of PDEs.

• The condition required in order to apply the Frobenius complete integrability the-
orem, and thereby guarantee that this system of PDEs actually has a solution, is
exactly the vanishing of the Riemann tensor.

• So integrability in the sense of PDEs is equivalent to integrability in the differential


geometry sense.

For those of you who did Math 301 last year, here’s a gentle reminder.

3.11.1 Frobenius–Mayer systems

Notation slightly changed to be more “geometrical”.

Definition 47 Frobenius/Mayer system:

One special system of PDEs that is very important is the Frobenius or Mayer system

∂U A
= FiA (x1 , . . . , xn , U 1 , . . . , U m ) (F ) (3.212)
∂xi
A = 1, 2, . . . , m, i = 1, 2, . . . , n (3.213)
where the m functions {U A } depend on the n independent variables {xi }.

All these equations are of first order.

In such a system there are as many PDEs as there are first-order derivatives of the
dependent functions (i.e., nm of them)

Notes:

• The “A” superscripts tell you which of the U ’s you are dealing with; not the order
of the derivative.
Math 464: Differential Geometry 75

• The only derivatives occurring above are first-order on the LHS.

• The RHS of the system does not involve any derivatives.

• Just because it’s important does not mean it’s easy to find any discussion of this
system.

• You can find a discussion in Volume 1 of Spivak, chapter 6. See especially pages
254–257.
(The notation is slightly different).

• You can find a discussion in Volume 5 of Forsyth, chapter 4. See especially pages
100 ff.
(The notation is, unfortunately, seriously archaic).

References:

• Courant R., and D. Hilbert, Methods of Mathematical Physics Vols 1 and 2, Inter-
science 1966.

• Forsyth R., Differential Equations, in six volumes, OUP (1906 onwards).

• Spivak, M., A comprehensive introduction to differential geometry, in six volumes,


(Publish or Perish, Berkeley, 1979).

3.11.2 Frobenius complete integrability theorem:

Theorem 13 Suppose the functions F A i are smooth functions of all their variables in a
neighbourhood of the origin, for A = 1, 2, . . . , m and i = 1, 2, . . . , n.

Then the Frobenius system (F) has a unique solution satisfying the IC

U A (0, 0, . . . , 0) = bA (A = 1, 2, . . . , m) (3.214)

for arbitrary given bA if and only if


m m
∂F A i X ∂F A i B ∂F A j X ∂F A j B
+ F j = + F i (C) (3.215)
∂xj B=1
∂U B ∂xi B=1
∂U B

for all i, j, and A in their respective ranges.


Math 464: Differential Geometry 76

• This Frobenius integrability theorem is an extremely important result. The condi-


tion (C) is effectively the requirement that the second partial derivatives should all
commute:

∂2U A ∂2U A
= (3.216)
∂xi ∂xj ∂xj ∂xi
• You can find a proof in Volume 1 of Spivak, chapter 6, pages 254–257. Note that
Spivak’s notation is slightly different.
• You can get a feel for how important the Frobenius integrability theorem is from
Spivak’s comment:
The Frobenius theorem (which represents everything we know about partial
differential equations) was used in [long list of topics].
(See Spivak, volume 5, page 1). This should be balanced against his further com-
ment:
Now it’s really rather laughable to call these things partial differential equa-
tions at all. True [...] partial differential equations are involved, but we do
not posit any relationship between different partial derivatives; this comes out
quite clearly in the proof [of the integrability theorem] where the equations
are reduced to ordinary differential equations.
• You can also find a statement of the theorem (not a proof) in Eisenhart, “Non-
Riemannian geometry”, (AMS, 1927) p 14.
• There’s a very brief statement of the result in the “Encyclopedic Dictionary of
Mathematics”, see EDM page 1775.
• It is useful to rewrite condition (C) in the equivalent form
m 
∂F A i ∂F A j X ∂F A i B ∂F A j B

− + F j− F i = 0. (C 0 ) (3.217)
∂xj ∂xi B=1
∂U B ∂U B

Alternatively
m
X
A
F [i,j] + ∂B F A [i F B j] = 0. (C 00 ) (3.218)
B=1

3.11.3 From auto-parallel to Frobenius–Mayer

The condition for the existence of a covariantly constant vector field is


∂V a
+ Γa mb V m = 0 (3.219)
∂xb
Math 464: Differential Geometry 77

This is a Frobenius–Mayer system in which


U A ←→V a ; F A i (x, U )←→ − Γa mb (x) V m (3.220)
The key point is that F A i (x, U ) is linear in U [that is, V ], this greatly simplifies the
analysis. The integrability condition is
∂[−Γa mb (x) V m ] ∂[−Γa mc (x) V m ]
− + {Γa nb (x)Γn mc (x) V m − Γa nc (x)Γn mb (x) V m } = 0.
∂xc ∂xb
(3.221)
That is
Γa m[b,c] + Γa n[c Γn |m|b] V m = 0

(3.222)
The vertical strokes surrounding the m, that is |m|, indicate that we do not include it in
the antisymmetrization process. If we only knew of the existence of one covariantly con-
stant vector field we would have to stop here. But because we have n linearly independent
covariantly constant vector fields we can make the stronger statement
Γa m[b,c] + Γa n[c Γn |m|b] = 0 (3.223)
Therefore
Γ• •[b,c] − Γ• •[b Γ• •c] = 0 (3.224)
But this is precisely the statement that the Riemann tensor vanishes.

Of course this had to be the case, but it’s still nice to see explicitly.

But now the Frobenius integrability theorem actually tells us more: If the Riemann
tensor is zero then there is a unique covariantly constant vector field for each possible
choice of V a (p). Since the possible choices of V a (p) span the tangent space Tp at p we
deduce the existence of n = dim(M) linearly independent covariantly constant vector
fields, let’s call them E a A (x).

That the E a A (x) are linearly independent at the point p follows by definition. That
they are then linearly independent at all other points x on the manifold can be established
by contradiction.

Exercise: Suppose that at some q 6= p the E a A (x) are linearly dependent. By using the
PDE of parallel transport demonstrate that they will then also be linearly dependent at
p, contrary to hypothesis. ♦

Warning: We do not assert that the E a A (x) are globally defined on the entire man-
ifold; all that the Frobenius integrability theorem guarantees is that they will exist on
some open neighborhood of the point p. ♦

If we combine this with its converse (one of the previous lemmata, already set as an
exercise) we have:
Math 464: Differential Geometry 78

Lemma 15 The Riemann tensor is zero iff (locally) there exist n = dim(M) linearly
independent covariantly constant vector fields.

Furthermore, this tells us a lot about the parallel transport operator. Let us start
with some vector X a (p) and parallel transport it along some fixed but arbitrary curve
γ. Because we have these n = dim(M) linearly independent covariantly constant vector
fields, we know that at any arbitrary point on the curve we have

X a (λ) = cA (λ) E a A (x(λ)) (3.225)

But now consider the differential equation of parallel transport


dX a dxc
(t · ∇)X = 0; ⇐⇒ + Γa bc X b (λ) (3.226)
dλ dλ
Since the E a A (x) are covariantly constant, and the covariant derivative was explicitly
cooked up to satisfy the Leibnitz rule, this implies

dcA
=0 (3.227)

so the cA are constants along any specified curve γ. But this means that for any initial
choice of vector at p,
X a (p) = cA (p) E a A (x) (3.228)
we see that the result of any curve that goes from p (coordinates xa ) to q (coordinates y a ),
while always remaining in the open region on which we are guaranteed that the E a A (x)
are defined, yields the same parallel-transported vector

X a (q) = cA (p) E a A (y) (3.229)

That is: parallel transport is path independent on the open region where the E a A (x) are
defined. This is enough to imply that the connexion is locally integrable. We can even
give an explicit formula for the path transport operator. Since the E a A (y) are linearly
independent the n × n matrix E a A (y) is nonsingular. It has a nonzero determinant and
an inverse that we will wite as [E −1 ]a A (y). Then

[transport(y→x) ]a b = E a A (x) [E −1 ]b A (y). (3.230)

which is explicitly path independent.

We therefore (using the converse lemma that was one of the previous exercises) have:

Lemma 16 The connexion is integrable iff on topologically trivial regions there exist n =
dim(M) linearly independent covariantly constant vector fields.
Math 464: Differential Geometry 79

Combining the two previous local lemmata:

Lemma 17 The connexion is integrable iff the Riemann tensor is zero.

Note that we have now proved this lemma without the agony of index gymnastics
involved in propagating around an infinitesimal loop. The price paid is a minor excursion
into the theory of PDEs, but it is well worthwhile.

I may wind up using Frobenius integrability again later on in the course.

There is a nice discussion of the geometrical version of Frobenius integrability in ap-


pendix B of Wald.

3.12 n-beins

Definition 48 In a n-dimensional manifold an n-bein is simply a collection of ea A of n


linearly independent vectors at a point.

The word is most common among physicists, it is a corruption of the German “n-
legs” based on the dimension dependent zwei-bein, drie-bein, vier-bein, funf-bein, etc.
Depending on dimensionality, you will also see words like “triad” or “tetrad”. Sometimes
(in old mathematical literature) you might run across the word “ennuple” [as in “n-tuple”
→ “en-tuple” → “ennuple”]. Mathematicians (modern abstract mathematicians) will also
sometimes use the word “frame” to express the same idea.

Note: The index a is an ordinary tangent space index, whereas the index A is simply a
label indicating which of the vectors we are dealing with. The a index transforms in the
usual way under a change of charts, the A index is simply unaffected. ♦

Because we assert that the vectors in the n-bein are linearly independent we know
det(ea A ) 6= 0. Therefore this matrix has an inverse, which we will denote by ea A . [Note
that index placement is important here.] By definition of this object as an inverse matrix
ea A ea B = δ B A ; ea A eb A = δ a b ; (3.231)
Note that the first one of these Kronecker delta is not a tensor; it simply lives in “label
space” where it labels the legs of the n-bein. The second Kronecker delta is a tensor;
as always it is a chart independent tensor. (It transforms to itself under a change in
coordinates.)

Note: Still no metric tensor yet, these n-beins are more primitive [more fundamental]
than those typically arising in GR. ♦
Math 464: Differential Geometry 80

Though simple in concept, this idea is remarkably powerful. Suppose we have field of
n-beins defined on the manifold. Make sure it’s suitably differentiable. Then consider the
derivatives
∇a e b B (3.232)
and use them to define
γ A BC = eb A ∇a eb B ea C (3.233)
Note that these γ A BC are carefully cooked up to be chart independent. They are often
called the “invariants” of the connexion; sometimes you will see people muttering about
“going to an aholonomic basis”. They are also related to something called the “spin
connexion”, for reasons I won’t go into right now. It should not surprise you to realise
that [given the n-bein field eb B ] they encode the same information as the Γa bc . Indeed if
we substitute the explicit form of the covariant derivative, and contact with appropriate
n-beins we get
∂ea A
Γa bc = −eb A + γ A BC ea A eb B ec C (3.234)
∂xc

Theorem 14 The Riemann tensor is zero iff there exists a n-bein field such that
∂ea A
Γa bc = −eb A (3.235)
∂xc

Proof: Suppose the Riemann tensor is zero; then the connexion is locally integrable.
Pick any n linearly independent vectors at some fixed but arbitrary point x and use the
parallel transport operator to extend them to a locally defined n-bein field. (We know
this can be done because the connexion is integrable.) But this n-bein field also satisfies
the differential form of the parallel transport equations so we have

∇a e b B = 0 (3.236)

which implies that for this particular n-bein field

γ A BC = 0 (3.237)

whence
∂ea A ∂eb A a
Γa bc = −eb A = + e A (3.238)
∂xc ∂xc
as claimed.

Conversely, suppose the connexion is of this form. Insert into the definition of Riemann
to see Riemann = 0. QED

Aside: There is a very simular result for non-Abelian gauge connexions. ♦


Math 464: Differential Geometry 81

Corollary 1 If the Riemann tensor is zero then there exists an n-bein field defined on
topologically trivial regions such that the parallel transport operator can be explicitly cal-
culated to be
transport[y → x; γ]a b = ea A (x) eb A (y) (3.239)

Corollary 2 Once we have constructed a local n-bein field in this way, we can adopt
coordinate charts so that ea A (x) is diagonal.

To do this just pick a point x and choose the coordinate axes to lie along the n vectors
~eA . This can now be extended away from the point x by integrability.

Corollary 3 Once we have constructed diagonalizing coordinate charts for the global n-
bein field, we define
D a b (x) = − ln [ea A→b (x)] (3.240)
which makes sense because A is now [by construction] doing double duty as both a label
specifying “which vector” and as a coordinate index. Then in these diagonalizing charts

Γa bc = ∂c D a b (3.241)

as per Eisenhart’s claim.

3.13 Weitzenbock connexion

There is a lovely (and perhaps unexpected) result due to Weitzenbock:

Theorem 15 In any manifold there exists a [non unique] asymmetric affine connexion
(now often called the Weitzenbock connexion) such that the Riemann tensor of that con-
nexion vanishes.

Note: This does not mean all manifolds have trivial curvature. It means I can cook
up a suitably perverse connexion to make the curvature of that connexion zero, but this
Weitzenbock connexion may not be the mathematically or physically interesting one. ♦

Proof: Pick any differentiable n-bein field ea A . That is, at each point of the manifold
pick an n-bein and make sure this is done in a differentiable manner.

We are not [at this stage] demanding the n-bein field be covariantly constant.
Math 464: Differential Geometry 82

We are also not asserting that this can be done globally. After all some manifolds have
no everywhere nonzero vector fields, let alone everywhere nonsingular n-beins. But we
can certainly do this locally on topologically trivial open regions.

Now construct
∂ea A
[ΓWeitzenbock ]a bc = −eb A (3.242)
∂xc
By our previous arguments
[RWeitzenbock ]a bcd = 0 (3.243)
and in fact the n-bein field ea A is covariantly constant in this Weitzenbock connexion, but
not generally in any other connexion. QED

The Weitzenbock connexion is used in the “teleparallel equivalent to general relativity”,


but you really do not want to know.

It appears to be more of a mathematical curiosity than an issue of any great importance.

Note that the Weitzenbock connexion is most useful on a topologically trivial manifold.

Note that the torsion is definitely nonzero for the Weitzenbock connexion.

3.14 Deforming general connexions

We know that if we have two different connexions Γ1 and Γ2 on the same manifold, then
their difference is a tensor
[Γ2 ]a bc = [Γ1 ]a bc + X a bc (3.244)
but this means we could calculate the Riemann tensor for Γ1 and Γ2 separately and
compare them:
[R2 ]a bcd = [R1 ]a bcd + X a bcd (3.245)
The difference between the two Riemann tensors will itself be a tensor. Indeed

X a bcd = X a bd;c − X a bc;d + X a mc X m bd − X a md X m bc − 2[T1 ]m cd X a bm (3.246)

where the semicolon denotes covariant differentiation using the connexion Γ1 .

Exercise: Check all index placement, coefficients and the like. ♦

Comment: This is a place where bullet notation is useful. Write

[Γ2 ]• •c = [Γ1 ]• •c + X • •c (3.247)


Math 464: Differential Geometry 83

and
[R2 ]• •cd = [R1 ]• •cd + X • •cd (3.248)
Then

[R2 ]• •cd = −2 [Γ2 ]• •[c,d] − [Γ2 ]• •[c [Γ2 ]• •d]



(3.249)
• • •

= −2 [Γ1 + X] •[c,d] − [Γ1 + X] •[c [Γ1 + X] •d] (3.250)

 • • • • • • •
= [R1 ] •cd − 2 X •[c,d] − [Γ1 ] •[c X •d] − X •[c [Γ1 ] •d] − X •[c X •d]
(3.251)

But
X • •[c;d] = X • •[c,d] + [Γ1 ]• •d X • •c − X • •c [Γ1 ]• •d − [Γ1 ]m cd X • •m . (3.252)
Consequently

[R2 ]• •cd = [R1 ]• •cd − 2 X • •[c;d] − 2[T1 ]m cd X • •m − X • •[c X • •d]



(3.253)

Exercise: Calculate [S2 ]ab − [S1 ]ab . Bullet notation may be helpful, though it’s a simple
calculation either way. ♦

Exercise: Calculate [R2 ]ab − [R1 ]ab . Bullet notation is unlikely to be helpful. Why? ♦

Exercise: Check all index placement, coefficients and the like. ♦

3.15 Preserving auto-parallelism

Question: Suppose two connexions lead to the same notion of parallelism, does this
mean the connexions are equal? ♦

Question: Suppose we have two affine connexions on the manifold that lead to the
same notion of parallelism, can we usefully characterize one in terms of the other? ♦

Recall that a specific vector field is said to be parallel to itself [auto-parallel] iff

∇a V b = f a V b ; ⇐⇒ ∇a V [b V c] = 0. (3.254)
Math 464: Differential Geometry 84

If we have two different connexions, and want the same vector field to be auto-parallel
with respect to both of them then
(1)
∇a V b = (1) fa V b ; and (2)
∇a V b = (2) fa V b . (3.255)

Therefore (2)
∇a − (1) ∇a V b =
(2)
fa − (1) fa Vb (3.256)
But (2)
∇a − (1) ∇a V b =
(2)
Γc ma − (1) Γc ma V m = X c ma V m (3.257)
where the quantity X c ma , being the difference between two connexions, is a T21 tensor. So
we require
X c ma V m = fa V b = fa δ c m V m (3.258)
But that implies (since we can take the components of V m at any fixed but arbitrary
point to be arbitrary)
X a bc = δ a b fc (3.259)
That is:

Theorem 16 Two affine connexions define the same notion of parallelism iff there exists
a covector fc such that
(2) a
Γ bc = (1) Γa bc + δ a b fc (3.260)

Notation: Two connexions related in this manner are said to differ from each other by
a “projective transformation”. You can also call them “projectively equivalent”. ♦
We can now ask how this impacts on the Riemann, Ricci, S, and torsion tensors.
(2)
Ra bcd = (1) Ra bcd + δ a b f[c,d] (3.261)
(2)
Rab = (1) Rab + f[a,b] (3.262)
(2)
Sab = (1) Sab + n f[a,b] (3.263)
(2)
T a bc = (1) T a bc + δ a [b fc] (3.264)
(2) n−1
T a ac = (1) T a ac + fc (3.265)
2
This implies that

(2) 1 a 1 a
Ra bcd − δ b (2)
Scd = (1) Ra bcd − δ b (1)
Scd (3.266)
n n
(2)
R(ab) = (1) R(ab) (3.267)
(2)
Sab − n (2) Rab = (1) Sab − n (1) Rab (3.268)
Math 464: Differential Geometry 85

while for the torsion a brief computation yields

(2) 2 2
T a bc − δ a [b (2)
Tc] = (1) T a bc − δ a [b (1)
Tc] (3.269)
n−1 n−1
This is a forerunner of a class of results that are typically very useful: One looks for a
restricted class of distortions to the affine connexion, asks how that impacts the Riemann
and torsion tensor, and then looks for invariants of the distortion process.
Comment: If we define
2
T̃ a bc = T a bc − δ a [b Tc] (3.270)
n−1
then
T̃ a ac = 0 (3.271)
and preserving parallelism implies
(2)
T̃ a bc = (1) T̃ a bc . (3.272)

(2)
Theorem 17 We can choose fa in such a way as to make Ra bcd = 0 iff

(1) 1 a
Ra bcd − δ b (1)
Scd = 0 (3.273)
n

Proof: The condition is clearly necessary. Conversely, if the condition is satisfied we


deduce
(1) a 1 1
R bcd = δ a b (1) Scd = δ a b (1) Γ[c,d] (3.274)
n n
So choose fc so that
1
f[c,d] = − (1) Γ[c,d] (3.275)
n
This can always be done (at least locally). To see this pick some fixed but arbitrary
coordinate chart and in that chart set
1 (1)
fc = − Γc (3.276)
n
and then extend fc to all other charts by the rules for the vector transformation of coor-
dinates. This fc is not unique, but we made no claim that it had to be unique. QED

We also have:

(2)
Theorem 18 We can always choose fa in such a way as to make Sab = 0.
Math 464: Differential Geometry 86

(1)
Proof: Note that Sab = (1) Γ[c,d] and choose fc so that

1 (1)
f[c,d] = − Γ[c,d] (3.277)
n
This can always be done (at least locally). QED
That is, without modifying the parallels (or the geodesics) we can always “get rid of” the
S-tensor.

Exercise: Check all index placement, coefficients and the like. ♦

3.16 Preserving geodesics

Preserving the geodesics is less difficult than preserving the notion of parallelism. Re-
member that we have already seen that an arbitrary torsion, while it leaves the geodesics
untouched, will modify notions of parallelism. From the above we have

Theorem 19 Two affine connexions define the same geodesics iff there exists a covector
fc , and an arbitrary T21 tensor Z a bc skew symmetric on its covariant indices, such that
(2)
Γa bc = (1) Γa bc + δ a b fc + Z a bc (3.278)

Corollary 4 By writing
δ a b fc = δ a (b fc) + δ a [b fc] (3.279)
we can rephrase this condition in terms of the symmetric part of the connexion and the
torsion
(2) a
Γ (bc) = (1) Γa bc + δ a (b fc) (3.280)
(2)
T a bc = (1) T a bc + δ a [b fc] + Z a bc (3.281)
By absorbing the antisymmetric part into a redefined Z a bc we also have
(2)
T a bc = (1) T a bc + Z̃ a bc (3.282)

Corollary 5 Two affine connexions define the same geodesics iff there exists a covector
fc , and an arbitrary T21 tensor Z̃ a bc skew symmetric on its covariant indices, such that
(2) a
Γ bc = (1) Γa bc + δ a (b fc) + Z̃ a bc (3.283)
Math 464: Differential Geometry 87

Exercise: What effects do changes of this type have on the Riemann, Ricci, and S
tensors? ♦

Exercise: Check all index placement, coefficients and the like. ♦

(2)
Exercise: What happens if Γa bc and (1) a
Γ bc have both the same geodesics, and the
same affine parameter ? ♦

Exercise: What happens if (2) Γa bc and (1) Γa bc both define the same notion of parallelism,
and when considering geodesics they have the same affine parameter ? ♦

3.17 Decomposing the general connexion

If you have a distinguished tensor gab and its inverse g ab , together with the non-metricity
qabc , you can use this to obtain a canonical decomposition for a general affine connexion.

Start with the fact that

∇c gab = qabc = ∂c gab − gmb Γm ac − gam Γm bc (3.284)

and rewrite it as
qabc = ∂c gab − Γbac − Γabc = gab,c − Γbac − Γabc (3.285)
Now consider the craftily chosen linear combination defined by
1
{ab; c} = {gca,b + gcb,a − gab,c } (3.286)
2
This combination of partial derivatives of any symmetric T20 tensor is called a “Christoffel
symbol of the first kind”.

We compute
1 1
{ab; c} = {qcab + qcba − qabc } + {Γacb + Γcab + Γbca + Γcba − Γbac − Γabc } (3.287)
2 2
That is
1 1
{ab; c} = {qcab + qcba − qabc } − {Tabc + Tbac } + {Γcab + Γcba } (3.288)
2 2
That is
1
{ab; c} = {qcab + qcba − qabc } − {Tabc + Tbac − Tcba } + Γcab . (3.289)
2
Math 464: Differential Geometry 88

Now just rearrange


1
Γcab = {ab; c} − {qcab + qcba − qabc } + {Tabc + Tbac − Tcba } (3.290)
2
That is
nco  
c cm 1
Γ ab = +g − {qmab + qmba − qabm } + {Tabm + Tbam − Tmba } (3.291)
ab 2
where we have defined the “Christoffel symbol of the second kind” by
nco
= g cm {ab; m} (3.292)
ab

Comment: Note what has happened here, although gab is not “the” metric, merely some
random non-degenerate symmetric tensor, we have nevertheless been able to decompose
an arbitrary asymmetric affine connexion in terms of partial derivatives of the gab , the
tensor of non-metricity qabc , and the torsion T a bc . Note also that both first and second
Christoffel symbols make good sense for arbitrary non-degenerate symmetric tensors, they
do not intrinsically have anything to do with the metric tensor. ♦

Comment: The combination


Tabc + Tbac − Tcba (3.293)
is sometimes called the “contorsion”. ♦

Exercise: Check all index placement, coefficients and the like. ♦

3.18 Locally geodesic coordinates

Theorem 20
At any point p we can introduce locally geodesic coordinates such that at that point:
Γa (bc) p
= 0. (3.294)

This is very different from saying that the symmetric quantity Γa (bc) vanishes throughout
the manifold. Note that the transformation law for connexions is inhomogeneous (affine)
— this is the only reason this result has even the slightest hope of being true.

To prove this theorem, start in some random coordinate system where


Γa (bc) p
= Qa bc 6= 0. (3.295)
Math 464: Differential Geometry 89

Now, assume that p is located at xa = 0, and transform coordinates


1
xa → xā = xa + Qa bc xb xc + O[x3 ] (3.296)
2
(So Qa bc , by its definition, is automatically symmetric in its lower two indices.) Then
∂xā
= δ a b + Qa bc xc + O(x2 ) = δ a b + O(x) (3.297)
∂xb
∂ 2 xā
= Qa bc + O(x). (3.298)
∂xb ∂xc
And using the transformation law for connexions:
∂xc ∂xā ∂xb a ∂xb ∂ 2 xā ∂xc
Γā b̄c̄ = Γ bc − (3.299)
∂xc̄ ∂xa ∂xb̄ ∂xb̄ ∂xb ∂xc ∂xc̄
a a
= Γ bc + O(x) − Q bc + O(x) (3.300)
= Γa (bc) + T a bc + O(x) − Qa bc + O(x) (3.301)
= T a bc + O(x). (3.302)
In particular (dropping the bars on the indices)
Γa (bc) p
= 0. (3.303)
Or equivalently
Γa bc |p = T a bc . (3.304)
Note that this only means that the symmetric part of the connexion vanishes at the
particular point p. This is called a locally geodesic coordinate system at the point p.

Exercise:
Show that in a locally geodesic coordinate system xa the curves
xa (λ) = ta λ (3.305)
with ta being any set of constant coefficients, are all geodesics at λ = 0 [this point corre-
sponding to xa = 0]. Furthermore the parameterization of these geodesics is affine.
(The curves need not be geodesic once λ 6= 0; hence the phrase “locally geodesic”.) ♦

Theorem 21
In any locally geodesic coordinate system we have
Ra bcd |p = Γa bd,c − Γa bc,d + T a mc T m bd − T a md T m bc (3.306)
= −2Γa b[c,d] + T a mc T m bd − T a md T m bc (3.307)
In other words, we can use the locally geodesic coordinate system to [at a point] simplify
the appearance of the Riemann tensor.

Note: These “locally geodesic” coordinates are much more useful if the torsion van-
ishes. ♦
Math 464: Differential Geometry 90

3.19 Riemann normal coordinates

In general, although locally geodesic coordinates enforce Γa bc |p = 0 it is not possible to


make all the Γa bc,d vanish. However we can (among other things) enforce

Γa (bc,d) |p = 0 (3.308)

To see this, start with a locally geodesic coordinate system and define new improved
coordinates. Assume that p is located at xa = 0, and transform coordinates
1 a
xa → xā = xa + Q bcd xb xc xd + O(x4 ) (3.309)
3!
(So Qa bcd , by its definition, is automatically symmetric in its lower three indices.) Then
∂xā a 1 a
= δ b + Q bcd xc xd + O(x3 ) = δ a b + O(x2 ) (3.310)
∂xb 2!
∂ 2 xā
= Qa bcd xd = O(x). (3.311)
∂xb ∂xc
∂ 3 xā
= Qa bcd + O(x). (3.312)
∂xb ∂xc ∂xd

And using the transformation law for connexions:


 c
∂x ∂xā ∂xb a ∂xb ∂ 2 xā ∂xc


Γ b̄c̄,d¯ = ∂d¯ Γ bc − b̄ (3.313)
∂xc̄ ∂xa ∂xb̄ ∂x ∂xb ∂xc ∂xc̄
a a
= Γ bc,d + O(x) − Q bcd + O(x) (3.314)

That is:
Γā (b̄c̄,d) a a
¯ = Γ (bc,d) − Q bcd + O(x) (3.315)
In particular (choosing Qa bcd = Γa (bc,d) |p and subsequently dropping the bars on the
indices)
Γa (bc,d) p = 0. (3.316)
Note that this does not imply that all the partial derivatives of the connexion vanish at
p, it only means that the completely symmetric part of the connexion derivatives can be
made to vanish.

You can use the same logic iteratively to enforce

Γa (bc,def...) p
=0 (3.317)

for any number of partial derivatives. If all these symmetric partial derivatives are zero
then this improvement on the notion of a locally geodesic coordinate system is called a
Riemann normal coordinate chart at the point p.
Math 464: Differential Geometry 91

Historical note: Schouten [Ricci calculus, p 158] points out that this is actually a
generalization of Riemann’s construction [Riemann worked with both nonmetricity and
torsion set to zero] and that these might more reasonably be called Veblen normal coor-
dinates. See also Eisenhart, p 58 ff. ♦

It’s generally the first step beyond locally geodesic coordinates, the fact that you can
enforce
Γa (bc,d) |p = 0 (3.318)
that is the most important.

Note: These “Riemann normal” coordinates are much more useful if the torsion van-
ishes. ♦
Chapter 4

Symmetric Connexions

For symmetric connexions several of the complications inherent in the general [asymmet-
ric] affine connection simplify. These issues are sufficiently important to merit a separate
chapter.

4.1 Semi-symmetric connexions

A connexion is said to be semi-symmetric if it has the same parallelism properties (defines


the same parallel lines, and in particular geodesics) as some symmetric connexion.

In view of the previous discussion concerning the preservation of parallelism this requires
(2) a
Γ bc = (1) Γa (bc) + δ a b fc (4.1)

Equivalently
(2)
Γa (bc) = (1) Γa (bc) + δ a (b fc) (4.2)
(2)
T a bc = δ a [b fc] (4.3)
That is: a necessary and sufficient condition that the asymmetric connexion Γa bc have the
same parallels as some symmetric affine connexion is that
2
T a bc = δ a [b Tc] (4.4)
n−1

There are now several simplifications in the various identities satisfied by the Riemann
tensor. Of course we still have
Ra b(cd) = 0 (4.5)

92
Math 464: Differential Geometry 93

but now
4
Ra [bcd] = δ a [b Tc;d] (4.6)
n−1
while
1
R[ab] = Sab + 2 T[a;b] (4.7)
2
and
1
R(cd)ab = qcd[a;b] + qcd[a Tb] . (4.8)
n−1
Finally the Weitzenbock [Bianchi] identities specialize to
4
Ra b[cd;e] = Ra b[cd T e] (4.9)
n−1
and for the contracted Weitzenbock [Bianchi] identities
4
S[cd;e] = S[cd Te] (4.10)
n−1
and
8
Rbd;e − Rbe;d + Ra bde;a = − Rn bde Tn . (4.11)
n−1

Exercise: Check all signs and coefficients. ♦

Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any semi-symmetric connexion is (as for the general asymmetric affine
connexion)
n3 (n − 1)
(4.12)
2

4.2 Symmetric connexions: Identities

For a fully symmetric connexion the various identities simplify even further. We still have

Ra b(cd) = 0 (4.13)

but now, since the torsion is identically zero,

Ra [bcd] = 0 (4.14)
Math 464: Differential Geometry 94

while
1
R[ab] = Sab (4.15)
2
and
R(cd)ab = qcd[a;b] . (4.16)
Finally the Bianchi identities specialize to

Ra b[cd;e] = 0, (4.17)

while for the contracted Bianchi identities

S[cd;e] = 0 (4.18)

and
Rbd;e − Rbe;d + Ra bde;a = 0. (4.19)
We can rewrite this final identity as

Rm abc;m = 2 Ra[b;c] . (4.20)

Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any symmetric connexion is
n2 (n + 1)(n − 1)
(4.21)
3

From our previous discussion on parallelism preserving deformations of a generic asym-
metric affine connexion we know that:

Theorem 22 If two symmetric connexions define the same parallels they are equal.

Preservation of geodesics is a less demanding constraint:

Theorem 23 If two symmetric connexions define the same geodesics then


(2)
Γa (bc) = (1) Γa (bc) + δ a (b fc) (4.22)

Exercise: Under a deformation of this type what happens to the Riemann, Ricci, and
S tensors? ♦

(2)
Exercise: What happens if two symmetric connexions Γa (bc) and (1)
Γa (bc) have both
the same geodesics and the same affine parameter ? ♦
Math 464: Differential Geometry 95

4.3 Locally geodesic coordinates for symmetric


connexions

For a symmetric connexion the geodesic and normal coordinates come to play a role of
special importance. We now have, in any locally geodesic coordinate chart

Γa bc |p = 0. (4.23)

whence
Ra bcd |p = Γa bd,c − Γa bc,d = −2Γa b[c,d] (4.24)
But then, automatically
Ra [bcd] p
=0 (4.25)
(Note that Γa bc,d] = 0 simply because it is both symmetric and antisymmetric in bc.) Now
we may have derived the above equation in a special coordinate system, but this is a
tensor equation so it must hold in all coordinate charts.

Similarly in any locally geodesic coordinate chart

Ra bcd;e |p = Γa bd,ce − Γa bc,de = −2Γa b[c,d]e (4.26)

whence
Ra b[cd;e] p
= −2Γa b[c,de] = 0 (4.27)
Again, though we have derived it in a special coordinate system, this is a tensor equation
so it must hold in all coordinate charts. Consequently we have demonstrated the Bianchi
identity for an arbitrary symmetric connexion in a rather straightforward way.

Exercise: Use locally geodesic coordinates to deduce that whenever we decompose a


general affine connexion as
Γ = S Γ + (torsion) (4.28)
then
R[Γ]a bcd = R[S Γ]a bcd + “∇(torsion)” + “(torsion)2 ”. (4.29)
On the RHS evaluate the coefficients and put all the indices in the right places. ♦

Theorem 24 Fermi normal coordinates For any symmetric connexion, and any curve
xa (λ), it is possible to choose coordinates to make the symmetric connexion zero every-
where along the curve.

Exercise: Develop an extension of the argument leading to the existence of locally


geodesic coordinates to deduce the existence of these “Fermi normal coordinates”. ♦
Math 464: Differential Geometry 96

4.4 Weyl symmetric connexion

Suppose one has an auxiliary tensor gab with covariant derivative qabc . Then defining the
conformally related ḡab = exp(2θ)gab but keeping the symmetric connexion fixed

q̄abc = ∇cḡab = ∇c [exp(2θ) gab ] = exp(2θ) [qabc + 2gab ∂c θ] (4.30)

This suggests that it may be useful to consider connexions with non-metricity of the form

qabc = ∇c gab = −2gab fc (4.31)

since then whenever ḡab = exp(2θ)gab we have

f¯c = fc − ∂c θ (4.32)

Now using our general result for decomposing the affine connexion in terms of the auxiliary
tensor gab , the nonmetricity gabc and the torsion [zero for symmetric connexions] we have
nao
Γa bc = + δ a b fc + δ a c fb − gbc f a (4.33)
bc
Symmetric connexions of this type are called Weyl connexions — they were used by Weyl
in one particular attempt [now abandoned] to geometrically unify electromagnetism with
gravity.

Schouten’s third identity now simplifies to

R(ab)cd = −2gab f[c,d] (4.34)

and upon taking the trace on the first two indices

Scd = −2n f[c,d] (4.35)

4.5 Weyl tensor for symmetric connexions

There is a version of the Weyl tensor that can be defined for a limited class of symmetric
connexions. Suppose we have two symmetric connexions that define the same geodesics
(though not the same affine parameter). That is, suppose
(2) a
Γ bc = (1) Γa bc + δ a (b fc) (4.36)

Exercise: Show that the affine parameters are related by


Z Z 
(2) c
(1)
λ=c exp fc dx d λ (4.37)
Math 464: Differential Geometry 97

Exercise: Show that


(2)
Ra bcd = (1) Ra bcd − δ a b (fcd − fdc ) − δ a c fbd + δ a d fbc (4.38)

where
fcd = fc;d − fc fd (4.39)

Exercise: Show that the Weyl tensor defined by


1 a 1
W a bcd = Ra bcd + δ b (Rcd −Rdc )+ 2 [δ a c (nRbd + Rdb ) − δ a d (nRbc + Rcb )] (4.40)
n+1 n −1
is a projective invariant in the sense that
(2)
W a bcd = (1) W a bcd (4.41)

Exercise: Show that the Weyl tensor satisfies the identities

W a b(cd) = 0 (4.42)

W a [bcd] = 0 (4.43)
W a acd = 0 (4.44)

Exercise: Show that a necessary and sufficient condition that the Riemann tensor be
invariant under a projective transformation is that

fc;d − fc fd = 0 (4.45)

Exercise: Show that a necessary and sufficient condition that the Ricci tensor be
invariant under a projective transformation is that

fc;d − fc fd = 0 (4.46)
Math 464: Differential Geometry 98

Exercise: Show that a necessary and sufficient condition that the symmetric part of
the Ricci tensor be invariant under a projective transformation is that

f(c;d) − fc fd = 0 (4.47)

Exercise: Show that a necessary and sufficient condition that the anti-symmetric part
of the Ricci tensor be invariant under a projective transformation is that

fc = ∂ c θ (4.48)

Exercise: Show that a necessary and sufficient condition that the S-tensor Sab be
invariant under a projective transformation is that

fc = ∂ c θ (4.49)

Exercise: [Trivial] Show that a necessary and sufficient condition that a symmetric
connexion be projectively equivalent to a affine flat connexion is that

Γa bc = δ a (b fc) (4.50)

Exercise: Show that (for n ≥ 3) a necessary and sufficient condition that a symmetric
connexion be projectively equivalent to an affine flat connexion is that the Weyl tensor
vanish. ♦

Exercise: Show that (for any n) a necessary and sufficient condition that a symmetric
connexion be projectively equivalent to an affine flat connexion is that
2  
Rab;c − Rac;b = R[ab];c − R[ac];b (4.51)
n+1

Math 464: Differential Geometry 99

Exercise: Show that a necessary and sufficient condition that a symmetric connexion be
projectively equivalent to a affine flat connexion, and that the Ricci tensor be symmetric,
is that
Γa bc = δ a (b θ,c) (4.52)
and that in this case
Ra bcd = e−θ δ a c ∂b ∂d eθ − δ a d ∂b ∂c eθ
 
(4.53)

Exercise: Check all signs and index placements. ♦

4.6 Projective connexions

Suppose we have two projectively equal symmetric connexions


(2) a
Γ bc = (1) Γa bc + δ a (b fc) (4.54)

Exercise: Show that the quantity


2 a
Πa bc = Γa bc − δ (b Γm mc) (4.55)
n+1
is a projective invariant. What are the transformation laws for this object under a change
of coordinates? ♦

Exercise: Show that the quantity

Πa ac = 0 (4.56)

There are many more results that can be derived for symmetric connexions; this
is more than enough for this particular course.

See Eisenhart and Schouten if you want more details.


Chapter 5

Metric Connexions

The irony is that given the way I have set things up, I can now define a metric connexion
without defining a metric tensor. Patience, the metric tensor is [finally] defined in the
next chapter.

5.1 Metric connexions with torsion

The generic metric connexion may still have torsion, but is characterized by the vanishing
of the non-metricity tensor qabc . More precisely the otherwise arbitrary non-degenerate
symmetric tensor gab is defined to be covariantly constant (it will soon be identified as
the metric tensor) and we demand
∇c gab = 0 (5.1)
so that nco
Γc ab = + g cm {Tabm + Tbam − Tmba } (5.2)
ab

Exercise: What are the integrability conditions for ∇c gab = 0? ♦

The standard symmetries are now

Ra b(cd) = 0 (5.3)

Ra [bcd] = −2 T a [bc;d] + 4 T a m[d T m bc] (5.4)


R[cd] = T a cd;a + 2T a a[c;d] + 2T a am T m cd (5.5)
Scd = 0 (5.6)
R(cd)ab = 0. (5.7)

100
Math 464: Differential Geometry 101

However we still need the full Weitzenbock form of the Bianchi identities

Ra b[cd;e] = 2Ra bm[e T m cd] (5.8)

Note that it is the anti-symmetry on the first two indices of the Riemann tensor that
forces Sab = 0.

Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any metric connexion (with torsion) is

n2 (n − 1)2
(5.9)
4

5.2 Metric semi-symmetric connexions

If the connexion is both metric and semi-symmetric we have further simplifications. We


still have
Ra b(cd) = 0 (5.10)
but now
4
Ra [bcd] = δ a [b Tc;d] (5.11)
n−1
while
R[ab] = 2T[a;b] (5.12)
Sab = 0 (5.13)
and
R(cd)ab = 0. (5.14)
Finally the Bianchi identities specialize to
4
Ra b[cd;e] = Ra b[cd T e] (5.15)
n−1

Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any semi-symmetric metric connexion is (as for the general metric con-
nexion with torsion)
n2 (n − 1)2
(5.16)
4

Math 464: Differential Geometry 102

5.3 Metric connexions without torsion — the GR


“standard connexion”

The metric connexion without torsion is the “standard connexion” for doing general rel-
ativity. Because of this most texts on GR deal only with this case, ignoring other com-
plications. There are however a number of good physics and mathematics reasons for
having kept the discussion quite general up to now. The torsion-free metric connexion is
characterized by the vanishing of both the non-metricity tensor qabc and the torsion T a bc .
Then we have nco
Γc ab = . (5.17)
ab
The standard symmetries are now
Ra b(cd) = 0 (5.18)
Ra [bcd] = 0 (5.19)
R[cd] = Scd = 0 (5.20)
R(cd)ab = 0, (5.21)
and the simplified form of the Bianchi identities

Ra b[cd;e] = 0. (5.22)

Exercise: Show that the interplay between Rab(cd) = 0 = R(ab)cd and Ra[bcd] = 0 implies

Rabcd = Rcdab (5.23)

Is there a converse for this result? ♦

Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any torsion-free metric connexion is
1 2 2
n (n − 1) (5.24)
12
This is not so trivial, see the next chapter for hints. ♦

Exercise: By contracting the Bianchi identities [simplified Weitzenbock identities] once


verify that
Rm abc;m = 2 Ra[b;c] . (5.25)
You already saw this result for generic symmetric connexions. ♦
Math 464: Differential Geometry 103

Exercise: By contracting the Bianchi identities [simplified Weitzenbock identities] a


second time verify that  
ab 1 ab
R − Rg = 0. (5.26)
2 ;b

(You will need to use the symmetries implied by the vanishing of nonmetricity to obtain
this result.) ♦

Comment: The combination


1
Gab = Rab − R gab (5.27)
2
is known as the Einstein tensor and is of central importance in general relativity. ♦
Chapter 6

Metric tensor

In good old ordinary Euclidean geometry (e.g., IR3 ) you probably never saw the distinction
between vectors and covectors, tangents and gradients, indices up and indices down.
Something special happens in Euclidean geometry to simplify life a lot. Let’s try to see
what is so special and how to modify it to a more general context.

6.1 Measuring distance

Let’s start with Euclidean space in curvilinear coordinates. In Cartesian coordinates X a


we know how to define the distance between two nearby points:
v
u n
uX
ds = t (δX a )2 (6.1)
a=1

In curvilinear coordinates the chain rule gives


∂X a b
δX a = δx (6.2)
∂xb
and then v
u n 
uX ∂X a ∂X a

ds = t
b ∂xc
δxb δxc (6.3)
a=1
∂x

Let’s define: n 
∂X a ∂X a
X 
gbc = (6.4)
a=1
∂xb ∂xc
So far, it’s only defined for Euclidean spaces, and uses the Cartesian coordinates as a
fundamental part of its definition, but once it is defined in this way it does transform as

104
Math 464: Differential Geometry 105

a T20 tensor. And then in general coordinates


p
ds = gbc δxb δxc (6.5)

Note:
You can alternatively define gbc this way, at least for Euclidean space — It is the T20 tensor
which in Cartesian coordinates takes on the special values

gab (X) = δab (6.6)

Comment: Suppose you were a physicist working in Minkowski space, then in Cartesian
coordinates you would want to take
 
−1 0 0 0
 0 +1 0 0 
gab (X) = ηab = 
 0
 (6.7)
0 +1 0 
0 0 0 +1

and in any other set of coordinates you would just use the tensor transformation laws to
calculate the components. ♦

If you are not working in Euclidean/Minkowski space, then you define a “metric” on
the manifold to be any non-singular tensor of type T20 (the tensor components typically
determine a positive definite matrix), and define the distance between nearby points as
p
ds = gbc δxb δxc (6.8)

Because the matrix gab is assumed nonsingular it has a matrix inverse, and without loss
of generality we can write:
g ab = [g −1 ]ab (6.9)

Whenever a metric g exists it establishes a natural isomorphism between vectors and


covectors. That is:
g : T∗ ↔ T (6.10)
In the vulgate, the metric is used to “raise and lower indices”.

Let t be a vector, and g a metric, then you define a covector t↓ by taking:

(t↓ )a = gab tb . (6.11)


Math 464: Differential Geometry 106

Similarly
ta = g ab (t↓ )b . (6.12)

As we have already seen (remember the non-metricity tensor) any nonsingular T20 tensor
picked at random would handle this particular job of raising and lowering indices; it’s only
because of the separate notion of “distance” that the metric acquires so much specific
importance.

6.2 Riemannian geometry

Definition 49 If the metric (considered as a matrix) is positive definite then the geometry
is Riemannian.

Exercise: Note that this is a coordinate independent statement. Why? ♦

Note: One of the appendices gives Riemann’s complete inaugural lecture — it is fasci-
nating to see how these ideas first came into use, and the logic that Riemann was using
to initiate this branch of mathematics. ♦

6.3 pseudo-Riemannian geometry

Definition 50 If the metric is nonsingular but indefinite, and in particular has signature
−, +, +, +... then the geometry is called pseudo-Riemannian (aka Lorentzian).

In particular, with a −, +, +, +... signature, the condition ds = 0 defines a cone structure


in the vicinity of every point in the manifold — we will want to use these as the light-cones
of general relativity.

Note: This pseudo-Riemannian geometry is most useful in the context of Einstein’s


special and general relativities.

Though you can also use pseudo–Riemannian geometry, for instance, for investigating
the propagation of sound in a moving fluid. This ultimately is the basis of a minor indus-
try called “analogue gravity” where various condensed matter systems are used to model
aspects of general relativity. See the book “Artificial Black Holes”. ♦
Math 464: Differential Geometry 107

6.4 Finsler/pseudo-Finsler geometry

To quote Bernhard Riemann (in slightly contorted English translation, full version pro-
vided in one of the appendices):

The next case in simplicity includes those manifoldnesses in which the line-element
may be expressed as the fourth root of a quartic differential expression. The
investigation of this more general kind would require no really different principles,
but would take considerable time and throw little new light on the theory of space,
especially as the results cannot be geometrically expressed. . .

. . . A method entirely similar may for this purpose be applied also to the manifold-
ness in which the line-element has a less simple expression, e.g., the fourth root
of a quartic differential. In this case the line-element, generally speaking, is no
longer reducible to the form of the square root of a sum of squares, and therefore
the deviation from flatness in the squared line-element is an infinitesimal of the
second order, while in those manifoldnesses it was of the fourth order.

Such manifolds, and their generalizations, have now come to be called Finsler geome-
tries, and for instance p
ds = 4 gabcd δxa δxb δxc δxd (6.13)
or more generally p
ds = n
ga1 a2 ...an δxa1 δxa2 . . . δxan (6.14)
are general notions of “Finsler distance”.

Pseudo-Finsler geometries allow the ga1 a2 ...an to become indefinite, and it is more useful
to write:
(ds)n = ga1 a2 ...an δxa1 δxa2 . . . δxan (6.15)
In this case both sides of the equation are at least real.

The only reason I mention this is because the 4th-order pseudo-Finsler geometry does
have a physics application:

(ds)4 = gabcd δxa δxb δxc δxd (6.16)

is a useful “metric” describing light “cones” (nested light-sheets) in a birefringent crystal


(or more generally a nematic liquid crystal). This is an open research topic. (It’s one of
the things I’m working on.)
Math 464: Differential Geometry 108

6.5 Metric geodesics

The metric can be used to define a whole different class of geodesics, based on the notion
of shortest distance (not straightest line). The two concepts agree in Euclidean space (or
Riemannian geometry), but do not necessarily agree in all the models physicists cook up.

Suppose we define the length of a curve as:


Z r
dxa dxb
s= gab dλ (6.17)
dλ dλ
γ

(Note that the distance defined in this way is independent of the parameterization of the
curve.)

Exercise: Suppose we change the parameterization of the curve from x(λ) to x(λ̄) where
λ̄ = f (λ) and f (·) is a 1-1 differentiable function from IR → IR. Verify that the distance
s is independent of f (·). ♦

Now if we fix the endpoints x and y we can use the Euler–Lagrange equations of varia-
tional calculus to find the equation of the shortest curve between those fixed points.

Note: I have added an appendix with the elements of variational calculus; enough to
understand what is going on here. ♦

Exercise: For the adventurous I have also added another appendix that gives the three
Hilbert problems that relate to the calculus of variations. Completely solving those three
problems would be a nice exercise. ♦

The relevant Euler–Lagrange equation is:


 
b
d  1 dx  1 ∂gab dxa dxb
gab − q = 0. (6.18)
∂xc dλ dλ
q
dλ a
g dx dx
b dλ 2 g dx
a dxb
ab dλ dλ ab dλ dλ

This simplifies tremendously if you parameterize using arc-length s, since in that case:

dxb 1 ∂gab dxa dxb


 
d
gab − = 0. (6.19)
ds ds 2 ∂xc ds ds
Expand the first term:

d2 xb ∂gab dxc dxb 1 ∂gab dxa dxb


gab + − = 0. (6.20)
ds2 ∂xc ds ds 2 ∂xc ds ds
Math 464: Differential Geometry 109

Rearrange:
d2 x a 1 ∂gbc dxb dxc
 
am ∂gmb
+g − = 0. (6.21)
ds2 ∂xc 2 ∂xm ds ds

This is qualitatively very similar to the notion of an affine geodesic with the analogy
 
a am ∂gmb 1 ∂gbc
Γ bc ∼ g − (6.22)
∂xc 2 ∂xm

However, the RHS here is not quite in its nicest form. It is traditional and usual to
symmetrize and introduce Christoffel symbols of the first and second kind
1 1 
{ab, c} ≡
[gac,b + gbc,a − gab,c ] = gc(a,b) − gab,c (6.23)
2 2
nao    
am am 1 1 am ∂gmb ∂gmc ∂gbc
= g {bc, m} = g gm(b,c) − gbc,m = g + − m (6.24)
bc 2 2 ∂xc ∂xb ∂x
Exercise:
It is now
 aa straightforward (but tedious) exercise to verify that the second Christoffel
symbol bc defined above really does transform in the same way as an affine connexion.

This is already implicit in our analysis for the general affine connexion, but explicit
verification in this particular situation is still a useful exercise. ♦

a
If we set Γa bc = bc
then:

• Geodesics in the sense of straightest possible paths coincide with geodesics in the
sense of shortest possible paths.

• The torsion tensor is by definition zero.

• The nonmetricity tensor is by definition zero.

• The curvature tensor can be calculated directly in terms of the metric.

6.6 Metric connexion revisited

Another key feature of the metric connexion is:

Theorem 25 If nao
Γa bc = (6.25)
bc
Math 464: Differential Geometry 110

then
∇a gbc = 0, (6.26)
that is: the metric is covariantly constant.
Conversely, if ∇a gbc = 0, and the torsion is zero (T a bc = 0), then Γa bc = bca .


(This is already explicit in our analysis for the general affine connexion.)

There are a number of other places where the metric connexion simplifies things, for
instance in the symmetry properties of the Riemann tensor. I already pointed out to you
that
Ra b(cd) = 0 (6.27)
for an arbitrary affine connexion. For the metric connexion, we start  a by using local
geodesic coordinates so that at any arbitrary but fixed point Γa bc |p = bc |p = 0, which
implies that at that point {ab, c} |p = 0. But in general

{ab, c} + {ac, b} = gbc,a (6.28)

so we deduce gbc,a |p = 0 at the arbitrary but fixed point p. But this means that in local
geodesic coordinates
1
∂d Γa bc = g am ∂d {bc, m} = g am (gbm,cd + gcm,bd − gbc,md ) (6.29)
2
and (still in locally geodesic coordinates)
1
Ra bcd |p = ∂c Γa bd − ∂d Γa bc = g am (gdm,bc − gbd,mc − gcm,bd + gbc,md ) (6.30)
2

Let’s lower the index on the Riemann tensor, then at any arbitrary but fixed point p
we can locally chose geodesic coordinates to make:
1
Rabcd |p = (gda,bc − gbd,ac − gca,bd + gbc,ad ) (6.31)
2
That is:
1
(gad,bc − gbd,ac − gac,bd + gbc,ad )
Rabcd |p = (6.32)
2
Notice that in this coordinate system, not only does

Rab(cd) |p = 0 (6.33)

(we already knew this), but also


R(ab)cd |p = 0 (6.34)
and
Ra[bcd] |p = 0 ⇐⇒ (Rabcd + Racdb + Radbc )|p = 0 (6.35)
Math 464: Differential Geometry 111

and finally
Rabcd |p = Rcdab |p (6.36)
But these last three equations are new tensor properties — if we have proved them in one
chart they are true in all charts; and the point p though fixed, was arbitrary, so we have
derived additional symmetry properties for a metric-derived Riemann tensor.

(We had already derived these results by specializing the results for the general asym-
metric affine connexion, but this use of locally geodesic coordinates makes life particularly
simple.)

Exercise: Verify that


2 (Rabcd + Racdb + Radbc )|p = (gad,bc − gac,bd − gbd,ac + gbc,ad )
+(gab,cd − gad,cb − gcb,ad + gcd,ab )
+(gac,db − gab,dc − gdc,ac + gdb,ac ) (6.37)
= 0. (6.38)

Exercise: Verify that from


Ra bcd |p = ∂c Γa bd − ∂d Γa bc (6.39)
we have
(Ra bcd + Ra cbd )p = ∂c Γa bd − ∂d Γa bc + ∂b Γa cd − ∂d Γa cb (6.40)
= {∂b Γa cd + ∂c Γa db + ∂d Γa bc } − 3∂d Γa bc (6.41)
= 3Γa (bc,d) − 3∂d Γa bc (6.42)
So far, this holds in any locally geodesic coordinate system.

If we now go to a normal coordinate system (Riemann normal coordinate system, in


which Γa (bc,d) |p = 0) we have the considerably stronger statement
1
Γa bc,d |p = − (Ra bcd + Ra cbd )p (6.43)
3
Also
1
Γabc,d |p = − (Rabcd + Racbd )p (6.44)
3
But in general (for a [torsion-free] metric connexion)
gab,c = Γabc + Γbac (6.45)
Therefore
1
gab,cd |p = − (Rbcad + Racbd )p (6.46)
3
Math 464: Differential Geometry 112

Note that this only works in a normal coordinate system, and does not hold in an arbi-
trary locally geodesic chart. ♦

Let’s Taylor series expand the metric in the normal chart around p, where p corresponds
to xa = 0. Then
1
gab (x) = gab (p) + gab,cd xc xd + O(x)3 (6.47)
2
But then

1
gab (x) = gab (p) − (Racbd + Rbcad ) xc xd + O(x)3 (6.48)
3!

We can always find a constant transformation matrix to take


gab (p) → ηab . (6.49)
That then corresponds to a Gaussian normal (not just a geodesic) coordinate system, in
terms of which
1
gab (x) = ηab − (Racbd + Rbcad ) xc xd + O(x)3 (6.50)
3!
The interpretation is now simple: The Riemann tensor gives you information about de-
viations from flatness; in fact second-order deviations from flatness. It is a generalization
of the notion of “radius of curvature” you will already be familiar with from elementary
mechanics or geometry — all the indices are there to deal with the many “directions” you
now have available for curves to move in.

Exercise: From
1
gab (x) = ηab − (Racbd + Rbcad ) xc xd + O(x)3 (6.51)
3!
it follows that
1
gab,cd = − (Racbd + Rbcad ) (6.52)
3
Verify that this is consistent with the definition of the Riemann tensor in locally geodesic
coordinates
Rabcd |p = (gad,bc − gbd,ac − gac,bd + gbc,ad ) |p (6.53)

Exercise: (Small research project)


Generalize this discussion, as much as possible, to include torsion and nonmetricity. As
we have seen previously, symmetric torsion-free connexions are not too bad. Still in the
general case there should be some sort of short distance expansion that would yield a
Taylor series for the connexion in terms of the torsion and the Riemann tensor? (And
possibly the nonmetricity?) Find it! ♦
Math 464: Differential Geometry 113

6.7 Bianchi identities revisited

As we have already seen, the Riemann tensor satisfies some important constraints in the
form of differential identities. Let’s step back to the stage of assuming the torsion is zero,
but make no assumption that the connexion is metric; it’s a generic symmetric affine
connexion. Then in locally geodesic coordinates

∇e Ra bcd (p) = ∂e Ra bcd (p) (6.54)


= ∂e (∂c Γa bd − ∂d Γa bc + Γa mc Γm bd − Γa md Γm bc )|p (6.55)
= (Γa bd,ce − Γa bc,de )|p (6.56)
= −2 Γa b[c,d]e p

(6.57)

But then
Ra b[cd;e] = 0 (6.58)
(because we are antisymmetrizing parial derivatives in the last line above).

Note: The generalization of this identity to the case of nonzero torsion or general affine
connexion is somewhat messier. See previous discussion. ♦

Since we have this identity we can also deduce related identities by contracting on the
up-down indices:
(Ra acd;e + Ra ade;c + Ra aec;d ) = 0 (6.59)
That is
Scd;e + Sde;c + Sec;d = 0 ⇐⇒ S[cd;e] = 0. (6.60)

Note: Remember that for a metric connection Sab = 0; so this equation is non-trivial
only for symmetric but non metric connexions. ♦

The more interesting contraction is:

(Ra bad;e + Ra bde;a + Ra bea;d ) = 0 (6.61)

That is
Rbd;e − Rbe;d + Ra bde;a = 0. (6.62)
This is true for symmetric but not necessarily metric connexions. If we make the additional
assumption that the connexion is metric, we can additionally contract by using g be to get

g be Rbd;e − g be Rbe;d + g be Ra bde;a = 0. (6.63)

That is
Re d;e − R;e + Re d;e = 0 (6.64)
Math 464: Differential Geometry 114
 
1
a
R b − R δab =0 (6.65)
2 ;a

This is now traditionally referred to as the contracted Bianchi identity, and the combina-
tion
1
Gab = Rab − R gab (6.66)
2
is now traditionally referred to as the Einstein tensor.

Warning: Note that to even define the Einstein tensor you need to have a metric (or
at the very least a nonsingular T20 tensor) available to fully contract the Ricci tensor to
obtain the Ricci scalar
R = Rab g ab (6.67)
The Ricci tensor can be defined for an arbitrary affine connexion; the Einstein tensor
needs the metric. Even when the metric is available, you then additionally need to make
sure you are using the metric connexion in order to deduce

Gab ;b = 0 ⇐⇒ ∇b Gab = 0 (6.68)

Note: If you crawl down into the basement of Rankine–Brown (not currently an option),
you will discover the VUW library has a complete set of Professor Bianchi’s nicely typeset
lecture notes from the graduate course in differential geometry he taught for many years
— of course they are all in the original Italian! ♦

6.8 The Weyl tensor

Suppose we are dealing with a [torsion-free] metric connexion; then the number of free
components of the Riemann tensor is strictly limited by all the symmetries. In fact:

• If n = 1 the Riemann tensor is identically zero.

• If n = 2 the Riemann tensor has only one independent component, essentially R.


Indeed
1
Rabcd = R (gac gbd − gad gbc ) (6.69)
2
1
Rab = R gab (6.70)
2
Math 464: Differential Geometry 115

• If n = 3 the Riemann tensor has only six independent components, essentially Rab .
Indeed 
Rabcd = −2 ga[d Rc]b + gb[c Rd]a − R ga[c gd]b (6.71)
That is
1
Rabcd = {gac Rbd + gbd Rac − gad Rbc − gbc Rad } − R {gac gbd − gad gbc } (6.72)
2

• If n = 4 the Riemann tensor has only twenty independent components. Ten of them
are the Ricci tensor Rab and the other ten are hidden in the “Weyl tensor”.
 1
Cabcd = Rabcd + ga[d Rc]b + gb[c Rd]a + R ga[c gd]b (6.73)
3
That is
1 1
Cabcd = Rabcd − {gac Rbd + gbd Rac − gad Rbc − gbc Rad } + R {gac gbd − gad gbc }
2 6
(6.74)

• Generally (n ≥ 3) we define
2  2
Cabcd = Rabcd + ga[d Rc]b + gb[c Rd]a + R ga[c gd]b (6.75)
n−2 (n − 1)(n − 2)
That is
{gac Rbd + gbd Rac − gad Rbc − gbc Rad } R {gac gbd − gad gbc }
Cabcd = Rabcd − +
n−2 (n − 1)(n − 2)
(6.76)
The Weyl tensor has the same symmetries as Riemann and is the unique linear
combination of the Riemann tensor, Ricci tensor, Ricci scalar, and metric such that

C a bad = 0 (6.77)

The Weyl tensor can also be characterized as the part of the Riemann tensor that
is covariant under conformal deformations of the metric. If

g̃ab = Ω2 gab (6.78)

then
C̃abcd = Ω2 Cabcd (6.79)

Exercise:
Verify that the number of algebraically independent components of the Riemann tensor
is
1 2 2
n (n − 1). (6.80)
12
Math 464: Differential Geometry 116

Warning:
If you only use the “basic” symmetries R(ab)(cd) = 0 and Rabcd = Rcdab you would find
   
1 n(n − 1) n(n − 1)
+1 (6.81)
2 2 2
independent components. It is only after adding the additional constraint Ra[bcd] = 0 that
you get the quoted result. ♦

Note:
Up to n(n + 1)/2 of these components can be assigned to the Ricci tensor, so that the
Weyl tensor has at most
1 2 2 1 (n − 3)n(n + 1)(n + 2)
n (n − 1) − n(n + 1) = (6.82)
12 2 12
independent components. ♦

Exercise:
[This exercise makes sense only in Lorentzian signature, i.e., for pseudo-Riemannian man-
ifolds.]
(Trivial) Verify that g̃ab = Ω2 gab has the same null cones as gab .
(A little harder) Verify that g̃ab has the same null geodesics as gab .
(A little harder) Even though g̃ab has the same null geodesics as gab , these geodesics will
not have the same affine parameter. Verify this. ♦

Exercise:
Verify that
Γ̃a bc = Γa bc + Ω−1 {δ a b ∇c Ω + δ a c ∇b Ω − gbc ∇a Ω} . (6.83)
Can you use this result to come up with a “slick” way of showing that g̃ab has the same
null geodesics as gab ?
[The second part of this exercise makes sense only in Lorentzian signature, i.e., for pseudo-
Riemannian manifolds.] ♦

Exercise:
Define
Ωa b = 4Ω−1 ∇a ∇b (Ω−1 ) − 2δ a b ||∇(Ω−1 )||2 (6.84)
−1
(The apparently perverse use of Ω minimises the number of explicit occurrences of n.)
Verify that
R̃ab cd = Ω−2 Rab cd + δ a [c Ωb [d (6.85)
Math 464: Differential Geometry 117

Verify that (n > 2)


1
R̃ab = Rab + (n − 2)Ω ∇a ∇b (Ω−1 ) − gab Ω2−n ∇2 (Ωn−2 ) (6.86)
n−2
Verify that (n ≥ 1)

R̃ = Ω−2 R + 2(n − 1)Ω−3 ∇2 Ω − (n − 1)(n − 4) Ω−4 ||∇Ω||2 (6.87)

Exercise:
If n = 2 verify that p √ 
g̃ R̃ = g R + 2∇2 ln Ω (6.88)
Deduce that on a compact 2-manifold

Z
g R d2 x (6.89)

is a conformal invariant — in fact it’s a topological invariant; the Euler characteristic. ♦

Warning: Strictly speaking, I have not yet defined the notion of integration on mani-
folds, but this particular calculation should be do-able by elementary means. ♦

Exercise:
By taking the formal limit n → 2 verify that

R̃ab = Rab + gab ∇2 ln Ω (6.90)

Exercise:
If n = 3 verify that
p √ 
g̃ R̃ = g Ω R + 4∇2 Ω − 2||∇ ln Ω||2 (6.91)

Deduce that in a compact manifold


√ 
Z p Z
3
g̃ R̃ d x = g exp(ln Ω) R − 2||∇ ln Ω||2 d3 x (6.92)

√ 
Z
= g exp(θ) R − 2||∇θ||2 d3 x (6.93)
Math 464: Differential Geometry 118

Considered as a function of θ ≡ ln Ω, with R held fixed, this is a Lagrangian of so-called


Liouville form. ♦

Exercise:
If n = 4 verify that p √ 
g̃ R̃ = g Ω2 R + 6Ω∇2 Ω (6.94)
Deduce that in a compact manifold
√  2
Z p Z
4
g̃ R̃ d x = g Ω R − 6||∇ ln Ω||2 d4 x (6.95)

Considered as a function of Ω, with R held fixed, this is a Lagrangian for a “massive


scalar particle” (with a possibly position dependent mass; and definitely with the wrong
sign for the kinetic term). ♦

Exercise:
Look up “Yamabe’s theorem”.
Under suitable conditions (find them), manifolds can be conformally related to constant
scalar curvature “cousins”. That is, given g it is possible to find Ω such that g̃ has a
constant Ricci scalar: R̃ ∈ {−1, 0, +1}.
(If n = 2 this is the so-called “uniformization theorem” of 2-manifolds; for n > 2 this
seems to be the best analog of the uniformization theorem on the market.) ♦

Definition 51
A manifold is “locally conformally flat” iff in some coordinate chart
gab ∝ ηab . (6.96)
It is globally conformally flat iff there is an atlas of coordinate charts with this property;
and if the transition functions on the overlap regions are trivial.

Lemma 18
Any 2-manifold is locally conformally flat.
(It’s globally conformally flat iff it has the topology of IR2 or T 2 ; otherwise the uniformiza-
tion theorem implies it’s globally conformal to either S 2 or the hyperbolic plane H 2 modded
out by some Mobius group.)

Lemma 19
A 3-manifold is conformally flat iff the Cotton tensor is everywhere zero.
(3 dimensions is a special case: The Cotton tensor is
 
1 1
Rabc = Rab;c − Rac;b − (gac R;b − gab R;c ) = 2 Ra[b;c] + ga[b R;c] (6.97)
4 4
Math 464: Differential Geometry 119

See MTW p 550.)

Lemma 20
A manifold (n ≥ 4) is conformally flat iff the Weyl tensor is everywhere zero.

Exercise: Check all signs, coefficients, and index placements! ♦

6.9 The differential geometry of Newton’s second law

To see that the notion of a conformally flat geometry contains a lot of interesting mathe-
matics, I will now present an “unusual” route from Newton’s second law to Maupertuis’
variational principle.

Even abstract mathematicians should recognize Newton’s second law

F~ = m ~a (6.98)

so that for a body subject to a “conservative” force field

d2 ~x ∂V (x)
m 2
=− (6.99)
dt ∂~x

Now suppose that you have good surveying equipment but very bad clocks. So you can
tell where the particle is, and its path through space, but you have poor information on
when it is at a particular point. Can you reformulate Newton’s second law in such a way
as to nevertheless be able to get good information about the path the body follows?

Now we are in good old Euclidean geometry and Cartesian coordinates, so we can write
the distance travelled in space as

ds = d~x · d~x (6.100)

Can we find a differential equation for d~x/ds (instead of d~x/dt)? By using the chain rule

d2 ~x
 
ds d ds d
= ~x (6.101)
dt2 dt ds dt ds

which implies 2 
d2 ~x d2
   
ds ds d ds d
= ~x + ~x (6.102)
dt2 dt ds2 dt ds dt ds
Math 464: Differential Geometry 120

That is 2  "  #
2
d2 ~x d2
 
ds 1 d ds d
= ~
x + ~x (6.103)
dt2 dt ds 2 2 ds dt ds
Therefore, putting this into Newton’s second law
 2  2  "  #
2
ds d ∂V (x) 1 d ds d
m ~
x = − − m ~x (6.104)
dt ds2 ∂~x 2 ds dt ds

Now let’s simplify this a little.

Directly from Newton’s second law


d2 ~x ∂V (x)
m 2
=− (6.105)
dt ∂~x
we have, taking the dot product of both sides by d~x/dt,

d2 ~x d~x ∂V (x) d~x


m 2
· =− · (6.106)
dt dt ∂~x dt
that is, using the chain rule,
"  2 #
d 1 d~x
m + V (x) = 0 (6.107)
dt 2 dt
or  2
1 d~x
m + V (x) = E (6.108)
2 dt
where E is simply a constant of integration (physicists call it the “energy”).

But this means 2 2


2[E − V (x)]
 
d~x ds
= = (6.109)
dt dt m
so that Newton’s second law becomes (note that m drops out)
 2   
d ∂V (x) 1 d d
2[E − V (x)] 2
~x = − − 2[E − V (x)] ~x (6.110)
ds ∂~x 2 ds ds
That is
d2 ~x
 
1 ∂V (x) 1 1 d d
2
=− − 2[E − V (x)] ~x (6.111)
ds 2[E − V (x)] ∂~x 2 2[E − V (x)] ds ds
Alternatively
d2 ~x ∂[E − V (x)] 1
 
1 1 1 d d~x
2
= − [E − V (x)] (6.112)
ds 2 [E − V (x)] ∂~x 2 [E − V (x)] ds ds
Math 464: Differential Geometry 121

This can be rewritten in terms of a projection operator as

d2 ~x d~x d~x ∂ ln[E − V (x)]


 
1
= I− ⊗ (6.113)
ds2 2 ds ds ∂~x

Now this has done the job of completely removing “time” from the equation of motion.
You now have an equation strictly in terms of position x and distance along the path s
— “time” has been completely eliminated.

But now let’s go one step further and re-write this in terms of geometry — you should
to be too surprised to see a three-dimensional conformally flat geometry drop out.

Start with a conformally flat geometry with metric

gab = Ω2 (x) δab (6.114)

and note that the geodesic equations are (in arbitrary parameterization)

d2 x a b
a dx dx
a
dxc
+ Γ bc = f (λ) (6.115)
dλ2 dλ dλ dλ
with (indices in this section are raised and lowered using the flat metric δab )

Γa bc = Ω−1 {δ a b Ω,c + δ a c Ω,b − δbc Ω,a } (6.116)

Now if we chose our parameter λ to be arc-length s, as measured by the flat metric δ ab ,


then
dxa dxc
δab =1 (6.117)
ds ds
and
d2 xa dxb dxa dxb
 
1 d
δab 2 = δab =0 (6.118)
ds ds 2 ds ds ds
so that
dxa dxb dxc dxa
f (s) = Γa bc = [ln Ω],a (6.119)
ds ds ds ds
and consequently
d2 x a dxa dxb b
 
a
− δ b− ∂ ln Ω = 0 (6.120)
ds2 ds ds
Therefore
d2 x a dxa dxb b
 
a
= δ b− ∂ ln Ω (6.121)
ds2 ds ds
This is exactly the from of the equations previously derived for the path of a particle
subjected to Newton’s second law provided we identify
p
Ω = E − V (x) (6.122)
Math 464: Differential Geometry 122

That is: paths of particles subject to Newton’s second law follow geodesics of the confor-
mally flat geometry defined by

gab = [E − V (x)] δab (6.123)

If we denote “length” as measured by this conformal metric as ` then we have

d`2 = [E − V (x)] ds2 (6.124)

While we have completely eliminated time from the equation for the paths there is a price
— you now need to consider a separate geometry for each value of the energy.

Furthermore
r r r r
p m 2[E − V (x)] m ds 1
d` = E − V (x) ds = ds = ds = p~ · d~x (6.125)
2 m 2 dt 2m
R
So minimizing ` = d`, which leads to the geodesic equations and thence to Newton’s
second law, is also equivalent to minimizing
Z b
W [a, b] = p~ · d~x (6.126)
a

subject to the “energy conservation equation”. But this is exactly Maupertuis’ (constant
energy) variational principle, which we now see is equivalent to minimizing the arc-length
` in the conformal geometry.
Rb
Comment: [If you have a little physics background] Note that W [a, b] = a p ds is the
quantity that shows up in the phase of the WKB approximation; this is not an accident
and ultimately can be traced back to the fact that quantum physics (via Feynman’s sum
over histories approach) can be viewed as classical mechanics plus fluctuations. The clas-
sical paths come from the saddle points in the path integral, and saddle points can be
located using the calculus of variations. ♦

Exercise: [If you have a little physics background] Since Maupertuis’ (constant energy)
variational principle is not the most common of the variational principles, let’s rewrite
this in terms of the Hamiltonian and the Lagrangian. Re-introduce time and write
Z b Z bp Z bp Z b
ds
d` = E − V (x) ds = E − V (x) dt = [E − V (x)] dt (6.127)
a a a dt a

But now, for the sort of non-relativistic kinetic energy terms we have been assuming,
L = 21 m(d~x/dt)2 − V (x) and H = P 2 /(2m) + V (x) → 21 m(d~x/dt)2 − V (x). Combining
the Lagrangian and Hamiltonian gives
H −L E−L
V (x) = → (6.128)
2 2
Math 464: Differential Geometry 123

on the constant-energy surface. Thus


Z b Z b Z b
E+L
d` = dt = E T + L dt (6.129)
a a 2 a
R R
So extremizing d` is equivalent to extremizing L dt, which was the variational princi-
ple we originally started with. ♦

One message to take from all of this is that physics applications of differential geometry
are not solely restricted to general relativity. Differential geometry is a much more basic
and fundamental tool.

Exercise: For N coupled particles of mass mi (i ∈ [1..n]) Newton’s second law becomes
the system of equations
d2 ~xi ∂V (~x1 , . . . , ~xN )
mi 2
=− (6.130)
dt ∂~xi
show that the paths swept out by this system of ODEs are geodesics in a conformally flat
3N dimensional space with metric
( N )
X
d`2 = {E − V (~x1 , . . . , ~xN )} mi ds2i (6.131)
i=1

where dsi is ordinary physical distance for the i’th particle. Note that we now need to
keep track of individual particles masses. ♦

Exercise: [A nice little project; it is open ended] One of the most famous problems of
analytical mechanics is the “three-body problem” of celestial mechanics. Suppose three
bodies attract each other according to Newtonian gravity. Then

d2 ~x1 m1 m2 n̂12 m1 m3 n̂13


m1 2
=G 2
+G (6.132)
dt |~x1 − ~x2 | |~x1 − ~x3 |2

d2 ~x2 m2 m3 n̂23 m2 m1 n̂21


m2 2 = G 2
+G (6.133)
dt |~x2 − ~x3 | |~x2 − ~x1 |2
d2 ~x3 m3 m1 n̂31 m3 m2 n̂21
m3 2
=G 2
+G (6.134)
dt |~x3 − ~x1 | |~x3 − ~x2 |2
Show that (for fixed energy E) the three body problem is equivalent to geodesic motion
in a nine dimensional conformally flat manifold with metric
( ) ( 3 )
G X mi mj X
d`2 = E − mi ds2i (6.135)
2 i6=j |~xi − ~xj | i=1
Math 464: Differential Geometry 124

Now this nine-dimensional representation is redundant. You can wlog go to the center of
mass frame so that
m1 ~x1 + m2 ~x2 + m3 ~x3 = ~0 (6.136)
This lets you eliminate one of the ~x’s, say ~x3 .

Show that (for fixed energy E) the three body problem is equivalent to geodesic motion
in a six dimensional conformally flat manifold and find the corresponding metric.

Are there any further reductions you can make? ♦

Exercise: Is there anything you can say for a general Lagrangian L(ẋ, x) that does not
conveniently separate into kinetic energy and potential energy contributions? ♦
Chapter 7

Exterior derivatives

The “exterior derivative” is another way of producing covariant objects from derivatives
of a tensor; this time without invoking any connexion [or parallel transport] at all. Of
course there is a different price to pay.

7.1 Partial derivatives are not tensors

Suppose Va are the components of a covector, then under a change of chart

∂xb
Vā = Vb (7.1)
∂ x̄a
Take partial derivatives

∂xb
 
∂c̄ Vā = ∂c̄ Vb (7.2)
∂ x̄a
 b  b
∂x ∂x
= ∂c̄ a
Vb + ∂c̄ Vb (7.3)
∂ x̄ ∂ x̄a
 2 b   b  d
∂ x ∂x ∂x
= c a
Vb + a
∂ d Vb (7.4)
∂ x̄ ∂ x̄ ∂ x̄ ∂ x̄c

If only the last term were present, then we would have a T20 tensor. Unfortunately the
presence of the first term destroys the tensorial properties of the partial derivative ∂T —
unless that is you are working in Cartesian coordinates and are only interested in changing
coordinates to another Cartesian system.

In previous chapters we discussed the notion of the covariant derivative (affine connex-
ion) in considerable detail. Now we will look at a different route to obtaining a covariant

125
Math 464: Differential Geometry 126

object. Note that the leading term, which gives all the problems, is symmetric in (ac).
So if we take the antisymmetric part of both sides of the equation above we have
 b  d
∂x ∂x
∂[c̄ Vā] = a
∂[d Vb] (7.5)
∂ x̄ ∂ x̄c

And this is a tensor equation. This is the second of the two standard routes to defining a
sensible notion of differentiation on tensors. This time we will not alter the definition of
differentiation (essentially that was the role of the affine connexion, to apply a “correction”
to the partial derivative to obtain a covariant derivative). Instead what we will now do
is to restrict the class of objects we want to differentiate. This is the central idea behind
the “exterior differential calculus”.

So far we have seen that for a scalar φ and a covector Va the quantities

φ,a and V[a,b] (7.6)

transform as tensors. How can we generalize this? Let’s start by looking at a tensor that
is completely covariant (all indices down) so that

∂xa ∂xb ∂xc


Xabc... → X̄abc... = . . . Xabc... (7.7)
∂ x̄a ∂ x̄b ∂ x̄c
and differentiate
∂xa ∂xb ∂xc ∂xz
∂z̄ X̄abc... = . . . ∂z Xabc... (7.8)
∂ x̄a ∂ x̄b ∂ x̄c ∂ x̄z
∂ 2 xa ∂xb ∂xc
+ z a . . . Xabc... (7.9)
∂ x̄ ∂ x̄ ∂ x̄b ∂ x̄c
∂xa ∂ 2 xb ∂xc
+ a . . . Xabc... (7.10)
∂ x̄ ∂ x̄z ∂ x̄b ∂ x̄c
∂xa ∂xb ∂ 2 xc
+ a . . . Xabc... + . . . (7.11)
∂ x̄ ∂ x̄b ∂ x̄z ∂ x̄c
To get rid of all the unwanted terms you must anti-symmetrize on za, and on zb, and on
zc, etc. But if you are anti-symmetrizing on all these pairs, it implies anti-symmetrization
on all indices.

If we anti-symmetrize on all indices then

∂xa ∂xb ∂xc ∂xz


∂[z̄ X̄abc...] = . . . ∂[z Xabc...] (7.12)
∂ x̄a ∂ x̄b ∂ x̄c ∂ x̄z
or equivalently
∂xa ∂xb ∂xc ∂xz
X̄[abc...,z] = . . . X[abc...,z] (7.13)
∂ x̄a ∂ x̄b ∂ x̄c ∂ x̄z
Math 464: Differential Geometry 127

That is, completely anti-symmetrized partial derivatives of fully covariant tensors trans-
form as tensors. The beauty here is that you do not have to introduce any affine con-
nexion, the differentiation is purely one of partial differentiation. The drawback is that
you are only allowed to us this trick for a rather restricted class of tensors — fully anti-
symmetrized covariant tensors.

Still this technique is of sufficient importance that a whole terminology and technology
has grown up around it.

Exercise: Show that for a symmetric connexion (in particular the Christoffel connexion)
X[abc...;z] = X[abc...,z] (7.14)
so that the connexion actually “drops out” completely. ♦

Exercise: Show that for a metric with torsion


X[abc...;z] 6= X[abc...,z] (7.15)
and evaluate the extra contributions. ♦

7.2 Exterior differential forms

Definition 52 A “s-form” is a completely antisymmetric tensor of type Ts0 .

• 0-forms [zero-forms] are scalars.


• 1-forms [one-forms] are covectors.
• 2-forms [two-forms] are antisymmetric in two covariant indices.
• n-forms are completely antisymmetric in n covariant indices.
• (n + 1)-forms and higher are all zero. Why?

Notation: A generic s-form will be denoted F . ♦

Definition 53 The exterior derivative, denoted d, maps s-forms onto (s + 1)-forms ac-
cording to the rule
{dF }a1 a2 ...as as+1 = (s + 1)∂[a1 Fa2 a3 ...as as+1 ] (7.16)
= (s + 1)F[a2 a3 ...as as+1 ,a1 ] (7.17)
= (−1)s (s + 1) F[a1 a2 ...as ,as+1 ] (7.18)
Math 464: Differential Geometry 128

Putting the factor (s + 1) here is purely a matter of convention — it maximizes agreement


with the notation of other books. Likewise with the sign (−1)s .

Exercise: These three definitions are really equivalent — check! ♦

Warning: The symbol d is really over-used in differential geometry. It sometimes means


you should differentiate something, it sometimes means you should integrate something,
it sometimes refers to an exterior derivative, and it sometimes just refers to a “small
change” in some quantity. Unfortunately, the notation is standard, you will have to learn
to live with it. ♦

Lemma 21 The exterior derivative when applied twice to any s-form always gives zero.
That is
d2 F = 0 (7.19)

From the definition you know that d2 maps s-forms into (s + 2)-forms, but the lemma
above tells you that the (s + 2)-form in question is always equal to zero. Why? (It should
be obvious.)

Definition 54 The exterior product, denoted ∧, is a bilinear product on forms that maps
an s1 -form and an s2 -form onto an (s1 + s2 )-form according to the rule

(s1 + s2 )!
{F1 ∧ F2 }[a1 a2 ...as1 b1 b2 ...bs2 ] = {F1 }[a1 a2 ...as1 {F2 }b1 b2 ...bs2 ] (7.20)
s1 ! s2 !
Putting the reciprocal binomial factor here is purely a matter of convention — it maximizes
agreement with the notation of other books.

Lemma 22
F1 ∧ F2 = (−1)s1 s2 F2 ∧ F1 (7.21)

The proof is, or should be, obvious.

7.3 “Index free” notation

Now is a suitable time to introduce some “index free” notation, though it should more
properly be called “limited index” notation. Dealing with exterior differential forms and
Math 464: Differential Geometry 129

the exterior derivative is the one place where “index free” methods are clearly worth the
trouble.

First we step back to the idea of a covariant vector and note that there is a natural
isomorphism between vectors and directional derivatives. If ta are the components of a
contravariant vector then we define

ta ←→ t = ta (7.22)
∂xa
note that because the partial derivatives transform in the same way as a covariant vector
the RHS is a coordinate invariant. Abstract mathematicians are likely to say that the
coordinate invariant t is the primary definition of a vector. Physicists generally take the
component based definition as primary. Fortunately all roads lead to Rome...

Warning: You will often see statements to the effect theat the partial derivatives
a
∂a = ∂/∂x are “vectors”. In an abstract sense this is quite true, they are linear differential
operators and they do form a basis for the tangent space T = T01 . Under a change of
chart the chain rule gives
∂ ∂xb ∂
= (7.23)
∂ x̄a ∂ x̄a ∂xb
So that this set of basis vectors transforms in a manner opposite to the components ta ,
which then makes the combination t = ta ∂a a coordinate invariant. Even though it is a
coordinate invariant I do not want to call this a scalar as that has additional connota-
tions... ♦

If we now have a vector field defined by components ta (x) then we can define a field of
linear differential operators

ta (x) ←→ t(x) = ta (x) (7.24)
∂xa
This now lets you things that at first glance are really perverse — like calculate the
“commutator” of two vector fields. Let t1 and t2 be two vector fields and define
∂ b ∂ ∂ ∂
[t1 , t2 ] = t1 t2 − t2 t1 = ta1 (x) a
t2 (x) b − tb2 (x) b ta1 (x) a (7.25)
∂x ∂x ∂x ∂x
∂2 ∂
= {ta1 tb2 − tb2 ta1 } a b
+ {ta1 ∂a tb2 − ta2 ∂a tb1 } b (7.26)
∂x ∂x ∂x

= {ta1 ∂a tb2 − ta2 ∂a tb1 } b (7.27)
∂x
That is
[t1 , t2 ]b = ta1 ∂a tb2 − ta2 ∂a tb1 (7.28)
Math 464: Differential Geometry 130

Lemma 23 The quantities [t1 , t2 ]b = ta1 ∂a tb2 − ta2 ∂a tb1 transform as the components of a
contravariant vector (a T01 tensor).

Proof: Since the ta1 transform as the components of a contravariant vector t1 is coordinate
independent.

Since the ta2 transform as the components of a contravariant vector t2 is coordinate


independent.

Therefore [t1 , t2 ] is coordinate independent.

But [t1 , t2 ] is by construction also a linear differential operator (a directional derivative),


therefore its components [t1 , t2 ]b transform as a contravariant vector. QED

Exercise: Provide a “low-brow” proof of the same statement by explicitly making the
change of coordinates
[t1 , t2 ]b̄ = tā1 ∂ā tb̄2 − tā2 ∂ā tb̄1 (7.29)
and verifying
∂xb̄
b
[t1 , t2 ]b
[t1 , t2 ]b̄ = (7.30)
∂x
2 2
Hint: What happens to the ∂ x̄/(∂x) term? ♦

We can now use this to give a “coordinate free” definition of the exterior derivative of a
0-form f . Recall that T10 tensors are elements of the co-tangent space T ∗ , so they can be
thought of as linear mapping from the tangent space T into IR. That is we can define the
covariant vector g by its action g(t) on vectors t; as long as g(t) is linear in t we can use
the isomorphism ta ↔ t = ta ∂x∂ a to re-write t in terms of its components ta and thereby
deduce the existence of components ga such that

g(t) = ga ta (7.31)

Definition 55 If f is a 0-form [scalar] then df is the 1-form uniquely defined by its


action on arbitrary tangent vectors t by

df (t) = tf (7.32)

Unwrapping the index free notation, on the one hand we have

df (t) = {df }a ta (7.33)


Math 464: Differential Geometry 131

while on the other hand


tf = ta ∂a f (7.34)
whence, since t is arbitrary, and so its components are arbitrary,

{df }a = ∂a f = f,a (7.35)

Now let’s generalize this construction... First, pick a particular coordinate chart, and
let xa (p) be the coordinates in this chart. Then we can certainly define a set of n 1-forms
dxa . Using these particular 1-forms as a basis for T ∗ define a generic 1-form as

g = ga dxa . (7.36)

where g are the components of a T10 tensor — the g defined this way is independent of
the choice of coordinate system.

Warning: The basis one-forms dxa manifestly do depend on the coordinate chart. ♦

Exercise: Show that under a change of chart

a ∂ x̄a
dx̄ = b
dxb (7.37)
∂x

Now step back a little: When we defined Tsr tensors we phrased the definition in terms
of the “outer product” or “Cartesian product” of vectors in the tangent and cotangent
spaces. I just want to remind you that we had considered the “tensor product”

⊗ : Tsr11 ⊗ Tsr22 → Tsr11+s


+r2
2
(7.38)

and in particular that for two 1-forms [covariant vectors] we had already decided that
things like
g1 ⊗ g 2 (7.39)
make sense — it’s an element of T20 .

This now lets us define the exterior product of two generic 1-forms as

g1 ∧ g 2 = g 1 ⊗ g 2 − g 2 ⊗ g 1 (7.40)

note the absence of any 1/2! (This is convention, but a useful convention.) Furthermore
for three one-forms

g1 ∧ g 2 ∧ g 3 = g 1 ⊗ g 2 ⊗ g 3 + g 2 ⊗ g 3 ⊗ g 1 + g 3 ⊗ g 1 ⊗ g 2
−g3 ⊗ g2 ⊗ g1 − g2 ⊗ g1 ⊗ g3 − g1 ⊗ g3 ⊗ g2 (7.41)
Math 464: Differential Geometry 132

note the absence of any 1/3!, with the sum being over all six (3!) signed permutations of
the labels 1, 2, 3. More generally
X
∧N
i=1 g i = signum(π) ⊗N
i=1 gπ(i) (7.42)
π

where the sum is over all permutations π of the labels [1,N] — these are just labels, not
coordinate indices. Here signum(π) is +1 if π is an even permutation and −1 if π is an
odd permutation. In particular this lets us use dxa , dxa ∧ dxb , dxa ∧ dxb ∧ dxc , . . . as a
basis for the 1, 2, 3-forms. So we can write

F = Fa1 a2 ...as dxa1 ∧ dxa1 ∧ . . . ∧ dxas (7.43)

d = dxa ∂a (7.44)
and then in “index free” notation
dF = d ∧ F (7.45)
so that
ddF = d ∧ d ∧ F = 0 (7.46)
or more formally
d2 = d ∧ d = 0 (7.47)
To see that this is equivalent to the index-based notation previously given, note that in
component notation, the “outer product” [also called “tensor product”] of two tensors of
type Tsr11 and Tsr22 is a tensor of type Tsr11+s
+r2
2
defined by
a a ...ar b b ...br a a ...ar b b ...br
{X1 ⊗ X2 }c11c22...cs 1d11d22...ds2 = {X1 }c11c22...cs11 {X2 }d11 d22 ...ds2 (7.48)
1 2 2

Exercise: Verify that


dF = d ∧ F (7.49)
is compatible with the previous component definition. Keep careful track of the signs
arising from permuting both the components and the basis one-forms dxa . ♦

Exercise: Show that

d(F1 ∧ F2 ) = (dF1 ) ∧ F2 + (−1)s1 F1 ∧ (dF2 ) (7.50)


Math 464: Differential Geometry 133

7.4 Closed and exact forms

Definition 56 An s-form F is said to be closed iff dF = 0.

Definition 57 An s-form F is said to be exact iff there exists an (s − 1)-form A such


that F = dA.

We have already seen that all exact forms are closed.

Exercise: Show that the exterior product of two closed forms is closed. ♦

Are all closed forms exact? Locally yes; in topologically trivial region all closed forms
are exact, but if the topology is nontrivial there may be obstructions. This leads you into
the theory of de Rham cohomology.

Define an equivalence relation on closed forms by

F1 ∼ F 2 ⇐⇒ F1 − F2 is exact (7.51)

Exercise: Show that this binary relation ∼ really is an equivalence relation. ♦

Definition 58
Hs = {closed s-forms}/ ∼ (7.52)
is the s’th de Rham cohomology class.

Exercise: Show that each Hs is a vector space. ♦

Definition 59 Bs = dim[Hs ] is the sth Betti number.

Exercise: Show that H0 = 1 and Hn = 1. [Easy] ♦

Exercise: Show that H1 equals the number of topologically distinct [independent] non-
contractible closed loops. [Not so easy, but still “elementary”] ♦
Math 464: Differential Geometry 134

Hint: A 1-form g is closed if dg = 0, which implies g[a,b] = 0. A 1-form g is exact if


g = df , which implies ga = ∂a f . When does

g[a,b] = 0 ⇒ g a = ∂a f ? (7.53)

You have already seen problems like this in Euclidean space. In 3-dimensional Euclidean
space the analogous question is: When does
~ ×A
∇ ~=0 ⇒ ~ = ∇Φ
A ~ ? (7.54)

When is a curl free vector field the gradient of a scalar? ♦

7.5 Integration of forms

Finally we need to be able to say something about how to integrate on a manifold. If n is


the dimension of the manifold then the set of n-forms is a one dimensional vector space
(why?) and we have that for any n-form there exists some f (x)

ω = f (x) dx1 ∧ dx2 ∧ · · · ∧ dxn (7.55)

Now f (x) is not a scalar since

∂xā
 
1̄ 2̄ n̄
dx ∧ dx ∧ · · · ∧ dx = det dx1 ∧ dx2 ∧ · · · ∧ dxn (7.56)
∂xa

so that −1
∂xā

f (x) → f¯(x) = det f (x) (7.57)
∂xa
This means that f (x) is a scalar density. To easily check the plausibility of the assertion,
consider a diagonal matrix ∂xā /∂xa corresponding to a simple rescaling of coordinates.

Exercise: Prove this assertion in general using properties of the determinant and the
fact that the n-fold exterior product is completely antisymmetric. ♦

Exercise: Prove that


f (x) = ω123...n (7.58)
That is
ω = ω123...n dx1 ∧ dx2 ∧ · · · ∧ dxn (7.59)

Math 464: Differential Geometry 135

Now define the integral of an n-form over an n-dimensional region of a manifold by


Z Z
ω= f (x) dn x (7.60)
Ω Ω

Note that f (x) while is not a scalar the combination f (x) dn x is coordinate independent.
Here dn x is the ordinary Riemann measure (or Lebesgue measure) on IR n . The definition
as given above makes sense only if Ω is a topologically trivial region that can be contained
within a single coordinate chart — but that is not really a significant restriction. If Ω is not
topologically trivial simply split it up into a number of regions Ωi that are topologically
trivial and define Z XZ
ω= ω (7.61)
Ω {i} Ωi

If you want even more technical precision you can set up a “subordinate partition of
unity” using the assumed second countability (or equivalently, paracompactness) of the
underlying manifold.

Exercise: Prove that Z Z


ω= ω123...n (x) dn x (7.62)

We are now ready for an important theorem.

Theorem 26 (Stokes theorem) Let Ω be a n-dimensional region with (n−1)-dimensional


boundary ∂Ω, let ω be an (n − 1)-form. Then
Z Z
dω = ω (7.63)
Ω ∂Ω

Proof: Note that the LHS integrates an n-form over a n-dimensional region while the
RHS integrates a (n − 1)-form over a (n − 1)-dimensional region.

Let’s work for now with a n-dimensional rectangular region Ω ∼ [Ai , Bi ]n . Then in
components
n
X
{dω}a1 a2 ...an = nω[a1 a2 ...an −1,an ] = ∂aj ωaj+1 ...an a1 ...aj−1 (7.64)
j=1

Then the appropriate f (x) is


n
X
f (x) = {dω}123...n = ∂j ω(j+1)...(n−1)n 1 2...j−1 (7.65)
j=1
Math 464: Differential Geometry 136

and Z Z n
Z X
n
dω = f (x) d x = ∂j ω(j+1)...n 1...(j−1) dn x (7.66)
Ω Ω Ω j=1

But in each of these n terms we can do the dxj integral explicitly


Z Xn
ω(Bj )(j+1)...n 1 2...(j−1) − ω(Aj )(j+1)...n 1 2...(j−1) dn−1 x
 
dω = (7.67)
Ω j=1

Each one of the terms on the RHS consists of an (n − 1) form being integrated over one
n − 1 dimensional face of the rectangular region. That is
Z X Z Z
dω = ω= ω (7.68)
Ω faces face ∂Ω

This proves Stokes’ theorem for rectangular regions.

For general regions merely approximate the general region with an increasingly fine
mesh of rectangular regions — you have seen such techniques often enough before in the
undergraduate versions of Gauss, Stokes, and Green’s theorem... QED

Comment: The only tricky part of the proof is keeping the signs correct. ♦

Comment: In n = 2 dimensional Euclidean space this reduces to Green’s theorem. To


see this pick a square in IR2 with Cartesian coordinates x and y. Then ω is a 1-form and
Z Z Z
dω = [ωx,y − ωy,x ] dx dy = [ωx,y − ωy,x ] d(Area) (7.69)
Ω Ω Ω

But Z Z I
a
ω= ωa dx = ωa dxa (7.70)
∂Ω ∂Ω
(Warning: Note that the penultimate “d” is quite distinct from the ultimate “d”. The
penultimate “d” is an exterior differential operator acting on the coordinate xa . The
ultimate “d” is an ordinary differential being used for a real integration. The notation is
unfortunately standard.) That is
Z I
[ωx,y − ωy,x ] d(Area) = ωa dxa (7.71)

or even Z I
[ωx,y − ωy,x ] d(Area) = ~ · d~x
ω (7.72)

which you should remember from Math 206. As usual arbitrary areas can now be built
up by taking unions of squares of different sizes.
Math 464: Differential Geometry 137

(Warning: In these last two equations both “d”s are ordinary differentials being used for
area and line integrations.) ♦

Comment: In n = 3 dimensional Euclidean space this reduces to the usual Stokes


theorem. To see this pick a 2 dimensional square in IR3 , and on that square set up
Cartesian coordinates x and y. Then let ω be a 1-form and we can mostly follow the
discussion of Green’s theorem
Z Z Z
dω = [ωx,y − ωy,x ] dx dy = [ωx,y − ωy,x ] d(Area) (7.73)
Ω Ω Ω

But Z Z I
a
ω= ωa dx = ωa dxa (7.74)
∂Ω ∂Ω
That is Z I
[ωx,y − ωy,x ] d(Area) = ωa dxa (7.75)

But now let’s write dS~ = n̂ d(Area) with n̂ being the unit normal to the surface, in the
present coordinates this lies in the z direction. Then we can write the surface integral as
Z I
~
(∇ × ω ~
~ ) · dS = ω · d~x (7.76)

which you should remember from Math 206. As usual arbitrary areas can mow be built
up by taking unions of squares of different sizes and orientations. ♦

Comment: Thus the generalized Stokes theorem takes the ordinary Stokes theorem and
modifies it to arbitrary manifolds in arbitrary many dimensions — and the generalized
Stokes theorem does not even need to make use of the metric tensor. ♦

Exercise: Prove the integration by parts formula:


Z Z Z
s1
(dF1 ) ∧ F2 = (F1 ∧ F2 ) − (−1) F1 ∧ (dF2 ) (7.77)
Ω ∂Ω

7.6 Generalized Kronecker tensors

In addition to the Kronecker delta δ a b other useful tensors are


δ a1 a2 b1 b2 = 2! δ [a1 b1 δ a2 ] b2 (7.78)
Math 464: Differential Geometry 138

and
δ a1 a2 a3 b1 b2 b3 = 3! δ [a1 b1 δ a2 b2 δ a3 ] b3 (7.79)
etc...

This series of tensors terminates with

δ a1 a2 a3 ...an b1 b2 b3 ...bn = n! δ [a1 b1 δ a2 b2 δ a3 b3 . . . δ an ] bn (7.80)

Exercise: Why does the series of tensors terminate at the n’th step? ♦

Exercise: Show that these tensors are all anti-symmetric on their lower indices; anti-
symmetry in the upper indices is true by construction. ♦

Exercise: Demonstrate that


δ a1 b 1 δ a1 b 2
δ a1 a2 b1 b2 = [δ a1 b1 δ a2 b2 − δ a2 b1 δ a1 b2 ] = (7.81)
δ a2 b 1 δ a2 b 2

Exercise: Demonstrate that


δ a1 b 1 δ a1 b 2 δ a1 b 3
a1 a2 a3
δ b1 b2 b3 = δ a2 b 1 δ a2 b 2 δ a2 b 3 (7.82)
δ a3 b 1 δ a3 b 2 δ a3 b 3

Exercise: Demonstrate that

δ a1 a2 a3 ...an b1 b2 b3 ...bn = signum(a1 a2 a3 . . . an ) signum(b1 b2 b3 . . . bn ) (7.83)

and verify that the LHS really transforms as a tensor. ♦

Exercise: Demonstrate that

signum(a1 a2 a3 . . . an ) (7.84)

does not transform like a tensor. In fact if we try to set

X a1 a2 a3 ...an = signum(a1 a2 a3 . . . an ); Yb1 b2 b3 ...bn = signum(b1 b2 b3 . . . bn ) (7.85)


Math 464: Differential Geometry 139

then it is easy to show that X and Y are both tensor densities, but of opposite weights.
What are the transformation properties? ♦

Exercise: Use the signum function, signum(a1 a2 a3 . . . an ), to write down an explicit


formula for the determinant of a matrix. ♦

Warning: So far this chapter on exterior derivatives has not made use of the metric
tensor. Once you have the metric tensor available to add extra structure even more in-
teresting things begin to happen. ♦

7.7 The Levi–Civita tensor

Suppose we have a metric tensor available, with components gab , then we can certainly
calculate its determinant g = det[gab ].

Definition 60 The Levi–Civita tensor is defined as


p
a1 a2 a3 ...an = det[gab ] signum(a1 a2 a3 . . . an ) (7.86)

In particular the Levi–Civita tensor is an n-form.

Exercise: Check that this really does transform as a Tn0 tensor. ♦

Note: If you are in Euclidean space using Cartesian coordinates you can afford to forget

the g. ♦

Exercise: If you do not have a metric tensor available, any non-singular T20 tensor
would be good enough — just make sure you have a good reason for whatever particular
choice of T20 tensor you decide on. ♦

Exercise: What are the transformation properties of


p
det[Rab ] signum(a1 a2 a3 . . . an ) (7.87)

where Rab is the Ricci tensor? ♦


Math 464: Differential Geometry 140

Exercise: Now raise all the indices on the Levi–Civita tensor by using the inverse metric
g ab . Show that
1
a1 a2 a3 ...an = p signum(a1 a2 a3 . . . an ) (7.88)
det[gab ]
Check that this really does transform as a T0n tensor. Show that

a1 a2 a3 ...an a1 a2 a3 ...an = n! (7.89)

Warning: In pseudo-Riemannian (Lorentzian) manifolds this becomes


p
a1 a2 a3 ...an = − det[gab ] signum(a1 a2 a3 . . . an ) (7.90)
−1
a1 a2 a3 ...an = p signum(a1 a2 a3 . . . an ) (7.91)
− det[gab ]
a1 a2 a3 ...an a1 a2 a3 ...an = −n! (7.92)
More generally, if the “metric tensor” has S negative eigenvalues [and n − S positive
eigenvalues] then
p
a1 a2 a3 ...an = | det[gab ]| signum(a1 a2 a3 . . . an ) (7.93)

(−1)S
a1 a2 a3 ...an = p signum(a1 a2 a3 . . . an ) (7.94)
| det[gab ]|
a1 a2 a3 ...an a1 a2 a3 ...an = (−1)S n! (7.95)
Unfortunately in physical applications these minus signs are often important. (Which is
why I’ve gone to the trouble to keep track of them.) ♦

Exercise: Show that (in a Riemannian manifold)

a1 a2 a3 ...an b1 b2 b3 ...bn = 0! δ a1 a2 a3 ...an b1 b2 b3 ...bn (7.96)

a1 a2 a3 ...an−1 c b1 b2 b3 ...bn−1 c = 1! δ a1 a2 a3 ...an−1 b1 b2 b3 ...bn−1 (7.97)


a1 a2 a3 ...an−2 c1 c2 b1 b2 b3 ...bn−2 c1 c2 = 2! δ a1 a2 a3 ...an−2 b1 b2 b3 ...bn−2 (7.98)
a1 a2 a3 ...an−3 c1 c2 c3 b1 b2 b3 ...bn−3 c1 c2 c3 = 3! δ a1 a2 a3 ...an−3 b1 b2 b3 ...bn−3 (7.99)
and so on...
a1 a2 c1 c2 c3 ...cn−2 b1 b2 c1 c2 c3 ...cn−2 = (n − 2)! δ aa a2 b1 b2 (7.100)
ac1 c2 c3 ...cn−1 bc1 c2 c3 ...cn−1 = (n − 1)! δ a b (7.101)
c1 c2 c3 ...cn c1 c2 c3 ...cn = n! (7.102)
Math 464: Differential Geometry 141

Though somewhat tedious these identities are often tremendously useful in component
calculations. ♦

Exercise: What are the relevant formulae in general signature? It is easiest to work in
a coordinate system in which the metric is diagonal at the point in question. ♦

Exercise: Explicitly write down, and keep handy, the specialization of these identities
to 2, 3, and 4 dimensions. ♦

Exercise: Show that [using the Christoffel connexion] the Levi–Civita tensor is covari-
antly constant
∇b a1 a2 a3 ...an = 0 = a1 a2 a3 ...an ;b (7.103)
It’s easiest to do this in locally geodesic coordinates. ♦

7.8 The Hodge star operation

Once we have the Levi–Civita tensor available, we can introduce the “Hodge star opera-
tion” which maps s-forms to (n − s)-forms.

Definition 61
1
{∗F }a1 a2 ...an−s =a1 a2 ...an−s b1 b2 ...bs {F }b1 b2 ...bs (7.104)
s!
Note that the metric is hiding both in the definition of the Levi–Civita tensor and in the
raising and lowering of indices on this tensor.

Exercise: Show that [in a Riemannian manifold]

∗ 1 = ; ∗ = 1. (7.105)

That is, the number 1 (a particular zero-form) is mapped into the Levi-Civita n-form,
while the Levi-Civita n-form is mapped into the zero-form 1. ♦

Exercise: Show that if the metric has S negative eigenvalues

∗ 1 = ; ∗ = (−1)S . (7.106)


Math 464: Differential Geometry 142

Exercise: Show that if the metric has S negative eigenvalues and ω is any n-form that
ω123...n
∗ ω = (−1)S p (7.107)
| det[gab ]|

and consequently p
ω123...n = (−1)S | det[gab ]| ∗ ω (7.108)

Applying the Hodge star twice maps s-forms back into s-forms. Explicitly
1
{∗ ∗ F }c1 c2 ...cs = c c ...c a1 a2 ...an−s a1 a2 ...an−s b1 b2 ...bs {F }b1 b2 ...bs (7.109)
(n − s)! s! 1 2 s

But then
1
(−1)s(n−s) δ b1 b2 ...bs c1 c2 ...cs {F }b1 b2 ...bs
{∗ ∗ F }c1 c2 ...cs = (7.110)
s!
The (−1)s(n−s) comes from re-arranging the order of indices

a1 a2 ...an−s b1 b2 ...bs = (−1)s(n−s) b1 b2 ...bs a1 a2 ...an−s (7.111)

so that you can apply the results of the previous section. But then

{∗ ∗ F }c1 c2 ...cs = (−1)s(n−s) δ [b1 c1 δ b2 c2 δ b3 c3 . . . δ bs ] cs {F }b1 b2 ...bs (7.112)

and so
∗ ∗F = (−1)s(n−s) F (7.113)

Exercise: Show that in any odd-dimensional Riemannian manifold

∗ ∗F = F (7.114)

Exercise: Show that in any even-dimensional Riemannian manifold

∗ ∗F = (−1)s F (7.115)

Exercise: Show that if the metric has S negative eigenvalues the general situation is

∗ ∗F = (−1)S+s(n−s) F (7.116)
Math 464: Differential Geometry 143

Exercise: Write down the pattern for Riemannian manifolds S = 0 in 2, 3, and 4


dimensions. ♦

Exercise: Write down the pattern for pseudo-Riemannian manifolds [Lorentzian signa-
ture] S = 1 in (1+1), (2+1), and (3+1) dimensions. ♦

The 0-forms
∗ (F ∧ ∗F ); ∗(F1 ∧ ∗F2 ) (7.117)
are particularly useful.

Lemma 24 In a Riemannian manifold

(n − s)!
∗ (F1 ∧ ∗F2 ) = (−1)s(n−s) {F1 }a1 a2 ...as {F2 }a1 a2 ...as (7.118)
n!

Proof:
1 a1 a2 ...an
{F1 }a1 a2 ...as as+1 as2 ...an b1 b2 ...bs {F2 }b1 b2 ...bs

∗ (F1 ∧ ∗F2 ) =  (7.119)
n!s!
(n − s)! a1 a2 ...as
= (−1)s(n−s) δ b1 b2 ...bs {F1 }a1 a2 ...as {F2 }
b1 b2 ...bs
(7.120)
n!s!
(n − s)! [a1 a2 a3
= (−1)s(n−s) δ b1 δ b2 δ b3 . . . δ as ] bs {F1 }a1 a2 ...as {F2 }b1 b2 ...bs
n!
(7.121)
(n − s)!
= (−1)s(n−s) {F1 }a1 a2 ...as {F2 }a1 a2 ...as (7.122)
n!
QED

Exercise: Better check signs and coefficients ♦

Exercise: Show that if the metric has S negative eigenvalues

(n − s)!
∗ (F1 ∧ ∗F2 ) = (−1)S+s(n−s) {F1 }a1 a2 ...as {F2 }a1 a2 ...as (7.123)
n!

Math 464: Differential Geometry 144

Exercise: Show that [regardless of signature]


Z Z Z
s(n−s)
(F1 ∧ ∗F2 ) = (F2 ∧ ∗F1 ) = (−1) (∗F1 ∧ F2 ) (7.124)

use this to define an “inner product” on s-forms according to the rule


Z
hF1 , F2 i = (F1 ∧ ∗F2 ) (7.125)

and show that this inner product is symmetric. In a Riemannian manifold it will also
be positive definite and satisfy the triangle inequality — these properties may fail if the
metric has negative eigenvalues. ♦

7.9 The divergence operation

Now that we have the exterior derivative d and the Hodge star operation ∗, we can define
a new differential operation on s-forms, the divergence.

Definition 62 The divergence operator δ maps s-forms to (s − 1)-forms and is defined


by
δF = (−1)S+n(s+1)+1 ∗ d ∗ F (7.126)

Comment: The phase (−1)S+n(s+1) is there for convenience elsewhere in the develop-
ment — you will not go too far wrong in ignoring it to get a qualitative feel for what is
going on. ♦

Lemma 25 With this choice of phase the divergence δ is adjoint to the exterior derivative
δ in the sense that
hα, dβi = hδα, βi (7.127)
or in terms of the underlying integral
Z Z
α ∧ (∗dβ) = δα ∧ (∗β) (7.128)

Here α is an s-form and β is an (s − 1)-form.


Math 464: Differential Geometry 145

Proof:
Z Z
S+n(s+1)+1
δα ∧ (∗β) = (−1) (∗ d ∗ α) ∧ (∗β) (7.129)
Z
S+n(s+1)+1
= (−1) (d ∗ α) ∧ (∗ ∗ β) (7.130)
Z
S+n(s+1)+1+S+s(n−s)
= (−1) (d ∗ α) ∧ β (7.131)
Z
n−s+1
= (−1) (d ∗ α) ∧ β (7.132)

Now use the integration by parts formula and assume either no boundary, ∂Ω = 0, or
suitable falloff boundary conditions. Then
Z Z
(n−s)+1+(n−s)+1
δα ∧ (∗β) = (−1) (∗α) ∧ (dβ) (7.133)
Z
= α ∧ (∗ dβ) (7.134)

and we are done. QED

Exercise: Derive the related result [α is an s-form]


Z Z
s+1
δα ∧ β = (−1) α ∧ δβ (7.135)

Lemma 26
δδF = 0 (7.136)

Proof:
δδF = ∗ d ∗ ∗ d ∗ F ∝ ∗ dd ∗ F = 0 (7.137)
QED

Lemma 27 The components of δF can be calculated in terms of the covariant divergence


of the components of F .
Math 464: Differential Geometry 146

Proof: Let’s have a look at this divergence in components:

δFc1 c2 ...cs−1 = (−1)S+n(s+1)+1 {∗ d ∗ F }c1 c2 ...cs−1 (7.138)


(−1)S+n(s+1)+1
= c c ...c a1 a2 ...an−s+1
(n − s + 1)! s! 1 2 s−1
×(n − s + 1) a1 a2 ...an−s b1 b2 ...bs {F }b1 b2 ...bs ,an−s+1
 
(7.139)
(−1)S+n(s+1)+1
c1 c2 ...cs−1 a1 a2 ...an−s+1 a1 a2 ...an−s b1 b2 ...bs {F }b1 b2 ...bs ;an−s+1
 
=
(n − s)! s!
(7.140)
S+n(s+1)+1
(−1)
= c1 c2 ...cs−1 a1 a2 ...an−s+1 a1 a2 ...an−s b1 b2 ...bs [{F }b1 b2 ...bs ];an−s+1
(n − s)! s!
(7.141)
S+n(s+1)+1+(n−s)
(−1)
= c1 c2 ...cs a1 a2 ...an−s a1 a2 ...an−s b1 b2 ...bs
(n − s)! s!
×g cs d [{F }b1 b2 ...bs ];d (7.142)

Here we have first used the complete anti-symmetry on a1 a2 . . . an−s+1 to replace the
partial derivative with the Christoffel covariant derivative, and then used the covariant
constancy of the Levi–Civita tensor to move the Levi–Civita tensor outside the brackets.
Finally I have rearranged the indices. Now we can contract over the indices of the two
Levi–Civita tensors to get

(n − s)!
δFc1 c2 ...cs−1 = (−1)S+n(s+1)+1+(n−s)+s(n−s)+S δ b1 b2 ...bs c1 c2 ...cs
(n − s)! s!
×g cs d {F }b1 b2 ...bs ;d (7.143)
(n − s)!s!
= (−1)1 δ [b1 c1 δ b2 c2 δ b3 c3 . . . δ bs ] cs
(n − s)! s!
×g cs d {F }b1 b2 ...bs ;d (7.144)
= − g cs d Fc1 c2 ...cs ;d (7.145)
= −Fc1 c2 ...cs−1 a;b g ab (7.146)
= −Fc1 c2 ...cs−1 a ;a (7.147)

So up to a multiplicative factor, this really is the covariant divergence of the s-form. QED

Exercise: Check coefficients and signs. ♦


Math 464: Differential Geometry 147

7.10 Gauss’ theorem

Let ω be a one-form and apply Stokes’ theorem to the (n − 1)-form ∗ ω. Then


Z Z
d∗ω = ∗ω (7.148)
Ω ∂Ω

But using the properties of the Hodge star


Z Z p Z p
S n 1+n(s+1)
d ∗ ω = (−1) | det[gab ]| (∗ d ∗ ω) d x = (−1) | det[gab ]| (δω) dn x
Ω Ω Ω
(7.149)
Since ω is a one-form s = 1 so
Z Z p
d∗ω =− | det[gab ]| (δω) dn x (7.150)
Ω Ω

But for a for a one-form we also have


δω = −ωa ;a (7.151)
Therefore the integral over the interior Ω is
Z Z p
d∗ω = | det[gab ]| ωa ;a dn x (7.152)
Ω Ω

R
On the other hand, go to the boundary and compute ∂Ω
∗ ω. It is useful to adopt
normal coordinates on the boundary such that

ds2 = gab dxa dxb = g̃ij dxi dxj ± dη 2 (7.153)


where indices i and j now run over n − 1 variables and η is a normal coordinate. The
number of negative eigenvalues of metric g̃ij is denoted S 0 . The sign is ± depending on
whether the unit normal vector has
0
gab na nb = ±1 = (−1)S−S (7.154)
in the interior metric — this never causes a problem for Riemannian manifolds, and is
only ever an issue if the metric has negative eigenvalues. Now by definition of integration
for forms Z Z q
S0
∗ ω = (−1) | det[g̃ij ]| (∗ ∗ ω) dn−1 x (7.155)
∂Ω ∂Ω
There is a subtlety here, the first Hodge star lives in n dimensions and maps a 1-form to
a (n − 1)-form; this (n − 1)-form is then restricted to the (n − 1)-dimensional manifold ∂Ω
and the second Hodge star lives in (n − 1) dimensions and maps this (n − 1)-form back
into a zero-form. It is then a useful exercise to verify that

{∗ ω}a1 a2 ...an−1 = a1 a2 ...an−1 m ωm (7.156)


Math 464: Differential Geometry 148

When we restrict this to the boundary ∂Ω and introduce a normal vector na we have, in
terms of the induced (n − 1)-dimensional Levi–Civita tensor,

{∗ ω}a1 a2 ...an−1 = a1 a2 ...an−1 ωm nm (7.157)

But once we have established this


0
{∗ ∗ ω} = ∗ ∗ 1 (ωm nm ) = (−1)S (ωm nm ) (7.158)

Therefore Z Z q
∗ω = m
(ωm n ) | det[gij ]| dn−1 x (7.159)
∂Ω ∂Ω
Now defining p
d(Volume) = | det[gab ]| dn x (7.160)
and q
d(Area) = | det[g̃ij ]| dn−1 x (7.161)
we have Z Z
;a
ωa d(Volume) = ωa na d(Area) (7.162)
Ω ∂Ω
which we recognize as Gauss’ theorem.

The original Gauss theorem was proved in flat Euclidean 3-space. This current version
applies to any curved manifold with a metric — note particularly the occurrence of the
Christoffel connexion in the covariant divergence on the LHS.

7.11 Euclidean three-space

A particularly nice way of making sure you understand differential forms and exterior
derivatives is to work in flat Euclidean three-space where everything reduces to vector
cross products and the “curl” [“rot”] operation on vector fields. Once the Hodge star is
introduced you can reconstruct ordinary dot products and the divergence operator.

Recall that in 3-dimensions you only have

• zero-forms — scalars — Φ;

• one-forms — vectors — Va ;

• two-forms — antisymmetric 3 × 3 matrices — F[ab] ;

• three-forms — completely antisymmetric 3 × 3 × 3 block — G[abc] .


Math 464: Differential Geometry 149

Because it is Euclidean space in Cartesian coordinates I am not bothering to make the


covariant-contravariant distinction.

In 3-d the Levi-Civita tensor is just

123 = 231 = 312 = +1; 213 = 132 = 321 = −1; (7.163)

and zero otherwise. The relevant contractions are

a1 a2 a3 b1 b2 b3 = 0! δ a1 a2 a3 b1 b2 b3 (7.164)

a1 a2 c b1 b2 c = 1! δ a1 a2 b1 b2 (7.165)


a1 c1 c2 b1 c1 c2 = 2! δ a1 b2 (7.166)
a1 a2 a3 b1 b2 b3 = 3! (7.167)
where we remind the reader that

δ a1 a2 b1 b2 = 2! δ [a1 b1 δ a2 ] b2 (7.168)

and
δ a1 a2 a3 b1 b2 b3 = 3! δ [a1 b1 δ a2 b2 δ a3 ] b3 (7.169)
Equivalently (this was an exercise)

δ a1 b 1 δ a1 b 2
δ a1 a2 b1 b2 = [δ a1 b1 δ a2 b2 − δ a2 b1 δ a1 b2 ] = (7.170)
δ a2 b 1 δ a2 b 2

and
δ a1 b 1 δ a1 b 2 δ a1 b 3
a1 a2 a3
δ b1 b2 b3 = δ a2 b 1 δ a2 b 2 δ a2 b 3 (7.171)
δ a3 b 1 δ a3 b 2 δ a3 b 3
Because we are in 3-d the series of tensors stops here. Because the signature of the metric
is + + + we do not need to worry about any (−1)S factors.

Now in 3-d the Hodge star operation maps two-forms into one-forms and three-forms
into zero forms — so all exterior forms can ultimately be reduced to scalars and vectors.
Explicitly:


{∗Φ}abc = Φ abc (7.172)

•  
0 Vz −Vy
{∗V }ab = ab c Vc = −Vz
 0 Vx  (7.173)
Vz −Vx 0 ab
Math 464: Differential Geometry 150


1 bc
{∗F }a = a Fbc = [Fyz , Fzx , Fxy ]a (7.174)
2!

1 abc
{∗G} =  Gabc = Gxyz (7.175)
3!

Because it’s an odd-dimensional Riemann manifold

∗ ∗(any f orm) = (the same f orm); ∗∗ = I. (7.176)

So the Hodge star is an involution — applied twice it is the identity.

Now take a pair of vectors f and g and compute:


 
0 f x gy − f y gx fx gz − f z gx
{f ∧ g}[a,b] = f[a gb] = fy gx − fx gy
 0 f y gz − f z gy  (7.177)
fz gx − f x gz fz gy − f y gz 0 ab

then
{∗(f ∧ g)}a = [fy gz − fz gy , fz gx − fx gz , fx gy − fy gx ]a (7.178)
But this last set of components is immediately recognizable as that of the vector cross
product in 3-dimensions. By contracting with an appropriate basis for the one-forms
dxa = {dx, dy, dz}, this can be written in index free notation as

∗ (f ∧ g) = f × g (7.179)

or equivalently
f ∧ g = ∗(f × g) (7.180)
Of course this is the primary reason so much work went into the development of the exte-
rior product of differential forms — they are the natural multi-dimensional generalization
of the notion of “vector cross product”.

But now consider the exterior derivative of a one-form


 
0 ∂ x fy − ∂ y fx ∂x fz − ∂ z fx
{df }ab = ∂[a fb] =  ∂y fx − ∂x fy 0 ∂ y fz − ∂ z fy  (7.181)
∂z fx − ∂ x fz ∂z fy − ∂ y fz 0 ab

and calculate
{∗(df )}a = [∂y fz − ∂z fy , ∂z fx − ∂x fz , ∂x fy − ∂y fx ]a (7.182)
But this last set of components is immediately recognizable as that of the “curl” of a
vector field in 3-dimensions; also called “rot”. In index free notation

∗ (df ) = ∇ × f = curlf = rotf (7.183)


Math 464: Differential Geometry 151

or equivalently
df = ∗(∇ × f ) = ∗(curlf ) = ∗(rotf ) (7.184)

What about the divergence operator δ? On general s-forms

δF = (−1)S+n(s+1)+1 ∗ d ∗ F (7.185)

On one-forms it yields

δf = − ∗ d ∗ f = −fa ;a = −∇ · f = −divf (7.186)

while on two-forms
δF = ∗d ∗ F = ∗d(∗F ) = curl(∗F ) (7.187)

Exercise: Check all coefficients and signs. ♦

Exercise: Provide a low-brow proof of

δf = −divf (7.188)

directly in terms of the 3-d components (instead of invoking the general n-dimensional
result as we have done above). ♦

Exercise: Show that the standard 3-d identity

∇ × (∇Φ) = 0 = curl (grad Φ) (7.189)

is a specialization of ddf = 0. ♦

Exercise: Show that the standard 3-d identity

∇ · (∇ × f ) = 0 = div(curl f ) (7.190)

is a specialization of ddf = 0. ♦

Exercise: Show that for a pair of vectors f and g

∗ (f ∧ ∗g) = f · g (7.191)

where the RHS is the inner product [dot product] of the two vectors. ♦

Exercise: Show that for a pair of vectors f and g

[∗(f ∧ g)] = [∗f ] [∗g] (7.192)


Math 464: Differential Geometry 152

where the product on the RHS refers to matrix multiplication of the two antisymmetric
matrices. ♦

Exercise: Look up every vector and differential identity you can find for 3-d Euclidean
space and translate them into the language of differential forms. ♦

7.12 Laplacian

In 3-d there is a vector identity

∇ × ∇ × f = −∇2 f + ∇(∇ · f ) (7.193)

or equivalently
∇2 f = −∇ × ∇ × f + ∇(∇ · f ) (7.194)
∇2 f = −curl curl f + grad (divf ) (7.195)
If we write this in terms of d and δ

curl curl f = ∗d(curlf ) = ∗d(∗(df )) = ∗d ∗ (df ) = δdf (7.196)

while
grad(divf ) = grad(−δf ) = −dδf (7.197)
That is:
∇2 f = − (δd + dδ) f (7.198)
But the LHS has been defined only for 3-d one-forms on Euclidean space in Cartesian co-
ordinates, while the RHS now can be meaningfully defined on arbitrary forms on arbitrary
curved manifolds. This motivates the definition, for all s-forms in arbitrary dimension:

Definition 63 The Laplacian of an s-form is defined by

∆F = −∇2 F = (δd + dδ) F (7.199)

Exercise: Show that this is equivalent to

∆F = (d + δ)2 F (7.200)


Math 464: Differential Geometry 153

7.13 Riemann curvature forms

Now that we have developed the language of exterior differential forms, it can be used to
tidy up some of the formulae for the Riemann tensor.

Remember how we defined the transport operator,

[transport(y→x;γ) ]• • , (7.201)

and used this to construct the affine connexion as



Γ• •m = {transport[y → x; γ]}• • . (7.202)
∂y m y→x

Now using the notation of differential forms we can introduce a tensor-valued one-form

Γ • by writing:


Γ• • = Γ• •m dxm = {transport[y → x; γ]}• • dxm . (7.203)
∂y m y→x

Now define a tensor-valued curvature two-form by

R• • = R• •ab dxa ∧ dxb . (7.204)

This two-form contains all the information encoded in the Riemann tensor. The definition
of the Riemann tensor in terms of Γ• • is now written as

R• • = dΓ• • − Γ• • ∧ Γ• • (7.205)

and the Weitzenbock identities take the simple and elegant form

dR• • + R• • ∧ Γ• • − Γ• • ∧ R• • = 0. (7.206)

Exercise: As always, check normalizations, index placement, and sign conventions. ♦

Warning: This is not exactly the same as MTW’s use curvature forms as developed on
pages 354 ff. MTW make explicit use of the n-bein formalism and must distinguish mani-
fold indices from n-bein indices. [In physicist’s language, they must distinguish spacetime
indices from local Lorentz indices.] In the current formalism everything is done with
manifold indices and coordinate charts — the •s are just placeholders for ordinary tensor
indices. ♦
Chapter 8

Lie derivatives

The “Lie derivative” is the third standard way of producing covariant objects from partial
derivatives of a tensor; this time without invoking any connexion [or parallel transport]
at all. Of course, as usual there is a price to pay.

8.1 Partial derivatives are not tensors

Generically partial derivatives of tensor quantities are not themselves tensors. We have
investigated in detail two ways of dealing with this issue — through the covariant deriva-
tive and via the exterior derivative. There is a third standard way of building tensor
quantities out of partial derivatives, known as the Lie derivative. (Named after Sophus
Lie, he of Lie groups, Lie algebras, etc.)

If you have been paying attention, you might have noticed the first hint of the Lie
derivative sneak by as we talked about the commutator of two vector fields. Recall that
if ta1 (x) and ta2 (x) are two contravariant vector fields then the commutator, defined as

[t1 , t2 ]b = ta1 ∂a tb2 − ta2 ∂a tb1 (8.1)

is also a contravariant vector field.

We will now use this quantity to iteratively define a new type of derivative on Tsr tensors.
For a change, I will adopt the axiomatic approach.

Axiom 1 The lie Derivative Lv of a tensor with respect to a contravariant vector field v
is a linear mapping
Lv : Tsr → Tsr (8.2)

154
Math 464: Differential Geometry 155

that is also linear in the argument v and satisfies the Leibnitz rule on Cartesian products
of tensors
Lv (X1 ⊗ X2 ) = (Lv X1 ) ⊗ X2 + X1 ⊗ (Lv X2 ) (8.3)

That is:
Lv1 +v2 X = Lv1 + Lv2 X (8.4)
Lv (X1 + X2 ) = Lv X1 + Lv X2 (8.5)
So far, this just specifies the general structure of the derivative, but does not specify a
particular notion of derivative. To do that we add two additional axioms:

Axiom 2 Acting on scalars


Lv f = vf = v a ∂a f (8.6)

Axiom 3 Acting on contravariant vectors



Lv t = [v, t] = v a ∂a tb − ta ∂a v b

(8.7)
∂xb

From these axioms we can now deduce the behaviour of Lv on arbitrary Tsr tensors.

Note that in this “third route” to a tensorial notion of derivative the use of the auxiliary
vector field v. Essentially the decision to limit the directions in which we are differentiation
is the “extra ingredient” that lets us get somewhere. Compare to the covariant derivative
where we added the extra structure of parallel transport, and the exterior derivative where
we restricted the set of tensors to be differentiated.

8.2 Lie derivatives on T0r tensors

To see what happens on a T0r tensor recall that any such tensor is an element of T r can
be written as a linear combination of terms of the form

t1 ⊗ t 2 ⊗ t 3 ⊗ · · · ⊗ t r (8.8)

But by repeated application of the Leibnitz rule axiom

Lv (t1 ⊗ t2 ⊗ t3 ⊗ · · · ⊗ tr ) = (Lv t1 ) ⊗ (t2 ⊗ t3 ⊗ · · · ⊗ tr )


+t1 ⊗ (Lv t2 ) ⊗ t3 ⊗ · · · ⊗ tr
+···
+t1 ⊗ t2 ⊗ t3 ⊗ · · · ⊗ (Lv tr ) (8.9)
Math 464: Differential Geometry 156

= [v, t1 ] ⊗ t2 ⊗ t3 ⊗ · · · ⊗ tr )
+t1 ⊗ [v, t2 ] ⊗ t3 ⊗ · · · ⊗ tr
+···
+t1 ⊗ t2 ⊗ t3 ⊗ · · · ⊗ [v, tr ] (8.10)
In components this means
{Lv (t1 ⊗ t2 ⊗ t3 ⊗ · · · ⊗ tr )}a1 a2 ...ar = (Lv t1 )a1 ⊗ (ta22 ⊗ ta33 ⊗ · · · ⊗ tar r )
+ta11 ⊗ (Lv t2 )a2 ⊗ ta33 ⊗ · · · ⊗ tar r
+···
+ta11 ⊗ ta22 ⊗ ta33 ⊗ · · · ⊗ (Lv tr )ar (8.11)
a1 a2 a3 ar
= (v∂t1 − t1 ∂v) ⊗ (t2 ⊗ t3 ⊗ · · · ⊗ tr )
+ta11 ⊗ (v∂t2 − t2 ∂v)a2 ⊗ ta33 ⊗ · · · ⊗ tar r
+···
+ta11 ⊗ ta22 ⊗ ta33 ⊗ · · · ⊗ (v∂tr − tr ∂v)ar
(8.12)
= v m ∂m (ta11 ⊗ ta22 ⊗ ta33 ⊗ · · · ⊗ tar r )
m
( i−1 r
)
X Y a Y a
− [ tj j ] tbj ∂b v ai [ tj j ] (8.13)
i=1 j=1 j=i+1

So for X a1 a2 ...ar a generic T0r tensor


m
X
a1 a2 ...ar m a1 a2 ...ar
X a1 a2 ...b...ar ∂b v ai

{Lv X} = v ∂m X − (8.14)
i=1

Example: If r = 2
{Lv X}a1 a2 = v m ∂m X a1 a2 − X ba2 ∂b v a1 + X a1 b ∂b v a2

(8.15)

Example: If r = 3
{Lv X}a1 a2 a3 = v m ∂m X a1 a2 a3 − X ba2 a3 ∂b v a1 + X a1 ba3 ∂b v a2 + X a1 a2 b ∂b v a3

(8.16)

Exercise: Show, by explicitly performing the coordinate transformations and watching


the “obnoxious pieces” cancel that {Lv X}a1 a2 and {Lv X}a1 a2 a3 really are tensor quanti-
ties. ♦
Math 464: Differential Geometry 157

8.3 Lie derivatives on covariant vectors

By combining the axions for the Lie derivatives on contravariant vectors, on covariant
vectors, and the Leibnitz rule we can deduce the effect of a Lie derivative on covariant
vectors. Recall that for a contravariant vector t and covariant vector g we construct a
scalar hg|ti via the pairing

hg|ti = hga dxa |tb ∂b i = ga tb hdxa |∂b i = ga tb δ a b = ga ta = tr(g ⊗ t) (8.17)

But then
Lv tr(g ⊗ t) = trLv (g ⊗ t) = tr{(Lv g) ⊗ t + g ⊗ (Lv t)} (8.18)
So that
tr{(Lv g) ⊗ t} = Lv tr(g ⊗ t) − tr{g ⊗ [v, t]} (8.19)
In components
(Lv g)a ta = v m ∂m (ga ta ) − ga (v m ∂m ta − tm ∂m v a ) (8.20)
So that
(Lv g)a ta = v m (∂m ga )ta + v m (∂m ta )ga − ga (v m ∂m ta − tm ∂m v a ) (8.21)
The terms involving ∂t cancel:

(Lv g)a ta = v m (∂m ga )ta + gm ta (∂a v m ) = {v m (∂m ga ) + gm (∂a v m )}ta (8.22)

Since this now holds for arbitrary ta

(Lv g)a = v m (∂m ga ) + gm (∂a v m ) (8.23)

Exercise: Verify by making a coordinate transformation that

(Lv g)a = v m (∂m ga ) + gm (∂a v m ) (8.24)

transform as the components of a T10 tensor. ♦

8.4 Lie derivatives on general tensors

By considering terms of the form ga11 ga22 , you can bootstrap this to T20 tensors and beyond.

Exercise: Show that on T20 tensors

(Lv X)ab = v m (∂m Xab ) + {Xmb (∂a v m ) + Xam (∂b v m )} (8.25)


Math 464: Differential Geometry 158

Exercise: Show that on T30 tensors

(Lv X)abc = v m (∂m Xabc ) + {Xmbc (∂a v m ) + Xamc (∂b v m ) + Xabm (∂c v m )} (8.26)

Exercise: Find the appropriate generalization to Ts0 tensors. ♦

Exercise: Combine with the previous section and find the appropriate generalization
to Tsr tensors. ♦

This is a “low-brow” approach to the Lie derivative — it has been defined recursively
in terms of basic rules for scalars and covariant vectors, and the need for preservation of
some basic features of anything that we would like to call a derivative. There is, as yet,
no geometrical interpretation of the Lie derivative.

8.5 Geometry of the Lie derivative — Lie dragging

The underlying geometry of the Lie derivative is the act of “dragging” geometrical quan-
tities along integral curves of a vector field. Suppose we are given some vector field v(x),
then we can define the integral curves via
dxa
= v a (xb (λ)) (8.27)

There is no requirement that these curved be geodesic, or in any way nice — as long
as they are sufficiently smooth. Now moving a certain parameter distance λ along these
curves defines a set of mappings from the manifold to itself

φλ : M → M (8.28)

These mappings essentially “drag” all points x on the manifold a “distance” λ along the
integral curve and may be represented (for suitably small λ) as coordinate functions

ṽ(λ) : xa (xb0 ; v(x); λ) = xa0 + v a (x0 )λ + O[λ2 ] (8.29)

Instead of dragging a point we can ask what it means to drag a scalar function, a
suitable definition is:
f ∗ (x0 ; λ) = f (x(x0 ; v; λ)) (8.30)
That is, the dragged function f ∗ at the point x0 is defined to be the original function at
the point that x0 is dragged to by the action of φλ .
Math 464: Differential Geometry 159

But now we can ask what it means to drag a vector field; let t(x) be a second vector
field, defined by a second set of integral curves
dxa
= ta (xb (µ)) (8.31)

with solution
t̃(λ) : xa (xb0 ; t(x); µ) = xa0 + ta (x0 )µ + O[µ2 ] (8.32)
Each one of these curves, when acted on by φλ will be pushed a certain distance along
the “v direction”; yielding a new set of curves

t̃∗ (µ; λ) : xa (xb (x0 ; t; µ); v(xb (x0 ; t; µ)); λ)


= xa (xa0 + ta (x0 )µ + O[µ2 ]; v(xb0 + t(x0 )b µ + O[µ2 ]); λ) (8.33)
a a 2 a b b 2 2
= x0 + t (x0 )µ + O[µ ] + v (x0 + t (x0 )µ + O[µ ])λ + O(λ ). (8.34)

Here µ is the parameter along the t̃∗ curve; λ is the distance the whole collection of t
curves has been pulled along the v direction. But the curves t̃∗ (µ; λ) themselves have
tangent vectors, and so define a new set of tangent vector fields
t∗ (x; λ) (8.35)

that depend on the parameter λ and reside at the point (x(x0 ; v; λ) — this is the “Lie
dragged vector field”. Now you can define
t(x(x0 ; v; λ)) − t∗ (x; λ)
Lv t = lim (8.36)
λ→0 λ
secure in the knowledge that it is at least a contravariant vector. Note that we have been
very careful to ensure that t(x(x0 ; v; λ)) and t∗ (x; λ) both reside in the same tangent space
[at x(x0 ; v; λ)].

Of course it remains to be shown that this agrees with the previous definition. Now in
components we have
dxa (xb (x0 ; t; µ); v(xb (x0 ; t; µ)); λ)
[t∗ (x; λ)]a = (8.37)

which we can evaluate as

[t∗ (x; λ)]a = ta (x0 ) + ∂b v a |x0 tb (x0 )λ + O(λ2 ) (8.38)


whereas

ta (x(x0 ; v; λ)) = ta (xb0 + v b (x0 )λ + O[λ2 ]) = ta (x0 ) + v b (x0 )∂b ta |x0 + O[λ2 ] (8.39)
so that
d {ta (x(x0 ; v; λ)) − [t∗ (x; λ)]a }
= v b (x0 ) ∂b ta |x0 − ∂b v a |x0 tb (x0 ) (8.40)
dλ x0
Math 464: Differential Geometry 160

which verifies that it is the same quantity as per the previous component-based definition:

{Lv t}a = v b ∂b ta − tb ∂b v a . (8.41)

Note that the process which takes the vector field t(x) into the vector field t∗ (x; λ) is
often called the “push forward” of the function φλ : M → M. One often sees notation
like
t∗ (x; λ) = φ∗ (t(x)) (8.42)

Exercise: Verify that “Lie dragging” on scalars reproduces the previous definition

Lv f = v a ∂ a f (8.43)

How would we Lie drag a covariant vector? Well we know how to Lie drag a scalar, and
we now how to Lie drag a contravariant vector, so consider the scalar quantity g(v) = g a v a .
The dragged covector g∗ (x; λ) is defined at the point x0 by its action on vector fields via

g∗ (x0 ; λ)t(x0 ) = g(x(x0 ; v; λ)t∗ (x0 ; λ) (8.44)

The process which takes the covector field g(x) into the covector field g∗ (x; λ) is often
called the “pull back” of the function φλ : M → M. One often sees notation like

g∗ (x; λ) = φ∗ (g(x)) (8.45)

What does this mean in components?

[g∗ (x0 ; λ)]a [t(x0 )]a = [g(x(x0 ; v; λ)]a [t∗ (x0 ; λ)]a (8.46)
= ga (x0 ) + v b (x0 ) ∂b ga |x0 λ + O[λ2 ]


× ta (x0 ) + ∂b v a |x0 tb (x0 ) λ + O(λ2 )



(8.47)
ga (x0 ) + v b ∂b ga λ + O[λ2 ]

=
× δca + ∂c v a |x0 λ + O(λ2 ) tc (x0 )

(8.48)

So that, factoring out the ta (x0 ) (and it is a non-trivial check of internal consistency that
this vector does factor out in this way)

[g∗ (x0 ; λ)]a = ga (x0 ) + λ v b ∂b ga + gb ∂a v b + O(λ2 )



(8.49)

Note that this is a one-form at x0 , so that it makes sense to define

[g∗ (x0 ; λ)]a − ga (x0 )


Lv g = lim = v b ∂b ga + g b ∂a v b (8.50)
λ→0 λ
Math 464: Differential Geometry 161

Note that this is the same definition as previously deduced — so everything is consistent.

Exercise: Since we know how to “push forward” a vector; it should be easy enough for
you to see how to “push forward” a T0r tensor. Go through the formalism to make sure it
holds together and verify that you can reproduce the previous result for the Lie derivative
on T0r tensors. ♦

Exercise: Since we know how to “pull back” a covector; it should be easy enough for
you to see how to “pull back” a Ts0 tensor. Go through the formalism to make sure it
holds together and verify that you can reproduce the previous result for the Lie derivative
on Ts0 tensors. ♦

Exercise: Hence verify that this “Lie dragging” on arbitrary Tsr tensors reproduces the
previous definition. There is a minor technical issue to deal with — define the pullback for
a T0r tensor by starting at the point x(x00 ; v; λ) and pushing forward a parameter distance
−λ. This is necessary in order to make sure you really are subtracting tensors at the same
point x0 . ♦

8.6 Symmetries and Killing vectors

— to be written —
Chapter 9

Extrinsic curvature

— to be written —

9.1 Embeddings

9.2 First fundamental form

9.3 Second fundamental form (extrinsic curvature)

9.4 Third fundamental form

9.5 Equations of Gauss and Codazzi

9.6 Equations of Weingarten

— to be written —

162
Chapter 10

Distribution valued curvature

— to be written —

10.1 Discontinuities in the connexion

10.2 Discontinuities in the metric?

10.3 Colombeaux algebra?

— to be written —

163
Chapter 11

Gauge-fields

Gauge fields are an extension of the idea of the affine connexion to general vector bundles,
not just the tangent bundle.

11.1 Connexions on vector bundles

Remember how we defined the transport operator,

[transport(y→x;γ) ]• • , (11.1)

and used this to construct the affine connexion as



Γ• •m = {transport[y → x; γ]}• • . (11.2)
∂y m y→x

Eventually we defined the Riemann tensor, and using the notation of differential forms


Γ• = Γ• •m dxm = m
{transport[y → x; γ]}• • dxm , (11.3)
∂y y→x

wrote it in the form


R• • = dΓ• • − Γ• • ∧ Γ• • (11.4)
Now the •s are just place-holders for indices corresponding to the tangent and cotangent
spaces.

But, and here is the beauty of the formalism, nothing in the above depends on us actually
working with the tangent bundle — the •s can be interpreted as place-holders for indices
in any arbitrary vector space we like.

164
Math 464: Differential Geometry 165

Specifically, let V(M) be some vector bundle with base space M, and fibres some vector
space V . If we assume the existence of some sort of parallel transport operator in this
vector bundle we can use the above to construct a connexion Γ• •m where the •s are
place-holders for indices in the vector space V .

Conversely if we assume the existence of Γ• •m we can use the path ordering process to
generate the parallel transport operator
Z y

[transport(y→x;γ) ] • = P exp(Γ• •c dxc ) (11.5)
x

again using any arbitrary vector space V as fibre.

Much of what we discussed for arbitrary affine connexions still holds true in this case.
In coordinates we can write
R• •ab = Γ• •[a,b] − Γ• •[a Γ• •b] (11.6)
One of the obvious things we cannot do is that we cannot now define a Ricci tensor, since
that would involve trying to “trace” over incompatible indices — one vector index with
one coordinate index.

The S-tensor on the other hand, continues to make perfectly good sense, so some of
Schouten’s identities continue to work.

Exercise: What happens to the Bianchi identities [Weitzenbock identities]? ♦

Exercise: What happens to the S–tensor? It is often [but not always] zero. Can you
find a pattern there? ♦

Exercise: Work your way through the chapter on general affine connexions in detail
and see just how much of it will survive in this current context. Does torsion make any
sense? Does nonmetricity make any sense? ♦

Example: Suppose we have a complex line bundle L(M). Then the fibre is a vector
space with one complex dimension and we can simplify
Γ• = Γ• •m dxm → iAm dxm = iA. (11.7)
That is, the affine connexion [on this line bundle] is just a complex valued one-form. (The
presence of the i is pure convention.) Therefore
Γ• • ∧ Γ• • → iA ∧ iA = 0, (11.8)
and the Riemann curvature simplifies to
R• • = dΓ• • − Γ• • ∧ Γ• • → idA = iF. (11.9)
Math 464: Differential Geometry 166

Thus in this situation the Riemann tensor simplifies to a complex-valued two form
F = dA (11.10)
In this particular case the Bianchi identities simply reduce to
dF = ddA = 0 (11.11)

Exercise: Suppose the parallel transport operator preserves the absolute value of the
complex “vector” as you move around the manifold M. (We have not used any such
property up to this stage.) What does that tell you about the one-form A? ♦

Exercise: Suppose we work in 4 dimensions (time+space). [You can consider flat space
with Cartesian coordinates for simplicity, though the construction generalizes to arbitrary
manifolds.] Pick
A = φ dt + A ~ · d~x = φ dt + Ai dxi
where φ is the electromagnetic vector potential and A ~ is the electromagnetic vector po-
tential. Relate the curvature F to the electromagnetic fields E ~ and B.
~ Demonstrate
that
F = dA = Ei dxi ∧ dt + ijk B i dxj ∧ dxk

Exercise: Furthermore, demonstrate that the Bianchi identities dF = 0 are equivalent


to the two homogeneous Maxwell equations. ♦

Exercise: Finally, demonstrate that the “gauge transformations” of electromagnetism


are equivalent to coordinate transformations on the one-dimensional complex fibre of the
vector bundle. ♦

11.2 Abelian gauge fields

General Abelian gauge fields are a minor generalization of electromagnetism. Whenever


the connexion Γ• •m commutes with itself (when viewed as a matrix on the vector space
indexed by the •s), we can say that this matrix is Abelian, and call the resulting gauge
connexion Abelian. Analogously to the previous example, define
Γ• • = iA• • (11.12)
Math 464: Differential Geometry 167

In this case (because the matrices A• • commute with each other) we still have

A• • ∧ A• • = 0. (11.13)

The curvature simplifies to


F • • = dA• • , (11.14)
and the Bianchi identities are
dF • • = 0. (11.15)
Effectively this reduces A• • to a number of independent fields, each rather similar to the
“electromagnetic” field, with one such field for each dimension of the vector space V .

[There is no known physics use for this mathematical object.]

11.3 non-Abelian gauge fields

Now let V be an arbitrary N dimensional vector space. Then

Γ• • = iA• • (11.16)

defines some N × N matrix of one-forms, and in general these need not commute. Unless
we add further constraints to the geometry, these are GL(N )-valued one-forms, where
GL(N ) stands for general linear N × N matrices.

We now have
A• • ∧ A• • 6= 0. (11.17)
The curvature now contains contributions from both terms

F • • = dA• • − iA• • ∧ A• • , (11.18)

and the Bianchi identity is more complicated. (Note the explicit occurrence of i; this is a
choice of convention and convenience. Physicists tend to keep the i explicit, mathemati-
cians tend to suppress it by absorbing it into the definition of A• • .)

Exercise: Find the appropriate version of the Bianchi identity. ♦

Exercise: Suppose that the vector space V which fibres V(M) has a “norm” (so that
“lengths” of vectors make sense; we have a metric on V ), and suppose that the transport
operator is chosen to always preserve the length of the vectors it transports. (This is an
extremely natural restriction.)

Show that in this situation the matrix-valued one-forms A• • are actually Hermitian
N × N matrices.
Math 464: Differential Geometry 168

(Mathematicians, who typically absorb the explicit i into the definition of A• • , would
instead be working with anti-Hermitian matrices.) ♦

Comment: [Not examinable] In physics, these non-Abelian gauge fields are often called
Yang–Mills fields. They are used, in particular, in quantum chromo-dynamics [QCD]
and in the electroweak standard model of particle physics. Further afield they underlie
the grand unified field theories [GUTS], though it should be noted that the GUTS are
“neither grand, nor unified, nor even theories”.

In particle physics, the transport operation always preserves the norm, and one is deal-
ing with Hermitian N × N matrices. The Hermitian N × N matrices generate the group
U (N ) of unitary N × N matrices. These unitary matrices are called the “gauge group”,
they are the permissible coordinate transformations you can make in the fibre V that do
not affect the norm of the vector. ♦

Exercise: Show that in the case of a single Abelian gauge field (that is, electromag-
netism) the gauge group is U (1). ♦

Comment: [Not examinable] In QCD one works with a 3 dimensional complex vector
space of three “colours”, preserves the norm, and factors out a physically irrelevant over-
all phase. The result is that one works with Hermitian traceless 3 × 3 matrices, and the
“gauge group” [the set of permissible coordinate transformations] is SU (3), the set of
special [determinant one] unitary 3 × 3 matrices. ♦

Comment: [Not examinable] In the electroweak model, the relevant gauge group is
SU (2) × U (1). Roughly speaking the SU (2) has to do with the weak interactions while
the U (1) has to do with electromagnetism, but the situation is made more complicated
by the presence of spontaneous symmetry breaking. ♦

Warning: Everything I have said here has to do with classical gauge theories, as I have
not even hinted at what would then be needed to build a quantum theory along these
lines. ♦

Warning: Gauge theories are also used outside of particle physics — for instance, I
have seen engineers trying to use gauge theories to analyze the recognition problem —
when a “target” may be translated and rotated by an arbitrary amount, and you need to
recognize it despite the peculiar point of view.

I have also seen attempts at modeling the swimming motions of bacteria and other small
organisms trough a gauge theoretic representation of the shape of the organism. ♦
Chapter 12

Coda

• Differential geometry is a basic tool in abstract mathematics, applied mathematics,


and theoretical physics.

• There are also engineering applications in elasticity and deformation theory.

• In my own research I have [among other things] been using the language of dif-
ferential geometry and curved spaces to investigate sound propagation in moving
fluids.

• There are also applications in theoretical biology — everything from studying the
conformation space of a protein molecule to the “fitness landscape” of biological
systems.

—# # #—

• Of course I did not finish all the topics I wanted to cover — but we have done quite a
bit.

• The material I have managed to cover is a good solid introduction to differential geometry
— at a level appropriate to both applied mathematicians and physicists.

—# # #—

169
Appendix A

Notation

Some key points of tricky notation:

iff if and only if.


wlog without loss of generality.
LHS Left hand side (of some equation).
RHS Right hand side (of some equation).
homeomorphism the function is continuous, it has an inverse function, and the inverse
function is continuous.
diffeomorphism the function is a homeomorphism [typically from some subset of IR n
to another] that in addition is differentiable in both directions.
symmetrize
1 X
S(a1 a2 ...ar ) = Sπ(a1 a2 ...ar )
r! π

anti-symmetrize
1 X
A[a1 a2 ...ar ] = signum(π) Aπ(a1 a2 ...ar )
r! π


partial derivatives ∂a is shorthand for ∂xa
.

—###—

170
Appendix B

Symmetries and counting arguments

We start with the trivial observation that an r-index tensor with no symmetries has nr
algebraically independent components.

B.1 Totally antisymmetic tensors

We define
1X
A[a1 a2 ...ar ] = signum(π)Aπ(a1 a2 ...ar )
r! π
where the sum runs over all permutations π of the r indices and signum(π) denotes the
parity of the permutation. (That is, −1 if an odd number of indices are flipped out of
order, and +1 otherwise.) For example

A a1 a2 − A a2 a1
A[a1 ] = Aa1 ; A[a1 a2 ] =
2
A a1 a2 a3 + A a2 a3 a1 + A a3 a1 a2 − A a2 a1 a3 − A a3 a2 a1 − A a1 a3 a2
A[a1 a2 a3 ] =
6
As another example, if we already happen to know that Aabc is anti-symmetric on its first
two indices then the above simplifies to
A a1 a2 a3 + A a2 a3 a1 + A a3 a1 a2
A[a1 a2 a3 ] =
3

To calculate the number Anr of algebraically independent components in a completely


anti-symmetric tensor of r indices proceed as follows:
Pick the first index in any of n ways, then because of antisymmetry the second cannot

171
Math 464: Differential Geometry 172

equal the first, and must be picked out of the remaining n − 1 indices. Proceeding through
all r indices leads to
n(n − 1)(n − 2) . . . (n − r + 1)
distinct ways of assigning distinct indices. But by the antisymmetry all r! rearrangements
of these indices are algebraically related. Thus the number of algebraically independent
components is simply the binomial coefficient
n(n − 1)(n − 2) . . . (n − r + 1) n! n
Anr = = =
r! (n − r)! r! r
Note:
nr
Anr ≤
r!
The first few of these numbers are
n(n − 1) n(n − 1)(n − 2)
Anr → 1; n; ; ; ...
2 6

B.2 Totally symmetric tensors

We define
1X
S(a1 a2 ...ar ) = Sπ(a1 a2 ...ar )
r! π
where the sum runs over all permutations π of the r. For example
S a1 a2 + S a2 a1
S(a1 ) = Sa1 ; S(a1 a2 ) =
2
S a1 a2 a3 + S a2 a3 a1 + S a3 a1 a2 + S a2 a1 a3 + S a3 a2 a1 + S a1 a3 a2
S(a1 a2 a3 ) =
6
As another example, if we already happen to know that Sabc is symmetric on its first two
indices then the above simplifies to
S a1 a2 a3 + S a2 a3 a1 + S a3 a1 a2
S(a1 a2 a3 ) =
3

Calculating the number Srn of algebraically independent components in a completely


symmetric tensor of r indices is a trifle trickier than the antisymmetric case. Proceed as
follows: How many inequivalent ways are there of choosing the values of the r indices from
the n possibilities? It equals the number of ways of choosing r items from n possibilities,
including repetitions. This is a standard problem for which the textbook answer is:
n+r−1 (n + r − 1)! n+r−1
   
n
Sr = = =
r (n − 1)! r! n−1
Math 464: Differential Geometry 173

That is
(r + 1)(r + 2) . . . (r + n − 1) n(n + 1)(n + 2) . . . (n + r − 1)
Srn = =
(n − 1)! r!
These is a strong tendency not to actually derive this result, and merely to quote it. That
is because, in contrast to Anr , the derivation is a little messy.

Obvious results are


n(n + 1)
S0n = 1; S1n = n; S2n =
2
n n
But even S3 is a bit of a mess. (S0 represents a scalar, which has exactly 1 component!)

There are two ways of proceeding:

• tricky re-labelling of the items to be enumerated;


• induction.

B.2.1 Enumeration

Suppose we have a completely symmetric tensor S(a1 a2 a3 ...ar ) , then wlog we can always
rearrange the indices in non-decreasing order, with
1 ≤ a1 ≤ a2 ≤ a3 ≤ · · · ≤ ar ≤ n.
Then Srn equals the number of such sequences #{ai }.

But to each non-decreasing sequence of the above form, the equation bi = ai + i − 1


defines a strictly increasing sequence of the form
1 ≤ b1 < b2 < b3 < · · · < br ≤ n + r − 1.
Conversely, to any strictly increasing sequence of the form bi above, the formula ai =
bi + 1 − i associates a non-increasing sequence of the form ai given above. Then Srn equals
the number of such sequences #{bi }.

But the number of these bi sequences is very easy to compute, as it equals the number
of anti-symmetric tensors with r indices in n + r − 1 dimensions. That is
n+r−1
 
n n+r−1
Sk = A r =
r
as was to be shown.

This enumerative technique is well-known in the field of discrete mathematics (specif-


ically combinatorics and enumeration), but is the sort of technique that most applied
mathematicians and theoretical physicists do not see in their training.
Math 464: Differential Geometry 174

B.2.2 Induction

If you don’t know the enumeration trick, the most straightforward way of proceeding is
by induction on the number of dimensions of spacetime. Clearly

Sk1 = 1

and
Sk2 = k + 1
This last is obtained by noting that if n = 2 there are only two distinct indices, so that
k of them can be partitioned as follows — 0 : n, 1 : (k − 1), . . . , (k − 1) : 1, k : 0 — in
k + 1 distinct ways.

Now consider Skn+1 . When we add one more possible value for the index to take, there
could be 0, 1, . . . , k − 1 or k occurrences of this new index. That corresponds to k, k − 1,
. . . , 1, or 0 slots being available for the old indices. That is
k
X
Skn+1 = Skn + n
Sk−1 +...+ S1n + S0n = Sjn
j=0

This recursion relation now completely specifies the remaining Skn . For instance
k k
X X k(k + 1) (k + 1)(k + 2)
Sk3 = Sj2 = (j + 1) = + (k + 1) =
j=0 j=0
2 2

k k
X X (j + 1)(j + 2) (k + 1)(k + 2)(k + 3)
Sk4 = Sj3 = =
j=0 j=0
2 6

At this stage the pattern is clear.

To complete the job apply induction, using, for instance


a−1   a−1      
X b+s X b+s a+b a+b
= = =
s=0
b s=0
s b+1 a−1

B.2.3 Some simple results

Note
nr
Srn ≥ ≥ Anr
r!
Math 464: Differential Geometry 175

A simple way of deriving Srn ≥ Anr is this: If all indices are distinct one finds nr inde-


pendent components; as for the antisymmetric case. In addition (if r > 1) there are now
additional nonzero components where two of the indices are identical, and distinct from
the remaining r − 2 indices.

The first few of these numbers are


n(n + 1) n(n + 1)(n + 2) n(n + 1)(n + 2)(n + 3)
Srn → 1; n; ; ; ; ...
2 6 24

—###—
Appendix C

Some matrix identities

C.1 Determinants, traces, and matrix logarithms

A very useful result is that for any square matrix X

ln det X = tr ln X. (C.1)

This is most easily proved for diagonal matrices of the form

D = diag[λ1 , · · · , λn ] (C.2)

since for such a matrix n


Y
det D = λi , (C.3)
i=1

while
ln D = diag[ln λ1 , · · · , ln λn ], (C.4)
so " #
n
X n
Y
tr ln D = ln λi = ln λi . = ln det D (C.5)
i=1 i=1

Though this is not a general proof, this is enough to guess that the result is correct. To
complete the proof we work in stages. (This presentation is longer than it has to be, I’m
making it rather explicit in order to lead to the general result gently).

Having proved it for diagonal matrices, it now holds for arbitrary real symmetric ma-
trices because you can always diagonalize them using an orthogonal transformation

X = ODO T (C.6)

176
Math 464: Differential Geometry 177

and then
det X = det[ODO T ] = det D; (C.7)
whereas

tr ln X = tr ln[ODO T ] (C.8)
= tr ln[I + (ODO T − I)] (C.9)
= tr ln[I + (O[D − I]O T )] (C.10)

X (O[D − I]O T )n
= tr (C.11)
n=1
n

(O[D − I]O T )n
X  
= tr (C.12)
n=1
n

[D − I]n
X  
= tr (C.13)
n=1
n
= tr ln[D] (C.14)

Similarly for any Hermitian matrix X we have

X = U DU −1 (C.15)

for U a unitary matrix. Apply the same argument as previously.

Next, it is a sufficient (though not necessary) condition that if all the eigenvalues of
an arbitrary square matrix X are distinct then it can be diagonalized using non-singular
linear transformations:
X = LDL−1 (C.16)
(The matrix L is allowed to be complex.) Apply the same argument as previously.

Finally, for a completely arbitrary matrix (possibly with degenerate eigenvalues) even
if you cannot diagonalize it you can always put it into upper (or lower) triangular form

X = LT L−1 (C.17)

where T has values only on the diagonal and above (or below), and the matrix L is again
allowed to be complex.

Exercise: Check that the previous argument holds for upper (or lower) triangular
matrices. What is
det T ? ln T ? tr ln T ? (C.18)

Math 464: Differential Geometry 178

Exercise: For a somewhat tidier proof, look up “Jordan canonical form”. For a matrix
J in Jordan canonical form what is
det J? ln J? tr ln J? (C.19)

You can also write the general result as


det exp(Z) = exp[trZ] (C.20)
det X = exp[tr ln(X)] (C.21)
ln det X = tr ln(X) (C.22)

C.2 Derivatives of determinants

Now suppose the elements of the matrix X depend on some variable z then
 
d d
ln det X = tr ln(X) (C.23)
dz dz
 
1 d det X −1 dX
= tr X (C.24)
det X dz dz
There is something to prove in this last step; as previously start with diagonal matrices
and bootstrap your way up...

Rearranging this result


 
d det X −1 dX
= det(X) tr X (C.25)
dz dz

This can also be written  


d det X dX
= tr cof X (C.26)
dz dz
where cof X is the cofactor matrix defined by
cof X = det(X)X −1 (C.27)

You can also prove this result directly from the definition of determinant. The ijth
element of the cofactor matrix is (−1)i+j times the determinant of the (n − 1) × (n − 1)
matrix defined by deleting the ith row and jth column of the matrix X.

You could also just look it up somewhere — the important thing is to realise that such
a result exists and to know how to use it.
Appendix D

Path ordered integrals

Path-ordered exponentials are a very convenient trick for formally solving certain matrix
differential equations. Suppose we have a differential equation of the form
dU (t)
= H(t) U (t) (D.1)
dt
where U (t) and H(t) are matrices [or more generally linear operators on some vector
space] and the matrix H(t) is generally not a constant. [So in particular H(t1 ) need not
commute with H(t2 ).]

If H(t) = H0 is a constant then we have the simple solution

U (t) = exp[H0 t] U (0) (D.2)

If H(t) is a constant then we define the formal process of “path ordering” in terms of the
exact solution U (t) which we know exists because of standard existence and uniqueness
theorems. That is Z t 
U (t) = P exp H(t0 ) dt0 U (0) (D.3)
0
That is Z t 
0 0
P exp H(t ) dt = U (t) U −1 (0) (D.4)
0
If we take this as our definition of path ordering then
Z t  Z t 
d 0 0 −1 0 0
P exp H(t ) dt = H(t) U (t) U (0) = H(t) P exp H(t ) dt (D.5)
dt 0 0

But then by basic notions of Taylor series expansion


Z t+∆t  Z t 
H(t ) dt = I + H(t) ∆t + O[(∆t)2 ]
0 0 0 0

P exp P exp H(t ) dt (D.6)
0 0

179
Math 464: Differential Geometry 180
Z t 
= exp [H(t) ∆t] P exp H(t ) dt + O[(∆t)2 ]
0 0
(D.7)
0
Let’s now bootstrap this result into a general limit formula for the path ordered integral.
Split the interval (0, t) into n equal segments and evaluate H(t) at the points
i
ti = t ; i ∈ (0, n − 1) (D.8)
n
then
Z t 
0 0
P exp H(t ) dt = exp [H(tn−1 ) ∆t] exp [H(tn−2 ) ∆t] · · · (D.9)
0
 
1
· · · exp [H(t1 ) ∆t] exp [H(t0 ) ∆t] + O
n
Alternatively
Z t 
0 0
P exp H(t ) dt = lim exp [H(tn−1 ) ∆t] exp [H(tn−2 ) ∆t] · · · (D.10)
0 n→∞

· · · exp [H(t1 ) ∆t] exp [H(t0 ) ∆t] .

This limiting process should remind you of the way the Riemann integral is defined, except
of course that the H(ti ) need not commute with each other so that the order in which the
matrix exponentials are multiplied together is critically important. This is why we call it
“path ordered”.

The parameter t can be any real parameter — in differential geometry it tends to be a


parameter along a curve, sometimes an affine parameter, sometimes even arc length, but
any old parameter would do.

Note what happens if for some reason the H(ti ) do happen to commute with each other.
Then for instance
 
exp [H(t1 ) ∆t] exp [H(t0 ) ∆t] = exp {H( t1 ) + H(t0 )} ∆t (D.11)

which is not true unless the matrices commute. Continuing in this vein, when the matrices
do commute we have
Z t 
0 0
P exp H(t ) dt = lim exp [{H(tn−1 ) + H(tn−2 ) · · · H(t1 ) + H(t0 )} ∆t] .
0 n→∞

(D.12)

But now the argument of the exponential on the RHS really is the Riemann integral, so
we have
Z t  Z t 
0 0 0 0
P exp H(t ) dt = exp H(t ) dt . (D.13)
0 0
Math 464: Differential Geometry 181

That is, the path ordered integral reduces to the ordinary integral whenever the matrices
H(t) commute with each other. (You could also derive this directly from the original
differential equation for U (t).)

In a quantum mechanical setting you are more likely to think of t as the time, and
consider the slightly different differential equation

dU (t)
= −iH(t) U (t) (D.14)
dt
where H(t) is now the Hamiltonian operator on an appropriate Hilbert space and U is
the unitary time evolution operator. Then
 Z t 
0 0
U (t) = T exp −i H(t ) dt U (0) (D.15)
0

where T is the “time ordering operator” and


 Z t 
0 0
T exp −i H(t ) dt = lim exp [−iH(tn−1 ) ∆t] exp [−iH(tn−2 ) ∆t] · · ·
0 n→∞

· · · exp [−iH(t1 ) ∆t] exp [−iH(t0 ) ∆t] . (D.16)

But note that there is nothing fundamentally new here — the physicists “time ordering’
and the mathematicians “path ordering” are fundamentally the same thing.

The other place where path ordering shows up in a physics setting is in Yang-Mills
theory when you are constructing objects such as “Wilson loops” or “Polyakov loops”. I
won’t explain them now but might get around to it later in the course.
Appendix E

Elements of the calculus of variations

The “calculus of variations” has to do with the study of integrals (defined on some suitable
set of functions) and the conditions under which the integral is “extremal”; meaning that
the value of the integral is a [local] maximum, minimum, or a “point of inflexion”.

The canonical example is to suppose we have a function L(·, ·) which itself depends on
a function x(t) and its first derivative ẋ(t) = dx(t)/dt, that is
L = L (ẋ(t), x(t)) (E.1)
Now consider the integral
Z b
S[a, b; x(t)] = L (ẋ(t), x(t)) dt (E.2)
a

which is a functional mapping some suitable set {x(t)} of functions x(t) into the real
numbers IR. Under what conditions is this integral extremal?

To analyse this question, write


x(t) → x(t) + δx(t) (E.3)
and note that by definition
b  
d
Z
S[a, b; x(t) + δx(t)] = L ẋ(t) + [δx(t)], x(t) + δx(t) dt (E.4)
a dt
Now expand L(·, ·) as a Taylor series in its arguments, so that
S[a, b; x(t) + δx(t)] = S[a, b; x(t)]
Z b 
∂L (ẋ(t), x(t)) d ∂L (ẋ(t), x(t))
+ [δx(t)] + δx(t) dt
a ∂ ẋ(t) dt ∂x(t)
+ O [δx(t)]2

(E.5)

182
Math 464: Differential Geometry 183

Now integrate by parts. Then


 b
∂L (ẋ(t), x(t))
δS[a, b; x(t)] = δx(t)
∂ ẋ(t) a
Z b   
d ∂L (ẋ(t), x(t)) ∂L (ẋ(t), x(t))
+ − + δx(t) dt
a dt ∂ ẋ(t) ∂x(t)
+ O [δx(t)]2

(E.6)

Now let us restrain the set of functions {x(t)} to consist only of functions x(t) that are
fixed at the end-points a and b. That is

δx(a) = 0 = δx(b). (E.7)

For this set of functions the integral S[a, b; x(t)] is extremal (meaning δS[a, b; x(t)] = 0)
iff for arbitrary δx(t) satisfying the endpoint constraints we have
Z b   
d ∂L (ẋ(t), x(t)) ∂L (ẋ(t), x(t))
− + δx(t) dt = 0. (E.8)
a dt ∂ ẋ(t) ∂x(t)

But this is true iff  


d ∂L (ẋ(t), x(t)) ∂L (ẋ(t), x(t))
= . (E.9)
dt ∂ ẋ(t) ∂x(t)
This is called the Euler–Lagrange equation. It is the basic equation of the calculus of
variations.

There are many generalizations an applications of this equation — the geodesic equation
(in the sense of shortest-distance geodesics) being one of them.

Exercise: Consider the special case


 2
1 dx
L (ẋ(t), x(t)) = m − V (x(t)) . (E.10)
2 dt

Find [and interpret] the corresponding Euler–Lagrange equation. ♦

Note: The use of t is just a label, we could just as well write

L = L (f 0 (x), f (x)) , (E.11)

and consider the integral


Z b
S[a, b; f (x)] = L (f 0 (x), f (x)) dx, (E.12)
a
Math 464: Differential Geometry 184

which would then lead to the Euler–Lagrange equation in the form


d ∂L (f 0 (x), f (x)) ∂L (f 0 (x), f (x))
 
= . (E.13)
dx ∂f 0 (x) ∂f (x)
This f (x) form of the Euler–Lagrange equation is completely equivalent to the x(t)
form. ♦

Note: In a mechanics context, L (ẋ(t), x(t)) is typically referred to as the Lagrangian;


S[a, b; x(t)] is typically referred to as the action. ♦

Exercise: Suppose L(· · ·) depends not only on the function and its first derivative, but
also on second and higher derivatives up to order N . That is
 N 
d
L=L x(t), · · · , ẍ(t), ẋ(t), x(t) . (E.14)
dtN
Show that in this case [with suitable restrictions on the functions x(t) at the end-points
a and b] the Euler–Lagrange equations generalize to
N n
 N 
n d ∂L d
X
(−) n n x/dn t) N
x(t), · · · , ẍ(t), ẋ(t), x(t) = 0. (E.15)
n=0
dt ∂(d dt

Explicitly find what the restrictions on x(t) should be at the end-points a and b. ♦

Exercise: Suppose L(· · ·) depends on a quantity φ(xb ) which is a function of many


variables xa (typically space ~x and time t), and that it depends on first derivatives ∂a φ(xb ).
That is
L = L ∂a φ(xb ), φ(xb ) .

(E.16)
For simplicity assume the xa are Cartesian coordinates in a Euclidean geometry. Then
integrating over some region Ω we can write
Z
a
L ∂a φ(xb ), φ(xb ) dn x

S[Ω; φ(x )] = (E.17)

Now go through the same sort of steps as previously. Let ∂Ω denote the boundary of Ω
and show that
" #
∂L ∂a φ(xb ), φ(xb )
Z 
δS[Ω; φ(x)] = b )]
δφ(x) (unit normal)a dn−1 (area)
∂Ω ∂[∂ a φ(x
Z ( " # )
∂ ∂L ∂a φ(xb ), φ(xb ) ∂L ∂a φ(xb ), φ(xb )
+ − a b )]
+ b)
δφ(xb ) dn x
Ω ∂x ∂[∂ a φ(x ∂φ(x
2

+ O [δx(t)] . (E.18)
Math 464: Differential Geometry 185

From this, formulate suitable restrictions on the field φ(xb ) at the boundary ∂Ω, and
demonstrate that the relevant Euler–Lagrange equations are
" #
∂ ∂L ∂a φ(xb ), φ(xb ) ∂L ∂a φ(xb ), φ(xb )

= . (E.19)
∂xa ∂[∂a φ(xb )] ∂φ(xb )

The generalization to non-Cartesian coordinates on a Euclidean space is straightforward


— try it. [Since I have not yet defined integration on a general manifold we cannot at
this stage derive the equivalent result for general manifolds]. ♦

Exercise: Suppose L(· · ·) depends on the field φ(xc ), plus its first and second derivatives.
Show that the Euler-Lagrange equations (for Cartesian coordinates in a Euclidean space)
are now
∂2 ∂2L
   
∂L ∂ ∂L
− + a b = 0. (E.20)
∂φ(xc ) ∂xa ∂[∂a φ(xc )] ∂x ∂x ∂[∂a ∂b φ(xc )]
The generalization to higher derivatives is obvious but notationally messy. ♦

Note: The calculus of variations is a general tool that has applications in many fields;
beyond the simple application to (shortest length) geodesics in the text this procedure is
also used in Lagrangian mechanics and it generalizations, in classical field theories [such
as say Maxwell’s electromagnetism or Einstein’s general relativity] where it is often the
easiest way of obtaining the filed equations, and also in quantum field theories where
classical solutions satisfying the Euler–Lagrange equation often dominate the physics. ♦

Exercise: Generalize this discussion to arbitrary manifolds. There will have to be some
minimal restictions on the type of manifold considered. Find them, but keep the formal-
ism as general as possible. ♦
Appendix F

Hilbert’s 19th, 20th, and 23rd


problems

In the year 1900 Professor David Hilbert gave a key-note address at the International
Congress of Mathematicians which was that year held in Paris. Hilbert’s address set
out a list of 23 problems that he thought were important — and much of 20th century
mathematics was devoted to solving about half of these problems. Dr. Maby Winton New-
son translated this address into English with the author’s permission for Bulletin of the
American Mathematical Society 8 (1902), 437-479. A reprint of appears in Mathematical
Developments Arising from Hilbert Problems, edited by Felix Brouder, American Math-
ematical Society, 1976. Various versions are also available on the internet; go to Google
and search on “Hilbert problems”. Three of the 23 problems directly involve the calculus
of variations, the 19th, 20th, and 23rd problems. Excerpts from the lecture are presented
below. Note especially that Hilbert’s 23rd problem was somewhat more open-ended than
the others ...

Mathematical Problems
Lecture delivered before the International Congress of Mathematicians
Paris 1900
By Professor David Hilbert

... lacuna ...

It is difficult and often impossible to judge the value of a problem correctly in advance;
for the final award depends upon the gain which science obtains from the problem. Nev-
ertheless we can ask whether there are general criteria which mark a good mathematical
problem. An old French mathematician said: ”A mathematical theory is not to be con-
sidered complete until you have made it so clear that you can explain it to the first man

186
Math 464: Differential Geometry 187

whom you meet on the street.” This clearness and ease of comprehension, here insisted
on for a mathematical theory, I should still more demand for a mathematical problem if
it is to be perfect; for what is clear and easily comprehended attracts, the complicated
repels us.

Moreover a mathematical problem should be difficult in order to entice us, yet not
completely inaccessible, lest it mock at our efforts. It should be to us a guide post on the
mazy paths to hidden truths, and ultimately a reminder of our pleasure in the successful
solution.
... lacuna ...

19. Are the solutions of regular problems in the calculus of variations always
necessarily analytic?

One of the most remarkable facts in the elements of the theory of analytic functions
appears to me to be this: That there exist partial differential equations whose integrals are
all of necessity analytic functions of the independent variables, that is, in short, equations
susceptible of none but analytic solutions. The best known partial differential equations
of this kind are the potential equation

∂2f ∂2f
+ 2 =0
∂x2 ∂y

and certain linear differential equations investigated by Picard;46 also the equation

∂2f ∂2f
+ = ef ,
∂x2 ∂y 2
the partial differential equation of minimal surfaces, and others. Most of these partial
differential equations have the common characteristic of being the Lagrangian differential
equations of certain problems of variation, viz., of such problems of variation
Z Z
F (p, q, z; x, y) dx dy = minimum
 
∂z ∂z
p= , q= ,
∂x ∂y
as satisfy, for all values of the arguments which fall within the range of discussion, the
inequality
 2 2
∂2F ∂ F
2
· > 0,
∂p ∂p ∂q
F itself being an analytic function. We shall call this sort of problem a regular variation
problem. It is chiefly the regular variation problems that play a role in geometry, in
mechanics, and in mathematical physics; and the question naturally arises, whether all
solutions of regular variation problems must necessarily be analytic functions. In other
Math 464: Differential Geometry 188

words, does every Lagrangian partial differential equation of a regular variation problem
have the property of admitting analytic integrals exclusively? And is this the case even
when the function is constrained to assume, as, e.g., in Dirichlet’s problem on the potential
function, boundary values which are continuous, but not analytic?

I may add that there exist surfaces of constant negative gaussian curvature which are
representable by functions that are continuous and possess indeed all the derivatives, and
yet are not analytic; while on the other hand it is probable that every surface whose
gaussian curvature is constant and positive is necessarily an analytic surface. And we
know that the surfaces of positive constant curvature are most closely related to this
regular variation problem: To pass through a closed curve in space a surface of minimal
area which shall inclose, in connection with a fixed surface through the same closed curve,
a volume of given magnitude.

20. The general problem of boundary values

An important problem closely connected with the foregoing is the question concerning
the existence of solutions of partial differential equations when the values on the boundary
of the region are prescribed. This problem is solved in the main by the keen methods of H.
A. Schwarz, C. Neumann, and Poincare for the differential equation of the potential. These
methods, however, seem to be generally not capable of direct extension to the case where
along the boundary there are prescribed either the differential coefficients or any relations
between these and the values of the function. Nor can they be extended immediately to
the case where the inquiry is not for potential surfaces but, say, for surfaces of least area,
or surfaces of constant positive gaussian curvature, which are to pass through a prescribed
twisted curve or to stretch over a given ring surface. It is my conviction that it will be
possible to prove these existence theorems by means of a general principle whose nature
is indicated by Dirichlet’s principle. This general principle will then perhaps enable us
to approach the question: Has not every regular variation problem a solution, provided
certain assumptions regarding the given boundary conditions are satisfied (say that the
functions concerned in these boundary conditions are continuous and have in sections one
or more derivatives), and provided also if need be that the notion of a solution shall be
suitably extended? 47
... lacuna ...

23. Further development of the methods of the calculus of variations

So far, I have generally mentioned problems as definite and special as possible, in the
opinion that it is just such definite and special problems that attract us the most and from
which the most lasting influence is often exerted upon science. Nevertheless, I should like
to close with a general problem, namely with the indication of a branch of mathematics
repeatedly mentioned in this lecture-which, in spite of the considerable advancement lately
given it by Weierstrass, does not receive the general appreciation which, in my opinion,
is its due-I mean the calculus of variations.50
Math 464: Differential Geometry 189

The lack of interest in this is perhaps due in part to the need of reliable modern text
books. So much the more praiseworthy is it that A. Kneser in a very recently published
work has treated the calculus of variations from the modern points of view and with
regard to the modern demand for rigor.51

The calculus of variations is, in the widest sense, the theory of the variation of functions,
and as such appears as a necessary extension of the differential and integral calculus. In
this sense, Poincare’s investigations on the problem of three bodies, for example, form a
chapter in the calculus of variations, in so far as Poincare derives from known orbits by
the principle of variation new orbits of similar character.

I add here a short justification of the general remarks upon the calculus of variations
made at the beginning of my lecture.

The simplest problem in the calculus of variations proper is known to consist in finding
a function y of a variable x such that the definite integral
Z b  
dy
J= F (yx , y; x)dx, yx =
a dx
assumes a minimum value as compared with the values it takes when y is replaced by
other functions of x with the same initial and final values.

The vanishing of the first variation in the usual sense

δJ = 0

gives for the desired function y the well-known differential equation


 
dFyx ∂F ∂F
− Fy = 0, Fy x = , Fy = (1)
dx ∂yx ∂y

In order to investigate more closely the necessary and sufficient criteria for the occur-
rence of the required minimum, we consider the integral
Z b

J = {F (yx , y; x) + (yx − p)Fp } dx,
a
 
∂F (p, y; x)
F = F (p, y; x), Fp = .
∂p

Now we inquire how p is to be chosen as function of x, y in order that the value of


this integral J ∗ shall be independent of the path of integration, i.e., of the choice of the
function y of the variable x. The integral J ∗ has the form
Z b

J = {A yx − B} dx,
a
Math 464: Differential Geometry 190

where A and B do not contain yx , and the vanishing of the first variation

δJ∗ = 0

in the sense which the new question requires gives the equation
∂A ∂B
+ = 0,
∂x ∂y
i.e., we obtain for the function p of the two variables x, y the partial differential equation
of the first order
∂Fp ∂(pFp − F )
+ = 0. (1∗)
∂x ∂y

The ordinary differential equation of the second order (1) and the partial differential
equation (1*) stand in the closest relation to each other. This relation becomes immedi-
ately clear to us by the following simple transformation
Z b

δJ = {Fy δy + Fp δp + (δyx − δp)Fp + (yx − p) δFp } dx
a
Z b
= {Fy δy + δyx Fp + (yx − p) δFp } dx
a
Z b
= δJ + (yx − p) δFp dx
a

We derive from this, namely, the following facts: If we construct any simple family of
integral curves of the ordinary differential equation (l) of the second order and then form
an ordinary differential equation of the first order

yx = p(x, y) (2)

which also admits these integral curves as solutions, then the function p(x, y) is always
an integral of the partial differential equation (1*) of the first order; and conversely, if
p(x, y) denotes any solution of the partial differential equation (1*) of the first order, all
the non-singular integrals of the ordinary differential equation (2) of the first order are at
the same time integrals of the differential equation (l) of the second order, or in short if
yx = p(x, y) is an integral equation of the first order of the differential equation (l) of the
second order, p(x, y) represents an integral of the partial differential equation (1*) and
conversely; the integral curves of the ordinary differential equation of the second order
are therefore, at the same time, the characteristics of the partial differential equation (1*)
of the first order.

In the present case we may find the same result by means of a simple calculation; for
this gives us the differential equations (1) and (1*) in question in the form

yxx Fyx yx + yx Fyx y + Fyx x − Fy = 0, (1)


Math 464: Differential Geometry 191

(px + ppx )Fpp + pFpy + Fpx − Fy = 0, (1∗)


where the lower indices indicate the partial derivatives with respect to x, y, p, yx . The
correctness of the affirmed relation is clear from this.

The close relation derived before and just proved between the ordinary differential
equation (1) of the second order and the partial differential equation (1*) of the first
order, is, as it seems to me, of fundamental significance for the calculus of variations. For,
from the fact that the integral J ∗ is independent of the path of integration it follows that
Z b Z b
{F (p) + (yx − p)Fp (p)} dx = F (ȳx ) dx, (3)
a a

if we think of the left hand integral as taken along any path y and the right hand integral
along an integral curve of the differential equation

ȳx = p(x, ȳ).

With the help of equation (3) we arrive at Weierstrass’s formula


Z b Z b Z b
F (yx ) dx − F (ȳx ) dx = E(yx , p) dx, (4)
a a a

where E designates Weierstrass’s expression, depending upon yx , p, y, x,

E(yx , p) = F (yx ) − F (p) − (yx − p)Fp (p),

Since, therefore, the solution depends only on finding an integral p(x, y) which is single
valued and continuous in a certain neighborhood of the integral curve , which we are
considering, the developments just indicated lead immediately—without the introduction
of the second variation, but only by the application of the polar process to the differential
equation (1)—to the expression of Jacobi’s condition and to the answer to the question:
How far this condition of Jacobi’s in conjunction with Weierstrass’s condition E > 0 is
necessary and sufficient for the occurrence of a minimum.

The developments indicated may be transferred without necessitating further calcula-


tion to the case of two or more required functions, and also to the case of a double or a
multiple integral. So, for example, in the case of a double integral
 
∂z ∂z
Z
J = F (zx , zy , z; x, y)dω, zx = , zy =
∂x ∂y

to be extended over a given region ω, the vanishing of the first variation (to be understood
in the usual sense)
δJ = 0
Math 464: Differential Geometry 192

gives the well-known differential equation of the second order


dFzx dFzy
 
∂F ∂F ∂F
+ − Fz = 0, Fz x = , Fz y = , Fz = (I)
dx dy ∂zx ∂zy ∂z
for the required function z of x and y.

On the other hand we consider the integral


Z

J = {F + (zx − p)Fp + (zy − q)Fq } dω
 
∂F ∂F
F = F (p, q, z; x, y), Fp = , Fq =
∂p ∂q
and inquire, how p and q are to be taken as functions of x, y and z in order that the value
of this integral may be independent of the choice of the surface passing through the given
closed twisted curve, i.e., of the choice of the function z of the variables x and y.

The integral J* has the form


Z

J = {Azx + Bzy − C} dω

and the vanishing of the first variation

δJ∗ = 0

in the sense which the new formulation of the question demands, gives the equation
∂A ∂B ∂C
+ + = 0,
∂x ∂y ∂z
i.e., we find for the functions p and q of the three variables x, y and z the differential
equation of the first order
∂Fp ∂Fq ∂(pFp + qFq − F )
+ + = 0. (I)
∂x ∂y ∂z

If we add to this differential equation the partial differential equation

py + qpx = qx + pqz (I∗)

resulting from the equations


zx = p(x, y, z),
zy = q(x, y, z)
the partial differential equation (I) for the function z of the two variables x and y and the
simultaneous system of the two partial differential equations of the first order (I*) for the
Math 464: Differential Geometry 193

two functions p and q of the three variables x, y, and z stand toward one another in a
relation exactly analogous to that in which the differential equations (1) and (1*) stood
in the case of the simple integral.

It follows from the fact that the integral J ∗ is independent of the choice of the surface
of integration z that
Z Z
{F (p, q) + (zx − p)Fp (p, q) + (zy − q)Fq (p, q)} dω = F (z̄z , z̄y ) dω, (III)

if we think of the right hand integral as taken over z̄ an integral surface of the partial
differential equations
z̄x = p(x, y, z̄), z̄y = q(x, y, z̄);
and with the help of this formula we arrive at once at the formula
Z Z Z
F (zx , zy )dω − F (z̄x , z̄y )dω = E(zx , zy , p, q)dω,

[E(zx , zy , p, q) = F (zx , zy ) − F (p, q) − (zx − p)Fp (p, q) − (zy − q)Fq (p, q)] , (IV )
which plays the same role for the variation of double integrals as the previously given
formula (4) for simple integrals. With the help of this formula we can now answer the
question how far Jacobi’s condition in conjunction with Weierstrass’s condition E > 0 is
necessary and sufficient for the occurrence of a minimum.

Connected with these developments is the modified form in which A. Kneser,52 beginning
from other points of view, has presented Weierstrass’s theory. While Weierstrass employed
integral curves of equation (1) which pass through a fixed point in order to derive sufficient
conditions for the extreme values, Kneser on the other hand makes use of any simple family
of such curves and constructs for every such family a solution, characteristic for that
family, of that partial differential equation which is to be considered as a generalization
of the Jacobi-Hamilton equation.

—###—

The problems mentioned are merely samples of problems, yet they will suffice to show
how rich, how manifold and how extensive the mathematical science of today is, and
the question is urged upon us whether mathematics is doomed to the fate of those other
sciences that have split up into separate branches, whose representatives scarcely under-
stand one another and whose connection becomes ever more loose. I do not believe this
nor wish it. Mathematical science is in my opinion an indivisible whole, an organism
whose vitality is conditioned upon the connection of its parts. For with all the variety
of mathematical knowledge, we are still clearly conscious of the similarity of the logical
devices, the relationship of the ideas in mathematics as a whole and the numerous analo-
gies in its different departments. We also notice that, the farther a mathematical theory
is developed, the more harmoniously and uniformly does its construction proceed, and
Math 464: Differential Geometry 194

unsuspected relations are disclosed between hitherto separate branches of the science. So
it happens that, with the extension of mathematics, its organic character is not lost but
only manifests itself the more clearly.

But, we ask, with the extension of mathematical knowledge will it not finally become
impossible for the single investigator to embrace all departments of this knowledge? In
answer let me point out how thoroughly it is ingrained in mathematical science that every
real advance goes hand in hand with the invention of sharper tools and simpler methods
which at the same time assist in understanding earlier theories and cast aside older more
complicated developments. It is therefore possible for the individual investigator, when
he makes these sharper tools and simpler methods his own, to find his way more easily in
the various branches of mathematics than is possible in any other science.

The organic unity of mathematics is inherent in the nature of this science, for math-
ematics is the foundation of all exact knowledge of natural phenomena. That it may
completely fulfill this high mission, may the new century bring it gifted masters and
many zealous and enthusiastic disciples!

—###—

Original references

46 — Picard: Jour. de l’Ecole Polytech., 1890.

47 — Cf. D. Hilbert: ”Uber das Dirichlet’sche Princip,” Jahresber. d. Deutschen Math.-


Vereinigung, 8 (1900), 184-188.

50 — Text-books:
Moigno and Lindelof, Lecons du calcul des variations, Mallet-Bachelier, Paris, 1861, and
A. Kneser, Lehrbuch der Variations-rechnung, Vieweg, Braunschweig, 1900.

51 — As an indication of the contents of this work, it may here be noted that for the
simplest problems Kneser derives sufficient conditions of the extreme even for the case that
one limit of integration is variable, and employs the envelope of a family of curves satisfying
the differential equations of the problem to prove the necessity of Jacobi’s conditions of
the extreme. Moreover, it should be noticed that Kneser applies Weierstrass’s theory also
to the inquiry for the extreme of such quantities as are defined by differential equations.

52 — Cf. Kneser’s above-mentioned textbook, §§14, 16, 19 and 20.

Other references

1996 — S. Chern: Remarks on Hilbert’s 23rd problem. Math. Intelligencer 18, no. 4, 7-8.
Math 464: Differential Geometry 195

1976 — G. Stampacchia: Hilbert’s twenty-third problem. Extensions of the calculus of


variations. Proceedings of the Symposium in Pure Mathematics of the American Math-
ematical Society, Held at Northern Illinois University 1974, (edited by F.E.Browder),
611-628.

1900 — D. Hilbert: Weiterfuhrung der Methoden der Variationsrechnung. Akad. Wiss.


Gottingen 1900, 291-296.

Note: These problems are sufficently open ended that there is no general agreement as
to whether or nt they have been “solved”. ♦

Exercise: Do a literature survey to judge the extent to which these problems have
actually been “solved”. If you find something new and interesting, publish. ♦
Appendix G

Riemann:
On the hypotheses which underlie
the foundations of geometry

Ueber die Hypothesen, welche der Geometrie zu


Grunde liegen
Bernhard Riemann
Translated by William Kingdon Clifford

[Nature, Vol. VIII. Nos. 183, 184, pp. 14–17, 36, 37.]

Transcribed by D. R. Wilkins

Plan of the Investigation.

It is known that geometry assumes, as things given, both the notion of space and the
first principles of constructions in space. She gives definitions of them which are merely
nominal, while the true determinations appear in the form of axioms. The relation of
these assumptions remains consequently in darkness; we neither perceive whether and
how far their connection is necessary, nor a priori, whether it is possible.

From Euclid to Legendre (to name the most famous of modern reforming geometers)
this darkness was cleared up neither by mathematicians nor by such philosophers as
concerned themselves with it. The reason of this is doubtless that the general notion of

196
Math 464: Differential Geometry 197

multiply extended magnitudes (in which space-magnitudes are included) remained entirely
unworked. I have in the first place, therefore, set myself the task of constructing the notion
of a multiply extended magnitude out of general notions of magnitude. It will follow
from this that a multiply extended magnitude is capable of different measure-relations,
and consequently that space is only a particular case of a triply extended magnitude.
But hence flows as a necessary consequence that the propositions of geometry cannot
be derived from general notions of magnitude, but that the properties which distinguish
space from other conceivable triply extended magnitudes are only to be deduced from
experience. Thus arises the problem, to discover the simplest matters of fact from which
the measure-relations of space may be determined; a problem which from the nature of
the case is not completely determinate, since there may be several systems of matters
of fact which suffice to determine the measure-relations of space—the most important
system for our present purpose being that which Euclid has laid down as a foundation.
These matters of fact are—like all matters of fact—not necessary, but only of empirical
certainty; they are hypotheses. We may therefore investigate their probability, which
within the limits of observation is of course very great, and inquire about the justice of
their extension beyond the limits of observation, on the side both of the infinitely great
and of the infinitely small.

I. Notion of an n-ply extended magnitude.

In proceeding to attempt the solution of the first of these problems, the development
of the notion of a multiply extended magnitude, I think I may the more claim indulgent
criticism in that I am not practised in such undertakings of a philosophical nature where
the difficulty lies more in the notions themselves than in the construction; and that besides
some very short hints on the matter given by Privy Councillor Gauss in his second memoir
on Biquadratic Residues, in the Göttingen Gelehrte Anzeige, and in his Jubilee-book, and
some philosophical researches of Herbart, I could make use of no previous labours.

§ 1. Magnitude-notions are only possible where there is an antecedent general notion


which admits of different specialisations. According as there exists among these speciali-
sations a continuous path from one to another or not, they form a continuous or discrete
manifoldness; the individual specialisations are called in the first case points, in the second
case elements, of the manifoldness. Notions whose specialisations form a discrete mani-
foldness are so common that at least in the cultivated languages any things being given
it is always possible to find a notion in which they are included. (Hence mathematicians
might unhesitatingly found the theory of discrete magnitudes upon the postulate that
certain given things are to be regarded as equivalent.) On the other hand, so few and far
between are the occasions for forming notions whose specialisations make up a continuous
manifoldness, that the only simple notions whose specialisations form a multiply extended
manifoldness are the positions of perceived objects and colours. More frequent occasions
for the creation and development of these notions occur first in the higher mathematic.
Math 464: Differential Geometry 198

Definite portions of a manifoldness, distinguished by a mark or by a boundary, are called


Quanta. Their comparison with regard to quantity is accomplished in the case of discrete
magnitudes by counting, in the case of continuous magnitudes by measuring. Measure
consists in the superposition of the magnitudes to be compared; it therefore requires a
means of using one magnitude as the standard for another. In the absence of this, two
magnitudes can only be compared when one is a part of the other; in which case also we
can only determine the more or less and not the how much. The researches which can
in this case be instituted about them form a general division of the science of magnitude
in which magnitudes are regarded not as existing independently of position and not as
expressible in terms of a unit, but as regions in a manifoldness. Such researches have
become a necessity for many parts of mathematics, e.g., for the treatment of many-valued
analytical functions; and the want of them is no doubt a chief cause why the celebrated
theorem of Abel and the achievements of Lagrange, Pfaff, Jacobi for the general theory
of differential equations, have so long remained unfruitful. Out of this general part of
the science of extended magnitude in which nothing is assumed but what is contained
in the notion of it, it will suffice for the present purpose to bring into prominence two
points; the first of which relates to the construction of the notion of a multiply extended
manifoldness, the second relates to the reduction of determinations of place in a given
manifoldness to determinations of quantity, and will make clear the true character of an
n-fold extent.

§ 2. If in the case of a notion whose specialisations form a continuous manifoldness, one


passes from a certain specialisation in a definite way to another, the specialisations passed
over form a simply extended manifoldness, whose true character is that in it a continuous
progress from a point is possible only on two sides, forwards or backwards. If one now
supposes that this manifoldness in its turn passes over into another entirely different, and
again in a definite way, namely so that each point passes over into a definite point of the
other, then all the specialisations so obtained form a doubly extended manifoldness. In
a similar manner one obtains a triply extended manifoldness, if one imagines a doubly
extended one passing over in a definite way to another entirely different; and it is easy
to see how this construction may be continued. If one regards the variable object instead
of the determinable notion of it, this construction may be described as a composition of
a variability of n + 1 dimensions out of a variability of n dimensions and a variability of
one dimension.

§ 3. I shall show how conversely one may resolve a variability whose region is given
into a variability of one dimension and a variability of fewer dimensions. To this end
let us suppose a variable piece of a manifoldness of one dimension—reckoned from a
fixed origin, that the values of it may be comparable with one another—which has for
every point of the given manifoldness a definite value, varying continuously with the
point; or, in other words, let us take a continuous function of position within the given
manifoldness, which, moreover, is not constant throughout any part of that manifoldness.
Math 464: Differential Geometry 199

Every system of points where the function has a constant value, forms then a continuous
manifoldness of fewer dimensions than the given one. These manifoldnesses pass over
continuously into one another as the function changes; we may therefore assume that out
of one of them the others proceed, and speaking generally this may occur in such a way
that each point passes over into a definite point of the other; the cases of exception (the
study of which is important) may here be left unconsidered. Hereby the determination
of position in the given manifoldness is reduced to a determination of quantity and to a
determination of position in a manifoldness of less dimensions. It is now easy to show that
this manifoldness has n − 1 dimensions when the given manifold is n-ply extended. By
repeating then this operation n times, the determination of position in an n-ply extended
manifoldness is reduced to n determinations of quantity, and therefore the determination
of position in a given manifoldness is reduced to a finite number of determinations of
quantity when this is possible. There are manifoldnesses in which the determination
of position requires not a finite number, but either an endless series or a continuous
manifoldness of determinations of quantity. Such manifoldnesses are, for example, the
possible determinations of a function for a given region, the possible shapes of a solid
figure, &c.

II. Measure-relations of which a manifoldness of n dimensions is capable on the


assumption that lines have a length independent of position, and consequently that every
line may be measured by every other.

Having constructed the notion of a manifoldness of n dimensions, and found that its true
character consists in the property that the determination of position in it may be reduced
to n determinations of magnitude, we come to the second of the problems proposed above,
viz. the study of the measure-relations of which such a manifoldness is capable, and of
the conditions which suffice to determine them. These measure-relations can only be
studied in abstract notions of quantity, and their dependence on one another can only be
represented by formulæ. On certain assumptions, however, they are decomposable into
relations which, taken separately, are capable of geometric representation; and thus it
becomes possible to express geometrically the calculated results. In this way, to come to
solid ground, we cannot, it is true, avoid abstract considerations in our formulæ, but at
least the results of calculation may subsequently be presented in a geometric form. The
foundations of these two parts of the question are established in the celebrated memoir
of Gauss, Disqusitiones generales circa superficies curvas.

§ 1. Measure-determinations require that quantity should be independent of position,


which may happen in various ways. The hypothesis which first presents itself, and which
I shall here develop, is that according to which the length of lines is independent of their
position, and consequently every line is measurable by means of every other. Position-
fixing being reduced to quantity-fixings, and the position of a point in the n-dimensioned
manifoldness being consequently expressed by means of n variables x1 , x2 , x3 , . . . , xn , the
Math 464: Differential Geometry 200

determination of a line comes to the giving of these quantities as functions of one variable.
The problem consists then in establishing a mathematical expression for the length of a
line, and to this end we must consider the quantities x as expressible in terms of certain
units. I shall treat this problem only under certain restrictions, and I shall confine myself
in the first place to lines in which the ratios of the increments dx of the respective variables
vary continuously. We may then conceive these lines broken up into elements, within
which the ratios of the quantities dx may be regarded as constant; and the problem is
then reduced to establishing for each point a general expression for the linear element
ds starting from that point, an expression which will thus contain the quantities x and
the quantities dx. I shall suppose, secondly, that the length of the linear element, to the
first order, is unaltered when all the points of this element undergo the same infinitesimal
displacement, which implies at the same time that if all the quantities dx are increased in
the same ratio, the linear element will vary also in the same ratio. On these suppositions,
the linear element may be any homogeneous function of the first degree of the quantities
dx, which is unchanged when we change the signs of all the dx, and in which the arbitrary
constants are continuous functions of the quantities x. To find the simplest cases, I
shall seek first an expression for manifoldnesses of n − 1 dimensions which are everywhere
equidistant from the origin of the linear element; that is, I shall seek a continuous function
of position whose values distinguish them from one another. In going outwards from the
origin, this must either increase in all directions or decrease in all directions; I assume
that it increases in all directions, and therefore has a minimum at that point. If, then,
the first and second differential coefficients of this function are finite, its first differential
must vanish, and the second differential cannot become negative; I assume that it is
always positive. This differential expression, of the second order remains constant when
ds remains constant, and increases in the duplicate ratio when the dx, and therefore also
ds, increase in the same ratio; it must therefore be ds2 multiplied by a constant, and
consequently ds is the square root of an always positive integral homogeneous function of
the second order of the quantities dx, in which the coefficients are continuous functions
of the quantities x. p For
P Space, when the position of points is expressed by rectilinear
co-ordinates, ds = (dx)2 ; Space is therefore included in this simplest case. The
next case in simplicity includes those manifoldnesses in which the line-element may be
expressed as the fourth root of a quartic differential expression. The investigation of this
more general kind would require no really different principles, but would take considerable
time and throw little new light on the theory of space, especially as the results cannot be
geometrically expressed; I restrict myself, therefore, to those manifoldnesses in which the
line element is expressed as the square root of a quadric differential expression. Such an
expression we can transform into another similar one if we substitute for the n independent
variables functions of n new independent variables. In this way, however, we cannot
transform any expression into any other; since the expression contains 21 n(n+1) coefficients
which are arbitrary functions of the independent variables; now by the introduction of
new variables we can only satisfy n conditions, and therefore make no more than n of the
coefficients equal to given quantities. The remaining 21 n(n−1) are then entirely determined
by the nature of the continuum to be represented, and consequently 21 n(n − 1) functions
Math 464: Differential Geometry 201

of positions are required for the determination of its measure-relations. Manifoldnesses


pP in
which, as in the Plane and in Space, the line-element may be reduced to the form dx2 ,
are therefore only a particular case of the manifoldnesses to be here investigated; they
require a special name, and therefore these manifoldnesses in which the square of the
line-element may be expressed as the sum of the squares of complete differentials I will
call flat. In order now to review the true varieties of all the continua which may be
represented in the assumed form, it is necessary to get rid of difficulties arising from the
mode of representation, which is accomplished by choosing the variables in accordance
with a certain principle.

§ 2. For this purpose let us imagine that from any given point the system of shortest
limes going out from it is constructed; the position of an arbitrary point may then be
determined by the initial direction of the geodesic in which it lies, and by its distance
measured along that line from the origin. It can therefore be expressed in terms of the
ratios dx0 of the quantities dx in this geodesic, and of the length s of this line. Let us
introduce now instead of the dx0 linear functions dx of them, such that the initial value
of the square of the line-element shall equal the sum of the squares of these expressions,
so that the independent variables are now the length s and the ratios of the quantities
dx. Lastly, take instead of the dx quantities x1 , x2 , x3 , . . . , xn proportional to them, but
such that the sum of their
P squares = s2 . When we introduce these quantities, the square
2
of the line-element is dx for infinitesimal values of the x, but the term of next order
in it is equal to a homogeneous function of the second order of the 21 n(n − 1) quantities
(x1 dx2 − x2 dx1 ), (x1 dx3 − x3 dx1 ) . . . an infinitesimal, therefore, of the fourth order; so
that we obtain a finite quantity on dividing this by the square of the infinitesimal triangle,
whose vertices are (0, 0, 0, . . .), (x1 , x2 , x3 , . . .), (dx1 , dx2 , dx3 , . . .). This quantity retains
the same value so long as the x and the dx are included in the same binary linear form, or so
long as the two geodesics from 0 to x and from 0 to dx remain in the same surface-element;
it depends therefore only on place and direction. It is obviously zero when P the manifold
2
represented is flat, i.e., when the squared line-element is reducible to dx , and may
therefore be regarded as the measure of the deviation of the manifoldness from flatness at
the given point in the given surface-direction. Multiplied by − 34 it becomes equal to the
quantity which Privy Councillor Gauss has called the total curvature of a surface. For
the determination of the measure-relations of a manifoldness capable of representation in
the assumed form we found that 21 n(n − 1) place-functions were necessary; if, therefore,
the curvature at each point in 21 n(n − 1) surface-directions is given, the measure-relations
of the continuum may be determined from them—provided there be no identical relations
among these values, which in fact, to speak generally, is not the case. In this way the
measure-relations of a manifoldness in which the line-element is the square root of a
quadric differential may be expressed in a manner wholly independent of the choice of
independent variables. A method entirely similar may for this purpose be applied also to
the manifoldness in which the line-element has a less simple expression, e.g., the fourth
root of a quartic differential. In this case the line-element, generally speaking, is no longer
reducible to the form of the square root of a sum of squares, and therefore the deviation
Math 464: Differential Geometry 202

from flatness in the squared line-element is an infinitesimal of the second order, while in
those manifoldnesses it was of the fourth order. This property of the last-named continua
may thus be called flatness of the smallest parts. The most important property of these
continua for our present purpose, for whose sake alone they are here investigated, is that
the relations of the twofold ones may be geometrically represented by surfaces, and of
the morefold ones may be reduced to those of the surfaces included in them; which now
requires a short further discussion.

§ 3. In the idea of surfaces, together with the intrinsic measure-relations in which only
the length of lines on the surfaces is considered, there is always mixed up the position
of points lying out of the surface. We may, however, abstract from external relations if
we consider such deformations as leave unaltered the length of lines—i.e., if we regard
the surface as bent in any way without stretching, and treat all surfaces so related to
each other as equivalent. Thus, for example, any cylindrical or conical surface counts as
equivalent to a plane, since it may be made out of one by mere bending, in which the
intrinsic measure-relations remain, and all theorems about a plane—therefore the whole
of planimetry—retain their validity. On the other hand they count as essentially different
from the sphere, which cannot be changed into a plane without stretching. According
to our previous investigation the intrinsic measure-relations of a twofold extent in which
the line-element may be expressed as the square root of a quadric differential, which is
the case with surfaces, are characterised by the total curvature. Now this quantity in the
case of surfaces is capable of a visible interpretation, viz., it is the product of the two
curvatures of the surface, or multiplied by the area of a small geodesic triangle, it is equal
to the spherical excess of the same. The first definition assumes the proposition that the
product of the two radii of curvature is unaltered by mere bending; the second, that in
the same place the area of a small triangle is proportional to its spherical excess. To give
an intelligible meaning to the curvature of an n-fold extent at a given point and in a given
surface-direction through it, we must start from the fact that a geodesic proceeding from
a point is entirely determined when its initial direction is given. According to this we
obtain a determinate surface if we prolong all the geodesics proceeding from the given
point and lying initially in the given surface-direction; this surface has at the given point
a definite curvature, which is also the curvature of the n-fold continuum at the given point
in the given surface-direction.

§ 4. Before we make the application to space, some considerations about flat manifold-
ness in general are necessary; i.e., about those in which the square of the line-element
is expressible as a sum of squares of complete differentials. the line-element may be ex-
pressed as the sum of the squares of complete differentials I will call flat. In order now
to review the true varieties of all the continua which may be represented in the assumed
form, it is necessary to get rid of difficulties arising from the mode of representation,
which is accomplished by choosing the variables in accordance with a certain principle.

In a flat n-fold extent the total curvature is zero at all points in every direction; it is
Math 464: Differential Geometry 203

sufficient, however (according to the preceding investigation), for the determination of


measure-relations, to know that at each point the curvature is zero in 12 n(n − 1) inde-
pendent surface directions. Manifoldnesses whose curvature is constantly zero may be
treated as a special case of those whose curvature is constant. The common character
of those continua whose curvature is constant may be also expressed thus, that figures
may be viewed in them without stretching. For clearly figures could not be arbitrarily
shifted and turned round in them if the curvature at each point were not the same in all
directions. On the other hand, however, the measure-relations of the manifoldness are
entirely determined by the curvature; they are therefore exactly the same in all directions
at one point as at another, and consequently the same constructions can be made from it:
whence it follows that in aggregates with constant curvature figures may have any arbi-
trary position given them. The measure-relations of these manifoldnesses depend only on
the value of the curvature, and in relation to the analytic expression it may be remarked
that if this value is denoted by α, the expression for the line-element may be written
1
qP
dx2 .
1 + 14 α x2
P

§ 5. The theory of surfaces of constant curvature will serve for a geometric illustration.
It is easy to see that surface whose curvature is positive may always be rolled on a sphere
whose radius is unity divided by the square root of the curvature; but to review the
entire manifoldness of these surfaces, let one of them have the form of a sphere and the
rest the form of surfaces of revolution touching it at the equator. The surfaces with
greater curvature than this sphere will then touch the sphere internally, and take a form
like the outer portion (from the axis) of the surface of a ring; they may be rolled upon
zones of spheres having new radii, but will go round more than once. The surfaces with
less positive curvature are obtained from spheres of larger radii, by cutting out the lune
bounded by two great half-circles and bringing the section-lines together. The surface
with curvature zero will be a cylinder standing on the equator; the surfaces with negative
curvature will touch the cylinder externally and be formed like the inner portion (towards
the axis) of the surface of a ring. If we regard these surfaces as locus in quo for surface-
regions moving in them, as Space is locus in quo for bodies, the surface-regions can be
moved in all these surfaces without stretching. The surfaces with positive curvature can
always be so formed that surface-regions may also be moved arbitrarily about upon them
without bending, namely (they may be formed) into sphere-surfaces; but not those with
negative-curvature. Besides this independence of surface-regions from position there is in
surfaces of zero curvature also an independence of direction from position, which in the
former surfaces does not exist.

III. Application to Space.

§ 1. By means of these inquiries into the determination of the measure-relations of an n-


fold extent the conditions may be declared which are necessary and sufficient to determine
Math 464: Differential Geometry 204

the metric properties of space, if we assume the independence of line-length from position
and expressibility of the line-element as the square root of a quadric differential, that is
to say, flatness in the smallest parts.

First, they may be expressed thus: that the curvature at each point is zero in three
surface-directions; and thence the metric properties of space are determined if the sum of
the angles of a triangle is always equal to two right angles.

Secondly, if we assume with Euclid not merely an existence of lines independent of


position, but of bodies also, it follows that the curvature is everywhere constant; and
then the sum of the angles is determined in all triangles when it is known in one.

Thirdly, one might, instead of taking the length of lines to be independent of position
and direction, assume also an independence of their length and direction from position.
According to this conception changes or differences of position are complex magnitudes
expressible in three independent units.

§ 2. In the course of our previous inquiries, we first distinguished between the relations of
extension or partition and the relations of measure, and found that with the same extensive
properties, different measure-relations were conceivable; we then investigated the system
of simple size-fixings by which the measure-relations of space are completely determined,
and of which all propositions about them are a necessary consequence; it remains to discuss
the question how, in what degree, and to what extent these assumptions are borne out
by experience. In this respect there is a real distinction between mere extensive relations,
and measure-relations; in so far as in the former, where the possible cases form a discrete
manifoldness, the declarations of experience are indeed not quite certain, but still not
inaccurate; while in the latter, where the possible cases form a continuous manifoldness,
every determination from experience remains always inaccurate: be the probability ever
so great that it is nearly exact. This consideration becomes important in the extensions of
these empirical determinations beyond the limits of observation to the infinitely great and
infinitely small; since the latter may clearly become more inaccurate beyond the limits of
observation, but not the former.

In the extension of space-construction to the infinitely great, we must distinguish be-


tween unboundedness and infinite extent, the former belongs to the extent relations, the
latter to the measure-relations. That space is an unbounded three-fold manifoldness, is
an assumption which is developed by every conception of the outer world; according to
which every instant the region of real perception is completed and the possible positions
of a sought object are constructed, and which by these applications is for ever confirm-
ing itself. The unboundedness of space possesses in this way a greater empirical certainty
than any external experience. But its infinite extent by no means follows from this; on the
other hand if we assume independence of bodies from position, and therefore ascribe to
space constant curvature, it must necessarily be finite provided this curvature has ever so
small a positive value. If we prolong all the geodesics starting in a given surface-element,
Math 464: Differential Geometry 205

we should obtain an unbounded surface of constant curvature, i.e., a surface which in a


flat manifoldness of three dimensions would take the form of a sphere, and consequently
be finite.

§ 3. The questions about the infinitely great are for the interpretation of nature useless
questions. But this is not the case with the questions about the infinitely small. It
is upon the exactness with which we follow phenomena into the infinitely small that our
knowledge of their causal relations essentially depends. The progress of recent centuries in
the knowledge of mechanics depends almost entirely on the exactness of the construction
which has become possible through the invention of the infinitesimal calculus, and through
the simple principles discovered by Archimedes, Galileo, and Newton, and used by modern
physic. But in the natural sciences which are still in want of simple principles for such
constructions, we seek to discover the causal relations by following the phenomena into
great minuteness, so far as the microscope permits. Questions about the measure-relations
of space in the infinitely small are not therefore superfluous questions.

If we suppose that bodies exist independently of position, the curvature is everywhere


constant, and it then results from astronomical measurements that it cannot be different
from zero; or at any rate its reciprocal must be an area in comparison with which the
range of our telescopes may be neglected. But if this independence of bodies from position
does not exist, we cannot draw conclusions from metric relations of the great, to those of
the infinitely small; in that case the curvature at each point may have an arbitrary value
in three directions, provided that the total curvature of every measurable portion of space
does not differ sensibly from zero. Still more complicated relations may exist if we no
longer suppose the linear element expressible as the square root of a quadric differential.
Now it seems that the empirical notions on which the metrical determinations of space
are founded, the notion of a solid body and of a ray of light, cease to be valid for the
infinitely small. We are therefore quite at liberty to suppose that the metric relations of
space in the infinitely small do not conform to the hypotheses of geometry; and we ought
in fact to suppose it, if we can thereby obtain a simpler explanation of phenomena.

The question of the validity of the hypotheses of geometry in the infinitely small is
bound up with the question of the ground of the metric relations of space. In this last
question, which we may still regard as belonging to the doctrine of space, is found the
application of the remark made above; that in a discrete manifoldness, the ground of
its metric relations is given in the notion of it, while in a continuous manifoldness, this
ground must come from outside. Either therefore the reality which underlies space must
form a discrete manifoldness, or we must seek the ground of its metric relations outside
it, in binding forces which act upon it.

The answer to these questions can only be got by starting from the conception of
phenomena which has hitherto been justified by experience, and which Newton assumed
as a foundation, and by making in this conception the successive changes required by facts
which it cannot explain. Researches starting from general notions, like the investigation
Math 464: Differential Geometry 206

we have just made, can only be useful in preventing this work from being hampered by
too narrow views, and progress in knowledge of the interdependence of things from being
checked by traditional prejudices.

This leads us into the domain of another science, of physic, into which the object of this
work does not allow us to go to-day.

Synopsis.

Plan of the Inquiry:


I. Notion of an n-ply extended magnitude.

§ 1. Continuous and discrete manifoldnesses. Defined parts of a manifoldness are


called Quanta. Division of the theory of continuous magnitude into the theo-
ries,
(1) Of mere region-relations, in which an independence of magnitudes from
position is not assumed;
(2) Of size-relations, in which such an independence must be assumed.
§ 2. Construction of the notion of a one-fold, two-fold, n-fold extended magnitude.
§ 3. Reduction of place-fixing in a given manifoldness to quantity-fixings. True
character of an n-fold extended magnitude.

II. Measure-relations of which a manifoldness of n-dimensions is capable on the assump-


tion that lines have a length independent of position, and consequently that every
line may be measured by every other.

§ 1. Expression for the line-element. Manifoldnesses to be called Flat in which the


line-element is expressible as the square root of a sum of squares of complete
differentials.
§ 2. Investigation of the manifoldness of n-dimensions in which the line element
may be represented as the square root of a quadric differential. Measure of its
deviation from flatness (curvature) at a given point in a given surface-direction.
For the determination of its measure-relations it is allowable and sufficient that
the curvature be arbitrarily given at every point in 12 n(n−1) surface directions.
§ 3. Geometric illustration.
§ 4. Flat manifoldnesses (in which the curvature is everywhere = 0) may be treated
as a special case of manifoldnesses with constant curvature. These can also be
defined as admitting an independence of n-fold extents in them from position
(possibility of motion without stretching).
§ 5. Surfaces with constant curvature.
Math 464: Differential Geometry 207

III. Application to Space.

§ 1. System of facts which suffice to determine the measure-relations of space as-


sumed in geometry.
§ 2. How far is the validity of these empirical determinations probable beyond the
limits of observation towards the infinitely great?
§ 3. How far towards the infinitely small? Connection of this question with the
interpretation of nature.

—###—
Bibliography

[1] R D’Inverno, Introduction to Einstein’s relativity, (Oxford, England).

Some books on differential geometry:


[Often with a lot of useful background information.]

• Differential topology with a view to applications


DRJ Chillingworth
Research notes in Mathematics #9
Pitman Publishing, London, 1976.
• The geometry of physics
Theodore Frankel
Cambridge University Press, 1997.
• Curvature and Homology
Samuel I. Goldberg
Dover [1962] 1982.
• Geometrical methods of mathematical physics
B Schutz
Cambridge University Press, 1980.
• Lecture notes on elementary topology and geometry
I.M. Singer and J.A. Thorpe
Springer–Verlag, 1967.
• Topology
Solomon Lefschetz
Chelsea, New York, 1930.
• A comprehensive introduction to differential geometry
[in five volumes]
M. Spivak
Publish or Perish, 1979.

208
Math 464: Differential Geometry 209

“Old-style” books on differential geometry:


• Ricci Calculus
J.A. Schouten
Springer–Verlag, 1954.
• Non–Riemannian geometry
L.P. Eisenhart
American Mathematical Society, 1927.
• Applications of tensor analysis
[formerly: Applications of the absolute differential calculus]
A.J. McConnell
Dover, 1957 [1931].

Some books on general relativity:


[Often with a lot of differential geometry in the discussion.]

• Introducing Einstein’s Relativity


Ray D’Inverno
Oxford University Press, reprinted 2000.
Cost US$50.00; I’d guess about NZ$100.00
• General relativity for mathematicians
R.K. Sachs and H. Wu
Springer graduate texts in Mathematics #48
(out of print)
• A first course in general relativity.
B. F. Schutz
Cambridge University Press, 1985, US$40.00; NZ$100?
• Gravity: An introduction to Einstein’s General Relativity.
J. B. Hartle
Addison–Wesley, December 2002; US$56.
• Gravitation
Misner, Thorne, and Wheeler
[“the phone book”]
Freeman, San Francisco, 1972.
• General Relativity
R.M. Wald
University of Chicago Press, 1984.
Math 464: Differential Geometry 210

• General Relativity: an Einstein centenary survey


Edited by S.W. Hawking and W. Israel
Cambridge University Press, 1979.
• 300 years of gravitation
Edited by S.W. Hawking and W. Israel
Cambridge University Press, 1989.
• Principles of physical cosmology
P.J.E. Peebles
Princeton series in physics
Princeton University press, 1993.
• Gravitation and Cosmology
Steven Weinberg
Wiley, 1972.
• Tensor calculus and relativity
D.F. Lawden
Methuen, 1967.

Specialized items:
[With lots of differential geometry in the discussion.]

• Lorentzian Wormholes: from Einstein to Hawking


Matt Visser
American Institute of Physics Press, 1995
(Now Springer–Verlag).
• Artificial Black Holes
Edited by Mario Novello, Matt Visser, and Grigori Volovik.
World Scientific, Singapore, 2002.

—###—

You might also like