Visser - Notes On Differential Geometry, Victoria University of Wellington, New Zealand - 2004
Visser - Notes On Differential Geometry, Victoria University of Wellington, New Zealand - 2004
Math 464:
Notes on differential geometry
Matt Visser
School of Mathematical and Computing Sciences,
Victoria University of Wellington,
New Zealand.
E-mail: [email protected]
URL: http://www.mcs.vuw.ac.nz/˜visser
Warning:
These notes are provided as a supplement to the lectures.
They are not a substitute for attending the lectures.
There are still a few rough edges:
If you find errors, typos, and/or obscurities, let me know.
They are not a substitute for attending the lectures.
Contents
1 Introduction 6
1.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Textbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Supplementary texts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Fundamentals 9
2.1 Elementary topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 The “usual topology” on IR . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Hausdorff topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 “Point doubling” topologies . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 “Train track” topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6 Topological closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.8 Elementary metric spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.9 Locally Euclidean spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.10 Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.11 Atlases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.12 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.13 Differentiable Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.14 Connected sum: M1 #M2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.15 “Modding out” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.16 Tangent vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.17 Covectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.18 Dual spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.19 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.20 Tensor components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.21 Tensor fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.22 Fibre bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.23 Final comments on fundamentals . . . . . . . . . . . . . . . . . . . . . . . 39
2
Math 464: Differential Geometry 3
3 Affine Connexions 40
3.1 Partial derivatives are not tensors . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Parallel transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.1 Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.2 Connexion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.3 Covariantly constant vector fields . . . . . . . . . . . . . . . . . . . 45
3.2.4 Covariantly auto-parallel vector fields . . . . . . . . . . . . . . . . . 45
3.2.5 Path ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Covariant derivative: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3.2 Covectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.3 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.1 Geometrical interpretation 1 . . . . . . . . . . . . . . . . . . . . . . 54
3.5.2 Geometrical interpretation 2 . . . . . . . . . . . . . . . . . . . . . . 55
3.5.3 Commutator acting on a scalar . . . . . . . . . . . . . . . . . . . . 56
3.6 Riemann curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.7 General commutator identities[Ricci identities] . . . . . . . . . . . . . . . . 59
3.8 Basic curvature identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.8.1 Anti-symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.8.2 Antisymmetric part of the Ricci tensor . . . . . . . . . . . . . . . . 61
3.8.3 Nonmetricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.9 Weitzenbock identities [generalized Bianchi identities] . . . . . . . . . . . . 64
3.10 Integrability [geometric route] . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.10.1 Basic definitions and results . . . . . . . . . . . . . . . . . . . . . . 66
3.10.2 Zero curvature implies integrability . . . . . . . . . . . . . . . . . . 68
3.10.3 Affine flat connexions . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.11 Integrability [PDE route] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.11.1 Frobenius–Mayer systems . . . . . . . . . . . . . . . . . . . . . . . 74
3.11.2 Frobenius complete integrability theorem: . . . . . . . . . . . . . . 75
3.11.3 From auto-parallel to Frobenius–Mayer . . . . . . . . . . . . . . . . 76
3.12 n-beins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.13 Weitzenbock connexion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.14 Deforming general connexions . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.15 Preserving auto-parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.16 Preserving geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.17 Decomposing the general connexion . . . . . . . . . . . . . . . . . . . . . . 87
3.18 Locally geodesic coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.19 Riemann normal coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Math 464: Differential Geometry 4
4 Symmetric Connexions 92
4.1 Semi-symmetric connexions . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.2 Symmetric connexions: Identities . . . . . . . . . . . . . . . . . . . . . . . 93
4.3 Locally geodesic coordinates for symmetricconnexions . . . . . . . . . . . . 95
4.4 Weyl symmetric connexion . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.5 Weyl tensor for symmetric connexions . . . . . . . . . . . . . . . . . . . . . 96
4.6 Projective connexions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
11 Gauge-fields 164
11.1 Connexions on vector bundles . . . . . . . . . . . . . . . . . . . . . . . . . 164
11.2 Abelian gauge fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
11.3 non-Abelian gauge fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
12 Coda 169
A Notation 170
Bibliography 208
Chapter 1
Introduction
1.1 Outline
In this course I will present an overview of differential geometry, also known as the the-
ory of manifolds, (sometimes loosely known as non-Euclidean geometry or Riemannian
geometry, but that is actually a more specialized topic).
Be aware that these are notes, not a textbook, and I make no pretense or claim of
completeness...
— Fibre bundles.
6
Math 464: Differential Geometry 7
• I will discuss general affine connexions, with both non-metricity and torsion. Do-
ing so complicates parts of the discussion where the metric-tensor-based Christoffel
connexion would lead to considerably simpler results. (For instance there are extra
terms in the differential Bianchi identities, which in this context should more prop-
erly be called the Weitzenbock identities.) The reason I go to this trouble is because
torsion and nonmetricity are playing an increasing role in modern research.
• On the other hand, I will not eschew “coordinate-free” methods completely. One
place where such methods are clearly advantageous is in the theory of differential
forms and the exterior derivative.
• The presentation is very much in the spirit of applied mathematics (and here my
training as a physicist shows through); rigour will be subservient to getting the job
done. If rigour is needed, it will be used. If correct results can be obtained through
less rigorous mental pictures that’s fine — we can always pick up an appropriately
rigorous textbook to flesh out the arguments.
• Somewhat unusually, I will take the parallel transport operator on the manifold as
the primitive notion; and then use it to define the connexion and so the covariant
derivative.
• I also show how to reverse the procedure, starting from the connexion the paral-
lel transport operator can be constructed as a “path-ordered” exponential integral.
Path-ordering is a technique that is extremely useful, but is currently almost com-
pletely confined to the theoretical physics community (and more specifically to the
particle physics community).
• Whenever I need an example it will almost always come from within physics — for
example interpreting (re-interpreting) Newton’s second law in terms of the geodesics
of a conformally flat space.
• For cultural reasons I’ve added two historical appendices — one on Bernhard Rie-
mann’s inaugural lecture wherein he basically set out his programme for developing
differential geometry, and another on some of the Hilbert problems (those to do
Math 464: Differential Geometry 8
with the calculus of variations, which is a very useful general purpose tool of great
importance in applications of differential geometry).
1.2 Textbook
• Introducing Einstein’s Relativity
Ray D’Inverno
Oxford University Press, reprinted 2000.
Cost US$50.00; about NZ$100.00
The Math Dept has some copies available for loan; see me.
This will be the only book you will need any access to.
1.4 Background
This course is an introduction, so we will pretty much be sticking with classical mathe-
matical tools — much of the mathematics I will develop would make sense to a late 19th
century mathematician. (Though some bits of notation, and some of the applications I’ll
mention, would be a bit of a surprise.)
Chapter 2
Fundamentals
Differential geometry was originally developed to handle very practical problems in large-
scale surveying. For small parcels of land the “flat-Earth approximation” is perfectly
adequate (except in Wellington). For large parcels of land the curvature of the Earth
does have an effect on surveying. The angles of the surveyor’s triangles do not add up to
π radians.
Similarly, for long-distance navigating, you had better be prepared to understand spher-
ical trigonometry.
Karl Friedrich Gauss is generally credited with the first steps towards what is now called
differential geometry. He was specifically interested in 2-dimensional curved surfaces
embedded in flat Euclidean 3-space, but the notions generalize — eventually it’s better
to discard the idea of an embedding space completely.
We will develop these notions in stages, using the flat-Earth approximation as our guide.
(This chapter is a little more abstract than strictly needed for standard general relativ-
ity; but if you ever plan on looking at a paper on “braneworlds” or “string-inspired GR”
this vocabulary will be both useful and necessary.)
Quiz: A hunter leaves camp and walks one mile south. He then spots a bear, and walks
one mile east, and then shoots the bear. After that he walks one mile north and finds he
is back at his camp. What colour was the bear? Where is his campsite? ♦
Quiz: A geologist leaves camp and walks one mile south. She then spots a likely loca-
9
Math 464: Differential Geometry 10
tion, and walks one mile east to her sampling site. After a bit of digging, she encounters
bedrock and extracts a rock sample. After that she walks one mile north and finds she is
back at her camp. What did she have to dig through to find her rock sample? Where is
her campsite? ♦
Quiz: A navigator directs his ship to follow a “rhumb line”. Ignoring trivia such as the
continents, islands, and rocks that might get in the way, where will his ship eventually
go? ♦
In this brief section I want to present just enough abstract topology for you to understand
some of the technical issues in setting up the definition of manifold.
Note: You can get a lot of reference material from Google; it’s generally pretty good
and reasonably reliable. ♦
• Topological spaces: Good enough to define “continuity”, but it may not even make
sense to ask questions about “distance” and/or “differentiability”.
• Metric spaces: Good enough to define a notion of distance. All metric spaces
have a natural topology, but it may not even make sense to ask questions about
“differentiability”.
• ∅∈T;
• E ∈T.
Remarkably enough, this very abstract definition is sufficient to capture the essential
notion of “continuity”.
Definition 2 Suppose a function f maps one topological space (E1 , T1 ) into another (E2 , T2 ).
Then the function is continuous iff [if and only if ] whenever U ∈ T2 we have f −1 (U ) ∈ T1 .
That is: “continuous” iff “the inverse image of every open set is open”.
f −1 (U ) ≡ {x ∈ E1 : f (x) ∈ U }, (2.1)
we are not [at this stage] assuming the inverse function exists as a function. This is simply
the set of points in E1 which maps to U in E2 . ♦
Note: This is a definition; it is meaningless to try to prove it. You can however, seek to
justify it by verifying that it makes sense in situations you are already familiar with. ♦
• That is, “open intervals” are subsets of IR of the form {x : a < x < b}.
• Now consider the “set closure” of the collection of “open intervals” under finite in-
tersection and arbitrary union.
Sometimes this “set closure” is called the “algebraic closure” — we want to con-
sider the class of objects that can be constructed by arbitrary unions and finite
intersections of the collection of “open intervals” (a, b).
• This set closure is a topology on IR, called the “usual topology”.
In the “usual topology” everything is as expected from previous courses you might have
taken.
Note the set of “open intervals” itself does not form a topology. It is however a “sub-
basis” for a topology. (Meaning that after you invoke set closure under arbitrary unions
and finite intersections; then you have a topology.)
Comment: I’m being a little loose with the technical definition of “basis of a topology”.
If you want to know precise details, consult textbooks on general topology. Brief sketch
of details below. ♦
Comment: The “usual topology” is not the only topology you can put on IR. There
are other perverse topologies you can place on IR [or any other set]. ♦
• ∅ ∈ B.
S
• B∈B B = E.
Exercise: Show that any arbitrary collection A of subsets of E can be used as a sub-basis
for some topology on E.
Exercise: Show that the set A of all open intervals (a, b) on the reals is closed under
finite intersection. Consequently, if we treat this as a sub-basis, then the basis B it gener-
ates is itself: B = A. By definition the topology TB defined by this basis is the usual one. ♦
Exercise: Suppose we are not dealing with the reals. Show that you can do similar
things for any set that is ordered by a notion of “greater”, “lesser”, or “equal”. [Techni-
cally consider a “total order”, look up the definition if necessary.] ♦
Exercise: Look up the definition for a “partial order”, or “poset”, and repeat the con-
struction of sub-basis → basis → topology. ♦
• Indiscrete topology: the only open sets are ∅ and E; T = {∅, E}.
Exercise: If (E1 , T1 ) has its discrete topology then show that all functions f : E1 → E2
are continuous. Note I don’t have to say anything about the topology on E2 . ♦
Exercise: If (E2 , T2 ) has its indiscrete topology then show that all functions f : E1 → E2
are continuous. Note I don’t have to say anything about the topology on E1 . ♦
Math 464: Differential Geometry 14
This means that the discrete and indiscrete topologies are close to useless [except as a
source of interesting counterexamples].
Topology and continuity are very primitive [very basic] concepts that make sense
even if there are no real [or complex] numbers anywhere in the problem.
Definition 6 A topology is “Hausdorff ” iff for any two distinct points x1 and x2 in E
there exist open sets U1 and U2 in T such that:
x1 ∈ U 1 ; x2 ∈ U 2 ; U1 ∩ U2 = ∅. (2.4)
Colloquially: A topology is Hausdorff iff distinct points can be “housed off” from each
other using open sets.
Examples:
Define a set E1 by considering the ordinary real line IR. Remove the point 0, and replace
it by two points 01 and 02 .
1. Any open set on IR that does not contain 0 is also an open set in E1 .
Now consider the topology formed by the set closure of this basis under finite intersection
and arbitrary union.
In this topology, every open set containing 01 has a non-empty intersection with every
open set containing 02 . Thus this topology fails to be Hausdorff for the pair of points
{01 , 02 }. This is an elementary example of “point doubling”.
Define a set E2 by considering the ordinary real line IR. Remove the half-closed interval
I = [0, +∞), and replace it by two copies I1 = [01 , +∞1 ) and I2 = [02 , +∞2 ).
Now consider the topology formed by the set closure of this basis under finite intersection
and arbitrary union.
This topology fails to be Hausdorff for the pair of points {01 , 02 }. Because of its explicit
construction in terms of the real line, this set can be given a “locally Euclidean” structure
in the obvious fashion (see below). This is an elementary example of a one-dimensional
manifold that “branches” into two.
Exercise: Since this “line splitting” topology is “locally Euclidean” you might be
tempted to try setting up a theory of ordinary differential equations on this space, similar
to that for ODEs on IR. Think a little about what might happen to an ODE when it hits
the branch point. ♦
We went through this discussion so that you could see why it might be a good idea to
exclude these types of behaviour from further consideration.
Definition 9 A topological space is “connected” iff the only clopen sets are ∅ and E.
Lemma 1 A topological space is connected iff it is not the union of two disjoint nonempty
open sets.
Definition 10 The “closure” cl(X) of a set X is the smallest closed set that contains X,
more formally E − cl(X) is the union of all open sets that lie completely in E − X.
[
cl(X) = E − {U : U ∈ T and U ⊆ E − X} (2.5)
U
Many analysts strongly condemn such usage as potentially confusing; but it is almost
universal in the physics literature. ♦
Definition 11 The “interior” int(X) of a set X is the biggest open set contained in X,
more formally int(X) is the union of all open sets that lie completely in X.
[
int(X) = {U : U ∈ T and U ⊆ X} (2.7)
U
Math 464: Differential Geometry 17
2.7 Compactness
You will often see the word “compact” flying around in the discussion. The technical
definition is this:
Definition 13 A set U ⊆ E is compact with respect to the topology (E, T ) iff every open
cover of U has a finite refinement.
That is: suppose we have any collection {Oi : i ∈ I} such that ∀i Oi ∈ T and U ⊆ ∪i∈I Oi .
Then U is compact iff for any such collection there exists a finite sub-collection {Oj : j ∈
J; #(J) < ∞} which does the same job: U ⊆ ∪i∈J Oi .
That is, “old style” Heine–Borel was limited to statements concerning IR n , and the defi-
nition of compactness and the statement of the theorem were interchanged.
Another word you will sometimes encounter is “paracompactness”; for our purposes this
represents a highly technical issue that almost never impinges on the things we want to
do.
Definition 15 A set U ⊆ E is paracompact with respect to the topology (E, T ) iff every
open cover of U has a locally finite refinement.
That is, for each x ∈ E there exists a U ∈ T containing x such that the original open
cover of E, restricted by intersection with U , has a finite refinement that covers U .
Warning: Some authors prefer to make “paracompactness” part of the technical defi-
nition of a manifold; in preference to “second countability” defined below.
The constructions are largely equivalent — see for instance appendix A of Wald, “Gravi-
tation”. ♦
Warning: The metric defined here is subtly different from the “metric tensor” to be
introduced later.
Worse, the “physicist’s metric” in special relativity [SR] and general relativity [GR] is
not a metric in this sense. The SR/GR metric should really be called a “pseudo-metric”.
More on this next semester. Unfortunately the physics usage is well-established and un-
likely to change. Fortunately, most of the people working in the filed are smart enough
to figure out what is meant from context. ♦
• d(x, x) = 0.
• d(x, y) = 0 ⇒ x = y. [“non-degeneracy”.]
That is: The collection of open balls is a sub-basis. The collection of finite intersections
of open balls is a basis. The collection of arbitrary unions of finite intersections of open
balls is a topology.
Proof: Let x and y be distinct, and let δ = d(x, y) be the distance between x and y
which is guaranteed positive. Now consider the open balls
Then B1 ∩ B2 = ∅. (You should be able to prove this using the triangle inequality.) QED
Since all metric spaces are Hausdorff, and nearly all of applied mathematics takes place
in some sort of metric space [or at least a metrizable space] this is another reason why the
applied mathematics community doesn’t worry too much about non-Hausdorff topologies.
Note: In metric spaces you can certainly define continuity; but you do not need to be
in a metric space to do so. That’s the whole point of going to the trouble of defining a
general topological space. ♦
a+b b−a
Note: In IR we have (a, b) = B( 2 , 2 ). ♦
Thus for IR the “usual topology” is the metric topology [with the usual metric].
Even better, in IR [or metric spaces generally] continuity in the topological sense is
equivalent to the standard ∀ ∃δ sense used in formal analysis.
This is the version of continuity most often used in classical analysis, but has now largely
been superseded end by the topological version. ♦
Math 464: Differential Geometry 20
Notation: homeomorphism ⇐⇒ the function and its inverse are both continuous.
That is: surrounding any point there is at least one open neighborhood that can be
mapped onto a subset of IRn in a 1–to–1 and bi-continuous manner.
Warning: This definition is actually very weak and still permits a number of things
that physicists would consider pathologies:
• non-Hausdorff topologies.
• dimensionality that is position-dependent.
• dimensionality n that does not seem to match the underlying space E.
• Point doubling and/or line splitting — see previous discussion of Hausdorff topology.
(It is easy to verify that those spaces are locally Euclidean in this sense.)
• Disconnected topologies.
Take the set union of a line with a plane, each with its usual topology. The result
is a disconnected topology, which is a locally Euclidean space, but in some regions
the space is 1-dimensional while in other regions it is 2-dimensional.
• Square with 1d topology.
This is particularly perverse. See Chillingworth p 121.
Take a square [0, 1] × [0, 1] ⊂ IR2 .
Define a topology: A set X is open iff it is either of the form
X = {(x, y) : {y} open in the usual topology on IR} , (2.14)
Math 464: Differential Geometry 21
Note that this peculiar example requires an uncountable infinity of open sets. In
particular any open cover of the square must contain an uncountable infinity of open
sets.
Definition 18 A set is infinite but countable if all its elements can be put into 1 to 1
correspondence with the natural numbers IN = 0, 1, 2, 3...
Exercise: [trivial] Show that the set of all possible subsets of the natural numbers is
uncountable. ♦
Warning: [In case you like to read mathematics in French] The French word “varieté”
is equivalent to the English word “manifold”, it does not translate to “variety”. ♦
There are speculations that quantum gravity might eventually force us to modify our
notions of space and time at short distances, in which case “locally Euclidean” might not
Math 464: Differential Geometry 22
be the most appropriate choice. Maybe we should look at ”locally X” spaces. There is
currently no consensus as to what the X might be in “locally X”, so I will not explore
that route in this course. The speculations will remain speculations unless and until some
clearly useful notion of ”locally X” space is developed. ♦
2.10 Charts
Definition 20 Chart
A chart (O, f, U ) on an open subset O ∈ T is simply a set U ⊂IR n together with a
homeomorphism f : O ↔ U = f (O).
That is: in any locally Euclidean space each point x lies in at least one chart. These
charts can be thought of as “maps” (in the cartographic sense) that cover part of the
space E.
Comment: The word “map” already has a technical meaning in mathematics, which
is why in manifold theory it is better to adopt the word “chart”. ♦
Definition 21 Coordinates
The image of the chart, f (O) = U ⊆IRn , is a set of n-tuples of real numbers. Each point
x in O corresponds to a single n-tuple, which are referred to as the coordinates of the
point x in the given chart. It is common to write the coordinates as xa with a = 1, 2, ...n,
or xµ with µ = 1, 2, ...n.
Notation: In the specific case of general relativity it has become conventional to write
a = 0, 1, 2, 3 or µ = 0, 1, 2, 3 with 0th coordinate typically [not always] being the time
coordinate. This is merely convention, it is not fundamental. ♦
2.11 Atlases
Definition 22 Atlas
An atlas is a collection of charts that covers the entire locally Euclidean space E.
Math 464: Differential Geometry 23
Note that whenever two charts overlap, the intersection Oij ≡ Oi ∩ Oj 6= ∅ is an open
set, and the mapping fi ◦ fj−1 , suitably restricted, is a homeomorphism from some open
subset of IRn to another open subset of IRn .
Lemma 2 In any locally Euclidean space, the local dimensionality n(x) is piecewise con-
stant on the connected components of E.
2.12 Manifolds
Definition 23 Manifold
A manifold M is a locally Euclidean space which:
• is Hausdorff;
• is second-countable.
Note that we have to put the dimension and Hausdorff conditions in “by hand”, while
the second-countable condition is there to make sure the dimensionality of the chart
“matches” the dimensionality of the space E.
Second-countable means there is at least one countable basis for the topology on M.
It implies (not that this is blindingly obvious) that there is at least one way of covering
the space with a countable number of charts. This is much weaker than the restriction
“compact”, but implied by it, since compactness would require every atlas [since it is in
particular an open cover] to have a finite refinement.
The only [mild] danger is that differing texts sometimes use slightly different definitions
as being more basic — in all situations of relevance to applied mathematics and physics the
definitions agree and the “second countable”/ “paracompact”/ “metrizable” restriction
simply serves to keep the number of charts under control.
So far we only have some continuity information about the manifold — where does
differentiability come in?
Example: Sphere S 2
The sphere S 2 is a manifold.
It is not enough to just mutter “latitude and longitude” since this does not cover the
whole sphere; the way we have set things up latitude and longitude would be defined on
an open set that covers most of the sphere except for both poles and one line of longitude.
How would we fix this? ♦
Exercise: What is the minimum number of charts needed to provide an atlas for S 2 ? ♦
Example: Sphere S n
The sphere S n is a manifold.
To see this use stereographic projection. ♦
Exercise: What is the minimum number of charts needed to provide an atlas for S n ? ♦
Example: Products M × N
If M is a manifold and N is a manifold then the product M × N is a manifold of dimen-
sion dim(M × N ) = dim(M) + dim(N ). ♦
Example: Torus T n ≡ (S 1 )n = S 1 × S 1 × · · · × S 1
The n-torus T n is a manifold. ♦
This “bundle” of all the tangents is called, surprise, the tangent bundle.
It has a natural topology as a subset of IRn+1 × IRn+1 .
To specify a point in the bundle you need 2n real numbers.
To prove it is a manifold is a relatively simple exercise; not attempted here. ♦
Exercise: Go through the definition in detail to verify that the tangent bundle T (S n )
is actually a manifold. ♦
Comment:
• Locally T (S n ) is always a product of the form Ui × IRn with the set of Ui ’s covering
S n ; but this need not hold globally.
M0i = Mi − Di (2.20)
Math 464: Differential Geometry 27
These two manifolds both have boundaries bd(Mi0 ) ∼ S n−1 which we can identify by some
homeomorphism. The result, denoted M1 #M2 is a n-dimensional manifold (without
boundary) called the connected sum of M1 and M2 .
• M1 #M2 = M2 #M1 ;
• M#S n = M.
• x ∼ x. [reflexive]
• x∼y ⇒ y ∼ x. [symmetric]
If we have any sense we would at least want to make the curve h :IR→ M continuous.
Question: How would we define continuity? ♦
Better yet:
Question: How would we define differentiability? ♦
Answer: In each chart U the curve h induces a map f ◦ h from IR to f (U ) ⊂IR n , and we
can certainly define continuity and differentiability for this map f ◦ h. We write
a dxa
t = . (2.24)
dλ
Math 464: Differential Geometry 29
Note that this does the obvious thing for Cartesian space M =IRn . If we reparameterize
λ, then by the chain rule
dxa dλ dxa dλ a
t̄a = = = t . (2.25)
dλ̄ dλ̄ dλ dλ̄
If we change coordinate patches, U to Ū , using
f¯ ◦ f −1 : f −1 (U ) → f¯(U ) (2.26)
then, again by the chain rule
n n
dx̄a X ∂ x̄a dxb X ∂ x̄a b
t̄a = = b dλ
= b
t. (2.27)
dλ b=1
∂x b=1
∂x
That is: Once we have the components of the tangent vector in any one chart, there are
specific unambiguous rules for changing to any other chart. Also note that it only makes
sense to talk about the tangent vector components at a particular point p(λ) on the curve
and in the manifold.
Definition 31 Vector:
Anything that transforms the same way as a tangent vector is called a vector, or more
specifically a contravariant vector.
To verify this, pick a curve xa (λ) in IRn defined by xa (λ) = ta λ and “pull it back” to
U by defining
h(λ) = f −1 (xa (λ)) = f −1 (ta λ). (2.28)
Math 464: Differential Geometry 30
To (informally) prove this, note that “above” any chart U , the tangent bundle will locally
look like U × IRn , so that sets of this form provide charts for T (M).
Since each of these charts has dimension 2n, so does T (M).
Since M (and hence U ) is Hausdorff, and IRn is Hausdorff, so is U × IRn , and so is T (M).
Since M is second countable, pick a countable atlas Ui . Then Ui × IRn provides a countable
atlas for T (M), which is thereby second countable.
Comment: Once we have the components of the vector ta , it is useful to consider the
directional derivative in the direction ta . Define:
∂
t = ta (2.29)
∂xa
It is then easy to verify (by the chain rule) that t is independent of coordinate chart —
it is a “geometrical object”. This provides a natural isomorphism between the set of all
vectors at a point and the set of all directional derivatives at that point.
This leads to statements that at first glance look a bit strange, such as:
You should be prepared, when encountering such terminology, to make appropriate trans-
lations.
Math 464: Differential Geometry 31
2.17 Covectors
There is a second set of distinct vectors (not tangent vectors) that it is natural to define
on arbitrary manifolds, the co-vectors, a natural extension of the notion of “gradient”.
Let φ(p) be a mapping M →IR. Then in a specific coordinate chart (O, f, U ) we can
write
φ(xa ) ∈IR meaning φ ◦ f −1 : f (O) = U ⊆IRn →IR . (2.30)
Now copy over (with minor modifications) the discussion we had for tangent vectors:
∂φ
ga = . (2.31)
∂xa
Index placement is important!
Warning: The gradient vectors do not transform the same way as the tangent vectors;
the index placement is different; the use of the the chain rule is different; to define tangent
vectors you need a map IR→ M, to define gradient vectors you need a map M →IR. ♦
Definition 35 Gradient:
The gradient to the function φ at a point p in M is the abstract object g defined as
follows: Pick chart any chart U surrounding p and calculate the components g a as above.
g is defined to be the object which in chart U has components ga (and so we know its
components in any chart).
Definition 36 Co-vector:
A co-vector at the point p in M is an abstract geometrical object, denoted g, defined as
Math 464: Differential Geometry 32
follows: Pick chart any chart U surrounding p and in that chart suppose g has components
ga . Suppose that in any other chart it has components
n
X ∂xb
ḡa = gb . (2.33)
b=1
∂ x̄a
Then we say that g is a co-vector that is independent of the specific choice of chart used
to initiate the definition.
Covectors are also called covariant vectors; they are simply not the same as contravariant
vectors.
Indeed, pick a chart U surrounding p and write coordinates {xa } ∈ f (U ). Then every
distinct mapping from IRn to IR corresponds to a distinct real function φ on M. The
gradients defined by the distinct mappings from IRn to IR certainly span a vector space.
To prove this (informally), note that “above” any chart U , the cotangent bundle will
locally look like U × IRn , so that sets of this form provide charts for T ∗ (M).
Since each of these charts has dimension 2n, so does T ∗ (M ).
Since M (and hence U ) is Hausdorff, and IRn is Hausdorff, so is U × IRn , and so is T ∗ (M ).
Math 464: Differential Geometry 33
Since M is second countable, pick a countable atlas Ui . Then Ui × IRn provides a countable
atlas for T ∗ (M), which is thereby second countable.
Now that we have defined both tangent space and cotangent space, I’ll explain the nomen-
clature by pointing out that the tangent space Tp and cotangent space Tp∗ are dual to each
other in the sense of vector space duality. Specifically any cotangent g ∈ Tp∗ can be viewed
as a linear mapping from Tp to IR.
The most down to earth way of seeing this is by working in terms of coordinates:
g ∈ Tp∗ → ga ; t ∈ Tp → ta (2.35)
Define n
X
g(t) = ga ta ∈IR (2.36)
a=1
a
and note that the combination ga t is independent of coordinate system — ga transforms
“covariantly”, but ta transforms “contravariantly”, and in the combination ga ta the two
transformation matrices, being inverses, cancel.
n
X ∂ x̄a ∂xb
b ∂ x̄c
= δac (2.38)
b=1
∂x
That is, the transformation matrices for vectors and covectors are matrix inverses of each
other. ♦
Notation:
Bra-ket notation:
g(t) = hg|ti (2.40)
Dirac’s bra-ket notation (widely used in quantum mechanics to represent inner products
on an appropriate Hilbert space) is an example of distinguishing a vector space from its
dual. Of course Dirac was dealing with a much larger vector space, that of suitably de-
fined “functions” on IR3 × IR. ♦
Comment: “bra” vectors are written hg|; “ket” vectors are written |ti; put them to-
gether, as in hg|ti and you get a “bra-ket” [bracket]. Blame Paul Dirac for this abuse of
the English language. ♦
(Tp∗ )∗ = Tp (2.43)
So, of course it’s a coordinate invariant. (Since φ(λ) : IR→IR has a meaning that is
quite independent of any coordinates you introduce on the manifold.)
2.19 Tensors
Now that we have defined tangents, cotangents, and duality, we are ready to define tensors.
[Adapted from a discussion I first saw presented by Chris Grigson.]
Definition 38 Tensor
A tensor Tsr (p) at a point p is a multi-linear mapping from (Tp∗ )r ⊗ (Tp )s into the real
numbers IR.
The symbol ⊗ denotes the usual Cartesian product of two structures, in this case the
Cartesian product of a number of vector spaces.
Why are we doing such a perverse thing? Because this is the abstract way of defining
a tensor without first dealing with components.
Because a tensor Tsr (p) is defined in this way, as a multi-linear mapping from (Tp∗ )r ⊗
(Tp )s into the real numbers IR, and because (Tp∗ )r ⊗ (Tp )s is itself a vector space [of
dimension dim(M)r+s ], we could equally well view the tensor as an element of the dual
of (Tp∗ )r ⊗ (Tp )s . That is
∗
Tsr (p) ∈ (Tp∗ )r ⊗ (Tp )s (2.47)
It is then an easy exercise to show
∗
(Tp∗ )r ⊗ (Tp )s = [(Tp∗ )r ]∗ ⊗ [(Tp )s ]∗ = [(Tp∗ )∗ ]r ⊗ [(Tp )∗ ]s = (Tp )r ⊗ (Tp∗ )s (2.48)
That is:
Tsr (p) ∈ (Tp )r ⊗ (Tp∗ )s (2.49)
Math 464: Differential Geometry 36
The components of a tensor of type Tsr (p) at a point p are defined as follows:
Then, because the tensor Tsr is a multi-linear mapping, there must exist a collection of
numbers T a1 a2 ...ar b1 b2 ...bs such that:
Tsr (g 1 , g 2 , . . . g r ; t1 , t2 , . . . ts ) = T a1 a2 ...ar b1 b2 ...bs ga11 ga22 . . . garr tb11 tb22 . . . tbss . (2.50)
These numbers are called the components of the tensor Tsr in the specified chart/ coordi-
nate patch/ coordinates.
Note that under a change of charts (change of coordinates) Tsr transforms as you would
expect: each one of the up (contravariant) indices transforms like the components of a
tangent vector; each one of the down (covariant) indices transforms like a gradient. Thus
in particular for a tensor of type:
∂xā a ∂xb
T11 : X a b → X ā b̄ = X b (2.51)
∂xa ∂xb̄
∂xā ∂xb̄
T02 : X ab → X āb̄ = a X ab (2.52)
∂x ∂xb
∂xa ∂xb
T20 : Xab → Xāb̄ = ā Xab (2.53)
∂x ∂xb̄
(Note that I have put the over-bars on the indices, not on the coordinates themselves.
This is supposed to make things clearer.)
Exercise: Consider the T11 tensor whose components are (in the chart whose coordi-
nates are xa ) given by the Kronecker delta δ a b . What are it components in some other
coordinate system x̄a (xb )? ♦
Once you have the notion of a tensor at a point, the notion of a tensor field is immediate:
A fibre bundle [US spelling: fiber bundle] is a particular type of manifold with special
structure. Standard examples of fibre bundles are the tangent bundle T (M) and the
cotangent bundle T ∗ (M), but the basic idea is much more general.
Additionally we demand that T is “locally” a product space. That is, suppose we have an
open cover of the base space M by a collection of open sets {U }. Then for each U we
demand π −1 (U ) ∼ U × F ; the inverse image of any open set in M is homeomorphic to
U × F.
Examples:
Special cases:
• If the fibre is a vector space [not necessarily the tangent space], the fibre bundle is
called a “vector bundle”.
• If the fibre is a group manifold, and the charts respect the group structure, then the
fibre bundle is called a “principal bundle”.
• If the fibre is IR, the fibre bundle is called a “line bundle”.
• If the fibre is C, the fibre bundle is called a “complex line bundle”.
• In quantum mechanics on configuration spaces of nontrivial topology the Schrodinger
wavefunction ψ(x) is not a function. It is instead a section of a complex line bundle
defined over the configuration space. This is actually important for a proper anal-
ysis of both the Aharonov–Bohm effect, and flux quantization in super-conducting
rings.
• In quantum field theory the Dirac spinor describing an electron ψ(x) is not a func-
tion. It is instead a section of a spinor bundle defined over Minkowski space. This
is actually important for a deeper understanding of the electromagnetic field.
Math 464: Differential Geometry 39
• Don’t be afraid of terms like “[something] bundle”; they are just special types of
bundle where you have additional information about the fibre.
All the technical machinery and terminology I have set up so far is very general, and
very flexible; it is used both in standard GR, and in various extensions to GR currently
under investigation. It will be common, at least in some approximation, to all physically
reasonable extensions of GR and is also used in other quite different physical situations
such as topological solitons, ...
So far, I have not introduced the notion of a metric tensor. The tangent and cotangent
spaces, though dual to each other, have otherwise been independent of each other. This
will change, in a chapter or two.
Affine Connexions
The fundamental reason that tensor calculus is (relatively) difficult is that partial deriva-
tives of a tensor are not themselves a tensor; so that it takes quite a bit of work to even
define tensor differentiation. We will focus on one way of defining tensor differentiation
in this chapter; there are at least two other types of tensor derivative we’ll get round to
later in the course.
40
Math 464: Differential Geometry 41
unless, that is, you can guarantee that the only coordinate changes you will ever have to
make are linear.
This happens for instance if you are working in Euclidean space using Cartesian coor-
dinates and are only interested in changing coordinates to another Cartesian system.
Comment: There is a whole vast subject called “Cartesian tensors”, mainly used in
geophysics, elasticity theory, and engineering analysis for which all these simplifications
do occur — so all the complications of this chapter quietly go away. ♦
In general though, you simply have to live with the first term and either develop ad-
ditional structure to deal with it or suitably restrict the questions you ask. The three
standard routes to tensor differentiation are:
For this chapter we will stick with the affine connexion [covariant derivative]. Note that
the first term, which causes the problems, is linear in the components of V ; this is one
hint of how to fix things up. The second hint comes from a deeper look at what went
wrong: In the partial derivative we are calculating
V (x + δx) − V (x)
∂c̄ V ā ≡ lim (3.6)
δx→0 δx
But that means we are really subtracting two different vectors in two different vector
spaces, one at x, [V (x) ∈ Tx ], and one at x + δx, [V (x + δx) ∈ Tx+δx ] — no wonder the
result is not a tensor.
transport(y→x;γ) : Ty → Tx (3.8)
should be an invertible mapping — distinct vectors should map to distinct vectors — but
it might well depend on the specific path chosen to go from y → x. (In fact, in general
it will be path dependent.) Now we certainly want parallel transport along the null path
γ0 (the path that does not move anywhere) to be the trivial identity operator
transport(x→x;γ0 ) = I : Tx → Tx (3.9)
Additionally, if we parallel transport along a path and then parallel transport back along
the same path in a reversed sense, we would want the composition of these parallel
transport processes to be trivial. Specifically let γ : y → x and let the reversed path be
γ̃ : x → y, then we would expect
that is −1
transport(x→y;γ̃) = transport(y→x;γ) (3.11)
so that reversing the path corresponds to the inverse transformation.
Finally, since it maps vector spaces to vector spaces, the transport operator should at
the very least be linear:
This is also necessary if we wish to enforce the basic property ∇(V1 + V2 ) = ∇V1 + ∇V2 .
So if we have a coordinate chart that covers both the points x and y we will be able to
write the components of the transport operator as:
[transport(y→x;γ) ]a b (3.13)
As always, the most basic properties are assumed, not derived, and the relevant ques-
tion is whether the assumptions you make [the axioms] lead to a useful mathematical
structure. ♦
Math 464: Differential Geometry 43
3.2.2 Connexion
∂
Γa bc = c
{transport[y → x; γ]}a b , (3.14)
∂y y→x
where the limit y → x approaches the trivial curve γ0 . The affine connexion is not a
tensor, for exactly the same reason that the partial derivative of a vector is not a tensor.
Note that the leading indices a and b have to do with “matrix multiplication” of the vector
components, while the trailing index c has to do with the “direction of differentiation”.
To emphasise the different roles these indices play it is sometimes useful to introduce an
abstract “bullet” notation and write:
∂
Γ• •c = {transport[y → x; γ]}• • . (3.15)
∂y c y→x
The “bulleted” indices should, whenever possible, be chained together by matrix multi-
plication.
Warning: This “bullet” notation is my own slightly nonstandard notation. You will
soon see why it is often useful. ♦
Warning: Authors differ regarding where to place the indices on the affine connex-
ion. Check whichever book you are using. The convention of these notes is [putting the
direction you are differentiating in as the trailing lower index] is compatible with Misner–
Thorne–Wheeler but is not [strictly speaking] compatible with either Hartle or Wald.
Fortunately both of these authors deal explicitly with symmetric connexions, so the order
in which they put the lower two indices does not really matter [for them]. ♦
Comment: A slightly more formal version of this construction works as follows: Pick
a coordinate chart around the point xa and in some “small” neighborhood of xa set up a
special collection of curves
a
γy,x (λ) = λy a + (1 − λ)xa ; λ ∈ [0, 1] (3.16)
Now these curves are all “straight lines” in the specified coordinate chart, but this implies
no loss of generality because we are always looking at “small” regions and relying on the
locally Euclidean nature of the manifold. (So that any complicated [but differentiable]
curve can in “small” enough regions be safely approximated by straight line segments.)
Note that by construction
lim γx,y = γ0 (3.17)
y→x
Math 464: Differential Geometry 44
is guaranteed to be well-defined. ♦
If you are only transporting the vector a short distance, then via Taylor’s theorem the
change in components due to the transport should be linear in δx, (and path independent
at this level of approximation). In components:
transport(x+δx→x) [V (x + δx)]a = V a (x + δx) + Γa bc V b δxc + O[(δx)2 ] (3.20)
This implies
transport(x+δx→x) [V (x + δx)]a = V a (x) + {∂c V a (x) + Γa bc V b }δxc + O[(δx)2 ] (3.21)
This now lets you write the notion of a vector field that is “covariantly constant” to itself
in differential form. By definition of the transport operator a vector field is “covariantly
constant” iff
transport(x+δx→x) [V (x + δx)]a = V a (x) (3.27)
But this then implies
Parameterizing the curve with some parameter λ and taking the limit as δxc → 0 and
δλ → 0, we have
c
a a b dx
{∂c V (x) + Γ bc V } =0 (3.29)
dλ
or
∂V a (λ) dxc a b dxc
+ Γ bc (λ) V (λ) = 0. (3.30)
∂xc dλ dλ
That is
dV a (λ) dxc
+ Γa bc (λ) V b (λ) = 0. (3.31)
dλ dλ
This differential version of the “parallel transport” process shows that if you (1) know what
the curve xa (λ) is, and you know (2) the components of the affine connexion Γa bc (x), [and
hence Γa bc (λ)], then finding the parallel transported coefficients of V is simply a matter
of solving a first-order linear ODE. And remember from Math301 that you know that
the solutions of first-oder ODEs enjoy nice existence and uniqueness properties. [Finally,
once you have the parallel transported components at x, you can compare them to the
actual components at x to define the covariant derivative — see below.]
One could also consider a slightly different notion: that of a vector field that is parallel
to itself (“auto-parallel” but not necessarily “covariantly constant”). By definition of the
transport operator a vector field is parallel to itself iff
That is
transport(x+δx→x) [V (x + δx)]a = f (x, δx) V a (x) (3.33)
But this then implies
Parameterizing the curve with some parameter λ and taking the limit as δxc → 0 and
δλ → 0, we have
dxc dxc a
{∂c V a (x) + Γa bc V b } = fc V (3.35)
dλ dλ
or
dV a (λ) dxc dxc a
+ Γa bc (λ) V b (λ) = fc V . (3.36)
dλ dλ dλ
This is the differential version of an “auto-parallel” vector field. The covector fc is un-
constrained.
Here the integral is to be taken along the curve γ and P denotes the process of “path or-
dering”. The transport operator then forms a “group” under path composition. (Strictly
speaking, a pseudo-group, since composition is defined only if the end of the first curve cor-
responds to the beginning of the second curve.) The technical definition of path-ordering
is (see appendix for details)
Z y N
Y
• c
P exp {Γ •c dx } ≡ lim exp {Γ• •c (xn ) δxcn } (3.38)
x N →∞
n=1
It is the matrix multiplication analogue of the Riemann sum; in quantum physics this is
exactly the same as the “time-ordered exponential integral”.
Aside: To a physicist, and in particular to a particle physicist, the underlying logic here
is obvious. If we think of the affine connexion as analogous to a non-Abelian [Yang–Mills]
gauge field (e.g., in QCD) then the parallel transport operator is analogous to what physi-
cists would call a gauge transport operator — and parallel transport around a closed path
is analogous to what physicists would call a Wilson loop. While I am not doing anything
with non-Abelian gauge theories at this stage, see the last chapter. ♦
Note that affine connexion and transport operator are interchangeable. From the trans-
port operator you can extract the connexion by differentiation or by looking at “short”
curves; from the connexion you can construct the transport operator by path-ordered
exponentiation.
Note that (at this stage) the transport operator and affine connexion Γa bc are introduced
as primitive a priori concepts that have no further resolution into more primitive concepts
— this will change once I get around to discussing the metric tensor and metric connexions.
Math 464: Differential Geometry 47
Note that there is no unique best way of introducing the concepts of parallel transport
and affine connexion — many textbooks use an axiomatic formation in terms of the
“covariant derivative” ∇V .
So the determinant of the transport operator has a simple connection with the trace Γa ac .
There is no special word for this object (it’s not a vector), but maybe there should be. ♦
Exercise: Look up the definition of tensor density and convince yourself that Γa ac
transforms as the gradient of a scalar density. ♦
3.3.1 Vectors
In terms of components
V a (x + δx) + Γa dc V d (x + δx) δxc + O[(δx)2 ] − V a (x)
∇b V a ≡ lim (3.45)
δx→0 δxb
Then
∇b V a = ∂b V a (x) + Γa db (x) V d (x) (3.46)
This is the component version of the covariant derivative acting on vectors. And if every-
thing has worked right, ∇V should b a T11 tensor. Let’s look at the transformation law
for the connexion
∂
Γā b̄c̄ = {transport[y → x; γ]}ā b̄ (3.47)
∂y c̄ y→x
∂y c ∂
ā b
∂x a ∂y
= {transport[y → x; γ]} b (3.48)
∂y c̄ ∂y c ∂xa ∂y b̄ y→x
∂xc ∂xā ∂xb a ∂xā ∂ 2 xa
= Γ bc + (3.49)
∂xc̄ ∂xa ∂xb̄ ∂xa ∂xc̄ ∂xb̄
∂xc ∂xā ∂xb a ∂xb ∂ 2 xā ∂xc
= Γ bc − . (3.50)
∂xc̄ ∂xa ∂xb̄ ∂xb̄ ∂xb ∂xc ∂xc̄
So the affine connexion Γ has the transformation law of a T21 tensor except for an addi-
tional second-derivative term. This second-derivative terms has of course been carefully
cooked up to cancel the second-derivative terms arising in the transformation of the par-
tial derivative of a vector. Result: the covariant derivative (of a contravariant vector)
really is a tensor.
Now that we have the notion of covariant derivative of a vector, we can revisit our
differential equation for an auto-parallel vector field
dxc dxc a
{∂c V a (x) + Γa bc V b } = fc V (3.51)
dλ dλ
and write it in the form
dxc dxc a
{∇c V a }= fc V (3.52)
dλ dλ
Multiply by V b and anti-symmetrize, then
dxc
∇c V [a V b] =0 (3.53)
dλ
which has the advantage that the unknown covector fc has been eliminated. If we only
specify that the vector field is auto-parallel along the one curve, this is as far as we can
go. But if this is supposed to hold for all curves, that is for all dxa /dλ, then
∇c V [a V b] = 0 (3.54)
which is our final form of the differential constraint for an auto-parallel vector field. Note
that a covariantly constant vector field satisfies the stronger constraint
∇c V a = 0 (3.55)
Math 464: Differential Geometry 49
3.3.2 Covectors
How would we define the covariant derivative of a covector? Use the fact that t a ga is a
scalar, and make some extra assumptions:
• For a scalar
∇a φ ≡ ∂ a φ (3.56)
This is justified since a gradient of a scalar does transform in a tensorial manner.
• Preserve the Leibnitz rule:
So
(∂a tb + Γb ca tc )gb + tb (∇a gb ) = (∂a tb )gb + tb (∂a gb ) (3.59)
That is
(Γb ca tc )gb + tb (∇a gb ) = tb (∂a gb ) (3.60)
Rearrange
tb (∇a gb ) = tb (∂a gb − Γc ba gc ) (3.61)
a
But this is supposed to hold for arbitrary t , therefore
∇a gb = ∂a gb − Γc ba gc (3.62)
And so the rule for covariantly differentiating covectors is implicit in that for vectors.
3.3.3 Tensors
The generalization to any Tsr is now obvious. There will be a total of r + s + 1 terms.
The first will be the partial derivative. Then r terms involving +Γ, and finally s terms
involving −Γ. On the Γ’s the last trailing index will always be the same as the index on
the ∇ and the remaining indices can be reconstructed to get the tensor structure correct.
Thus
r
X s
X
b1 b2 ...br b1 b2 ...br bα b1 b2 ...σ...br
∇a T c1 c2 ...cs ≡ ∂a T c1 c2 ...cs + Γ σa T c1 c2 ...cs − Γσ cβ a T b1 b2 ...br c1 c2 ...σ···cs
α=1 β=1
(3.63)
Math 464: Differential Geometry 50
∇b V a = ∂b V a + Γa db V d (3.64)
and
∇a gb = ∂a gb − Γc ba gc (3.65)
As a specific example
∇a X b c = ∂a X b c + Γb ma X m c − Γm ca X b m (3.66)
3.4 Geodesics
Geodesics are now defined as the “straightest possible” curves in the manifold, where
straight means that the tangent vector to the curve, parallel propagated along the curve,
is still parallel to the tangent vector.
This can be written in infinitesimal form by differentiating with respect to λ2 , and then
letting y → x (ie, λ2 → λ1 ). Then
dt d
+ transport(γ(λ2 )→λ(λ1 );γ) t ∝t (3.69)
dλ dλ
In coordinates:
dxa ∂
dt
+ transport(γ(λ2 )→λ(λ1 );γ) t ∝t (3.70)
dλ dλ ∂xa
That is
dta
+ {tc Γa bc } tb ∝ ta (3.71)
dλ
Alternatively
dta
+ Γa bc tb tc ∝ ta (3.72)
dλ
So that:
d2 x a a dxb dxc dxa
+ Γ bc ∝ . (3.73)
dλ2 dλ dλ dλ
Math 464: Differential Geometry 51
Where does the notion of metric tensor come into this? It doesn’t — not yet anyway.
So by suitably choosing λ̄(λ) you can make the term proportional to t vanish so that:
(t̄ · ∇)t̄ = 0. (3.80)
A parameterization of this type is called an affine parameterization of the curve. Affine
parameterizations are invariant under affine transformations of the form
λ → λ̄ = Aλ̄ + B. (3.81)
Exercise: Show that to make the term proportional to t vanish you need to solve the
ODE 2
d2 λ dλ
2
= −f (λ) (3.82)
dλ̄ dλ̄
What are appropriate boundary conditions? What can you say about the existence and
uniqueness of solutions to this ODE? ♦
Warning: The use of the word “affine” in “affine connexion” is subtly different from its
use in “affine parameterization”; in the connexion context it arises because the transfor-
mation laws from one chart to another are qualitatively of the form
Γ → Γ̄ = JJJ −1 Γ + ∂J (3.83)
Math 464: Differential Geometry 52
where J is a transformation matrix; it’s affine in the sense that it’s linear with an inho-
mogeneous term. ♦
3.5 Torsion
Ta ≡ T m ma (3.85)
Note that the trace is on the first two indices. This quantity does not seem to have a
special name. ♦
Torsion is also used to study the continuum approximation os line defects in solid state
crystals. ♦
Comment: Symmetric connexions are very important; the entire next chapter (thank-
fully a small chapter) will be devoted to them. ♦
Γa bc = Γa (bc) + T a bc (3.87)
which has the effect of decomposing an arbitrary affine connexion into a symmetric con-
nexion plus the torsion.
Lemma 6 Let Γa bc be an affine connexion and let X a bc be any tensor of type T21 . Then
More generally:
How should we try to interpret torsion geometrically? Torsion can be interpreted in terms
of “infinitesimal parallelograms” and the fact that they fail to close. (There will now be
a brief agony of index gymnastics.)
Specifically, pick some point x and a pair of tangent vectors t1 , t2 . Take the geodesic
passing through x in the direction t1 and, adopting an affine parameterization, move out
a “distance” λ1 . This puts you at the point
1
xa (0 → 1) = xa + ta1 λ1 + Γa bc tb1 tc1 λ21 + O[λ31 ]. (3.91)
2
Meanwhile, parallel transport the vector t2 along this geodesic. The components of t2 at
xa + ta1 λ1 will be
ta2 (xb (0 → 1)) = ta2 + Γa bc tb2 [tc1 λ1 ] + O(λ1 )2 . (3.92)
Now move out a “distance” λ2 along the geodesic specified by this vector, that will now
place you at the point
a a 1 a b c 2 3
x + t1 λ1 + Γ bc t1 t1 λ1 + O[λ1 ]
2
1
+ t2 + Γa bc tb2 [tc1 λ1 ] + O(λ1 )2 λ2 + Γa bc tb2 tc2 λ22 + O[λ32 ]
a
(3.93)
2
That is, at the point
1 1
xa (0 → 1 → 2) = xa + ta1 λ1 + ta2 λ2 + Γa bc tb1 tc1 λ21 + Γa bc tb2 tc2 λ22 + Γa bc tb2 tc1 λ1 λ2 + O[λ3 ]
2 2
(3.94)
Now go back to the original point x and repeat in the opposite order, first travelling out
along t2 a “distance” λ2 and then along the parallel transported t1 a “distance” λ1 . You
will now end up at the points
1
xa (0 → 2) = xa + ta2 λ2 + Γa bc tb2 tc2 λ22 + O[λ32 ]. (3.95)
2
and
1 1
xa (0 → 2 → 1) = xa + ta1 λ1 + ta2 λ2 + Γa bc tb1 tc1 λ21 + Γa bc tb2 tc2 λ22 + Γa bc tb1 tc2 λ1 λ2 + O[λ3 ]
2 2
(3.96)
If we were in flat space the “parallelogram” would certainly close and we would have
xa (0 → 1 → 2) = xa (0 → 2 → 1). In the general situation however we see that the
parallelogram does not close and that
That is:
xa (0 → 1 → 2) − xa (0 → 2 → 1) = 2 T a bc tb2 tc1 λ1 λ2 + O[λ3 ]. (3.98)
In other words: torsion has to do with the fact that travelling along “parallel” geodesics is
not a commutative process, and that in a manifold with torsion there are no “infinitesimal
parallelograms”; at least no infinitesimal parallelograms defined via geodesic motion and
parallel transport.
Now wait a minute? We have already seen that given an arbitrary affine connexion Γ a bc
the related symmetric connexion Γa (bc) has the same geodesics, and by definition, no
torsion.
Well, although the the sides of the “infinitesimal parallelograms” defined by Γa (bc) are
still geodesic when viewed in terms of Γa bc , the sides of the “infinitesimal parallelograms”
defined by Γa (bc) fail to be parallel when viewed in terms of Γa bc .
Exercise: Check this. Show that the lack of parallelism is proportional to the torsion. ♦
In other words, there is more to parallelism than geodesics. Even if you know what all the
geodesics are in a given manifold, this does not mean you know how to define “parallel”.
Example: To really drive this home, consider the extreme case in which Γa (bc) is trivial,
meaning that there is a coordinate system in which the symmetric connexion is zero
throughout the manifold. In that case the geodesics are simply the geodesics of Euclidean
space IRn — ordinary straight lines. But parallelism is still complicated: the differential
equation for parallel transport [in this situation] simplifies to
dV a (λ) dxc
+ T a bc (λ) V b (λ) = 0. (3.99)
dλ dλ
Let’s make the further brutal simplification that the torsion is position independent.
Because the geodesics are straight limes we can wlog [without loss of generality] param-
eterize them by ordinary distance, in which case tc = dxc /ds is a constant vector. Then
the equation of parallel transport reduces to
dV a (s)
+ T a bc tc V b (s) = 0. (3.100)
ds
with the explicit solution
Though this example is very special [Γa (bc) trivial; T a bc constant] there is a general message
that can be inferred from the above:
Torsion arises from the part of the parallel transport operation that is independent
from the geodesics.
Suppose we take a scalar function φ(x) and calculate [∇a , ∇b ]φ. Now
∇a ∇b φ = ∇a (∂b φ) = ∂a ∂b φ − Γm ba ∂m φ. (3.102)
Therefore, anti-symmetrizing
[∇a , ∇b ]φ = 2 T m ab ∇m φ. (3.103)
Thus we see that the torsion is related to the failure of covariant derivatives, acting on a
scalar, to commute. The generic failure of the covariant derivatives, acting on a vector,
to commute leads naturally to the concept of curvature.
Warning: Note that the commutator [a, b] = ab−ba, as opposed to the anti-symmetrization
process A[ab] = (Aab − Aba )/2, is defined without any factor of 1/2. This usage is unfor-
tunately standard and I’m not about to try to change it. ♦
There are two natural ways of getting from the affine connexion to the notion of curvature:
Either by using the commutativity properties of the covariant derivative or by looking at
parallel transport around a small curve.
Math 464: Differential Geometry 57
One of the nice features of partial derivatives is that they commute. Unfortunately
partial derivatives are not tensorial. We have fixed the tensor properties by introducing
the connexion, but this will have an effect on the commutativity properties:
∇a ∇b X c = ∇a (∂b X c + Γc db X d ) (3.105)
c c d c m m d n
= ∂a (∂b X + Γ db X ) + Γ ma (∂b X + Γ db X ) − Γ ba (∂n X + Γ dn X d )
c c
(3.106)
c c d c d c m n c
= ∂a ∂b X + (∂a Γ db )X + Γ db (∂a X ) + Γ ma ∂b X − Γ ba ∂n X
+(Γc ma Γm db − Γn ba Γc dn )X d (3.107)
[∇a , ∇b ]X c = (∂a Γc db − ∂b Γc da + Γc ma Γm db − Γn ba Γc dn − Γc mb Γm da + Γn ab Γc dn )X d
+Γc db ∂a X d + Γc ma ∂b X m − Γn ba ∂n X c − Γc da ∂b X d − Γc mb ∂a X m + Γn ab ∂n X c
(3.108)
Regroup:
[∇a , ∇b ]X c = (∂a Γc db − ∂b Γc da + Γc ma Γm db − Γc mb Γm da )X d
+2T n ab ∇n X c
+Γc db ∂a X d + Γc ma ∂b X m − Γc da ∂b X d − Γc mb ∂a X m (3.109)
Stare at that last line for a second — it’s zero. (Check out the free indices and the dummy
indices.) Therefore
[∇a , ∇b ]X c = (∂a Γc db − ∂b Γc da + Γc ma Γm db − Γc mb Γm da )X d
+2T n ab ∇n X c . (3.110)
Now the LHS is by construction a tensor. Similarly on the RHS the torsion T and covariant
derivative ∇X are by construction tensors. Therefore the combination in brackets must
be a tensor — in fact it is called the Riemann tensor and we define:
Re-labelling indices
Note: This agrees with the sign chosen in MTW, D’Inverno, and Wald. ♦
Note that technically there is something to prove here. Lawden calls it the “quotient
theorem” and it states:
Math 464: Differential Geometry 58
Exercise: The proof is best developed by looking at a few simple examples. Try it. ♦
Warning: Various authors differ in their index placement, and sign conventions, but
once you pick a convention stick to it. The convention here is relatively standard. To
compare with other conventions look at the flyleaf of Misner–Thorne–Wheeler (MTW). ♦
Warning: Schouten “Ricci calculus” does something particularly weird with his index
placement; be warned. ♦
Note: The way we have placed the indices on the Riemann tensor, the first two can
be thought of as carrying internal matrix structure, while the last two carry information
about the directions you are differentiating in. If we introduce a bullet • to stand for a
generic index we can schematically write
R• •cd ≡ −2 Γ• •[c,d] − Γ• •[c Γ• •d]
(3.114)
where the anti-symmetrization is only on [cd], the directions you are differentiating along,
and the bullets • denote general matrix indices with implied matrix multiplication.
There are of course several good reasons for this notation. It is helpful in a purely mathe-
matical sense and for physicists it does serve to make manifest the deep analogies between
a general affine connexion and the non-Abelian Yang–Mills gauge field, and between Rie-
mann curvature and Yang-Mills field strength. (This comment need not make sense now,
see the lecture on Yang-Mills at the end of the course.) ♦
Note the remarkable feature of this result: the affine connexion is not a tensor, but this
particular combination of derivatives and quadratic terms is a tensor (type T31 in fact).
From the Riemann tensor you can easily construct two additional tensors:
Rab ≡ Rc acb = ∂c Γc ab − ∂b Γc ac + Γc dc Γd ab − Γc db Γd ac (3.115)
is called the Ricci tensor, while
Sab ≡ Rc cab = − (∂a Γc cb − ∂b Γc ca ) = − (∂a Γb − ∂b Γa ) (3.116)
Math 464: Differential Geometry 59
does not seem to have a special name. (I’ll just call it the S-tensor.) In general both R(ab)
and R[ab] are nonzero, though R[ab] vanishes in GR. Furthermore the S-tensor Sab = 0 in
GR; which is why you don’t see much discussion of it.
Note again the remarkable feature of this result: the affine connexion is not a tensor,
but these particular combination of derivatives and quadratic terms are tensors (type T 20
in fact).
It is often preferable to re-organize the expression for the Ricci tensor by using
Γc ab = Γc ba + 2T c ab = Γc ba − 2T c ba (3.117)
and the more specialized
Γc ac = Γc ca + 2T c ac = Γc ca − 2T c ca = Γa − 2Ta (3.118)
to write
Rab = ∂c Γc ab − ∂b Γa + Γd Γd ab − Γc db Γd ca + 2Ta;b + 2Γc db T d ca (3.119)
We still have not used the metric tensor — the Riemann tensor exists independent of
whether or not there is a metric tensor present; as long as there’s an affine connexion
that’s enough.
Exercise: Try to do these explicitly, see if you can get the indices in the right place.
Check if I have the indices in the right place. ♦
Comment: One place to find this stuff, modulo some notation changes, is on page 7 of
Eisenhart. ♦
3.8.1 Anti-symmetries
Note that the Riemann tensor is antisymmetric on its last pair of indices
Ra b(cd) = 0 (3.126)
Exercise: Show that for a generic asymmetric connexion the Riemann tensor has
3
n (n − 1)/2 algebraically independent components. ♦
(This is Schouten equation (5.2) p 144.) For typographic convenience we have introduced
the anti-symmetrization operator, A[abc] , which in obvious notation anti-symmetrizes the
indicated indices.
Schouten calls this the “second identity”. The authors of the “Encyclopedic Dictionary
of Mathematics” [EDM] refer to it as the “first Bianchi identity”, see EDM §80.J page
304, with a terrifyingly obscure notation.
Math 464: Differential Geometry 61
then
That is:
1
R[cd] = Scd + 2T a a[c;d] + T a cd;a − 2T a ma T m cd + 4T n m[c T m |n|d] (3.134)
2
Rearranging:
1
R[cd] = Scd + T a cd;a + 2T a a[c;d] + 2T a am T m cd + 4T n m[c T m |n|d] (3.135)
2
Notation: The vertical bar in the previous equation indicates that the relevant index
is not to be included in the anti-symmetrization process. ♦
Now focus on the last term, noting that
T n mc T m nd (3.136)
From Schouten’s point of view this is merely an application of what he calls this the
“second identity”. Maybe we should call it the “contracted second identity”.
Math 464: Differential Geometry 62
Alternatively, we could proceed by brute force. (Which is a useful check on this formula.)
If we look at the antisymmetric part of the Ricci tensor
1
R[ab] = Sab + 2T[a;b] + T m ab,m + Γd T d ab − 2Γc d[b T d a]c (3.139)
2
Convert the partial derivatives to covariant derivatives using
Collecting terms
1
R[ab] = Sab + 2Tc[a;b] + T m ab;m + 2Tm T m ab (3.144)
2
This agrees with the previous derivation above, and with Schouten.
Note the relative paucity of symmetries for the Riemann tensor generated by the general
affine connexion. Once you add symmetry and metricity things improve tremendously.
3.8.3 Nonmetricity
Let gab be some arbitrary symmetric type T20 tensor, that is nondegenerate in the sense
that there is locally some coordinate system such that det(gab ) 6= 0. Then we can define
a type T02 tensor by
g ab = [g −1 ]ab (3.145)
and use these objects gab and g ab to raise and lower indices. Though this looks rather
similar to what one might be expecting to see — the long-awaited metric tensor — it
is not. What we are emphatically not demanding at this stage is ∇c gab = 0, indeed we
define
∇c gab = qabc 6= 0 (3.146)
Math 464: Differential Geometry 63
[∇a , ∇b ]gcd = ∇a qcdb − ∇b qcda = Rm cab gmd + Rm dab gcm − T m ab ∇m gcd (3.147)
Warning: Remember that gab is at this stage merely some arbitrary symmetric nonsin-
gular T20 tensor. ♦
Exercise: Use the auxiliary tensor gab and its inverse g ab to show
Note that the LHS of this expression is explicitly independent of gab (since gab never shows
up in the definition of the S-tensor). Thus the RHS must be independent of the specific
auxiliary tensor gab that we chose to evaluate it. ♦
Exercise: Use the auxiliary tensor gab and its inverse g ab to define
Thus the S̄ tensor is defined in terms of a “trace” on the second and fourth indices of the
Riemann tensor. Using this, and the third identity, show that
Exercise: What are the integrability conditions for making the nonmetricity vanish? ♦
Math 464: Differential Geometry 64
The Riemann tensor satisfies some important constraints in the form of differential iden-
tities. It is relatively rare to see these identities discussed for a general asymmetric
connexion.
Theorem 9 (Weitzenbock)
Schouten gives some comments on the history: in the case of a symmetric metric connex-
ion early results were due to Voss, Ricci, Padova, and Bianchi. Schouten appears to be
the first for a general symmetric connexion, and Weitzenbock the general affine connex-
ion. Nevertheless it now seems common to call them the Bianchi identities — perhaps
Weitzenbock identities would be more appropriate. For details see Schouten, “Ricci Cal-
culus”, p 146. (EDM calls this the “second Bianchi identity”, see §80.J page 304, with a
terrifyingly obscure notation.)
This formula involves only partial derivatives. If we replace them by covariant derivatives
we see [before anti-symmetrization]
0 0
R• •cd;e = R• •cd,e + Γ• •e R• •cd − R• •cd Γ• •e − R• •c0 d Γc ce − R• •cd0 Γd de (3.160)
Math 464: Differential Geometry 65
Now anti-symmetrize
0 0
R• •[cd;e] = R• •[cd,e] + Γ• •[e R• •cd] − R• •[cd Γ• •e] − R• •c0 [d T c ce] − R• •[c|d0 T d de] (3.161)
R• •[cd;e] = R• •[cd,e] + Γ• •[c R• •de] + R• •[ce Γ• •d] − 2R• •m[d T m ce] (3.162)
But we have just seen that the first three terms on the RHS add to zero, thus
as required. QED
Note: The proof for the generic affine connexion presented above is significantly more
complicated than what is needed in the case of the symmetric connexion. (Where the
entire RHS vanishes for a start.) When dealing with the symmetric connexion judicious
use of Riemann normal coordinates, described later, greatly simplifies the proof. ♦
Since we have this identity we can also deduce related identities by contracting on the
up-down indices:
Ra a[cd;e] = 2Ra am[e T m cd] (3.165)
That is
S[cd;e] = 2Sm[e T m cd] (3.166)
The more interesting contraction is:
That is
This is true for generic affine connexions, without needing any auxiliary arbitrary sym-
metric nonsingular T20 tensor gab .
Exercise: (Non-trivial) Let’s re-introduce the arbitrary symmetric nonsingular T20 ten-
sor gab and its covariant derivative qabc . Use this metric to contract the contacted Bianchi
identities a second time. Define the “Ricci scalar”
R = g ab Rab (3.169)
and using the contracted Bianchi identity, and the S̄ tensor defined above, derive
d
R δ c − Rd c − S̄ d c ;e = “an expression involving Riemman and Torsion tensors” (3.170)
Warning: If you look at the details of his proofs, D’Inverno assumes all his manifolds
are topologically trivial. I will work around this restriction at the cost of some additional
technical machinery. ♦
Note: “Contractible region” means that any closed loop in this region is contractible;
it can be shrunk to a point without the loop ever leaving the region of interest. Formal
topologists will say that the [first] homotopy is trivial. ♦
Note: Typically in the older literature, you might also see the word “teleparallel” [tele-
parallel; long-distance parallelism] or the phrase “absolute parallelism” used as a synonym
for integrable. Sometimes you will even see “Fernparallelismus”. ♦
Math 464: Differential Geometry 67
Lemma 8 The connexion Γ is “integrable” iff parallel propagation around any contractible
closed path is trivial. That is
Now using
transport(y→x;γ̃1 ) = [transport(x→y;γ1 ) ]−1 (3.175)
so that the effect of reversing the path is to obtain the inverse parallel transport operator,
we have
transport(x→y;γ2 ) = transport(x→y;γ1 ) (3.176)
proving path independence [for homotopically equivalent paths]. Consequently paral-
lel transport is path independent on contractible regions, and so the connexion is inte-
grable. QED
Now we want to relate integrability to the Riemann tensor:
Proof: Suppose the connexion is integrable. Pick an arbitrary point p with coordinates
x and some vector X a at that point. Now extend this vector to a vector field defined on
a topologically trivial [in particular, contractible] open region surrounding p by parallel
propagation (which by assumption of integrability is supposed to define a unique vector
X a (y) at all y 6= x).2 Then along any arbitrary curve in the topologically trivial region
of interest
transport[y → x; γ]X(y) = X(x) (3.177)
2
Aside: There is actually no need to extend X a (y) to the entire manifold. If this could be done,
then X a (y) would now be an everywhere nonzero vector field; but there are topologies, e.g. S 2 , for which
you know such things do not exist. Therefore there are topological manifolds for which you are forced to
work on topologically trivial regions. D’Inverno quietly ignores all such complications. ♦
Math 464: Differential Geometry 68
Exercise: Prove these two lemmata; you can do this by adding a few technical details
to the preceding proof. ♦
Proving the converse of this theorem is technically more difficult.
Theorem 11 If the Riemann tensor is zero everywhere on the manifold then the connex-
ion is integrable.
The “geometric route” is traditional, and I will deal with that first. The “PDE route”
is actually technically easier, but requires a little extra machinery. It has the advantage
of providing you with techniques of somewhat wider applicability. I’ll deal with the PDE
route in the next section.
Proof: (Geometric route) We want to show that the result of parallel propagation
around any [topologically trivial] closed loop is the identity. It is sufficient to prove this
for infinitesimal loops, since any finite loop can be subdivided into many smaller loops.
(More precisely, you want an analytic estimate of the possible deviation from triviality as
a function of loop size; see below.)
Warning: The proof in the presence of torsion is somewhat messier than the case with-
out torsion. The cleanest “geometric” version I have found is on pp 23-24 of Eisenhart. ♦
Consider a small rectangle of points in a plane parameterized by (u, v). Let the points
in question be p(u, v), q(u + du, v), r(u + du, v + dv), and s(u, v + dv). Along each
line segment [not necessarily a geodesic] we parallel propagate the vector V a using the
equations
dV a a b dx
c
+ Γ bc V = 0. (3.183)
dλ dλ
Then along each side of the “rectangle” we have
dV a 1 d2 V a
V a (q) = V a (p) + du + du2 + O(du3 ) (3.184)
du p 2 du2 p
dV a 1 d2 V a
V a (r) = V a (q) + dv + dv 2 + O(dv 3 ) (3.185)
dv q 2 dv 2 q
dV a 1 d2 V a
V a (s) = V a (r) − du + du2 + O(du3 ) (3.186)
du r 2 du2 r
a dV a 1 d2 V a
Vreturn (p) = V a (r) − dv + dv 2 + O(dv 3 ) (3.187)
dv r 2 dv 2 s
Now add these equations
" # " #
a dV a dV a dV a dV a
Vreturn (p) = V a (p) + − du + − dv (3.188)
du p du r dv q dv s
Math 464: Differential Geometry 70
" # " #
1 d2 V a d2 V a 2 1 d2 V a d2 V a
+ + du + + dv 2 + O(du3 ) + O(dv 3 )
2 du2 p du2 r 2 dv 2 q dv 2 s
(3.189)
Now note that generically, for any quantity X(u, v)
∂X ∂X
X(r) = X(p) + du + dv + O(dλ2 ) (3.190)
∂u p ∂v p
∂X
X(s) = X(p) + dv + O(dv 2 ) (3.192)
∂v p
So to the required accuracy
∂ dV a ∂ dV a
a a
∆V ≡ Vreturn (p) − Va (p) = − − du dv + O(dλ3 ) (3.193)
∂v du ∂u dv p
To see this take a finite “square” closed loop of size L × L (rectangular in the particular
coordinate chart adopted) and subdivide it into N × N smaller squares. Then simply
from the definition of path composition
NY
×N
transport(L×L;γ) = transport(L/N ×L/N ;γα ) (3.198)
α=1
Math 464: Differential Geometry 71
But from the analytic estimate above, for each one of the smaller squares we have
transport(L/N ×L/N ;γα ) = exp(O[(L/N )3 ]) (3.199)
That is:
transport(L×L;γ) = exp(N × N × O[(L/N )3 ]) = exp(O[1/N ]) → I. (3.200)
So far this works for “square” paths; arbitrarily shaped (but topologically trivial) paths
may now be handled by approximating them with finer and finer collections of square
paths — these are standard techniques usually developed in the theory of surface inte-
gration. QED
Exercise: Verify that if the Riemann tensor is nonzero, then our analytic estimate im-
plies that parallel propagation around “square” paths will generally lead to a nontrivial
result. ♦
Comment: The argument simplifies somewhat if the torsion is zero. See, for instance
D’Inverno. Note that even for zero torsion D’Inverno does not really complete the proof,
he stops at the stage
∆V = O(dλ3 ) (3.201)
Also note that he implicitly assumes everything is topologically trivial. ♦
Definition 46 Affine flat: A connexion is “affine flat” if there exists an atlas in which
Γa bc = 0 in all charts.
Warning: The notion “affine flat” is most useful for symmetric connexions. For general
asymmetric affine connexions there are flat connexions (integrable, zero curvature) which
are not affine flat. ♦
Math 464: Differential Geometry 72
Comment: Note that if an affine flat connexion exists then then this places a topological
constraint on the tangent bundle
T (M) = M × IRn (3.202)
♦
Lemma 11 If a manifold is affine flat then the torsion vanishes identically (the connexion
is symmetric).
[Trivial]
Lemma 13 If a manifold is affine flat then the Riemann tensor vanishes identically.
[Trivial]
Lemma 14 If the connexion is symmetric (torsion zero) and integrable, then the connex-
ion is affine flat.
[Easy; I think]
Theorem 12
For a general affine connexion the Riemann tensor is zero iff there exists a coordinate
system such that
Γa bc = ∂c D a b (3.203)
where D a b is a diagonal matrix.
Proof: Sufficiency is easy; just plug it into the definition of Riemann. The ∂Γ terms
cancel because of the partial derivative and the Γ Γ terms cancel because it is diagonal in
the first 2 indices.
On the other hand, establishing necessity takes several pages of rather turgid analysis
of PDEs in Eisenhart [pp 14-22]. There is a nicer version of the proof using n-beins that
I will discuss later, after I’ve introduced n-beins.
In contrast, an easy result is that if the Riemann tensor is zero then in any coordinate
patch
Γa ac = ∂c Θ (3.207)
One way to see this limited result is from the parallel transport operator — if the Riemann
tensor is zero then the transport operator is integrable, and so for any closed loop it is
the identity operator, whose determinant is unity. But then by our general result for the
determinant of the transport operator we have
I
Γa ac dxc = 0 (3.208)
for arbitrary closed loops. We can now apply Stokes’ theorem [not that we have proved
it yet].
A more prosaic way proceeds from Sab which equals zero simply because the Riemann
tensor is zero, but that implies (as an algebraic identity) that
∂a Γm mb − ∂b Γm ma = 0 (3.209)
Γm ma = ∂a Θ (3.210)
We have defined the word “integrable” in terms of the path independence of the parallel
transport operator. Now there is another more basic notion of integrability you can use
— that of the integrability of a system of PDEs. The ideas are of course related.
Math 464: Differential Geometry 74
• The condition for the existence of a covariantly constant vector field, namely
∂V a
b
+ Γa mb V m = 0 (3.211)
∂x
is a particular case of a Frobenius–Mayer system of PDEs.
• The condition required in order to apply the Frobenius complete integrability the-
orem, and thereby guarantee that this system of PDEs actually has a solution, is
exactly the vanishing of the Riemann tensor.
For those of you who did Math 301 last year, here’s a gentle reminder.
One special system of PDEs that is very important is the Frobenius or Mayer system
∂U A
= FiA (x1 , . . . , xn , U 1 , . . . , U m ) (F ) (3.212)
∂xi
A = 1, 2, . . . , m, i = 1, 2, . . . , n (3.213)
where the m functions {U A } depend on the n independent variables {xi }.
In such a system there are as many PDEs as there are first-order derivatives of the
dependent functions (i.e., nm of them)
Notes:
• The “A” superscripts tell you which of the U ’s you are dealing with; not the order
of the derivative.
Math 464: Differential Geometry 75
• Just because it’s important does not mean it’s easy to find any discussion of this
system.
• You can find a discussion in Volume 1 of Spivak, chapter 6. See especially pages
254–257.
(The notation is slightly different).
• You can find a discussion in Volume 5 of Forsyth, chapter 4. See especially pages
100 ff.
(The notation is, unfortunately, seriously archaic).
References:
• Courant R., and D. Hilbert, Methods of Mathematical Physics Vols 1 and 2, Inter-
science 1966.
Theorem 13 Suppose the functions F A i are smooth functions of all their variables in a
neighbourhood of the origin, for A = 1, 2, . . . , m and i = 1, 2, . . . , n.
Then the Frobenius system (F) has a unique solution satisfying the IC
U A (0, 0, . . . , 0) = bA (A = 1, 2, . . . , m) (3.214)
∂2U A ∂2U A
= (3.216)
∂xi ∂xj ∂xj ∂xi
• You can find a proof in Volume 1 of Spivak, chapter 6, pages 254–257. Note that
Spivak’s notation is slightly different.
• You can get a feel for how important the Frobenius integrability theorem is from
Spivak’s comment:
The Frobenius theorem (which represents everything we know about partial
differential equations) was used in [long list of topics].
(See Spivak, volume 5, page 1). This should be balanced against his further com-
ment:
Now it’s really rather laughable to call these things partial differential equa-
tions at all. True [...] partial differential equations are involved, but we do
not posit any relationship between different partial derivatives; this comes out
quite clearly in the proof [of the integrability theorem] where the equations
are reduced to ordinary differential equations.
• You can also find a statement of the theorem (not a proof) in Eisenhart, “Non-
Riemannian geometry”, (AMS, 1927) p 14.
• There’s a very brief statement of the result in the “Encyclopedic Dictionary of
Mathematics”, see EDM page 1775.
• It is useful to rewrite condition (C) in the equivalent form
m
∂F A i ∂F A j X ∂F A i B ∂F A j B
− + F j− F i = 0. (C 0 ) (3.217)
∂xj ∂xi B=1
∂U B ∂U B
Alternatively
m
X
A
F [i,j] + ∂B F A [i F B j] = 0. (C 00 ) (3.218)
B=1
Of course this had to be the case, but it’s still nice to see explicitly.
But now the Frobenius integrability theorem actually tells us more: If the Riemann
tensor is zero then there is a unique covariantly constant vector field for each possible
choice of V a (p). Since the possible choices of V a (p) span the tangent space Tp at p we
deduce the existence of n = dim(M) linearly independent covariantly constant vector
fields, let’s call them E a A (x).
That the E a A (x) are linearly independent at the point p follows by definition. That
they are then linearly independent at all other points x on the manifold can be established
by contradiction.
Exercise: Suppose that at some q 6= p the E a A (x) are linearly dependent. By using the
PDE of parallel transport demonstrate that they will then also be linearly dependent at
p, contrary to hypothesis. ♦
Warning: We do not assert that the E a A (x) are globally defined on the entire man-
ifold; all that the Frobenius integrability theorem guarantees is that they will exist on
some open neighborhood of the point p. ♦
If we combine this with its converse (one of the previous lemmata, already set as an
exercise) we have:
Math 464: Differential Geometry 78
Lemma 15 The Riemann tensor is zero iff (locally) there exist n = dim(M) linearly
independent covariantly constant vector fields.
Furthermore, this tells us a lot about the parallel transport operator. Let us start
with some vector X a (p) and parallel transport it along some fixed but arbitrary curve
γ. Because we have these n = dim(M) linearly independent covariantly constant vector
fields, we know that at any arbitrary point on the curve we have
dcA
=0 (3.227)
dλ
so the cA are constants along any specified curve γ. But this means that for any initial
choice of vector at p,
X a (p) = cA (p) E a A (x) (3.228)
we see that the result of any curve that goes from p (coordinates xa ) to q (coordinates y a ),
while always remaining in the open region on which we are guaranteed that the E a A (x)
are defined, yields the same parallel-transported vector
That is: parallel transport is path independent on the open region where the E a A (x) are
defined. This is enough to imply that the connexion is locally integrable. We can even
give an explicit formula for the path transport operator. Since the E a A (y) are linearly
independent the n × n matrix E a A (y) is nonsingular. It has a nonzero determinant and
an inverse that we will wite as [E −1 ]a A (y). Then
We therefore (using the converse lemma that was one of the previous exercises) have:
Lemma 16 The connexion is integrable iff on topologically trivial regions there exist n =
dim(M) linearly independent covariantly constant vector fields.
Math 464: Differential Geometry 79
Note that we have now proved this lemma without the agony of index gymnastics
involved in propagating around an infinitesimal loop. The price paid is a minor excursion
into the theory of PDEs, but it is well worthwhile.
3.12 n-beins
The word is most common among physicists, it is a corruption of the German “n-
legs” based on the dimension dependent zwei-bein, drie-bein, vier-bein, funf-bein, etc.
Depending on dimensionality, you will also see words like “triad” or “tetrad”. Sometimes
(in old mathematical literature) you might run across the word “ennuple” [as in “n-tuple”
→ “en-tuple” → “ennuple”]. Mathematicians (modern abstract mathematicians) will also
sometimes use the word “frame” to express the same idea.
Note: The index a is an ordinary tangent space index, whereas the index A is simply a
label indicating which of the vectors we are dealing with. The a index transforms in the
usual way under a change of charts, the A index is simply unaffected. ♦
Because we assert that the vectors in the n-bein are linearly independent we know
det(ea A ) 6= 0. Therefore this matrix has an inverse, which we will denote by ea A . [Note
that index placement is important here.] By definition of this object as an inverse matrix
ea A ea B = δ B A ; ea A eb A = δ a b ; (3.231)
Note that the first one of these Kronecker delta is not a tensor; it simply lives in “label
space” where it labels the legs of the n-bein. The second Kronecker delta is a tensor;
as always it is a chart independent tensor. (It transforms to itself under a change in
coordinates.)
Note: Still no metric tensor yet, these n-beins are more primitive [more fundamental]
than those typically arising in GR. ♦
Math 464: Differential Geometry 80
Though simple in concept, this idea is remarkably powerful. Suppose we have field of
n-beins defined on the manifold. Make sure it’s suitably differentiable. Then consider the
derivatives
∇a e b B (3.232)
and use them to define
γ A BC = eb A ∇a eb B ea C (3.233)
Note that these γ A BC are carefully cooked up to be chart independent. They are often
called the “invariants” of the connexion; sometimes you will see people muttering about
“going to an aholonomic basis”. They are also related to something called the “spin
connexion”, for reasons I won’t go into right now. It should not surprise you to realise
that [given the n-bein field eb B ] they encode the same information as the Γa bc . Indeed if
we substitute the explicit form of the covariant derivative, and contact with appropriate
n-beins we get
∂ea A
Γa bc = −eb A + γ A BC ea A eb B ec C (3.234)
∂xc
Theorem 14 The Riemann tensor is zero iff there exists a n-bein field such that
∂ea A
Γa bc = −eb A (3.235)
∂xc
Proof: Suppose the Riemann tensor is zero; then the connexion is locally integrable.
Pick any n linearly independent vectors at some fixed but arbitrary point x and use the
parallel transport operator to extend them to a locally defined n-bein field. (We know
this can be done because the connexion is integrable.) But this n-bein field also satisfies
the differential form of the parallel transport equations so we have
∇a e b B = 0 (3.236)
γ A BC = 0 (3.237)
whence
∂ea A ∂eb A a
Γa bc = −eb A = + e A (3.238)
∂xc ∂xc
as claimed.
Conversely, suppose the connexion is of this form. Insert into the definition of Riemann
to see Riemann = 0. QED
Corollary 1 If the Riemann tensor is zero then there exists an n-bein field defined on
topologically trivial regions such that the parallel transport operator can be explicitly cal-
culated to be
transport[y → x; γ]a b = ea A (x) eb A (y) (3.239)
Corollary 2 Once we have constructed a local n-bein field in this way, we can adopt
coordinate charts so that ea A (x) is diagonal.
To do this just pick a point x and choose the coordinate axes to lie along the n vectors
~eA . This can now be extended away from the point x by integrability.
Corollary 3 Once we have constructed diagonalizing coordinate charts for the global n-
bein field, we define
D a b (x) = − ln [ea A→b (x)] (3.240)
which makes sense because A is now [by construction] doing double duty as both a label
specifying “which vector” and as a coordinate index. Then in these diagonalizing charts
Γa bc = ∂c D a b (3.241)
Theorem 15 In any manifold there exists a [non unique] asymmetric affine connexion
(now often called the Weitzenbock connexion) such that the Riemann tensor of that con-
nexion vanishes.
Note: This does not mean all manifolds have trivial curvature. It means I can cook
up a suitably perverse connexion to make the curvature of that connexion zero, but this
Weitzenbock connexion may not be the mathematically or physically interesting one. ♦
Proof: Pick any differentiable n-bein field ea A . That is, at each point of the manifold
pick an n-bein and make sure this is done in a differentiable manner.
We are not [at this stage] demanding the n-bein field be covariantly constant.
Math 464: Differential Geometry 82
We are also not asserting that this can be done globally. After all some manifolds have
no everywhere nonzero vector fields, let alone everywhere nonsingular n-beins. But we
can certainly do this locally on topologically trivial open regions.
Now construct
∂ea A
[ΓWeitzenbock ]a bc = −eb A (3.242)
∂xc
By our previous arguments
[RWeitzenbock ]a bcd = 0 (3.243)
and in fact the n-bein field ea A is covariantly constant in this Weitzenbock connexion, but
not generally in any other connexion. QED
Note that the Weitzenbock connexion is most useful on a topologically trivial manifold.
Note that the torsion is definitely nonzero for the Weitzenbock connexion.
We know that if we have two different connexions Γ1 and Γ2 on the same manifold, then
their difference is a tensor
[Γ2 ]a bc = [Γ1 ]a bc + X a bc (3.244)
but this means we could calculate the Riemann tensor for Γ1 and Γ2 separately and
compare them:
[R2 ]a bcd = [R1 ]a bcd + X a bcd (3.245)
The difference between the two Riemann tensors will itself be a tensor. Indeed
and
[R2 ]• •cd = [R1 ]• •cd + X • •cd (3.248)
Then
But
X • •[c;d] = X • •[c,d] + [Γ1 ]• •d X • •c − X • •c [Γ1 ]• •d − [Γ1 ]m cd X • •m . (3.252)
Consequently
Exercise: Calculate [S2 ]ab − [S1 ]ab . Bullet notation may be helpful, though it’s a simple
calculation either way. ♦
Exercise: Calculate [R2 ]ab − [R1 ]ab . Bullet notation is unlikely to be helpful. Why? ♦
Question: Suppose two connexions lead to the same notion of parallelism, does this
mean the connexions are equal? ♦
Question: Suppose we have two affine connexions on the manifold that lead to the
same notion of parallelism, can we usefully characterize one in terms of the other? ♦
Recall that a specific vector field is said to be parallel to itself [auto-parallel] iff
∇a V b = f a V b ; ⇐⇒ ∇a V [b V c] = 0. (3.254)
Math 464: Differential Geometry 84
If we have two different connexions, and want the same vector field to be auto-parallel
with respect to both of them then
(1)
∇a V b = (1) fa V b ; and (2)
∇a V b = (2) fa V b . (3.255)
Therefore (2)
∇a − (1) ∇a V b =
(2)
fa − (1) fa Vb (3.256)
But (2)
∇a − (1) ∇a V b =
(2)
Γc ma − (1) Γc ma V m = X c ma V m (3.257)
where the quantity X c ma , being the difference between two connexions, is a T21 tensor. So
we require
X c ma V m = fa V b = fa δ c m V m (3.258)
But that implies (since we can take the components of V m at any fixed but arbitrary
point to be arbitrary)
X a bc = δ a b fc (3.259)
That is:
Theorem 16 Two affine connexions define the same notion of parallelism iff there exists
a covector fc such that
(2) a
Γ bc = (1) Γa bc + δ a b fc (3.260)
Notation: Two connexions related in this manner are said to differ from each other by
a “projective transformation”. You can also call them “projectively equivalent”. ♦
We can now ask how this impacts on the Riemann, Ricci, S, and torsion tensors.
(2)
Ra bcd = (1) Ra bcd + δ a b f[c,d] (3.261)
(2)
Rab = (1) Rab + f[a,b] (3.262)
(2)
Sab = (1) Sab + n f[a,b] (3.263)
(2)
T a bc = (1) T a bc + δ a [b fc] (3.264)
(2) n−1
T a ac = (1) T a ac + fc (3.265)
2
This implies that
(2) 1 a 1 a
Ra bcd − δ b (2)
Scd = (1) Ra bcd − δ b (1)
Scd (3.266)
n n
(2)
R(ab) = (1) R(ab) (3.267)
(2)
Sab − n (2) Rab = (1) Sab − n (1) Rab (3.268)
Math 464: Differential Geometry 85
(2) 2 2
T a bc − δ a [b (2)
Tc] = (1) T a bc − δ a [b (1)
Tc] (3.269)
n−1 n−1
This is a forerunner of a class of results that are typically very useful: One looks for a
restricted class of distortions to the affine connexion, asks how that impacts the Riemann
and torsion tensor, and then looks for invariants of the distortion process.
Comment: If we define
2
T̃ a bc = T a bc − δ a [b Tc] (3.270)
n−1
then
T̃ a ac = 0 (3.271)
and preserving parallelism implies
(2)
T̃ a bc = (1) T̃ a bc . (3.272)
(2)
Theorem 17 We can choose fa in such a way as to make Ra bcd = 0 iff
(1) 1 a
Ra bcd − δ b (1)
Scd = 0 (3.273)
n
We also have:
(2)
Theorem 18 We can always choose fa in such a way as to make Sab = 0.
Math 464: Differential Geometry 86
(1)
Proof: Note that Sab = (1) Γ[c,d] and choose fc so that
1 (1)
f[c,d] = − Γ[c,d] (3.277)
n
This can always be done (at least locally). QED
That is, without modifying the parallels (or the geodesics) we can always “get rid of” the
S-tensor.
Preserving the geodesics is less difficult than preserving the notion of parallelism. Re-
member that we have already seen that an arbitrary torsion, while it leaves the geodesics
untouched, will modify notions of parallelism. From the above we have
Theorem 19 Two affine connexions define the same geodesics iff there exists a covector
fc , and an arbitrary T21 tensor Z a bc skew symmetric on its covariant indices, such that
(2)
Γa bc = (1) Γa bc + δ a b fc + Z a bc (3.278)
Corollary 4 By writing
δ a b fc = δ a (b fc) + δ a [b fc] (3.279)
we can rephrase this condition in terms of the symmetric part of the connexion and the
torsion
(2) a
Γ (bc) = (1) Γa bc + δ a (b fc) (3.280)
(2)
T a bc = (1) T a bc + δ a [b fc] + Z a bc (3.281)
By absorbing the antisymmetric part into a redefined Z a bc we also have
(2)
T a bc = (1) T a bc + Z̃ a bc (3.282)
Corollary 5 Two affine connexions define the same geodesics iff there exists a covector
fc , and an arbitrary T21 tensor Z̃ a bc skew symmetric on its covariant indices, such that
(2) a
Γ bc = (1) Γa bc + δ a (b fc) + Z̃ a bc (3.283)
Math 464: Differential Geometry 87
Exercise: What effects do changes of this type have on the Riemann, Ricci, and S
tensors? ♦
(2)
Exercise: What happens if Γa bc and (1) a
Γ bc have both the same geodesics, and the
same affine parameter ? ♦
Exercise: What happens if (2) Γa bc and (1) Γa bc both define the same notion of parallelism,
and when considering geodesics they have the same affine parameter ? ♦
If you have a distinguished tensor gab and its inverse g ab , together with the non-metricity
qabc , you can use this to obtain a canonical decomposition for a general affine connexion.
and rewrite it as
qabc = ∂c gab − Γbac − Γabc = gab,c − Γbac − Γabc (3.285)
Now consider the craftily chosen linear combination defined by
1
{ab; c} = {gca,b + gcb,a − gab,c } (3.286)
2
This combination of partial derivatives of any symmetric T20 tensor is called a “Christoffel
symbol of the first kind”.
We compute
1 1
{ab; c} = {qcab + qcba − qabc } + {Γacb + Γcab + Γbca + Γcba − Γbac − Γabc } (3.287)
2 2
That is
1 1
{ab; c} = {qcab + qcba − qabc } − {Tabc + Tbac } + {Γcab + Γcba } (3.288)
2 2
That is
1
{ab; c} = {qcab + qcba − qabc } − {Tabc + Tbac − Tcba } + Γcab . (3.289)
2
Math 464: Differential Geometry 88
Comment: Note what has happened here, although gab is not “the” metric, merely some
random non-degenerate symmetric tensor, we have nevertheless been able to decompose
an arbitrary asymmetric affine connexion in terms of partial derivatives of the gab , the
tensor of non-metricity qabc , and the torsion T a bc . Note also that both first and second
Christoffel symbols make good sense for arbitrary non-degenerate symmetric tensors, they
do not intrinsically have anything to do with the metric tensor. ♦
Theorem 20
At any point p we can introduce locally geodesic coordinates such that at that point:
Γa (bc) p
= 0. (3.294)
This is very different from saying that the symmetric quantity Γa (bc) vanishes throughout
the manifold. Note that the transformation law for connexions is inhomogeneous (affine)
— this is the only reason this result has even the slightest hope of being true.
Exercise:
Show that in a locally geodesic coordinate system xa the curves
xa (λ) = ta λ (3.305)
with ta being any set of constant coefficients, are all geodesics at λ = 0 [this point corre-
sponding to xa = 0]. Furthermore the parameterization of these geodesics is affine.
(The curves need not be geodesic once λ 6= 0; hence the phrase “locally geodesic”.) ♦
Theorem 21
In any locally geodesic coordinate system we have
Ra bcd |p = Γa bd,c − Γa bc,d + T a mc T m bd − T a md T m bc (3.306)
= −2Γa b[c,d] + T a mc T m bd − T a md T m bc (3.307)
In other words, we can use the locally geodesic coordinate system to [at a point] simplify
the appearance of the Riemann tensor.
Note: These “locally geodesic” coordinates are much more useful if the torsion van-
ishes. ♦
Math 464: Differential Geometry 90
Γa (bc,d) |p = 0 (3.308)
To see this, start with a locally geodesic coordinate system and define new improved
coordinates. Assume that p is located at xa = 0, and transform coordinates
1 a
xa → xā = xa + Q bcd xb xc xd + O(x4 ) (3.309)
3!
(So Qa bcd , by its definition, is automatically symmetric in its lower three indices.) Then
∂xā a 1 a
= δ b + Q bcd xc xd + O(x3 ) = δ a b + O(x2 ) (3.310)
∂xb 2!
∂ 2 xā
= Qa bcd xd = O(x). (3.311)
∂xb ∂xc
∂ 3 xā
= Qa bcd + O(x). (3.312)
∂xb ∂xc ∂xd
That is:
Γā (b̄c̄,d) a a
¯ = Γ (bc,d) − Q bcd + O(x) (3.315)
In particular (choosing Qa bcd = Γa (bc,d) |p and subsequently dropping the bars on the
indices)
Γa (bc,d) p = 0. (3.316)
Note that this does not imply that all the partial derivatives of the connexion vanish at
p, it only means that the completely symmetric part of the connexion derivatives can be
made to vanish.
Γa (bc,def...) p
=0 (3.317)
for any number of partial derivatives. If all these symmetric partial derivatives are zero
then this improvement on the notion of a locally geodesic coordinate system is called a
Riemann normal coordinate chart at the point p.
Math 464: Differential Geometry 91
Historical note: Schouten [Ricci calculus, p 158] points out that this is actually a
generalization of Riemann’s construction [Riemann worked with both nonmetricity and
torsion set to zero] and that these might more reasonably be called Veblen normal coor-
dinates. See also Eisenhart, p 58 ff. ♦
It’s generally the first step beyond locally geodesic coordinates, the fact that you can
enforce
Γa (bc,d) |p = 0 (3.318)
that is the most important.
Note: These “Riemann normal” coordinates are much more useful if the torsion van-
ishes. ♦
Chapter 4
Symmetric Connexions
For symmetric connexions several of the complications inherent in the general [asymmet-
ric] affine connection simplify. These issues are sufficiently important to merit a separate
chapter.
In view of the previous discussion concerning the preservation of parallelism this requires
(2) a
Γ bc = (1) Γa (bc) + δ a b fc (4.1)
Equivalently
(2)
Γa (bc) = (1) Γa (bc) + δ a (b fc) (4.2)
(2)
T a bc = δ a [b fc] (4.3)
That is: a necessary and sufficient condition that the asymmetric connexion Γa bc have the
same parallels as some symmetric affine connexion is that
2
T a bc = δ a [b Tc] (4.4)
n−1
There are now several simplifications in the various identities satisfied by the Riemann
tensor. Of course we still have
Ra b(cd) = 0 (4.5)
92
Math 464: Differential Geometry 93
but now
4
Ra [bcd] = δ a [b Tc;d] (4.6)
n−1
while
1
R[ab] = Sab + 2 T[a;b] (4.7)
2
and
1
R(cd)ab = qcd[a;b] + qcd[a Tb] . (4.8)
n−1
Finally the Weitzenbock [Bianchi] identities specialize to
4
Ra b[cd;e] = Ra b[cd T e] (4.9)
n−1
and for the contracted Weitzenbock [Bianchi] identities
4
S[cd;e] = S[cd Te] (4.10)
n−1
and
8
Rbd;e − Rbe;d + Ra bde;a = − Rn bde Tn . (4.11)
n−1
Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any semi-symmetric connexion is (as for the general asymmetric affine
connexion)
n3 (n − 1)
(4.12)
2
♦
For a fully symmetric connexion the various identities simplify even further. We still have
Ra b(cd) = 0 (4.13)
Ra [bcd] = 0 (4.14)
Math 464: Differential Geometry 94
while
1
R[ab] = Sab (4.15)
2
and
R(cd)ab = qcd[a;b] . (4.16)
Finally the Bianchi identities specialize to
Ra b[cd;e] = 0, (4.17)
S[cd;e] = 0 (4.18)
and
Rbd;e − Rbe;d + Ra bde;a = 0. (4.19)
We can rewrite this final identity as
Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any symmetric connexion is
n2 (n + 1)(n − 1)
(4.21)
3
♦
From our previous discussion on parallelism preserving deformations of a generic asym-
metric affine connexion we know that:
Theorem 22 If two symmetric connexions define the same parallels they are equal.
Exercise: Under a deformation of this type what happens to the Riemann, Ricci, and
S tensors? ♦
(2)
Exercise: What happens if two symmetric connexions Γa (bc) and (1)
Γa (bc) have both
the same geodesics and the same affine parameter ? ♦
Math 464: Differential Geometry 95
For a symmetric connexion the geodesic and normal coordinates come to play a role of
special importance. We now have, in any locally geodesic coordinate chart
Γa bc |p = 0. (4.23)
whence
Ra bcd |p = Γa bd,c − Γa bc,d = −2Γa b[c,d] (4.24)
But then, automatically
Ra [bcd] p
=0 (4.25)
(Note that Γa bc,d] = 0 simply because it is both symmetric and antisymmetric in bc.) Now
we may have derived the above equation in a special coordinate system, but this is a
tensor equation so it must hold in all coordinate charts.
whence
Ra b[cd;e] p
= −2Γa b[c,de] = 0 (4.27)
Again, though we have derived it in a special coordinate system, this is a tensor equation
so it must hold in all coordinate charts. Consequently we have demonstrated the Bianchi
identity for an arbitrary symmetric connexion in a rather straightforward way.
Theorem 24 Fermi normal coordinates For any symmetric connexion, and any curve
xa (λ), it is possible to choose coordinates to make the symmetric connexion zero every-
where along the curve.
Suppose one has an auxiliary tensor gab with covariant derivative qabc . Then defining the
conformally related ḡab = exp(2θ)gab but keeping the symmetric connexion fixed
This suggests that it may be useful to consider connexions with non-metricity of the form
f¯c = fc − ∂c θ (4.32)
Now using our general result for decomposing the affine connexion in terms of the auxiliary
tensor gab , the nonmetricity gabc and the torsion [zero for symmetric connexions] we have
nao
Γa bc = + δ a b fc + δ a c fb − gbc f a (4.33)
bc
Symmetric connexions of this type are called Weyl connexions — they were used by Weyl
in one particular attempt [now abandoned] to geometrically unify electromagnetism with
gravity.
There is a version of the Weyl tensor that can be defined for a limited class of symmetric
connexions. Suppose we have two symmetric connexions that define the same geodesics
(though not the same affine parameter). That is, suppose
(2) a
Γ bc = (1) Γa bc + δ a (b fc) (4.36)
where
fcd = fc;d − fc fd (4.39)
♦
W a b(cd) = 0 (4.42)
W a [bcd] = 0 (4.43)
W a acd = 0 (4.44)
♦
Exercise: Show that a necessary and sufficient condition that the Riemann tensor be
invariant under a projective transformation is that
fc;d − fc fd = 0 (4.45)
Exercise: Show that a necessary and sufficient condition that the Ricci tensor be
invariant under a projective transformation is that
fc;d − fc fd = 0 (4.46)
Math 464: Differential Geometry 98
Exercise: Show that a necessary and sufficient condition that the symmetric part of
the Ricci tensor be invariant under a projective transformation is that
f(c;d) − fc fd = 0 (4.47)
Exercise: Show that a necessary and sufficient condition that the anti-symmetric part
of the Ricci tensor be invariant under a projective transformation is that
fc = ∂ c θ (4.48)
Exercise: Show that a necessary and sufficient condition that the S-tensor Sab be
invariant under a projective transformation is that
fc = ∂ c θ (4.49)
Exercise: [Trivial] Show that a necessary and sufficient condition that a symmetric
connexion be projectively equivalent to a affine flat connexion is that
Γa bc = δ a (b fc) (4.50)
Exercise: Show that (for n ≥ 3) a necessary and sufficient condition that a symmetric
connexion be projectively equivalent to an affine flat connexion is that the Weyl tensor
vanish. ♦
Exercise: Show that (for any n) a necessary and sufficient condition that a symmetric
connexion be projectively equivalent to an affine flat connexion is that
2
Rab;c − Rac;b = R[ab];c − R[ac];b (4.51)
n+1
♦
Math 464: Differential Geometry 99
Exercise: Show that a necessary and sufficient condition that a symmetric connexion be
projectively equivalent to a affine flat connexion, and that the Ricci tensor be symmetric,
is that
Γa bc = δ a (b θ,c) (4.52)
and that in this case
Ra bcd = e−θ δ a c ∂b ∂d eθ − δ a d ∂b ∂c eθ
(4.53)
♦
Πa ac = 0 (4.56)
There are many more results that can be derived for symmetric connexions; this
is more than enough for this particular course.
Metric Connexions
The irony is that given the way I have set things up, I can now define a metric connexion
without defining a metric tensor. Patience, the metric tensor is [finally] defined in the
next chapter.
The generic metric connexion may still have torsion, but is characterized by the vanishing
of the non-metricity tensor qabc . More precisely the otherwise arbitrary non-degenerate
symmetric tensor gab is defined to be covariantly constant (it will soon be identified as
the metric tensor) and we demand
∇c gab = 0 (5.1)
so that nco
Γc ab = + g cm {Tabm + Tbam − Tmba } (5.2)
ab
Ra b(cd) = 0 (5.3)
100
Math 464: Differential Geometry 101
However we still need the full Weitzenbock form of the Bianchi identities
Note that it is the anti-symmetry on the first two indices of the Riemann tensor that
forces Sab = 0.
Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any metric connexion (with torsion) is
n2 (n − 1)2
(5.9)
4
♦
Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any semi-symmetric metric connexion is (as for the general metric con-
nexion with torsion)
n2 (n − 1)2
(5.16)
4
♦
Math 464: Differential Geometry 102
The metric connexion without torsion is the “standard connexion” for doing general rel-
ativity. Because of this most texts on GR deal only with this case, ignoring other com-
plications. There are however a number of good physics and mathematics reasons for
having kept the discussion quite general up to now. The torsion-free metric connexion is
characterized by the vanishing of both the non-metricity tensor qabc and the torsion T a bc .
Then we have nco
Γc ab = . (5.17)
ab
The standard symmetries are now
Ra b(cd) = 0 (5.18)
Ra [bcd] = 0 (5.19)
R[cd] = Scd = 0 (5.20)
R(cd)ab = 0, (5.21)
and the simplified form of the Bianchi identities
Ra b[cd;e] = 0. (5.22)
Exercise: Show that the interplay between Rab(cd) = 0 = R(ab)cd and Ra[bcd] = 0 implies
Exercise: Show that the number of algebraically independent components of the Rie-
mann tensor for any torsion-free metric connexion is
1 2 2
n (n − 1) (5.24)
12
This is not so trivial, see the next chapter for hints. ♦
(You will need to use the symmetries implied by the vanishing of nonmetricity to obtain
this result.) ♦
Metric tensor
In good old ordinary Euclidean geometry (e.g., IR3 ) you probably never saw the distinction
between vectors and covectors, tangents and gradients, indices up and indices down.
Something special happens in Euclidean geometry to simplify life a lot. Let’s try to see
what is so special and how to modify it to a more general context.
Let’s define: n
∂X a ∂X a
X
gbc = (6.4)
a=1
∂xb ∂xc
So far, it’s only defined for Euclidean spaces, and uses the Cartesian coordinates as a
fundamental part of its definition, but once it is defined in this way it does transform as
104
Math 464: Differential Geometry 105
Note:
You can alternatively define gbc this way, at least for Euclidean space — It is the T20 tensor
which in Cartesian coordinates takes on the special values
Comment: Suppose you were a physicist working in Minkowski space, then in Cartesian
coordinates you would want to take
−1 0 0 0
0 +1 0 0
gab (X) = ηab =
0
(6.7)
0 +1 0
0 0 0 +1
and in any other set of coordinates you would just use the tensor transformation laws to
calculate the components. ♦
If you are not working in Euclidean/Minkowski space, then you define a “metric” on
the manifold to be any non-singular tensor of type T20 (the tensor components typically
determine a positive definite matrix), and define the distance between nearby points as
p
ds = gbc δxb δxc (6.8)
Because the matrix gab is assumed nonsingular it has a matrix inverse, and without loss
of generality we can write:
g ab = [g −1 ]ab (6.9)
Similarly
ta = g ab (t↓ )b . (6.12)
As we have already seen (remember the non-metricity tensor) any nonsingular T20 tensor
picked at random would handle this particular job of raising and lowering indices; it’s only
because of the separate notion of “distance” that the metric acquires so much specific
importance.
Definition 49 If the metric (considered as a matrix) is positive definite then the geometry
is Riemannian.
Note: One of the appendices gives Riemann’s complete inaugural lecture — it is fasci-
nating to see how these ideas first came into use, and the logic that Riemann was using
to initiate this branch of mathematics. ♦
Definition 50 If the metric is nonsingular but indefinite, and in particular has signature
−, +, +, +... then the geometry is called pseudo-Riemannian (aka Lorentzian).
Though you can also use pseudo–Riemannian geometry, for instance, for investigating
the propagation of sound in a moving fluid. This ultimately is the basis of a minor indus-
try called “analogue gravity” where various condensed matter systems are used to model
aspects of general relativity. See the book “Artificial Black Holes”. ♦
Math 464: Differential Geometry 107
To quote Bernhard Riemann (in slightly contorted English translation, full version pro-
vided in one of the appendices):
The next case in simplicity includes those manifoldnesses in which the line-element
may be expressed as the fourth root of a quartic differential expression. The
investigation of this more general kind would require no really different principles,
but would take considerable time and throw little new light on the theory of space,
especially as the results cannot be geometrically expressed. . .
. . . A method entirely similar may for this purpose be applied also to the manifold-
ness in which the line-element has a less simple expression, e.g., the fourth root
of a quartic differential. In this case the line-element, generally speaking, is no
longer reducible to the form of the square root of a sum of squares, and therefore
the deviation from flatness in the squared line-element is an infinitesimal of the
second order, while in those manifoldnesses it was of the fourth order.
Such manifolds, and their generalizations, have now come to be called Finsler geome-
tries, and for instance p
ds = 4 gabcd δxa δxb δxc δxd (6.13)
or more generally p
ds = n
ga1 a2 ...an δxa1 δxa2 . . . δxan (6.14)
are general notions of “Finsler distance”.
Pseudo-Finsler geometries allow the ga1 a2 ...an to become indefinite, and it is more useful
to write:
(ds)n = ga1 a2 ...an δxa1 δxa2 . . . δxan (6.15)
In this case both sides of the equation are at least real.
The only reason I mention this is because the 4th-order pseudo-Finsler geometry does
have a physics application:
The metric can be used to define a whole different class of geodesics, based on the notion
of shortest distance (not straightest line). The two concepts agree in Euclidean space (or
Riemannian geometry), but do not necessarily agree in all the models physicists cook up.
(Note that the distance defined in this way is independent of the parameterization of the
curve.)
Exercise: Suppose we change the parameterization of the curve from x(λ) to x(λ̄) where
λ̄ = f (λ) and f (·) is a 1-1 differentiable function from IR → IR. Verify that the distance
s is independent of f (·). ♦
Now if we fix the endpoints x and y we can use the Euler–Lagrange equations of varia-
tional calculus to find the equation of the shortest curve between those fixed points.
Note: I have added an appendix with the elements of variational calculus; enough to
understand what is going on here. ♦
Exercise: For the adventurous I have also added another appendix that gives the three
Hilbert problems that relate to the calculus of variations. Completely solving those three
problems would be a nice exercise. ♦
This simplifies tremendously if you parameterize using arc-length s, since in that case:
Rearrange:
d2 x a 1 ∂gbc dxb dxc
am ∂gmb
+g − = 0. (6.21)
ds2 ∂xc 2 ∂xm ds ds
This is qualitatively very similar to the notion of an affine geodesic with the analogy
a am ∂gmb 1 ∂gbc
Γ bc ∼ g − (6.22)
∂xc 2 ∂xm
However, the RHS here is not quite in its nicest form. It is traditional and usual to
symmetrize and introduce Christoffel symbols of the first and second kind
1 1
{ab, c} ≡
[gac,b + gbc,a − gab,c ] = gc(a,b) − gab,c (6.23)
2 2
nao
am am 1 1 am ∂gmb ∂gmc ∂gbc
= g {bc, m} = g gm(b,c) − gbc,m = g + − m (6.24)
bc 2 2 ∂xc ∂xb ∂x
Exercise:
It is now
aa straightforward (but tedious) exercise to verify that the second Christoffel
symbol bc defined above really does transform in the same way as an affine connexion.
This is already implicit in our analysis for the general affine connexion, but explicit
verification in this particular situation is still a useful exercise. ♦
a
If we set Γa bc = bc
then:
• Geodesics in the sense of straightest possible paths coincide with geodesics in the
sense of shortest possible paths.
Theorem 25 If nao
Γa bc = (6.25)
bc
Math 464: Differential Geometry 110
then
∇a gbc = 0, (6.26)
that is: the metric is covariantly constant.
Conversely, if ∇a gbc = 0, and the torsion is zero (T a bc = 0), then Γa bc = bca .
(This is already explicit in our analysis for the general affine connexion.)
There are a number of other places where the metric connexion simplifies things, for
instance in the symmetry properties of the Riemann tensor. I already pointed out to you
that
Ra b(cd) = 0 (6.27)
for an arbitrary affine connexion. For the metric connexion, we start a by using local
geodesic coordinates so that at any arbitrary but fixed point Γa bc |p = bc |p = 0, which
implies that at that point {ab, c} |p = 0. But in general
so we deduce gbc,a |p = 0 at the arbitrary but fixed point p. But this means that in local
geodesic coordinates
1
∂d Γa bc = g am ∂d {bc, m} = g am (gbm,cd + gcm,bd − gbc,md ) (6.29)
2
and (still in locally geodesic coordinates)
1
Ra bcd |p = ∂c Γa bd − ∂d Γa bc = g am (gdm,bc − gbd,mc − gcm,bd + gbc,md ) (6.30)
2
Let’s lower the index on the Riemann tensor, then at any arbitrary but fixed point p
we can locally chose geodesic coordinates to make:
1
Rabcd |p = (gda,bc − gbd,ac − gca,bd + gbc,ad ) (6.31)
2
That is:
1
(gad,bc − gbd,ac − gac,bd + gbc,ad )
Rabcd |p = (6.32)
2
Notice that in this coordinate system, not only does
Rab(cd) |p = 0 (6.33)
and finally
Rabcd |p = Rcdab |p (6.36)
But these last three equations are new tensor properties — if we have proved them in one
chart they are true in all charts; and the point p though fixed, was arbitrary, so we have
derived additional symmetry properties for a metric-derived Riemann tensor.
(We had already derived these results by specializing the results for the general asym-
metric affine connexion, but this use of locally geodesic coordinates makes life particularly
simple.)
Note that this only works in a normal coordinate system, and does not hold in an arbi-
trary locally geodesic chart. ♦
Let’s Taylor series expand the metric in the normal chart around p, where p corresponds
to xa = 0. Then
1
gab (x) = gab (p) + gab,cd xc xd + O(x)3 (6.47)
2
But then
1
gab (x) = gab (p) − (Racbd + Rbcad ) xc xd + O(x)3 (6.48)
3!
Exercise: From
1
gab (x) = ηab − (Racbd + Rbcad ) xc xd + O(x)3 (6.51)
3!
it follows that
1
gab,cd = − (Racbd + Rbcad ) (6.52)
3
Verify that this is consistent with the definition of the Riemann tensor in locally geodesic
coordinates
Rabcd |p = (gad,bc − gbd,ac − gac,bd + gbc,ad ) |p (6.53)
♦
As we have already seen, the Riemann tensor satisfies some important constraints in the
form of differential identities. Let’s step back to the stage of assuming the torsion is zero,
but make no assumption that the connexion is metric; it’s a generic symmetric affine
connexion. Then in locally geodesic coordinates
But then
Ra b[cd;e] = 0 (6.58)
(because we are antisymmetrizing parial derivatives in the last line above).
Note: The generalization of this identity to the case of nonzero torsion or general affine
connexion is somewhat messier. See previous discussion. ♦
Since we have this identity we can also deduce related identities by contracting on the
up-down indices:
(Ra acd;e + Ra ade;c + Ra aec;d ) = 0 (6.59)
That is
Scd;e + Sde;c + Sec;d = 0 ⇐⇒ S[cd;e] = 0. (6.60)
Note: Remember that for a metric connection Sab = 0; so this equation is non-trivial
only for symmetric but non metric connexions. ♦
That is
Rbd;e − Rbe;d + Ra bde;a = 0. (6.62)
This is true for symmetric but not necessarily metric connexions. If we make the additional
assumption that the connexion is metric, we can additionally contract by using g be to get
That is
Re d;e − R;e + Re d;e = 0 (6.64)
Math 464: Differential Geometry 114
1
a
R b − R δab =0 (6.65)
2 ;a
This is now traditionally referred to as the contracted Bianchi identity, and the combina-
tion
1
Gab = Rab − R gab (6.66)
2
is now traditionally referred to as the Einstein tensor.
Warning: Note that to even define the Einstein tensor you need to have a metric (or
at the very least a nonsingular T20 tensor) available to fully contract the Ricci tensor to
obtain the Ricci scalar
R = Rab g ab (6.67)
The Ricci tensor can be defined for an arbitrary affine connexion; the Einstein tensor
needs the metric. Even when the metric is available, you then additionally need to make
sure you are using the metric connexion in order to deduce
Note: If you crawl down into the basement of Rankine–Brown (not currently an option),
you will discover the VUW library has a complete set of Professor Bianchi’s nicely typeset
lecture notes from the graduate course in differential geometry he taught for many years
— of course they are all in the original Italian! ♦
Suppose we are dealing with a [torsion-free] metric connexion; then the number of free
components of the Riemann tensor is strictly limited by all the symmetries. In fact:
• If n = 3 the Riemann tensor has only six independent components, essentially Rab .
Indeed
Rabcd = −2 ga[d Rc]b + gb[c Rd]a − R ga[c gd]b (6.71)
That is
1
Rabcd = {gac Rbd + gbd Rac − gad Rbc − gbc Rad } − R {gac gbd − gad gbc } (6.72)
2
• If n = 4 the Riemann tensor has only twenty independent components. Ten of them
are the Ricci tensor Rab and the other ten are hidden in the “Weyl tensor”.
1
Cabcd = Rabcd + ga[d Rc]b + gb[c Rd]a + R ga[c gd]b (6.73)
3
That is
1 1
Cabcd = Rabcd − {gac Rbd + gbd Rac − gad Rbc − gbc Rad } + R {gac gbd − gad gbc }
2 6
(6.74)
• Generally (n ≥ 3) we define
2 2
Cabcd = Rabcd + ga[d Rc]b + gb[c Rd]a + R ga[c gd]b (6.75)
n−2 (n − 1)(n − 2)
That is
{gac Rbd + gbd Rac − gad Rbc − gbc Rad } R {gac gbd − gad gbc }
Cabcd = Rabcd − +
n−2 (n − 1)(n − 2)
(6.76)
The Weyl tensor has the same symmetries as Riemann and is the unique linear
combination of the Riemann tensor, Ricci tensor, Ricci scalar, and metric such that
C a bad = 0 (6.77)
The Weyl tensor can also be characterized as the part of the Riemann tensor that
is covariant under conformal deformations of the metric. If
then
C̃abcd = Ω2 Cabcd (6.79)
Exercise:
Verify that the number of algebraically independent components of the Riemann tensor
is
1 2 2
n (n − 1). (6.80)
12
Math 464: Differential Geometry 116
Warning:
If you only use the “basic” symmetries R(ab)(cd) = 0 and Rabcd = Rcdab you would find
1 n(n − 1) n(n − 1)
+1 (6.81)
2 2 2
independent components. It is only after adding the additional constraint Ra[bcd] = 0 that
you get the quoted result. ♦
Note:
Up to n(n + 1)/2 of these components can be assigned to the Ricci tensor, so that the
Weyl tensor has at most
1 2 2 1 (n − 3)n(n + 1)(n + 2)
n (n − 1) − n(n + 1) = (6.82)
12 2 12
independent components. ♦
Exercise:
[This exercise makes sense only in Lorentzian signature, i.e., for pseudo-Riemannian man-
ifolds.]
(Trivial) Verify that g̃ab = Ω2 gab has the same null cones as gab .
(A little harder) Verify that g̃ab has the same null geodesics as gab .
(A little harder) Even though g̃ab has the same null geodesics as gab , these geodesics will
not have the same affine parameter. Verify this. ♦
Exercise:
Verify that
Γ̃a bc = Γa bc + Ω−1 {δ a b ∇c Ω + δ a c ∇b Ω − gbc ∇a Ω} . (6.83)
Can you use this result to come up with a “slick” way of showing that g̃ab has the same
null geodesics as gab ?
[The second part of this exercise makes sense only in Lorentzian signature, i.e., for pseudo-
Riemannian manifolds.] ♦
Exercise:
Define
Ωa b = 4Ω−1 ∇a ∇b (Ω−1 ) − 2δ a b ||∇(Ω−1 )||2 (6.84)
−1
(The apparently perverse use of Ω minimises the number of explicit occurrences of n.)
Verify that
R̃ab cd = Ω−2 Rab cd + δ a [c Ωb [d (6.85)
Math 464: Differential Geometry 117
Exercise:
If n = 2 verify that p √
g̃ R̃ = g R + 2∇2 ln Ω (6.88)
Deduce that on a compact 2-manifold
√
Z
g R d2 x (6.89)
Warning: Strictly speaking, I have not yet defined the notion of integration on mani-
folds, but this particular calculation should be do-able by elementary means. ♦
Exercise:
By taking the formal limit n → 2 verify that
Exercise:
If n = 3 verify that
p √
g̃ R̃ = g Ω R + 4∇2 Ω − 2||∇ ln Ω||2 (6.91)
√
Z
= g exp(θ) R − 2||∇θ||2 d3 x (6.93)
Math 464: Differential Geometry 118
Exercise:
If n = 4 verify that p √
g̃ R̃ = g Ω2 R + 6Ω∇2 Ω (6.94)
Deduce that in a compact manifold
√ 2
Z p Z
4
g̃ R̃ d x = g Ω R − 6||∇ ln Ω||2 d4 x (6.95)
Exercise:
Look up “Yamabe’s theorem”.
Under suitable conditions (find them), manifolds can be conformally related to constant
scalar curvature “cousins”. That is, given g it is possible to find Ω such that g̃ has a
constant Ricci scalar: R̃ ∈ {−1, 0, +1}.
(If n = 2 this is the so-called “uniformization theorem” of 2-manifolds; for n > 2 this
seems to be the best analog of the uniformization theorem on the market.) ♦
Definition 51
A manifold is “locally conformally flat” iff in some coordinate chart
gab ∝ ηab . (6.96)
It is globally conformally flat iff there is an atlas of coordinate charts with this property;
and if the transition functions on the overlap regions are trivial.
Lemma 18
Any 2-manifold is locally conformally flat.
(It’s globally conformally flat iff it has the topology of IR2 or T 2 ; otherwise the uniformiza-
tion theorem implies it’s globally conformal to either S 2 or the hyperbolic plane H 2 modded
out by some Mobius group.)
Lemma 19
A 3-manifold is conformally flat iff the Cotton tensor is everywhere zero.
(3 dimensions is a special case: The Cotton tensor is
1 1
Rabc = Rab;c − Rac;b − (gac R;b − gab R;c ) = 2 Ra[b;c] + ga[b R;c] (6.97)
4 4
Math 464: Differential Geometry 119
Lemma 20
A manifold (n ≥ 4) is conformally flat iff the Weyl tensor is everywhere zero.
To see that the notion of a conformally flat geometry contains a lot of interesting mathe-
matics, I will now present an “unusual” route from Newton’s second law to Maupertuis’
variational principle.
F~ = m ~a (6.98)
d2 ~x ∂V (x)
m 2
=− (6.99)
dt ∂~x
Now suppose that you have good surveying equipment but very bad clocks. So you can
tell where the particle is, and its path through space, but you have poor information on
when it is at a particular point. Can you reformulate Newton’s second law in such a way
as to nevertheless be able to get good information about the path the body follows?
Now we are in good old Euclidean geometry and Cartesian coordinates, so we can write
the distance travelled in space as
√
ds = d~x · d~x (6.100)
Can we find a differential equation for d~x/ds (instead of d~x/dt)? By using the chain rule
d2 ~x
ds d ds d
= ~x (6.101)
dt2 dt ds dt ds
which implies 2
d2 ~x d2
ds ds d ds d
= ~x + ~x (6.102)
dt2 dt ds2 dt ds dt ds
Math 464: Differential Geometry 120
That is 2 " #
2
d2 ~x d2
ds 1 d ds d
= ~
x + ~x (6.103)
dt2 dt ds 2 2 ds dt ds
Therefore, putting this into Newton’s second law
2 2 " #
2
ds d ∂V (x) 1 d ds d
m ~
x = − − m ~x (6.104)
dt ds2 ∂~x 2 ds dt ds
Now this has done the job of completely removing “time” from the equation of motion.
You now have an equation strictly in terms of position x and distance along the path s
— “time” has been completely eliminated.
But now let’s go one step further and re-write this in terms of geometry — you should
to be too surprised to see a three-dimensional conformally flat geometry drop out.
and note that the geodesic equations are (in arbitrary parameterization)
d2 x a b
a dx dx
a
dxc
+ Γ bc = f (λ) (6.115)
dλ2 dλ dλ dλ
with (indices in this section are raised and lowered using the flat metric δab )
That is: paths of particles subject to Newton’s second law follow geodesics of the confor-
mally flat geometry defined by
While we have completely eliminated time from the equation for the paths there is a price
— you now need to consider a separate geometry for each value of the energy.
Furthermore
r r r r
p m 2[E − V (x)] m ds 1
d` = E − V (x) ds = ds = ds = p~ · d~x (6.125)
2 m 2 dt 2m
R
So minimizing ` = d`, which leads to the geodesic equations and thence to Newton’s
second law, is also equivalent to minimizing
Z b
W [a, b] = p~ · d~x (6.126)
a
subject to the “energy conservation equation”. But this is exactly Maupertuis’ (constant
energy) variational principle, which we now see is equivalent to minimizing the arc-length
` in the conformal geometry.
Rb
Comment: [If you have a little physics background] Note that W [a, b] = a p ds is the
quantity that shows up in the phase of the WKB approximation; this is not an accident
and ultimately can be traced back to the fact that quantum physics (via Feynman’s sum
over histories approach) can be viewed as classical mechanics plus fluctuations. The clas-
sical paths come from the saddle points in the path integral, and saddle points can be
located using the calculus of variations. ♦
Exercise: [If you have a little physics background] Since Maupertuis’ (constant energy)
variational principle is not the most common of the variational principles, let’s rewrite
this in terms of the Hamiltonian and the Lagrangian. Re-introduce time and write
Z b Z bp Z bp Z b
ds
d` = E − V (x) ds = E − V (x) dt = [E − V (x)] dt (6.127)
a a a dt a
But now, for the sort of non-relativistic kinetic energy terms we have been assuming,
L = 21 m(d~x/dt)2 − V (x) and H = P 2 /(2m) + V (x) → 21 m(d~x/dt)2 − V (x). Combining
the Lagrangian and Hamiltonian gives
H −L E−L
V (x) = → (6.128)
2 2
Math 464: Differential Geometry 123
One message to take from all of this is that physics applications of differential geometry
are not solely restricted to general relativity. Differential geometry is a much more basic
and fundamental tool.
Exercise: For N coupled particles of mass mi (i ∈ [1..n]) Newton’s second law becomes
the system of equations
d2 ~xi ∂V (~x1 , . . . , ~xN )
mi 2
=− (6.130)
dt ∂~xi
show that the paths swept out by this system of ODEs are geodesics in a conformally flat
3N dimensional space with metric
( N )
X
d`2 = {E − V (~x1 , . . . , ~xN )} mi ds2i (6.131)
i=1
where dsi is ordinary physical distance for the i’th particle. Note that we now need to
keep track of individual particles masses. ♦
Exercise: [A nice little project; it is open ended] One of the most famous problems of
analytical mechanics is the “three-body problem” of celestial mechanics. Suppose three
bodies attract each other according to Newtonian gravity. Then
Now this nine-dimensional representation is redundant. You can wlog go to the center of
mass frame so that
m1 ~x1 + m2 ~x2 + m3 ~x3 = ~0 (6.136)
This lets you eliminate one of the ~x’s, say ~x3 .
Show that (for fixed energy E) the three body problem is equivalent to geodesic motion
in a six dimensional conformally flat manifold and find the corresponding metric.
Exercise: Is there anything you can say for a general Lagrangian L(ẋ, x) that does not
conveniently separate into kinetic energy and potential energy contributions? ♦
Chapter 7
Exterior derivatives
The “exterior derivative” is another way of producing covariant objects from derivatives
of a tensor; this time without invoking any connexion [or parallel transport] at all. Of
course there is a different price to pay.
∂xb
Vā = Vb (7.1)
∂ x̄a
Take partial derivatives
∂xb
∂c̄ Vā = ∂c̄ Vb (7.2)
∂ x̄a
b b
∂x ∂x
= ∂c̄ a
Vb + ∂c̄ Vb (7.3)
∂ x̄ ∂ x̄a
2 b b d
∂ x ∂x ∂x
= c a
Vb + a
∂ d Vb (7.4)
∂ x̄ ∂ x̄ ∂ x̄ ∂ x̄c
If only the last term were present, then we would have a T20 tensor. Unfortunately the
presence of the first term destroys the tensorial properties of the partial derivative ∂T —
unless that is you are working in Cartesian coordinates and are only interested in changing
coordinates to another Cartesian system.
In previous chapters we discussed the notion of the covariant derivative (affine connex-
ion) in considerable detail. Now we will look at a different route to obtaining a covariant
125
Math 464: Differential Geometry 126
object. Note that the leading term, which gives all the problems, is symmetric in (ac).
So if we take the antisymmetric part of both sides of the equation above we have
b d
∂x ∂x
∂[c̄ Vā] = a
∂[d Vb] (7.5)
∂ x̄ ∂ x̄c
And this is a tensor equation. This is the second of the two standard routes to defining a
sensible notion of differentiation on tensors. This time we will not alter the definition of
differentiation (essentially that was the role of the affine connexion, to apply a “correction”
to the partial derivative to obtain a covariant derivative). Instead what we will now do
is to restrict the class of objects we want to differentiate. This is the central idea behind
the “exterior differential calculus”.
So far we have seen that for a scalar φ and a covector Va the quantities
transform as tensors. How can we generalize this? Let’s start by looking at a tensor that
is completely covariant (all indices down) so that
That is, completely anti-symmetrized partial derivatives of fully covariant tensors trans-
form as tensors. The beauty here is that you do not have to introduce any affine con-
nexion, the differentiation is purely one of partial differentiation. The drawback is that
you are only allowed to us this trick for a rather restricted class of tensors — fully anti-
symmetrized covariant tensors.
Still this technique is of sufficient importance that a whole terminology and technology
has grown up around it.
Exercise: Show that for a symmetric connexion (in particular the Christoffel connexion)
X[abc...;z] = X[abc...,z] (7.14)
so that the connexion actually “drops out” completely. ♦
Definition 53 The exterior derivative, denoted d, maps s-forms onto (s + 1)-forms ac-
cording to the rule
{dF }a1 a2 ...as as+1 = (s + 1)∂[a1 Fa2 a3 ...as as+1 ] (7.16)
= (s + 1)F[a2 a3 ...as as+1 ,a1 ] (7.17)
= (−1)s (s + 1) F[a1 a2 ...as ,as+1 ] (7.18)
Math 464: Differential Geometry 128
Lemma 21 The exterior derivative when applied twice to any s-form always gives zero.
That is
d2 F = 0 (7.19)
From the definition you know that d2 maps s-forms into (s + 2)-forms, but the lemma
above tells you that the (s + 2)-form in question is always equal to zero. Why? (It should
be obvious.)
Definition 54 The exterior product, denoted ∧, is a bilinear product on forms that maps
an s1 -form and an s2 -form onto an (s1 + s2 )-form according to the rule
(s1 + s2 )!
{F1 ∧ F2 }[a1 a2 ...as1 b1 b2 ...bs2 ] = {F1 }[a1 a2 ...as1 {F2 }b1 b2 ...bs2 ] (7.20)
s1 ! s2 !
Putting the reciprocal binomial factor here is purely a matter of convention — it maximizes
agreement with the notation of other books.
Lemma 22
F1 ∧ F2 = (−1)s1 s2 F2 ∧ F1 (7.21)
Now is a suitable time to introduce some “index free” notation, though it should more
properly be called “limited index” notation. Dealing with exterior differential forms and
Math 464: Differential Geometry 129
the exterior derivative is the one place where “index free” methods are clearly worth the
trouble.
First we step back to the idea of a covariant vector and note that there is a natural
isomorphism between vectors and directional derivatives. If ta are the components of a
contravariant vector then we define
∂
ta ←→ t = ta (7.22)
∂xa
note that because the partial derivatives transform in the same way as a covariant vector
the RHS is a coordinate invariant. Abstract mathematicians are likely to say that the
coordinate invariant t is the primary definition of a vector. Physicists generally take the
component based definition as primary. Fortunately all roads lead to Rome...
Warning: You will often see statements to the effect theat the partial derivatives
a
∂a = ∂/∂x are “vectors”. In an abstract sense this is quite true, they are linear differential
operators and they do form a basis for the tangent space T = T01 . Under a change of
chart the chain rule gives
∂ ∂xb ∂
= (7.23)
∂ x̄a ∂ x̄a ∂xb
So that this set of basis vectors transforms in a manner opposite to the components ta ,
which then makes the combination t = ta ∂a a coordinate invariant. Even though it is a
coordinate invariant I do not want to call this a scalar as that has additional connota-
tions... ♦
If we now have a vector field defined by components ta (x) then we can define a field of
linear differential operators
∂
ta (x) ←→ t(x) = ta (x) (7.24)
∂xa
This now lets you things that at first glance are really perverse — like calculate the
“commutator” of two vector fields. Let t1 and t2 be two vector fields and define
∂ b ∂ ∂ ∂
[t1 , t2 ] = t1 t2 − t2 t1 = ta1 (x) a
t2 (x) b − tb2 (x) b ta1 (x) a (7.25)
∂x ∂x ∂x ∂x
∂2 ∂
= {ta1 tb2 − tb2 ta1 } a b
+ {ta1 ∂a tb2 − ta2 ∂a tb1 } b (7.26)
∂x ∂x ∂x
∂
= {ta1 ∂a tb2 − ta2 ∂a tb1 } b (7.27)
∂x
That is
[t1 , t2 ]b = ta1 ∂a tb2 − ta2 ∂a tb1 (7.28)
Math 464: Differential Geometry 130
Lemma 23 The quantities [t1 , t2 ]b = ta1 ∂a tb2 − ta2 ∂a tb1 transform as the components of a
contravariant vector (a T01 tensor).
Proof: Since the ta1 transform as the components of a contravariant vector t1 is coordinate
independent.
Exercise: Provide a “low-brow” proof of the same statement by explicitly making the
change of coordinates
[t1 , t2 ]b̄ = tā1 ∂ā tb̄2 − tā2 ∂ā tb̄1 (7.29)
and verifying
∂xb̄
b
[t1 , t2 ]b
[t1 , t2 ]b̄ = (7.30)
∂x
2 2
Hint: What happens to the ∂ x̄/(∂x) term? ♦
We can now use this to give a “coordinate free” definition of the exterior derivative of a
0-form f . Recall that T10 tensors are elements of the co-tangent space T ∗ , so they can be
thought of as linear mapping from the tangent space T into IR. That is we can define the
covariant vector g by its action g(t) on vectors t; as long as g(t) is linear in t we can use
the isomorphism ta ↔ t = ta ∂x∂ a to re-write t in terms of its components ta and thereby
deduce the existence of components ga such that
g(t) = ga ta (7.31)
df (t) = tf (7.32)
Now let’s generalize this construction... First, pick a particular coordinate chart, and
let xa (p) be the coordinates in this chart. Then we can certainly define a set of n 1-forms
dxa . Using these particular 1-forms as a basis for T ∗ define a generic 1-form as
g = ga dxa . (7.36)
where g are the components of a T10 tensor — the g defined this way is independent of
the choice of coordinate system.
Warning: The basis one-forms dxa manifestly do depend on the coordinate chart. ♦
a ∂ x̄a
dx̄ = b
dxb (7.37)
∂x
♦
Now step back a little: When we defined Tsr tensors we phrased the definition in terms
of the “outer product” or “Cartesian product” of vectors in the tangent and cotangent
spaces. I just want to remind you that we had considered the “tensor product”
and in particular that for two 1-forms [covariant vectors] we had already decided that
things like
g1 ⊗ g 2 (7.39)
make sense — it’s an element of T20 .
This now lets us define the exterior product of two generic 1-forms as
g1 ∧ g 2 = g 1 ⊗ g 2 − g 2 ⊗ g 1 (7.40)
note the absence of any 1/2! (This is convention, but a useful convention.) Furthermore
for three one-forms
g1 ∧ g 2 ∧ g 3 = g 1 ⊗ g 2 ⊗ g 3 + g 2 ⊗ g 3 ⊗ g 1 + g 3 ⊗ g 1 ⊗ g 2
−g3 ⊗ g2 ⊗ g1 − g2 ⊗ g1 ⊗ g3 − g1 ⊗ g3 ⊗ g2 (7.41)
Math 464: Differential Geometry 132
note the absence of any 1/3!, with the sum being over all six (3!) signed permutations of
the labels 1, 2, 3. More generally
X
∧N
i=1 g i = signum(π) ⊗N
i=1 gπ(i) (7.42)
π
where the sum is over all permutations π of the labels [1,N] — these are just labels, not
coordinate indices. Here signum(π) is +1 if π is an even permutation and −1 if π is an
odd permutation. In particular this lets us use dxa , dxa ∧ dxb , dxa ∧ dxb ∧ dxc , . . . as a
basis for the 1, 2, 3-forms. So we can write
d = dxa ∂a (7.44)
and then in “index free” notation
dF = d ∧ F (7.45)
so that
ddF = d ∧ d ∧ F = 0 (7.46)
or more formally
d2 = d ∧ d = 0 (7.47)
To see that this is equivalent to the index-based notation previously given, note that in
component notation, the “outer product” [also called “tensor product”] of two tensors of
type Tsr11 and Tsr22 is a tensor of type Tsr11+s
+r2
2
defined by
a a ...ar b b ...br a a ...ar b b ...br
{X1 ⊗ X2 }c11c22...cs 1d11d22...ds2 = {X1 }c11c22...cs11 {X2 }d11 d22 ...ds2 (7.48)
1 2 2
♦
Math 464: Differential Geometry 133
Exercise: Show that the exterior product of two closed forms is closed. ♦
Are all closed forms exact? Locally yes; in topologically trivial region all closed forms
are exact, but if the topology is nontrivial there may be obstructions. This leads you into
the theory of de Rham cohomology.
F1 ∼ F 2 ⇐⇒ F1 − F2 is exact (7.51)
Definition 58
Hs = {closed s-forms}/ ∼ (7.52)
is the s’th de Rham cohomology class.
Exercise: Show that H1 equals the number of topologically distinct [independent] non-
contractible closed loops. [Not so easy, but still “elementary”] ♦
Math 464: Differential Geometry 134
g[a,b] = 0 ⇒ g a = ∂a f ? (7.53)
You have already seen problems like this in Euclidean space. In 3-dimensional Euclidean
space the analogous question is: When does
~ ×A
∇ ~=0 ⇒ ~ = ∇Φ
A ~ ? (7.54)
∂xā
1̄ 2̄ n̄
dx ∧ dx ∧ · · · ∧ dx = det dx1 ∧ dx2 ∧ · · · ∧ dxn (7.56)
∂xa
so that −1
∂xā
f (x) → f¯(x) = det f (x) (7.57)
∂xa
This means that f (x) is a scalar density. To easily check the plausibility of the assertion,
consider a diagonal matrix ∂xā /∂xa corresponding to a simple rescaling of coordinates.
Exercise: Prove this assertion in general using properties of the determinant and the
fact that the n-fold exterior product is completely antisymmetric. ♦
Note that f (x) while is not a scalar the combination f (x) dn x is coordinate independent.
Here dn x is the ordinary Riemann measure (or Lebesgue measure) on IR n . The definition
as given above makes sense only if Ω is a topologically trivial region that can be contained
within a single coordinate chart — but that is not really a significant restriction. If Ω is not
topologically trivial simply split it up into a number of regions Ωi that are topologically
trivial and define Z XZ
ω= ω (7.61)
Ω {i} Ωi
If you want even more technical precision you can set up a “subordinate partition of
unity” using the assumed second countability (or equivalently, paracompactness) of the
underlying manifold.
Proof: Note that the LHS integrates an n-form over a n-dimensional region while the
RHS integrates a (n − 1)-form over a (n − 1)-dimensional region.
Let’s work for now with a n-dimensional rectangular region Ω ∼ [Ai , Bi ]n . Then in
components
n
X
{dω}a1 a2 ...an = nω[a1 a2 ...an −1,an ] = ∂aj ωaj+1 ...an a1 ...aj−1 (7.64)
j=1
and Z Z n
Z X
n
dω = f (x) d x = ∂j ω(j+1)...n 1...(j−1) dn x (7.66)
Ω Ω Ω j=1
Each one of the terms on the RHS consists of an (n − 1) form being integrated over one
n − 1 dimensional face of the rectangular region. That is
Z X Z Z
dω = ω= ω (7.68)
Ω faces face ∂Ω
For general regions merely approximate the general region with an increasingly fine
mesh of rectangular regions — you have seen such techniques often enough before in the
undergraduate versions of Gauss, Stokes, and Green’s theorem... QED
Comment: The only tricky part of the proof is keeping the signs correct. ♦
But Z Z I
a
ω= ωa dx = ωa dxa (7.70)
∂Ω ∂Ω
(Warning: Note that the penultimate “d” is quite distinct from the ultimate “d”. The
penultimate “d” is an exterior differential operator acting on the coordinate xa . The
ultimate “d” is an ordinary differential being used for a real integration. The notation is
unfortunately standard.) That is
Z I
[ωx,y − ωy,x ] d(Area) = ωa dxa (7.71)
Ω
or even Z I
[ωx,y − ωy,x ] d(Area) = ~ · d~x
ω (7.72)
Ω
which you should remember from Math 206. As usual arbitrary areas can now be built
up by taking unions of squares of different sizes.
Math 464: Differential Geometry 137
(Warning: In these last two equations both “d”s are ordinary differentials being used for
area and line integrations.) ♦
But Z Z I
a
ω= ωa dx = ωa dxa (7.74)
∂Ω ∂Ω
That is Z I
[ωx,y − ωy,x ] d(Area) = ωa dxa (7.75)
Ω
But now let’s write dS~ = n̂ d(Area) with n̂ being the unit normal to the surface, in the
present coordinates this lies in the z direction. Then we can write the surface integral as
Z I
~
(∇ × ω ~
~ ) · dS = ω · d~x (7.76)
Ω
which you should remember from Math 206. As usual arbitrary areas can mow be built
up by taking unions of squares of different sizes and orientations. ♦
Comment: Thus the generalized Stokes theorem takes the ordinary Stokes theorem and
modifies it to arbitrary manifolds in arbitrary many dimensions — and the generalized
Stokes theorem does not even need to make use of the metric tensor. ♦
and
δ a1 a2 a3 b1 b2 b3 = 3! δ [a1 b1 δ a2 b2 δ a3 ] b3 (7.79)
etc...
Exercise: Why does the series of tensors terminate at the n’th step? ♦
Exercise: Show that these tensors are all anti-symmetric on their lower indices; anti-
symmetry in the upper indices is true by construction. ♦
signum(a1 a2 a3 . . . an ) (7.84)
then it is easy to show that X and Y are both tensor densities, but of opposite weights.
What are the transformation properties? ♦
Warning: So far this chapter on exterior derivatives has not made use of the metric
tensor. Once you have the metric tensor available to add extra structure even more in-
teresting things begin to happen. ♦
Suppose we have a metric tensor available, with components gab , then we can certainly
calculate its determinant g = det[gab ].
Note: If you are in Euclidean space using Cartesian coordinates you can afford to forget
√
the g. ♦
Exercise: If you do not have a metric tensor available, any non-singular T20 tensor
would be good enough — just make sure you have a good reason for whatever particular
choice of T20 tensor you decide on. ♦
Exercise: Now raise all the indices on the Levi–Civita tensor by using the inverse metric
g ab . Show that
1
a1 a2 a3 ...an = p signum(a1 a2 a3 . . . an ) (7.88)
det[gab ]
Check that this really does transform as a T0n tensor. Show that
(−1)S
a1 a2 a3 ...an = p signum(a1 a2 a3 . . . an ) (7.94)
| det[gab ]|
a1 a2 a3 ...an a1 a2 a3 ...an = (−1)S n! (7.95)
Unfortunately in physical applications these minus signs are often important. (Which is
why I’ve gone to the trouble to keep track of them.) ♦
Though somewhat tedious these identities are often tremendously useful in component
calculations. ♦
Exercise: What are the relevant formulae in general signature? It is easiest to work in
a coordinate system in which the metric is diagonal at the point in question. ♦
Exercise: Explicitly write down, and keep handy, the specialization of these identities
to 2, 3, and 4 dimensions. ♦
Exercise: Show that [using the Christoffel connexion] the Levi–Civita tensor is covari-
antly constant
∇b a1 a2 a3 ...an = 0 = a1 a2 a3 ...an ;b (7.103)
It’s easiest to do this in locally geodesic coordinates. ♦
Once we have the Levi–Civita tensor available, we can introduce the “Hodge star opera-
tion” which maps s-forms to (n − s)-forms.
Definition 61
1
{∗F }a1 a2 ...an−s =a1 a2 ...an−s b1 b2 ...bs {F }b1 b2 ...bs (7.104)
s!
Note that the metric is hiding both in the definition of the Levi–Civita tensor and in the
raising and lowering of indices on this tensor.
∗ 1 = ; ∗ = 1. (7.105)
That is, the number 1 (a particular zero-form) is mapped into the Levi-Civita n-form,
while the Levi-Civita n-form is mapped into the zero-form 1. ♦
∗ 1 = ; ∗ = (−1)S . (7.106)
♦
Math 464: Differential Geometry 142
Exercise: Show that if the metric has S negative eigenvalues and ω is any n-form that
ω123...n
∗ ω = (−1)S p (7.107)
| det[gab ]|
and consequently p
ω123...n = (−1)S | det[gab ]| ∗ ω (7.108)
♦
Applying the Hodge star twice maps s-forms back into s-forms. Explicitly
1
{∗ ∗ F }c1 c2 ...cs = c c ...c a1 a2 ...an−s a1 a2 ...an−s b1 b2 ...bs {F }b1 b2 ...bs (7.109)
(n − s)! s! 1 2 s
But then
1
(−1)s(n−s) δ b1 b2 ...bs c1 c2 ...cs {F }b1 b2 ...bs
{∗ ∗ F }c1 c2 ...cs = (7.110)
s!
The (−1)s(n−s) comes from re-arranging the order of indices
so that you can apply the results of the previous section. But then
and so
∗ ∗F = (−1)s(n−s) F (7.113)
∗ ∗F = F (7.114)
∗ ∗F = (−1)s F (7.115)
Exercise: Show that if the metric has S negative eigenvalues the general situation is
∗ ∗F = (−1)S+s(n−s) F (7.116)
Math 464: Differential Geometry 143
Exercise: Write down the pattern for pseudo-Riemannian manifolds [Lorentzian signa-
ture] S = 1 in (1+1), (2+1), and (3+1) dimensions. ♦
The 0-forms
∗ (F ∧ ∗F ); ∗(F1 ∧ ∗F2 ) (7.117)
are particularly useful.
(n − s)!
∗ (F1 ∧ ∗F2 ) = (−1)s(n−s) {F1 }a1 a2 ...as {F2 }a1 a2 ...as (7.118)
n!
Proof:
1 a1 a2 ...an
{F1 }a1 a2 ...as as+1 as2 ...an b1 b2 ...bs {F2 }b1 b2 ...bs
∗ (F1 ∧ ∗F2 ) = (7.119)
n!s!
(n − s)! a1 a2 ...as
= (−1)s(n−s) δ b1 b2 ...bs {F1 }a1 a2 ...as {F2 }
b1 b2 ...bs
(7.120)
n!s!
(n − s)! [a1 a2 a3
= (−1)s(n−s) δ b1 δ b2 δ b3 . . . δ as ] bs {F1 }a1 a2 ...as {F2 }b1 b2 ...bs
n!
(7.121)
(n − s)!
= (−1)s(n−s) {F1 }a1 a2 ...as {F2 }a1 a2 ...as (7.122)
n!
QED
(n − s)!
∗ (F1 ∧ ∗F2 ) = (−1)S+s(n−s) {F1 }a1 a2 ...as {F2 }a1 a2 ...as (7.123)
n!
♦
Math 464: Differential Geometry 144
and show that this inner product is symmetric. In a Riemannian manifold it will also
be positive definite and satisfy the triangle inequality — these properties may fail if the
metric has negative eigenvalues. ♦
Now that we have the exterior derivative d and the Hodge star operation ∗, we can define
a new differential operation on s-forms, the divergence.
Comment: The phase (−1)S+n(s+1) is there for convenience elsewhere in the develop-
ment — you will not go too far wrong in ignoring it to get a qualitative feel for what is
going on. ♦
Lemma 25 With this choice of phase the divergence δ is adjoint to the exterior derivative
δ in the sense that
hα, dβi = hδα, βi (7.127)
or in terms of the underlying integral
Z Z
α ∧ (∗dβ) = δα ∧ (∗β) (7.128)
Proof:
Z Z
S+n(s+1)+1
δα ∧ (∗β) = (−1) (∗ d ∗ α) ∧ (∗β) (7.129)
Z
S+n(s+1)+1
= (−1) (d ∗ α) ∧ (∗ ∗ β) (7.130)
Z
S+n(s+1)+1+S+s(n−s)
= (−1) (d ∗ α) ∧ β (7.131)
Z
n−s+1
= (−1) (d ∗ α) ∧ β (7.132)
Now use the integration by parts formula and assume either no boundary, ∂Ω = 0, or
suitable falloff boundary conditions. Then
Z Z
(n−s)+1+(n−s)+1
δα ∧ (∗β) = (−1) (∗α) ∧ (dβ) (7.133)
Z
= α ∧ (∗ dβ) (7.134)
Lemma 26
δδF = 0 (7.136)
Proof:
δδF = ∗ d ∗ ∗ d ∗ F ∝ ∗ dd ∗ F = 0 (7.137)
QED
Here we have first used the complete anti-symmetry on a1 a2 . . . an−s+1 to replace the
partial derivative with the Christoffel covariant derivative, and then used the covariant
constancy of the Levi–Civita tensor to move the Levi–Civita tensor outside the brackets.
Finally I have rearranged the indices. Now we can contract over the indices of the two
Levi–Civita tensors to get
(n − s)!
δFc1 c2 ...cs−1 = (−1)S+n(s+1)+1+(n−s)+s(n−s)+S δ b1 b2 ...bs c1 c2 ...cs
(n − s)! s!
×g cs d {F }b1 b2 ...bs ;d (7.143)
(n − s)!s!
= (−1)1 δ [b1 c1 δ b2 c2 δ b3 c3 . . . δ bs ] cs
(n − s)! s!
×g cs d {F }b1 b2 ...bs ;d (7.144)
= − g cs d Fc1 c2 ...cs ;d (7.145)
= −Fc1 c2 ...cs−1 a;b g ab (7.146)
= −Fc1 c2 ...cs−1 a ;a (7.147)
So up to a multiplicative factor, this really is the covariant divergence of the s-form. QED
R
On the other hand, go to the boundary and compute ∂Ω
∗ ω. It is useful to adopt
normal coordinates on the boundary such that
When we restrict this to the boundary ∂Ω and introduce a normal vector na we have, in
terms of the induced (n − 1)-dimensional Levi–Civita tensor,
Therefore Z Z q
∗ω = m
(ωm n ) | det[gij ]| dn−1 x (7.159)
∂Ω ∂Ω
Now defining p
d(Volume) = | det[gab ]| dn x (7.160)
and q
d(Area) = | det[g̃ij ]| dn−1 x (7.161)
we have Z Z
;a
ωa d(Volume) = ωa na d(Area) (7.162)
Ω ∂Ω
which we recognize as Gauss’ theorem.
The original Gauss theorem was proved in flat Euclidean 3-space. This current version
applies to any curved manifold with a metric — note particularly the occurrence of the
Christoffel connexion in the covariant divergence on the LHS.
A particularly nice way of making sure you understand differential forms and exterior
derivatives is to work in flat Euclidean three-space where everything reduces to vector
cross products and the “curl” [“rot”] operation on vector fields. Once the Hodge star is
introduced you can reconstruct ordinary dot products and the divergence operator.
• zero-forms — scalars — Φ;
• one-forms — vectors — Va ;
δ a1 a2 b1 b2 = 2! δ [a1 b1 δ a2 ] b2 (7.168)
and
δ a1 a2 a3 b1 b2 b3 = 3! δ [a1 b1 δ a2 b2 δ a3 ] b3 (7.169)
Equivalently (this was an exercise)
δ a1 b 1 δ a1 b 2
δ a1 a2 b1 b2 = [δ a1 b1 δ a2 b2 − δ a2 b1 δ a1 b2 ] = (7.170)
δ a2 b 1 δ a2 b 2
and
δ a1 b 1 δ a1 b 2 δ a1 b 3
a1 a2 a3
δ b1 b2 b3 = δ a2 b 1 δ a2 b 2 δ a2 b 3 (7.171)
δ a3 b 1 δ a3 b 2 δ a3 b 3
Because we are in 3-d the series of tensors stops here. Because the signature of the metric
is + + + we do not need to worry about any (−1)S factors.
Now in 3-d the Hodge star operation maps two-forms into one-forms and three-forms
into zero forms — so all exterior forms can ultimately be reduced to scalars and vectors.
Explicitly:
•
{∗Φ}abc = Φ abc (7.172)
•
0 Vz −Vy
{∗V }ab = ab c Vc = −Vz
0 Vx (7.173)
Vz −Vx 0 ab
Math 464: Differential Geometry 150
•
1 bc
{∗F }a = a Fbc = [Fyz , Fzx , Fxy ]a (7.174)
2!
•
1 abc
{∗G} = Gabc = Gxyz (7.175)
3!
then
{∗(f ∧ g)}a = [fy gz − fz gy , fz gx − fx gz , fx gy − fy gx ]a (7.178)
But this last set of components is immediately recognizable as that of the vector cross
product in 3-dimensions. By contracting with an appropriate basis for the one-forms
dxa = {dx, dy, dz}, this can be written in index free notation as
∗ (f ∧ g) = f × g (7.179)
or equivalently
f ∧ g = ∗(f × g) (7.180)
Of course this is the primary reason so much work went into the development of the exte-
rior product of differential forms — they are the natural multi-dimensional generalization
of the notion of “vector cross product”.
and calculate
{∗(df )}a = [∂y fz − ∂z fy , ∂z fx − ∂x fz , ∂x fy − ∂y fx ]a (7.182)
But this last set of components is immediately recognizable as that of the “curl” of a
vector field in 3-dimensions; also called “rot”. In index free notation
or equivalently
df = ∗(∇ × f ) = ∗(curlf ) = ∗(rotf ) (7.184)
δF = (−1)S+n(s+1)+1 ∗ d ∗ F (7.185)
On one-forms it yields
while on two-forms
δF = ∗d ∗ F = ∗d(∗F ) = curl(∗F ) (7.187)
δf = −divf (7.188)
directly in terms of the 3-d components (instead of invoking the general n-dimensional
result as we have done above). ♦
is a specialization of ddf = 0. ♦
∇ · (∇ × f ) = 0 = div(curl f ) (7.190)
is a specialization of ddf = 0. ♦
∗ (f ∧ ∗g) = f · g (7.191)
where the RHS is the inner product [dot product] of the two vectors. ♦
where the product on the RHS refers to matrix multiplication of the two antisymmetric
matrices. ♦
Exercise: Look up every vector and differential identity you can find for 3-d Euclidean
space and translate them into the language of differential forms. ♦
7.12 Laplacian
or equivalently
∇2 f = −∇ × ∇ × f + ∇(∇ · f ) (7.194)
∇2 f = −curl curl f + grad (divf ) (7.195)
If we write this in terms of d and δ
while
grad(divf ) = grad(−δf ) = −dδf (7.197)
That is:
∇2 f = − (δd + dδ) f (7.198)
But the LHS has been defined only for 3-d one-forms on Euclidean space in Cartesian co-
ordinates, while the RHS now can be meaningfully defined on arbitrary forms on arbitrary
curved manifolds. This motivates the definition, for all s-forms in arbitrary dimension:
∆F = (d + δ)2 F (7.200)
♦
Math 464: Differential Geometry 153
Now that we have developed the language of exterior differential forms, it can be used to
tidy up some of the formulae for the Riemann tensor.
[transport(y→x;γ) ]• • , (7.201)
Now using the notation of differential forms we can introduce a tensor-valued one-form
•
Γ • by writing:
∂
Γ• • = Γ• •m dxm = {transport[y → x; γ]}• • dxm . (7.203)
∂y m y→x
This two-form contains all the information encoded in the Riemann tensor. The definition
of the Riemann tensor in terms of Γ• • is now written as
R• • = dΓ• • − Γ• • ∧ Γ• • (7.205)
and the Weitzenbock identities take the simple and elegant form
dR• • + R• • ∧ Γ• • − Γ• • ∧ R• • = 0. (7.206)
Warning: This is not exactly the same as MTW’s use curvature forms as developed on
pages 354 ff. MTW make explicit use of the n-bein formalism and must distinguish mani-
fold indices from n-bein indices. [In physicist’s language, they must distinguish spacetime
indices from local Lorentz indices.] In the current formalism everything is done with
manifold indices and coordinate charts — the •s are just placeholders for ordinary tensor
indices. ♦
Chapter 8
Lie derivatives
The “Lie derivative” is the third standard way of producing covariant objects from partial
derivatives of a tensor; this time without invoking any connexion [or parallel transport]
at all. Of course, as usual there is a price to pay.
Generically partial derivatives of tensor quantities are not themselves tensors. We have
investigated in detail two ways of dealing with this issue — through the covariant deriva-
tive and via the exterior derivative. There is a third standard way of building tensor
quantities out of partial derivatives, known as the Lie derivative. (Named after Sophus
Lie, he of Lie groups, Lie algebras, etc.)
If you have been paying attention, you might have noticed the first hint of the Lie
derivative sneak by as we talked about the commutator of two vector fields. Recall that
if ta1 (x) and ta2 (x) are two contravariant vector fields then the commutator, defined as
We will now use this quantity to iteratively define a new type of derivative on Tsr tensors.
For a change, I will adopt the axiomatic approach.
Axiom 1 The lie Derivative Lv of a tensor with respect to a contravariant vector field v
is a linear mapping
Lv : Tsr → Tsr (8.2)
154
Math 464: Differential Geometry 155
that is also linear in the argument v and satisfies the Leibnitz rule on Cartesian products
of tensors
Lv (X1 ⊗ X2 ) = (Lv X1 ) ⊗ X2 + X1 ⊗ (Lv X2 ) (8.3)
That is:
Lv1 +v2 X = Lv1 + Lv2 X (8.4)
Lv (X1 + X2 ) = Lv X1 + Lv X2 (8.5)
So far, this just specifies the general structure of the derivative, but does not specify a
particular notion of derivative. To do that we add two additional axioms:
From these axioms we can now deduce the behaviour of Lv on arbitrary Tsr tensors.
Note that in this “third route” to a tensorial notion of derivative the use of the auxiliary
vector field v. Essentially the decision to limit the directions in which we are differentiation
is the “extra ingredient” that lets us get somewhere. Compare to the covariant derivative
where we added the extra structure of parallel transport, and the exterior derivative where
we restricted the set of tensors to be differentiated.
To see what happens on a T0r tensor recall that any such tensor is an element of T r can
be written as a linear combination of terms of the form
t1 ⊗ t 2 ⊗ t 3 ⊗ · · · ⊗ t r (8.8)
= [v, t1 ] ⊗ t2 ⊗ t3 ⊗ · · · ⊗ tr )
+t1 ⊗ [v, t2 ] ⊗ t3 ⊗ · · · ⊗ tr
+···
+t1 ⊗ t2 ⊗ t3 ⊗ · · · ⊗ [v, tr ] (8.10)
In components this means
{Lv (t1 ⊗ t2 ⊗ t3 ⊗ · · · ⊗ tr )}a1 a2 ...ar = (Lv t1 )a1 ⊗ (ta22 ⊗ ta33 ⊗ · · · ⊗ tar r )
+ta11 ⊗ (Lv t2 )a2 ⊗ ta33 ⊗ · · · ⊗ tar r
+···
+ta11 ⊗ ta22 ⊗ ta33 ⊗ · · · ⊗ (Lv tr )ar (8.11)
a1 a2 a3 ar
= (v∂t1 − t1 ∂v) ⊗ (t2 ⊗ t3 ⊗ · · · ⊗ tr )
+ta11 ⊗ (v∂t2 − t2 ∂v)a2 ⊗ ta33 ⊗ · · · ⊗ tar r
+···
+ta11 ⊗ ta22 ⊗ ta33 ⊗ · · · ⊗ (v∂tr − tr ∂v)ar
(8.12)
= v m ∂m (ta11 ⊗ ta22 ⊗ ta33 ⊗ · · · ⊗ tar r )
m
( i−1 r
)
X Y a Y a
− [ tj j ] tbj ∂b v ai [ tj j ] (8.13)
i=1 j=1 j=i+1
Example: If r = 2
{Lv X}a1 a2 = v m ∂m X a1 a2 − X ba2 ∂b v a1 + X a1 b ∂b v a2
(8.15)
♦
Example: If r = 3
{Lv X}a1 a2 a3 = v m ∂m X a1 a2 a3 − X ba2 a3 ∂b v a1 + X a1 ba3 ∂b v a2 + X a1 a2 b ∂b v a3
(8.16)
♦
By combining the axions for the Lie derivatives on contravariant vectors, on covariant
vectors, and the Leibnitz rule we can deduce the effect of a Lie derivative on covariant
vectors. Recall that for a contravariant vector t and covariant vector g we construct a
scalar hg|ti via the pairing
But then
Lv tr(g ⊗ t) = trLv (g ⊗ t) = tr{(Lv g) ⊗ t + g ⊗ (Lv t)} (8.18)
So that
tr{(Lv g) ⊗ t} = Lv tr(g ⊗ t) − tr{g ⊗ [v, t]} (8.19)
In components
(Lv g)a ta = v m ∂m (ga ta ) − ga (v m ∂m ta − tm ∂m v a ) (8.20)
So that
(Lv g)a ta = v m (∂m ga )ta + v m (∂m ta )ga − ga (v m ∂m ta − tm ∂m v a ) (8.21)
The terms involving ∂t cancel:
By considering terms of the form ga11 ga22 , you can bootstrap this to T20 tensors and beyond.
♦
Math 464: Differential Geometry 158
(Lv X)abc = v m (∂m Xabc ) + {Xmbc (∂a v m ) + Xamc (∂b v m ) + Xabm (∂c v m )} (8.26)
Exercise: Combine with the previous section and find the appropriate generalization
to Tsr tensors. ♦
This is a “low-brow” approach to the Lie derivative — it has been defined recursively
in terms of basic rules for scalars and covariant vectors, and the need for preservation of
some basic features of anything that we would like to call a derivative. There is, as yet,
no geometrical interpretation of the Lie derivative.
The underlying geometry of the Lie derivative is the act of “dragging” geometrical quan-
tities along integral curves of a vector field. Suppose we are given some vector field v(x),
then we can define the integral curves via
dxa
= v a (xb (λ)) (8.27)
dλ
There is no requirement that these curved be geodesic, or in any way nice — as long
as they are sufficiently smooth. Now moving a certain parameter distance λ along these
curves defines a set of mappings from the manifold to itself
φλ : M → M (8.28)
These mappings essentially “drag” all points x on the manifold a “distance” λ along the
integral curve and may be represented (for suitably small λ) as coordinate functions
Instead of dragging a point we can ask what it means to drag a scalar function, a
suitable definition is:
f ∗ (x0 ; λ) = f (x(x0 ; v; λ)) (8.30)
That is, the dragged function f ∗ at the point x0 is defined to be the original function at
the point that x0 is dragged to by the action of φλ .
Math 464: Differential Geometry 159
But now we can ask what it means to drag a vector field; let t(x) be a second vector
field, defined by a second set of integral curves
dxa
= ta (xb (µ)) (8.31)
dµ
with solution
t̃(λ) : xa (xb0 ; t(x); µ) = xa0 + ta (x0 )µ + O[µ2 ] (8.32)
Each one of these curves, when acted on by φλ will be pushed a certain distance along
the “v direction”; yielding a new set of curves
Here µ is the parameter along the t̃∗ curve; λ is the distance the whole collection of t
curves has been pulled along the v direction. But the curves t̃∗ (µ; λ) themselves have
tangent vectors, and so define a new set of tangent vector fields
t∗ (x; λ) (8.35)
that depend on the parameter λ and reside at the point (x(x0 ; v; λ) — this is the “Lie
dragged vector field”. Now you can define
t(x(x0 ; v; λ)) − t∗ (x; λ)
Lv t = lim (8.36)
λ→0 λ
secure in the knowledge that it is at least a contravariant vector. Note that we have been
very careful to ensure that t(x(x0 ; v; λ)) and t∗ (x; λ) both reside in the same tangent space
[at x(x0 ; v; λ)].
Of course it remains to be shown that this agrees with the previous definition. Now in
components we have
dxa (xb (x0 ; t; µ); v(xb (x0 ; t; µ)); λ)
[t∗ (x; λ)]a = (8.37)
dµ
which we can evaluate as
ta (x(x0 ; v; λ)) = ta (xb0 + v b (x0 )λ + O[λ2 ]) = ta (x0 ) + v b (x0 )∂b ta |x0 + O[λ2 ] (8.39)
so that
d {ta (x(x0 ; v; λ)) − [t∗ (x; λ)]a }
= v b (x0 ) ∂b ta |x0 − ∂b v a |x0 tb (x0 ) (8.40)
dλ x0
Math 464: Differential Geometry 160
which verifies that it is the same quantity as per the previous component-based definition:
Note that the process which takes the vector field t(x) into the vector field t∗ (x; λ) is
often called the “push forward” of the function φλ : M → M. One often sees notation
like
t∗ (x; λ) = φ∗ (t(x)) (8.42)
Exercise: Verify that “Lie dragging” on scalars reproduces the previous definition
Lv f = v a ∂ a f (8.43)
How would we Lie drag a covariant vector? Well we know how to Lie drag a scalar, and
we now how to Lie drag a contravariant vector, so consider the scalar quantity g(v) = g a v a .
The dragged covector g∗ (x; λ) is defined at the point x0 by its action on vector fields via
The process which takes the covector field g(x) into the covector field g∗ (x; λ) is often
called the “pull back” of the function φλ : M → M. One often sees notation like
[g∗ (x0 ; λ)]a [t(x0 )]a = [g(x(x0 ; v; λ)]a [t∗ (x0 ; λ)]a (8.46)
= ga (x0 ) + v b (x0 ) ∂b ga |x0 λ + O[λ2 ]
So that, factoring out the ta (x0 ) (and it is a non-trivial check of internal consistency that
this vector does factor out in this way)
Note that this is the same definition as previously deduced — so everything is consistent.
Exercise: Since we know how to “push forward” a vector; it should be easy enough for
you to see how to “push forward” a T0r tensor. Go through the formalism to make sure it
holds together and verify that you can reproduce the previous result for the Lie derivative
on T0r tensors. ♦
Exercise: Since we know how to “pull back” a covector; it should be easy enough for
you to see how to “pull back” a Ts0 tensor. Go through the formalism to make sure it
holds together and verify that you can reproduce the previous result for the Lie derivative
on Ts0 tensors. ♦
Exercise: Hence verify that this “Lie dragging” on arbitrary Tsr tensors reproduces the
previous definition. There is a minor technical issue to deal with — define the pullback for
a T0r tensor by starting at the point x(x00 ; v; λ) and pushing forward a parameter distance
−λ. This is necessary in order to make sure you really are subtracting tensors at the same
point x0 . ♦
— to be written —
Chapter 9
Extrinsic curvature
— to be written —
9.1 Embeddings
— to be written —
162
Chapter 10
— to be written —
— to be written —
163
Chapter 11
Gauge-fields
Gauge fields are an extension of the idea of the affine connexion to general vector bundles,
not just the tangent bundle.
[transport(y→x;γ) ]• • , (11.1)
Eventually we defined the Riemann tensor, and using the notation of differential forms
∂
Γ• = Γ• •m dxm = m
{transport[y → x; γ]}• • dxm , (11.3)
∂y y→x
But, and here is the beauty of the formalism, nothing in the above depends on us actually
working with the tangent bundle — the •s can be interpreted as place-holders for indices
in any arbitrary vector space we like.
164
Math 464: Differential Geometry 165
Specifically, let V(M) be some vector bundle with base space M, and fibres some vector
space V . If we assume the existence of some sort of parallel transport operator in this
vector bundle we can use the above to construct a connexion Γ• •m where the •s are
place-holders for indices in the vector space V .
Conversely if we assume the existence of Γ• •m we can use the path ordering process to
generate the parallel transport operator
Z y
•
[transport(y→x;γ) ] • = P exp(Γ• •c dxc ) (11.5)
x
Much of what we discussed for arbitrary affine connexions still holds true in this case.
In coordinates we can write
R• •ab = Γ• •[a,b] − Γ• •[a Γ• •b] (11.6)
One of the obvious things we cannot do is that we cannot now define a Ricci tensor, since
that would involve trying to “trace” over incompatible indices — one vector index with
one coordinate index.
The S-tensor on the other hand, continues to make perfectly good sense, so some of
Schouten’s identities continue to work.
Exercise: What happens to the S–tensor? It is often [but not always] zero. Can you
find a pattern there? ♦
Exercise: Work your way through the chapter on general affine connexions in detail
and see just how much of it will survive in this current context. Does torsion make any
sense? Does nonmetricity make any sense? ♦
Example: Suppose we have a complex line bundle L(M). Then the fibre is a vector
space with one complex dimension and we can simplify
Γ• = Γ• •m dxm → iAm dxm = iA. (11.7)
That is, the affine connexion [on this line bundle] is just a complex valued one-form. (The
presence of the i is pure convention.) Therefore
Γ• • ∧ Γ• • → iA ∧ iA = 0, (11.8)
and the Riemann curvature simplifies to
R• • = dΓ• • − Γ• • ∧ Γ• • → idA = iF. (11.9)
Math 464: Differential Geometry 166
Thus in this situation the Riemann tensor simplifies to a complex-valued two form
F = dA (11.10)
In this particular case the Bianchi identities simply reduce to
dF = ddA = 0 (11.11)
♦
Exercise: Suppose the parallel transport operator preserves the absolute value of the
complex “vector” as you move around the manifold M. (We have not used any such
property up to this stage.) What does that tell you about the one-form A? ♦
Exercise: Suppose we work in 4 dimensions (time+space). [You can consider flat space
with Cartesian coordinates for simplicity, though the construction generalizes to arbitrary
manifolds.] Pick
A = φ dt + A ~ · d~x = φ dt + Ai dxi
where φ is the electromagnetic vector potential and A ~ is the electromagnetic vector po-
tential. Relate the curvature F to the electromagnetic fields E ~ and B.
~ Demonstrate
that
F = dA = Ei dxi ∧ dt + ijk B i dxj ∧ dxk
♦
In this case (because the matrices A• • commute with each other) we still have
A• • ∧ A• • = 0. (11.13)
Γ• • = iA• • (11.16)
defines some N × N matrix of one-forms, and in general these need not commute. Unless
we add further constraints to the geometry, these are GL(N )-valued one-forms, where
GL(N ) stands for general linear N × N matrices.
We now have
A• • ∧ A• • 6= 0. (11.17)
The curvature now contains contributions from both terms
and the Bianchi identity is more complicated. (Note the explicit occurrence of i; this is a
choice of convention and convenience. Physicists tend to keep the i explicit, mathemati-
cians tend to suppress it by absorbing it into the definition of A• • .)
Exercise: Suppose that the vector space V which fibres V(M) has a “norm” (so that
“lengths” of vectors make sense; we have a metric on V ), and suppose that the transport
operator is chosen to always preserve the length of the vectors it transports. (This is an
extremely natural restriction.)
Show that in this situation the matrix-valued one-forms A• • are actually Hermitian
N × N matrices.
Math 464: Differential Geometry 168
(Mathematicians, who typically absorb the explicit i into the definition of A• • , would
instead be working with anti-Hermitian matrices.) ♦
Comment: [Not examinable] In physics, these non-Abelian gauge fields are often called
Yang–Mills fields. They are used, in particular, in quantum chromo-dynamics [QCD]
and in the electroweak standard model of particle physics. Further afield they underlie
the grand unified field theories [GUTS], though it should be noted that the GUTS are
“neither grand, nor unified, nor even theories”.
In particle physics, the transport operation always preserves the norm, and one is deal-
ing with Hermitian N × N matrices. The Hermitian N × N matrices generate the group
U (N ) of unitary N × N matrices. These unitary matrices are called the “gauge group”,
they are the permissible coordinate transformations you can make in the fibre V that do
not affect the norm of the vector. ♦
Exercise: Show that in the case of a single Abelian gauge field (that is, electromag-
netism) the gauge group is U (1). ♦
Comment: [Not examinable] In QCD one works with a 3 dimensional complex vector
space of three “colours”, preserves the norm, and factors out a physically irrelevant over-
all phase. The result is that one works with Hermitian traceless 3 × 3 matrices, and the
“gauge group” [the set of permissible coordinate transformations] is SU (3), the set of
special [determinant one] unitary 3 × 3 matrices. ♦
Comment: [Not examinable] In the electroweak model, the relevant gauge group is
SU (2) × U (1). Roughly speaking the SU (2) has to do with the weak interactions while
the U (1) has to do with electromagnetism, but the situation is made more complicated
by the presence of spontaneous symmetry breaking. ♦
Warning: Everything I have said here has to do with classical gauge theories, as I have
not even hinted at what would then be needed to build a quantum theory along these
lines. ♦
Warning: Gauge theories are also used outside of particle physics — for instance, I
have seen engineers trying to use gauge theories to analyze the recognition problem —
when a “target” may be translated and rotated by an arbitrary amount, and you need to
recognize it despite the peculiar point of view.
I have also seen attempts at modeling the swimming motions of bacteria and other small
organisms trough a gauge theoretic representation of the shape of the organism. ♦
Chapter 12
Coda
• In my own research I have [among other things] been using the language of dif-
ferential geometry and curved spaces to investigate sound propagation in moving
fluids.
• There are also applications in theoretical biology — everything from studying the
conformation space of a protein molecule to the “fitness landscape” of biological
systems.
—# # #—
• Of course I did not finish all the topics I wanted to cover — but we have done quite a
bit.
• The material I have managed to cover is a good solid introduction to differential geometry
— at a level appropriate to both applied mathematicians and physicists.
—# # #—
169
Appendix A
Notation
anti-symmetrize
1 X
A[a1 a2 ...ar ] = signum(π) Aπ(a1 a2 ...ar )
r! π
∂
partial derivatives ∂a is shorthand for ∂xa
.
—###—
170
Appendix B
We start with the trivial observation that an r-index tensor with no symmetries has nr
algebraically independent components.
We define
1X
A[a1 a2 ...ar ] = signum(π)Aπ(a1 a2 ...ar )
r! π
where the sum runs over all permutations π of the r indices and signum(π) denotes the
parity of the permutation. (That is, −1 if an odd number of indices are flipped out of
order, and +1 otherwise.) For example
A a1 a2 − A a2 a1
A[a1 ] = Aa1 ; A[a1 a2 ] =
2
A a1 a2 a3 + A a2 a3 a1 + A a3 a1 a2 − A a2 a1 a3 − A a3 a2 a1 − A a1 a3 a2
A[a1 a2 a3 ] =
6
As another example, if we already happen to know that Aabc is anti-symmetric on its first
two indices then the above simplifies to
A a1 a2 a3 + A a2 a3 a1 + A a3 a1 a2
A[a1 a2 a3 ] =
3
171
Math 464: Differential Geometry 172
equal the first, and must be picked out of the remaining n − 1 indices. Proceeding through
all r indices leads to
n(n − 1)(n − 2) . . . (n − r + 1)
distinct ways of assigning distinct indices. But by the antisymmetry all r! rearrangements
of these indices are algebraically related. Thus the number of algebraically independent
components is simply the binomial coefficient
n(n − 1)(n − 2) . . . (n − r + 1) n! n
Anr = = =
r! (n − r)! r! r
Note:
nr
Anr ≤
r!
The first few of these numbers are
n(n − 1) n(n − 1)(n − 2)
Anr → 1; n; ; ; ...
2 6
We define
1X
S(a1 a2 ...ar ) = Sπ(a1 a2 ...ar )
r! π
where the sum runs over all permutations π of the r. For example
S a1 a2 + S a2 a1
S(a1 ) = Sa1 ; S(a1 a2 ) =
2
S a1 a2 a3 + S a2 a3 a1 + S a3 a1 a2 + S a2 a1 a3 + S a3 a2 a1 + S a1 a3 a2
S(a1 a2 a3 ) =
6
As another example, if we already happen to know that Sabc is symmetric on its first two
indices then the above simplifies to
S a1 a2 a3 + S a2 a3 a1 + S a3 a1 a2
S(a1 a2 a3 ) =
3
That is
(r + 1)(r + 2) . . . (r + n − 1) n(n + 1)(n + 2) . . . (n + r − 1)
Srn = =
(n − 1)! r!
These is a strong tendency not to actually derive this result, and merely to quote it. That
is because, in contrast to Anr , the derivation is a little messy.
B.2.1 Enumeration
Suppose we have a completely symmetric tensor S(a1 a2 a3 ...ar ) , then wlog we can always
rearrange the indices in non-decreasing order, with
1 ≤ a1 ≤ a2 ≤ a3 ≤ · · · ≤ ar ≤ n.
Then Srn equals the number of such sequences #{ai }.
But the number of these bi sequences is very easy to compute, as it equals the number
of anti-symmetric tensors with r indices in n + r − 1 dimensions. That is
n+r−1
n n+r−1
Sk = A r =
r
as was to be shown.
B.2.2 Induction
If you don’t know the enumeration trick, the most straightforward way of proceeding is
by induction on the number of dimensions of spacetime. Clearly
Sk1 = 1
and
Sk2 = k + 1
This last is obtained by noting that if n = 2 there are only two distinct indices, so that
k of them can be partitioned as follows — 0 : n, 1 : (k − 1), . . . , (k − 1) : 1, k : 0 — in
k + 1 distinct ways.
Now consider Skn+1 . When we add one more possible value for the index to take, there
could be 0, 1, . . . , k − 1 or k occurrences of this new index. That corresponds to k, k − 1,
. . . , 1, or 0 slots being available for the old indices. That is
k
X
Skn+1 = Skn + n
Sk−1 +...+ S1n + S0n = Sjn
j=0
This recursion relation now completely specifies the remaining Skn . For instance
k k
X X k(k + 1) (k + 1)(k + 2)
Sk3 = Sj2 = (j + 1) = + (k + 1) =
j=0 j=0
2 2
k k
X X (j + 1)(j + 2) (k + 1)(k + 2)(k + 3)
Sk4 = Sj3 = =
j=0 j=0
2 6
Note
nr
Srn ≥ ≥ Anr
r!
Math 464: Differential Geometry 175
A simple way of deriving Srn ≥ Anr is this: If all indices are distinct one finds nr inde-
pendent components; as for the antisymmetric case. In addition (if r > 1) there are now
additional nonzero components where two of the indices are identical, and distinct from
the remaining r − 2 indices.
—###—
Appendix C
ln det X = tr ln X. (C.1)
D = diag[λ1 , · · · , λn ] (C.2)
while
ln D = diag[ln λ1 , · · · , ln λn ], (C.4)
so " #
n
X n
Y
tr ln D = ln λi = ln λi . = ln det D (C.5)
i=1 i=1
Though this is not a general proof, this is enough to guess that the result is correct. To
complete the proof we work in stages. (This presentation is longer than it has to be, I’m
making it rather explicit in order to lead to the general result gently).
Having proved it for diagonal matrices, it now holds for arbitrary real symmetric ma-
trices because you can always diagonalize them using an orthogonal transformation
X = ODO T (C.6)
176
Math 464: Differential Geometry 177
and then
det X = det[ODO T ] = det D; (C.7)
whereas
tr ln X = tr ln[ODO T ] (C.8)
= tr ln[I + (ODO T − I)] (C.9)
= tr ln[I + (O[D − I]O T )] (C.10)
∞
X (O[D − I]O T )n
= tr (C.11)
n=1
n
∞
(O[D − I]O T )n
X
= tr (C.12)
n=1
n
∞
[D − I]n
X
= tr (C.13)
n=1
n
= tr ln[D] (C.14)
X = U DU −1 (C.15)
Next, it is a sufficient (though not necessary) condition that if all the eigenvalues of
an arbitrary square matrix X are distinct then it can be diagonalized using non-singular
linear transformations:
X = LDL−1 (C.16)
(The matrix L is allowed to be complex.) Apply the same argument as previously.
Finally, for a completely arbitrary matrix (possibly with degenerate eigenvalues) even
if you cannot diagonalize it you can always put it into upper (or lower) triangular form
X = LT L−1 (C.17)
where T has values only on the diagonal and above (or below), and the matrix L is again
allowed to be complex.
Exercise: Check that the previous argument holds for upper (or lower) triangular
matrices. What is
det T ? ln T ? tr ln T ? (C.18)
♦
Math 464: Differential Geometry 178
Exercise: For a somewhat tidier proof, look up “Jordan canonical form”. For a matrix
J in Jordan canonical form what is
det J? ln J? tr ln J? (C.19)
♦
Now suppose the elements of the matrix X depend on some variable z then
d d
ln det X = tr ln(X) (C.23)
dz dz
1 d det X −1 dX
= tr X (C.24)
det X dz dz
There is something to prove in this last step; as previously start with diagonal matrices
and bootstrap your way up...
You can also prove this result directly from the definition of determinant. The ijth
element of the cofactor matrix is (−1)i+j times the determinant of the (n − 1) × (n − 1)
matrix defined by deleting the ith row and jth column of the matrix X.
You could also just look it up somewhere — the important thing is to realise that such
a result exists and to know how to use it.
Appendix D
Path-ordered exponentials are a very convenient trick for formally solving certain matrix
differential equations. Suppose we have a differential equation of the form
dU (t)
= H(t) U (t) (D.1)
dt
where U (t) and H(t) are matrices [or more generally linear operators on some vector
space] and the matrix H(t) is generally not a constant. [So in particular H(t1 ) need not
commute with H(t2 ).]
If H(t) is a constant then we define the formal process of “path ordering” in terms of the
exact solution U (t) which we know exists because of standard existence and uniqueness
theorems. That is Z t
U (t) = P exp H(t0 ) dt0 U (0) (D.3)
0
That is Z t
0 0
P exp H(t ) dt = U (t) U −1 (0) (D.4)
0
If we take this as our definition of path ordering then
Z t Z t
d 0 0 −1 0 0
P exp H(t ) dt = H(t) U (t) U (0) = H(t) P exp H(t ) dt (D.5)
dt 0 0
179
Math 464: Differential Geometry 180
Z t
= exp [H(t) ∆t] P exp H(t ) dt + O[(∆t)2 ]
0 0
(D.7)
0
Let’s now bootstrap this result into a general limit formula for the path ordered integral.
Split the interval (0, t) into n equal segments and evaluate H(t) at the points
i
ti = t ; i ∈ (0, n − 1) (D.8)
n
then
Z t
0 0
P exp H(t ) dt = exp [H(tn−1 ) ∆t] exp [H(tn−2 ) ∆t] · · · (D.9)
0
1
· · · exp [H(t1 ) ∆t] exp [H(t0 ) ∆t] + O
n
Alternatively
Z t
0 0
P exp H(t ) dt = lim exp [H(tn−1 ) ∆t] exp [H(tn−2 ) ∆t] · · · (D.10)
0 n→∞
This limiting process should remind you of the way the Riemann integral is defined, except
of course that the H(ti ) need not commute with each other so that the order in which the
matrix exponentials are multiplied together is critically important. This is why we call it
“path ordered”.
Note what happens if for some reason the H(ti ) do happen to commute with each other.
Then for instance
exp [H(t1 ) ∆t] exp [H(t0 ) ∆t] = exp {H( t1 ) + H(t0 )} ∆t (D.11)
which is not true unless the matrices commute. Continuing in this vein, when the matrices
do commute we have
Z t
0 0
P exp H(t ) dt = lim exp [{H(tn−1 ) + H(tn−2 ) · · · H(t1 ) + H(t0 )} ∆t] .
0 n→∞
(D.12)
But now the argument of the exponential on the RHS really is the Riemann integral, so
we have
Z t Z t
0 0 0 0
P exp H(t ) dt = exp H(t ) dt . (D.13)
0 0
Math 464: Differential Geometry 181
That is, the path ordered integral reduces to the ordinary integral whenever the matrices
H(t) commute with each other. (You could also derive this directly from the original
differential equation for U (t).)
In a quantum mechanical setting you are more likely to think of t as the time, and
consider the slightly different differential equation
dU (t)
= −iH(t) U (t) (D.14)
dt
where H(t) is now the Hamiltonian operator on an appropriate Hilbert space and U is
the unitary time evolution operator. Then
Z t
0 0
U (t) = T exp −i H(t ) dt U (0) (D.15)
0
But note that there is nothing fundamentally new here — the physicists “time ordering’
and the mathematicians “path ordering” are fundamentally the same thing.
The other place where path ordering shows up in a physics setting is in Yang-Mills
theory when you are constructing objects such as “Wilson loops” or “Polyakov loops”. I
won’t explain them now but might get around to it later in the course.
Appendix E
The “calculus of variations” has to do with the study of integrals (defined on some suitable
set of functions) and the conditions under which the integral is “extremal”; meaning that
the value of the integral is a [local] maximum, minimum, or a “point of inflexion”.
The canonical example is to suppose we have a function L(·, ·) which itself depends on
a function x(t) and its first derivative ẋ(t) = dx(t)/dt, that is
L = L (ẋ(t), x(t)) (E.1)
Now consider the integral
Z b
S[a, b; x(t)] = L (ẋ(t), x(t)) dt (E.2)
a
which is a functional mapping some suitable set {x(t)} of functions x(t) into the real
numbers IR. Under what conditions is this integral extremal?
182
Math 464: Differential Geometry 183
Now let us restrain the set of functions {x(t)} to consist only of functions x(t) that are
fixed at the end-points a and b. That is
For this set of functions the integral S[a, b; x(t)] is extremal (meaning δS[a, b; x(t)] = 0)
iff for arbitrary δx(t) satisfying the endpoint constraints we have
Z b
d ∂L (ẋ(t), x(t)) ∂L (ẋ(t), x(t))
− + δx(t) dt = 0. (E.8)
a dt ∂ ẋ(t) ∂x(t)
There are many generalizations an applications of this equation — the geodesic equation
(in the sense of shortest-distance geodesics) being one of them.
Exercise: Suppose L(· · ·) depends not only on the function and its first derivative, but
also on second and higher derivatives up to order N . That is
N
d
L=L x(t), · · · , ẍ(t), ẋ(t), x(t) . (E.14)
dtN
Show that in this case [with suitable restrictions on the functions x(t) at the end-points
a and b] the Euler–Lagrange equations generalize to
N n
N
n d ∂L d
X
(−) n n x/dn t) N
x(t), · · · , ẍ(t), ẋ(t), x(t) = 0. (E.15)
n=0
dt ∂(d dt
Explicitly find what the restrictions on x(t) should be at the end-points a and b. ♦
Now go through the same sort of steps as previously. Let ∂Ω denote the boundary of Ω
and show that
" #
∂L ∂a φ(xb ), φ(xb )
Z
δS[Ω; φ(x)] = b )]
δφ(x) (unit normal)a dn−1 (area)
∂Ω ∂[∂ a φ(x
Z ( " # )
∂ ∂L ∂a φ(xb ), φ(xb ) ∂L ∂a φ(xb ), φ(xb )
+ − a b )]
+ b)
δφ(xb ) dn x
Ω ∂x ∂[∂ a φ(x ∂φ(x
2
+ O [δx(t)] . (E.18)
Math 464: Differential Geometry 185
From this, formulate suitable restrictions on the field φ(xb ) at the boundary ∂Ω, and
demonstrate that the relevant Euler–Lagrange equations are
" #
∂ ∂L ∂a φ(xb ), φ(xb ) ∂L ∂a φ(xb ), φ(xb )
= . (E.19)
∂xa ∂[∂a φ(xb )] ∂φ(xb )
Exercise: Suppose L(· · ·) depends on the field φ(xc ), plus its first and second derivatives.
Show that the Euler-Lagrange equations (for Cartesian coordinates in a Euclidean space)
are now
∂2 ∂2L
∂L ∂ ∂L
− + a b = 0. (E.20)
∂φ(xc ) ∂xa ∂[∂a φ(xc )] ∂x ∂x ∂[∂a ∂b φ(xc )]
The generalization to higher derivatives is obvious but notationally messy. ♦
Note: The calculus of variations is a general tool that has applications in many fields;
beyond the simple application to (shortest length) geodesics in the text this procedure is
also used in Lagrangian mechanics and it generalizations, in classical field theories [such
as say Maxwell’s electromagnetism or Einstein’s general relativity] where it is often the
easiest way of obtaining the filed equations, and also in quantum field theories where
classical solutions satisfying the Euler–Lagrange equation often dominate the physics. ♦
Exercise: Generalize this discussion to arbitrary manifolds. There will have to be some
minimal restictions on the type of manifold considered. Find them, but keep the formal-
ism as general as possible. ♦
Appendix F
In the year 1900 Professor David Hilbert gave a key-note address at the International
Congress of Mathematicians which was that year held in Paris. Hilbert’s address set
out a list of 23 problems that he thought were important — and much of 20th century
mathematics was devoted to solving about half of these problems. Dr. Maby Winton New-
son translated this address into English with the author’s permission for Bulletin of the
American Mathematical Society 8 (1902), 437-479. A reprint of appears in Mathematical
Developments Arising from Hilbert Problems, edited by Felix Brouder, American Math-
ematical Society, 1976. Various versions are also available on the internet; go to Google
and search on “Hilbert problems”. Three of the 23 problems directly involve the calculus
of variations, the 19th, 20th, and 23rd problems. Excerpts from the lecture are presented
below. Note especially that Hilbert’s 23rd problem was somewhat more open-ended than
the others ...
Mathematical Problems
Lecture delivered before the International Congress of Mathematicians
Paris 1900
By Professor David Hilbert
It is difficult and often impossible to judge the value of a problem correctly in advance;
for the final award depends upon the gain which science obtains from the problem. Nev-
ertheless we can ask whether there are general criteria which mark a good mathematical
problem. An old French mathematician said: ”A mathematical theory is not to be con-
sidered complete until you have made it so clear that you can explain it to the first man
186
Math 464: Differential Geometry 187
whom you meet on the street.” This clearness and ease of comprehension, here insisted
on for a mathematical theory, I should still more demand for a mathematical problem if
it is to be perfect; for what is clear and easily comprehended attracts, the complicated
repels us.
Moreover a mathematical problem should be difficult in order to entice us, yet not
completely inaccessible, lest it mock at our efforts. It should be to us a guide post on the
mazy paths to hidden truths, and ultimately a reminder of our pleasure in the successful
solution.
... lacuna ...
19. Are the solutions of regular problems in the calculus of variations always
necessarily analytic?
One of the most remarkable facts in the elements of the theory of analytic functions
appears to me to be this: That there exist partial differential equations whose integrals are
all of necessity analytic functions of the independent variables, that is, in short, equations
susceptible of none but analytic solutions. The best known partial differential equations
of this kind are the potential equation
∂2f ∂2f
+ 2 =0
∂x2 ∂y
and certain linear differential equations investigated by Picard;46 also the equation
∂2f ∂2f
+ = ef ,
∂x2 ∂y 2
the partial differential equation of minimal surfaces, and others. Most of these partial
differential equations have the common characteristic of being the Lagrangian differential
equations of certain problems of variation, viz., of such problems of variation
Z Z
F (p, q, z; x, y) dx dy = minimum
∂z ∂z
p= , q= ,
∂x ∂y
as satisfy, for all values of the arguments which fall within the range of discussion, the
inequality
2 2
∂2F ∂ F
2
· > 0,
∂p ∂p ∂q
F itself being an analytic function. We shall call this sort of problem a regular variation
problem. It is chiefly the regular variation problems that play a role in geometry, in
mechanics, and in mathematical physics; and the question naturally arises, whether all
solutions of regular variation problems must necessarily be analytic functions. In other
Math 464: Differential Geometry 188
words, does every Lagrangian partial differential equation of a regular variation problem
have the property of admitting analytic integrals exclusively? And is this the case even
when the function is constrained to assume, as, e.g., in Dirichlet’s problem on the potential
function, boundary values which are continuous, but not analytic?
I may add that there exist surfaces of constant negative gaussian curvature which are
representable by functions that are continuous and possess indeed all the derivatives, and
yet are not analytic; while on the other hand it is probable that every surface whose
gaussian curvature is constant and positive is necessarily an analytic surface. And we
know that the surfaces of positive constant curvature are most closely related to this
regular variation problem: To pass through a closed curve in space a surface of minimal
area which shall inclose, in connection with a fixed surface through the same closed curve,
a volume of given magnitude.
An important problem closely connected with the foregoing is the question concerning
the existence of solutions of partial differential equations when the values on the boundary
of the region are prescribed. This problem is solved in the main by the keen methods of H.
A. Schwarz, C. Neumann, and Poincare for the differential equation of the potential. These
methods, however, seem to be generally not capable of direct extension to the case where
along the boundary there are prescribed either the differential coefficients or any relations
between these and the values of the function. Nor can they be extended immediately to
the case where the inquiry is not for potential surfaces but, say, for surfaces of least area,
or surfaces of constant positive gaussian curvature, which are to pass through a prescribed
twisted curve or to stretch over a given ring surface. It is my conviction that it will be
possible to prove these existence theorems by means of a general principle whose nature
is indicated by Dirichlet’s principle. This general principle will then perhaps enable us
to approach the question: Has not every regular variation problem a solution, provided
certain assumptions regarding the given boundary conditions are satisfied (say that the
functions concerned in these boundary conditions are continuous and have in sections one
or more derivatives), and provided also if need be that the notion of a solution shall be
suitably extended? 47
... lacuna ...
So far, I have generally mentioned problems as definite and special as possible, in the
opinion that it is just such definite and special problems that attract us the most and from
which the most lasting influence is often exerted upon science. Nevertheless, I should like
to close with a general problem, namely with the indication of a branch of mathematics
repeatedly mentioned in this lecture-which, in spite of the considerable advancement lately
given it by Weierstrass, does not receive the general appreciation which, in my opinion,
is its due-I mean the calculus of variations.50
Math 464: Differential Geometry 189
The lack of interest in this is perhaps due in part to the need of reliable modern text
books. So much the more praiseworthy is it that A. Kneser in a very recently published
work has treated the calculus of variations from the modern points of view and with
regard to the modern demand for rigor.51
The calculus of variations is, in the widest sense, the theory of the variation of functions,
and as such appears as a necessary extension of the differential and integral calculus. In
this sense, Poincare’s investigations on the problem of three bodies, for example, form a
chapter in the calculus of variations, in so far as Poincare derives from known orbits by
the principle of variation new orbits of similar character.
I add here a short justification of the general remarks upon the calculus of variations
made at the beginning of my lecture.
The simplest problem in the calculus of variations proper is known to consist in finding
a function y of a variable x such that the definite integral
Z b
dy
J= F (yx , y; x)dx, yx =
a dx
assumes a minimum value as compared with the values it takes when y is replaced by
other functions of x with the same initial and final values.
δJ = 0
In order to investigate more closely the necessary and sufficient criteria for the occur-
rence of the required minimum, we consider the integral
Z b
∗
J = {F (yx , y; x) + (yx − p)Fp } dx,
a
∂F (p, y; x)
F = F (p, y; x), Fp = .
∂p
where A and B do not contain yx , and the vanishing of the first variation
δJ∗ = 0
in the sense which the new question requires gives the equation
∂A ∂B
+ = 0,
∂x ∂y
i.e., we obtain for the function p of the two variables x, y the partial differential equation
of the first order
∂Fp ∂(pFp − F )
+ = 0. (1∗)
∂x ∂y
The ordinary differential equation of the second order (1) and the partial differential
equation (1*) stand in the closest relation to each other. This relation becomes immedi-
ately clear to us by the following simple transformation
Z b
∗
δJ = {Fy δy + Fp δp + (δyx − δp)Fp + (yx − p) δFp } dx
a
Z b
= {Fy δy + δyx Fp + (yx − p) δFp } dx
a
Z b
= δJ + (yx − p) δFp dx
a
We derive from this, namely, the following facts: If we construct any simple family of
integral curves of the ordinary differential equation (l) of the second order and then form
an ordinary differential equation of the first order
yx = p(x, y) (2)
which also admits these integral curves as solutions, then the function p(x, y) is always
an integral of the partial differential equation (1*) of the first order; and conversely, if
p(x, y) denotes any solution of the partial differential equation (1*) of the first order, all
the non-singular integrals of the ordinary differential equation (2) of the first order are at
the same time integrals of the differential equation (l) of the second order, or in short if
yx = p(x, y) is an integral equation of the first order of the differential equation (l) of the
second order, p(x, y) represents an integral of the partial differential equation (1*) and
conversely; the integral curves of the ordinary differential equation of the second order
are therefore, at the same time, the characteristics of the partial differential equation (1*)
of the first order.
In the present case we may find the same result by means of a simple calculation; for
this gives us the differential equations (1) and (1*) in question in the form
The close relation derived before and just proved between the ordinary differential
equation (1) of the second order and the partial differential equation (1*) of the first
order, is, as it seems to me, of fundamental significance for the calculus of variations. For,
from the fact that the integral J ∗ is independent of the path of integration it follows that
Z b Z b
{F (p) + (yx − p)Fp (p)} dx = F (ȳx ) dx, (3)
a a
if we think of the left hand integral as taken along any path y and the right hand integral
along an integral curve of the differential equation
Since, therefore, the solution depends only on finding an integral p(x, y) which is single
valued and continuous in a certain neighborhood of the integral curve , which we are
considering, the developments just indicated lead immediately—without the introduction
of the second variation, but only by the application of the polar process to the differential
equation (1)—to the expression of Jacobi’s condition and to the answer to the question:
How far this condition of Jacobi’s in conjunction with Weierstrass’s condition E > 0 is
necessary and sufficient for the occurrence of a minimum.
to be extended over a given region ω, the vanishing of the first variation (to be understood
in the usual sense)
δJ = 0
Math 464: Differential Geometry 192
δJ∗ = 0
in the sense which the new formulation of the question demands, gives the equation
∂A ∂B ∂C
+ + = 0,
∂x ∂y ∂z
i.e., we find for the functions p and q of the three variables x, y and z the differential
equation of the first order
∂Fp ∂Fq ∂(pFp + qFq − F )
+ + = 0. (I)
∂x ∂y ∂z
two functions p and q of the three variables x, y, and z stand toward one another in a
relation exactly analogous to that in which the differential equations (1) and (1*) stood
in the case of the simple integral.
It follows from the fact that the integral J ∗ is independent of the choice of the surface
of integration z that
Z Z
{F (p, q) + (zx − p)Fp (p, q) + (zy − q)Fq (p, q)} dω = F (z̄z , z̄y ) dω, (III)
if we think of the right hand integral as taken over z̄ an integral surface of the partial
differential equations
z̄x = p(x, y, z̄), z̄y = q(x, y, z̄);
and with the help of this formula we arrive at once at the formula
Z Z Z
F (zx , zy )dω − F (z̄x , z̄y )dω = E(zx , zy , p, q)dω,
[E(zx , zy , p, q) = F (zx , zy ) − F (p, q) − (zx − p)Fp (p, q) − (zy − q)Fq (p, q)] , (IV )
which plays the same role for the variation of double integrals as the previously given
formula (4) for simple integrals. With the help of this formula we can now answer the
question how far Jacobi’s condition in conjunction with Weierstrass’s condition E > 0 is
necessary and sufficient for the occurrence of a minimum.
Connected with these developments is the modified form in which A. Kneser,52 beginning
from other points of view, has presented Weierstrass’s theory. While Weierstrass employed
integral curves of equation (1) which pass through a fixed point in order to derive sufficient
conditions for the extreme values, Kneser on the other hand makes use of any simple family
of such curves and constructs for every such family a solution, characteristic for that
family, of that partial differential equation which is to be considered as a generalization
of the Jacobi-Hamilton equation.
—###—
The problems mentioned are merely samples of problems, yet they will suffice to show
how rich, how manifold and how extensive the mathematical science of today is, and
the question is urged upon us whether mathematics is doomed to the fate of those other
sciences that have split up into separate branches, whose representatives scarcely under-
stand one another and whose connection becomes ever more loose. I do not believe this
nor wish it. Mathematical science is in my opinion an indivisible whole, an organism
whose vitality is conditioned upon the connection of its parts. For with all the variety
of mathematical knowledge, we are still clearly conscious of the similarity of the logical
devices, the relationship of the ideas in mathematics as a whole and the numerous analo-
gies in its different departments. We also notice that, the farther a mathematical theory
is developed, the more harmoniously and uniformly does its construction proceed, and
Math 464: Differential Geometry 194
unsuspected relations are disclosed between hitherto separate branches of the science. So
it happens that, with the extension of mathematics, its organic character is not lost but
only manifests itself the more clearly.
But, we ask, with the extension of mathematical knowledge will it not finally become
impossible for the single investigator to embrace all departments of this knowledge? In
answer let me point out how thoroughly it is ingrained in mathematical science that every
real advance goes hand in hand with the invention of sharper tools and simpler methods
which at the same time assist in understanding earlier theories and cast aside older more
complicated developments. It is therefore possible for the individual investigator, when
he makes these sharper tools and simpler methods his own, to find his way more easily in
the various branches of mathematics than is possible in any other science.
The organic unity of mathematics is inherent in the nature of this science, for math-
ematics is the foundation of all exact knowledge of natural phenomena. That it may
completely fulfill this high mission, may the new century bring it gifted masters and
many zealous and enthusiastic disciples!
—###—
Original references
50 — Text-books:
Moigno and Lindelof, Lecons du calcul des variations, Mallet-Bachelier, Paris, 1861, and
A. Kneser, Lehrbuch der Variations-rechnung, Vieweg, Braunschweig, 1900.
51 — As an indication of the contents of this work, it may here be noted that for the
simplest problems Kneser derives sufficient conditions of the extreme even for the case that
one limit of integration is variable, and employs the envelope of a family of curves satisfying
the differential equations of the problem to prove the necessity of Jacobi’s conditions of
the extreme. Moreover, it should be noticed that Kneser applies Weierstrass’s theory also
to the inquiry for the extreme of such quantities as are defined by differential equations.
Other references
1996 — S. Chern: Remarks on Hilbert’s 23rd problem. Math. Intelligencer 18, no. 4, 7-8.
Math 464: Differential Geometry 195
Note: These problems are sufficently open ended that there is no general agreement as
to whether or nt they have been “solved”. ♦
Exercise: Do a literature survey to judge the extent to which these problems have
actually been “solved”. If you find something new and interesting, publish. ♦
Appendix G
Riemann:
On the hypotheses which underlie
the foundations of geometry
[Nature, Vol. VIII. Nos. 183, 184, pp. 14–17, 36, 37.]
Transcribed by D. R. Wilkins
It is known that geometry assumes, as things given, both the notion of space and the
first principles of constructions in space. She gives definitions of them which are merely
nominal, while the true determinations appear in the form of axioms. The relation of
these assumptions remains consequently in darkness; we neither perceive whether and
how far their connection is necessary, nor a priori, whether it is possible.
From Euclid to Legendre (to name the most famous of modern reforming geometers)
this darkness was cleared up neither by mathematicians nor by such philosophers as
concerned themselves with it. The reason of this is doubtless that the general notion of
196
Math 464: Differential Geometry 197
multiply extended magnitudes (in which space-magnitudes are included) remained entirely
unworked. I have in the first place, therefore, set myself the task of constructing the notion
of a multiply extended magnitude out of general notions of magnitude. It will follow
from this that a multiply extended magnitude is capable of different measure-relations,
and consequently that space is only a particular case of a triply extended magnitude.
But hence flows as a necessary consequence that the propositions of geometry cannot
be derived from general notions of magnitude, but that the properties which distinguish
space from other conceivable triply extended magnitudes are only to be deduced from
experience. Thus arises the problem, to discover the simplest matters of fact from which
the measure-relations of space may be determined; a problem which from the nature of
the case is not completely determinate, since there may be several systems of matters
of fact which suffice to determine the measure-relations of space—the most important
system for our present purpose being that which Euclid has laid down as a foundation.
These matters of fact are—like all matters of fact—not necessary, but only of empirical
certainty; they are hypotheses. We may therefore investigate their probability, which
within the limits of observation is of course very great, and inquire about the justice of
their extension beyond the limits of observation, on the side both of the infinitely great
and of the infinitely small.
In proceeding to attempt the solution of the first of these problems, the development
of the notion of a multiply extended magnitude, I think I may the more claim indulgent
criticism in that I am not practised in such undertakings of a philosophical nature where
the difficulty lies more in the notions themselves than in the construction; and that besides
some very short hints on the matter given by Privy Councillor Gauss in his second memoir
on Biquadratic Residues, in the Göttingen Gelehrte Anzeige, and in his Jubilee-book, and
some philosophical researches of Herbart, I could make use of no previous labours.
§ 3. I shall show how conversely one may resolve a variability whose region is given
into a variability of one dimension and a variability of fewer dimensions. To this end
let us suppose a variable piece of a manifoldness of one dimension—reckoned from a
fixed origin, that the values of it may be comparable with one another—which has for
every point of the given manifoldness a definite value, varying continuously with the
point; or, in other words, let us take a continuous function of position within the given
manifoldness, which, moreover, is not constant throughout any part of that manifoldness.
Math 464: Differential Geometry 199
Every system of points where the function has a constant value, forms then a continuous
manifoldness of fewer dimensions than the given one. These manifoldnesses pass over
continuously into one another as the function changes; we may therefore assume that out
of one of them the others proceed, and speaking generally this may occur in such a way
that each point passes over into a definite point of the other; the cases of exception (the
study of which is important) may here be left unconsidered. Hereby the determination
of position in the given manifoldness is reduced to a determination of quantity and to a
determination of position in a manifoldness of less dimensions. It is now easy to show that
this manifoldness has n − 1 dimensions when the given manifold is n-ply extended. By
repeating then this operation n times, the determination of position in an n-ply extended
manifoldness is reduced to n determinations of quantity, and therefore the determination
of position in a given manifoldness is reduced to a finite number of determinations of
quantity when this is possible. There are manifoldnesses in which the determination
of position requires not a finite number, but either an endless series or a continuous
manifoldness of determinations of quantity. Such manifoldnesses are, for example, the
possible determinations of a function for a given region, the possible shapes of a solid
figure, &c.
Having constructed the notion of a manifoldness of n dimensions, and found that its true
character consists in the property that the determination of position in it may be reduced
to n determinations of magnitude, we come to the second of the problems proposed above,
viz. the study of the measure-relations of which such a manifoldness is capable, and of
the conditions which suffice to determine them. These measure-relations can only be
studied in abstract notions of quantity, and their dependence on one another can only be
represented by formulæ. On certain assumptions, however, they are decomposable into
relations which, taken separately, are capable of geometric representation; and thus it
becomes possible to express geometrically the calculated results. In this way, to come to
solid ground, we cannot, it is true, avoid abstract considerations in our formulæ, but at
least the results of calculation may subsequently be presented in a geometric form. The
foundations of these two parts of the question are established in the celebrated memoir
of Gauss, Disqusitiones generales circa superficies curvas.
determination of a line comes to the giving of these quantities as functions of one variable.
The problem consists then in establishing a mathematical expression for the length of a
line, and to this end we must consider the quantities x as expressible in terms of certain
units. I shall treat this problem only under certain restrictions, and I shall confine myself
in the first place to lines in which the ratios of the increments dx of the respective variables
vary continuously. We may then conceive these lines broken up into elements, within
which the ratios of the quantities dx may be regarded as constant; and the problem is
then reduced to establishing for each point a general expression for the linear element
ds starting from that point, an expression which will thus contain the quantities x and
the quantities dx. I shall suppose, secondly, that the length of the linear element, to the
first order, is unaltered when all the points of this element undergo the same infinitesimal
displacement, which implies at the same time that if all the quantities dx are increased in
the same ratio, the linear element will vary also in the same ratio. On these suppositions,
the linear element may be any homogeneous function of the first degree of the quantities
dx, which is unchanged when we change the signs of all the dx, and in which the arbitrary
constants are continuous functions of the quantities x. To find the simplest cases, I
shall seek first an expression for manifoldnesses of n − 1 dimensions which are everywhere
equidistant from the origin of the linear element; that is, I shall seek a continuous function
of position whose values distinguish them from one another. In going outwards from the
origin, this must either increase in all directions or decrease in all directions; I assume
that it increases in all directions, and therefore has a minimum at that point. If, then,
the first and second differential coefficients of this function are finite, its first differential
must vanish, and the second differential cannot become negative; I assume that it is
always positive. This differential expression, of the second order remains constant when
ds remains constant, and increases in the duplicate ratio when the dx, and therefore also
ds, increase in the same ratio; it must therefore be ds2 multiplied by a constant, and
consequently ds is the square root of an always positive integral homogeneous function of
the second order of the quantities dx, in which the coefficients are continuous functions
of the quantities x. p For
P Space, when the position of points is expressed by rectilinear
co-ordinates, ds = (dx)2 ; Space is therefore included in this simplest case. The
next case in simplicity includes those manifoldnesses in which the line-element may be
expressed as the fourth root of a quartic differential expression. The investigation of this
more general kind would require no really different principles, but would take considerable
time and throw little new light on the theory of space, especially as the results cannot be
geometrically expressed; I restrict myself, therefore, to those manifoldnesses in which the
line element is expressed as the square root of a quadric differential expression. Such an
expression we can transform into another similar one if we substitute for the n independent
variables functions of n new independent variables. In this way, however, we cannot
transform any expression into any other; since the expression contains 21 n(n+1) coefficients
which are arbitrary functions of the independent variables; now by the introduction of
new variables we can only satisfy n conditions, and therefore make no more than n of the
coefficients equal to given quantities. The remaining 21 n(n−1) are then entirely determined
by the nature of the continuum to be represented, and consequently 21 n(n − 1) functions
Math 464: Differential Geometry 201
§ 2. For this purpose let us imagine that from any given point the system of shortest
limes going out from it is constructed; the position of an arbitrary point may then be
determined by the initial direction of the geodesic in which it lies, and by its distance
measured along that line from the origin. It can therefore be expressed in terms of the
ratios dx0 of the quantities dx in this geodesic, and of the length s of this line. Let us
introduce now instead of the dx0 linear functions dx of them, such that the initial value
of the square of the line-element shall equal the sum of the squares of these expressions,
so that the independent variables are now the length s and the ratios of the quantities
dx. Lastly, take instead of the dx quantities x1 , x2 , x3 , . . . , xn proportional to them, but
such that the sum of their
P squares = s2 . When we introduce these quantities, the square
2
of the line-element is dx for infinitesimal values of the x, but the term of next order
in it is equal to a homogeneous function of the second order of the 21 n(n − 1) quantities
(x1 dx2 − x2 dx1 ), (x1 dx3 − x3 dx1 ) . . . an infinitesimal, therefore, of the fourth order; so
that we obtain a finite quantity on dividing this by the square of the infinitesimal triangle,
whose vertices are (0, 0, 0, . . .), (x1 , x2 , x3 , . . .), (dx1 , dx2 , dx3 , . . .). This quantity retains
the same value so long as the x and the dx are included in the same binary linear form, or so
long as the two geodesics from 0 to x and from 0 to dx remain in the same surface-element;
it depends therefore only on place and direction. It is obviously zero when P the manifold
2
represented is flat, i.e., when the squared line-element is reducible to dx , and may
therefore be regarded as the measure of the deviation of the manifoldness from flatness at
the given point in the given surface-direction. Multiplied by − 34 it becomes equal to the
quantity which Privy Councillor Gauss has called the total curvature of a surface. For
the determination of the measure-relations of a manifoldness capable of representation in
the assumed form we found that 21 n(n − 1) place-functions were necessary; if, therefore,
the curvature at each point in 21 n(n − 1) surface-directions is given, the measure-relations
of the continuum may be determined from them—provided there be no identical relations
among these values, which in fact, to speak generally, is not the case. In this way the
measure-relations of a manifoldness in which the line-element is the square root of a
quadric differential may be expressed in a manner wholly independent of the choice of
independent variables. A method entirely similar may for this purpose be applied also to
the manifoldness in which the line-element has a less simple expression, e.g., the fourth
root of a quartic differential. In this case the line-element, generally speaking, is no longer
reducible to the form of the square root of a sum of squares, and therefore the deviation
Math 464: Differential Geometry 202
from flatness in the squared line-element is an infinitesimal of the second order, while in
those manifoldnesses it was of the fourth order. This property of the last-named continua
may thus be called flatness of the smallest parts. The most important property of these
continua for our present purpose, for whose sake alone they are here investigated, is that
the relations of the twofold ones may be geometrically represented by surfaces, and of
the morefold ones may be reduced to those of the surfaces included in them; which now
requires a short further discussion.
§ 3. In the idea of surfaces, together with the intrinsic measure-relations in which only
the length of lines on the surfaces is considered, there is always mixed up the position
of points lying out of the surface. We may, however, abstract from external relations if
we consider such deformations as leave unaltered the length of lines—i.e., if we regard
the surface as bent in any way without stretching, and treat all surfaces so related to
each other as equivalent. Thus, for example, any cylindrical or conical surface counts as
equivalent to a plane, since it may be made out of one by mere bending, in which the
intrinsic measure-relations remain, and all theorems about a plane—therefore the whole
of planimetry—retain their validity. On the other hand they count as essentially different
from the sphere, which cannot be changed into a plane without stretching. According
to our previous investigation the intrinsic measure-relations of a twofold extent in which
the line-element may be expressed as the square root of a quadric differential, which is
the case with surfaces, are characterised by the total curvature. Now this quantity in the
case of surfaces is capable of a visible interpretation, viz., it is the product of the two
curvatures of the surface, or multiplied by the area of a small geodesic triangle, it is equal
to the spherical excess of the same. The first definition assumes the proposition that the
product of the two radii of curvature is unaltered by mere bending; the second, that in
the same place the area of a small triangle is proportional to its spherical excess. To give
an intelligible meaning to the curvature of an n-fold extent at a given point and in a given
surface-direction through it, we must start from the fact that a geodesic proceeding from
a point is entirely determined when its initial direction is given. According to this we
obtain a determinate surface if we prolong all the geodesics proceeding from the given
point and lying initially in the given surface-direction; this surface has at the given point
a definite curvature, which is also the curvature of the n-fold continuum at the given point
in the given surface-direction.
§ 4. Before we make the application to space, some considerations about flat manifold-
ness in general are necessary; i.e., about those in which the square of the line-element
is expressible as a sum of squares of complete differentials. the line-element may be ex-
pressed as the sum of the squares of complete differentials I will call flat. In order now
to review the true varieties of all the continua which may be represented in the assumed
form, it is necessary to get rid of difficulties arising from the mode of representation,
which is accomplished by choosing the variables in accordance with a certain principle.
In a flat n-fold extent the total curvature is zero at all points in every direction; it is
Math 464: Differential Geometry 203
§ 5. The theory of surfaces of constant curvature will serve for a geometric illustration.
It is easy to see that surface whose curvature is positive may always be rolled on a sphere
whose radius is unity divided by the square root of the curvature; but to review the
entire manifoldness of these surfaces, let one of them have the form of a sphere and the
rest the form of surfaces of revolution touching it at the equator. The surfaces with
greater curvature than this sphere will then touch the sphere internally, and take a form
like the outer portion (from the axis) of the surface of a ring; they may be rolled upon
zones of spheres having new radii, but will go round more than once. The surfaces with
less positive curvature are obtained from spheres of larger radii, by cutting out the lune
bounded by two great half-circles and bringing the section-lines together. The surface
with curvature zero will be a cylinder standing on the equator; the surfaces with negative
curvature will touch the cylinder externally and be formed like the inner portion (towards
the axis) of the surface of a ring. If we regard these surfaces as locus in quo for surface-
regions moving in them, as Space is locus in quo for bodies, the surface-regions can be
moved in all these surfaces without stretching. The surfaces with positive curvature can
always be so formed that surface-regions may also be moved arbitrarily about upon them
without bending, namely (they may be formed) into sphere-surfaces; but not those with
negative-curvature. Besides this independence of surface-regions from position there is in
surfaces of zero curvature also an independence of direction from position, which in the
former surfaces does not exist.
the metric properties of space, if we assume the independence of line-length from position
and expressibility of the line-element as the square root of a quadric differential, that is
to say, flatness in the smallest parts.
First, they may be expressed thus: that the curvature at each point is zero in three
surface-directions; and thence the metric properties of space are determined if the sum of
the angles of a triangle is always equal to two right angles.
Thirdly, one might, instead of taking the length of lines to be independent of position
and direction, assume also an independence of their length and direction from position.
According to this conception changes or differences of position are complex magnitudes
expressible in three independent units.
§ 2. In the course of our previous inquiries, we first distinguished between the relations of
extension or partition and the relations of measure, and found that with the same extensive
properties, different measure-relations were conceivable; we then investigated the system
of simple size-fixings by which the measure-relations of space are completely determined,
and of which all propositions about them are a necessary consequence; it remains to discuss
the question how, in what degree, and to what extent these assumptions are borne out
by experience. In this respect there is a real distinction between mere extensive relations,
and measure-relations; in so far as in the former, where the possible cases form a discrete
manifoldness, the declarations of experience are indeed not quite certain, but still not
inaccurate; while in the latter, where the possible cases form a continuous manifoldness,
every determination from experience remains always inaccurate: be the probability ever
so great that it is nearly exact. This consideration becomes important in the extensions of
these empirical determinations beyond the limits of observation to the infinitely great and
infinitely small; since the latter may clearly become more inaccurate beyond the limits of
observation, but not the former.
§ 3. The questions about the infinitely great are for the interpretation of nature useless
questions. But this is not the case with the questions about the infinitely small. It
is upon the exactness with which we follow phenomena into the infinitely small that our
knowledge of their causal relations essentially depends. The progress of recent centuries in
the knowledge of mechanics depends almost entirely on the exactness of the construction
which has become possible through the invention of the infinitesimal calculus, and through
the simple principles discovered by Archimedes, Galileo, and Newton, and used by modern
physic. But in the natural sciences which are still in want of simple principles for such
constructions, we seek to discover the causal relations by following the phenomena into
great minuteness, so far as the microscope permits. Questions about the measure-relations
of space in the infinitely small are not therefore superfluous questions.
The question of the validity of the hypotheses of geometry in the infinitely small is
bound up with the question of the ground of the metric relations of space. In this last
question, which we may still regard as belonging to the doctrine of space, is found the
application of the remark made above; that in a discrete manifoldness, the ground of
its metric relations is given in the notion of it, while in a continuous manifoldness, this
ground must come from outside. Either therefore the reality which underlies space must
form a discrete manifoldness, or we must seek the ground of its metric relations outside
it, in binding forces which act upon it.
The answer to these questions can only be got by starting from the conception of
phenomena which has hitherto been justified by experience, and which Newton assumed
as a foundation, and by making in this conception the successive changes required by facts
which it cannot explain. Researches starting from general notions, like the investigation
Math 464: Differential Geometry 206
we have just made, can only be useful in preventing this work from being hampered by
too narrow views, and progress in knowledge of the interdependence of things from being
checked by traditional prejudices.
This leads us into the domain of another science, of physic, into which the object of this
work does not allow us to go to-day.
Synopsis.
—###—
Bibliography
208
Math 464: Differential Geometry 209
Specialized items:
[With lots of differential geometry in the discussion.]
—###—