0% found this document useful (0 votes)
90 views129 pages

Manifolds 2014 Parallelization

The document discusses manifolds, which are mathematical spaces that are locally similar to Euclidean space. It defines topological manifolds as spaces that can be covered by coordinate charts that map open sets to open sets in Euclidean space, making the space locally homeomorphic to Euclidean space. Examples of topological manifolds include circles, spheres, tori, and the general linear group of invertible matrices. The document then introduces smooth manifolds, which require coordinate charts to be smoothly compatible on overlaps through transition functions.

Uploaded by

MorvaridYi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views129 pages

Manifolds 2014 Parallelization

The document discusses manifolds, which are mathematical spaces that are locally similar to Euclidean space. It defines topological manifolds as spaces that can be covered by coordinate charts that map open sets to open sets in Euclidean space, making the space locally homeomorphic to Euclidean space. Examples of topological manifolds include circles, spheres, tori, and the general linear group of invertible matrices. The document then introduces smooth manifolds, which require coordinate charts to be smoothly compatible on overlaps through transition functions.

Uploaded by

MorvaridYi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 129

Manifolds

Ed Segal
Autumn 2014

Contents
1 Introduction

2 Topological manifolds and smooth


2.1 Topological manifolds . . . . . . .
2.2 Smooth atlases . . . . . . . . . .
2.3 Smooth structures . . . . . . . .

manifolds
4
. . . . . . . . . . . . . . . . 4
. . . . . . . . . . . . . . . . 7
. . . . . . . . . . . . . . . . 12

3 Submanifolds
17
3.1 Definition of a submanifold . . . . . . . . . . . . . . . . . . . . 17
3.2 Level sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Smooth functions
27
4.1 Definition of a smooth function . . . . . . . . . . . . . . . . . 27
4.2 The rank of a smooth function . . . . . . . . . . . . . . . . . . 33
4.3 Some special kinds of smooth functions . . . . . . . . . . . . . 37
5 Tangent spaces
5.1 Tangent vectors via curves . . . . . .
5.2 Tangent spaces to submanifolds . . .
5.3 A second definition of tangent vectors
5.4 A third definition of tangent vectors .
6 Vector fields
6.1 The tangent bundle . . . . . . .
6.2 Vector fields and flows . . . . .
6.3 Other definitions of vector fields
6.4 Vector bundles . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

40
41
47
50
52

.
.
.
.

60
60
63
69
77

7 Covectors and one-forms


82
7.1 Covectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.2 The cotangent bundle and one-forms . . . . . . . . . . . . . . 88
8 Differential forms
94
8.1 Antisymmetric multi-linear maps and the wedge product . . . 94
8.2 p-forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
9 Integration
111
9.1 Orientations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
9.2 Defining integration . . . . . . . . . . . . . . . . . . . . . . . . 117
9.3 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 122
A Topological spaces

125

B The Hausdorff condition and bump functions

127

Introduction

A manifold is a particular kind of mathematical space, which encodes an


idea of smoothness. Theyre the most general kind of space on which we
can easily do calculus - differentiation and integration. This makes them very
important, and theyre fundamental objects in geometry, topology, and analysis, as well as having lots of uses in applied maths and theoretical physics.
The simplest example of a manifold is the real vector space Rn , for any
n. More generally, a manifold is a space that locally looks like Rn , so if you
zoom in close enough, you cant tell that youre not in Rn .
Example 1.1. The surface of the Earth is approximately a 2-dimensional
sphere, a space that we denote S 2 . Theres a myth that people used to think
the Earth was flat - the myth is obviously false, in fact the ancient Greeks
had a decent estimate of the radius of the Earth! But the story has a grain of
plausibility, because close up the Earth does look flat, and we could imagine
that were living on the surface of the plane R2 . Hence the sphere S 2 is an
example of a 2-dimensional manifold.
Example 1.2. The surface of a ring doughnut is a space we call a (2dimensional) torus, and denote T 2 (see Figure 1). If you were a very small
creature sitting on the doughnut, it wouldnt be immediately obvious that
you werent sitting on R2 . So T 2 is another example of a 2-dimensional
manifold.

Figure 1: A torus.
Lets go down a dimension:
Example 1.3. A circle, sometimes denoted S 1 , is an example of a 1-dimensional
manifold. A small piece of a circle looks just like a small piece of the real
line R.
Heres an example of a different flavour:
Example 1.4. Let Mat22 (R) be the set of all 2 2 real matrices, this is a
4-dimensional real vector space so its isomorphic to R4 . Now let
GL2 (R) Mat22 (R)
denote the subset of invertible matrices. If M is an invertible matrix, and N
is any matrix whose entries are sufficiently small numbers, then the matrix
M +N will still be invertible. So every matrix nearby M also lies in GL2 (R)
(i.e. GL2 (R) is an open subset). This means that a small neighbourhood of
M looks exactly like a small neighbourhood of the origin in Mat22 (R)
= R4 .
Hence GL2 (R) is an example of a 4-dimensional manifold.
This is an example of a Lie group, a group that is also a manifold. Lie
groups are very important, but they wont really be covered in this course.
We often picture manifolds as being subsets of some larger vector space,
e.g. we think of S 2 or T 2 as smooth surfaces sitting inside R3 . This is very
helpful for our intuition, but the theory becomes much more powerful when
we can talk about manifolds abstractly, without reference to any ambient
vector space. A lot of the hard work in this course will involve developing
the necessary machinery so that we can do this.
3

Topological manifolds and smooth manifolds

2.1

Topological manifolds

We now begin formalizing the concept of a manifold. The full definition is


rather complicated, so we begin with a simpler version, called a topological
manifold.
Definition 2.1. Let X be a topological space. A co-ordinate chart on X
is the data of:
An open set U X.
An open set U Rn , for some n.
A homeomorphism

f : U U

When we want to specify a co-ordinate chart we always need to specify


this triple (U, U , f ), but often well be lazy and just write (U, f ), leaving the
U implicit.
The key distinguishing property of manifolds is that co-ordinate charts
exist!
Definition 2.2. Let X be a topological space, and fix a natural number
n N. We say that X is an n-dimensional topological manifold iff for
any point x X we can find a co-ordinate chart

f : U U Rn

with x U .
In words, this says that at any point in X we can find an open neighbourhood which is homeomorphic to some open set in Rn . A concise (but
slightly imprecise) way to say this is X is locally homeomorphic to Rn .
Example 2.3. The circle S 1 is a 1-dimensional topological manifold. Lets
prove this carefully. Firstly, lets define S 1 to be the subset


S 1 = (x, y); x2 + y 2 = 1 R2
and equip it with the subspace topology. Next we need to find some coordinate charts, well do this using stereographic projection.
4

Let (x, y) be a point in S 1 , not equal to (0, 1). Draw a straight line
through (x, y) and the point (0, 1), and let x R be the point where this
line crosses the x-axis, so:
x
x =
1+y

Figure 2: Stereographic projection.


This sets up a bijection between points in S 1 (apart from (0, 1)) and
points in the x-axis. So lets set
U1 = S 1 \ (0, 1)
and note that this is an open set, since its the intersection of S 1 with the
open set {y 6= 1} R2 . Now set U1 = R, and
f1 : U1 U1
x
(x, y) 7 x =
1+y
Then f1 is continuous, since its the restriction to U1 of a continous function
defined on {y 6= 1} R2 . To show that f1 is a bijection we write down the
inverse function:
f11 : U1 U1


2
x
1 x2
x 7
,
1 + x2 1 + x2
5

An elementary calculation shows that f11 (


x) really does lie in U1 for any
x R, and that f1 and f11 really are inverse to each other. Also f11 is
continuous (since its evidently continuous when viewed as a function to R2 ),
so we conclude that f1 is a homeomorphism. The triple (U1 , U1 , f1 ) defines
our first co-ordinate chart.
For our second co-ordinate chart, we use the same trick but we project
from the point (0, 1) instead. So we define U2 = S 1 \ (0, 1) and U2 = R, and:

f2 : U2 U2
x
(x, y) 7
1y

We repeat the previous arguments to check that this is also a co-ordinate


chart. Now any point in S 1 lies in either U1 or U2 (most points lie in both)
so we have proved that S 1 is a 1-dimensional topological manifold.
Now lets do the same thing for the n-dimensional sphere S n .
Example 2.4. Let
n
o
X
S n = (x0 , ..., xn );
x2i = 1 Rn+1
with the subspace topology. We get our first co-ordinate chart using stererographic projection from the point (0, ..., 0, 1) (the south pole). So we
define
U1 = S n \ (0, ..., 0, 1)
U1 = Rn
and
f1 : U1 U1


xn1
x0
, ...,
(x0 , ..., xn ) 7
1 + xn
1 + xn
We can prove that f1 is a homeomorphism using the arguments from the
previous example, in particular the inverse to f1 is the function
P 

2
x0
2
xn1
1 x2i
1
P , ... ,
P ,
P
f1 : (
x0 , ..., xn1 ) 7
1 + x2i
1 + x2i 1 + x2i
For our second co-ordinate chart we project from the point (0, ..., 0, 1) (the
north pole), i.e. we set
U2 = S n \ (0, ..., 0, 1)
6

U2 = Rn
and:
f2 : U2 U2


x0
xn1
(x0 , ..., xn ) 7
, ...,
1 xn
1 xn
Since every point in S n lies in at least one of U1 and U2 , this proves that S n
is a topological manifold, of dimension n.

2.2

Smooth atlases

Lets go back to S 1 again. Pick a point (x, y) S 1 which isnt (0, 1) or


(0, 1). Weve found two co-ordinate charts that we could use near this point;
x
, but if we use U2
if we use U1 (and f1 ) then our point has co-ordinate 1+y
x
(and f2 ) then our point has co-ordinate 1y . How can we switch between the
two co-ordinate systems?
The intersection of our two co-ordinate charts is
U1 U2 = S 1 \ {(0, 1), (0, 1)}
In this locus, both the functions f1 and f2 are defined. Now notice that in
the co-ordinate chart U1 , the point (0, 1) gets mapped to the origin in R. So
the function f1 defines a homeomorphism:

f1 : U1 U2 R \ 0
Similarly, the point (0, 1) lies in the co-ordinate chart U2 , and it gets
mapped by f2 to the origin in R. So the function f2 also defines a homeomorphism:

f2 : U1 U2 R \ 0
To change between co-ordinates we must consider the composition:

21 = f2 f11 : R \ 0 R \ 0
This sends x R \ 0 to:

21 (
x) = f 2

1 x2
2
x
,
1 + x2 1 + x2


=

1
x

So if our point has co-ordinate x under our first co-ordinate chart, then it has
co-ordinate 1/
x in our second chart. The function 21 is called a transition
function.
7

More generally, suppose X is any topological manifold, and let

f1 : U1 U1 Rn

and

f2 : U2 U2 Rn

be two co-ordinate charts on X. The intersection U1 U2 is an open subset


of U1 , and f1 gives us a homeomorphism:

f1 : U1 U2 f1 (U1 U2 ) U1
The image of this homeomorphism is some open subset of Rn , contained in
U1 . Similarly, f2 gives us a homeomorphism

f2 : U1 U2 f2 (U1 U2 ) U2

onto some other open subset of Rn , which is contained in U2 .


Definition 2.5. Let X be a topological manifold, and let (U1 , f1 ) and (U2 , f2 )
be two co-ordinate charts on X. The transition function between these
two co-ordinate charts is the function:

21 = f2 f11 : f1 (U1 U2 ) f2 (U1 U2 )


The transition function is automatically a homeomorphism between these
two open subsets of Rn , since its a composition of two homeomorphisms.
Notice that its possible that the intersection of U1 and U2 is empty, but then
the transition function isnt very interesting!
Also notice that 21 depends on the ordering of the two co-ordinate charts;
its the transition function from the chart U1 to the chart U2 . If we reverse
the order then we get the transition function

12 = f1 f21 : f2 (U1 U2 ) f1 (U1 U2 )


but these two functions are inverse to each other:
12 = (21 )1
Example 2.6. Let X = S n , and consider the two co-ordinate charts that
we found in Example 2.4. We have
U1 U2 = S n \ {(0, ..., 0, 1), (0, ..., 0, 1)}
f1 (U1 U2 ) = Rn \ 0
8

f2 (U1 U2 ) = Rn \ 0
(in this example both f1 (U1 U2 ) and f2 (U1 U2 ) happen to be the same
subset of Rn , but this is a coincidence). The transition function between
these two charts is the function:

21 : Rn \ 0 Rn \ 0


x0
xn1
(
x0 , ..., xn1 ) 7 P 2 , ..., P 2
xi
xi
A transition function tells us how to change co-ordinates between two
different charts in some region of our manifold, so it tells us each new coordinate as some continous function of the old co-ordinates. However, we
dont want our change-of-co-ordinate functions to be merely continuous, we
really want them to be smooth.
Recall that a function
F : Rn Rm
is called smooth (or C ) if we can take partial derivatives of F to any order,
in any direction. This definition also makes sense if F is only defined on some
open subset of Rn . Since a transition function is a function from an open set
in Rn to some other open set in Rn , it makes sense to ask if the transition
function is smooth.
Definition 2.7. Let X be a topological manifold. An atlas for X is a
collection of co-ordinate charts on X
fi : Ui Ui Rn ,

iI

indexed by some (possibly-infinite) set I, such that


[
Ui = X
iI

So an atlas is a set of co-ordinate charts that collectively cover the whole


of X. By the definition of a topological manifold, an atlas always exists. The
next definition is more important:
Definition 2.8. An atlas for a topological manifold X is called smooth iff
for any two charts in the atlas, the transition function

ij : fj (Ui Uj ) fi (Ui Uj ),
is a smooth function.
9

i, j I

So a smooth atlas is a collection of co-ordinate charts that cover the whole


of X, and such that whenever we change co-ordinates the new co-ordinates
depend smoothly on the old co-ordinates.
Example 2.9. Let X = S n , and consider the atlas consisting of the two
co-ordinate charts that we found in Example 2.4. In Example 2.6 we wrote
down the transition function 21 between the two charts, and it is clearly
a smooth function. By symmetry, the transition function 12 in the other
direction is also a smooth function (in fact its easy to check that 12 = 21
in this example). Therefore this is a smooth atlas.
The next example shows another way that we might approach the circle.
Example 2.10. Let the group Z act on the real numbers R by translations,
so the orbit of a real number x is the set:
[x] = {x + n; n Z}
Let T 1 = R/Z be the set of orbits, i.e. T 1 is the quotient set of R by the
equivalence relation x y x y Z. Let
q : R T1
be the quotient map, which sends x to [x]. We give T 1 the quotient topology,
so a set U T 1 is open iff its preimage q 1 (U ) is open, this means that q
is automatically continuous. Notice that q is also an open mapping, i.e. it
sends open sets to open sets. This is because if W R is any open set then
q 1 (q(W )) is the union of all translates of W , hence it is an open set, hence
q(W ) is an open set in T 1 .
Notice that every equivalence class apart from [0] has a unique representative in the interval [0, 1], so:
T 1 = [0, 1]/(0 1)
i.e. we take an interval an then glue the two ends together. This gives us a
circle! The notation T 1 here means 1-dimensional torus.
We want to find a (smooth) atlas for T 1 . Let U1 be the open interval
U1 = (0, 1) R
and let:
U1 = q(U1 ) T 1
Then the quotient map q : U1 U1 is a bijection. We let
f1 : U1 U1 = (0, 1) R
10

be the inverse function to q : U1 U1 , so f1 ([x]) is the unique representative


of the orbit [x] which lies in the interval (0, 1). Since q is an open mapping,
the function f1 is continuous, and thus a homeomorphism. Hence (U1 , f1 ) is
a co-ordinate chart on T 1 .
Now let U2 = ( 21 , 21 ) R, so we get a second co-ordinate chart by
defining U2 = q(U2 ) and defining f2 to be the inverse of the quotient map
q : U2 U2 . These two charts cover the whole of T 1 , now lets look at the
transition functions. We have


U1 U2 = T 1 \ [0], [ 12 ] , f1 (U1 U2 ) = (0, 21 )t( 21 , 1), f2 (U1 U2 ) = ( 21 , 0)t(0, 12 )
and the transition function is:
21 : (0, 21 ) t ( 12 , 1) ( 21 , 0) t (0, 21 )

x. for x < 12
x 7
x 1. for x > 21
Since this is a smooth function, and the inverse function 12 is also smooth,
this is a smooth atlas on T 1 .
Now lets do the 2-dimensional version of the previous example:
Example 2.11. Let the group Z2 act on R2 by translations, so the orbits
are:
[(x, y)] = {(x + n, y + m); n, m Z}
Let T 2 = R2 /Z2 be the quotient space. We can picture this as a square with
opposite sides glued together:
T 2 = [0, 1] [0, 1] / (x, 0) (x, 1) and (0, y) (1, y)
(see Figure 3). Hopefully its clear that this produces a 2-dimensional torus.
We can cover T 2 with four co-ordinate charts
U2 = ( 21 , 12 ) (0, 1)
U4 = ( 21 , 12 ) ( 12 , 12 )

U1 = (0, 1) (0, 1),


U3 = (0, 1) ( 12 , 21 ),

where in each chart Ui we define a map fi : Ui R2 to be the inverse of the


quotient map q : R2 T 2 . Using the same arguments as in Example 2.10
we can show that these are indeed co-ordinate charts, and its easy to check
that this is a smooth atlas.
We can generalize this to T n = Rn /Zn for any n; this gives us an ndimensional torus.
11

Figure 3: The 2-dimensional torus T 2 = R2 /Z2 .


Heres a trivial but important example:
Example 2.12. Let X = Rn . Now let U0 = X, let U0 = Rn , and let

f0 : U0 U0
be the identity function. This is a co-ordinate chart, and since it covers all of
X it is in fact an atlas (consisting of a single chart). Hence Rn is a topological
manifold, of dimension n. Furthermore this is a smooth atlas, because there
are no non-trivial transition functions!
More generally we can let X be any open set inside Rn , then the same
procedure provides a smooth atlas on X (with a single chart).

2.3

Smooth structures

Weve just seen (in Example 2.12) that there is a trivial smooth atlas on the
topological manifold X = R. Here is another smooth atlas on this topological
manifold:
Example 2.13. Let X = R. Here are two co-ordinate charts on X:
1
U1 = R>0 , U1 = R>0 , f1 (x) =
x
1
U2 = R<1 , U2 = R>0 , f2 (x) =
1x
The union of U1 and U2 is the whole of R, so together they form an atlas.
Lets look at the transition functions; we have
U1 U2 = (0, 1),

f1 (U1 U2 ) = R>1 ,

f2 (U1 U2 ) = R>1

and the transition functions are:


x
= 12 (
x)
x 1
Both transition functions are smooth, so this is a smooth atlas.
21 (
x) =

12

In an important sense, these two smooth atlases on R are really the


same.
Definition 2.14. Let X be a topological manifold, and let
A = {(Ui , fi ); i I}
be a smooth atlas for X. Let (U, f ) be any co-ordinate chart on X. We say
that (U, f ) is compatible with the atlas A iff the transition functions (in
either direction) between (U, f ) and any chart in A are smooth.
In other words, the new chart (U, f ) is compatible with the atlas A iff
the union A {(U, f )} is still a smooth atlas.
Once weve fixed a smooth atlas A, its not very important to know which
co-ordinate charts are actually in A: the important thing is to know which
charts are compatible with A. These are the charts that we are allowed to
use, we should disregard all the charts that are not compatible with A.
Definition 2.15. Let X be a topological manifold, and let A and B be two
smooth atlases for X. We say that A and B are compatible iff every chart
in B is compatible with the atlas A.
Equivalently, we could say that A and B are compatible iff every chart in
A is compatible with the atlas B, or iff the union A B is a smooth atlas.
Example 2.16. Let X = R. Let (U0 , f0 ) be the co-ordinate chart considered
in Example 2.12, and let A = {(U0 , f0 )} be the resulting (trivial) atlas. Let
B be the atlas
B = {(U1 , f1 ), (U2 , f2 )}
from Example 2.13. Lets calculate the transition functions between the
single chart in A and the two charts in B. Since f0 is the identity this is easy,
the transition functions are simply

10 = f1 : R>0 R>0
and

20 = f2 : R<1 R>0
and their inverses. Since all four of f1 , f2 , f11 and f21 are smooth, this shows
that both charts in B are compatible with the atlas A (and the chart in A
is compatible with the atlas B). Hence these two atlases are compatible.

13

As weve said, knowing exactly which charts are in our atlas A is not so
important, all we really care about is the set of charts that are compatible
with A. The next lemma says that if we replace A by a compatible atlas B,
then this information does not change: the set of compatible charts remains
the same.
Lemma 2.17. Let X be a topological manifold, and let
A = {(Ui , fi ); i I}

B = {(Uj , fj ); j J}

and

be two compatible smooth atlases for X. Let (U0 , f0 ) be a co-ordinate chart


on X which is compatible with the atlas A. Then (U0 , f0 ) is also compatible
with B.
Proof. Pick any chart (Uj , fj ) in B and consider the transition function:

j0 : f0 (U0 Uj ) fj (U0 Uj )
We need to show that both j0 and 1
j0 are smooth functions. Pick any point
U0 Uj . Were going to show that j0 is smooth at the point f0 (x), and that
1
j0 is smooth at the point fj (x) = j0 (x). If we can do this for any point
x, then well have shown that both functions are smooth, and proved the
lemma.
Since A is an atlas, there exists some chart (Ui , fi ) A with x Ui . Set
W = U0 Uj Ui , this is an open neighbourhood of x. We have homeomorphisms:
W
f0

f0 (W )

fj

fi

fi (W )

fj (W )

The set f0 (W ) is an open set inside f0 (U0 Uj ), and the composition

fj f01 : f0 (W ) fj (W )
is just the restriction of the transition function j0 . Similar statements apply
when we move between fi (W ) and either of the other two charts, so we have
that:
j0 |f0 (W ) = ji |fi (W ) i0 |f0 (W )
Since ji and i0 are smooth by assumption, it follows that j0 is smooth
within the open set f0 (W ), and in particular it is smooth at the point f0 (x).
1
1
Since 1
ij and i0 are also smooth, j0 is smooth at the point fj (x).
14

Corollary 2.18. Compatibility is an equivalence relation on smooth atlases.


Proof. Exercise.
So the two atlases A and B that we found on R (in Examples 2.12 and
2.13) are equivalent, under the equivalence relation of compatibility. This is
the sense in which they are the same.
Finally we can define a objects we really care about!
Definition 2.19. A smooth manifold is a topological manifold X together
with an equivalence class [A] of compatible smooth atlases on X. We call
the equivalence classes of atlases a smooth structure on X.
If we want to specify a smooth structure on X then we have to give a
specific smooth atlas A, but once weve done that then we are free to change
A to any other compatible atlas. Notice that it makes sense to say that
a co-ordinate chart is compatible with a smooth structure, since the set of
compatible charts is independent of the specific choice of atlas.
Example 2.20. Let X = S n and let A be the stereographic projection
atlas from Example 2.4. Since this is a smooth atlas (Example 2.9), the pair
(S n , [A]) defines a smooth manifold.
Example 2.21. Let X = T 1 , and let B be the smooth atlas from Example
2.10. Then (T 1 , [B]) is a smooth manifold.
As we shall see later, our two versions of the circle, (S 1 , [A]) and (T 1 , [B]),
really are the same smooth manifold.
Example 2.22. Let X = Rn and let A be the trivial atlas from Example
2.12. Then (Rn , [A]) is a smooth manifold. This is called the standard smooth
structure on Rn .
Now lets see an example of two atlases which are not compatible.
Example 2.23. Let X = R again, and consider the function:
g:RR

x, for x 0
x 7
2x, for x > 0
This is a homeomorphism, so (R, g) is a co-ordinate chart on X. Furthermore the chart covers the whole of X, so it gives us an atlas
C = {(R, g)}
15

Figure 4: The function g from Example 2.23.


which is automatically smooth since there are no transition functions to
check. Hence (R, [C]) defines a smooth manifold.
However, because the function g is not smooth (it fails to have a derivative
at the point x = 0), this co-ordinate chart is not compatible with the trivial
atlas A from Example 2.12. So [C] is a different smooth structure from the
standard one [A].
Nevertheless, it will turn out that the two smooth manifolds (R, [A]) and
(R, [C]) are still the same smooth manifold (in the same sense that S 1 and
T 1 are the same).
Now that weve laid the technical foundations, we can get on with actually
studying some manifolds. From this point on were going to assume that
everything in sight is smooth, i.e. were going to say
Manifold when we mean smooth manifold.
Atlas when we mean smooth atlas.
Co-ordinate chart when we mean compatible co-ordinate chart.

16

Submanifolds

3.1

Definition of a submanifold

Consider the following two subsets of R2 :


Z1 = {(x, sin x)}
Z2 = {(x, x), x 0} {(x, 2x), x 0}
So Z1 is the graph of the function x 7 sin x, and Z2 is the graph of the
function g from Example 2.23 and Figure 4. The subset Z1 is nice and
smooth, it looks like a (1-dimensional) manifold. But Z2 doesnt look like a
manifold, it has a sharp corner at the origin.
However, both of these subsets are homeomorphic to R, in fact the graph
of any continous function R R is always homeomorphic to R. So Z2 is
certainly a topological manifold, and we could equip it with a smooth atlas
if we wanted to. So why doesnt it look like a smooth manifold?
The problem is (of course) the way in which Z2 is sitting inside the ambient space R2 . To understand whats happening precisely, we need to introduce the concept of a submanifold.
If n and m are natural numbers with m n, we can consider the subspace
Rm Rn
of vectors of the form (x1 , ..., xm , 0, ..., 0). Roughly speaking: a manifold is a
space X that locally looks like Rn , and a submanifold is a subset of X which
locally looks like this subspace Rm .
Example 3.1. Consider the subset Z1 = {(x, sin x)} R2 again. Now let
U = R2 , U = R2 , and:
f : U U
(x, y) 7 (x, y sin x)
Its easy to check that f is a smooth bijection with a smooth inverse, so (U, f )
defines a co-ordinate chart on R2 (compatible with the standard smooth
structure). Futhermore, f (Z1 ) is the subset:
{(x, 0)} U
So in these co-ordinates, Z1 is just the subspace R R2 .
Now we write down the formal definition.
17

Definition 3.2. Let X be a (smooth) manifold. Let Z be any subset of X.


We say that Z is an m-dimensional submanifold of X iff, for any point
z Z, there exists a co-ordinate chart on X

f : U U Rn

with z U , such that f (U Z) is the intersection of U with the subspace:


Rm Rn
In Example 3.1 we proved that the subset Z1 inside R2 is a 1-dimensional
submanifold (when R2 is equipped with the standard smooth structure). The
subset Z2 is not a submanifold, but we shall not prove this fact.
Example 3.3. Recall our definition of S 1 :


S 1 = x2 + y 2 = 1 R2
Lets prove that this subset is a (1-dimensional) submanifold of R2 . We just
need to use polar co-ordinates! Set
U = R2 \ {(x, 0), x 0} ,

U = R>0 (, )

and:
f 1 : U U
(r, ) 7 (r cos , r sin )
Then its clear that f 1 is a smooth bijection. Its slightly less obvious that
the inverse function f : U U is also smooth, but this can be shown using
the Inverse Function Theorem (more on this shortly). Hence (U, f ) is a
co-ordinate chart on R2 , compatible with the standard atlas. We have:
f (S 1 U ) = {(1, )} U
Now consider the map:
: R2 R2
(r, ) 7 (, r 1)
This is an affine map (i.e. the composition of a linear map and a translation),
its invertible, smooth, and has a smooth inverse. So if we compose f with
then we get a new co-ordinate chart

f : U U = (, ) R>1

18

and we have:
( f )(S 1 U ) = {(, 0)} = R U
Every point on S 1 , apart from the point (1, 0), lies in U1 , so this co-ordinate
chart demonstrates that S 1 satisfies the submanifold condition at every point
except for (1, 0). To deal with this final point we need a second co-ordinate
chart, we can do this by using polar co-ordinates with (0, 2) (so we
delete the positive x-axis).
As the previous example shows, its enough to find a co-ordinate chart
such that f (U Z) is the intersection of U with some m-dimensional affine
subspace of Rn . This is because we can always apply an affine change-of-coordinates to turn it into our standard subspace Rm Rn .
As the name suggests, an m-dimensional submanifold is itself a manifold,
of dimension m.
Proposition 3.4. Let X be an n-dimensional manifold, and let Z X be
an m-dimensional submanifold of X. Then we can equip Z with the structure
of an m-dimensional manifold.
Proof. Firstly we equip Z with the subspace topology to make it a topological
space. Now consider a co-ordinate chart

f : U U Rn
on X having the property that:
f (Z U ) = Rm U

(3.5)

The intersection V = U Z is an open set in Z, and f induces a homeomorphism between V and the open set:
V = Rm U
Hence (V, f ) is a co-ordinate chart on Z. Let A0 be the set of all co-ordinate
charts on X having the property (3.5) (this is a hugely infinite set!), this
gives a corresponding set A of co-ordinate charts on Z. By the definition of
a submanifold, every point in Z lies in at least one of the charts in A, so A
is an atlas.
Now lets look at the transition functions. Pick two charts (U1 , f1 ),
(U2 , f2 ) in A0 , and let (V1 , f1 ), (V2 , f2 ) be the corresponding charts in A.
Note that V1 V2 = Z (U1 U2 ), so
f1 (V1 V2 ) = Rm f1 (U1 U2 )
19

and:
f2 (V1 V2 ) = Rm f2 (U1 U2 )
We have a smooth transition function

21 : f1 (U1 U2 ) f2 (U1 U2 )
between the charts on X, and restricting this to the subspace Rm gives the
transition function

21 : f1 (V1 V2 ) f2 (V1 V2 )

between the charts on Z. Since 21 is smooth, the function 21 is also smooth.


So this is a smooth atlas.

3.2

Level sets

Lets think again about our two subsets


Z1 = {y sin x = 0} R2
which is a submanifold (Example 3.1), and
Z2 = {y g(x) = 0} R2
(where g is the function defined in Example 2.23), which is not a submanifold.
We also know that the subset


S 1 = y 2 + x2 = 1 R2
is a submanifold (Example 3.3). All of these subsets are of the form
{h(x, y) = }
for some function h : R2 R and some real number R. So we should
ask: when is such a subset in fact a submanifold?
More generally, if h is a function
h : Rn Rk
and Rk , when is the subset {h(x) = } Rn a submanifold? This is an
important question, which we will explore in some detail.
The subsets {h(x) = } Rn are called the level sets of h. Firstly, note
that functions
h1 (x, y) = y sin x,

and h3 (x, y) = x2 + y 2
20

Figure 5: Level sets of h = xy.


are both smooth functions from R2 to R, whereas the function
h2 (x, y) = y g(x)
is not a smooth function (since g is not smooth). One might reasonably
guess that the level sets of h are submanifolds provided that h is a smooth
function. Unfortunately this is not enough, as the next example shows.
Example 3.6. Consider the smooth function:
h : R2 R
(x, y) xy
For any R lets denote the level set of h by
Z = {h(x, y) = } R2
(see Figure 5).
If 6= 0, then Z is the set {(x, /x); x R6=0 }, its both branches of
a hyperbola. This a (1-dimensional) submanifold of R2 : consider the coordinate chart with
U = U = {x 6= 0} R2
21

and:
f : U U
(x, y) 7 (x, y /x)
Note that this really is a co-ordinate chart (f is smooth with a smooth
inverse), and that:
f (Z U ) = {(x, 0)} U
Since Z is entirely contained in U (provided that 6= 0), this demonstrates
that Z is a submanifold.
However, for = 0, the level set
Z0 = {xy = 0} = {x = 0} {y = 0}
consists of both co-ordinate axes. This certainly doesnt look like a submanifold, because of the singularity at the point where the two axes cross. In
fact one can prove that Z0 is not even a topological manifold.
So we need to find an additional condition to guarantee that a level set
of h is a submanifold. To do this we need to look at the derivative of h.
For a smooth function
h = (h1 , ...., hk ) : Rn Rk
and a point x = (x1 , ..., xn ) Rn , recall that the derivative of h at x is the
linear map
Dh|x : Rn Rk
given by the k-by-n matrix

Dh|x =


hi
xj x

of all partial derivatives of the components of h (evaluated at the point x).


This is also known as the Jacobian of h.
Example 3.7. If h is the function h(x, y) = xy we considered in Example
3.6, and we fix a point (x, y) R2 , then the derivative of h at this point is
the 1-by-2-matrix (or linear map R2 R):
Dh|(x,y) = (y, x)

22

Definition 3.8. Let h : Rn Rk be a smooth function. A point x Rn is


called a regular point of h iff the derivative
Dh|x : Rn Rk
of h at x is a surjection. If x is not a regular point of h then its called a
critical point.
Definition 3.9. A point Rk is called a regular value of h iff every
point in the level set
h1 () Rn
is a regular point of h. If is not a regular value of h then its called a
critical value.
Notice that we cant have any regular points unless k n! Also note
that all these definitions work perfectly well if h is only defined on an open
subset of Rn .
Example 3.10. Consider the function h(x, y) = xy again, whose derivative
at a point (x, y) is the matrix (y, x) : R2 R (Example 3.7). This is a
surjection provided that at least one of x or y is not zero, so the origin
(x, y) = (0, 0) is a critical point of h and all other points are regular points.
Hence the only critical value of h is = 0; if R is not zero then it is a
regular value.
Now look back at Example 3.6: we see that the level sets Z are submanifolds provided that is a regular value of h (i.e. for 6= 0). However for
the critical value = 0, the level set Z0 contains a critical point (0, 0), and
Z0 fails to be a submanifold near this point.
Here is the general result:
Proposition 3.11. Let h : Rn Rk be a smooth function. If Rk is a
regular value of h then the level set
Z = h1 () Rn
is a submanifold of Rn , of dimension n k.
If X is any n-dimensional manifold, and Z X is an m-dimensional
submanifold, then the difference
nm
is called the codimension of Z. So the proposition says that the level set
Z Rn is a submanifold of codimension k (provided that is a regular
23

value). Hopefully this makes intuitive sense: we start with a space having n
degrees-of-freedom, then we impose k equations, so we have n k degreesof-freedom left.
The proposition will follow fairly easily from the following important theorem, which you should recall from real analysis.
Theorem 3.12 (Inverse Function Theorem). Let W1 and W2 be two open
subsets of Rn , and let
F : W1 W2
be a smooth function. Let x W1 be a point such that the derivative of F at
x
DF |x : Rn Rn
is an isomorphism. Then there exists an open neighbourhood of U W1 of
x such that the function
F : U F (U )

W2

is a bijection, and the inverse function


F 1 : F (U ) U
is smooth.
Now we can prove our result about level sets.
Proof of Proposition 3.11. Let h : Rn Rk be a smooth function, and
Rk be a regular value of h. By replacing h with the function h , we can
assume that = 0. Now let Z0 = h1 (0) be the level set, and pick any point
x Z0 . We want to show that Z0 satisfies the submanifold condition at the
point x.
Let : Rn Rk be the projection map onto the last k co-ordinates, so
1
(0) is our standard subspace Rnk Rn . Our goal is to find a co-ordinate
chart
f : U U
on Rn , containing the point x, such that the composition
f : U Rk
is our function h (restricted to U ). In words, we want to find co-ordinates
around x in which the function h is just the projection function . If we can
do this, then
f (U Z0 ) = 1 (0) U = Rnk U
24

and so Z0 is a submanifold of codimension k, near the point x. We now prove


that we can indeed find such a co-ordinate chart.
Since = 0 is a regular value, any x Z0 is a regular point, so the
derivative
Dh|x : Rn Rk
is a surjection. Let (h1 , ..., hk ) be the components of h, and let x1 , ..., xn be
the co-ordinates on Rn . The j-th column of the matrix Dh|x is the vector:

h1

xj x
..
vj = . Rk

hk
xj
x

This collection of n vectors v1 , ..., vn together span Rk (since Dh|x is a surjection), so by elementary linear algebra there must be some subset of them
which forms a basis for Rk . After re-ordering the co-ordinates xj if necessary,
we may assume that the subset vnk+1 , ..., vn is a basis, i.e. the k-by-k matrix



h1
h1
...

xn
xnk+1 x
x

..
(3.13)
M =
: Rk Rk
.

hk
hk
... x

xnk+1
n
x

is an isomorphism.
Now consider the function:
f : Rn 7 Rn
x 7 (x1 , ..., xnk , h1 (x), ..., hk (x))
The derivative of f at our point x Z0 is an n-by-n matrix of the form


Ink 0
Df |x =
?
M
where M is the matrix (3.13), and Ink is the identity matrix. Hence
det Df |x = det M , and this is non-zero since M is an isomorphism. Applying the Inverse Function Theorem (Theorem 3.12), we see that there is
an open set U Rn containing x such that the function
f : U f (U ) Rn
is a bijection with a smooth inverse. This is our required co-ordinate chart.

25

Intuitively, critical points and critical values are rather rare. If we pick
a point x Rn at random, then it is vanishingly unlikely that the derivative Dh|x is not surjective, since almost all k-by-n matrices are surjective
(provided that k n). This suggests that almost all level sets of a smooth
function are submanifolds. This intuition is correct, and can be turned into
a result known as Sards Theorem. However it would take us on a significant
detour to even state this result precisely!
Proposition 3.11 gives us an easy way to find new manifolds: just pick
any smooth function from Rn to Rk and then look at a level set. Provided
that were at a regular value (and by Sards Theorem we almost always will
be), the level set is a manifold of dimension n k.
Example 3.14. Consider the smooth function
h : Rn R
h : (x1 , ..., xn ) 7 x21 + ... + x2n
The derivative of h at a point (x1 , ..., xn ) is the 1-by-n matrix
(2x1 , ...., 2xn ) : Rn R
which is a surjection if at least one of the xi is not zero. Hence the origin is
the only critical point of h, and 0 R is the only critical value. For R>0 ,
the level set Z = h1 () is a (n 1)-dimensional sphere of radius . So
this sphere is a codimension-1 submanifold of Rn , and in particular it is a
(n 1)-dimensional manifold.
If R<0 then Z = (the empty set), technically this is also a
codimension-1 submanifold of Rn , but its less interesting!
Weve now seen two ways to get an atlas on the n-dimensional sphere S n :
in Example 2.4 we used stereographic projection to get an atlas with two
charts, or since S n is a submanifold of Rn+1 we can use Proposition 3.4 to
get an atlas with infinitely-many charts. In fact these two atlases define the
same smooth structure on S n (i.e. theyre compatible), but we wont fill in
all the details of this fact.
We can be slightly more general. Take an open set
X Rn
and consider a smooth function
h : X Rk
26

(and recall from Example 2.12 that X is an n-dimensional manifold). Because


the proof of Proposition 3.11 was entirely local, it shows immediately that
the level sets
h1 () X
are submanifolds of X, provided that is a regular value.
Example 3.15. Consider the smooth function:
r : R2 \ (0, 0) R
p
(x, y) 7 x2 + y 2
Now let X be the open set
X = {(x, y, z); (x, y) 6= (0, 0)} R3
and let h be the smooth function:
h:XR
(x, y, z) 7 (r 2)2 + z 2
The derivative of h at (x, y, z) is the 1-by-3 matrix:
(2x(r 2)/r, y(r 2)/r, 2z) : R3 R
This only fails to be a surjection if z = 0 and r = 2, meaning that the only
critical value of h is = 0. For any other value of , the level set Z is a
2-dimensional manifold.
If lies in the interval (0, 4) then Z is the surface-of-revolution of the
graph drawn in Figure 6, so its a 2-dimensional torus.

4
4.1

Smooth functions
Definition of a smooth function

Suppose we have a manifold X, and a function:


h:XR
If X = Rn , or an open set in Rn , then know what it means to say that h is
a smooth function. If X is an arbitrary manifold, how should we decide if h
is smooth or not?
27

Figure 6: Level set of (r 2)2 + z 2 .


The answer is that we should look at h in co-ordinates. If we pick a
co-ordinate chart

f : U U Rn
on X then we can consider the function
= h f 1 : U R
h
as the function h written in this choice of co-ordinates.
We should think of h
is a smooth function, then we should declare that h is also smooth, at
If h
least within the open set U X. If we want to be more specific then we can
choose a particular point x U , and declare that h is smooth at x iff the
is smooth at the point f (x).
function h
However, there might be a problem with this definition: it might depend
on which co-ordinates we chose! Lets check that it doesnt. Suppose that
(U1 , f1 ) and (U2 , f2 ) are two co-ordinate charts on X, both containing the
point x. Then the functions
1 = h f 1
h
1

and

2 = h f 1
h
2

are related by the transition function 12 between the two charts:


2 = h
1 12
h
(note that this equality only makes sense on the open set f2 (U1 U2 ) U2
where both sides are defined). Since 12 is a smooth function with a smooth
2 is smooth at the point f2 (x) iff the function h
1 is
inverse, the function h
smooth at the point f1 (x).
So if h looks smooth (at x) in one co-ordinate chart, then it will look
smooth (at x) in any co-ordinate chart. Lets record this definition formally:
28

Definition 4.1. Let X be a manifold, let


h:XR
be a function, and let x be a point in X. We say that h is smooth at x iff,
for any co-ordinate chart
f : U U Rn
with x U , the function
h f 1 : U R
is smooth at the point f (x). We say that h is smooth everywhere, or
simply smooth, iff h is smooth at all points x X.
Notice that h is smooth everywhere iff, for any co-ordinate chart (U, f ),
the function h f 1 is smooth. However if want to check if h is smooth
we dont need to check every co-ordinate chart, its enough to pick an atlas
{(Ui , fi )} for X, and verify that each function h fi1 is smooth.
Example 4.2. Let X = S 1 = {x2 + y 2 = 1} R2 , and let:
h : S1 R
(x, y) 7 x2
Lets verify that h is a smooth function, using the stereographic projection
atlas from Example 2.3. Recall that we have a co-ordinate chart with U1 =
S 1 \ (0, 1), U1 = R, and:


1 x2
2
x
1
,
f1 : x 7
1 + x2 1 + x2
Then
h

f11


: x 7

2
x
1 + x2

2

and this is a smooth function. We can similarly check that the function
h f21 is smooth, where (U2 , f2 ) is the other co-ordinate chart.
Obviously we could replace h with any other smooth function of x and y
in this example, and it would again define a smooth function on S 1 .
If X = Rn (or an open set in Rn ) equipped with the standard smooth
structure, its clear that this definition of a smooth function agrees with the
ordinary definition.
Its obvious how to generalize Definition 4.1 to get a definition of when a
function
h : X Rk
29

is smooth. But why stop there? What we really need is a definition of a


smooth function between any two manifolds.
Suppose X is a manifold of dimension n, and Y is a manifold of dimension
k, and we have a function:
H:XY
How should we decide if H is smooth or not? Or more specifically, if we
pick a point x X, how should we decide if H is smooth at this point?
Again, we need to look at H in co-ordinates. So, pick a co-ordinate
chart on X

f : U U Rn
with x U , and pick a co-ordinate chart on Y

g : V V Rk

with H(x) V . To study the function H in co-ordinates (near the point x)


we would like to form the composition:
g H f 1 : U V
Unfortunately this expression doesnt make sense, because H(U ) might not
be contained in V . However, if we assume that H is continuous, we can solve
this issue by passing to a smaller co-ordinate chart on X.
Let UH,V denote the intersection:
UH,V = U H 1 (V )
If H is continous then this is a smaller open neighbourhood of x, contained
in U , and restricting f to UH,V gives a smaller co-ordinate chart:

f : UH,V f (UH,V ) Rn
Then H defines a function
H : UH,V V
and we can consider the composition:
g H f 1 : f (UH,V ) V
This is a function between an open set in Rn and an open set in Rk , so it
makes sense to ask if it is a smooth function.

30

Definition 4.3. Let X and Y be two manifolds, of dimension n and k, and


let
H:XY
be a continuous function. Fix a point x X. We say that H is smooth at
x iff, given any co-ordinate chart

f : U U Rn

on X containing the point x, and any co-ordinate chart

g : V V Rk
on Y containing the point H(x), the function
g H f 1 : f (UH,V ) V
is smooth.
We say that H is smooth everywhere, or just smooth, iff H is smooth
at x for every point x X.
Notice that a continuous function H is smooth iff, for any co-ordinate
chart (U, f ) on X, and any co-ordinate chart (V, g) on Y , the composition
g H f 1 : f (UH,V ) V
is a smooth function.
If we set Y = R (with the standard smooth structure), then its easy to
check that this definition agrees with Definition 4.1. Recall that in that case
we observed that if a function looks smooth in one co-ordinate chart then
it looks smooth in all co-ordinate charts. This is still true for our more
general definition of a smooth function. Suppose (U1 , f1 ) and (U2 , f2 ) are
two co-ordinate charts on X, and let 21 be the transition function between
them. Now let (V1 , g1 ) and (V2 , g2 ) be two co-ordinate charts on Y , and let
21 be the transition function between them. Then we have an equality of
functions
g2 H f21 = 21 (g1 H f11 ) 12
(4.4)
(on the open subset in U2 where both sides are defined). Since all the transition functions are smooth, the function g2 H f21 will be smooth iff the
function g1 H f11 is smooth.
In particular if we want to check that H is a smooth function, we can
pick an atlas {(Ui , fi )} for X and an atlas {(Vj , gj )} for Y , and then check
that every function gj H fi1 is smooth.
31

Example 4.5. Let X = T 1 (from Example 2.10) and let Y = S 1 . Now let:
H : T 1 S1
[t] 7 (cos 2t, sin 2t)
(note that this is well-defined). Lets show that H is smooth.
Take the co-ordinate chart (U1 , f1 ) on T 1 , where U1 = T 1 \ [0] and

f1 : U1 U1 = (0, 1) R

Let (V1 , g1 ) be the first stereographic projection chart on S 1 (from Example


2.3), so V1 = S 1 \ (0, 1) and
g1 : (x, y) 7

x
1+y

(and V1 = R). Then



f1 (U1 H 1 (V1 )) = f1 U1 \ [ 34 ] = (0, 34 ) ( 34 , 1)
and:
g1 H f 1 :(0, 34 ) ( 34 , 1) R
cos 2t
t 7
1 + sin 2t
This is a smooth function, which proves that H is smooth at any point
in T 1 apart from the points [0] and [ 34 ]. To prove that H is smooth at
these remaining points we do a similar calculation using the other charts
U2 = T 1 \ [ 21 ] and V2 = S 1 \ (0, 1).
The next result should not be surprising.
Lemma 4.6. Let X, Y and Z be three manifolds, of dimensions n, k and m
respectively. Let
H:XY

and

G:Y Z

be smooth functions. Then G H is smooth.


Proof. Fix a point x X. Now pick a co-ordinate chart (U, f ) on X containing the point x, a co-ordinate chart (V, g) on Y containing the point H(x),
and a co-ordinate chart (W, h) on Z containing the point G(H(x)). Since H
is smooth, the function g H f 1 is a smooth function, defined on some
open neighbourhood of the point f (x) Rn . Similarly since G is smooth,
32

the function h G g 1 is a smooth function, defined on some open neighbourhood of the point g(H(x)) Rk . To prove that G H is smooth at x,
we need to know that the function
h G H f 1 : f (UGH,W ) W
is smooth at the point f (x). But in a sufficiently small open neighbourhood
of f (x) we can factor this function as
h G H f 1 = (h G g 1 ) (g H f 1 )
and both factors are smooth.
If you know what a category is, then this shows that there is a category
whose objects are manifolds and whose arrows are smooth functions.

4.2

The rank of a smooth function

We now begin to think about an extremely important concept: the derivative


of a smooth function. It will take us a long time to really get to grips with
this idea!
Suppose we have functions
F : Rn Rk

and

G : Rk Rm

and we form their composition G F : Rn Rm . If we pick a point x Rn


then the derivative of G F at x is a linear map
D(G F )|x : Rn Rm
and you should recall that the formula
D(G F )|x = DG|F (x) DF |x
holds. This is nothing but the chain rule for functions of more than one
variable. Of course the formula still holds if F is only defined in some open
neighbourhood of x, and G is only defined in some open neighbourhood of
F (x).
In particular, if n = k = m, and G = F 1 , we get that
D(F 1 )|F (x) = (DF |x )1
(since the derivative of the indentity function Rn Rn is the indentity linear
map, at all points). So if F is a smooth function with a smooth inverse, then
33

the derivative of F is an isomorphism at all points. This is the (much easier!)


converse to the Inverse Function Theorem.
Now suppose we have two manifolds X and Y , of dimensions n and k
respectively, and we have a smooth function:
F :XY
Fix a point x X. Lets write F in co-ordinates near the point x, so we pick
a co-ordinate chart (U1 , f1 ) on X containing the point x, and a co-ordinate
chart (V1 , g1 ) on Y containing the point F (x), and we consider the function:
F1 = g1 F f11
This is defined in some open neighbourhood of the point f1 (x) Rn , and it
lands in Rk . This means we can take the derivative of this function at the
point f1 (x), it is some linear map:
DF1 |f1 (x) : Rn Rk
What happens if we change co-ordinates? If we pick new charts (U2 , f2 )
(containing x) and (V2 , f2 ) (containing F (x)), then our function becomes:
F2 = g2 F f21
(which is is defined on some open neighbourhood of f2 (x) Rn ). The derivative of F2 at the point f2 (x) is also a linear map:
DF2 |f2 (x) : Rn Rk
How are the two linear maps DF1 |f1 (x) and DF2 |f2 (x) related to each other?
Weve already observed (4.4) that the functions F1 and F2 are related by
the equation
F2 = 21 F1 12
(4.7)
where 12 is the transition function between U2 and U1 , and 21 is the transistion function between V1 and V2 (we might have to restrict to a smaller
open neighbourhood of f2 (x) before this equation makes sense).
Now take the derivative of the equation (4.7) at the point f2 (x). By the
chain rule, we have:
DF2 |f2 (x) = D21 |g1 (F (x)) DF1 |f1 (x) D12 |f2 (x)

(4.8)

So the linear maps DF1 |f1 (x) and DF2 |f2 (x) are not the same, but they are related by this formula. Now we can make an important observation: the linear
34

maps D21 |g1 (H(x)) and D12 |f2 (x) are isomorphisms (because the transition
functions are bijections with smooth inverses), this means that the rank of
DF2 |f2 (x) must be the same as the rank of DF1 |f1 (x) . Consequently we can
make the following definition:
Definition 4.9. Let X and Y be manifolds (of dimensions n and k respectively) and let F : X Y be a smooth function. Fix a point x X. Now
pick a co-ordinate chart (U, f ) containing x and a co-ordinate chart (V, g)
containing F (x), and consider the function:
F = g F f 1 : UF,V V
We define the rank of F at x to be the rank of the derivative
DF |f (x) : Rn Rk
of F at f (x).
This makes sense because of the formula (4.8); it doesnt matter which
co-ordinate charts we choose, the rank of DF |f (x) will always be the same.
Now we can generalize Definitions 3.8 and 3.9.
Definition 4.10. Let F : X Y be a smooth function between two manifolds, of dimensions n and k respectively. We say that a point x X is a
regular point of F if the rank of F at x is equal to k. If x is not a regular
point then we call it a critical point.
We say that a point y Y is a regular value of F if every point x
F 1 (y) is a regular point. If y is not a regular value then we call it a critical
value.
So x is a regular point of F iff the derivative
DF |f (x) : Rn Rk
is a surjection, where F is F written in any co-ordinate charts. In other
words x is a regular point of F iff f (x) is a regular point of F , for any choice
of co-ordinates. Clearly if we set X to be an open set in Rn , and Y to be
Rk , then we recover our previous definitions.
We can also generalize Proposition 3.11 fairly easily:
Proposition 4.11. Let F : X Y be a smooth function between two manifolds, of dimensions n and k respectively. Let y Y be a regular value of F .
Then the level set
Zy = F 1 (y) X
is a submanifold of X of codimension k.
35

Proof. Pick a point x Zy , a co-ordinate chart (U, f ) containing x, and a


co-ordinate chart (V, g) containing y. Assume that F (U ) is contained in V (if
not then replace U with the smaller chart UF,V ), and consider the function:
F = g F f 1 : U V
Note that the level set F 1 (g(y)) is just f (Zy U ). Now since y is a regular
value of F , the point x must be a regular point of F , which means that f (x)
is a regular point of F . Now we can apply the Inverse Function Theorem
just as we did in the proof of Proposition 3.11, and conclude that there is an
open set W f (U ) containing f (x), and a homeomorphism

Rn
h : W W

. Then we can use the co-ordinate chart


such that h(F 1 (g(y)) = Rnk W
(f 1 (W ), h f ) on X to demonstate that Zy satisfies the submanifold condition at the point x.
Example 4.12. Consider the 2-sphere


S 2 = (x, y, z); x2 + y 2 + z 2 = 1 R3
and let F : S 2 R be the function:
F : (x, y, z) 7 x
Take the stereographic projection chart (U1 , f1 ) from Example 2.4, so U1 =
S 2 \ (0, 0, 1) and :

f11 : R2 U1


2
x
2
y
1 x2 y2
(
x, y) 7
,
,
1 + x2 + y2 1 + x2 + y2 1 + x2 + y2
In this chart, the function F becomes the function
F = F f11 : (
x, y) 7

2
x
2
1 + x + y2

and this has derivative:




2(1 x2 + y2 )
4
xy

DF |(x,y) =
,
: R2 R
(1 + x2 + y2 )2 (1 + x2 + y2 )2
This has rank 1 except when it goes to zero, which occurs exactly at (
x, y) =
(1, 0). Hence inside the open set U1 , the only critical points of F are:

f1 (1, 0) = (1, 0, 0) S 2
36

and we can use another chart to check that these are really the only critical
points in S 2 . Thus the critical values of F are = 1 and = 1.
If || < 1 then the level set F 1 () is a circle, the intersection of the 2sphere with the plane {x = }. The theorem says that this is a 1-dimensional
submanifold of S 2 .
If || > 1 then the level set F 1 () is empty. At the critical values = 1
the level set consists of a single point, this is evidently not a 1-dimensional
submanifold.

4.3

Some special kinds of smooth functions

We can use our definition of rank to single out some particularly important
kinds of smooth functions.
Definition 4.13. A smooth function F : X Y is called a submersion if
the rank of F at any point is equal to the dimension of Y .
So F is a submersion iff the derivative at any point (in any co-ordinates)
is a surjection, i.e. a submersion is exactly a smooth function that has no
critical points. There is a dual notion to this:
Definition 4.14. A smooth function F : X Y is called an immersion if
the rank of F at any point x X is equal to the dimension of X.
In other words, F is an immersion iff the derivative at any point (in any
co-ordinates) is an injection.
Now let X be a manifold, and let Z X be a submanifold of X. Recall
from Proposition 3.4 that there is an induced smooth structure on Z, making
it into a manifold in its own right.
Lemma 4.15. Let X be a manifold, and let Z X be a submanifold of X.
Then the inclusion map
: Z , X
is smooth, and an immersion.
Proof. Exercise.
However, not every immersion is of this form.
Example 4.16. Let X = R and Y = R2 , and let:
F : R R2
t 7 (t2 , t3 t)
37

Figure 7: The image of an immersion need not be a submanifold.


We can take the trivial co-ordinate charts on X and Y . The derivative of F
at the point t R is the linear map
DF |t = (2t, 3t3 1) : R R2
which is an injection for every t. Hence F is an immersion. The image of
F is not a submanifold, the problem occurs at the intersection point at
(1, 0) = F (1) (see Figure 7).
One might hope that if F is an injective immersion then the image of F
must be a submanifold, but this is not true either! For a counter-example,
just restrict the function F from Example 4.16 to the open interval (, 1)
R.
Our final class of smooth functions is perhaps the most important:
Definition 4.17. Let X and Y are two smooth manifolds. A function
F :XY
is called a diffeomorphism if F is smooth, bijective, and the inverse function
F 1 is also smooth. If there exists a diffeomorphism between X and Y then
we say that X and Y are diffeomorphic.
38

If two manifolds are diffeomorphic then they are exactly the same, for all
practical purposes (it may help to think of diffeomorphic as another word
for isomorphic).
Suppose F : X Y is a diffeomorphism, and we pick a point x X, a coordinate chart (U, f ) containing x, and a co-ordinate chart (V, g) containing
F (x). By shrinking U and V is necessary, we can assume that F is a bijection
from U to V and hence
F = f F g 1 : U V
is a smooth bijection with a smooth inverse. This means that the derivative
of F at any point must be an isomorphism, so F is both a submersion and
an immersion. In particular, diffeomorphic manifolds must have the same
dimension, which is reassuring!
There do exist smooth functions which are bijections, but whose inverse
functions are not smooth. These functions are not diffeomorphisms. This
means the following criterion is useful:
Lemma 4.18. Let X and Y be n-dimensional manifolds, and let
F :XY
be a smooth bijection. If the rank of F is n at every point then F is a
diffeomorphism.
Proof. We just need to show that the inverse function F 1 is smooth. Fix a
point y Y , let x = F 1 (y), and choose co-ordinate charts (U, f ) containing
x and (V, g) containing y. Assume (by shrinking U if necessary) that F (U )
V , and consider the function F = g F f 1 . Since the rank of F is n at
the point F (y), the derivative
DF |f (x) : Rn Rn
is an isomorphism. By the Inverse Function Theorem, there is some open
neighbourhood of g(y) on which the function F 1 is smooth. This proves
that F 1 is smooth at y.
Example 4.19. Recall from Example 4.5 that we have a smooth function:
H : T 1 S1
[t] 7 (cos 2t, sin 2t)

39

This function is obviously a bijection. Moreover, we computed that for a


particular choice of co-ordinate charts H became the function
: (0, 3 ) ( 3 , 1) R
H
4
4
cos 2t
t 7
1 + sin 2t
This has derivative:
t=
DH|

2
:RR
1 + sin 2t

This is never zero, which shows that H has rank 1 at all points other than
[0], [ 34 ] T 1 . We can check using other charts that H also has rank 1 at these
remaining points, so by Lemma 4.18 H is a diffeomorphism.
So our two versions of the circle, T 1 and S 1 , are diffeomorphic manifolds.
Example 4.20. Recall from Example 2.23 that we can find a non-standard
atlas C on the topological manifold R which is not compatible with the standard atlas A. However, the two smooth manifolds (R, [A]) and (R, [C]) are
diffeomorphic (we leave the proof as an exercise).
This leaves an interesting question: does there exist any smooth atlas D
on Rn such that the resulting smooth manifold (Rn , [D]) is not diffeomorphic to the standard Rn ? This question was comprehensively answered in
the 1980s, and the answer is one of the most astonishing results in all of
mathematics!

Tangent spaces

If we have open sets U Rn and V Rm and a smooth function F : U V ,


we know that we can define the derivative of F at any point x U , and this
is a linear map:
DF |x : Rn Rm
If we replace U and V by arbitrary manifolds X and Y , then we dont yet
know how to generalize this definition. We saw in the last section that its
possible to define the rank of the derivative of F at a point in X, but this is
just a number. We want to upgrade it to a linear map!
In this section well show that to any point x in a manifold X theres
an associated vector space, called the tangent space to X at x, and denoted
Tx X. Then well show that if we have smooth map F : X Y then we can
define the derivative of F at x, and its a linear map:
DF |x : Tx X TF (x) Y
40

In fact defining tangent spaces is the hard bit, the fact that we can define
DF |x will follow almost automatically.

5.1

Tangent vectors via curves

Roughly, a tangent vector to a point x in a manifold is a direction that you


can go in from x. There are several ways to make this precise, the most
intuitively appealing way is via equivalence classes of curves. We start by
explaining how this works for the simplest kind of manifolds, namely open
sets in Rn .
Fix an open set U Rn , and pick a point x U . Lets declare that a
curve through x is a smooth function
= (1 , ..., n ) : (, ) U
with (0) = x, where here  is some positive real number and (, ) R is
the corresponding open interval. This is indeed a smooth parametrized curve
in U , passing through the point x. Note that we really mean the function
and not just its image in U (which would be an unparametrized curve).
The derivative of at the point 0 R is a linear map:
D|0 : R Rn
given by the n-by-1-matrix:
D|0 = ( 1 (0), ..., n (0))>
Were going to think of D|0 as a column vector in Rn rather than as a
matrix (or if you prefer, were going to write D|0 when we mean D|0 (1)).
Of course, this is just the tangent vector to when it hits the point x. Its the
direction that is travelling when it passes through x, or more accurately
its the velocity of , since we dont forget the length of D|0 .
Now suppose that we have two curves through x:
: (1 , 1 ) U
: (2 , 2 ) U
Lets declare that and are tangent at x iff they have the same tangent
vector at this point, so D|0 = D |0 . This is restrictive use of the word
tangent, since were requiring that the the tangent vectors are actually
equal and not just proportional. For example if was a reparametrization
of , then under our definition it probably wouldnt be tangent to at x.
41

Obviously being tangent at x is an equivalence relation on curves through


x, and by definition we have a well-defined function
U : {curves through x} /(tangency at x) Rn
sending each equivalence class [] to its tangent vector D|0 . By definition
this function U is an injection. Its also a surjection, this is because for any
vector v Rn we can consider a straight line
V : R Rn
t 7 x + vt

(5.1)

and if  is small enough this defines a function:


V : (, ) U
This is a curve through x, and obviously U (v ) = v. So U is a bijection
of sets.
Now we want to generalize this to other manifolds. Let X be an ndimensional manifold, and pick a point x X.
Definition 5.2. A curve through x is a smooth function
: (, ) X
with (0) = x, where  is any positive real number.
We want to find a definition of when two curves through x are tangent
to each other. As usual, we need to look at our curves in co-ordinates.
Let : (, ) X be a curve through x, and pick a co-ordinate chart
(U, f ) on X containing the point x. The composition

=f
is a curve through the point x = f (x) U (we might have to shrink  to
ensure that the image of lies within U ). Hence this curve has an associated
tangent vector:
D
|0 Rn
If we have two curves through x, say and , then we could say that they
are tangent at x if when we look at them in co-ordinates then the two curves

= f and = f are tangent at x. So we would declare that and


are tangent at x iff the two tangent vectors
D
|0 , D
|0 Rn
42

are the same. This is a reasonable definition, but we need to check that it
doesnt depend on which co-ordinates we chose.
So, pick two different charts (U1 , f1 ) and (U2 , f2 ) both containing x. Looking at the curve in these two charts gives a curve
1 = f1 through the
point x1 = f1 (x), and a curve
2 = f2 through the point x2 = f2 (x). These
curves are related by the transition function 21 between the two charts:

2 = 21
1
(we can always shrink  to make sure that lands in U1 U2 ). If we use our
first chart, the tangent vector to would be the vector D
1 |0 Rn , and if
we use our second chart, it would be D
2 |0 . By the chain rule, we have that

D
2 |0 = D21 |x1 D
1 |0
(5.3)
so these two vectors are related by the linear map:
D21 |x! : Rn Rn
This means that our definition of when two curves through x are tangent is
indeed independent of co-ordinates. Suppose we have two curves and
through x, and they have the same tangent vector when we look at them in
the chart (U1 , f1 ), i.e.
D
1 |0 = D
1 |0
Then by equation (5.3), we must also have
D
2 |0 = D
2 |0
so and have the same tangent vector when we look at them in the chart
(U2 , f2 ). Lets write down our definition formally:
Definition 5.4. Fix a point x in a manifold X. We say that two curves ,
through x are tangent at x iff for any co-ordinate chart (U, f ) containing
x, we have
D(f )|0 = D(f )|0
As we have shown, if this holds in one chart then it holds in all charts.
Obviously being tangent at x is an equivalence relation on the set of all
curves through x.
Definition 5.5 (Geometers definition ). Fix a point x in a manifold X. A
tangent vector to x is an equivalence class of curves through x. We denote
the set of all tangent vectors to x by
Tx X = {curves through x} /(tangency at x)
and call it the tangent space to X at x.
43

Proposition 5.6. If x is a point in an n-dimensional manifold X, then the


tangent space Tx X is an n-dimensional vector space.
Proof. Pick a co-ordinate chart (U, f ) containg x. Then by the definition of
Tx X, we have a well-defined injective function:
f : Tx X Rn
[] 7 D(f )|0

(5.7)

In fact f is also a surjection, because given any v Rn we have a straightline curve

v : (, ) U
t 7 f (x) + vt
for small-enough  (as in (5.1)), and then v = f 1
v is a curve through x
such that f (v ) = v. Hence f is a bijection of sets.
We can use the bijection f to put a vector space structure on Tx X,
i.e. we can define an addition operation

[] + [ ] = 1
f () + f ( )
(5.8)
f
and a scalar multiplication

[] = 1

()
f
f

(5.9)

(and these are guaranteed to satisfy the vector space axioms). However, once
again we need to check that these definitions dont depend on our choice of
co-ordinates.
So, pick two co-ordinate charts (U1 , f1 ) and (U2 , f2 ) both containing x.
This gives two different bijections

f1 : Tx X Rn
[] 7 D(f1 )|0
and

f2 : Tx X Rn
[] 7 D(f2 )|0
given by calculating tangent vectors in the two different charts. By the chain
rule (5.3), we have that

D(f2 )|0 = D21 |f1 (x) D(f1 )|0
44

where 21 is the transition function between our two charts. So the two
bijections f1 and f2 are related by:
f2 = D21 |f1 (x) f1

(5.10)

Since D21 is a linear isomorphism, this implies that the operations (5.8)
and (5.9) give the same result, independent of which chart we used to define
them.
So we have achieved our first aim for this section, namely to any point
x in a manifold X we have attached a vector space Tx X. If we choose any
co-ordinate chart (U, f ) containing x, then we get a linear isomorphism

f : Tx X Rn
as in (5.7). However, there is no canonical way to identify Tx X with Rn , and
indeed if we have two different charts then we know that f1 and f2 are
related by the equation (5.10).
Now we move on to our second aim: defining the derivative of a smooth
function between two manifolds.
Firstly, suppose that U and V are open sets in Rn and Rm respectively,
and that F is a smooth function:
F : U V
Pick a point x U and let y = F (
x). If we have a curve through x, then
the compostion F is a curve through y (since the composition of two
smooth functions is smooth). Furthermore, using the chain rule again tells
us that:

D(F )|0 = DF |x D|0
(5.11)
In particular, the tangent vector to F only depends on the tangent vector
to . So if we wish, we could view the derivative DF |x as a function:
DF |x :

{curves through y}
{curves through x}

(tangency at x)
(tangency at y)
[] 7 [F ]

(this is the composition of DF |x with the bijections U and V ). Now we


can generalize this to any smooth function between manifolds.
Proposition 5.12. Let X and Y be manifolds of dimensions n and m, and
let F : X Y be a smooth function. Fix a point x X, and let y = F (x).
Then there is an associated linear map
DF |x : Tx X Ty Y
45

defined by:
DF |x () = [F ] Ty Y
We call this linear map the derivative of F at x.
Proof. Pick a chart (U, f ) on X containing x, and a chart (V, g) on Y containing y. In these charts, F becomes the function:
F = g F f 1 : U V
Now choose a curve through x. Using our chart, this becomes a curve

= f through the point x = f (x) U , and its associated tangent vector


is:
f () = D
|0 Rn
Now form the composition F , this is a curve through the point y Y .
Using our chart on Y , it becomes a curve
g (F ) = F

through the point y = g(y) V . The tangent vector associated to this curve
is
g (F ) = D(F
)|0 Rm
which by the chain rule (5.11) is equal to:


DF |x D
|0 = DF |x f ()
So the tangent vector g (F ) only depends on the tangent vector f ().
This means that the equivalence class of the curve F in the space Ty Y only
depends on the equivalence class of the curve in the space Tx X, and thus
we have a well-defined function DF |x : Tx X Ty Y that sends [] [F ].
Furthermore the square
Tx X

DF |x

Rn

Ty Y
g

DF |x

(5.13)

Rm

commutes, i.e.
f
DF |x = 1
g D F |x
and therefore DF |x is a linear map, since its the composition of three linear
maps.

46

So now we have a way to talk about the derivative of a smooth map


abstractly, without reference to any co-ordinate charts. If we do decide to
pick co-ordinates, it reduces the ordinary notion of the derivative of a smooth
map, via the square (5.13).
In particular, its immediate that the rank of F at x (Definition 4.9) is
exactly the rank of the linear map DF |x .

5.2

Tangent spaces to submanifolds

If we have a submanifold Z of Rn , and we choose a point z Z, then we have


an intuitive idea of what it means for a vector v Rn to be tangent to Z at
the point z. This means that for submanifolds of Rn there should be a much
more basic definition of the tangent space Tz Z, it should be the subspace of
Rn consisting of vectors that are tangent to Z at z. Fortunately, our fancy
definition agrees with this more basic definition, in this special case.
For any point z Rn , the tangent space Tz Rn can be canonically identified with Rn , via the map [] 7 D|0 . This is another way in which Rn is a
special manifold! Now let Z Rn be a submanifold, with z Z. Any curve
in Z (through z) is also a curve in Rn (through z), and we have a map
Tz Z Tz Rn
= Rn
sending [] to the associated vector D|0 , viewed as a vector in Rn . Clearly
this map is an injection, and by definition its image is the subspace of vectors
which are tangent to curves in Z. This is exactly our intuitive idea of a
tangent space.
Example 5.14. Consider the submanifold Z1 = {(x, sin x)} R2 from
Example 3.1. For any point y = (x, sin x) Z1 , we can define a curve in R2
through the point y by
: (, ) R2
t 7 (t + x, sin(t + x))
(and  can be any positive real number). The image of lies in Z1 , so
is automatically a smooth function from (, ) to Z1 (see problem sheets).
Hence [] is a vector in Ty Z1 . If we want to view [] as a vector in Ty R2
= R2
we compute:


1
D|0 =
R2
cos x
Since Z1 is only 1-dimensional, the tangent space Ty Z1 is the line in R2
spanned by this vector (see Figure 8).
47

Figure 8: The tangent space to a point in Z1 .


This is exactly what the tangent line ought to be, so our complicated
definitions reduce to a sensible answer!
We can generalize this picture from Rn to arbitrary manifolds. Suppose
X is a manifold, and Z X is a submanifold. The inclusion map
: Z , X
is a smooth immersion (Lemma 4.15), so for any z Z we have a linear
injection:
D|z : Tz Z , Tz X
So we can always view Tz Z as a subspace of Tz X. We can see this very
explicitly in co-ordinates: we know we can choose a chart (U, f ) containing
z such that
f (U Z) = U Rm

for the standard subspace Rm Rn . This fixes an isomorphism f : Tz X


Rn , under which Tz Z becomes the subspace Rm Rn .
We saw in Proposition 4.11 that a good way to produce submanifolds is
as the level sets of smooth functions.
Lemma 5.15. Let F : X Y be a smooth function, let y Y be a regular
value of F , and let Z = {F 1 (y)} be the corresponding submanifold of X.
For any z Z, the tangent space Tz Z is the kernel of the linear map:
DF |z : Tz X Ty Y
Proof. Let the dimensions of X and Y be n and k. If we look at the proof
of Proposition 4.11 again, we can see that what it actually proves is that its
possible to find a chart (U, f ) on X containing z, and a chart (V, g) on Y
containing y, such that the function
F = g F f 1 : U V
48

is simply the restriction of the linear projection map:


: Rn Rk
In particular, f (Z U ) is the intersection of U with the subspace Rnk Rn .
Since is linear, if we look at DF |z in these charts we get
DF |f (z) = : Rn Rk
and the kernel of this is the subspace Rnk , which is exactly the tangent
space Tz Z (in this chart).
Example 5.16. Consider the submanifold S n Rn+1 . This is the level set
of the function
h : (x0 , ..., xn ) 7 x20 + ...x2n
at the regular value h = 1, as we saw in Example 3.14. At a point x =
(x0 , ..., xn ) S n , the derivative of h is the 1-by-n matrix:
(2x0 , ..., 2xn ) : Rn R
So the tangent space Tx S n is the subspace
Tx S n = {v ; x.v = 0} Rn+1
Example 5.17. If we specialize the previous Example to S 1 R2 , we see
that the tangent space to a point (x, y) S 1 is the subspace:
 
y
R2
x
Now lets derive this again using polar co-ordinates

f 1 : U = R>0 (, ) U = R2 \ {(x, 0), x 0}


(r, ) 7 (r cos , r sin )

as in Example 3.3. In this chart, S 1 becomes the subspace {r = 1} U , so


if we pick a point (1, ) f (S 1 ) then the tangent space is:
 
0
1
T(1,) f (S ) =
R2
1
The derivative of f 1 at this point is
D(f


)|(1,) =

cos sin
sin cos

so the tangent space to the point (cos , sin ) S 1 is the line spanned by
the vector ( sin , cos )> R2 .
49

5.3

A second definition of tangent vectors

Were now going to discuss a second way to define tangent vectors, and later
on well introduce a third definition. These other definitions are precisely
equivalent to the definition weve already introduced, but each one has its
own advantages and disadvantages.
Fix a point x in a manifold X, and fix a tangent vector [] Tx X, the
equivalence class of some curve through x. If we now choose a co-ordinate
chart (U, f ) containing x, we can turn [] into an ordinary column vector
f () Rn . Furthermore, if we change co-ordinates between (U1 , f1 ) and
(U2 , f2 ), we know that the transformation law

f2 () = D21 |f1 (x) f1 ()
(5.18)
holds (this was equation (5.10)). If we wish, we can take these properties to
be the definition of a tangent vector.
Definition 5.19 (Physicists definition ). Fix a point x in a manifold X.
Let Ax denote the set of all co-ordinate charts on X that contain the point
x. A tangent vector to x is a function
: Ax Rn
which we write
: (U, f ) 7 f
and which has the following property: for any two charts (U1 , f1 ) and (U2 , f2 )
in Ax , the equation

f2 = D21 |f1 (x) f1
(5.20)
holds.
Lets (temporarily) use the notation Tx X to denote the set of tangent
vectors in the sense of Definition 5.19, although as we shall see in a moment
the two definitions are equivalent and Tx X is the same thing as Tx X.
Lemma 5.21. For any chart (U, f ) Ax , the function evaluate in (U, f ):
evf : Tx X Rn
7 f
is a bijection.

50

Proof. Clearly if two tangent vectors look the same in (U, f ) then they must
look the same in all charts by the rule (5.20), so evf is an injection. Now pick
any vector v Rn . Let (U0 , f0 ) = (U, f ), and define a function : Ax Rn
by:
: (Ui , fi ) 7 fi = Di0 |f0 (x) (v)
If we pick any two charts (U1 , f1 ), (U2 , f2 ) Ax , the transition functions obey
the equation
20 = 21 10
(this makes sense in some neighbourhood of f0 (x)), so by the chain rule:

f2 = D20 |f0 (x)(v) = D21 |f1 (x) D10 |f0 (x) (v) = D21 |f1 (x) (f1 )
So our function obeys the rule (5.20), hence its an element of Tx X. Since
evf () = v this shows that evf is a surjection.
If we have two different charts in Ax then the bijections evf1 and evf2 are
related by:
evf2 = D21 |f1 (x) evf1
(5.22)
The vector space structure on Tx X is much more obvious than the one on
Tx X. For two elements , Tx X, we can define + to be the function
+ : (U, f ) 7 f + f Rn
and this still obeys the rule (5.20), so it is an element of Tx X. Scalar multiplication is similar, and it follows immediately that the function evf is a
linear isomorphism, for any chart (U, f ).
Now we can prove that our two definitions of tangent vectors, Definition
5.5 and Definition 5.19, are equivalent.
Proposition 5.23. There is a canonical linear isomorphism between Tx X
and Tx X.
Proof. If we have a tangent vector [] Tx X in the geometers sense, then
we can get a tangent vector Tx X in the physicists sense by considering
the function:
: Ax Rn
(U, f ) 7 f ()
So there is a natural function:
Tx X Tx X
51

If we fix any chart (U, f ) Ax , this function factors as the composition


evf1

Tx X Rn Tx X
and since both factors are linear isomorphisms, so is their composition.

5.4

A third definition of tangent vectors

Our third definition of tangent vectors is perhaps the most difficult, but in
practice it is the most useful. It takes the point-of-view that a tangent vector
is a direction in which we can take a partial derivative of a function. Again
we start by explaining how this works when our manifold is just an open
subset of Rn .
Let U Rn be an open set. We let C (U ) denote the set of all smooth
functions from U to R, this is an infinite-dimensional vector space under
point-wise addition and scalar multiplication of functions.
Now fix a point x U , and choose a vector v Rn . We have an operation
take the partial derivative at x in the direction v, this is an operator
x,v : C (U ) R
that sends a smooth function h C (U ) to the real number:
x,v (h) = Dh|x (v) R
This operation x,v is a linear map, i.e. we have:
x,v (h1 + h2 ) = x,v (h1 ) + x,v (h2 )

and

x,v (h) = x,v (h)

It also obeys the product rule




x,v (h1 h2 ) = h1 (
x) x,v (h2 ) + h2 (
x) x,v (h1 )
for any two functions h1 , h2 C (U ).
Definition 5.24. Fix an open set U Rn and a point x U . A derivation
at x is a linear map
d : C (U ) R
that obeys the product rule
d(h1 h2 ) = h1 (
x)d(h2 ) + h2 (
x)d(h1 )
for any two functions h1 , h2 C (U ).
52

(5.25)

We denote the set of all derivations at x by:


Derx (U )
Its easy to check that Derx (U ) forms a subspace of the vector space of all
linear maps from C (U ) to R, so Derx (U ) is a vector space.
For any v Rn , our partial differentiation operation x,v is an element of
Derx (U ). The next lemma says that in fact these are all the derivations at
x.
Lemma 5.26. If d is a derivation at x then d = x,v for some v Rn .
Proof. Pick any d Derx (U ). Firstly we show that d vanishes on constant
functions. Let 1 C (U ) denote the constant function with the value 1 R.
Let h C (U ) be any function such that h(
x) 6= 0, then the product rule
implies that
d(h) = d(h.1) = h(
x)d(1) + d(h)
and hence d(1) = 0. Since d is linear it must send any constant function to
zero.
Now we choose any function h C (U ) and apply Taylors theorem. Let
y1 , .., yn be the co-ordinate functions on U , so for each i [1, n] we have a
function
yi xi C (U )
where (
x1 , ..., xn ) are the co-ordinates of our fixed point x. Also let e1 , ..., en
Rn denote the standard basis. Then Taylors theorem says that we can write
h(
y ) = h(
x) +

n
X

(
yi xi ) x,ei (h) + Hi (
y)

i=1

where H1 , ..., Hn C (U ) are functions such that Hi (


x) = 0 for each i.
Applying d, and using the product rule, yields:
d(h) =

n
X

x,ei (h))d(
yi xi )

i=1

If we set vi to be the real number d(


yi xi ), then we see that d is exactly
the partial derivative operator x,v for the vector v = (v1 , ..., vn ).
So we have a bijection

Rn Derx (U )

53

by sending v Rn to the derivation x,v , and this is evidently linear so its


an isomorphism of vector spaces.
Now lets relate derivations at x to curves through x. If we have a curve
through x, and a function h C (U ), then h is a smooth function
from some open interval (, ) R to R. We define an operator
: C (U ) R
by taking the derivative along , i.e. for h C (U ) we define

d(h )
(h) =
dt 0
By the chain rule, this is nothing but the partial derivative of h along the
tangent vector D|0 associated to . In particular, only depends on the
tangency class of , and its a derivation at x.
So we have a commuting triangle of linear isomorphisms:
{curves through x}
(tangency at x)

[]7

Derx (U )
v7x,v

[]7D|0

Rn
Now lets generalize all of this to an arbitrary manifold.
Definition 5.27. Let X be a manifold and let x be a point in X. Let C (X)
denote the vector space of all smooth functions from X to R. A derivation
at x is a linear map
d : C (X) R
obeying the product rule
d(h1 h2 ) = h1 (
x)d(h2 ) + h2 (
x)d(h1 )
for any two functions h1 , h2 C (X). We denote the set of all derivations
at x by Derx (X).
Again its easy to check that Derx (X) is a vector space under the usual
addition and scalar multiplication operations on linear maps.
If is a curve through x, then we can define an operator:
: C (X) R

d(h )
h 7
dt
54

This is a derivation at x, by the ordinary product rule for differentiation.


Clearly this operator is, in some sense, the partial derivative operator
along the tangent vector []. We can make this more precise if we work in
co-ordinates.
Pick a chart (U, f ) containing x, and assume that lies in U by shrinking
its domain if necessary (note that this will not affect the operator ). Then
in these co-ordinates becomes a curve
= f through the point x =

f (x). Any smooth function h C (X) restricts to give a smooth function


= h f 1 C (U ), and we have
h

(h) = (h)

since h = h
. So the operator can be described as restrict to
the chart (U, f ), then take the partial derivative along the tangent vector
f () = D
|0 Rn (note that we dont need to check if this is co-ordinate
independent, because we defined in a way that didnt need co-ordinates).
It follows from this description that only depends on the vector D
|0 ,
so it only depends on the equivalence class of in the tangent space Tx X.
So we have a well-defined function:
Tx X Derx X
[] 7
Looking at it in our chart (U, f ), its clear that this function is a linear
injection.
Proposition 5.28. The function [] 7 is a linear isomorphism between
Tx X and Derx (X).
The hard part of this proposition is proving surjectivity, i.e. that every
derivation at x arises in this way. Well need to prove a couple of lemmas
first, and introduce some useful gadgets called bump functions.
You may recall the remarkable function:
:RR
 1
e x , for x > 0
x 7
0, for x 0
This function is smooth, it can be differentiated to arbitrary order, because
1
all derivatives of e x tend to zero as x tends to zero. By messing around
with we can create some other nice functions, for example:
(x) =

(x)
(x) + (1 x)
55

Figure 9: The function .


This function is smooth, identically equal to zero for x 0, and identically
equal to 1 for x 1 (see Figure 9).
With some further modifications, it should be clear that we can create a
smooth function of n variables
e : Rn R
which is constantly equal to a inside the ball B(0, r) and constantly equal
to b outside the ball B(0, r0 ), for any constants a, b R and any r0 > r > 0.
This is called a bump function.
We can also create bump functions on arbitrary manifolds. Let X be a
manifold, and let x be a point in X. Pick a chart (U, f ) containing x, and
for simplicity assume that f (x) = 0 Rn . Let e be a bump function on Rn
as above, chosen such that the closure of the larger ball B(0, r0 ) is contained
in U . Then we can define a smooth1 function on the whole of X by

( f )(y), for y U
(y) =
b, for y
/U
1

Actually, it is not obvious that this function is smooth. In fact it isnt even true
in general! We need a technical condition on our manifold, we must assume that X is
Hausdorff. This is explained in Appendix B, from now on we will assume our manifolds
are Hausdorff without futher comment.

56

So is a function in C (X) which has constant value a in an open neighbourhood W = f 1 (B(0, r)) of x, and constant value b outside a larger open
neighbourhood W 0 = f 1 (B(0, r0 )).
Now we can prove the two lemmas that we need for Proposition 5.28.
Lemma 5.29. Let d be a derivation at x. If h C (X) is identically zero
on some open neighbourhood of x then d(h) = 0.
Proof. Suppose that h is identically zero on a neighbourhood U of x. Let
C (X) be a bump function such that we have neighbourhoods
x W W0 U
with |W 0 and |X\W 0 1. Then h = h, so d(h) = d(h) = 0 by the
product rule.
The subset
{h C (X) ; an open neighbourhood U 3 x with h|U 0}

(5.30)

is a subspace of C (X), and quotienting by this subspace produces a vector


space which well denote
G
x (X)
Elements of G
x (X) are called germs of smooth functions at x. A germ is
an equivalence class of functions, where we declare that two functions are
equivalent if they agree in some open neighbourhood of x.
By Lemma 5.29, any derivation d Der(X, x) gives a well-defined linear
map:
d : G
x (X) R
Notice that multiplication of functions is still well-defined in G
x (X), because
if h1 vanishes in a neighbourhood of x, and h2 is any function, then h1 h2
vanishes in a neighbourhood of x (i.e. the subspace (5.30) is an ideal in
C (X)). So we can identify Derx (X) with the space of linear maps from
G
x (X) to R that obey the product rule.
Lemma 5.31. If U X is an open neighbourhood of x then we have a linear
isomorphism

G
x (X) Gx (U )
given by sending [h] to [h|U ], for any h C (X).

57

Proof. Firstly note that if h C (X) vanishes in an open neighbourhood


of x then so does the function h|U C (U ), so this map is well defined.
Now lets find an inverse map. Take a bump function C (X) which
is constant with value 1 on some open neighourhood W of x, and vanishes
outside some larger open neighbourhood W 0 U . If g C (U ), then we
can extend g to a function on the whole of X by defining:

g inside U
gb =
0 outside U
Note that gb is a smooth function, so we have a linear map:
C (U ) C (X)
g 7 gb
Furthermore if g vanishes on some neighbourhood of x then so does gb, so we
get an induced map:

G
x (U ) Gx (X)
[
If h is a function on X then (h|
U ) agrees with h on the neighbourhood W .
Also if g is a function on U then (b
g )|U agrees with g on the neighbourhood
W . This proves that the function [g] 7 [b
g ] is the inverse to the function
[h] 7 [h|U ].
The isomorphism in Lemma 5.31 respects multiplication of functions, so
linear maps from G
x (X) to R that obey the product rule are the same thing
as linear maps from G
x (U ) to R that obey the product rule. Using Lemma
5.29, this implies that the map
Derx (U ) Derx (X)
(induced by the map h 7 h|U ) is an isomorphism.
Finally we can prove that Tx X and Derx (X) are the same thing.
Proof of Proposition 5.28. Recall that we have a linear map
Tx X Derx (X)
[] 7
sending a tangent vector to partial derivative operator along that vector. Pick
a chart (U, f ) containing x. We know that Derx (X) is isomorphic to Derx (U ),
and using our co-ordinates f this is isomorphic to Derx (U ). By Lemma 5.26,
the only derivations on U at x are the partial derivative operators x,v . So
the map Tx X Derx (X) is an isomorphism.
58

In light of Proposition 5.28, we can give our third definition:


Definition 5.32 (Algebraists definition ). Let x be a point in a manifold
X. A tangent vector to x is a derivation at x.
This definition of tangent vectors makes it easy to define the derivative
of a smooth function. Suppose F : X Y is a smooth function between two
manifolds, sending x X to y Y . If we have a derivation d Derx (X), we
can define an operator
DF |x (d) : C (Y ) R
by:
DF |x (d) : h 7 d(h F )
Its easy to check that this is a derivation at y, so we have a map
DF |x : Derx (X) Dery (Y )
d 7 DF |x (d)
and this is obviously linear. However, we need to check that this agrees with
our earlier definition of the derivative.
Lemma 5.33. Let F : X Y be a smooth function between two manifolds,
sending x X to y Y . Then the linear map DF |x : Derx (X) Dery (Y )
agrees with the linear map DF |x : Tx X Ty Y , under the canonical isomorphisms Tx X
= Derx (X) and Ty
= Dery (Y ).
Proof. Pick a curve through x, so we have a tangent vector [] Tx X and
an associated derivation Derx (X). Using our previous definition, DF |x
sends [] to the class of the curve F in Ty Y . Under our new definition,
DF |x sends the derivation to the operator
DF |x ( ) : C (Y ) R

d(h F )
h 7

dt
0
but this is precisely the derivation F associated to the curve F .
Its worthwhile seeing what these definitions mean for a smooth function:
F = (F1 , ..., Fk ) : Rn Rk
Given Lemma 5.33, we know that our definition of DF |x must reduce to our
ordinary definition of the derivative in terms of the Jacobian matrix, but it
still nice to see explicitly how this happens.
59

Pick a point x Rn and let y = F (x). Under these definitions, a tangent


vector to a point x Rn is a partial derivative operator

n
X
h
x,v : h 7
vi
xi x
i=1
for some vector v = (v1 , ..., vn ) Rn . Then DF |x (x,v ) is the operator on
C (Y ) given by
DF |x (x,v ) : g 7 x,v (g F ) =

n X
k
X
i=1



Fj g
vi
xi x yj y
j=1

using the chain rule. So DF |x (x,v ) is the partial derivative operator y,u
where u is the vector we get by multiplying v by the Jacobian matrix.

6
6.1

Vector fields
The tangent bundle

We have shown (three times over!) that to any point x in a manifold X


we can attach a vector space Tx X. If we assemble all of these vector spaces
together then we get a set
[
TX =
Tx X
xX

called the tangent bundle to X. By definition, an element of T X is a pair


(x, v) with x X and v a vector in Tx X, so we have a projection function
: TX X
sending (x, v) to x.
In fact T X is not just a set, its actually a manifold, and its dimension is
twice the dimension of X. Lets see why this is.
As usual we start with the simplest kind of manifold, an open set U Rn .
Recall that for any point x U we have a canonical identification
Tx U
= Rn
so the tangent bundle to U is just:
T U = U Rn
60

This is an open set inside R2n , so T U is a manifold of dimension 2n.


Now let X be any n-dimensional manifold. If we pick a chart (U, f ) on
X, then for any point x U we have a linear isomorphism:

f : Tx X Rn
Associated to our open set U X there is a subset of the tangent bundle
[
T U = 1 (U ) =
Tx X T X
xU

and our co-ordinates f give us a bijection:

F : T U U Rn
(x, v) 7 f (x), f (v)

These pairs (T U, F ) are going to be our co-ordinate charts on T X, but first


we need to define a topology on T X.
Suppose now we pick two charts (U1 , f1 ) and (U2 , f2 ), and let W = U1 U2
denote their intersection. If x is a point in W then we have two isomorphisms

f1 : Tx X Rn

f2 : Tx X Rn

and

and the two are related by the derivative of the transition function 21 at
f1 (x):
f2 = D21 |f1 (x) f1
So the first chart gives us a bijection

F1 : T W f1 (W ) Rn
and the second chart gives us a bijection

F2 : T W f2 (W ) Rn
and these two are related by the bijection

21 : f1 (W ) Rn f2 (W ) Rn

(
x, v) 7 21 (
x), D21 |x (v)
This function 21 is smooth, since the entries in the Jacobian matrix D21 |x
depend smoothly on x. The inverse function is just 12 , so this is also
smooth. In particular both 21 and its inverse are continuous, so its a
homeomorphism.
61

Now we can define the topology on T X. If we have any subset V T X,


and we pick a chart (U, f ) on X, we can look at the subset:
F (V T U ) U Rn
We declare that V is open in T X iff for any choice of chart (U, f ) the subset
F (V T U ) is open in U Rn . Because the functions 21 are homeomorphisms
this definition of openness is actually co-ordinate independent, meaning that
if we fix a single chart (U0 , f0 ) then the open subsets of T U0 correspond
exactly to the open subsets of U0 Rn . Its easy to check that this defines a
topology on T X.
With this topology, each pair (T U, F ) is a co-ordinate chart, and all of
these charts together cover the whole of T X, which proves that T X is a
topological manifold of dimension 2n. Furthermore, the transition functions
21 are all smooth, so this is a smooth atlas and it specifies a smooth structure
on T X. So as promised, we have shown that the tangent bundle is a manifold,
of twice the dimension of X.
The atlas weve just described has uncountably-infinitely many charts,
one for every chart on X. However we can easily find a smaller atlas for
T X: just pick any atlas {(Ui , fi )} for X, then the corresponding set of charts
{(T Ui , Fi )} will be an atlas for T X.
Example 6.1. Let X be the manifold T 1 = R1 /Z from Example 2.10. We
claim that the tangent bundle to T 1 is the infinite cylinder:
T (T 1 )
= T1 R
This is clearly a 2-dimensional manifold. Well prove this claim carefully
later on using some general theory, but its also quite easy to see using an
atlas.
Recall that we have an atlas for T 1 with two charts,

f1 : U1 = T 1 \ [0] U1 = (0, 1)

and

f2 : U2 : T 1 \ [ 21 ] U2 = ( 21 , 12 )

both of which simply lift an equivalence class to its representative in the


given interval. The transition function between these two charts is:
21 : (0, 12 ) t ( 12 , 1) ( 21 , 0) t (0, 21 )

x,
for x < 12
x 7
x 1,
for x > 21
62

So we have an atlas for the tangent bundle T (T 1 ) which has two charts
T U1 = (0, 1) R
and:
T U2 = ( 12 , 12 ) R
The derivative of the transition function 21 at any point is just the identity
map from R to R, so the transition function between our two charts on T (T 1 )
is just:




21 = (21 , 1) : (0, 21 ) t ( 12 , 1) R ( 21 , 0) t (0, 21 ) R
A manifold which has this atlas must be T 1 R.
This example is rather unusual, its not normally true that the tangent
bundle to X is simply X Rn . Manifolds like this are called parallelizable,
well come back to this later on.
Also, this is unfortunately the only example for which its easy to visualise
the tangent bundle, because if X has dimension 2 then T X has dimension 4!
The projection function : T X X is automatically smooth, since if
we choose a chart (U, f ) on X and the corresponding chart (T U, F ) on T X
then the function in these charts becomes the projection map from U Rn
to U . Since this is (the restriction of) a linear map, its derivative at any
point is the same projection map again, which proves that has rank n at
every point and hence is a submersion.
The level sets of are the individual tangent spaces Tx X. Since has no
critical points, these must all be n-dimensional submanifolds of T X. This
fact is also obvious if we look in one of our charts (T U, F ).
Theres another obvious n-dimensional submanifold of T X, given by the
inclusion:
: X 7 T X
x 7 (x, 0)
This function is called the zero section of T X. Looking in a chart (T U, F ),
its obvious that is a smooth immersion, and that the image of is a
submanifold which is diffeomorphic to X.

6.2

Vector fields and flows

A vector field on X is a function which assigns to any point x X a vector


in the corresponding tangent space Tx X.
63

Figure 10: A vector field on R2 .


If X = U is just an open set in Rn , then for any point x U the tangent
space to x is just Rn . This means that a vector field on U is nothing more
than a function:
: U Rn
However, its helpful to visualize vector fields differently from other such functions, you should imagine that at every point x U the function attaches
x to the point x (see Figure 10).
a vector |
A formal definition that fits better with this intuition is to define a vector
field to be a function
: U T U = U Rn

such that = 1U . This just says that is a function of the form (1U , ),
so the data of and are exactly the same. However, this second definition
generalizes to an arbitrary manifold.
Definition 6.2. A vector field on X is a function
: X TX
such that = 1X .
So for each point x X, the function selects a vector |x Tx X.
Example 6.3. If X = S 1 , then we saw in Example 5.17 that the tangent
space T(x,y) S 1 to a point (x, y) S 1 can be identified with the subspace of
64

Figure 11: The vector field on S 1 .


R2 spanned by the vector (y, x)> . So we can define a vector field on S 1 by:
: S1 T S1
(x, y) 7 (y, x)> R2
This sends any point in S 1 to the corresponding (anti-clockwise) unit angular
vector (see Figure 11).
Of course, we are mostly interested in smooth vector fields. Weve shown
that T X is a manifold, so we know what it means for to be a smooth function. Lets unpack this definition explicitly, since it turns out to something
fairly simple.
Let : X T X be a vector field. If we choose a chart (U, f ) on X,
then we have a corresponding chart (T U, F ) on T X. Since |x Tx X for
any point x X, its restriction to U gives a function:
|U : U T U
We can look at this in our co-ordinates, this gives a function:
F |U f 1 : U T U
= U Rn
This function is still a vector field (since the co-ordinates F respect the
projection maps), so we must have

F |U f 1 = (1U , )
65

for some function : U Rn .


By definition, the function is smooth iff, for any choice of chart (U, f ),
the corresponding function F |U f 1 is smooth. This will be the case iff
the function is smooth, so in any chart, a smooth vector field reduces to
a smooth function:
: U Rn
If we want to check if a vector field is smooth we must pick an atlas for X,
and compute the functions in every chart.
Example 6.4. Lets check that the vector field on S 1 from Example 6.3
is smooth. If we use polar co-ordinates on R2 then we get an induced chart
on the submanifold S 1 R2 , as in Proposition 3.4. Explicitly, set U =
S 1 \ (1, 0) and U = (, ) R, and:
f 1 : 7 (cos , sin )
We computed in Example 5.17 that this chart identifies the tangent space
T(cos , sin ) S 1 with T U
= R via the linear isomorphism:

R T(cos , sin ) S 1
1 7 ( sin , cos )>
So in this chart, the vector field is just a constant function:
1 : U R
This is certainly smooth, which proves that is smooth at every point in S 1
apart from (1, 0), and we can use another polar co-ordinate chart to check
that is smooth at that point too.
From now on well assume that all our vector fields are smooth, unless
we need to specifically state otherwise.
Its easy to find (smooth) vector fields on any manifold X. For example,
pick any chart (U, f ) on X, and choose a bump function on X which
vanishes outside of some open set W U . A vector field on the open set
U is the same thing as a smooth function from U to Rn , and if we choose
one then we can extend it to a (smooth) vector field on X by defining:

inside U
,
=
0, outside U
So the set of all vector fields on X is very large indeed. Also notice that we
can make this set into an (infinite-dimensional) vector space, using point-wise
addition and scalar multiplication.
66

Vector fields are closely related to a nice geometric idea called a flow, as
well now explain.
If X and Y are two manifolds then weve defined a diffeomorphism between X and Y to be a smooth function F : X Y with a smooth inverse
(Definition 4.17). In particular, it makes sense to talk about diffeomorphisms
F :XX
from a manifold to itself. These are the symmetries of a manifold.
Example 6.5. Let X = T 1 = R/Z. For any constant s R, we can define
a bijection

Fs : T 1 T 1
by:
Fs : [t] 7 [t + s]
Its easy to check that Fs is smooth for any s (we can use our atlas on T 1
from Example 2.10), and since the inverse of Fs is Fs this shows that each
Fs is a diffeomorphism.
We checked in Example 4.19 that the function
H : T 1 S1
[t] 7 (cos 2t, sin 2t)
is a diffeomorphism. This implies that for any s the function rotate by 2s
Gs = H Fs H 1 : S 1 S 1

(cos , sin ) 7 cos( + 2s), sin( + 2s)
is a diffeomorphism of S 1 .
In the previous example, we didnt just write down one diffeomorphism,
we wrote down a whole family of them, indexed by the parameter s R,
and in the middle we have the identity function F0 = 1T 1 . Moreover, the
diffeomorphism Fs depends smoothly on s, in the following sense. Put all
of them together to form the function:
F : R T1 T1
(s, [t]) 7 Fs ([t])
The set R T 1 is a 2-dimensional manifold (we wrote down an atlas in
Example 6.1), and its easy to check that this total function F is smooth.
Lets abstract this example.
67

Definition 6.6. Let X be a manifold. A 1-parameter family of diffeomorphisms, or flow, on X is a smooth map
F : (, ) X X
for some positive real number , such that for each s (, ) the function
Fs : X X
x 7 F (s, x)
is a diffeomorphism, and in particular the map F0 is the indentity on X.
It should be obvious that that if take a smooth atlas for X then we can
produce a smooth atlas for (, )X, making it into a manifold of dimension
dim X + 1. So it does makes sense to ask for F to be smooth.
Instead of fixing the parameter s, we can instead choose to fix a point
x X. Then we get a smooth function:
Fx : (, ) X
s 7 F (s, x)
Since Fx (0) = F0 (x) = x, this is a curve through x, so it determines a vector
in the tangent space Tx X. If we do this at all points in x simultaneously
then we produce a vector field on X, which well call F . This vector field
is the infinitesimal version of the flow F , it tells us the direction that every
point will move in if we start to apply the flow.
Lets look at this procedure in co-ordinates. If we pick a chart on X with
codomain U Rn then F will become a smooth function
F = (F1 , ..., Fn ) : (, ) U U
such that F0 = 1U (in fact F might only be defined on an open neighbourhood
F field on U
is:
of the subset {0} U ). The associated vector f


!
1
n

F
F =
=
: U Rn
f

, ... ,

s
s
s
s=0

s=0

s=0

F is a smooth
In particular, its clear that the vector field F is smooth, since f
function.

Example 6.7. In Example 6.5 we defined the following flow on S 1 :


G : (, ) S 1 S 1


s, (cos , sin ) 7 cos( + 2s), sin( + 2s)
68

(here  can be any positive real number). The vector field associated to this
flow takes the value


2 sin
G
|(cos , sin ) =
T(cos ,sin ) S 1 R2
2 cos
at the point (cos , sin ) S 1 . Up to the overall scale factor of 2, this is
the vector field that we saw in Example 6.3.
Its in interesting question to ask whether this process can be reversed: if
we have a vector field on X, can we construct a flow on X whose associated
vector field is ?
This is a question about constructing solutions to partial differential equations, and dealing with it properly requires more analysis than we wish to
introduce here. However, the answer is yes, provided that we assume that
X is compact. If we fix a small neighbourhood U X then we can always
construct a flow
F : (, ) U X
whose infinitesimal version is |U , for some value of . If X is non-compact
then these values for  might not be bounded above zero over the whole of
X, meaning that we cannot find a global flow F for any positive value of .
However if X is compact then there must some minimal 0 > 0, and we have
a global flow with  = 0 .

6.3

Other definitions of vector fields

Well now see what vector fields look like if we adopt the viewpoints of our
second two definitions of tangent vectors, Definition 5.19 and Definition 5.32.
If is a vector field on X, then for any chart (U, f ) we have a smooth
function : U Rn . If we have two charts (U1 , f1 ) and (U2 , f2 ), then the
two functions
1 : U1 Rn
and
2 : U2 Rn
are related by the derivatives of the transition function 21 . For any point
x U1 U2 , we must have the transformation law:

2 |f2 (x) = D21 |f1 (x) 1 |f1 (x)
In the style of Definition 5.19, we can take these properties to be the
definition of a vector field.
69

Proposition 6.8. Let be a rule which assigns to any chart (U, f ) on X a


smooth function:
: U Rn
Assume that obeys the following property: for any two charts (U1 , f1 ) and
(U2 , f2 ), and any point x U1 U2 , the corresponding functions 1 and 2
satisfy

2 |f2 (x) = D21 |f1 (x) 1 |f1 (x)
where 21 is the transition function between the two charts. Then is a
(smooth) vector field on X.
Proof. Fix a point x X and let Ax denote the set of charts containing x.
Define a function
|x : Ax Rn
by:
f (x) Rn
|x : (U, f ) 7 |
1
Then |x is a tangent vector in the sense of Definition 5.19, so it defines an
element of Tx X by Proposition 5.23. Hence we have a vector field:
: X TX
x 7 |x
In any single chart (U, f ) this vector field becomes the corresponding function
: U Rn , so is smooth.
We can specify a vector field by choosing an atlas for X and then specifying the functions for every chart in the atlas. The values of the vector
field in any other chart will then be determined by the transformation law.
Example 6.9. Let X = S 1 , and recall our stereographic projection atlas
from Example 2.3. So we have two charts whose codomains are
U1 = R

U2 = R

and

and the transition function is:

21 : R \ 0 R \ 0
1
x 7
x
You can think of this data as alternative definition of the manifold S 1 ; it says
we take two copies of R and glue them together using 21 .
70

Note that if f1 (x) = x, then f2 (x) = 12 (


x) = 1/
x. Consequently, a
vector field on S 1 is equivalent to the data of two smooth functions
1 : R R

2 : R R

and

such that
x)
2 ( x1 ) = x12 1 (
for all x 6= 0. Its very easy to construct vector fields on S 1 using this
definition, just take any smooth function 1 : R R and define:
2 : R \ 0 R
x 7
x2 1 ( 1 )
x

Provided that 1 behaves sufficiently well as |


x| , we will be able to
extend 2 to a smooth function on R. For example, the pairs
1 : x 7 1
2 : x 7
x2

1 : x 7 x
2 : x 7
x

1 : x 7 x2
2 : x 7 1

all define vector fields on S 1 .


Now lets see what vector fields look like if we adopt our third definition
of tangent vectors (Definition 5.32), in terms of derivations. Well begin by
working on open sets in Rn .
Suppose U is an open set in Rn , and suppose
: U Rn
is a smooth vector field. At any point x U , we have a partial derivative
operator
x,| x : C (U ) R
x at the point x. This means that if
which differentiates along the vector |
h is a smooth function in C (U ), then we can define a new function
: U R
(h)
by:
: x 7 (h)
(h)
x
,|x
Explicitly, if the components of are = (1 , ...., n ) then:
=
(h)

n
X
i=1

71

h
i
xi


Since and h are both smooth functions, the function (h)
is also smooth.
This means we can view our vector field as an operator:
: C (U ) C (U )

h 7 (h)
When we want to think in this way its common to write vector fields in the
form:
n
X

=
i
xi
i=1
Recall that every operator x,| x is a derivation at x, i.e. it satisfies the
product rule (5.25). This implies that the operator satisfies the following
version of the product rule:
1 h2 ) = h1 (h
2 ) + h2 (h
1)
(h

(6.10)

for any two functions h1 , h2 C (U ). An operator like this is called a


derivation. Note the difference between a derivation and a derivation at
x.
Now lets repeat this on an arbitrary manifold X. Suppose we have a
smooth vector field:
: X TX
At every point we have a tangent vector |x Tx X, and recall that we have
an associated partial derivative operator:
|x : C (X) R
We can evaluate this operator by picking a chart (U, f ) containing x, so x
x Rn . A function
becomes a point x U and |x becomes a vector |
= h f 1 C (U ), and:
h C (X) becomes a function h
R
|x : h 7 x,v (h)
The resulting real number is independent of which chart we chose.
If we take a function h C (X) then we can define a new function
(h) : X R
by:
(h) : x 7 |x (h)
72

This operator is local, meaning that the restriction of the function (h) to
some open subset U X only depends on h|U , it doesnt depend on the
behaviour of h outside U . This means we can pick a chart (U, f ) on X and
translate everything over to the open set U Rn . The vector field becomes
a smooth function : U Rn , and the function (h), looked at in these co h)
as we defined before. This shows that
ordinates, is just the function (
(h) is a smooth function, so is an operator:
: C (X) C (X)
Furthermore the product rule (6.10) holds, since we can check it by working
in co-ordinate charts.
Definition 6.11. A derivation on a manifold X is a linear map
D : C (X) C (X)
such that the product rule
D(h1 h2 ) = h1 D(h2 ) + h2 D(h1 )
holds for all h1 , h2 C (X). The set of all derivations on X is denoted by
Der(X).
Weve seen that any smooth vector field defines a derivation in Der(X).
The converse is also true:
Proposition 6.12. Any derivation D Der(X) defines a smooth vector
field.
Proof. Pick a D Der(X). For a fixed point x X, we can define a linear
operator
D|x : C (X) R
by:

D|x : h 7 D(h) |x
The product rule (6.10) implies that D|x is a derivation at x, so by Proposition 5.28 it must be the partial derivative operator associated to some tangent
vector in Tx X. Hence the function : x 7 D|x is a vector field on X.
It remains to show that is smooth. Pick any chart (U, f ), and let
x1 , ..., xn be the co-ordinate functions on U . In these co-ordinates, we can
write the vector field as
n
X

i
xi
i=1
73

for some functions 1 , ..., n : U R, and what we need to show is that each
of these functions i is smooth. More specifically, lets fix a point y U , set
y = f (y) U , and prove that each i is smooth at y.
Let e be a bump function on U which is constantly equal to 1 inside
of y, and constantly equal to zero outside some
some neighbourhood W
larger neighbourhood. Then we can use this bump function e to extend each
co-ordinate function xi C (U ) to a smooth function i C (X), by
defining

e f inside U
(
xi )
i =
0 outside U
(this is the same trick we used to prove Lemma 5.31). Then if we write i in
the chart (U, f ), we get a function i C (U ) which agrees with xi inside
of y.
the open neighbourhood W
By definition, applying our derivation D must send each i to a smooth
function D(i ) C (X). Lets compute this function D(i ) inside the open
) of y. Its value at any point x W is given by
neighbourhood W = f 1 (W
applying the operator D|x to i , and we can compute this in the chart (U, f )
and see that

n
X

i

= i |f (x)
j |f (x)
D(i )|x =

j
f
(x)
j=1
. Since D(i ) is smooth, i must be smooth
since i xi in the open set W
, and in particular smooth at y.
inside W
So the set of all vector fields on X is exactly Der(X). Its easy to check
that Der(X) is a vector space under the usual addition and scalar multipliation of linear maps, and its clear that this vector space structure is the same
as the one we saw in our previous definitions.
This characterization of vector fields reveals an additional interesting
structure, the Lie bracket.
Suppose we have two derivations D, E Der(X). The composition D E
is a linear map from C (X) to C (X), but it might not be a derivation.
However, it is easy to compute that the commutator of D and E
[D, E] : C (X) C (X)


h 7 D E(h) E D(h)
does obey the product rule, and hence is also an element of Der(X).

74

Definition 6.13. The Lie bracket on Der(X) is the bilinear map


[, ] : Der(X) Der(X) Der(X)
(D, E) 7 [D, E]
Its obvious that the Lie bracket is antisymmetric, i.e. [D, E] = [E, D],
and its easy to check that it obeys the Jacobi identity:

 
 

D, [E, F] + E, [F, D] + F, [D, E] = 0
for any D, E, F Der(X). If you know what a Lie algebra is, this shows that
Der(X) is an (infinite-dimensional) Lie algebra.
We can see what the Lie bracket does more explicitly if we work in coordinates. Suppose U is an open set in Rn , and we have two vector fields
=

n
X
i=1

i
xi

and

n
X
i=1

i
xi

on U . Then its straightforward to compute that their Lie bracket is the


vector field:
!
n X
n
X
i
i



=
,
j
j
xj
xj xi
i=1 j=1
The Lie bracket also has a geometric interpretation in terms of flows,
which were going to quickly sketch. Firstly, suppose that we have a single
diffeomorphism:
F :XX
Then for any point x X, we have an isomorphism:

DF |x : Tx X TF (x) X
Now suppose we also have a vector field on X. We can define a new vector
field F by pulling-back along F , meaning that we define:

(F )|x = (DF |x )1 |F (x)
Notice that this trick only works if F is a diffeomorphism, it wont work for
more general smooth functions.
Now suppose we have a flow
F : (, ) X X
which we recall means a smooth family of diffeomorphisms Fs : X X, with
F0 = 1X . Then we can use F to turn our vector field into a 1-parameter
75

family of vector fields, since for every s (, ) we have a vector field Fs
on X. For s = 0 we get the original vector field F0 () = . If we fix a single
point x X then we have 1-parameter family of vectors in the tangent
space Tx X, given by
(Fs )|x Tx X
for each s (, ). The derivative of this family at s = 0 is a new vector

(Fs )|x
Tx X
s s=0
so if we evaluate this at all points x X then we obtain a new vector field.
This new vector field measures the infinitesimal change in the vector field
when we apply the flow F .
Now the flow F has its own vector field F associated to it, which is the
infinitesimal version of F . We claim that the infinitesimal change in when
we apply the flow F is given by the vector field:
[ F , ]
Well give a sketch proof of this claim. The statement can be checked in
co-ordinates, so let
F : (, ) U U
be a flow and let
: U Rn
be a vector field. For any point x U , we have
x + O(s2 )
F (s, x) = x + s|
where is the infinitesimal version of F . Hence the Jacobian matrix of F at
(s, x) is
DF |(s,x) = I + sJ + O(s2 )
where I is the identity matrix and J is the matrix whose entries are:

i
Jij =

xj
x

Inverting the Jacobian, we get:



1
DF |(s,x)
= I sJ + O(s2 )

76

On the other hand, the value of the vector field at the point F (s, x) is

+ sK |
x + O(s2 )
|
F (s,
x) = |x
where K is the matrix whose entries are:

i
Kij =

xj

So applying our flow F to the vector field gives a family of vector fields
Fs whose values are:

1 


+ sK |
x sJ |
x + O(s2 )
(Fs )|x = DF |(s,x)
|
F (s,
x) = |x
If we take the partial derivative with respect to s, and then set s to zero, we
get the vector field



x J |
x = ,
K |
x

as claimed.

6.4

Vector bundles

Weve seen that for every manifold X theres an associated tangent bundle,
which is a smooth manifold T X coming with a smooth surjection
: TX X
such that every level set 1 (x) = Tx X is a vector space. This is a very rich
mathematical structure, its an example of something called a vector bundle.
Let X be a manifold of dimension n. Informally, a vector bundle over a
manifold X is a collection of vector spaces {Ex }, indexed by the points of
x. These vector spaces have to fit together to give a smooth manifold E,
equipped with a smooth map : E X whose level set over x X is the
associated vector space 1 (x) = Ex . For example, for any manifold X, and
any integer r, there is a vector bundle
: E = X Rr X
where is the projection map : (v, x) 7 x. Its easy to show that theres
a smooth structure on E making it into a manifold (of dimension n + r), and
that is smooth. Obviously the level set of at any point x X is the
vector space Rr . This is called the trivial vector bundle of rank r.
77

The tangent bundle is not usually of this form, in general we cant canonically identify Tx X with Rn so its not usually true that T X = X Rn .
However if we pick a chart (U, f ), then within the open set U X it is true
that T U = U Rn , since our co-ordinates give us this bijection. So within
small neighbourhoods in X, the tangent bundle looks like the trivial bundle
of rank n. This condition, of being locally trivial, is the defining property
of a vector bundle.
Definition 6.14. Let X be a manifold of dimension n, let E be a manifold
of dimension n + r, and let
:EX
be a smooth surjection. For each x X we denote the level set of by
Ex = 1 (x).
Suppose that for every point x X we have specified the structure of an
r-dimensional vector space on Ex . We call this structure a vector bundle if
there exist atlases {(Ui , fi ), i I} for X and {(Vi , gi ), i I} for E (indexed
by the same set I) with the following properties:
Vi = 1 (Ui ), for each i I.
Vi = Ui Rr Rn+r , for each i I.
For each i I, the square
Vi

gi

Ui Rr

Ui

fi

Ui

commutes, where the right-hand vertical arrow is the obvious projection


map.
For any x X and any i I, the map
gi |Ex : Ex Rn
is an isomorphism of vector spaces.
The integer r is called the rank of the vector bundle, and the vector spaces
Ex are called the fibres of the vector bundle.

78

A vector bundle is a lot of data: we need to specify E, , and the vector


space structure on each fibre Ex . However its common to just write it as
the map : E X, or sometimes just as E.
When we defined the tangent bundle T X X we used an atlas of exactly
this form, so the tangent bundle is a vector bundle of rank n. If we take E
to be the trivial vector bundle of rank r, so E = X Rr , then any atlas
{(Ui , fi ), i I} for X will produce an atlas for E of the required form, just
by setting Vi = Ui Rr and Vi = Ui Rr . Hence the trivial vector bundle of
rank r is indeed an example of a vector bundle, of rank r. This is fortunate!
Example 6.15. Let E be the quotient of R2 by the equivalence relation
(x, y) (x + n, (1)n y), n Z
(these are the orbits of a group action generated by a horizontal glide reflection). This is an infinite Mobius strip. We have a well-defined map:
: E T1
[(x, y)] 7 [x]
Notice that the usual vector space structure on R2 does give a well-defined
vector space structure on each fibre E[x] . Futhermore if we take the atlas
{(U1 , f1 ), (U2 , f2 )} on T 1 from Example 2.10 then its easy to find a corresponding atlas {(V1 , g1 ), (V2 , g2 )} on E with all the properties required by
Definition 6.14. For example, V1 = (0, 1) R R2 , and V1 is the image of
V1 in E.
The concept of a vector field can be easily generalized to other vector
bundles:
Definition 6.16. Let : E X be a vector bundle. A section of E is a
smooth map
:XE
such that = 1X .
So a section sends any point x X to a vector |x Ex lying in the
fibre over x, and this vector varies smoothly with x. A section of the tangent
bundle T X is precisely a vector field.
Any vector bundle has one obvious section, the zero section, which maps
any point x X to the zero vector 0 Ex . By looking at this in a chart its
clear that it is smooth, and furthermore it gives us an injective immersion
X , E
79

whose image is a submanifold which is diffeomorphic to X. We observed this


fact earlier in the case that E is the tangent bundle.
We saw earlier (Example 6.3) that the manifold T 1 has the rather special
property that the its tangent bundle looks globally like the trivial bundle
T 1 R. We now want to say this precisely, but we first we need to say what
it means for two vector bundles over X to be isomorphic.
Definition 6.17. Let 1 : E1 X and 2 : E2 X be two vector bundles
over X. An isomorphism between E1 and E2 is a diffeomorphism

F : E1 E2
such that 2 F = 1 , and such that the induced function
Fx : (E1 )x (E2 )x
is a linear isomorphism, for each x X.
So an isomorphism of vector bundles is a bijection that preserves all the
structure of a vector bundle. In particular if two vector bundles over X are
isomorphic they must obviously have the same rank.
Definition 6.18. A rank r vector bundle : E X is called trivial if it is
isomorphic to the trivial vector bundle X Rr .
Here is one way to tell if a vector bundle is trivial:
Proposition 6.19. Let : E X be a vector bundle of rank r. Then E
is trivial iff there exist r sections 1 , ..., r of E such that, for every point
x E, the vectors
1 |x , ..., r |x Ex
form a basis of Ex .
Proof. If E is the trivial vector bundle X Rr then we can just pick any
basis e1 , ..., er for Rr and consider the constant sections
i : x 7 ei for each
i, which are evidently smooth. More generally if F : X Rr E is an
isomorphism of vector bundles then we can define r sections of E by:
i = F
i : x 7 Fx (ei ) Ex
These are smooth since both F and
i are smooth, and give a basis of Ex
since Fx is an isomorphism of vector spaces.

80

Conversely, suppose that we have such a set of sections 1 , ..., r . Define


a function
F : X Rr E
by:

F : x, (v1 , ..., vr ) 7

x,

r
X

!
vi i |x

i=1

Obviously F commutes with the projection maps, and for each x X the
map Fx is linear and sends the standard basis of Rr to the basis 1 |x , ..., r |x
of Ex . Hence each Fx is an isomorphism of vector spaces, and it follows that
F is a bijection.
Now pick a chart (U, f ) on X and a chart (V, g) on E of the form specified
in Definition 6.14, and choose the corresponding chart U Rr on X Rr . In
these charts, each section i is a smooth function

i = (
1i , ....,
ri ) : U Rr
and F is the function
F : U Rr U Rr
given by the smooth family of invertible r-by-r matrices Mx whose entries
are
ji |x . The inverse function F 1 is given by the family of matrices Mx1 ,
whose entries will also vary smoothly with x since they are rational functions
of the entries in Mx . Hence both F and F 1 are smooth, so we have shown
that F is an isomorphism of vector bundles.
Definition 6.20. A manifold X is said to be parallelizable iff its tangent
bundle T X is trivial.
Example 6.21. Let X = S 1 . In Example 6.3 we found a smooth vector field
on S 1 which was not equal to zero at any point. Since S 1 is 1-dimensional,
this means that gives a basis of the tangent space at every point. So by
Proposition 6.19 the bundle T S 1 is trivial, and S 1 is parallelizable.
The manfold S 2 is not parallelizable, because of the following fact:
Theorem 6.22 (Hairy ball theorem). Any vector field on S 2 must be equal
to zero at some point.
Consequently it is impossible to find a pair of vector fields 1 , 2 on S 2
that form a basis of the tangent space at every point.
Theorem 6.22 is is a very nice result. It implies for example that at any
moment in time there must be a point on the Earth where the wind speed
81

is zero, and also that you cannot groom a spherical dog without leaving a
protruding tuft of hair at one point. The proof is not very difficult, but
unfortunately it requires some algebraic topology that doesnt form a part of
this course.
Theorem 6.22 is true for any even-dimensional sphere S 2n , so no evendimensional sphere is parallelizable. In fact the only parallelizable spheres
are S 1 , S 3 and S 7 , but this is rather harder to prove.

7
7.1

Covectors and one-forms


Covectors

In this section were going to look at the dual objects to tangent vectors,
which are called covectors. Well begin by recalling some basic facts about
dual vector spaces.
Let V be a vector space, of dimension n. Recall that the dual vector space
to V is the space
V ? = Hom(V, R)
of all linear maps from V to R. The dimension of V is also n, if we pick a
basis {e1 , ..., en } for V there is a corresponding dual basis {1 , ..., n }, where
i V ? is the linear map defined by:
i : V R

1, j = i
ej 7
0, i =
6 j
If V is just Rn , the space of length-n column vectors, then the dual space V ? is
the space of length-n row vectors, which means that theres a canonical way to
identify (Rn )? with Rn . Under this identification, the standard basis becomes
its own dual basis, and the process of evaluating a row vector u (Rn )? on
a column vector v Rn becomes the dot product of vectors.
If W is a second vector space (of dimension m) and
F :V W
is a linear map, then there is a corresponding dual linear map
F? : W? V ?
which sends a vector u W ? to a vector F ? (u) V ? defined by:
F ? (u) : V R
v 7 u(F (v))
82

If we compose two linear maps F and G then its easy to see that the dual
of the composed map is:
(G F )? = F ? G?
If we pick a basis for both V and W then the linear map F can be expressed
as an m-by-n matrix:
F : Rn Rm
The dual map F ? can be an expressed as an n-by-m matrix, using the corresponding dual bases of V ? and W ? , and its easy to calculate that it becomes
the transpose matrix:
F > : Rm Rn
Now we return to manifolds.
Definition 7.1. Let x be a point in a manifold X. The cotangent space
to X at x is the dual vector space to Tx X. Elements of the cotangent space
are called covectors.
Its conventional to denote the cotangent space to X at x by
Tx? X
(rather than (Tx X)? ). We know that if we pick a chart (U, f ) containing x
then we get a linear isomorphism:

f : Tx X Rn
The dual linear map to f gives a linear isomorphism

?f : Rn Tx? X
which we can invert to give:

(?f )1 : Tx? X Rn
So if we work in co-ordinates then we can identify both the tangent space
and the cotangent space to x with the vector space Rn . Its only when we
change co-ordinates that we see the difference between them.
Recall that if we have two charts (U1 , f1 ) and (U2 , f2 ) containing x then
tangent vectors change according to the transformation law
f2 = D21 |f1 (x) f1
83

where 21 is the transition function between the two charts. We can also
write this as
1
f1 = D21 |f1 (x)
f2 = D12 |f2 (x) f2
using the inverse transition function 12 . Taking the dual of this equation
tells us that
?
?f1 = ?f2 D12 |f2 (x)
and then inverting gives
(?f1 )1 =

D12 |f2 (x)

? 1

so:
(f2 )1 = D12 |f2 (x)

?

(?f2 )1

(f1 )1

(7.2)

What does this equation mean? If we take an abstract covector u Tx? X,


we can we can translate it into an ordinary column vector
1
?f1 (u) Rn
using our first chart. Or we can use our second chart, and it becomes the
column vector:
1
?f2 (u) Rn
These two column vectors are related by the matrix:
>
D12 |f2 (x) : Rn Rn
This is the transpose of the Jacobian matrix for the transition function 12 ,
at the point f2 (x).
In the style of the physicists definition of tangent vectors (Definition
5.19), we can use the transformation law (7.2) as an alternative definition of
a covector.
Proposition 7.3. Let x be a point in a manifold X, and let Ax denote the
set of all charts containing x. A covector in Tx? X is the same thing as a
function
u : Ax Rn
(U, f ) 7 uf
such that, for any two charts (U1 , f1 ), (U2 , f2 ) Ax , we have:
>
uf2 = D12 |f2 (x) (
uf1 )
84

Proof. This is proved in exactly the same way as Proposition 5.23. Let
Tx? X denote the set of all such functions u, this has an obvious vector space
structure. For any chart (U, f ), the evaluation map
f : Tx? X Rn
u 7 uf
is a linear isomorphism, by exactly the same argument that proved Lemma
5.21. Then the evident function from Tx? X to Tx? X must be a linear isomorphism, since if we pick any chart then we can factor it as:
Tx? X

(?f )1

f1

R Tx? X

The analogue for covectors of the algebraists definition of tangent vectors (Definition 5.32) is less obvious. It will take us a little while to describe,
but in the end it is quite simple.
Recall that we may think of a tangent vector v Tx X as a partial
derivative operator
v : C (X) R
which is an element of Derx (X). This operator can be evaluated either by
differentiating functions along some curve such that [] = v Tx X, or by
choosing co-ordinates and then performing partial differentiation along the
corresponding vector in Rn .
Now lets turn this definition around. If we fix a smooth function h
C (X) then we can produce a linear map from Tx X to R, which we will
denote by dh|x , by defining:
dh|x : Tx X R
v 7 v (h)
So to every smooth function h C (X) there is an associated covector
dh|x Tx? X.
: R2 R be the function h
: (x, y) 7 x2 + y. The
Example 7.4. Let h
2
tangent space to a point (x, y) R is spanned by the two partial derivative

(x,y) is the covector


and x
Der(x,y) (R2 ). So dh|
operators x
(x,y) : T(x,y) R2 R
dh|
which sends

to 2x and

to 1.
85

The previous example generalizes immediately to any open set U Rn . If


we fix a point y U then we have canonical identifications TyU
= Dery(U )
=
n
n
R , and the standard basis vectors e1 , ..., ei R correspond to the partial
differentiation operators:
y,e1 , ... , y,en Dery(U )
C (U ). If we evaluate the associated covector
Now choose a function h
y against our basis vectors for TyU we get:
dh|

h
y : y,e 7 y,e (h)
=
dh|

i
i
xi
y

to be the co-ordinate function xi C (U ) then we


In particular, if we set h
get that

1, i = j
d
xi |y : y,ej 7
0, i 6= j
So d
x1 |y, ..., d
xn |y is the dual basis to our standard basis for Ty? U . We can
identify Ty(U ) with Rn , so we can also identify Ty? (U ) with Rn , and then
each d
xi becomes the standard basis vector ei .
C (U ), we can think of dh|
y as a vector in Rn
For a general function h
at y, or we may also write:
whose entries are the partial derivatives of h



h
y =
dh|
x1 |y + ... +
xn |y Tx? U
(7.5)
d
d
x1
xn
y

If we now let X be any manifold, and h be a function in C (X), the


calculation weve just done tells us what the covector dh|x looks like in any
co-ordinate chart. From the point-of-view of Proposition 7.3, the covector
dh|x is the function
dh|x : Ax Rn
which sends a chart (U, f ) to the column vector


>

h
h
, ... ,
Rn

x1
xn
f (x)

f (x)

= h f 1 . Notice that this is just the transpose of the Jacobian


where h
f (x) for the smooth function:
matrix Dh|
h:XR
86

f (x) )>
We know that dh|x is a covector, so we know that the column vector (Dh|
must obey the transformation law (7.2), but for the sake of completeness lets
verify this explicitly. In fact weve already done the calculation, because we
know that when we change charts the Jacobian matrix changes by the rule
(4.8), so
2 |f (x) = Dh
1 |f (x) D12 |f (x)
Dh
2
1
2
and transposing this gives


2 |f (x)
Dh
2

>

= D12 |f2 (x)

>
> 
1 |f (x)
Dh
1

as required.
Now we give the algebraists definition of a covector. As weve just seen,
to any function h C (X) theres an associated covector dh|x . Furthermore,
its easy to check that this function
C (X) Tx? X
h 7 dh|x
is itself a linear map. However, the vector space Tx? X is finite-dimensional,
whereas the vector space C (X) is (uncountably) infinite-dimensional, so
this linear map must have an infinite-dimensional kernel. Using our expression for dh|x in co-ordinates, we can see that dh|x = 0 iff in any chart (U, f )
containing x, the corresponding Jacobian matrix
f (x) : Rn R
Dh|
is the zero row vector. This is precisely the statement that the function h
has rank zero at the point x, because the rank of h at x must be either one
f (x) 6= 0 then
or zero (since it is a smooth function from X to R), and if Dh|

?
it has rank one. So the kernel of our linear map C (X) Tx X is the subset
Rx (X) C (X)
of functions which have rank zero at x. In particular Rx (X) must be a
subspace.
Proposition 7.6. We have an isomorphism

C (X)/Rx (X) Tx? X


given by h 7 dh|x .

87

Proof. Weve seen that h 7 dh|x defines a linear map from C (X) to Tx? X,
and that the kernel of this map is Rx (X). It only remains to show that
this map is a surjection, since then the Proposition follows by the First
Isomorphism Theorem.
Pick a chart (U, f ) containing x, and let y = f (x). Then we can identify
?
xi |y, ..., d
xn |y dual to the standard basis
Tx X with Ty? U , and this has a basis d

y,e1 , ..., y,en of Dery(U ). Its enough to show that each basis vector d
xi |y
can be obtained from some smooth function on X.
Using a bump function (as in the proof of Proposition 6.12), for each i
we can find a smooth function xi C (X) such that when we write xi in
our chart, it agrees with the co-ordinate function xi in some neighbourhood
of y. This means that if we evaluate the covector dxi |x against any tangent
vector y,v Dery(U ) we get the answer:
y,v (
xi ) = d
xi |y (y,v )
So in this chart, dxi |x is exactly d
x|i .
So from this point-of-view, a covector at x is just an equivalence class of
smooth functions on X, where we regard two functions as equivalent if their
difference has rank zero at x.
Incidentally, this shows yet another possible way to define the tangent
space Tx X. We could define Tx? X to be the quotient of C (X) by Rx (X),
and then define Tx X to be the dual space to Tx? X. This is certainly the
quickest definition to write down!

7.2

The cotangent bundle and one-forms

Just as we did for the tangent bundle, we can take all the cotangent spaces
Tx X for each x X, and assemble them together to get a set
[
T ?X =
Tx? X
xX

called the cotangent bundle. This comes with a projection function


: T ?X X
which sends a covector u Tx? X to the corresponding point x X.
As the name suggests, the cotangent bundle is actually a vector bundle
over X, of rank n. The argument is essentially identical to the case of the
tangent bundle. For any chart (U, f ) on X, we get a bijection

T ? U U Rn

(x, u) 7 f (x), (?f )1 (u)
88

and if W = U1 U2 is the intersection of two charts then the two bijections


are related by the diffeomorphism:

f1 (W ) Rn f2 (W ) Rn

(
x, u) 7 21 (
x), (D12 |f2 (x) )> (
u)
This means we can put a topology on T ? X such that these bijections are
co-ordinate charts, then we have a smooth atlas of exactly the form required
by Definition 6.14.
Definition 7.7. Let X be a manifold. A covector field, or one-form, on
X is a section of the cotangent bundle, i.e. a smooth function
: X T ?X
such that = 1X .
The reason for the name one-form is that this is the first case of a more
general object called a p-form, where p can be any natural number. We will
meet p-forms later on.
If our manifold is simply an open set U Rn then the tangent bundle is
just U Rn and a one-form is the same thing as a smooth function:

: U Rn
So on U , its difficult to see the difference between one-forms and vector
fields. On more complicated manifolds we do see the difference, because the
way that one-forms change when we change co-ordinates is different from the
way that vector fields change. If is a one-form on X, and we have two
charts (U1 , f1 ) and (U2 , f2 ), then in each chart becomes a smooth function:

1 : U1 Rn

2 : U2 Rn
For points x U1 U2 , we have the transformation law

2 |f2 (x) = (D12 |f2 (x) )>


1 |f1 (x)

(7.8)

and this is different from the transformation law for vector fields.
As we did for vector fields (Proposition 6.8), we could take these properties as an alternative definition of a one-form. Also note that we can specify
a one-form on X by choosing an atlas A = {(Ui , fi )} for X and choosing
functions

i : Ui Rn
89

such that the correct transformation law (7.8) holds.


In many ways one-forms are more important than vector fields. One
reason for this is the following: for any smooth function h C (X), there
is an associated one-form on X.
Recall that for any fixed point x X, our smooth function h determines
a covector dh|x Tx? X. Immediately this defines a function
dh : X T ? X
x 7 dh|x
with dh = 1X , but before we can declare that this is a one-form we need
to check that it is smooth.
To check smoothness we need to work in co-ordinates, so lets assume that
C (U ),
X is just an open set U Rn . If we take a smooth function h

then at any point y U weve already calculated that dh|y Ty? U


= Rn is
at y, so dh
is the function:
the vector of all partial derivatives of h
: U Rn
dh


>

h
h

y 7
, ... ,

x1
xn
y

vary smoothly
This is a smooth function, since the partial derivatives of h
to be one of the co-ordinate
with y. In particular, notice that that we set h
functions xi on U , then d
xi is the constant function sending every point in

U to the standard basis vector ei Rn . This means that


=
dh

h
h
d
x1 + ... +
d
xn
x1
xn

which is a very attractive equation.


If we return to a general manifold X, the previous discussion shows that
for any smooth function h C (X) the function dh is indeed a (smooth)
one-form, since in a co-ordinate chart (U, f ) the function dh becomes the
smooth function:
d(h f 1 ) : U Rn
Example 7.9. Let X = S 1 , and take the polar co-ordinate chart where:

f 1 : (, ) S 1 \ (1, 0)
7 (cos , sin )
90

Let h C (S 1 ) be the function h : (x, y) 7 x2 . Then in this chart h becomes


: 7 cos2 and the one-form dh becomes the function:
the function h
: (, ) R
dh
7 2 sin cos = sin 2
= sin 2 d, since d : (, ) R is the constant
We can also write dh
function with value 1.
The second reason that one-forms are important is because they behave
nicely with respect to smooth functions between manifolds. Suppose
F :XY
is a smooth function between two manifolds, and we have a one-form on
the manifold Y . For any point x X we have a linear map
DF |x : Tx X TF (x) Y
and dually we have a linear map:
DF |?x : TF? (x) Y Tx? X
This means that for any point x in X we can get a covector in Tx? X by
defining:
(F ? )|x = DF |?x (|F (x) )
Unsuprisingly, we claim that this covector (F ? )|x varies smoothly with x,
so we have a smooth one-form
F ? : x 7 (F ? )|x Tx? X
on the manifold X. This is called the pull-back of along F . We showed
earlier that its possible to pull back vector fields, but only if the function F is
a diffeomorphism. One-forms can be pulled back along any smooth function.
To check the smoothness of F ? we of course have to work in co-ordinates,
so let consider the case where X is an open set U Rn and Y is an open set
V Rk , and we have a smooth function:
F = (F1 , ..., Fk ) : U V
Let x1 , ..., xn be the co-ordinate functions on U , and y1 , ..., yk be the coordinate functions on V . Lets start by calculating the pull-back of the
constant one-form:
d
yj : V Rk
91

By definition, F ? d
yj is the function
F ? d
yj : U Rn
which sends a point x to the column vector

!>
Fj
Fj
Rn
, ... ,

x1
x1
x

that we get by applying the matrix (DF |x )> to the j-th standard basis vector
in Rk . This is a smooth function, since the partial derivatives of Fj vary
smoothly with x. Also, we have that
F ? d
yj =

Fj
Fj
d
x1 + ... +
d
xn
x1
xn

(7.10)

which is another nice equation. A general one-form


on V can be written
as

=
1 d
y1 + ... +
k d
yk
for k smooth functions
1 , ...,
k C (V ). Then


yk |F (x)
k |F (x) d
y1 |F (x) + ... +
F ?
|x = (DF |x )?
1 |F (x) d
so
F ?
= (
1 F )F ? d
y1 + ... + (
k F )F ? d
yk
since each (DF |x )? is a linear map. Each F ? d
yk is a smooth function from U
n

to R , and each
k F is a smooth function from U to R, so F ?
is smooth.
This local calculation proves that on general manifolds, the pull-back of
a one-form is indeed a (smooth) one-form.
Example 7.11. Consider the inclusion map:
: S 1 R2
On R2 we have two (constant) one-forms dx and dy. Lets calculate the
one-forms dx and ? dy on S 1 . If we think of the tangent space to a point
(x, y) S 1 as the subspace
T(x,y) S 1 R2
spanned by (y, x)> , then D|(x,y) is just the inclusion of this subspace, and
hence
? dx|(x,y) : T(x,y) S 1 R
92

is the linear map which sends (y, x) to y. Similarly dy|(x,y) sends (y, x)
to x.
Now lets look at these one-forms on S 1 in co-ordinates. Weve seen
(Examples 5.17 and 6.4) that if use polar co-ordinates (x, y) = (cos , sin )

then the tangent vector (y, x) corresponds to the unit tangent vector
.

This means that in this chart, dx becomes the one-form sin d and dy
becomes the one-form cos d.
If we choose our one-form on Y to be of the form = dh for some
h C (Y ), then there is another way to say what the pull-back of is.
Proposition 7.12. If F : X Y is a smooth function and h C (Y )
then:
F ? dh = d(h F )
Proof. Fix a point x X and let y = F (x). By definition, (F ? dh)|x is the
covector (DF |x )? (dh|y ) Tx? X. Now choose any d Derx (X) = Tx X. By
the definition of a dual linear map, we have that (DF |x )? (dh|y ) sends d to
the real number dh|y (DF |x (d)). By the definition of the covector dh|y this
number is (DF |x (d))(h), and this equals d(h F ) by Lemma 5.33. So weve
shown, for any point x X, that (F ? dh)|x is the covector:
(F ? dh)|x : Tx X R
d 7 d(h F )
This proves that F ? dh = d(h F ).
In particular, suppose that U Rn and V Rk are open sets, and
F = (F1 , .., Fk ) : U V is a smooth function. If we take one of the coordinate functions yj C (V ), then F yj is just Fj , and the proposition
weve just proved says that
F ? d
yj = dFj
which is exactly what we saw in the equation (7.10).
C (R2 )
Example 7.13. Let : S 1 , R2 be the inclusion map, and let h
2

1
: (x, y) 7 x . Then h = h
C (S ) is the function
be the function h
considered in Example 7.9, and we calculated there that if we look at dh in
polar co-ordinates we get 2 sin cos d. On the other hand, we have that:
= 2x dx
dh
In Example 7.11 we saw that if we look at dx in in polar co-ordinates we will
in polar co-ordinates we must
get sin d, and therefore if we look at dh
get 2 cos sin d. These two calculations had to give the same answer,

since Proposition 7.12 says that dh = dh.


93

Its worth noticing that the transformation law for one-forms (7.8) is
actually a special case of this pull-back operation. Suppose we have two
charts (U1 , f1 ) and (U2 , f2 ) on X, so we have a transition function:

21 : f1 (U1 U2 ) f2 (U1 U2 )
Now pick a one-form on X, which in our charts becomes:

1 : U1 Rn

and

2 : U2 Rn

Then on the overlap, the one-forms


1 and
2 are related by pull-back along
the inverse transition function 12 , i.e. we have

2 = ?12
1
on the open set f2 (U1 U2 ).

Differential forms

In this section were going to generalize one-forms to p-forms, where p can be


any natural number. We have to start by discussing quite a lot of multi-linear
algebra for vector spaces, but once weve got that out the way the definition
of p-forms on manifolds will be quite easy.

8.1

Antisymmetric multi-linear maps and the wedge


product

Let V be a vector space, of dimension n. Weve previously consider the dual


space of all linear maps u : V R, now were going to consider bilinear
maps
b:V V R
i.e. maps which are linear in each argument. If we choose a basis e1 , ..., en
for V then b is determined by its values on each pair of basis elements, and
we can view this data as an n-by-n matrix B with entries:
Bij = b(ei , ej )
Conversely any such matrix B specifies a bilinear map b, by extending linearly in each argument. The space of all such bilinear maps forms a vector space under point-wise addition and scalar multiplication, and once we
choose a basis for V this vector space becomes identified with the vector
space Matnn (R). So its dimension is n2 .
94

In fact were only going to be interested in antisymmetric bilinear maps,


which means that
b(v, v) = b(
v , v)
for all v, v V . If we choose a basis for V , these correspond to antisymmetric
matrices. Its easy to check that the antisymmetric maps form a subspace of
the space of all bilinear maps, so the set of all antisymmetric bilinear maps
is a vector space. We denote it by:

2 V ?
Now suppose we have two elements u, u of the space V ? . We can combine
them to form an element of 2 V , by setting:
u u : V V 7 V
(v, v) 7 u(v)
u(
v ) u(
v )
u(v)
Clearly u u is an antisymmetric bilinear map. We call it the wedge product
of u and u. This wedge product is an important structure, we can think of
it as a kind of multiplication:
: V ? V ? 2 V ?
(u, u) 7 u u
Its a straight-forward exercise to check that this product is itself bilinear.
Its also antisymmetric, since its clear from the definition that:
u u =
uu
Now pick a basis e1 , ..., en for V , and let 1 , ..., n be the dual basis for V ? .
Choose a pair i, j [1, n] with i < j, and form the bilinear map i j 2 V ? .
Applying this to pairs of basis vectors in V we get:

1, s = i and t = j
1, s = j and t = i
i j : (es , et ) 7

0,
otherwise
So i j corresponds to the matrix with a 1 in the (i, j) position (which is
above the diagonal), a 1 in the (j, i) position (which is below the diagonal),
and zeroes everywhere else. Clearly this set of matrices forms a basis for the
space of all antisymmetric n-by-n matrices, so the set
{i j ; i < j} 2 V ?
95

is a basis. In particular, we have:


dim V
2

 
n
=
2

We can use these bases to describe the wedge product explicitly. If we take
two elements
u = 1 1 + ... + n n

and

u = 1 1 + ... + n n

V?

(here 1 , ..., n and 1 , ..., n are just real numbers) then their wedge product
is:


u u = 1 2 2 1 1 2 + 1 3 3 1 1 3 + ...

... + n1 n n n1 n1 n
Example 8.1. Let V = R3 , and e1 , e2 , e3 be the standard basis. Then V ? is
also
R3 , and 1 , 2 , 3 is again the standard basis. The dimension of 2 V ? is

3
= 3, and it has a basis:
2
{2 3 , 1 3 , 1 2 }
Using these bases, the wedge product of two vectors (1 , 2 , 3 ) and (1 , 2 , 3 )
is:
(2 3 3 2 , 1 3 3 1 , 1 2 2 1 )
If we flip the sign of the basis vector 1 3 , then the formula above becomes
the usual cross-product of vectors in R3 . This explains why there is no
direct
 analogue of the cross-product in other dimensions, since if n 6= 3 then
n
6= n. In other dimensions, the cross-product of two vectors is really the
2
n
wedge product, and it lands in R( 2 ) .
We know that a linear map F : V W induces a dual linear map
F : W ? V ? . It also induces a linear map
?

2 F ? : 2 W ? 2 V ?
defined by

2 F ? (b) : (v, v) 7 b F (v), F (v) R


for b 2 W ? and v, v V . Its easy to check that 2 F ? (b) really is an
element of 2 V ? , and that 2 F ? really is linear in b. In terms of matrices,
?
moving from F to 2 F
 is arather complicated operation that turns a k

by-n matrix into an n2 -bythe abstract definition.

k
2

matrix, and its generally easier to work with

96

It follows immediately from the definition that if F : V W and G :


W U are two linear maps then we have:

2 (G F )? = 2 F ? 2 G?
In particular, if F is an isomorphism, then so is 2 F ? . If you know what
a functor is, this says that the operation which sends V to 2 V ? is a contravariant functor (as is the operation which sends V to V ? ).

Now fix a natural number p. Were going to generalize from


V ?.
For any p we can consider p-linear maps from V to R:

2 V ? to

c : V p R
where V p means V ...(p times)... V . If we choose a basis e1 , ..., en for V
then c gives us a p-dimensional array of numbers
Ci1 ,...,ip = c(ei1 , ..., eip )
by evaluating c on each p-tuple of basis vectors, and conversely any such
array of numbers determines a p-linear map by extending linearly in each
argument. Hopefully its clear that the set of all p-linear maps from V to R
is a vector space, of dimension np .
We say that c is antisymmetric if c flips sign when we swap any two of
its arguments, i.e.
c(v1 , ..., vp ) = c(v(1) , ..., v(p) )
for any transposition acting on the set {1, ..., p}. In particular this means
that if we set any two of its arguments to be the same vector, then c must
give the answer zero. If we apply a more general permutation Sp , we
must have:
c(v1 , ..., vp ) = (1) c(v(1) , ..., v(p) )
where (1) is our notation for the sign of the permutation .
The antisymmetric maps form a subspace of the space of all p-linear maps
from V to R, and we denote this vector space by:

p V ?
Weve seen that two elements of V ? can be wedged together to get an
element of 2 V ? . Similarly, if we have p elements u1 , ..., up of V ? , then we
97

can combine them to get an element of p V ? , which we denote by u1 ...up .


We define it to be the map
u1 ... up : V p R
which sends a p-tuple (v1 , ..., vp ) to the real number:
X
(1) u1 (v(1) ) ... up (v(p) ) R

(8.2)

Sp

This map is clearly linear in each argument, and by construction its antisymmetric, so it is indeed an element of p V ? .
Hence weve defined a p-fold wedge product:
(V ? )p p V ?
(u1 , ..., up ) 7 u1 ... up
This product is p-linear and antisymmetric, since the expression (8.2) is linear
in each ui , and changes sign if we switch any ui and uj . In particular if any
two ui and uj are equal then we get the zero element of p V ? .
Now suppose we choose a basis e1 , ..., en for V , so we get a dual basis
1 , ..., n for V ? . We can produce elements in p V ? by picking an p-tuple
(i1 , ..., ip ) of integers in [1, n] and then forming the wedge product i1 ...ip .
If any entries in our p-tuple are repeated then this product must be zero.
If our p-tuple contains no repeated entries, then it must be of the form
((j1 ), ..., (jp )) for some correctly-ordered p-tuple j1 < ... < jp and some
permutation Sp . Then antisymmetry implies that:
(j1 ) ... (jp ) = (1) j1 ... jp
So up to sign, this procedure creates one element of
[1, n] of size p.

p V ? for each subset of

Proposition 8.3. Let e1 , ..., en be a basis for V , and let 1 , ..., n be the dual
basis for V ? . Then the set of elements


i1 i2 .... ip | 1 i1 < i2 < ... < ip n p V ?
is a basis. In particular:
dim

 
n
V =
p
?

If V = Rn and e1 , ..., en is the standard basis, then this proposition provides us with a basis for the space p (Rn )? . This means that we can identify
n
p (Rn )? with R(p) if we wish, but this is not quite canonical, since theres
no preferred way to order the basis vectors in p (Rn )? .
98

Proof. Choose a correctly-ordered p-tuple i1 < ... < ip with each entry in
[1, n], and form the p-linear map i1 ... ip p V ? . Now take an arbitrary
p-tuple (j1 , ..., jp ) of numbers from the set [1, n], and consider evaluating the
map i1 ... ip on the p-tuple of basis vectors
(ej1 , ..., ejp ) V p
using the defining formula (8.2). We can only get a non-zero result if the ptuple (j1 , ..., jp ) is a permutation of the p-tuple (i1 , ..., ip ), in particular there
must be no repetitions in the first p-tuple. So we have
i1 ... ip : (ei(1) , ..., ei(p) ) 7 (1)
for each Sp , and it vanishes on every other p-tuple of basis vectors.
A general antisymmetric p-linear map c is determined by its values on
each p-tuple of basis vectors for V . It must vanish on p-tuples containing
any repetition, and if j1 < ... < jp is a correctly-ordered p-tuple then we
must have
c : (ej(1) , ..., ej(p) ) 7 (1) c(ej1 , ..., ejp )
for each S p . This means that c can be written as a linear combination
X
c=
c(ei1 , ..., eip )i1 ... ip
i1 <...<ip

since both sides agree on any p-tuple of basis vectors for V .


This shows that this set of elements span p V ? . Futhermore if some
linear combination of them gives the zero map then each coefficent must be
zero, so theyre linearly independent.
So the spaces p V ? initially get larger as we increase p, but once p > n/2
then they start to get smaller again, and indeed we have a symmetry
dim p V ? = dim np V ?
for 1 p n 1. The space n V ? is only 1-dimensional, so there is a
unique antisymmetric n-linear map from V to R, up to scale. For p > n the
space p V is zero-dimensional, so there are no antisymmetric p-linear maps
at all, except for the zero map. We can extend down to p = 0 by declaring
that 0 V ? = R, this
 is sensible for many reasons, for example it means that
dim 0 V ? = 1 = n0 .
An element of p V ? is called decomposable if it lies in the image of the
p-fold wedge product (V ? )p p V ? , i.e. if it can be written in the form
u1 ... up . Proposition 8.3 implies that any element of p V ? can be written
99

as a linear combination of decomposable elements. However, it is not true


that every element is decomposable (see Problem Sheets).
If F : V W is a linear map, then we get an induced linear map

p F ? : p W ? p V ?
just as we did in the case p = 2, by defining:

p F ? (c) : (v1 , ..., vp ) 7 c F (v1 ), ..., F (vp )

Again its immediate that p (G F )? = p F ? p G? . For a decomposable


element c = u1 ... up , its easy to check that


p F ? (u1 ... up ) = F ? (u1 ) ... F ? (up )
(8.4)
(apply both sides to any p-tuple (v1 , ..., vp ) V p and check that they give
the same number). We extend down to the case p = 0 by declaring that for
any F , the map 0 F ? is just the identity map from 0 W ? = R to 0 V ? = R.
If we try to understand p F ? in terms of matrices then it gets rather
complicated, but there is one simple special case which well now explain.
Assume that dim V = dim W = n, and consider n F ? . Pick bases e1 , ..., en
for V and f1 , ..., fn for W , so we can express F as a matrix M Matnn (R)
where:
F : ei 7 M1,i f1 + ... + Mn,i fn
Now let 1 , ..., n V ? and 1 , ..., n W ? be the dual bases. We know that
n V ? and n W ? are both one-dimensional, with basis vectors 1 ... n
and 1 ... n , so the matrix describing n F ? is a single real number. To
compute this number, we observe that

n F ? (1 ... n ) : (e1 , ..., en ) 7 1 ... n F (e1 ), ..., F (en )
X


=
(1) 1 F (e(1) ) ...n F (e(n) )
Sp

(1) M1,(1) ...Mn,(n)

Sp

= det(M )
so:

n F ? (1 ... n ) = det(M ) 1 ... n


This is an important observation that we will use later on.

100

(8.5)

The next thing we want to do is extend the wedge product to give a


bilinear product
p V ? q V ? p+q V
(8.6)
for any p and q. It is possible to write down an explicit formula for this
product, but its not very enlightening, so instead were going to approach it
indirectly.
If we fix a basis for V as before then its clear what we should do. If we
take a basis vector i1 ... ip for p V ? , and a basis vector j1 ... jq for
q V ? , then their wedge product should be:
i1 ... ip j1 ... jq p+q V ?

(8.7)

This product is zero if the two subsets {i1 , ..., ip } and {j1 , ..., jq } [1, n] are
not disjoint. If they are disjoint, theres some shuffle permutation Sp+q
that returns the (p + q)-tuple (i1 , ..., iq , j1 , ..., jq ) to its correct order, and
the product (8.7) is equal to (1) times the corresponding basis vector in
p+q V ? .
This defines the wedge product (8.6) on each pair of basis vectors, then we
can extend bilinearly. However, we need to check that this definition doesnt
depend on our choice of basis, and this is guaranteed by the next lemma.
Lemma 8.8. For each p, q there is a unique bilinear map from p V ? q V ?
to p+q V ? which makes the following triangle commute:

p V ? q V ?

(V ? )p (V ? )q

p+q V ?
What this lemma says is that if we have two decomposable elements
u1 ... up p V ?

and

u1 ... uq q V ?

then their wedge product is the obvious thing:




u1 ... up u1 ... uq = u1 ... up u1 ... uq
The point of the lemma is that this product is well-defined, and extends
to all of p V ? q V ? . These facts are not instantly obvious because not
all elements of p V ? or q V ? are decomposable, and even when they are
decomposable their decompositions are not unique.
101

Proof of Lemma 8.8. Suppose weve found such a bilinear map (the vertical
arrow in the triangle). Choose a basis 1 , ..., n of V ? . If we take any two basis
vectors i1 ... ip p V ? and j1 ... jq q V ? then commutativity
of the triangle forces the product of these two basis vectors to be given by
the expression (8.7), and then bilinearity determines all other products. This
proves uniqueness.
For existence, we have to check that this product weve defined using our
basis really does make the triangle commute, and by multi-linearity its sufficient to check this on any (p + q)-tuple (i1 , ..., ip , j1 , .., jq ) of basis vectors
for V ? . If any entries are repeated in this (p + q)-tuple then going either way
around the triangle gives the answer zero. If no entries are repeated then
going either way around the triangle gives the corresponding basis vector
in p+q V ? , so we just need to check that the signs match. If go diagonally
across the triangle then we get the sign of the permutation Sp+q that restores this (p+q)-tuple to its correct order. We can factor as first correctly
order (i1 , ..., ip ), then correctly order (j1 , ..., jq ), then shuffle them together,
and this corresponds exactly to the sign that we pick up by going the other
way around the triangle.
This extended version of the wedge product behaves very nicely, as the
next proposition shows.
Proposition 8.9.
have:

(i) For any c

p V ? , c q V ? and c r V ? , we

c (
c c) = (c c) c p+q+r V ?
(ii) For any c p V ? and c q V ? we have:
c c = (1)pq c c p+q V ?
(iii) If we have a linear map F : U V then for any c
c q V ? we have:


p+q F ? (c c) = p F ? (c) q F ? (c)

p V ? and

Proof. By bilinearity its sufficient to check all properties on decomposable


elements:
c = u1 ... up ,

c = u1 ... uq ,

c = u1 ... ur

Property (i) is obvious. Property (ii) just says that the sign of the permutation (p+1)...(p+q)1....p Sp+q is (1)pq . Property (iii) follows immediately
from the observation (8.4).
102

We can also extend the wedge product down to the case when p = 0 (or
q = 0), by declaring that if 0 V ? = R and c q V ? then c is just c,
the scalar multiple. Its trivial to check that the properties in Proposition
8.9 continue to hold in this case.
If we take the direct sum of all our

V ? =

p V ? s we get a single vector space

n
X

p V ?

i=0

called the exterior algebra of V . We can give V ? a (bilinear) multiplication by using our wedge product for each component. Property (i) in
Proposition 8.9 says that this structure is an associative algebra, and it has
a unit 1 0 R. Property (ii) in the proposition says that this algebra is
supercommutative. Property (iii) says that is a contravariant functor from
vector spaces to algebras.
If these words are unfamiliar, dont worry, you may safely ignore the last
paragraph!
?

8.2

p-forms

Finally we can return to manifolds!


If X is a manifold of dimension n, and x is a point in X, then we have
a vector space Tx X. Hence for any p 0 we may form the vector space of
all p-linear antisymmetric maps from Tx X to R, and this is conventionally
denoted by:
p Tx? X
(rather than

p (Tx X)? ). If we let x vary, we can form the set


p T ? X =

p Tx? X

xX

which is called the p-th wedge power ofthe cotangent bundle. This set is in
fact a vector bundle over X, of rank np . To see this, recall that choosing a

chart (U, f ) around a point x in X gives an isomorphism f : Tx X Rn ,


so it also gives an isomorphism
?

p 1
: p Tx? X p (Rn )?
f
n
We can further identify p (Rn )? with R( p ) , once weve chosen an ordering
of the standard basis of p (Rn )? provided by Proposition 8.3. If we do this
over the whole chart, we get a bijection:
n


p T ? U
U R( p )

103

Then we can proceed by exactly the same argument that we used to show
that T X and T ? X were vector bundles.
Definition 8.10. A p-form on X is a section of the vector bundle

p T ? X.

If we dont wish to specify p we can use the phrase differential form,


which means a p-form, for some p. Notice that by definition, we have
0 Tx? X = R for any x X. This means that 0 T ? X is the trivial rank
1 bundle X R, and so a 0-form on X is nothing but a smooth function
from X to R.
If we have a p-form , and a q-form , then we can wedge them together
by forming their wedge-product at every point. This gives us a function:
: X p+q T ? X
x 7 |x |x xp+q T ? X
If we can convince ourselves that this is a smooth function, then is a
(p + q)-form. So we need to look at this process in co-ordinates, which means
that we first have to understand what p-forms look like in co-ordinates.
If U Rn is an open set, then a p-form on U is a smooth function:

: U p (Rn )?
Now recall that the co-ordinate functions x1 , ..., xn on U give us a set of
one-forms on U
d
x1 , ..., d
xn
and if we evaluate this set of one-forms at any point y then we get the
standard basis for the cotangent space Ty? U
= Rn . This means that if we
take a correctly-ordered p-tuple i1 < ... < ip , and form the function
d
xi1 ... d
xip : U p (Rn )?
then at any point y U this function just gives us one of the standard basis
vectors in p (Rn )? . This is a constant function, so its certainly smooth, and
so this is a p-form.
A general p-form on U is given by some smooth function
: U
p
n ?
(R ) , and we can write it as
X

i d
xi1 ... d
xi p
i

where i runs over all correctly-ordered p-tuples i = {i1 < ... < ip } and each
coefficient
i is a smooth function from U to R. If we wanted to we could
104

choose an ordering on the set of correctly-ordered p-tuples, then we could



n
identify p (Rn )? with R( p ) , and we could write
as a length- np column
vector whose entries are the coefficient functions
i C (U ).
This tells us what p-forms look like in co-ordinates, now we can look at
the wedge product of two differential forms in co-ordinates. For example, if
we have two 1-forms

=
1 d
x1 + ... +
n d
xn

and

= 1 d
x1 + ... + n d
xn

then their wedge-product is the two-form:


X

=
(
i j
j i )d
xi d
xj
i<j

All the coefficient functions are obviously smooth. In general, if we wedge


together a p-form
and a q-form then the coefficient functions for

will be linear combinations of products of the coefficient functions for
and
, so this will always be smooth.
This proves that on a general manifold X, the wedge product of a
p-form and a q-form is indeed a (smooth) (p + q)-form. Also, it follows
instantly from parts (i) and (ii) of Proposition 8.9 that we have
= (1)pq
and
( ) = ( )
for any p-form , any q-form , and any r-form . Notice that it makes sense
to multiply any p-form by smooth function h C (U ), and in fact this is a
special case of the wedge product, since we can view h as a zero-form.
If have a smooth function F : X Y between two manifolds, then weve
already seen that we can pull-back 1-forms along F . This is also true for
p-forms, since if is a p-form on Y then we can define a p-form on X by:
F ? : X p T ? X
x 7 p (DF |x )? |F (x)

We need to check that F ? is smooth, but we should first note that it follows
immediately from Proposition 8.9(iii) that
F ? ( ) = F ? () F ? ()

(8.11)

for any two differential forms and on Y . Now it is easy to check that F ?
is smooth, because in co-ordinates we can write as a linear combination of
105

wedge products of one-forms, but we know that the pull-back of a one-form


is smooth, and that the wedge-product of any differential forms is always
smooth.
Notice that if is just a zero-form, i.e. an element of C (Y ), then F ?
is just F C (X). This is because 0 (DF |x )? is always the identity
map on R, by definition.
Example 8.12. Let U = R>0 (, ) R2 , and consider the smooth
function
F : U R2
(r, ) 7 (r cos , r sin )
(the inverse to polar co-ordinates). The one-forms dx and dy on R2 pull-back
via F to give one-forms
F ? dx = cos dr r sin d

and

F ? dy = sin dr + r cos d

on U , so the 2-form dx dy must pull-back to give the 2-form




F (dx dy) = cos dr r sin sin dr + r cos d
= r cos2 dr d r sin2 d dr
= r dr d
Now lets briefly examine the transformation law for p-forms. If we have
two charts (U1 , f1 ) and (U2 , f2 ), and we have a p-form on X, then in each
chart will become smooth functions:

2 : U2 p (Rn )?

1 : U1 p (Rn )?

For points x in the intersection U1 U2 , the values of these functions are


related by the linear map:

p D12 |f2 (x) ? : p (Rn )? p (Rn )?
As we already spotted for one-forms, this is a special case of the pull-back
operation, and it says that

2 = ?12
1
on the open set f2 (U1 U2 ) U2 . As for vector fields and one-forms, we
could take this transformation law as the definition of a p-form if we wished.
We could write this transformation law explicitly in terms of the Jacobian
matrix D12 |f2 (x) , but it gets rather complicated. However, in the special case
106

of n-forms, its very easy. If is an n-form, then in each chart just becomes
smooth functions:

1 : U1 R

and

2 : U2 R

By our observation (8.5), on the overlap these functions are related by:


2 |f2 (x) = det D12 |f2 (x)


1 |f1 (x)
(8.13)
So n-forms transform by the determinant of the Jacobian matrix of the transition function.
There is an extremely important operation that can be performed on
differential forms, called the exterior derivative. This is our next topic.
Lets denote the set of all p-forms on X by:
p (X)
This is an infinite-dimensional vector space. In the case p = 0, we have that
0 (X) = C (X), since 0-forms are just smooth functions from X to R. Now
recall that for any smooth function h C (X) we produced a one-form dh,
and in co-ordinates this is given by:
=
dh

h
h
d
x1 + ... +
d
xn
x1
xn

So we have an operator:
d : 0 (X) 1 (X)
h 7 dh
What we want to do is extend this to an operator
d : p (X) p+1 (X)
for all p.
Well begin by assuming X is an open set U Rn , and well let x1 , ..., xn
be the co-ordinate functions on U . Suppose we have a p-form
which has
only one non-zero component, so

=
i d
xi1 ... d
xi p
where i = (i1 < ... < ip ) is a single correctly-ordered p-tuple, and
i

C (U ). Then we define:
d
= d
i d
xi1 ... d
xi p
n
X
i
d
xj d
xi1 ... d
xi p
=
xj
j=1
107

(8.14)

Some of the terms in this sum will be zero, since when j is equal to one of
the it then that wedge product of one-forms is zero. So there will be one
(potentially) non-zero component of d
for each j which does not appear in
the p-tuple i, and if we want to write it in terms of our standard basis then we
pick up a 1 when we apply the permutation which returns the (p + 1)-tuple
(j, i1 , ..., ip ) to its correct order.
The expression (8.14) is linear in
i , so we can extend it to a linear
operator:
d : p (U ) p+1 (U )
Together, these operators are called the exterior derivative, or the de Rham
differential.
Example 8.15. Let X = R3 , with co-ordinates x, y and z. If we have a
one-form
=
2 dy for some
2 C (R2 ), then:
d
=

2
dx dy
dy dz
x
z

More generally, if

=
1 dx +
2 dy +
3 dz
then:

d
=







2
1

3
1

3
2

dxdy +

dxdz +

dydz
x
y
x
z
y
z

You might recognise this formula - if we flip the sign of middle term, which
we can do by deciding to write things in terms of dz dx instead of dx dz,
then this is the formula for the curl operator which turns a vector field
on R3 into another vector field on R3 . However its more natural to interpret
it as an operator that turns one-forms into 2-forms.
The exterior derivative has many nice properties.
Proposition 8.16.

(i) For any


p (U ) we have:
d(d
) = 0 p+2 (U )

(ii) For any


p (U ) and q (U ) we have:
= d
d(
)
+ (1)p
d p+1 (U )

(8.17)

(iii) If V is an open set in Rk and F : U V is a smooth function, then


for any
p (V ) we have:
d(F ?
) = F ? d
p (U )
108

A good way to remember the sign in part (ii) is to pretend that the symbol
d behaves a bit like a one-form, so if we want to permute it past the p-form

then we pick up a sign (1)p . Notice that (8.17) is formally similar to the
product rule for derivations, and in fact d is indeed a derivation in a more
general sense.
Proof. (i) By linearity its enough to prove the result for an
which has
a single component
i for some i. Applying the formula (8.14) twice,
we get the (p + 2)-form:
n X
n
X
2
i
d
xm d
xj d
xi1 ... d
xi p
d(d
) =

m
j
m=1 j=1

This sum has a (potentially) non-zero term for every pair m, j such that
m 6= j and neither m nor j appear in i. However, the double partial
2
derivative xmixj is symmetric in m and j, and the wedge product
d
xm d
xj is antisymmetric in m and j, so these terms cancel in pairs.
(ii) Firstly suppose that
and are just zero-forms, i.e. elements of
C (U ). Then (8.17) says that
= d
+
d(
)
d
(since for zero-forms the wedge-product is ordinary point-wise multiplication), and this is true by the product rule for partial differentiation.
Now let
C (U ) be a zero-form, and let be the constant q-form
d
xi1 ... d
xiq . Then the formula (8.14) says that d = 0, and it also
says that
d(
d
xi1 ... d
xiq ) = d
d
xi1 ... d
xi q
so this special case of (8.17) is also true.
Now let
be a p-form with a single component, and be a q-form with
a single component, so:
and
= j d
xj ... d
xj q

=
i d
xi ... d
xi p
1

with
i , j C (U ). Then:


=d
d(
)
i j d
xi1 ... d
xip d
xj1 ... d
xj p



= j d
i +
i dj d
xi1 ... d
xip d
xj1 ... d
xj p
= (d
i d
xi1 ... d
xip ) (j d
xj1 ... d
xjq )
+ (1)p (
i d
xi1 ... d
xip ) (dj d
xj1 ... d
xj q )
= d
+ (1)p
d
109

So by linearity (8.17) holds for any p-form and any q-form.


(iii) Weve already proved the case p = 0 in Proposition 7.12, since if
is a
zero-form then F ?
is just
F . Now suppose
=
i d
xi1 ... d
xi p
is a p-form with a single component. Then
F ?
= (
i F ) (F ? d
xi1 ) ... (F ? d
xi p )
by (8.11). Now for any k, we have F ? d
xk = d(
xk F ) by Proposition
?

7.12, so d(F d
xk ) = 0 by part (i) of this proposition. Then repeatedly
applying part (ii) of this proposition shows that:
d(F ?
) = d(
i F ) (F ? d
xi1 ) ... (F ? d
xi p )
Hence
d(F ?
) = (F ? d
i ) (F ? d
xi1 ) ... (F ? d
xi p )

?
= F d
i d
xi1 ... d
xi p
= F ? d

as required.
Now we want to define the exterior derivative on an arbitrary manifold X.
Obviously, when we work in co-ordinates it should reduce to the operations
that weve just defined. In fact, this requirement is a valid way to define d
on X.
Fix a p-form p (X) . In any chart (U, f ), it turns into a p-form

: U p (Rn )?
on U . So in any chart, we can form a (p + 1)-form:
d
: U p+1 (Rn )?
So we have a rule which produces, for any chart (U, f ), a (p + 1)-form on U .
We claim that this object is actually a (p + 1)-form on X, in the physicists
sense, i.e. we claim that it obeys the correct transformation law when we
change co-ordinates. This means that there really is a (p + 1)-form
d p+1 (X)
such that when we write d in a chart (U, f ) we get the corresponding d
.
110

So we just need to prove our claim that d


transforms correctly. If we
have two charts (U1 , f1 ) and (U2 , f2 ), then in each chart the p-form turns
into two smooth functions
1 and
2 from U to p (Rn )? . These functions
are related by the transformation law
1

2 = ?12
where the two charts overlap. But by Proposition 8.16(iii), we have
1
d
2 = ?12 d
and this is the correct transformation law for a (p + 1)-form on X. This
proves our claim.
This means that on any manifold X, and any p, we have a exterior derivative:
d : p (X) p+1 (X)
Futhermore its immediate that the three properties listed in Proposition
8.16 all hold, since each of them can be checked in co-ordinates.
The main use for the exterior derivative is to define the de Rham cohomology of a manifold, which unfortunately is not a part of this course. However,
we will see one application of d in the next section.

Integration

C (U ), we can
If we have an open set U Rn , and a smooth function h
try to compute the multiple integral:
Z
d
h
x1 ...d
xn

doesnt grow too big as we approach the edges of U ,


As long as the function h
is identically zero
this integral will converge. For example, if we know that h
outside some closed ball B(0, r) U then the integral certainly converges.

Now suppose we have a diffeomorphism F : V U where V is some


other open set in Rn . This is just a change of co-ordinates, so we can compute
the integral in the new co-ordinates if we wish. You should recall the changeof-variables formula:
Z
Z




h d

x1 ...d
xn = (h F ) det DF d
y1 ...d
yn
(9.1)

111

The factor | det DF |, the absolute value of the determinant of the Jacobian
matrix of F , keeps track of how volume gets distorted when we change variables.
This is strikingly similar to the the way
R that n-forms behave. If we decide
that the symbols after the integral sign U are really the n-form
d

=h
x1 ... d
xn n (U )
then we have
F ) det(DF ) d
F ?
= (h
y1 ... d
yn
R
which is almost
the same
as the symbols after V . The only difference is the


occurence of det(DF ) instead of det(DF ), and well deal with this shortly.
This is very strong evidence that the correct thing to integrate over U is not
functions but n-forms, and we should define:
Z
Z
d

=
h
x1 ...d
xn

More generally, suppose X is any manifold, and n (X) is an n-form on


X. Can we define the integral of over X?
If we pick a chart (U, f ) on X, we could at least try to define the integral
of over the region U by writing in co-ordinates and then integrating it
over U . There are two problems with this definition that we must addess:
(a) Does this integral converge?
(b) Is this definition co-ordinate independent?
We will deal with these two problems (in reverse order), and then go on
to try to understand the integral of an n-form over the whole manifold X.

9.1

Orientations

In this section were going to solve problem (b), the problem of whether the
integral of an n-form over a co-ordinate chart is co-ordinate independent.
Let X be a manifold, and let n (X) be an n-form on X. Suppose
we pick two charts U1 and U2 , and let U = U1 U2 . On U we have two
co-ordinate systems, with co-domains f1 (U ) U1 and f2 (U ) U2 , and

transition function 12 : f2 (U ) f1 (U ). Write in the first co-ordinates,


so it becomes some n-form:
d

=h
x1 ... d
xn
112

n (f1 (U ))

Then we know from the transition law for n-forms that when we write in
the second co-ordinates we will get:
12 ) det(D12 ) d
?12
= (h
y1 ... d
yn n (f2 (U ))
R
We would like to be able to define the integral
R U , by evaluating it in corodinates. In the first co-ordinates this gives f1 (U )
, and lets assume that
this integral converges. Then by the change-of-variables formula (9.1) we
have that:
Z
Z
12 ) |det(D12 )| d

=
(h
y1 ...d
yn
f1 (U )
f2 (U )
Z
?12

(9.2)

f2 (U )

If we knew that det D12 was always positive then this would be an equality.
In fact, since 12 is a diffeomorphism, we know that det D12 is never zero.
So if we assume that U is connected then det D12 must be either always
positive or always negative, and we have an equality up to an overall factor
of 1.
R
If we can solve this sign issue, then we can evaluate U in either coordinates and get the same answer. The solution to the signs is provided by
the idea of an orientation.
Definition 9.3. A volume form on a manifold X is an n-form n (X)
such that is not zero at any point. If there exists a volume form on X then
we say that X is orientable.
Example 9.4. If U Rn is an open set, then

0 = d
x1 ... d
xn n (U )
is a volume form, so Rn is orientable. Well call
0 the standard volume form
0 for some h
C (U )
on U . Every other volume form on U is of the form h
which is never zero.
Since the vector bundle n T ? X has rank 1, asking for X to be orientable
is equivalent to asking for n T ? X to be a trivial vector bundle, by Proposition 6.19. Also, if weve fixed a volume form on X, then any n-form
n (X) can be written as = h for some h C (T 1 ). If h is never
zero then will be another volume form (and vice-versa).

113

Example 9.5. Let X = T 1 , and lets use our usual atlas with two charts
where:
U1 = (0, 1)
and
U2 = ( 21 , 21 )
Since D21 = 1 at all points, a one-form 1 (T 1 ) must be expressed in
this atlas as
1 d

1 = h
x 1 (U1 )

and

2 d

2 = h
y 1 (U2 )

where:
2 (x) = h
1 (x)
h
x (0, 21 ), and
2 (x) = h
1 (x + 1) x ( 1 , 0)
h
2
1 1 and h
2 1, and this defines a volume form
In particular we may set h
1
1
1
(T ). Hence T is orientable.
1 and h
2 are
Notice that, for any one-form 1 (T 1 ), the data of h

1
exactly the data of a function h C (T ), and we have that = h.
We saw in Example 6.21 that S 1 (and hence T 1 ) is parallelizable, and
its not hard to show in general that if X is parallelizable then X is also
orientable. However, being orientable is a much weaker condition, and in
practice most manifolds that we care about are orientable.
Proposition 9.6. Let X be an orientable manfold, and let
Z = h1 (y) X
be a level set of some function h C (X) at a regular value y R. Then
Z is orientable.
In particular if we set X = Rn+1 and Z = S n then we see that S n is
orientable, for any n.
Proof. Let dim Z = n, and fix a point z Z, so the volume form at this
point gives us a non-zero element |z n Tz? X. We saw in Lemma 5.15
that the tangent space Tz Z is the subspace of Tz X given by the kernel of the
linear map Dh|z : Tz Z Tz R
= R. Let n Tz X be any vector such that
Dh|z (n) = 1. Then we can define an element 0 |z n1 Tz? Z by declaring
that
0 |z : (v1 , ..., vn1 ) 7 |z (v1 , ..., vn1 , n) R
for any vectors v1 , ..., vn1 Tz Z. This map 0 |z is automatically (n 1)linear and antisymmetric, so it is indeed an element of n1 Tz? Z. Futhermore
114

we claim that its independent of our choice of n. To see this, recall that
n1 Tz? Z is only 1-dimensional, so if we pick a basis e1 , ..., en1 for Tz Z then
0 |z is determined by the single real number :
0 |z (e1 , ..., en1 ) = |z (e1 , ..., en1 , n)

(9.7)

This number will not change if we change n by adding on any linear combination of the ei s, because |z is anti-symmetric and linear in each argument.
However, any vector in Dh|1
z (1) must differ from n by some linear combi0
nation of the ei s, so |z is indeed independent of our choice of n.
Also, the number (9.7) cannot be zero, because the vectors e1 , ..., en1 , n
form a basis of Tz X, and we know that |z is not zero. Therefore 0 |z is not
the zero element of n1 Tz? Z.
So for every point z Z, we have constructed a non-zero element 0 |z
n1 Tz? Z. If we can show that these elements vary smoothly, then we have
found a volume form 0 on Z. So we need to look at this construction in
co-ordinates.
We can assume that Z is the level set of y = 0, since we can always
replace h by h y. Then for any point z Z, we know that we can find a
chart (U, f ) containing z such that when we write h in this chart it is just the
= xn on U . In such a chart, the submanifold Z becomes
last co-ordinate h
n1
f (Z U ) = R
U , and we may choose our vector n to be the tangent

vector xn at any point in f (Z U ). Now write the volume form in these


co-ordinates, so it becomes

= g d
x1 ... d
xn
for some g C (U ). It follows that 0 , in these co-ordinates, is given by

0 = g|xn =0 d
x1 ... d
xn1
which is indeed a smooth (n 1)-form on f (Z U ).
With a little more work, this proposition can be generalised to level sets
(at regular values) of smooth functions h : X Y , where Y is any other
orientable manifold. However, it is not true that any submanifold of an
orientable manifold is orientable.
Now let X be an orientable manifold, and lets fix a volume form on
X. Pick a chart (U, f ) on X. In this chart becomes a volume form

n (U )

115

0 for some function h


C (U ) which is never zero,
so we must have
= h
is always
where
0 is the standard volume form on U . If this function h
positive, then we say that the chart (U, f ) is oriented (with respect to ).
must be
Its not hard to find oriented charts. If U is connected, then h
either always positive or always negative. If its negative, just compose the
co-ordinates f with the reflection:
T : Rn Rn
(
x1 , x2 , ..., xn ) 7 (
x1 , x2 , ..., xn )
Since T ?
0 =
0 , the chart (U, T f ) will be oriented (we could replace T
here with any other diffeomorphism T satisfying det DT < 0 at all points).
If U is not connected, then we can split U into its connected components,
is negative. So for any
and perform this trick on each component where h
chart (U, f ), we can find an oriented chart which has the same domain U .
If g C (X) is positive at all points, then g is another volume form
on X, and asking for a chart to be oriented with respect to g is exactly the
same condition as asking for it to be oriented with respect to . This leads
us to the following definition:
Definition 9.8. An orientation on a manifold X is an equivalence class of
volume forms on X, where we declare that two volume forms 1 and 2 are
equivalent iff 2 = g1 for some g C (X) which is positive everywhere. If
weve fixed an orientation on X we say that X is oriented.
Obviously, we can find an orientation for X iff X is orientable. If we fix
an orientation [] on X then that determines which charts are oriented, it is
not necessary to choose a specific volume form in the equivalence class [].
For our purposes, the reason for introducing oriented manifolds, and oriented charts, is the following easy observation:
Lemma 9.9. Let X be an oriented manifold. Let (U1 , f1 ) and (U2 , f2 ) be two
oriented charts, and let 12 be the transition function between them. Then
for any point x f2 (U1 U2 ) we have:
det D12 |x > 0
Proof. Pick any volume form representing the given equivalence class. In
our two charts, becomes
1

1 = h
0 n (U1 )

and

2 = h
0 n (U2 )

1 and h
2 are positive everywhere since both charts are oriented.
where both h
For a point x U1 U2 we have


2 |f2 (x) = det D12 |f2 (x)


1 |f1 (x)
116


by the transformation law for n-forms (8.13). Hence det D12 |f2 (x) > 0.
This solves part of our problem of defining integration on manifolds. Lets
return to the situation we discussed at the beginning of this section, where
we have an n-form n (X), and a region U X which is the intersection
of two charts, so it has two co-ordinate systems f1 and f2 . If we assume that
X is oriented, and that both charts are oriented, then the transition function
12 satisfies det D12 > 0 at all points. Consequently the formula (9.2) is an
equality,
so (assuming that the integral converges) we can define the integral
R
using either co-ordinate system and we will get the same answer.
U
The other part of the problem (part (a)) is about the convergence. Well
solve this in the next section.

9.2

Defining integration

For any (oriented) manifold X, and any n-form on X, we would like to be


able to define the integral
Z

as some real number. However this is not going to work in general, because
integrals do not always converge. For example if we take X = R, and
1 (R) to be the constant one-form dx, then we are trying to evaluate
Z
1 dx

which doesnt converge to a finite answer. So we have to put some restrictions


on either X or .
To start with, well let X be any oriented manifold, but well only consider
a restricted class of n-forms.
Definition 9.10. Let p (X) for some p. Well call a bump form if
there exists some chart (U, f ) on X, and some compact subset W U , such
that is identically zero outside of W .
Warning: this is not standard terminology! But it will be a convenient
definition for us.
R
If n (X) is a bump form, then we can define the integral X R
in the following way. By definition, there is some chart (U, f ) such that
vanishes outside of a compact subset W U . We need to assume in
addition that (U, f ) is an oriented chart, but we know that we can always
achieve this. Now look at the n-form
n (U ) that we get by writing
117

in this chart. The subset f (W ) U is compact, so its contained in some


closed ball B(0, R) U , so
is identically zero outside this ball. Therefore
we can define
Z
Z
=

since this integral converges. Now lets prove that this definition is independent of our choice of chart.
Proposition 9.11. Let (U1 , f1 ) be an oriented chart such that vanishes
outside of some compact subset W1 U1 . Let (U2 , f2 ) be another oriented
chart such that vanishes outside of some compact subset W2 U2 . Let

1 n (U1 ) and
2 n (U2 ) be the n-forms that we get by writing in the
two charts. Then we have:
Z
Z

1 =

2
1
U

2
U

So for any bump form n (X) we have a well-defined integral

R
X

R.
Proof. Let U = U1 U2 , and W = W1 W2 . Then W is a compact subset of
U , and vanishes outside of W . Consequently
Z
Z
Z
Z

1
and

2 =

1 =
1
U

2
U

f1 (U )

f2 (U )

since
1 vanishes outside f1 (U ) and
2 vanishes outside f2 (U ), and all these
integrals converge. Weve now reduced to the situation considered in the
previous section, and the formula (9.2) shows that
Z
Z

1 =

2
f1 (U )

f2 (U )

since
2 = 12
1 and both charts are oriented.
We now want to consider integrating arbitrary n-forms. This means that
we have to put some restriction on X, and the correct restriction is to insist
that X itself is compact.
On a compact manifold, there is a way to chop-up an arbitrary n-form
into a finite number of bump forms. Then we can then integrate each piece,
and add the answers together. The chopping-up step is done with the
following gadget:
Definition 9.12. Let X be a manifold. A partition-of-unity on X is a
finite set of functions 1 , ..., r C (X) with the following two properties:
118

(i) For each i the function i 0 (X) is a bump form, so there exists
a chart (Ui , fi ) and a compact subset Wi Ui such that i vanishes
outside of Wi .
(ii)
1 + ... + r 1
Again we are not quite using standard terminology here - everyone would
agree that a partition-of-unity must satisify property (ii), but some people
might vary property (i), and/or allow the set of functions to be infinite.
Proposition 9.13. If X is compact then a partition-of-unity exists on X.
In fact its fairly easy to show that a (finite) partition-of-unity can only
exist if X is compact, so the proposition is really if-and-only-if.
Proof. For any point x X, we can find a bump function x which is
constantly equal to 1 on some open neighbourhood Vx of x, never negative, and vanishes outside of some compact set which is contained within
a chart. Choose this data of x and Vx and for each point x. The open
sets {Vx , x X} form an open cover of X, so since X is compact there is
some finite subcover {Vx1 , ..., Vxr }. Let xr , ..., xr be the corresponding set
of bump functions. Then the sum x1 + ... + xr is strictly positive at all
points of X, since no term is negative and at all points at least one term is
equal to 1. Hence we can define
i =

x1

xi
C (X)
+ ... + xr

and then 1 , ..., r is a partition-of-unity.


Partitions-of-unity can be used for many things, one of which is defining integration. Suppose X is oriented and compact, and we have found a
partition-of-unity = {1 , ..., r }. Then if n (X) is any n-form, we
have that
= 1 + ... + r
and each i is a bump form. So it seems sensible to define the integral of
over X to be
Z
Z
Z
=
1 + ... +
r R
X,

We just need to check that this definition is independent of which partitionof-unity we chose.
119

Proposition 9.14. Let X be a compact oriented manifold, and let =


{1 , ..., r } and
b = {
b1 , ...,
bs } be two partitions-of-unity on X. Then for
n
any n-form (X), we have that:
Z
Z
=

X,

X,
b

Proof.R We claim that if is a bump-form,


and is any partition-of-unity,
R
then X, gives the same answer as X . It follows from this claim that
if is any n-form, and ,
b are two partitions-of-unity, then:
 X Z
 X Z

Z
X Z
=
i =
i =

bj i
X,

X,
b

i,j

If we swap and
b then we get the same answer, which proves the proposition.
So we just need to prove the claim. Suppose is a bump-form, and (U, f )
is an oriented chart such that vanishes outside of a compact subset W U .
Now let be any a partition-of-unity.R Each bump-form i also vanishes
outside of W , so we may evaluate each X i using the chart (U, f ). Writing
everything in these co-ordinates, we have that
X
d
x1 ... d
xn =
i d
x1 ... d
xn
i

and
over U is a linear operation, we see that
R
P R since integrating
i X i = X, .

R
X

Consequently we may drop the partition-of-unity


R from the notation, and
we have a well-defined definition of the integral X , for any compact oriented manifold X and any n-form n (X).
Example 9.15. Let X = T 1 , and lets use our usual atlas {(U1 , f1 ), (U2 , f2 )}.
We observed in Example 9.5 that there is a volume form 1 (T 1 ) which
becomes the constant one-form dx in either chart, and that any other oneform on T 1 can be written as = h for some h C (T 1 ). Lets fix the
orientation [] on T 1 , this means that both of our charts are oriented. Now
lets evaluate the integral:
Z
h
T1


First we need a partition-of-unity. Let
b1 C (0, 1) be some function
which is never negative, constantly equal to 1 in some interval containing
120

1
,
2

and vanishes outside some larger interval. Now extend


b1 to a periodic
function on the whole of R by setting
b1 (0) = 0 and insisting that
b1 (x) =

b1 (x + 1) for all x R. This defines a function 1 C (T ), which vanishes


outside a compact subset of U1 . If we let
b2 = 1
b1 C (R), then this
defines a function 2 C (T 1 ) which vanishes outside a compact subset of
U2 , and the pair (1 , 2 ) is a partition-of-unity on T 1 . Then our integral is
the sum of two terms:
Z
Z
Z
h =
1 h +
2 h
T1

T1

T1

Lift the function h C (T 1 ) to a periodic function b


h C (R), then the
expression for h in either chart is just given by restricting b
h to the corresponding interval. Now lets evaluate the first term in our integral, which we
can do in the chart chart (U1 , f1 ). It gives the answer:
Z 1
Z

b1 (x)b
h(x) dx
1 h =
T1

The second term can be evaluated in (U2 , f2 ), and gives


Z

Z
2 h =
T1

1
2

b2 (x)b
h(x) dx =

b2 (x)b
h(x) dx

since b
h and
b2 are periodic. So adding the two terms together gives:
Z 1
Z 1
Z

b
b
h(x) dx
h =

b1 (x) +
b2 (x) h(x) dx =
T1

In particular, we have that:


Z
=1
T1

Theres an easy general observation we can make here: if X is any compact manifold and n (X) is a volume form, then if we use the orientation
[] on X we must have
Z
>0
X

since the integral will be a sum of strictly positive terms. This means that
integration over X defines a surjective linear map
Z
: n (X) R
X

since if multiply by a scalar R then we can arrange


value we wish.
121

R
X

to take any

9.3

Stokes Theorem

Suppose we have a function h C (R) that vanishes outside of some closed


interval [r, r] R. Then the one-form
dh =

h
dx
x

R
also vanishesR outside Rof this interval, so the integral R dh certainly converges,
r
and in fact R dh = r dh. So by the fundamental theorem of calculus, we
have
Z r
Z
h
dx = h(r) h(r) = 0
dh =
r x
R
since h(r) = h(r) = 0.
Lets generalise this observation to higher dimensions. If we work on Rn ,
then the objects that we can integrate are n-forms, so we must replace h by
an (n 1)-form. For example, lets work on R3 , and consider a 2-form
= 1 dy dz
which only has one non-zero component. Then:
d =

1
dx dy dz
x

Suppose that vanishes outside some cube, i.e. the function 1 C (R3 )
is zero unless x, y, z [r, r]. Then we have that
 
Z r Z r Z r
Z
1
dx dy dz
d =
r
r
r x
R3
Z r Z r 
 
=
1 (r, y, z) 1 (r, y, z) dy dz
r

=0
since 1 (r, y, z) = 1 (r, y, z) = 0 for any values of y and z.
R
If has more than one component then its still true that R R3 d = 0,
since we can evaluate each component individually (both d and are linear)
and each piece will be zero. We can also perform this argument in exactly
the same way in higher dimensions - the only problem is keeping track of the
notation! Formally:
Lemma 9.16. Let U Rn be an open set, and let n1 (U ) be an
(n 1)-form that vanishes outside some compact subset W U . Then:
Z
d = 0

122

Proof. We can extend to a smooth (n 1)-form on the whole of Rn by


setting it to be zero on Rn \ U , and if we pick a large enough r then W will
be contained in the n-dimensional cube [r, r]n . Then we have
Z
Z
d =
d = 0

[r,r]n

since 0 on each face of the cube.


Futhermore, this statement generalizes very easily to more interesting
manifolds.
Theorem 9.17 (Stokes Theorem). Let X be a compact oriented manifold
of dimension n. Then for any n1 (X) we have:
Z
d = 0
X

Proof. Choose a partition of unity on X, so = (1 + ... + r ) and


d = d(1 ) + ... + d(r ). For each i we know that i is a bump form
vanishing outside some compact set Wi , hence d(i ) is also a bump form,
since it also vanishes outside Wi . Also, we can see that
Z
d(i ) = 0
X

by passing to some oriented chart containing Wi and


R applying Lemma 9.16.
Since integrating over X is linear, it follows that X d = 0.
Example 9.18. Let X = T 1 , and let 1 (T 1 ) be the volume form we
constructed in Example 9.5. We saw in Example 9.15 that (after fixing the
orientation [] on T 1 ) the integral of a one-form = h 1 (T 1 ) is given
by
Z
Z
1

=
T1

b
h(x) dx
0

where b
h C (R) is the periodic function corresponding to h CR (T 1 ). If
= dg for some g C (T 1 ), then Stokes Theorem says that T 1 = 0.
R1

Therefore, if 0 h(x)
dx 6= 0, then there cannot exist a g C (T 1 ) such that
= dg. In particular there is no g such that = dg.
R1
The converse to this statement is also true. If 0 b
h(x) dx = 0, then the
function
gb : R R
Z x
b
x 7
h(y) dy
0

123

will satisfy gb(x) = gb(x + 1) for all x R. Hence it defines a corresponding


function g C (T 1 ), and clearly dg = h.
The observation from the previous example is actually a general phenomenon. If X is a connected compact oriented n-dimensional manifold, and
n (X) is an n-form, then
Z
=0
X

if-and-only-if there exists some n1 (X) such that = d. In other


words, if we consider the subspace Im(d) n (X) given by the image of the
linear map
d : n1 (X) n (X)
then integration provides an isomorphism:
Z
n (X)
R
:
Im(d)
X
This result is quite hard to prove, its a first glimse of an extremely deep and
important fact about manifolds called Poincare duality.

124

Topological spaces

If X is a set we let PX denote the power set of X, i.e. the set of all subsets
of X.
Definition A.1. Let X be a set. A topology on X is a collection
T PX
of subsets of X, satisfying the list of axioms below. We refer to elements of
T as open sets. The axioms are:
(i) The empty subset is open, and the whole of X is open.
(ii) The intersection of two open sets is open.
(iii) Given any collection of open sets, their union is also open.
Axiom (ii) implies that the intersection of any finite collection of open sets
is open. Axiom (iii) applies to any collection of open sets, including infinite
ones. If we have chosen a topology on X, then we call X a topological
space.
A subset V of a topological space X is called closed iff its complement
c
V = X \V is open. Note that most subsets of X are neither open nor closed.
Example A.2. Let X = Rn , equipped with the usual (Euclidean) norm.
For a point x Rn , and a real number r R0 , the open ball around x of
radius r is the set:
B(x, r) = {y Rn ; |y x| < r}
We declare that a subset U Rn is open iff for any point x U there exists
some  > 0 such that:
B(x, ) U
Equivalently, we can say that a subset U X is open iff U can be written as
a union of some collection of open balls. It is easy to prove that this defines
a topology on Rn .
Definition A.3. Let X and Y be topological spaces, and let f be a function:
f :XY
We say that f is continuous iff whenever U Y is an open set then its
pre-image
f 1 (U ) X
is also open. Equivalently, we can require that the pre-image of every closed
set is closed.
125

Its easy to show that the composition of two continuous functions is


continuous.
Definition A.4. If f : X Y is a continous function between two topological spaces then we say that f is a homeomorphism iff f is a bijection and
the inverse function
f 1 : Y X
is also continous. If there exists a homeomorphism between X and Y then
we say that X and Y are homeomorphic.
Definition A.5. Let X be a topological space, and let Z X be any subset.
We define the subspace topology on Z by declaring that a subset U Z
is open iff there exists some open set U X such that:
U = Z U
Its easy to prove that this really does define a topology on Z, and that
the inclusion map Z , X is continous. It follows that if f : X Y is
continuous then the restriction f |Z : Z Y is also continuous. Its also easy
to prove that a function g : Y Z is continous iff g is continuous when
viewed as a function g : Y X.
Definition A.6. Let X be a topological space, let Y be a set, and let
q:XY
be a surjective function. We define the quotient topology on Y by declaring
that U Y is open iff q 1 (U ) is open in X.
Its easy to check that this really is a topology on Y , and that it makes
q continuous.
Let X and Y be two topological spaces. We can put a topology on their
cross-product
X Y = {(x, y); x X, y Y }
by declaring that if U1 is an open set in X and U2 is an open set in Y then
U1 U2 X Y
is an open set, and further declaring that any union of sets of this form is
also an open set. Its easy to check that this defines a topology, and that the
projection map from X Y to either X or Y is continuous.

126

We can also put a topology on the disjoint union


X tY
by declaring that a subset U X t Y is open iff U X is open in X and
U Y is open in Y (again its easy to check that this is a topology). This
means that both X and Y are subspaces of X t Y .
Definition A.7. A topological space X is called compact iff, whenever we
have a collection of open sets {Ui , i I} (indexed by some set I) such that
[
Ui = X
iI

then it is possible to find a finite subset J I such that we still have:


[
Uj = X
jJ

A collection {Ui , i I} like this is called an open cover of X, and the


sub-collection {Uj , j J} is called a finite sub-cover. A subset Z X is
called compact iff Z is compact in the subspace topology.
Example A.8. If X = Rn with the usual topology, then a subset Z Rn is
compact iff Z is both closed and bounded, i.e.
Z B(0, R)
for some large-enough R. We wont give a proof of this fact, to find one
consult any first course on topological spaces.

The Hausdorff condition and bump functions

Definition B.1. A topological space X is called Hausdorff if for any two


distinct points x, y X we can find open sets U and V with x U and
y V and U V = .
So x and y can be housed-off from each other by these open neighbourhoods. For us, the important fact about Hausdorff spaces is:
Lemma B.2. If X is Hausdorff, and Z X is compact, then Z is closed in
X.
127

Proof. Fix a point y X \ Z. For any point z Z we can find an open


set Uz containing z and an open set Vz containing y such that Uz Vz =
. The union of all the Uz s is an open cover of Z, so it contains a finite
subcover {Uz1 , ..., Uzt }. The intersection Vz1 ... Vzt of the corresponding
open neighbourhoods of y is an open neighbourhood of y, and it does not
intersect any Uzi , so it is contained in X \ Z. Therefore X \ Z is open.
If we have a manifold X, then since X is in particular a topological space,
we can ask if it is Hausdorff. In fact all reasonable manifolds are Hausdorff,
for example any submanifold of Rn is Hausdorff, and any submanifold of a
Hausdorff manifold is Hausdorff. However, it is possible to construct manifolds that are not Hausdorff.
Example B.3. Take the disjoint union R t R of two copies of R, and for any
x R lets write x1 or x2 for the corresponding points in either component.
Let X be the quotient
X = (R t R) / (x1 x2 for x 6= 0)
(with the quotient topology). Then X looks a lot like R, but the origin has
been replaced with two points 01 and 02 . If we let U1 and U2 be the images
in X of the two copies of R, then its easy to show that they are the domains
of two co-ordinate charts, so X is (1-dimensional) topological manifold. In
fact this is even a smooth atlas.
However, X is not Hausdorff. If U is any open set containing the first
origin 01 , and V is any open set containing the second origin 02 , then U
and V must have a non-empty intersection.
To avoid weird examples like this, many people would insist that the definition of a topological manifold includes the condition that X is Hausdorff.
We will only use Hausdorff condition for one thing: bump functions. In
Section 5.4 we observed that for any open set U Rn , any r, r0 such that
0 < r < r0 and
B(0, r0 ) U
and any constants a, b R, it is possible to find a smooth function e
C (U ) such that e is constantly equal to a inside the ball B(0, r) and constantly equal to b outside the larger ball B(0, r0 ).
Then we considered bump functions on an arbitrary manifold X. We pick
a chart (U, f ) on X, and a function e on U as above. Then we extend e to

128

a function on the whole of X by defining:



(e f )(y), for y U
(y) =
b, for y
/U
Proposition B.4. If X is Hausdorff then this function is smooth.
Proof. The closed ball B(0, r0 ) is a compact subset of U , so W = f 1 (B(0, r0 ))
is a compact subset of X, contained in U . Since X is Hausdorff, Lemma B.2
says that W is closed in X. Then is smooth inside the open set U , and
it is certainly smooth inside the open set X \ W since its constant in this
locus. Therefore is smooth.
To understand why the Hausdorff condition was necessary here, consider
the following:
Example B.5. Let X be the line with two origins from Example B.3. Let e
be a bump function on R which is constantly equal to a in some open inteval
around 0, and constantly equal to b outside some larger open interval. View
this is a bump function in the chart U1 , and extend it to a function on the
whole of X as we did above. Notice that X \ U1 is a single point, the other
origin 02 . If we restrict to the chart U2
= R we get a function which is
equal to b at the origin, but constantly equal to a inside the set (r, 0)(0, r)
for some r > 0. So if a 6= b, then is not even continuous.

129

You might also like