0% found this document useful (0 votes)
13 views167 pages

DiffGeo 1 17.11

Uploaded by

tangbowei39
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views167 pages

DiffGeo 1 17.11

Uploaded by

tangbowei39
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 167

Differential Geometry I

Lecture by Tom Ilmanen

Maciej Swiatek

17.11.2023
2
Contents

Preface 7

I Curves and Surfaces 9


1 Curves 13
1.1 Definition and some Restrictions . . . . . . . . . . . . . . . . . . . 13
1.2 Arc-length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3 Geometric quantities . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3.1 The tangent vector . . . . . . . . . . . . . . . . . . . . . . 17
1.3.2 The curvature vector . . . . . . . . . . . . . . . . . . . . . 20
1.4 Curves in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4.1 The main interpretations of the curvature scalar . . . . . . . 27
1.4.2 Curvature determines curve up to rigid motion . . . . . . . . 32
1.4.3 The interaction between global and local . . . . . . . . . . . 33
1.5 Curves in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.5.1 First remarks . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.5.2 The curvature scalar in three dimensions . . . . . . . . . . . 37
1.5.3 Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.5.4 How Curvature and Torsion determine curve . . . . . . . . . 41
1.5.5 The Fréchet-Frame . . . . . . . . . . . . . . . . . . . . . . 42
1.5.6 Global theorems for curves in R3 . . . . . . . . . . . . . . . 44

2 Surfaces 47
2.1 Some definitions and basic quantities . . . . . . . . . . . . . . . . . 47
2.2 The curvature of surfaces in R3 . . . . . . . . . . . . . . . . . . . . 52
2.3 The Geometric Definition of Curvature on Surfaces . . . . . . . . . 53
2.3.1 A bit about Qp . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.2 The independence of Qp (v) from the curve we choose . . . 57
2.3.3 Proof of the theorem . . . . . . . . . . . . . . . . . . . . . 59
2.4 The second fundamental form . . . . . . . . . . . . . . . . . . . . 62
2.4.1 Simplifying the second fundamental form . . . . . . . . . . 62
2.4.2 The mean and Gauss curvatures . . . . . . . . . . . . . . . 64
2.5 Symmetry and Curvature . . . . . . . . . . . . . . . . . . . . . . . 68

3
4 CONTENTS

2.5.1 Examples: Determining Ap quickly through Symmetry . . . 71


2.6 Interlude: A bit about Differentiation . . . . . . . . . . . . . . . . . 75
2.6.1 Vector Fields on R3 . . . . . . . . . . . . . . . . . . . . . . 76
2.6.2 Vectorfields on a surfaces . . . . . . . . . . . . . . . . . . . 78
2.7 Another characterisation of curvature: The Weingarten Map . . . . 81
2.7.1 The Weingarten Map . . . . . . . . . . . . . . . . . . . . . 82
2.7.2 Proof of connection between the Weingarten map and Ap . 85
2.8 *Other formulas for curvature . . . . . . . . . . . . . . . . . . . . . 89
2.8.1 Curvature of a parametrized sufrace . . . . . . . . . . . . . 89
2.8.2 The curvature of a graph . . . . . . . . . . . . . . . . . . . 92
2.9 Intrinsic Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.9.1 The intrinsic metric of M . . . . . . . . . . . . . . . . . . . 94
2.9.2 The first fundamental form . . . . . . . . . . . . . . . . . . 95
2.10 Intrinsic Isometries and Gaussian curvature . . . . . . . . . . . . . . 96
2.11 Two Important theorems . . . . . . . . . . . . . . . . . . . . . . . 103
2.11.1 The Gauss-Bonnet-Theorem . . . . . . . . . . . . . . . . . 103
2.12 The unification theorem . . . . . . . . . . . . . . . . . . . . . . . . 105

II Manifolds 107
3 Topology and topological manifolds 111
3.1 Humble beginnings . . . . . . . . . . . . . . . . . . . . . . . . . . 111
3.2 A topological space . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.3 Charts: Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.4 The Hausdorff Condition . . . . . . . . . . . . . . . . . . . . . . . 122
3.5 The ant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
3.6 Interlude: Useful topology for the course . . . . . . . . . . . . . . . 124

4 Smooth Manifolds 125


4.1 Charts II: Compatible charts . . . . . . . . . . . . . . . . . . . . . 125
4.2 Examples of smooth manifolds . . . . . . . . . . . . . . . . . . . . 128
4.3 The maximal atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.4 The final definition of a smooth manifold . . . . . . . . . . . . . . 135
4.5 Cartesian Products, Smoothness . . . . . . . . . . . . . . . . . . . 135
4.5.1 The Cartesian product of two manifolds . . . . . . . . . . . 136
4.5.2 Smooth functions between manifolds . . . . . . . . . . . . . 136
4.6 Diffeomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.6.1 What you can get from diffeomorphisms . . . . . . . . . . . 139

5 Tangent Vectors 141


5.1 Tangent vectors from charts . . . . . . . . . . . . . . . . . . . . . 142
5.1.1 Vectors and Maps in Rn . . . . . . . . . . . . . . . . . . . 143
5.1.2 Back to manifolds . . . . . . . . . . . . . . . . . . . . . . . 143
5.1.3 The definition . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.2 Tangent vectors as directional derivatives . . . . . . . . . . . . . . 148
CONTENTS 5

5.2.1 The basis of Tp M with the second definition. . . . . . . . . 151


5.2.2 The coefficients of a vector . . . . . . . . . . . . . . . . . . 152
5.2.3 Proving that ∂x ∂
i
i=1,...,n
is a basis . . . . . . . . . . . . . 153
5.3 Tangent vectors from curves . . . . . . . . . . . . . . . . . . . . . 156
5.4 Tangent vectors as operators that satisfy the product rule. . . . . . 158
5.5 Change of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.6 Differentiation of a function between manifolds . . . . . . . . . . . 161
5.7 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.8 The coordinate expression for df (p) . . . . . . . . . . . . . . . . . 164

6 Tangent Spaces and Tangent Bundels 167


6 CONTENTS
Preface

Sections with an asterisk in-front of them are not mandatory to read, you can skip
them as you like (”You should know these ideas exist, but don’t need to learn
them”). Big thanks go out to Anastasia Sandamirskaya and Ji Zhexian for helping
with writing the first lecture about surfaces.
Disclaimer: We generally stay true to the notation of the lecture. The only real
exception to this (so far) is the symbols we use for charts, for which ψ, χ were used
in the lecture as general charts, whereas we use Ch1 and Ch2 (literally: chart 1 and
chart 2), to make it the equations a bit more direct. Other than that, the symbol
T1→2 is also a product of this script, for the transition/overlap map, which in the
lecture was always written out with the charts (χ ◦ ψ −1 ). We did this to make a
few equations a bit more readable.
Note: There are currently a few pages, particularly where there are many figures
in the text, that have very wide gaps. This changes every time one adds any text,
because latex re-chooses where to put things like figures and definitions and then
tries to fit the text around this. I would need to fix this manually at every such
point and will do it after the script is done. For now I hope you can get past this.

7
8 CONTENTS
Part I

Curves and Surfaces

9
11

We start with some of the most intuitive examples of the type of manifolds we
will be working with, that is, with curves and surfaces embedded in some form of
RN .
12
Chapter 1

Curves

In this chapter, we will deal with curves. We first define what we mean by a curve,
and impose some restrictions on the kind of curve we want to deal with. We won’t
prove all the things we claim in this chapter, as some of these things you should
have already seen in a Calculus class and this is only a quick overview.

1.1 Definition and some Restrictions


Given that this is Differential Geometry, we do not want to work with discontinuous
curves. We therefore choose to work with smooth curves.
Definition 1.1.1 (Smooth curve). We define a smooth curve in N dimensions
to be a function from some interval I to RN , which is smooth. Mathematically:

A smooth curve is a function γ : I → RN so that it is in C ∞ (I)


The interval can be any sort of interval you want, open, closed, half-open, etc.
We also allow things like (∞, 0]. You can see an example in Figure 1.1. The
thing we are interested the most in in Differential Geometry, is not actually the
parametrization of the curve. What we usually mean by the curve is the actual line
in RN that you can draw on a piece of paper. That is what we mean by a curve.
The actual real geometric line, not the function that assigns it a parameter value.
That is why the parametrization is not the main player in Differential Geometry.
The curve exists independent of parametrization. It is (mathematically) the image
of the function γ, We will use names like γ for the image of the function, not just
for the function, as that is what we care about the most.
Smoothness is not the only property we will (usually) want a curve we work
with to have, because smoothness in the sense above does not guarantee that the
image of the curve is a smooth object. (It only requires the parametrization to be
smooth.) We can see this with the example below.
Example 1.1.1 (Smooth parametrization doesn’t imply smooth image). Take
the curve γ : t ∈ R → (t2 , t3 ) ∈ R2 . It is clear that this is a smooth curve. (The

13
14 CHAPTER 1. CURVES

Figure 1.1: An example of a smooth curve. The Interval I gets mapped onto a
curve in R3 with the function γ

Figure 1.2: The curve γ : t ∈ R → (t2 , t3 ) ∈ R2 . At t = 0 we see that the image


of the curve is not a smooth object
1.2. ARC-LENGTH 15

coefficients are polynomials in t, all the derivatives exist and are continuous.)
But look at Figure 1.2. The image of the curve is obviously not smooth in R2
at t = 0 or equivalently x = (0, 0). What is happening over there? Well, it
resembles the absolute value function a bit. It also had a sort of sharp bend
at a point. The problem back then was with the derivative. It simply did not
exist, which made the curve have a weird behavior (the sharp bend).Similarly,
here the problem is also with the derivative. It exists, obviously, since this is a
smooth curve. But it becomes 0 at the problem point (t = 0 or the origin). A
curve with such a bend is not something we want to really work with, therefore
we put another restriction on the curves we work with. We eliminate curves like
the one from this example simply by saying we don’t work with curves whose
derivative becomes 0 anywhere.

As we saw in the previous example, we get into problem situations if the deriva-
tive of the curve with respect to the parametrization parameter is zero. We therefore
define regular curves as those for which this doesn’t happen, or in other words, where
the velocity never vanishes.

Definition 1.1.2 (regular curves). A smooth curve is called a regular curve, if:


̸= 0 for all t ∈ I (1.1)
dt
where γ is the smooth curve and I is the interval it is defined on.

Note 1. We will use various notations for the derivative of a curve. These
include:

= γt = γ̇ (1.2)
dt

1.2 Arc-length
Now that we have said what we mean by a curve and restricted it so as to not run
into problems like the one in the example above, we can start with the geometry.
Undeniably, one of the most important quantities in geometry is the length. If you
know the lengths of a problem, you already know quite a bit of the geometry. What
is the length of a (piece of a) curve?
Well, we already restricted ourselves to work with regular curves, so our moti-
vation will be more on the intuitive side.
Imagine you have any curve, like the one in the figure 1.3. The idea is that we
divide the curve into very small almost-straight parts, calculate the length of these
parts by approximating that part as a straight line and then summing up all of those
back together. We can do it for reasonable (i.e regular) curves. Of course, in reality
what we do is go infinitesimal, at which point this becomes an integral.
dt ∆t, where by ’s’ we
For the small piece as seen in the figure, we have ∆s = dγ
mean the length and by ∆s the very small length of that very small part. Afterwards
16 CHAPTER 1. CURVES

we add all of these up, and in the continuum limit we get an integral:
Z Z t

s(t) = ds = dt (1.3)
γ t0 dt

Figure 1.3: A curve and our intuitive way to understand the definition of the
arc-length. We zoom in on a very small part of the curve, between t′ and t′ +∆t.
There, if ∆t is small enough, the line will be approximately straight and we can
use the velocity vector to calculate the length of that piece approximately. Note
that the velocity vector is drawn in way smaller than it would actually be for
any reasonable ∆t, just so that the picture is clearer.

Definition 1.2.1 (Arc-length). We define the arc-length of a curve γ : I → Rn ,


by first choosing a specify point t0 ∈ I and it’s image γ(t0 ) as a reference point.
Then the arc length s(t) between t and t0 is:
Z t

s(t) = dt (1.4)
t0 dt Rn

Note that in the definition we did not assume that t > t0 , a negative arc-length
is possible, simply by going into the opposite direction of the parametrization of the
curve.
We already mentioned that the geometrically interesting object is the image of
γ, not the function γ (i.e the parametrization) itself. We will care mostly for things
we can define on the image of γ that are not dependant on that parametrization.
The arc-length is something independent of the parametrization1
1 Of course, we can always choose to parameterize the curve in the other direction, which

changes the arc-length by a minus. We can also choose a different reference point other than
γ(t0 ). But these are choices that are rather trivial and we won’t really mention them from
now on.
1.3. GEOMETRIC QUANTITIES 17

In line with this philosophy, we can define a very convenient, but also more
geometrically ”real” parametrization. The idea is that the arc-length is a geometric
object independent of parametrization, and that, for regular curves, we can use the
arc-length to parameterize the curve.

Lemma 1.2.1 (Reparametrization of a regular curve). Let γ : t ∈ I → RN be


a regular curve. Then we can re-parameterize it to a new (regular) curve β(s),
so that

=1 (1.5)
ds
In other words, we parameterize it by the arc-length s,

Proof. We will only sketch the proof, as this is something rather simple and you
very likely already saw the proof in a calculus class2
The first step is to take the arc-length and see it as a function of t:
Z t

s = f (t) = dt (1.6)
t0 dt

We can then take the inverse of this function, call it g(s) = f −1 and express
t as a function of s. If we take β = γ ◦ g = γ(g(s)), we found the right
parametrization. The only thing left that you need to convince yourself is that
the velocity is really of unit length. (You can do this using the chain rule.)

Of course, the curve is still regular, and all the properties like smoothness are
still obeyed by the curve. The image of β and γ is of course exactly the same, i.e,
you can’t change the curve simply by re-parameterizing. You might find Figure 1.4
helpful in visualizing this.

1.3 Geometric quantities


We continue our search for geometric quantities (other than arc-length), that we can
find in connection to curves. We will find that the tangent vector and the curvature
vector (see definition below) are both independent of the specific parametrization
of our curve and that they tell us a lot of geometric information.

1.3.1 The tangent vector


You have, throughout your studies, seen many functions, and many curves. Your
intuition from Calculus about the tangent vector will probably make you very quickly
say that the the tangent vector (here called τ ) should look like this:

? dγ
τ (t) = (1.7)
dt
2 If not, try it yourself as an exercise.
18 CHAPTER 1. CURVES

Figure 1.4: Different Parametrizations of the same curve. The curve is drawn in
green, the ticks are the points on the curve with the parameter-values written
next to them. In (a) you see a typical non-special parametrization (i.e the ”t”
, in (b) you find the curve parameterized by the arc-length (s). It is intuitively
clear, why the parametrization is not something geometrically interesting. The
real curve (green line) exists, independent of the ticks. In (c) you find another
parametrization by the arc-length, except with a different choice of reference
point on the curve.
1.3. GEOMETRIC QUANTITIES 19

A bit of thought however, reveals that this cannot be true. Why? Well, it is not
independent of parametrization. Imagine, for example, you were to go twice as fast
along the curve. Then your velocity vector (=tangent vector in this example) would
be, twice as big at every point. But if we want the tangent vector to be something
fundamentally independent of the parametrization, then equation 1.7 cannot be the
correct definition of the tangent vector.
How can we fix this? Well, look at figure 1.5. It shows the same curve, param-
eterized in three different ways, with the ”fake” tangent vectors (from equation 1.7
drawn in. The thing that should jump at you is that, while the length of the vectors
does change, the direction does not 3 . The tangent vector how we defined it is not
the geometrically real thing, rather the unit tangent vector is, which is exactly how
we choose to define it below.

Figure 1.5: The ”fake” tangent vectors from equation 1.7 for different
parametrizations of the same curve (green). In (a) you see a random
parametrization (black ticks) and its ”fake” tangent vector (blue) from equation
1.7. In (b) you have the same situation, only that this time you go twice as fast
along the curve. Notice that the vectors (the physically drawn arrows) change.
In (c) you have the same thing, but this time parameterized with arc-length.

Definition 1.3.1 (tangent vector of a curve). We define the tangent vector τ


3 At least to the precision of my drawing skills
20 CHAPTER 1. CURVES

of a curve γ : t ∈ I → RN to be the vector:

dγ/dt
τ (t) = (1.8)
|dγ/dt|

If we parameterize by the arc-length, then the formula for the tangent vector
becomes:

τ (s) = (1.9)
ds
since dγ
ds = 1.

In this sense, we get something geometric, independent of the parametrization.


You can also think of this as us choosing the tangent vector to be the one from
figure1.5(c)
Note 2. We abused notation a bit. We write γ(s) instead of β(s), since, as we
already mentioned, we only really care about the image of those, and they are
ds for ds for the same
the same, and it avoids cluttered notation. We also write dγ dβ

reason. By the chain rule, we get ds = dt ds , which we also write as dγ


dβ dγ dg
dt ds , to
dt

avoid introducing another symbol,g, since it is just the function that expresses
the parameter t in terms of s.

1.3.2 The curvature vector


We now come to the curvature vector. It is the fundamental object that describes
how much the tangent vector of the curve changes. Look at, for example, figure
1.5(c). It depicts the tangent vector for the curve, which is parameterized by its arc-
length. This vector does not change it’s length (it is per definition of unit length),
but it does rotate as you follow along the curve. Notice also, that the more the
tangent vector rotates, the more ”curved”4 the curve is. That’s where the name
comes from.
Definition 1.3.2 (The Curvature Vector of a Curve). Let γ : s ∈ I → RN be
a regular curve, parameterized by the arc-length. Then we define the curvature
vector κ to be:
dτ d2 γ
κs = = 2 (1.10)
ds ds
It is clear, that, because s is independent of any sort of parametrization, dτ
ds is
too.
Figure 1.6 shows a curve with the curvature vector drawn in. Notice that the
curvature vector seems to be orthogonal to the tangent vector5 . This turns out to
be true, universally, as we will now prove.
4 At
least in the intuitive sense, for well-behaving curves.
5 Thephysics students among you might find this very similar to how in physics we some-
times separate acceleration into a parallel and perpendicular part, the former changing the
speed, the latter curving the trajectory. Since here we don’t change the speed, only the curving
part is left.
1.3. GEOMETRIC QUANTITIES 21

Lemma 1.3.1 (κ ⊥ τ ). The curvature vector κ is orthogonal to the tangent


vector τ .

Proof. The proof is strikingly simple. We know that the length of τ is set to
one. Therefore ⟨τ, τ ⟩ = 1 and dsd
⟨τ, τ ⟩ = 0 since the length (and therefore the
scalar product) doesn’t change along the trajectory. We can use the product
rule:
d dτ
0= ⟨τ, τ ⟩ = 2⟨ , τ ⟩ = 2⟨κ, τ ⟩ (1.11)
ds ds
Therefore, the the scalar product of the two vectors is 0, i.e they are orthogonal,
as claimed.

Figure 1.6: A curve with τ and κ drawn in. Notice that κ is orthogonal to τ .

This is something physics students are very familiar with. The situation is very
analogous to the trajectory of a particle. The speed of the particle doesn’t change,
so the only direction the acceleration (= curvature vector) can have is perpendicular
to the curve.

Note 3. We want to make a quick check on the units of all the quantities that
we described so far. Let’s assume that our RN holds some sort of length unit,
like the cm, which we will write as [L]. Let’s also assume the parameter of
our parametrization has the units of time, like sec, which we denote [T ]. Then
both γ and s have units [L], so the tangent vector dγ ds has units of [L]/[L] = 1
and is unit-less. This is something we want explicitly, as the geometric object
should not be dependant on the parametrization, which means it should also be
independent of the unit of the parametrization [T ]. The ”fake” tangent vector
we defined before has, on the other hand. units of [L]/[T ].
2
The curvature vector ddsγ2 has units of [L]/[L]2 = 1/[L].
22 CHAPTER 1. CURVES

Until now, we have only given a formula for the curvature vector in the arc-
length-parametrization. We will now write down the formula for the curvature
vector with any parametrization.
Lemma 1.3.2 (Curvature Vector in arbitrary Parametrization). Let γ : t ∈
dγ/dt
I → RN be any curve and τ (t) = γt (t) = |dγ/dt| the tangent vector. Then the
curvature vector κ(t) can be written as : 6

1
 
γt γt
κ= 2 γ tt − ⟨γ tt , ⟩ (1.12)
|γt | |γt | |γt |
d2 γ
where γtt is dt2 .

Before we go on to prove this, we first want to talk about what each part of the
equation means.
2
We know that κ = ddsγ2 and therefore expect it to have something to do with
d2 γ
dt2 . This turns out to be the case, the first term is indeed γtt . But there is a
correction term of −⟨γt , |γγtt | ⟩ |γγtt | , which has a nice geometric explanation.
It projects γtt onto the normal plane of the tangent vector. See Figure 1.7 for
a visual example. After we have projected γtt onto the normal plane, we still divide
2
it by |γt | . You can see it as just a factor that makes sure that the units work out.
We can see this simply by comparing units. The part that projects γtt onto the
Normal plane has the same unit as γtt , so we can just look at γtt . (Because we add
2
them. That doesn’t change the units.) The unit of γtt = ddt2γ are clearly [L]/[T ]2 ,
while the unit of κ is 1/[L], as we saw above. Therefore, to get a consistent formula,
2
we need something that has units of [T ]2 /[L]2 . 1/ |γt | is exactly a factor like that.
Note 4. We call the normal plane a plane, even though that is technically only
correct if we have a curve in R3 . In R2 it is a line, in R4 a hyperplane and in
general an (N − 1)-dimensional vector-space.

Proof. We now prove equation 1.12. The proof consist in its most basic form just
of taking the definition of κ in the arc-length-parametrization and switching to
the t-parametrization, using the normal rules of derivatives (chain rule / product
rule). We start with the chain rule.
dτ dt dτ
κ= = (1.13)
ds ds dt
Rt
where by dsdt
we of course mean dg
ds where g = f
−1
and f (t) = t0

dt dt.
Therefore:
 −1  −1
dt dg df dγ 1
= = = = = 1/ |γt | (1.14)
ds ds dt dt |dγ/dt|
6 If you already have some experience of Differential Geometry or you are rereading this after

already learning further chapters, you might notice how this is the the covariant derivative of
the tangent vector
1.3. GEOMETRIC QUANTITIES 23

Figure 1.7: A curve with τ and κ drawn in, as well as γtt . Because we move
along the curve faster and faster (the ticks are more spread out), γtt has a
component in the ”forward” direction, which we cancel out in equation 1.12.
2
The vector is still too long though, which is why we need to divide by |γt | .
24 CHAPTER 1. CURVES

If we insert the definition of τ in the t-parametrization we get:


dt dτ 1 d γt
κ= = (1.15)
ds dt |γt | dt |γt |
d|γt |
Now we need to use the quotient rule and the fact7 that dt = ⟨γtt , |γγtt | ⟩.
1 d γt
κ= (1.16)
|γt | dt |γt |
d|γt |
1 (γtt |γt | − γt ( dt )
= 2 (1.17)
|γt | |γt |
1 (γtt |γt | − γt ⟨γtt , γt / |γt |⟩
= 2 (1.18)
|γt | |γt |
1
 
γt γt
= 2 γ tt − ⟨γ tt , ⟩ (1.19)
|γt | |γt | |γt |
And we get the result as promised.

1.4 Curves in R2
We have, by now, defined exactly what we mean by a curve, seen the concept of
what sort of object is geometric, and defined a few of these, like the arc-length,
tangent and curvature vectors. We will now use all of these concepts to describe
curves in the two dimensional plane.
The main idea that makes this a lot simpler, is that the curvature vector κ
reduces to a number. This is because the direction of the curvature is always
predetermined in two dimensions by the direction of the tangent vector.
To see this, we note that, as we showed before, the curvature vector κ lies in
the normal ”plane” of the tangent vector, which in two dimensions means that it
lies on a straight line perpendicular to τ . Therefore, we only need to specify one
number8 to determine the curvature vector.
Let’s say we are at a point on a curve, like the one drawn in figure 1.8. We
can construct a right handed basis of R2 at that point by taking τ as our first
basis vector, and the vector that one gets if one rotates τ by 90 deg (in the positive
sense.), which we will call N . Since τ is of unit length and we get N by rotating τ
by 90 deg, this is an orthonormal basis. (One that is right hand sided.) Notice that
it immediately follows that:

κ = kN (1.20)
for some k ∈ R, because we know that κ and τ are orthogonal. We call this k the
curvature scalar. It is an important quantity in differential geometry, and we will
find its equivalents for different geometric objects throughout the subject.
7 You should recognize this from Calculus II, given maybe in a different notation: dr/dt =

x/dt = ⃗xr d⃗
(∇r) ∗ d⃗ x
dt
8 On each point of the curve
1.4. CURVES IN R2 25

Figure 1.8: A curve with its tangent, curvature and normal vectors drawn in
at a point on the curve. As you see, the curvature vector is just some number
times the normal vector. Note however, that k is not just the absolute value of
κ, since it can also be the negative of its length, if it points in the other direction

Example 1.4.1. Our first example is the simplest curve that is not a straight
line (because a straight line, of course, has no curvature9 ),which is a circle of
radius R.
The curvature of that circle is

k = 1/R (1.21)

if the circle is parameterized in the counter-clockwise direction. Before you


dive into algebra, let’s consider why this result makes a lot of sense. Take the
few circles in figure 1.9. It is intuitively clear, that the bigger the radius of
the circle, the less curved it gets. As you get progressively bigger radii, the
circles look more and more ”flat” at the top, or in other words, less curved.
The biggest circle (only drawn in partially) is almost flat, and if you were to
draw something like R = 100 you could probably not see the difference anymore
between a straight line and the circle. So the result, that the curvature scalar
is the inverse of this radius makes a lot of sense.
The proof of this claim is a very good exercise in converting parametrizations
and getting geometric information out of coordinates, and we will therefore leave
it as an exercise.

9 If you don’t immediately believe this, convince yourself of it.


26 CHAPTER 1. CURVES

Figure 1.9: A few circles with different radii, with their respective τ, κ, N drawn
at the point (0, R) of each curve. The bigger the circle, the less curved it is, as
reflected by the formula k = 1/R.
1.4. CURVES IN R2 27

1.4.1 The main interpretations of the curvature scalar

The curvature scalar has a lot of interpretations. Let’s first state them, then discuss
their consequences.

Proposition 1.4.1 (Interpretations of the curvature scalar). As always, let γ


be a two dimensional curve, with all the properties we already discussed. Then
the curvature scalar has the following interpretations:

1. The curvature scalar is the rate of change of the angle the tangent vector
makes with the x axis. Mathematically, let θ = arctan ττ21 (s)
(s) be exactly
that angle Then:


k= (1.22)
ds

2. The absolute value of the curvature scalar k tells us the radius of the
osculating circle, which is the distinct circle, that agrees with the curve
up to order two.

1
|k(s)| = (1.23)
R(s)

where R(s) is the radius of that circle at the point of the curve whose
parameter-value is s.

The first interpretation of the curvature scalar should make a lot of sense in-
tuitively. We know that the tangent vector cannot change its length, since per
construction it is of unit length. Therefore the only thing that can really change is
the direction, i.e the angle it makes with the x-axis. This, along with the fact that
the curvature vector describes how the tangent vector changes, makes the first part
of the proposition rather intuitive. See figure 1.10 for a visualisation.
The proof is not to complicated, you just need to derive θ(s) and remember
that (1) the derivative of arctan(x) is 1+x
1
2 and (2) the normal vector in terms of

the components of τ is (−τ2 , τ1 ).

Proof of the first interpretation. As we said, you only need to derive θ. Let’s
28 CHAPTER 1. CURVES

Figure 1.10: A curve, with its tangent vectors drawn in, and a table that shows
how the tangent vector rotates.

start:
 
dθ d τ2
= arctan (1.24)
ds ds τ1
 
d(arctan(x)) d τ2
= (1.25)
dx ds τ1
1 τ˙2 τ1 − τ2 τ˙1
= (1.26)
1 + x2 τ12
1 τ˙2 τ1 − τ2 τ˙1
= (1.27)
τ2
1 + 22 τ12
τ1
1
= ⟨(τ˙1 , τ˙2 ), (−τ2 , τ1 )⟩ (1.28)
τ12 + τ22
1
= ⟨κ, N ⟩ (1.29)
1
=k (1.30)

The factor in the fraction is one because it’s the square of the length of τ , which
is one.

Now, what about the second interpretation? Well, you can imagine a circle,
going along the curve, that locally looks like the curve. (The curve tries to be
1.4. CURVES IN R2 29

as similar to the circle as possible, but because the radius of the osculating circle
changes with s, it doesn’t become a circle.)
Figure 1.11 gives a picture of a curve and it’s osculating circles at different
points of the curve. As you hopefully agree with, the bigger the radius of the circle,
the more straight the curve will be at that point (as both of them agree to order
two so they locally behave quite similarly.) Therefore we expect that the second
interpretation is correct, that is, the curvature scalar is inverse to the radius of the
osculating circle.

Figure 1.11: A curve with its osculating circle drawn in at a few places along the
curve. (The biggest one only partially drawn in) It is clear that the bigger the
osculating circle is, the straighter the curve will be, which gives the connection
to the curvature scalar.

Proof of the second interpretation. We will not prove this, as it is quite simple,
but we will sketch a proof. The osculating circle agrees with γ up to order
two. Therefore, we can expect that the second derivatives (i.e the k’s) agree
for the curve and the osculating circle (which we can see as a second curve.) at
that point. We know that at that point, the circle has kcircle = 1/Rcircle , and
therefore this should also be true for the first curve. The only missing parts of
the proof are (1) the proof that an osculating circle exists, which it does10 , and
a more rigorous way of presenting the above argument.

There is actually also a third interpretation of the curvature scalar, for a special
kind of curve. Let’s say, that the curve is the graph of a function y = u(x) that
assigns a y-value to every x-value, like the one in figure 1.12. Consider the second
10 A straight line is a circle of infinite radius in many aspects of geometry, this is also

true here, if the curve is locally straight at a point, the radius of it’s osculating circle will
blow up and the circle will become as straight line, but the theorem will still hold. For the
mathematicians: 1/∞ = 0 in this case.
30 CHAPTER 1. CURVES

Figure 1.12: A graph of a function as a curve.

derivative of u. Can we connect it to k, which is also, a second derivative? Yes. In-


fact, this is a very common theme that will accompany you throughout differential
geometry. Curvature is a second derivative and a second derivative is curvature in
some sense11 . The relationship between k and uxx is not trivial however. k ̸= uxx !
The actual relationship is:
uxx
k= 3/2
(1.31)
(1 + u2x )
We have to compensate, because x is not s. The proof of this is quite simple, if
quite long. The basic strategy is, as with many of these proofs, differentiate until
you get to where you want to be.

Proof. Let’s start, by collecting different terms that might be useful. Firstly,
γ(x) = (x, u(x)) and therefore:

γx = (1, ux ) (1.32)
1/2
|γx | = 1 + u2x (1.33)
γx (1, ux )
τ= = 1/2
(1.34)
|γx | (1 + u2x )
(−ux , 1)
N= 1/2
(1.35)
(1 + u2x )

The first three should be rather clear, coming straight from the definition. The
last one comes from the fact that N is just τ , but rotated by 90 deg, which means
we switch the two entries of the vector and put a minus in-front of the first one12
11 Conditions apply, as always.
12 Ifthis is not clear to you, try it out with the rotation matrix of positive 90 deg. You’ll
see that this is correct.
1.4. CURVES IN R2 31

We can now just use the definition of κ and the chain rule and calculate until
we get there.
dx dτ
κ= (1.36)
ds dx
 −1
dx ds −1
= = |γx | (1.37)
ds dx
1 dτ
→κ= (1.38)
|γx | dx
1 d γx
= (1.39)
|γx | dx |γx |
Before we continue, there is something to note about what we already found.
Inside the derivative, we already normalize once, and then again outside of the
derivative. In this sense, κ is a normalized version of a second derivative.
1 d (1, ux )
... = (1.40)
|γx | dx |γx |
1 (0, uxx ) d 1
 
= − (1, ux ) (1.41)
|γx | |γx | dx |γx |

(We used the product rule.) Now, we know that k = ⟨κ, N ⟩. The last term in
the equation above for k is proportional to (1, ux ) which is proportional to τ ,
which means that when we form the scalar product to get k, it drops out, since
τ is orthogonal (per construction) to N . We get:

k = ⟨κ, N ⟩ (1.42)
1 (0, uxx ) (−ux , 1)
=⟨ , ⟩ (1.43)
|γx | |γx | |γx |
uxx uxx
= 3 = 3/2
(1.44)
|γx | (1 + u2x )
Here, we used that that aforementioned second term is orthogonal to N and left
it out. At the end we just collected terms.
Now, after seeing how much manual computation this took, you might be a
bit astounded as to why. The reason is the same reason why anytime you actually
want to compute something in differential geometry it usually turns into a mess
of derivatives. We are turning something fundamentally coordinate-based13 (uxx )
into something geometric (k). Coordinate-based objects usually have, as you might
imagine, a lot of information in them that is only related to the choice of our
coordinates and we have to filter that information out when we do the conversion.
This is the reason why there is so much to compute, even if the steps aren’t too
complicated.
13 To make this discussion more general, we write coordinate-based, even though right now

it’s just a parametrization. You can see a parametrization as coordinates on the curve.
32 CHAPTER 1. CURVES

1.4.2 Curvature determines curve up to rigid motion


There is one more thing we want to discuss about the curvature scalar before we
move on. We want to talk about how much the curvature (scalar) actually tells us
about a curve, or to what degree it determines the curve, in the sense that you have
a function k(s) which you say is the curvature scalar of the curve, and ask yourself
how much freedom you still have left. It will turn out that the curvature determines
the curve, up to it’s position at s = 0 and the angle of the tangent vector at that
point. This mirrors Newton’s law a lot, the reason being that both are differential
equations of second order14 . You can also see this as having the freedom to preform
any rigid motion (a rotation or translation, but no mirroring or stretching) and still
getting a curve with the same curvature scalar. You can look at figure 1.13 for an
example.

Figure 1.13: You can preform a rigid motion and not change anything about
the curvature of the curve.

Theorem 1.4.1 (Curvature determines curve up to a rigid motion). The cur-


vature k(s) of a curve determines that curve up to a rigid motion.
Proof. We will, again, not prove this rigorously. We will give you a sketch, from
which it should be clear that a proof can be constructed.
The basic idea is to integrate twice.
Firstly, integrate the equation:

k(s) = (1.45)
ds
to get: Z s
θ(s) = θ(0) + k(s) ds′ (1.46)
s0
14 The only difference to Newton’s law is that the speed (with respect to s) can’t change for

a curve, that is why we don’t get to pick any tangent vector.


1.4. CURVES IN R2 33

Figure 1.14: A picture of the theorem.

where θ(0) is a constant we can choose freely. The next step is to integrate the
equation:

(s) = τ (s) = eiθ(s) (1.47)
ds
and get: Z s

γ(s) = γ0 + eiθ(s ) ds′ (1.48)
s0
giving us yet another constant γ0 , which we can choose freely. To get a more
rigorous proof, you would need to show that these are the solutions (just dif-
ferentiate them) and that these are the only solutions (use a theorem from
calculus.)

1.4.3 The interaction between global and local


Before we close the subject of curves in R2 , we want to talk about a general theme
of differential geometry, that shows up throughout the subject, and apply it to
curves in R2 . The theme is the interaction between global and local properties of
geometric objects. The idea is that local properties (like curvature), which only feel
a tiny piece of the object (around any point), give constrains on (or even determine)
global properties. Local things are things that only need a small surrounding of
a point to be defined at that point, like the curvature vector, or later the metric.
Global things are typically integral quantities, which often are related to topology.
We will give an example (without proof) of a theorem that follows along the line of
this idea, but before we do that, we need to define two further restrictions, that we
will need to make (and will from now on assume that the curves we usually work
with will usually obey.)
34 CHAPTER 1. CURVES

Figure 1.15: A problem case of a closed curve, for which the tangent vector at
the beginning is not the same as the tangent vector at the end. We want to avoid
this, so we just take these kinds of curves (and ones where higher derivatives
don’t match up) out of the set of curves we consider
1.5. CURVES IN R3 35

Definition 1.4.1 (more restrictions, simple curves).

1. An N -dimensional smooth curve γ is called simple, if it has no self-


intersections, i.e if γ(s) = γ(t) then s = t. The one exception we make
are the edges, as we don’t want to call a closed curve, like a circle, self-
intersecting, just because it returns back to it’s beginning.

2. Similarly, a curve is closed if it is defined on an interval [a, b] and γ(a) =


γ(b)

3. If we are working with closed curves, we will want them to have the (nice)
property that, if we extend them periodically to a curve from R → RN ,
they are smooth. This is to avoid annoying situations like the one in figure
1.15

With these restrictions, we can state the theorem.

Theorem 1.4.2. Let γ be a two dimensional regular closed curve that obeys
the above restriction. Then: Z
kds = 2πn (1.49)
γ

for some n ∈ Z

The integral quantity is the global quantity we mentioned before, while k is the
Rb
curvature scalar, which is a local quantity. Since γ k ds = a dθ
R
ds ds, the global
quantity is just the total angle (with signs) by which the tangent vector rotated, a
profoundly global thing.
It makes sense that this would be so. If the curved is closed (and is smooth
on the edge, if made periodic), then the angle τ has to rotate by must be some
multiple of a whole rotation, since it ends up where it started.

Proposition 1.4.2. If the curve is simple, n = ±1

This proposition tells us that a non-intersecting curve’s τ can only turn once in
total. See the examples in figure 1.16
With this we conclude the topic of two-dimensional curves, and move on to
three-dimensional curves.

1.5 Curves in R3
1.5.1 First remarks
Very early in our discussion of two-dimensional curves we figured out that in two
dimension, the curvature vector is not really necessary for the description of how the
curve curves. Better said, the curvature scalar (the signed length of the curvature
vector) held all the information about curvature, the direction of the curvature vector
36 CHAPTER 1. CURVES

Figure 1.16: Two simple curves and the graph that shows by how much the
tangent vector rotated.

was always predetermined. This is very different for curves in three dimensions.
Here, the vector character of the curvature vector really stands out.
We saw at the beginning of this chapter that the curvature vector lies in the
normal plane of the tangent vector. In two dimensions this helped us, by letting us
forget about the direction of the curvature vector and only consider the curvature
scalar. This time, we cannot do this, as we have an entire plane that the curvature
vector could lie in.
In the two dimensional case, we defined a moving frame composed of τ and
N , which was a right handed orthogonal basis. This is the idea we will use to get
further in three dimensions.
Definition 1.5.1 (The moving frame). Let, as always, γ be a curve, this time in
R3 with all the properties we already mentioned before (smoothness, regularity
etc.) We define three vectors at each point, that will compose the moving frame
we will usually use.

1. The first is the tangent vector τ . Remember that it is normalized.


2. The second one is the normalized curvature vector, which we will call N ,
N = |κ|
κ

3. The third vector will be defined by the equation β = N × τ . Because both


τ and N are of unit length, β is too, and all three together form a right
1.5. CURVES IN R3 37

handed orthonormal basis.

The second vector will be called the normal vector15 , while the third vector is
called the bi-normal vector.

Together, N and β span the normal plane of τ . For a visualisation of the moving
frame, look to figure 1.17.

Figure 1.17: A three dimensional curve, with the moving frame drawn in. N is
κ, but normalized, β is the cross product of N and τ .

1.5.2 The curvature scalar in three dimensions


Because the curvature in three dimensions is really a vector, unlike in two dimen-
sions, we will not find that we can describe the entire curve just with the curvature
scalar. We still introduce it, as |κ|. But this time, it cannot change signs, which
means the two dimensional version is not exactly the same thing as the three di-
mensional.

Definition 1.5.2 (Curvature scalar for curves in three dimensions). The cur-
vature scalar is simply the absolute value of the curvature vector, defined as
k = |κ|

Now, you might have noted that to define N , we divided by the curvature scalar
and this becomes a problem, if k is zero. This is an actual problem and happens
for any curve that is (to second order) straight at some point. We will exclude this,
simply by adding another restriction on our definition of a curve.
15 Even though it is not the only normal vector to τ , but it is a very special normal vector,

since it goes in the direction of κ. Therefore we call it the normal vector


38 CHAPTER 1. CURVES

Definition 1.5.3 (ordinary). A curve γ is called ordinary, if the curvature


vector never vanishes, which means that for all points on the curve |κ| =
̸ 0.

With this aside, let’s go back to the curvature vector. We saw that the curvature
scalar will simply not provide enough information about our curve that we can paint
a complete picture. We will need another agent, which will be called the torsion.

1.5.3 Torsion
Here we will define what we mean by the new agent we said we needed in the
previous section, called torsion. Torsion will also be another geometric object16 . It
will tell us, in a sense, how much the curvature vector changes.

Definition 1.5.4. Let γ be an ordinary three dimensional curve. We define


the torsion vector λ to be the projection of the derivative of the normal vector
onto the bi-normal:
dN
λ=⟨ , β⟩β (1.50)
ds
and the torsion scalar l to be:
dN
l=⟨ , β⟩ (1.51)
ds

Why do we do this? Why don’t we just define the torsion vector to be dN ds ?


Well, it turns out, that this has a lot of unnecessary information.
Consider it’s τ and N components. For the τ component we get (product rule):

dN d dτ d
⟨ , τ⟩ = ⟨N, τ ⟩ − ⟨N, ⟩ = 0 − ⟨N, κ⟩ = −k (1.52)
ds ds ds ds
which is just (minus) the curvature scalar, which we already know, and for the other
component we get (Product rule again):

dN d⟨N, N ⟩ d
2⟨ , N⟩ = = 1=0 (1.53)
ds ds ds
since a unit vector can’t change in it’s own direction (otherwise it’s length would
change.).
That is why we take the projection.It provides us with the only new information.
The information we want is how the normal plane changes, but only in the direction
of the bi-normal-vector.
We can form a table with all the objects we have so-far introduced and a few
things to note on them. See table 1.1.
One thing that you might find surprising at first is that the units of λ are not
1/[L]2 . The reason is because we normalize κ before differentiating, which means
we multiplied the units by [L].
16 As a reminder, something is a geometric object or geometrically invariant when it does

not change if we change the parametrization, or rotate RN .


1.5. CURVES IN R3 39

Figure 1.18: A curve, and it’s normal vector and plane changing along the curve.

object formula name/interpretation derivatives units


γ - Position 0 [L]
τ dγ
ds tangent vector / velocity 1 []
κ κ = kN curvature vector / acceleration 2 1/[L]
λ λ = lβ torsion / ”jerk” 3 1/[L]

Table 1.1: The main geometric objects we have defined up-til now.

We can see what we are doing as a Taylor-expansion, which we stop after 4


terms (3 derivatives.)
We can draw a parallel between k and l. k measures how much γ deviates from
being a straight line. Look at figure 1.19(a). k is the value that measures how γ1
deviates from a straight line. In the same figure, in (b), you see a curve that lives
entirely in a plane (even though it might have been defined in three dimensions.)
It’s l value is zero17 . In (c), you see a a curve γ3 , which lives almost in a plane,
close to the vicinity of a particular point on the curve, but which deviates from that
plane. For it l is not zero18 .
Imagine the red plane in the figure is the xy-plane. Then the one curve would
look like γ2 (s) = (x2 (s), y2 (s), 0) and the other curve would look like γ3 (s) =
(x3 (s), y3 (s), z3 (s)), i.e it would have a non-zero third component.
We can expand z3 (s) (which we will call z(s)) around s = 0, which we can
assume to be the parameter of the thick point in the picture. Then, since the
xy-plane is the one that the curve ”almost” lives in, z(0) = 0, of course, but also
z ′ (0) = 0, z ′′ (0) = 0. This means that both the tangent and curvature vector lie in
the xy-plane. The first non-zero derivative will be the third, and if we Taylor-expand

17 Think about why.


18 Itmight be zero on some particular points of an arbitrary curve, where to high order the
curve locally really almost lives in a plane, and only deviates a bit ”further”.
40 CHAPTER 1. CURVES

Figure 1.19: The parallel between k and l. k measures how lose a curve is to
a straight line (Part (a)). In part (b) you see a curve that lives entirely in a
plane, while in (c) you see a curve that deviates from the plane it is almost in,
at least in the direct vicinity of the thick point with the moving frame drawn
in. l is the thing that measures this.
1.5. CURVES IN R3 41

z around 0, we will get something like:

z(s) = 0 + 0s + 0s2 + cls3 + ... (1.54)

where l is the torsion scalar, and c is some universal constant. This is why the
torsion scalar measures how much the curve deviates from living in a plane. As an
exercise, you should prove all the claims we did not prove in this discussion and find
c. You can take the curve γ = (s, as2 , bs3 ), as a first example and see where in the
Taylor-series around 0 you find k and l)
You can also prove that if k = 0 for all s, then the curve is a straight line, and
if l = 0 for all s, the curve lies in a plane. (You can do this as an exercise, or decide
that the above discussion convinced you of this.)
In summary, torsion measures how much a curve twists away from a plane and
into the third dimension.

Example 1.5.1. What is the curve with constant curvature and torsion? We
won’t show it here, but it can be shown that it is a helix. A helix, by the way,
can be parameterized by (R cos(t), R sin(t), mt) where m is some constant.

Figure 1.20: A helix

Why does it make sense that it’s a helix? Well, we want constant curvature,
which means a circle is involved, but we also want it to twist out of the plane
it lives in, at a constant rate, which is why we get a helix.

1.5.4 How Curvature and Torsion determine curve


We have gotten a good intuition about curvature and torsion by now. Are these all
the things we need to know about a curve in R3 ? It turns out that, pretty much,
yes, will be the answer, very similarly to how in two dimensions we only really need
the curvature scalar.
42 CHAPTER 1. CURVES

This is stated in the following theorem, which can be proven similarly to its
equivalent in two dimensions.
Theorem 1.5.1 (k and l determine curve up to a rigid motion). Let k(s) ≥ 0
and l(s) be any smooth functions. If we set the curvature scalar and torsion
scalar to these functions, respectively, we determine the curve uniquely in R3
up to a rigid motion in R3 .
As we said before, we won’t prove this. But we hope you see how interesting the
result it. We need only two real functions to describe a curve in three dimensions,
and only one real function in two dimensions. In both of these cases, we reduced
our description by an entire function.

1.5.5 The Fréchet-Frame


We want to talk a bit more about τ, N and β. We already said that these form
a right-hand-sided orthonormal moving frame, basically a coordinate system that
moves with the curve. Now, you can define arbitrary moving frames, but this one is
somewhat special, not least because both of the components are normalized versions
of geometrically important vectors (κ and λ). That is why this choice holds a special
name, it’s called the Fréchet-frame. (Sometimes Frenet–Serret frame).
Before we talk about what makes this choice of moving frame special, we want
to talk a bit about general moving frames.
Let’s say we have some curve γ in three-dimensional space, and some moving
frame (not necessarily the Fréchet-frame) attached to it, that is orthonormal. See
figure 1.21.
We will abuse notation a bit and write the three basis vectors of the frame
e1 (s), e2 (s), e3 (s) as a vector like this:

e1 (s)
 
e2 (s) (1.55)
e3 (s)
Then, as we will show in a second, for a general moving frame we can write:

e1 (s) e1 (s)
   
d 
e2 (s) = A(s) e2 (s) (1.56)
ds
e3 (s) e3 (s)
where A(s) is a 3 × 3 matrix.
This matrix has a special property, generally, for any moving frame.
Proposition 1.5.1. Let γ be any curve and (e1 (s), e2 (s), e3 (s)) an orthonormal
moving frame. Then the matrix A(s) from equation 1.56 is anti-symmetric.
Where does the anti-symmetry come from? Well, there are two parts of the
anti-symmetry. Firstly, the diagonal is zero, which comes simply from the fact that
a vector of unit length cannot change in its own direction, otherwise the length
would change. Then there is the anti-symmetry of the other components, which
1.5. CURVES IN R3 43

Figure 1.21: A curve with its Fréchet-frame and a second picture of the same
curve with a random moving frame.
44 CHAPTER 1. CURVES

stems from the orthogonality. If one vector changes, the others have to change in
a way that they all stay orthogonal to each other.

Proof. We know that ⟨ei (s), ej (s)⟩ = δij , in-dependant of s. Therefore:

d dei (s) dej (s)


0= ⟨ei (s), ej (s)⟩ = ⟨ , ej (s)⟩ + ⟨ei (s), ⟩ = Aij + Aji (1.57)
ds ds ds
Aha, this is exactly the condition for anti-symmetry.

Now we will be able to see why the Fréchet-frame is special.

Theorem 1.5.2 (Fréchet-Frame-derivatives). For a Fréchet-Frame, we get:

0 0
    
τ k τ
d   
N = −k 0 l  N  (1.58)
ds
β 0 −l 0 β
where k = k(s), l = l(s) are the curvature and torsion scalars, respectively,
This is a particularly simple matrix. The special thing about a Fréchet-frame is
that it reduces the three independent matrix-components of A to two, specifically
two components we already know.

Proof. Many of these are definitions (for example, the first equation is just the
definition of N ), the rest aren’t too hard to check and we leave them to you as
an exercise.

1.5.6 Global theorems for curves in R3


We already introduced the principle that local and global quantities influence each
other, when we discussed the topic for curves in R3 . The same principle, of course,
applies to curves in R3 .
We will not fully prove any of the below, just state them. The first one will be
familiar from curves in two dimensions.

Theorem 1.5.3 (Fenchel’s theorem). Let γ be a closed curve in R3 (RN works


as well.) Then: Z
|k| ds ≥ 2π (1.59)
γ

The main steps in proving Fenchel’s theorem are the following:


(i) The curve τ has image in S 2 not contained in any open hemisphere. It is in a
closed hemisphere iff γ is a plane curve.
(ii) Any curve of length ≤ 2π in S 2 is contained in a closed hemisphere, and any
curve of lenght < 2π is strictly contained in an open hemisphere.
1.5. CURVES IN R3 45

Proof. (see exercise sheet 1, nr 1c)


(i) Suppose the image of τ in contained in an open hemisphere. By rotating
R3 we can assume that the last coordinate of τ (s) is > 0. Then (as τ is the
derivative of γ) the last coordinate of γ(t) in R3 is strictly increasing in t, so
γ(0) = γ(L) is not possible, which contradicts γ being a closed curve. If the
last coordinate of τ (s) is only ≥ 0 for all s, then it actually must be = 0for all
s. Because if the last coordinate of τ (s) is < 0 for some s, there needs to be an
s̃ such that the last coordinate of τ (s) is < 0 to get a closed curve (we need to
lose height again). But if the last coordinate of τ (s) is 0 everywhere the curve
stays in the plane with last coordinate 0.
(ii) is left as an exercise to the reader.

The next theorem says something about the same quantity as above, for knotted
curves. Knotted curves are curves that cannot smoothly be transformed into a circle,
without crossing themselves. You can see a few examples in figure 1.22

Figure 1.22: The trefoil, figure-8-knot and unknot. The first two are knotted,
the last one is not, even though it might look like it at first.

Theorem 1.5.4. Millner’s theorem Let γ be any three-dimensional curve, that


is knotted, closed and simple. Then:
Z
|k| ds ≥ 4π (1.60)
γ

The bound in Millner’s theorem is sharp: for example the trefoil knot has total
absolute curvature of exactly 4π.

With these two examples, we conclude our discussion of curves and move on to
the second type of object we want to discuss before talking of differential geometry
in a general matter, those objects being surfaces in R3 .
46 CHAPTER 1. CURVES
Chapter 2

Surfaces

In this chapter, we deal with surfaces, which are the obvious next step after curves in
our discussion. We will not treat surfaces in the more general RN , but just surfaces
in R3 , as they are more intuitive and are enough for the purposes of this lecture.
We start with the the definition and then define a few basic properties, before
moving further to a discussion of the geometry and curvature.

2.1 Some definitions and basic quantities


The easiest type of surface in R3 is simply a graph of a function of x and y (Like
the one in figure 2.1(a)) We have a function f (x, y) and we set this to be the
z-coordinate. You can have a look at the specific example below.

Proposition: The surface z = f (x, y) = (x2 + y 2 )

In figure 2.1(b), you can see the graph of the function z = f (x, y) =
(x2 + y 2 ). It is, of course, a surface (in every intuitive way, but also in the
more general definition we will give below.)

It is clear that the surface in the example should be a surface. The idea that a
surface should always be the graph of a function is however, not a good one. There
are two simple examples that should definitely be surfaces, but wouldn’t be, if that
was our definition. The first is the xz-plane. Obviously, it should be a surface. If
anything should be a surface, the xz-plane should be. And yet, it is quite easy to
see that you can’t write it as a function of x and y. Well, you might say, that there
is nothing special about x and y in R3 and that we should be allowed to choose
any plane to describe our surface. For example, we could take y = f (x, z) = 0 to
describe the xz-plane. Yes, that is a possibility, but still we find a problem. Take
the sphere. It should definitely be included in our definition of a surface. But I dare
you to find any plane for which you can write the whole sphere as a graph of a
function. It should be very clear, that this is not possible.

47
48 CHAPTER 2. SURFACES

temp

Figure 2.1: A picture of many surfaces, that explain our definition of a surface. In
(a) you can a graph of some random function of x and y. The picture in (b) is
simply the graph of f (x, y) = x2 + y 2 . In (c) you see the first problem with the
naive definition, because we cannot write the xz-plane as the graph of a function
of x and y. In (d) you see the further problem, that even if we allow for the
function to be defined on an arbitrary plane, the sphere can simply never be of
sure form globally, but it can be made such locally (e).
2.1. SOME DEFINITIONS AND BASIC QUANTITIES 49

Can we repair this situation? Yes, if we recognize that, while we cannot write
surfaces like the sphere as a graph of a function, at every point on the surface, in
some (possibly very small) region of the surface, we can write that region as the
graph of a function (on some plane in R3 ). That is we can write the surfaces locally
as the graph of a function1 . This leads us to our full definition.

Definition 2.1.1: Smooth surface in R3

smooth surface M in R3 is a subset of R3 , so that for every point, there


is a surrounding of that point which can be written as a smooth graph of
some function in some direction.

Now that we have settled on what we mean by surfaces mathematically, we need


to be able to describe derivatives, since we are, after all, doing differential geometry.
You can imagine that you are a small ant, living on this surface, and you, as an
ant, can only move on the surface, never out of it2 . Then everything you can see
and feel is two dimensional, but the usual calculus we know in R3 is obviously three
dimensional.We somehow need to reduce the dimensions. An intuitive idea, is that
smooth surfaces, locally (that might be very locally) look like a plane, the same
way that a smooth curve locally looks like a straight line. We can take this plane,
called the tangent plane, as the space of directions, and do our differentiation there,
because in the limit (of a very small area around the point of interest), the surface
and plane will converge to the same thing. See figure 2.1.
We define the tangent plane a bit differently however, through curves that live
in M . This is mostly because it will be this definition that we will use to define the
tangent space when talking of manifolds generally, in later chapters.

Definition 2.1.2: Tangent plane

The tangent plane to M at p is the set of all vectors based at p and tangent
to M at point p. We can formally define it as:

Tp M = v ∈ R3 and v = γt (0) for γ smooth curve on M and γ(0) = p




(2.1)

Apart from tangent vectors, we also have normal vectors. These are vectors that
are normal to M , which, of course, means that they are also normal to Tp M

Definition 2.1.3: Unit normal vector

The unit normal vector to M at p, denoted N (p), is a normalised vector


perpendicular to the tangent plane at p.

You can look at figure 2.1 for an example.


1A function that depends on the point!
2 You are an ant. You cannot fly or jump or anything like that.
50 CHAPTER 2. SURFACES

Figure 2.2: A surface M , with a point p and the tangent plane Tp M drawn
in, at different levels of ”zoom”. Globally, M and Tp M are very different,
but the more we zoom in, the more do the coincide, until they become
”almost” the same.

Figure 2.3: A surface, it’s tangent plane, and a normal vector.


2.1. SOME DEFINITIONS AND BASIC QUANTITIES 51

We have, of course, at every point, two choices of normal vectors. We can


either choose N to point in one, or the other direction. When we were discussing
curves, we also had this choice, and simply solved it by saying that we would always
take N in a way so that we get a right-handed coordinate system with the other
important vectors3 . Here, without any further important objects, we cannot really
do this. There isn’t an obvious choice of preferred directions in the tangent plane,
so that we could just choose N ’s orientation to get a right-hand coordinate system.
With closed surfaces (like a sphere), we can at least talk of N pointing ”inwards” or
”outwards”, but for something that isn’t closed (like a plane), we don’t even have
this luxury. What we will do, is simply choose some direction for N whenever we
need it, and use that one consistently.
But we don’t need to only talk of a normal vector at a point, we can talk of the
normal vectors at all the points of the surface, see figure 2.1

Figure 2.4: The normal field N of some manifold M

Definition 2.1.4: Unit normal field

The unit normal field of M is the function N : M → R3 which maps points


p in M onto the unit normal vector at p. We require N to be continuous
(or sometimes smooth).

Some remarks to the unit normal field:


• We have two possible choices for a unit normal vector at a point (”inwards”
or ”outwards”). Since we require N to be continuous it must be in the same
direction on a connected set, so overall we get a total of 2k possible unit
normal fields, where k is the number of connected components in M .
• In some cases we can only define N locally. A classical example being the
Möbius-strip4 , which you can see in figure 2.1.
3 In 2d: τ , In 3d τ, β
4 You can easily build a Möbius-strip, from a strip of paper. You just need to tape the ends
52 CHAPTER 2. SURFACES

• If M is the boundary of an open set in R3 , then N can be defined globally.

Figure 2.5: The Möbius strip. You can try to draw a normal field by starting
at some point and drawing the next normal vector and then the next until
you get back to where you started, but you will find that when you come
back, your normal will point into the other direction than the one you started
with (= not smooth).

Now that we have developed the most basic tools we could use to do differential
geometry on surfaces (The analogues of τ and N for curves) we can proceed to
talk about curvature.

2.2 The curvature of surfaces in R3


We will find that the the curvature of a surface at a point will be described by an
object called the second fundamental form5 . There are three equivalent ways to
define the curvature of surface, which are summarized below.
A Through the curvature of curves through p (geometric definition, QP )
B By using the Hessian of M , regarded as a graph over its own tangent plane
Tp M (2nd fundamental form, Ap (x, y))
C By defining the Weingarten map Wp
Furthermore, we will prove two useful formulas for explicitly computing curvature:
D Formula via appropriate parametrisation
E Formula for a graph of a function
of the strip of paper together, but in a way that the strip turns once. If you never did this,
you should try.
5 You might wonder whether there is a first fundamental form. Yes there is, but for peda-

gogical reasons, we will introduce it a bit later.


2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES53

We shall start with the geometric definition of curvature, since it is the most intuitive
one.

2.3 The Geometric Definition of Curvature on


Surfaces

The first way to define curvature on a surface is using curves that lie on the surface.
If the surface is curved, then any curve passing through the point of interest p
has to be curved as well. We cannot just take the curvature of some curve on
M going through p, simply because this will give us different values for different
curves. (There is an infinite amount of curves going through p that live on M ,
all curved differently, see figure 2.3) There is however a specific amount of curving
that a curve has to curve at point p to still stay on M , otherwise it would leave.
This amount of curvature, will be in the direction of N . Why? Well, in the tangent
direction, we can have pretty much any curvature we want, the curve can be as
curved as it want on M . That pretty much leaves on the direction of N . From this
simple thought follows the next definition.

Figure 2.6: A surface M and a point p. There is a lot of curves going through
p, some less, some more curved. (They all have to live on M though.)
54 CHAPTER 2. SURFACES

Definition 2.3.1: Geometric definition of curvature


Let M be a surface and N the unit normal vector field. Let p ∈ M , v ∈ Tp M
with |v| = 1, γ(t) any curve in M with some parametrisation t, such that
γ(0) = p, γt (0) = v. Then define:

p (v) = Qp (v) = ⟨κγ (0), N (p)⟩


QN

What we basically do is define a function that gives us the normal component


of the curvature of a curve going through p on M , with its tangent vector matching
with v. See figure 2.3

Figure 2.7: The idea of the previous definition. The function Qp takes a
(normalized) v from the tangent plane, takes a curve through p with a
matching tangent vector, and puts out the normal component of κγ .

There are two questions you might have already asked yourself. Firstly, why do
we only define this when |v| = 1? That is an easy question to answer. Mostly
convenience. It will simplify the proof and calculations, while not leaving out any
real information. Let’s say you want to know Qp (v) for some vector with length
2. Then you can do this entire procedure for the same vector, but normalized, and
simply parameterize the curve γ in such a way that you move twice as fast along it.
The second question is whether this definition makes any sense, whether it
is well-defined, because we have, after all, many curves going through p with their
tangent-vector equaling v and it is not obvious that we always get the same number
for Qp (v) for any choice of appropriate curve. This will turn out to be true however,
as we will prove in a short time. For now, we will quickly assume that this is true
and talk about the object Qp a bit.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES55

2.3.1 A bit about Qp

We want to take a few notes and intuitive ideas about what Qp is and describes at
this point.

• Firstly, Qp (v) is not standard notation, because it will turn out that Qp will
simply be the second fundamental form in a slightly different manner, which
will be denoted as Ap (X, Y ), and which we will introduce shortly.

• Qp (v) will turn out to be a quadratic form in the components of v, i.e it will
P2
be of the form Qp (v) = i,j=1 aij v i v j , where aij are some numbers that
come from the surface’s geometry and v i is the i-the component6 of v.

• Qp (v) = Qp (−v). This follows from the fact that for an appropriate curve
which we use to calculate Qp (v), we can simply take the same curve, but
reverse it’s parametrization. Then the tangent-vector changes direction (or
sign), but the curvature vector stays the same, and therefore by definition
Qp (v) does too7 .

• Qp (v) can be either positive or negative, there are no general restrictions on


it’s sign. The signs hold geometric information however. Also, if we use −N
as our normal field, the sign of Qp (v) flips, simply because N is in the scalar
product.

• The largest and smallest values of Qp belong to two normalized vectors e1 , e2


that are orthogonal to each other. To know these two directions and their
Qp values is also enough to specify Qp (v) for all v in Tp M with unit length.
The values of these are most often denoted kmax = k1 and kmin = k2 .The
quantities k1 and k2 are called the principle curvatures at p, while the two
directions e1 and e2 are called the principle directions/axes of curvature.

We will prove many of these claims soon, for now we just wanted to familiarize you
with these properties.

6 Notice the index is upstairs. There is a reason for this sort of notation, where upstairs and

downstairs indexes have different meanings. For now, however, there really is no difference,
but we will use this notation to get you accustomed.
7 Prove that the tangent-vector changes sign, but the curvature doesn’t.
56 CHAPTER 2. SURFACES

Figure 2.8: A sphere, the north pole, a random unit tangent vector and a
great circle.

Proposition: A sphere of radius R

As always, a sphere is a good way to find an intuitive understanding of


what an object like Qp (v) describes. Let us assume that for all v in Tp M
of unit length it really is enough to calculate Qp (v) for a single curve and
that this number will be the same for all other appropriate curves. Then
we can calculate Qp (v) very easily. Because the sphere is very symmetric,
we can get away with just calculating Qp for the north pole, and expect
it to be the same (geometrically) at all other pointsa . The coordinates
(Cartesian) of the north-pole are (0, 0, R) and the normal vector (we choose
the inwards normal vector) is N = (0, 0, −1). Let’s say we are given a
v = (v 1 , v 2 , 0) to calculate Qp (v) of. For this v, we can choose the curve to
be the appropriately parameterized greater circle lying in the plane spanned
by N and v. The curvature vector of this greater circle at p must, of course,
be R1 N , so that

1 1
Qp (v) = ⟨κγ , N ⟩ = ⟨ N, N ⟩ = (2.2)
R R
Also, this is independent of v, which really just says something about the
(local) symmetry of the sphere.
a You can alternatively see this as us using special (Cartesian) coordinates, for
which the point on the sphere that is of interest p has coordinates (0, 0, R). Then,
since Qp is defined by a scalar product of two fundamentally geometric vectors, it
should transform geometrically.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES57

Proposition: Independence of Qp from the curve

on S 2 ] We have calculated Qp for the sphere. We will do the same thing


again, but with a different curve and observe how it is that we get the same
result. We have the same situation as in the last example, but this time we
pick a different curve, that is not a greater circle. Instead, let’s take a circle
on the sphere that passes through p and has the tangent vector v, but is
smaller than a greater circle. See figure 2.3.1. Because the circle is smaller,
κβ is a lot larger than κγ from the last example. But κβ also points in a
different direction, not in the direction of the normal vector of the sphere
at the north pole. When we projecta κβ back onto N , we get the same
result (prove this). In this way, the projection onto N ensures that we get
the same result, no matter the curve.
a Because we are calculating a scalar product

Figure 2.9: Here we have the same situation as in the previous figure, but
instead of a greater circle, we take a smaller circle to compute Qp (v). We
have however rotated the sphere (v is pointing towards you), so that it would
be simpler to see what is going on, namely that the curvature vector of the
smaller curve is far larger, but that the projection onto N is still the same.

2.3.2 The independence of Qp (v) from the curve we choose


Now that we have a bit more intuition/information about Qp (v), we want to prove
that it is independent of our choice of curve, so long as the curve lies on M and
has the tangent vector v.
58 CHAPTER 2. SURFACES

Theorem 2.3.1: Qp is well-defined

et M be a surface, p some point on the surface and v some unit vector


in Tp M . Then Qp (v) = ⟨κγ (0), N (p)⟩ is independent of our choice of the
appropriate curve γ. (Appropriate means that γ lies on M , γ(0) = p and
that the tangent vector γt = v.)

To prove this theorem, we will need the lemma below.

Lemma 2.3.1
Let everything be the same as in the theorem above. Then

⟨κγ (0), N (p)⟩ = ⟨γtt (0), N (p)⟩ (2.3)

What the lemma says, is basically that instead of the geometric curvature κ of
the curve, we can instead use the second derivative of γ.
The proof is quite simple.
Proof. Let γ be any appropriate
 curve. We simply compute ⟨κγ (0), N (p)⟩, using
the fact that κγ = |γ |2 γtt − ⟨γtt , γt ⟩ |γ |2 . We can use the fact that |γt | =
1 γt
t t
|v| = 1, which simplifies the formula for κ

⟨κγ (0), N (p)⟩ = ⟨γtt − ⟨γtt , γt ⟩γt , N (p)⟩ = ⟨γtt (0), N (p)⟩ (2.4)

where in the last step we have used the geometric fact that γt = v is in the
tangent plane, and therefore normal to N .

Curvature is a second derivative


Before we go on to prove the theorem, we will discuss a big idea the proof relies on,
which is an overarching theme in differential geometry. It’s the idea that curvature is
a second derivative. This is something that is fully universal throughout differential
geometry. This fact however, shows up only if you either use some smart ”natural”
coordinates or differentiate with respect to a geometric quantity. In general coordi-
nates, it is usually an equation containing a second derivative but with some other
non-geometric ”decorations”.
For a curve, we defined the curvature vector to be:
d2 γ
κ= (2.5)
ds2
It is manifestly a second derivative, and because we are differentiating after a geo-
metric quantity (s), it is simply just the second derivative. Later, while discussing
curves, we saw that, if you can write the curve as a graph of a function y = u(x),
then the curvature becomes:
uxx
k= (2.6)
(1 + u2x )3/2
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES59

At any point whose tangent is parallel to the x-axis, ux = 0 and:

k = uxx (2.7)

It is clear that we can also (locally) write the curve as a graph over it’s tangent, as
seen in figure 2.3.2, where we also find that the curvature scalar becomes a second
derivative.
We will see that the curvature of a surface will also become a second derivative,
when you write the surface as the graph of a function over its tangent plane (locally).

2.3.3 Proof of the theorem


We will now, finally, get to the proof.
Proof.

Step 1 We start, by choosing an orthonormal basis of R3 , so that p = (0, 0, 0)


and Tp M is the xy-plane, and N = (0, 0, 1). This is mostly a matter
of convenience, and certainly doesn’t change ⟨κγ (0), N (p)⟩. Then we can
write the surface (locally) as the graph of a function where z = f (x, y) for
a function given by f . For a picture see figure 2.3.3
Step 2 Now let v ∈ Tp M . Then it will be of the form v = (v 1 , v 2 , 0) ∼ (v 1 , v 2 )
(We associate the xy-plane in R3 with R2 in the usual way)
Step 3 We now let γ be any curve on M with γ(0) = p and γt 0 = v and compute
⟨κγ (0), N (p)⟩. To do this, we notice that γ(t) = (x(t), y(t), z(t)), and
γtt (t) = (x′′ (t), y ′′ (t), z ′′ (t)), obviously. Then:

⟨κγ (0), N (p)⟩ = ⟨γtt (0), N ⟩ = ⟨(x′′ (0), y ′′ (0), z ′′ (0)), (0, 0, 1)⟩ = z ′′ (0)
(2.8)

where we used the lemma from above. We found the theme of the last
subsection. Curvature is a second derivative. What we have basically
done is write (a local part of) the surface as a graph over its tangent plane
at p, and found that, similarly to curves, curvature became the second
derivative in those coordinates.
Step 4 Somehow, we need to use the fact that γ lies on the surface, which we have
not yet used. Of course, because γ does lie on the surface, its coordinates
have to be: γ(t) = (x(t), y(t), f (x(t), y(t))), because the z component (for
a local part, again) is not independent of x and y. If we now evaluate
z ′′ (0) we get:

zt t = fx (x(t), y(t))xt (t) + fy (x(t), y(t))yt (t) (2.9)


ztt (t) = fxx xt xt + fxy xt yt + fx xtt + fyx xt yt + fyy yt yT + fy ytt (2.10)

But, we only need it at t = 0, where it simplifies a lot. Firstly, xt =


v 1 , yt = v 2 because we are using a curve whose tangent vector is v. But
60 CHAPTER 2. SURFACES

Figure 2.10: A curve that is just the graph of the function u(x). When
written as the graph of a function over its tangent, k becomes the second
derivative.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES61

Figure 2.11: The coordinates we are using for the proof of the theorem.
The point p is at the origin and the tangent plane is the xy-plane

also fx (0) = fx (x(0), y(0) has to be 0, because the xy-plane is Tp M is


tangent to the surface, which we describe with the function f . So what
we get is:

ztt (0) = fxx v 1 v 1 + 2fxy v 1 v 2 + fyy v 2 v 2 (2.11)

Firstly, we have just proven that Qp (v) is a quadratic form, but we have
also proven that Qp (v) is independent of the curve we choose. The quan-
tities fxx , fxy and fyy have nothing to do with our choice of curve, only
with the underlying surface.

We can rewrite our result quite a bit and see another way of looking at it.

Qp (v) = fxx v 1 v 1 + 2fxy v 1 v 2 + fyy v 2 v 2 (2.12)


  
 fxx fxy v1
= v1 v2 = v T Hv (2.13)
fyx fyy v2

where H is the Hessian-matrix of f evaluated at the point p. If we write x =


x1 , y = y 1 , z = z 1 we can write it in one sum:
2
X ∂f
Qp (v) = vi vj (2.14)
i,j
∂xi ∂xj

which makes the quadratic-form-ness of Qp a bit more visible. (The partials are of
course evaluated at p)
62 CHAPTER 2. SURFACES

2.4 The second fundamental form


Now that we have seen the geometric definition, we can manipulate it a tiny bit
and define the second fundamental form.

Definition 2.4.1
Let M and p be, as always, a surface and a point on it. Let us also use
coordinates where locally we write the surface as the graph of a function f
over its tangent Tp M at p. The the second fundamental is defined as:
2
X ∂f
Ap (X, Y ) = i ∂xj
X iY j (2.15)
i,j=1
∂x

as a function Ap : Tp M × Tp M → R. The vectors X, Y are both in the


tangent space.

We can very quickly note that the second fundamental form is symmetric, in
the sense that Ap (X, Y ) = Ap (Y, X) simply because the hessian is a symmetric
matrix. You can also see the second fundamental form as the first Taylor-coefficient
of f , which actually tells us anything geometric. The zero-th coefficient just tells
us where the surface is, the first only how the tangent plane is oriented, both of
which are not geometric things. Only the second one tells us anything geometric,
that is, the curvature at the point p.
Note 5. The formula above can only work if we use coordinates where p =
(0, 0, 0) and the xy plane is the tangent-space, because there fx and fy vanish.
We need to carefully pick this coordinate system to extract the geometric infor-
mation. There is a general formula in any coordinates, but it long and tedious,
which is the reason why we won’t do it here.
To connect the second fundamental form to curvature, we note that:

⟨κγ , N (p)⟩ = Ap (γt (0), γt (0)) (2.16)

2.4.1 Simplifying the second fundamental form


The second fundamental form is a bilinear form, whose matrix is the hessian of
the surface in our special choice of coordinates. It is also symmetric. This means
that we can diagonalize it, by choosing good coordinates of Tp M . We can do that
since our choice of coordinates has an arbitrariness, in the sense that we only really
fixed the xy-plane to be Tp M and the z-axis to be along N , but we can still rotate
or flip around the z-axis and get another coordinate system that works exactly as
well as the original one. In linear algebra terms, we can preform an orthonormal
(rotation or flip or both) transformation on Tp M (the xy-plane). We will use this
to our advantage. If you had a good course on linear algebra, you probably know
the singular value decomposition, which you can see summarized below.
2.4. THE SECOND FUNDAMENTAL FORM 63

Proposition 2.4.1: Singular Value Decomposition

Let’s say you have a scalar product (not necessarily the standard euclidean
one) and a bilinear form. That is you have the maps

⟨,⟩ : V × V → R (2.17)
B :V ×V →R (2.18)
(2.19)

where the first has the usual properties of a scalar product and the second one
is a linear in both entries and symmetrica . Then there exists an orthonormal
basis e1 , e2 , . . . , en of the vectorspace V , so that Bij = B(ei , ej ) = λi δij
for some numbers λ1 , λ2 , . . . , λn . In other words, the matrix Bij is diagonal
with the λ’s as the diagonal. The λ’s are called the singular or principal
values.
a B(X, Y ) = B(Y, X)

If you have not seen the singular value decomposition before, it simply states
that you can rotate and flip your coordinate system so as to turn a particular bilinear
form into the form:

B(v, w) = λ1 v 1 w1 + λ2 v 2 w2 + · · · + λn v n wn (2.20)

where vi and wi are the components of v, w in the new orthonormal system.


This is what we want to use on the second fundamental form, since it is a
symmetric bilinear form. If we do find that special orthonormal coordinates system
of Tp M , we find that the matrix of Ap becomes:

k1 0
 
Ap = (2.21)
0 k2

where k1 and k2 are some numbers that only depend on the surface. They are the
principal curvatures we mentioned when discussing Qp (v), which you can see quite
simply. If we work in that coordinate system, the second fundamental form used on
the same vector twice becomes:

Qp (v) = Ap (v, v) = k1 v 1 v 1 + k2 v 2 v 2 (2.22)


or, if we use that |v| = 1 (as we used while first introducing the principal curvatures),
we can write v1 = cos(θ), v2 = sin(θ) and get:

Qp (v) = Ap (v, v) = k1 cos2 (θ) + k2 sin2 (θ) (2.23)

You can verify (exercise) that the maxima and minima happen when v = ±e1 , ±e2
in those special coordinates, which verifies that (1) k1 , k2 are the extreme values of
Qp (v) and that (2) the directions of the principal axes of curvature are orthogonal
to each other (e1 ,e2 are orthogonal per construction).
64 CHAPTER 2. SURFACES

2.4.2 The mean and Gauss curvatures


The last two sections introduced the second fundamental form and its connection
the the principal curvatures. We will now define two new curvatures we construct
from k1 and k2 and give all four numbers some more meaning.
In figure 2.4.2 you can see the geometric meaning of k1 and k2 . The first one
tells us the maximal curvature (and e1 it’s direction), while the second tells us the
minimal curvature (and e2 its direction). From these two we can construct these
two other numbers:

H = k1 + k2 (2.24)
K = k1 k2 (2.25)

Figure 2.12: The principal curvatures and axes of curvature on a rather


standard surface. We have k1 , which is positive and k2 , which is negative.
Note that which is which (max/min) depends on the direction of N we
chose, for this picture, since k1 is the positive one, N points up. Also, the
principal axis are orthogonal to each other, even if it might not look like it
in the picture because of perspective.

The first H, is called the mean curvature, while the second, K, is called the
Gauss curvature. You might wonder why H is called the mean curvature when it
should be k1 +k
2 . That used to be the old definition, but over time, people got sick
2

of writing that 2 everywhere, and redefined H.


2.4. THE SECOND FUNDAMENTAL FORM 65

What is the meaning of H? To understand it, we hope, that by now you are
convinced that the matrix form of A in our special coordinates is just the Hessian
of the function that describes our surface, and that when we rotate Tp M , this
does not change. What does change is that the Hessian becomes diagonal in these
coordinates (which we will denote x′ , y ′ ). So what we get for that matrix is:

∂2f
! 
0 k1 0

H = Ap = = (2.26)
∂x ′2

0 ∂2f 0 k2
∂y ′2

So that:
∂2f ∂2f
H = k1 + k2 = ′2
+ ′2 (2.27)
∂x ∂y
which is the two dimensional Laplacian of f . Now, the Laplacian tells us quite a bit
about average quantities, which also makes it more clear why H is called the mean
curvature. It tells us how much the function f deviates from it’s value at (0, 0) in
our coordinates, which is, of course, 0. It makes sense that a this would be a good
value to describe a part of the curvature. Because the Laplacian is inevitably tied
up in (physical) problems like drums and soap films, it first came up way before
curvature in this sense was discussed. Surfaces with H = 0 are called minimal
surfaces, not only because of the history OF THE Laplacian, but also because they
are usually surfaces of minimal area for some boundary problem.

Figure 2.13: The signs of k1 and k2 change with the other choice of N .
If we choose N1 as in the picture, k1 is positive, while if we use N2 , it is
negative. The absolute value stays the same however.

Now what about Gauss’s curvature? Well, it has a very nice property. It’s sign
is independent of our choice of N , unlike k1 , k2 , and H. First regard k1 and k2 .
66 CHAPTER 2. SURFACES

Figure 2.14: The Laplacian locally is just the average difference of f (x0 )
and f (x0 + ϵe) where e are all possible directions, ϵ is a parameter that goes
to 0

Look at figure 2.4.2. There you should see how k1 changes depending on the choice
of our normal. If we choose N1 , then the normal and the curvature vector of the
curve belonging to k1 point in the same direction, if we choose the other direction,
they point in opposite directions.
The fact that the sign of H tells us nothing geometric either can be seen alge-
braically quite easily.
H = k1 + k2 (2.28)
Therefore if we change the signs of k1 and k2 it is clear that the sign of H changes.
You can also look at this through the meaning of H. H is the two dimensional
Laplacian of the function that the surface is a graph of (in good coordinates it is.)
If we reverse the direction of N , we practically just take −f as the function M is a
graph of (locally). But the Laplacian is linear, which means it changes signs too and
since H is just the Laplacian it does too. The Gauss curvature, on the other hand,
does not change signs. The important thing to note is that if you change the signs
of k1 and k2 . their product, which is the Gauss curvature, does not. The sign of the
Gauss curvature captures the relationship of the signs of the principal curvatures. If
they have the same sign (++ or −−), then K will be positive. Otherwise it will be
negative. The sign relationship of the principal curvatures are very geometric things.
In particular, if both principal curvatures have the same sign, then the curves that
they belong to curve in the same direction, and if they have opposite signs, into
other directions. In the first case you have something that looks like a bulge, in the
other something that looks like a saddle point, as you can see in figure 2.4.2.
2.4. THE SECOND FUNDAMENTAL FORM 67

Figure 2.15: Typical example for each of the possible signs of the Gauss
curvature. If it is positive, both curves curve in the same direction, creating
a bulge. Otherwise, they curve in opposite directions and create a saddle.
68 CHAPTER 2. SURFACES

There is another way to express H and K, that might be more enlightening. If,
as always, we call the matrix of the second fundamental form (so the hessian of f )
A, then we get the result:

H = tr(A) (2.29)
K = det(A) (2.30)

as you should check yourself (exercise). These formulas are important, as they
illustrate the geometric character of H and K.

Proposition: The Sphere

We start with the standard example of a surface, a sphere. Let it have


radius R and a normal N = −x/R that points into the sphere. Then, for
any direction we pick, a greater circle will have a curvature R1 N , which
means that the second fundamental form will be Ap (X, Y ) = R1 ⟨X, Y ⟩
and the principal curvatures will be k1 = k2 = 2/R. The mean curvature
H will be H = k1 + k2 = R2 and the Gauss curvature will be K = R12 .
From the sign of K we would expect a (locally) bulge-like surface, which
the sphere of course is. To illustrate the power of the geometric approach
to this, we invite you to also derive the results above purely algebraically,
which will take a bit longer. You can restrict yourself to the north pole
p = (0, 0, R) and N = (0, 0, −1). You can calculate the hessian of f . It is
a simple calculation, but quite error prone and rather annoying. You can do
it yourself to see it, and probably already did quite a few times throughout
your studies. The result is that:

1/R 0
 
D2 (f ) = (2.31)
0 1/R

from which our results from above follow. If you actually did the calculation,
you are probably happy about the power of geometry.

2.5 Symmetry and Curvature


Now that we have found the right tool to understand the curvature of a surface, that
is, the second fundamental form, we want to make our computations as quick as
possible. A way to do this, is to exploit the symmetry that a surface has. Take the
sphere, for example. It is the most symmetric surface you can think of, any rotation
or reflection originating at the origin preserves the sphere. We used symmetry in
the previous section to very quickly calculate the second fundamental form and the
different curvatures. We now want to state and prove a theorem that will allow us
to use a specific symmetry to make the problem almost trivial, if the surface has
that kind of symmetry.
2.5. SYMMETRY AND CURVATURE 69

Theorem 2.5.1: Symmetry and Curvature

Let M and p, as always be a surface and a point on the surface. Let S be


some isometry of R3 , which preserves M . In particular, let S be a reflection.
There are a few requirements for the idea to work:

• The point p for which we are calculating the second fundamental form
is a fixpoint. S(p) = p
• The tangent plane Tp M is preserved by S, which means S(Tp M ) =
Tp M . In other words, the plane P that we reflect our space on is
perpendicular to Tp M .

Then, as I hope you can agree, S acts on the tangent-plane like a reflection
through a line L, which passes through the origin. Then the direction of
this line as well as the direction orthogonal to it are the principal axes of
curvature.

You can see the setup of the situation in figure 2.5 and the way the symmetry
acts on the tangent plane in figure 2.5. I hope you can appreciate how useful this

Figure 2.16: A surface that is symmetric under a reflection across the plane
P . The plane P is perpendicular to the tangent plane Tp M , and the line of
intersection of the two planes (L) is also drawn in.

theorem is. It is very easy to recognize a reflective symmetry like the one we need,
and if we do find it, we have pretty much immediately found the axes of curvature.
We then only need to calculate Ap for two combinations of e∥ and e⊥ to specify
Ap completely, and often these are quite easy to find (like with the sphere).

Proof. Let e⊥ and e∥ be an orthonormal system so that the first vector is orthog-
70 CHAPTER 2. SURFACES

Figure 2.17: Here we drew the tangent plane and the principal axes of cur-
vature. The symmetry acts as a reflection along the line L and the two
directions e∥ , e⊥ turn out to be the axes of curvature (per the theorem we
will prove in this section.)

onal to L and the second is parallel. We need to prove that in these two axes,
A is diagonal. We will do this by a contradiction. First, not that if k1 = k2 ,
then A = k1 I, which already is diagonal, so we can assume k1 ̸= k2 . We will
show that any other orthonormal basis cannot diagonalize A8 .

Step 1 Our first claim is that Ap cannot change under the symmetry as it is a
geometric object. As a formula, we claim

Ap (S(X), S(Y )) = Ap (X, Y ) (2.32)

This is easy to see geometrically, because the curves flip under the sym-
metry.
Step 2 Let’s assume there is another orthonormal basis of Tp M , which diagonal-
izes A and let’s call these basis vectors e1 , e2 . Then S(e1 ), S(e2 ) also diago-
nalize A. This is easy to to see because of the previous step.A(S(e1 ), S(e2 )) =
A(e1 , e2 ) = 0. and A(S(e1 ), S(e1 )) = A(e1 , e1 ) = k1 and similarly for the
other combinations. A in that basis has to have the form
k1 0
 
A= (2.33)
0 k2

Step 3 We now want to show that S(e1 ) cannot lie in the direction of e2 and vice
versa. Well, we know that the maximal and minimal values of curvature
8 This is why we excluded the case k = k , because any orthonormal basis diagonalizes A
1 2
in that case.
2.5. SYMMETRY AND CURVATURE 71

lie in directions that are perpendicular to each other, see figure 2.5 for
a picture of an Ap with k1 ̸= k2 . It should be clear that the maximum
(k1 ) cannot occur for a vector between e1 and e2 and the same goes for
the minimum. Therefore S(e1 ) can also not have a component in the e2
direction.

Step 4 But then S(e1 ) has to lie along e1 and be of unit length (since S is an
isometry) and the same goes for e2 . So we get the possibilities:

S(e1 ) = ±e1 (2.34)


S(e2 ) = ±e2 (2.35)
(2.36)

Step 5 But which vectors have this property? Well, S is a reflection along L, so
the only vectors with this properties are the ones that lie along L, which
get mapped onto themselves, and the ones lying on the line perpendicular
to L (L⊥ ), as you can see in figure 2.5. Therefore:

(2.37)

e1 , e2 ∈ ±e∥ , ±e⊥

But this is exactly what we wanted to prove9 .

This theorem is very useful when computing curvatures and we will now provide
a few examples.

2.5.1 Examples: Determining Ap quickly through Symme-


try

We want to discuss a few examples of surfaces and their curvatures.

9 The fact that we can also have a ± before the vectors in the set is not important, as we

all combinations will diagonalize A


72 CHAPTER 2. SURFACES

Figure 2.18: A typical way the curve Ap looks like if k1 ̸= k2 , on the


circle of unit length in Tp M . Here we took one where both curvatures are
positive to see it better, but you can easily visualize the other possibilities
by moving the curve up and down.
2.5. SYMMETRY AND CURVATURE 73

Figure 2.19: The tangent plane (drawn twice) and the way vectors get
mapped onto other vectors when you reflect along L. The vectors drawn
in purple are the only ones that get mapped onto the same line that they
already laid on.
74 CHAPTER 2. SURFACES

Proposition: The cylinder

We start with an example slightly more complicated than a sphere and a


point p, like in figure ?? for which we want to calculate the curvature. Unlike
the sphere, we cannot take any plane as a symmetry, but the cylinder does
have two planes (for the point p) that can be used. Firstly we have the
xy-plane. It is clear that it is perpendicular to Tp M and that p stays where
is is, and that the horizontal line drawn in purple also gets mapped to itself.
Therefore we know that that is our first principal axis. The other one has
to be orthogonal in Tp M , therefore it has to point up. We can then very
easily calculate the principal curvatures. The first one is the curvature of
the straight line which is zero. The other one is the curvature of a circle
of radius R, so k2 = 1/R. We also have H = 1/R and K = 0 (Verify
all these.). Because K = 0, the cylinder turns out to be flat! An intuitive
explanation is that you can use scissors to cut along the cylinder and roll it
out onto a flat piece of paper.

Proposition: The Catenoid

The catenoid is the surface of revolution of the function cosh(z). You take
the graph of cosh(z) as a function of z and rotate it around the z-axis to get
a surface, that looks a bit like a sci-fi picture of a wormhole, as you can see
in figure 2.5.1. We have drawn in a point p lying on the ”original” graph of
cosh(z) that points in the x direction. Its position on that line is arbitrary
though. For that point, we can use the xz-plane exactly like with cylinder
(although the orientation is different) and get that the line tangent to the
graph is the one the reflection preserves. So we know one of the directions
s tangent to to the graph. The other one is the y − axis, since it needs
to be orthogonal and lie in the tangent plane. We leave it as an exercise
to you to show that the principal curvatures have opposite signs and that
their values are: k1 = cosh1 2 z , k2 = cosh
−1
2 z and that H = 0 (the surface is

a minimal surface) and that K = cosh4 z is negative, which means we have


−1

bulges locally.

The catenoid is historically quite a special surface, because it is based on the


function cosh(x). One of the first contexts in which it was used was as the solution
to the chain problem. The chain problem is quite simply, how does a chain look if
you hold it at two places and let it hang there.

Proposition: The helicoid

The helicoid is another surface we want to introduce, where the symmetry


argument does not work.
2.6. INTERLUDE: A BIT ABOUT DIFFERENTIATION 75

2.6 Interlude: A bit about Differentiation

Before continuing our discussion of curvature, we want to stop for a minute and
discuss how one defines derivatives on surfaces and fix the notation. We start by
introducing standard calculus notation and then defining derivatives on surfaces.

Definition 2.6.1: Differential of f : Rn → R

Let f : Rn → R be any smooth function that assigns each value x =


(x1 , . . . , xn ) ∈ Rn a value in R. We define the differential of f to be:
 
∂f ∂f
df (x) = Df (x) := , . . . , : Rn → R (2.38)
∂x1 ∂xn x

Definition 2.6.2: Differential of f : Rn → Rm

Let f : Rn → Rm be a smooth function, where we write f = (f α )m α=1 =


(f 1 , . . . , f m ). Similarly to the previous definition we define the differential
of f at x to be:
 ∂f 1 ∂f 1

∂x1 ... ∂xn
df (x) = Df (x) =  ... ..  : Rn → Rm (2.39)

. 
∂f m ∂f m
∂x1 ... ∂xn x

In both cases we defined the differential to be a map of the same kind as f ,


that locally agrees with f (to first order). We now want to define the directional
derivative.
76 CHAPTER 2. SURFACES

Definition 2.6.3: Directional derivative


Let f : Rn → Rm be a smooth function like in the above definition and
x ∈ Rn the point at which we want to evaluate the directional derivative
and X ∈ Rn the direction in which we want to evaluate it. (The first is a
point, the second is a vector.) Then we define the directional derivative as:
 ∂f 1 ∂f 1
  
∂x1 ... ∂xn
X1
 .. 
(DX f )(x) = Df (x)(X) =  ... .. 
 . :R →R
n

. 
∂f m ∂f m Xn
∂x1 ... ∂xn x
(2.40)
which we can rewrite like this:
m
n X
X ∂f α
DX f (x) = X j eα (2.41)
α=1 j=1
∂xj

where eα is the standard basis in Rn

Proposition

Let γ(t) ∈ Rn be any smooth curve, so that γ(0) = x and γt (0) = X.


Then:
df (γ(t))
DX f (x) = (2.42)
dt t=0

This equation is very useful while calculating derivatives but also quite intuitive.
Simply put, to differentiate a function in the direction of X, take Any curve that
has (at the point of interest) its tangent-vector equal to X and differentiate on that
curve. Locally, they will look the same, after all. You can look at figure 2.6 for an
example.

Proof. The proof of this Proposition is very simple. You basically use the chain
rule once, and get exactly what you need.

df (γ(t))
= Df (γ(0))γt (0) = Df (x)X = DX f (x) (2.43)
dt t=0

The last equation is just the definition of DX f (x). Notice that the multiplication
is a matrix multiplication.

2.6.1 Vector Fields on R3


Now that we have fixed the notation of derivatives in Rn , we want to talk about
vector fields, restricting ourselves to the case of R3 .
2.6. INTERLUDE: A BIT ABOUT DIFFERENTIATION 77

Figure 2.21: The catenoid. It is the surface of revolution of the


function cosh(z).
78 CHAPTER 2. SURFACES

Definition 2.6.4: Vector field on R3

A vector field on R3 is a function X : U ⊂ R3 → R3 , where U is some


opena subset of R3 .
a We choose it this way so that we can differentiate at every point.

We can also define the derivative of one vector field in the direction of another
vectorfield, which we do point wise.

Definition 2.6.5: Derivative along a vector-field

Let X, Y be two smooth vectorfields in R3 . Then we define the derivative


of Y along X as:
3
X ∂Y j
DX Y (p) = DX(p) Y = (DY )(p)(X(p)) = Xi (p)ej (2.44)
i,j=1
∂xi

where ej is the standard basis of R3 .

What we get is another vectorfield. Note that the derivative at p depends on the
value of X only at p, but that the same is not true for Y . This is simply because
we are interested in the change of Y along X(p). From the first we need a few
values so we can calculate the change (that is an surrounding of p), while we only
need X(p) for the direction.

2.6.2 Vectorfields on a surfaces


Now that we have fixed the notation of vectorfields in R3 , we want to see how
we can apply the ideas of vectorfields onto surfaces. (In this section M is again a
surface and p a typical point on M .)
The difference between R3 and a surface is that, unlike in R3 , there are two
kinds of vectorfields, that behave quite differently from another and have different
geometric meanings.

Definition 2.6.6: Vectorfields on M


We can define two kinds of vectorfields. Let’s start with the simple general-
ization:
• We call a smooth function X : M → R3 a vectorfield along M.
• If X(p) ∈ Tp M for all p ∈ M , then we call the vectorfield tangent to
M.

The first kind of vectorfields of course contains the second, but the second is
a bit special. It is special in-so-far as that geometrically small vectors (|v| < ϵ for
2.6. INTERLUDE: A BIT ABOUT DIFFERENTIATION 79

Figure 2.22: The catenoid and the lines we need to calculate the
principal axes of curvature using symmetry, for some point p.

some ϵ) lie in the surface, because locally the surface looks like the tangent plane.
You can see the difference between the two types in figure 2.6.2 and
How can we differentiate a vectorfield along a surface M ? Let’s say we have
vectorfields X, Y defined on M and want to differentiate Y with respect to X,
where X of course has to be tangent to M . How can we proceed? There are a
few possibilities. we could for example take a curve that at t = 0 is at p and has
the tangent vector X(p) and use the proposition from two sections ago and repeat
for all the points. We will however use a different (but equivalent) way. We will
extend X, Y to be smooth vectorfields X̃, Ỹ on an open set U ∈ R3 , so that X̃, Ỹ
restricted to M match X, Y , and then define the derivative to be DX Y = DX̃ Ỹ

Definition 2.6.7: Derivative of Y along X on M

Let X be a smooth vectorfield tangent to M , and Y be a smooth vectorfield


along M . Then we define the derivative DX Y of Y after X by first extending
X and Y onto a subset U ∈ R3 : X̃ : U → R3 and Ỹ : U → R3 (so that
they are smooth) and define:

DX Y (p) = DX̃ Ỹ (p) (2.45)

This definition is independent of the extensions X̃, Ỹ , which we prove now:


80 CHAPTER 2. SURFACES

Figure 2.23: An example for the intuition behind the Proposition, for an
f : R2 → R. If you move along X and the curve γ with a very tiny step (h
is very small) the values you get will be very similar, which is why you get
the same result.

Figure 2.24: A typical vectorfield in R3


2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP81

Proof. We know that DX̃ Ỹ depends only on the value X̃(p) = X(p) and on the
values of Ỹ in a small surrounding of p. (Surrounding in R3 ). Therefore the
independence of the extension of X is given. We also know that:

dỸ (γ(t))
DX̃ Ỹ = (2.46)
dt t=0

for some γ(t) with γ(0) = p and γt (0) = X(p). We can use any smooth curve
that satisfies the two conditions, in particular we can take a curve in M , so that
the expression becomes:

dỸ (γ(t)) dY (γ(t))


DX̃ Ỹ = = (2.47)
dt t=0 dt t=0

since we are using a curve that lies on M and Ỹ M


=Y

with this, we are done with the interlude and return to curvature.

2.7 Another characterisation of curvature: The


Weingarten Map
We have already seen two different ways to look at curvature, firstly a very geometric
definition (what we called Qp ) and then as the second fundamental form (Ap ). We
now turn to the last major way to describe curvature we will cover, called the
Weingarten-map. The idea of this way of looking at curvature is that they way
tangent vectorfields have to change (locally) to stay tangent to M is another way
to detect curvature. Let X, Y be two vectorfields, tangent to M . Then, because
they are tangent, there is an amount they have to change to stay in the tangent-
plane, similarly to how curves on M have to curve to stay on M in our geometric
definition. You can see an example in figure 2.7 From this though emerged the

Figure 2.25: Two vectorfields X, Y and the derivative DX Y .

following theorem.
82 CHAPTER 2. SURFACES

Theorem 2.7.1: Vectorfield characterization of Ap

Let X, Y be two tangent vectorfields on a surface M , and N a unit normal


field (the same one as used in the definition of Ap ). Then we can rewrite
Ap in the following way:

Ap (X, Y ) = ⟨DX Y, N ⟩ = −⟨DX N, Y ⟩ (2.48)

The first equality say that we can characterize curvature by how vectorfields
change along M , specifically by the component of that change normal to the surface.
The second one tells us that we can characterize curvature by how the normal
vectorfield changes on M .
We will prove this theorem, but we want to first discuss a few ways to see
this intuitively. We already saw that a tangent vectorfield has to change to stay
tangent to M . We can also see this, not as Y changing, but the tangent-plane
rotating as you move on M , which you can see in figure 2.7. This should be clear,
since the tangent-plane is in one sense the first linear approximation of M , and if
M curves then the tangent-plane also has to rotate. You can also see this as all
the possible tangent-vectors changing to accommodate M ’s curving, and since the
tangent-plane is build out of all of these vectors, it has to change as well. You can
also describe the change of the tangent-plane by describing how the normal vector
changes, since it has to change with the tangent-plane. This is how you get the
second equality. It is clear that DX N has all the information about how N , and
therefore Tp M changes, and this is why this map gets a special name, it’s called
the Weingarten map.

2.7.1 The Weingarten Map


We write down the definition of the Weingarten map, which we motivated just now
explicitly for clarity.

Definition 2.7.1: The Weingarten Map

Let M be a surface and N a choice of unit normal field. Then we define


the Weingarten map to be:

Wp : Tp M → Tp M : X(p) → DX N (p) (2.49)

By the theorem we then get that:

Ap (X, Y ) = −⟨Wp (X), Y ⟩ (2.50)

There are a few things to note about the Weingarten map, that one can see
quite easily. Firstly, Wp is self-adjoint, because Ap is symmetric.

⟨Wp (X), Y ⟩ = Ap (X, Y ) = Ap (Y, X) = ⟨Wp (Y ), X⟩ = ⟨X, Wp (Y )⟩ (2.51)


2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP83

Figure 2.26: A vectorfield along M in (a), and a vectorfield tangent to M .


The first type sticks out of the surface, while the second type lies along the
surface.
84 CHAPTER 2. SURFACES

Figure 2.27: Vectors tangent to M live in the tangent plane at the point they
come out of, which locally, you can imagine, as them living in the surface.
2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP85

Secondly, we can once again use that a unit normal vector cannot change in it’s one
direction, and conclude that the Wp really goes to Tp M , because ⟨Wp (X), N ⟩ = 0.
You can see this, by applying the same calculation we already did quite often. We
know that, ⟨N, N ⟩ = 1 since the unit normal is of unit length. We can therefore
calculate:
0 = DX ⟨N, N ⟩ = 2⟨DX N, N ⟩ (2.52)
which is exactly the claim we just described.

Proposition: The Sphere

We turn to our standard example of a surface, the sphere. What is the


Weingarten map of the sphere?. Let’s again choose an inner normal field
N = −x/R, where R is the radius of the sphere. Then we can compute Ap
rather quickly using the theorem:
x 1
Ap (X, Y ) = −⟨DX N, Y ⟩ = ⟨DX , Y ⟩ − ⟨DX (x), Y ⟩ (2.53)
R R
We need to compute DX (x) = DX ((x1 , x2 , x3 )). We get:

DX (x) = DX ((x1 , x2 , x3 )) (2.54)


= D((x1 , x2 , x3 ))(X) (2.55)
1 0 0
   1
X
= 0 1 0 X 2  = X (2.56)
0 0 1 X3

and therefore, by inserting:


1
Ap (X, Y ) = − ⟨X, Y ⟩ (2.57)
R
The Weingarten map is just: Wp (X) = − R1 X

The example showed the way in which you can sometimes use the Weingarten
map to calculate something really fast. Imagine doing the calculation of Ap by
using the Hessian Matrix. There is so many derivatives that you’d probably make
a small mistake, but at the very least it would take way longer.

2.7.2 Proof of connection between the Weingarten map


and Ap
Now that we have gotten some intuition about the Weingarten map, and saw a first
example, we want to prove the theorem. There is two things we need to prove.

• We need to show that Ap (X, Y ) = ⟨DX Y, N ⟩.

• We also need to show that ⟨DX Y, N ⟩ = −⟨DX N, Y ⟩


86 CHAPTER 2. SURFACES

We will actually prove the second part first, as it is both easier and we need it to
prove the first part.

Proof. Let us first prove the second claim. It is not too hard to prove. It’s
the same calculation we already did very often. We know that ⟨Y, N ⟩ = 0
everywhere, because Y is a tangent vector field, and N is a unit normal field
and they are, per construction, orthogonal to each other. Then we get:

0 = DX (⟨Y, N ⟩) = ⟨DX Y, N ⟩ + ⟨DX N, Y ⟩ (2.58)

per the product rule, and therefore:

⟨DX Y, N ⟩ = −⟨DX N, Y ⟩ (2.59)

as promised.

There is an immediate consequence of this. We know that DX Y (p) depends


only on the value of X at the point p, but for Y you need a surrounding of p that
it depends on, since you are calculating changes of Y . But because of the equality,
one can clearly see that the expression ⟨DX N, Y ⟩ only depends on X(p) and Y (p),
and definitely not on Y on a small surrounding. This is quite surprising at first,
because ⟨DX Y, N ⟩ looks, at first sight, like it would depend on more that just Y (p).
We will use this in the proof of the second plane.
The idea of the second part of the proof a lot of calculation, but has one basic
idea. Ap is, in essence, just the Hessian, D2 f , if you write M as the graph of
f . So on the one side, we have a second derivative, and on the other, a term with
DX Y . In a vague sort of sense, we will see that Y is roughly like DY f , at least
when it comes to this specific case.

Proof. The setup of our proof is drawn in (a), which is the same as in the proof
that Qp is the same as Ap . This time we want to extend Y to a vector that is
in the tangent-plane of q. In (b) we find that our first step will be extending Y
to be constant on the tangent-plane.

Step 1 We begin by fixing p ∈ M , and using coordinates of R3 where p = (0, 0, 0),


Tp M = R2 × {0} and x3 = f (x1 , x2 ) describes the surface. This is the
exact same construction we used when we proved that Qp and Ap are the
same thing, in section 2.3.3.

Step 2 We let X = (X 1 , X 2 , 0) and Y = (Y 1 , Y 2 , 0) be two vectors in the tangent-


plane, and want to calculate ⟨DX Y, N ⟩ at the point p. We want to do this
algebraically, in the coordinate system we choose, so we will actually need
to extend X, Y to some extensions X̃, Ỹ (on M ). We note already here,
that we already showed that ⟨DX Y, N ⟩ depends only on X(p), Y (p), and
therefore any suitable extension we pick will be good enough, and the
whole thing won’t depend on the extension.
2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP87

Figure 2.28: Even the most boring tangent vectorfield like Y has to change
a minimal specific amount to stay on M .

Step 3 We now think about which extension might be good to work with. We
want to extend X, Y to a surrounding of p on M , that contains the typical
point q ∈ M . The choice we will make is to ensure that the extensions
X̃(q), Ỹ (q) are both tangent to M at q, or in other words, that they are
an element of Tq M .
We can construct our vectors10 , first, by extending them to constant vec-
tors on Tp M . So we take Ỹ (q) = (Y 1 , Y 2 , ?), where we don’t know the
third component yet. We know that q = (x1 , x2 , f (x1 , x2 )) if it has coor-
dinates x1 , x2 on Tp M . When is a vector in q’s tangent space? Well, if we
have a graph, like f , we can see what we have to do by quickly looking at
a one-dimensional-example. which you can see in figure 2.7.2. Very simi-
larly to the figure, the a vector (Y 1 , Y 2 , Y 3 ) will be in the tangent-space
Tq M of q, if Y 3 = df (Y 1 , Y 2 ). So our extension becomes:

Y (x1 , x2 , f (x1 , x2 )) = (Y 1 , Y 2 , df(x1 ,x2 ) (Y 1 , Y 2 )) (2.60)

This is what we meant by Y is somehow analogous to DY f . If contains


df , which will turn into the second derivative in the next steps.
Step 4 Now, extend Ỹ to a vectorfield on the whole of R3 , and in particular, make
it independent11 of x3 .
Step 5 Do the above procedure for X.
10 we will only explain the extension of Y , the one of X is exactly the same.
11 We can do this and we do it for convenience.
88 CHAPTER 2. SURFACES

Step 6 Calculate DX Y p
. We get:

3
X ∂ Ỹ j
DX Y = DX̃ Ỹ = X̃ i e
p j
(2.61)
p p
i,j=1
∂xi
3
X ∂
= Xi (Ỹ 1 , Ỹ 2 , Ỹ 3 ) (2.62)
i=1
∂xi p
3 2
X ∂ X ∂f 1 2 j
= Xi (Y 1
, Y 2
, (x , x )Y ) (2.63)
i=1
∂xi p j=1
∂xj

3 X
2
X ∂ ∂f 1 2
= (0, 0, Xi (x , x ) p Y j ) (2.64)
i=1 j=1
∂xi ∂xj
= (0, 0, XHY ) (2.65)

where H is the hessian at p. Because N = (0, 0, 1), we get:

⟨DX Y, N ⟩ = XHY = Ap (X, Y ) (2.66)

which is exactly the result we wanted.

Figure 2.29: The tangent-plane of M changing as M curves.


2.8. *OTHER FORMULAS FOR CURVATURE 89

Figure 2.30: The unit normal vector has to change as M curves.

2.8 *Other formulas for curvature


We have promised you five different ways to describe the curvature of surfaces. We
have given you three already, all based on different ways of thinking about curvature.
The other two, we claimed, are simply formulas for computation. We present them
here. You don’t need to know them by heart but it is worth knowing that they
exist, shall you need them sometime. Both of these formulas are analogous to the
two formulas for the curvature of a curve we derived before. For clarity, we write
them down here:
1
 
γt γt
κ= 2 γtt − ⟨γtt , ⟩ (2.67)
|γt | |γt | |γt |

uxx
k= 3/2
(2.68)
(1 + u2x )

The first, in principle, just describes the curvature vector of a parameterized curve.
The second one describes the curvature of a curve as a graph. We will start with
the first.

2.8.1 Curvature of a parametrized sufrace


Here, we will mention the formula for parameterized surfaces analogous to formula
2.67.
Firstly, we need to describe what exactly we mean by a parameterized surface,
which is not to difficult:
90 CHAPTER 2. SURFACES

Definition 2.8.1: Parameterized Surface


We define a parameterized surface, which we call M , in the usual way, as a
map F : U ⊂ R2 → R3 , where U is called the coordinate space. We want
it to have a non-singular Jacobian too. In equation:

F (x1 , x2 ) = (F 1 (x1 , x2 ), F 2 (x1 , x2 ), F 3 (x1 , x2 )) (2.69)

We can get a basis of Tp M at each point p, by taking the basis vectors of the
coordinate space e1 , e2 and applying dF |p onto them. That way we get two new
vectors (at each p):

Xi = dF |p(ei ) (2.70)

which (you should check) turn out to be basis-vectors of Tp M . We can also write
the Xi ’s out:

∂F α 3
Xi = ( ) (2.71)
∂xi α=1

We change notation a bit (to make the formula we will get readable in the end) so
α
that for ∂F α α
∂xi we write Fi . Similarly we use Fij for second derivatives etc.

We can then define the metric, which is a tool that tells us how to transform
from changes of coordinates to changes in lengths.

Definition 2.8.2: The metric of a parameterized surface

We define the metric of a parameterized surface as the map (from the


coordinate space or M , whichever is more convenient) as:
X
gij = g(Xi , Xj ) = ⟨Xi , Xj ⟩ = Xiα Xjα (2.72)
α

In actuality, gij is a collection of four maps, but it will turn out to be a tensor
(in later chapters), and a mighty object in differential geometry, so we call it one
map. Let us also define g ij to be the inverse of gij , if you see gij as a matrix. Then
for a parameterized surface we get the following formulas for some curvatures:
2.8. *OTHER FORMULAS FOR CURVATURE 91

Figure 2.31: The setup of our proof is drawn in (a), which is the same
as in the proof that Qp is the same as Ap . This time we want to extend
Y to a vector that is in the tangent-plane of q. In (b) we find that our
first step will be extending Y to be constant on the tangent-plane.
92 CHAPTER 2. SURFACES

Proposition 2.8.1: Curvature of a parameterized surface

The Weingarten map is:

W = g −1 D2 F · n (2.73)

where n is the unit normal, which we can write as u = X1 ×X2


|X1 ×X2 | . (point wise
definition.)
Written out we get:
3
2 X
X ∂F α α
Wki = g ij n (2.74)
j=1 α=1
∂xj ∂xk

For the mean curvature we get:

H = tr(W ) (2.75)

or written out: XX
H= g ij Fijα nα (2.76)
i,j α

or written out:
!
F2β F2β F11
α
− 2F1β F2β F12
α
+ F1β F1β F22
α
H = nα γ γ δ δ γ γ 2 (2.77)
F1 F1 F2 F2 − (F1 F2 )

where all indexes repeated twice are summed over (Einstein convention).
For the Gauss-curvature K we get a similar result:
α α β β
F11 n F22 n − (F12 n )
α α 2
K = det(W ) = (2.78)
F1 F2 F1 δF2δ − (F1γ F2γ )2
γ γ

The formulas might look horrible, which in a way they are. But they are certainly
useful, because often you know F and can simply plug the derivatives in and get H
and K. It’s a formula which is very easy to implement on a computer.
By the way, if you know H and K, you can figure out the principal curvatures
quite easily. Let’s say you know k1 and k2 .
Then you can realize that:

0 = (x − k1 )(x − k2 ) = x2 − (k1 + k2 )x + k1 k2 = x2 − Hx + K (2.79)

which you can of course solve easily for k1 , k2 .

2.8.2 The curvature of a graph


We will now provide you with the formulas for the mean and Gauss-curvatures for
a graph.
2.9. INTRINSIC GEOMETRY 93

Proposition 2.8.2: Curvature of a graph

Let’s say you have a function z = f (x, y), so that the graph of f is the
surface M for which you want to calculate H, K for. Let’s denote the
partial derivative after x, fx as usual. Then you can write H and K as:

(1 + fy2 )fxx − 2fx fy fxy + (1 + fx2 )fyy


H= (2.80)
(1 + fx2 + fy2 )3/2
2
fxx fyy − fxy
K= (2.81)
(1 + fx + fy )2
2 2

As in the previous case, if you know H and K, you can solve for the principal
curvatures by the same calculation,
Notice the similarly of both cases to the formulas for a curve. Specifically, notice
how both of the formulas for the case of a graph have a correction term (in the
denominator) similar to the case of curves.

2.9 Intrinsic Geometry


Up until now, all our considerations of surfaces were based on the assumption that
we are a three-dimensional creature, who has knowledge about all of R3 and can
find information about a surface from the three-dimensional ambient space. A lot
of what we did were extrinsic things, that you can figure out out of the ambient
space. We want to amend this now.

Figure 2.32: The idea of the third Step of the proof, but in the one-
dimensional case. A vector (x1 , x2 ) is in the tangent-space (lies in the
tangent of f at p), if x2 = df (x1 ) = dx
df 2
x

Imagine that you are an ant on some surface, like the one in figure 2.9 and you
94 CHAPTER 2. SURFACES

cannot see ”outside”, into the third dimension. You stay entirely confined to the
surface, which is your world. Imagine also, that you are a curious, very smart, ant.
You also have a measuring tape, with which you can measure lengths (infinitely
precisely) and have access to infinite computational power, as we said, you are very
very smart. In this world, light can obviously not go in straight lines (unless the
surface is flat), so we say it goes as straight as it can, and what the ant sees is a
result of this. You can see all of this drawn in figure 2.9. From this idea, that is,
what the ant can understand about the world it lives in from measuring lengths and
looking around, we can define what we mean as an intrinsic quantity on the surface.
Intrinsic quantities, simply said, are those that you can infer from measuring lengths
only.

2.9.1 The intrinsic metric of M


We promised, some time ago, that we would come to the first fundamental form
after covering the second fundamental form. We fulfill this promise in this section.
Let’s start simply. Let us say you are the ant, and you go around measuring the
length of a curve γ on the surface, that connects two points on the surface p and
q 12 .
Then the thing we, as three dimensional creatures, would call the length would
simply be:
Z b
L(γ) = |γt | dt (2.82)
a

where |γt | uses the norm in three dimensional space. We know from our treatment
of curves, that this is independent of parametrization, and only depends on the
image of γ. It is something intrinsic, that the ant can measure.
From this we can define the distance between two points on the surface M .
We simply pick the curve connecting two points that has the smallest length (if it
exists), otherwise the infimum.

dM (p, q) = LM (p, q) = inf L(γ) (2.83)

for curves γ that connect p and q. Usually, a curve connecting the two with the
property that its length is the distance between the points exists, but sometimes it
does not. That is why we define it as an infimum13 . We call a curve with the above
property a geodesic.
The surface, equipped with these lengths as distances, is a metric space, as is
quite easy to show (exercise).

The two metrics


Here we are really talking about two different metrics, and it is important that you
keep them distinct. There is the metric that we get just from the surface, defined by
the lengths our ant can measure, and then there is the metric that comes from the
12 By this we of course mean that γ : [a, b] → M with γ(a) = p, γ(b) = q
13 Exercise: Find a surface and a pair of points, for which no such curve exists.
2.9. INTRINSIC GEOMETRY 95

external structure of the R3 that the surface might live in, which is not something
the ant can ever experience. You can see the difference in the following figure. We

Figure 2.33: The way we extend Y , is by using the the same (but two-
dimensional) construction from figure 2.7.2.

can now go on to define the first fundamental form.

2.9.2 The first fundamental form


The first fundamental form, also called the Riemannian metric of M or just the
metric, is (at each point on M ) the restriction of the scalar-product onto Tp M .

Definition 2.9.1: The metric


The metric g is defined as the map from Tp M × Tp M → R, which is the
restriction of the three-dimensional scalar product ⟨·, ·⟩R3 onto Tp M ×Tp M .

g(p)(X, Y ) = ⟨X, Y ⟩R3 Tp M


(2.84)
96 CHAPTER 2. SURFACES

This g in turn defines both dM and Lγ on the surface, i.e it is the only thing
you need to calculate L(γ) on the surface for any curve γ. It is also determined
from L(γ) or dM . This can be proven, but we will abstain from doing so until
later chapters. Our ant can therefore find it, it exists without any reference to the
ambient space R3 . We call anything that can be deducted from g in a sensible way,
intrinsic.

Note 6. Notice how both g(p) and A(p) are bilinear forms on Tp M . This is why
the first is called the first fundamental form, and the second is called the second
fundamental form. If you are wondering, there is also a third fundamental form,
which was introduced by Gauss, but it is not used much nowadays anymore,
because it doesn’t really add any new information, and can be calculated from
the first two.

2.10 Intrinsic Isometries and Gaussian curva-


ture

We continue with our discussion of an ant, and intrinsic geometry. In particular,


we want to talk about when the ant can make out the difference between two
different surfaces it lives on. Take a very simple example, two spheres. One is
centered at the origin, the other is centered around some point far away from the
origin. These two are different surfaces, in the sense that they are different sets of
points, mathematically. It is clear, however, that this difference only arises because
of the ambient space R3 , specifically because of the choice of coordinates we made
and that an ant living on either of these spheres could not tell them apart in any
way. They are, geometrically, the same thing. You can just translate one to the
other in the ambient space, their position in the ambient space, in other words,
is certainly not something intrinsic. The idea we want to develop in this section
goes beyond this simple case. We want to abstract away the effects of the ambient
space, take away anything our tiny smart ant cannot perceive and see what is left.
In particularly we want to see if two surfaces, that we perceive as different in the
ambient space, cannot be told apart by the ant. The tool we will develop is called
an intrinsic (local) isometry
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 97

Figure 2.34: The parameterization of the sphere. The lines we usually draw
on the sphere, are, as you know, the lines of constant longitude and latitude,
which are lines inherited from the coordinate space. We can also get basis-
vectors of Tp M for all p from the coordinate space, using the vectors Xi =
dF |p (ei ) as basis vectors of Tp M at p.
98 CHAPTER 2. SURFACES

Figure 2.35: A surface, on which our very smart ant lives. It has a measuring
tape, and (somehow) access to infinitely much computational power, so that
it can figure out as much about the world it lives in, as possible.
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 99

Definition 2.10.1: Intrinsic Isometry

Let (M, gM ) and (N, gN ) be two surfaces, with their own respective metrics.
We call a function ϕ : M → N an (intrinsic) isometry, if it is bijective,
smooth, and preserves distances. That is, for any two p, q ∈ M , if p̃ = ϕ(p)
and q̃ = ϕ(q), then
dM (p, q) = dN (p̃, q̃) (2.85)
or equivalently, if for any curve γ in M :

LM (γ) = LN (ϕ(γ)) (2.86)

or equivalently, if the metric is preserved:

gM (p)(X, Y ) = gN (p̃)(X̃, Ỹ ) (2.87)

where X̃ = dϕp (X) is the corresponding vector in N to X. Similarly, a


local isometry is one that is an isometry from an opena subset of M to an
open subset of N .
a Here, open means open in M , not R3 , that is a two dimensional set.

We can make the meaning of something being (geometrically) intrinsic a bit


more clear now. Something is intrinsic, simply, when it is invariant under (local)
isometries. Here the power of isometries shows itself quite clearly. Isometries pre-
serve the metric, but forget about all the unnecessary information that comes from
the ambient space. They only see, what the ant can see, and nothing else.

Proposition: The cylinder

Take a piece of paper. It is clearly flat, it could be nothing else. Roll it


up. This is an (local) isometry. You can see this, by taking a very tiny
part of the paper and notice that you don’t stretch or disturb it. The local
lengths, i.e the metric, stays the same. Now, take a look at the different
kinds of curvatures we defined in this chapter. Because the piece of paper
is flat, k1 , k2 , H and K are all zero. But for the cylinder, which is isometric
to the piece of paper, locally, k1 = 0, k2 = 1/R, H = 1/R and K = 0.
Therefore, the first three cannot be, in general, isometric. k1 might be the
same here, but you can roll in the other direction, finding another isometry,
where it is 1/R. So the first three cannot be intrinsic quantities. This is a
very important point. Three out of the four curvature quantities depend on
the ambient space. What about the Gaussian curvature? It is zero in both
surfaces, so we have not ruled out its intrinsic nature. Is it intrinsic?
100 CHAPTER 2. SURFACES

Figure 2.36: A surface and the two different distances we can define. The
first one in (a) comes from the surface itself, and can at least in principle
be measured by our ant. The second one in (b) comes from the structure
of R3 and is, in general, not equal to the one in (a), and the ant can never
feel this length.
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 101

Figure 2.37: Two spheres at different places in R3 are indistinguishable from


each other to the ant. This is the simplest example of an intrinsic isometry.

Proposition: The cone

The cone, similarly to the cylinder, can, locally, be unfurled into a flat piece
of paper. Exercise: Try both examples with paper.

In general, any surface that can be unrolled into a piece of paper is called
developable, and through any point, there will be a line through it that is straight
(in the ambient space sense), so that k1 = 0 and K = 0.
This hints at the following result:

Theorem 2.10.1: Theorema Egregium (Gauss)

The Gaussian curvature K is intrinsic. In other words, it does not change


under (local) isometries.

This is an extremely important result for surfaces. An ant cannot measure k1 , k2


or H, but what it can measure is K!
In fact, there is an extremely intuitive way for the ant to measure curvature. Let
the ant make the closest thing to a triangle it can in the space it lives in. By that
we mean it takes the lines, which are as straight as they can possibly be (geodesic)
and crosses them to make a triangle. Then let it measure the angles. It will find
that, if the space it lives in is not flat, in general, the sum of angles won’t be 180◦ !
102 CHAPTER 2. SURFACES

Figure 2.38: An isometry is a map from one surface M to another surface


N , so that distance is preserved. A local isometry is one that only works
locally.

Theorem 2.10.2: Gauss-Bonnet - Triangle Version

Let T be a triangle on a surface, that is, a triangle constructed out of three


geodesics (straightest possible lines on surface). Then:
Z
α+β+γ =π+ K (2.88)
T

where α, β, γ are the angles of the triangle, K is the Gaussian curvature,


and the integral is an area-integral (not over the boundary of the triangle)

So by measuring angles, the ant can figure out if the space she lives in is curved
or flat! Another way she can do this is by using circles, not triangles. A circle, in
her world, is the set of points equally distant from some point (”radius r”), which
she of course calls the centre of the circle. If she lives in flat space, she will find
that the area of the circle will be πr2 . But if she lives in a curved space, then this
fact also doesn’t have to be true, you can take a look at figure 2.10.
One can show that the area will be:
π
Ap (r) = πr2 − K(p)r4 + · · · (2.89)
12
which is not πr2 !
Note 7. There is a formula for K in terms of the metric, but it is rather long
and not very useful for our purposes right now.

∂ 2 gij
K = g ij g kl + ... (2.90)
∂xk xl
2.11. TWO IMPORTANT THEOREMS 103

Figure 2.39: A piece of paper can be rolled up into a cylinder, without


disturbing local lengths. The principal curvatures, and the mean curvature
all change, but the Gauss curvature does not.

2.11 Two Important theorems


. Before finishing with surfaces, we want to mention two very important theorems of
surfaces connecting differential geometry with topology. A simplified version of the
first one was already presented in the last section, called the Gauss-Bonnet-Theorem.
It describes a very specific way in which curvature and topology interact. The other
theorem is the Uniformization Theorem, which describes the way in which you can
bring topological surfaces into differential geometry. Both are global theorems.

2.11.1 The Gauss-Bonnet-Theorem


We want to connect the Gauss curvature to a result from topology. Topology tells
us that we can characterize every (orientable compact) surface by a number called
the genus. You can see a few examples in 2.11.1. The genus is the number of
tunnels the surface has. A sphere has no tunnels, so it’s genus is zero. A torus has
one, and so its genus has two. A double-torus has two, its genus is two.
Topology also has the idea of the Euler characteristic, which measured genus
through triangles. The idea of the Euler characteristic is to triangulate the surface
and compare the number of triangles with the number of edges and vertices. If we
call the Euler characteristic of a surface M , χ(M ) then:

χ(M ) = # Triangles − #Edges + #Vertices (2.91)

This number is, incidentally, independent of triangulation, which we won’t prove


here. It also turns out that:

χ(M ) = 2 − 2genus(M ) (2.92)


104 CHAPTER 2. SURFACES

Figure 2.40: A cone can also be unrolled to a flat piece of paper

Figure 2.41: An example of the the Gauss-Bonnet theorem. If you form a


geodesic triangle on the sphere like in the figure, you can create a triangle
with three angels of π, totaling 270◦ . If you calculate the (area) integral
over the triangle of the Gauss curvature, you get the angle excess!
2.12. THE UNIFICATION THEOREM 105

The Gauss-Bonnet theorem connects the Euler-characteristic to curvature in the


following way:

Theorem 2.11.1: Gauss-Bonnet

Let (M, g) be a compact Riemann 2-dimensional manifold (so a surface that


is either abstract surface or a normal surface in R3 . ) Then:
Z
K dA = 2πχ(M ) (2.93)
M

So you can get the Euler-characteristic of a surface through the Gauss-curvature!


This is an extremely important bridge between topology and differential geometry.

2.12 The unification theorem


The unification theorem deals with the way one can geometrize a topological surface
by adding a metric onto it. It is a bridge between topology and differential geometry
even deeper than the Gauss-Bonnet theorem.

Theorem 2.12.1: Unification Theorem


Let M be any compact topological surface. Then one can construct a metric
on g, that has constant Gauss-curvature, specifically:

+1 if χ > 0

KM = 0 if χ = 0 (2.94)
−1 if χ < 0

This is obvious for the case of the sphere (it has genus zero, so χ(S) = 2) and
you can just give it the usual metric of the sphere. Very surprising is the result that
on can do this with the torus. You cannot do it by embedding the torus in three
dimensions, but you can do it by embedding it in four.
The idea is presented in figure ?? and ??.
With this last theorem, we move away from curves and surfaces, and move
towards the theory of differential geometry on manifolds, making a short stop along
the way to catch up on important concepts from topology.
106 CHAPTER 2. SURFACES

Figure 2.42: On a sphere, the area of what the ant would call a circle is not
πr2 .

Figure 2.43: Examples for surfaces with genus 0, 1, 2. The genus is the
number of tunnels the surface has.
Part II

Manifolds

107
109

We have discussed the geometry of curves and surfaces extensively in the first
part of the lecture. We saw many ideas, like curvature and intrinsicness that became
very big themes and useful tools. The goal of this lecture is to extend these ideas
to general (geometric) spaces and see some of the fantastic results and tools that
this approach brings. Before we can do that however, we need a bit of technical
knowledge. For curves and surfaces, to define our tools, we needed to use certain
concepts constantly. Continuity, open sets, neighbourhoods, vectors, tangent vec-
tors and coordinates are only some of these. Usually, they were rather trivial to do,
because we stuck to the ambient space Rn and we know all of these concepts in
Rn quite well from courses like calculus. A vector is, in the most simple form, just
an arrow in Rn and is easy to understand. But we want to go beyond this simple
idea of our geometric things sitting in Rn . As a particular example, our world,
according to general relativity is a four dimensional space that is not R4 but also
does not live in any Rn . We want to extend ideas like surroundings and vectors
to abstract spaces which do not necessarily sit in some embedding space. This is
the topic of this part of the lecture. Our first step will be open sets, since we need
them for basic things like continuity. The topic of open sets belongs to a broad
field in mathematics, called topology, which we will explore briefly14 . Afterwards we
will handle coordinates and charts and define exactly what we mean by a smooth
manifold. Finally, we will talk about vectors in the settings of manifolds.

14 Any mathematician who has had a good lecture on topology can skip that part safely.
110
Chapter 3

Topology and topological


manifolds

We want to start our discussion of manifolds with a discussion of topology. A


topology gives a space some spatial structure, which will definitely be useful in our
discussion of geometry.

3.1 Humble beginnings


What is the least we can definitely say about a geometric space (be it a curve or
surface or something more complicated)?
Well, at the very very least, at the most basic level, it will be a set of all points
that belong to that space. The thing we start out with is a set. There is not a lot
of geometry you can do with just a set, so our description needs to go further. We
need some sort of sense of spatiality to describe where things are in relation to each
other. How do we do this in Rn ? Well, in Rn we have a metric, which is something
that tells us how far away points are. We don’t have that here yet and we don’t
want to introduce it just yet. But with that metric, we can define balls centred at
a point p of some radius and from balls, we can define sets that are open or closed
and work from there with where things are spatially.
Let us write down the definitions of open and closed sets in Rn .

Definition 3.1.1: open set

Let X ⊂ Rn be a set. We call it open, if for any point p ∈ X, there is a


ball centred around p of some radius, which is entirely contained in X.

Definition 3.1.2: closed set

A set Y ⊂ Rn is closed, if its complement Rn \ Y is open.

111
112 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

Where do these definitions come from? Well, the first idea is that an open set,
intuitively speaking is one that has no border, while a closed one does. You can
clearly see that for simple sets like the ones in the figures 3.1 - 3.1, this is true.

Figure 3.1: A set without a border. No matter where, no matter how close
to the border, you can always find a small enough radius so that the ball
around that point is still entirely contained in X.

We can therefore see, that at least in these examples, the definition matches
our intuition and we can therefore accept them and see what other sets are open
and closed in that case.

We can easily see that a set consisting of a single point is closed (it is its own
border) and that often (but not always) the question of open/closed comes down
to whether we include a border in our set or not.

We want to point out a few ideas that follow from our definitions below.
3.1. HUMBLE BEGINNINGS 113

Figure 3.2: If you have a piece of the border, then you cannot do the same
thing as we did in the previous example. A point on the border, will, by
definition, always contain a bit of X and a bit of the rest of Rn . So if
you have a piece of the border, the set cannot be open, which matches our
intuition.
114 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

Figure 3.3: If you have a set with a border, you can quite easily convince
yourself that its complement is open since no part of the border is in the
compliment.
3.1. HUMBLE BEGINNINGS 115

Proposition 3.1.1: Some consequences

Here are some of the consequences of our definitions.


• The empty set ∅ and the whole of Rn are open sets.

• Any composition of open sets is an open set. This is simple to see


(and prove). If you have a point p in the composed set, it must have
come from some open set that you were composing, and if you can
find a ball there that works, then it also works in the composed set.
Adding other material does not change this simple fact, whether we
are composing two, three, infinitely many, or uncountably infinitely
many open sets.
• However, only finite intersections have to still be open. This is simple
to see. If you take infinitely many spheres (centred at 0), of radius
converging to zero, then the intersection will be the set containing
only the point 0, which is not open anymore.

• Similarly only finite compositions of closed sets are guaranteed to be


closed. The reason for this is the last example, really. You can do the
same thing with the complements of the spheres, and in the end, if
you compose all of these, you get all of Rn , without the origin, which
is not closed.

• Any intersection of closed sets is closed. This follows from the second
item and the definition.

You can see the two examples given in the proposition in figure 3.1.
We can also express continuity of a function through open sets.

Proposition 3.1.2: Continuity through open sets

A function f : Rn → Rm is continuous if and only if the pre-image of any


open set in Rm is open. In formula:

∀V ∈ Rm open ⇒ f −1 (V ) is open (3.1)

This can be proven quite easily to be equivalent to the usual ϵ − δ definition of


continuity, and is left for the reader as an exercise.
Continuity is something very local and the usual definition (ϵ − δ) makes this
very clear and uses the spacial relations between points a lot.
This means, that in Rn , we can define continuity (and in the end spaciality)
only through the use of empty (and or closed) sets. This is what we will generalize.
116 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

Figure 3.4: On the left you have the composition of a few open sets. Any
point in the combined set is still surrounded by a ball (circle) containing only
points of the combined set because the ball (circle) from the original set the
point comes from works for this. On the right, you have the intersection of
many balls (circles) whose radius is getting smaller and smaller. (All open).
After intersecting all of them, you get a set containing only the origin, which
is not open anymore.
3.2. A TOPOLOGICAL SPACE 117

3.2 A topological space


Now we can take these concepts from Rn and use them to give us some more
structure for our space, more than the only piece of information we had until now.
”it’s a set”.
We can encode spaciality by having open sets on the space. Now for a general
set, we cannot construct the open sets they need to be given. In the case of Rn ,
we constructed them using balls, in the general case we do not have balls we can
use to construct open sets with, because they are geometrical objects!
We need to choose some axioms of how we want open sets to behave otherwise
there is no difference between open sets and just subsets of a space X. We mirror
the case of Rn completely here, by simply taking the first three items from the
proposition of the last section as the axioms. This we turn into a definition.

Definition 3.2.1: Topological space and topology

A set X is called a topological space if it is equipped with a set of ”open”


sets, called a topology T . The topology has to have the following properties:

• ∅, X ∈ T
• If Ui ∈ T for some indexation I, then
S
Ui ∈ T .
i∈I

• If Ui ∈ T for a finite indexation I, then


T
Ui ∈ T
i∈I

This definition might look slightly different, but is exactly the three ideas we
wanted to steal from Rn . Indexation here means simply that we have a set (for
example {1, 2, 3, . . . }) that we use as indexes, finite means that the has a finite
amount of elements.
We can immediately define continuity of a function between two topological
spaces.

Definition 3.2.2: Continuity for topological spaces

Let X with TX and Y with TY be two topological spaces and f : X → Y


a function between the two. Then we call f continuous, if for any open
set V ∈ TY , f −1 (V ) ∈ TX , that is if the preimage of any open set (in the
Y -sense) is open (in the X-sense).

This definition is of course, exactly as the axioms, stolen from Rn .


This is the first structure we add to our space. We want it to be a topological
space, that is, we want to be able to speak about spatial relations, like continuity.
There is a lot you can do with a topological space without adding any more
structure to it. We will talk about these in the interlude at the end of the chapter,
which is recommended for anyone not familiar with topology and includes concepts
ranging from convergence, paths and connectedness to separation axioms.
118 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

But before we go into all that detail, which, while interesting, is not the direct
subject of the lecture, we want to talk about a few more ideas that will be more
directly relevant to us.

3.3 Charts: Part I


Our ultimate goal will be to do differential geometry, or very simply said, geometry
with derivatives. But the concept of a derivative needs numbers somewhere in the
problem. We need some descriptions of the space we work with that uses numbers,
otherwise known as coordinates, parametrizations or charts. These three concepts
are very similar. Let’s start with charts.
What is a chart? Very simply put, it is a map (in the geography sense, not in
the math sense). Let us take a two-dimensional space as an example. Then a chart
is a map in the intuitive way. It is a piece of paper which represents the space we
describe. We have points on the piece of paper and some way of knowing which
points these correspond to in the space we are describing. Like with real-life maps,
the map does not need to describe all of the space at once, but can simply represent
a part of the space. Like with real-life maps, the map does not need to represent
the geometry of the space well, often it is even impossible to construct a map that
represents, for example, both the lengths and angles accurately.
How do we transform this into a functional definition?

Figure 3.5: The earth and a map of the earth. We can have a map of only
part of the earth and as we know all of the maps we have cannot represent
the geometry of the earth well. A common example is that the size of
Greenland in the Mercator map is similar in size to Africa, even though in
reality it is about one-fifteenth Africa’s size

Well, mathematically, what we need is (1) the piece of the space the chart
describes, (2) the piece of Rn you draw the map in and (3) a way to assign every
point in (1) a point in (2). The latter is of course just the description of a function.
3.3. CHARTS: PART I 119

Definition 3.3.1: A chart (topological definition)

Let X be a topological space as defined above and U ∈ X open. A function


Ch is called a chart, if it is a function Ch : U → Rn for some n, and as
a function Ch : U → Ch(U ) is continuous and bijective. We also require
Ch(U ) ⊂ Rn to be open.

We sometimes call Ch the chart and sometimes (U, Ch) the chart (as a tuple),
depending on which one is most convenient1 .
With this definition, we have fulfilled both (1) and (3), and (2) is just Ch(U ).
As you can see in the definition, we require bijectivity, that is, we don’t want our
map to have two points on the map corresponding to the same point in the space,
nor do we want two points on the space being shown as one on the map.
We can also see this more topologically, by introducing the notion of a homeo-
morphism.

Definition 3.3.2: Homeomorphism

Let X, Y be two topological spaces with topologies TX , TY . We define a


function f : X → Y to be a homeomorphism, if it is:
• bijective

• continuous (If V ∈ TY , then f −1 (V ) ∈ TX )


• It’s inverse, f −1 is also continuous. (If U ∈ TX , then f (U ) ∈ TY )
Continuity of course refers to the topological continuity.

In this sense, a chart is a homeomorphism between an open subset of X and an


open subset of Rn .
A chart gives us (local) coordinates for X. You can see this in figure 3.3.
These coordinates are simply the numbers that the chart maps points to.

Definition 3.3.3: Coordinates

We can write a chart as a function that takes points p ∈ X as Ch(p) =


(x1 (p), x2 (p), . . . , xn (p)). The functions x1 (p), x2 (p), . . . , xn (p) are called
the coordinates associated with the chart, and evaluated at a specific point
p they are called the coordinates of p.

We can construct the parametrization P : Ch(U ) → U simply by taking the


inverse of Ch, which has to exist since we require bijectivity. So we get P = Ch−1 :
Ch(U ) → U .
1 Formally, a chart is defined as the tuple, so if you need to show something formally, use

the tuple.
120 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

Figure 3.6: A topological space X with an open subset U and it’s map/chart
onto a portion Ch(U ) of Rn . The coordinate lines from the map can be
projected back onto the space, giving us coordinates for the space.

Definition 3.3.4: Parametrization

We can construct a parametrization of U from a chart (U, Ch) simply by


taking the inverse of Ch, restricted to Ch(U ).

P = Ch−1 : Ch(U ) → U (3.2)


Then P is of course a function P (x , x , . . . , x ) = p ∈ U and parameterizes
1 2 n

U in the usual sense.


3.3. CHARTS: PART I 121

Example 3.3.1: The Sphere, again

The sphere, as a subset of R3 and a two-dimensional surface, is a topo-


logical space, if you equip it with the usual open sets that come from R3 .
Specifically, you can write the topology as:

TS 2 = U ⊂ S 2 | ∃V ⊂ R3 : V ∈ TR3 and V ∩ S 2 = U (3.3)




Or, simply put, the open sets on S 2 have bigger open subsets of R3 , that
they match with on the sphere.
There exists a standard chart, which is simply the chart that describes the
sphere by longitude and latitude. It’s easier to write the parametrization
down first, so we will start with that.

P : (0, 2π) × (0, π) → S 2 \ S 2 ∩ {(x, 0, z) | x ≥ 0 z ∈ R} (3.4)




cos(ϕ) sin(θ)
 

(ϕ, θ) →  sin(ϕ) sin(θ)  (3.5)


cos(θ)

From this, we can construct a chart, by taking the inverse. You can work this
out quite easily. If we call the subset of the sphere, that the parametrization
points to, U , then we get:

Ch : U → (0, 2π) × (0, π) (3.6)


(x, y, z) → (arctan(y/x), arccos(z)) (3.7)

with the extension of the arctan to the case where x = 0. You can check
(or should know) that this is a bijective continuous function so (U, Ch) is
a chart. There are, however, a myriad of other charts you could use for the
sphere, some of which we will introduce later.

We have developed the tools to add another condition for what we want to work
with, calling the new type of space a topological manifold. The new requirement
will be quite simple. We will want to be able to use charts. We want to require
that the whole space can be covered by charts, i.e., that there is no point in the
space, where you cannot construct a chart for any of its surroundings2 . We do not
require, however, that there is one chart that covers the entire space. You can see
easily why in examples like example 3.3
Before we add this extra structure, however, we will want to talk about another
condition that we will want to have, which eliminates (geometrically) ”pathological”
examples that we will definitely not want to work with, at least in this lecture.

2A surrounding is an open set containing the point it surrounds.


122 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

3.4 The Hausdorff Condition

We will want our new space to have a condition called the Hausdorff condition to
eliminate some weird examples we don’t want to have. Consider the example below
for a pathological case of a topological space that we won’t want to work with.

Figure 3.7: We can create the real line with two origins, by glueing together
two real lines everywhere, except at the origins of the two lines. This is
shown in (a). Sets that would be open on a real line (and have the origin
in them) are open if they contain at least one of the origins.
3.5. THE ANT 123

Example 3.4.1: The real line with two origins

We can construct a pathological example we want to avoid by taking two real


lines and gluing them togethera everywhere except for the origin. What we
get is the real line with two origins as show in figure 3.4. It is a topological
space and you can even show that you can cover it with chartsb , but it is
not something we want to work with. That is why we build the Hausdorff
condition into our new definition.
a in jargon: We take the quotient space
b You can just project onto the real axis, always picking one or the other origin.

What is the problem with cases such as the real line with two origins? Well, in
some sense, there are two points where there should be one. You can’t separate
them. In a topological way of speaking, there aren’t two open sets that each contain
one of the points, but that don’t intersect. The way we will avoid this is simply by
requiring it and throwing away all other cases.

Definition 3.4.1: The Hausdorff condition


Let X be a topological space. X is called a Hausdorff space if for any two
points, they are separable by two open sets. More precisely X is a Hausdorff
space if for any two distinct point p, q ∈ X there exist open sets O1 , O2 ⊂ X
so that p ∈ O1 , q ∈ O2 and O1 ∩ O2 = ∅

We will require this of any space we work with and build this into our new
definition. We will call the new kind of space a topological manifold.

Definition 3.4.2: Topological Manifold

A topological manifold M is a topological space (M, TM ), which obeys the


Hausdorff condition and can be covered by charts.

There are many examples of topological spaces, any curve or surface will do.
The circle is a great example, so is the torus.

3.5 The ant


We have now added many conditions, some rather technical and we want to conclude
this chapter by coming back to our old friend, the ant. You can imagine our
progressive addition of requirements onto our space as additional abilities the ant
has. In the first stage, we had a simple set and our ant couldn’t do anything really,
it just knew the places that exist in the world it lives in. In a topological space,
the ant can go a bit further, it can ”see”. It can tell where things are spatially in
relationship to each other, but it cannot (yet) tell anything as intricate as distances.
In a topological manifold, it can draw charts and it has enough of these to cover its
124 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS

entire space, but it cannot yet do much with these charts, in particular things like
derivatives are still out of its reach. Our next goal will be to enable the ant to tell
how things change when she changes the coordinates. This will be the topic of the
next chapter.

3.6 Interlude: Useful topology for the course


Will be updated throughout the course.
Chapter 4

Smooth Manifolds

We have taught the ant how to recognize where things are spatially. We now want
to prepare to teach it calculus. We won’t teach it calculus just yet, but we will
prepare it to do so, by making sure the charts it uses are compatible with each
other, which is the main topic of this chapter.

4.1 Charts II: Compatible charts

We have seen that a topological manifold can be covered by continuous (in the
topological sense) charts. To differentiate, we would need something a bit better
some differentiable, or preferably smooth structure. Right now, there does not seem
to be an obvious way of defining a sensible, geometric, way to define a derivative
on the space without making grand ad hoc assumptions that we are not prepared
to make, since we want to stay quite general. But let us, for the sake of the
argument, say that we have found a way to do it. There is an immediate way that
differentiating could go wrong if we do not pose further restrictions on our charts.

Imagine we have two charts, (U1 , Ch1 ) and (U2 , Ch2 ), either covering the same
region or covering regions that overlap somewhere (U1 ∩ U2 ̸= ∅) and we have some
function, let’s call it f : M → R. Whatever our derivative should be, the one
thing we will want is that if f is differentiable in the new sense, and if the charts
are sensible, then f should be differentiable as a function of the coordinates. But
this should hold for any chart we want, so in particular it should hold for Ch1 and
Ch2 . We can guarantee this if the transition map from one chart to another is
differentiable. This will be the topic of this chapter.

125
126 CHAPTER 4. SMOOTH MANIFOLDS

Definition 4.1.1: The transition map

Let (U1 , Ch1 ) and (U2 , Ch2 ) be two charts, with U1 ∩ U2 not empty, that
is, there is a region on M that both charts cover. Call this region U . Then
we define the transition map from the first to the second chart as:

T1→2 = Ch2 ◦ Ch−1


1 = Ch2 ◦ P1 (4.1)

More specifically, we define it as T1→2 : Ch1 (U ) → Ch2 (U ), since anything


else is senselessa . The proper definition is: T1→2 = Ch2 U ◦ P1 Ch1 (U ) .
a We cannot define it on any bigger set obviously.

You can see a picture with all the objects we are using right now in figure 4.1.

Figure 4.1: A picture showing all the main players of this chapter. We have
(two) charts Ch1 , Ch2 each covering a region U1 , U2 of the manifold. Each
chart has its own inverse, P1 , P2 associated with it, which are parametriza-
tions of the manifolds. On the region where U1 and U2 overlap (called U )
we can define transition maps T1→2 , T2→1 , which change charts.
4.1. CHARTS II: COMPATIBLE CHARTS 127

From now on, since we will be working with charts a lot and want to declutter
notation, we will not write on which sets each chart is defined, these are implied
to be the ”reasonable” ones. It is, however, a good exercise, to keep track of
these, especially for the exam. We will also call U1 and Ch1 (U1 ) the same thing,
even though they are definitely not. Our reasoning is that Ch1 (U1 ) is our chart
representation of U1 and in the same way you can point at a map and say ”Here
is America”, even though you are pointing at a chart of America, you can call
Ch1 (U1 ), U1 .
We can now formalize what our idea, by calling two charts compatible if their
transition maps are differentiable. The only thing we will change is require smooth-
ness, for convenience.

Definition 4.1.2: Compatible charts

Let (U1 , Ch1 ) and (U2 , Ch2 ) be two charts of M . We call the two charts
(smoothly) compatible if either U1 and U2 don’t overlap, or otherwise, if
their transition functions T1→2 and T2→1 are smooth.

With this, we can define an atlas and a preliminary definition of a smooth


manifold. An atlas, simply put, is a collection of charts that cover M and are all
compatible with each other. A smooth manifold (for now) is a topological manifold
with an atlas.

Definition 4.1.3: Atlas

An Atlas is a set A of charts (Ui , Chi ) which cover M and are all smoothly
compatible with each other (all the transition maps between the charts are
smooth)

Definition 4.1.4: Smooth Manifold: Preliminary Definition

A smooth manifold M is a topological manifold equipped with an atlas. We


summarize all the requirements in the following list:

• The set M has a topology TM , which is the set of all open subsets of
M.
• The space fulfils the Hausdorff condition, that is, every pair of two
different points can be separated by open sets.

• The space is equipped with an atlas A of charts, which are smoothly


compatible.
• The charts are all homomorphisms. (This comes from the definition
of a topological manifold.)
128 CHAPTER 4. SMOOTH MANIFOLDS

4.2 Examples of smooth manifolds


There are many examples of smooth manifolds, and now that we have a preliminary
definition we can give some examples in this chapter.

Example 4.2.1: Smooth manifolds

• Given that we started our journey with Rn as an inspiration and


generalized some of its properties, it is no wonder that Rn is an n-
dimensional smooth manifold.
• Any open subset of Rn is also a smooth manifold.

• The graph of a smooth function f : U ⊂ Rn → Rm with U open is a


smooth n-dimensional manifold.

Figure 4.2: The graph of a smooth function is a smooth manifold.


We can create a smooth chart Ch that covers all of the manifold
quite easily, by projecting (x, f (x)) down onto x.

• Any subset of RN that can be written, locally as the graph of a


smooth function (in some orthogonal coordinate system) is, of course,
a smooth manifold.

The last two examples aren’t too surprising. They are completely analogous to
their equivalents we saw while discussing surfaces and since spaces like surfaces are
what we want to generalize, it looks like we are on the right track here.
Now that we have the general examples done, we give a few more specific ones.
4.2. EXAMPLES OF SMOOTH MANIFOLDS 129

Example 4.2.2: The n-dimensional sphere S n

• The n-dimensional spheres S n are all manifolds. (S n is defined as


the set of all points in Rn+1 with euclidean distance one from the
origin). You might ask yourself how many charts we need to cover S n
completely. Your first idea might be to do something similar to how
we covered S 2 in chapter 2 and describe half the sphere as a graph.
If you do this, you get (2n + 2) charts (exercise: verify this). You can
see the two examples S 1 and S 2 in the following figure.

Figure 4.3: The six different charts (only the open sets are shown)
that you need to map S 2 with the first method.

It turns out, that you don’t actually need this many charts to cover
S n , you could do with just two. What you need for this is the
stereographic projection. You can see the stereographic projection
of S 2 onto the plane in figure 4.2. The way you project a point
p = (x1 , . . . , xn , xn+1 ) from the sphere onto the (hyper)-plane is by
drawing a straight line from the north pole N = (0, . . . , 0, 1) through
p. The point at which the line hits Rn is the point it gets mapped on.
Notice that in two dimensions it is easy to see that the southern hemi-
sphere gets mapped onto the disc inside the sphere, while the northern
one takes up all of the rest of the plane. This way you can create a
chart that covers the entire sphere except for the north pole. This is
the first chart. The second one is also the stereographic projection,
but this time you do it from the south pole, and that covers the entire
sphere except for the south pole. Together you have an atlas of size
two.
130 CHAPTER 4. SMOOTH MANIFOLDS

The formulas for the two projections are:

ChN (x1 , . . . , xn , xn+1 ) = (x1 , . . . , xn , 0)/(1 + xn+1 ) (4.2)


ChS (x , . . . , x , x
1 n n+1
) = (x , . . . , x , 0)/(1 − x
1 n n+1
) (4.3)
(4.4)

As an exercise, you should derive these formulas and show that they
are compatible.

Figure 4.4: The stereographic projection. To produce the stereo-


graphic projection of a point p on S n+1 , you draw a line between the
north pole N and the point p. You then follow the line to the point
where it crosses Rn × {0}. This point is then the projection of the
point p

You can also check that the charts from the first method (2n + 2
charts) are compatible with the charts obtained from the stereographic
projection.

Example 4.2.3: Mobius Strip

The Mobius strip is another example of a smooth manifold. You can see
the Mobius strip in a bit of a different form in the figure below.
4.3. THE MAXIMAL ATLAS 131

Figure 4.5: The Mobius strip.

4.3 The maximal atlas

We have given a preliminary definition of a smooth manifold, by saying that a


manifold is a space with a topology and an atlas, composed of charts which are
compatible with each other and cover the whole space. We now come to a practical
problem with our definition. In example ??, we saw that we can construct two
atlases for the sphere, both of which are good enough in the sense that the charts
of each individual atlas cover the sphere and are compatible among each other (in
each atlas, individually).

But which atlas should we choose to ”define” the sphere as a manifold? And
do we get different geometries from different atlases like that? In the case of the
sphere, it certainly would be weird if we got different geometric results if we used
the stereographic atlas or the graph atlas. To this also comes the fact that working
in one or the other is not really too comfortable. For example, if you choose the
graph atlas, there is no chart on which you can see both the north and south pole on
a single map, you need to use at least two charts, which is both more complicated
and somehow seems like an unnecessary problem. There is an easy solution to
this. Throw both of these atlases together. We already asked you to show that the
charts you get from the graphs and the ones from the stereographic projections are
compatible, so if you throw all these together, you still get a fully functional atlas.

In fact, while we are at it, why not just throw all possible compatible charts
together into one atlas, call it a maximal atlas and be done with it? This is exactly
what we choose to do and will be the modification to our definition.
132 CHAPTER 4. SMOOTH MANIFOLDS

Definition 4.3.1: The maximal atlas

Let A be an atlas of some smooth manifold (as per our preliminary defini-
tion). We can construct a maximal atlas by collecting together all possible
compatible charts with A. The maximal atlas is defined as:

Ā = {all charts (U, Ch) that are compatible with all (UA , ChA ) ∈ A }
(4.5)

Of course, to construct the maximal atlas we need some atlas to start with, and
the maximal atlas will depend on this. If you have two atlases that are compatible
with each other, however, they of course produce the same maximal atlas. In that
case we call the two atlases equivalent.

It should make sense, of course, that Ā is an atlas in its own right, all charts in
Ā are compatible with each other. We will give the proof, because it is a proof that
is similar to many other proofs in differential geometry of this kind and it is good
to have seen its kind once.

We will need two ideas for the proof, which seem quite trivial and are not too
hard to prove.
4.3. THE MAXIMAL ATLAS 133

Lemma 4.3.1: Smoothness is a local property

S manifold M and (Ui )i∈I open sets that


Let U be an open region of some
cover U . (which means that Ui = U ). Then a function f is smooth if
i∈I
and only if f |Ui is smooth for all i.

Figure 4.6: It should be clear that smoothness is a local property.


A function should be smooth on U exactly when it is smooth on a
covering of U (a parting of U into smaller parts).

This idea is quite easy to accept, since smoothness should be a local property,
after all, it is a generalization of the ϵ − δ kind of continuity and differentiation.
The second idea is even simpler.

Lemma 4.3.2: Composition of smooth functions are smooth.

et U, V, W be open sets of some manifold N, M, P and f : U → V and


g : V → W two smooth functions. Then g ◦ f is a smooth function.

Again this should feel obvious and the proof is not hard, it’s another one of the
typical chain rule proofs of differential geometry.
We will leave these two claims unproven, since their proofs are neither hard nor
illuminating and focus on the original idea we want to prove.

Proof. We want to show that Ā, which is a maximal atlas constructed from A,
is an atlas in its own right, that is, every two charts in Ā are compatible with
each other.
134 CHAPTER 4. SMOOTH MANIFOLDS

We start with choosing two charts in Ā, and we call them (V, ChV ) and
(W, ChW ). We want to show that they are compatible.
Let Z = V ∩ W and we can assume that it is not empty, otherwise, we are
done. We want to show that TV →W (on Z) is a smooth map. The basic idea of

Figure 4.7: The charts we use for the proof.

the proof is that we do not transition from the first chart to the other directly,
but go over the charts from A. (See figure 4.3)This is why we needed the second
lemma. The first we need because, in general, it is not necessary that A contains
a chart that works on Z, so we have cut Z in parts and prove it for each part,
which is where our lemma will come in handy. Let’s start.
Firstly, cut Z up into all the pieces, where a chart in A exits that covers
that portion of Z (and maybe more beyond Z). By construction of the maximal
atlas, they are compatible with each (Ui , Chi ) ∈ A. That is, S define the sets
Zi = Z ∩ Ui , which are open and cover Z. Then ChV (Z) = Ch(Zi ) and all
i∈I
of these are open as well since ChV has to be continuous.
4.4. THE FINAL DEFINITION OF A SMOOTH MANIFOLD 135

Take TV →W and split it up into smaller maps over the Ch(Zi ). Because
of the first lemma, we only need to show that TV →W |Ch(Zi ) is smooth for all
Ch(Zi ), the smoothness of TV →W as a whole map follows from the lemma.
But we can use that:

TV →W |Ch(Zi ) = ChW ◦ PV (4.6)


= ChW ◦ (Pi ◦ Chi ) ◦ PV (4.7)
= (ChW ◦ Pi ) ◦ (Chi ◦ PV ) (4.8)
= TUi →W ◦ TV →Ui (4.9)

where all the functions are, of course, restricted to either Zi (all the charts) or
Chi (Zi ) (all the parameterizations and transition maps), which has been left
out for readability.
But ChV and ChW have to be compatible with all the Chi , so the two tran-
sition maps in the above equation need to be smooth, but since the composition
of two smooth functions is smooth, TV →W |Ch(Zi ) has to be smooth for all charts
in the atlas, and since the Zi cover Z, the whole transition map TV →W has to
be smooth.

4.4 The final definition of a smooth manifold


We can now tweak our preliminary definition to get the full, proper, definition of a
smooth manifold. The only change we make is that the atlas has to be maximal.

Definition 4.4.1: Smooth Manifold: Final Definition


A smooth manifold M is a topological manifold equipped with a maximal
atlas. We summarize all the requirements in the following list:

• The set M has a topology TM , which is the set of all open subsets of
M.
• The space fulfils the Hausdorff condition, that is, every pair of two
different points can be separated by open sets.
• The space is equipped with a maximal atlas Ā of charts, which are
smoothly compatible.
• The charts are all homomorphisms. (This comes from the definition
of a topological manifold.)

4.5 Cartesian Products, Smoothness


In this section, we want to talk about a few more ideas that we need to understand
smooth function. In particular, we want to discuss what happens if you take the
136 CHAPTER 4. SMOOTH MANIFOLDS

cartesian product of smooth manifolds and also the relationship between smooth
functions between manifolds and atlases.

4.5.1 The Cartesian product of two manifolds


One thing we would like to do is take small manifolds and create larger ones out
of these. A very natural way to do this is by using the Cartesian product. Going
back to Rn , if you take a copy of Rn and a copy of Rm and combine them using a
cartesian product, you, of course, get Rn+m , which is, of course, a smooth manifold.
It turns out that the same happens when you take the cartesian product of two
smooth manifolds. The resulting space is a smooth manifold as well, in a very
natural way.
You can for example take two circles (two S 1 ) and combine them by the cartesian
product. The result is the torus.
The main question is how you can construct an atlas for a cartesian product of
two smooth manifolds. It turns out to be quite simple, you just take the cartesian
product of the charts of each atlas.

Definition 4.5.1: Atlas for a cartesian product

Let M and N be two manifolds, with dimensions m and n. Then you can
construct an atlas for M × N out of the two (maximal) atlases ĀM and ĀN
that belong to M and N . If (UM , ChM ) and (UN , ChN ) are two charts
from the individual atlases, then you can create a chart for UM × UN by
taking the cross product:

ChM ×N = ChM × ChN (4.10)

You can check that with this atlas (or the maximal version of it) you get a
smooth manifold, by checking all the conditions for a smooth manifold.
This definition also forces the dimension of M × N to be m + n as you would
expect, which you should convince yourself of.

4.5.2 Smooth functions between manifolds


We want to talk about smooth function between two manifolds, say M and N , with
dimensions m and n.
The first question is how we want to define smoothness on manifolds. We use
charts of course! But we need a bit more detail. Let’s say we have f : M → N ,
and a point p ∈ M so that f (p) = q ∈ N . Since smoothness is a local quantity,
we only need to check it locally. The general idea is to pick two charts, one of M
that charts a surrounding of p, and a second one for N that chats a surrounding of
q, and see if f is smooth as a function of the coordinates of the charts.
Firstly, however, we need a small condition on the pair of charts we use.
4.5. CARTESIAN PRODUCTS, SMOOTHNESS 137

Definition 4.5.2: Admissible pairs

Let (U, ChU ) ∈ ĀM and (V, ChV ) ∈ ĀN be two charts and f : M → N
the function whose smoothness we want to check. The two charts are called
an admissible pair if f (U ) ⊂ V , that is if the whole set U gets mapped into
V.

Definition 4.5.3: Smoothness of f : M → N

Let f : M → N be a function between two manifolds M and N . We say it


is smooth if for every point p ∈ M we can find at least one admissible pair
(U, ChU ) ∈ ĀM and (V, ChV ) ∈ ĀN where p ∈ U so that the function in
coordinates is smooth. In other words, fc = ChV ◦ f |U ◦ PU : Ch(U ) →
Ch(V ) is smooth, where the c in fc stands for coordinates.

Notice first that we did not define f to have to be smooth in coordinates for all
admissible pairs, only that one exists. You might ask yourself if this means that a
smooth f might not be smooth for an admissible pair that wasn’t used to check its
smoothness. It turns out, that the answer is no. You only need to check smoothness
in one atlas, not the maximal atlas, and the definition then forces f to be smooth
as a function of coordinates for all other admissible pairs. The proof of this claim
is very similar to the proof that all charts in the maximal atlas are compatible with
each other. You take two charts that are an admissible pair, break U up into small
pieces for which there is a chart where it is smooth (as per the definition), and use
the fact that smoothness of functions between two Rn is local. We therefore choose
not to give it here.

From this definition, you can immediately see that the following corollary has to
be true.

Corollary 4.5.1: Smoothness is a local property

The smoothness of a function f : M → N is a local property.


138 CHAPTER 4. SMOOTH MANIFOLDS

Example 4.5.1: Smooth functions between manifolds

We want to give a few examples of smooth functions between manifolds


which you already know, but might not have thought about in this way
before (except for the last two). The first three rely on the fact that Rn is
a manifold.

• Smooth curves on a manifold γ : (a, b) → M are in this sense smooth


functions, since (a, b) is a smooth manifold.

• Charts and parameterizations are smooth functions in the above sense,


which should convince you that our definition of a smooth function is
sensible.
• Functions u : M → R can be smooth.

• The definition of smooth maps for surfaces that we gave in Chapter


2 is equivalent to the new one.

The last two points should make it reasonable that our definition makes
sense.

Another thing we definitely would like to have is that the composition of two
smooth functions is a smooth function and that smooth functions in the above sense
are also continuous in the topological sense, both of which turn out to be true.

Proposition 4.5.1: Composition of smooth functions is smooth

Let f : M → N and g : N → P be two smooth functions and M, N, P


smooth manifolds. Then g ◦ f : M → P is also smooth.

Proposition 4.5.2: Smooth functions are continuous

et f : M → N be a smooth function in the manifold sense. Then it is


continuous in the topological sense.

Since both of these propositions seem very natural and their proofs aren’t too
interesting we leave these out. You can show the first one easily by using charts,
similar to other proofs in this chapter and the second by using that continuity is a
local property and charts are homeomorphic.

4.6 Diffeomorphisms
Now that we have done a lot of technical detail, we want to talk about when (at
this stage) you cannot tell the difference between two manifolds, similar to how
two topological spaces are pretty much the same thing (equivalent, homeomorphic)
4.6. DIFFEOMORPHISMS 139

if there is a homomorphism between them. The main point back then was that
we needed a bijective continuous map between the two spaces. The idea was that
we have a structure on the space, and if the structure is the same between two
spaces, we cannot tell the difference with any tool we have that comes from these
structures. The only new structure we have at this stage is a smooth atlas, so you
are probably not too surprised that the bijection will need to be smooth and its
inverse as well. When we have such a map, we call it a diffeomorphism and the two
spaces are diffeomorphic.

Definition 4.6.1: diffeomorphism

et f : M → N be a function. It is called a diffeomorphism if it satisfies the


following conditions.

• f is a bijection.
• f is a homomorphism. If the two spaces are supposed to be ”the
same space”, then we shouldn’t be able to tell them apart by their
topologies, so this condition makes sense.
• f is smooth, and it’s inverse f −1 is also smooth. This is to make sure
that the maximal atlases are ”pretty much the same” and we cannot
tell M and N apart from their smooth structure.

If such an f exists, we call M and N diffeomorphic.

There are a few things to note about the definition. Firstly, it forces dim(M ) =
dim(N ), which should make sense. It should not be possible for the circle and the
sphere to be the same thing, and they are not. Secondly, the second requirement is
not actually necessary, because charts are homeomorphic and the second condition
follows from the other two. A diffeomorphism is automatically a homomorphism
without the second condition, we just added it so that you could see very clearly
how a diffeomorphism respects the entire structure, not just the atlas.

4.6.1 What you can get from diffeomorphisms


We want to just mention a few results that you can get from studying diffeomor-
phisms, without mentioning too much detail.
The set of self-diffeomorphisms Diff(M ) of a manifold M forms a group and
that this group is huge, usually infinitely dimensional.
Starting in dimension four, not all topological manifolds possess a smooth struc-
ture (a maximal atlas). Other topological manifolds have more than one maximal
atlases, which are not diffeomorphic to each other. The last one might seem like
something that happens only to some weird edge cases, but it actually even happens
to R4 . (Result of Freedmann/Donaldson in 1980’s)
140 CHAPTER 4. SMOOTH MANIFOLDS
Chapter 5

Tangent Vectors

Now that we have defined what a smooth manifold is, we want to be able to do
something inside of it. It is all well and good to talk about diffeomorphisms and
charts, but we need some objects in the manifold to have interesting results about.
The obvious first candidate for something interesting on a smooth manifold is a
vector. After all, Rn , curves and surfaces all have vectors associated with them and
a lot of the nice results from the first part of the lecture had something to do with
vectors.
There is a small problem of just ”lifting” the definition of vectors from Rn to
manifolds directly, without any thought. The problem is that a manifold does not, in
general, have a natural vectorspace structure. You can’t ”add” points on a general
manifold in a very sensible way. What is the north pole plus the south pole on a
sphere, which is not embedded in R3 ? This question does not even seem sensible
and any addition you would add at this stage would seem very arbitrary and definitely
not natural. Now, there are many ways to look at the idea of what a vector in Rn
is. You have the obvious one, a vector is ”an arrow” or the mathematical one of it
being an element of a set with an addition and scalar multiplication which obey the
following axioms...
But neither of these help us right now. Of course, we always want to draw
vectors as arrows, and mathematically, the object we want to work with should be
vectors in the vector-space-vector definition, but they are not helpful yet.
One idea is clear even at this stage. Whatever concept of a vector we will choose
to generalize, we will only generalize tangent vectors (for example of curves/sur-
faces). The reason is simply that all other vectors on curves and surfaces (normal
vectors) did not come from the curve/surface itself but the Rn that we embedded
the curve/surface in. So only tangent-vectors are appropriate if we don’t want to
have the influence of some ambient space that our manifold lives in.
There are four different ideas we can generalize out of Rn that end up being
equivalent.

• The first one is very simple. We treat vectors in charts. Vectors are vectors
in charts, where vectors make sense (since charts go to Rn ) and we can point

141
142 CHAPTER 5. TANGENT VECTORS

at a vector in a chart and look to another chart and through the transition
map check which vector it is there. This definition defines tangent-vectors
through equivalence classes of vectors on charts. We don’t know what they
are on M , but we can chart them.

• A bit more on the computational side, we can define vectors through direc-
tional derivatives of smooth functions. We know that you can differentiate a
smooth function f : Rm → R in the direction of X and get

∂f ∂f ∂f
DX f = X 1 1
+ X2 2 + . . . Xn n (5.1)
∂x ∂x ∂x
This derivative contains the same information as what we would usually call
the vector, that is the quantities (X 1 , X 2 , . . . , X n ).

• More on the geometric side, we can use curves to define tangent vectors. We
can use the fact that from a geometric standpoint, a vector tangent to a
curve (multiplied with a small ϵ) looks like a small piece of the curve itself, in
the usual Rn cases. This way we define a tangent vector as a ”small piece of
a curve”. Since many curves have the same tangent vector, we will also use
equivalence classes here.

• On the more abstract side, we can define tangent vectors as linear operators
on smooth functions to R from the manifold, which satisfy the product rule.

X op (f g) = X op (f )g + f X op (g) (5.2)

These four ways are all equivalent to each other and can all be used to define
tangent vectors. They all represent different ways to think about tangent vectors
(charts, computation, geometric (curves), abstract) and this variety gives you a lot
of ways to tackle a mathematical problem. Sometimes the geometric picture will
be more applicable, sometimes the computational etc.
Of particular note is the second definition, which, because of its relationship
with partial derivatives has fostered a notation in differential geometry which can
be confusing at first, but to which one gets used to quite fast.
For our purposes, the first two definitions will be most useful, and we will take
the most time discussing them.

5.1 Tangent vectors from charts


The first possibility of defining tangent vectors on a manifold is by using charts.
We know how vectors work in Rn and can therefore use this knowledge to define
something resembling a vector on a manifold. We will recapitulate how vectors
behave with maps in Rn , and then go on to define them properly on manifolds.
5.1. TANGENT VECTORS FROM CHARTS 143

5.1.1 Vectors and Maps in Rn


Imagine you have two copies of Rn (or open subsets) that are connected via a
(smooth) map f : Rn → Rn , which you can see as a coordinate change for the
underlying map. Then pick a point p and a vector X ∈ Tp Rn ≈ Rn . Where does
the vector X get mapped to? That is a simple idea from calculus. It gets mapped
to df (X) of course because the Jacobean (df ) tells us how the map f behaves
locally, and a vector is a local thing (it is at the point p).
You can see the situation in figure 5.1.1.

Figure 5.1: A vector X in Rn gets mapped to dfp (X) when f is applied to


Rn .

5.1.2 Back to manifolds


Now what can we say when working with manifolds? Well, we can use the situation
from the last section, by saying that the first Rn (or more precisely an open subset
U1 ⊂ Rn ) is the coordinate space for a chart Ch1 and the second one for Ch2 ,
144 CHAPTER 5. TANGENT VECTORS

which are connected by T1→2 . Then if you are working in the first chart, which
means you are working in the first Rn , you know what a vector is in the chart,
it’s just a tuple (X 1 , . . . , X n ) ∈ Rn situated at p and you can work with it in the
chart. What happens when you use the second chart? Well the coordinates of the
vector X, which previously lived on the first chart, will just be the jacobian of the
transformation matrix at p evaluated at X.

(X ′1 , . . . , X ′n ) = dT1→2 ((X 1 , . . . , X n )) (5.3)

or written out:

n
X ∂T i
X ′i = Xj (5.4)
j=1
∂xj

where we wrote T = (T 1 (x), . . . , T n (x)) instead of T1→2 , and when clear will
continue to do so.
By taking all other charts you can find which vector in the other charts X
corresponds to and what you get is a working definition of a vector, without having
really talked about the manifold, at all. Figure 5.1.2 might make this clearer.
Let’s say we have a plane flying over the point p with a physically real velocity
described by the vector X on the Mercator chart (square). We can describe the
path of the plane perfectly well in that chart without any reference to the actual
Earth. If we want to switch to a new chart, maybe because it is easier to see
something or represent our country as bigger than others, we can easily do that
with the transition map and bring the velocity vector to the new chart using the
Jacobean of the transition map. This is the idea of this definition, we take vectors
in charts and use them only in charts, really, with no mention of the manifold.

5.1.3 The definition

We want to pack the idea of using vectors in charts into a working definition. The
way is quite simple. We say a vector is simply the collection of all vectors in charts
that transform into each other, and we can take one representative (in one chart)
as the example of the vector we talk about.
5.1. TANGENT VECTORS FROM CHARTS 145

Figure 5.2: The Mercator and stereographic projection of the earth (/parts
of the earth). Imagine at a real point p on the earth, there is a plane flying
with a velocity described by the (blue) tangent vector X. We can talk of
its velocity vector as an arrow on both of these pictures (charts) without
making any reference to the actual manifold, that is the earth. We can
describe the path of the plane and all the information about it we would like
using only the charts and nothing else.
146 CHAPTER 5. TANGENT VECTORS

Definition 5.1.1: Tangent vector definition 1: Through charts

Our first definition is through charts. Let M be a manifold and p a point


on it as always. A tangent vector X at p is defined as an equivalence class
of vectors in charts identified with each other through transition maps. Two
vectors X ′ , X ′′ in different charts Ch1 , Ch2 are equivalent in this sense if:
• The point p is covered by both charts p ∈ U1 ∩ U2
• The second vector is just the Jacobean of the transition map acting
on the first. X ′′ = dT1→2,p (X ′ )

We always have the point p be a part of the vector, a tangent vector never
exists without a point p it sits at.

It should not surprise you that the equivalence relationship we presented is an


actual equivalence relationship in the mathematical sense (reflexivity, symmetry,
transitivity). If you want, you can prove it, the proof is another one of those ”chain
rule” proofs we see quite often.
You might wonder why we require the point p to always be included with the
vector. The reason is that we don’t really have a good way of saying two vectors,
sitting on two different points p, q are the same vector but shifted. In Rn , this is
simple. We can simply translate the arrows to get one to another, and by this,
we can have a very natural way of determining if two vectors are the same. But
have a look at the sphere (figure 5.1.3). This clearly doesn’t work here. Simply
translating, even in a way where locally the vectors look parallel1 you can get back
to the original point and get a different vector! Transporting vectors on manifolds
is clearly not as simple as on Rn and requires more thought. In fact, this is the first
hint at curvature.
This is why we always carry the point around with the vector because we cannot
yet do anything with vectors between two
We can also, along the lines of our discussion of surfaces, define the tangent
space, which we have already mentioned a few times in this chapter (specifically
the one of Rn ), but have not yet properly introduced. The definition is quite simple
and doesn’t require much talk.

Definition 5.1.2: Tangent-space

We define the tangent space Tp M at a point p of the manifold M to be the


set of all of its tangent vectors.

You really shouldn’t be too surprised that Tp M is a vector space of dimension n.


A heuristic argument would be that we have glued together all the tangent spaces
of Rn at the points Chi (p) for all the charts together, so we are really just left with
one glued-together copy of the tangent space of Rn , which is just {p} × Rn and all
1 after all, the sphere locally looks like R2 , as you should know from living on the earth.
5.1. TANGENT VECTORS FROM CHARTS 147

Figure 5.3: In Rn , you can transport vectors and compare them without
problems, we therefore don’t need to always say which point the vector is
situated at. But if we move onto the sphere, this is not the case anymore.
You can transport a vector on the sphere so that it locally stays parallel to
itself, and after performing a loop, come back and get a different vector!
148 CHAPTER 5. TANGENT VECTORS

the rules from there get copied over to Tp M

5.2 Tangent vectors as directional derivatives

In the last section, we introduced tangent-vectors as things that we know how to


describe in charts, but don’t really understand on the manifold. We know how to
map them, but we don’t know what they are. This definition gives a more compu-
tational view of things. We take inspiration from the fact that we can differentiate
functions along the direction of a vector in Rn .
We know that the derivative of a (smooth) function u : Rn → R in the direction
of X (vector) at p is:

n
X ∂u i
DX u(p) = Du(p)(X) = X (5.5)
i=1
∂xi
∂u ∂u ∂u
= X1 + X2 2 + · · · + Xn n (5.6)
∂x1 ∂x ∂x
(5.7)

We can use this. The components of X are very explicit in this equation. Notice also
that if you view this over all possible smooth functions, the directional derivatives
are operators on smooth functions. Even more so, every tangent vector produces
its own derivative operator, and two different tangent vectors produce two different
operators.
We can also rewrite the above equation using curves. If γ is a curve so that
γ(0) = p, γ ′ (0) = X, then:

du(γ(t)) ∂u ∂u ∂u
= γ ′ (0)1 + γ ′ (0)2 2 + · · · + γ ′ (0)n n (5.8)
dt t=0 ∂x1 ∂x ∂x
∂u ∂u ∂u
= X 1 1 + X 2 2 + · · · + X n n = DX u(p) (5.9)
∂x ∂x ∂x
(5.10)

Again, notice that if we take all possible smooth functions, different vectors produce
different operators. We can use the last equation very easily to define vectors on
manifolds, we don’t need any other structure. We just take the last equation as the
definition.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 149

Definition 5.2.1: Tangent vector definition 2: Through derivatives

Let p be a point on M . We can define a tangent vector as an operator:

X = (p, X op ) (5.11)

where X op is a linear operator on the smooth functions on M , C ∞ (M ),


which arises as the derivative of functions along smooth curves.

X op : C ∞ (M ) → R (5.12)
du(γ(t))
X op (u) = (5.13)
dt t=0
where γ is some curve on M , but always the same curve for all the functions
u.

Notice how, as with the first definition, we use external objects (there charts,
here smooth functions) to define things on M . This is quite common in differential
geometry. We will drop the op from X op from now on and just call it X as well,
but always have it in the back of our head that we need a point p where the vector
sits, for the same reason as with the first definition. As a reminder we will also
sometimes write Xp instead of X.
We can define the tangent space again.

Definition 5.2.2: Tangent space 2

The tangent space Tp M at p according to the second definition is the set


of all vectors X = (p, X op ) in the sense of the above definition.

Notice that this time Tp M ⊂ {p} × Hom(C ∞ (M, R), which is a vector space.
This time, however, it is totally not obvious that Tp M is a vector space. No way is
it obvious that, for example, if X, Y are in Tp M , that X + Y is in it two, because
we do not know how to add curves on M . In Rn , it is obvious, but certainly not
on a general manifold, Imagine even just the earth and two curves, for example, the
ones of a plane flying from Zurich to London and from Zurich to New York. There
is no reasonable way of adding them. What would that even mean? Would you end
up in Greenland?
Where we can do this is in charts, however. The one important point here,
though, is that the curve we get is dependant on the chart, and depending on the
chart, if you add the two curves from before, you can get anywhere from Greenland
to Brazil, but locally the two curves will be the same, in the sense that they will
have the same tangent vector.
We will not prove that this version of Tp M satisfies all the axioms of a vec-
torspace, rather and more interestingly, we will show that it is a whole vectorspace
in the sense that it has a basis and that basis spans the entire vectorspace, leaving
150 CHAPTER 5. TANGENT VECTORS

Figure 5.4: We can define tangent vectors by differentiating smooth func-


tions along curves.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 151

none out and spanning no other non-vector identity2 .

Figure 5.5: We don’t know how to really add curves sensibly on a manifold
directly. We can do it in curves, but depending on which chart we use, the
curves will look different. Here, on this example of the earth, the addition of
the two paths of our plane lands you in Cuba, if you use the Mercator pro-
jection, or somewhere in Antarctica if you use the stereographic projection.
That is not an ambiguity we want if we don’t want our vacation ruined.
Notice, however, that both curves result in the same tangent vector.

5.2.1 The basis of Tp M with the second definition.


It should be obvious that there is no natural (canonical) basis of Tp M , we don’t
have preferred directions in any way, at least not with the structures we have now.
2 No linear operator which does not come from a curve
152 CHAPTER 5. TANGENT VECTORS

So we are forced to use charts. Let us fix a point p and a chart Ch, which also
charts p.
We can construct a basis of Tp M by using the basis of Rn . For simplicity, let
us say that p gets mapped to (0, 0, . . . , 0). Then we have the basis vectors of Rn
at the origin (=p̃ = Ch(p)), which we can call e1 , e2 , . . . , en . Which curves could
we use to construct the basis of Tp M ? Well, the coordinate lines of course! The
coordinate lines are defined as follows:
β̃i (t) = tei (5.14)
where β̃i is the i-th coordinate line. Then we can project them back onto the
manifold βi (t) = PCh (βi˜(t)), as you can see in figure 5.2.2.
We can then define the i-th basis vector of Tp M as the one that one gets from
the i-th coordinate axis, projected back onto the manifold. In the coordinate space,
this vector would simply belong to the operator:
∂ ∂ ∂ ∂
X1 + X2 2 + · · · + Xn n = (5.15)
∂x 1 ∂x ∂x ∂xi
This motivates a new notation for this basis. We can now write:
 op

= The vector gotten from the i-th coordinate line (5.16)
∂xi p,Ch
where we remind ourselves that it sits at p and that it is definitely something that
comes from Ch and that it is an operator. (In future, we will leave out all these
little reminders.)
We can turn this into a definition.

Definition 5.2.3: Standard basis of Tp M for a chart Ch

We define the standard basis of Tp M with respect to a chart Ch as:


 op
∂ d
(u) = u(βi (t)) (5.17)
∂xi p,Ch dt t=0

where βi is the i-th coordinate line of the coordinate space Ch(U ), projected
back onto the manifold.

What is left to do now is to prove that this is a basis. For this, we would like
the coefficients of our vectors, to work a bit easier.

5.2.2 The coefficients of a vector


If we ever want to go computational with vectors, we need their coefficients. We
need some tuples to work with. Getting to the coefficients is not hard, however,
we just need a chart Ch and use the standard basis of Tp M induced by that chart.
Then we can write X as:
du(γ(t))
X ·u= (5.18)
dt t=0
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 153

for some curve γ which is appropriate for X. Now, this is a map from R to R, going
over the manifold. We can eliminate the manifold, by going to the coordinate space
and back with a chart and parametrization.

d
X ·u= (u ◦ Ch−1 ) ◦ (Ch ◦ γ)(t) (5.19)
dt t=0

The first (from the right) is simply the curve γ, drawn into the coordinate space,
not onto the manifold, which we will also call γ̃. The second one is simply the
function u, but as a function of the coordinates, not of the points on M , which we
will call ũ. We can now use the chain rule:

d
X op · u = ũ ◦ γ̃(t) (5.20)
dt t=0
n n
X ∂ ũ dγ̃ i X dγ̃ i ∂ ũ
= (0) = (5.21)
i=1
∂xi dt i=1
dt ∂xi
n op
dγ̃ i

X ∂
= (u) (5.22)
i=1
dt ∂xi

∂ ũ
where we have used the fact that ∂x i is simply the derivative of u in the i-th

direction in coordinate space, which allowed us to introduce the operator in the


next line. We have clearly found the coefficients of X, and they simply turn out to
be the tangent-vector of γ, but in coordinate space, which should show that what
we are doing is not nonsense. What else could they even have been?
We can summarize this in a box.

Definition 5.2.4: Coefficients of vector in standard basis

Let X be a vector (in the second definition sense) at p on M and γ a curve


that X belong to, so that X · u = du(γ(t))
dt . Then the coefficients of X are:

dγ̃ 1 dγ̃ n
(X 1 , . . . , X n ) = ( ,..., ) (5.23)
dt dt
where γ̃ = Ch ◦ γ is the curve γ, but in coordinate space, not the manifold.

 

5.2.3 Proving that ∂xi i=1,...,n
is a basis


We now want to now show that ∂x i
i=1,...,n
is a basis. This means we need to
show three things. Firstly, that all vectors spanned by the basis are in Tp M , and
secondly that all vectors in Tp M are spanned by the basis and thirdly that the basis
is linearly independent.
154 CHAPTER 5. TANGENT VECTORS

Figure 5.6: We can use the coordinate lines from a chart Ch to define our
basis-vectors of Tp M , by projecting them onto the manifold.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 155



Proposition 5.2.1: The vectors ∂xi i=1,...,n form a basis of Tp M



The vectors ∂xi i=1,...,n form a basis of Tp M , that is:

• Tp M ⊂ span( ∂x ∂
, . . . , ∂x∂n )
 
1

• span( ∂x

, . . . , ∂x∂n ) ⊂ Tp M
 
1

• The basis is linearly independent.


We note that dim(Tp M ) = n = dim(M ), which is not too surprising since
Tp M are all the directions in which you can vary (differentiate), locally, on
M.

But we have already shown the first claim in the last section, when we found
the coefficients of a general vector in Tp M , because we wrote X out as a linear
combination of this basis. So we only have the second and third claims to prove.
The proof of the second Pn claim iseasy to understand. We need to show that
any (X 1 , . . . , X n ) = i=1 X i ∂
∂xi is in Tp M , which means we need to find
a curve that generates these coefficients. But the curve (in coordinate space)
γ̃(t) = t(X 1 , . . . , X n ) will do3 . Figure 5.2.3 should make this almost obvious.

Figure 5.7: We can use the curve that just ”extends” X in coordinate space
for our proof

The only thing left to show is that the basis is linearly independent, which we
leave to you as an exercise, it’s not too hard.

3 Here, we are still using the mentioned simplification, which is that Ch(p) = 0. If you

don’t want this, this would just be p̃ + t(X ! , . . . , X n )


156 CHAPTER 5. TANGENT VECTORS

5.3 Tangent vectors from curves


The third definition we introduce is, intuitively, that tangent vectors are ”small”
pieces of curves. Now, you can understand this geometrically quite easily, you can
have a look a figure 5.3, for example.

Figure 5.8: We can see vectors as small pieces of curves, locally, if we think
of them geometrically.

Mathematically, however, it is not as simple as that, specifically, which part of


a curve we call small or large is not something clear, so we need to attack this with
a different approach. Instead of finding some measure of small/local and having
a headache with that, we can do the exact opposite. Let’s take all the curves, no
matter the ”length”, which we have not yet even defined. That definitely leads to
the problem of us having too many curves, there will be two curves that locally
look the same, i.e., that locally look the same. How can two curves locally look the
same? Well, we can recycle definition two, and say that they locally look the same if
they induce the same differential operator X op . With that, we can get the ”small”
part of the curve, by using an equivalence relation, which tears away anything that
is non-local.
5.3. TANGENT VECTORS FROM CURVES 157

Definition 5.3.1: Tangent vector definition 3: Through curves

Let γ and β be two curves on M , so that γ(0) = β(0) = p. Then we call


these equilient, if they induce the same operator, as per the second definition
of tangent vectors.
γ ∼ β ⇐⇒ Xγop = Xβop (5.24)
Then we can define vectors as the equivalence classes of this equivalence
relationship.

This definition is a lot more geometric and is useful to think about in very
pictorial settings. You can see the equivalence relationship in figure 5.3

Figure 5.9: A lot of curves, even very wild ones are equivalent in our defini-
tions. These wild behaviours are cut off by the equivalence relationship, so
we have a way of talking about ”small” parts of the curve
158 CHAPTER 5. TANGENT VECTORS

5.4 Tangent vectors as operators that satisfy the


product rule.
There is one last definition of tangent vectors, which we want to mention, but not
go into very much detail about. This one is very much more on the abstract side,
compared to the last few, and we only really want to show that it exists. The idea
we generalize is the Leibniz or product rule, which should be very familiar to you.
For first derivatives, in general, an equation of this form is correct:

D(uv) = uD(v) + D(u)v (5.25)

for some functions u, v. It turns out that, like the directional derivative, we can also
generalize this. We can take the space of all linear transformations that take smooth
functions on M as the input and output a real number, Hom(C ∞ (M ), R), which
contains things like tangent vectors (as derivatives), but also things like multiplying
functions with the number two, which is clearly not a derivative-operator.
We can then notice, that derivative operators should satisfy the product rule,
while non-derivative operators do not. For example, with multiplying by two, in
general, we have:
2 · (uv) ̸= u(2v) + v(2u) = 4uv (5.26)
ith this we get to our definition:

Definition 5.4.1: Tangent vector definition 4: The Leibniz-rule

Let X op be an element of Hom(C ∞ (M ), R). We call X = (p, X op ) a


tangent vector at p, if X op satisfies the Leibniz rule at p, that is if:

X op (uv) = X op (u)v(p) + u(p)X op (v) (5.27)

for all u, v ∈ C ∞ (M )

It is quite tricky to work with this definition, because we don’t have charts or
derivatives in it, so one needs to be very algebraic and it all turns into a tricky,
non-digestible mess quickly, so we won’t pursue this further.
What we want to mention is that the vectors from the previous definitions we
have do satisfy the Leibniz rule, you can use it comfortably. The proof is similar
to what we did before, using lots of chain rules, working in charts and applying the
Rn Leibnitz rule.

5.5 Change of coordinates


We want to see what happens to tangent vectors when you change from one chart
to another. This is quite useful since often one thing might be easy to calculate in
one chart, but then another thing might be easier to calculate in another chart but
uses the first. You are probably familiar with the situation from calculus or physics,
5.5. CHANGE OF COORDINATES 159

where often for example, something is easier in polar coordinates, and sometimes
in cartesian. We want to generalize this to any coordinates.
More specifically, we want to look at how tangent vectors transform. For this,
fix M, p and two charts Ch1 , Ch2 and a tangent vector X.
n  
X ∂
X= Xi (5.28)
i=1
∂xi Ch1

where X 1 , . . . , X n are the components of X in the basis of the first chart. What
happens if we change the charts? That is, what are the components in the other
chart, Ch2 ?
We can start by answering first how we can express the basis vectors of the old
chart as the basis vectors of the new chart.
We
 do this, by unravelling the definition of the differential operator that is

∂xi Ch1 .
We know that:
 
∂ ∂
i
·u= (u ◦ Ch−1
1 ) (5.29)
∂x Ch1 ∂xi
in the first coordinate space. But we know how to transform coordinates. We call
the coordinates in the second chart y 1 , . . . , y n . We can change to the second chart
by inserting an identity, that is by going to the second coordinate space and back
and then using the Rn chain rule:
 
∂ ∂
i
·u= (u ◦ Ch−1 1 ) (5.30)
∂x Ch1 ∂xi

= (u ◦ (Ch−1
2 ◦ Ch2 ) ◦ Ch1 )
−1
(5.31)
∂xi

= ((u ◦ Ch−1 −1
2 ) ◦ (Ch2 ◦ Ch1 )) (5.32)
∂xi
n 
∂(u ◦ Ch−12 ) ∂(Ch2 ◦ Ch−1
1 )
j
X 
= j i
(5.33)
j=1
∂y ∂x
n
∂(Ch2 ◦ Ch−1 1 )
j
 
X ∂
= i
(5.34)
j=1
∂x ∂y j Ch2

We have found the transformation of the basis vectors of one chart into the vectors
of another. We can thus see how a vector X, which is of course nothing less than
a linear combination of the basis vectors in the first chart, transforms.
n n X n −1 j 
i ∂(Ch2 ◦ Ch1 )
  X 
X ∂ ∂
X= Xi = X (5.35)
i=1
∂xi i=1 j=1
∂xi ∂y j Ch2
n n
!
−1 j
i ∂(Ch2 ◦ Ch1 )
  X 
∂ X ∂
= = X (5.36)
∂xi j=1 i=1
∂xi ∂y j Ch2
(5.37)
160 CHAPTER 5. TANGENT VECTORS

So we have found how the coefficients change. If we call the coefficients in the
second basis (in the second chart) X̂ 1 , . . . , X̂ n , then we can write them as:
n
X ∂(Ch2 ◦ Ch−1
1 )
j
X̂ j = Xi (5.38)
i=1
∂xi

Note that this is just a matrix multiplication, and we can write it simpler using our
transformation notation, since Ch2 ◦ Ch−1 1 = T1→2

XCh2 = dT1→2 XCh1 (5.39)

Where XCh2 is the vector X as a column in the second basis, while XCh1 equiva-
lently in the first basis.
At this point, we want to introduce a new notation, which is very common in
much of differential geometry literature, and understandably so. We can think of
the coordinates of the first charts as functions:

x1 (p), x2 (p), . . . xn (p) (5.40)

which gives every point on the covered subset of the manifold a real number. That
is xi : U1 ⊂ M → R.
We do the same thing with the other charts and call them

y 1 (p), y 2 (p), . . . , y n (p) (5.41)

We can then do the above calculations with these as actual partials. But what is
the transition map, then? Well, as you can imagine, it takes the coordinates of a
point in the first chart and spits out the corresponding coordinates in the second
chart:
T (x , . . . , xn ) y (x , . . . , xn )
 1 1   1 1 
 T 2 (x1 , . . . , xn )   y 2 (x1 , . . . , xn ) 
T = = (5.42)
   
.. .. 
 .   . 
T n (x1 , . . . , xn ) y n (x1 , . . . , xn )

so we can write:
∂y j ∂T1→2 ∂(Ch2 ◦ Ch−1
1 )
j

i
for i
= i
(5.43)
∂x ∂x ∂x
and the formula, as:
n
j
X ∂y j
XCh = i
XCh (5.44)
2
i=1
∂xi 1

which looks very similar to the real number chain rule. Note, however, that there
is a bit of an abuse of notation here, and that it is certainly not mathematically
enough to use the chain rule without thought with these, there is more behind them.
5.6. DIFFERENTIATION OF A FUNCTION BETWEEN MANIFOLDS 161

5.6 Differentiation of a function between mani-


folds

We finally want to address the question of a proper real derivative of a function


between two manifolds. We have used many derivatives up till now, but they were
all either inside a chart or maybe a parametrization of a curve.

Let us now go to the case where we have two manifolds, M, N and a smooth
function f : M → N between them. What is the derivative of f , df (p). We expect
it to be similar to a jacobian in Rn , taking tangent vectors to tangent vectors.

Figure 5.10: You can see two manifolds, M and N and a function between
them. We want a derivative df , that is a function that takes tangent vectors
of M (X) to tangent vectors of N (Y ).

How can we do this? The geometric idea is quite simple. We know f and
want to know what happens with a vector X ∈ Tp M . That vector is created by
some curve on M , call it α through the second definition. We know what happens
with the curve if we use f on the manifold, it gets mapped to some other curve
β = f ◦ α. But then the vector associated with β at p̃ = f (p) should be the vector
the derivative maps X to!
162 CHAPTER 5. TANGENT VECTORS

Definition 5.6.1: The derivative df (p) of a function at p

Let f : M → N be a smooth function and p ∈ M a fixed point. Then we


define the derivative of f at the point p, df (p) as a function of the tangent
space at p, Tp M , to the corresponding tangent space of N at p̃ = f (p),
Tp̃ N , by the following relationship. If X ∈ Tp M is created by the curve α,
that is, if for all u ∈ C ∞ (M ):

d
X ·u= u(α(t)) (5.45)
dt 0

then
Y = (df (p)(X) ∈ Tp̃ N (5.46)
is defined by:
d
Y ·v = v(βt) (5.47)
dt 0
for all v ∈ C ∞ (N ) and where β is the curve β = f ◦ α.

Notice that nowhere in the definition did we use any charts whatsoever! This is
a purely geometric, chart-independent object we have.
There is, however, a few things we need to make sure work if this definition is
to make sense.

• Firstly and most obviously, df (p)(X) needs to be independent of α, it needs


to be well defined.
• Secondly, it is a derivative, we want it to be linear.

The second claim is easy to prove if you do it in charts, so we leave it to you as an


exercise.
The first one is a bit more interesting and introduces a new idea.
Let us see what happens, if we evaluate Y · v for some v ∈ C ∞ (N ) and a Y
gotten from the definition for a particular choice of α.

d
Y ·v = v(β(t)) (5.48)
dt 0
d
= v(f (α(t))) (5.49)
dt 0
= X · (v ◦ f ) (5.50)

So Y · v = X · (v ◦ f ), the right side of which is definitely independent of α, which


means that Y is too.
The new idea we get is that:

Y · v = X · (v · f ) (5.51)
5.7. THE CHAIN RULE 163

which we will call proposition X from now on, whenever we use it, which will be
in the next section when we use it to prove the chain rule for functions between
manifolds.

5.7 The chain rule


To remind you, if you have Rm , Rn , Rp and f : Rm → Rn and g : Rn → Rp both
smooth, then the chain rule says that:
d(g ◦ f )p = dgf (p) ◦ dfp (5.52)
where d is the jacobian.
We will show that this result is true in the case of manifolds as well.

Theorem 5.7.1: The chain rule


Let M, N, P be manifolds and f : M → N and g : N → P be smooth
functions. Then for p ∈ M the chain rule holds:

d(g ◦ f )p = dgf (p) ◦ dfp (5.53)

You can show this in two ways. You can either express this whole thing in
coordinates and ”inherit” the chain rule from Rn into the whole thing or you can
do it directly and abstractly. Neither is better, but since many of our proofs until
now have been leaning more toward the first type, we will do it with the second
method instead.
Proof. We need a bit of setup since we have a lot of players in this proof. We
have the manifolds, the maps, the vectors and the general smooth functions we
need for the vectors to act on. We show all the players in figure ??
Our strategy is to write each of the parts of the chain rule equation using
proposition X and then collect them together.
df (p)(X) · v = X · (v ◦ f ) (5.54)
dg(q)(Y ) · w = Y (w ◦ g) (5.55)
d(g ◦ f )(p)(X) · w = X · (w ◦ (g ◦ f )) (5.56)
We can now set Y = df (p)(X) and q = f (p) and insert into the right side of the
chain rule.
dg(f (p))(df (p)(X)) · w = df (p)(X) · (w ◦ f ) (5.57)
= X · ((w ◦ g) ◦ f ) (5.58)
= d(g ◦ f )(p)(X) · w (5.59)
So we get the desired equation:
d(g ◦ f )p = dgf (p) ◦ dfp (5.60)
since w, X and p were all general.
164 CHAPTER 5. TANGENT VECTORS

Figure 5.11: You can see all the actors we need in the proof in this figure.

5.8 The coordinate expression for df (p)


We have until now been on a completely abstract level, not using coordinates what-
soever. We want to now complete our discussion of the derivative of a function by
deriving a coordinate expression.
Let us say we have two manifolds again, M and N , the general vector X in
Tp M , two charts ChM and ChN which map at least the relevant parts of M and
N . What is the coordinate representation of df (p) : Tp M → Tq N where q = f (p).
Let us write X = (X 1 , . . . , X n ) and Y = df (p)(X) = (Y 1 , . . . , Y n ) for the
vectors X and Y in the standard basis at p of the charts ChM and ChN , respectively.
We need to compute Y = df (p)(X) in the coordinates. Let’s start by writing
out the definition and using proposition X. (v ∈ C ∞ (M ))

Y · v = x · (v ◦ f ) (5.61)
m  
X ∂
= Xi (v ◦ f ) (5.62)
i=1
∂xi p,ChM
m
X ∂
= Xi (v ◦ f ◦ Ch−1
m ) (5.63)
i=1
∂xi p̃,ChM
m

(ṽ ◦ f˜)
X
= Xi (5.64)
i=1
∂xi p̃,ChM

where in the last expression, the tilde means that the functions are their represen-
tations in charts, and the partial derivative becomes the simple Rn partial we all
know and love. We can then use the Rn chain rule.
5.8. THE COORDINATE EXPRESSION FOR df (p) 165

Figure 5.12: We present the standard picture with functions between man-
ifolds again, with the charts ChM and ChN . Our goal is to find the
coordinate representation of df (p)
166 CHAPTER 5. TANGENT VECTORS

∂ṽ ∂ f˜j
m X
X n
= Xi (5.65)
i=1 j=1
∂y j q̃ ∂xi p̃
(5.66)

where the partials in y are in the chart of N and the ones in x are the ones belonging
∂ ṽ
to the chart of M . If we rearrange a bit, and realize that we can turn ∂y j back

into the operator, and that these are simply the standard basis vectors at q̃ in ChN ,
we get:

∂ f˜j
n m
!
X X ∂ṽ
= i
X i
(5.67)
j=1 i=1
∂x p̃ ∂y j q̃

So the coefficients of Y are then:

∂ f˜ i
Xm
Yj = X (5.68)
i=1
∂xi

Again, we find a result that is parallel to the chase of Rn , since if f were a map
from some Rm to some Rn , we would get the exact same result. The (column)
vector Y we get is simply the Jacobean (in the chart) used on X (in the chart)!
We can introduce a new matrix notation, for the chart Jacobean of df (p). We
can write:
∂ f˜j
df (p)ji = (df (p)ChM ,ChN )ji = (5.69)
∂xi
Then we can write the above result as:

Y j = df (p)ji X i (5.70)

We can also rewrite the chain rule in this notation. It becomes:


n
X
d(g ◦ f )ki = dg(f (p))kj df (p)ji (5.71)
i=1
Chapter 6

Tangent Spaces and


Tangent Bundels

167

You might also like