DiffGeo 1 17.11
DiffGeo 1 17.11
Maciej Swiatek
17.11.2023
2
Contents
Preface 7
2 Surfaces 47
2.1 Some definitions and basic quantities . . . . . . . . . . . . . . . . . 47
2.2 The curvature of surfaces in R3 . . . . . . . . . . . . . . . . . . . . 52
2.3 The Geometric Definition of Curvature on Surfaces . . . . . . . . . 53
2.3.1 A bit about Qp . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.2 The independence of Qp (v) from the curve we choose . . . 57
2.3.3 Proof of the theorem . . . . . . . . . . . . . . . . . . . . . 59
2.4 The second fundamental form . . . . . . . . . . . . . . . . . . . . 62
2.4.1 Simplifying the second fundamental form . . . . . . . . . . 62
2.4.2 The mean and Gauss curvatures . . . . . . . . . . . . . . . 64
2.5 Symmetry and Curvature . . . . . . . . . . . . . . . . . . . . . . . 68
3
4 CONTENTS
II Manifolds 107
3 Topology and topological manifolds 111
3.1 Humble beginnings . . . . . . . . . . . . . . . . . . . . . . . . . . 111
3.2 A topological space . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.3 Charts: Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.4 The Hausdorff Condition . . . . . . . . . . . . . . . . . . . . . . . 122
3.5 The ant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
3.6 Interlude: Useful topology for the course . . . . . . . . . . . . . . . 124
Sections with an asterisk in-front of them are not mandatory to read, you can skip
them as you like (”You should know these ideas exist, but don’t need to learn
them”). Big thanks go out to Anastasia Sandamirskaya and Ji Zhexian for helping
with writing the first lecture about surfaces.
Disclaimer: We generally stay true to the notation of the lecture. The only real
exception to this (so far) is the symbols we use for charts, for which ψ, χ were used
in the lecture as general charts, whereas we use Ch1 and Ch2 (literally: chart 1 and
chart 2), to make it the equations a bit more direct. Other than that, the symbol
T1→2 is also a product of this script, for the transition/overlap map, which in the
lecture was always written out with the charts (χ ◦ ψ −1 ). We did this to make a
few equations a bit more readable.
Note: There are currently a few pages, particularly where there are many figures
in the text, that have very wide gaps. This changes every time one adds any text,
because latex re-chooses where to put things like figures and definitions and then
tries to fit the text around this. I would need to fix this manually at every such
point and will do it after the script is done. For now I hope you can get past this.
7
8 CONTENTS
Part I
9
11
We start with some of the most intuitive examples of the type of manifolds we
will be working with, that is, with curves and surfaces embedded in some form of
RN .
12
Chapter 1
Curves
In this chapter, we will deal with curves. We first define what we mean by a curve,
and impose some restrictions on the kind of curve we want to deal with. We won’t
prove all the things we claim in this chapter, as some of these things you should
have already seen in a Calculus class and this is only a quick overview.
13
14 CHAPTER 1. CURVES
Figure 1.1: An example of a smooth curve. The Interval I gets mapped onto a
curve in R3 with the function γ
coefficients are polynomials in t, all the derivatives exist and are continuous.)
But look at Figure 1.2. The image of the curve is obviously not smooth in R2
at t = 0 or equivalently x = (0, 0). What is happening over there? Well, it
resembles the absolute value function a bit. It also had a sort of sharp bend
at a point. The problem back then was with the derivative. It simply did not
exist, which made the curve have a weird behavior (the sharp bend).Similarly,
here the problem is also with the derivative. It exists, obviously, since this is a
smooth curve. But it becomes 0 at the problem point (t = 0 or the origin). A
curve with such a bend is not something we want to really work with, therefore
we put another restriction on the curves we work with. We eliminate curves like
the one from this example simply by saying we don’t work with curves whose
derivative becomes 0 anywhere.
As we saw in the previous example, we get into problem situations if the deriva-
tive of the curve with respect to the parametrization parameter is zero. We therefore
define regular curves as those for which this doesn’t happen, or in other words, where
the velocity never vanishes.
Definition 1.1.2 (regular curves). A smooth curve is called a regular curve, if:
dγ
̸= 0 for all t ∈ I (1.1)
dt
where γ is the smooth curve and I is the interval it is defined on.
Note 1. We will use various notations for the derivative of a curve. These
include:
dγ
= γt = γ̇ (1.2)
dt
1.2 Arc-length
Now that we have said what we mean by a curve and restricted it so as to not run
into problems like the one in the example above, we can start with the geometry.
Undeniably, one of the most important quantities in geometry is the length. If you
know the lengths of a problem, you already know quite a bit of the geometry. What
is the length of a (piece of a) curve?
Well, we already restricted ourselves to work with regular curves, so our moti-
vation will be more on the intuitive side.
Imagine you have any curve, like the one in the figure 1.3. The idea is that we
divide the curve into very small almost-straight parts, calculate the length of these
parts by approximating that part as a straight line and then summing up all of those
back together. We can do it for reasonable (i.e regular) curves. Of course, in reality
what we do is go infinitesimal, at which point this becomes an integral.
dt ∆t, where by ’s’ we
For the small piece as seen in the figure, we have ∆s = dγ
mean the length and by ∆s the very small length of that very small part. Afterwards
16 CHAPTER 1. CURVES
we add all of these up, and in the continuum limit we get an integral:
Z Z t
dγ
s(t) = ds = dt (1.3)
γ t0 dt
Figure 1.3: A curve and our intuitive way to understand the definition of the
arc-length. We zoom in on a very small part of the curve, between t′ and t′ +∆t.
There, if ∆t is small enough, the line will be approximately straight and we can
use the velocity vector to calculate the length of that piece approximately. Note
that the velocity vector is drawn in way smaller than it would actually be for
any reasonable ∆t, just so that the picture is clearer.
Note that in the definition we did not assume that t > t0 , a negative arc-length
is possible, simply by going into the opposite direction of the parametrization of the
curve.
We already mentioned that the geometrically interesting object is the image of
γ, not the function γ (i.e the parametrization) itself. We will care mostly for things
we can define on the image of γ that are not dependant on that parametrization.
The arc-length is something independent of the parametrization1
1 Of course, we can always choose to parameterize the curve in the other direction, which
changes the arc-length by a minus. We can also choose a different reference point other than
γ(t0 ). But these are choices that are rather trivial and we won’t really mention them from
now on.
1.3. GEOMETRIC QUANTITIES 17
In line with this philosophy, we can define a very convenient, but also more
geometrically ”real” parametrization. The idea is that the arc-length is a geometric
object independent of parametrization, and that, for regular curves, we can use the
arc-length to parameterize the curve.
Proof. We will only sketch the proof, as this is something rather simple and you
very likely already saw the proof in a calculus class2
The first step is to take the arc-length and see it as a function of t:
Z t
dγ
s = f (t) = dt (1.6)
t0 dt
We can then take the inverse of this function, call it g(s) = f −1 and express
t as a function of s. If we take β = γ ◦ g = γ(g(s)), we found the right
parametrization. The only thing left that you need to convince yourself is that
the velocity is really of unit length. (You can do this using the chain rule.)
Of course, the curve is still regular, and all the properties like smoothness are
still obeyed by the curve. The image of β and γ is of course exactly the same, i.e,
you can’t change the curve simply by re-parameterizing. You might find Figure 1.4
helpful in visualizing this.
? dγ
τ (t) = (1.7)
dt
2 If not, try it yourself as an exercise.
18 CHAPTER 1. CURVES
Figure 1.4: Different Parametrizations of the same curve. The curve is drawn in
green, the ticks are the points on the curve with the parameter-values written
next to them. In (a) you see a typical non-special parametrization (i.e the ”t”
, in (b) you find the curve parameterized by the arc-length (s). It is intuitively
clear, why the parametrization is not something geometrically interesting. The
real curve (green line) exists, independent of the ticks. In (c) you find another
parametrization by the arc-length, except with a different choice of reference
point on the curve.
1.3. GEOMETRIC QUANTITIES 19
A bit of thought however, reveals that this cannot be true. Why? Well, it is not
independent of parametrization. Imagine, for example, you were to go twice as fast
along the curve. Then your velocity vector (=tangent vector in this example) would
be, twice as big at every point. But if we want the tangent vector to be something
fundamentally independent of the parametrization, then equation 1.7 cannot be the
correct definition of the tangent vector.
How can we fix this? Well, look at figure 1.5. It shows the same curve, param-
eterized in three different ways, with the ”fake” tangent vectors (from equation 1.7
drawn in. The thing that should jump at you is that, while the length of the vectors
does change, the direction does not 3 . The tangent vector how we defined it is not
the geometrically real thing, rather the unit tangent vector is, which is exactly how
we choose to define it below.
Figure 1.5: The ”fake” tangent vectors from equation 1.7 for different
parametrizations of the same curve (green). In (a) you see a random
parametrization (black ticks) and its ”fake” tangent vector (blue) from equation
1.7. In (b) you have the same situation, only that this time you go twice as fast
along the curve. Notice that the vectors (the physically drawn arrows) change.
In (c) you have the same thing, but this time parameterized with arc-length.
dγ/dt
τ (t) = (1.8)
|dγ/dt|
If we parameterize by the arc-length, then the formula for the tangent vector
becomes:
dγ
τ (s) = (1.9)
ds
since dγ
ds = 1.
avoid introducing another symbol,g, since it is just the function that expresses
the parameter t in terms of s.
Proof. The proof is strikingly simple. We know that the length of τ is set to
one. Therefore ⟨τ, τ ⟩ = 1 and dsd
⟨τ, τ ⟩ = 0 since the length (and therefore the
scalar product) doesn’t change along the trajectory. We can use the product
rule:
d dτ
0= ⟨τ, τ ⟩ = 2⟨ , τ ⟩ = 2⟨κ, τ ⟩ (1.11)
ds ds
Therefore, the the scalar product of the two vectors is 0, i.e they are orthogonal,
as claimed.
Figure 1.6: A curve with τ and κ drawn in. Notice that κ is orthogonal to τ .
This is something physics students are very familiar with. The situation is very
analogous to the trajectory of a particle. The speed of the particle doesn’t change,
so the only direction the acceleration (= curvature vector) can have is perpendicular
to the curve.
Note 3. We want to make a quick check on the units of all the quantities that
we described so far. Let’s assume that our RN holds some sort of length unit,
like the cm, which we will write as [L]. Let’s also assume the parameter of
our parametrization has the units of time, like sec, which we denote [T ]. Then
both γ and s have units [L], so the tangent vector dγ ds has units of [L]/[L] = 1
and is unit-less. This is something we want explicitly, as the geometric object
should not be dependant on the parametrization, which means it should also be
independent of the unit of the parametrization [T ]. The ”fake” tangent vector
we defined before has, on the other hand. units of [L]/[T ].
2
The curvature vector ddsγ2 has units of [L]/[L]2 = 1/[L].
22 CHAPTER 1. CURVES
Until now, we have only given a formula for the curvature vector in the arc-
length-parametrization. We will now write down the formula for the curvature
vector with any parametrization.
Lemma 1.3.2 (Curvature Vector in arbitrary Parametrization). Let γ : t ∈
dγ/dt
I → RN be any curve and τ (t) = γt (t) = |dγ/dt| the tangent vector. Then the
curvature vector κ(t) can be written as : 6
1
γt γt
κ= 2 γ tt − ⟨γ tt , ⟩ (1.12)
|γt | |γt | |γt |
d2 γ
where γtt is dt2 .
Before we go on to prove this, we first want to talk about what each part of the
equation means.
2
We know that κ = ddsγ2 and therefore expect it to have something to do with
d2 γ
dt2 . This turns out to be the case, the first term is indeed γtt . But there is a
correction term of −⟨γt , |γγtt | ⟩ |γγtt | , which has a nice geometric explanation.
It projects γtt onto the normal plane of the tangent vector. See Figure 1.7 for
a visual example. After we have projected γtt onto the normal plane, we still divide
2
it by |γt | . You can see it as just a factor that makes sure that the units work out.
We can see this simply by comparing units. The part that projects γtt onto the
Normal plane has the same unit as γtt , so we can just look at γtt . (Because we add
2
them. That doesn’t change the units.) The unit of γtt = ddt2γ are clearly [L]/[T ]2 ,
while the unit of κ is 1/[L], as we saw above. Therefore, to get a consistent formula,
2
we need something that has units of [T ]2 /[L]2 . 1/ |γt | is exactly a factor like that.
Note 4. We call the normal plane a plane, even though that is technically only
correct if we have a curve in R3 . In R2 it is a line, in R4 a hyperplane and in
general an (N − 1)-dimensional vector-space.
Proof. We now prove equation 1.12. The proof consist in its most basic form just
of taking the definition of κ in the arc-length-parametrization and switching to
the t-parametrization, using the normal rules of derivatives (chain rule / product
rule). We start with the chain rule.
dτ dt dτ
κ= = (1.13)
ds ds dt
Rt
where by dsdt
we of course mean dg
ds where g = f
−1
and f (t) = t0
dγ
dt dt.
Therefore:
−1 −1
dt dg df dγ 1
= = = = = 1/ |γt | (1.14)
ds ds dt dt |dγ/dt|
6 If you already have some experience of Differential Geometry or you are rereading this after
already learning further chapters, you might notice how this is the the covariant derivative of
the tangent vector
1.3. GEOMETRIC QUANTITIES 23
Figure 1.7: A curve with τ and κ drawn in, as well as γtt . Because we move
along the curve faster and faster (the ticks are more spread out), γtt has a
component in the ”forward” direction, which we cancel out in equation 1.12.
2
The vector is still too long though, which is why we need to divide by |γt | .
24 CHAPTER 1. CURVES
1.4 Curves in R2
We have, by now, defined exactly what we mean by a curve, seen the concept of
what sort of object is geometric, and defined a few of these, like the arc-length,
tangent and curvature vectors. We will now use all of these concepts to describe
curves in the two dimensional plane.
The main idea that makes this a lot simpler, is that the curvature vector κ
reduces to a number. This is because the direction of the curvature is always
predetermined in two dimensions by the direction of the tangent vector.
To see this, we note that, as we showed before, the curvature vector κ lies in
the normal ”plane” of the tangent vector, which in two dimensions means that it
lies on a straight line perpendicular to τ . Therefore, we only need to specify one
number8 to determine the curvature vector.
Let’s say we are at a point on a curve, like the one drawn in figure 1.8. We
can construct a right handed basis of R2 at that point by taking τ as our first
basis vector, and the vector that one gets if one rotates τ by 90 deg (in the positive
sense.), which we will call N . Since τ is of unit length and we get N by rotating τ
by 90 deg, this is an orthonormal basis. (One that is right hand sided.) Notice that
it immediately follows that:
κ = kN (1.20)
for some k ∈ R, because we know that κ and τ are orthogonal. We call this k the
curvature scalar. It is an important quantity in differential geometry, and we will
find its equivalents for different geometric objects throughout the subject.
7 You should recognize this from Calculus II, given maybe in a different notation: dr/dt =
x/dt = ⃗xr d⃗
(∇r) ∗ d⃗ x
dt
8 On each point of the curve
1.4. CURVES IN R2 25
Figure 1.8: A curve with its tangent, curvature and normal vectors drawn in
at a point on the curve. As you see, the curvature vector is just some number
times the normal vector. Note however, that k is not just the absolute value of
κ, since it can also be the negative of its length, if it points in the other direction
Example 1.4.1. Our first example is the simplest curve that is not a straight
line (because a straight line, of course, has no curvature9 ),which is a circle of
radius R.
The curvature of that circle is
k = 1/R (1.21)
Figure 1.9: A few circles with different radii, with their respective τ, κ, N drawn
at the point (0, R) of each curve. The bigger the circle, the less curved it is, as
reflected by the formula k = 1/R.
1.4. CURVES IN R2 27
The curvature scalar has a lot of interpretations. Let’s first state them, then discuss
their consequences.
1. The curvature scalar is the rate of change of the angle the tangent vector
makes with the x axis. Mathematically, let θ = arctan ττ21 (s)
(s) be exactly
that angle Then:
dθ
k= (1.22)
ds
2. The absolute value of the curvature scalar k tells us the radius of the
osculating circle, which is the distinct circle, that agrees with the curve
up to order two.
1
|k(s)| = (1.23)
R(s)
where R(s) is the radius of that circle at the point of the curve whose
parameter-value is s.
The first interpretation of the curvature scalar should make a lot of sense in-
tuitively. We know that the tangent vector cannot change its length, since per
construction it is of unit length. Therefore the only thing that can really change is
the direction, i.e the angle it makes with the x-axis. This, along with the fact that
the curvature vector describes how the tangent vector changes, makes the first part
of the proposition rather intuitive. See figure 1.10 for a visualisation.
The proof is not to complicated, you just need to derive θ(s) and remember
that (1) the derivative of arctan(x) is 1+x
1
2 and (2) the normal vector in terms of
Proof of the first interpretation. As we said, you only need to derive θ. Let’s
28 CHAPTER 1. CURVES
Figure 1.10: A curve, with its tangent vectors drawn in, and a table that shows
how the tangent vector rotates.
start:
dθ d τ2
= arctan (1.24)
ds ds τ1
d(arctan(x)) d τ2
= (1.25)
dx ds τ1
1 τ˙2 τ1 − τ2 τ˙1
= (1.26)
1 + x2 τ12
1 τ˙2 τ1 − τ2 τ˙1
= (1.27)
τ2
1 + 22 τ12
τ1
1
= ⟨(τ˙1 , τ˙2 ), (−τ2 , τ1 )⟩ (1.28)
τ12 + τ22
1
= ⟨κ, N ⟩ (1.29)
1
=k (1.30)
The factor in the fraction is one because it’s the square of the length of τ , which
is one.
Now, what about the second interpretation? Well, you can imagine a circle,
going along the curve, that locally looks like the curve. (The curve tries to be
1.4. CURVES IN R2 29
as similar to the circle as possible, but because the radius of the osculating circle
changes with s, it doesn’t become a circle.)
Figure 1.11 gives a picture of a curve and it’s osculating circles at different
points of the curve. As you hopefully agree with, the bigger the radius of the circle,
the more straight the curve will be at that point (as both of them agree to order
two so they locally behave quite similarly.) Therefore we expect that the second
interpretation is correct, that is, the curvature scalar is inverse to the radius of the
osculating circle.
Figure 1.11: A curve with its osculating circle drawn in at a few places along the
curve. (The biggest one only partially drawn in) It is clear that the bigger the
osculating circle is, the straighter the curve will be, which gives the connection
to the curvature scalar.
Proof of the second interpretation. We will not prove this, as it is quite simple,
but we will sketch a proof. The osculating circle agrees with γ up to order
two. Therefore, we can expect that the second derivatives (i.e the k’s) agree
for the curve and the osculating circle (which we can see as a second curve.) at
that point. We know that at that point, the circle has kcircle = 1/Rcircle , and
therefore this should also be true for the first curve. The only missing parts of
the proof are (1) the proof that an osculating circle exists, which it does10 , and
a more rigorous way of presenting the above argument.
There is actually also a third interpretation of the curvature scalar, for a special
kind of curve. Let’s say, that the curve is the graph of a function y = u(x) that
assigns a y-value to every x-value, like the one in figure 1.12. Consider the second
10 A straight line is a circle of infinite radius in many aspects of geometry, this is also
true here, if the curve is locally straight at a point, the radius of it’s osculating circle will
blow up and the circle will become as straight line, but the theorem will still hold. For the
mathematicians: 1/∞ = 0 in this case.
30 CHAPTER 1. CURVES
Proof. Let’s start, by collecting different terms that might be useful. Firstly,
γ(x) = (x, u(x)) and therefore:
γx = (1, ux ) (1.32)
1/2
|γx | = 1 + u2x (1.33)
γx (1, ux )
τ= = 1/2
(1.34)
|γx | (1 + u2x )
(−ux , 1)
N= 1/2
(1.35)
(1 + u2x )
The first three should be rather clear, coming straight from the definition. The
last one comes from the fact that N is just τ , but rotated by 90 deg, which means
we switch the two entries of the vector and put a minus in-front of the first one12
11 Conditions apply, as always.
12 Ifthis is not clear to you, try it out with the rotation matrix of positive 90 deg. You’ll
see that this is correct.
1.4. CURVES IN R2 31
We can now just use the definition of κ and the chain rule and calculate until
we get there.
dx dτ
κ= (1.36)
ds dx
−1
dx ds −1
= = |γx | (1.37)
ds dx
1 dτ
→κ= (1.38)
|γx | dx
1 d γx
= (1.39)
|γx | dx |γx |
Before we continue, there is something to note about what we already found.
Inside the derivative, we already normalize once, and then again outside of the
derivative. In this sense, κ is a normalized version of a second derivative.
1 d (1, ux )
... = (1.40)
|γx | dx |γx |
1 (0, uxx ) d 1
= − (1, ux ) (1.41)
|γx | |γx | dx |γx |
(We used the product rule.) Now, we know that k = ⟨κ, N ⟩. The last term in
the equation above for k is proportional to (1, ux ) which is proportional to τ ,
which means that when we form the scalar product to get k, it drops out, since
τ is orthogonal (per construction) to N . We get:
k = ⟨κ, N ⟩ (1.42)
1 (0, uxx ) (−ux , 1)
=⟨ , ⟩ (1.43)
|γx | |γx | |γx |
uxx uxx
= 3 = 3/2
(1.44)
|γx | (1 + u2x )
Here, we used that that aforementioned second term is orthogonal to N and left
it out. At the end we just collected terms.
Now, after seeing how much manual computation this took, you might be a
bit astounded as to why. The reason is the same reason why anytime you actually
want to compute something in differential geometry it usually turns into a mess
of derivatives. We are turning something fundamentally coordinate-based13 (uxx )
into something geometric (k). Coordinate-based objects usually have, as you might
imagine, a lot of information in them that is only related to the choice of our
coordinates and we have to filter that information out when we do the conversion.
This is the reason why there is so much to compute, even if the steps aren’t too
complicated.
13 To make this discussion more general, we write coordinate-based, even though right now
it’s just a parametrization. You can see a parametrization as coordinates on the curve.
32 CHAPTER 1. CURVES
Figure 1.13: You can preform a rigid motion and not change anything about
the curvature of the curve.
where θ(0) is a constant we can choose freely. The next step is to integrate the
equation:
dγ
(s) = τ (s) = eiθ(s) (1.47)
ds
and get: Z s
′
γ(s) = γ0 + eiθ(s ) ds′ (1.48)
s0
giving us yet another constant γ0 , which we can choose freely. To get a more
rigorous proof, you would need to show that these are the solutions (just dif-
ferentiate them) and that these are the only solutions (use a theorem from
calculus.)
Figure 1.15: A problem case of a closed curve, for which the tangent vector at
the beginning is not the same as the tangent vector at the end. We want to avoid
this, so we just take these kinds of curves (and ones where higher derivatives
don’t match up) out of the set of curves we consider
1.5. CURVES IN R3 35
3. If we are working with closed curves, we will want them to have the (nice)
property that, if we extend them periodically to a curve from R → RN ,
they are smooth. This is to avoid annoying situations like the one in figure
1.15
Theorem 1.4.2. Let γ be a two dimensional regular closed curve that obeys
the above restriction. Then: Z
kds = 2πn (1.49)
γ
for some n ∈ Z
The integral quantity is the global quantity we mentioned before, while k is the
Rb
curvature scalar, which is a local quantity. Since γ k ds = a dθ
R
ds ds, the global
quantity is just the total angle (with signs) by which the tangent vector rotated, a
profoundly global thing.
It makes sense that this would be so. If the curved is closed (and is smooth
on the edge, if made periodic), then the angle τ has to rotate by must be some
multiple of a whole rotation, since it ends up where it started.
This proposition tells us that a non-intersecting curve’s τ can only turn once in
total. See the examples in figure 1.16
With this we conclude the topic of two-dimensional curves, and move on to
three-dimensional curves.
1.5 Curves in R3
1.5.1 First remarks
Very early in our discussion of two-dimensional curves we figured out that in two
dimension, the curvature vector is not really necessary for the description of how the
curve curves. Better said, the curvature scalar (the signed length of the curvature
vector) held all the information about curvature, the direction of the curvature vector
36 CHAPTER 1. CURVES
Figure 1.16: Two simple curves and the graph that shows by how much the
tangent vector rotated.
was always predetermined. This is very different for curves in three dimensions.
Here, the vector character of the curvature vector really stands out.
We saw at the beginning of this chapter that the curvature vector lies in the
normal plane of the tangent vector. In two dimensions this helped us, by letting us
forget about the direction of the curvature vector and only consider the curvature
scalar. This time, we cannot do this, as we have an entire plane that the curvature
vector could lie in.
In the two dimensional case, we defined a moving frame composed of τ and
N , which was a right handed orthogonal basis. This is the idea we will use to get
further in three dimensions.
Definition 1.5.1 (The moving frame). Let, as always, γ be a curve, this time in
R3 with all the properties we already mentioned before (smoothness, regularity
etc.) We define three vectors at each point, that will compose the moving frame
we will usually use.
The second vector will be called the normal vector15 , while the third vector is
called the bi-normal vector.
Together, N and β span the normal plane of τ . For a visualisation of the moving
frame, look to figure 1.17.
Figure 1.17: A three dimensional curve, with the moving frame drawn in. N is
κ, but normalized, β is the cross product of N and τ .
Definition 1.5.2 (Curvature scalar for curves in three dimensions). The cur-
vature scalar is simply the absolute value of the curvature vector, defined as
k = |κ|
Now, you might have noted that to define N , we divided by the curvature scalar
and this becomes a problem, if k is zero. This is an actual problem and happens
for any curve that is (to second order) straight at some point. We will exclude this,
simply by adding another restriction on our definition of a curve.
15 Even though it is not the only normal vector to τ , but it is a very special normal vector,
With this aside, let’s go back to the curvature vector. We saw that the curvature
scalar will simply not provide enough information about our curve that we can paint
a complete picture. We will need another agent, which will be called the torsion.
1.5.3 Torsion
Here we will define what we mean by the new agent we said we needed in the
previous section, called torsion. Torsion will also be another geometric object16 . It
will tell us, in a sense, how much the curvature vector changes.
dN d dτ d
⟨ , τ⟩ = ⟨N, τ ⟩ − ⟨N, ⟩ = 0 − ⟨N, κ⟩ = −k (1.52)
ds ds ds ds
which is just (minus) the curvature scalar, which we already know, and for the other
component we get (Product rule again):
dN d⟨N, N ⟩ d
2⟨ , N⟩ = = 1=0 (1.53)
ds ds ds
since a unit vector can’t change in it’s own direction (otherwise it’s length would
change.).
That is why we take the projection.It provides us with the only new information.
The information we want is how the normal plane changes, but only in the direction
of the bi-normal-vector.
We can form a table with all the objects we have so-far introduced and a few
things to note on them. See table 1.1.
One thing that you might find surprising at first is that the units of λ are not
1/[L]2 . The reason is because we normalize κ before differentiating, which means
we multiplied the units by [L].
16 As a reminder, something is a geometric object or geometrically invariant when it does
Figure 1.18: A curve, and it’s normal vector and plane changing along the curve.
Table 1.1: The main geometric objects we have defined up-til now.
Figure 1.19: The parallel between k and l. k measures how lose a curve is to
a straight line (Part (a)). In part (b) you see a curve that lives entirely in a
plane, while in (c) you see a curve that deviates from the plane it is almost in,
at least in the direct vicinity of the thick point with the moving frame drawn
in. l is the thing that measures this.
1.5. CURVES IN R3 41
where l is the torsion scalar, and c is some universal constant. This is why the
torsion scalar measures how much the curve deviates from living in a plane. As an
exercise, you should prove all the claims we did not prove in this discussion and find
c. You can take the curve γ = (s, as2 , bs3 ), as a first example and see where in the
Taylor-series around 0 you find k and l)
You can also prove that if k = 0 for all s, then the curve is a straight line, and
if l = 0 for all s, the curve lies in a plane. (You can do this as an exercise, or decide
that the above discussion convinced you of this.)
In summary, torsion measures how much a curve twists away from a plane and
into the third dimension.
Example 1.5.1. What is the curve with constant curvature and torsion? We
won’t show it here, but it can be shown that it is a helix. A helix, by the way,
can be parameterized by (R cos(t), R sin(t), mt) where m is some constant.
Why does it make sense that it’s a helix? Well, we want constant curvature,
which means a circle is involved, but we also want it to twist out of the plane
it lives in, at a constant rate, which is why we get a helix.
This is stated in the following theorem, which can be proven similarly to its
equivalent in two dimensions.
Theorem 1.5.1 (k and l determine curve up to a rigid motion). Let k(s) ≥ 0
and l(s) be any smooth functions. If we set the curvature scalar and torsion
scalar to these functions, respectively, we determine the curve uniquely in R3
up to a rigid motion in R3 .
As we said before, we won’t prove this. But we hope you see how interesting the
result it. We need only two real functions to describe a curve in three dimensions,
and only one real function in two dimensions. In both of these cases, we reduced
our description by an entire function.
e1 (s)
e2 (s) (1.55)
e3 (s)
Then, as we will show in a second, for a general moving frame we can write:
e1 (s) e1 (s)
d
e2 (s) = A(s) e2 (s) (1.56)
ds
e3 (s) e3 (s)
where A(s) is a 3 × 3 matrix.
This matrix has a special property, generally, for any moving frame.
Proposition 1.5.1. Let γ be any curve and (e1 (s), e2 (s), e3 (s)) an orthonormal
moving frame. Then the matrix A(s) from equation 1.56 is anti-symmetric.
Where does the anti-symmetry come from? Well, there are two parts of the
anti-symmetry. Firstly, the diagonal is zero, which comes simply from the fact that
a vector of unit length cannot change in its own direction, otherwise the length
would change. Then there is the anti-symmetry of the other components, which
1.5. CURVES IN R3 43
Figure 1.21: A curve with its Fréchet-frame and a second picture of the same
curve with a random moving frame.
44 CHAPTER 1. CURVES
stems from the orthogonality. If one vector changes, the others have to change in
a way that they all stay orthogonal to each other.
0 0
τ k τ
d
N = −k 0 l N (1.58)
ds
β 0 −l 0 β
where k = k(s), l = l(s) are the curvature and torsion scalars, respectively,
This is a particularly simple matrix. The special thing about a Fréchet-frame is
that it reduces the three independent matrix-components of A to two, specifically
two components we already know.
Proof. Many of these are definitions (for example, the first equation is just the
definition of N ), the rest aren’t too hard to check and we leave them to you as
an exercise.
The next theorem says something about the same quantity as above, for knotted
curves. Knotted curves are curves that cannot smoothly be transformed into a circle,
without crossing themselves. You can see a few examples in figure 1.22
Figure 1.22: The trefoil, figure-8-knot and unknot. The first two are knotted,
the last one is not, even though it might look like it at first.
The bound in Millner’s theorem is sharp: for example the trefoil knot has total
absolute curvature of exactly 4π.
With these two examples, we conclude our discussion of curves and move on to
the second type of object we want to discuss before talking of differential geometry
in a general matter, those objects being surfaces in R3 .
46 CHAPTER 1. CURVES
Chapter 2
Surfaces
In this chapter, we deal with surfaces, which are the obvious next step after curves in
our discussion. We will not treat surfaces in the more general RN , but just surfaces
in R3 , as they are more intuitive and are enough for the purposes of this lecture.
We start with the the definition and then define a few basic properties, before
moving further to a discussion of the geometry and curvature.
In figure 2.1(b), you can see the graph of the function z = f (x, y) =
(x2 + y 2 ). It is, of course, a surface (in every intuitive way, but also in the
more general definition we will give below.)
It is clear that the surface in the example should be a surface. The idea that a
surface should always be the graph of a function is however, not a good one. There
are two simple examples that should definitely be surfaces, but wouldn’t be, if that
was our definition. The first is the xz-plane. Obviously, it should be a surface. If
anything should be a surface, the xz-plane should be. And yet, it is quite easy to
see that you can’t write it as a function of x and y. Well, you might say, that there
is nothing special about x and y in R3 and that we should be allowed to choose
any plane to describe our surface. For example, we could take y = f (x, z) = 0 to
describe the xz-plane. Yes, that is a possibility, but still we find a problem. Take
the sphere. It should definitely be included in our definition of a surface. But I dare
you to find any plane for which you can write the whole sphere as a graph of a
function. It should be very clear, that this is not possible.
47
48 CHAPTER 2. SURFACES
temp
Figure 2.1: A picture of many surfaces, that explain our definition of a surface. In
(a) you can a graph of some random function of x and y. The picture in (b) is
simply the graph of f (x, y) = x2 + y 2 . In (c) you see the first problem with the
naive definition, because we cannot write the xz-plane as the graph of a function
of x and y. In (d) you see the further problem, that even if we allow for the
function to be defined on an arbitrary plane, the sphere can simply never be of
sure form globally, but it can be made such locally (e).
2.1. SOME DEFINITIONS AND BASIC QUANTITIES 49
Can we repair this situation? Yes, if we recognize that, while we cannot write
surfaces like the sphere as a graph of a function, at every point on the surface, in
some (possibly very small) region of the surface, we can write that region as the
graph of a function (on some plane in R3 ). That is we can write the surfaces locally
as the graph of a function1 . This leads us to our full definition.
The tangent plane to M at p is the set of all vectors based at p and tangent
to M at point p. We can formally define it as:
(2.1)
Apart from tangent vectors, we also have normal vectors. These are vectors that
are normal to M , which, of course, means that they are also normal to Tp M
Figure 2.2: A surface M , with a point p and the tangent plane Tp M drawn
in, at different levels of ”zoom”. Globally, M and Tp M are very different,
but the more we zoom in, the more do the coincide, until they become
”almost” the same.
Figure 2.5: The Möbius strip. You can try to draw a normal field by starting
at some point and drawing the next normal vector and then the next until
you get back to where you started, but you will find that when you come
back, your normal will point into the other direction than the one you started
with (= not smooth).
Now that we have developed the most basic tools we could use to do differential
geometry on surfaces (The analogues of τ and N for curves) we can proceed to
talk about curvature.
We shall start with the geometric definition of curvature, since it is the most intuitive
one.
The first way to define curvature on a surface is using curves that lie on the surface.
If the surface is curved, then any curve passing through the point of interest p
has to be curved as well. We cannot just take the curvature of some curve on
M going through p, simply because this will give us different values for different
curves. (There is an infinite amount of curves going through p that live on M ,
all curved differently, see figure 2.3) There is however a specific amount of curving
that a curve has to curve at point p to still stay on M , otherwise it would leave.
This amount of curvature, will be in the direction of N . Why? Well, in the tangent
direction, we can have pretty much any curvature we want, the curve can be as
curved as it want on M . That pretty much leaves on the direction of N . From this
simple thought follows the next definition.
Figure 2.6: A surface M and a point p. There is a lot of curves going through
p, some less, some more curved. (They all have to live on M though.)
54 CHAPTER 2. SURFACES
Figure 2.7: The idea of the previous definition. The function Qp takes a
(normalized) v from the tangent plane, takes a curve through p with a
matching tangent vector, and puts out the normal component of κγ .
There are two questions you might have already asked yourself. Firstly, why do
we only define this when |v| = 1? That is an easy question to answer. Mostly
convenience. It will simplify the proof and calculations, while not leaving out any
real information. Let’s say you want to know Qp (v) for some vector with length
2. Then you can do this entire procedure for the same vector, but normalized, and
simply parameterize the curve γ in such a way that you move twice as fast along it.
The second question is whether this definition makes any sense, whether it
is well-defined, because we have, after all, many curves going through p with their
tangent-vector equaling v and it is not obvious that we always get the same number
for Qp (v) for any choice of appropriate curve. This will turn out to be true however,
as we will prove in a short time. For now, we will quickly assume that this is true
and talk about the object Qp a bit.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES55
We want to take a few notes and intuitive ideas about what Qp is and describes at
this point.
• Firstly, Qp (v) is not standard notation, because it will turn out that Qp will
simply be the second fundamental form in a slightly different manner, which
will be denoted as Ap (X, Y ), and which we will introduce shortly.
• Qp (v) will turn out to be a quadratic form in the components of v, i.e it will
P2
be of the form Qp (v) = i,j=1 aij v i v j , where aij are some numbers that
come from the surface’s geometry and v i is the i-the component6 of v.
• Qp (v) = Qp (−v). This follows from the fact that for an appropriate curve
which we use to calculate Qp (v), we can simply take the same curve, but
reverse it’s parametrization. Then the tangent-vector changes direction (or
sign), but the curvature vector stays the same, and therefore by definition
Qp (v) does too7 .
We will prove many of these claims soon, for now we just wanted to familiarize you
with these properties.
6 Notice the index is upstairs. There is a reason for this sort of notation, where upstairs and
downstairs indexes have different meanings. For now, however, there really is no difference,
but we will use this notation to get you accustomed.
7 Prove that the tangent-vector changes sign, but the curvature doesn’t.
56 CHAPTER 2. SURFACES
Figure 2.8: A sphere, the north pole, a random unit tangent vector and a
great circle.
1 1
Qp (v) = ⟨κγ , N ⟩ = ⟨ N, N ⟩ = (2.2)
R R
Also, this is independent of v, which really just says something about the
(local) symmetry of the sphere.
a You can alternatively see this as us using special (Cartesian) coordinates, for
which the point on the sphere that is of interest p has coordinates (0, 0, R). Then,
since Qp is defined by a scalar product of two fundamentally geometric vectors, it
should transform geometrically.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES57
Figure 2.9: Here we have the same situation as in the previous figure, but
instead of a greater circle, we take a smaller circle to compute Qp (v). We
have however rotated the sphere (v is pointing towards you), so that it would
be simpler to see what is going on, namely that the curvature vector of the
smaller curve is far larger, but that the projection onto N is still the same.
Lemma 2.3.1
Let everything be the same as in the theorem above. Then
What the lemma says, is basically that instead of the geometric curvature κ of
the curve, we can instead use the second derivative of γ.
The proof is quite simple.
Proof. Let γ be any appropriate
curve. We simply compute ⟨κγ (0), N (p)⟩, using
the fact that κγ = |γ |2 γtt − ⟨γtt , γt ⟩ |γ |2 . We can use the fact that |γt | =
1 γt
t t
|v| = 1, which simplifies the formula for κ
⟨κγ (0), N (p)⟩ = ⟨γtt − ⟨γtt , γt ⟩γt , N (p)⟩ = ⟨γtt (0), N (p)⟩ (2.4)
where in the last step we have used the geometric fact that γt = v is in the
tangent plane, and therefore normal to N .
k = uxx (2.7)
It is clear that we can also (locally) write the curve as a graph over it’s tangent, as
seen in figure 2.3.2, where we also find that the curvature scalar becomes a second
derivative.
We will see that the curvature of a surface will also become a second derivative,
when you write the surface as the graph of a function over its tangent plane (locally).
⟨κγ (0), N (p)⟩ = ⟨γtt (0), N ⟩ = ⟨(x′′ (0), y ′′ (0), z ′′ (0)), (0, 0, 1)⟩ = z ′′ (0)
(2.8)
where we used the lemma from above. We found the theme of the last
subsection. Curvature is a second derivative. What we have basically
done is write (a local part of) the surface as a graph over its tangent plane
at p, and found that, similarly to curves, curvature became the second
derivative in those coordinates.
Step 4 Somehow, we need to use the fact that γ lies on the surface, which we have
not yet used. Of course, because γ does lie on the surface, its coordinates
have to be: γ(t) = (x(t), y(t), f (x(t), y(t))), because the z component (for
a local part, again) is not independent of x and y. If we now evaluate
z ′′ (0) we get:
Figure 2.10: A curve that is just the graph of the function u(x). When
written as the graph of a function over its tangent, k becomes the second
derivative.
2.3. THE GEOMETRIC DEFINITION OF CURVATURE ON SURFACES61
Figure 2.11: The coordinates we are using for the proof of the theorem.
The point p is at the origin and the tangent plane is the xy-plane
Firstly, we have just proven that Qp (v) is a quadratic form, but we have
also proven that Qp (v) is independent of the curve we choose. The quan-
tities fxx , fxy and fyy have nothing to do with our choice of curve, only
with the underlying surface.
We can rewrite our result quite a bit and see another way of looking at it.
which makes the quadratic-form-ness of Qp a bit more visible. (The partials are of
course evaluated at p)
62 CHAPTER 2. SURFACES
Definition 2.4.1
Let M and p be, as always, a surface and a point on it. Let us also use
coordinates where locally we write the surface as the graph of a function f
over its tangent Tp M at p. The the second fundamental is defined as:
2
X ∂f
Ap (X, Y ) = i ∂xj
X iY j (2.15)
i,j=1
∂x
We can very quickly note that the second fundamental form is symmetric, in
the sense that Ap (X, Y ) = Ap (Y, X) simply because the hessian is a symmetric
matrix. You can also see the second fundamental form as the first Taylor-coefficient
of f , which actually tells us anything geometric. The zero-th coefficient just tells
us where the surface is, the first only how the tangent plane is oriented, both of
which are not geometric things. Only the second one tells us anything geometric,
that is, the curvature at the point p.
Note 5. The formula above can only work if we use coordinates where p =
(0, 0, 0) and the xy plane is the tangent-space, because there fx and fy vanish.
We need to carefully pick this coordinate system to extract the geometric infor-
mation. There is a general formula in any coordinates, but it long and tedious,
which is the reason why we won’t do it here.
To connect the second fundamental form to curvature, we note that:
Let’s say you have a scalar product (not necessarily the standard euclidean
one) and a bilinear form. That is you have the maps
⟨,⟩ : V × V → R (2.17)
B :V ×V →R (2.18)
(2.19)
where the first has the usual properties of a scalar product and the second one
is a linear in both entries and symmetrica . Then there exists an orthonormal
basis e1 , e2 , . . . , en of the vectorspace V , so that Bij = B(ei , ej ) = λi δij
for some numbers λ1 , λ2 , . . . , λn . In other words, the matrix Bij is diagonal
with the λ’s as the diagonal. The λ’s are called the singular or principal
values.
a B(X, Y ) = B(Y, X)
If you have not seen the singular value decomposition before, it simply states
that you can rotate and flip your coordinate system so as to turn a particular bilinear
form into the form:
B(v, w) = λ1 v 1 w1 + λ2 v 2 w2 + · · · + λn v n wn (2.20)
k1 0
Ap = (2.21)
0 k2
where k1 and k2 are some numbers that only depend on the surface. They are the
principal curvatures we mentioned when discussing Qp (v), which you can see quite
simply. If we work in that coordinate system, the second fundamental form used on
the same vector twice becomes:
You can verify (exercise) that the maxima and minima happen when v = ±e1 , ±e2
in those special coordinates, which verifies that (1) k1 , k2 are the extreme values of
Qp (v) and that (2) the directions of the principal axes of curvature are orthogonal
to each other (e1 ,e2 are orthogonal per construction).
64 CHAPTER 2. SURFACES
H = k1 + k2 (2.24)
K = k1 k2 (2.25)
The first H, is called the mean curvature, while the second, K, is called the
Gauss curvature. You might wonder why H is called the mean curvature when it
should be k1 +k
2 . That used to be the old definition, but over time, people got sick
2
What is the meaning of H? To understand it, we hope, that by now you are
convinced that the matrix form of A in our special coordinates is just the Hessian
of the function that describes our surface, and that when we rotate Tp M , this
does not change. What does change is that the Hessian becomes diagonal in these
coordinates (which we will denote x′ , y ′ ). So what we get for that matrix is:
∂2f
!
0 k1 0
H = Ap = = (2.26)
∂x ′2
0 ∂2f 0 k2
∂y ′2
So that:
∂2f ∂2f
H = k1 + k2 = ′2
+ ′2 (2.27)
∂x ∂y
which is the two dimensional Laplacian of f . Now, the Laplacian tells us quite a bit
about average quantities, which also makes it more clear why H is called the mean
curvature. It tells us how much the function f deviates from it’s value at (0, 0) in
our coordinates, which is, of course, 0. It makes sense that a this would be a good
value to describe a part of the curvature. Because the Laplacian is inevitably tied
up in (physical) problems like drums and soap films, it first came up way before
curvature in this sense was discussed. Surfaces with H = 0 are called minimal
surfaces, not only because of the history OF THE Laplacian, but also because they
are usually surfaces of minimal area for some boundary problem.
Figure 2.13: The signs of k1 and k2 change with the other choice of N .
If we choose N1 as in the picture, k1 is positive, while if we use N2 , it is
negative. The absolute value stays the same however.
Now what about Gauss’s curvature? Well, it has a very nice property. It’s sign
is independent of our choice of N , unlike k1 , k2 , and H. First regard k1 and k2 .
66 CHAPTER 2. SURFACES
Figure 2.14: The Laplacian locally is just the average difference of f (x0 )
and f (x0 + ϵe) where e are all possible directions, ϵ is a parameter that goes
to 0
Look at figure 2.4.2. There you should see how k1 changes depending on the choice
of our normal. If we choose N1 , then the normal and the curvature vector of the
curve belonging to k1 point in the same direction, if we choose the other direction,
they point in opposite directions.
The fact that the sign of H tells us nothing geometric either can be seen alge-
braically quite easily.
H = k1 + k2 (2.28)
Therefore if we change the signs of k1 and k2 it is clear that the sign of H changes.
You can also look at this through the meaning of H. H is the two dimensional
Laplacian of the function that the surface is a graph of (in good coordinates it is.)
If we reverse the direction of N , we practically just take −f as the function M is a
graph of (locally). But the Laplacian is linear, which means it changes signs too and
since H is just the Laplacian it does too. The Gauss curvature, on the other hand,
does not change signs. The important thing to note is that if you change the signs
of k1 and k2 . their product, which is the Gauss curvature, does not. The sign of the
Gauss curvature captures the relationship of the signs of the principal curvatures. If
they have the same sign (++ or −−), then K will be positive. Otherwise it will be
negative. The sign relationship of the principal curvatures are very geometric things.
In particular, if both principal curvatures have the same sign, then the curves that
they belong to curve in the same direction, and if they have opposite signs, into
other directions. In the first case you have something that looks like a bulge, in the
other something that looks like a saddle point, as you can see in figure 2.4.2.
2.4. THE SECOND FUNDAMENTAL FORM 67
Figure 2.15: Typical example for each of the possible signs of the Gauss
curvature. If it is positive, both curves curve in the same direction, creating
a bulge. Otherwise, they curve in opposite directions and create a saddle.
68 CHAPTER 2. SURFACES
There is another way to express H and K, that might be more enlightening. If,
as always, we call the matrix of the second fundamental form (so the hessian of f )
A, then we get the result:
H = tr(A) (2.29)
K = det(A) (2.30)
as you should check yourself (exercise). These formulas are important, as they
illustrate the geometric character of H and K.
1/R 0
D2 (f ) = (2.31)
0 1/R
from which our results from above follow. If you actually did the calculation,
you are probably happy about the power of geometry.
• The point p for which we are calculating the second fundamental form
is a fixpoint. S(p) = p
• The tangent plane Tp M is preserved by S, which means S(Tp M ) =
Tp M . In other words, the plane P that we reflect our space on is
perpendicular to Tp M .
Then, as I hope you can agree, S acts on the tangent-plane like a reflection
through a line L, which passes through the origin. Then the direction of
this line as well as the direction orthogonal to it are the principal axes of
curvature.
You can see the setup of the situation in figure 2.5 and the way the symmetry
acts on the tangent plane in figure 2.5. I hope you can appreciate how useful this
Figure 2.16: A surface that is symmetric under a reflection across the plane
P . The plane P is perpendicular to the tangent plane Tp M , and the line of
intersection of the two planes (L) is also drawn in.
theorem is. It is very easy to recognize a reflective symmetry like the one we need,
and if we do find it, we have pretty much immediately found the axes of curvature.
We then only need to calculate Ap for two combinations of e∥ and e⊥ to specify
Ap completely, and often these are quite easy to find (like with the sphere).
Proof. Let e⊥ and e∥ be an orthonormal system so that the first vector is orthog-
70 CHAPTER 2. SURFACES
Figure 2.17: Here we drew the tangent plane and the principal axes of cur-
vature. The symmetry acts as a reflection along the line L and the two
directions e∥ , e⊥ turn out to be the axes of curvature (per the theorem we
will prove in this section.)
onal to L and the second is parallel. We need to prove that in these two axes,
A is diagonal. We will do this by a contradiction. First, not that if k1 = k2 ,
then A = k1 I, which already is diagonal, so we can assume k1 ̸= k2 . We will
show that any other orthonormal basis cannot diagonalize A8 .
Step 1 Our first claim is that Ap cannot change under the symmetry as it is a
geometric object. As a formula, we claim
This is easy to see geometrically, because the curves flip under the sym-
metry.
Step 2 Let’s assume there is another orthonormal basis of Tp M , which diagonal-
izes A and let’s call these basis vectors e1 , e2 . Then S(e1 ), S(e2 ) also diago-
nalize A. This is easy to to see because of the previous step.A(S(e1 ), S(e2 )) =
A(e1 , e2 ) = 0. and A(S(e1 ), S(e1 )) = A(e1 , e1 ) = k1 and similarly for the
other combinations. A in that basis has to have the form
k1 0
A= (2.33)
0 k2
Step 3 We now want to show that S(e1 ) cannot lie in the direction of e2 and vice
versa. Well, we know that the maximal and minimal values of curvature
8 This is why we excluded the case k = k , because any orthonormal basis diagonalizes A
1 2
in that case.
2.5. SYMMETRY AND CURVATURE 71
lie in directions that are perpendicular to each other, see figure 2.5 for
a picture of an Ap with k1 ̸= k2 . It should be clear that the maximum
(k1 ) cannot occur for a vector between e1 and e2 and the same goes for
the minimum. Therefore S(e1 ) can also not have a component in the e2
direction.
Step 4 But then S(e1 ) has to lie along e1 and be of unit length (since S is an
isometry) and the same goes for e2 . So we get the possibilities:
Step 5 But which vectors have this property? Well, S is a reflection along L, so
the only vectors with this properties are the ones that lie along L, which
get mapped onto themselves, and the ones lying on the line perpendicular
to L (L⊥ ), as you can see in figure 2.5. Therefore:
(2.37)
e1 , e2 ∈ ±e∥ , ±e⊥
This theorem is very useful when computing curvatures and we will now provide
a few examples.
9 The fact that we can also have a ± before the vectors in the set is not important, as we
Figure 2.19: The tangent plane (drawn twice) and the way vectors get
mapped onto other vectors when you reflect along L. The vectors drawn
in purple are the only ones that get mapped onto the same line that they
already laid on.
74 CHAPTER 2. SURFACES
The catenoid is the surface of revolution of the function cosh(z). You take
the graph of cosh(z) as a function of z and rotate it around the z-axis to get
a surface, that looks a bit like a sci-fi picture of a wormhole, as you can see
in figure 2.5.1. We have drawn in a point p lying on the ”original” graph of
cosh(z) that points in the x direction. Its position on that line is arbitrary
though. For that point, we can use the xz-plane exactly like with cylinder
(although the orientation is different) and get that the line tangent to the
graph is the one the reflection preserves. So we know one of the directions
s tangent to to the graph. The other one is the y − axis, since it needs
to be orthogonal and lie in the tangent plane. We leave it as an exercise
to you to show that the principal curvatures have opposite signs and that
their values are: k1 = cosh1 2 z , k2 = cosh
−1
2 z and that H = 0 (the surface is
bulges locally.
Before continuing our discussion of curvature, we want to stop for a minute and
discuss how one defines derivatives on surfaces and fix the notation. We start by
introducing standard calculus notation and then defining derivatives on surfaces.
Proposition
This equation is very useful while calculating derivatives but also quite intuitive.
Simply put, to differentiate a function in the direction of X, take Any curve that
has (at the point of interest) its tangent-vector equal to X and differentiate on that
curve. Locally, they will look the same, after all. You can look at figure 2.6 for an
example.
Proof. The proof of this Proposition is very simple. You basically use the chain
rule once, and get exactly what you need.
df (γ(t))
= Df (γ(0))γt (0) = Df (x)X = DX f (x) (2.43)
dt t=0
The last equation is just the definition of DX f (x). Notice that the multiplication
is a matrix multiplication.
We can also define the derivative of one vector field in the direction of another
vectorfield, which we do point wise.
What we get is another vectorfield. Note that the derivative at p depends on the
value of X only at p, but that the same is not true for Y . This is simply because
we are interested in the change of Y along X(p). From the first we need a few
values so we can calculate the change (that is an surrounding of p), while we only
need X(p) for the direction.
The first kind of vectorfields of course contains the second, but the second is
a bit special. It is special in-so-far as that geometrically small vectors (|v| < ϵ for
2.6. INTERLUDE: A BIT ABOUT DIFFERENTIATION 79
Figure 2.22: The catenoid and the lines we need to calculate the
principal axes of curvature using symmetry, for some point p.
some ϵ) lie in the surface, because locally the surface looks like the tangent plane.
You can see the difference between the two types in figure 2.6.2 and
How can we differentiate a vectorfield along a surface M ? Let’s say we have
vectorfields X, Y defined on M and want to differentiate Y with respect to X,
where X of course has to be tangent to M . How can we proceed? There are a
few possibilities. we could for example take a curve that at t = 0 is at p and has
the tangent vector X(p) and use the proposition from two sections ago and repeat
for all the points. We will however use a different (but equivalent) way. We will
extend X, Y to be smooth vectorfields X̃, Ỹ on an open set U ∈ R3 , so that X̃, Ỹ
restricted to M match X, Y , and then define the derivative to be DX Y = DX̃ Ỹ
Figure 2.23: An example for the intuition behind the Proposition, for an
f : R2 → R. If you move along X and the curve γ with a very tiny step (h
is very small) the values you get will be very similar, which is why you get
the same result.
Proof. We know that DX̃ Ỹ depends only on the value X̃(p) = X(p) and on the
values of Ỹ in a small surrounding of p. (Surrounding in R3 ). Therefore the
independence of the extension of X is given. We also know that:
dỸ (γ(t))
DX̃ Ỹ = (2.46)
dt t=0
for some γ(t) with γ(0) = p and γt (0) = X(p). We can use any smooth curve
that satisfies the two conditions, in particular we can take a curve in M , so that
the expression becomes:
with this, we are done with the interlude and return to curvature.
following theorem.
82 CHAPTER 2. SURFACES
The first equality say that we can characterize curvature by how vectorfields
change along M , specifically by the component of that change normal to the surface.
The second one tells us that we can characterize curvature by how the normal
vectorfield changes on M .
We will prove this theorem, but we want to first discuss a few ways to see
this intuitively. We already saw that a tangent vectorfield has to change to stay
tangent to M . We can also see this, not as Y changing, but the tangent-plane
rotating as you move on M , which you can see in figure 2.7. This should be clear,
since the tangent-plane is in one sense the first linear approximation of M , and if
M curves then the tangent-plane also has to rotate. You can also see this as all
the possible tangent-vectors changing to accommodate M ’s curving, and since the
tangent-plane is build out of all of these vectors, it has to change as well. You can
also describe the change of the tangent-plane by describing how the normal vector
changes, since it has to change with the tangent-plane. This is how you get the
second equality. It is clear that DX N has all the information about how N , and
therefore Tp M changes, and this is why this map gets a special name, it’s called
the Weingarten map.
There are a few things to note about the Weingarten map, that one can see
quite easily. Firstly, Wp is self-adjoint, because Ap is symmetric.
Figure 2.27: Vectors tangent to M live in the tangent plane at the point they
come out of, which locally, you can imagine, as them living in the surface.
2.7. ANOTHER CHARACTERISATION OF CURVATURE: THE WEINGARTEN MAP85
Secondly, we can once again use that a unit normal vector cannot change in it’s one
direction, and conclude that the Wp really goes to Tp M , because ⟨Wp (X), N ⟩ = 0.
You can see this, by applying the same calculation we already did quite often. We
know that, ⟨N, N ⟩ = 1 since the unit normal is of unit length. We can therefore
calculate:
0 = DX ⟨N, N ⟩ = 2⟨DX N, N ⟩ (2.52)
which is exactly the claim we just described.
The example showed the way in which you can sometimes use the Weingarten
map to calculate something really fast. Imagine doing the calculation of Ap by
using the Hessian Matrix. There is so many derivatives that you’d probably make
a small mistake, but at the very least it would take way longer.
We will actually prove the second part first, as it is both easier and we need it to
prove the first part.
Proof. Let us first prove the second claim. It is not too hard to prove. It’s
the same calculation we already did very often. We know that ⟨Y, N ⟩ = 0
everywhere, because Y is a tangent vector field, and N is a unit normal field
and they are, per construction, orthogonal to each other. Then we get:
as promised.
Proof. The setup of our proof is drawn in (a), which is the same as in the proof
that Qp is the same as Ap . This time we want to extend Y to a vector that is
in the tangent-plane of q. In (b) we find that our first step will be extending Y
to be constant on the tangent-plane.
Figure 2.28: Even the most boring tangent vectorfield like Y has to change
a minimal specific amount to stay on M .
Step 3 We now think about which extension might be good to work with. We
want to extend X, Y to a surrounding of p on M , that contains the typical
point q ∈ M . The choice we will make is to ensure that the extensions
X̃(q), Ỹ (q) are both tangent to M at q, or in other words, that they are
an element of Tq M .
We can construct our vectors10 , first, by extending them to constant vec-
tors on Tp M . So we take Ỹ (q) = (Y 1 , Y 2 , ?), where we don’t know the
third component yet. We know that q = (x1 , x2 , f (x1 , x2 )) if it has coor-
dinates x1 , x2 on Tp M . When is a vector in q’s tangent space? Well, if we
have a graph, like f , we can see what we have to do by quickly looking at
a one-dimensional-example. which you can see in figure 2.7.2. Very simi-
larly to the figure, the a vector (Y 1 , Y 2 , Y 3 ) will be in the tangent-space
Tq M of q, if Y 3 = df (Y 1 , Y 2 ). So our extension becomes:
Step 6 Calculate DX Y p
. We get:
3
X ∂ Ỹ j
DX Y = DX̃ Ỹ = X̃ i e
p j
(2.61)
p p
i,j=1
∂xi
3
X ∂
= Xi (Ỹ 1 , Ỹ 2 , Ỹ 3 ) (2.62)
i=1
∂xi p
3 2
X ∂ X ∂f 1 2 j
= Xi (Y 1
, Y 2
, (x , x )Y ) (2.63)
i=1
∂xi p j=1
∂xj
3 X
2
X ∂ ∂f 1 2
= (0, 0, Xi (x , x ) p Y j ) (2.64)
i=1 j=1
∂xi ∂xj
= (0, 0, XHY ) (2.65)
uxx
k= 3/2
(2.68)
(1 + u2x )
The first, in principle, just describes the curvature vector of a parameterized curve.
The second one describes the curvature of a curve as a graph. We will start with
the first.
We can get a basis of Tp M at each point p, by taking the basis vectors of the
coordinate space e1 , e2 and applying dF |p onto them. That way we get two new
vectors (at each p):
Xi = dF |p(ei ) (2.70)
which (you should check) turn out to be basis-vectors of Tp M . We can also write
the Xi ’s out:
∂F α 3
Xi = ( ) (2.71)
∂xi α=1
We change notation a bit (to make the formula we will get readable in the end) so
α
that for ∂F α α
∂xi we write Fi . Similarly we use Fij for second derivatives etc.
We can then define the metric, which is a tool that tells us how to transform
from changes of coordinates to changes in lengths.
In actuality, gij is a collection of four maps, but it will turn out to be a tensor
(in later chapters), and a mighty object in differential geometry, so we call it one
map. Let us also define g ij to be the inverse of gij , if you see gij as a matrix. Then
for a parameterized surface we get the following formulas for some curvatures:
2.8. *OTHER FORMULAS FOR CURVATURE 91
Figure 2.31: The setup of our proof is drawn in (a), which is the same
as in the proof that Qp is the same as Ap . This time we want to extend
Y to a vector that is in the tangent-plane of q. In (b) we find that our
first step will be extending Y to be constant on the tangent-plane.
92 CHAPTER 2. SURFACES
W = g −1 D2 F · n (2.73)
H = tr(W ) (2.75)
or written out: XX
H= g ij Fijα nα (2.76)
i,j α
or written out:
!
F2β F2β F11
α
− 2F1β F2β F12
α
+ F1β F1β F22
α
H = nα γ γ δ δ γ γ 2 (2.77)
F1 F1 F2 F2 − (F1 F2 )
where all indexes repeated twice are summed over (Einstein convention).
For the Gauss-curvature K we get a similar result:
α α β β
F11 n F22 n − (F12 n )
α α 2
K = det(W ) = (2.78)
F1 F2 F1 δF2δ − (F1γ F2γ )2
γ γ
The formulas might look horrible, which in a way they are. But they are certainly
useful, because often you know F and can simply plug the derivatives in and get H
and K. It’s a formula which is very easy to implement on a computer.
By the way, if you know H and K, you can figure out the principal curvatures
quite easily. Let’s say you know k1 and k2 .
Then you can realize that:
Let’s say you have a function z = f (x, y), so that the graph of f is the
surface M for which you want to calculate H, K for. Let’s denote the
partial derivative after x, fx as usual. Then you can write H and K as:
As in the previous case, if you know H and K, you can solve for the principal
curvatures by the same calculation,
Notice the similarly of both cases to the formulas for a curve. Specifically, notice
how both of the formulas for the case of a graph have a correction term (in the
denominator) similar to the case of curves.
Figure 2.32: The idea of the third Step of the proof, but in the one-
dimensional case. A vector (x1 , x2 ) is in the tangent-space (lies in the
tangent of f at p), if x2 = df (x1 ) = dx
df 2
x
Imagine that you are an ant on some surface, like the one in figure 2.9 and you
94 CHAPTER 2. SURFACES
cannot see ”outside”, into the third dimension. You stay entirely confined to the
surface, which is your world. Imagine also, that you are a curious, very smart, ant.
You also have a measuring tape, with which you can measure lengths (infinitely
precisely) and have access to infinite computational power, as we said, you are very
very smart. In this world, light can obviously not go in straight lines (unless the
surface is flat), so we say it goes as straight as it can, and what the ant sees is a
result of this. You can see all of this drawn in figure 2.9. From this idea, that is,
what the ant can understand about the world it lives in from measuring lengths and
looking around, we can define what we mean as an intrinsic quantity on the surface.
Intrinsic quantities, simply said, are those that you can infer from measuring lengths
only.
where |γt | uses the norm in three dimensional space. We know from our treatment
of curves, that this is independent of parametrization, and only depends on the
image of γ. It is something intrinsic, that the ant can measure.
From this we can define the distance between two points on the surface M .
We simply pick the curve connecting two points that has the smallest length (if it
exists), otherwise the infimum.
for curves γ that connect p and q. Usually, a curve connecting the two with the
property that its length is the distance between the points exists, but sometimes it
does not. That is why we define it as an infimum13 . We call a curve with the above
property a geodesic.
The surface, equipped with these lengths as distances, is a metric space, as is
quite easy to show (exercise).
external structure of the R3 that the surface might live in, which is not something
the ant can ever experience. You can see the difference in the following figure. We
Figure 2.33: The way we extend Y , is by using the the same (but two-
dimensional) construction from figure 2.7.2.
This g in turn defines both dM and Lγ on the surface, i.e it is the only thing
you need to calculate L(γ) on the surface for any curve γ. It is also determined
from L(γ) or dM . This can be proven, but we will abstain from doing so until
later chapters. Our ant can therefore find it, it exists without any reference to the
ambient space R3 . We call anything that can be deducted from g in a sensible way,
intrinsic.
Note 6. Notice how both g(p) and A(p) are bilinear forms on Tp M . This is why
the first is called the first fundamental form, and the second is called the second
fundamental form. If you are wondering, there is also a third fundamental form,
which was introduced by Gauss, but it is not used much nowadays anymore,
because it doesn’t really add any new information, and can be calculated from
the first two.
Figure 2.34: The parameterization of the sphere. The lines we usually draw
on the sphere, are, as you know, the lines of constant longitude and latitude,
which are lines inherited from the coordinate space. We can also get basis-
vectors of Tp M for all p from the coordinate space, using the vectors Xi =
dF |p (ei ) as basis vectors of Tp M at p.
98 CHAPTER 2. SURFACES
Figure 2.35: A surface, on which our very smart ant lives. It has a measuring
tape, and (somehow) access to infinitely much computational power, so that
it can figure out as much about the world it lives in, as possible.
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 99
Let (M, gM ) and (N, gN ) be two surfaces, with their own respective metrics.
We call a function ϕ : M → N an (intrinsic) isometry, if it is bijective,
smooth, and preserves distances. That is, for any two p, q ∈ M , if p̃ = ϕ(p)
and q̃ = ϕ(q), then
dM (p, q) = dN (p̃, q̃) (2.85)
or equivalently, if for any curve γ in M :
Figure 2.36: A surface and the two different distances we can define. The
first one in (a) comes from the surface itself, and can at least in principle
be measured by our ant. The second one in (b) comes from the structure
of R3 and is, in general, not equal to the one in (a), and the ant can never
feel this length.
2.10. INTRINSIC ISOMETRIES AND GAUSSIAN CURVATURE 101
The cone, similarly to the cylinder, can, locally, be unfurled into a flat piece
of paper. Exercise: Try both examples with paper.
In general, any surface that can be unrolled into a piece of paper is called
developable, and through any point, there will be a line through it that is straight
(in the ambient space sense), so that k1 = 0 and K = 0.
This hints at the following result:
So by measuring angles, the ant can figure out if the space she lives in is curved
or flat! Another way she can do this is by using circles, not triangles. A circle, in
her world, is the set of points equally distant from some point (”radius r”), which
she of course calls the centre of the circle. If she lives in flat space, she will find
that the area of the circle will be πr2 . But if she lives in a curved space, then this
fact also doesn’t have to be true, you can take a look at figure 2.10.
One can show that the area will be:
π
Ap (r) = πr2 − K(p)r4 + · · · (2.89)
12
which is not πr2 !
Note 7. There is a formula for K in terms of the metric, but it is rather long
and not very useful for our purposes right now.
∂ 2 gij
K = g ij g kl + ... (2.90)
∂xk xl
2.11. TWO IMPORTANT THEOREMS 103
This is obvious for the case of the sphere (it has genus zero, so χ(S) = 2) and
you can just give it the usual metric of the sphere. Very surprising is the result that
on can do this with the torus. You cannot do it by embedding the torus in three
dimensions, but you can do it by embedding it in four.
The idea is presented in figure ?? and ??.
With this last theorem, we move away from curves and surfaces, and move
towards the theory of differential geometry on manifolds, making a short stop along
the way to catch up on important concepts from topology.
106 CHAPTER 2. SURFACES
Figure 2.42: On a sphere, the area of what the ant would call a circle is not
πr2 .
Figure 2.43: Examples for surfaces with genus 0, 1, 2. The genus is the
number of tunnels the surface has.
Part II
Manifolds
107
109
We have discussed the geometry of curves and surfaces extensively in the first
part of the lecture. We saw many ideas, like curvature and intrinsicness that became
very big themes and useful tools. The goal of this lecture is to extend these ideas
to general (geometric) spaces and see some of the fantastic results and tools that
this approach brings. Before we can do that however, we need a bit of technical
knowledge. For curves and surfaces, to define our tools, we needed to use certain
concepts constantly. Continuity, open sets, neighbourhoods, vectors, tangent vec-
tors and coordinates are only some of these. Usually, they were rather trivial to do,
because we stuck to the ambient space Rn and we know all of these concepts in
Rn quite well from courses like calculus. A vector is, in the most simple form, just
an arrow in Rn and is easy to understand. But we want to go beyond this simple
idea of our geometric things sitting in Rn . As a particular example, our world,
according to general relativity is a four dimensional space that is not R4 but also
does not live in any Rn . We want to extend ideas like surroundings and vectors
to abstract spaces which do not necessarily sit in some embedding space. This is
the topic of this part of the lecture. Our first step will be open sets, since we need
them for basic things like continuity. The topic of open sets belongs to a broad
field in mathematics, called topology, which we will explore briefly14 . Afterwards we
will handle coordinates and charts and define exactly what we mean by a smooth
manifold. Finally, we will talk about vectors in the settings of manifolds.
14 Any mathematician who has had a good lecture on topology can skip that part safely.
110
Chapter 3
111
112 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS
Where do these definitions come from? Well, the first idea is that an open set,
intuitively speaking is one that has no border, while a closed one does. You can
clearly see that for simple sets like the ones in the figures 3.1 - 3.1, this is true.
Figure 3.1: A set without a border. No matter where, no matter how close
to the border, you can always find a small enough radius so that the ball
around that point is still entirely contained in X.
We can therefore see, that at least in these examples, the definition matches
our intuition and we can therefore accept them and see what other sets are open
and closed in that case.
We can easily see that a set consisting of a single point is closed (it is its own
border) and that often (but not always) the question of open/closed comes down
to whether we include a border in our set or not.
We want to point out a few ideas that follow from our definitions below.
3.1. HUMBLE BEGINNINGS 113
Figure 3.2: If you have a piece of the border, then you cannot do the same
thing as we did in the previous example. A point on the border, will, by
definition, always contain a bit of X and a bit of the rest of Rn . So if
you have a piece of the border, the set cannot be open, which matches our
intuition.
114 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS
Figure 3.3: If you have a set with a border, you can quite easily convince
yourself that its complement is open since no part of the border is in the
compliment.
3.1. HUMBLE BEGINNINGS 115
• Any intersection of closed sets is closed. This follows from the second
item and the definition.
You can see the two examples given in the proposition in figure 3.1.
We can also express continuity of a function through open sets.
Figure 3.4: On the left you have the composition of a few open sets. Any
point in the combined set is still surrounded by a ball (circle) containing only
points of the combined set because the ball (circle) from the original set the
point comes from works for this. On the right, you have the intersection of
many balls (circles) whose radius is getting smaller and smaller. (All open).
After intersecting all of them, you get a set containing only the origin, which
is not open anymore.
3.2. A TOPOLOGICAL SPACE 117
• ∅, X ∈ T
• If Ui ∈ T for some indexation I, then
S
Ui ∈ T .
i∈I
This definition might look slightly different, but is exactly the three ideas we
wanted to steal from Rn . Indexation here means simply that we have a set (for
example {1, 2, 3, . . . }) that we use as indexes, finite means that the has a finite
amount of elements.
We can immediately define continuity of a function between two topological
spaces.
But before we go into all that detail, which, while interesting, is not the direct
subject of the lecture, we want to talk about a few more ideas that will be more
directly relevant to us.
Figure 3.5: The earth and a map of the earth. We can have a map of only
part of the earth and as we know all of the maps we have cannot represent
the geometry of the earth well. A common example is that the size of
Greenland in the Mercator map is similar in size to Africa, even though in
reality it is about one-fifteenth Africa’s size
Well, mathematically, what we need is (1) the piece of the space the chart
describes, (2) the piece of Rn you draw the map in and (3) a way to assign every
point in (1) a point in (2). The latter is of course just the description of a function.
3.3. CHARTS: PART I 119
We sometimes call Ch the chart and sometimes (U, Ch) the chart (as a tuple),
depending on which one is most convenient1 .
With this definition, we have fulfilled both (1) and (3), and (2) is just Ch(U ).
As you can see in the definition, we require bijectivity, that is, we don’t want our
map to have two points on the map corresponding to the same point in the space,
nor do we want two points on the space being shown as one on the map.
We can also see this more topologically, by introducing the notion of a homeo-
morphism.
the tuple.
120 CHAPTER 3. TOPOLOGY AND TOPOLOGICAL MANIFOLDS
Figure 3.6: A topological space X with an open subset U and it’s map/chart
onto a portion Ch(U ) of Rn . The coordinate lines from the map can be
projected back onto the space, giving us coordinates for the space.
Or, simply put, the open sets on S 2 have bigger open subsets of R3 , that
they match with on the sphere.
There exists a standard chart, which is simply the chart that describes the
sphere by longitude and latitude. It’s easier to write the parametrization
down first, so we will start with that.
cos(ϕ) sin(θ)
From this, we can construct a chart, by taking the inverse. You can work this
out quite easily. If we call the subset of the sphere, that the parametrization
points to, U , then we get:
with the extension of the arctan to the case where x = 0. You can check
(or should know) that this is a bijective continuous function so (U, Ch) is
a chart. There are, however, a myriad of other charts you could use for the
sphere, some of which we will introduce later.
We have developed the tools to add another condition for what we want to work
with, calling the new type of space a topological manifold. The new requirement
will be quite simple. We will want to be able to use charts. We want to require
that the whole space can be covered by charts, i.e., that there is no point in the
space, where you cannot construct a chart for any of its surroundings2 . We do not
require, however, that there is one chart that covers the entire space. You can see
easily why in examples like example 3.3
Before we add this extra structure, however, we will want to talk about another
condition that we will want to have, which eliminates (geometrically) ”pathological”
examples that we will definitely not want to work with, at least in this lecture.
We will want our new space to have a condition called the Hausdorff condition to
eliminate some weird examples we don’t want to have. Consider the example below
for a pathological case of a topological space that we won’t want to work with.
Figure 3.7: We can create the real line with two origins, by glueing together
two real lines everywhere, except at the origins of the two lines. This is
shown in (a). Sets that would be open on a real line (and have the origin
in them) are open if they contain at least one of the origins.
3.5. THE ANT 123
What is the problem with cases such as the real line with two origins? Well, in
some sense, there are two points where there should be one. You can’t separate
them. In a topological way of speaking, there aren’t two open sets that each contain
one of the points, but that don’t intersect. The way we will avoid this is simply by
requiring it and throwing away all other cases.
We will require this of any space we work with and build this into our new
definition. We will call the new kind of space a topological manifold.
There are many examples of topological spaces, any curve or surface will do.
The circle is a great example, so is the torus.
entire space, but it cannot yet do much with these charts, in particular things like
derivatives are still out of its reach. Our next goal will be to enable the ant to tell
how things change when she changes the coordinates. This will be the topic of the
next chapter.
Smooth Manifolds
We have taught the ant how to recognize where things are spatially. We now want
to prepare to teach it calculus. We won’t teach it calculus just yet, but we will
prepare it to do so, by making sure the charts it uses are compatible with each
other, which is the main topic of this chapter.
We have seen that a topological manifold can be covered by continuous (in the
topological sense) charts. To differentiate, we would need something a bit better
some differentiable, or preferably smooth structure. Right now, there does not seem
to be an obvious way of defining a sensible, geometric, way to define a derivative
on the space without making grand ad hoc assumptions that we are not prepared
to make, since we want to stay quite general. But let us, for the sake of the
argument, say that we have found a way to do it. There is an immediate way that
differentiating could go wrong if we do not pose further restrictions on our charts.
Imagine we have two charts, (U1 , Ch1 ) and (U2 , Ch2 ), either covering the same
region or covering regions that overlap somewhere (U1 ∩ U2 ̸= ∅) and we have some
function, let’s call it f : M → R. Whatever our derivative should be, the one
thing we will want is that if f is differentiable in the new sense, and if the charts
are sensible, then f should be differentiable as a function of the coordinates. But
this should hold for any chart we want, so in particular it should hold for Ch1 and
Ch2 . We can guarantee this if the transition map from one chart to another is
differentiable. This will be the topic of this chapter.
125
126 CHAPTER 4. SMOOTH MANIFOLDS
Let (U1 , Ch1 ) and (U2 , Ch2 ) be two charts, with U1 ∩ U2 not empty, that
is, there is a region on M that both charts cover. Call this region U . Then
we define the transition map from the first to the second chart as:
You can see a picture with all the objects we are using right now in figure 4.1.
Figure 4.1: A picture showing all the main players of this chapter. We have
(two) charts Ch1 , Ch2 each covering a region U1 , U2 of the manifold. Each
chart has its own inverse, P1 , P2 associated with it, which are parametriza-
tions of the manifolds. On the region where U1 and U2 overlap (called U )
we can define transition maps T1→2 , T2→1 , which change charts.
4.1. CHARTS II: COMPATIBLE CHARTS 127
From now on, since we will be working with charts a lot and want to declutter
notation, we will not write on which sets each chart is defined, these are implied
to be the ”reasonable” ones. It is, however, a good exercise, to keep track of
these, especially for the exam. We will also call U1 and Ch1 (U1 ) the same thing,
even though they are definitely not. Our reasoning is that Ch1 (U1 ) is our chart
representation of U1 and in the same way you can point at a map and say ”Here
is America”, even though you are pointing at a chart of America, you can call
Ch1 (U1 ), U1 .
We can now formalize what our idea, by calling two charts compatible if their
transition maps are differentiable. The only thing we will change is require smooth-
ness, for convenience.
Let (U1 , Ch1 ) and (U2 , Ch2 ) be two charts of M . We call the two charts
(smoothly) compatible if either U1 and U2 don’t overlap, or otherwise, if
their transition functions T1→2 and T2→1 are smooth.
An Atlas is a set A of charts (Ui , Chi ) which cover M and are all smoothly
compatible with each other (all the transition maps between the charts are
smooth)
• The set M has a topology TM , which is the set of all open subsets of
M.
• The space fulfils the Hausdorff condition, that is, every pair of two
different points can be separated by open sets.
The last two examples aren’t too surprising. They are completely analogous to
their equivalents we saw while discussing surfaces and since spaces like surfaces are
what we want to generalize, it looks like we are on the right track here.
Now that we have the general examples done, we give a few more specific ones.
4.2. EXAMPLES OF SMOOTH MANIFOLDS 129
Figure 4.3: The six different charts (only the open sets are shown)
that you need to map S 2 with the first method.
It turns out, that you don’t actually need this many charts to cover
S n , you could do with just two. What you need for this is the
stereographic projection. You can see the stereographic projection
of S 2 onto the plane in figure 4.2. The way you project a point
p = (x1 , . . . , xn , xn+1 ) from the sphere onto the (hyper)-plane is by
drawing a straight line from the north pole N = (0, . . . , 0, 1) through
p. The point at which the line hits Rn is the point it gets mapped on.
Notice that in two dimensions it is easy to see that the southern hemi-
sphere gets mapped onto the disc inside the sphere, while the northern
one takes up all of the rest of the plane. This way you can create a
chart that covers the entire sphere except for the north pole. This is
the first chart. The second one is also the stereographic projection,
but this time you do it from the south pole, and that covers the entire
sphere except for the south pole. Together you have an atlas of size
two.
130 CHAPTER 4. SMOOTH MANIFOLDS
As an exercise, you should derive these formulas and show that they
are compatible.
You can also check that the charts from the first method (2n + 2
charts) are compatible with the charts obtained from the stereographic
projection.
The Mobius strip is another example of a smooth manifold. You can see
the Mobius strip in a bit of a different form in the figure below.
4.3. THE MAXIMAL ATLAS 131
But which atlas should we choose to ”define” the sphere as a manifold? And
do we get different geometries from different atlases like that? In the case of the
sphere, it certainly would be weird if we got different geometric results if we used
the stereographic atlas or the graph atlas. To this also comes the fact that working
in one or the other is not really too comfortable. For example, if you choose the
graph atlas, there is no chart on which you can see both the north and south pole on
a single map, you need to use at least two charts, which is both more complicated
and somehow seems like an unnecessary problem. There is an easy solution to
this. Throw both of these atlases together. We already asked you to show that the
charts you get from the graphs and the ones from the stereographic projections are
compatible, so if you throw all these together, you still get a fully functional atlas.
In fact, while we are at it, why not just throw all possible compatible charts
together into one atlas, call it a maximal atlas and be done with it? This is exactly
what we choose to do and will be the modification to our definition.
132 CHAPTER 4. SMOOTH MANIFOLDS
Let A be an atlas of some smooth manifold (as per our preliminary defini-
tion). We can construct a maximal atlas by collecting together all possible
compatible charts with A. The maximal atlas is defined as:
Ā = {all charts (U, Ch) that are compatible with all (UA , ChA ) ∈ A }
(4.5)
Of course, to construct the maximal atlas we need some atlas to start with, and
the maximal atlas will depend on this. If you have two atlases that are compatible
with each other, however, they of course produce the same maximal atlas. In that
case we call the two atlases equivalent.
It should make sense, of course, that Ā is an atlas in its own right, all charts in
Ā are compatible with each other. We will give the proof, because it is a proof that
is similar to many other proofs in differential geometry of this kind and it is good
to have seen its kind once.
We will need two ideas for the proof, which seem quite trivial and are not too
hard to prove.
4.3. THE MAXIMAL ATLAS 133
This idea is quite easy to accept, since smoothness should be a local property,
after all, it is a generalization of the ϵ − δ kind of continuity and differentiation.
The second idea is even simpler.
Again this should feel obvious and the proof is not hard, it’s another one of the
typical chain rule proofs of differential geometry.
We will leave these two claims unproven, since their proofs are neither hard nor
illuminating and focus on the original idea we want to prove.
Proof. We want to show that Ā, which is a maximal atlas constructed from A,
is an atlas in its own right, that is, every two charts in Ā are compatible with
each other.
134 CHAPTER 4. SMOOTH MANIFOLDS
We start with choosing two charts in Ā, and we call them (V, ChV ) and
(W, ChW ). We want to show that they are compatible.
Let Z = V ∩ W and we can assume that it is not empty, otherwise, we are
done. We want to show that TV →W (on Z) is a smooth map. The basic idea of
the proof is that we do not transition from the first chart to the other directly,
but go over the charts from A. (See figure 4.3)This is why we needed the second
lemma. The first we need because, in general, it is not necessary that A contains
a chart that works on Z, so we have cut Z in parts and prove it for each part,
which is where our lemma will come in handy. Let’s start.
Firstly, cut Z up into all the pieces, where a chart in A exits that covers
that portion of Z (and maybe more beyond Z). By construction of the maximal
atlas, they are compatible with each (Ui , Chi ) ∈ A. That is, S define the sets
Zi = Z ∩ Ui , which are open and cover Z. Then ChV (Z) = Ch(Zi ) and all
i∈I
of these are open as well since ChV has to be continuous.
4.4. THE FINAL DEFINITION OF A SMOOTH MANIFOLD 135
Take TV →W and split it up into smaller maps over the Ch(Zi ). Because
of the first lemma, we only need to show that TV →W |Ch(Zi ) is smooth for all
Ch(Zi ), the smoothness of TV →W as a whole map follows from the lemma.
But we can use that:
where all the functions are, of course, restricted to either Zi (all the charts) or
Chi (Zi ) (all the parameterizations and transition maps), which has been left
out for readability.
But ChV and ChW have to be compatible with all the Chi , so the two tran-
sition maps in the above equation need to be smooth, but since the composition
of two smooth functions is smooth, TV →W |Ch(Zi ) has to be smooth for all charts
in the atlas, and since the Zi cover Z, the whole transition map TV →W has to
be smooth.
• The set M has a topology TM , which is the set of all open subsets of
M.
• The space fulfils the Hausdorff condition, that is, every pair of two
different points can be separated by open sets.
• The space is equipped with a maximal atlas Ā of charts, which are
smoothly compatible.
• The charts are all homomorphisms. (This comes from the definition
of a topological manifold.)
cartesian product of smooth manifolds and also the relationship between smooth
functions between manifolds and atlases.
Let M and N be two manifolds, with dimensions m and n. Then you can
construct an atlas for M × N out of the two (maximal) atlases ĀM and ĀN
that belong to M and N . If (UM , ChM ) and (UN , ChN ) are two charts
from the individual atlases, then you can create a chart for UM × UN by
taking the cross product:
You can check that with this atlas (or the maximal version of it) you get a
smooth manifold, by checking all the conditions for a smooth manifold.
This definition also forces the dimension of M × N to be m + n as you would
expect, which you should convince yourself of.
Let (U, ChU ) ∈ ĀM and (V, ChV ) ∈ ĀN be two charts and f : M → N
the function whose smoothness we want to check. The two charts are called
an admissible pair if f (U ) ⊂ V , that is if the whole set U gets mapped into
V.
Notice first that we did not define f to have to be smooth in coordinates for all
admissible pairs, only that one exists. You might ask yourself if this means that a
smooth f might not be smooth for an admissible pair that wasn’t used to check its
smoothness. It turns out, that the answer is no. You only need to check smoothness
in one atlas, not the maximal atlas, and the definition then forces f to be smooth
as a function of coordinates for all other admissible pairs. The proof of this claim
is very similar to the proof that all charts in the maximal atlas are compatible with
each other. You take two charts that are an admissible pair, break U up into small
pieces for which there is a chart where it is smooth (as per the definition), and use
the fact that smoothness of functions between two Rn is local. We therefore choose
not to give it here.
From this definition, you can immediately see that the following corollary has to
be true.
The last two points should make it reasonable that our definition makes
sense.
Another thing we definitely would like to have is that the composition of two
smooth functions is a smooth function and that smooth functions in the above sense
are also continuous in the topological sense, both of which turn out to be true.
Since both of these propositions seem very natural and their proofs aren’t too
interesting we leave these out. You can show the first one easily by using charts,
similar to other proofs in this chapter and the second by using that continuity is a
local property and charts are homeomorphic.
4.6 Diffeomorphisms
Now that we have done a lot of technical detail, we want to talk about when (at
this stage) you cannot tell the difference between two manifolds, similar to how
two topological spaces are pretty much the same thing (equivalent, homeomorphic)
4.6. DIFFEOMORPHISMS 139
if there is a homomorphism between them. The main point back then was that
we needed a bijective continuous map between the two spaces. The idea was that
we have a structure on the space, and if the structure is the same between two
spaces, we cannot tell the difference with any tool we have that comes from these
structures. The only new structure we have at this stage is a smooth atlas, so you
are probably not too surprised that the bijection will need to be smooth and its
inverse as well. When we have such a map, we call it a diffeomorphism and the two
spaces are diffeomorphic.
• f is a bijection.
• f is a homomorphism. If the two spaces are supposed to be ”the
same space”, then we shouldn’t be able to tell them apart by their
topologies, so this condition makes sense.
• f is smooth, and it’s inverse f −1 is also smooth. This is to make sure
that the maximal atlases are ”pretty much the same” and we cannot
tell M and N apart from their smooth structure.
There are a few things to note about the definition. Firstly, it forces dim(M ) =
dim(N ), which should make sense. It should not be possible for the circle and the
sphere to be the same thing, and they are not. Secondly, the second requirement is
not actually necessary, because charts are homeomorphic and the second condition
follows from the other two. A diffeomorphism is automatically a homomorphism
without the second condition, we just added it so that you could see very clearly
how a diffeomorphism respects the entire structure, not just the atlas.
Tangent Vectors
Now that we have defined what a smooth manifold is, we want to be able to do
something inside of it. It is all well and good to talk about diffeomorphisms and
charts, but we need some objects in the manifold to have interesting results about.
The obvious first candidate for something interesting on a smooth manifold is a
vector. After all, Rn , curves and surfaces all have vectors associated with them and
a lot of the nice results from the first part of the lecture had something to do with
vectors.
There is a small problem of just ”lifting” the definition of vectors from Rn to
manifolds directly, without any thought. The problem is that a manifold does not, in
general, have a natural vectorspace structure. You can’t ”add” points on a general
manifold in a very sensible way. What is the north pole plus the south pole on a
sphere, which is not embedded in R3 ? This question does not even seem sensible
and any addition you would add at this stage would seem very arbitrary and definitely
not natural. Now, there are many ways to look at the idea of what a vector in Rn
is. You have the obvious one, a vector is ”an arrow” or the mathematical one of it
being an element of a set with an addition and scalar multiplication which obey the
following axioms...
But neither of these help us right now. Of course, we always want to draw
vectors as arrows, and mathematically, the object we want to work with should be
vectors in the vector-space-vector definition, but they are not helpful yet.
One idea is clear even at this stage. Whatever concept of a vector we will choose
to generalize, we will only generalize tangent vectors (for example of curves/sur-
faces). The reason is simply that all other vectors on curves and surfaces (normal
vectors) did not come from the curve/surface itself but the Rn that we embedded
the curve/surface in. So only tangent-vectors are appropriate if we don’t want to
have the influence of some ambient space that our manifold lives in.
There are four different ideas we can generalize out of Rn that end up being
equivalent.
• The first one is very simple. We treat vectors in charts. Vectors are vectors
in charts, where vectors make sense (since charts go to Rn ) and we can point
141
142 CHAPTER 5. TANGENT VECTORS
at a vector in a chart and look to another chart and through the transition
map check which vector it is there. This definition defines tangent-vectors
through equivalence classes of vectors on charts. We don’t know what they
are on M , but we can chart them.
• A bit more on the computational side, we can define vectors through direc-
tional derivatives of smooth functions. We know that you can differentiate a
smooth function f : Rm → R in the direction of X and get
∂f ∂f ∂f
DX f = X 1 1
+ X2 2 + . . . Xn n (5.1)
∂x ∂x ∂x
This derivative contains the same information as what we would usually call
the vector, that is the quantities (X 1 , X 2 , . . . , X n ).
• More on the geometric side, we can use curves to define tangent vectors. We
can use the fact that from a geometric standpoint, a vector tangent to a
curve (multiplied with a small ϵ) looks like a small piece of the curve itself, in
the usual Rn cases. This way we define a tangent vector as a ”small piece of
a curve”. Since many curves have the same tangent vector, we will also use
equivalence classes here.
• On the more abstract side, we can define tangent vectors as linear operators
on smooth functions to R from the manifold, which satisfy the product rule.
X op (f g) = X op (f )g + f X op (g) (5.2)
These four ways are all equivalent to each other and can all be used to define
tangent vectors. They all represent different ways to think about tangent vectors
(charts, computation, geometric (curves), abstract) and this variety gives you a lot
of ways to tackle a mathematical problem. Sometimes the geometric picture will
be more applicable, sometimes the computational etc.
Of particular note is the second definition, which, because of its relationship
with partial derivatives has fostered a notation in differential geometry which can
be confusing at first, but to which one gets used to quite fast.
For our purposes, the first two definitions will be most useful, and we will take
the most time discussing them.
which are connected by T1→2 . Then if you are working in the first chart, which
means you are working in the first Rn , you know what a vector is in the chart,
it’s just a tuple (X 1 , . . . , X n ) ∈ Rn situated at p and you can work with it in the
chart. What happens when you use the second chart? Well the coordinates of the
vector X, which previously lived on the first chart, will just be the jacobian of the
transformation matrix at p evaluated at X.
or written out:
n
X ∂T i
X ′i = Xj (5.4)
j=1
∂xj
where we wrote T = (T 1 (x), . . . , T n (x)) instead of T1→2 , and when clear will
continue to do so.
By taking all other charts you can find which vector in the other charts X
corresponds to and what you get is a working definition of a vector, without having
really talked about the manifold, at all. Figure 5.1.2 might make this clearer.
Let’s say we have a plane flying over the point p with a physically real velocity
described by the vector X on the Mercator chart (square). We can describe the
path of the plane perfectly well in that chart without any reference to the actual
Earth. If we want to switch to a new chart, maybe because it is easier to see
something or represent our country as bigger than others, we can easily do that
with the transition map and bring the velocity vector to the new chart using the
Jacobean of the transition map. This is the idea of this definition, we take vectors
in charts and use them only in charts, really, with no mention of the manifold.
We want to pack the idea of using vectors in charts into a working definition. The
way is quite simple. We say a vector is simply the collection of all vectors in charts
that transform into each other, and we can take one representative (in one chart)
as the example of the vector we talk about.
5.1. TANGENT VECTORS FROM CHARTS 145
Figure 5.2: The Mercator and stereographic projection of the earth (/parts
of the earth). Imagine at a real point p on the earth, there is a plane flying
with a velocity described by the (blue) tangent vector X. We can talk of
its velocity vector as an arrow on both of these pictures (charts) without
making any reference to the actual manifold, that is the earth. We can
describe the path of the plane and all the information about it we would like
using only the charts and nothing else.
146 CHAPTER 5. TANGENT VECTORS
We always have the point p be a part of the vector, a tangent vector never
exists without a point p it sits at.
Figure 5.3: In Rn , you can transport vectors and compare them without
problems, we therefore don’t need to always say which point the vector is
situated at. But if we move onto the sphere, this is not the case anymore.
You can transport a vector on the sphere so that it locally stays parallel to
itself, and after performing a loop, come back and get a different vector!
148 CHAPTER 5. TANGENT VECTORS
n
X ∂u i
DX u(p) = Du(p)(X) = X (5.5)
i=1
∂xi
∂u ∂u ∂u
= X1 + X2 2 + · · · + Xn n (5.6)
∂x1 ∂x ∂x
(5.7)
We can use this. The components of X are very explicit in this equation. Notice also
that if you view this over all possible smooth functions, the directional derivatives
are operators on smooth functions. Even more so, every tangent vector produces
its own derivative operator, and two different tangent vectors produce two different
operators.
We can also rewrite the above equation using curves. If γ is a curve so that
γ(0) = p, γ ′ (0) = X, then:
du(γ(t)) ∂u ∂u ∂u
= γ ′ (0)1 + γ ′ (0)2 2 + · · · + γ ′ (0)n n (5.8)
dt t=0 ∂x1 ∂x ∂x
∂u ∂u ∂u
= X 1 1 + X 2 2 + · · · + X n n = DX u(p) (5.9)
∂x ∂x ∂x
(5.10)
Again, notice that if we take all possible smooth functions, different vectors produce
different operators. We can use the last equation very easily to define vectors on
manifolds, we don’t need any other structure. We just take the last equation as the
definition.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 149
X = (p, X op ) (5.11)
X op : C ∞ (M ) → R (5.12)
du(γ(t))
X op (u) = (5.13)
dt t=0
where γ is some curve on M , but always the same curve for all the functions
u.
Notice how, as with the first definition, we use external objects (there charts,
here smooth functions) to define things on M . This is quite common in differential
geometry. We will drop the op from X op from now on and just call it X as well,
but always have it in the back of our head that we need a point p where the vector
sits, for the same reason as with the first definition. As a reminder we will also
sometimes write Xp instead of X.
We can define the tangent space again.
Notice that this time Tp M ⊂ {p} × Hom(C ∞ (M, R), which is a vector space.
This time, however, it is totally not obvious that Tp M is a vector space. No way is
it obvious that, for example, if X, Y are in Tp M , that X + Y is in it two, because
we do not know how to add curves on M . In Rn , it is obvious, but certainly not
on a general manifold, Imagine even just the earth and two curves, for example, the
ones of a plane flying from Zurich to London and from Zurich to New York. There
is no reasonable way of adding them. What would that even mean? Would you end
up in Greenland?
Where we can do this is in charts, however. The one important point here,
though, is that the curve we get is dependant on the chart, and depending on the
chart, if you add the two curves from before, you can get anywhere from Greenland
to Brazil, but locally the two curves will be the same, in the sense that they will
have the same tangent vector.
We will not prove that this version of Tp M satisfies all the axioms of a vec-
torspace, rather and more interestingly, we will show that it is a whole vectorspace
in the sense that it has a basis and that basis spans the entire vectorspace, leaving
150 CHAPTER 5. TANGENT VECTORS
Figure 5.5: We don’t know how to really add curves sensibly on a manifold
directly. We can do it in curves, but depending on which chart we use, the
curves will look different. Here, on this example of the earth, the addition of
the two paths of our plane lands you in Cuba, if you use the Mercator pro-
jection, or somewhere in Antarctica if you use the stereographic projection.
That is not an ambiguity we want if we don’t want our vacation ruined.
Notice, however, that both curves result in the same tangent vector.
So we are forced to use charts. Let us fix a point p and a chart Ch, which also
charts p.
We can construct a basis of Tp M by using the basis of Rn . For simplicity, let
us say that p gets mapped to (0, 0, . . . , 0). Then we have the basis vectors of Rn
at the origin (=p̃ = Ch(p)), which we can call e1 , e2 , . . . , en . Which curves could
we use to construct the basis of Tp M ? Well, the coordinate lines of course! The
coordinate lines are defined as follows:
β̃i (t) = tei (5.14)
where β̃i is the i-th coordinate line. Then we can project them back onto the
manifold βi (t) = PCh (βi˜(t)), as you can see in figure 5.2.2.
We can then define the i-th basis vector of Tp M as the one that one gets from
the i-th coordinate axis, projected back onto the manifold. In the coordinate space,
this vector would simply belong to the operator:
∂ ∂ ∂ ∂
X1 + X2 2 + · · · + Xn n = (5.15)
∂x 1 ∂x ∂x ∂xi
This motivates a new notation for this basis. We can now write:
op
∂
= The vector gotten from the i-th coordinate line (5.16)
∂xi p,Ch
where we remind ourselves that it sits at p and that it is definitely something that
comes from Ch and that it is an operator. (In future, we will leave out all these
little reminders.)
We can turn this into a definition.
where βi is the i-th coordinate line of the coordinate space Ch(U ), projected
back onto the manifold.
What is left to do now is to prove that this is a basis. For this, we would like
the coefficients of our vectors, to work a bit easier.
for some curve γ which is appropriate for X. Now, this is a map from R to R, going
over the manifold. We can eliminate the manifold, by going to the coordinate space
and back with a chart and parametrization.
d
X ·u= (u ◦ Ch−1 ) ◦ (Ch ◦ γ)(t) (5.19)
dt t=0
The first (from the right) is simply the curve γ, drawn into the coordinate space,
not onto the manifold, which we will also call γ̃. The second one is simply the
function u, but as a function of the coordinates, not of the points on M , which we
will call ũ. We can now use the chain rule:
d
X op · u = ũ ◦ γ̃(t) (5.20)
dt t=0
n n
X ∂ ũ dγ̃ i X dγ̃ i ∂ ũ
= (0) = (5.21)
i=1
∂xi dt i=1
dt ∂xi
n op
dγ̃ i
X ∂
= (u) (5.22)
i=1
dt ∂xi
∂ ũ
where we have used the fact that ∂x i is simply the derivative of u in the i-th
dγ̃ 1 dγ̃ n
(X 1 , . . . , X n ) = ( ,..., ) (5.23)
dt dt
where γ̃ = Ch ◦ γ is the curve γ, but in coordinate space, not the manifold.
∂
5.2.3 Proving that ∂xi i=1,...,n
is a basis
∂
We now want to now show that ∂x i
i=1,...,n
is a basis. This means we need to
show three things. Firstly, that all vectors spanned by the basis are in Tp M , and
secondly that all vectors in Tp M are spanned by the basis and thirdly that the basis
is linearly independent.
154 CHAPTER 5. TANGENT VECTORS
Figure 5.6: We can use the coordinate lines from a chart Ch to define our
basis-vectors of Tp M , by projecting them onto the manifold.
5.2. TANGENT VECTORS AS DIRECTIONAL DERIVATIVES 155
∂
Proposition 5.2.1: The vectors ∂xi i=1,...,n form a basis of Tp M
∂
The vectors ∂xi i=1,...,n form a basis of Tp M , that is:
• Tp M ⊂ span( ∂x ∂
, . . . , ∂x∂n )
1
• span( ∂x
∂
, . . . , ∂x∂n ) ⊂ Tp M
1
But we have already shown the first claim in the last section, when we found
the coefficients of a general vector in Tp M , because we wrote X out as a linear
combination of this basis. So we only have the second and third claims to prove.
The proof of the second Pn claim iseasy to understand. We need to show that
any (X 1 , . . . , X n ) = i=1 X i ∂
∂xi is in Tp M , which means we need to find
a curve that generates these coefficients. But the curve (in coordinate space)
γ̃(t) = t(X 1 , . . . , X n ) will do3 . Figure 5.2.3 should make this almost obvious.
Figure 5.7: We can use the curve that just ”extends” X in coordinate space
for our proof
The only thing left to show is that the basis is linearly independent, which we
leave to you as an exercise, it’s not too hard.
3 Here, we are still using the mentioned simplification, which is that Ch(p) = 0. If you
Figure 5.8: We can see vectors as small pieces of curves, locally, if we think
of them geometrically.
This definition is a lot more geometric and is useful to think about in very
pictorial settings. You can see the equivalence relationship in figure 5.3
Figure 5.9: A lot of curves, even very wild ones are equivalent in our defini-
tions. These wild behaviours are cut off by the equivalence relationship, so
we have a way of talking about ”small” parts of the curve
158 CHAPTER 5. TANGENT VECTORS
for some functions u, v. It turns out that, like the directional derivative, we can also
generalize this. We can take the space of all linear transformations that take smooth
functions on M as the input and output a real number, Hom(C ∞ (M ), R), which
contains things like tangent vectors (as derivatives), but also things like multiplying
functions with the number two, which is clearly not a derivative-operator.
We can then notice, that derivative operators should satisfy the product rule,
while non-derivative operators do not. For example, with multiplying by two, in
general, we have:
2 · (uv) ̸= u(2v) + v(2u) = 4uv (5.26)
ith this we get to our definition:
for all u, v ∈ C ∞ (M )
It is quite tricky to work with this definition, because we don’t have charts or
derivatives in it, so one needs to be very algebraic and it all turns into a tricky,
non-digestible mess quickly, so we won’t pursue this further.
What we want to mention is that the vectors from the previous definitions we
have do satisfy the Leibniz rule, you can use it comfortably. The proof is similar
to what we did before, using lots of chain rules, working in charts and applying the
Rn Leibnitz rule.
where often for example, something is easier in polar coordinates, and sometimes
in cartesian. We want to generalize this to any coordinates.
More specifically, we want to look at how tangent vectors transform. For this,
fix M, p and two charts Ch1 , Ch2 and a tangent vector X.
n
X ∂
X= Xi (5.28)
i=1
∂xi Ch1
where X 1 , . . . , X n are the components of X in the basis of the first chart. What
happens if we change the charts? That is, what are the components in the other
chart, Ch2 ?
We can start by answering first how we can express the basis vectors of the old
chart as the basis vectors of the new chart.
We
do this, by unravelling the definition of the differential operator that is
∂
∂xi Ch1 .
We know that:
∂ ∂
i
·u= (u ◦ Ch−1
1 ) (5.29)
∂x Ch1 ∂xi
in the first coordinate space. But we know how to transform coordinates. We call
the coordinates in the second chart y 1 , . . . , y n . We can change to the second chart
by inserting an identity, that is by going to the second coordinate space and back
and then using the Rn chain rule:
∂ ∂
i
·u= (u ◦ Ch−1 1 ) (5.30)
∂x Ch1 ∂xi
∂
= (u ◦ (Ch−1
2 ◦ Ch2 ) ◦ Ch1 )
−1
(5.31)
∂xi
∂
= ((u ◦ Ch−1 −1
2 ) ◦ (Ch2 ◦ Ch1 )) (5.32)
∂xi
n
∂(u ◦ Ch−12 ) ∂(Ch2 ◦ Ch−1
1 )
j
X
= j i
(5.33)
j=1
∂y ∂x
n
∂(Ch2 ◦ Ch−1 1 )
j
X ∂
= i
(5.34)
j=1
∂x ∂y j Ch2
We have found the transformation of the basis vectors of one chart into the vectors
of another. We can thus see how a vector X, which is of course nothing less than
a linear combination of the basis vectors in the first chart, transforms.
n n X n −1 j
i ∂(Ch2 ◦ Ch1 )
X
X ∂ ∂
X= Xi = X (5.35)
i=1
∂xi i=1 j=1
∂xi ∂y j Ch2
n n
!
−1 j
i ∂(Ch2 ◦ Ch1 )
X
∂ X ∂
= = X (5.36)
∂xi j=1 i=1
∂xi ∂y j Ch2
(5.37)
160 CHAPTER 5. TANGENT VECTORS
So we have found how the coefficients change. If we call the coefficients in the
second basis (in the second chart) X̂ 1 , . . . , X̂ n , then we can write them as:
n
X ∂(Ch2 ◦ Ch−1
1 )
j
X̂ j = Xi (5.38)
i=1
∂xi
Note that this is just a matrix multiplication, and we can write it simpler using our
transformation notation, since Ch2 ◦ Ch−1 1 = T1→2
Where XCh2 is the vector X as a column in the second basis, while XCh1 equiva-
lently in the first basis.
At this point, we want to introduce a new notation, which is very common in
much of differential geometry literature, and understandably so. We can think of
the coordinates of the first charts as functions:
which gives every point on the covered subset of the manifold a real number. That
is xi : U1 ⊂ M → R.
We do the same thing with the other charts and call them
We can then do the above calculations with these as actual partials. But what is
the transition map, then? Well, as you can imagine, it takes the coordinates of a
point in the first chart and spits out the corresponding coordinates in the second
chart:
T (x , . . . , xn ) y (x , . . . , xn )
1 1 1 1
T 2 (x1 , . . . , xn ) y 2 (x1 , . . . , xn )
T = = (5.42)
.. ..
. .
T n (x1 , . . . , xn ) y n (x1 , . . . , xn )
so we can write:
∂y j ∂T1→2 ∂(Ch2 ◦ Ch−1
1 )
j
i
for i
= i
(5.43)
∂x ∂x ∂x
and the formula, as:
n
j
X ∂y j
XCh = i
XCh (5.44)
2
i=1
∂xi 1
which looks very similar to the real number chain rule. Note, however, that there
is a bit of an abuse of notation here, and that it is certainly not mathematically
enough to use the chain rule without thought with these, there is more behind them.
5.6. DIFFERENTIATION OF A FUNCTION BETWEEN MANIFOLDS 161
Let us now go to the case where we have two manifolds, M, N and a smooth
function f : M → N between them. What is the derivative of f , df (p). We expect
it to be similar to a jacobian in Rn , taking tangent vectors to tangent vectors.
Figure 5.10: You can see two manifolds, M and N and a function between
them. We want a derivative df , that is a function that takes tangent vectors
of M (X) to tangent vectors of N (Y ).
How can we do this? The geometric idea is quite simple. We know f and
want to know what happens with a vector X ∈ Tp M . That vector is created by
some curve on M , call it α through the second definition. We know what happens
with the curve if we use f on the manifold, it gets mapped to some other curve
β = f ◦ α. But then the vector associated with β at p̃ = f (p) should be the vector
the derivative maps X to!
162 CHAPTER 5. TANGENT VECTORS
d
X ·u= u(α(t)) (5.45)
dt 0
then
Y = (df (p)(X) ∈ Tp̃ N (5.46)
is defined by:
d
Y ·v = v(βt) (5.47)
dt 0
for all v ∈ C ∞ (N ) and where β is the curve β = f ◦ α.
Notice that nowhere in the definition did we use any charts whatsoever! This is
a purely geometric, chart-independent object we have.
There is, however, a few things we need to make sure work if this definition is
to make sense.
d
Y ·v = v(β(t)) (5.48)
dt 0
d
= v(f (α(t))) (5.49)
dt 0
= X · (v ◦ f ) (5.50)
Y · v = X · (v · f ) (5.51)
5.7. THE CHAIN RULE 163
which we will call proposition X from now on, whenever we use it, which will be
in the next section when we use it to prove the chain rule for functions between
manifolds.
You can show this in two ways. You can either express this whole thing in
coordinates and ”inherit” the chain rule from Rn into the whole thing or you can
do it directly and abstractly. Neither is better, but since many of our proofs until
now have been leaning more toward the first type, we will do it with the second
method instead.
Proof. We need a bit of setup since we have a lot of players in this proof. We
have the manifolds, the maps, the vectors and the general smooth functions we
need for the vectors to act on. We show all the players in figure ??
Our strategy is to write each of the parts of the chain rule equation using
proposition X and then collect them together.
df (p)(X) · v = X · (v ◦ f ) (5.54)
dg(q)(Y ) · w = Y (w ◦ g) (5.55)
d(g ◦ f )(p)(X) · w = X · (w ◦ (g ◦ f )) (5.56)
We can now set Y = df (p)(X) and q = f (p) and insert into the right side of the
chain rule.
dg(f (p))(df (p)(X)) · w = df (p)(X) · (w ◦ f ) (5.57)
= X · ((w ◦ g) ◦ f ) (5.58)
= d(g ◦ f )(p)(X) · w (5.59)
So we get the desired equation:
d(g ◦ f )p = dgf (p) ◦ dfp (5.60)
since w, X and p were all general.
164 CHAPTER 5. TANGENT VECTORS
Figure 5.11: You can see all the actors we need in the proof in this figure.
Y · v = x · (v ◦ f ) (5.61)
m
X ∂
= Xi (v ◦ f ) (5.62)
i=1
∂xi p,ChM
m
X ∂
= Xi (v ◦ f ◦ Ch−1
m ) (5.63)
i=1
∂xi p̃,ChM
m
∂
(ṽ ◦ f˜)
X
= Xi (5.64)
i=1
∂xi p̃,ChM
where in the last expression, the tilde means that the functions are their represen-
tations in charts, and the partial derivative becomes the simple Rn partial we all
know and love. We can then use the Rn chain rule.
5.8. THE COORDINATE EXPRESSION FOR df (p) 165
Figure 5.12: We present the standard picture with functions between man-
ifolds again, with the charts ChM and ChN . Our goal is to find the
coordinate representation of df (p)
166 CHAPTER 5. TANGENT VECTORS
∂ṽ ∂ f˜j
m X
X n
= Xi (5.65)
i=1 j=1
∂y j q̃ ∂xi p̃
(5.66)
where the partials in y are in the chart of N and the ones in x are the ones belonging
∂ ṽ
to the chart of M . If we rearrange a bit, and realize that we can turn ∂y j back
q̃
into the operator, and that these are simply the standard basis vectors at q̃ in ChN ,
we get:
∂ f˜j
n m
!
X X ∂ṽ
= i
X i
(5.67)
j=1 i=1
∂x p̃ ∂y j q̃
∂ f˜ i
Xm
Yj = X (5.68)
i=1
∂xi
Again, we find a result that is parallel to the chase of Rn , since if f were a map
from some Rm to some Rn , we would get the exact same result. The (column)
vector Y we get is simply the Jacobean (in the chart) used on X (in the chart)!
We can introduce a new matrix notation, for the chart Jacobean of df (p). We
can write:
∂ f˜j
df (p)ji = (df (p)ChM ,ChN )ji = (5.69)
∂xi
Then we can write the above result as:
Y j = df (p)ji X i (5.70)
167