Differential Geometry Class Notes
Differential Geometry Class Notes
Richard Koch
Preface vii
1 Curves 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Reparameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 The Moving Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 The Frenet-Serret Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.7 The Goal of Curve Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.8 The Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.9 Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.10 The Possibility that κ = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.11 Formulas for γ(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.12 Higher Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Surfaces 25
2.1 Parameterized Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Tangent Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Directional Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4 Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.5 The Lie Bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6 The Metric Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.7 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.8 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.9 The Geodesic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.10 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.11 Geodesics on a Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.12 Geodesics on a Surface of Revolution . . . . . . . . . . . . . . . . . . . . . . 50
2.13 Geodesics on a Poincare Disk . . . . . . . . . . . . . . . . . . . . . . . . . . 58
iii
iv CONTENTS
3 Extrinsic Theory 61
3.1 Differentiating the Normal Vector . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2 Vector Fields in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.3 Differentiating Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4 Basic Differentiation Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5 The Normal Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.6 Decomposing Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.7 The Fundamental Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 71
3.8 The Crucial Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.9 The Principal Axis Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.10 Principal Curvatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.11 A Formula for b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.12 A Formula for B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.13 Justification of the Definition . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.14 Gaussian Curvature and Mean Curvature . . . . . . . . . . . . . . . . . . . 81
3.15 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.16 Algebraic Postscript . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Differential geometry began in 1827 with a paper of Gauss titled General Investigations of
Curved Surfaces. In the paper, Gauss recalled Euler’s definition of the curvature of such
a surface at a point p. Then he completely transformed the subject by asking a profound
question.
Here is Euler’s definition of curvature. Suppose p is a point on a surface S. We can translate
and rotate this surface without changing the curvature at p. Let us translate and rotate
until p lies at the origin in the xy-plane and its tangent plane is this xy-plane.
-2
2
0
1 0
0.5
-2
0
-0.5
5
0
-1
1 -2
-2
2
0 -1-0.5
2 0 0.5
1
After this motion, we can imagine that the new surface is given by z = f (x, y). Expanding
in a Taylor series about the origin, we have
1 ∂2f 2 ∂2f ∂2f 2
∂f ∂f
z = f (0, 0) + x+ y + x + 2 xy + y + ...
∂x ∂y 2! ∂x2 ∂x∂y ∂y 2
where all partial derivatives are computed at the origin. But since the tangent plane at the
origin is the xy-plane, both partials ∂f ∂f
∂x and ∂y vanish and the Taylor expansion reduces
to
1 ∂2f 2 ∂2f ∂2f 2
f (x, y) = x +2 xy + 2 y + . . .
2! ∂x2 ∂x∂y ∂y
vii
viii PREFACE
It is convenient to call these three second partials A, B, and C and thus write
1
Ax2 + 2Bxy + Cy 2 + . . .
f (x, y) =
2!
We are still free to rotate the surface about the z-axis, since such rotation leaves the tangent
plane unchanged. A famous algebraic result asserts that we can eliminate the xy quadratic
term by rotating the function appropriately. Many people are familiar with this result in
a different context. The solution of the equation Ax2 + 2Bxy + Cy 2 = D is an ellipse in
general position. If we rotate this ellipse until it is parallel to the x and y axes, its equation
2 2
changes to the form xa2 + yb2 = 1.
Imagine that we have eliminated the xy term in our formula. The resulting numbers A
and C are then called κ1 and κ2 in the literature, and the surface is given by the Taylor
expansion
κ1 2 κ2 2
f (x, y) = x + y + ...
2 2
All of this was done by Euler. The numbers κ1 and κ2 describe the curvature of the
surface at p. Euler could compute these numbers directly from the formula z = F (x, y) of
the original surface without physically rotating and translating.
Notice that the numbers κ1 and κ2 depend on p and thus become functions on the surface.
They are called the principal curvatures of the surface. The pictures below show the
surfaces
κ1 2 κ2 2
z= x + y
2 2
for both κ’s positive, for one positive and one negative, for one positive and one zero, and
for both zero.
1
1 1 1
0.5 0.5 0.5 0.5
0 0 0
0 -0.5 -0.5 -0.5
-1 -1 -1
-0.5 1 1 1
-1
3 0.5 0.75 0.5
0 0.5 0
2
Consider the special case of a torus, shown on the next page. Along the outside rim, we
will have κ1 > 0 and κ2 > 0 and the surface looks like a paraboloid, while along the inside
rim we have κ1 > 0 and κ2 < 0 and the surface looks like a saddle.
ix
-2
1
0.5
0
-0.5
5
-1
1
-2
0
2
Can a two-dimensional person living on the surface determine κ1 and κ2 , if that person is
unable to see into the third dimension?
Let us provide our two-dimensional person with tools: an infinitesimal ruler and an in-
finitesimal protractor. Using the ruler, our worker can determine the lengths of curves on
the surface. Rigorously, this means that our worker can integrate to determine the length
of curves. Using the protractor, our worker can determine the angle between two curves.
Rigorously this means that our worker can compute the angle between the tangent vectors
to any two curves. Can κ1 and κ2 be computed with this information?
The surprising answer to Gauss’ question is no! To see this, consider the surfaces f (x, y) =
1 2
2 x and g(x, y) = 0, shown on the next page. For the first we have κ1 = 1 and κ2 = 0,
while for the second we have κ1 = κ2 = 0. However, it is possible to bend the first surface
until it is flat without changing the lengths of curves in the surface or the angles of curves
at intersection points. So a two-dimensional worker could not tell the difference between
these surfaces.
x PREFACE
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
1 1
0.75 0.5
0.5 0
0.25 -0.5
0 -1
-1 -1
-0.5 -0.5
0 0
0.5 0.5
1 1
But Gauss did not give up. Before Gauss, mathematicians often wrote in an expansive
way. Numerical experiments were mixed with conjectures and half-proved theorems. Gauss
introduced a more austere style, which has been adopted by mathematicians ever since.
He calmly stated theorems and proofs, letting his results speak for themselves. But when
Gauss came to the main result in his paper on surfaces, he allowed himself to write “a
remarkable theorem.” And the theorem has been known as the theorema egregium ever
since:
Theorem Egregium A two-dimensional person can compute the product κ1 · κ2 .
The number κ = κ1 · κ2 is called the Gaussian curvature of the surface at p. Consider the
four surfaces below again. The Gaussian curvature is positive in the first case, negative in
the second case, and zero in the third and fourth cases. So a two-dimensional person can
distinguish between the first two surfaces, but not between the last two.
1
1 1 1
0.5 0.5 0.5 0.5
0 0 0
0 -0.5 -0.5 -0.5
-1 -1 -1
-0.5 1 1 1
-1
3 0.5 0.75 0.5
0 0.5 0
2
Gauss gave several proofs of the theorema egregium. The first is computational, but the
computation is very interesting. In ordinary Euclidean geometry, the distance ds between
two infinitesimally close points p = (p1 , . . . , pn ) and p + ∆p = (p1 + dx1 , . . . , pn + dxn ) is
given by the Pythagorian theorem
The gij are exactly the quantities which can be measured by our two-dimensional worker.
Thus our worker can determine geometrical quantities exactly when they can be expressed
in terms of the gij . Gauss found formulas for several important quantities in terms of the gij
and ultimately found a formula for κ in terms of gij and their first and second derivatives.
Thus a two-dimensional worker can compute κ.
I’d like to give the flavor of these computations without giving details. Since a two-
dimensional worker can find the lengths of curves, the worker can determine the shortest
curves between two points. Such curves are called geodesics. If γ(t) = (γ1 (t), . . . , γn (t)) is
such a geodesic, Gauss proved that γ solves a differential equation
d2 γi X i dγj dγk
+ Γjk =0
dt2 dt dt
j,k
where the Γijk , known as the Christoffel symbols, are given in terms of the gij . Indeed,
1 X −1 ∂glk ∂glj ∂gjk
Γijk = gil + −
2 ∂xj ∂xk ∂xl
l
We will deduce these equations later in the course. These same equations occur in Einstein’s
general relativity theory, where now γ(t) is the curve traversed by a particle acted on by
gravitation, and gij describes the gravitational field.
Gauss’ formula for κ is a fairly simple expression involved Γijk and their derivatives.
However, Gauss later found a more conceptional proof of the theorema egregium. Draw a
triangle on the surface whose sides are geodesics, as illustrated on the next page.
xii PREFACE
0.5
-0.5
-0.5
0
0.5
-0.5
0
0.5 1
1
Suppose we are two-dimensional people and wish to determine κ. Lay out a triangle with
geodesic sides. The number κ is a function, but assume the triangle is so small that κ is
essentially constant on it. Then
α+β+γ−π
κ=
area of triangle
π
Example: On the spherical triangle illustrated above, all three angles are 2 and the area
4π π 3 π2 −π
of the triangle is one eighth the area of a sphere, so = 8 2.
Thus κ = π/2 = 1. This is
correct because on a sphere of radius one we have κ1 = κ2 = 1.
One of Gauss’ unusual jobs was to head a geodesic survey of Germany. During the survey,
he instructed workers to climb three mountains and measure the angles between them very
accurately. I’m sorry to report that the sum was π to within experimental error.
In letters to his closest friends, Gauss admitted that he was interested in the three di-
mensional case of his theorem. Suppose we are three-dimensional people, living in a world
xiii
which curves into the fourth dimension. Can we determine that the world is curved without
stepping into the fourth dimension? Gauss warned his friends not to reveal this thought,
for fear that he would be thought crazy.
xiv PREFACE
Chapter 1
Curves
1.1 Introduction
Definition 1 We say
that γ(t) isa regular curve if the coordinate functions x(t), y(t), z(t)
∞ 0 dy dz
are C and if γ (t) = dx
dt , dt , dt is never zero.
Example 1: The straight line through p in the direction v can be written γ(t) = p + tv and
thus is a parameterized curve whenever v 6= 0. A picture of the line γ(t) = (1 + 2t, 1 − t) =
(1, 1) + t(2, −1) is shown on the next page.
Example 2: The circle about the origin of radius r can be written γ(t) = (r cos t, r sin t).
Example 3: The helix at the top of the next page can be written γ(t) = (r cos t, r sin t, at)
for constants r and a.
1
2 CHAPTER 1. CURVES
2 2
1
0
-1
1
-2
3
2
1.75
2
1.5 -2 -1 1 2
1.25 1
1
0
0.75 -1
-2
0.5 -1
0
0.25
1
-0.5 0.5 1 1.5 2 2.5 -2 2
Remark: Recall that a function f (t) is C ∞ if it has derivatives of all orders. Most common
functions have this form. For instance, f (t) = t3 +q 3t + 5 is C ∞ and after a while all of
its derivatives are zero. The function f (t) = arctan x+1 ∞ on its domain, although
x−3 is C
it would not be pleasant to write down formulas for higher derivatives. We require that
the coordinate functions be C ∞ because we intend to differentiate these functions many
times.
Remark: Consider the curve γ(t) = (t3 , t2 ), whose graph is shown below. This curve has
a kink at the origin even though its coordinate functions are C ∞ . However, γ 0 (0) = (0, 0).
We require that γ 0 (t) be nonzero to eliminate this sort of example.
4
3.5
3
2.5
2
1.5
1
0.5
-8 -6 -4 -2 2 4 6 8
Remark: In this definition, ||γ 0 (t)|| is the length of the derivative. It is easy to make the
definition plausible. Replacing the integral with a Riemann sum, we have that the length
is approximately
X γ(ti + ∆t) − γ(ti ) X
∆t = ||γ(ti + ∆t) − γ(ti )||
∆t
which equals the length of a polyhedral approximation to the curve; in the limit we get the
length of the curve.
Remark: Sadly, it is usually difficult to compute curve lengths. The circle of radius R is
given by γ(t) = (R cos t, R sin t) and so ||γ 0 (t)|| = ||(−R sin t, R cos t)|| = R. Thus we obtain
the satisfying result below for length:
Z 2π
R dt = 2πR.
0
But the integral for the length of an ellipse is already beyond elementary calculation (yield-
ing an elliptic integral), and the length of the parabola y = x2 for 0 ≤ x ≤ b is just barely
computable: √
Z bp
2 2
b 1 + 4b2 p
1 + (2x) dx = + ln 2b + 1 + 4b2
0 2
1.3 Reparameterization
Undeterred by the practical impossible of computing arc length, we now insist that curves
be traversed at unit speed, so that the length of the curve for 0 ≤ t ≤ s is precisely s. We
do this because we are interested only in the geometry of our curves, and not in how they
are traversed in time.
Suppose γ(t) : I → R3 is a regular curve. Fix t0 and define
Z t
0
s(t) = γ (t) dt
t0
Thus s(t) is the length of the curve from t0 to t and is defined on the open interval I where
γ is defined.
Theorem 1 The image s(I) is an open interval and s : I → s(I) is one-to-one and onto.
The map s is C ∞ and its inverse map t = ϕ(s) is also C ∞ .
Proof: By the fundamental theorem of calculus, ds 0
dt = ||γ (t)||; since γ is a regular curve,
0
γ (t) is not zero, so this derivative is positive. Hence s is a strictly increasing function and
thus one-to-one.
4 CHAPTER 1. CURVES
Since s is continuous, the image s(I) is a connected subset of R and thus an interval
(possibly infinite). The interval is open, because if s1 = s(t1 ) is in the image, then we
can find a < t1 < b in the open interval I and so s(a) < s(t1 ) = s1 < s(b) in the image
interval.
Finally we will prove that ϕ(s) is differentiable. By definition, the derivative of ϕ at s1
is
ϕ(s) − ϕ(s1 )
lim
s→s1 s − s1
Notice that we have abused the notation by letting the letter ‘s’ stand for the function s(t)
and also an arbitrary point in the image interval s(I). We will continue this abuse (!!) by
writing s = s(ϕ(s)) and s1 = s(ϕ(s1 )) so the limit becomes
dϕ 1
= lim s(ϕ(s))−s(ϕ(s1 )
ds s→s1
ϕ(s)−ϕ(s1 )
However, ϕ(s) is some number t and ϕ(s1 ) is a number t1 and the denominator thus equals
limt→t1 s(t)−s(t
t−t1
1)
= ds 0
dt = ||γ (t1 )|| . So
dϕ 1 1
= =
ds s1 ||γ 0 (t)|| t1 ||γ 0 (ϕ(s1 ))||
and in particular this derivative exists. (Here we have assumed the subtle point that s → s1
implies t → t1 . We leave the verification to the reader.)
It follows rapidly that ϕ(s) is C ∞ . Indeed γ(t) is C ∞ and we have just proved that ϕ is
1
differentiable, so ||γ 0 (ϕ(s))|| is differentiable. But this equals dϕ
ds , so this function is differ-
2
entiable and ϕ is C . We can continue to bootstrap in this manner and show that ϕ has
derivatives of all orders. QED.
Definition 3 Suppose γ(t) is a regular curve. Fix t0 and define s(t) and ϕ(s) as above.
Then γ(ϕ(s)) is called the reparameterization of γ by arc-length.
Remark: Notice that the new curve is defined on an interval which contains zero (because
s(t0 ) = 0). The derivative vector of the new curve has length one because
d dϕ 1
γ(ϕ(s)) = γ 0 (ϕ(s)) = γ 0 (ϕ(s)) 0 ,
ds ds ||γ (ϕ(s))||
which is a vector of length one. It follows immediately that the length of the new curve
from s = 0 to s is exactly s.
We now introduce a final abuse of notation. Rather than writing γ(ϕ(s)), we shall write
γ(s) for the curve parameterized by arclength. We are committing a fairly serious sin,
1.4. CURVATURE 5
because we do not get from γ(t) to γ(s) by changing the letter t to s, as is the convention
in all other mathematics. Instead, we compute s(t), find its inverse ϕ(s), and substitute
ϕ(s) for t in the original formula.
In the rest of the course, γ(t) denotes an arbitrary regular curve and γ(s) denotes a curve
parameterized by arclength.
Example: Consider
√ at). Then γ 0 (t) = (−r sin t, r cos t, a)
the helix γ(t)R =√(r cos t, r sin t, √
0 t
and ||γ (t)|| = r + a . So s(t) = 0 r + a dt = t r2 + a2 . We obtain the inverse ϕ(s)
2 2 2 2
So
s s as
γ(s) = r cos √ , r sin √ ,√
2
r +a 2 2
r +a 2 r + a2
2
1.4 Curvature
d dT dT
(T (s) · T (s)) = · T (s) + T (s) · =0
ds ds ds
dT
and so ds · T = 0. QED.
Definition 4 If γ(s) is parameterized by arclength, we define the curvature κ(s) by
dT
κ(s) =
ds
If this curvature is not zero, we define the normal vector and binormal vector by
dT
ds
N (s) = dT
ds
The main point of this section is to convince you that κ(s) is a reasonable measure of the
curvature of γ(s). It is easy to see that κ is a rough measure of curvature: if our curve
always goes in the same direction, T (s) points constantly in this direction and its derivative
is zero. So when κ(s) = dT
is not zero, the curve must be changing direction.
ds
However, we will prove something more precise. Approximate γ near s by a circle. Clas-
sically this circle was called the osculating circle or kissing circle. We will prove that κ
equals one over the radius of this circle. If γ is curving rapidly, then R is small and so κ
is large, as we’d wish. If γ is curving very slowly, then R is enormous and so κ is small, as
we’d wish. If γ is a straight line, then R = ∞ and so κ = 0.
Theorem 3 The curve γ(s) is a straight line if and only if κ(s) is always zero.
Theorem 4 Let γ(s) be a curve parameterized by arclength and fix a point s0 . There is a
unique circle C(s) in R3 parameterized by arclength such that C and γ agree at s0 up to
derivatives of order two, so C(s0 ) = γ(s0 ), C 0 (s0 ) = γ 0 (s0 ), C 00 (s0 ) = γ 00 (s0 ). If this circle
has radius R, then
1
κ(s0 ) = .
R
s − s0 s − s0
C(s) = p~ + (R cos )~v1 + (R sin )~v2 .
R R
1.5. THE MOVING FRAME 7
These equations have a unique solution. Therefore the osculating circle is completely
determined and in particular κ = R1 .
1
R = κ
~v1 = −N
~v2 = T
p~ = γ(s0 ) + κ1 N
In this section we shall suppose that γ(s) is parameterized by arc length and κ(s) is never
zero. In this situation, we have attached three orthonormal vectors T (s), N (s), B(s) to
each point of the curve. These three vectors form a basis of R3 , which moves along the
curve. Geometers use the word frame instead of the word basis, so T, N, B is called the
moving frame.
8 CHAPTER 1. CURVES
Imagine an isolated point traveling along the curve at constant speed. The moving frame
allows us to replace this single point with an airplane. The nose of the airplane should
point along T , the left wing along N , and the tail along B. As we travel along the curve,
the airplane pitches and rolls as T, N, and B move. At each point, the orientation of the
plane is completely determined by T, N, and B.
As we’ll see, the introduction of T, N, and B is decisive in the theory.
We want to measure the change of the moving frame as we travel along the curve. So
we differentiate all three vectors. Since T, N, and B form a basis, we can write these
derivatives as linear combinations of T, N, and B, and it turns it to be very important to
do this instead of expressing them as linear combinations of the standard basis.
Let us temporarily define X1 = T, X2 = N, X3 = B. Then the expression of the derivatives
of the frame vectors as a linear combination of these vectors is given by coefficients aij
such that
dXi X
= aij Xj
ds
j
Theorem 5 The matrix A = (aij ) is skew symmetric, so aji = −aij . In particular, aii = 0.
Proof: Since the basis T, N, B is orthonormal, we have Xi ·Xj = δij , where δij = 1 if i = j
dXj
and δij = 0 otherwise. Differentiating with respect to s yields dX d
ds ·Xj +Xi · ds = ds δij = 0.
i
and so X X
aik Xk · Xj + ajk Xi · Xk = 0
k k
aij + aji = 0.
QED
Returning to the notation T, N, B, the previous theorem allows us to write
dT
ds = a11 T + a12 N + a13 B
dN
ds = a21 T + a22 N + a23 B
dB
ds = a31 T + a32 N + a33 B
dN
ds = −a12 T + a23 B
dB
ds = −a13 T −a23 N
dT
However, we already know that dt = κN , so the equations simply further to
dT
ds = κN
dN
ds = −κT + a23 B
dB
ds = −a23 N
Definition 5 The quantity a23 is called the torsion of the curve and written τ (s). It is a
function of s.
We have proved the Frenet-Serret formulas, first discovered in 1851. We’ll soon see that
these formulas hold the key to the theory of curves:
10 CHAPTER 1. CURVES
Theorem 6 (Frenet-Serret) Let γ(s) be a curve parameterized by arc length and suppose
the curve κ(s) is never zero. Then
dT
ds = κ(s) N
dN
ds = −κ(s) T + τ (s) B
dB
ds = −τ (s) N
We end this section with a description of the geometrical meaning of torsion. Fix s and
recall that our curve is moving in the direction T (s) and is approximated by a circle
with center in the direction N (s). These two vectors define a plane in which the curve is
momentarily trapped. Classical geometers called this plane the osculating plane, which
means kissing plane. Since B(s) is perpendicular to this plane, it forms a handle for the
osculating plane exactly like the handle on the tool used by carpenters to smooth plaster
walls.
If the curve remains in this osculating plane forever, then B(s) should be constant and its
derivative −τ (s)N (s) should be zero, so τ = 0. Otherwise, τ measures the curve’s attempt
to twist out of the plane in which it finds itself momentarily trapped.
Let us convert this informal discussion into a theorem:
Theorem 7 A curve γ(s) lies in a plane if and only if τ (s) is identically zero.
Proof: Suppose γ lies in a plane. Then
γ(s + h) − γ(s)
T (s) = lim
h→0 h
is parallel to the plane, and consequently
T (s + h) − T (s)
κN = lim
h→0 h
is also parallel to the plane. It follows that B = T × N is perpendicular to the plane for
all s. Since B has unit length and varies continuously with s, it must be constant, so its
derivative −τ (s) N is zero.
Conversely, suppose τ = 0. Then B is constant. Fix s0 and consider the plane
P = { x | (x − γ(s0 )) · B = 0 }
We claim that γ is in this plane. To see this, substitute γ(s) for x. The expression
(γ(s) − γ(s0 )) · B is constant because its derivative is T · B = 0. When s = s0 , this constant
is zero, so it is always zero and γ(s) always satisfies the equation of the plane. QED.
1.6. THE FRENET-SERRET FORMULAS 11
Example: Finally, we will compute the curvature and torsion of a helix. We earlier
discovered that the equation of a helix parameterized by arc length is
s s as
γ(s) = r cos √ , r sin √ ,√
2
r +a 2 2
r +a 2 r + a2
2
and so
dT 1 s s
= 2 −r cos √ , −r sin √ ,0
ds r + a2 r2 + a2 r2 + a2
The length of this derivative is the curvature, so
r
κ(s) =
r2 + a2
dT
The normal vector is the unit vector in the direction of ds , so
s s
N (s) = − cos √ , − sin √ ,0
2
r +a 2 r + a2
2
s s s s
−r sin √ , r cos √ , a × − cos √ , − sin √ ,0
r2 + a2 r2 + a2 r2 + a2 r2 + a2
or
1 s s
B(s) = √ a sin √ , −a cos √ ,r
r + a2
2 2
r +a 2 r + a2
2
dB
It is supposed to be true that ds = −τ (s)N (s), and indeed
dB a s s
= 2 cos √ , sin √ ,0
ds r + a2 2
r +a 2 r + a2
2
so we have
a
τ (s) = .
r2 + a2
12 CHAPTER 1. CURVES
Let’s step back and develop some goals. We want to understand the geometry of curves.
Around 1870, Felix Klein gave a famous lecture in Erlangen, Germany, defining geometry.
In the lecture, Klein called attention to the key role played in geometry by Euclidean
motions, that is, maps preserving distances and angles. Among these maps are translations,
rotations, and reflections. According to Klein, geometry is the study of properties of figures
which are invariant under Euclidean motions. For example, suppose we have a line segment
from (1, 1) to (4, 5). Then the x coordinate of the starting point is not an interesting
quantity
√ because it changes if we translate the picture. But the length of this segment,
2 2
3 + 4 = 5, is interesting because it remains unchanged under translation and rotation
of the segment.
Let us apply this philosophy to curve theory. Unlike the physicists, we are not interested
in how the curve is traced in time, but only in the curve itself. And we are not interested in
properties which change when the curve is picked up and moved somewhere else, but only
in quantities which do not change under such motions. Clearly the curvature and torsion, κ
and τ , are invariant under such motions and therefore genuine geometric quantities.
Astonishingly, they are the only geometric quantities. Every other interesting quantity can
be written as an expression in curvature and torsion. A better way to say this is that κ
and τ completely determine the curve up to Euclidean motions. If two curves have the
same curvature and torsion functions, then one can be rotated and translated until it lies
right on top of the other.
• Conversely, suppose κ(s) and τ (s) are C ∞ functions with κ(s) > 0. Then there is a
curve γ(s) parameterized by arclength with κ and τ as curvature and torsion.
dγ
ds = T
dT
ds = κ(s) N
dN
ds = −κ(s) T + τ (s) B
dB
ds = −τ (s) N
We’ll fill in details in a moment, but the essence of the proof is easy. These equations are
linear, so by the existence and uniqueness theorem for differential equations, they have a
unique solution once initial values are given. These initial values are γ(0), which is the
starting point of the curve, and T (0), N (0), B(0), which is the starting orientation. Thus
the differential equations completely determine the curve up to a Euclidean motion.
Let’s expand the argument. We’ll first prove existence. Solve the equations with initial
values γ(0) = (0, 0, 0), T (0) = (1, 0, 0), N (0) = (0, 1, 0), and B(0) = (0, 0, 1). We claim that
T (s), N (s), B(s) are orthonormal for all s. For a moment, assume this. But then dγ ds = T is
always a unit vector, so γ(s) is a curve parameterized by arc length. Moreover dT ds = κN ,
so κ must be the curvature and N the vector normal to the curve. Since B has unit length,
is perpendicular to T and N , and equals T × N when s = 0, it must equal T × N always
by continuity. Since dB ds = −τ N , τ is the torsion.
Temporarily let Lij = Xi · Xj and notice that we obtain a system of differential equations
for these functions:
d X X
Lij = aik Lkj + ajk Lik
ds
k k
From differential equation theory, the solution of this equation is unique once initial con-
ditions are given. From the initial choice of T (0), N (0), B(0) we see that Lij (0) = δij .
But if we define Lij (s) = P
δij for all s, P
we obtain one solution of the differential equations
d
since then ds δij = 0 and k ai kδkj + k ajk δik = aij + aj i = 0 by the Frenet-Serret for-
mulas. So our solution Xi · Xj must equal this solution δij and in particular the Xi are
orthonormal.
Finally we prove uniqueness. This time, suppose that γ is given in advance. Call the
curve obtained in the previous paragraph using special initial properties σ(s). Translate γ
14 CHAPTER 1. CURVES
to make its starting point the origin instead of γ(0). Then rotate γ so the three vectors
T (0), N (0), B(0) associated with γ rotate into the vectors (1, 0, 0), (0, 1, 0), (0, 0, 1). This is
possible because both collections are right handed orthonormal coordinate systems. The
combination of the translation and rotation just used is a Euclidean motion M. Clearly
Euclidean motions do not change curvature and torsion. So M ◦ γ and σ both satisfy
the differential equations and have the same initial conditions, and therefore are equal. It
follows that any γ is just σ up to a Euclidean motion. QED.
In the previous section, we proved the fundamental theorem using fancy differential equa-
tion theory. But I prefer to think of the proof as a Kansas farmer would. Suppose we are
given functions κ(s) and τ (s). Pick a small interval ∆s, say ∆s = .01, and let si = i ∆s.
We will show how to compute γ(si ) numerically using a computer. Start by choosing initial
conditions, say γ(0) = (0, 0, 0), T (0) = (1, 0, 0), N (0) = (0, 1, 0), and B(0) = (0, 0, 1).
dγ
Consider the differential equation ds = T (s). Replace the derivative on the left side by a
difference quotient to obtain
γ(si + ∆s) = γ(si ) + ∆s ( T (si ) )
T (si + ∆s) = T (si ) + ∆s ( κ(si ) N (si ) )
The initial conditions determine γ(s0 ), T (s0 ), N (s0 ), B(s0 ) and these formulas then deter-
mine γ(s1 ), T (s1 ), N (s1 ), B(s1 ). Applying the formulas again gives γ(s2 ), T (s2 ), N (s2 ), B(s2 ).
Etc.
1.9. COMPUTERS 15
This calculation is very easy on a computer. However, numerical errors lead to T, N, and B
which gradually have lengths not equal to one.. To improve accuracy, renormalize at each
stage to get unit vectors. To improve accuracy still more, use the Gram-Schmidt process
at each stage to insure that the frame remains orthonormal.
My favorite way to do this calculation is to allow the user to enter κ and τ in real time with
a mouse. Draw a two-dimensional coordinate plane at the bottom of the screen. Input κ
by moving the mouse left or right. Input τ by moving the mouse up or down. While the
mouse moves, let the computer draw the curve at the top of the screen.
Below is a picture of this program in action, together with a three dimensional picture of
a curve in red and green. Use red and green glasses to see the curve in 3D.
16 CHAPTER 1. CURVES
There is something strange about the proof of the fundamental theorem. It seems to work
fine even if κ is sometimes zero or sometimes negative. What is going on?
Notice that the fundamental theorem is really a theorem about parameterized curves with
associated moving frames. We can capture this distinction with a formal definition:
Definition 6 A framed curve is a C ∞ curve γ(s) parameterized by arclength, together
with a moving frame T (s), N (s), B(s) of orthonormal vectors, such that dγ dT
ds = T (s) and ds
is a multiple of N (s).
According to the fundamental theorem, such framed curves are completely determined by
κ(s) and τ (s) up to Euclidean motion, and every κ and τ can occur. Here κ is allowed to
be positive, negative, or zero.
There is a natural question to ask. If γ(s) is a curve, how many framed curves can be
constructed using γ? If the answer were “exactly one,” then our theory would apply
without change to all γ(s).
Example 1: We are going to show that some C ∞ curves γ have no associated framed
curve at all. Start with the function
−1/x2
e if x 6= 0
f (x) =
0 if x = 0.
This function is pictured below. It is easy to prove that f is C ∞ and all derivatives at the
origin are zero.
1
0.8
0.6
0.4
0.2
Let
(t, f (t), 0) if t ≤ 0
γ(t) =
(t, 0, f (t)) if t > 0
This curve is C ∞ because all derivatives of f vanish at the origin. Its derivative is never
zero, so it is a regular curve and can be reparameterized by arc length. A short calculation
1.10. THE POSSIBILITY THAT κ = 0 17
shows that the curvature is zero only at the origin. The curve lies in the xy-plane for t < 0
and in the xz-plane for t > 0.
0.5
0.4
0.3
0.2 -1
-0.5
0.1
0
0 0.5
0 0.1 0.2
0.3 0.4 1
0.5
When t = 0, the tangent vector is (1, 0, 0). For t just slightly smaller, the vector N must be
a unit vector perpendicular to (1, 0, 0) in the xy-plane and so ±(0, 1, 0). For t just slightly
larger, the vector N must be a unit vector perpendicular to (1, 0, 0) in the xz-plane, and
so ±(0, 0, 1). Clearly there is no continuous extension of N to t = 0.
Example 2: Next we shall construct a C ∞ curve with infinitely many different framed
extensions. We do this by gluing an interval of length one along the x-axis into the previous
example. Thus let
(t, f (t), 0) if t ≤ 0
(t, 0, 0) if 0 < t < 1
γ(t) =
(t, 0, f (t − 1)) if 1 < t
18 CHAPTER 1. CURVES
0.5
0.4
-1
0.3
0
0.2
0.1 1
0
0 0.1 0.2 0.3 0.4 2
0.5
Using the airplane example, our plane is moving straight ahead for 0 ≤ t ≤ 1. During this
time, we can perform as many rolls as we wish.
What does this example say in terms of κ and τ ? Notice that during the crucial interval, κ =
0. We perform rolls by changing τ. So the conclusion is that many different combinations
of κ and τ give the same curve γ.
Remark: On the other hand, during intervals when κ 6= 0, we have only two choices for
frames. We can choose as in our previous theory making κ > 0, or we can change the
signs of κ, N, and B. Notice that τ does not change sign because dB
ds = τ N, so changing
the signs of N and B leaves τ unchanged. So there are two framed curves for each γ and
exactly one of the two has positive curvature.
Remark: The examples above show that when κ 6= 0, there are only two possible extensions
of γ to a framed curve. But as soon as κ = 0, there suddenly may be no extensions,
or infinitely many. However, there is an important case when these difficulties do not
occur. Recall that a function is analytic if it can be expanded in a power series. All
standard functions from calculus are analytic. If γ(t) is analytic, it is easy to show that
the renormalized curve γ(s) is analytic.
Suppose that γ(s) is analytic. We are going to examine the theory near a particular s0 ; we
1.10. THE POSSIBILITY THAT κ = 0 19
γ(s) = a0 + a1 s + a2 s2 + . . .
There is an air of impracticality about our theory, because renormalization from γ(t) to
γ(s) is usually impossible to compute explicitly, and yet we need this renormalization to
compute κ and τ. We are going to remove that impracticality by producting formulas for
κ and τ which do not depend on initial renormalization.
Theorem 10 Let γ(t) be a regular curve. Then
γ 0 (t)
T =
||γ 0 (t)||
γ 0 (t) × γ 00 (t)
B =
||γ 0 (t) × γ 00 (t)||
N = B×T
||γ 0 × γ 00 ||
κ =
||γ 0 ||3
(γ 0 × γ 00 ) · γ 000
τ =
||γ 0 × γ 00 ||2
γ 0 (t)
Proof: The vector γ 0 (t) is tangent to the curve, so T (t) is the normalized form ||γ 0 (t)|| .
But the numerator of this fraction is ||γ 0 × γ 00 ||2 , so we obtain the formula
||γ 0 × γ 00 ||
κ=
||γ 0 ||3
Dividing out this κ from the previously displayed formula involving N gives
||γ 0 || γ0
0
00 γ 00
N= γ − 0 ·γ
||γ 0 × γ 00 || ||γ || ||γ 0 ||
γ0
We get a simplier expression by crossing this expression with T = ||γ 0 || since γ 0 × γ 0 = 0.
But N × T = −T × N = −B. So
γ 0 × γ 00
B(t) = .
||γ 0 × γ 00 ||
(γ 0 × γ 000 ) · (γ 0 × γ 00 )
−τ N γ 0 γ 0 × γ 00 + B = γ 0 × γ 000
||γ 0 × γ 00 ||
−τ γ 0 γ 0 × γ 00 = N · (γ 0 × γ 000 ).
||γ 0 ||
−τ γ 0 γ 0 × γ 00 = γ 00 · (γ 0 × γ 00 )
||γ 0 × γ 00 ||
and so
(γ 0 × γ 000 ) · γ 00 (γ 0 × γ 00 ) · γ 000
τ =− = .
||γ 0 × γ 00 ||2 ||γ 0 × γ 00 ||2
Finally, I’d like to sketch the theory of curves in higher dimensions. Suppose γ(s) is a curve
parameterized by arclength in Rn . Consider the n-dimensional vectors X1 = dγ ds , X2 =
d2 γ dn γ
ds2
, . . . , Xn = dsn . Just as we assumed that κ 6= 0 in the three-dimensional case, we now
assume that these vectors are always linearly independent.
Recall the Gram-Schmidt process, which replaces n-linearly independent vectors by n or-
thonormal vectors. Apply this process to X1 , X2 , . . . , Xn , obtaining Y1 , Y2 , . . . , Yn . Recall
22 CHAPTER 1. CURVES
aij = −aji .
In matrix notation, therefore, we have Frenet-Serret like formulas which look like
Y1 0 a12 a13 . . . a1n Y1
Y2 −a12 0 a23 . . . a2n Y2
d
Y3 −a13 −a23 0 . . . a3n
= Y3
ds ..
.. .. .. . . .. ..
. . . . . . .
Yn −a1n −a2n −a3n . . . 0 Yn
i
Now notice that the derivative of Xi is just Xi+1 because Xi = ddsγi . Therefore the derivative
of X1 is X2 and thus a linear combination of Y1 and Y2 . Since Y1 = X1 , we conclude that
the derivative of Y1 is a linear combination of Y1 and Y2 . But that means that the entire
top row of the above matrix equation is zero except for a12 . We define κ1 to be a12 .
In exactly the same way, we conclude that the derivative of Yi is a linear combination of
Y1 , Y2 , . . . , Yi+1 . For example, the derivative of Y2 is a linear combination of Y1 , Y2 , and Y3 .
1.12. HIGHER DIMENSIONS 23
But then every entry on the second row of the above matrix is zero except −a12 = −κ1
and a23 . We define κ2 = a23 . Etc.
In this way we kill two birds with one stone. We define the κi and simultaneously deduce
that the Frenet-Serret formula becomes
Y1 0 κ1 0 0 ... 0 Y1
Y2
−κ1 0 κ2 0 . . . 0
Y2
d Y3 0 −κ2 0 κ3 . . . 0 Y3
=
ds
Y4 0 0 −κ3 0 . . . 0
Y4
.. .. .. .. .. . . .. ..
. . . . . . . .
Yn 0 0 0 0 ... 0 Yn
Surfaces
Suppose we have a surface S inside R3 . Examples include the graph of y 2 − x2 , the surface
of a sphere, the surface of a doughnut, and the Mobius band. A key feature of each of these
surfaces is that in small pieces it looks like a piece of the plane. Thus whenever p ∈ S,
there is an open set U in the plane and a map s from U to the surface whose image is an
open neighborhood of p in the surface.
There are two aspects to surface theory. In the local theory, we work with small pieces
of a surface without worrying about global topology. In the global theory, we glue the
information from the local theory together to get information about the entire surface.
This chapter is about the local theory, so we will not worry when our symbolism does not
describe every point on a surface.
25
26 CHAPTER 2. SURFACES
-1
-0.5 1
-1
3 0.5
0
2
-0.5
1
-1
0 -1
-1 -0.5
-0.5 0
0
0.5 0.5
1 1
p
Example 2: Let s(r, θ) = (r cos θ, r sin θ, r). Clearly the resulting surface is z = x2 + y 2
and so the cone below. Notice that this cone has a sharp singularity at the origin and yet
the map s is C ∞ . The partial derivatives of s are
∂s ∂s
= (cos θ, sin θ, 1) = (−r sin θ, r cos θ, 0)
∂r ∂θ
and at the origin these vectors are (0, 0, 1) and (0, 0, 0) and thus not linearly independent.
This shows why we require that these two vectors be linearly independent.
2.1. PARAMETERIZED SURFACES 27
1
0.5
0
-0.5
-1
1
0.75
0.5
0.25
0
-1
-0.5
0
0.5
1
Example 3: Standard spherical coordinates make the sphere into a parameterized sur-
face.
s(ϕ, θ) = (sin ϕ cos θ, sin ϕ sin θ, cos ϕ)
Recall that ϕ is the angle from the north pole down to the point in question, and θ is the
angle which the projection of this point to the xy-plane makes with the x-axis. This map
is not globally one-to-one on 0 ≤ ϕ ≤ π, 0 ≤ θ ≤ 2π but is one-to-one on the inside of this
rectangle. The inside maps to every point on the sphere except the north and south poles
∂s ∂s
and the Greenwich meridian. Except at the poles, we have ∂ϕ × ∂θ 6= 0. Since we are only
interested in local theory, we are happy to work with this coordinate map.
0.5
-0.5 -0.5
0.5
-0.5
0
0.5
Example 4: Consider the doughnut shown on the next page. Let the radius of the circle
through the center of the doughnut be R and let the radius of the smaller circle through a
piece of doughnut be r. If p is a point on the doughnut, project p to the xy-plane and let θ
be the angle made with the x-axis. Cut the doughnut alone the plane through the center
and p and let the angle made from the center of the smaller circle to p be ϕ. It is easy to
28 CHAPTER 2. SURFACES
1 4
0.5
0 2
-0.5
-1
-4 0
-2
0 -2
2
4 -4
Example 5: Consider the Mobius band shown below. Let the center of this band trace
out a circle (R cos θ, R sin θ, 0). At each point of this circle, imagine a small arrow perpen-
dicular to the circle and let this arrow rotate half as fast as the circle, so that when we
go completely around the band, the arrow has twisted through 180 degrees. Clearly the
arrow is ((R cos θ) cos θ/2, (R sin θ) cos θ/2, sin θ/2) . We can get a band by going along the
arrow a distance t, where −1 ≤ t ≤ 1. This gives the parameterization below.
s(θ, t) = (R cos θ + tR cos θ cos θ/2, R sin θ + tR sin θ cos θ/2, t sin θ/2)
1
0.5
0 2
-0.5
-1
0
-2
0 -2
2
4
γ(t) = (u(t), v(t)) in local coordinates, move the curve to the surface by writing s(u(t), v(t)),
and then differentiate. By the chain rule we get
∂~s du ∂~s dv
+ .
∂u dt ∂v dt
∂~s ∂~s
Here we have written small arrows over s to indicate that ∂u and ∂v are three-dimensional
vectors, but we will usually omit these arrows.
Notice that the expression just written is a linear combination of the linearly independent
∂~s ∂~s
vectors ∂u and ∂v . This motivates the following definition:
Definition 8 Let S be a parameterized surface, p ∈ S. A tangent vector at p is a vector
in R3 of the form
∂~s ∂~s
X = X1 + X2
∂u ∂v
where X1 and X2 are real numbers and the derivatives of s are evaluated at the point
(u0 , v0 ) corresponding to p. The corresponding vector in local coordinates is the vector in
R2 given by
X = (X1 , X2 ) .
Example: Consider the saddle z = y 2 − x2 and the point p = (1, 2, 3) on this saddle.
The saddle can be parameterized by s(u, v) = (u, v, v 2 − u2 ) and the point corresponds to
∂s ∂s
u = 1, v = 2. We have ∂u = (1, 0, −2u) and ∂v = (0, 1, 2v) and at the point p these vectors
30 CHAPTER 2. SURFACES
are (1, 0, −2) and (0, 1, 4). So a tangent vector is any linear combination of these vectors.
Such a vector has the form
X = (X1 , X2 ).
Remark: Recall that we are interested in surfaces, but usually work in local coordinates.
Consequently, when we think of a tangent vector X, we visualize a vector in 3-space
tangent to the surface. But when we write the vector symbolically, we usually just write
X = (X1 , X2 ). At rare moments, we need to remember that the corresponding vector
∂~s ∂~s
tangent to the surface is X1 ∂u + X2 ∂v .
Tangent vectors usually appear in this course in one of two ways. We may have a curve
in the surface and want to compute the curve’s tangent vector, which will be tangent to
the surface. Or we may be given a tangent vector X and want to compute the directional
derivative X(f ) of a function f on the surface in the direction X. Here are more details
about the first of these ideas.
Definition 9 By a curve on a parameterized surface S, we mean a curve α(t) in R3 which
has the form α(t) = s(γ(t)) for a C ∞ curve γ(t) in the coordinate uv-plane. This coordinate
curve is often written γ(t) = (u(t), v(t)) and the corresponding surface curve is written
Suppose we want the tangent vector to such a curve. We claim that we can perform the
obvious calculation in either coordinate space and get the same answer. Indeed, in R3 we
will compute
0 du dv
γ (t) = ,
dt dt
and in R3 we will compute
0 dx dy dz
α (t) = , , .
dt dt dt
But α(t) = s(γ(t)), so by the chain rule we have α0 (t) = ∂u ∂s du ∂s dv
dt + ∂v dt . Thus the derivative
in coordinate space is du dv 3 du ∂~s dv ∂~s
dt , dt and the derivative in R is dt ∂u + dt ∂v . These are different
descriptions of the same tangent vector because according to our earlier rule the vector
∂~s ∂~s
(X1 , X2 ) in R2 corresponds to the vector X1 ∂u + X2 ∂v in R3 .
Example: Suppose we parameterize the surface z = x2 + y 2 using polar coordinates.
Then s(r, θ) = (r cos θ, r sin θ, r2 ). Consider the curve γ(t) = (1, t) in local coordinates.
2.3. DIRECTIONAL DERIVATIVES 31
Then√α(t) = 0 0 =
√ s(γ(t)) = (cos t, sin t, 1). At the special time t = π/4, γ = (0, 1)∂~s and α
∂~s
(−1/ 2, 1/ 2, 0). These vectors are equivalent because (0, 1) corresponds to 0· ∂r +1· ∂θ =
√ √
(−r sin θ, r cos θ, 0)|π/4 = (−1/ 2, 1/ 2, 0).
∂g ∂g
X(g) = X1 + X2
∂u ∂v
∂
Here ∂u has no independent meaning; it is just a formal symbol. But obviously the notation
has been chosen so
∂ ∂ ∂f ∂f
X(f ) = X1 + X2 f = X1 + X2
∂u ∂v ∂u ∂v
Many people teaching calculus for the first time think it is very easy and straightforward
and then get caught on tricky points. The first such point occurs very early in the term.
Students are carefully taught that the derivative is the slope of the curve and thus a
number. Then the lecturer must give an example. A typical first example is f (x) = x3 and
the derivative is 3x2 . Wait. That’s a function.
The point is that mathematicians usually differentiate functions at many points simulta-
df
neously, producing the function dx . Similarly in several variable calculate we differentiate
at many places, producing the functions ∂f ∂f
∂u and ∂v .
The material in this section will not be used until the next chapter, so you may skip it and
come back later if you wish.
2.5. THE LIE BRACKET 33
In several variable calculus, it is very significant that mixed partial derivatives are equal,
∂2f ∂2f
so ∂u∂v = ∂v∂u . On a surface there are no preferred directions, so we should work with
arbitrary vector fields X and Y . But X and Y need not commute as operators. That is,
X(Y (g)) need not equal Y (X(g)).
Non-commutative operations are one of mathematicians greatest discoveries. When oper-
ators do not commute, it is important to measure their noncommutativity. Rubic’s cube is
solved by operations of the form aba−1 b−1 , which measure the noncommutativity of simple
twists a and b. In physics, measurement of noncommutativity leads to the Heisenberg
uncertainty relations.
Also in this course, the deepest results will ultimately depend on measuring the noncom-
mutativity of various operators. We have come to the first of these situations. If X and Y
are vector fields, we define the Lie bracket (or Poisson bracket) of X and Y by the following
formula.
Definition 11
[X, Y ]g = X(Y (g)) − Y (X(g))
Notice that [X, Y ] makes no sense for vectors at a point; to define it, X and Y must be
vector fields. The equality of mixed partial derivatives survives in the following remarkable
theorem:
Theorem 12 The operator [X, Y ] corresponds to a unique vector field.
Proof: This theorem is a surprise because we expect that second derivatives would be
involved. But they cancel out.
We have
∂ ∂ ∂g ∂g
X(Y (g)) = X1+ X2 Y1 + Y2
∂u ∂v ∂u ∂v
Since the Yi are functions, we must use the product rule. We obtain
∂2f
∂Y1 ∂Y1 ∂g ∂Y2 ∂Y2 ∂g X
X1 + X2 + X1 + X2 + Xi Yj
∂u ∂v ∂u ∂u ∂v ∂v ∂ui ∂uj
ij
Now subtract the same result with X and Y interchanged. Since mixed partial derivatives
are equal, the last term cancels out and the final result is
∂Y1 ∂Y1 ∂X1 ∂X1 ∂g
[X, Y ]g = X1 + X2 − Y1 − Y2 +
∂u ∂v ∂u ∂v ∂u
∂Y2 ∂Y2 ∂X2 ∂X2 ∂g
X1 + X2 − Y1 − Y2
∂u ∂v ∂u ∂v ∂u
Thus [X, Y ] is obtained by applying another vector field, whose coefficients are given inside
the large round brackets in the previous formula.
34 CHAPTER 2. SURFACES
The exciting ideas begin here. Suppose that X and Y are vectors in R3 . We can compute
the dot product and length of these vectors. From now on, we will use the notation hX, Y i
for dot product. Recall that ||X||2 = hX, Xi .
The basic philosophy of the course is to work in local coordinates. Suppose X and Y are
vectors in the uv-plane. Then they correspond to tangent vectors to the surface X and Y ,
and we now define
2 . Then hX, Y i
Definition 12 Suppose X = (X1 , X2 ) and Y = (Y1 , Y2 ) are vectors in Rp
3
denotes the dot product of the corresponding vectors in R , and ||X|| = hX, Xi is the
length of the corresponding vector in R3 .
WARNING: These symbols do not mean ordinary dot product or length in two space.
Let us compute the formula for hX, Y i. If X = (X1 , X2 ), then X corresponds to the three
∂~s ∂~s ∂~s ∂~s
dimensional vector X1 ∂u + X2 ∂v . Similarly, Y corresponds to the vector Y1 ∂u + Y2 ∂v and
3
consequently, hX, Y i is the ordinary dot product of these vectors in R , which equals
Definition 13
∂~s ∂~s
g11 (u, v) = ·
∂u ∂u
∂~s ∂~s
g12 (u, v) = ·
∂u ∂v
∂~s ∂~s
g22 (u, v) = ·
∂v ∂v
Because the gij are so important, we’ll spend some time trying to get an intuitive feel for
their significance. Suppose we are in the flat plane at a point p = (u, v) and we make
infinitesimal changes du and dv in u and v. Let ds be the distance we moved, that is, the
hypotenuse. By Pythagoras,
ds2 = du2 + dv 2
Now suppose that we are on a curved surface. We still draw coordinates u and v, but
these coordinates no longer preserve lengths and angles. So the Pythagorian theorem fails.
Gauss and Riemann’s great idea was to replace the Pythagorian theorem with
X
ds2 = gij dui duj
ij
Here and in the future, we use u1 and u2 rather than u and v when we want to sum over
various indices.
In the next section, we will use the gij to compute the lengths of curves on the surface
in local coordinates. As we’ll see, the infinitesimals above are replaced by calculations
involving vectors.
Let’s try a concrete example. Consider the saddle z = y 2 −x2 . Then s(u, v) = (u, v, v 2 −u2 ).
Local coordinates are obtained by projecting points on the saddle straight down to the
plane, but when arrows tangent to the saddle are projected down, their lengths change
and angles between vectors change. We are going to use the mathematics above to find
two vector fields X and Y in the plane which come from unit length perpendicular vector
fields X and Y on the saddle. See the picture below.
36 CHAPTER 2. SURFACES
∂s ∂s
Here are details. Since ∂u = (1, 0, −2u) and ∂v = (0, 1, 2v), we have g11 = 1 + 4u2 , g12 =
−4uv, g22 = 1 + 4v 2 .. Consider the vector field X = (1, 0). While these vectors have length
1 in Euclidean space, their length on the surface is
√
sX
gij Xi Xj = g11
ij
Remark: At the start of this course, we talked about a two-dimensional worker on the
surface with an infinitesimal straightedge and compass, who could not see into the third
dimension. Notice that such a worker could construct vector fields exactly as we have done.
The worker would first draw coordinate lines on the surface. Experience would convince
the worker that it is hopeless to try to draw perpendicular coordinates, or to be fussy about
lengths, so the coordinates would look curved to us. Call the coordinate numbers u and v.
The worker could use the ruler to draw unit vectors at each point, and the protractor and
ruler to draw perpendicular unit vectors at each point.
A little thought shows that the worker could compute gij . Conversely, any calculation
which the worker could perform with compass and ruler could be done by us if we knew
gij .
So the gij contain exactly the information which could be discovered by a two-
dimensional person who could not see into the third dimension.
There is one final remark to be made. We are assuming that there is a surface “out
there.” The gij describe ordinary Euclidean geometry on this surface. Our worker lives in
coordinate space u, v and measures gij rather than computing them from the surface. If
you wish to think this way, our worker lives in Plato’s cave and only imagines the actual
surface in the sunlight outside the cave.
But what if this surface didn’t exist? Could we imagine that we measure gij , work in local
coordinates, and do the rest of the calculations in this chapter without ever referring to
the surface? Absolutely. That’s exactly what we’re going to do. And then might we later
discover that there is no corresponding surface. Again, absolutely, this can happen. When
it happens, we will have discovered a new geometry rather than just mundanely studying
known surfaces.
Here is an example. Define
4 dx2 + dy 2
2
ds =
(1 − (x2 + y 2 ))2
on the unit disk. The four is included for historical reasons. You can immediately deduce
the corresponding gij . When we are at the center of the disk, this is just the ordinary
Pythagorean formula (except for the 2). But near the boundary, the denominator is almost
zero and distances are much later than they appear.
It turns out that this example is exactly the non-Euclidean geometry discovered by Bolyai
and Lobachevsky. It also turns out that no surface in R3 can give these gij (this is a
theorem of Hilbert). So ds2 determines a new geometry.
However, many people know this geometry, because it is the geometry behind Escher’s
picture of angels and devils. In the picture, each angel has the same non-Euclidean size,
although they shrink in the Euclidean world. See the next page.
38 CHAPTER 2. SURFACES
After these extensive preparations, we are ready to discuss a truly great result. Let γ(t) =
(u(t), v(t)) be a curve from a point p to a point q, written in local coordinates. We call this
curve a geodesic if it is the shortest curve from p to q. Of course we are really measuring
lengths on the surface, so geodesics will not be straight lines.
We are going to prove that each geodesic satisfies a differential equation. To make efficient
use of summation notation, we will call the coordinates u1 and u2 rather than u and v. In
the special case when the surface is a flat plane, the differential equation will turn out to
2.7. GEODESICS 39
be
d2 uk
=0
dt2
and geodesics will be straight lines. This is no surprise. The key question is how to
generalize the above simple equation; after much work, mathematicians in the nineteenth
century discovered that the appropriate generalization is
d2 uk X k dui duj
+ Γij =0
dt2 dt dt
ij
where Γkij , the so-called Christoffel symbols, are given by the formula
l 1 X −1 ∂gjk ∂gik ∂gij
Γij = g lk
+ −
2 ∂ui ∂uj ∂uk
k
Notice in particular that the differential equation ultimately depends on gij , so geodesics
can be determined by a two-dimensional worker.
Let us now begin the rigorous development of the theory. Suppose γ(t) = (u(t), v(t)) is a
curve on the surface in local coordinates. We define the length of this curve from a to b to
be Z b
0
Lba (γ) = γ (t) dt.
a
Recall that when we are in local coordinates, the expression ||γ 0 (t)|| always means the
special inner product using gij . Thus this definition when written out gives
Z bv
dui duj
uX
b u
La (γ) = t gij dt
a dt dt
ij
We claim that this expression is equal to the actual Euclidean length of the curve α(t) =
∂ ∂ ∂ ∂
s(γ(t)) on the surface. To see this, suppose that X = X1 ∂u + X2 ∂v and Y = Y1 ∂u + Y2 ∂v
are vectors in local coordinates. By definition, the dot product hX, Y i equals the ordinary
∂s ∂s
dot product of the corresponding vectors in three space given by X = X 1du∂u∂ + Xdv
2 ∂v and
∂s ∂s 0 ∂
Y = Y1 ∂u + Y2 ∂v . Therefore, the local coordinate expression ||γ (t)|| = dt ∂u + dt ∂v is
du ds dv ds d
the ordinary Euclidean length of dt du + dt dv , a vector which equals dt s(u(t), v(t)) by the
chain rule. Since s(u(t), v(t)) = α(t), we have
Z b Z b
0 0
γ (t) dt = α (t) dt
a a
as desired. Notice that this is a deceptive formula. The length on the left is computed in
a fancy way using the gij , while the length on the right side is computed in three space
using the Pythagorian theorem.
40 CHAPTER 2. SURFACES
2.8 Energy
Suppose that the particle is smart and wants to spend as little kinetic energy as possible
getting from p to q. It turns out that the particle should do this by going along a geodesic
at constant speed. If instead the particle travels along a longer curve, or travels along a
geodesic but goes fast part of the time and slow part of the time, then it will use up more
kinetic energy.
Since we are not physicists, constants do not matter to us.
Definition 14 Let γ(t) be a curve in local coordinates for a ≤ t ≤ b. The energy of the
curve, E(γ), is defined to be
Z b
0 2
E(γ) = γ (t) dt
a
Theorem 14 We have 2
Lba (γ) ≤ (b − a) E(γ)
This inequality is an equality exactly when the speed of the curve is constant.
Proof: Recall that the dot product of two vectors in Rn equals the length of one times
the length of the other times the cosine of the angle between them. So we have
hX, Y i = ||X|| ||Y || cos θ
Since cos θ is between -1 and 1, the above formula implies the Schwarz inequality
hX, Y i2 ≤ ||X||2 ||Y ||2
2.8. ENERGY 41
Moreover, this will be an equality exactly when cos θ = ±1 and so when X and Y point in
the same direction and one is a scalar multiple of the other.
We now work by analogy. Let V be the vector space of all continuous functions on the
interval [a, b]. Imagine that each f ∈ V is a sort of infinite dimensional vector, with one
coordinate f (t) for each t ∈ [a, b]. Then it is reasonable to compute the dot product hf, gi by
multiplying coordinates f (t)g(t) and then summing up over all t. Since there are infinitely
many t, we get
Z b
hf, gi = f (t)g(t) dt.
a
and so L(γ)2 ≤ (b − a) E(γ). Moreover, this an equality exactly if ||γ 0 (t)|| is a multiple of
1 so γ(t) has constant speed. QED.
Remark: If we have a curve γ(t) defined for a ≤ t ≤ b, we can reparameterize the curve
so it has constant speed. The length will not change. By changing this constant speed, we
can assume that the curve is defined on the interval 0 ≤ t ≤ 1. From now on, we work only
with such curves.
Theorem 15 Consider curves in U defined for 0 ≤ t ≤ 1. Such a curve minimizes energy
if and only if it minimizes length and has constant speed.
Proof: Suppose that γ(t) has constant speed and the smallest possible length. Since the
speed is constant, the previous theorem gives L(γ)2 = E(γ) If τ (t) is another curve, then
we have
E(γ) = L(γ)2 ≤ L(τ )2 ≤ E(τ )
and so γ has smallest energy.
42 CHAPTER 2. SURFACES
Conversely, suppose γ has smallest energy. We have L(γ)2 ≤ E(γ) with equality exactly
when γ has constant speed. Therefore if γ does not have constant speed, we could repa-
rameterize to get a curve with the same length but constant speed and so smaller energy.
If there were a shorter curve τ , then we could reparameterize τ to have constant speed,
and then
E(γ) = L(γ)2 > L(τ )2 = E(τ )
contradicting the assumption that γ has smallest energy. QED.
We now have enough background to deduce the differential equation of a geodesic. The
calculation will be slightly messy, but after all we are deducing one of the great results in
mathematics!
To avoid getting bogged down, we’ll sketch the idea of the calculation first. Suppose we
have a curve γ(t) defined on 0 ≤ t ≤ 1. Suppose the curve travels from p to q, so γ(0) = p
and γ(1) = q. Finally, suppose γ minimizes energy among such curves.
Imagine that we vary γ through a family of similar curves from p to q as in the picture
below.
Call the family of curves γ(t, s) where − < s < . For each fixed s we get a curve from p
to q; when s = 0 we get our original curve γ.
Let E(s) be the energy of the sth curve. Then E is a function of one variable which has a
minimum at s = 0. By ordinary calculus, the derivative of E should be zero at s = 0. We
are going to calculate this derivative by differentiating the formula for E with respect to s
under the integral sign. It will turn out that the derivative at s = 0 has the form
Z 1
∂s
{ an expression in t involving gij and ui } dt
0 ∂u
2.9. THE GEODESIC EQUATION 43
∂s
The expression ∂u (t, 0) is the derivative of the variation, and thus a series of arrows along
the original curve explaining how to begin the variation. See the picture below.
However, dE ∂s
ds = 0 for all possible variations. Whenever we have a candidate ∂u (t, 0) for
a variation with the property that the arrows at zero at t = 0 and t = 1 so the varied
curves still go from p to q, we can clearly find a variation γ(t, s) with the appropriate
∂s
derivative. So the integral must be zero for any function ∂u (t, 0) which vanishes at the
endpoints.
It follows that the expression “{ an expression in t involving gij and ui }” inside the inte-
gral must be identically zero in t. Why? You might think that it could be nonzero but have
positive and negative parts which cancel when integrated from 0 to 1. But since we can
∂s
choose ∂u (t, 0) arbitrarily, we could defeat this cancellation by choosing a function with a
small bump, as illustrated below.
Finally, the expression “{ an expression in t involving gij and ui }” will turn out to be our
differential equation.
Fine. Here are the details. The energy of the variation γ(t, s) is given by
Z 1 2 Z 1X
∂γ ∂ui ∂uj
E(s) = ∂t (t, s) dt =
gij (u(t, s), v(t, s)) (t, s) (t, s) dt
0 0 ∂t ∂t
ij
We are assuming that the curve when s = 0 has smallest energy, so by calculus we have
d
ds E(s) = 0 when s = 0. Let us compute this derivative by integrating with respect to s
44 CHAPTER 2. SURFACES
The first term can be expanded via the chain rule, so the entire expression becomes
Z 1X
dE ∂gij ∂uk ∂ui ∂uj X ∂ 2 ui ∂uj X ∂ui ∂ 2 uj
= + gij + gij
ds 0 ∂uk ∂s ∂t ∂t ∂s∂t ∂t ∂t ∂s∂t
ijk ij ij
We now come to the decisive step. Notice that the first term in the previous integral is
indeed a complicated expression multiplied by ∂u ∂s as promised. But the other two terms
k
2u
don’t have this form. Instead they contain terms ∂∂s∂t i
. We are going to integrate the last
two terms
R bby parts to convert to the
R b required form. Recall that integration by parts is the
dg
formula a f (x) dx = f (x)g(x)|ba − a dx df
g(x). Here is the calculation for just these last two
terms: Z 1X Z 1X
∂uj ∂ 2 ui ∂ui ∂ 2 uj
[gij dt + gij dt =
0 ij ∂t ∂t∂s 0 ij ∂t ∂t∂s
Z 1X Z 1X
∂ ∂uj ∂ui ∂ ∂ui ∂uj
(boundary terms) − [gij dt − gij dt
0 ∂t ∂t ∂s 0 ∂t ∂t ∂s
ij ij
Let us stop to investigate the boundary teams. The boundary term for the first integral
is 1
X
∂uj ∂ui
[gij
∂t ∂s
ij
0
However, γ(t, s) = (u1 (t, s), u2 (t, s)) is constantly p at t = 0 because all of our curves begin
and end at p. So ∂u
∂s = 0 when t = 0. Similarly this expression equals zero when t = 1. We
i
rename a few indices and use the fact that gij = gji . We get
Z 1 X X X
dE ∂gij ∂ui ∂uj X ∂ ∂uj ∂ ∂ui ∂uk
= − gjk − gik
ds 0 ∂uk ∂t ∂t ∂t ∂t ∂t ∂t ∂s
k ij j i
At this point, we apply the second main idea of the proof. Since this expression must equal
zero for all choices of ∂u
∂s which vanish at the endpoints, the integrand must be identically
k
Expand out the derivative of the terms inside the square brackets, using the product rule
and the chain rule. We get
X ∂gij ∂ui ∂uj X ∂gjk ∂ui ∂uj X ∂ 2 uj X ∂gik ∂ui ∂uj X ∂ 2 ui
− − gjk 2 − − gik 2 = 0.
∂uk ∂t ∂t ∂ui ∂t ∂t ∂t ∂uj ∂t ∂t ∂t
ij ij j ij i
Notice that the two terms with second derivatives are really the same. They just use
different indices. We write these terms together and use the fact that gik = gki to get
X ∂gij ∂gjk
∂gik ∂ui ∂uj X ∂ 2 ui
− − −2 gki 2 = 0
∂uk ∂ui ∂uj ∂t ∂t ∂t
ij i
∂ 2 ui 1 X −1
X
−1 ∂gjk ∂gki ∂gij ∂ui ∂uj
(g )lk gki 2 + (g )lk + − = 0.
∂t 2 ∂ui ∂uj ∂uk ∂t ∂t
ik ijk
−1
P
Notice that k (g )lk gki is one if l = i and zero otherwise. Consequently the above
expression simplifies to
∂ 2 ul 1 X −1
∂gjk ∂gki ∂gij ∂ui ∂uj
+ (g )lk + − = 0.
∂t2 2 ∂ui ∂uj ∂uk ∂t ∂t
ijk
Call the second expression Γlij . Also notice that we are interested only in the curve γ(t, 0),
so we can convert partial derivatives to total derivatives. We obtain
d2 ul X l dui duj
+ Γij =0
dt2 dt dt
ij
We have proved the following theorem, where we change one index for convenience when
we apply the result:
46 CHAPTER 2. SURFACES
Theorem 16 Define
1 X −1 ∂gjk ∂gik ∂gij
Γlij = (g )lk + −
2 ∂ui ∂uj ∂uk
k
d2 uk X k dui duj
+ Γij = 0.
dt2 dt dt
ij
When I learn something new, I first like to try an example where I already know the answer
to see if the new method really works.
Suppose we were to use polar coordinates in the plane rather than rectangular coordinates.
Would the geodesic equation give the correct geodesics? Let’s try.
∂s
We have a map s(r, θ) = (r cos θ, r sin θ, 0), illustrated below. Then ∂r = (cos θ, sin θ, 0) and
∂s ∂s ∂s ∂s ∂s ∂s ∂s 2
∂θ = (−r sin θ, r cos θ, 0). So g11 = ∂r · ∂r = 1, g12 = ∂r · ∂θ = 0 and g22 = ∂θ · ∂θ = r .
2.5 4
2
3
1.5
2
1
1
0.5
1 2 3 4 -4 -2 2 4
Notice that
1 0 −1 1 0
g= so g = 1
0 r2 0 r2
We must compute Γkij . Clearly Γkij = Γkji , so it suffices to compute the following expres-
sions
Γ111 = 0
Γ211 = 0
Γ112 = 0
2.10. AN EXAMPLE 47
1
Γ212 =
r
Γ122 = −r
Γ222 = 0
d2 θ 2 dr dθ
+ = 0.
dt2 r dt dt
Suppose for a moment that a geodesic has constant θ. Then the two equations reduce to
d2 r
dt2
= 0 and so r = at + b. This gives a radial line through the origin, clearly one kind of
geodesic.
Otherwise θ varies and our geodesic can be given by an equation for r in terms of θ. We
have
dr dr dθ
=
dt dθ dt
2 d2 r dθ 2 dr d2 θ
d r
= +
dt2 dθ2 dt dθ dt2
48 CHAPTER 2. SURFACES
and when these equations are substituted into the two earlier geodesic equations, we ob-
tain
d2 r dθ 2 dr d2 θ
2
dθ
2
+ 2
−r = 0
dθ dt dθ dt dt
d2 θ 2 dr dθ 2
+ = 0
dt2 r dθ dt
2
The second equation can be solved for ddt2θ and this result can be inserted into the first
equation, yielding a single equation
d2 r dθ 2 2 dr 2 dθ 2
2
dθ
2
− −r =0
dθ dt r dθ dt dt
d2 r 2 dr 2
− −r =0
dθ2 r dθ
Incidentally, the reason our original two equations have become only a single equation is
that we have lost track of how the curve is traced in time, and only kept information about
the shape r(θ).
This final differential equation can be solved by an ingenious trick which also comes up in
1
the theory of Newtonian orbits for the two body problem. Let r(θ) = u(θ) dr
. Then dθ = −1 du
u2 dθ
d2 r 2 du 2 2
− u12 ddθu2 . So the previous equation becomes
and dθ 2 = u3 dθ
2 2
1 d2 u
2 du −1 du 1
− 2 2 − 2u − =0
u3 dθ u dθ u2 dθ u
d2 u
+ u = 0.
dθ2
The solutions of this equation are u(θ) = A sin(θ + δ) and so
1
r(θ) =
A sin(θ + δ)
Luckily, this is the polar form of a straight line which misses the origin. Consider the
pictures on the next page. The picture on the left shows that a horizontal line of distance
1 1
A from the origin has polar equation r(θ) = A sin θ . The picture on the right shows that
every straight line which misses the origin can be obtained by rotating the picture on the
2.11. GEODESICS ON A SPHERE 49
left by −δ. When a polar curve r(θ) is rotated by −δ, the new equation has the form
r(θ + δ).
Putting these results together, we find that the solutions of the geodesic equation in polar
coordinates are exactly the polar forms of straight lines in the plane. Whew.
We end this chapter with a description of geodesics on several important surfaces. The
geodesic equation is a nonlinear differential equation, and it can seldom be explicitly solved.
Moreover, solutions of the geodesic equation have constant speed, and we have discovered
that renormalizing a curve so it has constant speed is rarely possible explicitly. The equa-
tion can be solved numerically, however. It is not difficult to teach Mathematica how to
do so; Mathematica has built-in routines to solve differential equations.
There are also tricks which can be useful!
Trick 1: The geodesic equation is a second order equation. Consequently, solutions γ(t)
are uniquely by the boundary conditions γ(0) and γ 0 (0). If a geodesic is already known
with these boundary conditions, it must be the one solving the geodesic equations.
Trick 2: An isometry of a surface is a C ∞ map M : S → S which is one-to-one and
onto with C ∞ inverse and preserves lengths. This final condition means that M ◦ α(t) and
α(t) always have the same length for any curve α. Such isometries automatically preserve
geodesics (why?).
These tricks can be used to dramatically simplify the determination of geodesics on a
sphere. One approach is to use spherical coordinates, write down the geodesic equation
as in the previous section, and then solve in the special case that ϕ is a constant. The
equations simplify dramatically and we discover that the solutions are circles which traverse
the equator at constant speed. We call these solutions great circles along the equator.
50 CHAPTER 2. SURFACES
Now suppose that we have a more general geodesic γ(t) with boundary conditions γ(0) = p
and γ 0 (0) = X. Rotate the sphere so the equator rotates to a great circle through p in the
direction X. By Trick 2, this new great circle must be a geodesic. By Trick 1, it must be
the geodesic γ(t). This proves
Remark: This result is known to the general public. You’ll see it dramatically confirmed if
you fly to Europe and discover yourself over Greenland in the middle of the journey.
Remark: However, even the special calculation described above can be eliminated if we
notice that the sphere also has isometries which reflect across the equator: (x, y, z) →
(x, y, −z). Call this isometry M . Let γ(t) be a geodesic which begins at the equator, so
γ(0) = p is on the equator, and initially moves tangent to the equator, so γ 0 (0) = X is
tangent to the equator. Then M ◦ γ is again a geodesic. Clearly this geodesic starts at p
with initial velocity X. By Trick 1, it equals γ. But M ◦ γ(t) can equal γ(t) only if γ(t)
has zero z-component. So γ is a great circle along the equator. QED.
Remark: It is very important to notice that the curve which follows the equator three
fourths of the way around the sphere is a geodesic. It is certainly not the shortest curve
connecting its endpoints, because it is shorter to go one fourth of the equator the other
way. We proved that shortest curves are geodesics, but did not prove the converse.
It turns out that geodesics locally minimize distance. Given t, there is an > 0 such that
γ is the shortest curve to all γ(u) satisfying |t − u| < .
Consider the surface formed by rotating the graph of a function y = f (x) about the x-axis.
The resulting surface can be parameterized by
Although we will not be able to solve the geodesic equation completely, we get a remarkably
complete and beautiful description of the geodesics on a surface of revolution by using a
combination of symbolic calculation and geometric interpretation.
2.5
6
0
5
4 -2.5 5
2.5
3
-5 0
2
0 -2.5
1 1
2 -5
3
4
1 2 3 4 5
∂s
= (1, f 0 (x) cos θ, f 0 (x) sin θ)
∂x
∂s
= (0, −f (x) sin θ, f (x) cos θ)
∂θ
2
g11 = 1 + f 0 (x)
g12 = 0
g22 = (f (x))2
f 0 f 00
Γ111 =
1 + (f 0 )2
Γ112 = 0
ff0
Γ122 = −
1 + (f 0 )2
Γ211 = 0
52 CHAPTER 2. SURFACES
f0
Γ212 =
f
Γ222 = 0
d2 θ 2f 0 dx dθ
+ =0
dt2 f dt dt
Example 1: Suppose x is constant, so only θ varies. We will call such a curve a meridian.
The equations reduce to
2
ff0 dθ
− 0 2
=0
1 + (f ) dt
d2 θ
=0
dt2
The solution of the second equation is θ(t) = ct +d, so the geodesic goes around the surface
of revolution at constant angular velocity. If the geodesic is not constant, the first equation
implies that f 0 (x) = 0. So meridians are geodesics if and only if x is a local min, local max,
or other critical point of f .
It may seem strange that meridians for a local maximum of f are geodesics. But recall
that geodesics minimize length only locally. The point is that we should move around the
rim to get to nearby points as fast as possible.
Example 2: Suppose that θ is constant. I like to call such curves latitudes. The geodesic
equations reduce to
2
d2 x f 0 f 00 dx
2
+ 0 2
=0
dt 1 + (f ) dt
2.12. GEODESICS ON A SURFACE OF REVOLUTION 53
At first this seems difficult to solve. However, geodesics have constant speed. Our partic-
ular geodesic is (x(t), f (x(t)), 0),
possibly rotated around the surface by a fixed θ. So its
dx 0 dx
tangent vector is dt , f (x) dt , 0 and the square of the length of this vector is
2
0 2
dx
1 + (f )
dt
We have proved
Theorem 18 Meridians are geodesics exactly at critical points of f (x). All latitudes are
geodesics.
We must now find the remaining geodesics, which wind around the surface. We begin by
solving the second of the two geodesic equations:
d2 θ 2f 0 dx dθ
+ =0
dt2 f dt dt
df dx d
Multiply this equation by f 2 and replace dx dt by dt f (x(t)). The equation becomes
d2 θ d dθ
f2 + 2f f (x(t) = 0.
dt2 dt dt
54 CHAPTER 2. SURFACES
Equivalently, then
d 2 dθ
f = 0.
dt dt
So the second equation just asserts that f 2 dθ
dt is constant! Call this constant c1 . Then
dθ c1 dθ c1
= or f (x) =
dt f (x)2 dt f (x)
Remark: Notice that dθ dt is always positive or always negative or always zero. So once a
curve starts winding around the surface of revolution clockwise, it always winds in that
direction. It cannot stop winding, or start winding in the other direction.
The number f (x) is the radius of the surface of revolution at x. If we watch our curve
over a small interval of time dt, it moves along this radius a distance f (x)dθ. It also
moves perpendicularly to the radius along the curve y = f (x) (or rather, along this curve
rotated by θ). In the time dt, x will change by dx. But the curve y = f (x) has slope
fp0 , so the distance our geodesic travels perpendicular to the radius is not dx but rather
1 + (f 0 )2 dx.
Since the curve has constant speed, the expression inside the square root must be a constant
independent of t. Call this constant c22 . Our conclusions up to this point can be summarized
in two equations:
dθ c1
f (x) =
dt f (x)
2 2
dθ 0 2 dx
f + (1 + (f ) ) = c22 .
dt dt
Remark: We now claim that this second equation is equivalent to the remaining unex-
amined geodesic equation. To see this, multiply that unexamined geodesic equation by
2 1 + (f 0 )2 dx
dt to get
( 2 ) 2
d2 x f 0 f 00
dx
0 2 dx dx dθ
2 1 + (f ) + − 2f f 0 = 0.
dt dt 2 1 + (f 0 )2 dt dt dt
We know from example 2 that the first of these two terms is the derivative with respect
dx 2 2
df
0
to t of 1 + (f )2
dt . Replace dt by f 2 in the second term to obtain −2f dt fc12
dθ c1
=
2
d c1
dt f . So the unexamined geodesic equation states that the derivative of
2 2
c1 0 2
dx
+ 1 + (f )
f dt
dθ c1
f (x) =
dt f (x)
2 2
dθ 0 2 dx
f + (1 + (f ) ) = c22 .
dt dt
According to the second equation, the motion of the geodesic is split into a component in
the direction of the meridian circle through γ(t) and a component in the direction of the
latitude curve through γ(t). As the radius decreases, the meridian component increases and
the latitude component decreases, so the geodesic becomes vertical, winding tightly around
the surface. As the radius increases, the meridian component decreases and the latitude
component increases, so the geodesic becomes horizontal, winding very slowly around the
surface.
56 CHAPTER 2. SURFACES
On the other hand, suppose the meridian of critical radius is a geodesic. If our geodesic
approaching this critical radius were to touch the meridian, it would be vertical there.
2.12. GEODESICS ON A SURFACE OF REVOLUTION 57
By uniqueness of boundary conditions, our geodesic would have to equal the meridian.
What happens instead is that our geodesic winds more and more vertically, approaching
the meridian infinitely closely without ever touching or bouncing back.
Let us suppose that our surface of revolution extends infinitely far along the x-axis. Our
analysis gives a complete picture of geodesics on the surface. These geodesics travels almost
horizontally when the radius is large, and then pinch together as the radius decreases. If
the radius does not decrease too much, the geodesic escapes through the narrow spot and
continues on. If it finds a spot which is too narrow, it bounces back and retreats toward
the left. And if it finds a spot which is just right, it approaches the spot infinitely closely.
Some geodesics bounce back and forth forever. Others continue on, almost managing to
squeeze through narrow spots.
58 CHAPTER 2. SURFACES
Finally, we will sketch the theory of geodesics on the Poincare disk described at the end of
section 2.6. Since this geometry is exactly Lobachevsky’s non-Euclidean geometry, it plays
an important role in mathematics. We will not give complete details, but an interested
reader can easily complete the theory from our sketch.
Recall that
4 dx2 + dy 2
2
ds =
(1 − (x2 + y 2 ))2
Notice first that reflection across a line through the origin preserves distance. Consequently
the tricks deduced earlier allow us to conclude that geodesics through the origin remain on
a radial line.
By rotational symmetry, it suffices to understand the curve γ(t) = (x(t), 0). The same
analysis will then apply to r(t) for any radial curve.
Since geodesics travel at constant speed, the length of the derivative of our curve must
remain constant. The resulting derivative is γ 0 (t) = (x0 (t), 0), and its Poincare length
is
2 dx
.
1 − x2 dt
2 dx 1+x
This must be a constant c, so 1−x2 = c dt and we discover by integration that = ln 1−x =
ct + d. By
starting
time at a different moment we can eliminate d. Exponentiating both
1+x
sides gives 1−x = ect and so
which maps complex numbers to complex numbers; here a, b, c, and d are fixed complex
numbers. Of course this map is not defined when cz + d = 0.
Lemma 2 This map preserves angles in the sense that if α(t) and β(t) are two curves in
the plane which meet at an angle α, then f (α(t)) and f (β(t)) meet at the same angle.
Proof: This lemma can be proved by a brute force and also follows from general principles
of complex variable theory.
Lemma 3 The map f preserves straight lines and circles in the sense that the image of a
straight line under f is either a straight line or a circle, and the image of a circle is either
a straight line or a circle.
Proof: Again this can be proved by brute force.
We now claim that certain of these maps send the unit circle centered at the origin back
to itself, not necessarily fixing the circle pointwise. A brief calculation reveals that
Theorem 20 Suppose z0 is a fixed complex number satisfying |z0 | < 1, and θ is an arbi-
trary real number. The map f (z) below sends the unit circle back to itself. Consequently
this map sends the interior of the unit circle to itself. Indeed, this map is a one-to-one
and onto map from the Poincare disk to itself, and its inverse has the same form for some
other choice of z0 and θ.
z − z0
f (z) = eiθ .
1 − z z̄0
Finally, we come to the most important point, which can again be proved by brute
force:
Theorem 21 The map
z − z0
f (z) = eiθ .
1 − z z̄0
from the Poincare disk to itself preserve Poincare length and so is an isometry of the disk.
Turn back to Escher’s picture of the Poincare disk in section 2.6. These isometries are
clearly visible, mapping one angel to another and one devil to another.
Remark: We now have enough information to determine the geodesics of the Poincare
disk completely. On the next page, these geodesics can be seen in Escher’s working notes
used to construct the angel and devil picture.
Theorem 22 The geodesics on the Poincare disk are exactly radial lines through the origin
or circles which meet the boundary at ninety degrees.
Proof: Suppose a geodesic γ(t) goes through the point z0 . Apply the isometry f given
above. This isometry maps γ to a geodesic through the origin, and consequently to a
radial line. This radial line meets the boundary of the disk at ninety degrees. Now apply
60 CHAPTER 2. SURFACES
the inverse of f , which is another such map. This map sends lines and circles to lines
and circles, so γ itself must be a straight line or a circle. Moreover, the inverse isometry
preserves Euclidean angles, and sends the boundary of the disk back to itself. So γ (or
rather, its extension to a full line or full circle) must hit the boundary of the disk at ninety
degrees. If γ is a line, then it must go through the origin because these are the only lines
which hit the boundary at ninety degrees. So γ is a line through the origin or else a circle
hitting the boundary at ninety degrees.
Conversely, all such circles are geodesics. Indeed, let γ be a circle through z0 hitting the
boundary at ninety degrees. There must be a geodesic through z0 starting in the same
direction as γ. This geodesic is a radial line or a circle hitting the boundary at ninety
degrees. But a little thought shows that there is only one such curve, so γ itself is a
geodesic.
Chapter 3
The previous chapter was about intrinsic surface theory — the part of the theory that
could be understood by a two-dimensional worker living on the surface. We used the
surface parameterization s(u, v) to calculate gij , but our worker would instead measure the
gij directly using a small ruler. All of the remaining mathematics depended only on the
gij without further reference to the surface.
We are now going to study extrinsic surface theory — the part of the theory which requires
looking at the surface from the third dimension. Our main goal is to calculate the curvature
of the surface. Although we’ll continue to calculate in local coordinates, the quantities we
study will depend directly on s(u, v); indeed we know from an example in the preface that
curvature cannot be determined intrinsically by a two-dimensional worker.
In that preface, we sketched a method to determine the curvature of S at a point p.
Rotate the surface until its tangent plane at p is parallel to the xy-plane; find the equation
z = f (x, y) for this rotated surface near p, and expand f in a power series. In practice this
method is awkward and we’ll use a different technique. Let ~n be a unit normal vector field
∂s ∂s
on the surface. Since ∂u and ∂v are tangent the surface and linearly independent, their
cross product is perpendicular to S and we can take
∂s ∂s
×
~n = ∂u ∂v
∂s × ∂s
∂u ∂v
61
62 CHAPTER 3. EXTRINSIC THEORY
If the surface lies in a plane, this normal vector is constant. Consequently, curvature is
related to changes in n; we can study these changes by differentiation. When X is a tangent
vector at p, let X(n) be the derivative of n in the direction X. (This derivative will be
defined rigorously in the next section.) Since n has length one, n · n = 1 and the product
rule gives X(n) · n + n · X(n) = 0 or X(n) · n = 0. Consequently, the derivative X(n) is
another vector tangent to the surface.
The picture of the saddle below shows how this works in practice. Pay attention to the
origin and move in the x direction. Notice that the change of the normal vectors is also in
the positive x-direction. If our saddle has standard form z = y 2 − x2 , we will later calculate
that the derivative of n in the direction e1 is 2e1 .
Repeat this argument, but this time move in the y direction, as in the left picture on
the following page. (To avoid interference, the normal vectors have been drawn shorter
than they should be.) Notice that the change of the normal vectors is now in the negative
y-direction. For a standard saddle, we will later calculate that the derivative of n in the
direction e2 is −2e2 .
3.2. VECTOR FIELDS IN R3 63
Finally, move in a diagonal direction X, as in the picture on the right. Notice that the
change of the normal vectors is no longer a multiple of X. We will later prove that the
derivative of n in the direction e1 + e2 is 2e1 − 2e2 .
These results are supposed to remind you of eigenvectors. The derivative of n in the
direction X is another tangent vector which we will denote B(X). This B is a linear trans-
formation from the tangent space at p to itself. In the example, e1 and e2 are eigenvectors
of B because B multiples each of these vectors by a scalar. In general, the eigenvalues of
B (up to a sign) will be the principal curvatures described in the preface.
Recall that a tangent vector at a point p looks like X = (X1 , X2 ) in local coordinates. The
numbers X1 and X2 are not the coordinates of the vector in R3 ; these are found from the
expression
∂~s ∂~s
X = X1 + X2
∂u ∂v
We now want to discuss vector fields on a surface S which point in arbitrary directions in
R3 , not just in tangent directions. The normal vector field below is such a field. We will
use the letters X, Y, Z to denote tangent vector fields, and the letters U, V, W to denote
vector fields in arbitrary directions.
64 CHAPTER 3. EXTRINSIC THEORY
A typical vector field V assigns a vector V (p) = (V1 , V2 , V3 ) to each point p in the surface.
In local coordinates, p is given by a pair (u, v). Consequently we have
Notice carefully that when we give a tangent vector (X1 , X2 ), the Xi are not the three-
dimensional coordinates of the vector. But when we give a vector (V1 , V2 , V3 ), the Vi are
just the three-dimensional coordinates of the vector.
Example 1: Consider the sphere, parameterized in the usual way via spherical coor-
dinates s(θ, φ) = (sin φ cos θ, sin φ sin θ, cos φ) . Let V be the vector field which assigns to
each point (x, y, z) the vector (0, y, 0). This vector field is pictured on the next page. Since
y = sin φ sin θ, this vector field equals
Example 2: Consider the surface z = x2 + y 2 . This surface can be parameterized via the
map s(u, v) = (u, v, u2 + v 2 ). Let V be the vector field which assigns to each point (x, y, z)
the vector z(x, y, 0). Then
V = (u(u2 + v 2 ), v(u2 + v 2 ), 0)
We are about to define the derivative of the vector field V in a tangent direction X. The
object V must be a vector field, and not just a vector at a single point p, since differentiation
requires that we compare values of V at nearby points. But X can be a single vector at
p, because it merely gives the direction we want to move. However, X must be a tangent
66 CHAPTER 3. EXTRINSIC THEORY
vector, because if we move in a direction that is not tangent to the surface, we’ll leave the
surface and V will not longer be defined.
We cannot define the derivative of V in the direction X using the usual definition
d
V (p + tX)
dt
because the line p + hX also leaves the surface and so V is not defined at p + hX. But a
slight modification works. Let α(t) be a curve on the surface which goes through p at time
0 with direction α0 (0) = X. Then we can compute
d
V (α(t))
dt
Let us deduce a formula for this derivative. Let γ(t) = (u(t), v(t)) be a local coordinate
representation of α. Then V at α(t) is just
(V1 (u(t), v(t)), V2 (u(t), v(t)), V3 (u(t), v(t)))
and the desired derivative is
dV (α(t)) ∂V1 du ∂V1 dv ∂V2 du ∂V2 dv ∂V3 du ∂V3 dv
= + , + , +
dt ∂u dt ∂v dt ∂u dt ∂v dt ∂u dt ∂v dt
But the local coordinate expression for X = α0 (0) is (X1 , X2 ) = du dv
dt , dt , so the above
expression is
dV (α(t)) ∂V1 ∂V1 ∂V2 ∂V2 ∂V3 ∂V3
= X1 + X2 , X1 + X2 , X1 + X2
dt ∂u ∂v ∂u ∂v ∂u ∂v
which is just
∂V~ ~
∂V
X1+ X2 .
∂u ∂v
Notice that this expression depends only on X and not on the particular curve α(t).
Consequently,
∂ ∂
Definition 16 Let X = (X1 , X2 ) = X1 ∂u + X2 ∂v be a tangent vector to a surface S and
let V = (V1 , V2 , V3 ) be a three-dimensional vector field. Then
~
∂V ~
∂V
X(V ) = X1 + X2
∂u ∂v
Example: Let S be the surface z = x2 + y 2 of the previous example and let V be the
vector field z(x, y, 0) = (u(u2 + v 2 ), v(u2 + v 2 ), 0) pictured there. Suppose X = (2, 3).
Then
∂V~ ∂V~
X(V ) = 2 +3 = (6u2 + 6uv + 2v 2 , 3u2 + 4uv + 9v 2 ).
∂u ∂v
3.4. BASIC DIFFERENTIATION FACTS 67
The rest of our course depends on the straightforward differentiation formula introduced
in the previous section. In this section we collect together all of the facts about this
differentiation used in the future. These facts are so simple that they are boring. But wait
until you see how they are used.
Proof: Most of these results can be left to the reader. Look back at the fourth result.
Since V and W are three-dimensional vectors, the inner product in question is the standard
dot product in R3 rather than the sophisticated two-dimensional dot product defined by
the gij . So this result is just the ordinary product rule
The fifth result is our first use of the Lie bracket defined in an earlier section. Its most
important consequence is point six. If X and Y are tangent vectors, X(Y ) need no longer
be tangent to the surface. For example, look at the tangent field shown in the picture below
on the sphere, and differentiate it in the direction of the equator; the derivative vectors
point inward toward the center of the sphere.
68 CHAPTER 3. EXTRINSIC THEORY
But X(Y ) − Y (X) will be tangent to the surface, since the Lie bracket of tangent vector
fields is again a tangent field.
Let us prove the fifth result. This time Y is a vector field rather than just an isolated
vector. In local coordinates Y = (Y1 (u, v), Y2 (u, v)). But to differentiate, we must think of
this as the three-dimensional vector
∂s ∂s
Y = Y1 (u, v) + Y2 (u, v)
∂u ∂v
Consequently
Similarly
Notice that when we subtract the second expression from the first, the terms involving
second partials of s cancel, and we obtain
∂s ∂s
X(Y ) − Y (X) = (X(Y1 ) − Y (X1 )) + (X(Y2 ) − Y (X2 ))
∂u ∂v
This is a three dimensional tangent vector whose expression in local coordinates is
A brief final calculation shows that this is exactly the expression for [X, Y ] obtained at the
end of section 2.5. QED
3.5. THE NORMAL FIELD 69
We apply this theory to the normal vector field ~n consisting of unit vectors perpendicular
∂s ∂s
to the surface. Since ∂u and ∂v are linearly independent tangent vectors, we have
Definition 17 Let S be a surface parameterized by s(u, v). The normal vector field is the
three-dimensional field
∂s ∂s
× ∂v
~n = ∂u
∂s × ∂s
∂u ∂v
As a matter of fact, there are two choices for this n, differing in sign. By definition, an
orientation of a surface is a choice of one or the other unit normal. Often this orientation
is given geometrically. For instance, closed objects like spheres and doughnuts are usually
oriented using outward pointing normals. Graphs like z = u2 + v 2 are usually oriented
using upward pointing normals.
If an orientation is given geometrically, it is necessary to check that the previous formula
gives the correct choice. When it gives the incorrect orientation, switch u and v to obtain
the correct one, or just change of sign of n.
Theorem 24 If X is a tangent vector, then X(~n) is again tangent to the surface.
Proof: The normal field has unit vectors, so hn, ni = 1. Since the derivative of a constant
is zero, rule four from section 3.3 gives
∂s ∂s
= (cos θ, sin θ, 2r) × (−r sin θ, r cos θ, 0) = −2r2 cos θ, −2r2 sin θ, r
×
∂r ∂θ
√
Since the length of this vector is r 1 + 4r2 ,
1
−2r2 cos θ, −2r2 sin θ, r
~n = √
r 1 + 4r 2
According to the previous theorem, this vector is a tangent vector and thus a linear com-
bination of
∂s
= (cos θ, sin θ, 2r)
∂r
and
∂s
= (− sin θ, cos θ, 0)
∂θ
−2r
In fact, it is √1+4r 2
times the second vector. We usually write tangent vectors in local
coordinates as (X1 , X2 ). In this language, the derivative just computed equals
−2r
0, √
1 + 4r2
Then the first term in this expression is tangent to the surface and the second term is
normal to the surface.
Proof: The second term is a multiple of n, so it is certainly normal. To show that the
first term is tangent to the surface, it suffices to show that the dot product of this term
and n is zero. But a glance shows that this is true. QED
Apology: In these notes, I usually omit writing small vectors over V , n, and other ex-
pressions. Occasionally I write small vectors to emphasize a point. Thus in the previous
theorem, n sometimes has a vector and sometimes does not. It is always the same n.
Remark: It follows from the previous theorem that every three-dimensional vector field
V on the surface can be written in the form
~ =Y
V ~ + f ~n
In this expression, X(f ) is the directional derivative of the function f in the direction X,
which we already understand from the discussion in section 2.3. Consequently, we can
understand the derivatives of arbitrary three-dimensional vector fields by studying two
special cases:
• X(n)
If X and Y are tangent to the surface, the derivative X(Y ) may no longer be tangent.
See the picture in section 3.4. But the vector X(Y ) can be decomposed into a tangential
and a normal component. This tangential component is denoted ∇X Y and the normal
component, which is a multiple of n, is denoted b(X, Y )n.
We have already proved that X(n) is tangent to the surface. The resulting tangent vector
is denoted B(X).
We have proved the following decomposition result, which is fundamental for everything
which follows:
72 CHAPTER 3. EXTRINSIC THEORY
Theorem 26 If X is a tangent vector and Y is a tangent vector field, we have the following
decomposition of derivatives into tangential and normal components:
X(~n) = B(X)
Remark: This chapter is about b(X, Y ) and B(X). We will discover that these objects
contain exactly the curvature information about the surface, nothing less and nothing
more.
The next chapter is about ∇X Y. According to the current definition, this object can only be
calculated by a three-dimensional worker able to differentiate and then separate the result-
ing vector into a tangential and normal component. However, we will discover that a two-
dimensional worker could compute ∇X Y , using the Christoffel symbols and all that.
Finally, Gauss’ theorema egregium will arise because the quantities ∇X Y and b(X, Y ) are
not completely independent.
The object b(X, Y ) is called the second fundamental form on the surface. And yes, the
first fundamental form is the metric tensor gij . More about all that in a moment. Notice
carefully that b(X, Y ) assigns a number to each pair of tangent vectors X and Y , while
B(X) assigns a tangent vector to each tangent vector. In the next section, we will prove that
knowing one of these objects automatically gives the other. We will also prove that
Here we have used the fact that n and ∇X Y are perpendicular and n has length one. So
b(X, Y ) = − hB(X), Y i . QED.
I’d like to remind you of a little linear algebra. Suppose A = (aij ) is a matrix. A vector v
is an eigenvector of A if A just stretches v without rotation, so Av = λv. In that case v is
74 CHAPTER 3. EXTRINSIC THEORY
We are about to use this theorem when the dimension of the vector space is two. For
completeness, we’ll recall how to calculate eigenvalues and eigenvectors in this case, and
prove the theorem.
3.9. THE PRINCIPAL AXIS THEOREM 75
a b
Suppose A = is a matrix. An eigenvector is a nonzero vector v = (v1 , v2 )
c d
such that v is merely stretched by A, so Av = λv for some real number λ. This λ is the
corresponding eigenvalue of A.
If Av = λv for a nonzero v, then λI − A takes the nonzero vector v to zero and conversely.
It is easy to check that a 2 × 2 matrix takes some nonzero vector to zero exactly when its
determinant is zero. So the eigenvalues of A are precisely the solutions of
λ − a −b
P (λ) = det(Iλ − A) = det = 0.
−c λ − d
5 −3
For example, the eigenvalues of the matrix A = discussed earlier are the
−3 5
roots of
λ−5 3
P (λ) = det = λ2 − 10λ + 16 = (λ − 2) (λ − 8)
3 λ−5
and so λ = 2, 8.
Once we know λ, the corresponding eigenvectors are solutions of Av = λv, or equivalently
(λI − A)v = 0. When written out, this will yield two equations, but the equations will
be redundant. Thus the solutions form a line through the origin. Any nonzero vector on
this line is an eigenvector. For example, in the previous example suppose λ = 2. Then
(λI − A)v becomes
2−5 3 v1 0
=
3 2−5 v2 0
which gives the two redundant equations −3v1 + 3v2 and 3v1 − 3v2 . The solutions are
vectors with v1 = v2 . One such solution is v = (1, 1), as described earlier.
Finally, we prove the principal axis theorem in the two-dimensional case. The first step is
to show that B has at least one eigenvector. Choose an orthonormal basis f1 , f2
arbitrarily.
a b
Write B(f1 ) = af1 + cf2 and B(f2 ) = bf1 + df2 so that the matrix of B will be .
c d
Since the fi are orthonormal, hBf1 , f2 i = c and hf1 , Bf2 i = b. By hypothesis these are
equal, so b = c.
The eigenvalue equation is thus
λ − a −b
P (λ) = det = λ2 − (a + d)λ + (ad − b2 ) = 0.
−b λ − d
Since the expression under the square root sign is nonnegative, this equation has real roots.
So at least one eigenvector exists.
Let e1 be an eigenvector. We may multiply e1 by a constant, so assume it has length one.
Choose a perpendicular vector e2 of length one. Then by hypothesis he1 , Be2 i = hBe1 , e2 i =
hκ1 e1 , e2 i = 0. Hence Be2 is perpendicular to e1 and therefore must be a multiple of e2 . So
e2 is also an eigenvector of B. QED.
This B is a symmetric map from the two-dimensional tangent space to itself. By the
principal axis theorem, it has an orthonormal basis of eigenvectors.
Definition 18 Let S be a surface with a fixed orientation, p ∈ S.
1. The eigenvalues of −B are denoted κ1 and κ2 and called the principal curvatures of
the surface at p
2. The corresponding eigenvectors are called the principal directions at p. These direc-
tions are defined unless κ1 = κ2 . When κ1 = κ2 , all vectors are eigenvectors and the
principal directions are not well-defined.
3. The product κ = κ1 κ2 is called the Gaussian curvature of the surface at p
4. The sum m = κ1 + κ2 is called the mean curvature of the surface at p.
Remark: We choose the eigenvalues of −B rather than B to follow the historical con-
vention. In the rest of this chapter, we will explain how to compute b, B, κ1 , and κ2 . In
particular, we will justify these definitions by showing that they agree with the numbers
defined in the preface.
We are going to find a formula for the secondPfundamental form b(X,P Y ). Write X in
∂
coordinates as X = (X1 , X2 ) = X1 ∂u + X2 ∂v = Xi ∂ui . Similarly Y = Yj ∂u∂ j . Since b
∂ ∂
is bilinear,
X ∂ ∂
X
b(X, Y ) = b , Xi Yj = bij Xi Yj
∂ui ∂uj
ij ij
3.11. A FORMULA FOR B 77
∂ ∂
where bij = b ∂u ,
i ∂uj
. In language which used to be popular years ago, the bij is a
tensor of rank two. When this formula is written out in detail, we get an expression similar
to our earlier formula for ds2 in terms of the gij , namely
∂ ∂
It remains to compute bij = b ∂ui , ∂uj . Recall that b(X, Y ) is the normal component of
the vector X(Y ). In our case, Y = ∂u∂ j , which corresponds to the three-dimensional vector
∂s ∂ ∂2s
∂uj .
The derivative of this vector with respect to X = ∂ui is then ∂ui ∂uj . This vector need
not be normal, but its normal component is
∂ 2~s
bij = · ~n
∂ui ∂uj
We have proved
Theorem 30 The second fundamental form b(X, Y ) is given by
X ∂ X ∂ X
b(X, Y ) = b Xi , Yj = bij Xi Yj
∂ui ∂uj
i j ij
∂ 2~s
bij = · ~n
∂ui ∂uj
Remark: We can write this result more concretely if our surface has the form z = f (x, y)
so that s(x, y) = (x, y, f (x, y)). Then it is useful to write ∂f
∂x = fx , etc., and we have
∂s ∂s
× = (1, 0, fx ) × (0, 1, fy ) = (−fx , −fy , 1)
∂x ∂y
2∂ s
Moreover, ∂x 2 = (0, 0, fxx ), with similar results for other second partials. The dot product
2
∂ s
∂x2
· ~n is then fxx divided by the length of (−fx , −fy , 1), and we ultimately obtain
Theorem 31 If a surface is given by the equation z = f (x, y), then bij is the matrix
1 fxx fxy
b= q
1 + fx2 + fy2 fxy fyy
78 CHAPTER 3. EXTRINSIC THEORY
The map B is a linear transformation from the set of tangent vectors to itself. In coordi-
nates
B11 B12 X1
B(X) = B(X1 , X2 ) =
B21 B22 X2
where Bij are the entries of the matrix B. We easily deduce that
X
∂ ∂
B = Bji .
∂ui ∂j
j
∂
Now apply the formula hB(X), Y i = −b(X, Y ) to X = ∂u i
and Y = ∂u∂ k . We obtain
∂ ∂ ∂ ∂
B , = −b , = −bik
∂ui ∂uk ∂ui ∂uk
or * +
X ∂ ∂
Bji , = −bik .
∂j ∂uk
j
P
The expression on the left is Bji gjk , but it is better to recall that gij = gji and bik = bki
and write our equation in the form
X
gkj Bji = −bki .
j
We have proved
Theorem 32 The matrix B satisfies the equation
g B = −b
and consequently can be computed using the formula
B = −(g −1 ) b
Remark: We obtain a more concrete formula when z = f (x, y). Then s(x, y) =
∂s ∂s
(x, y, f (x, y)) and so ∂x = (1, 0, fx ) and ∂y = (0, 1, fy ). Therefore g11 = 1 + fx2 , g12 = fx fy ,
and g22 = 1 + fy2 . The determinant of the matrix g is then
1 + fy2 −fx fy
fxx fxy
1
B=−
(1 + fx2 + fy2 )3/2 2
−fx fy 1 + fx fxy fyy
Suppose S is a surface containing the point p and we wish to compute the principal curva-
tures at p. According to the definition, these curvatures are the negatives of the eigenvalues
of the matrix B defined by B(X) = X(n). Thus we can find linearly independent tangent
vectors e1 and e2 at p which satisfy e1 (n) = −κ1 e1 and e2 (n) = −κ2 e2 .
Suppose we were to rotate the surface in three space. We claim that the principle curvatures
will not change. Indeed, n will rotate and the ei will rotate and the derivative contraption
will rotate, and after this rotation we will still have e1 (n) = −κ1 e1 and e2 (n) = −κ2 e2 at
the rotated image of p.
Clearly we can rotate the surface so p lies over the origin and its tangent plane is parallel
to the xy-plane. After this rotation, the surface will be given near p by
1
ax2 + 2bxy + cy 2 + . . .
z = f (x, y) = f (0, 0) +
2
Since the tangent plane at the origin is parallel to the xy-plane, we have fx = fy = 0 at
the origin, and fxx = a, fxy = b, fyy = c. Therefore the matrix B at the origin is
1 0 a b −a −b
1
− =
(1 + 0 + 0)3/2
0 1 b c −b −c
The principal curvatures are the negatives of the eigenvalues of this B. So the principal
curvatures and principal directions are the eigenvalues and eigenvectors of
a b
b c
In the special case when b = 0, these eigenvalues are a and c and the principal directions
are along the x and y axes. Therefore we recover the description of curvature from the
preface:
κ1 κ2
f (x, y) = f (0, 0) + x2 + y 2 + . . .
2 2
80 CHAPTER 3. EXTRINSIC THEORY
In the general case when b may not be zero, write p = (x, y) and notice that
a b x x
hB(p), pi = , = ax2 + 2bxy + cy 2
b c y y
This is the quadratic term of the Taylor expansion, up to a factor of 1/2. Choose or-
thonormal eigenvectors e1 and e2 . Then p = (x, y) can be written in the form ue1 + ve2 ,
where u and v are new coordinates of p in the coordinate system defined by e1 and e2 . We
have
ax2 + 2bxy + cy 2 = hB(p), pi = hB(ue1 + ve2 ), ue1 + ve2 i .
Since the ei are eigenvectors, this equals
1
1 1 1
0.5 0.5 0.5 0.5
0 0 0
0 -0.5 -0.5 -0.5
-1 -1 -1
-0.5 1 1 1
-1
3 0.5 0.75 0.5
0 0.5 0
2
If κ1 and κ2 are the principal curvatures, recall that we have defined the mean curvature
m = κ1 + κ2 and the Gaussian curvature κ = κ1 κ2 . Since the principal curvatures are the
negatives of the eigenvalues of B, we have
det(λI − B) = (λ + κ1 ) (λ + κ2 ) = λ2 + mλ + κ.
where tr(B), the trace of B, is the sum of the diagonal elements of B, and det(B) is the
determinant of B. Consequently, we have proved
Theorem 35 The mean curvature and Gaussian curvature are given by
m = −tr(B) κ = det(B).
Remark: If B is an arbitrary linear transformation, the numbers tr(B) and det(B) are
invariants which do not depend on the choice of basis. It can be proved that they are the
only such invariants in the sense that any continuous invariant is just a function of tr(B)
and det(B). Therefore from an algebraic point of view, the Gaussian curvature and the
mean curvature are natural invariants of the operator B.
As explained in the preface, we will soon prove that the Gaussian curvature can be com-
puted intrinsically by a two-dimensional worker. It will interest us greatly. The mean
curvature is also interesting, although we do not have time to study it in this course. For
instance, if a wire is bent and dipped into a soap film, a surface forms supported by the
wire. The chosen surface has minimal area among all surfaces bounded by the wire, and
82 CHAPTER 3. EXTRINSIC THEORY
it is then easy to prove that it has mean curvature zero using the variational techniques
introduced during the discussion of geodesics. So soap films bounding a wire are always
shaped like a saddle. The curvatures of this saddle vary from point to point, but they are
always of equal magnitude and opposite sign.
3.15 Examples
Notice that this normal points inward. Let us change the sign of the normal and use the
more standard outward-pointing normal.
We have
∂2s
= (0, f 00 cos θ, f 00 sin θ)
∂x2
∂2s
= (0, −f 0 sin θ, f 0 cos θ)
∂x∂θ
∂2s
= (0, −f cos θ, −f sin θ)
∂θ2
The matrix for b is obtained by dotting these terms with n; recall that we have changed
the sign of n. 00
1 f 0
b= p
1 + (f 0 )2 0 −f
Consequently
1
0 f 00
1+(f 0 )2 0
1
B = −g −1 b = −
p
0 1 1 + (f 0 )2 0 −f
f2
3.15. EXAMPLES 83
and so
−f 00
(1+(f 0 )2 )3/2
0
B=
1
0 f (1+(f 0 )2 )1/2
This matrix is already diagonal, and the principal curvatures are the negatives of its di-
agonal entries. Therefore the principal directions are along the axis with varying x and
constant θ, giving curvature
f 00
κ1 =
(1 + (f 0 )2 )3/2
and along the meridians with varying θ and constant x, giving curvature
−1
κ2 = .
f (1 + (f 0 )2 )1/2
−f 00
κ=
f (1 + (f 0 )2 )2
Remark: A little thought shows that these results are reasonable. Consider the picture
below. Notice that when f is concave up so that f 00 > 0, the two curvatures have opposite
signs and the surface of revolution looks like a saddle. But when f is concave down so that
f 00 < 0, the two curvatures have the same sign and the surface looks like a paraboloid.
2.5
6
0
5
4 -2.5 5
2.5
3
-5 0
2
0 -2.5
1 1
2 -5
3
4
1 2 3 4 5
84 CHAPTER 3. EXTRINSIC THEORY
Since we can obtain a doughnut by rotating a circle, these results imply that the doughnut
looks like a saddle along the inside half, and like a paraboloid along the outside half.
4
2
0
-2
-4
4
-2
-4
-1
-0.5 0
0.5 1
An interesting special case occurs when f 00 (x) = 0 so that f (x) = ax + b. In this case
κ1 = 0 and the Gaussian curvature is zero. The corresponding surfaces look like cylinders
or cones.
2
1
0
-1
1
0.5
-2
2
0
-0.5
-1
1 1
0.5
0
0
-0.5 -1
-1
1 -2
1.5 1
2 1.5
2
2.5 2.5
3 3
√
Consider the case f (x) = a2 − x2 , which yields a sphere of radius a. A brief calculation
3.15. EXAMPLES 85
1
0.5
0
-0.5
-1
1
0.5
-0.5
-1
-1
-0.5
0
0.5
1
1 + y2
xy −1 0
−1
B=
(1 + x2 + y 2 )3/2 2
xy 1+x 0 1
1 + y2
−xy
1
B=
(1 + x2 + y 2 )3/2 2
xy −1 − x
At the origin, these values are ±1, confirming earlier results. Because the expression
1 + x2 + y 2 inside the square root is always positive, one of these terms is positive and one
is negative. The two curvatures approach zero when x and y are large.
86 CHAPTER 3. EXTRINSIC THEORY
The theory of b and B developed above can be formulated purely algebraically. From this
standpoint, differential geometry enters the picture only in the initial definition b(X, Y ) =
X(Y )·n. In linear algebra courses, the theory is known as the theory of quadratic forms.
Here is a sketch. Let V be a finite dimensional real vector space. A quadratic form on V
is a symmetric bilinear map b(X, Y ) from pairs of vectors X and Y to the real numbers.
The associated quadratic map is the map b(X, X). In coordinates, this quadratic map has
the form X
(r1 , . . . , rn ) → bij ri rj
Knowing b(X, Y ) determines the quadratic map, but conversely the quadratic map de-
termines b because b(X, Y ) = 21 (b(X + Y, X + Y ) − b(X, X) − b(Y, Y )) . The fundamental
theorem of quadratic form theory states that we can always choose new coordinates so that
the quadratic map becomes
Moreover, the number of terms with a plus sign, the number with a minus sign, and the
number which do not appear at all, are invariants of b.
Now suppose that V has an inner product. A deeper theory emerges if we pay attention
to this inner product. Then the fundamental theorem states that we can always choose a
new orthonormal basis so the quadratic map becomes
4.1 Introduction
Euclid’s geometry book has no introduction. Instead the book starts with the following
definitions:
4. A straight line is a line that lies evenly with the points on itself.
7. A plane surface is a surface which lies evenly with the straight lines on itself.
This first page of Euclid is almost a summary of our course. Euclid’s lines are our curves.
His straight lines are our geodesics. His surfaces are our surfaces, and his plane surfaces
are our zero curvature surfaces.
Let us examine carefully the implications of the statement that a two-dimensional worker
will think that geodesics are straight lines.
Suppose a two-dimensional worker is living on a surface and examines a path γ(t) starting
in the direction γ 0 (0). The worker can construct the geodesic g(t) which starts in the same
direction g 0 (0) = γ 0 (0). To this worker, γ will curve if it bends away from this geodesic.
87
88 CHAPTER 4. THE COVARIANT DERIVATIVE
4.2 Review
If X and Y are tangent vector fields, the derivative X(Y ) need not be tangent to the
surface. In the previous chapter, we decomposed X(Y ) as
In the early part of this century, differential geometry was intimately connected to ten-
sor calculus, an unpleasant subject involving manipulation of complicated multi-indexed
symbols. In tensor analysis, we associate indexed symbols with the local coordinate ver-
sions of geometric objects on the surface. New objects are then created using various
algebraic combinations of derivatives of the original symbols. These new expressions can
be complicated, leading to what Spivak called “the debauch of indices.”
But there is a catch. If we are not careful, the new objects will depend on the coordinate
system chosen.
4.4. FUNDAMENTAL PROPERTIES 89
I’d like to show you an example. Suppose X and Y are tangent vector fields. In local
coordinates X = (X1 , X2 ) and Y = (Y1 , Y2 ), so X and Y are tensors of a simple type.
Suppose we want to define the derivative of Y in the direction X. It is natural to mimic
the definition in section 3.3 and write
∂Y ∂Y
X(Y ) = X1 + X2
∂u ∂v
This is exactly the formula we used to define X(V ) except that V was a three-dimensional
vector in our earlier work. But there is an extremely importance difference. We deduced
our earlier formula from an invariant expression
d
V (α(t))
dt
that makes direct sense on the surface. In the present case, we have written down an
expression algebraically without providing a corresponding surface calculation.
It turns out that the proposed definition of X(Y ) is nonsense because X(Y ) gives different
results in different local coordinate systems.
Here is an example. Consider the surface s(x, y) = (x, y, 0). It is just the xy-plane. Let X
be the local coordinate vector (1, 0) and let Y be the local coordinate vector (−y, x). Our
d
definition of X(Y ) gives dx (−y, x) = (0, 1). In particular, this derivative is not zero.
However, watch what happens when we do the same calculation in polar coordinates
s(r, θ) = (r cos θ, r sin θ, 0). Then the local coordinate vector (0, 1) corresponds to the three-
∂s
dimensional vector ∂θ = (−r sin θ, r cos θ, 0) = (−y, x, 0). This is exactly the Y we used
earlier. But in polar coordinates X(Y ) will be zero regardless of the coordinate expression
for X because Y = (0, 1) and the coefficients of Y are constant.
Theorem 37 Let X be a tangent vector at p, and let Y be a tangent vector field. Then
vector differentiation ∇X Y has the following properties:
1. ∇X (r1 Y + r2 Z) = r1 ∇X Y + r2 ∇X Z
2. If f is a C ∞ function, ∇X f Y = X(f ) Y + f ∇X Y
3. ∇r1 X+r2 Y Z = r1 ∇X Z + r2 ∇Y Z
and the third result follows by taking the tangential component of both sides.
But Z and n are orthogonal since Z is a tangent vector. Similarly Y and n are orthogonal.
Therefore the above result becomes
Finally we prove the last result. According to the earlier theorem, X(Y ) − Y (X) = [X, Y ] .
Decomposing into tangential and normal components gives
and the result follows by equating the tangential components of both sides.
4.5. A FORMULA FOR ∇X Y 91
∂
Yj (u, v) ∂u∂ j is a vector field. Applying the
P P
Suppose X = Xi ∂u i
is a vector and Y =
previous theorem,
X
∂
∇X Y = Xi ∇ ∂ Yj
∂ui ∂uj
ij
So
X ∂Yj ∂ ∂
∇X Y = Xi + Yj ∇ ∂
∂ui ∂uj ∂ui ∂uj
ij
∂
Remark: We now need a formula for ∇ ∂
∂uj . This expression is another vector field, and
∂ui
so a linear combination of the ∂u∂ k . It turns out that the coefficients of this linear expression
are the Christoffel symbols. Let us forget for a moment that we have worked with these
symbols earlier and give
∂ X ∂
∇ ∂ = Γkij .
∂ui ∂uj ∂uk
k
∂ ∂ ∂
∇ ∂ = Γ111 + Γ211 .
∂u ∂u ∂u ∂v
Remark: In this expression, the outer sum gives the derivative as a linear combination of
the basis vectors. The above formula is more palatible if we look at the special case when
∂
X = ∂u . According to the formula, the derivative of (Y1 , Y2 ) in this direction is
( )
X ∂Yk X k ∂
+ Γ1j Yj
∂u ∂uk
k k
The first term inside the curley bracket is the naive term we wrote down in section 4.2.
The second term is a correction which yields a coordinate invariant derivative.
To finish the theory, we must find a formula for Γkij . Luckily, the correct formula is the
formula we obtained in chapter three:
Theorem 39 The Γkij are given by the following formula. Consequently, they are exactly
equal to the Christoffel symbols which occurred earlier in our course.
l 1 X −1 ∂gjk ∂gik ∂gij
Γij = (g )lk + −
2 ∂ui ∂uj ∂uk
k
Proof: By the fourth item in the theorem from section 4.3, we have
equals
h∇X Y + ∇Y X, Zi + h∇X Z − ∇Z X, Y i + h∇Y Z − ∇Z Y, Xi
Apply the formula ∇X Y −∇Y X = [X, Y ] and its counterparts using other letters. The last
two terms simplify immediately, while the left half of the first term becomes ∇X Y +∇Y X =
∇X Y +{∇X Y − [X, Y ]}, or 2∇X Y −[X, Y ]. Consequently, the previously displayed formula
becomes
2 h∇X Y, Zi − h[X, Y ], Zi + h[X, Z], Y i + h[Y, Z], Xi
Putting this altogether, we obtain the following important formula
4.6. TANGENT FIELDS ALONG CURVES 93
Important Formula:
2 h∇X Y, Zi = X hY, Zi +Y hX, Zi −Z hX, Y i
− hX, [Y, Z]i − hY, [X, Z]i + hZ, [X, Y ]i
∂ ∂
Continuation of proof: Apply this formula in the special case that X = ∂ui and Y = ∂uj
∂
and Z = ∂uk . Notice that the Lie bracket of any two such fields is zero. We obtain
* +
l ∂ ∂ ∂ ∂ ∂
X
2 Γij , = gjk + gik − gij
∂ul ∂uk ∂ui ∂uj ∂uk
l
or equivalently
X 1 ∂gjk ∂gik ∂gij
Γlij glk = + −
2 ∂ui ∂uj ∂uk
l
and the result clearly follows after multiplying both sides by g −1 and summing appropri-
ately. QED.
We have now proved the following extremely important result:
Theorem 40 Let X and Y be tangent vector fields. The expression ∇X Y can be computed
intrinsically by a two-dimensional worker living on the surface.
Suppose Y is a tangent field on a surface and γ(t) is a curve. We want to compute the
derivative of Y in the direction of the curve. To do so, define X = γ 0 (t). Then the derivative
of the three-dimensional vector field in the direction of the curve is X(Y ) and the tangential
component of this derivative is ∇X Y. It is convenient to call the first of these derivatives
dY DY
dt and the second dt .
Let us deduce formulas for these derivatives. The formulas will reveal something interesting,
namely that the derivatives depend only on Y along the curve, and not on the extension
of Y to a global vector field.
∂~s ∂~s
We first compute X(Y ). Write Y = (Y1 , Y2 ) = Y1 ∂u + Y2 ∂v as a three dimensional vector
0 du dv du ∂V dv ∂V
V (u, v). Write γ (t) = dt , dt . Then X(V ) = dt ∂u + dt ∂v . By the chain rule, this equals
d d
dt V (u(t), v(t)) = dt V (γ(t)). The answer depends only on V along γ(t).
d ∂Yk ∂γj
By the chain rule, dt Yk (γ(t)) = ∂uj ∂t , so the result can be rewritten as
X dYk X dγi
∂
+ Γkij Yj
dt dt ∂uk
k ij
This answer also depends only on the values of Y on the curve γ(t) and not on the extension
of these values to the rest of the surface.
We use this observation to extend our theory to a slightly more general situation. Suppose
then that γ(t) is a curve and Y (t) is a function which assigns to each t a tangent vector
Y (t) starting at γ(t). We call Y a tangent field along the curve. See the picture below.
In coordinates Y (t) = (Y1 (t), Y2 (t)). This tangent vector can be converted to a three-
dimensional vector (V1 (t), V2 (t), V3 (t)).
Definition 20 The ordinary derivative of Y (t) is a three-dimensional field along the curve,
and is defined by
dY dV1 dV2 dV3
= , , .
dt dt dt dt
The covariant derivative of Y (t) is a tangent field along the curve, and is defined by
DY X dYk X dγi ∂
= + Γkij Yj
dt dt dt ∂uk
k ij
4.7. ACCELERATION 95
Theorem 41 Let γ(t) be a curve on the surface and let Y (t) be a tangent field along γ.
1. The three-dimensional vector dY
dt can be decomposed as follows into a tangential com-
ponent and a normal component:
dY DY
= + b(γ 0 (t), Y (t))~n
dt dt
D
2. {r1 Y (t) + r2 Z(t)} = r1 DY
dt
DZ
dt + r2 dt
d
hY (t), Z(t)i = DY
DZ
3. dt dt , Z + Y, dt
Proof: This follows immediately from the corresponding facts for vector fields. QED.
When γ(t) is a curve, the derivative γ 0 (t) is a tangent field along the curve. The remaining
sections of this chapter are about DY 0
dt for this special case Y = γ (t).
4.7 Acceleration
Forget our course for a moment and consider the path of a particle moving in R3 . The
position of the particle at time t will be γ(t) for some curve γ. The velocity of the particle
is then γ 0 (t). We want to talk of the speed of the particle at time t, but we cannot call it
s(t) because s has a different meaning in this course. We will call the speed w(t). Then
γ 0 (t) = w(t)T (t) where T (t) is the unit tangent vector.
Differentiating again, we find that the acceleration is given by
dw dT
γ 00 (t) = T (t) + w
dt dt
It is convenient to find a more geometrical description of the second term. Write T (t) =
T (s(t)) as in chapter one, where T (s) is the tangent vector along the curve parameterized
by arc length. Then
dT dT ds
= = (κN ) w(t).
dt ds dt
Putting this together with the previous formula, we conclude that
Theorem 42 If a particle travels along a curve γ(t) and has speed w(t) at time t, then its
acceleration is given by
dw
γ 00 (t) = T + w2 κN
dt
where T is the unit tangent vector, N is the unit normal vector, and κ is the curvature of
the curve.
96 CHAPTER 4. THE COVARIANT DERIVATIVE
Remark: This theorem is well known to anyone who has driven a car. Passengers usually
sit calmly. But sometimes they are pressed into the seat or thrown forward in the direction
T . The force in this direction is independent of the speed of the car and depends only on
acceleration or deceleration (that is, on dw
dt ).
Drivers are also thrown sidewise. The sidewise force is not caused by accelerating the car,
but instead depends on cornering. The magnitude of the sidewise force is proportional to κ,
the curvature of the road as determined by the engineer who designed it, and proportional
to speed squared. So driving around a corner slowly is a piece of cake, but driving around
it fast is dangerous.
We are going to refine the previous theorem when the car is driven on a surface. In that
case, the driver will feel a force in the direction T , caused as before by acceleration or
deceleration. The force perpendicular to T can be further decomposed into a component
in the direction n and a component in the direction n × T. The component in the direction
n presses the car down onto the surface. We will see that it is proportional to speed squared
and to the curvature of the surface in the direction T . The component in the direction
n × T pushes the car back and forth across the surface. We will see that it is proportional
to the square of the speed and to a number κg called the geodesic curvature of the curve.
This geodesic curvature, which will be computed using the covariant derivative, measures
the curvature of the path γ(t) from the point of view of a two-dimensional worker in the
surface. When the path follows a geodesic, this curvature is zero; otherwise it measures
the divergence of the path from a geodesic in the same direction.
Theorem 43 Let γ(t) be a C ∞ curve on a surface. Suppose it has unit tangent vector
T (t) and speed w(t).
1. The decomposition
dw
γ 00 (t) = T + w2 κN
dt
obtained earlier can be refined to a decomposition into three components: one along
T , one along ~n, and one along n × T as follows:
dw ~
~γ 00 (t) = T + w2 κg (~n × T~ ) + w2 b(T, T )~n
dt
This expression defines a real-valued quantity κg , called the geodesic curvature of γ.
2. We have the following decomposition of γ 00 into a component tangent to the surface
4.8. SURFACE DECOMPOSITION OF ACCELERATION 97
Dγ 0
γ 00 (t) = + w2 b(T, T )n
dt
Dγ 0 dw
= T + w2 κg (n × T ).
dt dt
4. The curvature κ of the curve is thus decomposed into the geodesic curvature κg of γ
within the surface, and the curvature b(T, T ) of the surface itself. Moreover,
κ2 = κ2g + b(T, T )2
5. The normal to the curve N (t) is decomposed into a vector tangent to the surface and
a vector normal to the surface. Moreover:
κg b(T, T )
N= (n × T ) + n.
κ κ
Proof: Let γ(t) be an arbitrary C ∞ curve on the surface. Then γ 0 (t) is a tangent field
d 0
along the curve. The expression dt γ (t) defined in section 4.6 is just the acceleration γ 00 (t).
According to results in that section, the tangential component of this acceleration vector
is
0 2
Dγ (t) d γk (t) k dγi dγj
X X ∂
= + Γ ij
dt dt2 dt dt ∂uk
k ij
Dγ 0 0 Dγ 0 Dγ 0 0
dw d
0 0
0
2w = γ (t), γ (t) = , γ (t) + γ (t), =2 , γ (t)
dt dt dt dt dt
D 0 E
Dγ
or 2w dw
dt = 2 dt , wT , and so
Dγ 0
dw
= ,T
dt dt
98 CHAPTER 4. THE COVARIANT DERIVATIVE
dw
Consequently, the tangential component of this decomposition is dt T. Comparing the two
decompositions in the first part of the theorem gives
κN = κg (n × T ) + b(T, T )n
4.10 Example
Consider the sphere parameterized by spherical coordinates: s(θ, φ) = (sin φ cos θ, sin φ sin θ, cos φ).
Then
∂s
= (− sin φ sin θ, sin φ cos θ, 0)
∂θ
∂s
= (cos φ cos θ, cos φ sin θ, − sin φ)
∂φ
g11 = sin2 φ g12 = 0 g22 = 1.
Consequently,
∂s ∂s
× = − sin φ(sin φ cos θ, sin φ sin θ, cos φ),
∂θ ∂φ
which points inward. We prefer to orient the sphere with outward pointing normal, so we
will change the sign of n. After normalizing we have
Consider the curves on the sphere of constant latitude, γ(t) = (sin φ cos t, sin φ sin t, cos φ).
Let us compute the geodesic curvature of these curves extrinsically.
-1
-0.5
0
0.5
1
1
0.5
-0.5
-1
-1
-0.5
0
0.5
1
We have γ 0 = (− sin φ sin t, sin φ cos t, 0) and γ 00 = (− sin φ cos t, − sin φ sin t, 0). So γ 0 · γ 0 =
sin2 φ. Along the curve, n = (sin φ cos t, sin φ sin t, cos φ) and so
We want to compute n × T along the curve, so we divide by the length of γ 0 , that is,
sin φ :
n × T = (− cos φ cos t, − cos φ sin t, sin φ)
Thus
d2 γ dγ
dt2
· n× dt cos φ
κg = dγ dγ
= .
· sin φ
dt dt
4.10. EXAMPLE 101
-1
-0.5
0
0.5
1
1
0.5
-0.5
-1
-1
-0.5
0
0.5
1
cos φ
Γ112 = Γ211 = − sin φ cos φ
sin φ
In local coordinates, our curve is γ(t) = (t, φ) and γ 0 (t) = (1, 0). The first component of
Dγ 0
dt is
d2 γ1 dγ1 dγ2 d 2 cos φ
2
+ 2Γ112 = (1) + 0 = 0.
dt dt dt dt sin φ
Dγ 0
The second component of dt is
d2 γ2 dγ1 dγ1 d
+ Γ211 = (0) − sin φ cos φ(1)2 = − sin φ cos φ.
dt2 dt dt dt
cos2 φ cos2 φ
g11 · 0 + 2g12 · 0 + g22 =
sin2 φ sin2 φ
and so the geodesic curvature is
cos φ
κg = ± .
sin φ
We must determine the correct sign. In the two-dimensional coordinate space (θ, φ), our
cos φ
curve has tangent (1, 0) and the vector computed above, 0, − sin φ , points downward.
Hence we get to it by rotating clockwise. But we are supposed to get to n × T by rotating
counterclockwise, so we should choose the minus sign.
But wait. At the start of this section, we oriented the sphere using outward pointing
∂s ∂s
normals, warning that the expression ∂θ × ∂φ points in the wrong direction. So the ori-
entation of our coordinate space is opposite the orientation chosen on the sphere, and we
must compensate for this change. Consequently, we should choose the plus sign.
Chapter 5
5.1 Introduction
We have divided the theory of surfaces into two pieces. The intrinsic piece can be un-
derstood by a two-dimensional worker on the surface, and is determined by the first fun-
damental form gij . Intrinsic theory is very rich, allowing two dimensional workers to find
0
geodesics, to compute derivatives of vector fields ∇X Y , to determine acceleration Dγ
dt , and
to compute geodesic curvature κg .
The extrinsic piece of surface theory describes the curvature of the surface into the third
dimension. This curvature is determined by the second fundamental form bij .
Let us compare this theory with the theory of curves in chapter one. Suppose a one-
dimensional worker lives on a curve. In one-dimension, there is only one possible geometry,
the geometry of the real line. So instead of constructing an elaborate intrinsic theory, we
parameterized the curve by arc length to directly reflect this geometry.
The numbers κ and τ are the curve theoretic analogues of the bij in surface theory.
According to the fundamental theorem of curve theory, a curve is completely determined
by κ and τ up to Euclidean motion. This theorem was proved using the existence theorem
for ordinary differential equations.
There is an analogous theorem for surfaces, although we probably will not prove it. The
theorem asserts that the gij and bij completely determine the surface up to Euclidean
motion. This time, the theorem is proved using the existence theorem for partial differential
equations.
The theory of ordinary differential equations and the theory of partial differential equa-
103
104 CHAPTER 5. THE THEOREMA EGREGIUM
tions differ in an important respect. In ordinary theory, all reasonable equations have
solutions. But most partial differential equations do not have solutions unless they satisfy
an extra integrability condition. For example, the simplest partial differential equation is
the following system for an unknown f (x, y):
∂f
∂x = Ex (x, y)
∂f
= Ey (x, y)
∂y
This chapter is about the integrability conditions in surface theory — conditions relating
the gij and bij which must hold before a surface exists with these fundamental forms. We
will determine all of these conditions (there are two). One of them will give us Gauss’s
theorema egregium.
On surfaces, there are no natural directions and we are forced to replace the two partials
with vector fields X and Y . These vector fields usually have non-constant coefficients, so the
operators X and Y do not commute. Instead, the non-commutativity X(Y (f ))−Y (X(f )) is
measured by another vector field [X, Y ]. We have seen this field appear several times:
∇X Y − ∇Y X = [X, Y ]
∇X ∇Y Z − ∇Y ∇X Z = ∇[X,Y ] Z
5.2. THE FUNDAMENTAL THEOREM 105
However, this formula is not quite correct. The correct formula will lead us to Gauss’s
great theorem.
Theorem 45 Let X and Y be tangent vector fields on a surface, and let V by an arbitrary
three-dimensional field on the surface. Then
Proof: Write V = (V1 , V2 , V3 ). The definition of X(V ) states that we should differentiate
each coordinate separately:
But then our theorem is just the assertion that for functions f we have X(Y (f )) −
Y (X(f )) = [X, Y ]f. QED.
Remark: We reach the inner sanctum of surface theory by decomposing this equation
into normal and tangential components. The resulting equations are given next. You may
wish to deduce the equations yourself without looking.
Proof: We know that X(Y (Z)) − Y (X(Z)) = [X, Y ]Z. Let us compute the tangential
component of this equation. We have
But
X(∇Y Z) = ∇X ∇Y Z + b(X, ∇Y Z)n
and
X( b(Y, Z)n ) = X(b(Y, Z))n + b(Y, Z)X(n) = X(b(Y, Z))n + b(Y, Z)B(X).
So we have
n o n o
X(Y (Z)) = ∇X ∇Y Z + b(Y, Z)B(X) + b(X, ∇Y Z) + X(b(Y, Z)) n.
106 CHAPTER 5. THE THEOREMA EGREGIUM
Subtract the same expression with X and Y interchanged. The tangential component of
the result is
∇X ∇Y Z + b(Y, Z)B(X) − ∇Y ∇X Z − b(X, Z)B(Y )
and must equal the tangential component of [X, Y ]Z, which is ∇[X,Y ] Z. Therefore
Take the inner product of this equation with W to obtain the first part of the theorem.
The second part of the theorem follows similarly by equating normal components of
X(Y (Z)) − Y (X(Z)) − [X, Y ]Z.
To obtain the third part of the theorem, decompose the equation X(Y (n)) − Y (X(n)) =
[X, Y ](n) into tangential and normal components. Notice that
and the result should equal [X, Y ](n) = B([X, Y ]). The normal component of this equation
is zero by the symmetry of the operator B, and the tangential component gives the third
result of the theorem. QED.
Remark: The above equations are formal and algebraic. It took genius to realize that
they hide important geometrical facts. We’ll reveal these facts in the next sections.
Incidentally, our convention that X, Y, and Z are tangent vectors and U, V, and W are
arbitrary three-dimensional vectors will not work in this section and the sections which
follow because we need four tangent vectors. From now on, W is a tangent vector!
In this section, we will study the right side of the first equation of the previous theorem
and try to disentangle the algebra. It will turn out that the information in the right side
is essentially the single number κ1 κ2 .
It is temporarily convenient to introduce the notation
Notice that the expression b(X, Y, Z, W ) is linear in each variable separately if we hold the
others fixed. So if we define
∂ ∂ ∂ ∂
bijkl = b , , ,
∂ui ∂uj ∂uk ∂ul
then X
b(X, Y, Z, W ) = bijkl Xi Yj Zk Wl
ijkl
det(gij ) κ1 κ2 .
Proof: These facts follow from earlier results stating that det B = κ1 κ2 and B = −g −1 b.
We’ll give a direct proof for completeness. Choose e1 and e2 principal directions. Then
B(e1 ) = κ1 e1 and B(e2 ) = κ2 e2 . So b(e1 , e1 ) = hB(e1 ), e1 i = κ1 he1 , e1 i = κ1 , etc. Write
X = a11 e1 + a21 e2 and Y = a12 e1 + a22 e2 . Then by linearity
b(X, X) = a211 b(e1 , e1 ) + 2a11 a21 b(e1 , e2 ) + a221 b(e2 , e2 ) = a211 κ1 + a221 κ2 .
Similarly b(X, Y ) = a11 a12 κ1 + a21 a22 κ2 and b(Y, Y ) = a212 κ1 + a222 κ2 . A short calculation
then gives
b(X, X)b(Y, Y ) − b(X, Y )2 = (a11 a22 − a12 a21 )2 κ1 κ2 .
108 CHAPTER 5. THE THEOREMA EGREGIUM
However,
hX, Xi = a211 he1 , e1 i + 2a11 a21 he1 , e2 i + a221 he2 , e2 i = a211 + a221 .
Similarly hX, Y i = a11 a12 + a21 a22 and hY, Y i = a212 + a222 , and another short calculation
gives
hX, Xi hY, Y i − hX, Y i2 = (a11 a22 − a12 a21 )2 .
Consequently
b(X, X)b(Y, Y ) − b(X, Y )2
κ1 κ 2 =
hX, Xi hY, Y i − hX, Y i2
If X and Y are orthonormal, the denominator is one and we have
∂ ∂ 2
If X = ∂u and Y = ∂v then the denominator is g11 g22 − g12 and
∂ ∂ ∂ ∂
b1212 = b , , , = b(X, Y, X, Y ),
∂u ∂v ∂u ∂v
which equals
b(X, X)b(Y, Y ) − b(X, Y )2 = g11 g22 − g12
2
κ1 κ2 .
QED.
Then the map R, which assigns a number to any combination of four tangent vectors, is
called the Riemann curvature tensor.
Theorem 49 Let X, Y, Z, and W be tangent vectors at p.
1. The number R(X, Y, Z, W ) depends only on the values of X, Y, Z, and W at p, and
not on their extension to vector fields.
2. The map R(X, Y, Z, W ) is linear in each variable separately if the others are held
fixed.
3. There are numbers Rijkl so that
X
R(X, Y, Z, W ) = Rijkl Xi Yj Zk Wl
ijkl
R1212 = − det(gij ) κ1 κ2 .
Proof: Let us start with vector fields X, Y, Z, and W . The expression R(X, Y, Z, W ) is lin-
ear over the reals in each variable separately. We are going to prove that R(f X, Y, Z, W ) =
f R(X, Y, Z, W ) for any C ∞ function f , and that a similar result holds if we multiple any
of Y, Z, or W by f . It immediately follows that
X ∂ ∂ ∂ ∂
R(X, Y, Z, W ) = R , , , Xi Yj Zk Wl ,
∂ui ∂uj ∂uk ∂ul
ijkl
and consequently that R(X, Y, Z, W ) at a point p depends only on the values of X, Y, Z, and
W at p, since no derivatives of the coefficient functions appear in the final expression.
As an initial step, notice that for any C ∞ function g we have
Now use the product rule for the covariant derivative, proved as point 2 of the first theorem
in section 4.4, to conclude that
∇f X ∇Y Z − ∇Y ∇f X Z − ∇[f X,Y ] Z
equals
f ∇X ∇Y Z − Y (f )∇X Z − f ∇Y ∇X Z − ∇f [X,Y ]−Y (f )X Z
which in turn equals
is equal to
h∇X ∇Y Z − ∇Y ∇X Z, W i + hZ, ∇X ∇Y W − ∇Y ∇X W i
is equal to
∇X ∇Y Z − ∇Y ∇X Z − ∇[X,Y ] Z, W + Z, ∇X ∇Y W − ∇Y ∇X W − ∇[X,Y ] W
However, the top equation is zero because g = hZ, W i is a function and for any function
whatever, X(Y (g)) − Y (X(g)) − [X, Y ](g) = 0. So the bottom equation, which equals
R(X, Y, Z, W ) + R(X, Y, W, Z), is zero.
The remaining parts of the theorem h now follow
i easily. We’ll prove only the formula for
∂ ∂
Rijkl . Notice that the Lie bracket ∂ui , ∂uj is zero. So
∂ ∂ ∂
∇ ∂ ∇ ∂ −∇ ∂ ∇ ∂ − ∇» ∂ ∂ –
∂ui ∂uj ∂uk ∂uj ∂ui ∂uk ,
∂ui ∂uj
∂uk
equals ( ) ( )
X ∂ X ∂
∇ ∂ Γsjk −∇ ∂ Γsik
∂ui
s
∂us ∂uj
s
∂us
which equals ( )
X ∂Γsjk ∂Γsik X m s X ∂
− + Γjk Γim − Γm s
ik Γjm .
s
∂ui ∂uj m m
∂us
∂
The theorem follows by dotting this final result with ∂ul .
The results of the previous sections allow us to compute the Gaussian curvature of the
Poincare disk. Recall that we defined gij directly on this disk, without embedding the disk
as a surface in R3 . So we cannot calculate κ1 κ2 extrinsically.
112 CHAPTER 5. THE THEOREMA EGREGIUM
We parameterize the disk using polar coordinates s(r, θ) = (r cos θ, r sin θ). Earlier we gave
the metric in rectangular coordinates. These convert to polar coordinates as follows:
4 dx2 + dy 2 4 dr2 + (rdθ)2
2
ds = =
(1 − (x2 + y 2 ))2 (1 − r2 )2
Consequently
4 4r2
g11 = g12 = 0 g22 =
(1 − r2 )2 (1 − r2 )2
A brief calculation shows that the only nonzero Christoffel symbols are
2r 2r3 1 2r
Γ111 = Γ122 = −r − Γ212 = +
1 − r2 1 − r2 r 1 − r2
P
We must compute R1212 = s {fancy expression with s} gs2 . The only term that matters
occurs when s = 2 and we obtain
( )
∂Γ221 ∂Γ211 X m 2 X
m 2
R1212 = − + Γ21 Γ1m − Γ11 Γ2m g22
∂u1 ∂u2 m m
This optional section will not be used in the future. We’d like to simplify the remaining
equations of surface theory. Recall that these equation are
The left side of the first surface equation contains this term and the negative of the term
with X and Y interchanged, so the first surface equation can be rewritten
QED.
Definition 22 Let X and Y be vector fields. Then S(X, Y ) is the new vector field defined
by
S(X, Y ) = ∇X B(Y ) − ∇Y B(X) − b([X, Y ])
4. We have
∂ ∂
S12 = ∇ ∂ B −∇ ∂ B
∂u ∂v ∂v ∂u
Using formulas established in the section on the curvature tensor, this becomes
which simplifies to
QED.
Remark: We are interested in the equation S12 = 0. This equation can still be simplified
a little more.
∂
Theorem 53 The vector S12 is zero if and only if the equation below is true for Z = ∂u
∂
and for Z = ∂v :
∂ ∂ ∂ ∂ ∂ ∂
b ,Z − b ,∇ ∂ Z = b ,Z − b ,∇ ∂ Z
∂u ∂v ∂v ∂u ∂v ∂u ∂u ∂v
Proof: We claim that these equations are equivalent to the statement hS12 , Zi = 0 for
these two Z. To prove the assertion, notice that
∂ ∂
hS12 , Zi = ∇ ∂ B −∇ ∂ B ,Z
∂u ∂v ∂v ∂u
But
∂ ∂ ∂ ∂ ∂ ∂
b ,Z =− B ,Z = − ∇ ∂ B ,Z − B ,∇ ∂ Z
∂u ∂v ∂u ∂v ∂u ∂v ∂v ∂u
and so
∂ ∂ ∂ ∂
b ,Z − b ,∇ ∂ Z = − ∇ ∂ B ,Z .
∂u ∂v ∂v ∂u ∂u ∂v
There is a similar formula obtained by interchanging u and v, and the result clearly follows.
QED
Remark: From here, it is easy
P to get the classical equations. We have been writing the
second fundamental form as ij bij Xi Yj . Let us adopt the classical language.
Definition 23 Define
e = b11 f = b12 g = b22
Then
b(X, X) = e X12 + 2f X1 X2 + gX22 .
Theorem 54 The second and third fundamental surface equations are equivalent to the
two equations below, known as the equations of Codazzi-Mainardi.
∂e ∂f
= eΓ112 + f Γ212 − Γ111 − gΓ211
−
∂v ∂u
∂f ∂g
= eΓ122 + f Γ222 − Γ112 − gΓ212
−
∂v ∂u
∂
Proof: Let Z = ∂u in the previous theorem. Then
∂ ∂ ∂
b ,Z = f
∂u ∂v ∂u
and
∂ ∂ ∂
b ,Z = e.
∂v ∂u ∂v
Also
∂ ∂ 1 ∂ ∂
b ,∇ ∂ Z =b , Γ11 + Γ211 = f Γ111 + gΓ211
∂v ∂u ∂v ∂u ∂v
and
∂ ∂ ∂ ∂
b ,∇ ∂ Z =b , Γ112 + Γ212 = eΓ112 + f Γ212
∂u ∂v ∂u ∂u ∂v
So
∂ ∂ ∂ ∂ ∂ ∂
b ,Z −b ,∇ ∂ Z = b ,Z −b ,∇ ∂ Z
∂u ∂v ∂v ∂u ∂v ∂u ∂u ∂v
becomes
∂ ∂
f − f Γ111 + gΓ211 = e − eΓ112 + f Γ212 ,
∂u ∂v
which is equivalent to the first equation in the theorem. The second equation can be
∂
obtained by setting Z = ∂v . QED.
116 CHAPTER 5. THE THEOREMA EGREGIUM
There are two possible directions to proceed from the complicated surface equations of this
chapter. Mathematicians interested in embedding problems will study the possible choices
of bij for a known geometry gij . For example, suppose the geometry is Euclidean. We’ll
later prove that this happens exactly when κ1 κ2 = 0. Then we can choose coordinates such
that gij = δij and the interesting question is how can this geometry be embedded in R3
or what are the surfaces in R3 with Gaussian curvature zero or what are the possible bij
satisfying the surface equations for gij = δij ? By folding a piece of paper without tearing,
you’ll discover that there are surfaces with Gaussian curvature zero which are not planes,
cylinders, or cones.
We will not pursue these questions.
A second direction is to understand the role of Gaussian curvature in the intrinsic geometry
of two-dimensional objects. This is the direction Gauss recommended and the direction
we will pursue. The surface equations tell us that there is an additional deep geometric
invariant, the Gaussian curvature κ, which can be measured by two-dimensional workers.
Gauss believed that geometers ignore this invariant at their peril. When the invariant is
taken into account, classical theorems going back to Euclid generalize and provide insight
into modern developments in mathematics.
Gauss wrote his paper at exactly the moment that Bolyai and Lobachevsky were inventing
non-Euclidean geometry. These mathematicians discovered a geometry unlike conventional
Euclidean geometry, and yet a geometry which might, for quite plausible reasons, be the
correct geometry of the universe. The new geometry had bizarre features:
1. the area of a triangle is completely determined by the three angles of the triangle
2. the area of the entire plane is infinite, but there is a finite bound B such that no
triangle has area greater than B
3. the length of a circle is not 2πr
4. the area of a circle is not πr2
5. perfect rectangles do not exist
6. if two figures are similar, they are congruent.
Much later, Poincare discovered that the Poincare disk is a model for the new geometry.
But even before that, Gauss had observed that the features of the new geometry become
completely natural once one calculates that its Gaussian curvature is −1. We’ll study
Gauss’s arguments in the remaining chapters of the course.
Chapter 6
6.1 Introduction
Euclid divided The Elements into thirteen books. Each book is a collection of propositions
and proofs, with no intermediate explanations. But careful reading reveals a plot.
Book 1 has 48 propositions, divided into five major sections. The first contains preliminary
material, including various straightedge and compass constructions. The second, from
proposition 8 through proposition 26, is about congruent triangles.
The third section, from proposition 27 through proposition 32, introduces the parallel
postulate for the first time, uses this proposition to show that parallel lines cut by a
transversal produce equal alternate angles, and culminates in the proof from this result
that the sum of the angles of a triangle is 180 degrees.
In the fourth section, Euclid discusses area. He defines figures to be “equal” if they can
be decomposed into congruent pieces and uses this notion to prove results which effec-
tively compute the areas of triangles, rectangles, and parallelograms. The section includes
propositions from 33 through 43.
The last propositions, from 44 on, lead to the Pythagorian theorem (number 47) and its
converse (number 48).
Remark: The intrinsic geometry of surfaces starts with the metric tensor gij . In some
sense, this metric tensor is the Pythagorian theorem in disguise, because we can always
choose an orthonormal basis and then the length of X = X1 e1 + X2 e2 is exactly
q
X12 + X22 .
117
118 CHAPTER 6. THE GAUSS-BONNET THEOREM
Intrinsic geometry provides an answer to the question if the Pythagorian theorem holds
infinitesimally, what are the consequences for geometry?
Remark: If we read Book 1 of the Elements backward starting with the Pythagorian
theorem, we come next to the notion of area. To find the area of a region R on a surface, we
divide the region into parallelograms of size du by dv, find the area of each parallelogram,
and add.
Theorem 55 The area of a region R on a surface is
Z Z q
2 dudv
g11 g22 − g12
R
∂ ∂
Proof: It suffices to show the area of the parallelogram spanned by X = ∂u and Y = ∂v
p
2 . But g g − g 2 equals
is g11 g22 − g12 11 22 12
||X||2 ||Y ||2 − hX, Y i2 = ||X||2 ||Y ||2 (1 − cos2 θ) = ||X||2 ||Y ||2 sin2 θ
QED.
More generally, we can integrate an arbitrary function f over R by dividing the region into
parallelograms, multiplying the value of f on each parallelogram by its area, and adding.
Proof: This follows from the previous theorem and the intuitive definition of the inte-
gral.
Remark: The two previous theorems can be taken as definitions if the reader de-
sires.
Continuing to read Euclid’s book backward, we come next to the section on the sum of the
angles of a triangle. Euclid’s main result takes a spectacular form in differential geometry,
as we’ll show.
6.2. POLYGONS IN THE PLANE 119
It is easy to see that this result is reasonable. Travel along the curve counterclockwise. At
corners, turn to face the next portion of the curve. You’ll have turned completely around
by the end of the trip, so the total amount of turning will be 2π.
120 CHAPTER 6. THE GAUSS-BONNET THEOREM
When we measure exterior angles, we must keep track of orientation. In the above figure,
notice that most turns are counterclockwise, but one is clockwise. From now on, we
assume that the plane has been given the standard right-handed orientation in which angles
are measured counterclockwise. Later when we consider surfaces with a predetermined
orientation, we insist that local coordinates be chosen so the uv-plane is oriented in this
standard manner.
Next imagine that our region has curved sides, as in the picture below. At each point on one
of these sides, choose a unit tangent vector T and a unit normal vector N . We insist that N
be obtained from T by rotating counterclockwise. Each boundary piece has curvature κ(s);
this curvature can be positive or negative because we have chosen a particular direction for
N . We will later prove the following wonderful generalization of Euler’s theorem:
Theorem 58 Suppose R is a region in the plane whose boundary consists of one counter-
clockwise curve, possibly with curved sides and corners. Then
X X Z
θi + κ(s) = 2π.
corners γi
sides
Example 1: Consider a circle of radius R. This circle has no corners and its curvature is
1
R . So Z
1
κ(s) = (length of circle) = 2π
γ R
6.2. POLYGONS IN THE PLANE 121
and so
(length of circle) = 2πR.
Remark: Consequently, the formula for the length of a circle and the theorem that the
sum of the angles of a triangle is π are really special cases of a common result!
Example 2: Consider the semicircle of radius R pictured below. This semicircle has
two corners, each of angle π/2. The previous theorem then yields the following correct
result:
X X Z π π 1
θi + κ(s) = + + (πR) = 2π.
corners γ 2 2 R
sides i
Example 3: Consider the figure below, whose sides are portions of circles of radius R.
The boundary of this figure contains three angles. The exterior angles at the sides are
π/2. The exterior angle at the bottom is π; indeed if the angle at the bottom were slightly
less sharp then the exterior angle would clearly be measured counterclockwise and equal
slightly less than π.
122 CHAPTER 6. THE GAUSS-BONNET THEOREM
The semicircle at the top has curvature R1 , but the two bottom pieces have curvature − R1
because of our orientation convention. So
X X Z π π 1 1 πR 1 πR
θi + κ(s) = + +π + (πR) − −
corners γi 2 2 R R 2 R 2
sides
The previous theorem is certainly false if we replace the plane with a curved surface and
replace the curvature of the sides with geodesic curvature. For example, the sides of the
region shown below on the sphere are geodesics with geodesic curvature zero, but the sum
of the three corner exterior angles is 3π
2 rather than 2π.
0.5
-0.5
-0.5
0
0.5
-0.5
0
0.5 1
1
Gauss, and later Bonnet, discovered that we can correct the error by adding a single term to
the equation — a term containing the Gaussian curvature κ1 κ2 . Gauss proved this theorem
when the sides are geodesics, and Bonnet extended it to regions with curved sides. By any
measure, the resulting theorem is among the greatest results in mathematics:
Theorem 59 (Gauss-Bonnet) Let R be a simply-connected region on a surface, whose
boundary consists of a finite number of curves meeting at corners. Then
X X Z Z Z
θi + κg (s) + κ1 κ2 = 2π.
corners γ R
sides i
6.4. INITIAL APPLICATIONS 123
Example: Suppose the sphere in the previous example has radius R. The Gaussian
2
curvature is R1 · R1 and the area of the triangle is 4πR
8 , so
1 4πR2
Z Z
π
κ 1 κ2 = 2
=
R R 8 2
and the sum of the exterior angles plus this extra term is
π π π π
+ + + = 2π.
2 2 2 2
Remark: The proof of the Gauss-Bonnet theorem is very beautiful. But contrary to
our usual custom, we will place it at the end of this chapter after discussing some of the
remarkable consequences of the theorem.
We’ll discuss some results here whose proofs are immediate and can be left to the reader.
The Gauss-Bonnet theorem explains how two dimensional workers might discover that the
Gaussian curvature is nonzero, and compute its value. Namely:
Theorem 60 Let T be a small triangle on which κ1 κ2 is essentially constant. Assume the
sides of this triangle are geodesics and the angles of the triangle are α, β, and γ. Then
(α + β + γ) − π
κ1 κ 2 ∼
(area of the triangle)
In particular, the sphere of radius one has Gaussian curvature one, so we obtain
124 CHAPTER 6. THE GAUSS-BONNET THEOREM
Theorem 61 Let T be a triangle on the sphere of radius one whose sides are great circles.
Call the angles of the triangle α, β, and γ. Then
area of triangle = α + β + γ − π.
0.5
-0.5
-0.5
0
0.5
-0.5
0
0.5 1
1
Remark: At the end of chapter four, we computed the geodesic curvature of a latitude
line on the sphere of radius one. The Gauss-Bonnet theorem gives us an alternate way to
do the calculation.
-1
-0.5
0
0.5
1
1
0.5
-0.5
-1
-1
-0.5
0
0.5
1
Apply the Gauss-Bonnet theorem to the cap at the top of the sphere. All exterior angles
are zero, so
κg (length of latitude) + κ1 κ2 (area of cap) = 2π
that
2π − (area of cap)
κg = .
2π sin φ
p
In section 4.10 we discovered that g11 = sin2 φ, g12 = 0, and g22 = 1. So 2
g11 g22 − g12
equals sin φ and the area of the cap is
Z 2π Z φ
sin φ dφdθ = 2π (1 − cos φ).
0 0
Therefore
2π − 2π(1 − cos φ) cos φ
κg = = .
2π sin φ sin φ
In the early part of the nineteenth century, Bolyai and Lobachevsky independently invented
non-Euclidean geometry. They replaced Euclid’s parallel postulate with a postulate which
asserts that more than one line can be drawn through a point parallel to a given line, and
investigated the consequences. Bolyai and Lobachevsky worked geometrically; their proofs
look very much like Euclidean proofs but their theorems are quite different.
It is unfortunate that the terminology “non Euclidean geometry” were chosen to describe
the new geometry, since these words imply that any geometry diverging from Euclid is non-
Euclidean. Actually, Bolyai and Lobachevsky discovered that there is only one geometric
object which satisfies their axiom and the remaining Euclidean axioms.
Gauss seems to have independently studied the geometry, and he rapidly realized that
the Gaussian curvature of such a surface would be −1. However, no surface studied in
the nineteenth century modeled the complete geometric object. Each candidate surface
described only a small piece of the non-Euclidean plane.
The situation changed dramatically when Poincare invented the Poincare disk at the end
of the century. This disk is not a surface, but it is a model for the complete non-Euclidean
plane, and permits us to study non-Euclidean geometry using analytic techniques instead
of geometric techniques.
Look back at sections 2.13 and 5.5. These sections sketch proofs of several important
non-Euclidean results which we will use here. We summarize them as follows
Theorem 62 The Poincare disk satisfies:
1. Geodesics in the disk are straight lines through the origin or circles which meet the
boundary at ninety degrees.
126 CHAPTER 6. THE GAUSS-BONNET THEOREM
3. All points of the disk look the same. If p and q are points, there is a one-to-one, onto
map from the disk to itself which preserves all distances, angles, and geodesics, and
maps p to q.
4. All directions on the disk look the same. If {e1 , e2 } and {f1 , f2 } are oriented bases of
tangent vectors at a point p, there is a one-to-one, onto map from the disk to itself
which preserves all distances, angles, and geodesics, and maps e1 to f1 and e2 to f2 .
5. A non-Euclidean circle looks like a Euclidean circle in the model, except that its center
is at an unexpected place.
6.5. NON-EUCLIDEAN GEOMETRY 127
Proof: The proofs for most of these results were sketched earlier. The last result can
be proved in the following way. Non-Euclidean circles centered at the origin are clearly
Euclidean circles (although the non-Euclidean radius is not the same as the Euclidean
radius). Suppose p is not the origin and we wish to study circles centered at p. There
is an isometry mapping the origin to p. This map preserves distances, and so maps non-
Euclidean circles to other non-Euclidean circles. Formulas for these isometries are given
in section 2.13. By Lemma 3 in that section, these isometries map Euclidean lines and
circles to Euclidean lines and circles. A non-Euclidean circle at the origin is a Euclidean
circle, so its image under the isometry, which is known to be a non-Euclidean circle, is also
a Euclidean circle. The result follows. QED.
Theorem 63 Suppose a worker stands at a point p in the non-Euclidean plane. Near p the
geometry is almost Euclidean; discrepencies arise only when the worker moves away from
p. The geodesic starting at p in the direction X continues forever and reaches distances
arbitrarily far away. Every point of the plane can be seen by looking out in exactly one
direction. The topology of the non-Euclidean plane is exactly the same as the topology of
the Euclidean plane. In all of these respects, non-Euclidean geometry works exactly like
Euclidean geometry, with no surprises.
Proof: The theorem follows from the philosophy of differential geometry, from the obvious
homeomorphism carrying the open unit disk to the plane, and from the fact proved earlier
that distances to the boundary along straight lines from the origin are infinite. QED.
We come next to the surprising results of the subject. Below are analytic proofs of some
of these amazing results, proved synthetically by Bolyai and Lobachevsky:
128 CHAPTER 6. THE GAUSS-BONNET THEOREM
π − (α + β + γ)
2π − (α + β + γ + δ)
(n − 2)π − (α1 + . . . + αn )
2. The area of a figure with n sides can never be larger than (n − 2)π.
3. There are figures with n sides which have area arbitrarily close to (n − 2)π.
Proof: As before. QED.
R3 R5
C = 2π sinh R = 2π R + + + ...
3! 5!
R2 R4
2
A = 4π sinh (R/2) = 2π (cosh R − 1) = 2π + + ...
2! 4!
R2
cosh R 1
κg = = 1+ + ...
sinh R R 6
1
This number is approximately R for small R.
130 CHAPTER 6. THE GAUSS-BONNET THEOREM
It suffices to prove the last three results. Since any circle can be mapped to any other by an
isometry, it suffices to study circles about the origin. Such a circle looks like a Euclidean
circle of Euclidian radius r. We shall compute its non-Euclidean radius R. The curve
γ(t) = (t, 0) for 0 ≤ t ≤ r is a radial curve and its non-Euclidean length is
Z r Z r
2 dt 1 1 1+r
R= 2
= + dt = ln
0 1−t 0 1−t 1+t 1−r
Solving this equation for r gives
eR − 1
r= .
eR + 1
Next we find the non-Euclidean circumference of this circle. The circle can be parameter-
ized as (r cos t, r sin t), so its length is
r 2
dx 2 dy
Z 2π 2 + dt Z 2π
dt dt 2r 2r
2 + y 2 ))
= 2
dt = 2π.
0 (1 − (x 0 1 − r 1 − r2
Substitution of our previous formula for r in terms of R gives
2 !
e − e−R
R R R
e −1 e −1
length = 2π · 2 R 1− = 2π = 2π sinh R.
e +1 eR + 1 2
This simplifies to
2r2
2π .
1 − r2
6.5. NON-EUCLIDEAN GEOMETRY 131
Recall that cosh R = cosh(R/2+R/2) = cosh2 (R/2)+sinh2 (R/2). Since cosh2 x−sinh2 x =
1, we obtain cosh R = 1 + 2 sinh2 (R/2) and the formula for area follows.
To obtain the geodesic curvature of a circle of radius R, we apply the Gauss-Bonnet theorem
to this circle. The circle has no exterior angles, so the theorem states that
or
κg (2π sinh R) − 2π(cosh R − 1) = 2π.
So
cosh R
κg = .
sinh R
Compare this result with the analogous result on the sphere. QED.
Remark: Finally let us study parallel lines.
In Euclidean geometry, we often apply the following construction: Given a line l and a point
q ∈ l, draw a perpendicular to l at q and extend it to a point p. Draw a perpendicular to
this line segment, obtaining a new line parallel to l. See the left picture below.
This construction still works in non-Euclidean geometry. Since the Poincare disk has a large
number of isometries, it is enough to study the situation when q is the origin and l is the
x-axis. The right picture below shows the non-Euclidean version of the construction.
q
132 CHAPTER 6. THE GAUSS-BONNET THEOREM
But in non-Euclidean geometry there are infinitely many other lines through p parallel to
l. See the picture below, and notice that there are two particular lines that are just barely
parallel. Lobachevsky called these lines the limiting parallels.
At p, there is an angle Φ such that all lines draw with angle 0 ≤ θ < Φ meet l, and all
lines drawn with angle Φ ≤ θ ≤ π2 are parallel to l. The angle Φ depends on the distance d
from p to q. If this distance is small, the angle Φ is almost π2 and two-dimensional workers
may not even notice that more than one parallel can be drawn. But when the distance is
larger, the angle Φ approaches zero and it becomes evident that there are many possible
parallel lines.
Theorem 68 The angle Φ is given in terms of the non-Euclidean distance d from q to p
by
Φ
tan = e−d .
2
For small d we have
π 1 1 5 61 7
Φ= − d + d3 − d + d + ...
2 6 24 5040
For distances smaller than 1/10, the divergence of Φ from π2 is negligible, as shown by the
plot below. The divergence becomes significant for d close to 1, and the angle Φ is so small
for 6 ≤ d ≤ ∞ that non-parallel lines are rare.
1.4 1.4
1.2 1.2
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
Proof: The picture on the left below shows that the center of the circle which gives the
limiting parallel in the Poincare model has x-coordinate one (since it meets the boundary
of the model perpendicularly at (1, 0)). Call the radius of this circle a.
p p
Φ Φ
q q
Draw the two additional dotted lines indicated on the right, and notice that these lines
meet at an angle Φ because each dotted line is perpendicular to a side of the original angle
Φ. The side opposite this new angle is the radius of the large circle minus the Euclidean
distance from q to p. If this Euclidean distance is called r, then
a−r
sin Φ = .
a
By the Pythagorian theorem applied to the triangle with dotted sides, we have (a−r)2 +12 =
2
a2 and so a = r 2r+1 . Consequently
1 − r2
sin Φ = .
1 + r2
Then
2
(1 + r2 )2 − (1 − r2 )2 4r2
2r
cos2 Φ = 1 − sin2 Φ = 2 2
= =
(1 + r ) (1 + r2 )2 1 + r2
and so
Φ sin Φ (1 − r2 )/(1 + r2 ) (1 − r) (1 + r) 1−r
tan = = 2
= =
2 1 + cos Φ (1 + 2r/(1 + r )) (1 + r) (1 + r) 1+r
134 CHAPTER 6. THE GAUSS-BONNET THEOREM
1+r
On page 151 we discovered that d = ln 1−r . Therefore
Φ 1−r
tan = = e−d .
2 1+r
QED.
In the first section of this chapter, we compared differential geometry to Book I of Euclid.
There is one portion of Euclid which we did not discuss. That is the section on congruent
triangles, starting with proposition 8.
The first proposition in this section is the familiar side-angle-side congruence theorem.
Euclid proves this proposition by superposition. He tells us to move the first triangle until
the two sides and included angle of the moved triangle lie on top of the corresponding sides
and angle of the second triangle. Then, he says, the remaining side and angles clearly also
match.
It is easy to see that this proof fails for surfaces. For instance, draw a right angle on the
rim of a doughnut where the Gaussian curvature is positive. Extend each side of the angle
by a small distance d, forming a small triangle. Since the curvature is positive, the sum of
the remaining angles will be greater than π2 . Draw the same right angle and sides on the
inside rim of the doughnut where the Gaussian curvature is negative. Then the sum of the
remaining angles will be smaller than π2 . So side-angle-side fails on a doughnut.
The problem is that there is no isometry from a doughnut to itself carrying the outside
triangle to the inside one. So Euclid’s superposition proof does not work.
Curiously, Euclid doesn’t use superposition again, although many other congruence theo-
rems could be proved that way. Euclid probably was unhappy with his proof of Proposition
8, and later authors took Proposition 8 on side-angle-side to be another axiom. The mod-
ern approach is to replace this axiom with the principle of superposition itself. From this
point of view, propositions 8 through 26 of Euclid follow from requiring that the surface
has many isometries, where
Definition 24 We say a surface has sufficiently many isometries if
1. whenever p and q are points of the surface, there is an isometry of the surface which
carries p to q
2. whenever {e1 , e2 } and {f1 , f2 } are orthonormal bases of tangent vectors are p, there
is an isometry carrying e1 to f1 and e2 to f2 .
6.7. COMPACT SURFACES 135
Remark: It is possible to prove that there are only four surfaces with sufficiently many
isometries, up to magnification:
2. the sphere
3. the sphere with opposite points identified, usually called projective space
Thus differential geometry is a completely natural complement to Euclid, and the results
we have proved recently are natural theorems in a completion of Euclid’s work.
The topologists have proved a glorious theorem about the surfaces of doughnuts with g
holes. The theorem was first proved by Euler for surfaces which can be deformed to spheres,
and later extended by Poincare to the general case.
Each of these objects looks like a sphere, so the Euler-Poincare number should equal two.
On a cube, there are 8 vertices, 12 edges, and 6 faces, and 8 − 12 + 6 = 2. On a tetrahedron,
there are 4 vertices, 6 edges, and 4 faces, and 4 − 6 + 4 = 2. Etc.
Proof: Cut the doughnut across each hole, as illustrated on the next page. The cuts
should follow edges of the dissection, which is not shown in the pictures. Notice that
around the hole there are n vertices and n edges, and after the cut the number of such
vertices and such edges doubles. But these terms cancel and the Euler-Poincare number
does not change. When we are done, we have an object which looks like a sphere with 2g
disks removed.
Each hole in the sphere is surrounded by n edges. Fill in each of these holes with a disk
cut into n triangles as illustrated below. Notice that each such disk adds 1 vertex, n edges,
and n faces to the dissection of the surface, and thus increases the Euler-Poincare number
6.7. COMPACT SURFACES 137
by one. Since there are 2g holes to be filled in, the Euler-Poincare number will increase by
2g. Thus it was originally 2 − 2g if and only if it is 2 after filling in the holes.
The object now looks like a sphere. So it suffices to prove the theorem in the special case
when we have a sphere.
Without loss of generality, we can suppose that each of the faces is a triangle. For if a face
has n edges, subdivide it as illustrated below, and notice that the subdivision added n edges,
n − 1 new faces, and 1 new vertex, so the Euler-Poincare number did not change.
Remove one face and notice that the resulting object can be flattened down into the
plane, possibly by greatly distorting the shapes of the faces and edges. The original Euler-
Poincare number will have been 2 exactly if the new number is 1, so we want to prove that
the Euler-Poincare number of the resulting mass of triangles is one.
138 CHAPTER 6. THE GAUSS-BONNET THEOREM
Begin removing triangles one by one from the new object. There are three ways to do
this, as illustrated on the next page. Note that none of these methods changes the Euler-
Poincare number. In the end, we will have one triangle left, and the Euler-Poincare number
will be 3 − 3 + 1 = 1, as desired. QED.
Remark: There is a variant of the Gauss-Bonnet theorem for surfaces. The variant is a
consequence of the original Gauss-Bonnet theorem and the above result of Euler-Poincare,
as we shall see.
Theorem 70 (Gauss-Bonnet) Let S be a surface in the shape of a doughnut with g
holes. Then Z Z
1
κ1 κ2 = 2 − 2g.
2π S
However, the geodesic curvature must be computed using normals which point into the
triangle. Each edge of the above decomposition appears twice, bounding two triangles. In
one appearance the normal points one way and in the other it points the other way, so the
integrals involving κg cancel.
We are left with Z Z
1 1 X
κ1 κ2 = (α + β + γ − π)
2π S 2π
triangles
The sum over triangle angles is supposed to be computed by summing the angles of each
triangle, and then summing these numbers over triangles. But it could also be computed
by summing the angles which meet in a particular vertex, and then summing over vertices.
When computed this way, the sum is 2π(number of vertices). The sum over triangles of π
gives π(number of triangles). So the above number equals
1
(number of vertices) − (number of triangles)
2
3 (number of triangles).
But this counts each edge twice because an edge bounds two triangles. So
3
(number of triangles) = (number of edges).
2
Hence
1
(number of triangles) = (number of edges) − (number of triangles).
2
140 CHAPTER 6. THE GAUSS-BONNET THEOREM
1
RR
So 2π S κ1 κ2 , which was earlier proved to equal
1
(number of vertices) − (number of triangles),
2
also equals
QED.
Theorem 71 Every compact surface in R3 contains points where the Gaussian curvature is
positive. Every compact surface in R3 except the sphere contains points where the Gaussian
curvature is negative.
Proof: If a surface is compact, there is a point p on the surface whose distance from
the origin is a maximum. Consequently the entire surface is inside the sphere of radius
R = ||p||, and this sphere touches the surface at p. It follows that the principal curvatures
κ1 and κ2 at p must be at least R1 and have the same sign. So the Gaussian curvature is
positive at p.
RR
By the Gauss-Bonnet theorem, S κ1 κ2 = 2 − 2g; this number is zero or negative unless
g = 0 and the surface is topologically a sphere. Since κ1 κ2 is continuous and sometimes
positive, it must also be sometimes negative. QED.
We return to Euclid for a final time. Euclid built geometry on a small number of axioms.
Once we have the machinery of differential geometry, we can replace his axioms by a series
of equivalent but more precise axioms. For instance, Euclid assumes that we can draw
straight lines and can determine whether line segments are congruent and whether angles
are congruent. The analogous modern assumption is that a geometry is a two-dimensional
surface (not necessarily in R3 ) and a metric tensor gij on this surface. Given this informa-
tion, we can measure lengths of curves and angles, and thus determine congruence for such
objects. We can draw geodesics, and thus speak of straight lines in Euclid’s sense.
A long section of Book 1 of Euclid is about congruent triangles. In section 6.6, we argued
that Euclid’s proof of the congruence theorems ultimately relies on a superposition argu-
ment whose modern equivalent is the existence of a large number of surface isometries. We
want to expand on that idea.
In some sense, the congruence theorems (and analogous axioms requiring a large number
of isometries) are at heart assertions that space is homogeneous — that geometry near one
6.8. CONSTANT CURVATURE 141
point p is the same as geometry near another point q. This axiom is the opposite of the
ancient belief that the earth is the center of the universe, or that a sacred city (Rome,
Jerusalem, or Mecca) is the center of the world. According to the congruence-isometry
axiom, all points are on an equal footing.
We’d like to convert this philosophy into mathematics. We are going to ignore the un-
fortunate discovery that space is curved due to gravity by different amounts at different
locations and thus is not homogeneous.
According to the isometry axiom, whenever p and q are points, there is a one-to-one and
onto map from the surface to itself preserving all distances, angles, and geodesics, and
carrying p to q. This requirement is very restrictive and somewhat implausible as an
axiom. For instance, if we believe that geometry is the same near the earth and near
the star Alpha Centura, the axiom requires us to move the earth to Alpha Centura and
simultaneously move all of the stars in the universe to new positions!.
A better axiom would require local motion only, along the lines of the following defini-
tion:
Definition 25 We say that a surface is locally homogeneous if whenever p and q are
points, there are open neighborhoods U and V of p and q and a one-to-one, onto C ∞ map
from U to V, with C ∞ inverse, preserving the metric gij .
Remark: In particular, all geometric notions are preserved by the local map, so the
Gaussian curvature will be the same at p and q. Since p and q are arbitrary, the Gaussian
curvature will be constant on a locally homogeneous surface.
142 CHAPTER 6. THE GAUSS-BONNET THEOREM
In the final chapter of these notes, we will prove that conversely if the Gaussian curvature
is constant, then any pair of points p and q have neighborhoods which are isometric. So a
useful replacement for Euclid’s congruence axiom is the requirement that the surface have
constant curvature.
So much for ”philosophy.” In the next section, we convert this philosophy into interesting
mathematics.
Suppose S is a surface with constant curvature and metric tensor gij . Let us magnify
this surface by a factor m > 0. To do so, multiply all lengths by m and leave all angles
unchanged. It is easy to see that this can get done by multiplying gij by m2 .
Theorem 72 If all distances on a surface are multiplied by m, then the Gaussian curvature
is multiplied by m12 .
Proof: Turn back to the formula for Γkij on page 52. Notice that the terms of g −1 are
multiplied by m12 and thus Γkij is unchanged.
Turn to the formula for the curvature tensor on page 126. Notice that this tensor is multi-
plied by m2 . Turn finally to the formula for Gaussian curvature in terms of the curvature
tensor on page 127. Notice that R1212 is multiplied by m2 and det(gij ) is multiplied by m4
and consequently κ1 κ2 is multiplied by m12 . QED.
Theorem 73 Suppose S is a compact oriented surface, and thus a doughnut with g holes.
We do not assume that this surface is embedded in R3 .
1. If the surface has a metric of positive constant curvature, then g = 0 and the surface
is a sphere. If the surface is magnified to make its curvature 1, then its area is 4π.
2. If the surface has a metric of zero constant curvature, then g = 1 and the surface is
a doughnut.
3. If the surface has a metric of constant negative curvature, then g ≥ 2 and the surface
is a doughnut with at least two holes. If the surface is magnified to make its curvature
-1, then its area is 4π(g − 1).
6.9. SURFACES OF CONSTANT CURVATURE 143
Proof: This follows immediately from the Gauss-Bonnet theorem. For instance, if the
curvature is 1, then
Z Z
1 1
κ1 κ 2 = (area of surface) = 2 − 2g,
2π 2π
Proof: The first result is clear since the unit sphere has curvature 1.
We give two arguments for the torus. Here is the first. Form a torus from a unit square by
glueing the top and bottom together to form a cylinder, and then glueing the left and right
together to form a doughnut. Give the square the standard flat Euclidean metric.
b b
We must make sure that each point has a Euclidean neighborhood after the glueing is
complete. This is clear for interior points. It is clear for boundary points from the following
picture.
144 CHAPTER 6. THE GAUSS-BONNET THEOREM
Finally, all four corners of the square glue to become a single point, and this point has a
Euclidean neighborhood because the four corner angles, each of size π2 , glue together to
form a neighborhood with total angle 2π as illustrated below.
Here is a second argument for the torus. The torus can be embedded in R3 by the map
illustrated on page 30:
Here 0 ≤ ϕ ≤ 2π and 0 ≤ θ ≤ 2π. However, we can also embedded the torus in R4 , and in
a considerably easier manner:
∂s ∂s
= (− sin θ, cos θ, 0, 0) = (0, 0, − sin ϕ, cos ϕ)
∂θ ∂ϕ
and so
∂s ∂s ∂s ∂s ∂s ∂s
g11 = · =1 g12 = · =0 g22 = · = 1.
∂θ ∂θ ∂θ ∂ϕ ∂ϕ ∂ϕ
Finally, here is an argument for surfaces with g ≥ 2 holes. The topologists have proved that
every such surface can be constructed from a regular polygon with 4g sides by identifying
corresponding sides as illustrates below. The pictures on the next page show this process
6.9. SURFACES OF CONSTANT CURVATURE 145
for a doughnut with g = 2. In particular, all vertices of the polygon glue to the same point
in the surface.
a1
b1 b1
a2 a1
b2 b2
a2
Flatten this object so the remaining doughnut tube sticks out toward us:
146 CHAPTER 6. THE GAUSS-BONNET THEOREM
We will attempt to glue the sides of our polygon together so the resulting surface inher-
its a geometry with constant curvature. Let’s first try to do that using flat Euclidean
geometry.
a1
b1 b1
a2 a1
b2 b2
a2
After glueing, every point inside the polygon or on the sides has a Euclidean neighborhood
as in the earlier torus example. But there is a problem at the point obtained by glueing all
vertices of the polygon together because the sum of the interior angles of the polygon at
the vertices adds up to more than 2π. To fix this problem, we will construct the polygon
in non-Euclidean geometry rather than in the plane. Consider the polygon at the center
of the picture on the next page. Each side of this polygon is a geodesic. We construct the
6.9. SURFACES OF CONSTANT CURVATURE 147
polygon so these sides all have the same length. Then after gluing each point which comes
from the inside of the polygon and each point which comes from a boundary line has a
non-Euclidean neighborhood.
We must check that the vertex angles sum to 2π. We do this by adjusting the size of the
polygon. If the polygon is very small as on the left below, the sum of the angles of the
polygon will be close to the corresponding sum for Euclidean polygons, which is larger than
2π. If the polygon touches the boundary as in the middle below, then all angles will be zero
and the angle sum is zero. Somewhere between these extremes, there is a non-Euclidean
polygon whose vertex angle sum is exactly 2π. QED.
148 CHAPTER 6. THE GAUSS-BONNET THEOREM
6.10 Frames
This section and the remaining sections of the chapter yield a proof of the Gauss-Bonnet
theorem. Several interesting ideas arise along the way.
Frames on surfaces are the analogues of the moving frame on a curve. But in surface
theorem, frames are not unique, so some mathematicians avoid them. We will see that
frames are extremely useful.
On the surface of the earth, there is a natural frame at each point except the north and
south poles: e1 points toward the rising run in the east, and e2 points toward the axis
of rotation at the north pole. This frame yields infinitesimal orthonormal coordinates, so
inhabitants of the earth learn the Pythagorian theorem in school. But in the large, the
frame behaves in an unexpected way due to the curvature of the earth. If occupants follow
the e1 ’s, their journeys are longer than necessary because paths along the e1 ’s are not
geodesics. And if two occupants start one horizontal mile apart and follow e2 ’s north, they
find themselves closer and closer together over time.
6.10. FRAMES 149
Imagine that the south pole were inhabitable. There is no natural framing there, but
it would be inevitable that the government would establish a framing so farms could be
divided into rectangular plots and efficient roads could be established.
Remark: Fix a framing. Suppose that γ(s) is a path parameterized by arc length. At
each time s, γ 0 (s) is a unit vector and thus
for an angle θ(s) determined up to a multiple of 2π. If we pick a starting angle, there is
clearly a unique way to extend θ(s) to the entire curve.
The pictures below show this process in action. The orthonormal basis {e1 , e2 } may not
look orthonormal to us, so we need the gij to compute θ(s).
Choose angles θ1 (s) along γ1 (s) so γ10 (s) = cos θ1 e1 + sin θ1 e2 . At s1 , γ10 makes an angle
θ1 (s1 ) with the frame and γ20 makes an angle θ2 (s1 ) with the frame. Call the exterior angle
at this point θ1 . Clearly we can uniquely choose θ2 (s) so θ2 (s) − θ1 (s) equals the exterior
angle θ1 .
2
1.5
0.5
0.5 1 1.5 2
Continue this process completely around the boundary. The frames allow us to assign
tangent angles at each point along the boundary curves. This is a great aid in the proof
of the Gauss-Bonnet theorem.
Remark: Define ∆θi (s) = θi (si ) − θi (si−1 ). This number is the amount that the tangent
angle of the curve γi increases from beginning to end. Notice that this increase is partly
due to the curvature of the curve γi and partly due to turning of the frame {e1 , e2 }.
Disentangling these two causes will occur us in the next few sections. The following theorem
shows that, modulo an analysis of ∆θi (s), we are close to the Gauss-Bonnet theorem:
6.11. DIFFERENTIAL FORMS 151
Theorem 75 The angle changes ∆θi (s) and exterior angles θi satisfy the equation
X X
θi + ∆θi (s) = 2π.
i i
Proof: The total change in θ must be an integer multiple of 2π because the curve returns
to its starting point.
Gradually shrink the boundary curve to a circle. The total change in θ will change contin-
uously under this deformation. Since the total change is always an integer multiple of 2π,
it cannot change continuously unless it is constant. So the total change in θ must equal the
total change for a small circle. But when we shrink small enough, the frame will become
essentially constant, and the theorem reduces to the assertion that the angle increases by
2π as we go around a classical Euclidean circle. QED.
This will be proved using Green’s theorem. The trickP is Rto find an appropriate vector
field so the integral of the field over the boundary gives γi κg and the integral over the
PR R
interior gives R κ1 κ2 . This vector field (which we will call a one-form) will be defined
in future sections. First we’d like to review Green’s theorem and introduce the notation
we will be using.
Definition 27 A one-form ω on a coordinate system is an expression
ω = ω1 (u, v) du + ω2 (u, v) dv
Ω = Ω12 (u, v) du ∧ dv
More explicitly,
Z Z b
du dv
ω= ω1 (u(t), v(t)) + ω2 (u(t), v(t)) dt.
γ a dt dt
Definition 30 Let Ω be a two-form and let R be a region in the plane. The surface integral
of Ω over R is defined to be
Z Z Z Z
Ω= Ω12 (u, v) dudv.
R R
In advanced calculus, you probably computed line integrals of vector fields, and wrote
Green’s theorem using different notation. In that case, the notation and formulas of the
previous section may seem puzzling. This optional section explains the new point of view
and motivates definitions in the next section.
The line integral of a vector field E = (Ex , Ey ) is defined in vector calculus as
Z ~
~ · dγ dt.
E
dt
Green’s theorem is then written
Z ~
dγ
Z Z
∂Ey ∂Ex
~
E· dt = − dxdy.
∂R dt R ∂x ∂y
If these theorems were taken as a model, we would expect the one-form ω above to just be
a vector field
ω = Ex , Ey .
6.12. THE CORRECT INTERPRETATION 153
But then the line integral should have been defined using the gij inner product, which
would give a much more complicated formula
Z bX
duj
gij Ei (u(t), v(t)) dt
a dt
ij
The double integral in our version of Green’s theorem is alsop puzzling, since the integral of
a function over the surface always contains an extra factor g11 g22 − g12 2 by theorem 56.
We would expect the double integral in Green’s theorem to have the form
Z Z q
∂Ey ∂Ex 2 dudv
− g11 g22 − g12
∂x ∂y
So ω should be an object which gives a number when evaluated on a small piece of the
curve. If γ(t) is our curve, the small piece from t to t + ∆t is approximately γ 0 (t) ∆t. We
expect the world to linearize when we make very small approximations, so
We conclude that ω should be an object which maps tangent vectors to real numbers. For
this reason, the coefficients of our ω should be thought of as entries (ω1 , ω2 ) in a matrix
defining a transformation from vectors to numbers, and the expression in our line integral
154 CHAPTER 6. THE GAUSS-BONNET THEOREM
The expression on the right is a surface integral. Suppose we want to integrate an object Ω
over a surface, but we do not yet know what kind of object Ω should be. To integrate, we
divide the surface into small pieces, compute Ω on each piece, add, and take a limit.
So Ω should be an object which gives a number when evaluated on a small piece of surface.
Small parallelograms on the surface are defined by pairs of vectors X and Y . We expect
the world to linearize when we make very small approximations, so Ω should be a real
valued function defined on pairs of vectors: Ω(X, Y ).
In this case, there is more to be said. Surface integrals depend on an orientation of the
surface; changing the orientation changes the sign of the integral. So we expect that
Ω(Y, X) = −Ω(X, Y ). By definition, a two-form is an assignment to each point of space of
6.13. FORMS ACTING ON VECTORS 155
a map Ω(X, Y ) defined on pairs of vectors, such that Ω(Y, X) = −Ω(X, Y ). In the special
two-dimensional case of interest to us,
∂ ∂ ∂ ∂ ∂ ∂
Ω X1 + X 2 , Y1 + Y2 =Ω , (X1 Y2 − X2 Y1 )
∂u ∂v ∂u ∂v ∂u ∂v
and so our ω has only one coefficient.
In classical three-dimensional calculus, such maps ω also arise from vectors E using the
formula
Ω(X, Y ) = E · (X × Y )
This explains the appearance of E · n in the surface integral formula. However, in advanced
mathematics, Ω more often appears directly as a map on pairs of vectors, and associating
each such map with a vector E is more confusing than illuminating.
Definition 32 Let ω be a one-form and let X be a tangent vector field. Then ω(X) is the
function defined by
∂ ∂
ω(X) = (ω1 du + ω2 dv) X1 + X2 = ω1 X1 + ω2 X2 .
∂u ∂v
Definition 33 Let Ω be a two-form and let X and Y be tangent vector fields. Then
Ω(X, Y ) is the function defined by
∂ ∂ ∂ ∂
Ω(X, Y ) = (Ω12 du ∧ dv) X1 + X2 , Y1 + Y2 = Ω12 (X1 Y2 − X2 Y1 ) .
∂u ∂v ∂u ∂v
Remark: If you skipped section 6.9, then these definitions may seem mysterious. We
are interested in them for precisely one reason, given by the next theorem. This theorem
shows that the curl of ω from classical calculus is closely related to the Lie bracket and
directional derivatives we have often used in this course.
Theorem 77 Let ω be a one-form and suppose X and Y are tangent vector fields. Then
We will prove that the expression on the right also has this linearity property. Indeed
where we have used the identity [f X, Y ] = f [X, Y ] − Y (f )X proved on page 127. Using
the product rule, the previous equation becomes
which simplifies to
f Xω(Y ) − Y ω(X) − ω([X, Y )
as desired. The expression Xω(Y ) − Y ω(X) − ω([X, Y ) changes sign when X and Y are
interchanged, so it follows that it is also linear in the second variable.
Since both sides of the equation we wish to establish are linear over functions, we need
only check this equation on basis vectors. Both sides are skew symmetric, so it suffices to
∂ ∂
check the equation when X = ∂u and Y = ∂v . In this case
∂ω2 ∂ω1
dω(X, Y ) = Ω12 = −
du dv
and, using the fact that [X, Y ] = 0,
∂ ∂
X(ω(Y )) − Y (ω(X)) − ω([X, Y ]) = ω2 − ω1 .
∂u ∂v
QED.
In the rest of the proof we never need to refer to the coefficients of the one-forms or two-
forms we introduce. So we will call these one forms ω and two forms Ω. Unfortunately,
we need to consider several one-forms. These various one-forms will be called ωij . Notice
carefully that the indices determine which form we are using and have nothing to do with
the coefficients of these forms.
We have come to the decisive moment in the proof. We want to understand how our
frame turns as we travel from point to point. The change in the frame will depend on the
direction we move. So we want to study ∇X e1 and ∇X e2 .
In chapter one, the Frenet-Serret formulas were obtained by expressing the derivatives of
the canonical frame as a linear combination of vectors in this frame. Similarly in surface
theory, it turns out that we should express the derivatives of the ei in terms of the basis
{e1 , e2 }.
6.14. THE MOVING FRAME 157
Definition 34 Write X
∇X ej = ωij (X)ei .
i
The coefficients of these linear combinations define one-forms wij , called the connection
one-forms of the surface.
Remark: The following theorem is an analogue of the Frenet-Serret formulas, and
expresses the fact that the frame consists of orthonormal vectors.
Theorem 78 ωij = −ωji
Proof: Since the ei are orthonormal,
Thus * + * +
X X
0= ω(X)ki ek , ej + ei , ω(X)kj ek
k k
and so X X
0= ω(X)ki hek , ej i + ωkj (X) hei , ek i = ωji (X) + ωij (X)
k k
QED.
Remark: It follows that ω11 = ω22 = 0 and ω21 = −ω12 . Consequently, there is only one
interesting one-form, ω12 . Notice that
∇X e2 = ω12 (X)e1 .
A little thought shows that this equation should indeed be true. Since the ei remain
orthornomal, the change of e2 should be in the e1 direction. Moveover, if we know how e2
changes, we can determine how e1 changed.
Remark: Let γ(s) be a curve parameterized by arc length, and let θ(s) be the angle
which the tangent to this curve makes with the frame. Now that we understand how the
frame changes from point to poine, we can decompose the change in θ(s) into two pieces,
one describing the curvature of γ(s) and one describing the turning of the frame.
Theorem 79 Suppose γ(s) is a curve parameterized by arc length, and θ(s) is the angle
which the tangent to this curve makes with the frame. Let κg be the geodesic curvature of
this curve. Then
dθ
= κg + ω12 (γ 0 ).
ds
158 CHAPTER 6. THE GAUSS-BONNET THEOREM
Proof: We have
γ 0 (s) = cos θ(s) e1 + sin θ(s).
Hence
Dγ 0 dθ dθ
= − sin θs e1 + cos θ(s) e2 + cos θ(s) ∇γ 0 e1 + sin θ(s) ∇γ 0 e2
ds ds ds
The last two terms can be rewritten
Dγ 0
dθ 0
= − ω12 (γ ) (− sin θ e1 + cos θ e2 ) .
ds ds
But − sin θ e1 + cos θ e2 is the vector cos θ e1 + sin θ e2 rotated ninety degrees counter-
clockwise, and thus points normal to the curve with the correct orientation. By definition
0
of geodesic curvature, Dγ
ds is κg times this normal for curves parameterized by arc length.
So
dθ 0
κg = − ω12 (γ )
ds
and the theorem follows. QED.
This section contains only a single a single equation, known as Cartan’s structural equation.
It is obviously very important.
Theorem 80 The two-form dω12 = Ω has the following properties:
1. If e1 , e2 is any oriented basis,
Ω(e1 , e2 ) = κ1 κ2 .
∇Y e2 = ω12 (Y ) e1 .
Hence
Subtract the same equation with X and Y interchanged, and subtract the equation for
∇[X,Y ] e2 , to obtain
∇X ∇Y − ∇Y ∇X − ∇[X,Y ] e2 = X(ω12 (Y ) − Y ω12 (X) − ω[X,Y ] e1
or
∇X ∇Y − ∇Y ∇X − ∇[X,Y ] e2 = (dω12 )(X, Y ) e1 = Ω(X, Y ) e1 .
R(X, Y, e2 , e1 ) = Ω(X, Y ).
If X = e1 and Y = e2 , we have
which equals
(α2 + β 2 ) (γ 2 + δ 2 ) − (αγ + βδ)2 = (αδ − βγ)2 .
QED.
160 CHAPTER 6. THE GAUSS-BONNET THEOREM
Notice that Z si Z si
dθi
∆θi (s) = = κg + ω12 (γ 0 (t))
si−1 ds si−1
R si
Since our curve is parameterized by arc length, si−1 κg is just the integral of the function
κg over the curve. So we have
X X Z X Z si
θi + κg + ω12 (γ 0 ) = 2π
γi si−1
exterior angles boundary curves boundary curves
R si
We claim that si−1 ω(γ 0 ) is the line integral of ω over γ. Indeed in coordinates ω =
ω1 du + ω2 dv and ω(γ 0 ) = ω1 du dv
ds + ω2 ds and so
Z si Z si Z
du dv
ω(γ 0 ) ds = ω1 + ω2 ds = ω.
si−1 si−1 ds ds γ
Consequently,
X Z si Z
0
ω12 (γ ) = ω12
si−1 ∂R
boundary curves
We can apply Green’s theorem to convert this to
Z Z Z Z Z Z Z Z q
dω12 = Ω= Ω12 du ∧ dv = 2 κ κ dudv
g11 g22 − g12 1 2
R R
The last integral is the integral of the function κ1 κ2 over the interior of the region, so we
obtain Z Z Z
X X
θi + κg + κ1 κ2 = 2π
γi R
exterior angles boundary curves
QED.
Chapter 7
Riemann’s thesis was on complex variable theory. He introduced Riemann surfaces, proved
the Riemann mapping theorem, and proved the first half of the Riemann-Roch theorem.
Not bad.
Riemann’s scholarly paper was on Fourier series. In this paper, Riemann introduced the
Riemann integral and used it to investigate convergence of series. OK.
For the probationary lecture, the candidate was asked to provide three topics from which
the examining committee would pick one. It was customary for the candidate to list the
thesis and the scholarly paper as the first two topics, and customary for the examining
committee to pick one of these. Riemann listed his thesis and his scholarly paper, and
listed as a third topic the foundations of geometry. Riemann’s examing committee included
Gauss, who convinced the committee to abandon tradition and choose the third topic. “So
I am in a quandary,” Riemann wrote his father, “since I have to work out this one.”
The resulting lecture is one of the most famous mathematical talks in history. Riemann
had to proceed without elaborate equations because the lecture was for a general audience.
In the lecture, Riemann explained how to generalize Gauss’ theory of surfaces to higher
dimensions, and gave the following wonderful argument:
161
162 CHAPTER 7. RIEMANN’S COUNTING ARGUMENT
Some choices of gij give new geometries — non-Euclidean geometry, spherical geometry,
and geometry on a surface of revolution. Other choices give familiar geometries written in
unusual coordinates.
Riemann gave the following counting argument to determine the number of new geometries
which could be obtained. The form gij is determined by three functions. On the other
hand, coordinate changes
r = φ(u, v)
s = ψ(u, v)
n(n+1)
Since gij is symmetric, there are n + (n − 1) + . . . + 1 = 2 functions involved. A
coordinate change
y1 = ψ1 (x1 , . . . , xn )
...
yn = ψn (x1 , . . . , xn )
n(n+1) n(n−1)
is determined by n functions. Thus there should be 2 −n = 2 pieces of purely
geometric information.
Riemann conjectured that these n(n−1)
2 pieces of geometric information could be found as
follows. At each point in n-dimensional space, choose a two-dimensional subspace of the
tangent space by choosing two basis vectors ei , ej from the full basis e1 , . . . , en . Follow
geodesics out along the plane spanned by ei and ej . These geodesics will form a two-
dimensional surface in n-dimensions, and this surface will have a Gaussian curvature κ.
Nowadays, we call this Gaussian curvature the sectional curvature of space in the direction
spanned by e1 and e2 .
7.3. THE MAIN THEOREMS 163
How many choices do we have for our two-dimensional plane? The vector ei can be chosen
in n ways, and then the vector ej can be chosen in n − 1 ways. But the plane spanned
by ei and ej does not depend on the order of these vectors, so in reality there are n(n−1)
2
such choices. Riemann asserted that the sectional curvatures in these n(n−1)
2 directions are
n(n−1)
exactly the 2 pieces of geometric information hidden in the gij .
Traces of this argument can be found in the mathematics of this course. On page 127, we
proved that when R is the Riemann curvature tensor and e1 , e2 is an orthonormal basis,
the number R(e1 , e2 , e1 , e2 ) is independent of the choice of this basis. In higher dimensions,
our proof works without change to prove that whenever ei and ej is an orthonormal basis,
the number
R(ei , ej , ei , ej )
depends only on the two-dimensional space spanned by ei and ej , and not on the basis
used to compute it. This number equals the sectional curvature up to a sign.
Moreover, it is possible to prove that two candidates for a curvature tensor, R(X, Y, Z, W )
and S(X, Y, Z, W ), are the same if and only if they give the same sectional curvature for
each two-dimensional plane spanned by basis vectors ei and ej . So Riemann’s information
is exactly the information hidden in the curvature tensor.
The following theorems provide two cases in which Riemann’s insight is verified by formal
theorems.
Theorem 81 Let S be a surface whose Gaussian curvature is zero. Then it is possible to
define local coordinates s(u, v) near any point p so that in these local coordinates, the gij
equal δij and
ds2 = du2 + dv 2 .
curve γ. Choose > 0 so all of these geodesics are defined for − < u < .
Use this construction to produce a coordinate system (u, v) near p. To get to an arbitrary
point (u, v), follow γ from p to γ(v), and then follow the particular τ which starts at γ(v)
from τ (0) to τ (u).
Below are pictures of this coordinate system in the plane, on the sphere, and on the Poincare
disk.
7.3. THE MAIN THEOREMS 165
Step 2: If we have two surfaces S and T containing points p and q, we can introduce the
above coordinate systems in both surfaces, and then map the point (u, v) in S to the point
(u, v) in T . We are going to prove that this map preserves the metric if both surfaces have
the same constant curvature. From now on, we work with one surface and one coordinate
system. We will discover that we can completely determine the metric gij in our special
coordinate system if the Gaussian curvature is constant. If so, the proof is complete.
∂s ∂s
Step 3: Let e1 (u, v) = ∂u and e2 (u, v) = ∂v . Then e1 and e2 are vector fields in our
coordinate system. Since τ (u) is a geodesic parameterized by arc length, e1 has length one
and De du = 0 are every point (u, v). Since γ(v) is a geodesic parameterized by arc length,
1
e2 has length one and De dv = 0 at points of the form (0, v). Since τ is perpendicular to γ,
2
However, according to the fifth item in the main theorem of section 4.4 we have
∂ ∂ ∂ ∂
∇∂ −∇ ∂ = , =0
∂u ∂v ∂v ∂u ∂u ∂v
and so
De2 De1
= .
du dv
Substituting in the second formula of this step, we obtain
∂ De2 De1 1 d
he1 , e2 i = e1 , = e1 , = he1 , e1 i
∂u du dv 2 dv
Since e1 is always a unit vector, this is zero and consequently he1 , e2 i is independent of u.
But he1 , e2 i = 0 on (0, v). Consequently, he1 , e2 i is always zero.
Step 5: It follows that g11 = 1 and g12 = 0. Let h(u, v) = ||e2 || . Then g22 = h2 .
Notice that up to this point, we have made no assumptions about curvature. Therefore, on
any surface, we can introduce new coordinates so the coordinate curves are perpendicular
and g11 = 1, g12 = 0, g22 = h2 (u, v).
166 CHAPTER 7. RIEMANN’S COUNTING ARGUMENT
R1212 = −h2 κ1 κ2 .
We have !
∂Γ212 ∂Γ211 X m 2 X
R1212 = − + Γ12 Γ1m − Γm 2
11 Γ2m h2
∂u ∂v m m
and so
2 !
∂ 1 ∂h ∂ 1 ∂h 1 ∂h
R1212 = + Γ212 Γ212 h2 = + h2
∂u h ∂u ∂u h ∂u h ∂u
This simplifies to
∂2h
R1212 = h = −h2 κ1 κ2 .
∂u2
∂2h
We conclude that κ1 κ2 = −1
h ∂u2
. Since we still have made no assumptions about Gaussian
curvature, we have proved
Lemma Any surface can be given a coordinate system in which g11 = 1, g12 = 0, g22 =
h2 (u, v) and in this coordinate system the Gaussian curvature is given by
−1 ∂ 2 h
.
h ∂u2
7.3. THE MAIN THEOREMS 167
We end this course by determining those surfaces of revolution which have constant Gaus-
sian curvature.
If a surface is obtained by rotating the function y = f (x) about the x-axis, its Gaussian
curvature is
−f 00
κ=
f (1 + (f 0 )2 )2
by a calculation on page 96. This will be constant just in case f satisfies the differential
equation
2 !2
d2 f df
2
= −κ f (x) 1 +
dx dx
In the special case κ = 0, we have f (x) = ax + b and we obtain cones and cylinders.
Remark: Otherwise we simplify the equation when we parameterize the curve y = f (x)
by arc length. The graph of y = f (x) is given as a parameterized curve by γ(x) = (x, f (x))
and its arc length is
s 2
Z x
df
s(x) = 1+ dx
x0 dx
This function can be solved for x in terms of s, giving x = ϕ(s). Let g(s) be the function
f written in terms of s instead of x, so that g(s) = f (x) = f (ϕ(s)). Notice that f (x) =
g(s(x)).
Theorem 84 We have
d2 g
= −κg.
ds2
Consequently when κ = 1 we have
g(s) = A sin(s + δ)
df dg ds
Proof: By the chain rule, dx = ds dx and so
s 2
df dg df
= 1+
dx ds dx
7.4. SURFACES OF REVOLUTION 169
Differentiating again and using the product and chain rules, we have
2 ! 2 !−1/2
d2 f d2 g df d2 f
df dg df
= 1+ + (1/2) 1 + 2
dx2 ds2 dx ds dx dx dx2
r 2
df dg df
Replace dx in the last term with ds 1+ dx to obtain
2 ! 2
d2 f d2 g d2 f
df dg
= 1+ +
dx2 ds2 dx ds dx2
or 2 ! 2 !
d2 f d2 g
dg df
1− = 2 1+
dx2 ds ds dx
Substitution of the differential equation for f at the start of this section gives
2 !2 2 ! 2 !
df dg d2 g df
−κf (x) 1 + 1− = 2 1+
dx ds ds dx
r 2 2 2 2
df dg df df dg df
The equation = dx 1+ds implies that 1 +
dx = 1+ 1+ dx ds dx and
2 2
df
so 1 + dx 1 − dg
ds = 1. So the previous equation simplifies to
d2 g
−κ g(s) = .
ds2
QED.
Remark: This does not finish the problem. Our curve has the form γ(s) =r(x(s), g(s))
2 dg 2 2
dg
and is parameterized by arc length, so 1 = dx
ds + ds . Therefore dx
ds = 1 − ds
and s
Z 2
dg
x(s) = 1− ds
ds
170 CHAPTER 7. RIEMANN’S COUNTING ARGUMENT
Theorem 85 The surfaces of revolution with Gaussian curvature 1 are obtained by rotat-
ing γ(s) = (x(s), A sin(s + δ)) around the x-axis, where
Z p
x(s) = ± 1 − A2 cos2 (s + δ) ds
The surfaces of revolution with Gaussian curvature -1 are obtained by rotating γ(s) =
(x(s), Aes + Be−s ) around the x-axis, where
Z q
x(s) = ± 1 − (Aes − Be−s )2
Remark: Some of these are elliptic integrals and must be calculated numerically.
Suppose the Gaussian curvature is one. Notice that A = 1 gives
Z p Z
x(s) = ± 2
1 − cos (s + δ) = ± sin(s + δ) = ± cos(s + δ)
0.5
0.25
0
0 -0.25
1 -0.5
0.5
0.5 0.25
0.4 0
0.3 2
0.2 -0.25
0.1
-0.5
0.5 1 1.5 2 2.5 3 3
7.4. SURFACES OF REVOLUTION 171
1
0.5
0
-0.5
-1
1
0.5
0
1
0.8 -0.5
0.6 -1
0
0.4
0.5
0.2 1
1.5
0.5 1 1.5 2 2
2 2
1
1.75
0
-1
1.5
-2
2
1.25
1
1
0.75
0
0.5
-1
0.25
-2
0
0.2
0.2 0.4 0.6 0.8 0.4
0.6
0.8
Remark: We will let the reader use Mathematica to provide corresponding surfaces of
negative Gaussian curvature. We want to discuss one special case of historical interest.
Consider g(s) = es . Then Z sp
x(s) = ± 1 − e2s ds
0
es
The substitution = sin φ converts this integral to
Z q Z
2 cos φ dφ 1
1 − sin φ = − sin φ dφ = − ln(csc φ + ctnφ) + cos φ
sin φ sin φ
and thus to
√ !
1+ 1 − e2s p p p
− ln + 1 − e2s = s − ln 1 + 1 − e2s + 1 − e2s .
es
172 CHAPTER 7. RIEMANN’S COUNTING ARGUMENT
The result is a famous surface used often to model non-Euclidean geometry. This surface
has a singularity at x = 0 and has the wrong topology, so it is only a local model for
non-Euclidean geometry. But it was the standard model during most of the 19th century
before Poincare’s discovery of the disk model, so the surface is often found on the cover of
non-Euclidean geometry books. And here is more. And still more.
1
0.5
0
-0.5
-1
1
0.5
1
-0.5
0.8
-1
0.6
-1
0.4
0
0.2
-1 -0.5 0.5 1 1
Bibliography
This wonderful and inexpensive edition has hundreds of pages of notes about
the oldest existing manuscripts, the first printed editions, mathematical com-
ments about the postulates over the centuries, and so forth. Volume I contains
Euclid’s Books 1 and 2. Book 1 is a sort of novel which starts with the con-
struction of an equilaterial triangle with given base and ends with the proof
of the Pythagorian theorem and its converse.
[3] Karl F. Gauss. General Investigations of Curved Surfaces. Raven Press, 1965.
This book makes heavy use of the computer program Mathematica, allowing
readers to experiment with the theorems without doing tedious calculations
by hand. The book has wonderful illustrations created by the program.
[5] Sigurdur Helgason. Differential Geometry and Symmetric Spaces. Academic Press,
1962.
173
174 BIBLIOGRAPHY
This advanced book has an introductory chapter which covers modern Dif-
ferential Geometry in a concise manner. All of the fundamental results are
proved for arbitrary manifolds in the chapter’s 81 pages.
[6] John McCleary. Geometry from a Differentiable Viewpoint. Cambridge University
Press, 1994.
A wonderful textbook which starts with Euclid’s axiomatic approach to ge-
ometry and the discovery of Non-Euclidean geometry using Euclid’s methods.
Then the author introduces the modern theory based on calculus. Ultimately
he shows that both approaches are really about the same fundamental ideas.
[7] John Milnor. Morse Theory, volume Number 51 of Annals of Mathematics Studies.
Princeton University Press, 1963.
John Milnor is an important modern mathematician with an elegant writ-
ing style. This is his famous book on Morse theory and its connection to
algebraic topology. In the later chapters, he needs results from Differential
Geometry, and he covers the entire subject in 23 luminous pages, in a section
called A Rapid Course in Riemannian Geometry. My notes were influenced
by Milnor’s treatment.
[8] John Milnor. Topology from a Differentiable Viewpoint. Princeton Landmarks in Math-
ematics. Princeton University Press, 1997.
Another famous Milnor book. The main topic is modern differential topology,
but Milnor covers the modern definition of a differentiable manifold in the
first few pages.
[9] Dirk J. Struik. Lectures on Classical Differential Geometry. Addison-Wesley Publishing
Company, Inc., 1961.
The standard source for the classic 19th century approach to differential
geometry. When I was in college, professors lectured on the modern abstract
approach and told students to read Struik for concrete applications. Struik
lived a long life and gave a mathematical lecture on his 100th birthday!