0% found this document useful (0 votes)
258 views248 pages

General Relativity: Matthias Bartelmann

This document provides lecture notes on general relativity. It begins with an introduction to the fundamental ideas and principles behind general relativity, including: 1) Einstein developed general relativity based on the equivalence principle, which states that the effects of gravity are indistinguishable from acceleration. 2) Previous theories like Newtonian gravity were unable to incorporate the principle of relativity that the laws of physics are the same in all inertial frames. 3) General relativity describes gravity not as a force but as a consequence of the curvature of spacetime caused by the uneven distribution of mass and energy in the universe.

Uploaded by

UMAR DRAZ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
258 views248 pages

General Relativity: Matthias Bartelmann

This document provides lecture notes on general relativity. It begins with an introduction to the fundamental ideas and principles behind general relativity, including: 1) Einstein developed general relativity based on the equivalence principle, which states that the effects of gravity are indistinguishable from acceleration. 2) Previous theories like Newtonian gravity were unable to incorporate the principle of relativity that the laws of physics are the same in all inertial frames. 3) General relativity describes gravity not as a force but as a consequence of the curvature of spacetime caused by the uneven distribution of mass and energy in the universe.

Uploaded by

UMAR DRAZ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 248

Lecture Notes

Physik LN

General Relativity
MATTHIAS BARTELMANN

HEIDELBERG
UNIVERSITY PUBLISHING
GENER AL RELATIVITY
Lecture Notes Physik LN
GENERAL RELATIVITY
Matthias Bartelmann

Institut für Theoretische Astrophysik


Universität Heidelberg
About the Author

Matthias Bartelmann is professor for theoretical astrophysics at Heidelberg University.


He mostly addresses cosmological questions, concerning in particular the formation
and evolution of cosmic structures.

Bibliographic information published by the Deutsche Nationalbibliothek


The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available on the Internet at http://dnb.dnb.de.

This book is published under the Creative Commons Attribution 4.0


License (CC BY-SA 4.0). The cover is subject to the Creative Commons
License CC BY-ND 4.0.

The electronic open access version of this work is permanently available


on Heidelberg University Publishing’s website: https://heiup.uni-heidelberg.de
urn: urn:nbn:de:bsz:16-heiup-book-534-9
doi: https://doi.org/10.17885/heiup.534

Text © 2019, Matthias Bartelmann

ISSN 2566-4816 (PDF)


ISSN 2512-4455 (Print)

ISBN 978-3-947732-59-3 (PDF)


ISBN 978-3-947732-60-9 (Softcover)
v

Instead of a preface

Before we begin, some words are in order on the purpose and the limits
of these notes, on the notation used, and on some of the many people I
am indebted to.
Each of the chapters of these notes is meant for a week of two lectures
of two hours each. Much more could be said about all of their topics in
all directions, in terms of mathematics, physics, and experimental tests
of general relativity. These notes are meant as an introduction which
can in no way be considered complete. They may serve as a first guide
through the subject, not a comprehensive one. These lectures are part of
a curriculum in which cosmology, gravitational lensing, and theoretical
astrophysics are regularly taught separately. In these areas, they are thus
only meant to lay the foundation.
We use index-free notation where possible and convenient. Then, the
curvature, the curvature tensor, the Ricci tensor and the Ricci scalar,
often denoted with an R with different numbers of indices, need different
symbols. We denote the curvature and the curvature tensor, closely
related as they are, with R̄, the Ricci tensor with R and the Ricci scalar
with R. Since the symbol G is then reserved for the Einstein tensor, we
write Newton’s gravitational constant as G.
Indices refering to coordinates on general, d-dimensional manifolds are
written as Latin characters. On 4-dimensional, spacetime manifolds,
Greek indices run from 0 to 3, while Latin indices refer to spatial coordi-
nates and run from 1 to 3.
These lecture notes grew over several years. Many students were ex-
posed to this lecture and contributed corrections and suggestions that
greatly helped improving it. In particular, Dr. Christian Angrick and
Dr. Francesco Pace were kind and patient enough to meticulously work
through the entire notes and point out many mistakes. Thank you all
very much!
Particular thanks are due to the wonderful and inspiring teachers I myself
had on general relativity. Jürgen Ehlers always impressed me with his
depth and clarity of thinking, and Norbert Straumann introduced me to
the elegance and beauty of the theory.
Contents

1 Introduction 1
1.1 The idea behind general relativity . . . . . . . . . . . 1
1.2 Fundamental properties of gravity . . . . . . . . . . . 4
1.3 Consequences of the equivalence principle . . . . . . . 7
1.4 Futile attempts . . . . . . . . . . . . . . . . . . . . . 9

2 Differential Geometry I 15
2.1 Differentiable manifolds . . . . . . . . . . . . . . . . 15
2.2 The tangent space . . . . . . . . . . . . . . . . . . . . 18
2.3 Dual vectors and tensors . . . . . . . . . . . . . . . . 24
2.4 The metric . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Differential Geometry II 31
3.1 Connections and covariant derivatives . . . . . . . . . 31
3.2 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 Curvature . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Riemannian connections . . . . . . . . . . . . . . . . 42

4 Physics in Gravitational Fields 47


4.1 Motion of particles . . . . . . . . . . . . . . . . . . . 47
4.2 Motion of light . . . . . . . . . . . . . . . . . . . . . 49
4.3 Energy-momentum (non-)conservation . . . . . . . . . 54
4.4 The Newtonian limit . . . . . . . . . . . . . . . . . . 58

5 Differential Geometry III 63

vii
viii CONTENTS

5.1 The Lie derivative . . . . . . . . . . . . . . . . . . . . 63


5.2 Killing vector fields . . . . . . . . . . . . . . . . . . . 68
5.3 Differential forms . . . . . . . . . . . . . . . . . . . . 69
5.4 Integration . . . . . . . . . . . . . . . . . . . . . . . . 73

6 Einstein’s Field Equations 77


6.1 The physical meaning of curvature . . . . . . . . . . . 77
6.2 Einstein’s field equations . . . . . . . . . . . . . . . . 82
6.3 Lagrangian formulation . . . . . . . . . . . . . . . . . 84
6.4 The energy-momentum tensor . . . . . . . . . . . . . 86

7 Weak Gravitational Fields 91


7.1 Linearised theory of gravity . . . . . . . . . . . . . . . 91
7.2 Gauge transformations . . . . . . . . . . . . . . . . . 93
7.3 Nearly Newtonian gravity . . . . . . . . . . . . . . . . 95
7.4 Gravitational waves . . . . . . . . . . . . . . . . . . . 100

8 The Schwarzschild Solution 111


8.1 Cartan’s structure equations . . . . . . . . . . . . . . . 111
8.2 Stationary and static spacetimes . . . . . . . . . . . . 115
8.3 The Schwarzschild solution . . . . . . . . . . . . . . . 116
8.4 Solution of the field equations . . . . . . . . . . . . . 120

9 The Schwarzschild Spacetime 127


9.1 Orbits in the Schwarzschild spacetime . . . . . . . . . 127
9.2 Comparison to the Kepler problem . . . . . . . . . . . 130
9.3 Perihelion shift and light deflection . . . . . . . . . . . 133
9.4 Spins in the Schwarzschild spacetime . . . . . . . . . 137

10 Schwarzschild Black Holes 143


10.1 The singularity at r = 2m . . . . . . . . . . . . . . . . 143
10.2 The Kruskal continuation . . . . . . . . . . . . . . . . 147
CONTENTS ix

10.3 Physical meaning of the Kruskal continuation . . . . . 152


10.4 Redshift approaching the Schwarzschild radius . . . . 156

11 Charged, Rotating Black Holes 161


11.1 The Reissner-Nordström solution . . . . . . . . . . . . 161
11.2 The Kerr-Newman solution . . . . . . . . . . . . . . . 164
11.3 Motion near a Kerr black hole . . . . . . . . . . . . . 172
11.4 Entropy and temperature of a black hole . . . . . . . . 177

12 Homogeneous, Isotropic Cosmology 183


12.1 Spherically-symmetric spacetimes . . . . . . . . . . . 183
12.2 Homogeneous and isotropic spacetimes . . . . . . . . 187
12.3 Friedmann’s equations . . . . . . . . . . . . . . . . . 191
12.4 Density evolution and redshift . . . . . . . . . . . . . 194

13 Relativistic Astrophysics 199


13.1 Light bundles . . . . . . . . . . . . . . . . . . . . . . 199
13.2 Gravitational lensing . . . . . . . . . . . . . . . . . . 201
13.3 The Tolman-Oppenheimer-Volkoff solution . . . . . . 206
13.4 The mass of non-rotating neutron stars . . . . . . . . . 209

A Electrodynamics 213
A.1 Electromagnetic field tensor . . . . . . . . . . . . . . 213
A.2 Maxwell’s equations . . . . . . . . . . . . . . . . . . 213
A.3 Lagrange density and energy-momentum tensor . . . . 214

B Differential Geometry 217


B.1 Manifold . . . . . . . . . . . . . . . . . . . . . . . . 217
B.2 Tangent and dual spaces . . . . . . . . . . . . . . . . 217
B.3 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . 218
B.4 Covariant derivative . . . . . . . . . . . . . . . . . . . 219
B.5 Parallel transport and geodesics . . . . . . . . . . . . . 220
x CONTENTS

B.6 Torsion and curvature . . . . . . . . . . . . . . . . . . 220


B.7 Pull-back, Lie derivative, Killing fields . . . . . . . . . 221
B.8 Differential forms . . . . . . . . . . . . . . . . . . . . 223
B.9 Cartan’s structure equations . . . . . . . . . . . . . . . 224
B.10 Differential operators and integration . . . . . . . . . . 225

C Penrose-Carter diagrams 227


Chapter 1

Introduction

1.1 The idea behind general relativity

There was no need for general relativity when Einstein started working
on it. There was no experimental data signalling any failure of the
Newtonian theory of gravity, except perhaps for the minute advance of
the perihelion of Mercury’s orbit by 43 per century, which researchers
at the time tried to explain by perturbations not included yet into the
calculations of celestial mechanics in the Solar System.
Essentially, Einstein found general relativity because he was deeply
dissatisfied with some of the concepts of the Newtonian theory, in par-
ticular the concept of an inertial system, for which no experimental
demonstration could be given.
After special relativity, he was convinced quite quickly that trying to
build a relativistic theory of gravitation led to conclusions which were in
conflict with experiments. Action at a distance is impossible in special
relativity because the absolute meaning of space and time had to be
given up. The most straightforward way to combine special relativity
with Newtonian gravity seemed to start from Poisson’s equation for the
gravitational potential and to add time derivatives to it so as to make it
relativistically invariant.
However, it was then unclear how the law of motion should be modified
because, according to special relativity, energy and mass are equiva-
lent and thus the mass of a body should depend on its position in a
gravitational field.
This led Einstein to a result which raised his suspicion. In Newtonian
theory, the vertical acceleration of a body in a vertical gravitational
field is independent of its horizontal motion. In a special-relativistic
extension of Newton’s theory, this would no longer be the case: the

1
2 1 Introduction

vertical gravitational acceleration would depend on the kinetic energy of


a body, and thus not be independent of its horizontal motion.

Figure 1.1 Albert Einstein (1879–1955) during a lecture in Vienna, 1921.


Source: Wikipedia

This was in striking conflict with experiment, which says that all bodies
experience the same gravitational acceleration. At this point, the equiva-
lence of inertial and gravitational mass struck Einstein as a law of deep
significance. It became the heuristic guiding principle in the construction
of general relativity.
Freely falling frames of reference
This line of thought leads to the fundamental concept of general rela-
tivity. It says that it must be possible to introduce local, non-rotating,
freely-falling frames of reference in which gravity is locally “trans-
formed away”.
The directions of motion of different freely-falling reference frames will
generally not be parallel: Einstein elevators released at the same height
above the Earth’s surface but over different locations will fall towards
the Earth’s centre and thus approach each other.
Space-time as a manifold
Replacing inertial frames by freely falling, non-rotating frames of ref-
erences leads to the idea that spacetime is a four-dimensional manifold
instead of the “rigid”, four-dimensional Euclidean space.
1.1 The idea behind general relativity 3

g

g
Figure 1.2 Einstein elevators: The left elevator is thought to be placed
outside a gravitational field, but accelerated upwards with an acceleration
−g; the right elevator is placed at rest in a gravitational field with gravitational
acceleration g directed downwards. According to the equivalence principle,
their occupants cannot distinguish these situations from each other.

As will be explained in the following two chapters, manifolds can locally


be mapped onto Euclidean space. In a freely-falling reference frame,
special relativity must hold, which implies that the Minkowskian metric
of special relativity must locally be valid. The same operation must
be possible in all freely-falling reference frames individually, but not
globally, as is illustrated by the example of the Einstein elevators falling
towards the Earth.
Thus, general relativity considers the metric of the spacetime manifold as
a dynamical field. The necessity to match it with the Minkowski metric
in freely-falling reference frames means that the signature of the metric
must be (−, +, +, +) or (+, −, −, −). A manifold with a metric which is
not positive definite is called pseudo-Riemannian, or Lorentzian if the
metric has the signature of the Minkowski metric.
The lecture starts with an introductory chapter describing the funda-
mental characteristics of gravity, their immediate consequences and the
failure of a specially-relativistic theory of gravity. It then introduces in
two chapters the mathematical apparatus necessary for general relativity,
which are the basics of differential geometry, i.e. the geometry on man-
ifolds. After this necessary mathematical digressions, we shall return
4 1 Introduction

to physics when we discuss the motion of test particles in given gravita-


tional fields in Chap. 4 and later introduce Einstein’s field equations in
Chap. 6.

1.2 Fundamental properties of gravity

1.2.1 Scales

The first remarkable property of gravity is its weakness. It is by far


Caution We are using Gaus- the weakest of the four known fundamental interactions. To see this,
sian cgs units throughout, in compare the gravitational and electrostatic forces acting between two
which the electrostatic potential protons at a distance r. We have
of a charge q is simply ⎛ ⎞
gravity ⎜⎜ Gm2p ⎟⎟ e2 −1 Gm2p
q = ⎜⎜⎝ 2 ⎟⎟⎠ 2 = 2 = 8.1 · 10−37 ! (1.1)
Φ(r) = − . electrostatic force r r e
r
In these units, the elementary This leads to an interesting comparison of scales. In quantum physics, a
charge is particle of mass m can be assigned the Compton wavelength
g1/2 cm3/2 
e = 4.80 · 10−10 . λ= , (1.2)
s mc
 where Planck’s constant h is replaced by  merely for conventional
reasons. We ask what the mass of the particle must be such that its
gravitational potential energy equals its rest mass mc2 , and set

Gm2 !
= mc2 . (1.3)
λ
The result is the Planck mass,

c GeV
m = MPl = = 2.2 · 10−5 g = 1.2 · 1019 2 , (1.4)
G c

which, inserted into (1.2), yields the Planck length



G
λPl = = 1.6 · 10−33 cm (1.5)
c3
and the Planck time

λPl G
tPl = = = 5.3 · 10−44 s . (1.6)
c c5
As Max Planck noted already in 19001 , these are the only scales for
mass, length and time that can be assigned an objective meaning.
1
Über irreversible Strahlungsvorgänge, Annalen der Physik 306 (1900) 69
1.2 Fundamental properties of gravity 5

The Planck mass is huge in comparison to the mass scales of elementary


particle physics. The Planck length and time are commonly interpreted
as the scales where our “classical” description of spacetime is expected
to break down and must be replaced by an unknown theory combining
relativity and quantum physics.
Using the Planck mass, the ratio from (1.1) can be written as

Gm2p 2
1 mp
= , (1.7)
e2 α MPl
2

where α = e2 /c ≈ 1/137 is the fine-structure constant.


Dominance of gravity
These comparisons suggest that gravity will dominate all other inter-
actions once the mass of an object is sufficiently large. A mass scale
important for the astrophysics of stars is set by the ratio
2
MPl
MPl = 1.7 · 1038 MPl = 3.7 · 1033 g , (1.8)
m2p

which is almost two solar masses.


We shall see at the end of this lecture that stellar cores of this mass
cannot be stabilised against gravitational collapse.

1.2.2 The Equivalence Principle

The observation that inertial and gravitational mass cannot be experi-


mentally distinguished is a highly remarkable finding. It is by no means
obvious that the ratio between any force exerted on a body and its conse-
quential acceleration should have anything to do with the ratio between
the gravitational force and the body’s acceleration in a gravitational field.
The experimentally well-established fact that inertial and gravitational
mass are the same at least within our measurement accuracy was raised
to a guiding principle by Einstein, the principle of equivalence, which
can be formulated in several different ways.
Principle of equivalence
The weaker and less precise statement is that the motion of a test
body in a gravitational field is independent of its mass and composition,

which can be cast into the more precise form that in an arbitrary gravi-
tational field, no local non-gravitational experiment can distinguish a
freely falling, non-rotating system from a uniformly moving system in
absence of the gravitational field.
6 1 Introduction

The latter is Einstein’s Equivalence Principle, which is the heuristic


guiding principle for the construction of general relativity.
It is important to note the following remarkable conceptual advance:
Newtonian mechanics starts from Newton’s axioms, which introduce
the concept of an inertial reference frame, saying that force-free bodies
in inertial systems remain at rest or move at constant velocity, and that
bodies in inertial systems experience an acceleration which is given by
the force acting on them, divided by their mass.
Firstly, inertial systems are a deeply unsatisfactory concept because
they cannot be realised in any strict sense. Approximations to inertial
systems are possible, but the degree to which a reference frame will
approximate an inertial system will depend on the precise circumstances
of the experiment or the observation made.
Secondly, Newton’s second axiom is, strictly speaking, circular in the
sense that it defines forces if one is willing to accept inertial systems,
Caution Note that Newton as- while it defines inertial systems if one is willing to accept the relation
sumed the existence of absolute between force and acceleration. A satisfactory, non-circular definition of
space and time. Strictly speak- force is not given in Newton’s theory. The existence of inertial frames is
ing, therefore, the problem of in- postulated.
ertial frames did not exist when
Special relativity replaces the rigid Newtonian concept of absolute space
he founded classical mechanics.
and time by a spacetime which carries the peculiar light-cone struc-

ture imprinted by the universality of the speed of light demanded by
Maxwell’s electrodynamics. Newtonian spacetime can be considered as
the Cartesian product R × R3 . An instant t ∈ R in time uniquely identifies
the three-dimensional Euclidean space of all simultaneous events.
Of course, it remains possible in special relativity to define simultane-
ous events, but the three-dimensional hypersurface in four-dimensional
Euclidean space R4 identified in this way depends on the motion of the
observer relative to another observer. Independent of their relative mo-
tion, however, is the light-cone structure of Minkowskian spacetime. The
future light cone encloses events in the future of a point p in spacetime
which can be reached by material particles, and its boundary is defined
by events which can be reached from p by light signals. The past light
cone encloses events in the past of p from which material particles can
reach p, and its boundary is defined by events from which light signals
can reach p.
Yet, special relativity still makes use of the concept of inertial reference
frames. Physical laws are required to be invariant under transformations
from the Poincaré group, which translate from one inertial system to
another.
1.3 Consequences of the equivalence principle 7

Flexible light-cone structure


General relativity keeps the light-cone structure of special relativity,
even though its rigidity is given up: the orientation of the light cones
can vary across spacetime.
Thus, the relativity of distances in space and time remains within the
theory. However, it is one of the great achievements of general relativ-
ity that it finally replaces the concept of inertial systems by something
else which can be experimentally demonstrated: the principle of equiva-
lence replaces inertial systems by non-rotating, freely-falling frames of
reference.

1.3 Consequences of the equivalence


principle

Without any specific form of the theory, the equivalence principle imme-
diately allows us to draw conclusions on some of the consequences any
theory must have which is built upon it. We discuss two here to illustrate
its general power, namely the gravitational redshift and gravitational
light deflection.

1.3.1 Gravitational Redshift

We enter an Einstein elevator which is at rest in a gravitational field at t =


0. The elevator is assumed to be small enough for the gravitational field
to be considered as homogeneous within it, and the (local) gravitational
acceleration be g.
According to the equivalence principle, the downward gravitational
acceleration felt in the elevator cannot locally be distinguished from a
constant upward acceleration of the elevator with the same acceleration g.
Adopting the equivalence principle, we thus assume that the gravitational
field is absent and that the elevator is constantly accelerated upward
instead.
At t = 0, a photon is emitted by a light source at the bottom of the
elevator, and received some time Δt later by a detector at the ceiling. The
time interval Δt is determined by
g
h + Δt2 = cΔt , (1.9)
2
where h is the height of the elevator. This equation has the solution
⎡ ⎤
1  c ⎢⎢

⎢ 2gh ⎥⎥⎥⎥ h
Δt± = c ± c − 2gh = ⎢⎣1 − 1 − 2 ⎥⎦ ≈ ;
2 (1.10)
g g c c
8 1 Introduction

the other branch makes no physical sense.


When the photon is received at the ceiling, the ceiling moves with the
velocity
gh
Δv = gΔt ≈ (1.11)
c
compared to the floor when the photon was emitted. The photon is thus
Doppler shifted with respect to its emission, and is received with the
longer wavelength
 
 Δv gh
λ ≈ 1+ λ≈ 1+ 2 λ. (1.12)
c c

The gravitational acceleration is given by the gravitational potential Φ


through

g = |∇Φ| ⇒ gh ≈ ΔΦ , (1.13)
 h is the change in Φ from the floor to the ceiling of the
where ΔΦ ≈ |∇Φ|
elevator.
Gravitational redshift
The equivalence principle demands a gravitational redshift of
λ − λ ΔΦ
z≡ ≈ 2 (1.14)
λ c
of a light ray passing the potential difference ΔΦ.

Δv = 12 gt2

h g

v=0
w
Figure 1.3 Two Einstein elevators, both outside a gravitational field and
accelerated upwards with acceleration g. When the photon reaches the
top of the elevator (left) or while the light ray crosses it (right), the elevator
is accelerated to the velocity Δv = gt2 /2. This leads to redshift (left) and
aberration (right).
1.4 Futile attempts 9

1.3.2 Gravitational Light Deflection

Similarly, it can be concluded from the equivalence principle that light


rays should be curved in gravitational fields. To see this, consider again
the Einstein elevator from above which is at rest in a gravitational field
 at t = 0.
g = |∇Φ|
As before, the equivalence principle asserts that we can consider the
elevator as being accelerated upwards with the acceleration g.
Suppose now that a horizontal light ray enters the elevator at t = 0 from
the left and leaves it at a time Δt = w/c to the right, if w is the horizontal
width of the elevator.
As the light ray leaves the elevator, the elevator’s velocity has increased
to

|∇Φ|w
Δv = gΔt = (1.15)
c
such that, in the rest frame of the elevator, it leaves at an angle


Δv |∇Φ|w
Δα = = (1.16)
c c2

downward from the horizontal because of the aberration due to the finite
light speed.
Light deflection by gravitational fields
Since the upward accelerated elevator corresponds to an elevator at
rest in a downward gravitational field, this leads to the expectation that
light will be deflected towards gravitational fields.
Although it is possible to construct theories of gravity which obey the
equivalence principle and do not lead to gravitational light deflection,
the bending of light in gravitational fields is by now a well-established
experimental fact.

1.4 Futile attempts

1.4.1 Gravitational Redshift

We have seen before that the equivalence principle implies a gravitational


redshift, which has been demonstrated experimentally. We must thus
require from a theory of gravity that it does lead to gravitational redshift.
Suppose we wish to construct a theory of gravity which retains the
Minkowski metric ημν . In such a theory, how ever it may look in detail,
10 1 Introduction

the proper time measured by observers moving along a world line x μ (λ)
from λ1 to λ2 is
 λ2

Δτ = dλ −ημν ẋ μ ẋ ν , (1.17)
λ1

where the minus sign under the square root appears because we choose
the signature of ημν to be (−1, 1, 1, 1).
Now, let a light ray propagate from the floor to the ceiling of the elevator
in which we have measured gravitational redshift before. Specifically, let
the light source shine between coordinates times t1 and t2 . The emitted
photons will propagate to the receiver at the ceiling along world lines
which may be curved, but must be parallel because the metric is constant.
The time interval within which the photons arrive at the receiver must
thus equal the time interval t2 − t1 within which they left the emitter.
Thus there cannot be gravitational redshift in a theory of gravity in flat
spacetime.

1.4.2 A Scalar Theory of Gravity and the Perihelion


Shift

Let us now try and construct a scalar theory of gravity starting from the
field equation
φ = −4πGT , (1.18)

where φ is the gravitational potential and T = T μμ is the trace of the


energy-momentum tensor. Note that φ is made dimensionless here by
dividing the Newtonian gravitational potential Φ by c2 .
In the limit of weak fields and non-relativistic matter, this reduces to
Poisson’s equation
 2 Φ = 4πGρ ,
∇ (1.19)

since then the time derivatives in d’Alembert’s operator and the pressure
contributions to T can be neglected.
Let us further adopt the Lagrangian

L(x μ , ẋ μ ) = −mc −ημν ẋ μ ẋ ν (1 + φ) , (1.20)

which is the ordinary Lagrangian of a free particle in special relativity,


multiplied by the factor (1 + φ). This is the only possible Lagrangian
that yields the right weak-field (Newtonian) limit.
We can write the square root in (1.20) as

√ 

−ημν ẋ μ ẋ ν = c2 − v 2 = c 1 − β 2 , (1.21)
1.4 Futile attempts 11

where β = v/c is the velocity in units of c. The weak-field limit of (1.20)


for non-relativistic particles is thus

μ μ v 2 m
L(x , ẋ ) ≈ −mc 1 − 2 (1 + φ) ≈ −mc2 + v 2 − mc2 φ , (1.22)
2
2c 2

which is the right expression in Newtonian gravity.


The equations of motion can now be calculated inserting (1.20) into
Euler’s equations,
d ∂L ∂L
= . (1.23)
dt ∂ ẋα ∂xα
On the right-hand side, we find
∂L ∂φ
= −mc 2
1 − β2 α . (1.24)
∂xα ∂x

On the left-hand side, we first have


∂L 1 ∂L mcβα
= = (1 + φ) , (1.25)
∂ ẋ α c ∂βα
1 − β2
and thus
⎛ ⎞
d ∂L ⎜⎜⎜ β̇α βα β · β ⎟⎟⎟⎟
˙

= mc(1 + φ) ⎜⎜⎝ + ⎟⎠
dt ∂ ẋα 2 3/2 ⎟
1 − β2 (1 − β )
mcβα
+ φ̇ . (1.26)
1 − β2

We shall now simplify these equations assuming that the potential is


static, φ̇ = 0, and that the motion is non-relativistic, β 1. Then, (1.26)
becomes 
d ∂L ˙ β2
≈ mc(1 + φ)β 1 + ≈ m(1 + φ)x¨ , (1.27)
dt ∂v 2
and (1.24) turns into

∂L β2   .
≈ −mc 1 −
2
∇φ ≈ −mc2 ∇φ (1.28)
∂x 2

The equation of motion thus reads, in this approximation,


(1 + φ)x¨ = −c2 ∇φ (1.29)

or 

¨x = −c2 ∇φ(1 2 φ2
− φ) = −c ∇ φ − . (1.30)
2
Compared to the equation of motion in Newtonian gravity, therefore, the
potential is augmented by a quadratic perturbation.
12 1 Introduction

For a static potential and non-relativistic matter, the potential is given by


Poisson’s equation.
We now proceed to work out the perihelion shift expected for planetary
orbits around the Sun in such a theory of gravity. As we know from the
discussion of Kepler’s problem in classical mechanics, the radius r and
the polar angle ϕ of such orbits are characterised by

dr mr2 2
= (E − VL ) , (1.31)
dϕ L m
where VL is the effective potential energy

L2
VL = V + , (1.32)
Caution Recall that the equa- 2mr2
tion of motion (1.31) follows
and L is the (orbital) angular momentum. Thus,
from the conservation laws of
angular-momentum,
dr r2 L2
= 2m(E − V) − 2 . (1.33)
L dϕ L r
ϕ̇ = ,
mr2
The perihelion shift is the change in ϕ upon integrating once around the
and energy,
orbit, or integrating twice from the perihelion radius r0 to the aphelion
2 radius r1 ,  r1
ṙ2 = (E − VL (r)) . dϕ
m Δϕ = 2 dr . (1.34)
r0 dr

Inverting (1.33), it is easily seen that (1.34) can be written as
 r1
∂ L2
Δϕ = −2 dr 2m(E − V) − 2 . (1.35)
? ∂L r0 r
Confirm (1.35) by carrying out
the calculation yourself. Now, we split the potential energy V into the Newtonian contribution V0
and a perturbation δV V and expand the integrand to lowest order in
δV, which yields
 r1 
∂ mδV
Δϕ ≈ −2 dr A0 1 − (1.36)
∂L r0 A0

where the abbreviation


L2
A0 ≡ 2m(E − V0 ) − (1.37)
r2
was inserted for convenience.
We know that orbits in Newtonian gravity are closed, so that the first
term in the integrand of (1.36) must vanish. Thus, we can write
 r1
∂ mδV
Δϕ ≈ 2 dr √ . (1.38)
∂L r0 A0
1.4 Futile attempts 13

Next, we transform the integration variable from r to ϕ, using that


dr r2
≈ A0 (1.39)
dϕ L
to leading order in δV, according to (1.31). Thus, (1.38) can be written
as  π
∂ 2m
Δϕ ≈ dϕ r2 δV . (1.40)
∂L L 0

Finally, we specialise the potential energy. Since Poisson’s equation for


the gravitational potential remains valid, we have
GM
m
V0 = mc2 φ = − , (1.41)
r
where M
is the Sun’s mass, and following (1.30), the potential perturba-
tion is
φ2 mc2 V02 G2 M
2 m
δV = mc2 = = . (1.42)
2 2 m2 c4 2c2 r2
Inserting this into (1.40) yields the perihelion shift
∂ m πG2 M
2 m πG2 M
2 m2
Δϕ = = − . (1.43)
∂L L c2 c2 L2

The angular momentum L can be expressed by the semi-major axis a


and the eccentricity e of the orbit,
L2 = GM
m2 a(1 − e2 ) , (1.44)
which allows us to write (1.43) in the form
πGM

Δϕ = − . (1.45)
ac2 (1 − e2 )
For the Sun, M
= 2 · 1033 g, thus
GM

= 1.5 · 105 cm . (1.46)


c2
For Mercury, a = 5.8 · 1012 cm and the eccentricity e = 0.2 can be
neglected because it appears quadratic in (1.45). Thus, we find
Δϕ = −8.1 · 10−8 radian = −0.017 (1.47)
per orbit. Mercury’s orbital time is 88 d, i.e. it completes about 415
orbits per century, so that the perihelion shift predicted by the scalar
theory of gravity is
Δϕ100 = −7 (1.48)
per century.
This turns out to be wrong: Mercury’s perihelion shift is six times as
large, and not even the sign is right. Therefore, our scalar theory of
gravity fails in its first comparison with observations, showing that we
have to walk along a different route.
Chapter 2

Differential Geometry I

2.1 Differentiable manifolds

By the preceding discussion of how a theory of gravity may be con-


structed which is compatible with special relativity, we are led to the con-
cept of a spacetime which “looks like” Minkowskian spacetime locally,
but may globally be curved. This concept is cast into a mathematically
precise form by the introduction of a manifold. Caution A topological space
Manifolds is a set M together with a collec-
tion T of open subsets T i ⊂ M
An n-dimensional manifold M is a topological Hausdorff space with a with the properties (i) ∅ ∈ T and
countable base, which is locally homeomorphic to Rn . This means that M ∈ T ; (ii) ∩ni=1 T i ∈ T for any
for every point p ∈ M, an open neighbourhood U of p exists together finite n; (iii) ∪ni=1 T i ∈ T for any
with a homeomorphism h which maps U onto an open subset U  of Rn , n. In a Hausdorff space, any two
points x, y ∈ M with x  y can
h : U → U . (2.1)
be surrounded by disjoint neigh-
A trivial example for an n-dimensional manifold is the R itself, on
n bourhoods. 
which h may be the identity map id. Thus, h is a specialisation of a map
φ from one manifold M to another manifold N, φ : M → N.
The homeomorphism h is called a chart or a coordinate system in the
language of physics. U is the domain or the coordinate neighbourhood
of the chart. The image h(p) of a point p ∈ M under the chart h is Caution A homeomorphism
expressed by the n real numbers (x1 , . . . xn ), the coordinates of p in the (not to be confused with a homo-
chart h. morphism) is a bijective, contin-
uous map whose inverse is also
A set of charts hα is called an atlas of M if the domains of the charts
continuous. 
cover M completely.
Charts and atlases
Charts are homeomorphisms from an n-dimensional manifold M into
Rn . A an atlas is a collection of charts whose domains cover M com-
pletely.

15
16 2 Differential Geometry I

Example: The sphere as a manifold


An example for a manifold is the n-sphere S n , for which the two-sphere
S 2 is a particular specialisation. It cannot be continuously mapped to
R2 , but pieces of it can.
We can embed the two-sphere into R3 and describe it as the point set
  
S 2 = (x1 , x2 , x3 ) ∈ R3  (x1 )2 + (x2 )2 + (x3 )2 = 1 ; (2.2)

then, the six half-spheres Ui± defined by


  
Ui± = (x1 , x2 , x3 ) ∈ S 2  ± xi > 0 (2.3)

can be considered as domains of maps whose union covers S 2 com-


pletely, and the charts can be the projections of the half-spheres onto
open disks
  
Di j = (xi , x j ) ∈ R2  (xi )2 + (x j )2 < 1 , (2.4)

such as
f1+ : U1+ → D23 , f1+ (x1 , x2 , x3 ) = (x2 , x3 ) . (2.5)
Thus, the six charts fi± ,
i ∈ {1, 2, 3}, together form an atlas of the
two-sphere. See Fig. 2.1 for an illustration. 

Let now hα and hβ be two charts, and Uαβ ≡ Uα ∩ Uβ  ∅ be the


intersection of their domains. Then, the composition of charts hβ ◦ h−1
α
exists and defines a map between two open sets in Rn which describes
the change of coordinates or a coordinate transform on the intersection
of domains Uα and Uβ . An atlas of a manifold is called differentiable
if the coordinate changes between all its charts are differentiable. A
manifold, combined with a differentiable atlas, is called a differentiable
manifold.
Using charts, it is possible to define differentiable maps between mani-
folds. Let M and N be differentiable manifolds of dimension m and n,
respectively, and φ : M → N be a map from one manifold to the other.
Introduce further two charts h : M → M  ⊂ Rm and k : N → N  ⊂ Rn
whose domains cover a point p ∈ M and its image φ(p) ∈ N. Then, the
combination k ◦ φ ◦ h−1 is a map from the domain M  to the domain
N  , for which it is clear from advanced calculus what differentiability
means. Unless stated otherwise, we shall generally assume that coordi-
nate changes and maps between manifolds are C ∞ , i.e. their derivatives
of all orders exist and are continuous.
Differentiable atlases and maps
An atlas is differentiable if all of its coordinate changes are differen-
tiable. Differentiable maps between manifolds are defined by means of
differentiable charts.
2.1 Differentiable manifolds 17

Example: A differentiable atlas for the 2-sphere


To construct an example for a differentiable atlas, we return to the two-
sphere S 2 and the atlas of the six projection charts A = { f1± , f2± , f3± }
described above and investigate whether it is differentiable. For doing
so, we arbitrarily pick the charts f3+ and f1+ , whose domains are the
“northern” and “eastern” half-spheres, respectively, which overlap on
the “north-eastern” quarter-sphere. Let therefore p = (p1 , p2 , p3 ) be a
point in the domain overlap, then

f3+ (p) = (p1 , p2 ) , f1+ (p) = (p2 , p3 ) ,


 
( f3+ )−1 (p1 , p2 ) = p1 , p2 , 1 − (p1 )2 − (p2 )2 ,
 
f1+ ◦ ( f3+ )−1 (p1 , p2 ) = p2 , 1 − (p1 )2 − (p2 )2 , (2.6)

which is obviously differentiable. The same applies to all other coor-


dinate changes between charts of A, and thus S 2 is a differentiable
manifold.
As an example for a differentiable map, let φ : S 2 → S 2 be a map
which rotates the sphere by 45◦ around its z axis. Let us further choose
a point p on the positive quadrant of S 2 in which all coordinates are
positive. We can combine φ with the charts f3+ and f1+ to define the
map
   1
+ + −1 p + p2
f1 ◦ φ ◦ ( f3 ) (p , p ) =
1 2
√ , 1 − (p ) − (p ) , (2.7)
1 2 2 2
2
which is also evidently differentiable. 
18 2 Differential Geometry I

(p1 , p2 ) ∈ D12

p ∈ U3+

Figure 2.1 Example for a chart, explained in the text: the point p on the
half-sphere U3+ the two-sphere is projected into the domain D12 ⊂ R2 .

Finally, we introduce product manifolds in a straightforward way. Given


two differentiable manifolds M and N of dimension m and n, respectively,
we can turn the product space M × N consisting of all pairs (p, q) with
p ∈ M and q ∈ N into an (m + n)-dimensional manifold as follows: if
h : M → M  and k : N → N  are charts of M and N, a chart h × k can be
defined on M × N such that
 
h × k : M × N → M × N  , (h × k)(p, q) = h(p), k(q) . (2.8)

In other words, pairs of points from the product manifold are mapped to
pairs of points from the two open subsets M  ⊂ Rm and N  ⊂ Rn .

2.2 The tangent space

2.2.1 Tangent vectors

Now we have essentially introduced ways how to construct local coordi-


nate systems, or charts, on a manifold, how to change between them, and
how to use charts to define what differentiable functions on the manifold
are. We now proceed to see how vectors can be introduced on a manifold.
2.2 The tangent space 19

Example: Product manifold


Many manifolds which are relevant in General Relativity can be ex-
pressed as product manifolds of the Euclidean space Rm with spheres S n .
For example, we can construct the product manifold R × S 2 composed
of the real line and the two-sphere. Points on this product manifold can
be mapped onto R × R2 for instance using the identical chart id on R
and the chart f3+ on the “northern” half-sphere of S 2 ,

(id × f3+ ) : R × S 2 → R × D12 , (p, q) → (p, q2 , q3 ) . (2.9)

Recall the definition of a vector space: a set V, combined with a field


(Körper in German) F, an addition,

+:V ×V →V , (v, w) → v + w , (2.10)

and a multiplication,

·: F×V →V , (λ, v) → λv , (2.11)

is an F-vector space if V is an Abelian group under the addition + and


the multiplication is distributive and associative. In other words, a vector
space is a set of elements which can be added and multiplied with scalars ?
(i.e. numbers from the field F). What are the defining properties
of a field?
On a curved manifold, this vector space structure is lost because it is
not clear how vectors at different points on the manifold should be
added. However, it still makes sense to define vectors locally in terms of
infinitesimal displacements within a sufficiently small neighbourhood of
a point p, which are “tangential” to the manifold at p.
This leads to the concept of the tangential space of a manifold, whose
elements are tangential vectors, or directional derivatives of functions.
We denote by F the set of C ∞ functions f from the manifold into R.

Example: Functions on a manifold


Examples for functions on the manifold S 2 → R could be the average
temperature on Earth or the height of the Earth’s surface above sea
level. 
20 2 Differential Geometry I

Tangent space
Generally, the tangent space T p M of a differentiable manifold M at a
point p is the set of derivations of F (p). A derivation v is a map from
F (p) into R,
v : F (p) → R , (2.12)
which is linear,
v(λ f + μg) = λv( f ) + μv(g) (2.13)
for f, g ∈ F (p) and λ, μ ∈ R, and satisfies the product rule (or Leibniz
rule)
v( f g) = v( f )g + f v(g) . (2.14)
See Fig. 2.2 for an illustration of the tangent space to a 2-sphere.

Tp M

Figure 2.2 Illustration of the tangent space T p M at point p on the 2-sphere.

Note that this definition immediately implies that the derivation of a


constant vanishes: let h ∈ F be a constant function, h(p) = c for all
p ∈ M, then v(h2 ) = 2cv(h) from (2.14) and v(h2 ) = v(ch) = cv(h) from
(2.13), which is possible only if v(h) = 0.
Together with the real numbers R and their addition and multiplication
laws, T p M does indeed have the structure of a vector space, with

(v + w)( f ) = v( f ) + w( f ) and (λv)( f ) = λv( f ) (2.15)

for v, w ∈ T p M, f ∈ F and λ ∈ R.

2.2.2 Coordinate basis

We now construct a basis for the vector space T p M, i.e. we provide a


complete set {ei } of linearly independent basis vectors. For doing so,
2.2 The tangent space 21

let h : U → U  ⊂ Rn be a chart with p ∈ U and f ∈ F (p) a function.


Then, f ◦ h−1 : U  → R is C ∞ by definition, and we introduce n vectors
ei ∈ T p M, 1 ≤ i ≤ n, by


ei ( f ) := i ( f ◦ h−1 ) , (2.16)
∂x h(p)

where xi are the usual cartesian coordinates of Rn .


The function ( f ◦ h−1 ) is applied to the image h(p) ∈ Rn of p under the
chart h, i.e. ( f ◦ h−1 ) “carries” the function f from the manifold M to the
locally isomorphic manifold Rn .
To show that these vectors span T p M, we first state that for any C ∞
function F : U  → R defined on an open neighbourhood U  of the origin
of Rn , there exist n C ∞ functions Hi : U  → R such that

n
F(x) = F(0) + xi Hi (x) . (2.17)
i=1

Note the equality! This is not a Taylor expansion. This is easily seen
using the identity
 1
d
F(x) − F(0) = F(tx1 , . . . , txn )dt
0 dt

n  1
= x i
Di F(tx1 , . . . , txn )dt , (2.18)
i=1 0

where Di is the partial derivative with respect to the i-th argument of F.


Thus, it suffices to set
 1
Hi (x) = Di F(tx1 , . . . , txn )dt (2.19)
0

to prove (2.17). For x = 0 in particular, we find


  
1
∂F  ∂F 
Hi (0) =  dt =  . (2.20)
0 ∂xi 0 ∂xi 0

Now we substitute F = f ◦ h−1 and choose a chart h : U → U  such that


h(q) = x and h(p) = 0, i.e. q = h−1 (x). Then, we first obtain from (2.17)

n
f (q) = f (p) + (xi ◦ h)(q) (Hi ◦ h)(q) , (2.21)
i=1

and from (2.20)



∂ −1 
Hi (0) = (Hi ◦ h)(p) = ( f ◦ h )  = ei ( f ) . (2.22)
∂xi h(p)
22 2 Differential Geometry I

Next, we apply a tangent vector v ∈ T p M to (2.21),



n 
v( f ) = v[ f (p)] + v(xi ◦ h) (Hi ◦ h)| p + (xi ◦ h)| p v(Hi ◦ h)
i=1

n
= v(xi ◦ h) ei ( f ) , (2.23)
i=1

where we have used that v applied to the constant f (p) vanishes, that
(xi ◦ h)(p) = 0 and that Hi (0) = ei ( f ) according to (2.22). Thus, setting
vi = v(xi ◦ h), we find that any v ∈ T p M can be written as a linear
combination of the basis vectors ei . This also demonstrates that the
dimension of the tangent space T p M equals that of the manifold itself.

Coordinate basis of T p M
The basis {ei }, which is often simply denoted as {∂/∂xi } or {∂i }, is called
a coordinate basis of T p M. Vectors v ∈ T p M can thus be written as

v = vi ei = vi ∂i . (2.24)

If we choose a different chart h instead of h, we obtain of course a


different coordinate basis {ei }. Denoting the i-th coordinate of the map
h ◦ h−1 with x , the chain rule applied to f ◦ h−1 = ( f ◦ h−1 ) ◦ (h ◦ h−1 )
yields
n
∂x j 
ei = e j =: Ji j ej . (2.25)
j=1
∂x i

which shows that the two different coordinate bases are related by the
Jacobian matrix of the coordinate change, which has the elements
Ji j = ∂x j /∂xi . Its inverse has the elements J ij = ∂xi /∂x j .

This relates the present definition of a tangent vector to the traditional


definition of a vector as a quantity whose components transform as
n n
∂xi j
vi = v(xi ◦ h ) = v j e j (xi ◦ h ) = v = J ij v j . (2.26)
j=1 j=1
∂x j

Repeating the construction of a tangent space at another point q ∈ M,


we obtain a tangent space T q M which cannot be identified in any way
with the tangent space T p M given only the structure of a differentiable
manifold that we have so far.
Consequently, a vector field is defined as a map v : p → v p which assigns
a tangent vector v p ∈ T p M to every point p ∈ M. If we apply a vector
field v to a C ∞ function f , its result (v( f ))(p) is a number for each point p.
The vector field is called smooth if the function (v( f ))(p) is also smooth.
Since we can write v = vi ∂i with components vi in a local coordinate
neighbourhood, the function v( f ) is
(v( f ))(p) = vi (p)∂i f (p) , (2.27)
2.2 The tangent space 23

and thus it is called the derivative of f with respect to the vector field v.

2.2.3 Curves and infinitesimal transformations

We can give a geometrical meaning to tangent vectors as “infinitesimal


displacements” on the manifold. First, we define a curve on M through
p ∈ M as a map from an open interval I ⊂ R with 0 ∈ I into M,
γ:I→M, (2.28)
such that γ(0) = p.
Next, we introduce a one-parameter group of diffeomorphisms γt as a
C ∞ map,
γt : R × M → M , (2.29)
such that for a fixed t ∈ R, γt : M → M is a diffeomorphism and, for all
Caution A diffeomorphism is
t, s ∈ R, γt ◦ γ s = γt+s . Note the latter requirement implies that γ0 is the
a continuously differentiable, bi-
identity map.
jective map with a continously
For a fixed t, γt maps points p ∈ M to other points q ∈ M in a dif- differentiable inverse. 
ferentiable way. As an example on the two-sphere S 2 , γt could be the
map which rotates the sphere about an (arbitrary) z axis by an angle
parameterised by t, such that γ0 is the rotation by zero degrees.
We can now associate a vector field v to γt as follows: For a fixed
point p ∈ M, the map γt : R → M is a curve as defined above which
passes through p at t = 0. This curve is called an orbit of γt . Then, we
assign to p the tangent vector v p to this curve at t = 0. Repeating this
operation for all points p ∈ M defines a vector field v on M which is
associated with γt and can be considered as the infinitesimal generator
of the transformations γt .

Example: Transformation of S 2
In our example on S 2 , we fix a point p on the sphere whose orbit under
the map γt is a part of the “latitude circle” through p. The tangent
vector to this curve in p defines the local “direction of motion” under
the rotation expressed by γt . Applying this to all points p ∈ S 2 defines
a vector field v on S 2 . 

Conversely, given a vector field v on M, we can construct curves through


all points p ∈ M whose tangent vectors are v p . This is most easily seen
in a local coordinate neighbourhood, h(p) = (x1 , . . . xn ), in which the
curves are the unique solutions of the system
dxi
= vi (x1 , . . . xn ) (2.30)
dt
of ordinary, first-order differential equations. Thus, tangent vectors can
be identified with infinitesimal transformations of the manifold.
24 2 Differential Geometry I

Given two vector fields v, w and a function f on M, we can define the


commutator of the two fields as

[v, w]( f ) = vw( f ) − wv( f ) . (2.31)

In coordinates, we can write v = vi ∂i and w = w j ∂ j , and the commutator


can be written as
 
[v, w] = vi ∂i w j − wi ∂i v j ∂ j . (2.32)

It can easily be shown to have the following properties (where v, w, x are


vector fields and f, g are functions on M):

[v + w, x] = [v, x] + [w, x]
[v, w] = −[w, v]
[ f v, gw] = f g[v, w] + f v(g)w − gw( f )v
[v, [w, x]] + [x, [v, w]] + [w, [x, v]] = 0 , (2.33)

where the latter equation is called the Jacobi identity.

2.3 Dual vectors and tensors

2.3.1 Dual space

We had introduced the tangent space T M as the set of derivations of


functions F on M, which were certain linear maps from F into R. We
now introduce the dual vector space T ∗ M to T M as the set of linear
maps
T∗M : T M → R (2.34)
from T M into R. Defining addition of elements of T ∗ M and their multi-
plication with scalars in the obvious way, T ∗ M obtains the structure of a
vector space; the elements of T ∗ M are called dual vectors.
Let now f be a C ∞ function on M and v ∈ T M an arbitrary tangent
vector. Then, we define the differential of f by

df : TM → R , d f (v) = v( f ) . (2.35)

It is obvious that, by definition of the dual space T ∗ M, d f is an element


of T ∗ M and thus a dual vector. Choosing a coordinate representation,
we see that
d f (v) = vi ∂i f . (2.36)
Specifically letting f = xi be the i-th coordinate function, we see that

dxi (∂ j ) = ∂ j xi = δij , (2.37)


2.3 Dual vectors and tensors 25

which shows that the n-tuple {e∗i } = {dxi } forms a basis of T ∗ M, which
is called the dual basis to the basis {ei } = {∂i } of the tangent space T M.
Dual vectors
Dual vectors map vectors to the real numbers. If {∂i } is a coordinate ba-
sis of T M, the dual basis of T ∗ M is given by the coordinate differentials
{dxi }. Dual vectors can thus be written as

w = wi dxi . (2.38)
Starting the same operation leading from T M to the dual space T ∗ M
with T ∗ M instead, we arrive at the double-dual vector space T ∗∗ M as the
vector space of all linear maps from T ∗ M → R. It can be shown that
T ∗∗ M is isomorphic to T M and can thus be identified with T M.

2.3.2 Tensors

Tensors T of rank (r, s) can now be defined as multilinear maps



T :T
 . × T∗M × T
M × . .!" M × . .!"
 . × TM → R , (2.39)
r s

in other words, given r dual vectors and s tangent vectors, T returns a


real number, and if all but one vector or dual vector are fixed, the map is
linear in the remaining argument. If a tensor of rank (r, s) is assigned to
every point p ∈ M, we have a tensor field of rank (r, s) on M.
Tensors
Tensors of rank (r, s) are multilinear maps of r dual vectors and s
vectors into the real numbers.
According to this definition, tensors of rank (0, 1) are simply dual vectors,
and tensors of rank (1, 0) are elements of V ∗∗ and can thus be identified
with tangent vectors.

Example: Tensor field of rank (1, 1)


For one specific example, a tensor of rank (1, 1) is a bilinear map from
T ∗ M × T M → R. If we fix a vector v ∈ T M, T (·, v) is a linear map
T ∗ M → R and thus an element of T ∗∗ M, which can be identified with
a vector. In this way, given a vector v ∈ T M, a tensor of rank (1, 1)
produces another vector ∈ T M, and vice versa for dual vectors. Thus,
tensors of rank (1, 1) can be seen as linear maps from T M → T M, or
from T ∗ M → T ∗ M. 

With the obvious rules for adding linear maps and multiplying them with
scalars, the set of tensors T sr of rank (r, s) attains the structure of a vector
space of dimension nr+s .
26 2 Differential Geometry I

Given a tensor t of rank (r, s) and another tensor t of rank (r , s ), we can
construct a tensor of rank (r + r , s + s ) called the outer product t ⊗ t of
t and t by simply multiplying their results on the r + r dual vectors wi
and the s + s vectors v j , thus

(t ⊗ t )(w1 , . . . , wr+r , v1 , . . . , v s+s ) = (2.40)
 r+r
t(w , . . . , w , v1 , . . . , v s ) t (w
1 r r+1
,...,w , v s+1 , . . . , v s+s ) .

In particular, it is thus possible to construct a basis for tensors of rank


(r, s) out of the bases {ei } of the tangent space and {e∗ j } of the dual space
by taking the tensor products. Thus, a tensor of rank (r, s) can be written
in the form
# $  j1 
t = tij11...i
... j s ∂i1 ⊗ . . . ⊗ ∂ir ⊗ dx ⊗ . . . ⊗ dx
r js
, (2.41)

where the numbers tij11...ir


... j s are its components with respect to the coordinate
system h.
The transformation law (2.25) for the basis vectors under coordinate
changes implies that the tensor components transform as

tij11......ijsr = Jki11 . . . Jkirr J lj11 . . . J ljss tlk11...l


...kr
s
, (2.42)

a property which is often used to define tensors in the first place.


Contraction
The contraction C ij t of a tensor of rank (r, s) is a map which reduces
both r and s by unity,

C ij t : T sr → T s−1
r−1
, C ij t = t(. . . , e∗k , . . . , ek , . . .) , (2.43)

where {ek } and {e∗k } are bases of the tangent and dual spaces, as before,
and the summation over all 1 ≤ k ≤ n is implied. The basis vectors e∗k
and ek are inserted as the i-th and j-th arguments of the tensor t.
Expressing the tensor in a coordinate basis, we can write the tensor in
the form (2.41), and thus its contraction with respect to the ia -th and
jb -th arguments reads

C ijab t = tij11...ir ik
... j s dx (∂ jk )
# $
∂i1 ⊗ . . . ⊗ ∂ia −1 ⊗ ∂ia +1 ⊗ . . . ⊗ ∂ir
 
dx j1 ⊗ . . . ⊗ dx jb −1 ⊗ dx jb +1 ⊗ . . . ⊗ dx js
= tij11...ia−1 ik ia+1 ...ir
... jb−1 ik jb+1 ... j s
# $
∂i1 ⊗ . . . ⊗ ∂ia −1 ⊗ ∂ia +1 ⊗ . . . ⊗ ∂ir
 
dx j1 ⊗ . . . ⊗ dx jb −1 ⊗ dx jb +1 ⊗ . . . ⊗ dx js . (2.44)
2.4 The metric 27

Example: Tensor contraction


For a simple example, let v ∈ T M be a tangent vector and w ∈ T ∗ M a
dual vector, and t = v ⊗ w a tensor of rank (1, 1). Its contraction results
in a tensor of rank (0, 0), i.e. a real number, which is

Ct = C(v ⊗ w) = dxk (v) w(∂k ) = vk wk . (2.45)

At the same time, this can be written as

Ct = C(v ⊗ w) = w(v) (2.46)


= (w j dx )(v ∂i ) = w j v dx (∂i ) = w j v ∂i x =
j i i j i j
w j vi δij = wi v .
i

In this sense, the contraction amounts to applying the tensor (partially)


“on itself”. 

2.4 The metric

We need some way to define and measure the “distance” between two
points on a manifold. A metric is introduced via the infinitesimal squared
distance between two neighbouring points on the manifold.
We have seen above that tangent vectors v ∈ T p M are closely related to
infinitesimal displacements around a point p on the manifold. Moreover,
the infinitesimal squared distance between two neighbouring points p
and q should be quadratic in the displacement leading from one point to
the other. Thus, we construct the metric g as a bi-linear map
g : TM × TM → R , (2.47)
which means that the g is a tensor of rank (0, 2). The metric thus assigns
a number to two elements of a vector field T M on M. The metric g
thus defines to two vectors their scalar product, which is not necessarily
positive. We abbreviate the scalar product of two vectors v, w ∈ T M by
g(v, w) ≡ v, w . (2.48)

In addition, we require that the metric be symmetric and non-degenerate,


which means
g(v, w) = g(w, v) ∀ v, w ∈ T p M ,
g(v, w) = 0 ∀ v ∈ T p M ⇔ w = 0 . (2.49)

Metric
A metric is a rank-(0, 2) tensor field which is symmetric and non-
degenerate.
In a coordinate basis, the metric can be written in components as
g = gi j dxi ⊗ dx j . (2.50)
28 2 Differential Geometry I

The line element ds is the metric applied to an infinitesimal distance


vector dx with components dxi ,

ds2 = g(dx, dx) = gi j dxi dx j . (2.51)

Given a coordinate basis {ei }, the metric g can always be chosen such
that
g(ei , e j ) = ei , e j  = ±δi j , (2.52)
where the number of positive and negative signs is independent of the
coordinate choice and is called the signature of the metric. Positive-
(semi-) definite metrics, which have only positive signs, are called Rie-
mannian, and pseudo-Riemannian metrics have positive and negative
signs.

Figure 2.3 Georg Friedrich Bernhard Riemann (1826–1866), German


mathematician. Source: Wikipedia

Example: Minkowski metric


Perhaps the most common pseudo-Riemannian metric is the Minkowski
metric known from special relativity, which can be chosen to have the
signature (−, +, +, +) and has the line element

ds2 = −c2 dt2 + (dx1 )2 + (dx2 )2 + (dx3 )2 . (2.53)

A metric with the same signature as for the spacetime is called


Lorentzian. 

Given a tangent vector v, the metric can also be seen as a linear map
from T M into T ∗ M,

g : T M → T∗M , v → g(·, v) . (2.54)


2.4 The metric 29

This is an element of T ∗ M because it linearly maps vectors into R. Since


the metric is non-degenerate, the inverse map g−1 also exists, and the
metric can be used to establish a one-to-one correspondence between
vectors and dual vectors, and thus between the tangent space T M and its
dual space T ∗ M.
Chapter 3

Differential Geometry II

3.1 Connections and covariant derivatives

3.1.1 Linear Connections

The curvature of the two-dimensional sphere S 2 can be described by Caution Whitney’s (strong)
embedding the sphere into a Euclidean space of the next-higher dimen- embedding theorem states that
sion, R3 . However, (as far as we know) there is no natural embedding any smooth n-dimensional mani-
of our four-dimensional curved spacetime into R5 , and thus we need a fold (n > 0) can be smoothly
description of curvature which is intrinsic to the manifold. embedded in the 2n-dimensional
Euclidean space R2n . Embed-
There is a close correspondence between the curvature of a manifold and
dings into lower-dimensional Eu-
the transport of vectors along curves.
clidean spaces may exist, but not
As we have seen before, the structure of a manifold does not trivially necessarily so. An embedding
allow to compare vectors which are elements of tangent spaces at two f : M → N of a manifold M into
different points. We will thus have to introduce an additional structure a manifold N is an injective map
which allows us to meaningfully shift vectors from one point to another such that f (M) is a submanifold
on the manifold. of N and M → f (M) is differen-
tiable. 
Even before we do so, it is intuitively clear how vectors can be trans-
ported along closed paths in flat Euclidean space, say R3 . There, the
vector arriving at the starting point after the transport will be identical to
the vector before the transport.
However, this will no longer be so on the two-sphere: starting on the
equator with a vector pointing north, we can shift it along a meridian to
the north pole, then back to the equator along a different meridian, and
finally back to its starting point on the equator. There, it will point into a
different direction than the original vector.
Curvature can thus be defined from this misalignment of vectors after
transport along closed curves. In order to work this out, we thus first
need some way for transporting vectors along curves.

31
32 3 Differential Geometry II

We start by generalising the concept of a directional derivative from Rn


by defining a linear or affine connection or covariant differentiation on
a manifold as a mapping ∇ which assigns to every pair v, y of C ∞ vector
fields another vector field ∇v y which is bilinear in v and y and satisfies

∇ f v y = f ∇v y
∇v ( f y) = f ∇v y + v( f )y , (3.1)

where f ∈ F is a C ∞ function on M.

? Figure 3.1 Elwin Bruno Christoffel (1829–1900), German mathematician.


Why do the Christoffel symbols Source: Wikipedia
suffice to specify the connection
completely? In a local coordinate basis {ei }, we can describe the linear connection by
its action on the basis vectors,

∇∂i (∂ j ) ≡ Γki j ∂k , (3.2)

where the n3 numbers Γki j are called the Christoffel symbols or connec-
tion coefficients of the connection ∇ in the given chart.
Connection
A connection ∇ generalises the directional derivative of objects on a
manifold. The directional derivative of a vector y in the direction of
the vector v is the vector ∇v y. The connection is linear and satisfies the
product rule.
The Christoffel symbols are not the components of a tensor, which is
seen from their transformation under coordinate changes. Let xi and xi
3.1 Connections and covariant derivatives 33

be two different coordinate systems, then we have on the one hand, by


definition,
∂xk
∇∂a (∂b ) = Γcab ∂c = Γcab ∂k = Γcab Jck ∂k , (3.3)
∂x c
where Jck is the Jacobian matrix of the coordinate transform as defined in
(2.26). On the other hand, the axioms (3.1) imply, with f represented by
the elements Jik of the Jacobian matrix,
∇∂a (∂b ) = ∇ Jai ∂i (Jbj ∂ j ) = Jai ∇∂i (Jbj ∂ j )

= Jai Jbj ∇∂i ∂ j + ∂i Jbj ∂ j
= Jai Jbj Γki j ∂k + Jai ∂i Jbk ∂k . (3.4)
Comparison of the two results (3.3) and (3.4) shows that
Γcab Jck = Jai Jbj Γki j + Jai ∂i Jbk , (3.5)
or, after multiplying with the inverse Jacobian matrix Jkc ,
Γcab = Jai Jbj Jkc Γki j + Jkc Jai ∂i Jbk . (3.6)
While the first term on the right-hand side reflects the tensor transforma-
tion law (2.42), the second term differs from it.

Caution Indices separated by


3.1.2 Covariant derivative a comma denote ordinary partial
differentiations with respect to
Let now y and v be vector fields on M and w a dual vector field, then the coordinates, y,i ≡ ∂i y. 
covariant derivative ∇y is a tensor field of rank (1, 1) which is defined
by
∇y(v, w) ≡ w[∇v (y)] . (3.7)
In a coordinate basis {∂i }, we write
y = yi ∂i and ∇y ≡ yi; j dx j ⊗ ∂i , (3.8)
and obtain the tensor components
 
yi; j = ∇y(∂ j , dxi ) = dxi ∇∂ j (yk ∂k )
 
= dxi yk, j ∂k + yk Γl jk ∂l
?
= yk, j δik + yk Γl jk δil Carry out the calculation in (3.9)
= yi, j + yk Γi jk . (3.9) yourself and verify (3.11) for
a symmetric connection. How
An affine connection is symmetric if many Christoffel symbols do you
need for a symmetric connection
∇v w − ∇w v = [v, w] , (3.10) on S 2 ?
which a short calculation shows to be equivalent to the symmetry prop-
erty
Γki j = Γk ji (3.11)
of the Christoffel symbols in a coordinate basis.
34 3 Differential Geometry II

3.2 Geodesics

3.2.1 Parallel transport and geodesics

Given a linear connection, it is now straightforward to introduce parallel


transport . To begin, let γ : I → M with I ⊂ R a curve in M with tangent
vector γ̇(t). A vector field v is called parallel along γ if

∇γ̇ v = 0 . (3.12)

The vector ∇γ̇ v is the covariant derivative of v along γ, and it is often


denoted by
Dv ∇v
∇γ̇ v = = . (3.13)
dt dt

In the coordinate basis {∂i }, the covariant derivative along γ reads

∇γ̇ v = ∇ ẋi ∂i (v j ∂ j ) = ẋi ∇∂i (v j ∂ j )


  
= ẋi v j ∇∂i (∂ j ) + ∂i v j ∂ j = v̇k + Γki j ẋi v j ∂k , (3.14)

and if this is to vanish identically, (3.12) and (3.14) imply the components

? v̇k + Γki j ẋi v j = 0 . (3.15)


Convince yourself of the results
(3.14) and (3.15). The existence and uniqueness theorems for ordinary differential equa-
tions imply that (3.15) has a unique solution once v is given at one point
along the curve γ(t). The parallel transport of a vector along a curve is
then uniquely defined.
If the tangent vector γ̇ of a curve γ is autoparallel along γ,

∇γ̇ γ̇ = 0 , (3.16)

the curve is called a geodesic. In a local coordinate system, this condition


reads
ẍk + Γki j ẋi ẋ j = 0 . (3.17)
In flat Euclidean space, geodesics are straight lines. Quite intuitively, the
condition (3.16) generalises the concept of straight lines to manifolds.
Parallel transport and geodesics
A vector v is parallel transported along a curve γ if the geodesic equation

∇γ̇ v = 0 (3.18)
holds. Geodesics are autoparallel curves,

∇γ̇ γ̇ = 0 . (3.19)
3.2 Geodesics 35

3.2.2 Normal Coordinates

Geodesics allow the introduction of a special coordinate system in the


neighbourhood of a point p ∈ M. First, given a point p = γ(0) and a
vector γ̇(0) ∈ T p M from the tangent space in p, the existence and unique-
ness theorems for ordinary differential equations ensure that (3.17) has a
unique solution, which implies that a unique geodesic exists through p
into the direction γ̇(0).
Obviously, if γv (t) is a geodesic with “initial velocity” v = γ̇(0), then
γv (at) is also a geodesic with initial velocity av = aγ̇(0), or

γav (t) = γv (at) . (3.20)

Thus, given some neighbourhood U ⊂ T p M of p = γ(0), unique


geodesics γ(t) with t ∈ [0, 1] can be constructed through p into any
direction v ∈ U, i.e. such that γ(0) = p and γ̇(0) = v ∈ U.
Using this, we define the exponential map at p,

exp p : T p M ⊃ U → M , v → exp p (v) = γv (1) , (3.21)

which maps any vector v from U ⊂ T p M into a point along the geodesic
through p into direction v at distance t = 1.
Now, we choose a coordinate basis {ei } of T p M and use the n basis
vectors in the exponential mapping (3.21). Then, the neighbourhood of
p can uniquely be represented by the exponential mapping along the
basis vectors, exp p (xi ei ), and the xi are called normal coordinates.
Since exp p (tv) = γtv (1) = γv (t), the curve γv (t) has the normal coordinates
xi = tvi , with v = vi ei . In these coordinates, xi is linear in t, thus ẍi = 0,
and (3.17) implies
Γki j vi v j = 0 , (3.22)

and thus
Γki j + Γk ji = 0 . (3.23)

If the connection is symmetric as defined in (3.11), the connection


coefficients must vanish,
Γki j = 0 . (3.24) ?
What could the exponential map
have to do with physics, in view
Normal coordinates of the equivalence principle?
Thus, at every point p ∈ M, local coordinates can uniquely be intro-
duced by means of the exponential map, the normal coordinates, in
which the coefficients of a symmetric connection vanish. This will turn
out to be important shortly.
36 3 Differential Geometry II

3.2.3 Covariant derivative of tensor fields

Extending the concept of the covariant derivative to tensor fields, we


start with a simple tensor of rank (1, 1) which is the tensor product of a
vector field v and a dual vector field w,

t =v⊗w, (3.25)

and we require that ∇ x satisfy the Leibniz rule,

∇ x (v ⊗ w) = ∇ x v ⊗ w + v ⊗ ∇ x w , (3.26)

and commute with the contraction,

C [∇ x (v ⊗ w)] = ∇ x [w(v)] . (3.27)

We now contract (3.26) and use (3.27) to find

C [∇ x (v ⊗ w)] = C (∇ x v ⊗ w) + C (v ⊗ ∇ x w)
= w(∇ x v) + (∇ x w)(v)
= ∇ x [w(v)] = xw(v) , (3.28)

where (3.1) was used in the final step (note that w(v) is a real-valued
function). Thus, we find an expression for the covariant derivative of a
dual vector,
(∇ x w)(v) = xw(v) − w(∇ x v) . (3.29)

Introducing the coordinate basis {∂i }, it is straightforward to show (and a


useful exercise!) that this result can be expressed as
 
(∇ x w)(v) = w j,i − Γki j wk xi v j . (3.30)

Specialising x = ∂i , w = dx j and v = ∂k , hence xa = δai , wb = δbj and


vc = δck , we see that this implies for the covariant derivatives of the dual
basis vectors dx j

? (∇∂i dx j )(∂k ) = −Γ jik or ∇∂i dx j = −Γ jik dxk . (3.31)


Verify equations (3.30) and (3.31)
yourself. As before, we now define the covariant derivative ∇t of a tensor field
as a map from the tensor fields of rank (r, s) to the tensor fields of rank
(r, s + 1),
∇ : T sr → T s+1
r
(3.32)
by setting

(∇t)(w1 , . . . , wr , v1 , . . . , v s , v s+1 ) ≡
(∇vs+1 t)(w1 , . . . , wr , v1 , . . . , v s ) , (3.33)

where the vi are vector fields and the w j dual vector fields.
3.2 Geodesics 37

We find a general expression for ∇t, with t ∈ T sr , by taking the tensor


product of t with s vector fields vi and r dual vector fields w j and applying
∇ x to the result, using the Leibniz rule,

∇ x (w1 ⊗ . . . ⊗ wr ⊗ v1 ⊗ . . . ⊗ v s ⊗ t)
= (∇ x w1 ) ⊗ . . . ⊗ t + . . . w1 ⊗ . . . ⊗ (∇ x v1 ) ⊗ . . . ⊗ t
+ w1 ⊗ . . . ⊗ (∇ x t) , (3.34)

and then taking the total contraction, using that it commutes with the
covariant derivative, which yields

∇ x [t(w1 , . . . , wr , v1 , . . . , v s )]
= t(∇ x w1 , . . . , wr , v1 , . . . , v s ) + . . . + t(w1 , . . . , wr , v1 , . . . , ∇ x v s )
+ (∇ x t)(w1 , . . . , wr , v1 , . . . , v s ) . (3.35)

Therefore, the covariant derivative ∇ x t of t is

(∇ x t)(w1 , . . . , wr , v1 , . . . , v s )
= xt(w1 , . . . , wr , v1 , . . . , v s )
− t(∇ x w1 , . . . , v s ) − . . . − t(w1 , . . . , ∇ x v s ) . (3.36)

We now work out the last expression for the covariant derivative of a
tensor field in a local coordinate basis {∂i } and its dual basis {dx j } for the
special case of a tensor field t of rank (1, 1). The result for tensor fields
of higher rank are then easily found by induction.
We can write the tensor field t as

t = ti j (∂i ⊗ dx j ) , (3.37)

and the result of its application to w1 = dxa and v1 = ∂b is

t(dxa , ∂b ) = ti j dxa (∂i )dx j (∂b ) = tab . (3.38)

Therefore, we can write (3.36) as

(∇ x t)(dxa , ∂b ) (3.39)
=x c
∂c tab − t j (∇ x dx )(∂i )dx (∂b ) − t j dx (∂i )dx (∇ x ∂b ) .
i a j i a j

According to (3.31), the second term on the right-hand side is

ti j δbj xc (∇∂c dxa )(∂i ) = −xc tib Γaci , (3.40)

while the third term is

ti j δai xc (∇∂c ∂b )(x j ) = xc ta j Γkcb ∂k x j = xc ta j Γ jcb . (3.41)

Summarising, the components of ∇ x t are

tab;c = tab,c + Γaci tib − Γ jcb ta j , (3.42)


38 3 Differential Geometry II

showing that the covariant indices are transformed with the negative, the
contravariant indices with the positive Christoffel symbols.
In particular, the covariant derivatives of tensors of rank (0, 1) (dual
vectors w) and of tensors of rank (1, 0) (vectors v) have components
wi;k = wi,k − Γ jki w j ,
? vi;k = vi,k + Γik j v j . (3.43)
Convince yourself of the results
(3.42) and (3.43).

3.3 Curvature

3.3.1 The Torsion and Curvature Tensors

Torsion
The torsion T maps two vector fields x and y into another vector field,

T : TM × TM → TM , (3.44)

such that
T (x, y) = ∇ x y − ∇y x − [x, y] . (3.45)

∇x y
∇x y − ∇ y x

x
∇y x

y
Figure 3.2 Torsion quantifies by how much parallelograms do not close.

Obviously, the torsion vanishes if and only if the connection is symmetric,


cf. (3.10).
The torsion is antisymmetric,
T (x, y) = −T (y, x) , (3.46)
and satisfies
T ( f x, gy) = f g T (x, y) (3.47)
? ∞
with arbitrary C functions f and g.
Confirm the statement (3.47).
The map
T∗M × T M × T M → R , (w, x, y) → w[T (x, y)] (3.48)
3.3 Curvature 39

with w ∈ T ∗ M and x, y ∈ T M is a tensor of rank (1, 2) called the torsion Caution In alternative, but
tensor . equivalent representations of gen-
eral relativity, the torsion does
According to (3.48), the components of the torsion tensor in the coordi- not vanish, but the curvature
nate basis {∂i } and its dual basis {dxi } are does. This is the teleparallel or
 Einstein-Cartan version of gen-
T ki j = dxk T (∂i , ∂ j ) = Γki j − Γk ji . (3.49) eral relativity. 

Curvature
The curvature R̄ maps three vector fields x, y and v into a vector field,

R̄ : T M × T M × T M → T M , (3.50)

such that
R̄(x, y)v = ∇ x (∇y v) − ∇y (∇ x v) − ∇[x,y] v . (3.51)

∇v x
R(u, v)x

∇u x
∇u ∇v x
∇ v ∇u x
x

v u

u v

Figure 3.3 Curvature quantifies by how much second covariant derivatives


do not commute.

Since the covariant derivatives ∇ x and ∇y represent the infinitesimal


parallel transports along the integral curves of the vector fields x and y,
the curvature R̄ directly quantifies the change of the vector v when it is
parallel-transported around an infinitesimal, closed loop.
Exchanging x and y and using the antisymmetry of the commutator [x, y],
we see that R̄ is antisymmetric in x and y,

R̄(x, y) = −R̄(y, x) . (3.52)

Also, if f , g and h are C ∞ functions on M,

R̄( f x, gy)hv = f gh R̄(x, y)v , (3.53)

which follows immediately from the defining properties (3.1) of the


connection.
40 3 Differential Geometry II

Curvature or Riemann tensor


Obviously, the map

T∗M × T M × T M × T M → R , (w, x, y, v) = w[R̄(x, y)v] (3.54)

with w ∈ T ∗ M and x, y, v ∈ T M defines a tensor of rank (1, 3). It is


called the curvature tensor or Riemann tensor.
To work out the components of R̄ in a local coordinate basis {∂i }, we first
note that
  
∇∂i (∇∂ j ∂k ) = ∇∂i ∇∂ j (∂k ) = ∇∂i Γl jk ∂l
= Γl jk,i ∂l + Γl jk Γmil ∂m . (3.55)
Interchanging i and j yields the coordinate expression for ∇y (∇ x v). Since
the commutator of the basis vectors vanishes, [∂i , ∂ j ] = 0, the compo-
nents of the curvature tensor are
R̄i jkl = dxi [R̄(∂k , ∂l )∂ j ]
?
Verify the statement (3.53) and = Γil j,k − Γik j,l + Γml j Γikm − Γmk j Γilm . (3.56)
the coordinate representation
(3.56). Ricci tensor
The Ricci tensor R is the contraction C31 R̄ of the curvature tensor R̄. Its
components are

R jl = R̄i jil = Γil j,i − Γii j,l + Γml j Γiim − Γmi j Γilm . (3.57)

3.3.2 The Bianchi Identities

The curvature and the torsion together satisfy the two Bianchi identities.
Bianchi identities
The first Bianchi identity is
  
R̄(x, y)z = {T [T (x, y), z] + (∇ x T )(y, z)} , (3.58)
cyclic cyclic

where the sums extend over all cyclic permutations of the vectors x, y
and z. The second Bianchi identity is
 
(∇ x R̄)(y, z) + R̄[T (x, y), z] = 0 . (3.59)
cyclic

They are important because they define symmetry relations of the curva-
ture and the curvature tensor. In particular, for a symmetric connection,
T = 0 and the Bianchi identities reduce to
  
R̄(x, y)z = 0 , (∇ x R̄)(y, z) = 0 . (3.60)
cyclic cyclic
3.3 Curvature 41

Figure 3.4 Gregorio Ricci-Curbastro (1853–1925), Italian mathematician.


Source: Wikipedia

Before we go on, we have to clarify the meaning of the covariant deriva-


tives of the torsion and the curvature. We have seen that T defines a
tensor field T̃ of rank (1, 2). Given a dual vector field w ∈ T ∗ M, we
define the covariant derivative of the torsion T such that
 
w ∇v T (x, y) = (∇v T̃ )(w, x, y) . (3.61)
Using (3.36), we can write the right-hand side as
(∇v T̃ )(w, x, y) = vT̃ (w, x, y) − T̃ (∇v w, x, y)
− T̃ (w, ∇v x, y) − T̃ (w, x, ∇v y) . (3.62)
The first two terms on the right-hand side can be combined using (3.29),
T̃ (∇v w, x, y) = ∇v w[T (x, y)] = vw[T (x, y)] − w[∇v T (x, y)]
= vT̃ (w, x, y) − w[∇v T (x, y)] , (3.63)
which yields
(∇v T̃ )(w, x, y) = w[∇v T (x, y)] − T̃ (w, ∇v x, y) − T̃ (w, x, ∇v y) (3.64)
or, dropping the common argument w from all terms,
(∇v T )(x, y) = ∇v [T (x, y)] − T (∇v x, y) − T (x, ∇v y) . (3.65)

Similarly, we find that


(∇v R̄)(x, y) = ∇v [R̄(x, y)] − R̄(∇v x, y) − R̄(x, ∇v y) − R̄(x, y)∇v . (3.66)
42 3 Differential Geometry II

For symmetric connections, T = 0, the first Bianchi identity is easily


proven. Its left-hand side reads

∇ x ∇y z − ∇y ∇ x z + ∇y ∇z x − ∇z ∇y x + ∇z ∇ x y − ∇ x ∇z y
− ∇[x,y] z − ∇[y,z] x − ∇[z,x] y
= ∇ x (∇y z − ∇z y) + ∇y (∇z x − ∇ x z) + ∇z (∇ x y − ∇y x)
− ∇[x,y] z − ∇[y,z] x − ∇[z,x] y
= ∇ x [y, z] − ∇[y,z] x + ∇y [z, x] − ∇[z,x] y + ∇z [x, y] − ∇[x,y] z
= [x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0 , (3.67)
?
Convince yourself of the result where we have used the relation (3.10) and the Jacobi identity (2.33).
(3.67).

3.4 Riemannian connections

3.4.1 Definition and Uniqueness

Up to now, the affine connection ∇ has not yet been uniquely defined.
We shall now see that a unique connection can be introduced on each
pseudo-Riemannian manifold (M, g).
A connection is called metric if the parallel transport along any smooth
curve γ in M leaves the inner product of two autoparallel vector fields x
and y unchanged. This is the case if and only if the covariant derivative
∇ of g vanishes,
∇g = 0 . (3.68)
Because of (3.36), this condition is equivalent to the Ricci identity
Caution In a third, equivalent xg(y, z) = g(∇ x y, z) + g(y, ∇ x z) , (3.69)
representation of general relativ-
ity, curvature and torsion both where x, y, z are vector fields.
vanish, but the metricity (3.68)
It can now be shown that a unique connection ∇ can be introduced on
is given up. 
each pseudo-Riemannian manifold such that ∇ is symmetric or torsion-
free, and metric, i.e. ∇g = 0. Such a connection is called the Riemannian
or Levi-Civita connection.
Suppose first that such a connection exists, then (3.69) and the symmetry
of ∇ allow us to write

xg(y, z) = g(∇y x, z) + g([x, y], z) + g(y, ∇ x z) . (3.70)

Taking the cyclic permutations of this equation, summing the second and
the third and subtracting the first (3.70), we obtain the Koszul formula

2g(∇z y, x) = −xg(y, z) + yg(z, x) + zg(x, y) (3.71)


+ g([x, y], z) − g([y, z], x) − g([z, x], y) .
3.4 Riemannian connections 43

Figure 3.5 Tullio Levi-Civita (1873–1941), Italian mathematician. Source:


Wikipedia

Since the right-hand side is independent of ∇, and g is non-degenerate,


this result implies the uniqueness of ∇. The existence of an affine,
symmetric and metric connection can be proven by explicit construction.
The Christoffel symbols for a Riemannian connection can now be deter-
mined specialising the Koszul formula (3.71) to the basis vectors {∂i } of
a local coordinate system. We choose x = ∂k , y = ∂ j and z = ∂i and use
that their commutator vanishes, [∂i , ∂ j ] = 0, and that g(∂i , ∂ j ) = gi j .
Then, (3.71) implies

2g(∇∂i ∂ j , ∂k ) = −∂k gi j + ∂ j gik + ∂i g jk , (3.72)

thus
1 
gmk Γmi j = gik, j + g jk,i − gi j,k . (3.73)
2
If (gi j ) denotes the matrix inverse to (gi j ), we can write

1  
Γli j = glk gik, j + g jk,i − gi j,k . (3.74)
2

Levi-Civita-connection
On a pseudo-Riemannian manifold (M, g) with metric g, a unique
connection exists which is symmetric and metric, ∇g = 0. It is called
Levi-Civita connection.
44 3 Differential Geometry II

3.4.2 Symmetries. The Einstein Tensor

In addition to (3.52), the curvature tensor of a Riemannian connection


has the following symmetry properties:

R̄(x, y)v, w = −R̄(x, y)w, v , R̄(x, y)v, w = R̄(v, w)x, y . (3.75)

The first of these relations is easily seen noting that the antisymmetry is
equivalent to
R̄(x, y)v, v = 0 . (3.76)
From the definition of R̄ and the antisymmetry (3.52), we first have

v, R̄(x, y)v = v, ∇ x ∇y v − ∇y ∇ x v − ∇[x,y] v . (3.77)

Replacing y by ∇y v and z by v, the Ricci identity (3.69) allows us to write

v, ∇ x ∇y v = x∇y v, v − ∇y v, ∇ x v (3.78)

and, replacing x by y and both y and z by v,


1
∇y v, v = yv, v . (3.79)
2
Hence, the first two terms on the right-hand side of (3.77) yield

v, ∇ x ∇y v − ∇y ∇ x v = v, x∇y v, v − y∇ x v, v


1
= v, xyv, v − yxv, v
2
1
= v, [x, y]v, v . (3.80)
2
By (3.79), this is the negative of the third term on the right-hand side of
(3.77), which proves (3.76).
The symmetries (3.52) and (3.75) imply

R̄i jkl = −R̄ jikl = −R̄i jlk , R̄i jkl = R̄kli j , (3.81)

where R̄i jkl ≡ gim R̄m jkl . Of the 44 = 256 components of the Riemann
tensor in four dimensions, the first symmetry relation (3.81) leaves
6 × 6 = 36 independent components, while the second symmetry relation
(3.81) reduces their number to 6 + 5 + 4 + 3 + 2 + 1 = 21.
In a coordinate basis, the Bianchi identities (3.60) for the curvature tensor
of a Riemannian connection read
 
R̄i jkl = 0 , R̄i jkl;m = 0 , (3.82)
( jkl) (klm)

where ( jkl) denotes the cyclic permutations of the indices enclosed in


parentheses. In four dimensions, the first Bianchi identity establishes one
3.4 Riemannian connections 45

further relation between the components of the Riemann tensor which is


not covered yet by the symmetry relations (3.81), namely

R̄0123 + R̄0231 + R̄0312 = 0 , (3.83)

and thus leaves 20 independent components of the Riemann tensor.


These are
⎛ ⎞
⎜⎜⎜ R̄0101 R̄0102 R̄0103 R̄0112 R̄0113 ⎟⎟⎟
⎜⎜⎜⎜ R̄ 0202 R̄0203 R̄0212 R̄0213 R̄0223 ⎟
⎟⎟
⎟⎟
⎜⎜⎜
⎜⎜⎜ R̄0303 R̄0312 R̄0313 R̄0323 ⎟⎟⎟⎟⎟
⎜⎜⎜ ⎟ , (3.84)
⎜⎜⎜ R̄1212 R̄1213 R̄1223 ⎟⎟⎟⎟
⎜⎜⎜ ⎟
⎜⎝ R̄1313 R̄1323 ⎟⎟⎟⎟

R̄2323

where R̄0123 is determined by (3.83).


Using the symmetries (3.81) and the second Bianchi identity from (3.82),
we can obtain an important result. We first contract

R̄i jkl;m + R̄i jlm;k + R̄i jmk;l = 0 (3.85)

by multiplying with δki and use the symmetry relations (3.81) to find

R jl;m + R̄ jiml;i − R jm;l = 0 (3.86)

for the Ricci tensor. Next, we contract again by multiplying with g jm ,


which yields
Rml;m + Ril;i − R;l = 0 , (3.87)
where Ri j are the components of the Ricci tensor and R = Rii is the
Ricci scalar or the scalar curvature. Renaming dummy indices, the last
equation can be brought into the form

R i
R j − δj = 0 ,
i
(3.88)
2 ;i

which is the contracted Bianchi identity. Moreover, the Ricci tensor can
easily be shown to be symmetric,

Ri j = R ji . (3.89)

We finally introduce the symmetric Einstein tensor by

R
Gi j ≡ Ri j − gi j , (3.90)
2
which has vanishing divergence because of the contracted Bianchi iden-
tity,
Gi j;i = 0 . (3.91)
46 3 Differential Geometry II

Riemann, Ricci, and Einstein tensors


The Ricci tensor is the only non-vanishing contraction of the Riemann
tensor. Its components are

Ri j = R̄aia j . (3.92)

The Ricci scalar is the only contraction (the trace) of the Ricci tensor,
R = Tr R. The Ricci tensor, the Ricci scalar and the metric together
define the Einstein tensor, which has the components
R
Gi j = Ri j − gi j (3.93)
2
and is divergence-free, Gi j;i = 0.
Chapter 4

Physical Laws in External


Gravitational Fields

4.1 Motion of particles

4.1.1 Action for particles in gravitational fields

In special relativity, the action of a free particle was

 b  b  b 
S = −mc2 dτ = −mc ds = −mc −ημν dxμ dxν , (4.1)
a a a

where we have introduced the Minkowski metric ημν = diag(−1, 1, 1, 1).


This can be rewritten as follows: first, we parameterise the trajectory of
?
the particle as a curve γ(τ) and write the four-vector dx = udτ with the
Verify that the action (4.1)
four-velocity u = γ̇. Second, we use the notation (2.48)
implies the correct, specially-
relativistic equations of motion.
η(u, u) = u, u (4.2)

to cast the action into the form


 b
S = −mc −u, u dτ . (4.3) Caution Note that the actions
a (4.1) and (4.4) contain an inter-
pretation of geometry in terms of
physics: the line element of the
In general relativity, the metric η is replaced by the dynamic metric g. metric is identified with proper
We thus expect that the motion of a free particle will be described by the time. 
action
 b  b
S = −mc −u, u dτ = −mc −g(u, u) dτ . (4.4)
a a

47
48 4 Physics in Gravitational Fields

4.1.2 Equations of motion

To see what this equation implies, we now carry out the variation of S
and set it to zero,
 b
δS = −mc δ −g(u, u) dτ = 0 . (4.5)
a

Since the curve is assumed to be parameterised by the proper time τ, we


must have

cdτ = ds = −u, u dτ , (4.6)

which implies that the four-velocity u must satisfy

u, u = −c2 . (4.7)

This allows us to write the variation (4.5) as


 b 
mc
δS = dτ ∂λ gμν δxλ ẋμ ẋν + 2gμν δ ẋμ ẋν = 0 . (4.8)
2 a

We can integrate the second term by parts to find


 
b
d  b 
2 dτ gμν δ ẋμ ẋν = −2 gμν ẋν δxμ
dτ (4.9)
a a dτ
 b  
= −2 dτ ∂λ gμν ẋλ ẋν + gμν ẍν δxμ .
a

Interchanging the summation indices λ and μ and inserting the result


into (4.8) yields
 
∂λ gμν − 2∂μ gλν ẋμ ẋν − 2gλν ẍν = 0 (4.10)

or, after multiplication with gαλ ,

1  
? ẍα + gαλ 2∂μ gλν − ∂λ gμν ẋμ ẋν = 0 . (4.11)
2
Convince yourself that (4.11)
is correct and agrees with the
geodesic equation. Comparing the result (4.11) to (3.17) and recalling the symmetry of
the Christoffel symbols (3.74), we arrive at the following important
conclusion:
Motion of freely falling particles
The trajectories extremising the action (4.4) are geodesic curves. Freely
falling particles thus follow the geodesics of the spacetime.
4.2 Motion of light 49

4.2 Motion of light

4.2.1 Maxwell’s Equations in a Gravitational Field

As an example for how physical laws can be carried from special to gen-
eral relativity, we now formulate the equations of classical electrodynam-
ics in a gravitational field. For a summary of classical electrodynamics,
see Appendix A.
In terms of the field tensor F, Maxwell’s equations read

∂λ Fμν + ∂μ Fνλ + ∂ν Fλμ = 0 ,


4π μ
∂ν F μν = j , (4.12)
c
where jμ is the current four-vector. The homogeneous equations are
identically satisfied introducing the potentials Aμ , in terms of which the
field tensor is
Fμν = ∂μ Aν − ∂ν Aμ . (4.13)
We can impose a gauge condition, such as the Lorenz gauge

∂μ Aμ = 0 , (4.14)

which allows to write the inhomogeneous Maxwell equation in the form


4π μ
Aμ = − j (4.15)
c
of the d’Alembert equation.
Indices are raised with the (inverse) Minkowski metric,

F μν = ημα ηνβ Fαβ . (4.16)

Finally, the equation for the Lorentz force can be written as


duμ q μ ν
m = F νu , (4.17)
dτ c
where uμ = dxμ /dτ is the four-velocity.
Moving to general relativity, we first replace the partial by covariant
derivatives in Maxwell’s equations and find

∇λ Fμν + ∇μ Fνλ + ∇ν Fλμ = 0 ,


4π μ
∇ν F μν = j . (4.18)
c
However, it is easy to see that the identity

∇λ Fμν + cyclic ≡ ∂λ Fμν + cyclic (4.19)


50 4 Physics in Gravitational Fields

holds because of the antisymmetry of the field tensor F and the symmetry
of the connection ∇.
Indices have to be raised with the inverse metric g−1 now,

F μν = gμα gνβ Fαβ . (4.20)

Equation (4.17) for the Lorentz force has to be replaced by


 μ
du q
m + Γμαβ uα uβ = F μν uν . (4.21)
dτ c

We thus arrive at the following general rule:


Porting physical laws into general relativity
In the presence of a gravitational field, the physical laws of special
relativity are changed simply by substituting the covariant derivative
for the partial derivative, ∂ → ∇, by raising indices with gμν instead
of ημν and by lowering them with gμν instead of ημν , and by replacing
the motion of free particles along straight lines by the motion along
geodesics.
Note that this is a rule, not a law, because ambiguities may occur in
presence of second derivatives, as we shall see shortly.
We can impose a gauge condition such as the generalised Lorenz gauge

∇μ Aμ = 0 , (4.22)

but now the inhomogeneous wave equation (4.15) becomes more com-
plicated. We first note that

Fμν = ∇μ Aν − ∇ν Aμ ≡ ∂μ Aν − ∂ν Aμ (4.23)

identically. Inserting (4.23) into the inhomogeneous Maxwell equation


first yields
4π μ
∇ν (∇μ Aν − ∇ν Aμ ) = j , (4.24)
Caution Applying the rule c
given above, it has to be taken but now the term ∇ν ∇μ Aν does not vanish despite the Lorenz gauge
into account that covariant deriva- condition because the covariant derivatives do not commute.
tives do not generally commute.
Instead, we have to use

 
∇μ ∇ν − ∇ν ∇μ Aα = Rαβμν Aβ (4.25)

by definition of the curvature tensor, and thus

∇ν ∇μ Aν = ∇μ ∇ν Aν + Rμβ Aβ = Rμβ Aβ (4.26)

inserting the Lorenz gauge condition.


4.2 Motion of light 51

Electromagnetic wave equation in a curved spacetime


Thus, the inhomogeneous wave equation for an electromagnetic field
in general relativity reads
4π μ
∇ν ∇ν Aμ − Rμν Aν = − j . (4.27)
c
Had we started directly from the wave equation (4.15) from special
relativity, we would have missed the curvature term! This illustrates
the ambiguities that may occur applying the rule ∂ → ∇ when second
derivatives are involved.

4.2.2 Geometrical Optics

We now study how light rays propagate in a gravitational field. As


usual in geometrical optics, we assume that the wavelength λ of the
electromagnetic field is very much smaller compared to the scale L of
the space within which we study light propagation. In a gravitational
field, which causes spacetime to curve on another scale R, we have to
further assume that λ is also very small compared to R, thus

λ L and λ R. (4.28)

Example: Geometrical optics in curved space


An example could be an astronomical source at a distance of several
million light-years from which light with optical wavelengths travels to
the observer. The scale L would then be of order 1024 cm or larger, the
scale R would be the curvature radius of the Universe, of order 1028 cm,
while the light would have wavelengths of order 10−6 cm. 

Consequently, we introduce an expansion of the four-potential in terms


of a small parameter ε ≡ λ/min(L, R) 1 and write the four-potential
as a product of a slowly varying amplitude and a quickly varying phase,
 
Aμ = Re (aμ + εbμ )eiψ/ε , (4.29)

where the amplitude is understood as the two leading-order terms in the


expansion, and the phase ψ carries the factor ε−1 because it is inversely
proportional to the wave length. The real part is introduced because the
amplitude is complex.
As in ordinary geometrical optics, the wave vector is the gradient of
the phase, thus kμ = ∂μ ψ. We further introduce the scalar amplitude
a ≡ (aμ a∗μ )1/2 , where the asterisk denotes complex conjugation, and the
polarisation vector eμ ≡ aμ /a.
52 4 Physics in Gravitational Fields

We first impose the Lorenz gauge and find the condition


%& ' (
i
Re ∇μ (aμ + εbμ ) + (aμ + εbμ ) kμ eiψ/ε = 0 . (4.30)
ε

To leading order (ε−1 ), this implies

k μ aμ = 0 , (4.31)

which shows that the wave vector is perpendicular to the polarisation


vector. The next-higher order yields

∇μ aμ + ikμ bμ = 0 . (4.32)

Next, we insert the ansatz (4.29) into Maxwell’s equation (4.27) in


vacuum, i.e. setting the right-hand side to zero. This yields
%&
2i
Re ∇ν ∇ν (aμ + εbμ ) + kν ∇ν (aμ + εbμ )
ε
i μ 1
+ (a + εbμ )∇ν kν − 2 kν kν (aμ + εbμ )
ε  ε
μ ν ν
− R ν (a + εb ) e iψ/ε
=0. (4.33)

To leading order (ε−2 ), this implies

kν kν = 0 , (4.34)

which yields the general-relativistic eikonal equation

gμν ∂μ ψ∂ν ψ = 0 . (4.35)

Trivially, (4.34) implies

0 = ∇μ (kν kν ) = 2kν ∇μ kν . (4.36)

Recall that the wave vector is the gradient of the scalar phase ψ. The
second covariant derivatives of ψ commute,

∇μ ∇ν ψ = ∇ν ∇μ ψ (4.37)

as is easily seen by direct calculation, using the symmetry of the connec-


tion. Thus,
∇μ kν = ∇ν kμ , (4.38)
which, inserted into (4.36), leads to

kν ∇ν kμ = 0 or ∇k k = 0 . (4.39)

In other words, we arrive at the following important result:


4.2 Motion of light 53

Light rays in curved spacetime


In the limit of geometrical optics, Maxwell’s equations imply that light
rays follow null geodesics.

The next-higher order (ε−1 ) gives



1
2i kν ∇ν aμ + aμ ∇ν kν − kν kν bμ = 0 (4.40)
2

and, with (4.34), this becomes

1
kν ∇ν aμ + aμ ∇ν kν = 0 . (4.41)
2

We use this to derive a propagation law for the amplitude a. Obviously,


we can write
 
2akν ∂ν a = 2akν ∇ν a = kν ∇ν (a2 ) = kν a∗μ ∇ν aμ + aμ ∇ν a∗μ . (4.42)

By (4.41), this can be transformed to

1
kν (a∗μ ∇ν aμ + aμ ∇ν a∗μ ) = − ∇ν kν (a∗μ aμ + aμ a∗μ ) = −a2 ∇ν kν . (4.43)
2
Combining (4.43) with (4.42) yields
a
kν ∂ν a = − ∇ν kν , (4.44)
2
which shows how the amplitude is transported along light rays: the
change of the amplitude in the direction of the wave vector is proportional
?
to the negative divergence of the wave vector, which is a very intuitive
Why and in what sense is the re-
result.
sult (4.44) called intuitive here?
Finally, we obtain a law for the propagation of the polarisation. Using What does it mean?
aμ = aeμ in (4.41) gives

1
0 = kν ∇ν (aeμ ) + aeμ ∇ν kν
2) *
a
= akν ∇ν eμ + eμ kν ∂ν a + ∇ν kν = akν ∇ν eμ , (4.45)
2
where (4.44) was used in the last step. This shows that

k ν ∇ ν eμ = 0 or ∇k e = 0 , (4.46)

or in other words:
Transport of polarisation
The polarisation of electromagnetic waves is parallel-transported along
light rays.
54 4 Physics in Gravitational Fields

4.2.3 Redshift

Suppose now that a light source moving with four-velocity us is sending


a light ray to an observer moving with four-velocity uo , and another light
ray after a proper-time interval δτs . The phases of the first and second
light rays be ψ1 and ψ2 = ψ1 + δψ, respectively.
Clearly, the phase difference measured at the source and at the observer
must equal, thus
uμs (∂μ ψ)s δτs = δψ = uμo (∂μ ψ)o δτo . (4.47)
Using kμ = ∂μ ψ, and assigning frequencies νs and νo to the light rays
which are indirectly proportional to the time intervals δτs and δτo , we
find
νo δτs k, uo
= = , (4.48)
νs δτo k, us
which gives the combined gravitational redshift and the Doppler shift of
?
the light rays. Any distinction between Doppler shift and gravitational
Beginning with (4.48), can you
redshift has no invariant meaning in general relativity.
derive the specially-relativistic
Doppler formula?

4.3 Energy-momentum (non-)conservation

4.3.1 Contracted Christoffel Symbols

From (3.74), we see that the contracted Christoffel symbol can be written
as
1  
Γμμν = gμα gαν,μ + gμα,ν − gμν,α . (4.49)
2
Exchanging the arbitrary dummy indices α and μ and using the symmetry
of the metric, we can simplify this to
1
Γμμν = gμα gμα,ν . (4.50)
2

We continue by using Cramer’s rule from linear algebra, which states


that the inverse of a matrix A has the components
C ji
(A−1 )i j = , (4.51)
det A
where the C ji are the cofactors (signed minors) of the matrix A. Thus,
the cofactors are
C ji = det A(A−1 )i j . (4.52)
The determinant of A can be expressed using the cofactors as

n
det A = C ji A ji (4.53)
j=1
4.3 Energy-momentum (non-)conservation 55

for any fixed i, where n is the dimension of the (square) matrix A. This
?
so-called Laplace expansion of the determinant follows after multiplying
Using (4.51) and (4.53), calculate
(4.52) with the matrix A jk .
the inverse and the determinant
By definition of the cofactors, any cofactor C ji does not contain the of 2 × 2 and 3 × 3 matrices.
element A ji of the matrix A. Therefore, we can use (4.52) and the
Laplace expansion (4.53) to conclude

∂ det A
= C ji = det A(A−1 )i j . (4.54)
∂A ji

The metric is represented by the matrix gμν , its inverse by gμν . We abbre-
viate its determinant by g here. Cramer’s rule (4.52) then implies that
the cofactors of gμν are C μν = g gμν , and we can immediately conclude
from (4.54) that
∂g
= ggμν (4.55)
∂gμν
and thus
∂g
∂λ g = ∂λ gμν = ggμν ∂λ gμν . (4.56)
∂gμν

Contracted Christoffel symbols


Comparing this with the expression (4.50) for the contracted Christoffel
symbol, we see that

ggμν gμν,λ = 2gΓμμλ ,


1 1 1 √
Γμμλ = gμν gμν,λ = g,λ = √ ∂λ −g , (4.57)
2 2g −g

which is a very convenient expression for the contracted Christoffel


symbol, as we shall see.

4.3.2 Covariant Divergences

The covariant derivative of a vector with components vμ has the compo-


nents
∇ν vμ = ∂ν vμ + Γμνα vα . (4.58)
Using (4.57), the covariant divergence of this vector can thus be written

1 √ 1 √
∇μ vμ = ∂μ vμ + √ vμ ∂μ −g = √ ∂μ ( −g vμ ) . (4.59)
−g −g

Similarly, for a tensor A of rank (2, 0) with components Aμν , we have

∇ν Aμν = ∂ν Aμν + Γμαν Aαν + Γννα Aμα . (4.60)


56 4 Physics in Gravitational Fields

Again, by means of (4.57), we can combine the first and third terms on
the right-hand side to write

1 √
∇ν Aμν = √ ∂ν ( −gAμν ) + Γμαν Aαν . (4.61)
−g

Tensor divergences
If the tensor Aμν is antisymmetric, the second term on the right-hand
side of the divergence (4.61) vanishes because then the symmetric
Christoffel symbol Γμαν is contracted with the antisymmetric tensor Aαν .
If Aμν is symmetric, however, this final term remains, with important
consequences.

4.3.3 Charge Conservation

Since the electromagnetic field tensor F μν is antisymmetric, (4.61) im-


plies
1 √
∇ν F μν = √ ∂ν ( −gF μν ) . (4.62)
−g
On the other hand, replacing the vector vμ by ∇ν F μν in (4.59), we see that

1 √ 1 √
∇μ ∇ν F μν = √ ∂μ ( −g∇ν F μν ) = √ ∂μ ∂ν ( −gF μν ) , (4.63)
−g −g

where we have used (4.62) in the final step. But the partial derivatives
commute, so that once more the antisymmetric tensor F μν is contracted
with the symmetric symbol ∂μ ∂ν . Thus, the result must vanish, allowing
us to conclude
∇μ ∇ν F μν = 0 . (4.64)

However, by Maxwell’s equation (4.18),


∇μ ∇ν F μν = ∇μ jμ , (4.65)
c
which implies, by (4.59)

∂μ ( −g jμ ) = 0 . (4.66)

Charge conservation
Equation (4.66) is the continuity equation of the electric four-current,
implying charge conservation. We thus see that the antisymmetry of
the electromagnetic field tensor is necessary for charge conservation.
4.3 Energy-momentum (non-)conservation 57

4.3.4 Energy-Momentum “Conservation”

In special relativity, energy-momentum conservation can be expressed


by the vanishing four-divergence of the energy-momentum tensor T ,

∂ν T μν = 0 . (4.67)

Example: Energy conservation in an electromagnetic field


For example, the energy-momentum tensor of the electromagnetic field
is, in special relativity
& '
μν 1 μλ ν 1 μν αβ
T = −F F λ + η F Fαβ , (4.68)
4π 4

and for μ = 0, the vanishing divergence (4.67) yields the energy con-
servation equation
⎛ ⎞ + ,
 2 ⎟⎟
∂ ⎜⎜⎜ E 2 + B  · c (E × B)
⎟⎟⎠ + ∇  =0,
⎜⎝ (4.69)
∂t 8π 4π

in which the Poynting vector


c   
S = E×B (4.70)

represents the energy current density. 

According to our general rule for moving results from special relativity
to general relativity, we can replace the partial derivative in (4.67) by the
covariant derivative,
∇ν T μν = 0 , (4.71)

and obtain an equation which is covariant and thus valid in all reference
frames. Moreover, we would have to replace the Minkowski metric η in
(4.68) by the metric g if we wanted to consider the energy-momentum
tensor of the electromagnetic field.
From our general result (4.61), we know that we can rephrase (4.71) as

1 √ μ
√ ∂ν ( −gT μν ) + Γ λν T λν = 0 . (4.72)
−g

If the second term on the left-hand side was absent, this equation would
imply a conservation law. It remains there, however, because the energy-
momentum tensor is symmetric. In presence of this term, we cannot
convert (4.72) to a conservation law any more. This result expresses the
following important fact:
58 4 Physics in Gravitational Fields

Energy non-conservation
Energy is not generally conserved in general relativity. This is not
surprising because energy can now be exchanged with the gravitational
field.

4.4 The Newtonian limit

4.4.1 Metric and Gravitational Potential

Finally, we want to see under which conditions for the metric the Newto-
nian limit for the equation of motion in a gravitational field is reproduced,
which is

x¨ = −∇Φ (4.73)
to very high precision in the Solar System.
We first restrict the gravitational field to be weak and to vary slowly with
time. This implies that the Minkowski metric of flat space is perturbed
by a small amount,
gμν = ημν + hμν , (4.74)
with |hμν | 1.
Moreover, we restrict the consideration to bodies moving much slower
than the speed of light, such that
? dxi dx0
Does it matter with respect to ≈1. (4.75)
dτ dτ
which coordinate frame the ve-
locity is assumed to be much less Under these conditions, the geodesic equation for the i-th spatial coordi-
than the speed of light? nate reduces to
d2 x i d2 xi dxα dxβ
≈ = −Γi
αβ ≈ −Γi00 . (4.76)
c2 dt2 dτ2 dτ dτ

By definition (3.74), the remaining Christoffel symbols read


1 1
Γi00 = h0i,0 − h00,i ≈ − h00,i (4.77)
2 2
due to the assumption that the metric changes slowly in time so that
its time derivative can be ignored compared to its spatial derivatives.
Equation (4.76) can thus be reduced to
d2 x c2 
≈ ∇h00 , (4.78)
dt2 2
which agrees with the Newtonian equation of motion (4.73) if we identify

h00 ≈ − + const. (4.79)
c2
4.4 The Newtonian limit 59

The constant can be set to zero because both the deviation from the
Minkowski metric and the gravitational potential vanish at large distance
from the source of gravity. Therefore, the metric in the Newtonian limit
has the 0-0 element

g00 ≈ −1 − 2 . (4.80)
c

4.4.2 Gravitational Light Deflection

Based on this result, we might speculate that the metric in Newtonian


approximation could be written as
&  '

g = diag − 1 + 2 , 1, 1, 1 . (4.81)
c
We shall now work out the gravitational light deflection by the Sun in
this metric, which was one of the first observational tests of general
relativity.
Since light rays propagate along null geodesics, we have

∇k k = 0 or kν ∂ν kμ + Γμνλ kν kλ = 0 , (4.82)

where k = (ω/c, k) is the wave four-vector which satisfies

k, k = 0 thus ω = c|k| , (4.83)

which is the ordinary dispersion relation for electromagnetic waves


in vacuum. We introduce the unit vector e in the direction of k by
k = |k| e = ω e/c.

Assuming that the gravitational potential Φ does not vary with time,
∂0 Φ = 0, the only non-vanishing Christoffel symbols of the metric (4.81)
are
1
Γ00i ≈ 2 ∂i Φ ≈ Γi00 . (4.84)
c
For μ = 0, (4.82) yields
 
1  ω + ω e · ∇Φ = 0 ,
∂t + e · ∇ (4.85)
c c2
which shows that the frequency changes with time only because the light
path can run through a spatially varying gravitational potential. Thus, if
the potential is constant in time, the frequencies of the incoming and the
outgoing light must equal.
Using this result, the spatial components of (4.82) read
  
1  e = de = − 1 ∇
∂t + e · ∇  Φ = − ∇⊥ Φ ;
 − e(e · ∇) (4.86)
c cdt c 2 c2
60 4 Physics in Gravitational Fields

in other words, the total time derivative of the unit vector in the direc-
tion of the light ray equals the negative perpendicular gradient of the
gravitational potential.
For calculating the light deflection, we need to know the total change
of e as the light ray passes the Sun. This is obtained by integrating
(4.86) along the actual (curved) light path, which is quite complicated.
Caution Note that this approx- However, due to the weakness of the gravitational field, the deflection
imation is conceptually identi- will be very small, and we can evaluate the integral along the unperturbed
cal to Born’s approximation in (straight) light path.
quantum-mechanical scattering
We choose a coordinate system centred on the Sun and rotated such that
problems. 
the light ray propagates parallel to the z axis from −∞ to ∞ at an impact
parameter b. Outside the Sun, its gravitational potential is

Φ GM
GM

=− 2 =− √ . (4.87)
c 2 cr c2 b2 + z2

The perpendicular gradient of Φ is

 ⊥ Φ = ∂Φ eb =

GM b
eb , (4.88)
∂b c2 (b2 + z2 )3/2

where eb is the radial unit vector in the x-y plane from the Sun to the
light ray.
Light deflection in (incomplete) Newtonian approximation
Thus, under the present assumptions, the deflection angle is
 ∞
GMb 2GM
δe = −eb dz 2 2 = − 2 eb . (4.89)
−∞ c (b + z )
2 3/2 cb

Evaluating (4.89) at the rim of the Sun, we insert M


= 2 · 1033 g and
R
= 7 · 1010 cm to find

|δe | = 0.87 . (4.90)


For several reasons, this is a remarkable result. First, it had already been
derived by the German astronomer Soldner in the 19th century who had
assumed that light was a stream of material particles to which celestial
mechanics could be applied just as well as to planets. Before general
relativity, a strict physical meaning could not be given to the trajectory
of light in the presence of a gravitational field because the interaction
between electromagnetic fields and gravity was entirely unclear. The
statement of general relativity that light propagates along null geodesics
for the first time provided a physical law for the propagation of light rays
in gravitational fields.
Second, the result (4.90) is experimentally found to be incorrect. In
fact, the measured value is twice as large. This is a consequence of our
4.4 The Newtonian limit 61

assumption that the metric in the Newtonian limit is given by (4.81),


while the line element in the complete Newtonian limit is
 
2Φ 2 2 2Φ
ds = − 1 + 2 c dt + 1 − 2 dx 2 .
2
(4.91)
c c
Chapter 5

Differential Geometry III

5.1 The Lie derivative

5.1.1 The Pull-Back

Following (2.28), we considered one-parameter groups of diffeomor-


phisms
γt : R × M → M (5.1)
such that points p ∈ M can be considered as being transported along
curves
γ:R→M (5.2)
with γ(0) = p. Similarly, the diffeomorphism γt can be taken at fixed
t ∈ R, defining a diffeomorphism

γt : M → M (5.3)

which maps the manifold onto itself and satisfies γt ◦ γ s = γ s+t .


We have seen the relationship between vector fields and one-parameter
groups of diffeomorphisms before. Let now v be a vector field on M and
γ from (5.2) be chosen such that the tangent vector γ̇(t) defined by
d
(γ̇(t))( f ) = ( f ◦ γ)(t) (5.4)
dt
is identical with v, γ̇ = v. Then γ is called an integral curve of v.
If this is true for all curves γ obtained from γt by specifying initial points
γ(0), the result is called the flow of v.
The domain of definition D of γt can be a subset of R× M. If D = R× M,
the vector field is said to be complete and γt is called the global flow of v.
If D is restricted to open intervals I ⊂ R and open neighbourhoods
U ⊂ M, thus D = I × U ⊂ R × M, the flow is called local.

63
64 5 Differential Geometry III

Pull-back
Let now M and N be two manifolds and φ : M → N a map from M
onto N. A function f defined at a point q ∈ N can be defined at a point
p ∈ M with q = φ(p) by

φ∗ f : M → R , (φ∗ f )(p) := ( f ◦ φ)(p) = f [φ(p)] . (5.5)

The map φ∗ “pulls” functions f on N “back” to M and is thus called


the pull-back.
Similarly, the pull-back allows to map vectors v from the tangent space
T p M of M in p onto vectors from the tangent space T q N of N in q. We
can first pull-back the function f defined in q ∈ N to p ∈ M and then
apply v on it, and identify the result as a vector φ∗ v applied to f ,

φ∗ : T p M → T q N , v → φ∗ v = v ◦ φ∗ , (5.6)

such that (φ∗ v)( f ) = v(φ∗ f ) = v( f ◦ φ). This defines a vector from the
tangent space of N in q = φ(p).
Push-forward
The map φ∗ “pushes” vectors from the tangent space of M in p to the
tangent space of N in q and is thus called the push-forward.
In a natural generalisation to dual vectors, we define their pull-back φ∗
by
φ∗ : T q∗ N → T p∗ M , w → φ∗ w = w ◦ φ∗ , (5.7)

such that (φ∗ w)(v) = w(φ∗ v) = w(v ◦ φ∗ ), where w ∈ T q∗ N is an element of


the dual space of N in q. This operation “pulls back” the dual vector w
from the dual space in q = φ(p) ∈ N to p ∈ M.
The pull-back φ∗ and the push-forward φ∗ can now be extended to tensors.
Let T be a tensor field of rank (0, r) on N, then its pull-back is defined
by
φ∗ : Tr0 (N) → Tr0 (M) , T → φ∗ T = T ◦ φ∗ , (5.8)

such that (φ∗ T )(v1 . . . , vr ) = T (φ∗ v1 , . . . , φ∗ vr ). Similarly, we can define


the pull-back of a tensor field of rank (r, 0) on N by

φ∗ : T0r (N) → T0r (M) , T → φ∗ T (5.9)

such that (φ∗ T )(φ∗ w1 , . . . , φ∗ wr ) = T (w1 , . . . , wr ).


If the pull-back φ∗ is a diffeomorphism, which implies in particular that
the dimensions of M and N are equal, the pull-back and the push-forward
are each other’s inverses,

φ∗ = (φ∗ )−1 . (5.10)


5.1 The Lie derivative 65

Irrespective of the rank of a tensor, we now denote by φ∗ the pull-back


of the tensor and by φ∗ its inverse, i.e.

φ∗ : T sr (N) → T sr (M) ,
φ∗ : T sr (M) → T sr (N) . (5.11)

The important point is that if φ∗ : M → M is a diffeomorphism and T is


a tensor field on M, then φ∗ T can be compared to T .
Symmetry transformations
If φ∗ T = T , φ∗ is a symmetry transformation of T because T stays the
same even though it was “moved” by φ∗ . If the tensor field is the metric
g, such a symmetry transformation of g is called an isometry.

5.1.2 The Lie Derivative

Lie derivative Caution While the covariant


derivative determines how vec-
Let now v be a vector field on M and γt be the flow of v. Then, for an
tors and tensors change when
arbitrary tensor T ∈ T sr , the expression
moved across a given manifold,
γt∗ T − T the Lie derivative determines how
Lv T := lim (5.12) these objects change upon trans-
t→0 t
formations of the manifold itself.
is called the Lie derivative of the tensor T with respect to v. 
Note that this definition naturally generalises the ordinary derivative
with respect to “time” t. The manifold M is infinitesimally transformed
by one element γt of a one-parameter group of diffeomorphisms. This
could, for instance, represent an infinitesimal rotation of the two-sphere
S 2 . The tensor T on the manifold after the transformation is pulled back
to the manifold before the transformation, where it can be compared to
the original tensor T before the transformation.
Obviously, the Lie derivative of a rank-(r, s) tensor is itself a rank-(r, s)
tensor. It is linear,

Lv (t1 + t2 ) = Lv (t1 ) + Lv (t2 ) , (5.13)

satisfies the Leibniz rule

Lv (t1 ⊗ t2 ) = Lv (t1 ) ⊗ t2 + t1 ⊗ Lv (t2 ) , (5.14)

and it commutes with contractions. So far, these properties are easy to


verify in particular after choosing local coordinates.
66 5 Differential Geometry III

The application of the Lie derivative to a function f follows directly


from the definition (5.4) of the tangent vector γ̇,
γt∗ f − f ( f ◦ γt ) − ( f ◦ γ0 )
Lv f = lim = lim
t→0 t t→0 t
d
= ( f ◦ γ) = γ̇ f = v f = d f (v) . (5.15)
dt

The additional convenient property


L x y = [x, y] (5.16)
for vector fields y is non-trivial to prove.
Given two vector fields x and y, the Lie derivative further satisfies the
linearity relations
L x+y = L x + Ly , Lλx = λL x , (5.17)
with λ ∈ R, and the commutation relation
L[x,y] = [L x , Ly ] = L x ◦ Ly − Ly ◦ L x . (5.18)

If and only if two vector fields x and y commute, so do the respective


Lie derivatives,
[x, y] = 0 ⇔ L x ◦ L y = Ly ◦ L x . (5.19)
If φ and ψ are the flows of x and y, the following commutation relation
is equivalent to (5.19),
φ s ◦ ψt = ψt ◦ φ s . (5.20)

Let t ∈ Tr0 be a rank-(0, r) tensor field and v1 , . . . , vr be vector fields,


then
(L x t)(v1 , . . . , vr ) = x(t(v1 , . . . , vr ))
r
− t(v1 , . . . , [x, vi ], . . . , vr ) . (5.21)
i=1

? To demonstrate this, we apply the Lie derivative to the tensor product of


Compute the Lie derivative of a t and all vi and use the Leibniz rule (5.14),
rank-(1, 0) tensor field.
L x (t ⊗ v1 ⊗ . . . ⊗ vr ) = L x t ⊗ v1 ⊗ . . . ⊗ vr
+ t ⊗ L x v1 ⊗ . . . ⊗ vr + . . .
+ t ⊗ v1 ⊗ . . . ⊗ L x v r . (5.22)
Then, we take the complete contraction and use the fact that the Lie
derivative commutes with contractions, which yields
L x (t(v1 , . . . , vr )) = (L x t)(v1 , . . . , vr ) (5.23)
+ t(L x v1 , . . . , vr ) + . . . + t(v1 , . . . , L x vr ) .
5.1 The Lie derivative 67

Inserting (5.16), we now obtain (5.21).


As an example, we apply (5.21) to a tensor of rank (0, 1), i.e. a dual
vector w:
(L x w)(y) = xw(y) − w([x, y]) . (5.24)
One particular dual vector is the differential of a function f , defined in
(2.35). Inserting d f for w in (5.24) yields the useful relation
(L x d f )(y) = xd f (y) − d f ([x, y])
= xy( f ) − [x, y]( f ) = yx( f )
= yL x f = dL x f (y) , (5.25)
and since this holds for any vector field y, we find
L x d f = dL x f . (5.26)

Using the latter expression, we can derive coordinate expressions for the
Lie derivative. We introduce the coordinate basis {∂i } and its dual basis
{dxi } and apply (5.26) to dxi ,
Lv dxi = dLv xi = dv(xi ) = dv j ∂ j xi = dvi = ∂ j vi dx j . (5.27)
The Lie derivative of the basis vectors ∂i is
Lv ∂i = [v, ∂i ] = −(∂i v j )∂ j , (5.28)
where (2.32) was used in the second step.

Example: Lie derivative of a rank-(1, 1) tensor field


To illustrate the components of the Lie derivative of a tensor, we take a
tensor t of rank (1, 1) and apply the Lie derivative to the tensor product
t ⊗ dxi ⊗ ∂x j ,

Lv (t ⊗ dxi ⊗ ∂ j ) = (Lv t) ⊗ dxi ⊗ ∂ j


+ t ⊗ Lv dxi ⊗ ∂ j + t ⊗ dxi ⊗ Lv ∂ j , (5.29)

and now contract completely. This yields

Lv tij = (Lv t)ij + t(∂k vi dxk , ∂ j ) − t(dxi , ∂ j vk ∂k )


= (Lv t)ij + tkj ∂k vi − tki ∂ j vk . (5.30)

Solving for the components of the Lie derivative of t, we thus obtain

(Lv t)ij = vk ∂k tij − tkj ∂k vi + tki ∂ j vk , (5.31)

and similarly for tensors of higher ranks. 

In particular, for a tensor of rank (0, 1), i.e. a dual vector w,


(Lv w)i = vk ∂k wi + wk ∂i vk . (5.32)
68 5 Differential Geometry III

5.2 Killing vector fields

Killing vector fields


A Killing vector field K is a vector field along which the Lie derivative
of the metric vanishes,
LK g = 0 . (5.33)
This implies that the flow of a Killing vector field defines a symmetry
transformation of the metric, i.e. an isometry.
To find a coordinate expression, we use (5.31) to write

(LK g)i j = K k ∂k gi j + gk j ∂i K k + gik ∂ j K k


= K k (∂k gi j − ∂i gk j − ∂ j gik ) + ∂i (gk j K k ) + ∂ j (gik K k )
= ∇ i K j + ∇ j Ki = 0 , (5.34)

? where we have identified the Christoffel symbols (3.74) in the last step.
Derive the Killing equation (5.34) This is the Killing equation.
yourself.
Let γ be a geodesic, i.e. a curve satisfying

∇γ̇ γ̇ = 0 , (5.35)

then the projection of a Killing vector K on the tangent to the geodesic γ̇


is constant along the geodesic,

∇γ̇ γ̇, K = 0 . (5.36)

This is easily seen as follows. First,

∇γ̇ γ̇, K = ∇γ̇ γ̇, K + γ̇, ∇γ̇ K = γ̇, ∇γ̇ K (5.37)

because of the geodesic equation (5.35).


Writing the last expression explicitly in components yields

γ̇, ∇γ̇ K = gik γ̇i γ̇ j ∇ j K k = γ̇i γ̇ j ∇ j Ki , (5.38)

changing indices and using the symmetry of the metric, we can also
write it as
γ̇, ∇γ̇ K = g jk γ̇ j γ̇i ∇i K k = γ̇ j γ̇i ∇i K j . (5.39)
Adding the latter two equations and using the Killing equation (5.34)
shows  
2γ̇, ∇γ̇ K = γ̇i γ̇ j ∇i K j + ∇ j Ki = 0 , (5.40)
which proves (5.36). More elegantly, we have contracted the symmetric
tensor γ̇i γ̇ j with the tensor ∇i K j which is antisymmetric because of the
Killing equation, thus the result must vanish.
Equation (5.36) has a profound meaning:
5.3 Differential forms 69

Conservation laws from Killing vector fields


Freely-falling particles and light rays both follow geodesics. The
constancy of γ̇, K along geodesics means that each Killing vector
field gives rise to a conserved quantity for freely-falling particles and
light rays. Since a Killing vector field generates an isometry, this shows
that symmetry transformations of the metric give rise to conservation
laws.

5.3 Differential forms

5.3.1 Definition

Differential p-forms are totally antisymmetric tensors of rank (0, p). The
most simple example are dual vectors w ∈ T p∗ M since they are tensors
of rank (0, 1). A general tensor t of rank (0, 2) is not antisymmetric, but
can be antisymmetrised defining the two-form

1
τ(v1 , v2 ) ≡ [t(v1 , v2 ) − t(v2 , v1 )] , (5.41)
2
with two vectors v1 , v2 ∈ V.
To generalise this operation for tensors of arbitrary ranks (0, r), we first
define the alternation operator by

1
(At)(v1 , . . . , vr ) := sgn(π)t(vπ(1) , . . . , vπ(r) ) , (5.42)
r! π

where the sum extends over all permutations π of the integer numbers
?
from 1 to r. The sign of a permutation, sgn(π), is negative if the permu-
As an exercise, explicitly apply
tation is odd and positive otherwise.
the alternation operator to a ten-
In components, we briefly write sor field of rank (0, 3).

(At)i1 ...ir = t[i1 ...ir ] (5.43)

so that p-forms ω are defined by the relation

ωi1 ...i p = ω[i1 ...i p ] (5.44)

between their components. For example, for a 2-form ω we have

1 
ωi j = ω[i j] = ωi j − ω ji . (5.45)
2
-
The vector space of p-forms is denoted by p . Taking the product of two
-p -q
differential forms ω ∈ and η ∈ yields a tensor of rank (0, p + q)
70 5 Differential Geometry III

which is not antisymmetric, but can be antisymmetrised by means of the


alternation operator. The result

(p + q)!
ω∧η≡ A(ω ⊗ η) (5.46)
p!q!

is called the exterior product . Evidently, it turns the tensor ω ⊗ η ∈ T p+q


0

into a (p + q)-form.
The definition of the exterior product implies that it is bilinear, associa-
tive, and satisfies
ω ∧ η = (−1) pq η ∧ ω . (5.47)
-
A basis for the vector space p can be constructed from the basis {dxi },
1 ≤ i ≤ n, of the dual space V ∗ by taking

dxi1 ∧ . . . ∧ dxi p with 1 ≤ i1 < . . . < i p ≤ n , (5.48)


-p
which shows that the dimension of is

n n!
≡ (5.49)
p p!(n − p)!

for p ≤ n and zero otherwise. The skewed commutation relation (5.47)


implies
dxi ∧ dx j = −dx j ∧ dxi . (5.50)

Given two vector spaces V and W above the same field F, the Cartesian
product V × W of the two spaces can be turned into a vector space by
defining the vector-space operations component-wise. Let v, v1 , v2 ∈ V
and w, w1 , w2 ∈ W, then the operations

(v1 , w1 ) + (v2 , w2 ) = (v1 + v2 , w1 + w2 ) , λ(v, w) = (λv, λw) (5.51)

with λ ∈ F give V × W the structure of a vector space V ⊕ W which is


called the direct sum of V and W.
Vector space of differential forms
Similarly, we define the vector space of differential forms
. /
n .
p
≡ (5.52)
p=0

as the direct sum of the vector spaces of p-forms with arbitrary p ≤ n.


Recalling that a vector space V attains the structure of an algebra by
defining a vector-valued product between two vectors,

×:V ×V →V , (v, w) → v × w , (5.53)


5.3 Differential forms 71
-
we see that the exterior product ∧ gives the vector space of differential
forms the structure of a Grassmann algebra,
. . .
∧: × → , (ω, η) → ω ∧ η . (5.54)

The interior product of a p-form ω with a vector v ∈ V is a mapping


.p . p−1
V× → , (v, ω) → iv ω (5.55)

defined by Caution A Grassmann alge-


(iv ω)(v1 , . . . , v p−1 ) ≡ ω(v, v1 , . . . , v p−1 ) (5.56) bra (named after Hermann Graß-
mann, 1809–1877) is an associa-
and iv ω = 0 if ω is 0-form (a number or a function).
tive, skew-symmetric, graduated
algebra with an identity element.

5.3.2 The Exterior Derivative

For p-forms ω, we now define the exterior derivative as a map d,


.p . p+1
d: → , ω → dω , (5.57)

with the following three properties:

-
(i) d is an antiderivation of degree 1 on , i.e. it satisfies

d (ω ∧ η) = dω ∧ η + (−1) p ω ∧ dη (5.58)
-p -
for ω ∈ and η ∈ .

(ii) d ◦ d = 0.

(iii) For every function f ∈ F , d f is the differential of f , i.e. d f (v) =


v( f ) for v ∈ T M.

The exterior derivative is unique. By properties (i) and (ii), we directly


find 
dω = dωi1 ...i p ∧ dxi1 ∧ . . . ∧ dxi p (5.59)
i1 <...<i p

for any p-form



ω= ωi1 ...i p dxi1 ∧ . . . ∧ dxi p . (5.60)
i1 <...<i p

According to (5.59), the components of the exterior derivative of a p-


form ω can be written as

(dω)i1 ...i p+1 = (p + 1) ∂[i1 ωi2 ...i p+1 ] . (5.61)


72 5 Differential Geometry III

Since ωi2 ...i p+1 is itself antisymmetric, this last expression can be brought
into the form

p+1
(dω)i1 ...i p+1 = (−1)k+1 ∂ik ωi1 ,...,îk ,...i p+1 , (5.62)
k=1

with 1 ≤ i1 < . . . < i p < i p+1 ≤ n. Indices marked with a hat are left out.
The Lie derivative, the interior product and the exterior derivative are
related by Cartan’s equation
? Lv = d ◦ iv + iv ◦ d . (5.63)
Verify the expressions (5.61) and
(5.62).
Cartan’s equation implies the convenient formula for the exterior deriva-
tive of a p-form ω

p+1
dω(v1 , . . . , v p+1 ) = (−1)i+1 vi ω(v1 , . . . , v̂i , . . . , v p+1 ) (5.64)
i=1

+ (−1)i+ j ω([vi , v j ], v1 , . . . , v̂i , . . . , v̂ j , . . . , v p+1 ) ,


i< j

where the hat over a symbol means that this object is to be left out.

Example: Exterior derivative of a 1-form


For an example, let us apply these relations to a 1-form ω = ωi dxi . For
it, equation (5.59) implies

dω = dωi ∧ dxi = ∂ j ωi dx j ∧ dxi (5.65)

while (5.64) specialises to

dω(v1 , v2 ) = v1 ω(v2 ) − v2 ω(v1 ) − ω([v1 , v2 ]) . (5.66)

With (5.61) or (5.62), we find the components

dωi j = ∂i ω j − ∂ j ωi (5.67)

of the exterior derivative of the 1-form. 

In R3 , the expression (5.65) turns into

dω = (∂1 ω2 − ∂2 ω1 ) dx1 ∧ dx2 + (∂1 ω3 − ∂3 ω1 ) dx1 ∧ dx3


+ (∂2 ω3 − ∂3 ω2 ) dx2 ∧ dx3 . (5.68)

Closed and exact forms


A differential p-form α is called exact if a (p − 1)-form β exists such
that α = dβ. If dα = 0, the p-form α is called closed. Obviously, an
exact form is closed because of d ◦ d = 0.
5.4 Integration 73

5.4 Integration

5.4.1 The Volume Form and the Codifferential

An atlas of a differentiable manifold is called oriented if for every pair


of charts h1 on U1 ⊂ M and h2 on U2 ⊂ M with U1 ∩ U2  0, the Jacobi
determinant of the coordinate change h2 ◦ h−1
1 is positive.

Volume form
An n-dimensional, paracompact manifold M is orientable if and only
if a C ∞ , n-form exists on M which vanishes nowhere. This is called a
volume form.
The canonical volume form on a pseudo-Riemannian manifold (M, g)
is defined by
η ≡ |g| dx1 ∧ . . . ∧ dxn . (5.69)
This definition is independent of the coordinate system because it
transforms proportional to the Jacobian determinant upon coordinate
changes.
Equation (5.69) implies that the components of the canonical volume
form in n dimensions are proportional to the n-dimensional Levi-Civita
symbol,
ηi1 ...in = |g| εi1 ...in , (5.70)
which is defined such that it is +1 for even permutations of the i1 , . . . , in ,
−1 for odd permutations, and vanishes if any two of its indices are equal.
A very useful relation is

ε j1 ... jq k1 ...k p ε j1 ... jq i1 ...i p = p!q! δk[i11 δki22 . . . δi pp] ,


k
(5.71)
where the square brackets again denote the complete antisymmetrisation.
In three dimensions, one specific example for (5.71) is the familiar
formula
εi jk εklm = εki j εklm = δil δmj − δim δlj . (5.72)
Note that p = 1 and q = 2 here, but the factor 2! = 2 is cancelled by the
antisymmetrisation.
Hodge star operator
The Hodge star operator (∗-operation) turns a p form ω into an (n − p)-
form (∗ω), .p .n−p
∗: → , ω → ∗ω , (5.73)
which is uniquely defined by its application to the dual basis.

For the basis {dxi } of the dual space T p∗ M,



|g| i1 ...i p
∗(dx ∧ . . . ∧ dx ) :=
i1 ip
ε i p+1 ...in dxi p+1 ∧ . . . ∧ dxin . (5.74)
(n − p)!
74 5 Differential Geometry III

If the dual basis {ei } is orthonormal, this simplifies to


∗(ei1 ∧ . . . ∧ ei p ) = ei p+1 ∧ . . . ∧ ein . (5.75)
In components, we can write
1
(∗ω)i p+1 ...in = ηi ...i ωi1 ...i p , (5.76)
p! 1 n
i.e. (∗ω) is the volume form η contracted with the p-form ω. A straight-
forward calculation shows that

? ∗(∗ω) = sgn(g)(−1) p(n−p) ω . (5.77)


Verify the statement (5.77).
Example: Hodge dual in three dimensions
For a 1-form ω = ωi dxi in R3 , we can use

∗dx1 = dx2 ∧ dx3 , ∗dx2 = dx3 ∧ dx1 , ∗dx3 = dx1 ∧ dx2 (5.78)

to find the Hodge-dual 2-form

∗ω = ω1 dx2 ∧ dx3 − ω2 dx1 ∧ dx3 + ω3 dx1 ∧ dx2 , (5.79)

while the 2-form dω (5.68) has the Hodge dual 1-form

∗dω = (∂2 ω3 − ∂3 ω2 )dx1 − (∂1 ω3 − ∂3 ω1 )dx2


+ (∂1 ω2 − ∂2 ω1 )dx3 = εi jk ∂ j ωk dxi . (5.80)

Codifferential
The codifferential is a map
.p . p−1
δ: → , ω → δω (5.81)

defined by
δω ≡ sgn(g)(−1)n(p+1) (∗d∗)ω . (5.82)
d ◦ d = 0 immediately implies δ ◦ δ = 0.
By successive application of (5.71) and (5.62), it can be shown that the
coordinate expression for the codifferential is
1  
(δω)i1 ...i p−1 = ∂k |g|ωki1 ...i p−1 . (5.83)
|g|

Comparing this with (4.59), we see that this generalises the divergence
of ω. To see this more explicitly, let us work out the codifferential of a
1-form in R3 by first taking the exterior derivative of ∗ω from (5.79),
d∗ω = (∂1 ω1 + ∂2 ω2 + ∂3 ω3 ) dx1 ∧ dx2 ∧ dx3 , (5.84)
5.4 Integration 75

whose Hodge dual is

δω = ∂1 ω1 + ∂2 ω2 + ∂3 ω3 . (5.85)

Example: Maxwell’s equations


The Faraday 2-form is defined by
1
F≡ Fμν dxμ ∧ dxν . (5.86)
2
Application of (5.62) shows that

(dF)λμν = ∂λ Fμν − ∂μ Fλν + ∂ν Fλμ


= ∂λ Fμν + ∂μ Fνλ + ∂ν Fλμ = 0 , (5.87)

i.e. the homogeneous Maxwell equations can simply be expressed by

dF = 0 . (5.88)

Similarly, the components of the codifferential of the Faraday form are,


according to (5.83) and (4.62)
1 √  4π μ
(δF)μ = √ ∂ν −gF νμ = ∇ν F νμ = − j . (5.89)
−g c

Introducing further the current 1-form by j = jμ dxμ , we can thus write


the inhomogeneous Maxwell equations as

δF = − j. (5.90)
c


5.4.2 Integrals and Integral Theorems

The integral over an n-form ω,



ω, (5.91)
M

is defined in the following way: Suppose first that the support U ⊂ M


of ω is contained in a single chart which defines positive coordinates
(x1 , . . . , xn ) on U. Then, if ω = f dx1 ∧ . . . ∧ dxn with a function
f ∈ F (U),  
ω= f (x1 , . . . , xn )dx1 . . . dxn . (5.92)
M U

Note that this definition is independent of the coordinate system be-


cause upon changes of the coordinate system, both f and the volume
76 5 Differential Geometry III

element dx1 . . . dxn change in proportion to the Jacobian determinant of


the coordinate change.
If the domain of the n-form ω is contained in multiple maps, the integral
(5.92) needs to be defined piece-wise, but the principle remains the same.
The integration of functions f ∈ F (M) is achieved using the canonical
volume form η,  
f ≡ fη . (5.93)
M M

Integral theorems
Stokes’ theorem can now be formulated as follows: let M be an n-
Caution Like  lowers a note dimensional manifold and the region D ⊂ M have a smooth boundary
by a semitone in music, the  op- ∂D such that D̄ ≡ D ∪ ∂D is compact. Then, for every n − 1-form ω,
erator lowers the index of vec- we have  
tor components and thus turns dω = ω. (5.94)
them into dual-vector compo- D ∂D

nents. Analogously,  raises Likewise, Gauss’ theorem can be brought into the form
notes by semitones in music, and  

indices of dual-vector compo- δx η = ∗x , (5.95)
nents.  D ∂D

where x ∈ T M is a vector field on M and x is the 1-form belonging to


this vector field.

Musical operators
Generally, the musical operators  and  are isomorphisms between the
tangent spaces of a manifold and their dual spaces given by the metric,

 : T M → T∗M , v → v , vi = gi j v j (5.96)

and similarly by the inverse of the metric,

 : T∗M → T M , w → w , (w )i = gi j w j . (5.97)

The essence of the differential-geometric concepts introduced here are


summarised in Appendix B.
Chapter 6

Einstein’s Field Equations

6.1 The physical meaning of curvature

6.1.1 Congruences of time-like geodesics

Having walked through the introductory chapters, we are now ready


to introduce Einstein’s field equations, i.e. the equations describing the
dynamics of the gravitational field. Einstein searched for these equations
essentially between early August 1912, when he moved back from
Prague to Zurich, and November 25, 1915, when he published them in
their final form, meanwhile in Berlin. We shall give a heuristic argument
for the form of the field equations, which should not be mistaken for a
derivation, and later show that these equations follow from a suitable
Lagrangian.
First, however, we shall investigate into the physical role of the curvature
tensor. As we have seen, gravitational fields can locally be transformed
away by choosing normal coordinates, in which the Christoffel symbols
(the connection coefficients) all vanish. By its nature, this does not
hold for the curvature tensor which, as we shall see, is related to the
gravitational tidal field. Thus, in this sense, the gravitational tidal field
has a more profound physical significance as the gravitational field itself.
Let us begin with a congruence of geodesics. This is a bundle of time-like Caution Note that the proper
geodesics imagined to run through every point of a small environment time cannot be used for parame-
U ⊂ M of a point p ∈ U. terising light rays. In Chapter 13,
an affine parameter will be intro-
Let the geodesics be parameterised by the proper time τ along them,
duced instead. 
and introduce a curve γ transversal to the congruence, parameterised
by a curve parameter λ. Transversal means that the curve γ is nowhere
parallel to the congruence.

77
78 6 Einstein’s Field Equations

When normalised, the tangent vector to one of the time-like geodesics


can be written as

u = ∂τ with u, u = −1 . (6.1)

Since it is tangent to a geodesic, it is parallel-transported along the


geodesic,
∇u u = 0 . (6.2)

Similarly, we introduce a unit tangent vector v along the curve γ,

v = γ̇ = ∂λ . (6.3)

Since the partial derivatives with respect to the curve parameters τ and
λ commute, so do the vectors u and v, and thus v is Lie-transported (or
Lie-invariant) along u,

0 = [u, v] = Lu v . (6.4)

γ
v

Figure 6.1 Geodesic bundle with tangent vector u of the fiducial geodesic,
the curve γ towards a neighbouring geodesic, and tangent vector v = γ̇.

Now, we project v on u and define a vector n which is perpendicular to u,

n = v + v, uu , (6.5)

which does indeed satisfy n, u = 0 because of u, u = −1. This vector
is also Lie-transported along u, as we shall verify now.
6.1 The physical meaning of curvature 79

First, we have

Lu n = [u, n] = [u, v] + [u, v, uu]


= u(v, u)u = (∂τ v, u)u , (6.6)

where (6.4) was used in the first step. Since u, u = −1, we have

0 = ∂λ u, u = vu, u = 2∇v u, u , (6.7) ?


Go through all steps leading from
if we use the Ricci identity (3.69).
(6.6) to (6.9) and convince your-
But the vanishing commutator between u and v and the symmetry of the self of them.
connection imply ∇v u = ∇u v, and thus

∂τ u, v = uu, v = ∇u u, v + u, ∇u v = u, ∇v u = 0 , (6.8)

where the Ricci identity was used in the second step, the geodesic prop-
erty (6.2) in the third, and (6.7) in the last. Returning to (6.6), this proves
that n is Lie-transported,
Lu n = 0 . (6.9)
The perpendicular separation vector between neighbouring geodesics of
the congruence is thus Lie-invariant along the congruence.

6.1.2 The curvature tensor and the tidal field

Now, we take the second derivative of v along u,

∇2u v = ∇u ∇u v = ∇u ∇v u = (∇u ∇v − ∇v ∇u )u , (6.10)

where we have used again that u and v commute and that u is a geodesic.
With [u, v] = 0, the curvature (3.51) applied to u and v reads

R̄(u, v)u = (∇u ∇v − ∇v ∇u )u . (6.11)

Jacobi equation
In this way, we see that the second derivative of v along u is determined
by the curvature tensor through the Jacobi equation

∇2u v = R̄(u, v)u . (6.12)


Let us now use this result to find a similar equation for n. First, we
observe that

∇u n = ∇u v + ∇u (v, uu) = ∇u v + (∂τ v, u)u = ∇u v (6.13)

because of (6.8). Thus ∇2u n = ∇2u v and

∇2u n = R̄(u, v)u . (6.14)


80 6 Einstein’s Field Equations

We then use

R̄(u, n) = R̄(u, v + u, vu) = R̄(u, v) + u, vR̄(u, u) = R̄(u, v) (6.15)

to arrive at the desired result:


Equation of geodesic deviation
The separation vector n between neighbouring geodesics obeys the
equation
∇2u n = R̄(u, n)u . (6.16)
This is called the equation of geodesic deviation because it describes
directly how the separation between neighbouring geodesics evolves
along the geodesics according to the curvature.
Finally, let us introduce a coordinate basis {ei } in the subspace perpen-
dicular to u which is parallel-transported along u. Since n is confined to
this subspace, we can write

n = ni ei (6.17)

and thus
dni
∇u n = (uni )ei + (ni ∇u )ei = ei . (6.18)

Since u is normalised and perpendicular to the space spanned by the


triad {ei }, we can form a tetrad from the ei and e0 = u. The equation of
geodesic deviation (6.16) then implies

d2 ni
ei = R̄(e0 , n j e j )e0 = n j R̄(e0 , e j )e0 = n j R̄i00 j ei . (6.19)
dτ2
Thus, defining a matrix K by

d2 ni
= R̄i00 j n j ≡ K ij n j , (6.20)
dτ2
we can write (6.19) in matrix form

d2n
= Kn . (6.21)
dτ2
Note that K is symmetric because of the symmetries (3.81) of the curva-
ture tensor.
Moreover, the trace of K is

Tr K = R̄i00i = R̄μ00μ = −R00 = −Rμν uμ uν , (6.22)

where we have inserted R̄0000 = 0 and the definition of the Ricci tensor
(3.57).
6.1 The physical meaning of curvature 81

Let us now compare this result to the motion of test bodies in Newtonian
theory. At two neighbouring points x and x + n, we have the equations
of motion
ẍi = − (∂i Φ)|x (6.23)
and, to first order in a Taylor expansion,

ẍi + n̈i = − (∂i Φ)|x+n ≈ − (∂i Φ)|x − (∂i ∂ j Φ)x n j . (6.24)

Subtracting (6.23) from (6.24) yields the evolution equation for the
separation vector.
Relative acceleration in Newtonian gravity
In Newtonian gravity, the separation vector between any two particle
trajectories changes due to the tidal field according to

n̈i = −(∂i ∂ j Φ)n j . (6.25)

This equation can now be compared to the result (6.21).


Taking into account that

d2 ni n̈i ∂i ∂ j Φ j
= 2 =− n , (6.26)
dτ2 c c2

we see that the matrix K in Newton’s theory is

∂i ∂ j Φ
j = −
Ki(N) , (6.27)
c2
and its trace is
 2Φ
∇ ΔΦ
Tr K (N) = − 2
=− 2 , (6.28)
c c
i.e. the negative Laplacian of the Newtonian potential, scaled by the
squared light speed.
Tidal field and curvature
The essential results of this discussion are the correspondences
∂i ∂ j Φ
R̄i0 j0 ↔ (6.29)
c2
and
 2Φ

Rμν uμ uν ↔ . (6.30)
c2
These confirm the assertion that the curvature represents the gravita-
tional tidal field, describing the relative accelerations of freely-falling
test bodies; (6.29) and (6.30) will provide useful guidance in guessing
the field equations.
82 6 Einstein’s Field Equations

6.2 Einstein’s field equations

6.2.1 Heuristic “derivation”

We start from the field equation from Newtonian gravity, i.e. the Poisson
equation
4πGρ = ∇ 2 Φ = −c2 Tr K (N) . (6.31)

The density ρ can be expressed by the energy-momentum tensor T . For


an ideal fluid, we have

T = (ρc2 + p)u ⊗ u + pg , (6.32)

from which we find because of u, u = −1

T (u, u) = ρc2 . (6.33)

Moreover, its trace is

Tr T = −ρc2 + 3p ≈ −ρc2 (6.34)


?
because p ρc2 under Newtonian conditions (the pressure is much less
Write equations (6.32) and (6.33)
than the energy density).
in components. At what level are
the indices? Hence, let us take a constant λ ∈ R, put

ρc2 = λT (u, u) + (1 − λ) Tr T g(u, u) (6.35)

and insert this into the field equation (6.31), using (6.22) for the trace of
K. We thus obtain
4πG  
R(u, u) = 4
λT + (1 − λ) Tr T g (u, u) . (6.36)
c

Since this equation should hold for any observer and thus for arbitrary
four-velocities u, we find
4πG  
R= 4
λT + (1 − λ) Tr T g , (6.37)
c
where λ ∈ R remains to be determined.
We take the trace of (6.37), obtain
4πG
Tr R = R = [λ + 4(1 − λ)] Tr T (6.38)
c4
and combine this with (6.37) to assemble the Einstein tensor (3.90),
R
G =R− g
2
4πG 2−λ
= 4 λT − Tr T g . (6.39)
c 2
6.2 Einstein’s field equations 83

We have seen in (3.91) that the Einstein tensor G satisfies the contracted
Bianchi identity

∇·G = 0 . (6.40)

Likewise, the divergence of the energy-momentum tensor must vanish


in order to guarantee local energy-momentum conservation,

∇·T =0. (6.41)

These two conditions are generally compatible with (6.39) only if we


choose λ = 2, which specifies the field equations.
Einstein’s field equations
Einstein’s field equations, published on November 25th, 1915, relate
the Einstein tensor G to the energy-momentum tensor T as
8πG ?
G= T. (6.42) Convince yourself that equations
c4
(6.42) and (6.43) are equivalent.
An equivalent form follows from (6.37),

8πG 1
R = 4 T − Tr T g . (6.43)
c 2

6.2.2 Uniqueness

In the appropriate limit, Einstein’s equations satisfy Newton’s theory by


construction and are thus one possible set of gravitational field equations.
A remarkable theorem due to David Lovelock (1938–) states that they are
the only possible field equations under certain very general conditions.
It is reasonable to assume that the gravitational field equations can be
written in the form

D[g] = T , (6.44)

where the tensor D[g] is a functional of the metric tensor g and T is


the energy-momentum tensor. This equation says that the source of the
gravitational field is assumed to be expressed by the energy-momentum
tensor of all matter and energy contained in spacetime. Now, Lovelock’s
theorem states:
84 6 Einstein’s Field Equations

Lovelock’s theorem
If D[g] depends on g and its derivatives only up to second order, then
it must be a linear combination of the Einstein and metric tensors,

D[g] = αG + βg , (6.45)
?
Express the coefficients α and β with α, β ∈ R. This absolutely remarkable theorem says that G must be
in terms of κ and Λ. of the form
G = κT − Λg , (6.46)
with κ and Λ are constants. The correct Newtonian limit then requires
that κ = 8πGc−4 , and Λ is the “cosmological constant” introduced by
Einstein for reasons which will become clear later.

6.3 Lagrangian formulation

6.3.1 The action of general relativity

The remarkable uniqueness of the tensor D shown by Lovelock’s theo-


rem lets us suspect that a Lagrangian formulation of general relativity
should be possible starting from a scalar constructed from D, most nat-
urally its contraction Tr D, which is simply proportional to the Ricci
scalar R if we ignore the cosmological term proportional to Λ for now.
Writing down the action, we have to take into account that we require an
invariant volume element, which we obtain from the canonical volume
form η introduced in (5.69). Then, according to (5.92) and (5.93), we
can represent volume integrals as
 

η= −g d4 x , (6.47)
M U

where −g is the square root of the determinant of g, and U ⊂ M admits
a single chart. Recall that, if we need to integrate over a domain covered
by multiple charts, a sum over the domains of the individual charts is
understood.
Thus, we suppose that the action of general relativity in a compact region
D ⊂ M with smooth boundary ∂D is
 

S GR [g] = R[g]η = R[g] −g d4 x . (6.48)
D D

6.3.2 Variation of the action

Working out the variation of this action with respect to the metric com-
ponents gμν , we write explicitly
R = gμν Rμν (6.49)
6.3 Lagrangian formulation 85

and thus
  √ 
δS GR = δ gμν Rμν −g d4 x (6.50)
D   √ 

= δRμν gμν −g d4 x + Rμν δ gμν −g d4 x .
D D

We evaluate the variation of the Ricci tensor first, using its expression
(3.57) in terms of the Christoffel symbols. Matters simplify considerably
if we introduce normal coordinates, which allow us to ignore the terms
in (3.57) which are quadratic in the Christoffel symbols. Then, the Ricci
tensor specialises to

Rμν = ∂α Γαμν − ∂ν Γαμα , (6.51)

and its variation is

δRμν = ∂α (δΓαμν ) − ∂ν (δΓαμα ) . (6.52)

Although the Christoffel symbols do not transform as tensors, their


variation does, as the transformation law (3.6) shows. Thus, we can
locally replace the partial by the covariant derivatives and write

δRμν = ∇α (δΓαμν ) − ∇ν (δΓαμα ) , (6.53)

which is a tensor identity, called the Palatini identity, and thus holds in
all coordinate systems everywhere. It implies
 
gμν δRμν = ∇α gμν δΓαμν − gμα δΓνμν , (6.54)

where the indices α and ν were swapped in the last term. Thus, the
variation of the Ricci tensor, contracted with the metric, can be expressed
by the divergence of a vector W,

gμν δRμν = ∇α W α , (6.55)

whose components W α are defined by the term in parentheses on the


right-hand side of (6.54).
From Cramer’s rule in the form (4.55), we see that
∂g
δg = δgμν = ggμν δgμν . (6.56)
∂gμν
Moreover, since gμν gμν = const. = 4, we conclude

gμν δgμν = −gμν δgμν . (6.57)


Using these expressions, we obtain for the variation of −g

√ δg ggμν δgμν −g
δ −g = − √ = √ =− gμν δgμν , (6.58)
2 −g 2 −g 2
86 6 Einstein’s Field Equations

or, in terms of the canonical volume form η,

1 1
δη = − gμν δgμν η = gμν δgμν η . (6.59)
2 2

Now, we put (6.58) and (6.55) back into (6.50) and obtain
  
α R
δS GR = ∇α W η + Rμν − gμν δgμν η
2
 D

D

Gμν δgμν η + ∇α W α η = 0 .
!
= (6.60)
D D

Varying gμν only in the interior of D, the divergence term vanishes by


Gauß’ theorem, and admitting arbitrary variations δgμν implies

Gμν = 0 . (6.61)

Caution Notice that we have


ignored a possible boundary term Vacuum field equations from a variational principle
here which needs to be taken into Including the cosmological constant and using (6.58) once more, we
account if the manifold has a see that Einstein’s vacuum equations, G + Λg = 0, follow from the
boundary. It has become known variational principle
as the Gibbons-Hawking-York 
boundary term, which plays a
δ (R − 2Λ) η = 0 . (6.62)
central role e.g. in calculations D
of black-hole entropy. 
The complete Einstein equations including the energy momentum tensor
cannot yet be obtained here because no matter or energy contribution to
the Lagrange density has been included yet into the action.

6.4 The energy-momentum tensor

6.4.1 Matter fields in the action

In order to include matter (where “matter” summarises all kinds of matter


and non-gravitational energy) into the field equations, we assume that
the matter fields ψ are described by a Lagrangian L depending on ψ, its
gradient ∇ψ and the metric g,

L(ψ, ∇ψ, g) , (6.63)

where ψ may be a scalar or tensor field.


The field equations are determined by the variational principle

δ Lη = 0 , (6.64)
D
6.4 The energy-momentum tensor 87

where the Lagrangian is varied with respect to the fields ψ and their
derivatives ∇ψ. Thus,
  
∂L ∂L
δ Lη = δψ + δ∇ψ η = 0 . (6.65)
D D ∂ψ ∂∇ψ

As usual, we can express the second term by the difference


 
∂L ∂L ∂L
δ∇ψ = ∇ · δψ − ∇ · δψ , (6.66)
∂∇ψ ∂∇ψ ∂∇ψ
of which the first term is a divergence which vanishes according to Gauß’
theorem upon volume integration. Combining (6.66) with (6.65), and
allowing arbitrary variations δψ of the matter fields, then yields the
Euler-Lagrange equations for the matter fields,
∂L ∂L
−∇· =0. (6.67)
∂ψ ∂∇ψ

Example: Lagrangian of a scalar field


To give an example, suppose we describe a neutral scalar field ψ with
the Lagrangian

1 m2 2 1 m2 2
L = − ∇ψ, ∇ψ − ψ = − ∇μ ψ∇μ ψ − ψ , (6.68)
2 2 2 2
where mψ2 /2 is a mass term with constant parameter m. The Euler-
Lagrange equations then imply the field equations
   
− + m2 ψ = −∇μ ∇μ + m2 ψ = 0 , (6.69)

which can be interpreted as the Klein-Gordon equation for a particle


with mass m. 

Similarly, we can vary the action with respect to the metric, which
requires care because the Lagrangian may depend on the metric explicitly
and implicitly through the covariant derivatives ∇ψ of the fields, and
the canonical volume form η depends on the metric as well because of
(6.59). Thus,
   
  1
δ Lη = (δL)η + Lδη = δL − gμν Lδgμν η . (6.70)
D D D 2

6.4.2 Field equations with matter

If the Lagrangian does not implicitly depend on the metric, we can write
∂L μν
δL = δg . (6.71)
∂gμν
88 6 Einstein’s Field Equations

If there is an implicit dependence on the metric, we can introduce normal


coordinates to evaluate the variation of the Christoffel symbols,

1  
δΓαμν = gασ ∇ν δgσμ + ∇μ δgσν − ∇σ δgμν , (6.72)
2

which is a tensor, as remarked above, whence (6.72) holds in all co-


ordinate frames everywhere. The derivatives can then be moved away
from the variations of the metric by partial integration, and expressions
proportional to δgμν remain.
Energy-momentum tensor
Thus, it is possible to write the variation of the action with respect to
the metric in the form
?  
1
Derive the energy-momentum δ Lη = − T μν δgμν η , (6.73)
D 2 D
tensor for the matter field de-
scribed by the Lagrangian (6.68). in which the tensor T is the energy-momentum tensor. If there are no
implicit dependences on the metric, its components are
∂L
T μν = −2 + Lgμν . (6.74)
∂gμν

Example: Energy-momentum tensor of the electromagnetic


field
Let us show by an example that this identification does indeed make
sense. We start from the Lagrangian of the free electromagnetic field,
1 αβ 1
L=− F Fαβ = − Fαβ Fγδ gαγ gβδ . (6.75)
16π 16π
We know from (4.23) that the covariant derivatives in the field ten-
sor F can be replaced by partial derivatives, thus there is no implicit
dependence on the metric. Then, the variation δL is
1
δL = − Fμα Fνβ gαβ δgμν (6.76)

With (6.70), this implies
  
1 1
δ Lη = Fμα F ν + Fαβ F gμν δgμν η
α αβ
(6.77)
D 8π D 4

and, from (6.73), the familiar energy-momentum tensor



1 1
T μν = F μλ F νλ − gμν Fαβ F αβ (6.78)
4π 4

of the electromagnetic field. 


6.4 The energy-momentum tensor 89

Therefore, Einstein’s field equations and the matter equations follow


from the variational principle
 
16πG
δ R − 2Λ + 4 L η = 0 (6.79)
D c
Since, as we have seen before, the variation of the first two terms yields
G + Λg, and the variation of the third term yields minus one-half of the
energy-momentum tensor, −T/2. In components, the variation yields
8πG
Gμν = T μν − Λgμν . (6.80)
c4

This shows that the cosmological constant can be considered as part of


the energy-momentum tensor,

Λ Λ Λc4
T μν → T μν + T μν , T μν ≡− gμν . (6.81)
8πG

6.4.3 Equations of motion

Suppose spacetime is filled with an ideal fluid whose pressure p can


be neglected compared to the energy density ρc2 . Then, the energy-
momentum tensor (6.32) can be reduced to

T = ρc2 u ⊗ u . (6.82)

Conservation of the fluid can be expressed in the following way: the


amount of matter contained in a domain D of spacetime must remain the
same, even if the domain is mapped into another domain φt (D) by the
flow φt of the vector field u with the time t. Thus
 
ρη = ρη . (6.83)
D φt (D)

This expression just says that, if the domain D is mapped along the flow
?
lines of the fluid flow, it will encompass a constant amount of material
Convince yourself recalling the
independent of time t.
definition of the pull-back that
Now, we can use the pull-back to write (6.84) is correct.
 
ρη = φ∗t (ρη) , (6.84)
φt (D) D

and take the limit t → 0 to see the equivalence of (6.83) and (6.84) with
the vanishing Lie derivative of ρη along u,

Lu (ρη) = 0 . (6.85)
90 6 Einstein’s Field Equations

The Leibniz rule (5.14) yields

(Lu ρ)η + ρLu η = 0 . (6.86)

Due to (5.15), the first term yields

(Lu ρ)η = (uρ)η = (ui ∂i ρ)η = (ui ∇i ρ)η = (∇u ρ)η . (6.87)

For the second term, we can apply equation (5.30) for the components
of the Lie derivative of a rank-(0, 4) tensor, and use the antisymmetry of
η to see that
Lu η = (∇μ uμ )η = (∇ · u)η . (6.88)
Accordingly, (6.86) can be written as

0 = (∇u ρ + ρ∇ · u) η = ∇ · (ρu)η , (6.89)

? or
Give a physical interpretation of ∇μ (ρuμ ) = 0 . (6.90)
equation (6.90). What does it
mean? At the same time, the divergence of T must vanish, hence

0 = ∇ν T μν = ∇ν (ρuμ uν ) = ∇ν (ρuν ) uμ + ρuν ∇ν uμ . (6.91)

The first term vanishes because of (6.90), and the second implies

uν ∇ ν u μ = 0 ⇔ ∇u u = 0 . (6.92)

In other words, the flow lines have to be geodesics. For an ideal fluid, the
equation of motion thus follows directly from the vanishing divergence
of the energy-momentum tensor, which is required in general relativity
by the contracted Bianchi identity (3.91).
Chapter 7

Weak Gravitational Fields

7.1 Linearised theory of gravity

7.1.1 Linearised field equations

We begin our study of solutions for the field equations with situations in
which the metric is almost Minkowskian, writing
gμν = ημν + hμν , (7.1)
where hμν is considered as a perturbation of the Minkowski metric ημν
such that
|hμν | 1 . (7.2)
This condition is excellently satisfied e.g. in the Solar System, where
Φ ?
|hμν | ≈ ≈ 10−6 . (7.3) How can you most easily confirm
c2
the estimate (7.3) for the Solar
Note that small perturbations of the metric do not necessarily imply System and other astronomical
small perturbations of the matter density, as the Solar System illustrates. objects?
Also, the metric perturbations may change rapidly in time.
First, we write down the Christoffel symbols for this kind of metric.
Starting from (3.74) and ignoring quadratic terms in hμν , we can write
1  
Γαμν = ηαβ ∂ν hμβ + ∂μ hβν − ∂β hμν
2
1 α 
= ∂ν hμ + ∂μ hαν − ∂α hμν . (7.4)
2

Next, we can ignore the terms quadratic in the Christoffel symbols in the
components of the Ricci tensor (3.56) and find
Rμν = ∂λ Γλμν − ∂ν Γλλμ . (7.5)

91
92 7 Weak Gravitational Fields

Inserting (7.4) yields

1 
Rμν = ∂λ ∂ν hλμ + ∂λ ∂μ hλν − ∂λ ∂λ hμν − ∂μ ∂ν hλλ
2
1 
= ∂λ ∂ν hλμ + ∂λ ∂μ hλν − hμν − ∂μ ∂ν h , (7.6)
2
where we have introduced the d’Alembert operator and abbreviated the
trace of the metric perturbation,
Caution Note that the
d’Alembert operator is the
 = ∂λ ∂λ , h ≡ hλλ . (7.7)
square of the ordinary partial
derivative here, not the covariant
derivative. Is this appropriate, The Ricci scalar is the contraction of Rμν ,
and if so, why? 
R = ∂λ ∂μ hλμ − h , (7.8)

and the Einstein tensor is


1
Gμν = ∂λ ∂ν hλμ + ∂λ ∂μ hλν − ημν ∂λ ∂σ hλσ
2 
− ∂μ ∂ν h − hμν + ημν h . (7.9)

Neglecting terms of order |hμν |2 , the contracted Bianchi identity reduces


to
∂νGμν = 0 , (7.10)
which, together with the field equations, implies

∂ν T μν = 0 . (7.11)

One could now insert the Minkowski metric in T μν , search for a first
μν of the linearised field equations and iterate replacing ημν
solution h(0)
μν
by ημν + h(0)
μν in T to find a corrected solution h(1)
μν , and so forth. This
procedure is useful as long as the back-reaction of the metric on the
energy-momentum tensor is small.
If we specialise (7.11) for pressure-less dust and insert (6.82), we find
the equation of motion
? uν ∂ν uμ = 0 , (7.12)
Compare (7.12) to the geodesic
which means that the fluid elements follow straight lines.
equation for the motion of fluid
particles.
7.1.2 Wave equation for metric fluctuations

The field equations simplify considerably when we substitute

1
γμν ≡ hμν − ημν h (7.13)
2
7.2 Gauge transformations 93

for hμν . Since γ ≡ γμμ = −h, we can solve (7.13) for hμν and insert
1
hμν = γμν − ημν γ (7.14)
2
into (7.9) to obtain the linearised field equations
16πG
∂λ ∂ν γλμ + ∂λ ∂μ γλν − ημν ∂λ ∂σ γλσ − γμν = T μν . (7.15)
c4

7.2 Gauge transformations

7.2.1 Diffeomorphism invariance

Diffeomorphism invariance
Let φ be a diffeomorphism of M, such that φ : M → N in diffeomor- ?
phic way. Since φ is then bijective and smoothly differentiable and In particular, the diffeomorphism
has a smoothly differentiable inverse, M and N can be considered as invariance of general relativity
indistinguishable abstract manifolds. The manifolds M and N then implies that coordinate systems
represent the same physical spacetime. In particular, the metric g on can have no physical significance.
M is then physically equivalent to the pulled-back metric φ∗ g. This dif- Use your own words to explain
feomorphism invariance is a fundamental property of general relativity. why this is so.

In particular, this holds for a one-parameter group φt of diffeomorphisms


which represents the (local) flow of some vector field v. By the definition
of the Lie derivative, we have, to first order in t,
φ∗ g = g + tLv g . (7.16)

Now, set g = η + h and define the infinitesimal vector ξ ≡ tv. Then, the
transformation (7.16) implies
h → φ∗ h = h + tLv η + tLv h = h + Lξ η + Lξ h . (7.17)
For weak fields, the third term on the right-hand side can be neglected.
Using (5.31), we see that
(Lξ η)μν = ηλν ∂μ ξλ + ημλ ∂ν ξλ = ∂μ ξν + ∂ν ξμ . (7.18)
We thus find the following important result:
Gauge transformations of weak metric perturbations
The weak metric perturbation hμν admits the gauge transformation

hμν → hμν + ∂μ ξν + ∂ν ξμ . (7.19)

This gauge transformation changes the tensor γμν as


γμν → γμν + ∂μ ξν + ∂ν ξμ − ημν ∂λ ξλ . (7.20)
94 7 Weak Gravitational Fields

7.2.2 Hilbert gauge

We can now arrange matters to enforce the Hilbert gauge

∂ν γμν = 0 . (7.21)

The gauge transformation (7.20) implies that the divergence of γμν is


transformed as

∂ν γμν → ∂ν γμν + ∂ν ∂μ ξν + ξμ − ∂μ ∂λ ξλ = ∂ν γμν + ξμ , (7.22)

such that, if (7.21) is not satisfied yet, it can be achieved by choosing for
ξμ a solution of the inhomogeneous wave equation

ξμ = −∂ν γμν , (7.23)


?
How are the retarded and the
which, as we know from electrodynamics, can be obtained by means of
advanced Green’s functions con-
the retarded Green’s function of the d’Alembert operator.
structed in electrodynamics? Re-
mind yourself of the essential Wave equation for metric perturbations
steps.
Enforcing the Hilbert gauge in this way simplifies the linearised field
equation (7.15) dramatically,
16πG μν
γμν = − T . (7.24)
c4
These equations are formally identical to Maxwell’s equations in Lorenz
gauge, and therefore admit the same solutions. Defining the Green’s
function of the d’Alembert operator  by

G(x, x ) = G(t, t , x, x  ) = −4πδD (t − t , x − x  ) (7.25)

and using x0 = ct instead of t, we find the retarded Greens function

1  
G(x, x ) = δD x0 − x 0 − |x − x  | . (7.26)
|x − x |


Using it, we arrive at the particular solution

 T  x0 − |x − x  |, x  
4G μν
γμν (x) = 4 d3 x (7.27)
c |x − x  |

for the linearised field equation. Of course, arbitrary solutions of the


homogeneous (vacuum) wave equation can be added.
Thus, similar to electrodynamics, the metric perturbation consists of the
field generated by the source plus wave-like vacuum solutions propagat-
ing at the speed of light.
7.3 Nearly Newtonian gravity 95

7.3 Nearly Newtonian gravity

7.3.1 Newtonian approximation of the metric

A nearly Newtonian source of gravity can be described by the approxi-


mations T 00  |T 0 j | and T 00  |T i j |, which express that mean velocities
are small, and the rest-mass energy dominates the kinetic energy. Then,
we can also neglect retardation effects and write

4G ρ(x  ) d3 x Φ(x )
γ00 (x ) = 2 = −4 2 , (7.28)
c |x − x  | c
where Φ(x ) is the ordinary Newtonian gravitational potential. All other
components of the metric perturbation γμν vanish,
γ0 j = 0 = γi j . (7.29)

Then, the full metric



1
gμν = ημν + hμν = ημν + γμν − ημν γ (7.30)
2
has the components
 
2Φ 2Φ
g00 = − 1 + 2 , g0 j = 0 , gi j = 1 − 2 δi j , (7.31)
c c
creating the line element
 
2Φ 2 2 2Φ
ds = − 1 + 2 c dt + 1 − 2 (dx2 + dy2 + dz2 ) .
2
(7.32)
c c
?
Far away from a source with mass M, the monopole term −GM/r domi- What does it mean to say that the
nates the gravitational potential Φ in (7.28). Thus, we find: monopole term dominates the po-
tential? Which other terms could
Metric in the Newtonian limit
contribute?
In the Newtonian limit, the weakly perturbed metric of a mass M has
the line element
 
2GM 2 2 2GM  2 
ds = − 1 −
2
c dt + 1 + dx + dy 2
+ dz2
. (7.33)
rc2 rc2

7.3.2 Gravitational lensing and the Shapiro delay

Two interesting conclusions can be drawn directly from (7.32). Since


light follows null geodesics, light propagation is characterised by ds2 = 0
or  
2Φ 2Φ
1 + 2 c2 dt2 = 1 − 2 dx 2 , (7.34)
c c
96 7 Weak Gravitational Fields

which implies that the light speed in a (weak) gravitational field is



 |dx | 2Φ
c = = 1+ 2 c (7.35)
dt c
to first order in Φ.
Since Φ ≤ 0 if normalised such that Φ → 0 at infinity, c ≤ c, which we
can express by the index of refraction for a weak gravitational field,
c 2Φ
n= =1− 2 . (7.36)
c c

Index of refraction of a gravitational field


A weak gravitational field with Newtonian gravitational potential Φ
has the effective index of refraction

n=1− ≥1. (7.37)
c2
This can be used to calculate light deflection using Fermat’s principle,
which asserts that light follows a path along which the light-travel time
between a fixed source and a fixed observer is extremal, thus
  
dx
δ dt = δ ⇒ δ n(x )|dx | = 0 . (7.38)
c
Introducing a curve parameter λ, we can write x = x(λ), thus |dx | =
(x˙ 2 )1/2 dλ and 
# $  1/2
δ n x x˙ 2 dλ = 0 , (7.39)

where the overdot denotes derivation with respect to λ.


The variation leads to the Euler-Lagrange equation
∂L d ∂L # $  1/2
− =0 with L ≡ n x x˙ 2 . (7.40)
∂x dλ ∂x˙
Thus, we find
 1/2 +   ,
x˙ 2  − d nx˙ x˙ 2 −1/2 = 0 .
∇n (7.41)

We can simplify this expression by choosing the curve parameter such
that x˙ is a unit vector e, hence
 
? ∇n  e − ne˙ = 0 .
 − e · ∇n (7.42)
Perform the calculation yourself
leading from (7.39) to (7.42).  perpendicular to e, and e˙ is
The first two terms are the component of ∇n
the change of direction of the tangent vector along the light ray. Thus,

 ⊥ ln n = − 2 ∇
e˙ = ∇  ⊥Φ (7.43)
c2
7.3 Nearly Newtonian gravity 97

to first order in Φ. The total deflection angle is obtained by integrating e˙


along the light path.
As a second consequence, we see that the light travel time along an
infinitesimal path length dl is

dl dl 2Φ dl
dt =  = n = 1 − 2 . (7.44)
c c c c

Shapiro delay
Compared to light propagation in vacuum, there is thus a time delay
?
dl 2Φ How could the Shapiro delay be
Δ(dt) = dt − = − 3 dl , (7.45) measured?
c c
which is called the Shapiro delay.

Example: Time delay in a gravitationally-lensed quasar


Gravitational bending of light can lead to multiple light paths, or null
geodesics, leading from a single source to the observer. Then, the
observer sees the source multiply imaged. If the source is variable,
the Shapiro delay, together with the different geometrical lengths of
the light paths, leads to a measureable time shift between the images:
shifted copies of the light curves are then seen in the individual images.
Many such time delays caused by gravitational lensing have been ob-
served. A recent example is the time delay of (111.3±3) days measured
in the doubly-imaged quasar SDSS 1206 + 4332. Such measurements
are important for cosmology because the allow determinations of the
Hubble constant, i.e. the relative expansion rate of the Universe. 

7.3.3 The gravitomagnetic field

At next order in powers of c−1 , the current terms in the energy-momentum


tensor appear, but no stresses yet. That is, we now approximate T i j = 0
and use the field equations
?
16πG How could boundary conditions
γi j = 0 , γ0μ = − 4 T 0μ . (7.46)
c be taken into account in solving
equation (7.46)? Which bound-
Now, we set Aμ ≡ γ0μ /4 and obtain the Maxwell-type equations ary conditions could be appro-
priate? Compare with electrody-
4π namics.
Aμ = − jμ , (7.47)
c2
where the current density jμ ≡ GT 0μ /c2 was introduced. According to
our earlier result (7.28), A0 = −Φ/c2 . This similarity to electromagnetic
98 7 Weak Gravitational Fields

theory naturally leads to the introduction of “electric” and “magnetic”


components of the gravitational field.
Suppose now that the field is quasi-stationary, so that time derivatives
of the metric γμν can be neglected. Then, ∇  2 γi j = 0 everywhere because
T i j = 0 was assumed, thus γi j = 0, and the potentials Aμ determine the
field completely. They are

Φ G T 0i (x  ) d3 x
A0 = − 2 , Ai = 4 , (7.48)
c c |x − x  |
and the components of the metric g are, according to (7.30),

g00 = −1 + 2A0 , g0i = γ0i = 4Ai , gi j = (1 + 2A0 )δi j . (7.49)

Gravitomagnetic potential
Matter currents create a magnetic gravitational potential similar to the
electromagnetic vector potential.
The most direct approach to the equations of motion starts from the
variational principle (4.5), or
  1/2
δ −gμν ẋμ ẋν dt = 0 , (7.50)

where the dot now denotes the derivative with respect to the coordinate
time t. The radicand is
 · v − v 2 ,
c2 − 2c2 A0 − 8cA (7.51)

where we have neglected terms of order Φv 2 since the velocities are
assumed to be small compared to the speed of light.
Using (7.51), we can reduce the least-action principle (7.51) to the
Euler-Lagrange equations with the effective Lagrangian

v 2  · v .
L= + A0 c2 + 4cA (7.52)
2
We first find
∂L  d ∂L dv  A.
= v + 4cA ⇒ = + 4c(v · ∇) (7.53)
∂v dt ∂v dt
?
Convince yourself of the vector Using the vector identity
identity (7.54).
 a · b) = (a · ∇)
∇(  b + (b · ∇)
 a + a × ∇
 × b + b × ∇
 × a , (7.54)

we further obtain
∂L 
 0 + 4c (v · ∇)
= c2 ∇A  A  × A)
 + v × (∇  , (7.55)
∂x
7.3 Nearly Newtonian gravity 99

from which we obtain the equations of motion

dv   0 + 4cv × (∇
 × A)
 ,
≡ f = c2 ∇A (7.56)
dt

in which the (specific) force term on the right-hand side corresponds to


the Lorentz force in electrodynamics.
Let us consider now a small body characterised by its density suspended
in a gravitomagnetic field; “small” means that the field can be considered
constant across it. It experiences the torque about its centre-of-mass

 =
M d3 x x × ρ f (7.57)
   
 0 × d3 x xρ + 4c d3 x x × j × B
= −c2 ∇A  ,

where j = ρv is the matter current density and B=∇  ×A


 is the gravit-
omagnetic field. With the coordinates’ being centred on the centre-of-
mass, the first term vanishes. A non-trivial calculation carried out in the
In-depth box “Spin in a gravitomagnetic field” shows that the second
term gives
 ?
M = 2c d3 x x × j × B
 = 2cs × B
, (7.58) Can you prove (7.64) in the In-
depth box “Spin in a gravitomag-
where s is the intrinsic angular momentum of the body, i.e. its spin. netic field” with partial integra-
tion in one of the two terms? Do
Thus, the body’s spin changes according to you need any further conditions
for doing so?
 = 2cs × B
s˙ = M . (7.59)

 = Be3 , i.e. B1 = 0 =
Let us now orient the coordinate frame such that B
B2 . Then,
ṡ1 = 2cBs2 , ṡ2 = −2cs1 B . (7.60)

Introducing σ = s1 + is2 turns this into the single equation

σ̇ = −2cBiσ , (7.61)

which is solved by the ansatz σ = σ0 exp(iωt) if ω = −2cB. This shows


that:
Lense-Thirring effect
A spinning body in a gravitomagnetic field will experience spin preces-
sion with the angular frequency

 = −2c B
ω  ×A
 = −2c∇ , (7.62)

which is called the Lense-Thirring effect.


100 7 Weak Gravitational Fields

Example: Measurement of spin precession near the Earth


On April 20th, 2004, the satellite Gravity Probe B was launched in order
to measure the combined geodetic and Lense-Thirring precessions of
four spinning quartz spheres. For the orbit of the satellite, general
relativity predicts a geodetic precession of −6606.1 mas yr−1 and a
Lense-Thirring precession of −39.2 mas yr−1 . The data taken between
August 28th, 2004, and August 14th, 2005, were analysed until mid-
2011 and resulted in a geodetic precession of (−6601.8 ± 18.3) mas yr−1
(cf. Eq. 9.94) and a Lense-Thirring precession of (−37.2±7.2) mas yr−1 ,
confirming the predictions, albeit less precisely than planned (Phys.
Rev. Lett. 106 (2011) 221101). 

7.4 Gravitational waves

7.4.1 Polarisation states

As shown in (7.24), the linearised field equations in vacuum are

γμν = 0 , (7.69)

if the Hilbert gauge condition (7.21) is enforced,

∂ν γμν = 0 . (7.70)

Within the Hilbert gauge class, we can further require that the trace of
γμν vanish,
γ = γμμ = 0 . (7.71)
To see this, we return to the gauge transformation (7.20), which implies

γ → γ + 2∂μ ξμ − 4∂μ ξμ = γ − 2∂μ ξμ , (7.72)

i.e. if γ  0, we can choose the vector ξμ such that

2∂μ ξμ = γ . (7.73)

Moreover, (7.22) shows that the Hilbert gauge condition remains pre-
served if ξμ satisfies the d’Alembert equation

ξμ = 0 (7.74)

at the same time. It can be generally shown that vector fields ξμ can be
?
constructed which indeed satisfy (7.74) and (7.74) at the same time. If
How would you construct a solu-
we arrange things in this way, (7.14) shows that then hμν = γμν .
tion to both equations (7.73) and
(7.73)? All functions propagating with the light speed satisfy the d’Alembert
equation (7.69). In particular, we can describe them as superpositions of
plane waves  
γμν = hμν = Re εμν eik,x (7.75)
7.4 Gravitational waves 101

In depth: Spin in a gravitomagnetic field


On the precession frequency of angular momentum
  (x × j) × B
We begin by noting that x × (j × B)  because the vector
product is not associative, but rather satisfies the Jacobi identity (2.33).
The double vector product can be expressed by two scalar products,
     
x × j × B
 = x · B j − x · j B, (7.63)

which is also known as the Grassmann identity. For a body rotating


with an angular frequency ω  , the matter-current density is j = ρ
ω × x,
thus x ⊥ j, making the second term on the right-hand side of (7.63)
vanish. For evaluating the first term, it is important to realise that
 · j = 0, which is guaranteed here by the continuity equation. Then,

for arbitrary functions f and g,
  
 + gj · ∇
d3 x f j · ∇g f =0. (7.64)

The proof is straightforward, integrating the second term by parts.


Setting f = xi and g = xk in (7.64) gives

d3 x (xi jk + xk ji ) = 0 (7.65)

and thus allows us to write


    
 ji = Bk d3 x xk ji = 1 Bk d3 x (xk ji − xi jk ) (7.66)
d3 x x · B
2
or
        
 j = 1
d3 x x · B d3 x B  · x j − B · j x . (7.67)
2
Reading the Grassmann identity (7.63) backwards finally enables us to
bring the right-hand side of (7.67) into the form
   
 j = 1 B
d3 x x · B  × d3 x j × x = 1 s × B
, (7.68)
2 2
 !"
=−s

as used in (7.58).
102 7 Weak Gravitational Fields

with amplitudes given by the so-called polarisation tensor εμν . They


satisfy the d’Alembert equation if

k2 = k, k = kμ kμ = 0 . (7.76)

The Hilbert gauge condition then requires

0 = ∂ν hμν ⇒ kν εμν = 0 , (7.77)

and (7.71) is satisfied if the trace of εμν vanishes,

εμμ = 0 . (7.78)

The five conditions (7.78) and (7.78) imposed on the originally ten inde-
pendent components of εμν leave five independent components. Without
loss of generality, suppose the wave propagates into the positive z direc-
tion, then
kμ = (k, 0, 0, k) , (7.79)
and (7.77) implies
ε0μ = ε3μ ; (7.80)
specifically,

ε00 = ε30 = ε03 = ε33 and ε01 = ε31 , ε02 = ε32 , (7.81)

while (7.78) means

− ε00 + ε11 + ε22 + ε33 = 0 . (7.82)

Since ε33 = ε00 , this last equation means

ε11 + ε22 = 0 . (7.83)

Therefore, all components of εμν can be expressed by five of them, as


follows: ⎛ 00 01 ⎞
⎜⎜⎜ ε ε ε02 ε00 ⎟⎟⎟
⎜⎜⎜ ε01 ε11 ε12 ε01 ⎟⎟⎟
εμν = ⎜⎜⎜⎜⎜ 02 12 ⎟⎟⎟ . (7.84)
⎜⎜⎝ ε ε −ε11 ε02 ⎟⎟⎟⎟

ε00 ε01 ε02 ε00

Now, a gauge transformation belonging to a vector field


 
ξμ = Re iεμ eik,x (7.85)

which keeps the metric perturbation hμν trace-less,

∂μ ξμ = 0 , (7.86)

changes the polarisation tensor according to

εμν → εμν + kμ εν + kν εμ (7.87)


7.4 Gravitational waves 103

for the k vector specified in (7.79), we thus have

ε00 → ε00 + 2kε0 , ε01 → ε01 + kε1 , ε02 → ε02 + kε2 ,


ε11 → ε11 , ε12 → ε12 . (7.88)

The condition (7.86) implies that kμ εμ = 0, hence ε0 = ε3 . We can then


use (7.88) to make ε00 , ε01 and ε02 vanish, and only the gauge-invariant
components ε11 and ε12 are left. Then

1  00 
εμ = −ε , −2ε01 − 2ε02 , −ε00 (7.89)
2k
fixes the gauge transformation, and the polarisation tensor is reduced to
⎛ ⎞
⎜⎜⎜ 0 0 0 0 ⎟⎟⎟
⎜⎜⎜ 0 ε11 ε12 ⎟⎟⎟
ε = ⎜⎜⎜⎜
μν ⎜ 0 ⎟⎟⎟ . (7.90)
⎜⎜⎝ 0 ε12 −ε11 0 ⎟⎟⎟
⎟⎠
0 0 0 0 ?
Carry out all steps yourself lead-
ing from (7.77) to (7.90).
Gauge-invariant polarisation states
As for electromagnetic waves, there are only two gauge-invariant, lin- ?
early independent polarisation states for gravitational waves. How are polarisation states of
electromagnetic waves being de-
Under rotations about the z axis by an arbitrary angle φ, the polarisation scribed? Recall the Stokes pa-
tensor changes according to rameters and their meaning.

εμν = Rμα Rνβ εαβ , (7.91)

where R is the rotation matrix with the components


⎛ ⎞
⎜⎜⎜ 1 0 0 0 ⎟⎟⎟
⎜⎜⎜ 0 cos φ sin φ 0 ⎟⎟⎟
R(φ) = ⎜⎜⎜⎜⎜ ⎟⎟⎟ .
⎟⎟⎟ (7.92)
⎜⎜⎝ 0 − sin φ cos φ 0 ⎟⎠
0 0 0 1

Carrying out the matrix multiplication yields

ε11 = ε11 cos 2φ + ε12 sin 2φ


ε12 = −ε11 sin 2φ + ε12 cos 2φ . (7.93)

Defining ε± ≡ ε11 ∓ iε12 , this can be written as

ε± = e±2iφ ε± , (7.94)

which shows that the two polarisation states ε± have helicity ±2, and
thus that they correspond to left and right-handed circular polarisation.
104 7 Weak Gravitational Fields

7.4.2 Generation of gravitational waves

We return to (7.27) to see how gravitational waves can be emitted. From


the start, we introduce the two simplifications that the source is far away
and changing with a velocity small compared to the speed-of-light. Then,
we can replace the distance |x − x  | by

|x − x  | ≈ |x | = r (7.95)

because “far away” means that the source is small compared to its
distance. Moreover, we can approximate the retarded time coordinate x0
as follows:

x0 − |x − x  | = x0 − (x − x  )2 = x0 − x 2 + x 2 − 2x · x 
≈ x0 − r + x  · er , (7.96)

where er is the unit vector in radial direction. Then, we obtain


 
4G r − x  · er  3 
γμν (t, x ) = − 4 T μν t − , x d x . (7.97)
rc c
Under the assumption of slow motion, we can further ignore the direc-
tional dependence of the retarted time, thus approximate x  · er = 0, and
write  ) r *
4G
γμν (t, x ) = − 4 T μν t − , x  d3 x . (7.98)
rc c
While this is already the essential result, a sequence of transformations
of the right-hand side now leads to further important insight.
By means of the local conservation law ∂ν T μν = 0, we can begin by
simplifying the integral on the right-hand side of (7.98):
  
μν 3 1
0= x ∂ν T d x = ∂t
k
x T d x+
k 0μ 3
xk ∂l T lμ d3 x
 c 
1
= ∂t xk T 0μ d3 x − T lμ δkl d3 x , (7.99)
c
where the second term on the right-hand side was partially integrated, as-
suming that boundary terms vanish (i.e. enclosing the source completely
in the integration boundary). Thus, we see that the volume integral over
the energy-momentum tensor can be written as a time derivative,
 
1
T kμ d3 x = ∂t xk T 0μ d3 x . (7.100)
c

From Gauß’ theorem, we infer that the volume integral over the diver-
gence of a vector field equals the integral of the vector field over the
boundary of the volume and must vanish if the field disappears on the
surface,   
∂ j T j0 xl xk d3 x = 0 . (7.101)
7.4 Gravitational waves 105

This result, together with ∂ν T μν = 0, enables us to write


      
1 ν0 l k
∂t T x x d x =
00 l k 3
∂ν T x x d x = 3
T ν0 ∂ν xl xk d3 x
c   
= T ν0 δkν xl + xk δlν d3 x
  
= T k0 xl + T l0 xk d3 x . (7.102)

Taking a further partial time derivative of (7.102) and using (7.100)


results in
 
1 2 1
∂ (T x x )d x = ∂t (T k0 xl + T l0 xk )d3 x
00 k l 3
(7.103)
2c2 t 2c 
1
= (T kl + T lk )d3 x = T kl d3 x .
2
The spatial components of the metric perturbation γμν thus turn out to be
given by the second time derivative
 ) r * ?
2G 2
γ (t, x) = − 6 ∂t
jk
T 00 t − , x  x j xk d3 x . (7.104) Convince yourself of all the steps
rc c
leading from (7.99) to (7.103).
Verify that the expression (7.104)
If we further use that the T 00 component of the energy-momentum tensor for the metric perturbation has
is well approximated by the matter density if the source’s material is the correct units.
moving slowly, we arrive at the main result of this sequence of transfor-
mations:
Source of gravitational waves
Wave-like metric perturbations in vacuum are created by the second ?
time derivative of a matter distribution with density ρ, Why can electromagnetic waves
be created by a time-dependent
 )
2G 2 r * dipole moment instead?
γ (t, x) = − 4 ∂t
jk
ρ t − , x  x j xk d3 x . (7.105)
rc c
Finally, we can further simplify the physical interpretation of this result
by introducing the source’s quadrupole tensor, which is defined by
  
Q jk = 3x j xk − r2 δ jk ρ(x )d3 x . (7.106)

It allows us to rewrite the metric perturbation from (7.105) as


& ) r *  ) r * ' ?
2G
γ jk = − 4 ∂2t Q jk t − , x + δ jk ∂2t r2 ρ t − , x  d3 x  . Calculate the quadrupole mo-
3rc c c ment of a binary star with compo-
(7.107)
nents having masses M1 and M2 ,
Generation of gravitational waves orbiting each other on a circular
orbit with radius r.
In order to generate gravitational waves, a mass distribution needs to
have a quadrupole moment with a non-vanishing second time derivative.
106 7 Weak Gravitational Fields

7.4.3 Energy transport by gravitational waves

The energy current density of electromagnetic waves is given by the


0i
time-space components T GW of their energy-momentum tensor. The
01-component, i.e. the energy current density propagating into the x1
direction, can be shown to be
0 1
c3 1
01
T GW = 2
2γ̇23 + (γ̇22 − γ̇33 )2 , (7.108)
32πG 2

which can be written with the help of (7.107) as


0 1
G ...2 1 ... ... 2
01
T GW = 2 Q + Q − Q , (7.109)
72πr2 c5 23
2 22 33

?
Does the expression (7.109) have showing one of the rare cases of a third time derivative in physics.
the correct units?
The transversal quadrupole tensor is

Q22 Q23
QT = (7.110)
Q32 Q33

because the direction of propagation was chosen as the x1 axis. Defining


the transversal trace-free quadrupole tensor by

I 1 Q22 − Q33 2Q23
QTT := QT − Tr QT = , (7.111)
2 2 2Q23 −(Q22 − Q33 )

we see that an invariant expression for the right-hand side of (7.109) is


given by
  1
Tr QTT QTT = (Q22 − Q33 )2 + 2Q223 , (7.112)
2
and thus the energy current density in gravitational waves has the com-
ponents
G 2  TT TT 3
0i
T GW = Tr Q Q . (7.113)
72πr2 c5

Einstein’s quadrupole formula


A final integration over a sphere with radius r yields Einstein’s famous
quadrupole formula for the gravitational-wave “luminosity”,
G 2 ...2 3
LGW = Tr Q . (7.114)
5c5
7.4 Gravitational waves 107

Example: First direct detection of gravitational waves


On September 14th, 2015, the LIGO interferometers at Hanford
(Washington, USA) and Livingston (Louisiana, USA) registered the
gravitational-wave signal summarised in Fig. 7.1. The figure shows that
the frequency f increased from ≈ 50 Hz to ≈ 100 Hz within ≈ 40 ms.
Inserting f ≈ 75 Hz and f˙ ≈ 50 Hz/0.04 s ≈ 1250 Hz s−1 into the for-
mula (7.128) derived in the In-depth box “The chirp mass of a binary
star” for the chirp mass M gives

M ≈ 30 M
. (7.115)

For two equal masses m1 = m2 =: m, M = 2m and μ = m/2, thus


m
M= , m ≈ 1.15 M ≈ 35 M
. (7.116)
21/5
At an orbital frequency of ω = π f ≈ 240 Hz, Kepler’s third law (7.123)
requires the two masses to be separated by
 1/3
2Gm
R≈ ≈ 550 km , (7.117)
ω2

less than a thousandth of the Solar radius. No ordinary stars could ever
come as close. Objects of mass m closer than R must be black holes.
The merging black-hole binary became known as GW150914.
Inserting the quadrupole tensor (7.122) into (7.105) leads to
 
 jk  GM Gμ
γ  ≤ 4 (7.118)
rc2 Rc2

which, for equal masses, turns into the intuitive expression


 jk  R2s
γ  ≤ (7.119)
Rr
in terms of the Schwarzschild radius Rs = 2Gm/c2 . With Rs ≈ 100 km
for m ≈ 35 M
and |γ jk |  10−21 , the distance of the merging black
holes can be estimated to be

r ≈ 2 · 1027 cm ≈ 600 Mpc . (7.120)


108 7 Weak Gravitational Fields

In depth: The chirp mass of a binary star


A Newtonian estimate
Two stars of masses m1,2 separated by a distance R orbit their centre-of-
mass at distances R1,2 with an angular frequency ω. They obey Kepler’s
third law,
GM
ω2 = 3 , M := m1 + m2 . (7.121)
R
Assuming circular orbits with radii Ri = mi R/M according to the
definition of the centre-of-mass, their quadrupole tensor is
⎡⎛ ⎞ ⎤
⎢⎢⎢⎜⎜⎜ cos2 ωt sin ωt cos ωt 0⎟⎟⎟ ⎥⎥⎥
⎢⎜ ⎟ 1 ⎥
Q = μR2 ⎢⎢⎢⎢⎜⎜⎜⎜sin ωt cos ωt sin2 ωt 0⎟⎟⎟⎟ − 13 ⎥⎥⎥⎥ , (7.122)
⎣⎝ ⎠ 3 ⎦
0 0 0

where μ := m1 m2 /M is the reduced mass. Straightforward calculation


gives ...2 
Tr Q = 32ω6 μ2 R4 , (7.123)
leading us with (7.114) to the gravitational-wave luminosity
32 G 6 2 4
LGW = ωμR . (7.124)
5 c5
According to the virial theorem, the total energy of the binary star is
1 1 Gm1 m2 1 GμM
E = − Epot = = . (7.125)
2 2 R 2 R
Its absolute time derivative must equal the gravitational-wave luminos-
ity, |Ė| = LGW . Since R ∝ ω−2/3 from (7.121),

Ṙ 2 ω̇
=− (7.126)
R 3ω
and thus
1 GμM ω̇
|Epot | = . (7.127)
3 R ω
Equating this to (7.124), using (7.121) to eliminate the radius via the
angular frequency, taking into account that the frequency f of the
gravitational waves emitted by the binary is f = ω/π, and sorting terms
leads to the chirp mass

 3/5  3/5
c3 5 f˙
M := M 2/3
μ = . (7.128)
8G 3π8/3 f 11/3

Although this estimate is based on three grossly simplifying assump-


tions: Newtonian gravity, circular orbits, and negligible energy loss per
orbit, the qualitative expression for the chirp mass and its numerical
value are close to the relativistic result in leading-order calculation.
7.4 Gravitational waves 109

Figure 7.1 Wave forms and frequency diagrams of the gravitational-wave


signals registered on September 14th, 2015, by the LIGO interferometers
at Hanford and Livingston (USA). This was the first direct detection of
a gravitational wave. The figure shows the strain |γ jk | measured by the
two interferometers, the reconstruction of the signal by comparison to the
signal expected from a merging black-hole binary, and the frequency of the
gravitational waves as a function of time. Since the frequency is increasing
during the event, an acoustic representation resembles a chirp. Source:
Wikipedia
Chapter 8

The Schwarzschild Solution

8.1 Cartan’s structure equations

8.1.1 Curvature forms

This section deals with a generalisation of the connection coefficients,


and the torsion and curvature tensor components, to arbitrary bases.
This will prove enormously efficient in our further discussion of the
Schwarzschild solution.
Let M be a differentiable manifold, {ei } an arbitrary basis for vector fields
and {θi } an arbitrary basis for dual vector fields, or 1-forms.
Connection forms
In analogy to the Christoffel symbols, we introduce the connection
forms by
∇v ei = ωij (v)e j . (8.1)
-1
Since ∇v ei is a vector, ωi (v) ∈ R is a real number, and thus ωi ∈
j j
is
a dual vector, or a one-form.
Since, by definition (3.2) of the Christoffel symbols
∇∂k ∂ j = Γik j ∂i = ωij (∂k )∂i (8.2)
in the coordinate basis {∂i }, we have in that particular basis,
ωij = Γik j dxk . (8.3)

Since θi , e j  is a constant (which is either zero or unity if the basis is


orthonormal), we must have
0 = ∇v θi , e j  = ∇v θi , e j  + θi , ∇v e j 
= ∇v θi , e j  + θi , ωkj (v)ek 
= ∇v θi , e j  + ωij (v) . (8.4)

111
112 8 The Schwarzschild Solution

From this result, we can conclude

∇v θi = −ωij (v)θ j (8.5)

for the covariant derivative of θi in the direction of v. Without specifying


the vector v, we find the covariant derivative

∇θi = −θ j ⊗ ωij . (8.6)

-
Let now α ∈ 1 be a one-form such that α = αi θi with arbitrary functions
αi . Then, the equations we have derived so far imply

∇v α = v(αi )θi + αi ∇v θi = dαi − αk ωki , vθi , (8.7)

where we have used the differential of the function αi , defined in (2.35)


by dαi (v) = v(αi ), together with the notation w, v = w(v) for a vector v
and a dual vector w. More generally, this expression can be written as
the covariant derivative

∇α = θi ⊗ (dαi − αk ωki ) . (8.8)

Similarly, for a vector field x = xi ei , we find

∇v x = dxi + xk ωik , vei (8.9)

or
? ∇x = ei ⊗ (dxi + ωik xk ) (8.10)
Derive the expressions (8.9) and
for the covariant derivative of the vector x.
(8.10) yourself, beginning with
(8.1).
8.1.2 Torsion and curvature forms

We are now in a position to use the connection forms for defining the
torsion and curvature forms.
Torsion and curvature forms
By definition, the torsion T (x, y) is a vector, which can be written in
terms of the torsion forms Θi as

T (x, y) = Θi (x, y)ei . (8.11)


-
Obviously, Θi ∈ 2 is a two-form, such that Θi (x, y) ∈ R is a real
number.
In the same manner, we express the curvature by the curvature forms
-
Ωij ∈ 2 ,
R̄(x, y)e j = Ωij (x, y)ei . (8.12)

The next important step is now to realise that the torsion and curvature
2-forms satisfy Cartan’s structure equations:
8.1 Cartan’s structure equations 113

Figure 8.1 Élie Cartan (1869–1951), French mathematician. Source:


Wikipedia

Cartan’s structure equations


In terms of the connection forms ωij , the torsion forms Θi and the
curvature forms Ωij are determined by Cartan’s structure equations,

Θi = dθi + ωij ∧ θ j
Ωij = dωij + ωik ∧ ωkj . (8.13)

Their proof is straightforward. To prove the first structure equation, we


insert the definition (3.45) of the torsion to obtain as a first step

Θi (x, y) = ∇ x y − ∇y x − [x, y]
= ∇ x (θi (y)ei ) − ∇y (θi (x)ei ) − θi ([x, y])ei , (8.14)

where we have expanded the vectors x, y and [x, y] in the basis {ei }
according to x = θi , xei = θi (x)ei . Then, we continue by using the
connection forms,

Θi (x, y) = ∇ x (θi (y)ei ) − ∇y (θi (x)ei ) − θi ([x, y])ei


= xθi (y)ei + θi (y)ωij (x)e j − yθi (x)ei − θi (x)ωij (y)e j
− θi ([x, y])ei

= xθi (y) − yθi (x) − θi ([x, y]) ei

+ θi (y)ωij (x) − θi (x)ωij (y) e j . (8.15)
114 8 The Schwarzschild Solution

According to (5.66), the first term can be expressed by the exterior


derivative of the θi , and since the second term is antisymmetric in x and
y, we can write this as

Θi (x, y) = dθi (x, y)ei + (ωij ∧ θ j )(x, y)ei , (8.16)

from which the first structure equation follows immediately.


The proof of the second structure equation proceeds similarly, using the
definition (3.51) of the curvature. Thus,

Ωij (x, y)ei = ∇ x ∇y e j − ∇y ∇ x e j − ∇[x,y] e j


= ∇ x (ωij (y)ei ) − ∇y (ωij (x)ei ) − ωij ([x, y])ei
= xωij (y)ei + ωij (y)∇ x ei
− yωij (x)ei − ωij (x)∇y ei − ωij ([x, y])ei

= xωij (y) − yωij (x) − ωij ([x, y]) ei

+ ωij (y)ωki (x) − ωij (x)ωki (y) ek
? = dωij (x, y)ei + (ωki ∧ ωij )(x, y)ek , (8.17)
Carry out all steps of the deriva-
tions (8.15) and (8.17) yourself which proves the second structure equation.
and convince yourself that they
Now, let us use the curvature forms Ωij to define tensor components R̄i jkl
are correct.
by
1
Ωij ≡ R̄i jkl θk ∧ θl , (8.18)
2
whose antisymmetry in the last two indices is obvious by definition,

R̄i jkl = −R̄i jlk . (8.19)

In an arbitrary basis {ei }, we then have

θi , R̄(ek , el )e j  = θi , Ω sj (ek , el )e s  = Ωij (ek , el ) = R̄i jkl . (8.20)

Comparing this to the components of the curvature tensor in the coordi-


nate basis {∂i } given by (3.56) shows that the functions R̄i jkl are indeed
the components of the curvature tensor in the arbitrary basis {ei }.
A similar operation shows that the functions T ijk defined by

1
Θi ≡ T ijk θ j ∧ θk (8.21)
2
are the elements of the torsion tensor in the basis {ei }, since

θi , T (e j , ek ) = θi , Θ s (e j , ek )e s  = Θi (e j , ek ) = T ijk . (8.22)

Thus, Cartan’s structure equations allow us to considerably simplify the


computation of curvature and torsion for an arbitrary metric, provided
we find a base in which the metric appears simple (e.g. diagonal and
constant).
8.2 Stationary and static spacetimes 115

Symmetry of the connection forms


We mention without proof that the connection ∇ is metric if and only if

ωi j + ω ji = dgi j , (8.23)

where the definitions

ωi j ≡ gik ωkj and gi j ≡ g(ei , e j ) (8.24)

were used, i.e. the gi j are the components of the metric in the arbitrary
basis {ei }.

8.2 Stationary and static spacetimes

Stationary spacetimes (M, g) are defined to be spacetimes which have


?
a time-like Killing vector field K. This means that observers moving
What exactly are Killing vector
along the integral curves of K do not notice any change.
fields? How are they defined, and
This definition implies that we can introduce coordinates in which the what do they mean?
components gμν of the metric do not depend on time. To see this, suppose
we choose a space-like hypersurface Σ ⊂ M and construct the integral
curves of K through Σ.
We further introduce arbitrary coordinates on Σ and carry them into M
as follows: let φt be the flow of K, p0 ∈ Σ and p = φt (p0 ), then the
coordinates of p are chosen as (t, x1 (p0 ), x2 (p0 ), x3 (p0 )). These are the
so-called Lagrange coordinates of p.
In these coordinates, K = ∂0 , i.e. K μ = δμ0 . From the derivation of the
Killing equation (5.34), we further have that the components of the Lie
derivative of the metric are
(LK g)μν = K λ ∂λ gμν + gλν ∂μ K λ + gμλ ∂ν K λ
= ∂0 gμν = 0 , (8.25)
which proves that the gμν do not depend on time in these so-called
adapted coordinates.
We can straightforwardly introduce a one-form ω corresponding to the
Killing vector K by ω = K  . This one-form obviously satisfies
ω(K) = K, K  0 . (8.26)

Suppose that we now have a stationary spacetime in which we have


introduced adapted coordinates and in which also g0i = 0. Then, the
Killing vector field is orthogonal to the spatial sections, for which t =
const. Then, the one-form ω is quite obviously
ω = g00 cdt = K, K cdt , (8.27)
116 8 The Schwarzschild Solution

because K = ∂0 . This then trivially implies the Frobenius condition

ω ∧ dω = 0 (8.28)

because the exterior derivative d satisfies d ◦ d ≡ 0.


Conversely, it can be shown that if the Frobenius condition holds, the one-
form ω can be written in the form (8.27). For a vector field v tangential
to a spacelike section defined by t = const., we have

K, v = ω(v) = K, K cdt(v) = K, Kv(t) = 0 (8.29)

because t = const., and thus K is then perpendicular to the spatial section.


Thus, K = ∂0 and

g0i = ∂0 , ∂i  = K, ∂i  = 0 . (8.30)

Stationary and static spacetimes


Thus, in a stationary spacetime with time-like Killing vector field K,
the Frobenius condition (8.28) for the one-form ω = K  is equivalent
to the condition g0i = 0 in adapted coordinates. Such spacetimes are
called static. In other words, stationary spacetimes are static if and only
if the Frobenius condition holds.
In static spacetimes, the metric can thus be written in the form

g = g00 (x )c2 dt2 + gi j (x )dxi dx j . (8.31)

8.3 The Schwarzschild solution

8.3.1 Form of the metric

Formally speaking, the Schwarzschild solution is a static, spherically


symmetric solution of Einstein’s field equations for vacuum spacetime.
From our earlier considerations, we know that a static spacetime is a
stationary spacetime whose (time-like) Killing vector field satisfies the
Frobenius condition (8.28).
As the spacetime is (globally) stationary, we know that we can introduce
? spatial hypersurfaces Σ perpendicular to the Killing vector field which, in
How does a product space com- adapted coordinates, is K = ∂0 . The manifold (M, g) can thus be foliated
posed of two manifolds attains as M = R × Σ.
the structure of a product mani-
From (8.31), we then know that, also in adapted coordinates, the metric
fold.
acquires the form
g = −φ2 c2 dt2 + h , (8.32)
8.3 The Schwarzschild solution 117

where φ is a smoothly varying function on Σ and h is the metric of


the spatial sections Σ. Under the assumption that K is the only time-
like Killing vector field which the spacetime admits, t is a uniquely
distinguished time coordinate, and we can write

− φ2 = K, K . (8.33)

The stationarity of the spacetime, expressed by the existence of the single


Killing vector field K, thus allows a convenient foliation of the spacetime
into spatial hypersurfaces or foils Σ and a time coordinate.
Furthermore, the spatial hypersurfaces Σ are expected to be spherically
symmetric. This means that the group SO(3) (i.e. the group of rotations
in three dimensions) must be an isometry group of the metric h. The
orbits of SO(3) are two-dimensional, space-like surfaces in Σ. Thus,
SO(3) foliates the spacetime (Σ, h) into invariant two-spheres.
Let the surface of these two-spheres be A, then we define a radial coordi-
nate for the Schwarzschild metric requiring

4πr2 = A (8.34)

as in Euclidean geometry. Moreover, the spherical symmetry implies


that we can introduce spherical polar coordinates (ϑ, ϕ) on one partic-
ular orbit of SO(3) which can then be transported along geodesic lines
perpendicular to the orbits. Then, the spatial metric h can be written in
the form
 
h = e2b(r) dr2 + r2 dϑ2 + sin2 ϑdϕ2 , (8.35)
?
where the exponential factor was introduced to allow a scaling of the
Why could it be useful to repre-
radial coordinate.
sent the coefficient of dr2 in h by
Due to the stationarity of the metric and the spherical symmetry of the an exponential?
spatial sections, K, K can only depend on r. We set

φ2 = −K, K = e2a(r) . (8.36)

The full metric g is thus characterised by two radial functions a(r) and
b(r) which we need to determine. The exponential functions in (8.35)
and (8.36) are chosen to ensure that the prefactors ea and eb are always
positive.
The spatial sections Σ are now foliated according to

Σ = I × S2 , I ⊂ R+ , (8.37)

with coordinates r ∈ I and (ϑ, ϕ) ∈ S 2 .


118 8 The Schwarzschild Solution

Metric for static, spherically-symmetric spacetimes


In the Schwarzschild coordinates (t, r, ϑ, ϕ), the metric of a static,
spherically-symmetric spacetime has the form

g = −e2a(r) c2 dt2 + e2b(r) dr2 + r2 (dϑ2 + sin2 ϑdϕ2 ) . (8.38)

The functions a(r) and b(r) are constrained by the requirement that the
metric should asymptotically turn flat, which means

a(r) → 0 , b(r) → 0 for r → ∞ . (8.39)

They must be determined by inserting the metric (8.38) into the vacuum
field equations, G = 0.

8.3.2 Connection and curvature forms

In order to evaluate Einstein’s field equations for the Schwarzschild


metric, we now need to compute the Riemann, Ricci, and Einstein
tensors. Traditionally, one would begin this step with computing all
Christoffel symbols of the metric (8.38). This very lengthy and error-
prone procedure can be considerably shortened using Cartan’s structure
equations (8.13) for the torsion and curvature forms Θi and Ωij .
To do so, we need to introduce a suitable basis, or tetrad {ei }, or alter-
natively a dual tetrad {θi }. Guided by the form of the metric (8.38), we
choose

θ0 = ea cdt , θ1 = eb dr , θ2 = rdϑ , θ3 = r sin ϑdϕ . (8.40)

In terms of these, the metric attains the simple diagonal, Minkowskian


form
g = gμν θ μ ⊗ θν , gμν = diag(−1, 1, 1, 1) . (8.41)
Obviously, dg = 0, and thus (8.23) implies that the connection forms ωμν
need to be antisymmetric,

ωμν = −ωνμ . (8.42)

Given the dual tetrad {θμ }, we must take their exterior derivatives. For
this purpose, we apply the expression (??) and find, for dθ0 ,

dθ0 = dea ∧ cdt = −a ea cdt ∧ dr . (8.43)

because dea = a ea dr. Similarly, we find

dθ1 = 0 (8.44)
8.3 The Schwarzschild solution 119

because dr ∧ dr = 0, further
dθ2 = dr ∧ dϑ (8.45)
and
dθ3 = sin ϑ dr ∧ dϕ + r cos ϑ dϑ ∧ dϕ . (8.46)

Using (8.40), we can also express the coordinate differentials by the dual
tetrad,
θ2 θ3
cdt = e−a θ0 , dr = e−b θ1 , , dϕ =
dϑ = , (8.47)
r r sin ϑ
so that we can write the exterior derivatives of the dual tetrad as
e−b 1
dθ0 = a e−b θ1 ∧ θ0 , dθ1 = 0 , dθ2 = θ ∧ θ2 ,
r
e−b 1 cot ϑ 2 ?
dθ3 = θ ∧ θ3 + θ ∧ θ3 . (8.48)
r r Test by independent calculation
whether you can confirm the dif-
Since the torsion must vanish, Θi = 0, Cartan’s first structure equation ferentials (8.48).
from (8.13) implies
dθ μ = −ωμν ∧ θν . (8.49)

Connection forms
With (8.48), this suggests that the connection forms of a static,
spherically-symmetric metric are

a θ 0
ω01 = ω10 = , ω02 = ω20 = 0 , ω03 = ω30 = 0 , ?
eb
θ2 θ3 Why can none of the connection
ω21 = −ω12 = b , ω31 = −ω13 = b , forms in (8.50) depend on ϕ?
re re
cot ϑ θ3
ω2 = −ω3 =
3 2
. (8.50)
r
They satisfy the antisymmetry condition (8.42) and Cartan’s first struc-
ture equation (8.49) for a torsion-free connection.

For evaluating the curvature forms Ωμν , we first recall that the exterior
derivative of a one-form ω multiplied by a function f is
d( f ω) = d f ∧ ω + f dω
= (∂i f )dxi ∧ ω + f dω (8.51)
according to the (anti-)Leibniz rule (??).
Thus, we have for dω01
dω01 = (a e−b ) dr ∧ θ0 + a e−b dθ0
= (a e−b − a b e−b )e−b θ1 ∧ θ0 + (a e−b )2 θ1 ∧ θ0
=: A θ0 ∧ θ1 (8.52)
120 8 The Schwarzschild Solution

where we have used (8.47) and (8.48) and abbreviated A := (a − a b +


a2 )E with E := exp(−2b).
In much the same way and using this definition of E, we find
b E 1
dω21 = − θ ∧ θ2 ,
r
b E 1 cot ϑ
dω31 = − θ ∧ θ3 + 2 b θ2 ∧ θ3 ,
r re
1 2
dω2 = − 2 θ ∧ θ .
3 3
(8.53)
r
This yields the curvature two-forms according to (8.13).
Curvature forms of a static, spherically-symmetric metric
The curvature forms of a static, spherically-symmetric metric are

Ω01 = dω01 = −A θ0 ∧ θ1 = Ω10


? a E 0
Ω02 = ω01 ∧ ω12 = − θ ∧ θ2 = Ω20
Perform the calculations lead- r
ing to the curvature forms (8.54) a E 0
Ω03 = ω01 ∧ ω13 = − θ ∧ θ3 = Ω30
yourself and see whether you can r
b E 1
confirm them. Ω12 = dω12 + ω13 ∧ ω32 = θ ∧ θ2 = −Ω21
r
b E 1
Ω13 = dω13 + ω12 ∧ ω23 = θ ∧ θ3 = −Ω31
r
1−E 2
Ω23 = dω23 + ω21 ∧ ω13 = θ ∧ θ3 = −Ω32 . (8.54)
r2
The remaining curvature two-forms follow from antisymmetry since
Ωμν = gμλ Ωλν = −Ωνμ , (8.55)
because of the (anti-)symmetries of the curvature.

8.4 Solution of the field equations

8.4.1 Components of the Ricci and Einstein tensors

The components of the curvature tensor are given by (8.20), and thus the
components of the Ricci tensor in the tetrad {eα } are
Rμν = R̄λμλν = Ωλμ (eλ , eν ) . (8.56)
Thus, the components of the Ricci tensor in the Schwarzschild tetrad are
2a E
R00 = Ω10 (e1 , e0 ) + Ω20 (e2 , e0 ) + Ω30 (e3 , e0 ) = A + ,
r

2b E
R11 = Ω01 (e0 , e1 ) + Ω21 (e2 , e1 ) + Ω31 (e3 , e1 ) = −A + (8.57)
r
8.4 Solution of the field equations 121

and, with B := (b − a )E/r,

1−E
R22 = Ω02 (e0 , e2 ) + Ω12 (e1 , e2 ) + Ω32 (e3 , e2 ) =: B +
r2
R33 = Ω03 (e0 , e3 ) + Ω13 (e1 , e3 ) + Ω23 (e2 , e3 ) = R22 (8.58)

The Ricci scalar becomes


1−E
R = −2A + 4B + 2 , (8.59)
r2
such that we can now determine the components of the Einstein tensor
in the tetrad {eα } :
Einstein tensor for a static, spherically-symmetric metric
The Einstein tensor of a static, spherically-symmetric metric has the
components

R 1 1 2b ?
G00 = R00 − g00 = 2 − E 2 − Convince yourself of the compo-
2 r r r
 nents (8.60) of the Einstein ten-
1 1 2a
G11 = − 2 + E 2 + sor.
r r r
G22 = E (A − B) = G33 . (8.60)

All off-diagonal components of Gμν vanish identically.

8.4.2 The Schwarzschild metric

The vacuum field equations now require that all components of the
Einstein tensor vanish. In particular, then,

2E 
0 = G00 + G11 = (a + b ) (8.61)
r
shows that a + b = 0. Since a + b → 0 asymptotically for r → ∞,
integrating a + b from r → ∞ indicates that a + b = 0 everywhere, or
b = −a.
After multiplying with r2 , equation G00 = 0 itself implies that

E(1 − 2rb ) = 1 ⇔ (rE) = 1 . (8.62)

Therefore, (8.62) is equivalent to

C
rE = r + C ⇔ E =1+ , (8.63)
r
with an integration constant C to be determined.
122 8 The Schwarzschild Solution

Figure 8.2 Karl Schwarzschild (1873–1916), German astronomer and


physicist. Source: Wikipedia

Since a = −b, this also allows to conclude that


C
e2a = E = 1 + . (8.64)
r
The integration constant C is finally determined by the Newtonian limit.
We have seen before in (4.80) that the 0-0 element of the metric must be
related to the Newtonian gravitational potential as g00 = −(1 + 2Φ/c2 ) in
order to meet the Newtonian limit. The Newtonian potential of a point
mass M at a distance r is
GM
Φ=− . (8.65)
r
Together with (8.64), this shows that the Newtonian limit is reached by
the Schwarzschild solution if the integration constant C is set to

2GM GM M
C = − 2 =: −2m with m = 2 ≈ 1.5 km . (8.66)
c c M

Schwarzschild metric
We thus obtain the Schwarzschild solution for the metric,
  
2m 2 2 dr2
ds2 = − 1 − c dt + + r2 dϑ2 + sin2 ϑdϕ2 . (8.67)
r 2m
1−
r
The Schwarzschild metric (8.67) has an (apparent) singularity at r = 2m
or
2GM
r = Rs ≡ 2 , (8.68)
c
8.4 Solution of the field equations 123

the so-called Schwarzschild radius. We shall clarify the meaning of this


singularity later.
In order to illustrate the geometrical meaning of the spatial part of
the Schwarzschild metric, we need to find a geometrical interpretation
for its radial dependence. Specialising to the equatorial plane of the
Schwarzschild solution, ϑ = π/2 and t = 0, we find the induced spatial
line element
dr2
dl2 = + r2 dϕ2 (8.69)
1 − 2m/r
on that plane.
On the other hand, consider a surface of rotation in the three-dimensional
Euclidean space E 3 . If we introduce the adequate cylindrical coordinates
(r, φ, z) on E 3 and rotate a curve z(r) about the z axis, we find the induced
line element
 2
dz
dl2 = dz2 + dr2 + r2 dϕ2 = dr2 + dr2 + r2 dϕ2
dr
= (1 + z2 )dr2 + r2 dϕ2 . (8.70)

We can now try and identify the two induced line elements from (8.69)
and (8.70) and find that this is possible if
 1/2
 1 2m
z = −1 = , (8.71)
1 − 2m/r r − 2m

which is readily integrated to yield



z= 8m(r − 2m) + const. or z2 = 8m(r − 2m) , (8.72)

if we set the integration constant to zero.


This shows that the geometry on the equatorial plane of the spatial
section of the Schwarzschild solution can be identified with a rotational
paraboloid in E 3 . In other words, the dependence of radial distances on
the radius r is equivalent to that on a rotational paraboloid (cf. Fig. 8.3).

8.4.3 Birkhoff’s theorem

Suppose now we had started from a spherically symmetric vacuum


spacetime, but with explicit time dependence of the functions a and
b, such that the spacetime could either expand or contract. Then, a
repetition of the derivation of the connection and curvature forms, and
124 8 The Schwarzschild Solution

Figure 8.3 Surface of rotation illustrating the spatial part of the Schwarz-
schild metric.

the components of the Einstein tensor following from them, had resulted
in the new components Ḡμν

Ḡ00 = G00 , Ḡ11 = G11


 
−2a
Ḡ22 = G22 − e ḃ2 − ȧḃ − b̈ = Ḡ33
2ḃ −a−b
Ḡ10 = e (8.73)
r
and Ḡμν = 0 for all other components.
The vacuum field equations imply Ḡ10 = 0 and thus ḃ = 0, hence b must
be independent of time. From Ḡ00 = 0, we can again conclude (8.63),
i.e. b retains the same form as before. Similarly, since Ḡ00 + Ḡ11 =
G00 + G11 , the requirement a + b = 0 must continue to hold, but now
the time dependence of a allows us to conclude only that

a = −b + f (t) , (8.74)

where f (t) is an otherwise unconstrained function of time only. Thus,


the line element then reads

2m 2 2
ds = −e 1 −
2 2f
c dt + dl2 , (8.75)
r
where dl2 is the unchanged line element of the spatial sections.
Introducing the new time coordinate t by

t = e f dt (8.76)

converts (8.75) back to the original form (8.67) of the Schwarzschild


metric.
8.4 Solution of the field equations 125

Birkhoff’s theorem
This is Birkhoff’s theorem, which states that a spherically symmetric
solution of Einstein’s vacuum equations is necessarily static for r > 2m.
Chapter 9

Physics in the Schwarzschild


Spacetime

9.1 Orbits in the Schwarzschild spacetime

9.1.1 Lagrange function

According to (4.3), the motion of a particle in the Schwarzschild space-


time is determined by the Lagrangian

L = −u, u , (9.1)

where u = dx/dτ is the four-velocity. The proper-time differential dτ is


defined by (4.6) to satisfy

ds = cdτ = −u, u dτ . (9.2)

This choice thus requires that the four-velocity u be normalised,

u, u = −c2 . (9.3)

Note that we have to differentiate and integrate with respect to the proper
?
time τ rather than the coordinate time t because the latter has no invariant
Recall the essential arguments for
physical meaning. In the Newtonian limit, τ = t.
the Lagrange function (9.1) and
The constant value of u, u allows that, instead of varying the action its physical interpretation.
 b
S = −mc −u, u dτ , (9.4)
a

we can just as well require that the variation of


 b
1
S̄ = u, u dτ (9.5)
2 a

127
128 9 The Schwarzschild Spacetime

vanish. In fact, from δS = 0, we have


 b 
1 b δu, u
0 = −δ −u, u dτ = √ dτ
a 2 a −u, u
&  b '
1
=δ u, u dτ . (9.6)
2c a
because of the normalisation condition (9.3).
Thus, we can obtain the equation of motion just as well from the La-
grangian
1 1
L = u, u = gμν ẋμ ẋν (9.7)
2& 2 '
1 ṙ 2  
= −(1 − 2m/r)c2 t˙2 + + r2 ϑ̇2 + sin2 ϑ ϕ̇2 ,
2 1 − 2m/r
where it is important to recall that the overdot denotes differentiation
with respect to proper time τ. In addition, (9.3) immediately implies
that 2L = −c2 for material particles, but 2L = 0 for light, which will be
discussed later.
The Euler-Lagrange equation for ϑ is
d ∂L ∂L d  2  2 2
− =0= r ϑ̇ − r ϕ̇ sin ϑ cos ϑ . (9.8)
dτ ∂ϑ̇ ∂ϑ dτ
Suppose the motion starts in the equatorial plane, ϑ = π/2 and ϑ̇ = 0.
Should this not be the case, we can always rotate the coordinate frame
so that this is satisfied. Then, (9.8) shows that

r2 ϑ̇ = const. = 0 . (9.9)

Effective Lagrangian
? Without loss of generality, we can thus restrict the discussion to motion
Derive the Lagrangian (9.10) in the equatorial plane, which simplifies the Lagrangian to
yourself and convince yourself of & '
all steps taken. 1 ṙ2
L= −(1 − 2m/r)c t +
2 ˙2
+ r ϕ̇ .
2 2
(9.10)
2 1 − 2m/r

9.1.2 Cyclic coordinates and equation of motion

Obviously, t and ϕ are cyclic, thus angular momentum


∂L
= r2 ϕ̇ ≡ L = const. (9.11)
∂ϕ̇
and energy
∂L
= −(1 − 2m/r)ct˙ ≡ E = const. (9.12)
∂t˙
9.1 Orbits in the Schwarzschild spacetime 129

are conserved. We exploit these conservation laws to eliminate


L E
ϕ̇ = and ct˙ = − (9.13)
r2 1 − 2m/r
from the Lagrangian (9.10), use 2L = −1 and find
ṙ2 ṙ2 − E 2 L2
− c2 = −(1 − 2m/r)c2 t˙2 + + r2 ϕ̇2 = + 2 . (9.14)
1 − 2m/r 1 − 2m/r r

Radial equation of motion


This first integral of the radial equation of motion can be cast into the
form
ṙ2 + V(r) = E 2 , (9.15) ?
where V(r) is the effective potential What form does the effective po-
  tential have in Newtonian grav-
2m 2 L2 ity?
V(r) ≡ 1 − c + 2 . (9.16)
r r

Note that the effective potential has (and must have) the dimension of a
squared velocity.
Since it is our primary goal to find the orbit r(ϕ), we use r = dr/dϕ =
ṙ/ϕ̇ to transform (9.15) to
L2 2
ṙ2 + V(r) = ϕ̇2 r2 + V(r) = r + V(r) = E 2 . (9.17)
r4
Now, we substitute u ≡ 1/r and u = −r /r2 = −u2 r and find
u2  
L 2 u4 4
+ V(1/u) = L2 u2 + (1 − 2mu) c2 + L2 u2 = E 2 (9.18)
u
or, after dividing by L2 and rearranging terms,
E 2 − c2 2mc2
u2 + u2 = + 2 u + 2mu3 . (9.19)
L2 L

Differentiation with respect to ϕ cancels the constant first term on the


right-hand side and yields
2mc2  ?
2u u + 2uu = u + 6mu2 u . (9.20)
L2 Convince yourself by your own
calculation that you agree with
Orbital equation the result (9.20).
The trivial solution of this orbital equation is u = 0, which implies a
circular orbit. If u  0, this equation can be simplified to read

mc2
u + u = + 3mu2 . (9.21)
L2
Note that this is the equation of a driven harmonic oscillator.
130 9 The Schwarzschild Spacetime

The fact that t and ϕ are cyclic coordinates in the Schwarzschild space-
time can be studied from a more general point of view. Let γ(τ) be a
geodesic curve with tangent vector γ̇(τ), and let further ξ be a Killing
vector field of the metric. Then, we know from (5.36) that the projection
of the Killing vector field on the geodesic is constant along the geodesic,

∇γ̇ γ̇, ξ = 0 ⇒ γ̇, ξ = constant along γ (9.22)

Due to its stationarity and the spherical symmetry, the Schwarzschild


spacetime has the Killing vector fields ∂t and ∂ϕ . Thus,

2m
γ̇, ∂t  = γ̇t ∂t , ∂t  = γ̇t ∂t , ∂t  = g00 γ̇t = − 1 − ct˙ = const. (9.23)
r
and

γ̇, ∂ϕ  = γ̇ϕ ∂ϕ , ∂ϕ  = gϕϕ γ̇ϕ = r2 sin2 ϑ ϕ̇ = r2 ϕ̇ = const. , (9.24)

where we have used ϑ = π/2 without loss of generality. This reproduces


(9.11) and (9.12).

9.2 Comparison to the Kepler problem

9.2.1 Differences in the equation of motion

It is instructive to compare this to the Newtonian case. There, the


Lagrangian is
1 2 
L= ṙ + r2 ϕ̇2 − Φ(r) , (9.25)
2
where Φ(r) is some centrally-symmetric potential and the dots denote
the derivative with respect to the coordinate time t now instead of the
proper time τ. In the Newtonian limit, τ = t. For later comparison of
results obtained in this and the previous sections, the overdots can here
?
also be interpreted as derivatives with respect to τ, as in the previous
Why does the angle ϑ not appear
section.
in the Lagrange function (9.25)?
Why can it be ignored here? Since ϕ is cyclic,
∂L
= r2 ϕ̇ ≡ L = const . (9.26)
∂ϕ̇

The Euler-Lagrange equation for r is


d ∂L ∂L dΦ
− = 0 = r̈ − rϕ̇2 + . (9.27)
dt ∂ṙ ∂r dr
Since
dr L
ṙ = = r ϕ̇ = r 2 = −Lu , (9.28)
dt r
9.2 Comparison to the Kepler problem 131

we can write the second time derivative of r as


du du L
r̈ = −L = −L ϕ̇ = −Lu 2 = −L2 u2 u . (9.29)
dt dϕ r

Thus, the equation of motion (9.27) can be written as

L2 dΦ
− L2 u2 u − r + =0 (9.30)
r4 dr
or, after dividing by −u2 L2 ,

1 dΦ
u + u = . (9.31)
L 2 u2
dr ?
Can you agree with the result
(9.31)?
Orbital equation in Newtonian gravity
In the Newtonian limit of the Schwarzschild solution, the potential and
its radial derivative are
GM dΦ GM
Φ=− , = 2 = GMu2 = mc2 u2 , (9.32)
r dr r
so that the orbital equation becomes

mc2
u + u = . (9.33)
L2
Compared to this, the equation of motion in the Schwarzschild case (9.21)
has the additional term 3mu2 . We have seen in (8.66) that m ≈ 1.5 km
in the Solar System. There, the ratio of the two terms on the right-hand
side of (9.21) is

3mu2 3u2 L2 3r4 ϕ̇2 3 3v2


= = 2 2 = 2 (rϕ̇)2 = 2⊥ ≈ 7.7 · 10−8 (9.34)
mc /L
2 2 c2 rc c c
for the innermost planet Mercury. Here, v⊥ is the tangential velocity
along the orbit, v⊥ = rϕ̇.

9.2.2 Effective potential

The equation of motion (9.21) in the Schwarzschild spacetime can thus


be reduced to a Kepler problem with a potential which, according to
(9.31), is given by

1 dΦ(r) mc2
= 2 + 3mu2 (9.35)
L 2 u2
dr L
or
dΦ(r) mc2 3mL2
= mc2 u2 + 3mL2 u4 = 2 + 4 , (9.36)
dr r r
132 9 The Schwarzschild Spacetime

e = 0.9
e = 0.2 3
1
2
0.5
1

y in units of p

y in units of p
0 0
0 0.5 1 0 1 2 3
-1
-0.5
-2
-1 -3

-1 -0.5 0 0.5 1 -3 -2 -1 0 1 2 3
x in units of p
x in units of p

Figure 9.1 Numerical solutions of the orbital equation (9.21) for test parti-
cles, for different values of the orbital eccentricity e. All lengths, including the
mass m = 0.025, are scaled by the orbital parameter p. The orbits shown
begin at u = 1 + e with u = 0. For e = 0.2 (left), two orbits are shown, and
twelve orbits for e = 0.9 (right).

which leads to
mc2 mL2
Φ(r) = − − 3 (9.37)
r r
if we set the integration constant such that Φ(r) → 0 for r → ∞.
As a function of x ≡ r/Rs = r/2m, the effective potential V(r) from
(9.16) depends in an interesting way on L/(cRs ) = L/(2mc ≡ λ). The
dimensionless function
 
V(x) 1 λ2
v(x) := = 1 − 1 + (9.38)
c2 x x2

corresponding to the effective potential (9.16) asymptotically behaves as


v(x) → 1 for x → ∞ and v(x) → −∞ for x → 0.
For the potential to have a minimum, v(x) must have a vanishing deriva-
tive, v (x) = 0. This is the case where
 
1 λ2 1 2λ2
0 = v (x) = 2 1 + 2 − 1 − (9.39)
x x x x3

or, after multiplication with x4 ,



x2 − 2λ2 x + 3λ2 = 0 ⇒ x± = λ2 ± λ λ2 − 3 . (9.40)
√ √
Real solutions require λ ≥ 3. If λ < 3, particles with E 2 < 1 will
crash directly towards r = Rs .
9.3 Perihelion shift and light deflection 133

3
=1
 = 3
2.5
=2
=3
effective potential v(x) := V(x)/c2

2 =4

1.5

0.5

-0.5

-1
0 1 2 3 4 5
radius x = r/Rs

Figure 9.2 Dimensionless effective potential v(x) for a test particle in the
Schwarzschild spacetime for various scaled angular momenta λ.

Last stable orbit in the Schwarzschild metric


The last stable orbit, or more precisely√the innermost stable circular
orbit or ISCO, must thus have λ = 3 and is therefore located at
x± = 3, i.e. at r = 6m = 3Rs , or three Schwarzschild radii. There, the
dimensionless effective potential is

2 3 8
v(x = 3) = 1+ = . (9.41)
3 9 9

For λ > 3, the effective potential has a minimum at x+ and a maximum
at x− which reaches the height v = 1 for λ = 2 at x− = 2 and is higher
for larger λ. This means that particles with E ≥ 1 and L < 2cRs will fall
unimpededly towards r = Rs .

9.3 Perihelion shift and light deflection

9.3.1 The perihelion shift

The treatment of the Kepler problem in classical mechanics shows that


closed orbits in the Newtonian limit are described by
1
u0 (ϕ) = (1 + e cos ϕ) , (9.42)
p
where the parameter p is related to the angular momentum L by
  L2
p = a 1 − e2 = (9.43)
m
in terms of the semi-major axis a and the eccentricity e of the orbit.
134 9 The Schwarzschild Spacetime

Assuming that the perturbation 3mu2 in the equation of motion (9.21) is


small, we can approximate it by 3mu20 , thus

mc2 3m
u + u = + 2 (1 + e cos ϕ)2 . (9.44)
L2 p

The solution of this equation turns out to be simple because differential


equations of the sort



⎪ A

u + u = ⎪
⎪ B cos ϕ , (9.45)

⎩ C cos2 ϕ

which are driven harmonic-oscillator equations, have the particular solu-


tions

B C 1
? u1 = A , u2 = ϕ sin ϕ , u3 = 1 − cos 2ϕ . (9.46)
2 2 3
Verify the particular solutions
(9.46) of the driven harmonic os-
cillator equations (9.45). Orbits in the Schwarzschild spacetime
Since the unperturbed equation u + u = mc2 /L2 has the Keplerian
solution u = u0 , the complete solution is thus the sum

u = u0 + u1 + u2 + u3 (9.47)
&  '
1 3m e2 1
= (1 + e cos ϕ) + 2 1 + eϕ sin ϕ + 1 − cos 2ϕ .
p p 2 3
This solution of (9.44) has its perihelion at ϕ = 0 because the unperturbed
solution u0 was chosen to have it there. This can be seen by taking the
derivative with respect to ϕ,
e 3me + e ,
u = − sin ϕ + 2 sin ϕ + ϕ cos ϕ + sin 2ϕ (9.48)
p p 3
and verifying that u = 0 at ϕ = 0, i.e. the orbital radius r = 1/u still has
an extremum at ϕ = 0.
We now use equation (9.48) in the following way. Starting at the perihe-
lion at ϕ = 0, we wait for approximately one revolution at ϕ = 2π + δϕ
and see what δϕ has to be for u to vanish again. Thus, the condition for
the next perihelion is
3m + e ,
0 = − sin δϕ + sin δϕ + (2π + δϕ) cos δϕ + sin 2δϕ (9.49)
p 3
or, to first order in the small angle δϕ,
& '
3m 2e
δϕ ≈ 2δϕ + 2π + δϕ . (9.50)
p 3
9.3 Perihelion shift and light deflection 135

Sorting terms, we find


& '
6m ) e* 6πm 6πm
δϕ 1 − 1+ ≈ = (9.51)
p 3 p a(1 − e2 )
for the perihelion shift δϕ.
Perihelion shift
Substituting the Schwarzschild radius from (8.68), we can write this
result as
3πRs
δϕ ≈ . (9.52)
a(1 − e2 ) ?
This turns out to be −6 times the result (1.45) from the scalar theory of Can you confirm (9.51) begin-
gravity discussed in § 1.4.2, or ning with (9.49)?

δϕ ≈ 43 (9.53)

per century for Mercury’s orbit, which reproduces the measurement


extremely well.

9.3.2 Light deflection

For light rays, the condition 2L = −c2 that we had before for material
particles is replaced by 2L = 0. Then, (9.14) changes to
ṙ2 − E 2 L2
+ 2 =0 (9.54)
1 − 2m/r r
or 
L2 2m
ṙ + 2 1 −
2
= E2 . (9.55)
r r
Changing again the independent variable to ϕ and substituting u = 1/r
leads to the equation of motion for light rays in the Schwarzschild
spacetime
E2
u2 + u2 = 2 + 2mu3 , (9.56)
L
which should be compared to the equation of motion for material parti-
cles, (9.19). Differentiation finally yields the orbital equation for light ?
rays in the Schwarzschild spacetime. Derive the orbital equation (9.56)
yourself.
Light rays in the Schwarzschild spacetime
Light rays (null geodesics) in the Schwarzschild spacetime follow the
orbital equation
u + u = 3mu2 . (9.57)

Compared to u on the left-hand side, the term 3mu2 is very small. In the
Solar System,
3mu2 3Rs Rs
= 3mu = ≤ ≈ 10−6 . (9.58)
u 2r R

136 9 The Schwarzschild Spacetime

Keplerian
4 m = 0.025
m = 0.050
m = 0.075
2

y in units of p
0
0 5 10

-2

-4

-10 -5 0 5 10
x in units of p

Figure 9.3 Numerical solutions of the orbital equation (9.57) for light rays,
compared to the Keplerian straight line, for different values of m. All lengths,
including the mass m, are scaled by the orbital parameter p. The orbits
shown begin at u = 1 with u = 0.

Thus, the light ray is almost given by the homogeneous solution of the
harmonic-oscillator equation u + u = 0, which is u0 = A sin ϕ + B cos ϕ.
We require that the closest impact at u0 = 1/b be reached when ϕ = π/2,
which implies B = 0 and A = 1/b, or
sin ϕ b
u0 = ⇒ r0 = . (9.59)
b sin ϕ
Note that this is a straight line in plane polar coordinates, as it should be!
Inserting this lowest-order solution as a perturbation into the right-hand
side of (9.57) gives
3m 2 3m  
u + u = sin ϕ = 1 − cos 2
ϕ , (9.60)
b2 b2
for which particular solutions can be found using (9.45) and (9.46).
Combining this with the unperturbed solution (9.59) gives

sin ϕ 3m 3m 1
u= + 2 − 2 1 − cos 2ϕ . (9.61)
? b b 2b 3
Beginning with (9.60), confirm
the deflection angle (9.63). Given the orientation of our coordinate system, i.e. with the closest
approach at ϕ = π/2, we have ϕ ≈ 0 for a ray incoming from the left at
large distances. Then, sin ϕ ≈ ϕ and cos 2ϕ ≈ 1, and (9.61) yields
ϕ 2m
u≈ + . (9.62)
b b2
In the asymptotic limit r → ∞, or u → 0, this gives the angle
2m
|ϕ| ≈ . (9.63)
b
9.4 Spins in the Schwarzschild spacetime 137

Deflection angle for light rays


The total deflection angle of light rays is then
4m Rs
α = 2|ϕ| ≈ = 2 ≈ 1.74 . (9.64)
b b
This is twice the result from our simple consideration leading to (4.90)
which did not take the field equations into account yet.

9.4 Spins in the Schwarzschild spacetime

9.4.1 Equations of motion

Let us now finally study how a gyroscope with spin s is moving along a
geodesic γ in the Schwarzschild spacetime. Without loss of generality,
we assume that the orbit falls into the equatorial plane ϑ = π/2, and we
restrict the motion to circular orbits.
Then, the four-velocity of the gyroscope is characterised by u1 = 0 = u2
because both r = x1 and ϑ = x2 are constant.
The equations that the spin s and the tangent vector u = γ̇ of the orbit
have to satisfy are
s, u = 0 , ∇u s = 0 , ∇u u = 0 . (9.65)
The first is because s falls into a spatial hypersurface perpendicular to
the time-like four-velocity u, the second because the spin is parallel
transported, and the third because the gyroscope is moving along a
geodesic curve.
We work in the same tetrad {θ μ } introduced in (8.40) that we used to
derive the Schwarzschild solution. From (8.9), we know that
(∇u s)μ = dsμ + sν ωμν , u = u(sμ ) + ωμν (u)sν
= ṡμ + ωμν (u)sν = 0 , (9.66)
where the overdot marks the derivative with respect to the proper time τ.
With the connection forms in the Schwarzschild tetrad given in (8.50),
and taking into account that a = −b and cot ϑ = 0, we find for the
components of ṡ
ṡ0 = −ω01 (u)s1 = b e−b u0 s1 ,
e−b 3 3
ṡ1 = −ω10 (u)s0 − ω12 (u)s2 − ω13 (u)s3 = b e−b u0 s0 + u s ,
r
ṡ2 = −ω21 (u)s1 − ω23 (u)s3 = 0 ,
e−b 3 1
ṡ3 = −ω31 (u)s1 − ω32 (u)s2 = − u s , (9.67)
r
138 9 The Schwarzschild Spacetime

where we have repeatedly used that

θ1 (u) = u1 = 0 = u2 = θ2 (u) (9.68)

and ω23 = 0 because cot ϑ = 0.


Similarly, the geodesic equation ∇u u = 0, specialised to u1 = 0 = u2 ,
leads to

u̇0 = b e−b u0 u1 = 0 ,
e−b 3 2
u̇1 = −b e−b (u0 )2 − (u ) = 0 ,
r
u̇2 = 0 ,
e−b 1 3
u̇3 = − u u =0. (9.69)
r
The second of these equations implies
 0 2
u 1
=−  . (9.70)
? u 3 br
What is the physical meaning of
equation (9.70)?
9.4.2 Spin precession

We now introduce a set of basis vectors orthogonal to u, namely


u3 u0
ē1 = e1 , ē2 = e2 , ē3 = e0 + e3 . (9.71)
c c
The orthogonality of ē1 and ē2 to u is obvious because of u1 = 0 = u2 ,
and
u, ē3  = u3 u0 + u0 u3 = 0 (9.72)
?
shows the orthogonality of u and ē3 . Recall that u0 = −u0 , but u3 = u3
Carry out the calculations lead-
because the metric is g = diag(−1, 1, 1, 1) in this basis.
ing to equations (9.74) and (9.75)
yourself. Since the basis {ēi } spans the three-space orthogonal to u, the spin s of
the gyroscope can be expanded into this basis as s = s̄i ēi . We find
u3 3 u0 3
s0 =  s̄i ēi , e0  = s̄ , s1 = s̄1 , s2 = s̄2 , s3 = s̄ , (9.73)
c c
which we can insert into (9.67) to find

u̇3 s̄3 + u3 s̄˙3 = u3 s̄˙3 = cb e−b u0 s̄1 ,



 −b e−b u0 u3 3
s̄˙ = b e +
1
s̄ ,
r c
s̄˙2 = 0 ,
ce−b 3 1
u̇0 s̄3 + u0 s̄˙3 = u0 s̄˙3 = − u s̄ . (9.74)
r
9.4 Spins in the Schwarzschild spacetime 139

Note that u̇μ = 0 for all μ according to (9.69). Using (9.69) and the
normalisation relation (u0 )2 − (u3 )2 = c2 , we obtain
& '
(u0 )2 u0 u3 3 0
 −b u 3
˙s̄1 = b e−b 1 − s̄ = −cb e s̄ ,
(u3 )2 c u3
s̄˙2 = 0 ,
u0 1
s̄˙3 = cb e−b s̄ . (9.75)
u3
From now on, we shall drop the overbar, understanding that the si denote
the components of the spin with respect to the basis ēi .
Next, we transform the time derivative from the proper time τ to the
coordinate time t. Since

u0 = θ0 (u) = ea cdt(u) = ct˙ ea , (9.76)

we have
u0 −a u0 b
t˙ = e = e , (9.77)
c c
or
dsi ṡi c ṡi −b
= = 0 e . (9.78)
dt t˙ u
Inserting this into (9.75) yields

ds1 c2 b ds2 ds3 c2 b −2b 1


= − 3 e−2b s3 , =0, = 3 e s . (9.79)
dt u dt dt u

Finally, using (8.40), we have

u3 = θ3 (u) = r sin ϑ dϕ(u) = ruϕ = rϕ̇ (9.80)

at ϑ = π/2, which yields the angular frequency

dϕ ϕ̇ ce−b u3
ω≡ = = , (9.81)
dt t˙ r u0
which can be rewritten by means of (9.70),
 2
u3 e−2b c2 b −2b c2  −2b 
ω2 = =− e = e . (9.82)
u0 r 2 r 2r

Now, since the exponential factor was


   2m
2m
e−2b = 1 − ⇒ e−2b = 2 , (9.83)
r r
we obtain the well-known intermediate result
mc2 GM
ω2 = = 3 , (9.84)
r3 r
140 9 The Schwarzschild Spacetime

which is Kepler’s third law.


Taking another time derivative of (9.79), we can use ṙ = 0 for circular
orbits and u̇3 = 0 from (9.69). Thus,

d2 s1 c2 b −2b ds3 c4 b2 e−4b 1


= − e = − s (9.85)
dt2 u3 dt (u3 )2
?
and likewise for s3 . This is an oscillator equation for s1 with the squared
Verify the calculation leading to
angular frequency
the squared angular frequency Ω2
in (9.86). c4 b2 e−4b 2 2 −4b (u ) − (u )
0 2 3 2
Ω2 = = c b e
(u3 )2 (u3 )2
 2  −4b
1 c be # $
= c2 b2 e−4b −1 −  = − 1 + b r . (9.86)
br r

Now, we use (9.82) to substitute the factor out front the final expression
and find the relation
# $
Ω2 = ω2 e−2b 1 + b r (9.87)

between the angular frequencies Ω and ω. From (8.62), we further know


that
1  1 1

m 1
rb = 1 − e2b = 1− =− , (9.88)
2 2 1 − 2m/r r 1 − 2m/r

thus
r − 3m 1 − 3m/r
rb + 1 = = (9.89)
r − 2m 1 − 2m/r
and 
1 − 3m/r 3m
Ω2 = ω2 e−2b = ω2 1 − . (9.90)
1 − 2m/r r

In vector notation, we can write (9.79) as


⎛ ⎞
⎜⎜⎜ 0 ⎟⎟⎟
ds 
= Ω × s ,  = ⎜⎜⎜⎜ Ω ⎟⎟⎟⎟ .
Ω (9.91)
dt ⎜⎝ ⎟⎠
0

Recall that we have projected the spin into the three-dimensional space
perpendicular to the direction of motion. Thus, the result (9.91) shows
that s precesses retrograde in that space about an axis perpendicular to
the plane of the orbit, since u2 = 0.
After a complete orbit, i.e. after the orbital time τ = 2π/ω, the projection
of s onto the plane of the orbit has advanced by an angle

Ω 3m
φ = Ωτ = 2π = 2π 1 − < 2π , (9.92)
ω r
9.4 Spins in the Schwarzschild spacetime 141

according to (9.90). The spin thus falls behind the orbital motion; its
precession is retrograde. The geodetic precession frequency is
⎛ ⎞
φ − 2π ⎜⎜ ⎟⎟
= ω ⎜⎜⎜⎝ 1 − − 1⎟⎟⎟⎠
3m
ωs =
τ r
 1/2
GM 3GM 3 (GM)3/2
≈− 3 2
=− (9.93)
r 2rc 2 c2 r5/2

to first-order Taylor approximation in m/r, with ω from Kepler’s third


law (9.84).
Geodetic precession near the Earth
If we insert the Earth’s mass and radius here, MEarth = 5.97 · 1027 g and
REarth = 6.38 · 108 cm, we find a geodetic precession near the Earth of
  )R *5/2 )R *5/2
ωs ≈ − 2.66 · 10−7 s−1 = −8.4 year−1
Earth Earth
.
r r
(9.94)
In this context, see also the Example box “Measurement of spin pre-
cession near the Earth” following the discussion of the Lense-Thirring
effect leading to (7.62).
Chapter 10

Schwarzschild Black Holes

10.1 The singularity at r = 2m

10.1.1 Free fall towards the centre

Before we can continue discussing the physical meaning of the Schwarz-


schild metric, we need to clarify the nature of the singularity at the
Schwarzschild radius, r = 2m. Upon closer inspection, it seems to lead
to contradictory conclusions.
Let us begin with an astronaut falling freely towards the centre of the
Schwarzschild spacetime along a radial orbit. Since ϕ̇ = 0, the angular
momentum vanishes, L = 0, and the equation of motion (9.15) reads

2m
ṙ2 + c2 1 − = E2 . (10.1)
r

Suppose the astronaut was at rest at r = R, then E 2 = c2 (1 − 2m/R) and


E 2 < c2 , and we have
  
ṙ2 2m 2m 1 1
= 1− − 1− = 2m − , (10.2)
c2 R r r R

which yields
&  '−1/2
1 1
2m − dr = cdτ , (10.3)
r R

where τ is the proper time.


This equation admits a parametric solution. Starting from

R R
r= (1 + cos η) , dr = − sin ηdη , (10.4)
2 2
143
144 10 Schwarzschild Black Holes

we first see that



1 1 2 1 1 1 − cos η
− = − =
r R R(1 + cos η) R R 1 + cos η
1 (1 − cos η)2
= , (10.5)
R sin2 η

where we have used in the last step that 1 − cos η2 = sin2 η. This result
allows us to translate (10.3) into

√ √
R sin ηdr R R sin2 ηdη
√ =− √
2m(1 − cos η) 2 2m 1 − cos η

R3
=− (1 + cos η)dη . (10.6)
8m

Integrating, we find that this solves (10.3) if


R3 R3
cτ = (η + sin η) , cdτ = (1 + cos η) dη . (10.7)
8m 8m

At η = 0, the proper time is τ = 0 and r = R, i.e. the proper time starts


running when the free fall begins. Figure 10.1 shows the radial distance
?
r as a function of the proper time τ for R = 6m, i.e. for an astronaut
Confirm the solution (10.7) for
starting at rest at the innermost stable circular orbit.
the proper time by your own cal-
culation.
6

4
radial distance r/m

0
0 2 4 6 8 10 12 14 16 18
proper time c/m

Figure 10.1 Radial distance r as a function of proper time τ for an astro-


naut falling towards the singularity of the Schwarzschild spacetime beginning
at rest at the innermost stable circular orbit, R = 6m.
10.1 The singularity at r = 2m 145

Free-fall time to the centre of a black hole


The centre r = 0 is reached when η = π, i.e. after the proper time

π R3
τ0 = . (10.8)
c 8m
This indicates that the observer falls freely within finite time “through”
the singularity at r = 2m without encountering any (kinematic) prob-
lem.

10.1.2 Problems with the Schwarzschild coordinates

However, let us now describe the radial coordinate r as a function of the


coordinate time t. Using (9.13), we first find

dr dr E/c
ṙ = t˙ = − . (10.9)
dt dt 1 − 2m/r
?
Next, we introduce a new, convenient radial coordinate r̄ such that Before you read on, find the func-
tion r̄(r) yourself, given (10.10).
dr
dr̄ = . (10.10)
1 − 2m/r

This condition can be integrated as follows,

dr r/2m − 1 + 1 dr
= dr = dr +
1 − 2m/r r/2m − 1 r/2m − 1
) r *
= dr + 2m d ln −1 , (10.11)
2m
giving ) r *
r̄ = r + 2m ln −1 . (10.12)
2m

With this, we find

E/c dr E dr̄
ṙ = − =− , (10.13)
1 − 2m/r dt c dt

and thus, from the equation of motion (10.1),


 2 
E 2 dr̄ 2m
= E 2
− c 2
1 − . (10.14)
c2 dt r

Approaching the Schwarzschild radius from outside, i.e. in the limit


r → 2m+, we have from (10.12)
+ ) r *,
lim r̄ = lim 2m 1 + ln − 1 = −∞ , (10.15)
r→2m+ r→2m+ 2m
146 10 Schwarzschild Black Holes

However, in the same limit, the equation of motion says

 2
E 2 dr̄
→ E2 , (10.16)
c2 dt

and thus
dr̄
→ ±c . (10.17)
dt
Of the two signs, we have to select the negative because of r̄ → −∞, as
(10.15) shows. Therefore, an approximate solution of the equation of
motion near the singularity is r̄ ≈ c(t−t0 ) with an arbitrary constant t0 . To
be specific, we set t = 0 when r = 6m, the radius of the innermost stable
circular orbit defined in Sect. 9.2. There, r̄0 = 2m(3 + ln 2) according to
(10.12) and thus
r̄ ≈ −ct + 2m(3 + ln 2) . (10.18)

Substituting r for r̄,


) r *
−ct + 2m(3 + ln 2) = r + 2m ln −1
+ )2mr *,
≈ 2m 1 + ln −1 . (10.19)
2m

Free-fall coordinate time to the centre of a black hole


Solving the approximate equation (10.19) for r, we find
) r * ct
ln −1 ≈− + 2 + ln 2 (10.20)
2m 2m
or  
r ≈ 2m 1 + 2e2−ct/2m > 2m , (10.21)
showing that the orbital radius remains larger than the Schwarzschild
radius even for t → ∞. Thus, in coordinate time, the Schwarzschild
radius is never even reached!
Finally, radial light rays are described by radial null geodesics, thus
satisfying

2m 2 2 dr2
0 = ds = − 1 −
2
c dt + (10.22)
r 1 − 2mr

or

dr 2m
= ±c 1 − , (10.23)
dt r

suggesting that the light cones become infinitely narrow as r → 2m+.


10.2 The Kruskal continuation 147

Problems on the horizon


These results appear quite dissatisfactory or confusing: while a freely
falling observer reaches the Schwarzschild radius and even the centre
of the Schwarzschild spacetime after finite proper time, the coordinate
time becomes infinite even for reaching the Schwarzschild radius, and
the flattening of the light cones as one approaches the Schwarzschild
radius is entirely unwanted because causality cannot be assessed when
the light cone degenerates to a line.

10.1.3 Curvature at r = 2m

Moreover, consider the components of the Ricci tensor given in (8.57)


and (8.58) near the Schwarzschild radius. Since a = −b and

1 2m
b = − ln 1 − = −a (10.24)
2 r

from (8.64), the required derivatives are


m 2m(r − m)
a = = −b , a = − = −b . (10.25)
r(r − 2m) r2 (r − 2m)2
Thus,
 
 2a
2 2m
R00 = − a + 2a + 1− = 0 = −R11 (10.26)
r r

and

2a 2m 1  
R22 = − 1− + 2 1 − e−2b
r r r
2m(r − 2m) 2m
=− 3 + 3 = 0 = R33 , (10.27)
r (r − 2m) r
i.e. the components of the Ricci tensor in the Schwarzschild tetrad remain
perfectly regular at the Schwarzschild radius!

10.2 The Kruskal continuation

10.2.1 Construction principle

We shall now try to remove the obvious problems with the Schwarzschild
coordinates by transforming (ct, r) to new coordinates (u, v), leaving ϑ
and ϕ, requiring that the metric can be written as

g = − f 2 (u, v)(dv2 − du2 ) + r2 (dϑ2 + sin2 ϑdϕ2 ) (10.28)


148 10 Schwarzschild Black Holes

with a function f (u, v) to be determined.


Provided f (u, v)  0, radial light rays propagate as in a two-dimensional
Minkowski metric according to
 2
du
dv2 = du2 , =1, (10.29)
dv
which shows that the light cones remain undeformed in the new coordi-
nates.
Light cones in Kruskal coordinates
The Kruskal coordinates are constructed such that the light cones re-
main the same everywhere.
The Jacobian matrix of the transformation from the Schwarzschild coor-
dinates (ct, r, ϑ, ϕ) to the new coordinates (v, u, ϑ, ϕ) is
⎛ ⎞
⎜⎜⎜ vt ut 0 0 ⎟⎟⎟
⎜⎜⎜ v u 0 0 ⎟⎟⎟
Jβα = ⎜⎜⎜⎜⎜ r r ⎟⎟⎟ ,
⎟ (10.30)
⎜⎜⎝ 0 0 1 0 ⎟⎟⎟⎠
0 0 0 1
where subscripts denote derivatives here,
vt = ∂ct v , vr = ∂r v (10.31)
and likewise for u. The metric ḡ in the new coordinates,
?
ḡ = diag(− f 2 , f 2 , r2 , r2 sin2 ϑ) , (10.32)
Beginning with ḡ, find the ma-
trix representation of the metric is transformed into the original Schwarzschild coordinates by
g yourself and thus confirm the
g = J ḡJ T (10.33)
following result (10.33). ⎛   ⎞
⎜⎜⎜ − f 2 v2t − u2t − f 2 (vt vr − ut ur ) 0 0 ⎟⎟⎟
⎜⎜⎜   ⎟⎟⎟
⎜⎜⎜ − f 2 (vt vr − ut ur ) − f 2 v2r − u2r 0 0 ⎟⎟⎟
= ⎜⎜⎜ ⎟⎟⎟ ,
⎜⎜⎜ 0 0 r2 0 ⎟⎟⎟
⎝ ⎠
0 0 0 r2 sin2 ϑ
which, by comparison with our requirement (10.28), yields the three
equations
  
2m
− 1− = − f 2 v2t − u2t ,
r
1  
= − f 2 v2r − u2r ,
1 − 2m/r
0 = v t vr − ut u r . (10.34)

For convenience, we now fall back to the radial coordinates r̄ from


(10.12) and introduce the function
1 − 2m/r
F(r̄) ≡ , (10.35)
f 2 (r)
10.2 The Kruskal continuation 149

assuming that f will turn out to depend on r only since any dependence
on time and on the angles ϑ and ϕ is forbidden in a static, spherically-
symmetric spacetime. Then,

dr 2m
vr̄ = vr = 1 − vr (10.36)
dr̄ r
?
and the same for u. Repeat the calculation in (10.36)
with the coordinate u.

10.2.2 Transformation to Kruskal coordinates

The equations (10.34) then transform to

F(r̄) = v2t − u2t , −F(r̄) = v2r̄ − u2r̄ , vt vr̄ − ut ur̄ = 0 . (10.37)

Now, we add the two equations containing F(r̄) and then add and subtract
from the result twice the third equation from (10.37). This yields

(vt ± vr̄ )2 = (ut ± ur̄ )2 . (10.38)

Taking the square root of this equation, we can choose the signs. The
choice

vt + vr̄ = ut + ur̄ , vt − vr̄ = − (ut − ur̄ ) (10.39)


?
avoids that the Jacobian matrix could become singular, det J = 0. Can you confirm that det J  0
for the choice of sign in (10.39)?
Adding and subtracting the equations (10.39), we find

vt = ur̄ , ut = vr̄ . (10.40)

Taking partial derivatives once with respect to t and once with respect to
r̄ allows us to combine these equations to find the wave equations

vtt − vr̄r̄ = 0 , utt − ur̄r̄ = 0 , (10.41)

which are solved by any two functions h± propagating with unit velocity,

v = h+ (r̄ + ct) + h− (r̄ − ct) , u = h+ (r̄ + ct) − h− (r̄ − ct) , (10.42)

where the signs were chosen such as to satisfy the sign choice in (10.39).
Now, since

vt = h+ − h− , ut = h+ + h− ,


vr̄ = h+ + h− , ur̄ = h+ − h− , (10.43)

where the primes denote derivatives with respect to the functions’ argu-
ments, we find from (10.37)
# $ # $
F(r̄) = h+ − h− 2 − h+ + h− 2 = −4h+ h− . (10.44)
150 10 Schwarzschild Black Holes

We start from outside the Schwarzschild radius, assuming r > 2m, where
also F(r̄) > 0 according to (10.35). The derivative of (10.44) with respect
to r̄ yields
# $
F  (r̄) = −4 h+ h− + h+ h− (10.45)
or, with (10.44),
F  h+ h−
=  +  . (10.46)
F h+ h−
the derivative of (10.44) with respect to time yields
# $ h+ h−
0 = −4 h+ h− − h+ h− ⇒ − =0. (10.47)
h+ h−

The sum of these two equations gives

(ln F) = 2(ln h+ ) . (10.48)

Now, the left-hand side depends on r̄, the right-hand side on the indepen-
dent variable r̄ + t. Thus, the two sides of this equation must equal the
same constant, which we call 2C:

(ln F) = 2C = 2(ln h+ ) . (10.49)

The left of these equations yields

ln F = 2C r̄ + const. ⇒ F = const.e2C r̄ , (10.50)

while the right equation gives

ln h+ = C(r̄ + ct) + const. (10.51)

or
h+ = const.eC(r̄+ct) . (10.52)

For later convenience, we choose the remaining constants in (10.50) and


(10.52) such that
1
F(r̄) = C 2 e2C r̄ , h+ (r̄ + ct) = eC(r̄+ct) , (10.53)
2
and (10.47) gives
1
h− (r̄ − ct) = − eC(r̄−ct) , (10.54)
2
where the negative sign must be chosen to satisfy both (10.44) and F > 0.
Working our way back, we find
1 
u = h+ (r̄ + ct) − h− (r̄ − ct) = eC(r̄+ct) + eC(r̄−ct)
2
) r *2mC
= e cosh(Cct) =
C r̄
−1 eCr cosh(Cct) , (10.55)
2m
10.2 The Kruskal continuation 151

using (10.12) for r̄. Similarly, we find

) r *2mC
v= −1 eCr sinh(Cct) , (10.56)
2m

and the function f follows from (10.35),

1 − 2m/r 1 − 2m/r −2C r̄


f2 = = e
F C2
1 − 2m/r −2Cr + ) r *,
= e exp −4mC ln − 1
C2 2m
2m ) r *1−4mC
−2Cr Caution Note that we could
= 2 −1 e . (10.57)
rC 2m equally well choose h+ < 0 and
h− > 0 in (10.53) and (10.54).
This possible alternative choice is
important for our later discussion.


Figure 10.2 Martin D. Kruskal (1925–2006), US-American mathematician


and physicist. Source: Wikipedia

Now, since we want f to be non-zero and regular at r = 2m, we must


require 4mC = 1, which finally fixes the Kruskal transformation of the
Schwarzschild metric, found by Martin Kruskal in 1960. The coordi-
nates (v, u, ϑ, ϕ) are also called Kruskal-Szekeres coordinates, including
George (György) Szekeres (1911–2005), who found them independently
in 1961.
152 10 Schwarzschild Black Holes

Kruskal-Szekeres coordinates
The Kruskal-Szekeres coordinates (u, v) are related to the Schwarz-
schild coordinates (ct, r) by
) ct *
r
u= − 1 er/4m cosh ,
2m 4m
) ct *
r
v= − 1 er/4m sinh , (10.58)
2m 4m
and the scale function f is

32m3 −r/2m
f2 = e . (10.59)
r
We have (or rather, Martin Kruskal has) thus achieved our goal to replace
the Schwarzschild coordinates by others in which the Schwarzschild
metric remains prefectly regular at r = 2m. Appendix C shows how
space-times can be compactly represented in Penrose-Carter diagrams.

10.3 Physical meaning of the Kruskal contin-


uation

10.3.1 Regions in the Kruskal spacetime

Since cosh2 (x) − sinh2 (x) = 1, eqs. (10.58) imply


) r * v ) ct *
u2 − v2 = − 1 er/2m , = tanh . (10.60)
2m u 4m
This means u = |v| for r = 2m, which is reached for t → ±∞. Lines of
constant coordinate time t are straight lines through the origin in the (u, v)
plane with slope tanh(ct/4m), and lines of constant radial coordinate r
are hyperbolae.
The metric in Kruskal coordinates (10.28) is regular as long as r(u, v) > 0,
which is the case for
u2 − v2 > −1 , (10.61)
as (10.61) shows. The hyperbola limiting the regular domain in the
Kruskal manifold is thus given by v2 − u2 = 1. If (10.61) is satisfied, r is
uniquely defined, because the equation

ρ(x) ≡ (x − 1)e x = u2 − v2 > −1 (10.62)

is monotonic for x > 0:

ρ (x) = e x (x − 1) + e x = xe x > 0 (x > 0) . (10.63)


10.3 Physical meaning of the Kruskal continuation 153

1.5


+
1

=t
1,
r=
0.5 II

0 III I
v

-0.5 IV r=

-1
1,
t
=
-

-1.5

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
u
Figure 10.3 Illustration of the Kruskal continuation in the u-v plane. The
Schwarzschild domain r > 2m is shaded in red and marked with I, the
forbidden region r < 0 is shaded in gray.

The domain of the original Schwarzschild solution is restricted to u > 0


and |v| < u (i.e. to the blue area I in Fig. 10.2), but this is a consequence of
our choice for the relative signs of h± in (10.54). We could as well have
chosen h+ < 0 and h− > 0, which would correspond to the replacement
(u, v) → (−u, −v).
The original Schwarzschild solution for r < 2m also satisfies Einstein’s
vacuum field equations. There, the Schwarzschild metric shows that r
then behaves like a time coordinate because grr < 0, and t behaves like a
spatial coordinate.
Looking at the definition of F(r̄) in (10.35), we see that r < 2m cor-
responds to F < 0, which implies that h+ and h− must have the same
(rather than opposite) signs because of (10.44). This interchanges the
functions u and v from (10.58), i.e. u → v and v → u. Then, the condition
|v| < u derived for r > 2m changes to |v| > u.
154 10 Schwarzschild Black Holes

Domains in the Kruskal spacetime


In summary, the exterior of the Schwarzschild radius corresponds to
the domain u > 0, |v| < u, and its interior is bounded in the (u, v) plane
by the lines u > 0, |v| = u and v2 − u2 = 1.
Radial light rays propagate according to ds2 = 0 or dv = du, i.e. they
are straight diagonal lines in the (u, v) plane. This shows that light
rays can propagate freely into the region r < 2m, but there is no causal
connection from within r < 2m to the outside.

Non-static interior of the Schwarzschild horizon


The Killing vector field K = ∂t for the Schwarzschild spacetime outside
r = 2m becomes space-like for r < 2m, which means that the spacetime
cannot be static any more inside the Schwarzschild radius.

10.3.2 Eddington-Finkelstein coordinates

We now want to study the collapse of an object, e.g. a star. For this
purpose, coordinates originally introduced by Arthur S. Eddington and
re-discovered by David R. Finkelstein are convenient, which are defined
by
r = r ,
ϑ = ϑ , ϕ = ϕ 
) r *
ct = ct − 2m ln ± ∓1 (10.64)
2m
in analogy to the radial coordinate r̄ from (10.12), where the upper
?
and lower signs in the second line are valid for r > 2m and r < 2m,
What do the light cones look
respectively.
like in the coordinates given in
(10.64)? Since
⎧  ∓1/2



⎨ 2m − 1 (r > 2m)
r
±ct /4m
e±ct/4m = e ⎪
⎪  ∓1/2 , (10.65)

⎩ − 2m + 1
r
(r < 2m)
inserting these expressions into the Kruskal-Szekeres coordinates (10.58)
shows that they are related to the Eddington-Finkelstein coordinates by

er/4m ct /4m r − 2m −ct /4m
u= e + e
2 2m

er/4m ct /4m r − 2m −ct /4m
v= e − e , (10.66)
2 2m
such that
r − 2m r/2m  r − 2m u + v
e = u2 − v 2 , ect /2m = . (10.67)
2m 2m u − v

The first of these equations shows again that r can be uniquely determined
from u and v if u2 − v2 > −1. The second equation determines t uniquely
10.3 Physical meaning of the Kruskal continuation 155

Figure 10.4 Sir Arthur Stanley Eddington (1882–1944), British astrophysi-


cist. Source: Wikipedia

provided r > 2m and (u + v)/(u − v) > 0, or r < 2m and (u + v)/(u − v) < 0.


This is possible if v > −u.
Using
1 ±dr 2m dr
cdt = cdt − 2m = cdt − , (10.68)
±r/2m ∓ 1 2m r 1 − 2m/r
we find
 
2m 2 2 2m 2 2 4mc  4m2 dr2
− 1− c dt = − 1 − c dt + dt dr − 2 ,
r r r r 1 − 2m/r
(10.69)
and thus the line element of the metric in Eddington-Finkelstein coordi-
nates reads
 
2m 2 2 4m2 dr2
ds2 = − 1 − c dt + 1 − 2
r r 1 − 2m/r
4mc 
+ dt dr + r2 dΩ2 (10.70)
r 
2m 2 2 2m 4mc 
=− 1− c dt + 1 + dr2 + dt dr + r2 dΩ2 .
r r r
Thus, the metric acquires off-diagonal elements such that it no longer
depends on t and r separately.
For radial light rays, dΩ = 0 and ds2 = 0, which implies from (10.70)
 
2m 2 2 2m 4mc 
1− c dt − 1 + dr2 − dt dr = 0 , (10.71)
r r r
156 10 Schwarzschild Black Holes

which can be factorised as


&  '
2m 2m
1− cdt − 1 + dr (cdt + dr) = 0 . (10.72)
r r

Light cones in Eddington-Finkelstein coordinates


Light cones in Eddington-Finkelstein coordinates are defined either by
dr
= −c ⇒ r = −t + const. (10.73)
dt
or by
dr r − 2m
=c . (10.74)
dt r + 2m
This shows that dr/dt → −c for r → 0, dr/dt = 0 for r = 2m, and
dr/dt = c for r → ∞. Due to the vanishing derivative of r with respect
to t at r = 2m, geodesics cannot cross the Schwarzschild radius from
inside, but they can from outside because of (10.73).

1.5
time t'/2m

0.5

0
0 0.5 1 1.5 2
radius r'/2m

Figure 10.5 Light cones in the Schwarzschild spacetime in Eddington-


Finkelstein coordinates. The blue lines mark outgoing, the red lines incoming
radial light rays. The blue ellipses emphasise the light cones.

10.4 Redshift approaching the


Schwarzschild radius

Suppose a light-emitting source (e.g. an astronaut with a torch) is falling


towards a (Schwarzschild) black hole, what does a distant observer see?
Let v and u be the four-velocities of the astronaut and the observer,
10.4 Redshift approaching the Schwarzschild radius 157

respectively. Then according to (4.48) the redshift of the light from the
torch as seen by the observer is

νem k, v
1+z= = , (10.75)
νobs k, u

where k is the wave vector of the light.


We transform to the retarded time ctret ≡ ct − r̄, with r̄ given by (10.12).
Then, (10.10) implies that

dr
cdtret = cdt − , (10.76)
1 − 2m/r

thus
dr2 2c dtret dr
c2 dt2 = c2 dtret
2
+ + (10.77)
(1 − 2m/r)2 1 − 2m/r
and the line element of the Schwarzschild metric transforms to

2m 2 2
ds2 = − 1 − c dtret − 2c dtret dr + r2 dΩ2 . (10.78)
r

For radial light rays, dΩ = 0, this means



2m 2 2
0=− 1− c dtret − 2c dtret dr , (10.79)
r

which is possible for outgoing light rays only if dtret = 0. This shows
that such light rays must propagate along r, or k ∝ ∂r , which is of course
a consequence of our using the retarded time tret . We set the amplitude
of k such that k = κ∂r . Since ∂r , ∂r  = 0 in the coordinates of the line
element (10.78), the null condition on k is satisfied for any κ.
For a distant observer at a fixed distance r  2m, the line element (10.78)
simplifies to
ds2 ≈ −c2 dtret
2
, (10.80)
which shows that the retarded time tret is also the distant observer’s
proper time.
Expanding now the astronaut’s velocity as

v = t˙ret ∂tret + ṙ∂r , (10.81)

we find
k, v = κ∂r , v = κ gtret r t˙ret = −κt˙ret (10.82)
because ∂r , ∂r  = 0 according to the metric with the line element (10.78).
The dots in these equations indicate derivatives with respect to the astro-
naut’s proper time.
158 10 Schwarzschild Black Holes

Far away from the black hole, the metric can be assumed to be Min-
kowskian. For a distant observer at rest, the four-velocity is u = ∂t . In
Minkowski coordinates, the wave vector of the light ray must be

k = κ (∂t + ∂r ) , (10.83)

which is required by cdtret (k) = (cdt − dr)(k) = 0, valid for r  2m,


together with k = κ∂r . Thus,

k, u = κ (∂t + ∂r ) , ∂t  = −κ . (10.84)

This gives the redshift

ṙ/c
1 + z ≈ t˙ret = t˙ − . (10.85)
1 − 2m/r

When restricted to radial orbits, ϕ̇ = 0 = L, the equation of motion (9.15)


is 
2m
ṙ2 + c2 1 − = E2 , (10.86)
r
where E was defined as

2m
E = −ct˙ 1 − , (10.87)
r

see (9.12). To be specific, we set the constant E such that the astronaut
is at rest at infinite radius, E 2 = c2 . Requiring that the astronaut’s proper
time increases with the coordinate time, t˙ > 0 and E < 0, hence we must
set E = −c. Since ṙ < 0 for the infalling astronaut,

ṙ = −c 1 − δ , (10.88)

with δ ≡ 1 − 2m/r.
The redshift (10.85) can now be written
1 √  2
1+z= 1+ 1−δ ≈ (10.89)
δ δ
to leading order close to the Schwarzschild radius, where δ → 0+. We
have seen in (10.21) that the radial coordinate of the falling astronaut
is well approximated by r ≈ 2m(1 + 2e2−ct/2m ) near the Schwarzschild
radius if the coordinate clock is set to zero at r = 6m. This enables us to
approximate δ by

2m r − 2m
δ=1− ≈ = 2e2−ct/2m (10.90)
r 2m
and the redshift by
1 + z ≈ ect/2m−2 . (10.91)
10.4 Redshift approaching the Schwarzschild radius 159

Redshift approaching the Schwarzschild horizon


Summarizing, this calculation shows that the astronaut’s redshift
2
1+z≈ = ect/2m−2 (10.92)
δ
grows exponentially to infinity as he approaches the Schwarzschild
radius.
This resolves the apparent contradiction that, while the astronaut has
long reached the singularity as measured by his own watch, the distant
observer never even sees him reach the Schwarzschild radius: The signal
of the astronaut’s passing the Schwarzschild radius is infinitely delayed
and thus never reaches the distant observer.
Chapter 11

Charged, Rotating Black Holes

11.1 The Reissner-Nordström solution

11.1.1 Energy-momentum tensor of electric charge

The Schwarzschild solution is a very important exact solution of Ein-


stein’s vacuum equations, but we expect that real objects collapsing
to become black holes may be charged and rotating. We shall now
generalise the Schwarzschild solution into these two directions.
First, we consider a static, axially-symmetric solution of Einstein’s
equations in the presence of an electromagnetic charge q at the origin of
the Schwarzschild coordinates, i.e. at r = 0. The electromagnetic field
will then also be static and axially symmetric.
Expressing the field tensor in the Schwarzschild tetrad (8.40), we thus
expect the Faraday 2-form (5.86) to be

q q
F=− 2
cdt ∧ dr = − 2 e−a−b θ0 ∧ θ1 . (11.1)
r r
?
We shall verify below that a = −b also for a Schwarzschild solution with Why would the Faraday-2-form
charge, so that the exponential factor will become unity later. be given by (11.1)? Recall the
meaning of the components of
The electromagnetic energy-momentum tensor the electromagnetic field tensor.
& '
μν 1 μλ ν 1 μν αβ
T = F F λ − g F Fαβ (11.2)
4π 4

is now easily evaluated. Since the only non-vanishing component of


Fμν is F01 and the metric is diagonal in the Schwarzschild tetrad, g =
diag(−1, 1, 1, 1), we have

Fαβ F αβ = F01 F 01 + F10 F 10 = −2F01


2
. (11.3)

161
162 11 Charged, Rotating Black Holes

Using this, we find the components of the energy-momentum tensor


& '
1 1 2 1 2 q2 −2(a+b)
T =
00
F F 1 − F01 =
01 0
F01 = e ,
4π 2 8π 8πr4
& '
1 1 2 q2 −2(a+b)
T 11 = F 10 F 10 + F01 =− e = −T 00 ,
4π 2 8πr4
1 2 q2 −2(a+b)
T 22 = F01 = e = T 33 . (11.4)
8π 8πr4

11.1.2 The Reissner-Nordström metric

Inserting these expressions instead of zero into the right-hand side of


Einstein’s field equations yields, with (8.60),

1 1 2b 8πG Gq2
G00 = 2 − e−2b 2 − = 4 T 00 = 4 4 e−2(a+b) ,
r r r c cr
 

1 1 2a 8πG
G11 = − 2 + e−2b 2 + = − 4 T 00 = −G00 . (11.5)
r r r c
Adding these two equations, we find a + b = 0, which implies a + b = 0
because the functions have to tend to zero at infinity. This confirms that
we can identify cdt ∧ dr = θ0 ∧ θ1 and write F01 = q/r2 .
Analogous to (8.62), we note that the first of equations (11.5) with
a = −b is equivalent to
  Gq2
re−2b = 1 − 4 2 , (11.6)
cr
which gives
2m Gq2
? e−2b = e2a = 1 − + 4 2 , (11.7)
r cr
Verify that the term Gq2 /c4 has
if we use −2m as the integration constant as for the neutral Schwarzschild
the unit of a squared length in
solution.
Gaussian cgs units. Recall that
the unit of the electric charge in Reissner-Nordström solution
the cgs system is g1/2 cm3/2 s−1 . Defining
Gq2
Δ ≡ r2 − 2mr + , (11.8)
c4
we thus obtain the line element for the metric of a charged Schwarz-
schild black hole,

Δ r2 dr2  
ds2 = − 2 dt2 + + r2 dϑ2 + sin2 ϑdϕ2 . (11.9)
r Δ
This is the Reissner-Nordström solution.
Of course, for q = 0, the Reissner-Nordström solution returns to the
Schwarzschild solution.
11.1 The Reissner-Nordström solution 163

Figure 11.1 Hans J. Reissner (right; 1874–1967), German engineer, math-


ematician and physicist. Source: Wikipedia

Before we proceed, we should verify that Maxwell’s equations are indeed


satisfied. First, we note that the Faraday 2-form (11.1) is exact because
it is the exterior derivative of the 1-form
q q q q
A = − cdt , dA = 2
dr∧cdt = − 2 cdt∧dr = − 2 θ0 ∧θ1 . (11.10)
r r r r
Thus, since d ◦ d = 0, dF = d2 A = 0, so that the homogeneous Maxwell
equations are satisfied.
Moreover, we notice that
q 2
∗F = θ ∧ θ3 , (11.11)
r2
which is easily verified using (5.75),

1
∗(θ0 ∧ θ1 ) = g00 g11 ε01αβ θα ∧ θβ (11.12)
2
1 
= − θ2 ∧ θ3 − θ3 ∧ θ2 = −θ2 ∧ θ3 .
2
Inserting the Schwarzschild tetrad from (8.40) yields

∗F = q sin ϑ dϑ ∧ dϕ = −d (q cos ϑdϕ) , (11.13)

which shows, again by d ◦ d = 0, that d(∗F) = 0, hence also (∗d∗)F = 0


and δF = 0, so that also the inhomogeneous Maxwell equations (in
vacuum!) are satisfied.
164 11 Charged, Rotating Black Holes

Figure 11.2 Gunnar Nordström (1881–1923), Finnish physicist. Source:


Wikipedia

11.2 The Kerr-Newman solution

11.2.1 The Kerr-Newman metric

The formal derivation of the metric of a rotating black hole is a formidable


task which we cannot possibly demonstrate during this lecture. We thus
start with general remarks on the expected form of the metric and then
immediately quote the metric coefficients without deriving them.
In presence of angular momentum, we expect the spherical symmetry
of the Schwarzschild solution to be broken. Instead, we expect that
the solution must be axisymmetric, with the axis fixed by the angular
momentum. Moreover, we seek to find a stationary solution.
Then, the group R × SO(2) must be an isometry of the metric, where
R represents the stationarity and SO(2) the (two-dimensional) rotations
about the symmetry axis. Expressing these symmetries, there must be a
time-like Killing vector field k and another Killing vector field m which
is tangential to the orbits of SO(2).
These two Killing vector fields span the tangent spaces of the two-
dimensional submanifolds which are the orbits of R × SO(2), i.e. cylin-
ders.
11.2 The Kerr-Newman solution 165

We can choose adapted coordinates t and ϕ such that k = ∂t and m = ∂ϕ .


Then, the metric (4) g of four-dimensional spacetime can be decomposed
as
(4)
g = gab (xi ) dxa ⊗ dxb + gi j (xk ) dxi ⊗ dx j , (11.14)

where indices a, b = 0, 1 indicate the coordinates on the orbits of


R × SO(2), and indices i, j, k = 2, 3 the others. Note that, due to the
symmetry imposed, the remaining metric coefficients can only depend
on the coordinates xi .
A stationary, axi-symmetric spacetime (M, g) can thus be foliated into
M = Σ × Γ, where Σ is diffeomorphic to the orbits of R × SO(2), and
the metric coefficients in adapted coordinates can only depend on the
coordinates of Γ. We write

(4)
g=σ+g (11.15)

and have
σ = σab (xi ) dxa ⊗ dxb . (11.16)

The coefficients σab are scalar products of the two Killing vector fields k
and m,

−k, k k, m
(σab ) = , (11.17)
k, m m, m

and we abbreviate the determinant of σ by


ρ≡ − det σ = k, km, m + k, m2 . (11.18)

Without proof, we now give the metric of a stationary, axially-symmetric


solution of Einstein’s field equations for either vacuum or an electro-
magnetic field. We first define the auxiliary quantities

Δ := r2 − 2mr + Q2 + a2 , ρ2 := r2 + a2 cos2 ϑ ,
Σ := (r + a ) − a Δ sin ϑ .
2 2 2 2 2 2
(11.19)

Moreover, we need appropriately scaled expressions Q and a for the


charge q and the angular momentum L of the black hole, which are given
by
Gq2 L GL
Q2 := , a := = (11.20)
c4 Mc mc3
?
and both Q and a have the dimension of a length. Verify that a also has the dimen-
sion of a length, like Q.
166 11 Charged, Rotating Black Holes

Kerr-Newman solution
With these definitions, we can write the coefficients of the metric for a
charged, rotating black hole in the form

2mr − Q2 a2 sin2 ϑ − Δ
gtt = −1 + = ,
ρ2 ρ2
2mr − Q2 r 2 + a2 − Δ
gtϕ = − a sin2
ϑ = − a sin2 ϑ ,
ρ2 ρ2
ρ2 Σ2
grr = , gϑϑ = ρ2 , gϕϕ = 2 sin2 ϑ . (11.21)
Δ ρ

Evidently, for a = 0 = Q, ρ = r, Δ = r2 − 2mr and Σ = r2 and we


return to the Schwarzschild solution (8.67). For a = 0, we still have
ρ = r and Σ = r2 , but Δ = r2 − 2mr + Q2 as in (11.8), and we return to
the Reissner-Nordström solution (11.9). For Q = 0, we obtain the Kerr
solution for a rotating, uncharged black hole, and for a  0 and Q  0,
the solution is called Kerr-Newman solution, named after Roy Kerr and
Ezra Newman.

Figure 11.3 Roy Kerr (born 1934), New Zealand mathematician. Source:
Wikimedia Commons

Also without derivation, we quote that the vector potential of the rotating,
charged black hole is given by the 1-form
qr  
? A = − 2 cdt − a sin2 ϑdϕ , (11.22)
Why is it plausible for A to have ρ
the form (11.22)? Beginning from which we obtain the Faraday 2-form
there, derive the Faraday-2-form
(11.23) yourself.
11.2 The Kerr-Newman solution 167

q  2   
F = dA = r − a2 cos2 ϑ dr ∧ cdt − a sin2 ϑdϕ
ρ 4

2qra   
+ 4 sin ϑ cos ϑdϑ ∧ r2 + a2 dϕ − acdt . (11.23)
ρ
For a = 0, this trivially returns to the field (11.1) for the Reissner-
Nordström solution. Sufficiently far away from the black hole, such that
a r, we can approximate to first order in a/r and write
q   2qa
F= dr ∧ cdt − a sin 2
ϑdϕ + sin ϑ cos ϑdϑ ∧ dϕ . (11.24)
r2 r

The field components far away from the black hole can now be read off
the result (11.24). Using the orthonormal basis
1 1
et = ∂ct , er = ∂r , eϑ = ∂ϑ , eϕ = ∂ϕ , (11.25)
r r sin ϑ
we find in particular for the radial component Br of the magnetic field
2qa
Br = F(eϑ , eϕ ) = cos ϑ . (11.26)
r3
In the limit of large r, the electric field thus becomes that of a point
?
charge q at the origin, and the magnetic field attains a characteristic
Find the remaining components
dipolar structure.
of the electromagnetic field of the
The Biot-Savart law of electrodynamics implies that a charge q with Kerr-Newman solution.
mass M on a circular orbit with angular momentum L has the magnetic
dipole moment
qL
μ = g , (11.27)
2Mc
where g is the gyromagnetic moment.
A magnetic dipole moment μ creates the dipole field

 = 3(μ · er )er − μ ,


B (11.28)
r3

whose radial component is Br = B  · er = 2μ · er /r3 . A comparison of


the radial magnetic field from (11.26) with this expression reveals the
following interesting result:
Magnetic dipole moment of a charged, rotating black hole
The magnetic dipole moment of a charged, rotating black holes is
?
qL qL What is the gyromagnetic mo-
μ = qa = =2 , (11.29)
Mc 2Mc ment of an electron?
showing that charged, rotating black holes have a gyromagnetic mo-
ment of g = 2.
168 11 Charged, Rotating Black Holes

11.2.2 Schwarzschild horizon, ergosphere and Killing


horizon

By construction, the Kerr-Newman metric (11.21) has the two Killing


vector fields k = ∂t , expressing the stationarity of the solution, and
m = ∂ϕ , which expresses its axial symmetry.
Since the metric coefficients in adapted coordinates satisfy

gtt = k, k , gϕϕ = m, m , gtϕ = k, m , (11.30)

they have an invariant meaning which will now be clarified.


Let us consider an observer moving with r = const. and ϑ = const. with
uniform angular velocity ω. If her four-velocity is u, then
dϕ ϕ̇ uϕ
ω= = = t (11.31)
dt t˙ u
for a static observer at infinity, whose proper time can be identified with
the coordinate time t. Correspondingly, we can expand the four-velocity
as  
u = ut ∂t + uϕ ∂ϕ = ut ∂t + ω∂ϕ = ut (k + ωm) , (11.32)
inserting the Killing vector fields. Let

|k + ωm| ≡ (− k + ωm, k + ωm)1/2 (11.33)

define the norm of k + ωm, then the four-velocity is


k + ωm
u= . (11.34)
|k + ωm|

Obviously, k+ωm is a time-like Killing vector field, at least at sufficiently


large distances from the black hole. Since then

k + ωm, k + ωm = k, k + ω2 m, m + 2ωk, m


= gtt + ω2 gϕϕ + 2ωgtϕ < 0 , (11.35)

k + ωm becomes light-like for angular velocities



−gtϕ ± g2tϕ − gtt gϕϕ
ω± = . (11.36)
gϕϕ
If we define
gtϕ k, m
Ω≡− =− , (11.37)
gϕϕ m, m
we can write (11.36) as

gtt
ω± = Ω ± Ω2 − . (11.38)
gϕϕ
11.2 The Kerr-Newman solution 169

g freely
angular
For an interpretation of Ω, we note that freely-falling test particles on
radial orbits have zero angular momentum and thus u, m = 0. By
(11.34), this implies

0 = k + ωm, m = gtϕ + ωgϕϕ (11.39)

and thus gtϕ


ω=− =Ω (11.40)
gϕϕ
according to the definition (11.37). This shows that Ω is the angular
velocity of a test particle falling freely towards the black hole on a radial
orbit.
The minimum angular velocity ω− from (11.38) vanishes if and only if
gtt = k, k = 0, i.e. if the Killing vector field k turns light-like. With
(11.21), this is so where

0 = a2 sin2 ϑ − Δ = 2mr − r2 − Q2 − a2 cos2 ϑ , (11.41)

i.e. at the radius



r0 = m + m2 − Q2 − a2 cos2 ϑ . (11.42)

Static limit in Kerr spacetime


The radius r0 marks the static limit of Kerr spacetime: for an observer at
this radius to remain static with respect to observers at infinity (i.e. with
respect to the “fixed stars”), she would have to move with the speed of
light. At smaller radii, observers cannot remain static against the drag
of the rotating black hole.
We have seen in (4.48) that the light emitted by a source with four-
velocity us is seen by an observer with four-velocity uo with a redshift

νo k̃, uo 
= , (11.43)
νs k̃, us 
where k̃ is the wave vector of the light.
Observers at rest in a stationary spacetime have four-velocities propor-
tional to the Killing vector field k,
k
u= √ , hence k = −k, k u . (11.44)
−k, k

We have seen in (5.36) that the projection of a Killing vector K on a


geodesic γ is constant along that geodesic, ∇γ̇ γ̇, K = 0. The light ray
propagating from the source to the observer is a null geodesic with γ̇ = k̃,
hence
∇k̃ k̃, k = 0 (11.45)
170 11 Charged, Rotating Black Holes

and k̃, ks = k̃, ko . Using this in a combination of (11.43) and (11.44),
we obtain √ √
νo k̃, ko −k, ks −k, ks
= √ = √ . (11.46)
νs k̃, ks −k, ko −k, ko
For an observer at rest far away from the black hole, k, ko ≈ −1, and
the redshift becomes
νs 1
1+z= ≈ √ = (−gtt )−1/2 , (11.47)
νo −k, ks
which tends to infinity as the source approaches the static limit.
The minimum and maximum angular velocities ω± from (11.38) both
become equal to Ω for
 2
gtϕ gtt
Ω2 = = ⇒ g2tϕ − gtt gϕϕ = 0 . (11.48)
gϕϕ gϕϕ

This equation means that the Killing field ξ ≡ k + Ωm turns light-like,

ξ, ξ = k, k + 2Ωk, m + Ω2 m, m


g2tϕ g2tϕ
= gtt + 2Ωgtϕ + Ω2 gϕϕ = gtt − 2 +
gϕϕ gϕϕ
gtt gϕϕ − g2tϕ
= =0. (11.49)
gϕϕ

Interestingly, writing the expression from (11.48) with the metric coeffi-
cients (11.21) leads to the simple result

g2tϕ − gtt gϕϕ = Δ sin2 ϑ , (11.50)

so that the condition (11.48) is equivalent to

0 = Δ = r2 − 2mr + Q2 + a2 , (11.51)

which describes a spherical hypersurface with radius



r H = m + m2 − Q 2 − a 2 , (11.52)

for which we choose the larger of the two solutions of (11.51).


By its definition (11.37), the angular velocity Ω on this hypersurface H
can be written as
 
gtϕ  a(2mr − Q2 )  a(2mrH − Q2 )
ΩH = −  =  = , (11.53)
gϕϕ H Σ2 H (rH2 + a2 )2

since Σ2 = (r2 + a2 )2 because of Δ = 0 at rH . Because of (11.51), the


numerator is a(rH2 + a2 ), and we find the following remarkable result:
11.2 The Kerr-Newman solution 171

Angular frequency of H
The hypersurface H is rotating with the constant angular velocity
a
ΩH = , (11.54)
rH2 + a2

like a solid body.


Since the hypersurface H is defined by the condition Δ = 0, its normal
vectors are given by
gradΔ = dΔ , dΔ = 2(r − m)dr . (11.55)
Thus, the norm of the normal vectors is
gradΔ, gradΔ = 4grr (r − m)2 , (11.56)
now, according to (11.21), g ∝ Δ = 0 on the hypersurface, showing
rr

that H is a null hypersurface. Because of this fact, the tangent space to


the null hypersurface H at any of its points is orthogonal to a null vector,
and hence it does not contain time-like vectors.
Killing horizon and ergosphere
The surface H is called a Killing horizon. The hypersurface defined
by the static limit is time-like, which means that it can be crossed in
both directions, in contrast to the horizon H. The region in between
the static limit and the Killing horizon is the ergosphere, in which k is
space-like and no observer can be prevented from following the rotation
of the black hole.

ergosphere
2 static limit
horizon

r/m
z/m

0
0 1 2

-1

-2

-2 -1 0 1 2
x/m

Figure 11.4 Static limit, horizon, and ergosphere for a Kerr black hole with
a = 0.75.

Formally, the Kerr solution is singular where Δ = 0, but this singularity


can be lifted by a transformation to coordinates similar to the Eddington-
Finkelstein coordinates for a Schwarzschild black hole.
172 11 Charged, Rotating Black Holes

11.3 Motion near a Kerr black hole

11.3.1 Kepler’s third law

We shall now assume q = 0 and consider motion on a circular orbit in


the equatorial plane. Thus ṙ = 0 and ϑ = π/2, and

? Δ = r2 − 2mr + a2 and ρ=r, (11.57)


If a particle starts orbiting in
further
the equatorial plane, does it stay  2
there? Σ2 = r2 + a2 − a2 Δ = r4 + a2 r2 + 2ma2 r , (11.58)

and the coefficients of the metric (11.21) become


2m 2ma
gtt = −1 + , gtϕ = − ,
r r
r2
grr = , gϑϑ = r2 ,
Δ
Σ2 2ma2
gϕϕ = 2 = r 2 + a2 + . (11.59)
r r

Since ϑ̇ = 0 and ṙ = 0, the Lagrangian reduces to


 
2m 2 2 4mac 2ma2 2
2L = − 1 − c t˙ − t˙ϕ̇ + r2 + a2 + ϕ̇ . (11.60)
r r r
By the Euler-Lagrange equation for r and due to ṙ = 0, we have
d ∂L ∂L
=0= , (11.61)
dt ∂ṙ ∂r
which yields, after multiplying with r2 /t˙2 ,

− mc2 + 2macω + (r3 − ma2 )ω2 = 0 (11.62)

where we have introduced the angular frequency ω according to (11.31).


Kepler’s third law
Noticing that

r3 − ma2 = (r3/2 − m1/2 a)(r3/2 + m1/2 a) , (11.63)

we can write the solutions as


cm1/2
ω± = ± . (11.64)
r3/2 ± m1/2 a
This is Kepler’s third law for a Kerr black hole: The angular velocity
of a test particle depends on whether it is co-rotating with or counter-
rotating against the black hole.
11.3 Motion near a Kerr black hole 173

3
2
1
0
-1 3 6

-2 a = 0.50
a = 0.99
-3
-2 0 2 4 6 8 10

3
2
1 r/m
0
-1 3 6

-2 L= 2
L = -2
-3
-2 0 2 4 6 8 10

Figure 11.5 Trajectories of test particles in the equatorial plane of the Kerr
metric. All orbits begin at r = 10m and ϕ = 0. Top: Orbits with angular
momentum L = 0 for a = 0.5 and a = 0.9. Bottom: orbits with angular
momenta L = ±2 for a = 0.99.

11.3.2 Accretion flow onto a Kerr black hole

We now consider a stationary, axially-symmetric flow of a perfect fluid


onto a Kerr black hole. Because of the symmetry constraints, the Lie
derivatives of all physical quantities in the direction of the Killing vector
fields k = ∂t and m = ∂ϕ need to vanish.
As in (11.32), the four-velocity of the flow is
u = ut (k + ωm) . (11.65)
We introduce
j uϕ
e ≡ −u, k = −ut , j ≡ u, m = uϕ , l≡ =− (11.66)
e ut
and use
 
uϕ = gtϕ ut + gϕϕ uϕ = ut gtϕ + gϕϕ ω
 
ut = gtt ut + gtϕ uϕ = ut gtt + gtϕ ω (11.67)
to see that
gtϕ + ωgϕϕ gtϕ + lgtt
l=− ⇔ ω=− . (11.68)
gtt + ωgtϕ gϕϕ + lgtϕ
174 11 Charged, Rotating Black Holes

Moreover, by the definition of l in (11.66) and ω in (11.31), and using


u, u = ut ut + uϕ uϕ = −c2 , we see that

uϕ uϕ c2
ωl = − ⇒ ut ut = . (11.69)
ut ut ωl − 1
Finally, using (11.67) and (11.69), we have
gtt + gtϕ ω
− u2t = −ut ut (gtt + gtϕ ω) = c2 . (11.70)
1 − ωl
If we substitute ω from (11.68) here, we obtain after a short calculation
?
Repeat the calculations leading g2tϕ − gtt gϕϕ
e2 = u2t = c2 . (11.71)
to (11.76) and (11.78) in compo- gϕϕ + 2lgtϕ + l2 gtt
nents. Why can (11.77) be called
a perpendicular projector? Can It is shown in the In-depth box “Ideal hydrodynamics in general relativity”
you confirm the non-relativistic on page 175 that the relativistic Euler equation reads
limit (11.79)?  
ρc2 + p ∇u u = −c2 dp − u(p)u , (11.80)

where ρc2 and p are the density and the pressure of the ideal fluid.
Applying this equation to the present case of a stationary flow, we first
observe that
0 = Lu p = u(p) , (11.81)
thus the second term on the right-hand side of (11.80) vanishes.
Next, we introduce the dual vector u belonging to the four-velocity u.
In components, (u )μ = gμν uν = uμ . Then, from (5.32),
 
Lu u = uν ∂ν uμ + uν ∂μ uν = uν ∇ν uμ + uν ∇μ uν , (11.82)
μ

where we have employed the symmetry of the connection ∇. This shows


that
Lu u = ∇u u . (11.83)

Now, we introduce f ≡ 1/ut and compute L f u u in two different ways.


First, a straightforward calculation beginning with (5.24) shows that

L f x w = f L x w + w(x)d f . (11.84)

Specialising this result to x = u and w = u gives

L f u u = f Lu u − c2 d f = f ∇u u − c2 d f , (11.85)
?
making use of (11.83) in the last step.
Can you confirm (11.84)?
On the other hand, f u = u/ut = k + ωm because of (11.65), which allows
us to write
L f u u = Lk!"
u +Lωm u . (11.86)
=0
11.3 Motion near a Kerr black hole 175

In depth: Ideal hydrodynamics in general relativity


The relativistic continuity and Euler equations
Relativistic hydrodynamics begins with the vanishing divergence of the
energy-momentum tensor, ∇ · T = 0, demanded by Einstein’s equations.
Specialising the energy-momentum tensor to that of an ideal fluid with
energy density ρc2 , pressure p and four-velocity u,
) p*
T = ρ + 2 u ⊗ u + pg−1 , (11.72)
c
we first find
+ ) p* ) p* , ) p*
0 = u ρ + 2 + ρ + 2 ∇ · u u + ρ + 2 ∇u u + dp . (11.73)
c c c
The first terms in brackets are proportional to the four-velocity u. Pro-
jecting ∇ · T on u, and taking u, u = −c2 into account, leads to
     ) p*
0 = − u ρc2 + p + ρc2 + p ∇ · u + ρ + 2 u, ∇u u + u(p) .
c
(11.74)
Now, since the connection is metric,

∇u u, u = 0 = 2∇u u, u , (11.75)

and (11.74) turns into the relativistic continuity equation


   
u ρc2 + ρc2 + p ∇ · u = 0 . (11.76)

If we project (11.73) instead into the three-space perpendicular to u by


applying the perpendicular projector

π⊥ := 14 + c−2 u ⊗ u , (11.77)

the terms proportional to u drop out by construction. Further using


(11.75) once more, we retain the relativistic Euler equation
 
ρc2 + p ∇u u + c2 dp + u(p)u = 0 . (11.78)

In the non-relativistic limit, equations (11.76) and (11.78) simplify to


the familiar expressions

∂t ρ + ∇ · #ρv $ = 0 ,
  
 v + ∇P = 0 .
∂tv + v · ∇ (11.79)
ρ
176 11 Charged, Rotating Black Holes

Applying (11.83) once more gives

Lωm u = ωLm u + u (m)dω . (11.87)

Since the Lie derivative of u in the direction m must vanish because of


the axisymmetry, this means

L f u u = Lωm u = u (m)dω = u, mdω = jdω . (11.88)

Equating this to (11.85) gives

f ∇u u = c2 d f + jdω . (11.89)

However, we know from (11.69) that

# $−1 ut (ωl − 1) e(1 − ωl)


f = ut = = . (11.90)
c2 c2

Inserting this into (11.89) yields

e(1 − ωl)
∇u u = (1 − ωl)de − eldω − eωdl + jdω
c2
= (1 − ωl)de − eωdl , (11.91)

where we have used el = j in the final step. Thus,



ωdl
∇u u = c2 d ln e − . (11.92)
1 − ωl

Returning with this result to Euler’s equation (11.80), we obtain

dp ωdl
= −d ln e + , (11.93)
ρc2 +p 1 − ωl

which shows that surfaces of constant pressure are given by



ωdl
ln e − = const. (11.94)
1 − ωl

Setting dl = 0, i.e. defining a surface of constant l, makes the second


term on the left-hand side vanish. In this case, find from (11.71)

gϕϕ + 2lgtϕ + l2 gtt c2


= 2 = const. (11.95)
g2tϕ − gtt gϕϕ e
11.4 Entropy and temperature of a black hole 177

Accretion tori
We now insert the metric coefficients (11.21) for the Kerr-Newman
solution to obtain the surfaces of constant pressure and constant l.
Assuming further a = 0, we obtain the isobaric surfaces of the accretion
flow onto a Schwarzschild black hole. With
2m
gtt = −1 + , gϕϕ = r2 sin2 ϑ , gtϕ = 0 , (11.96)
r
we find
r l2
− = const. (11.97)
r − 2m r2 sin2 ϑ
This describes toroidal surfaces around black holes, the so-called ac-
cretion tori.

10
5
z/2m 0
-5
-10

-10 10
-5 5
0 0
x/2m 5 -5 y/2m
10 -10

Figure 11.6 Accretion torus around a Schwarzschild black hole. The


constants l and e were set to l = 0.45 and e = 0.95 c here.

11.4 Entropy and temperature of a black hole

It was realised by Stephen Hawking, Roger Penrose and Demetrios


Christodoulou that the area of a possibly charged and rotating black hole,
defined by  
A := 4πα := 4π r+2 + a2 (11.98)
cannot shrink. Here, r+ is the positive branch of the two solutions of
(11.51),
r ± = m ± m2 − Q 2 − a 2 . (11.99)

This led Jacob Bekenstein (1973) to the following consideration. If A


cannot shrink, it reminds of the entropy as the only other quantity known
in physics that cannot shrink. Could the area A have anything to do
with an entropy that could be assigned to a black hole? In fact, this is
much more plausible than it may appear at first sight. Suppose radiation
disappears in a black hole. Without accounting for a possible entropy of
178 11 Charged, Rotating Black Holes

Figure 11.7 Jacob D. Bekenstein (1947–2015), Israeli-US-American physi-


cist. Source: Wikipedia

the black hole, its entropy would be gone, violating the second law of
thermodynamics. The same holds for gas accreted by the black hole: Its
entropy would be removed from the outside world, leaving the entropy
there lower than before.
If, however, the increased mass of the black hole led to a suitably in-
creased entropy of the black hole itself, this violation of the second law
could be remedied.
Analogy between area and entropy
Any mass and angular momentum swallowed by a black hole leads to
an increase of the area (11.98), which makes it appear plausible that
the area of a black hole might be related to its entropy.
Following Bekenstein (1973), we shall now work out this relation.
Beginning with the scaled area α = r+2 + a2 from (11.98), we have
# $
dα = 2 r+ dr+ + a · da . (11.100)
Inserting r+ from (11.99) and using that

r+ − r− =: δr = 2 m2 − Q2 − a2 , (11.101)
we find directly
&  '
r+ δr + 2r+ m 2r+ Q 2r+
dα = 2 dm − dQ + 1 − a · da . (11.102)
δr δr δr
11.4 Entropy and temperature of a black hole 179

The coefficients of dm and da can be further simplified. Noting that

δr + 2m = r+ − r− + (r+ + r− ) = 2r+ (11.103)

and
δr − 2r+ = − (r+ + r− ) = −2m , (11.104)
we can bring (11.102) into the form

4r+2 4r+ Q 2m
dα = dm − dQ − a · da . (11.105)
δr δr δr
Now, we need to take into account that the scaled angular momentum a
can change by changing the angular momentum L or the mass m. From
the definition (11.20), we have
⎛ ⎞
G ⎜⎜ dL L ⎟⎟ G dL adm
da = 3 ⎜⎜⎝ − 2 dm⎟⎟⎠ = 3 − . (11.106)
c m m c m m

Substituting this expression for da in (11.105), we find

4α 4r+ Q 4G a · dL
dα = dm − dQ − 3 , (11.107) ?
δr δr c δr
Confirm equation (11.107) by
Solving equation (11.107) for dm yields your own calculation.

 · dL
dm = Θdα + ΦdQ + Ω (11.108)

with the definitions

Θ :=
δr
, Φ :=
r+ Q
,  := G a .
Ω (11.109)
4α α c3 α
This reminds of the first law of thermodynamics if we tentatively asso-
ciate m with the internal energy, α with the entropy and the remaining
terms with external work.
Let us now see whether a linear relation between the entropy S and the
area α will lead to consistent results. Thus, assume S = γα with some
constant γ to be determined. Then, a change δα will lead to a change
δS = γδα in the entropy.
Bekenstein showed that the minimal change of the effective area is twice
the squared Planck length (1.5), thus
2G
δα = . (11.110)
c3
On the other hand, he identified the minimal entropy change of the black
hole with the minimal change of the Shannon entropy, which is derived
from information theory and is

δS = kB ln 2 , (11.111)
180 11 Charged, Rotating Black Holes

where the Boltzmann constant kB was inserted to arrive at conventional


units for the entropy. This could e.g. correspond to the minimal informa-
tion loss when a single particle disappears in a black hole. Requiring

2G
kB ln 2 = δS = γδα = γ (11.112)
c3
? fixes the constant γ to
Look up Bekenstein’s arguments ln 2 kB c3
leading to the normalisation γ= . (11.113)
2 G
(11.113) of the black-hole en-
tropy (Bekenstein, J., Black
Bekenstein entropy
Holes and Entropy. Phys. Rev.
D 7 (1973) 2333). The Bekenstein entropy of a black hole is

ln 2 c3 kB
S = A, (11.114)
8π G
where A is the area of the black hole.
The quantity Θ defined in (11.109) must then correspond to the tempera-
ture of the black hole. From (11.108), we have on the one hand

∂m
Θ= . (11.115)
∂α Q,L

If the association of a temperature should be consistent, it must on the


other hand agree with the thermodynamic definition of temperature,

1 ∂S
= . (11.116)
T ∂E V

For E, we can use the mass or rather

mc4
E = Mc2 = . (11.117)
G

Then,
  
∂S G ∂S ln 2 kB ∂α
= 4 = . (11.118)
∂E V c ∂m Q,L 2 c ∂m Q,L

Inserting (11.115) now leads to an expression for the temperature.


Black-hole temperature
The analogy between the area of a black hole and entropy implies that
black holes can be assigned the temperature
2 c 2π c δr
T= Θ= . (11.119)
ln 2 kB ln 2 kB A
11.4 Entropy and temperature of a black hole 181

This result leads to a remarkable conclusion. If black holes have a


temperature, they will radiate and thus lose energy or its mass equiva-
lent. They can therefore evaporate. By the Stefan-Boltzmann law, the
luminosity radiated by a black body of area A and temperature T is

π2 kB4
L = σAT 4 , σ= . (11.120)
603 c2
For an uncharged and non-rotating black hole, δr = 2m and A = 16πm2 ,
thus its temperature is

1 c 1 c3
T= = . (11.121)
4 ln 2 kB m 4 ln 2 kB GM
Defining the Planck temperature by

MPl c2
T Pl := = 1.42 · 1032 K (11.122)
kB

in terms of the Planck mass MPl = 2.2 · 10−5 g from (1.4), we can write
T Pl MPl
T= . (11.123)
4 ln 2 M
For a black hole of solar mass, M = M
= 2.0 · 1033 g, the temperature is

T = 5.6 · 10−7 K . (11.124)


Chapter 12

Homogeneous, Isotropic
Cosmology

12.1 Spherically-symmetric spacetimes

Physical cosmology aims at studying the structure and evolution of the


universe as a whole. Of the four fundamental interactions of physics,
only gravity is relevant on the largest scales because the strong and
the weak interactions are confined to sub-atomic length scales, and the
electromagnetic force is shielded on large scales by opposite charges.
We thus expect that the spacetime of the Universe can be idealised as
a solution of Einstein’s field equations, satisfying certain simplicity re-
quirements expressed by symmetries imposed on the form of the solution.
In this chapter, we shall therefore first discuss spherically-symmetric
spacetimes in general and then specialise them to cosmological solutions
in particular.

12.1.1 Form of the metric

Generally, a spacetime (M, g) is called spherically symmetric if it admits


the group SO(3) as an isometry such that the group’s orbits are two-
dimensional, space-like surfaces.
For any point p ∈ M, we can then select the orbit Ω(p) of SO(3) through
p. In other words, we construct the spatial two-sphere containing p
which is compatible with the spherical symmetry.
Next, we construct the set of all geodesics N(p) through p which are
orthogonal to Ω(p). Locally, N(p) forms a two-dimensional surface
which we also call N(p). Repeating this construction for all p ∈ M
yields the surfaces N.

183
184 12 Homogeneous, Isotropic Cosmology

We can now introduce coordinates (r, t) on N and (ϑ, ϕ) on Ω, i.e. such


that the group orbits Ω of SO(3) are given by (r, t) = const. and the
surfaces N by (ϑ, ϕ) = const. This allows the following intermediate
conclusion.
Metric of a spherically-symmetric spacetime
The line element of the metric of a spherically-symmetric spacetime M
can be written in the form
 
? ds2 = d s̃2 + R2 (t, r) dϑ2 + sin2 ϑdϕ2 , (12.1)
Following similar arguments as
presented here, how would where d s̃2 is the line element of a yet unspecified metric g̃ in the
you construct the metric for coordinates (t, r) on the surfaces N.
a cylindrically-symmetric space- Without loss of generality, we can now choose t and r such that the
time? metric g̃ is diagonal, which allows us to write its line element as

d s̃2 = −e2a(t,r) c2 dt2 + e2b(t,r) dr2 , (12.2)

with functions a(t, r) and b(t, r) to be determined.


As suggested by the line elements (12.1) and (12.2), we introduce the
dual basis

θ0 = ea cdt , θ1 = eb dr , θ2 = Rdϑ , θ3 = R sin ϑdϕ (12.3)

and find its exterior derivatives

dθ0 = −a e−b θ0 ∧ θ1 ,


dθ1 = ḃe−a θ0 ∧ θ1 ,
Ṙ −a 0 R −b 1
dθ2 = e θ ∧ θ2 + e θ ∧ θ2 ,
R R
Ṙ R −b 1 cot ϑ 2
dθ3 = e−a θ0 ∧ θ3 + e θ ∧ θ3 + θ ∧ θ3 , (12.4)
R R R

?
where the overdots and primes denote derivatives with respect to ct and
Carry out the calculations lead-
r, respectively.
ing to (12.4) yourself. Can you
confirm the results?

12.1.2 Connection and curvature forms

In the dual basis (12.3), the metric is Minkowskian, g = diag(−1, 1, 1, 1),


thus dg = 0, and Cartan’s first structure equation (8.13) implies

ωμν ∧ θν = −dθ μ (12.5)


12.1 Spherically-symmetric spacetimes 185

for the connection 1-forms ωμν . From (12.5) and the results (12.4), we
can read off
R −b 2
ω01 = ω10 = a e−b θ0 + ḃe−a θ1 , ω12 = −ω21 = − e θ ,
R

Ṙ −a 2 R
ω02 = ω20 = e θ , ω13 = −ω31 = − e−b θ3 ,
R R
Ṙ cot ϑ 3
ω03 = ω30 = e−a θ3 , ω23 = −ω32 = − θ . (12.6)
R R

Cartan’s second structure equation (8.13) then yields the curvature 2-


forms Ωij ,

Ω01 = dω01 ≡ E θ0 ∧ θ1 ,
Ω02 = dω02 + ω01 ∧ ω12 ≡ Ẽ θ0 ∧ θ2 + H θ1 ∧ θ2 ,
Ω03 = dω03 + ω01 ∧ ω13 + ω02 ∧ ω23 = Ẽ θ0 ∧ θ3 + H θ1 ∧ θ3 ,
Ω12 = dω12 + ω10 ∧ ω02 ≡ −H θ0 ∧ θ2 + F̃ θ1 ∧ θ2 ,
Ω13 = dω13 + ω10 ∧ ω03 + ω12 ∧ ω23 = −H θ0 ∧ θ3 + F̃ θ1 ∧ θ3 ,
Ω23 = dω23 + ω20 ∧ ω03 + ω21 ∧ ω13 ≡ F θ2 ∧ θ3 , (12.7)
where the functions
   
E = e−2a b̈ − ȧḃ + ḃ2 − e−2b a − a b + a2 ,
e−2a   e−2b
Ẽ = R̈ − ȧṘ − a R ,
R R
e−a−b   
H= Ṙ − a Ṙ − ḃR ,
R
1  
F = 2 1 − R2 e−2b + Ṙ2 e−2a ,
R
e−2a e−2b #   $
F̃ = ḃṘ + b R − R (12.8) ?
R R Repeat the calculations leading to
were defined for brevity. (12.8) yourself, beginning read-
ing off the connection forms
According to (8.20), the curvature forms imply the components
(12.6).
Rαβ = Ωλα (eλ , eβ ) (12.9)
of the Ricci tensor, for which we obtain
R00 = −E − 2Ẽ , R01 = −2H , R02 = 0 = R03 ,
R11 = E + 2F̃ , R12 = 0 = R13
R22 = Ẽ + F̃ + F = R33 , R23 = 0 , (12.10)
the Ricci scalar
R = (E + 2Ẽ) + (E + 2F̃) + 2(Ẽ + F̃ + F)
= 2(E + F) + 4(Ẽ + F̃) , (12.11)
186 12 Homogeneous, Isotropic Cosmology

and finally the components


⎛ ⎞
⎜⎜⎜F + 2F̃ −2H 0 0 ⎟⎟⎟
⎜⎜⎜ −2H −2Ẽ − F ⎟⎟⎟
Gαβ = ⎜⎜⎜⎜ ⎜ 0 0 ⎟⎟⎟ (12.12)
⎜⎜⎝ 0 0 −E − Ẽ − F̃ 0 ⎟⎟⎟
⎟⎠
0 0 0 −E − Ẽ − F̃
of the Einstein tensor
R
G =R− g. (12.13)
2

12.1.3 Generalised Birkhoff’s theorem

We can now state and prove Birkhoff’s theorem in its general form:
Birkhoff’s generalised theorem
Every C 2 solution of Einstein’s vacuum equations which is spherically
symmetric in an open subset U ⊂ M is locally isometric to a domain
of the Schwarzschild-Kruskal solution.
The proof proceeds in four steps:

1. If the surfaces {R(t, r) = const.} are time-like in U and dR  0,


we can choose R(t, r) = r, thus Ṙ = 0 and R = 1. Since H =
−ḃe−a−b /R then, the requirement G01 = 0 implies ḃ = 0. The sum
G00 + G11 = 2(F̃ − Ẽ) must also vanish, thus
e−2b #  $
b + a = 0 , (12.14)
R
which means a(t, r) = −b(r) + f (t). By a suitable choice of a new
time coordinate, a can therefore be made time-independent as well.
Moreover, we see that
1 − e−2b 2b e−2b
0 = G00 = F + 2F̃ = + (12.15)
R2 R
is identical to the condition (8.62) for the function b in the Schwarz-
?
schild spacetime. Thus, we have e−2b = 1 − 2m/r as there, further
Construct yourself the new time
a(r) = −b(r), and the metric turns into the Schwarzschild metric.
coordinate implied by the condi-
tion a(t, r) = −b(r) + f (t). 2. If the surfaces {R(t, r) = const.} are space-like in U and dR  0,
we can choose R(t, r) = t and proceed in an analogous way. Then,
Ṙ = 1 and R = 0, thus H = −a e−a−b /R, hence G01 = 0 implies
a = 0 and, again through G00 + G11 = 0, the condition ȧ + ḃ = 0
or b(t, r) = −a(t) + f (r). This allows us to change the radial
coordinate appropriately so that b(t, r) also becomes independent
of r. Then, G00 = 0 implies
1  −2a
 e−2a
0 = G00 = 1 + e − 2ȧ , (12.16)
R2 R
12.2 Homogeneous and isotropic spacetimes 187

where ḃ = −ȧ was used. Since R = t, this is equivalent to


2m
∂t (te−2a ) = −1 ⇒ e−2a = e2b = −1, (12.17)
t
with t < 2m. This is the Schwarzschild solution for r < 2m
because r and t change roles inside the Schwarzschild horizon.

3. If {R(t, r) = const.} are space-like in some part of U and time-


like in another, we obtain the respective different domains of the
Schwarzschild spacetime.

4. Assume finally dR, dR = 0 on U. If R is constant in U, G00 =


R−2 = 0 implies R = ∞. Therefore, suppose dR is not zero, but
light-like. Then, r and t can be chosen such that R = t − r and
dR = dt − dr. For dR to be light-like,

dR, dR = g̃(dR, dR) = −e2a + e2b = 0 , (12.18)

we require a = b. Then, G00 + G11 = 0 or

e−2a  
− ȧ + ḃ − a − b = 0 , (12.19)
R
implies ȧ = a , which again leads to R = ∞ through G00 = 0.

This shows that the metric reduces to the Schwarzschild metric in all
relevant cases.
Cavity in spherically-symmetric spacetime
It is a corollary to Birkhoff’s theorem that a spherical cavity in a ?
spherically-symmetric spacetime has the Minkowski metric. Indeed, Compare Birkhoff’s to Newton’s
Birkhoff’s theorem says that the cavity must have a Schwarzschild theorem.
metric with mass zero, which is the Minkowski metric.

12.2 Homogeneous and isotropic spacetimes

12.2.1 Homogeneity and isotropy

There are good reasons to believe that the Universe at large is isotropic
around our position. The most convincing observational data are pro-
vided by the cosmic microwave background, which is a sea of blackbody
radiation at a temperature of (2.725 ± 0.001) K whose intensity is almost
exactly independent of the direction into which it is observed.
There is furthermore no good reason to believe that our position in the
Universe is in any sense prefered compared to others. We must therefore
conclude that any observer sees the cosmic microwave background as
188 12 Homogeneous, Isotropic Cosmology

an isotropic source such as we do. Then, the Universe must also be


homogeneous.
Caution While isotropy about We are thus led to the expectation that our Universe at large may be
our position in spacetime can be described by a homogeneous and isotropic spacetime. Let us now give
tested and is confirmed by ob- these terms a precise mathematical meaning.
servations, homogeneity is essen-
tially impossible to test.  Spatially homogeneous spacetime
A spacetime (M, g) is called spatially homogeneous if there exists a
one-parameter family of space-like hypersurfaces Σt that foliate the
spacetime such that for each t and any two points p, q ∈ Σt , there exists
an isometry φ of g which takes p into q.
Before we can define isotropy, we have to note that isotropy requires
that the state of motion of the observer needs to be specified first be-
cause two observers moving with different velocities through a given
point in spacetime will generally observe different redshifts in different
directions.
Spatially isotropic spacetime
Therefore, we define a spacetime (M, g) as spatially isotropic about a
point p if there exists a congruence of time-like geodesics through p
with tangents u such that for any two vectors v1 , v2 ∈ T p M orthogonal
to u, there exists an isometry of g taking v1 into v2 but leaving u and
p invariant. In other words, if the spacetime is spatially isotropic, no
prefered spatial direction orthogonal to u can be identified.
Isotropy thus identifies a special class of observers, with four-velocities
u, who cannot identify a prefered spatial direction. The spatial hyper-
surfaces Σt must then be orthogonal to u because otherwise a prefered
direction could be identified through the misalignment of the normal
direction to Σt and u, breaking isotropy.
We thus arrive at the following conclusions: a homogeneous and isotropic
spacetime (M, g) is foliated into space-like hypersurfaces Σt on which g
induces a metric h. There must be isometries of h carrying any point p ∈
Σt into any other point q ∈ Σt . Because of isotropy, it must furthermore
be impossible to identify prefered spatial directions on Σt . These are
very restrictive requirements which we shall now exploit.

12.2.2 Spaces of constant curvature

Consider now the curvature tensor (3) R̄ induced on Σt (i.e. the curvature
tensor belonging to the metric h induced on Σt ). We shall write it in
components with its first two indices lowered and the following two
indices raised,
kl
(3)
R̄ = (3) R̄i j . (12.20)
12.2 Homogeneous and isotropic spacetimes 189

In this way, (3) R̄ represents a linear map from the vector space of 2-
- -
forms 2 into 2 , because of the antisymmetry of (3) R̄ with respect to
permutations of the first and the second pairs of indices. Thus, it defines
an endomorphism
.2 .2 kl
L: → , (Lω)i j = (3) R̄i j ωkl . (12.21) Caution Recall that an endo-
morphism is a linear map of a
Due to the symmetry (3.81) of (3) R̄ upon swapping the first with the vector space into itself. 
second pair of indices, the endomorphism L is self-adjoint. In fact, for
-
any pair of 2-forms α, β ∈ 2 ,
kl
α, Lβ = (3) R̄i j αi j βkl = (3) R̄i jkl αi j βkl = (3) R̄kli j αi j βkl
ij
= (3) R̄kl αi j βkl = β, Lα , (12.22)

which defines a self-adjoint endomorphism.


We can now use the theorem stating that the eigenvectors of a self-adjoint
endomorphism provide an orthonormal basis for the vector space it is
operating on. Isotropy now requires us to conclude that the eigenvalues
of these eigenvectors need to be equal because we could otherwise define
a prefered direction (e.g. by the eigenvector belonging to the largest
eigenvalue). Then, however, the endomorphism L must be proportional
to the identical map
L = 2k id , (12.23)
with some k ∈ R.
By the definition (12.21) of L, this implies for the coefficients of the
curvature tensor  
kl
(3)
R̄i j = k δki δlj − δkj δli (12.24)
because (3) R̄ must be antisymmetrised. Lowering the indices by means
of the induced metric h yields
 
(3)
R̄i jkl = k hik h jl − h jk hil . (12.25)

The Ricci tensor is


i    
(3)
R jl = (3) R̄ jil = khis h si h jl − h ji h sl = k 3h jl − h jl
= 2kh jl , (12.26)

and the Ricci scalar becomes


j ?
(3)
R = (3) R j = 6k . (12.27)
Summarise the arguments lead-
ing to the Ricci tensor (12.26)
and the Ricci scalar (12.27) in
In the coordinate-free representation, the curvature is your own words.
R̄(x, y)v = k (x, vy − y, vx) . (12.28)
190 12 Homogeneous, Isotropic Cosmology

from (8.18) and (12.25), we find the curvature forms


1 i k  
Ωij = (3) R̄ jkl θk ∧ θl = his h sk h jl − h jk h sl θk ∧ θl
2 2
= k θi ∧ θ j (12.29)
in a so far arbitrary dual basis θ . i

The curvature parameter k must be (spatially) constant because of ho-


mogeneity. Space-times with constant curvature can be shown to be
conformally flat, which means that coordinates can be introduced in
which the line element dl2 of the metric h reads
1  i2
3
dl2 = (dx ) , (12.30)
ψ2 i=1

with a yet unknown arbitrary function ψ = ψ(x j ). This leads us to


introduce the dual basis
1
θi ≡ dxi , (12.31)
ψ
from which we find
∂ jψ j
dθi = − dx ∧ dxi = ψ j θi ∧ θ j , (12.32)
ψ2
where ψ j = ∂ j ψ abbreviates the partial derivative of ψ with respect to x j .
In this basis, the metric h is represented by h = diag(1, 1, 1). Therefore,
we do not need to distinguish between raised and lowered indices, and
dh = 0. Hence Cartan’s first structure equation (8.13) implies the
connection forms
ωi j = ψi θ j − ψ j θi . (12.33)

According to Cartan’s second structure equation, the curvature forms are


Ωi j = dωi j + ωik ∧ ωkj (12.34)
 
= ψ ψik θk ∧ θ j − ψ jk θk ∧ θi − ψk ψk θi ∧ θ j ,
but at the same time we must satisfy (12.29). This immediately implies
ψik = 0 (i  k) , (12.35)
thus ψ has to be of the form

3
ψ= fk (xk ) (12.36)
k=1

because otherwise the mixed derivatives could not vanish.


Inserting this result into (12.34) shows

f  f k
Ωi j = ψ fi + f j − k θi ∧ θ j . (12.37)
ψ
12.3 Friedmann’s equations 191

In order to satisfy (12.29), we must have

k + fk f k
fi + f j = . (12.38)
ψ

Since the two sides of these equations (one for each combination of i
and j) depend on different sets of variables, the second derivatives fi
and f j must all be equal and constant, and thus the fi must be quadratic
in xi with a coefficient of xi 2 which is independent of xi . Therefore, we
can write
k  i2
3
ψ=1+ x (12.39)
4 i=1

because, if the linear term is non-zero, it can be made zero by translating


the coordinate origin, and a constant factor on ψ is irrelevant because it
simply scales the coordinates.

12.3 Friedmann’s equations

12.3.1 Connection and curvature forms

Robertson-Walker metric
According to the preceding discussion, the homogeneous and isotropic
spatial hypersurfaces Σt must have a metric h with a line element of the
form 83 i2 3
i=1 dx
dl2 = , r 2
≡ xi 2 . (12.40)
(1 + kr2 /4)2 i=1

By a suitable choice of the time coordinate t, the line element of the


metric of a spatially homogeneous and isotropic spacetime can then be
written as
ds2 = −c2 dt2 + a2 (t)dl2 , (12.41)
because the scaling function a(t) must not depend on the xi in order
to preserve isotropy and homogeneity. The metric (12.41) of a spa-
tially homogeneous and isotropic spacetime is called Robertson-Walker
metric.
Correspondingly, we choose the appropriate dual basis

a(t) dxi
θ0 = cdt , θi = , (12.42)
1 + kr2 /4

in terms of which the metric coefficients are g = diag(−1, 1, 1, 1).


192 12 Homogeneous, Isotropic Cosmology

The exterior derivatives of the dual basis are

dθ0 = 0 ,
ȧ dt ∧ dxi a k
dθi = − x j dx j ∧ dxi
1 + kr2 /4 (1 + kr2 /4)2 2
ȧ 0 kx j i
= θ ∧ θi + θ ∧ θj . (12.43)
ca 2a
Since the exterior derivative of the metric is dg = 0, Cartan’s first
structure equation (8.13) implies

ωij ∧ θ j = −dθi , (12.44)

suggesting the curvature forms


ȧ i
ω0i = ωi0 = θ ,
ca
k  j 
ωij = −ωij = xi θ − x j θi , (12.45)
2a
which evidently satisfy (12.44).
Their exterior derivatives are
äa − ȧ2 0 ȧ i
dω0i = θ ∧ θi + dθ
c2 a2 ca
ä kȧx j i
= 2 θ0 ∧ θi + θ ∧ θj (12.46)
ca 2ca2
and
kȧ 0  j 
dωij = − 2
θ ∧ xi θ − x j θ i (12.47)
2a
k  
+ dxi ∧ θ j − dx j ∧ θi + xi dθ j − x j dθi
2a

k k k2  
= 2 1 + r2 θi ∧ θ j + 2 xi xk θ j ∧ θk − x j xk θi ∧ θk .
a 4 4a

Cartan’s second structure equation (8.13) then gives the curvature forms
ä 0
Ω0i = dω0i + ω0k ∧ ωki = θ ∧ θi , (12.48)
c2 a
k + ȧ2 /c2 i
Ωij = dωij + ωi0 ∧ ω0j + ωik ∧ ωkj = θ ∧ θj ,
a2
from which we obtain the components of the Ricci tensor

Rμν = R̄αμαν = Ωαμ (eα , eν ) (12.49)

as
3ä ä k + ȧ2 /c2
R00 = − , R11 = R22 = R33 = + 2 . (12.50)
c2 a c2 a a2
12.3 Friedmann’s equations 193

The Ricci scalar is then


ä k + ȧ2 /c2
R = Rμμ = 6 + . (12.51)
c2 a a2

Einstein tensor for a spatially homogeneous and isotropic


spacetime ?
The Einstein tensor of a spatially homogeneous and isotropic spacetime Beginning with the dual basis
has the components (12.43), carry out all calculations
leading to (12.50) and (12.51)
k + ȧ2 /c2 2ä k + ȧ2 /c2 yourself.
G00 = 3 , G11 = G22 = G33 = − − . (12.52)
a2 c2 a a2

12.3.2 From Einstein to Friedmann

For Einstein’s field equations to be satisfied, the energy-momentum


tensor must be diagonal, and its components must not depend on the
spatial coordinates in order to preserve isotropy and homogeneity. We
set T 00 = ρc2 , which is the total energy density, and T i j = pδi j , where p
is the pressure.
This corresponds to the energy-momentum tensor of an ideal fluid,

) p*
T = ρ + 2 u ⊗ u + pg (12.53)
c

as seen by a fundamental observer (i.e. an observer for whom the spatial


hypersurfaces are isotropic). For such an observer, u = c∂t , and since the
metric is Minkowskian in the tetrad (12.42), the components of T are
simply T 00 = ρc2 and T ii = p.
Then, Einstein’s field equations in the form (6.80) with the cosmological
constant Λ reduce to

k + ȧ2 /c2 8πG


3 = 2 ρ+Λ,
a2 c
2ä k + ȧ2 /c2 8πG
− 2 − = 4 p−Λ. (12.54)
ca a2 c

Adding a third of the first equation to the second, and re-writing the first
equation, we find Friedmann’s equations.
194 12 Homogeneous, Isotropic Cosmology

Friedmann’s equations
For a spatially homogeneous and isotropic spacetime with the
Robertson-Walker metric (12.41), Einstein’s field equations reduce
to Friedmann’s equations,

ȧ2 8πG Λc2 kc2


= ρ + − 2
a2 3  3 a
ä 4πG 3p Λc2
=− ρ+ 2 + . (12.55)
a 3 c 3

A Robertson-Walker metric whose scale factor satisfies Friedmann’s


equations is called Friedmann-Lemaître-Robertson-Walker metric.

Figure 12.1 Alexander A. Friedmann (1888–1925), Russian physicist and


mathematician. Source: Wikipedia

12.4 Density evolution and redshift

12.4.1 Density evolution

After multiplication with 3a2 and differentiation with respect to t, Fried-


mann’s first equation gives

6ȧä = 8πG(ρ̇a2 + 2ρaȧ) + 2Λc2 aȧ . (12.56)


12.4 Density evolution and redshift 195

If we eliminate

3p
6ȧä = −8πGaȧ ρ + 2 + 2Λc2 aȧ (12.57)
c

by means of Friedmann’s second equation, we find


ȧ ) p*
ρ̇ + 3 ρ + 2 = 0 (12.58)
a c
for the evolution of the density ρ with time.
This equation has a very intuitive meaning. To see it, let us consider the
energy contained in a volume V0 , which changes over time in proportion
to V0 a3 , and employ the first law of thermodynamics,

d(ρc2 V0 a3 ) + pd(V0 a3 ) = 0 ⇒ d(ρc2 a3 ) + pd(a3 ) = 0 . (12.59)

We can use the first law of thermodynamics here because isotropy forbids
any energy currents, thus no energy can flow into or out of the volume
a3 .
Equation (12.59) yields
3p 2
a3 ρ̇ + 3ρa2 ȧ + a ȧ = 0 , (12.60)
c2
which is identical to (12.58). This demonstrates that (12.58) simply
?
expresses energy-momentum conservation. Consequently, one can show
Why are energy and momentum
that it also follows from the contracted second Bianchi identity, ∇ · T = 0.
conserved here, but not in gen-
Two limits are typically considered for (12.58). First, if matter moves eral?
non-relativistically, p ρc2 , and we can assume p ≈ 0. Then,
ρ̇ ȧ
= −3 , (12.61)
ρ a
which implies
ρ = ρ0 a−3 (12.62)
if ρ0 is the density when a = 1.
Second, relativistic matter has p = ρc2 /3, with which we obtain
ρ̇ ȧ
= −4 (12.63)
ρ a
and thus
ρ = ρ0 a−4 . (12.64)
This shows that the density of non-relativistic matter drops as expected
in proportion to the inverse volume, but the density of relativistic matter
drops faster by one order of the scale factor. An explanation will be
given below.
196 12 Homogeneous, Isotropic Cosmology

12.4.2 Cosmological redshift

We can write the line element (12.41) in the form

ds2 = −c2 dt2 + a2 (t)dl2 , (12.65)

where dl2 is the line element of a three-space with constant curvature k.


Since light propagates on null geodesics, (12.65) implies

? cdt = ±a(t)dl . (12.66)


What do the two different signs
in (12.66) mean or imply?
Suppose a light signal leaves the source at the coordinate time t0 and
reaches the observer at t1 , then (12.66) shows that the coordinate time
satisfies the equation
 t1  observer
cdt
= dl , (12.67)
t0 a(t) source

whose right-hand side is time-independent. Thus, for another light signal


leaving the source at t0 + dt0 and reaching the observer at t1 + dt1 , we
have  t1  t1 +dt1
cdt cdt
= . (12.68)
t0 a(t) t0 +dt0 a(t)

Since this implies


 t0 +dt0  t1 +dt1
dt dt
= , (12.69)
t0 a(t) t1 a(t)

we find for sufficiently small dt0,1 that

dt0 dt1
= . (12.70)
a(t0 ) a(t1 )

We can now identify the time intervals dt0,1 with the inverse frequencies
of the emitted and observed light, dti = νi−1 for i = 0, 1. This shows that
the emitted and observed frequencies are related by
ν0 a(t1 )
= . (12.71)
ν1 a(t0 )
Since the redshift z is defined in terms of the wavelengths as
λ 1 − λ0 ν 0 − ν 1
z= = , (12.72)
λ0 ν1
we find that light emitted at t0 and observed at t1 is redshifted by
λ1 a(t1 )
1+z= = . (12.73)
λ0 a(t0 )
12.4 Density evolution and redshift 197

Cosmological redshift
The expansion or contraction of spacetime according to Friedmann’s
equations causes the wavelength of light to be increased or decreased
in the same proportion as the universe itself expands or contracts.
We can now interpret the result (12.64) that the density of relativistic
matter drops by one power of a more than expected by mere dilution:
as the universe expands, relativistic particles are redshifted by another
factor a and thus loose energy in addition to their dilution.

12.4.3 Alternative forms of the metric

Before we proceed, we bring the spatial line element dl from (12.40) into
a different form. We first write it in terms of spherical polar coordinates
as
dr2 + r2 (dϑ2 + sin2 ϑdϕ2 )
dl2 = (12.74)
(1 + kr2 /4)2
and introduce a new radial coordinate u defined by
r
u= . (12.75)
1 + kr2 /4

Requiring that r ≈ u for small r and u, we can uniquely solve (12.75) to


find
2  √ 
r= 1 − 1 − ku2 , (12.76)
ku
which implies the differential
2udu
d(ru) = √ . (12.77)
1 − ku2
At the same time, (12.75) requires

r2 2rdr 2udr
d(ru) = d =# $ = (12.78)
1 + kr2 /4 1 + kr2 /4 2 1 + kr2 /4

and thus
dr du
= √ . (12.79)
1 + kr /4
2
1 − ku2

In terms of the new radial coordinate u, we can thus write the spatial line
element of the metric in the frequently used form

du2
dl2 = + u2 dΩ2 , (12.80)
1 − ku2
where dΩ abbreviates the solid-angle element. The constant k can be
positive, negative or zero, but its absolute value does not matter since it
198 12 Homogeneous, Isotropic Cosmology

merely scales the coordinates. Therefore, we can normalise the coordi-


nates such that k = 0, ±1.
Yet another form of the metric is found by introducing a radial coordinate
w such that
du
dw = √ . (12.81)
1 − ku2
Integrating both sides, we find that this is satisfied if
⎧  


⎪ k −1/2
sin k 1/2
w (k > 0)



? u = fk (w) ≡ ⎪
⎪w (k = 0) . (12.82)


⎪  
What advantages or disadvan- ⎩|k|−1/2 sinh |k|1/2 w (k < 0)
tages could alternative forms
of the Friedmann-Lemaître-
Robertson-Walker (or any other) Equivalent forms of the Robertson-Walker metric
metric have? We thus find that the homogeneous and isotropic class of cosmological
models based on Einstein’s field equations are characterised by the line
element
 
ds2 = −c2 dt2 + a2 (t) dw2 + fk2 (w) dϑ2 + sin2 ϑdϕ2 (12.83)

which is equivalent to
&  '
du2
ds2 = −c2 dt2 + a2 (t) + u2 dϑ2 + sin2 ϑdϕ2 (12.84)
1 − ku2

with u related to w by (12.82), and the scale factor a(t) satisfies the
Friedmann equations (12.55).
Metrics with line elements of the form (12.83) or (12.84) are called
Robertson-Walker metrics, and Friedmann-Lemaître-Robertson-Walker
metrics if their scale factor satisfies Friedmann’s equations.
Chapter 13

Two Examples of Relativistic


Astrophysics

13.1 Light bundles

13.1.1 Geodesic deviation

We had seen in (6.16) that the separation vector n between two geodesics
out of a congruence evolves in a way determined by the equation of
geodesic deviation or Jacobi equation

∇2u n = R̄(u, n)u , (13.1)


Caution Recall that, as intro-
where u is the tangent vector to the geodesics and R̄(u, n)u is the curvature
duced in Chapter 6, a congruence
as defined in (3.51).
is a bundle of world lines in this
We apply this now to a light bundle, i.e. a congruence of light rays or context. 
null geodesics propagating from a source to an observer moving with
a four-velocity uo . Let k be the wave vector of the light rays, then the
frequency of the light at the observer is

ωo = k, uo  , (13.2)

and we introduce the normalised wave vector k̃ = k/ωo which satisfies


k̃, uo  = 1. Since k is a null vector, so is k̃.
Next, we introduce a screen perpendicular to k and to uo . It thus falls
into the local three-space of the observer, where it is perpendicular to
the light rays. Since it is two-dimensional, it can be spanned by two
orthonormal vectors E1,2 , which are parallel-transported along the light
bundle such that

∇k Ei = 0 = ∇k̃ Ei (i = 1, 2) . (13.3)

199
200 13 Relativistic Astrophysics

Notice that the parallel transport along a null geodesic implies that the
Ei remain perpendicular to k̃,

∇k̃ k̃, Ei  = ∇k̃ k̃, Ei  + k̃, ∇k̃ Ei  = 0 . (13.4)

In a coordinate basis {eα } and its conjugate dual basis {θi }, they can be
written as
Ei = Eiα eα with Eiα = θα (Ei ) . (13.5)

The separation vector n between rays of the bundle can now be expanded
into the basis E1,2 ,
n = nα eα = N i Ei , (13.6)
showing that its components nα in the basis {eα } are

nα = θα (n) = θα (N i Ei ) = N i Eiα . (13.7)

Substituting the normalised wave vector k̃ for the four-velocity u in the


Jacobi equation (13.1), we first have

∇2k̃ n = R̄(k̃, n)k̃ . (13.8)

Writing n = N i Ei and using (13.3), we find

∇k̃ n = Ei ∇k̃ N i , ∇2k̃ n = Ei ∇2k̃ N i , (13.9)

and thus
Ei ∇2k̃ N i = R̄(k̃, E j )k̃ N j . (13.10)
Finally, we multiply equation (13.10) with E i from the left and use the
orthonormality of the vectors E1,2 ,
2 3
E i , E j = δij , (13.11)

to find the equation


2   3
∇2k̃ N i = E i , R̄ k̃, E j k̃ N j , (13.12)

describing how the perpendicular cross section of a light bundle changes


along the bundle.

13.1.2 Ricci and Weyl contributions

It will turn out convenient to introduce the Weyl tensor C̄, whose compo-
nents are determined by

R
R̄αβγδ = C̄αβγδ + gα[γ Rδ]β − gβ[γ Rδ]α − gα[γ gδ]β . (13.13)
3
13.2 Gravitational lensing 201

In contrast to the Riemann tensor, the Weyl tensor (representing the


Weyl curvature) is trace-free, C̄ αβαδ = 0, but otherwise has the same
symmetries,
C̄αβγδ = −C̄βαγδ = −C̄αβδγ = C̄γδαβ . (13.14)
Inserting the second, third, and fourth terms on the right-hand side of
(13.12) into (13.12), we see that
1 2 i 3   2 3  
gα[γ Rδ]β E iα k̃β k̃γ E δj = E , k̃ R E j , k̃ − E i , E j R k̃, k̃ ,
2
1 2 3   2 3  
gβ[γ Rδ]α E iα k̃β k̃γ E δj = k̃, k̃ R E i , E j − E j , k̃ R E i , k̃ ,
2
iα β γ δ 1 2 i 3 2 3 2 3 2 3
gα[γ gδ]β E k̃ k̃ E j = E , k̃ E j , k̃ − E i , E j k̃, k̃ . (13.15)
2
Defining further the 2 × 2 matrix C with the components

C ij := C̄αβγδ E iα k̃β k̃γ E δj , (13.16)

and using E i , k̃ = 0 = k̃, k̃ together with (13.11), we find that we can
write (13.12) as
  
1
∇2k̃ N i = − δij R k̃, k̃ + C ij N j . (13.17)
2

13.2 Gravitational lensing

13.2.1 The optical tidal matrix

The evolution of the bundle’s perpendicular cross section can thus be


described by a matrix T ,
 1  1
N N
∇k̃2
=T , (13.18)
N2 N2

which, according to (13.17), can be written in the form


1  
T = − R k̃, k̃ 12 + C . (13.19)
2

Some further insight can be gained by extracting the trace-free part from
T . Since the trace is
 
Tr T = −R k̃, k̃ + Tr C , (13.20) ?
Why is there a factor of 1/2 in
the trace-free part of T is
front of the trace on the left-hand
1 1 side of (13.21)?
T− Tr T 12 = C − Tr C 12 =: Γ , (13.21)
2 2
202 13 Relativistic Astrophysics

where we have defined the shear matrix Γ. Notice that the symmetries
(13.14) imply that C is symmetric,

Ci j = C̄αβγδ Eiα k̃β k̃γ E δj = C̄δγβα E δj k̃γ k̃β Eiδ = C ji . (13.22)

Thus, Γ is also symmetric and has only the two independent components
 
γ1 = C̄αβγδ k̃β k̃γ E1α E1δ − E2α E2δ , γ2 = C̄αβγδ k̃β k̃γ E1α E2δ . (13.23)

Optical tidal matrix


Summarising, we define three scalars, the shear components γ1,2 from
(13.23) and the convergence
1   
κ := − R k̃, k̃ − Tr C , (13.24)
2
in terms of which the matrix T can be brought into the form

κ + γ1 γ2
T = . (13.25)
γ2 κ − γ1

this is called the optical tidal matrix.


The effect of the optical tidal matrix becomes obvious if we start with a
light bundle with circular cross section, for which the components N i of
the distance vector can be written as
 1 
N cos ϕ
= , (13.26)
N2 sin ϕ

where ϕ is the polar angle on the screen spanned by the vectors E1,2 .
Before we apply the optical tidal matrix, we rotate it into its principal-
axis frame,  
κ + γ1 γ2 κ+γ 0
→ (13.27)
γ2 κ − γ1 0 κ−γ
with γ2 = γ12 + γ22 , which shows that it maps the circle onto a curve
outlined by the vector
 
x (κ + γ) cos ϕ
≡ . (13.28)
y (κ − γ) sin ϕ

This is an ellipse with semi-major axis κ + γ and semi-minor axis κ − γ,


because obviously

x2 y2
+ = cos2 ϕ + sin2 ϕ = 1 . (13.29)
(κ + γ)2 (κ − γ)2
Thus, for γ = 0, the originally circular cross section remains circular,
with κ being responsible for isotropically expanding or shrinking it, while
the light bundle is elliptically deformed if γ  0.
13.2 Gravitational lensing 203

13.2.2 Homogeneous and isotropic spacetimes

In an isotropic spacetime, it must be impossible to single out preferred


directions. This implies that γ = 0 then, because otherwise the principal-
axis frame of the optical tidal matrix would break isotropy. If the space-
time is homogeneous, this must hold everywhere, so that we can spe-
cialise
T = κ 12 , (13.30)
with κ defined in (13.24). Since the propagation equation (13.18) for the
light bundle is then isotropic, we replace N i = D for i = 1, 2 and write

∇2k̃ D = κD . (13.31)

Moreover, we see that


  
R    
G k̃, k̃ = R − g k̃, k̃ = R k̃, k̃ (13.32)
2

because k̃ is a null vector. Thus, we can put

1   4πG  
κ = − G k̃, k̃ = − 4 T k̃, k̃ , (13.33)
2 c ?
Why does the cosmological con-
using Einstein’s field equations in the second step.
stant Λ not appear in the expres-
Next, we can insert the energy-momentum tensor (12.53) for a perfect sion (13.33) for the convergence?
fluid, ) p*
T = ρ + 2 u ⊗ u + pg , (13.34)
c
and use the fact that fundamental observers (i.e. observers for whom the
universe appears isotropic) have u = c∂t and u = −cdt.
The frequency of the light measured by a hypothetical fundamental
observer moving with four-velocity u and placed between the source
and the final observer is k, u. Due to our definition of k̃ = k/ωo , and
because of the cosmological redshift (12.73), we can write
2 3 k, u ω
k̃, u = = =1+z, (13.35)
ωo ωo
where ωo is the frequency measured by the final observer, and z is the
redshift relative to the final observer.
Thus, we find

4πG ) p* 4πG ρ + p/c2


κ=− 2
ρ + 2
(1 + z)2 = − 2 , (13.36)
c c c a2
where a is the scale factor of the metric inserted according to (12.73),
setting a = 1 at the time of observation.
204 13 Relativistic Astrophysics

If p ρc2 , the density scales like a−3 as shown in (12.62), and then
4πG −5
κ=− ρ0 a . (13.37)
c2

We still need to choose a suitable affine curve parameter λ along the


fiducial light ray. In terms of λ, the tangent vector k̃ is given by
dx
k̃ = . (13.38)

Since we have normalised k̃ such that k̃, u = 1 + z as shown in (13.35),
we must have
0 1
dx dx0 cdt
,u = = = 1 + z = a−1 , (13.39)
dλ dλ dλ
where u = ∂t was used in the second step. Thus, the curve parameter
must be related to the cosmic time by dλ = cadt. Then, observing that
da = ȧdt, we find
cada cda
dλ = cadt = = . (13.40)
ȧ ȧ/a
With this result, we can rewrite
dxα dD
∇k̃ D = k̃α ∇α D = ∂α D = , (13.41)
dλ dλ
and the propagation equation (13.31) becomes

D = κD , (13.42)

where the prime indicates the derivative with respect to the affine param-
eter λ.
Equation (13.42) can be further simplified to reveal its very intuitive
meaning. From the metric in the form (12.83), we see that radially
propagating light rays must satisfy

cdt = ±adw , (13.43)

where w is the radial distance coordinate defined in (12.81). The sign


can be chosen depending on whether the distance should grow with
increasing time (i.e. into the future) or with decreasing time (i.e. into the
past), but it is irrelevant for our consideration. We choose cdt = adw and
therefore, with (13.40),

dλ = cadt = a2 dw . (13.44)

In addition to replacing the affine parameter λ by the comoving radial


coordinate w, we consider now the propagation of the comoving diameter
13.2 Gravitational lensing 205

D/a, i.e. the diameter with the expansion of the universe divided out.
Substituting dw for dλ, we first see that
& ) D *'
d2 ) D * 2 d 2 d d
=a a = a2 (aD − a D)
dw2 a dλ dλ a dλ
= a2 (aD − a D) . (13.45)

Next, we use (13.40) to write

da ȧ da ȧ d ) ȧ * 1 d ) ȧ *2
a = = = 2 = 2 , (13.46)
dλ ca da c a da a 2c da a
which enables us to insert Friedmann’s equation (12.55) in the form
) ȧ *2 8πG ρ0 Λc2 kc2
= + − 2 (13.47)
a 3 a3 3 a
to find
da 4πG
a = = − 2 ρ0 a−4 + ka−3 = κa + ka−3 , (13.48)
dλ c
inserting κ from (13.37).
Finally, we substitute D = κD from (13.31) and a from (13.48) into
(13.45) and obtain an intuitive result.
Propagation equation for the bundle diameter
In a spatially homogeneous and isotropic spacetime, the comoving ?
diameter D of a light bundle obeys the equation The derivation of the behaviour
of light bundles from Fried-
d2 ) D *   mann’s equation suggests that
2
= a3 κD − a2 D κa + ka−3
dw a it should be possible to derive
)D*
= −k , (13.49) Friedmann’s equation from the
a behaviour of light bundles. Is it?
which is a simple oscillator equation.
Equation (13.49) is now easily solved. We set the boundary conditions
such that the bundle emerges from a source point, hence D = 0 at the
source, and that it initially expands linearly with the radial distance w,
hence d(D/a)/dw = 1 there. Then, the solution of (13.49) is
⎧  


⎪ k−1/2 sin k1/2 w (k > 0)



D = a fk (w) = a ⎪⎪ w (k = 0) , (13.50)


⎪  
⎩|k|−1/2 sinh |k|1/2 w (k < 0)

with fk (w) defined in (12.82).


This shows that the diameter of the bundle increases linearly if space is
flat, diverges hyperbolically if space is negatively curved, and expands
and shrinks as a sine if space is positively curved.
206 13 Relativistic Astrophysics

13.3 The Tolman-Oppenheimer-Volkoff solu-


tion

13.3.1 Relativistic hydrostatics

We now consider an axially symmetric, static solution of Einstein’s


field equations in presence of matter. As usual for an axisymmetric
solution, we can work in the Schwarzschild tetrad (8.40), in which the
energy-momentum tensor of a perfect fluid,
) p*
T = T μν θ μ ⊗ θν with T μν = ρ + 2 uμ uν + pgμν , (13.51)
c
simplifies to
T μν = diag(ρc2 , p, p, p) (13.52)
because u = u0 e0 = e0 in the static situation we are considering.
It has been shown in the In-depth box “Ideal hydrodynamics in general
relativity” on page 175 that the relativistic Euler equation is

(ρc2 + p)∇u u = −c2 dp − u∇u p , (13.53)

which had been derived by contracting the local conservation equation

∇·T =0 (13.54)

with the perpendicular projection tensor π⊥ = 14 + u ⊗ u .


Specialising (13.53) to our situation, we use (8.9) to see that

∇u u = duα + uβ ωαβ , ueα = u0 ω10 , ue1 = c2 a e−b e1 , (13.55)

because the only non-vanishing of the connection forms ωα0 for a static,
axially symmetric spacetime is

ω10 = a e−b θ0 , (13.56)

as shown in (8.50). Moreover, in the static situation, ∇u p = 0.


The pressure gradient gradp = dp is
?
Compare the relativistic hy- gradp = dp = p dr = p e−b (θ1 ) = p e−b e1 . (13.57)
drostatic equation to its non-
relativistic counterpart in Newto-
Relativistic hydrostatic equation
nian gravity.
Substituting (13.55) and (13.57) into (13.53) yields
p
(ρc2 + p)a = −p ⇒ a = − , (13.58)
ρc2 + p
which is the relativistic hydrostatic equation.
13.3 The Tolman-Oppenheimer-Volkoff solution 207

Figure 13.1 Richard C. Tolman (1881–1948), US-American physicist.


Source: Wikipedia

13.3.2 The Tolman-Oppenheimer-Volkoff equation

With the components of the Einstein tensor given in (8.60) and the
energy-momentum tensor (13.52), the two independent field equations
read

1 −2b 1 2b 8πG
− 2 +e − =− 2 ρ
r r2 r c
 

1 1 2a 8πG
− 2 + e−2b 2 + = 4 p. (13.59)
r r r c
The first of these equations is equivalent to
  8πG
re−2b = 1 − 2 ρr2 . (13.60)
c
Integrating, and using the mass
 r
M(r) = 4π ρ(r )r2 dr , (13.61)
0

shows that the function b is determined by


2m GM(r)
e−2b = 1 − , m := . (13.62)
r c2

If we subtract the first from the second field equation (13.59), we find
2e−2b  8πG
(a + b ) = 4 (ρc2 + p) (13.63)
r c
208 13 Relativistic Astrophysics

Figure 13.2 J. Robert Oppenheimer (1904–1967), US-American physicist.


Source: Wikimedia Commons

or
4πG 2b 2
a = −b + e (ρc + p)r . (13.64)
c4
On the other hand, (13.62) gives

2m 2m 2m 8πG
− 2b e−2b = − = 2 − 2 ρr , (13.65)
r2 r r c
or 
4πG m 2b
b = ρr − e , (13.66)
c2 r2
which allows us to write (13.64) as

 m 4πG m + 4πGpr3 /c4
a = 2 + 4 pr e2b = . (13.67)
r c r(r − 2m)

Tolman-Oppenheimer-Volkoff equation
But the hydrostatic equation demands (13.58), which we combine with
(13.67) to find

(ρc2 + p)(m + 4πGpr3 /c4 )


− p = . (13.68)
r(r − 2m)
This is the Tolman-Oppenheimer-Volkoff equation for the pressure
gradient in a relativistic star.
13.4 The mass of non-rotating neutron stars 209

Figure 13.3 George M. Volkoff (1914–2000), Canadian physicist. Source:


Wikipedia

This equation generalises the hydrostatic Euler equation in Newtonian


physics, which reads for a spherically-symmetric configuration
GMρ mρc2
− p = = 2 . (13.69)
r2 r
This shows that gravity acts on ρc2 + p instead of ρ alone, the pressure
itself adds to the source of gravity, and gravity increases more strongly
than ∝ r−2 towards the centre of the star.

13.4 The mass of non-rotating neutron stars

Neutron stars are a possible end product of the evolution of massive stars.
When such stars explode as supernovae, they may leave behind an object
with a density so high that protons and electrons combine to neutrons
in the process of inverse β decay. Objects thus form which consist of
matter with nuclear density
ρ0 ≈ 5 · 1014 g cm−3 . (13.70)

A greatly simplified, yet instructive solution to the Tolman-Oppenheimer-


Volkoff equation can be found assuming a constant density



⎨ρ0 (r ≤ r0 )
ρ(r) = ⎪⎪ (13.71)
⎩0 (r > r0 )
210 13 Relativistic Astrophysics

throughout the star, with r0 representing the stellar radius. Introducing


the length scale
 −1/2
4πG
λ0 := ρ0 , (13.72)
c2
scaling the pressure p with the central energy density
p
q := (13.73)
ρ0 c2
and introducing x := r/λ0 , we can transform the Tolman-Oppenheimer-
Volkoff equation (13.68) to
dq (1 + q)(1 + 3q)x
− = . (13.74)
dx 3 − 2x2
Separating variables, setting the scaled pressure to q = q0 at x = 0, and
adopting q0 = 1/3 as appropriate for an ultrarelativistic gas, we can
? integrate (13.74) to find
Introducing the length scale λ0 √
from (13.72) and the dimension- 2 − 9 − 6x2
q(x) = √ . (13.75)
less pressure q from (13.75) 9 − 6x2 − 6
into the Tolman-Oppenheimer-
0.35
Volkoff equation (13.68), confirm
(13.74) and solve it to arrive at 0.3
dimension-less pressure q = p/0c2

(13.75).
0.25

0.2

0.15
x* = 5/6
0.1

0.05

0
0 0.2 0.4 0.6 0.8 1
dimension-less radius x = r/0

Figure 13.4 Pressure profile obtained from the Tolman-Oppenheimer-


Volkoff equation for a homogeneous star.

The pressure falls to zero at x∗ = 5/6, which defines the stellar radius
r∗ = x∗ λ0 and a stellar mass M∗ of
4π 3
M∗ = r ρ0 . (13.76)
3 ∗
With the nuclear density (13.75), we find
λ0 = 14.7 km , r∗ = 13.4 km and M∗ = 2.5 M
. (13.77)

These are approximate results obtained under simplifying assumptions,


which show however that at most a few solar masses can be stabilised by
a relativistic gas with nuclear density. Masses exceeding this limit will
collapse into black holes.
211

Instead of a postface

As mentioned instead of a preface, these lectures aim at introducing the


theory of general relativity, but cannot replace a comprehensive textbook.
They can be summarised as follows:

• The main concern of the introduction in Chap. 1 is the equivalence


principle and the consequence drawn from it that the light-cone
structure, commonly expressed by the metric, needs to be flexible.
Locally, in a freely-falling reference frame, special relativity must
hold with its light-cone defined by the Minkowski metric. Since
the directions of free fall will generally differ at different loca-
tions in spacetime, the metric needs to vary from place to place.
Sufficiently flexible spacetimes are represented by differentiable
manifolds.

• The mathematics on differentiable manifolds, i.e. differential ge-


ometry, is thus the adequate mathematical language for general
relativity. Tangent and dual spaces provide vectors and dual vec-
tors. Connections, or covariant derivatives, define how vectors
can be moved along curves from one tangent space to another.
Having chosen a connection, torsion and curvature can be defined.
Chapters 2 and 3 serve this purpose.

• With these tools at hand, concepts of physics can be ported from


Minkowskian spacetime to manifolds. The essential choice here is
the identification of the line element of the metric with the proper
time interval measured by an observer. Motion of test particles
and light rays on geodesics follows from this choice, as described
in Chap. 4.

• The Lie derivative defines how objects on a manifold change


as the manifold itself is transformed. It is most important for
specifying symmetry transformations of manifolds, generated by
Killing vector fields. Differential forms allow coordinate-free
differentiation and integration on manifolds. Chapter 5 introduces
these concepts.

• Einstein’s field equations are then motivated in two ways in Chap. 6,


first via the gravitational tidal field and its relation to curvature,
second via an action principle. Lovelock’s theorems reveal the
remarkable uniqueness of the field equations derived therefrom.

• In the remainder of the lecture, several classes of solutions of


Einstein’s field equations are discussed. In Chap. 7, the field
equations are linearised, leading to the various effects of weak
gravitational fields, among them gravitational light deflection,
gravitomagnetic frame-dragging and gravitational waves. The
212

diffeomorphism invariance of general relativity and the gauge


freedom following from it are an important mathematical side-line
of this chapter.

• The Schwarzschild solution, its derivation, properties, its maximal


continuation, and its causal structure are the subjects of Chapters
8, 9, and 10. Chapter 11 adds charge and angular momentum to
the solution and offers a first look into the consequences. Thermo-
dynamics of black holes is briefly introduced there.

• Chapter 12 shows how the Friedmann equations of spatially ho-


mogeneous and isotropic cosmology follow from Einstein’s field
equations. It thus describes the root of a specialised cosmology lec-
ture which typically begins with these equations and their premises.
Similarly, Chap. 13 begins with light propagation through general
space-times and later focuses on the evolution of light bundles in
Friedmann cosmologies. Cosmic gravitational lensing by large-
scale structures begins with the optical tidal matrix defined there
and is typically again the subject of more specialised lectures.
Finally, the Tolman-Oppenheimer-Volkoff equation is derived as
the generally-relativistic analog to the hydrostatic equation of
hydrostatic stellar models in Newtonian gravity.

If these lectures lay the foundation for studying more detailed textbooks
and reading the research literature, they serve their intended purpose.
Appendix A

Electrodynamics

A.1 Electromagnetic field tensor

Electric and magnetic fields are components of the antisymmetric field


tensor
F μν = ∂ μ Aν − ∂ν Aμ (A.1)
formed from the four-potential

Φ
Aμ =  . (A.2)
A
The field tensor can be conveniently summarised as

μν 0 E #
(F ) = (A.3)
−E B
with
Bi j = i ja Ba . (A.4)
Given the signature (−, +, +, +) of the Minkowski metric, its associated
rank-(0, 2) tensor has the components

0 −E #
(Fμν ) =  . (A.5)
E B

A.2 Maxwell’s equations

The homogeneous Maxwell equations read

∂[α Fβγ] = 0 . (A.6)

For α = 0, (β, γ) = (1, 2), (1, 3) and (2, 3), this gives

˙ + c∇
B  × E = 0 , (A.7)

213
214 A Electrodynamics

and for α = 1, (β, γ) = (2, 3), we find

 ·B
∇ =0. (A.8)

The inhomogeneous Maxwell equations are

4π μ
∂ν F μν = j , (A.9)
c
where 
ρc
jμ = j (A.10)

is the four-current density. For μ = 0 and μ = i, (A.9) gives

 · E = 4πρ ,
∇ c∇  − E˙ = 4πj ,
 ×B (A.11)

respectively.
With the definition (A.1) and the Lorenz gauge condition ∂μ Aμ = 0, the
inhomogeneous equations (A.9) can be written as

4π μ
Aμ = − j , (A.12)
c

where  = −∂20 + ∇ 2 is the d’Alembert operator. The particular solution


of the homogeneous equation is given by the convolution of the source
with the retarded Greens function
⎛  ⎞
1 ⎜⎜⎜ x − x   ⎟⎟⎟
 
G(t, t , x, x ) =   δ ⎜⎜t − t −

⎟⎟ , (A.13)
x − x   ⎝ c ⎠

i.e. by
 
1
Aμ (t, x ) = d3 x dtG(t, t , x, x  ) j μ (t, x  ) . (A.14)
c

A.3 Lagrange density and energy-


momentum tensor

The Lagrange density of the electromagnetic field coupled to matter is

1 μν 1
L=− F Fμν − Aμ j μ , (A.15)
16π c
from which Maxwell’s equations follow by the Euler-Lagrange equa-
tions,
∂L ∂L
∂ν ν μ − μ = 0 . (A.16)
∂(∂ A ) ∂A
A.3 Lagrange density and energy-momentum tensor 215

From the Lagrange density of the free electromagnetic field,


1 μν
L=− F Fμν , (A.17)
16π
we find the energy-momentum tensor

∂L 1 1
T μν = −2 + gμν L = F μλ F νλ − gμν F αβ Fαβ . (A.18)
∂gμν 4π 4

From (A.3), we find first

F αβ Fαβ = −2(E 2 − B
 2) , (A.19)

and the energy-momentum tensor can be written as



1 E 2 + B
2 2E # B#
T μν =  2 + BB# . (A.20)
8π 2BE −E 2 − B

This yields the energy density

2
E 2 + B
T 00 = (A.21)

of the electromagnetic field and the Poynting vector
c  
cT 0i = E×B. (A.22)

Appendix B

Summary of Differential
Geometry

B.1 Manifold

An n-dimensional manifold M is a suitably well-behaved space that is


locally homeomorphic to Rn , i.e. that locally “looks like” Rn .
A chart h, or a coordinate system, is a homeomorphism from D ⊂ M to
U ⊂ Rn ,
h : D → U , p → h(p) = (x1 , . . . , xn ) , (B.1)
i.e. it assigns an n-tupel of coordinates {xi } to a point p ∈ D.
An atlas is a collection of charts whose domains cover the entire manifold.
If all coordinate changes between charts of the atlas with overlapping
domains are differentiable, the manifold and the atlas themselves are
called differentiable.

B.2 Tangent and dual spaces

The tangent space T p M at a point p ∈ M is the vector space of all


derivations. A derivation v is a map from the space F p of C ∞ functions
in p into the real numbers,

v : Fp → R , f → v( f ) . (B.2)

A derivation is a linear map which satisfies the Leibniz rule,

v(λ f + μg) = λv( f ) + μv(g) , v( f g) = v( f )g + f v(g) . (B.3)

Tangent vectors generalise directional derivatives of functions.

217
218 B Differential Geometry

A coordinate basis of the tangent vector space is given by the partial


derivatives {∂i }. Tangent vectors can then be expanded in this basis,

v = vi ∂i , v( f ) = vi ∂i f . (B.4)

A dual vector w is a linear map assigning a real number to a vector,

w : TM → R , v → w(v) . (B.5)

The space of dual vectors to a tangent vector space T M is the dual space
T ∗ M.
Specifically, the differential of a function f ∈ F is a dual vector defined
by
d f : T M → R , v → d f (v) = v( f ) . (B.6)
Accordingly, the differentials of the coordinate functions xi form a basis
{dxi } of the dual space which is orthonormal to the coordinate basis {∂i }
of the tangent space,

dxi (∂ j ) = ∂ j (xi ) = δij . (B.7)

B.3 Tensors

A tensor t ∈ T sr of rank (r, s) is a multilinear mapping of r dual vectors


and s vectors into the real numbers. For example, a tensor of rank (0, 2)
is a bilinear mapping of 2 vectors into the real numbers,

t : TM × TM → R , (x, y) → t(x, y) . (B.8)

The tensor product is defined component-wise. For example, two dual


vectors w1,2 can be multiplied to form a rank-(0, 2) tensor w1 ⊗ w2

(v1 , v2 ) → (w1 ⊗ w2 )(v1 , v2 ) = w1 (v1 ) w2 (v2 ) . (B.9)

A basis for tensors of arbitrary rank is obtained by the tensor product of


suitably many elements of the bases {∂i } of the tangent space and {dx j }
of the dual space. For example, a tensor t ∈ T20 can be expanded as

t = ti j dxi ⊗ dx j . (B.10)

If applied to two vectors x = xk ∂k and y = yl ∂l , the result is

t(x, y) = ti j dxi (xk ∂k ) dx j (yl ∂l ) = ti j xk yl δik δlj = ti j xi y j . (B.11)

The contraction of a tensor t ∈ T sr is defined by

C : T sr → T s−1
r−1
, t → Ct (B.12)
B.4 Covariant derivative 219

such that one of the dual vector arguments and one of the vector argu-
ments are filled with pairs of basis elements and summed over all pairs.
For example, the contraction of a tensor t ∈ T11 is
Ct = t(dxk , ∂k ) = (tij ∂i ⊗ dx j )(dxk , ∂k ) = tij δki δkj = tkk . (B.13)

The metric g ∈ T20 is a symmetric, non-degenerate tensor field of rank


(0, 2), i.e. it satisfies
g(x, y) = g(y, x) , g(x, y) = 0 ∀ y ⇒ x=0. (B.14)
The metric defines the scalar product between two vectors,
x, y = g(x, y) . (B.15)

B.4 Covariant derivative

The covariant derivative or a connection linearly maps a pair of vectors


to a vector,
∇ : TM × TM → TM , (x, y) → ∇ x y (B.16)
such that for a function f ∈ F
∇ f x y = f ∇x y , ∇ x ( f y) = f ∇ x y + x( f )y . (B.17)
The covariant derivative of a function f is its differential,
∇v f = v f = d f (v) . (B.18)

Due to the linearity, it is completely specified by the covariant derivatives


of the basis vectors,
∇∂i ∂ j = Γkij ∂k . (B.19)
The functions Γkij are called connection coefficients or Christoffel sym-
bols. They are not tensors.
By means of the exponential map, so-called normal coordinates can
always be introduced locally in which the Christoffel symbols all vanish.
The covariant derivative ∇y of a vector y is a rank-(1, 1) tensor field
defined by
∇y : T ∗ M × T M → R , ∇y(w, v) = w(∇v y) . (B.20)
In components,
(∇y)ij = ∇y(dxi , ∂ j ) = dxi (∇∂ j yk ∂k ) = ∂ j yi + Γijk yk . (B.21)

The covariant derivative of a tensor field is defined to obey the Leib-


niz rule and to commute with contractions. Specifically, the covariant
derivative of a dual vector field w ∈ T ∗ M is a tensor of rank (0, 2) with
components
(∇w)i j = ∂ j wi − Γkij wk . (B.22)
220 B Differential Geometry

B.5 Parallel transport and geodesics

A curve γ is defined as a map from some interval I ⊂ R to the manifold,

γ:I→M, t → γ(t) . (B.23)

Its tangent vector is γ̇(t).


A vector v is said to be parallel transported along γ if

∇γ̇ v = 0 . (B.24)

A geodesic curve is defined as a curve whose tangent vector is parallel


transported along γ,
∇γ̇ γ̇ = 0 . (B.25)
In coordinates, let u = γ̇ be the tangent vector to γ and with components
ẋi = ui , then
ẍk + Γkij ẋi ẋ j = 0 . (B.26)

B.6 Torsion and curvature

The torsion of a connection is defined by

T : T M×T M → T M , (x, y) → T (x, y) = ∇ x y−∇y x−[x, y] . (B.27)

It vanishes if and only if the connection is symmetric.


On a manifold M with a metric g, a symmetric connection can always
be uniquely defined by requiring that ∇g = 0. This is the Levi-Civita
connection, whose Christoffel symbols are
1  
Γijk = gia ∂ j gak + ∂k g ja − ∂a g jk . (B.28)
2
From now on, we shall assume that we are working with the Levi-Civita
connection whose torsion vanishes.
The curvature is defined by

R̄ : T M × T M × T M → T M ,
 
(x, y, z) → R̄(x, y)z = ∇ x ∇y − ∇y ∇ x − ∇[x,y] z . (B.29)

The curvature or Riemann tensor R̄ ∈ T31 is given by

R̄ : T ∗ M ×T M ×T M ×T M → R , (w, x, y, z) → w[R̄(x, y)z] . (B.30)

Its components are

R̄ijkl = dxi [R̄(∂k , ∂l )∂ j ] = ∂k Γijl − ∂l Γijk + Γajl Γiak − Γajk Γial . (B.31)
B.7 Pull-back, Lie derivative, Killing fields 221

The Riemann tensor obeys three important symmetries,

R̄i jkl = −R̄ jikl = R̄ jilk , R̄i jkl = R̄kli j , (B.32)

which reduce its 44 = 256 components in four dimensions to 21.


In addition, the Bianchi identities hold,
 
R̄(x, y)z = 0 , ∇ x R̄(y, z) = 0 , (B.33)
(x,y,z) (x,y,z)

where the sums extend over all cyclic permutations of x, y, z. The first
Bianchi identity reduces the number of independent components of the
Riemann tensor to 20. In components, the second Bianchi identity can
be written
R̄ij[kl;m] = 0 , (B.34)
where the indices in brackets need to be antisymmetrised.
The Ricci tensor is the contraction of the Riemann tensor over its first
and third indices, thus its components are

R jl = R̄ijil = Rl j . (B.35)

A further contraction yields the Ricci scalar,

R = Rii . (B.36)

The Einstein tensor is the combination


R
G =R− g. (B.37)
2
Contracting the second Bianchi identity, we find the contracted Bianchi
identity,
∇·G = 0 . (B.38)

B.7 Pull-back, Lie derivative and Killing


vector fields

A differentiable curve γt (p) defined at every point p ∈ M defines a


diffeomorphic map φt : M → M. If γ̇t = v for a vector field v ∈ T M, φt
is called the flow of v.
The pull-back of a function f defined on the target manifold at φt (p) is
given by
(φ∗t f )(p) = ( f ◦ φt )(p) . (B.39)
This allows vectors defined at p to be pushed forward to φt (p) by

(φt∗ v)( f ) = v(φ∗t f ) = v( f ◦ φt ) . (B.40)


222 B Differential Geometry

Dual vectors w can then be pulled back by

(φ∗t w)(v) = w(φt∗ v) . (B.41)

For diffeomorphisms φt , the pull-back and the push-forward are inverse,


φ∗t = φ−1
t∗ .

As for vectors and dual vectors, the pull-back and the push-forward can
also be defined for tensors of arbitrary rank.
The Lie derivative of a tensor field T into direction v is given by the limit
φ∗t T − T
Lv T = lim , (B.42)
t→0 t
where φt is the flow of v. The Lie derivative quantifies how a tensor
changes as the manifold is transformed by the flow of a vector field.
The Lie derivative is linear and obeys the Leibniz rule,

L x (y + z) = L x y + L x z , L x (y ⊗ z) = L x y ⊗ z + y ⊗ L x z . (B.43)

It commutes with the contraction. Further important properties are

L x+y = L x + Ly , Lλx = λL x , L[x,y] = [L x , Ly ] . (B.44)

The Lie derivative of a function f is the ordinary differential

Lv f = v( f ) = d f (v) . (B.45)

The Lie derivative and the differential commute,

Lv d f = dLv f . (B.46)

The Lie derivative of a vector x is the commutator

Lv x = [v, x] . (B.47)

By its commutation with contractions and the Leibniz rule, the Lie
derivative of a dual vector w turns out to be

(L x w)(v) = x[w(v)] − w([x, v]) . (B.48)

Lie derivatives of arbitrary tensors can be similarly derived. For example,


if g ∈ T20 , we find

(L x g)(v1 , v2 ) = x[g(v1 , v2 )] − g([x, v1 ], v2 ) − g(v1 , [x, v2 ]) (B.49)

with v1,2 ∈ T M.
Killing vector fields K define isometries of the metric, i.e. the metric
does not change under the flow of K. This implies the Killing equation

LK g = 0 ⇒ ∇i K j + ∇ j Ki = 0 . (B.50)
B.8 Differential forms 223

B.8 Differential forms


-
Differential p-forms ω ∈ p are totally antisymmetric tensor fields of
rank (0, p). Their components satisfy
ωi1 ...i p = ω[i1 ...i p ] . (B.51)

The exterior product ∧ is defined by


. p .q . p+q (p + q)!
∧: × → , (ω, η) → ω ∧ η = A(ω ⊗ η) ,
p!q!
(B.52)
where A is the alternation operator
1 
(At)(v1 , . . . , v p ) = sgn(π)t(vπ(1) , . . . , vπ(p) ) . (B.53)
p! π
-
On the vector space of differential forms, the wedge product defines
an associative, skew-commutative Grassmann algebra,
ω ∧ η = (−1) pq η ∧ ω , (B.54)
-p -q
with ω ∈ and η ∈ .
A basis for the p-forms is
dxi1 ∧ . . . ∧ dxi p , (B.55)
-
which shows that the dimension of p is
.p  n
dim = . (B.56)
p

The interior product iv is defined by


.p . p−1
i : TM × → , (v, ω) → iv (ω) = ω(v, . . .) . (B.57)
In components, the interior product is given by
(iv ω)i2 ...i p = v j ω ji2 ...i p . (B.58)

The exterior derivative turns p-forms ω into (p + 1)-forms dω,


.p . p+1 
d: → , ω → dω = dωi1 ...i p ∧ dxi1 ∧ . . . ∧ dxi p .
i1 <...<i p
(B.59)
Accordingly, the components of the exterior derivative are given by
partial derivatives,
(dω)i1 ,...,i p+1 = (p + 1) ∂[i1 ωi2 ...i p+1 ] (B.60)

A differential form α is called exact if a differential form β exists such


that α = dβ. It is called closed if dα = 0.
224 B Differential Geometry

B.9 Cartan’s structure equations

Let {ei } be an arbitrary basis and {θi } its dual basis such that

θi , e j  = δij . (B.61)


-1
The connection forms ωij ∈ are defined by

∇v ei = ωij (v) e j . (B.62)

In terms of Christoffel symbols, they can be expressed as

ωij = Γik j θk . (B.63)

They satisfy the antisymmetry relation

dgi j = ωi j + ω ji . (B.64)

The covariant derivative of a dual basis vector is

∇v θi = −ωij (v) θ j . (B.65)

Covariant derivatives of arbitrary vectors x and dual vectors α are then


given by

∇v x = dxi + x j ωij , v ei , ∇v α = dαi − α j ωij , v θi (B.66)

or

∇x = ei ⊗ (dxi + x j ωij ) , ∇α = θi ⊗ (dαi − α j ωij ) . (B.67)

-2
Torsion and curvature are expressed by the torsion 2-form Θi ∈ and
-
the curvature 2-form Ωij ∈ 2 as

T (x, y) = Θi (x, y) ei , R̄(x, y)ei = Ωij (x, y) e j . (B.68)

The torsion and curvature forms are related to the connection forms and
the dual basis vectors by Cartan’s structure equations

Θi = dθi + ωik ∧ θk , Ωij = dωij + ωik ∧ ωkj . (B.69)

The components of the torsion and curvature tensors are determined by

Θi = T ijk θ j ∧ θk , Ωij = R̄i jkl θk ∧ θl . (B.70)


B.10 Differential operators and integration 225

B.10 Differential operators and integration

The Hodge star operator turns a p-form into an (n − p)-form,


.p .n−p
∗: → , ω → ∗ω . (B.71)
If {ei } is an orthonormal basis of the dual space, the Hodge star operator
is uniquely defined by
∗ (e1 ∧ . . . ∧ ei p ) = ei p+1 ∧ . . . ∧ ein , (B.72)
where the indices i1 . . . in appear in their natural order or a cyclic per-
mutation thereof. For example, the coordinate differentials {dxi } are an
orthonormal dual basis in R3 , and
∗dx1 = dx2 ∧ dx3 , ∗dx2 = dx3 ∧ dx1 , ∗dx3 = dx1 ∧ dx2 . (B.73)

The codifferential is a differentiation lowering the order of a p-form by


one, .p . p−1
δ: → , ω → δω , (B.74)
which is defined by
δω = sgn(g)(−1)n(p+1) (∗d∗) ω . (B.75)
It generalises the divergence of a vector field and thus has the components
1  
(δω)i2 ...i p = ∂i1 |g| ωi1 i2 ...i p . (B.76)
|g|

The Laplace-de Rham operator


d◦δ+δ◦d (B.77)
generalises the Laplace operator.
The canonical volume form is an n-form given by

η = |g| dx1 ∧ . . . ∧ dxn . (B.78)

The integration of n-forms ω = f (x1 , . . . , xn ) dx1 ∧. . .∧dxn over domains


D ⊂ M is defined by
 
ω= f (x1 , . . . , xn ) dx1 . . . dxn . (B.79)
D D
Functions f are integrated by means of the canonical volume form,
 
fη = f |g| dx1 . . . dxn . (B.80)
D D

The theorems of Stokes and Gauss can be expressed as


   
dα = α, δv η = ∗v , (B.81)
D ∂D D ∂D
-n−1
where α ∈ is an (n − 1)-form and v ∈ T M is a vector field.
Appendix C

Penrose-Carter diagrams

For studying in particular the causal structure of spacetimes, a com-


pactification has proven useful whose illustration has become known
as Penrose-Carter diagram. We first demonstrate its construction at the
example of Minkowski spacetime. It proceeds in three steps:

1. We assume a spherically-symmetric spacetime and consider the


time t and the radial coordinate r only. We transform (ct, r) to null
coordinates (ũ, ṽ) by

(ct, r) → (ũ, ṽ) , ũ := ct − r, ṽ := ct + r . (C.1)

With t ∈ (−∞, ∞) and r ∈ [0, ∞), we have ũ, ṽ ∈ (−∞, ∞) and


ũ ≤ ṽ. The line element of the metric then transforms to
1
ds2 = −c2 dt2 + dr2 + r2 dΩ2 = −dũdṽ + (ṽ − ũ)2 dΩ2 . (C.2)
4

2. Next, we map the null coordinates to compact intervals by the


transform

(ũ, ṽ) → (U, V) , U := arctan ũ , V := arctan ṽ . (C.3)

With
dU dV sin(U − V)
dũ = , dṽ = , ṽ − ũ = , (C.4)
cos2 U cos2 V cos U cos V
the line elements turns into
& '
1 1 2
ds = − 2
2
dUdV + sin (V − U) dΩ .
2
(C.5)
cos U cos2 V 4

3. Finally, we return to time- and space-like coordinates (T, R) de-


fined by
T := V + U , R := V − U . (C.6)

227
228 C Penrose-Carter diagrams

Using
1 1
U = (T − R) , V = (T + R) , (C.7)
2 2
we find
1 2 
dUdV = dT − dR2 (C.8)
4
and thus the line element
 
ds2 = ω−2 (T, R) −dT 2 + dR2 + sin2 R dΩ2 (C.9)

with the conformal factor

ω(T, R) = 2 cos U cos V = cos T + cos R . (C.10)

With ũ, ṽ ∈ R and ũ ≤ ṽ, the compactified coordinates (U, V) obey


) π π* ) π π*
U∈ − , , V∈ − , , U≤V, (C.11)
2 2 2 2
which implies
R ∈ [0, π) , |T | + R ∈ [0, π) . (C.12)

i+

J+

T
R i0

J−

i−
Figure C.1 Penrose-Carter diagram of Minkowski spacetime. The light-
gray curves are lines of constant radius (running from i− to i+ ) and lines of
constant time (emerging from i0 ).

The following points and lines are particularly important for the causal
structure of the spacetime:
229

• i+ : Future time-like infinity, i.e. (T, R) = (π, 0);

• i0 : Spatial infinity, i.e. (T, R) = (0, π);

• i− : Past time-like infinity, i.e. (T, R) = (−π, 0);

• J + : Future null infinity, i.e. T = π − R, 0 < R < π;

• J − : Past null infinity, i.e. T = R − π, 0 < R < π;

Lines of constant radius emerge from i− and end in i+ , while lines of


constant time all end in i0 .
The Penrose-Carter diagram of Minkowski spacetime is shown in Fig. C.1.
In addition, Fig. C.2 shows the Penrose-Carter diagram for the Kruskal
extension of the Schwarzschild spacetime. There, the four domains
of this extension are marked with Roman numerals, as discussed in
Chap. 10 and indicated in Fig. 10.3.

i+ r=0 i+

J+ J+
III

II I
i0 i0

IV
J− J−

i− r=0 i−

Figure C.2 Penrose-Carter diagram for the Kruskal extension of the


Schwarzschild spacetime. The four different domains of the Kruskal ex-
tension are indicated by Roman numerals as in Fig. 10.3.
Index

accretion torus, 177 and energy-momentum tensor,


action 89
of a free particle, 47 covariant derivative, 33
of general relativity, 86 of a dual vector, 36
adapted coordinates, 115 of a tensor, 38
algebra, 70 covariant differentiation, 32
alternation operator, 69 Cramer’s rule, 54, 85
atlas, 15 curvature, 39
differentiable, 16 and tidal field, 81
curvature tensor, 40
Bianchi identities in arbitrary basis, 114
contracted, 45 symmetries, 44
first and second, 40 curve, 23
for Riemannian connection, 44
derivation, 20
Birkhoff’s theorem, 125
diffeomorphism
generalised, 186
invariance, 93
one-parameter group, 23, 63
Cartan’s equation, 72
differential forms, 69
Cartan’s structure equations, 113
exact and closed, 72
chart, 15
differential of a function, 24
chirp mass, 108
Doppler shift, 8
Christoffel symbols, 32
and redshift, 54
for linearised gravity, 91
dual vector, 24
of Riemannian connection, 43
transformation, 33 Eddington-Finkelstein coordinates,
codifferential, 74 154
coordinate expression, 74 eikonal equation, 52
cofactor, 54 Einstein
commutator, 24 elevator, 7
connection field equations, 83
affine, 32 linearised in Hilbert gauge,
linear, 32 94
metric, 42 tensor, 45
Riemannian or Levi-Civita, 42 energy non-conservation, 58
symmetric, 33 energy-momentum tensor, 88
connection forms, 111 of electromagnetic field, 88
contraction, 26 equivalence principle
convergence, 202 Einstein’s, 5
cosmological constant, 84 weak, 5

231
232 INDEX

exponential map, 35 interior product, 71


exterior derivative, 71 ISCO, 133
coordinate expression, 72 isometry, 65
exterior product, 70
Jacobi equation, 79
Faraday 2-form, 75 Jacobian matrix, 22
Fermat’s principle, 96
frames Kerr spacetime
freely-falling, 7 angular velocity, 169
inertial, 6 ergosphere, 171
Friedmann’s equations, 193 Kepler’s third law, 172
Frobenius condition, 116 Killing horizon, 171
redshift near static limit, 170
gauge invariance, 93 static limit, 169
Gauss’ theorem, 76 Kerr(-Newman) solution, 166
generator of a transformation, 23 gyromagnetic moment, 167
geodesic, 34 radial magnetic field, 167
and free fall, 48 Killing
and light rays, 53 equation, 68
congruence, 77 vector field, 68
deviation, 80 vector fields and conservation
equation, 34 laws, 69
in Newtonian limit, 58 Koszul formula, 42
geometrical optics, 51 Kruskal(-Szekeres)
change of amplitude, 53 continuation, 151
polarisation transport, 51 coordinates, 151
Gibbons-Hawking-York boundary
term, 86 Laplace expansion, 55
Grassmann algebra, 71 Lense-Thirring effect, 99
Grassmann identity, 101 Lie derivative, 65
gravitational wave and commutator, 66
equation, 100 coordinate expressions, 67
polarisation tensor, 102, 103 Lie transport, 78
gravitomagnetic field, 98 light deflection, 9
equation of motion in, 99 by the universe, 205
Lense-Thirring effect, 99 in Newtonian limit, 59
Gravity Probe B, 100 light speed in weak gravitational
field, 96
Hilbert gauge, 94 Lorentz force, 50
Hilbert-Palatini action, 89 Lorentzian spacetime, 28
Hodge ∗ operator, 73 Lovelock’s theorem, 84

index of refraction in weak gravita- manifold, 15


tional field, 96 differentiable, 16
inhomogeneous wave equation for Maxwell’s equations
electromagnetic field, 51 in general relativity, 50
innermost stable circular orbit, 133 in special relativity, 49
integration of an n-form, 75 metric
INDEX 233

(pseudo-)Riemannian, 28 Schwarzschild solution, 122


Friedmann-Lemaître-Robert- Schwarzschild spacetime
son-Walker, 194, 198 Eddington-Finkelstein coordi-
Kerr(-Newman), 166 nates, 154
of static spacetime, 116 effective potential, 129
Reissner-Nordström, 162 equation of motion in, 130
Schwarzschild, 122 Kruskal(-Szekeres) continuation,
tensor, 27 151
Lagrangian of, 127
neutron stars last stable orbit, 133
mass limit, 210 light deflection, 137
optical tidal matrix, 202 perihelion shift, 135
spin precession, 140
Palatini identity, 85 Schwarzschild tetrad, 118
parallel transport, 34 curvature forms of, 120
Penrose-Carter diagram, 152, 227 Einstein tensor of, 121
perihelion shift Ricci scalar of, 121
in scalar theory of gravity, 13 Ricci tensor of, 120
in Schwarzschild spacetime, 135 Shapiro delay, 97
Planck shear
temperature, 181 components, 202
units, 4 matrix, 202
polarisation simultaneous events, 6
and parallel transport, 53 spacetimes with constant curvature
tensor, 102 metric, 190
product manifolds, 18 Ricci scalar, 189
pull-back, 64 Ricci tensor, 189
push-forward, 64 spatially homogeneous spacetime,
188
quadrupole
spatially isotropic spacetime, 188
formula, 106
spherically symmetric spacetime,
tensor, 105
183
redshift connection forms, 185
cosmological, 197 curvature forms, 185
gravitational, 8, 10 Einstein tensor, 186
near Schwarzschild radius, 159 Ricci scalar, 185
Reissner-Nordström solution, 162 Ricci tensor, 185
Ricci static spacetime, 116
identity, 42 stationary spacetime, 115
scalar, 45 axially symmetric, 164
for linearised gravity, 92 Stokes’ theorem, 76
tensor, 40 structure equations, Cartan’s, 113
for linearised gravity, 92
Riemann tensor, 40 tangent space, 20
Robertson-Walker metric, 191, 198 basis, 22
tensor, 25
scalar product, 27 transformation of, 26
234 INDEX

tensor field, 25
Tolman-Oppenheimer-Volkoff
equation, 208
torsion, 38
forms, 112
tensor, 39
in arbitrary basis, 114
two-sphere, 16
atlas, 16

variational principle of general rel-


ativity, 86, 89
vector field, 22
flow of, 63
global flow of, 63
integral curve of, 63
parallel, 34
vector space, 19
vectors and infinitesimal transfor-
mations, 23
volume form, 73
canonical, 73

weakness of gravity, 4
Weyl tensor, 200
Einstein‘s theory of general relativity is still the valid theory of gravity
and has been confirmed by numerous tests and measurements. It
is built upon simple principles and relates the geometry of space-
time to its matter-energy content. These lecture notes begin by
introducing the physical principles and by preparing the necessary
mathematical tools taken from differential geometry. Beginning with
Einstein’s field equations, which are introduced in two different
ways in the lecture, the motion of test particles in a gravitational
field is then discussed, and it is shown how the properties of weak
gravitational fields follow from the field equations. Solutions for com-
pact objects and black holes are derived and discussed as well as
cosmological models. Two applications of general relativity to astro-
physics conclude the lecture notes.

About the Author


Matthias Bartelmann is professor for theoretical astrophysics at
Heidelberg University. He mostly addresses cosmological questions,
concerning in particular the formation and evolution of cosmic
structures.

ISBN 978-3-947732-60-9

9 783947 732609

You might also like