A Students Guide To General Relativity 9781107183469 9781316869659 9781316634790 1107183464 1316634795 - Compress PDF
A Students Guide To General Relativity 9781107183469 9781316869659 9781316634790 1107183464 1316634795 - Compress PDF
This compact guide presents the key features of General Relativity, to support and
supplement the presentation in mainstream, more comprehensive undergraduate
textbooks, or as a recap of essentials for graduate students pursuing more advanced
studies. It helps students plot a careful path to understanding the core ideas and basic
techniques of differential geometry, as applied to General Relativity, without
overwhelming them. While the guide doesn’t shy away from necessary technicalities,
it emphasizes the essential simplicity of the main physical arguments. Presuming a
familiarity with Special Relativity (with a brief account in an appendix), it describes
how general covariance and the equivalence principle motivate Einstein’s theory of
gravitation. It then introduces differential geometry and the covariant derivative as the
mathematical technology which allows us to understand Einstein’s equations of
General Relativity. The book is supported by numerous worked examples and
exercises, and important applications of General Relativity are described in an
appendix.
N O R M A N G R AY
University of Glasgow
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India
79 Anson Road, #06–04/06, Singapore 079906
www.cambridge.org
Information on this title: www.cambridge.org/9781107183469
DOI: 10.1017/9781316869659
© Norman Gray 2019
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2019
Printed in the United Kingdom by TJ International Ltd. Padstow Cornwall
A catalogue record for this publication is available from the British Library.
Library of Congress Cataloging-in-Publication Data
Names: Gray, Norman, 1964– author.
Title: A student’s guide to general relativity / Norman Gray (University of Glasgow).
Description: Cambridge, United Kingdom ; New York, NY : Cambridge University
Press, 2018. | Includes bibliographical references and index.
Identifiers: LCCN 2018016126 | ISBN 9781107183469 (hardback ; alk. paper) |
ISBN 1107183464 (hardback ; alk. paper) | ISBN 9781316634790 (pbk. ; alk. paper) |
ISBN 1316634795 (pbk.; alk. paper)
Subjects: LCSH: General relativity (Physics)
Classification: LCC QC173.6 .G732 2018 | DDC 530.11–dc23
LC record available at https://lccn.loc.gov/2018016126
ISBN 978-1-107-18346-9 Hardback
ISBN 978-1-316-63479-0 Paperback
Additional resources for this publication at www.cambridge.org/9781107183469
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
Before thir eyes in sudden view appear
The secrets of the hoarie deep, a dark
Illimitable Ocean without bound,
Without dimension, where length, breadth, & highth,
And time and place are lost;
[. . . ]
Into this wilde Abyss,
The Womb of nature and perhaps her Grave,
Of neither Sea, nor Shore, nor Air, nor Fire,
But all these in thir pregnant causes mixt
Confus’dly, and which thus must ever fight,
Unless th’ Almighty Maker them ordain
His dark materials to create more Worlds,
Into this wild Abyss the warie fiend
Stood on the brink of Hell and look’d a while,
Pondering his Voyage: for no narrow frith
He had to cross.
John Milton, Paradise Lost , II, 890–920
Preface page ix
Acknowledgements xii
1 Introduction 1
1.1 Three Principles 1
1.2 Some Thought Experiments on Gravitation 6
1.3 Covariant Differentiation 11
1.4 A Few Further Remarks 12
Exercises 16
2 Vectors, Tensors, and Functions 18
2.1 Linear Algebra 18
2.2 Tensors, Vectors, and One-Forms 20
2.3 Examples of Bases and Transformations 36
2.4 Coordinates and Spaces 41
Exercises 42
3 Manifolds, Vectors, and Differentiation 45
3.1 The Tangent Vector 45
3.2 Covariant Differentiation in Flat Spaces 52
3.3 Covariant Differentiation in Curved Spaces 59
3.4 Geodesics 64
3.5 Curvature 67
Exercises 75
4 Energy, Momentum, and Einstein’s Equations 84
4.1 The Energy-Momentum Tensor 85
4.2 The Laws of Physics in Curved Space-time 93
4.3 The Newtonian Limit 102
Exercises 108
vii
viii Contents
ix
x Preface
of courses: this one was ‘the maths half ’, which provided most of the maths
required for its partner, which focused on various applications of Einstein’s
equations to the study of gravity. The course was a compulsory one for most of
its audience: with a smaller, self-selecting class, it might be possible to cover
the material in less time, by compressing the middle chapters, or assigning
readings; with a larger class and a more leisurely pace, we could happily
spend a lot more time at the beginning and end, discussing the motivation and
applications.
In adapting this course into a book, I have resisted the temptation to expand
the text at each end. There are already many excellent but heavy tomes
on GR – I discuss a few of them in Section 1.4.2 – and I think I would
add little to the sum of world happiness by adding another. There are also
shorter treatments, but they are typically highly mathematical ones, which
don’t amuse everyone. Relativity, more than most topics, benefits from your
reading multiple introductions, and I hope that this book, in combination with
one or other of the mentioned texts, will form one of the building blocks in
your eventual understanding of the subject.
As readers of any book like this will know, a lecture course has a point ,
which is either the exam at the end, or another course that depends on it. This
book doesn’t have an exam, but in adapting it I have chosen to act as if it
did: the book (minus appendices) has the same material as the course, in both
selection and exclusion, and has the same practical goal, which is to lead the
reader as straightforwardly as is feasible to a working understanding of the
core mathematical machinery of GR. Graduate work in relativity will of course
require mining of those heavier tomes, but I hope it will be easier to explore
the territory after a first brisk march through it. The book is not designed to
be dipped into, or selected from; it should be read straight through. Enjoy the
journey.
Another feature of lecture courses and of Cambridge University Press’s
Student’s Guides , which I have carried over to this book, is that they are
bounded: they do not have to be complete, but can freely refer students to
other texts, for details of supporting or corroborating interest. I have taken
full advantage of this freedom here, and draw in particular on Schutz’s A
First Course in General Relativity (2009), and to a somewhat lesser extent
on Carroll’s Spacetime and Geometry (2004), aligning myself with Schutz’s
approach except where I have a positive reason to explain things differently.
This book is not a ‘companion’ to Schutz, and does not assume you have a
copy, but it is deliberately highly compatible with it. I am greatly indebted
both to these and to the other texts of Section 1.4.2.
Preface xi
These notes have benefitted from very thoughtful comments, criticism, and
error checking, received from both colleagues and students, over the years this
book’s precursor course has been presented. The balance of time on different
topics is in part a function of these students’ comments and questions. Without
downplaying many other contributions, Craig Stark, Liam Moore, and Holly
Waller were helpfully relentless in finding ambiguities and errors.
The book would not exist without the patience and precision of Róisı́n
Munnelly and Jared Wright of CUP. Some of the exercises and some of the
motivation are taken, with thanks, from an earlier GR course also delivered at
the University of Glasgow by Martin Hendry. I am also indebted to various
colleagues for comments and encouragement of many types, in particular
Richard Barrett, Graham Woan, Steve Draper, and Susan Stuart. For their
precision and public-spiritedness in reporting errors, the author would like to
thank Charles Michael Cruickshank, David Spaughton and Graham Woan.
xii
1
Introduction
What is the problem that General Relativity (GR) is trying to solve? Section 1.1
introduces the principle of general covariance, the relativity principle, and the
equivalence principle, which between them provide the physical underpinnings
of Einstein’s theory of gravitation.
We can examine some of these points a second time, at the risk of a little
repetition, in Section 1.2, through a sequence of three thought experiments,
which additionally bring out some immediate consequences of the ideas.
It’s rather a matter of taste, whether you regard the thought experiments as
motivation for the principles, or as illustrations of them.
The remaining sections in this chapter are other prefatory remarks, about
‘natural units’ (in which the speed of light c and the gravitational constant G
are both set to 1), and pointers to a selection of the many textbooks you may
wish to consult for further details.
1
2 1 Introduction
from which we can deduce the constant-acceleration equations and, from that,
all the fun and games of Applied Maths 1.
Alternatively, we could describe a coordinate system S ± rotating about the
origin of our rectilinear one with angular speed ±, in which
d r±
F± = m a± = − m ± × (± × r± ) − 2 m ± × , (1.3)
dt
and then derive the equations of constant acceleration from that. Doing so
would not be wrong, but it would be perverse, because the underlying physical
statement is the same in both cases, but the expression of it is more complicated
in one frame than in the other. Put another way, Eq. (1.1) is physics, but the
distinction between Eqs. (1.2) and (1.3) is merely mathematics.
This is a more profound statement than it may at first appear, and it can be
dignified as
The principle of general covariance: All physical laws must be invariant under
all coordinate transformations.
that are moving with respect to S with a constant velocity v, and we call each
of the members of this class an inertial frame . In each inertial frame, motion
is simple and, moreover, each inertial frame is related to another in a simple
way: namely the galilean transformation in the case of pre-relativistic physics,
and the Lorentz transformation in the case of Special Relativity (SR).
The fact that the observational effects of Newton’s laws are the same in each
inertial frame means that we cannot tell, from observation only of dynamical
phenomena within the frame, which frame we are in. Put less abstractly, you
can’t tell whether you’re moving or stationary, without looking outside the
window and detecting movement relative to some other frame. Inertial frames
thus have, or at least can be taken to have, a special status. This special status
turns out, as a matter of observational fact, to be true not only of dynamical
phenomena dependent on Newton’s laws, but of all physical laws, and this
also can be elevated to a principle.
The principle of relativity (RP): (a) All true equations in physics (i.e., all ‘laws
of nature’, and not only Newton’s first law) assume the same mathematical form
relative to all local inertial frames. Equivalently, (b) no experiment performed
wholly within one local inertial frame can detect its motion relative to any other
local inertial frame.
If we add to this principle the axiom that the speed of light is infinite, we
deduce the galilean transformation; if we instead add the axiom that the speed
of light is a frame-independent constant (an axiom that turns out to be amply
confirmed by observation), we deduce the Lorentz transformation and Special
Relativity. In SR, remember, we are obliged to talk of a four-dimensional
coordinate frame, with one time and three space dimensions.
General Relativity – Einstein’s theory of gravitation – adds further signif-
icance to the idea of the inertial frame. Here, an inertial frame is a frame
in which SR applies, and thus the frame in which the laws of nature take
their corresponding simple form. This definition, crucially, applies even in the
presence of large masses where (in newtonian terms) we would expect to find
a gravitational force. The frames thus picked out are those which are in free
fall , either because they are in deep space far from any masses, or because they
are (attached to something that is) moving under the influence of ‘gravitation’
alone. I put ‘gravitation’ in scare quotes because it is part of the point of GR
to demote gravitation from its newtonian status as a distinct physical force to a
status as a mathematical fiction – a conceptual convenience – which is no more
real than centrifugal force.
The first step of that demotion is to observe that the force of gravitation
(I’ll omit the scare quotes from now on) is strangely independent of the
4 1 Introduction
nature of the things that it acts upon. Imagine a frame sitting on the surface
of the Earth, and in it a person, a bowl of petunias, and a radio, at some
height above the ground: we discover that, when they are released, each of
them will accelerate at the same rate towards the floor (Galileo is supposed
to have demonstrated this same thing using the Tower of Pisa, careless of
the health and safety of passers-by). Newton explains this by saying that the
force of gravitation on each object is proportional to its gravitational mass
(the gravitational ‘charge’, if you like); and the acceleration of each object, in
response to that force, is proportional to its inertia, which is proportional to its
inertial mass. Newton doesn’t put it in those terms, of course, but he also fails to
explain why the gravitational and inertial masses, which a priori have nothing
to do with each other, turn out experimentally to be exactly proportional
to each other, even though the person, the plant, the plantpot, and the
radio broadcasting electromagnetic waves all exhibit very different physical
properties.
Now imagine this same frame – or, for the sake of concreteness and the
containment of a breathable atmosphere, a spacecraft – floating in space. Since
spacecraft, observer, petunias, and radio are all equally floating in space, none
will move with respect to another (or, if they are initially moving, they will
continue to move with constant relative velocity). That is, Newton’s laws work
in their simple form in this frame, which we can therefore identify as an inertial
frame.
If, now, we turn on the spacecraft’s engines, then the spacecraft will
accelerate, but the objects within it will not, until the spacecraft collides with
them, and starts to accelerate them by pushing them with what we will at
that point decide to call the cabin floor. Crucially – and, from this point of
view, obviously – the sequence of events here is independent of the details
of the structure of the ceramic plantpot, the biology of the observer and the
petunias, and the electronic intricacies of the radio. If the spacecraft continues
to accelerate at, say, 9.81 m s − 2 , then the objects now firmly on the cabin floor
will experience a continuous force of one standard Earth gravity, and observers
within the cabin will find it difficult to tell whether they are in an accelerating
spacecraft or in a uniform gravitational field.
In fact we can make the stronger statement – and this is another physical
statement which has been verified to considerable precision in, for example,
the Eötvös experiments – that the observers will find it impossible to tell the
difference between acceleration and uniform gravitation; and this is a third
remark that we can elevate to a physical principle.
The EP is closely related to the observation that gravitational and inertial mass
are strictly proportional; Rindler, for example, refers to this as the ‘weak’
equivalence principle (see Section 4.2.2).
We can summarise where we have got to as follows: (i) the principle of
general covariance thus constrains the possible forms of statements of physical
law, (ii) the EP and RP point to a privileged status of inertial frames in our
search for further such laws, (iii) the RP gives us a link to the physics that we
already know at this stage, and (iv) the EP gives us a link to the ‘gravitational
fields’ that we want to learn more about.
These three principles make a variety of physical and mathematical points.
1 By ‘point of view’ I mean ‘as measured with respect to a reference frame fixed to the box’, but
such circumlocution can distract from the point that this is an observationwe’re talking about –
we can see this happening.
1.2 Some Thought Experiments on Gravitation 7
+ =
Figure 1.3 The Pound-Rebka experiment.
t
B
A n/f
n/f B
A z
A ξ (t) B
z (t)
d 2ξ d2 z F GM GM
2
= k
2
= −k = −k 2 = −ξ 3 .
dt dt m z z
This tells us that the inertial frames attached to these freely falling particles
approach each other at an increasing speed (that is, they ‘accelerate’ towards
each other in the sense that the second derivative of their separation is non-
zero, but since they are in free fall, there is no physical acceleration that an
observer in the frame would feel as a push).
If A and B are two observers in inertial frames (or inertial spacecraft), then
we have said that they cannot distinguish between being in space far from any
gravitating masses, and being in free fall near a large mass. If instead they
found themselves at opposite ends of a giant free-falling spacecraft, then they
would find themselves drifting closer to each other as the spacecraft fell, in
apparent violation of Newton’s laws. Is there a contradiction here?
No. The EP as quoted in Section 1.1 talked of uniform gravitational fields,
which this is not. Also, both the RP of that section, and the discussion in
Section 1.2.1, talked of local inertial frames. A lot of SR depends on inertial
frames having infinite extent: if I am an inertial observer, then any other
inertial observer must be moving at a constant velocity with respect to me.
In GR, in contrast, an inertial frame is a local approximation (indeed it is fully
accurate only at a point, an important issue we will return to later), and if your
measurement or experiment is sufficiently extended in space or time, or if your
instruments are sufficiently accurate, then you will be able to detect tidal forces
in the way that A and B have done in this thought experiment.
If A and B are plummeting down lift shafts, in free fall, on opposite sides
of the earth, then they are inertial observers, but they are ‘accelerating’ with
respect to one another. This means that, if I am one of these inertial observers,
then (presuming I do not have more pressing things to worry about) I cannot
use SR to calculate what the other inertial observer would measure in their
frame, nor calculate what I would measure if I observed a bit of physics that
I understand, which is happening in the other inertial observer’s frame.
1.3 Covariant Differentiation 11
But this is precisely what I do want to do, supposing that the bit of physics in
question is happening in free fall in the accretion disk surrounding a black hole,
and I want to interpret what I am seeing through my telescope. Gravitational
redshift of spectral lines is just the beginning of it.
It is GR that tells us how we must patch together such disparate inertial
frames. [Exercise 1.2]
df f (x + h) − f (x )
= lim .
dx h→ 0 h
1
principles
2
tensors
scitamehtam
scisyhp
3
diff’n
4
gravity
Figure 1.6 The sequence of ideas. In Chapters 2 and 3 we examine the mathe-
matical technology that we will need to turn the principles of Chapter 1 into the
physics of Chapter 4.
There are several advantages to this: (i) In relativity, space and time are not
really distinct, and having different units for the two ‘directions’ can obscure
this; (ii) In these units, light travels a distance of one metre in a time of one
metre, giving the speed of light as an easy-to-remember, and dimensionless,
c = 1; (iii) If we measure time in metres, then we no longer need the
conversion factor c in our equations, which are consequently simpler. We also
quote other speeds in these units of metres per metre, so that all speeds are
dimensionless and less than one.
Of these three points, the first is by far the most important.
Writing c = 1 = 3 × 108 m s− 1 (dimensionless) looks rather odd, until
we read ‘seconds’ as units of length. In the same sense, the inch is defined
to be precisely 25.4 mm long, and this figure of 25.4 is merely a conversion
factor between two different, and only historically distinct, units of length.
We write this as 1 in = 25.4 mm or, equivalently but unconventionally, as
1 = 25.4 mm in− 1 .
Consider converting 10 J = 10 kg m2 s− 2 to natural units. Since c = 1, we
have 1 s = 3 × 108 m, and so 1 s− 2 = (9 × 1016)− 1 m− 2 . So 10 kg m2 s− 2 =
10 kg m2 × (9 × 1016)− 1 m− 2 = 1.1 × 10 − 16 kg. Recalling SR’s E = γ mc 2 =
γ m , it should be unsurprising that, in the ‘right’ units, mass has the same units
as other forms of energy.
In GR it is also usual to use units in which the gravitational constant is
G = 1. That means that the expression 1 = G = 6.673 × 10− 11 m3 kg− 1 s− 2 =
7.414 × 10− 28 m kg− 1 becomes a conversion factor between kilogrammes and
the other units. This, for example, gives the mass of the sun, in these units, as
M ² ≈ 1.5 km.
It is easy, once you have a little practice, to convert values and equations
between the different systems of units. Throughout the rest of this book, I will
quote equations in units where c = 1, and, when we come to that, G = 1, so
that the factors c and G disappear from the equations. [Exercise 1.3]
• Carroll (2004) is very good. Although it’s mathematically similar, the order
of the material, and the things it stresses, are sufficiently different from this
book and Schutz that it might be confusing. However, that difference is also
a major virtue: the book introduces topics clearly, and in a way that usefully
contrasts with my way. Also, Carroll’s relativity lecture notes from a few
years ago, which are a precursor of the book, are easily findable on the
Internet.
• Rindler (2006) always explains the physics clearly, distinguishing
successively strong variants of the EP, and the motivation for GR (his first
two chapters are, incidentally, notably excellent in their careful explanation
of the conceptual basis of SR). However Rindler is now rather
old-fashioned in many respects, in particular in its treatment of differential
geometry, which it introduces from the point of view of coordinate
transformations, rather than the geometrical approach we use later in the
book. Earlier editions of this book are equally valuable for their insight.
• Similarly, again, Narlikar (2010) is worthwhile looking at, to see if it suits
you. The mathematical approach is one which introduces vectors and
tensors via components (like Rindler), rather than the more functional
approach we’ll use here. Narlikar is good at transmitting mathematical and
physical insights.
• Misner, Thorne, and Wheeler (1973) is a glorious, comprehensive, doorstop
of a book. Its distinctive prose style and typographical oddities have fans
and detractors in roughly equal numbers. Chapter 1 in particular is worth
reading for an overview of the subject. MTW is, incidentally, highly
compatible in style with the introduction to SR found in Taylor and
Wheeler’s excellent Spacetime Physics (1992).
• Wald ( 1984) is comprehensive and has long been a standby of
undergraduate- and graduate-level GR courses.
• Hartle (2003) is more recent and similarly popular, with a practical focus.
1.4 A Few Further Remarks 15
• Another Schutz book, Gravity from the Ground Up Schutz (2003), aims to
cover all of gravitational physics from falling apples to black holes using
the minimum of maths. It won’t help with the differential geometry, but it’ll
supply lots of insight.
• Longair (2003) is excellent. The section on GR (only a smallish part of the
book) is concerned with motivating the subject rather than doing a lot of
maths, and is in a seat-of-the-pants style that might be to your taste.
There are also many more advanced texts. The following are graduate-level
texts, and so reach well beyond the level of this book. They are mathematically
very sophisticated. If, however, your tastes and experience run that way, then
the introductory chapters of these books might be instructive, and give you a
taste of the vast wonderland of beautiful maths that can be found in this subject.
They can also be useful as a way of compactly summarising material you have
come to understand by a more indirect route.
• Chapter 1 of Stewart (1991) covers more than the content of this course in
its first 60 laconic pages.
• Geometrical Methods of Mathematical Physics (Schutz, 1980) is a
delightful book, which explains the differential geometry clearly and
sparsely, including applications beyond relativity and cosmology. However,
it appeals only to those with a strong mathematical background; it may
cause alarm and despondency in others.
• Hawking and Ellis (1973), chapter 2, covers more than all the differential
geometry of this book.
Table 1.1 Sign conventions in various texts. This text also matches Hawking
and Ellis (1973) and MTW. References are to equation numbers in the
corresponding texts, except where indicated. For explanations, see Eq. (1.5).
a few texts (imitating MTW’s corresponding table) in Table 1.1. In this table,
the signs are
´ R i jkl = i
²jl ,k − i
²jk,l + i
²σ k ²jl
σ
− i σ
²σ l ²jk
´ R β ν = R µβ µν
(1.5)
G µν = R µν − 12 Rg µν = ´ 8π T µν
´ ηµν = diag (− 1, + 1, + 1, + 1)
Exercises
Here and in the following chapters, the notations d + , d − , u + , and so on,
indicate questions that are slightly more or less difficult, or more useful, than
others.
Exercise 1.2 (§ 1.2.4) If two 1 kg balls, 1m apart, fall down a lift shaft
near the surface of the earth, how much is their tidal acceleration towards each
other? How much is their acceleration towards each other as a result of their
mutual gravitational attraction?
18
2.1 Linear Algebra 19
line graph’, or might refer to it as linear in other contexts, in this formal sense
it is not a linear function, because f (2x ) ±= 2f (x )).
The obvious example of a vector space is the set of vectors that you learned
about in school, but crucially, anything that satisfies these axioms is also a
vector space.
Vectors A 1, . . . , A n are linearly independent (LI) if a1 A 1 + a 2A 2 + · · · +
a nA n = 0 implies a i = 0, ∀ i. The dimension of a vector space, n, is the largest
number of LI vectors that can be found. A set of n LI vectors A i in an n -
dimensional space is said to span the space, and is termed a basis for the space.
Then is it a theorem that, for every vector B ∈ V , there exists a set of numbers
{ bi } such that B =
∑n b A ; these numbers { b } are the components of the
i= 1 i i i
vector B with respect to the basis { A i } .
One can (but need not) define an inner product on a vector space: the inner
product between two vectors A and B is written A · B (yes, the dot-product
that you know about is indeed an example of an inner product; also note
that the inner product is sometimes written ² A , B ³ , but we will reserve that
notation, here, to the contraction between a vector and a one-form, defined in
Section 2.2.1). This is a symmetric, linear, operator that maps pairs of vectors
to the real line. That is (i) A · B = B · A , and (ii) (aA + bB ) · C = aA · C + bB · C .
Two vectors, A and B , are orthogonal if A · B = 0. An inner-product is positive-
definite if A · A > 0 for all A ±= 0, or indefinite otherwise. The norm of a
vector A is | A | = |A · A | 1/ 2. The symbol δ ij is the Kronecker delta symbol ,
defined as
20 2 Vectors, Tensors, and Functions
±
1 if i = j
δ ij ≡ (2.1)
0 otherwise
(throughout the book, we will use variants of this symbol with indexes raised
or lowered – they mean the same: δ ij = δ i j = δ i j ; see the remarks about this
object at the end of Section 2.2.6). A set of vectors { e i } such that e i · e j = δ ij
(that is, all orthogonal and with unit norm) is an orthonormal basis. It is a
theorem that, if { b i } are the components of an arbitrary vector B in this basis,
then b i = B · e i . [Exercises 2.1 and 2.2]
T( ²· , ²· , · ),
to emphasise that the function has two ‘slots’ for one-forms and one ‘slot’ for
a vector. When we insert one-forms ² p and ²q , and vector A , we get T (² p,²q , A ),
which, by our definition of a tensor, we see must be a pure number, in R.
Note that this ‘dots’ notation is an informal one, and though I have chosen to
write this in the following discussion with one-form arguments (1)all to the left²of
vector ones, this is just for the sake of clarity: in (general,
) the 1 tensor T( · , · )
is a perfectly good tensor, and distinct from the 11 tensor T(²· , · ).
22 2 Vectors, Tensors, and Functions
()
Note firstly that there is nothing in the definition of a tensor that states that
the arguments are interchangeable, thus, in the case of a 02 tensor U( · , · ),
U( A , B ) ±= U( B , A ) in general: if in fact U( A , B ) = U( B , A ) , ∀ A , B , then U is
T( ω ², ²· , · ),
then we obtain an object that can take a single one-form( )and a single vector,
and map them into a number; in other words, we have a 11 tensor. If we fill in
a further argument
²²
V = T ( ω, · , A )
²p = ( p 1, p 2 ),
µ A 1¶
A= ,
A2
³ ´ µA 1 ¶
²p, A = ( p 1, p 2 )
A2
= p1 A 1 + p 2A 2. (2.3)
2.2 Tensors, Vectors, and One-Forms 23
A
A2
A1
e2 A = A1 e1 + A2 e2
e1
Figure 2.1 A vector.
³ ´
function over vectors, mapping them to numbers, and similarly A , a real-valued
function over one-forms. In this equation, we have chosen to define ² p, A using
the familiar mechanism of matrix multiplication; the definitions of A (² p) and
²p (A ) then come for free, using the equivalences of Eq. (2.2) (I have written
the vector components with raised indexes in order to be consistent with the
notation introduced in Section 2.2.5; note, by the way, that the vector illustrated
in Figure 2.1 is not anchored to the origin – it is not a ‘position vector’, since
that is a thing that would change on any change of origin).
How about tensors of higher rank? Easy: matching the row and column
vectors from this section, a square matrix
µa a 12
¶
11
T =
a21 a 22
1 See for example Goldstein (2001). Note that here we have elided the distinction between
vectors and one-forms, since the distinction does not matter in the euclidean space where we
normally care about the inertia tensor.
2 There are multiple ways of describing forces and displacements in terms of vectors and
one-forms (supposing that we are careful enough to care about the distinction between them),
and the consequent rank of σ . Every account of continuum mechanics seems to make its own
choices here: this variety of ‘accents’ serves to remind us that mathematics is a way that we
have of describingnature, and not the same thing as nature itself.
2.2 Tensors, Vectors, and One-Forms 25
()
If we have vectors V and W , then we can form a 20 tensor written V ⊗ W ,
the value of which on the one-forms ²
p and ²
q is defined to be
(V ⊗ W )(²
p, ²
q) ≡ V (²
p ) × W (²
q).
This object V ⊗ W is known as the outer product of the two vectors; see Schutz,
section 3.4. For example, given two column vectors A and B , the object
µA 1 ¶ µB 1¶
A⊗ B = ⊗
A2 B2
is a
( ) tensor whose value when applied to the two one-forms ²p and ²q is
2
0
(A ⊗ B )(²
p, ²
q) = A (²
p ) × B (²
q)
µA 1¶ µB 1 ¶
= ( p 1, p2 ) × ( q1 , q 2)
A2 B2
= ( p 1A 1 + p2 A 2 ) × ( q1 B 1 + q 2 B 2) .
In a similar way, we can use the outer product to form objects of other ranks
from suitable combinations of vectors and one-forms. Not all tensors are
necessarily outer products, though all tensors can be represented as a sum of
outer products.
2.2.3 Fields
We will often want to refer to a scalar-, vector- or tensor-field. A field is just
a function, in the sense that it maps one space to another, but in this book
we restrict the term ‘field’ to the case of a tensor-valued function, where the
domain is a physical space or space-time. That is, a field is a rule that associates
a number, or some higher-rank tensor, with each point in space or in space-
time. Air pressure is an example of a scalar field (each point in 3-d space has a
number associated with it), and the electric and magnetic fields, E and B, are
vector fields (associating a vector with each point in 3-d space).
A
p
Figure 2.2 Contraction of 2-d vectors and one-form.
A A2
A1
A = A1 e1 + A 2 e2
e2
e1
Figure 2.5 An oblique basis.
a point, rather than joining two separated points in space, you should think of
a one-form as having a direction and magnitude at a point, and not consisting
of actually separate planes.
With this visualisation, it is natural to talk of A and ²
p as geometrical objects .
When we do so, we are stressing the distinction between, firstly, A and ² p as
abstract objects and, secondly, their numerical components with respect to a
basis. This is what we meant when we talked, in Section 1.1, about physical
laws depending only on geometrical objects, and not on their components
with respect to a set of basis vectors that we introduce only for our mensural
convenience.
2.2.5 Components
()
I said, above, that the set of MN tensors formed a vector space. Specifically,
that includes the sets of vectors and one-forms. From Section 2.1.1, this means
that we can find a set of n basis vectors { e i } and basis one-forms { ²
ωi } (this is
supposing that the domains of the arguments to our tensors all have the same
dimensionality, n ; this is not a fundamental property of tensors, but it is true in
all the use we make of them, and so this avoids unnecessary complication).
Armed with a set of basis vectors and one-forms, we can write a vector A
and one-form ² p in components as
· ·
A =
i
A ei; ²p = ²i
p iω .
i i
28 2 Vectors, Tensors, and Functions
See Figure 2.1 and Figure 2.5. Crucially, these components are not intrinsic to
the geometrical objects that A and ²
p represent, but instead depend on the vector
or one-form basis that we select. It is absolutely vital that you fully appreciate
that if you change the basis, you change the components of a vector or one-
form (or any tensor) with respect to that basis, but the underlying geometrical
object, A or ²
p or T, does not change. Though this remark seems obvious now,
dealing with it in general is what much of the complication of differential
geometry is about.
Note the (purely conventional) positions of the indexes for these basis
vectors and one-forms, and for the components: the components of vectors
have raised indexes, and the components of one-forms have lowered indexes.
This convention allows us to define an extremely useful notational shortcut,
which allows us in turn to avoid writing hundreds of summation signs:
Einstein summation convention: whenever we see an index repeated in an
expression, once raised and once lowered, we are to understand a summation
over that index.
Thus:
· ·
i
A ei ≡
i
A ei; ²i
p iω ≡ ²i
pi ω .
i i
We have illustrated this for components and vectors here, but it will apply quite
generally in the following rules for working with components:
Here are the rules for working with components:
1. In any expression, there must be at most two of each index, one raised and
one lowered. If you have more than two, or have both raised or lowered,
you’ve made a mistake. Any indexes ‘left over’ after contraction tell you
the rank of the object of which this is the component.
2. The components are just numbers, and so, as you learned in primary
school, it doesn’t matter what order you multiply them (they don’t
commute with differential signs, though). If they are the components of a
field, then the components, as well as the basis vectors, may vary across
the space.
3. The indexes are arbitrary – you can always replace an index letter with
another one, as long as you do it consistently. That is, p i A i = A jp j , and
p i q j T ij = pj q i T ji = p k qi T ki (though p k qi T ki ±= p k q i T ik in general, unless
the tensor T is symmetric).
What happens if we apply ²
p , say, to one of the basis vectors? We have
A one-form basis with this property is said to be dual to the vector basis.
Returning to Eq. (2.4), therefore, we find
²p (ej ) = p i ω²i (e j ) = pi δ i j = pj . (2.6)
Thus in the one-form basis that is dual to the vector basis { ej } , the arbitrary
one-form ²p has components p j = ²
p ( ej ) .
Similarly, we can apply the vector A to the one-form basis { ²
ω i } , and obtain
² ²
A ( ω i ) = A j ej ( ω i ) = A j δ j i = A i .
In exactly the same way, we can apply the tensor T to the basis vectors and
one-forms, and obtain
T( ω²i , ²ω j , e k ) = ij
T k. (2.7)
The set of n × n × n numbers { T ij k } are the components of the tensor T in the
basis { e i } and its dual { ²
ωj } . We will generally denote the vector A by simply
writing ‘A i ’, denote ²
()
p by ‘p i ’, and the 21 tensor T by ‘ T ij k ’. Because of the
index convention, we will always know what sort of object we are referring to
by whether the indexes are raised or lowered: the components of vectors always
have their indexes raised, and the components of one-forms always have their
indexes lowered.
Notice the pattern: the components of vectors have raised indexes, but the
basis vectors themselves are in contrast written with lowered indexes, and
vice versa for one-forms. It is this notational convention that allows us to
take advantage of the Einstein summation convention when writing a vector
as A = A i ei or ² p = pi ²i
ω.
We can, obviously, find the components of the basis vectors and one-forms
by exactly this method, and find
e1 → ( 1, 0, . . ., 0)
e2 → ( 0, 1, . . ., 0)
.. (2.8)
.
en → ( 0, 0, . . ., 1)
30 2 Vectors, Tensors, and Functions
where the numbers on the right-hand-side are the components in the vector
basis, and
²1 →
ω (1, 0, . . ., 0)
²2 →
ω (0, 1, . . ., 0)
.. (2.9)
.
²n →
ω (0, 0, . . ., 1)
where the components are in the one-form basis. Make sure you understand
why Eqs. (2.8) and (2.9) are ‘obvious’.
So what is the value of the expression ²
p(A ) in components? By linearity,
²p (A ) = pi ω²i (A j e j ) = p i A j ²ωi (e j ) = pi A j δi j = p i A i .
This is the contraction of ² p with A . Note particularly that, since ² p and A are
basis-independent, geometrical objects – or quite separately, since ² p(A ) is a
pure number – the number p i A i is basis-independent also, even though the
numbers p i and A i are separately
(2) basis-dependent.
Similarly, contracting the 1 tensor T with one-forms ²
p, ²
q and vector A , we
obtain the number
², ²q, A ) =
T( p pi q j A k T ij k .
and the solitary unmatched lower index k on the right-hand-side indicates (or
rather confirms) that this object is a one-form. The indexes are staggered so
that we keep track of which argument they correspond to. I noted before that
the two tensors T( · , ²· ) and T(² · , · ) are different tensors: if the tensor is
symmetric, then T (e i , ²
ω ) = T(²
j ω , e i ) , and thus T i j = T j i , but we cannot simply
j
assume this.
We can also form the contraction of a tensor, by pairing up a vector- and
(2)
one-form-shaped argument, to make a tensor of rank two smaller. Considering
the 1 tensor T as before, we can define a new tensor S as
S( ²· ) = (²· , ²ωj , e j ),
T (2.10)
where the indexes j are summed over as usual. Pairing up different slots in T
()
One thing we do not have yet is any notion of distance, but we can supply that
very easily, by picking a symmetric 02 tensor g, and calling that the metric
tensor , or just ‘the metric’.
The metric allows us to define an inner product between vectors (which in
other contexts we might call a scalar product or dot product). The inner product
between two vectors A and B is the scalar
A·B= g (A , B)
We can define the length of a vector as the square root of its inner product with
itself: | A | 2 = g(A , A ). We can also use this to define an angle θ between two
vectors via
A · B = | A || B | cos θ .
Note that since g is a tensor, it is frame independent, so that the length | A | and
angle θ must be frame-independent quantities also.
We can find the components of the metric tensor in the same way we can
find the components of any earlier tensor:
g (e i , ej ) = g ij . (2.12)
()
As well as giving us a notion of length, the metric tensor allows us to define
a mapping between vectors and one-forms. Since it is a 02 tensor, it is a thing
which takes two vectors and turns them into a number. If instead we only
supply a single vector A = A i e i to the metric, we have a thing which takes
one further vector and turns it into a number; but this is just a one-form, which
we will write as ²
A:
²A = g (A ,· ) = g( · , A ). (2.13)
32 2 Vectors, Tensors, and Functions
That is, for any vector A , we have found a way of picking out a single associ-
ated one-form, written ²A . What are the components of this one-form? Easy:
²
A i = A (e i ) = g( ei , A)
= g( ei , Aje j)
= A j g(e i , e j )
= gij A j , (2.14)
from Eq. (2.12) above. That is, the metric tensor can also be regarded as an
‘index lowering ’ operator. ()
Can we do this trick in the other direction, defining a 20 tensor that takes
two one-forms as arguments and turns them into a number? Yes we can, and
the natural way to do it is via the tensor’s components.
The set of numbers g ij is, at one level, just a matrix. Thus if it is non-singular
(and we will always assume that the metric is non-singular), this matrix has an
inverse, and we can take the components of the tensor we’re looking for, gij , to
be the components of this inverse. That means nothing other than
ij i i
g gjk = g k = δ k, (2.15)
We will refer to the tensors corresponding to g ij , g i j and g ij indiscriminately as
‘the metric’.
What happens if we apply g ij to the one-form components A j ?
ij ij k i k
g A j = g gjk A = δ kA = A i, (2.16)
so that the metric can raise components as well as lower them.
There is nothing in the discussion above that says that the tensor g has the
same value at each point in space-time. In general, g is a tensor field, and the
different values of the metric at different points in spacetime are associated
with the curvature of that space-time. This is where the physics comes in.
[Exercises 2.6 and 2.7]
But there is nothing special about this basis, and we could equally well
have chosen a completely different set { e j̄ } with dual { ²j¯
ω } . With respect to this
basis, the same vector A can be written
A = A ı̄ e ı̄ ,
It’s important to remember that A i and A ı̄ are different (sets of) numbers
because they refer to different bases { e i } and { e ı̄ } , but that they correspond
to the same underlying vector A (this is why we distinguish the symbols by
putting a bar on the index i rather than on the base symbol e or A – this
does look odd, I know, but it ends up being notationally tidier than the various
alternatives). 3
Since both these sets of components represent the same underlying object A ,
we naturally expect that they are related to each other, and it is easy to write
down that relation. From before
A ı̄ = A (ωı̄ ) ²
= A i e i (²ı̄
ω )
i
= ı̄
±i A , (2.17)
where we have written the transformation matrix ± as
±i
ı̄
≡ e i (²
ω )≡
ı̄
²ω (ei ).
ı̄
(2.18)
Note that ± is a matrix , not a tensor – there’s no underlying geometrical object,
and we have consequently not staggered its indexes (see also the remarks on
this at the end of Section 2.3.2). Also, note that indexes i and ı̄ are completely
distinct from each other, and arbitrary, and we are using the similarity of
symbols just to emphasise the symmetry of the operation. Exactly analogously,
the components of a one-form ² p transform as
i
pı̄ = ±ı̄ p i , (2.19)
where the transformation matrix is
i
±ı̄ ≡ ²i (e ).
ω ı̄ (2.20)
3 Notation: it is slightly more common to distinguish the bases by a prime on the index, as in e µ ,
i
or even sometimes a hat, eı̂ . I prefer the overbar on the practical grounds that it seems easier to
distinguish in handwriting – try writing ‘; iµ’ three times quickly.
34 2 Vectors, Tensors, and Functions
Since the vector A is the same in both coordinate systems, we must have
i
A = A e i = A e ı̄
ı̄
coordinate systems:
δj = ² ω (e j ) = ±ı̄ ²ω (±j e j̄ ) = ±ı̄ ±j ²
i i i ı̄ j¯ i j¯ ı̄ i j¯ ı̄ i ı̄
ω (e j̄ ) = ±ı̄ ±j δ j̄ = ±ı̄ ±j . (2.22)
Thus ±iı̄ and ±ı̄j are matrix inverses of each other; so we must have E iı̄ =
i
±ı̄ , and
i
eı̄ = ±ı̄ ei . (2.23)
The results of Exercise 2.10 amplify this point.
Here, it has been convenient to introduce basis transformations by
focusing on the transformation of the components of vectors and one-forms,
in Eq. (2.17). We could alternatively introduce them by focusing on the
transformation of the basis vectors and one-forms themselves, and this is
the approach used in the discussion of basis transformations in Section 3.1.4.
Finally, it is easy to show that the components of other tensors transform in
()
what might be the expected pattern, with each unbarred index in one coordinate
system matched by a suitable ± term. For example the components of our 21
tensor T will transform as
j̄ k ij
T
ı̄ j̄
k̄ = ı̄
± i ±j ±
k̄
T k. (2.24)
of our intuitions, or the flat space of Special Relativity, the difference between
them disappears or, to put it another way, there is a one–to-one correspondence
between ‘a vector in the space’ and ‘the difference between two positions’
(which is what a difference vector is). In a curved space, it’s useful to talk
about the former (and we do, at length), but the latter won’t often have much
physical meaning. It is because of this correspondence that we can easily
‘parallel transport’ vectors everywhere in a flat space (see Section 3.3.2), which
means we have been able, at earlier stages in our education, to define vector
differentiation without having to think about it very hard.
If you think of a vector field – that is, a field of vectors, such as you tend to
imagine in the case of the electric field – then the things you imagine existing at
each point in space-time are straightforwardly vectors. That is, they’re a thing
with magnitude and direction (but not spatial extent), defined at each point.
recovers Eq. (2.23), and in this section alone staggered the indexes of ± to help
keep track of the elements of ± and its inverse, written as matrix expressions.
Therefore, if we require ±ı̄ i ±i j¯ = δjı̄¯ , then we must have
⎛ ⎞ µ ¶
1̄ ± 1̄2
± i
ı̄
= ⎝± 1 ⎠= cos θ sin θ
(2.30)
± 1
2̄
± 2
2̄ − sin θ /r cos θ /r
(you can confirm that if you matrix-multiply Eq. (2.30) by Eq. (2.28), then you
retrieve the unit matrix, as Eq. (2.22) says you should). Symmetrically with
Eq. (2.29), we can now write
⎛ 1̄ ⎞ ⎛ 1̄ 1̄
⎞ ⎛ 1⎞
⎝A ⎠ = ⎝± 1 ± 2
⎠ ⎝A ⎠ , (2.31)
A 2̄ ±2̄ 1 ±2̄2 A2
We see that, even though coordinates (x , y ) and (r , θ ) are describing the same
flat space, the metric looks a little more complicated in the polar-coordinate
coordinate system than it does in the plain cartesian one, and looking at this
out of context, we would have difficulty identifying the space described by
Eq. (2.32) as flat euclidean space.
We will see a lot more of this in the parts to come.
2.3 Examples of Bases and Transformations 39
x y ∂x ∂y
eθ ≡ ± θ ex + ±θ e y = ex + ey.
∂θ ∂θ
a matrix with a column vector (by the usual means of right-multiplying the
matrix by the column vector) you get a column vector, and if you contract the
row-vector/one-form with a matrix (by the usual means of left-multiplying the
matrix by the row-vector), you get another row-vector/one-form. If you both
left-multiply a matrix by a row vector, and right-multiply it by a column vector,
then you end up, of course, with just a number, in R.
What we have done here is to regard column vectors, row vectors, and ()
(0) , and (1) tensors (in this approach we can’t conveniently find representations
square matrices as representations of the abstract structures of respectively 10 ,
1 1
of higher-rank tensors). We have done two non-trivial things here: (i) we have
selected three vector spaces to play with, and (ii) we have defined ‘function
application’, as required by the definition of a tensor in Section 2.2.1, to be the
()
familiar matrix multiplication. Thus in this representation, any square matrix
is a 11 tensor.
This last expression should be very familiar to you, since it is exactly the
definition of the norm of two vectors which was so fundamental, and which
seemed so peculiar, in SR.
2.4 Coordinates and Spaces 41
Exercises
Exercises 1–12 of Schutz’s chapter 3 are also useful.
Exercise 2.1 (§ 2.1.1) Demonstrate that the set of ‘ordinary vectors’ does
indeed satisfy these axioms. Demonstrate that the set of all functions also
satisfies them and is thus a vector space. Demonstrate that the subset of
functions { eax : a ∈ R} is a vector space (hint: think about what the ‘vector
addition’ operator should be in this case). Can you think of other examples?
[u + ]
Exercise 2.4 (§ 2.2.5) So, why are Eq. (2.8) and Eq. (2.9) obvious?
[d − u + ]
Exercise 2.5 (§ 2.2.5) Show that the contraction in Eq. (2.11) is indeed a
tensor. You will need to show that S is linear in its arguments, and independent
of the choice of basis vectors { e i } (you will need Section 2.2.7).
e2̄ e2
A
θ e1̄
φ
e1
Figure 2.6 Two coordinate systems.
1. g ii 2. g ij T j kl 3. g ik T j kl 4. T i ij
5. g ij A i A j 6. g ij A k A k
If you’re looking at this after studying Chapter 3, then how about (7) A i ,j and
(8) A i ;j ?
What about A i = A (² i
ω )? [d − ]
Exercise 2.8 (§ 2.2.7) Figure 2.6 shows a vector A in two different coordi-
nate systems. We have A = cos θ e 1 + sin θ e 2 = cos(θ − φ)e 1̄ + sin(θ − φ)e 2̄.
Obtain e 1,2 in terms of e 1̄,2̄ and vice versa, and obtain A ı̄ in terms of cos θ ,
cos φ , sin θ and sin φ . Thus identify the components of the matrix ±iı̄ (see also
Eq. (2.23)).
Exercise 2.9 (§ 2.2.7) Show that the tensor S in Eq. (2.10) is independent
of the basis vectors used to define it.
Exercise 2.11 (§ 2.2.7) By repeating the logic that led to Eq. (2.17), prove
Eq. (2.19) and Eq. (2.24) [use T = T ij k (e i ⊗ e j ⊗ ²k
ω ) ]. Alternatively, and more
directly, use the results of Exercise 2.10. [d + ]
In the previous chapter, we carefully worked out the various things we can do
with a set of vectors, one-forms, and tensors, once we have identified those
objects. Identifying those objects on a curved surface is precisely what we are
about to do now. We discover that we have to take a rather roundabout route
to them.
After defining them suitably, we next want to differentiate them. But as soon
as we can do that, we have all the mathematical technology we need to return
to physics in the next chapter, and start describing gravity.
45
46 3 Manifolds, Vectors, and Differentiation
t
λ(t) R
M P
TP (M )
xi (P )
R
Figure 3.1 A manifold M, with a curve λ : R → M (dashed), a point on the
manifold P = λ(t), and a coordinate function xi : M → R . Curves that go through
the point P have tangent vectors, at P (shaded arrow), which lie in the space TP(M )
(shaded). The function xi(t) = (xi ◦ λ)(t) is therefore R → R.
structures are there, the removal of which would make it impossible to define
a derivative? For our present purposes we can do the traditional physicists’
thing and ignore such niceties, and assume that our spaces of interest are well
enough behaved that they can support coordinates and curves as discussed in
this section. Later in your study of GR, you may have to become aware of these
conditions again, in a detailed study of black holes (the point of singularity is
not in the manifold, or more specifically in any open subset of it) or the large-
scale structure of space-time (where the overall topology of the space becomes
important).
That said, for completeness, I’ll mention a few details about what
structure we already have at this point. A manifold, M , is a set (which
we naturally think of here as a set of points) in which we can identify open
subsets; we can identify such a subset (that is, a ‘neighbourhood’) for every
point in the manifold ( ∀ p ∈ M , ∃ S ⊂ M such that p ∈ S). We can smoothly
patch together multiple such subsets to cover the manifold, even though there
might not be any single subset that covers the entire manifold. At this point
we have a ‘topological space’. We then may or may not be able to define maps
from each of these subsets to Rn (these are the aforementioned ‘charts’); if we
can, and with the same n in all cases, we have an n -dimensional manifold (that
is, each such subset is homeomorphic to Rn; two spaces are homeomorphic
if there is a continuous bijection with a continuous inverse, which maps one
to the other; each such subset ‘looks like’ Rn). That is, the dimension of the
manifold, n, is a property of the manifold, and not a consequence of any
arbitrariness in the number of coordinate functions we use. These maps must be
continuous, but we will also assume that they are as differentiable as we need
them to be. Carroll (2004, § 2.2) has an excellent description of the sequence
of ideas, along with examples of spaces that are and are not manifolds. It’s also
possible to say quite a lot more about the precise relationship between charts,
coordinates, and frames, but we already have as much detail as we need.
f = f
(λ(t )) = f ±x 1(λ(t )) , . . ., x n (λ(t))²,
which we can write as just (R → R)
( )
f = f x1 (t), . . ., x n(t) .
48 3 Manifolds, Vectors, and Differentiation
d/ds
µ(s)
d/dr
τ (r )
P
λ(t)
d/dt
Be aware that, strictly, we are talking about three different functions here,
namely f (P ) : M → R, and f (x 1 , . . ., x n ) : Rn → R, and f (x 1 (t ), . . ., x n (t )) :
R → R. Giving all three functions the same name, f , is a sort of pun.
Similarly, we will think of x i as interchangeably a function x i (P ) : M → R,
or x i (λ(t)) : R → R, or as the number that is one of the arguments of the
function f (x1 , . . ., x n). When we manipulate these objects in the rest of the
book, it should be clear which interpretation we mean: when we write ∂ f /∂ x i ,
for example, we are thinking of f as f (x 1 , . . ., xn ), and thinking of the x i as
numbers; when we write ∂ x i/∂ x j = δ i j we simply mean that the coordinate
function x i is independent of the value of the coordinate x j .
So how does f vary as we move along the curve? Easy:
df ³
n i
∂x ∂f
= .
dt ∂ t ∂ xi
i= 1
However, since this is true of any function f , we can write instead
d ³ ∂xi ∂
= . (3.1)
dt ∂ t ∂ xi
i
We can now derive two important properties of this derivative. Consider the
same path parameterised by ta = t /a . We have
df ³ ∂xi ∂f
=
d ta ∂ t a ∂ xi
i
³ ∂xi ∂f
= a
∂t ∂xi
i
df
= a (3.2)
dt
Next, consider another curve µ( s ), which crosses curve λ(t ) at point P . We can
therefore write, at P ,
df df ³´ ∂xi ∂xi
µ ∂f ³ ∂xi ∂ f df
a + b = a + b = = , (3.3)
ds dt ∂s ∂t ∂xi ∂r ∂xi dr
i i
3.1 The Tangent Vector 49
for some further curve τ (r ) that also passes through point P (see Figure 3.2).
But now look what we have discovered. Whatever sort of thing d/dt is, a d/dt
is the same type of thing (from Eq. (3.2)), and so is ad/ds + b d/dt . But now
we can look at Section 2.1.1, and realise that these derivative-things defined
at P, which we’ll write (d/dt)P , satisfy the axioms of a vector space . Thus the
(1) (d/dt )P are another example of things that can be regarded as vectors,
things
or 0 tensors. The thing (d/dt)P is referred to as a tangent vector .
When we talk of ‘vectors’ from here on, it is these tangent vectors that we
mean.
A vector V = (d/dt)P has rather a double life. Viewed as a derivative, V is
just an operator, which acts on a function f to give
´dµ df ¶¶
¶
dt ¶t (P)
Vf = f = ,
dt P
the rate of change of f along the curve λ(t ), evaluated at P . There’s nothing par-
ticularly exotic there. What we have just discovered, however, is that this object
()
V = (d/dt )P can also, separately , be regarded as an element of a vector space
associated with the point P , and as such is a 10 tensor, which is to say a thing
· ¹
that takes a one-form as an argument, to produce a number that we will write
as ω̧, V , for some one-form ω̧ (we will see in a moment what this one-form is;
it is not the function f ). This dual aspect does seem confusing, and makes the
object V seem more exotic than it really is, but it will (or should be!) always
clear from context which facet of the vector is being referred to at any point.
We’ll denote the set of these directional derivatives as T P (M ), the tangent
plane of the manifold M at the point P. It is very important to note that T P(M )
and, say, T Q (M ) – the tangent planes at two different points, P and Q , of
the manifold – are different spaces , and have nothing to do with one another
a priori (though we want them to be related, and this is ultimately why we
introduce the connection in Section 3.2).
With this in mind, we can reread Eq. (3.1) as a vector equation, identifying
the vectors
´∂ µ
ei = (3.4)
∂xi P
as a basis for the tangent-plane, and the numbers ∂ x i/∂ t as the components of
the vector V = (d/dt )P in this basis, or
´ dµ ³ ∂xi ´ ∂ µ
=
dt P ∂t ∂xi P
i
i
V = V ei.
50 3 Manifolds, Vectors, and Differentiation
So, I’ve shown you that we can regard the (d/dt )P as vectors; the rest of this
part of the book should convince you that this is additionally a useful thing
to do.
Now consider the gradient one-form associated with, not f , but one of the
coordinate functions xi (from Section 3.1.1, recall that the coordinates are just
a set of functions on the manifold, and in this sense not importantly different
from an arbitrary function f ). We write these one-forms as simply ḑx i : what
is their action on the basis vectors ei = ∂/∂ x i (from Eq. (3.4))? Directly from
Eq. (3.5),
´∂µ ∂ xi
i i
ḑx = = δ j (3.7)
∂xj ∂ xj
so that, comparing this with Eq. (2.5), we see that the set ω̧i = ḑxi forms a
basis for the one-forms, which is dual to the vector basis e i = ∂/∂ x i .
[Exercises 3.1 and 3.2]
functions x ı̄ , how does this appear in our formalism, and how does it compare
to Section 2.2.7?
The new coordinates will generate a set of basis vectors
∂
e ı̄ = . (3.8)
∂ x ı̄
This new basis will be related to the old one by a linear transformation
j
eı̄ = ±ı̄ e j
and the corresponding one-form basis will be related via the inverse transfor-
mation
j
ω̧
ı̄
= ±j ω̧
ı̄
Note that Eq. (3.8) does the right thing if we, for example, double the value
of the coordinate function x i . If x i doubles, then ∂ f /∂ x i halves, but Eq. (3.8)
then implies that ei halves, which means that the corresponding component
of V , namely V i , doubles, so that V i ∂ f /∂ x i is unchanged, as expected.
[Exercises 3.3–3.5]
52 3 Manifolds, Vectors, and Differentiation
er d er /d θ
d eθ /d θ
d eθ /d r
dθ
eθ
dθ dr
We are now aiming for much the same destination, but by a slightly different
route. This follows Schutz §§5.3–5.5 quite closely.
In order to illustrate the process, we will examine the basis vectors of
(plane) polar coordinates, as expressed in terms of the cartesian basis vectors e x
and e y . We will promptly see that our formalism is not restricted to this route.
The basis vectors of polar coordinates are
er = cos θ e x + sin θ e y (3.10a)
e θ = − r sin θ e x + r cos θ ey (3.10b)
(compare the ‘dangerous bend’ discussion of coordinate bases in Section
2.3.2). A little algebra shows that
∂
er = 0 (3.11a)
∂r
∂ 1
er = eθ (3.11b)
∂θ r
∂ 1
eθ = eθ (3.11c)
∂r r
∂
e θ = − r er (3.11d)
∂θ
so that we can see how the basis vectors change as we move to different points
in the plane (Figure 3.3), unlike the cartesian basis vectors.
At any point in the plane, a vector V has components (V r , V θ ) in the polar
basis at that point . We can differentiate this vector with respect to, say, r , in
the obvious way
∂V ∂ r
= (V er + V e θ )
θ
∂r ∂r
r θ
∂V r ∂ er ∂V ∂ eθ
= er + V + eθ + V
θ
,
∂r ∂r ∂r ∂r
54 3 Manifolds, Vectors, and Differentiation
or, in index notation, with the summation index i running over the ‘indexes’ r
and θ ,
∂V ∂ i
= (V ei )
∂r ∂r
i
∂V i ∂ ei
= ei + V . (3.12)
∂r ∂r
These coefficients ²ijk are known as the Christoffel symbols , and this set of n ×
n × n numbers encodes all the information we need about how the coordinates,
and their associated basis vectors, change within the space.1 The object ² is
not a tensor – it is merely a collection of numbers – so its indexes are not
staggered (just like the transformation matrix ±).
Returning to polar coordinates, we discover that we have already done all
the work required to calculate the relevant Christoffel symbols. If we compare
Eq. (3.14) with Eq. (3.11) (replacing e r ±→ e 1 and e θ ±→ e2 ), we see, for
example, that
∂ e1 ∂er 1 2
= = ²12 e 1 + ²12 e 2
∂ x2 ∂θ
1
= 0e 1 + e2,
r
1
so that ²12 = 0 and ²212 = 1/r . We will sometimes write this, slightly slangily,
r
as ²r θ = 0 and ²θrθ = 1/r . By continuing to match Eq. (3.11) with Eq. (3.14),
we find
2 2 1 1
²12 = ²21 = , ²22 = − r , others zero, (3.15)
r
or
1 r
θ
²r θ = θ
²θ r = , ²θ θ = − r, others zero. (3.16)
r
[Exercise 3.6]
For each j this is a vector at each point in the space – that is to say, it is
a vector field – with components given by the term in brackets. We denote
these components of the vector field by the notation V i ; j , where the semicolon
denotes covariant differentiation. We can further denote the derivative of the
component with a comma: ∂ V i/∂ x j ≡ V i , j . Then we can write
∂V
= V i; jei (3.18a)
∂xj
i i k i
V ;j = V ,j + V ²kj . (3.18b)
It is important to be clear about what you are looking at, here. The objects V i ; j
are numbers, which are the components , indexed by i, of a set of vectors ,
indexed by j. They look rather like tensor components, however, because of
how we have chosen to write them; and we are about to deduce that that is
exactly what they are in fact.2 But components of which tensor?
Final step: looking back at Eq. (3.8), we see that the differential operator
j
∂/∂ x in Eq. (3.17) is associated with the basis vector e j , and this is consistent
with what we saw in Section 3.1.4: that e j is proportional to ∂/∂ x j , and thus
that ∂ V /∂ x j , in Eq. (3.18a), is proportional to ej also. That linearity permits us
()
to define a 11 tensor, which we shall call ∇ V , which we shall define by saying
2 It’s also unfortunate that the notation includes common punctuation characters: at the risk of
stating the obvious, note that any commas or semicolons followingsuch notation is part of the
surrounding text.
56 3 Manifolds, Vectors, and Differentiation
that the action of it on the vector e j is the vector ∂ V /∂ x j in Eq. (3.17). That is,
using the notation of Chapter 2, we could write
(∇ ¸
V )( · , e j ) ≡
∂V
∂ xj
( ¸· ) (3.19)
can find them easily, via Eq. (3.18), or by transforming the components from
a system where we already know them (such as cartesian coordinates) into the
system { xk } – we know we can do this because we know that ∇ V is a tensor,
so we know how its components transform.
Finally, here, note that a scalar is independent of any coordinate system,
therefore all the complications of this section, which essentially involve
dealing with the fact that basis vectors are different at different points on the
manifold, disappear, and we discover that we have already obtained a covariant
derivative of a scalar, in Eq. (3.5). Thus
∇ f ≡ ḑf (3.23)
and (where V is tangent to a curve with coordinate t )
∂f
∇ V f = ḑf (V ) = . (3.24)
∂t
(cf. Schutz Eq. (5.53)). From this we can deduce the expression for the
covariant derivative of a one-form, which we shall simply quote as:
( ∇ j p̧ ) i ≡ (∇ p̧ )ij ≡ p i;j = p i,j − pk ²ijk . (3.26)
∇ j V̧ = g( ∇ j V , · ). (3.32)
In components (and in all coordinate systems),
V i = g ik V k (3.33)
V i;j = g ik V k ;j . (3.34)
The first equation here is just the component form of Eq. (3.29) (compare
Eq. (2.14)). Note that the latter equation (which we obtained by comparing
Eqs. (3.32) and (3.26)) is not trivial. From the properties of the metric we
know that there exists some tensor that has components A ij = g ik V k ;j : what
this expression tells us is the nontrivial statement that this A ij is exactly V i;j .
That is to say that we did not get Eq. (3.34) by differentiating Eq. (3.33),
though it looks rather as if we did. What do we get by differentiating
Eq. (3.33)? By the Leibniz rule, Eq. (3.28),
V i ;j = g ik;j V k + gik V k ;j .
But comparing this with Eq. (3.34), we see that the first term on the right-
hand side must be zero, for arbitrary V . Thus, in all coordinate systems (and
relabelling),
g ij;k = 0. (3.35)
We have not exhausted the link between covariant differentiation and the
metric. The two are related via
i 1 il
²jk = 2 g (g jl,k + g kl,j − gjk ,l ). (3.36)
The proof is in Schutz §5.4, leading up to his equation (5.75); it is not long
but involves, in Schutz’s words, some ‘advanced index gymnastics’. It depends
on first proving that
k k
²ij = ²ji , in all coordinate systems. (3.37)
Equation (3.36) completely cuts the link between the Christoffel symbols
and cartesian coordinates, which might have lingered in your mind after
Section 3.2.2 – once we have a metric, we can work out the Christoffel
symbols’ components immediately. [Exercises 3.12–3.14]
actually rather little to do to bring this over to the most general case of curved
spaces. See Schutz §§6.2–6.4.
The first step is to define carefully the notion of a ‘local inertial frame’.
This is the local flatness theorem , and the coordinates xk̄ represent a local
inertial frame , or LIF.
These coordinates are also known as ‘normal’ or ‘geodesic’ coordinates,
and geodesics expressed in these coordinates have a particularly simple form.
Also, in these coordinates, we can see from Eq. (3.36) that ²ijk = 0 at P , which
is just another way of saying that this space is locally flat.
3.3 Covariant Differentiation in Curved Spaces 61
Schutz proves the theorem at the end of his §6.2, and Carroll (2004) in §2.5;
both are very illuminating.
df f (x + h) − f (x )
= lim . (3.39)
dx h→ 0 h
That’s straightforward because it’s obvious what f (x + h )− f (x ) means, and how
we divide that by a number. Surely we can do a similar thing with vectors on a
manifold. Not trivially, because remember that the vectors at P are not defined
on the manifold but on the tangent plane T P (M ) associated with that point, so
the vectors at a different point Q are in a completely different space T Q (M ), and
so in turn it’s not obvious how to ‘subtract’ one vector from the other. Differen-
tiation on the manifold consists of finding ways to define just that ‘subtraction’.
There are several ways to do this. One produces the ‘Lie derivative’, which
is important in many respects, but which we will not examine.
λ(t)
TQ (M ) V (Q)
t(Q) - t(P )
V (P )
TP (M )
Figure 3.5 Pulling a vector from one tangent plane, attached to Q, to another,
attached to P.
The crucial thing here is that nowhere in this account of the covariant
derivative have we mentioned coordinates at all.
We’ve actually said rather little, here, because although this passage has, I
hope, made clear how closely linked are the ideas of the covariant derivative
and parallel transport, we haven’t said how we go about choosing a definition
of parallelism, and we haven’t seen how this links to the covariant derivative
we introduced in Section 3.2. The link is the locally flat LIF. Although the
general idea of parallel transport, and in particular the definition I am about
to introduce, may seem obvious or intuitive, do remember that there is an
important element of arbitrariness in its actual definition.
Consider the coordinates representing the LIF at the point P . These are
cartesian coordinates describing a flat space (but not euclidean, remember,
since it does not have a euclidean metric). That means that the basis vectors
are constant – their derivatives are zero. A definition of parallelism now jumps
out at us: two nearby vectors are parallel if their components in the LIF are
the same . But this is the definition of parallelism that was implicit in the
differentiations we used in Sections 3.2.1 and 3.2.2, leading up to Eq. (3.18),
and so the covariant derivative we end up with is the same one: the tensor ∇ V
as defined in this section is the same as the covariant derivative of V in the LIF,
by our choice of parallelism; and the covariant derivative in the (flat) LIF is the
tensor ∇ V of Eq. (3.21).
Possibly rather surprisingly, we’re now finished: we’ve already done all of
the work required to define the covariant derivative in a curved space.
There are two further remarks remaining. Firstly, we can see that, in this
cartesian frame, covariant differentiation is the same as ordinary differentia-
tion, and so
V i ;j = V i ,j in LIF.
g ij;k = gij ,k = 0 at P ,
Secondly, as mentioned at the end of Section 3.2.3, from Eq. (3.41) we can
deduce Eq. (3.36), since the conditions for that are still true in this more general
case.
64 3 Manifolds, Vectors, and Differentiation
The discussion in this section used less algebra than you may have
expected for such a crucial part of the argument. Writing down the
details of the construction of this derivative would be notationally intricate
and take us a little too far afield. If you want details, there are more formal
discussions of this mechanism via the notion of a ‘pull-back map’ in Schutz
(1980, §6.3) or Carroll (2004, appendix A), and the covariant derivative is
introduced in an axiomatic way in both Schutz (1980) and Stewart (1991).
Also, the definition of parallelism via the LIF is not the only one possible, but
picks out a particular derivative and set of connection coefficients, called the
‘metric connection’. Only with this connection are Eq. (3.36) and Eq. (3.41)
true. See also Schutz’s discussion of geodesics on his pages 156–157, which
elaborates the idea of parallelism introduced here.
[Exercise 3.15]
3.4 Geodesics
Consider a curve λ(t) and its tangent vectors U (that is, the set of vectors U
is a field that is defined at least at all the points along the curve λ). If we
have another vector field V , then the vector ∇ U V tells us how much V changes
as we move along the curve λ to which U is the tangent. What happens if,
instead of the arbitrary vector field V , we take the covariant derivative of U
itself? In general, ∇ U U will not be zero – if the curve ‘turns a corner’, then the
tangent vector after the corner will no longer be parallel to the tangent before
the corner. The meaning of ‘parallel’ here is exactly the same as the meaning
of ‘parallel’ that was built in to the definition of the covariant derivative in the
passage after Eq. (3.40). Curves that do not do this – that is, curves such that
all the tangent vectors are parallel to each other – are the nearest thing to a
straight line in the space, and indeed are a straight line in a flat space. A curve
such as this is called a geodesic, more formally defined as follows:
∇ U U = 0. ⇔ (U is the tangent to a geodesic) (3.42)
3.4 Geodesics 65
U = U jej,
(recalling Eq. (3.21)). The i-component of this equation is, using Eq. (3.18b),
U j U i ;j = U j U i ,j + U j U k ²jk
i
= 0.
Let t be the parameter along the geodesic (that is, there is a parameterisation
of the geodesic, λ(t ), with parameter t , which U is the tangent to). Then, using
Eq. (3.5),
j j j
U = U (ḑx ) = dx /dt
and
U i ,j = j i
∂/∂ x (d x /d t) ,
Another way of saying this is that an affine parameter is the time coordinate of
some inertial system , and all that means it that an affine parameter is the time
shown on some free-falling ‘good’ clock; it also means that a ‘good’ time is an
affine transformation of another ‘good’ time. There are further remarks about
affine parameters in Section 3.4.1.
l= ds = g ij dx dx = g ij ẋ ẋ dλ, ≡ ṡ dλ ,
curve curve λ0 λ0
where
¶¶ ¶¶
i j 1/2
ṡ = g ij ẋ ẋ
expresses the relationship between parameter distance and proper distance, and
where dots indicate d/dλ. We wish to find a curve that is extremal, in the
3.5 Curvature 67
sense that its length l is unchanged under first-order variations in the curve,
for fixed λ 0 and λ1. The calculus of variations (which as physicists you are
most likely to have met in the context of classical mechanics) tells us that such
an extremal curve xk (λ) is the solution of the Euler–Lagrange equations
d
´ ∂ ṡ µ ∂ ṡ
− = 0.
dλ ∂ ẋ k ∂xk
Have a go, yourself, at deriving the geodesic equation from this before reading
the rest of this section (at an appropriate point, you will need to restrict the
argument to parameterisations of s (λ) that are such that s̈ = 0.)
For ṡ as given in the above equation, we find fairly directly that
1 s̈ 1 d ( ) 1 g ẋ i ẋ j = 0.
− 2
2gkj ẋ j + 2gkj ẋ j − ij,k (3.44)
2 ṡ 2ṡ dλ 2ṡ
To simplify this, we can choose at this point to restrict ourselves to parameteri-
sations of the curve that are such that ds/dλ is constant along the curve, so that
s̈ = 0; this λ is an affine parameter as described in the previous section. With
this choice, and multiplying overall by ṡ , we find
g kj,l ẋ j ẋl + g kj ẍ j − 21 gij ,k ẋi ẋ j = 0
which, after relabelling and contracting with gkl , and comparing with
Eq. (3.36), reduces to
k k i j
ẍ + ²ij ẋ ẋ = 0, (3.45)
the geodesic equation of Eq. (3.43).
As well as showing the direct connection between the geodesic equation and
this deep variational principle, and thus making clear the idea that a geodesic
is an extremal distance, this also confirms the significance of affine parameters
that we touched on in Section 3.4. There is a ‘geodesic equation’ for non-affine
parameters (namely Eq. (3.44)), but only when we choose an affine parameter λ
does this equation take the relatively simple form of Eq. (3.43) or Eq. (3.45).
The general solution of Eq. (3.44) is the same path as the geodesic, but because
of the non-affine parameterisation it is not the same curve , and is not, formally,
a geodesic. As we have dicussed before, the affine parameter of the geodesic
is chosen so that motion looks simple.
Schutz discusses this at the very end of his §6.4, and the exercises
corresponding to it.
3.5 Curvature
We now come, finally, to the coordinate-independent description of curva-
ture . We approach it through the idea of parallel transport, as described in
68 3 Manifolds, Vectors, and Differentiation
4 3
1
Section 3.3.2, and specifically through the idea of transporting a vector round
a closed path. This section follows Schutz §6.5; MTW (1973) chapter 11 is
illuminating on this.
First, it’s important to have a clear picture of the way in which a vector will
change as it is moved around a curved surface. In Figure 3.7 we see a view of
a sphere and a vector that starts off on the equator, pointing ‘north’ (at 1). If
we parallel transport this along the line of longitude until we get to the North
Pole (2), then transport it south along a different line of longitude until we
get back to the equator (3), then back along the equator to the point we started
at (4), then we discover that the vector at the end does not end up parallel to the
vector as it started. The vector on its round trip picks up information about the
curvature of the surface; crucially this information is intrinsic to the surface,
and does not depend on looking at the sphere from ‘outside’. We now attempt
to quantify this change, as a measure of the curvature of the surface.
xλ = b
B
xλ = b + δb
eσ
A V
eλ
C
D xσ = a + δa
xσ = a
Figure 3.8 Taking a vector for a walk.
¼ B ∂Vi
i i
V (B ) = V (A ) + dx σ
∂xσ
¼AB
= V i (A ) − ²kσ V
i k
dx σ
¼Aa+ a δ ¶¶
= V i (A ) − i
²kσ V
k
¶x = b d x σ
,
x
σ =a λ
δV
i i i
= V (A final ) − V (A init ) = − ¶x = b dx
²jσ V
i j σ
x =a
λ
¼ b+ b
σ
i j¶
¶δ
− ²j V ¶ dx λ
x = a+ a
λ
x =b
σ
¼ a+ a
λ δ
j¶
δ¶
²j V ¶
i
+ dx σ
x = b+ b
σ
x =a
λ δ
¼ b+ b
σ
i j¶
¶
δ
+ ²j V ¶
(3.47) dx . λ
x =a
λ
=b x
λ σ
At this point we can take advantage of the fact that δ a and δ b are small by
construction, ignore terms in δ a2 and δ b2 , and thus take the integrands to
½ a+ δa f (x )dx = δaf (a) + O (δa 2 )).
be constant along the interval of integration (by expanding the integrand in
a Taylor series, convince yourself that a
70 3 Manifolds, Vectors, and Differentiation
We don’t know what the ²jiλ V j | x = a+ δa and ²ijσ V j | x = b+ δ b are (of course, since
σ λ
we are doing this calculation for perfectly general ²), but since δ a is small, we
can estimate them using Taylor’s theorem, finding
¶¶ ¶¶ ¶¶
i j
²jλ V
xσ = a+ δa
= i j
²jλ V
xσ = a
+ δa
∂ i j
²jλ V ¶¶ + O (δ a2 )
∂xσ x =a
σ
δb
∂ i
²jσ V ¶¶
j
dx σ
∂x λ
¼x b+= ab ¶¶ x = a,x = b
σ σ λ
i j¶
δ
∂
− ²j V ¶ dx .
λ
δa λ
x λ
=b ∂xσ x = a,x = b
σ λ
However, the integrands here are now constant with respect to the variable of
integration, so the integrals are easy:
¾ ∂
± ² ∂
± ²¿
i i j i j
δV ≈ δa δb ²jσ V − ²jλ V ,
∂xλ ∂ xσ
with all quantities evaluatated at the point A . If we now use Eq. (3.46) to get
rid of the differentials of V j , we find, to first order
i
À i i i k i k
Á j
δV = δx σ δx λ ²jσ ,λ − ²jλ,σ − ²kσ ²jλ + ²k λ ²jσ V , (3.48)
R i jkl = i
²jl,k − i
²jk ,l + i
²σ k ²jl
σ
− i
²σ l ²jk
σ
(3.49)
(after some relabelling) called the Riemann curvature tensor (this notation,
and in particular the overall sign, is consistent with Schutz; numerous other
3.5 Curvature 71
conventions exist – see the discussion in Section 1.4.3). Thus Eq. (3.48)
becomes
i
δV = R i jσ λ V j δ xσ δx λ . = R( ω̧
i
, V , δ x σ e σ , δ x λ e λ ). (3.50)
This tensor tells us how the vector V varies after it is parallel transported on
an arbitrary excursion in the area local to point A (‘local’, here, indicating
small δ a and δ b); that is, it encodes all the information about the local shape
of the manifold.
Another way to see the significance of the Riemann tensor is to consider
the effect of taking the covariant derivative of a vector with respect to first one
then another of the coordinates, ∇ i ∇ j V . Defining the commutator
 Ã
∇ i, ∇ j V k ≡ ∇ i∇ jV k − ∇ j∇ iV k, (3.51)
it turns out that
Â∇ Ã k
= R k lij V l
i, ∇ j V (3.52)
(see the ‘dangerous-bend’ paragraph in this section for details). This, or
something like it, might not be a surprise. We discovered the Riemann
tensor by taking a vector for a walk round the circuit ABCDA in Figure 3.8
and working out how it changed as a result. The commutator Eq. (3.51) is
effectively the result of taking a vector from A to C via B and via D , and asking
how the two resulting vectors are different.
The Riemann tensor has a number of symmetries. In a locally inertial frame,
R i jkl = 21 g im (gml ,jk − gmk ,jl + gjk ,ml − gjl ,mk ), (3.53)
and so
R ijkl ≡ g im R m jkl = 21 (g il,jk − g ik,jl + gjk ,il − gjl ,ik ). (3.54)
Note that this is not a tensor equation, even in these coordinates: in such inertial
coordinates V i ,j = V i ;j and so an expression involving single partial derivatives
of inertial coordinates can be trivially rewritten as a (covariant) tensor equation
by simply rewriting the commas as semicolons; however the same is not true
of second derivatives, so that Eq. (3.54) does not trivially correspond to a
covariant expression. 3
3 To see this, consider differentiatingV twice, and note that V k ;ln includes a term in ²kml, n . This
term is non-zero (in general) even in locally flat coordinates, which means that Vk ;ln does not
reduce to Vk ,ln in the LIF. That means, in turn, that Eq. (3.53) cannot be taken to be the LIF
version of a covariant equation, and thus that we cannot obtain a covariant expression for the
Riemann tensor by swapping these commas with semicolons. Compare also Eq. (3.38c), and the
argument leading up to Eq. (3.36).
72 3 Manifolds, Vectors, and Differentiation
Notice that the definition of the Riemann tensor in Eq. (3.52) does not
involve the metric; this is not a coincidence. The development of the
Riemann tensor up to Eq. (3.50) is (I think) usefully explicit, but also rather
laborious, and manifestly involves the Christoffel symbols, which you are
possibly used to thinking of as being very closely related to the metric. That’s
not false, but another way of getting to the Riemann tensor is to define the
covariant derivative ∇ U V as an almost immediate consequence of the definition
of parallel transport, and then notice that the operator
[∇ U , ∇ V ] − ∇ [U ,V ] = R( U , V)
maps vectors to vectors, defining the
( ) Riemann tensor
1
(this is not a
R
3
derivation; see Schutz (1980, §6.8)). The point here is that R is not just a
function of the metric, but is a completely separate tensor, even though, in
the case of a metric connection, it is calculable in terms of the metric via
expressions such as Eq. (3.53). This is why it is not a contradiction (though
I agree it is surprising at first thought) that the Riemann tensor starts off with
considerably more degrees of freedom than the metric tensor. There are some
further remarks about the number of degrees of freedom in the Riemann tensor
in Section 4.2.5.
[Exercises 3.17–3.19]
λ (t )
µ (s+δs)
λ(t )
µ(s)
ξ X
X
ξ λ(t +δt)
λ(t +δt) µ(s+δs)
µ (s)
Schutz covers this at the end of his section 6.5, using a different argument
from the following. I plan to describe it in a different style here, partly because
a more geometrically minded explanation makes a possibly welcome change
from relentless components.
First, some useful formulae. (i) Marginally rewriting Eq. (3.52), we find
Â∇ Ã Â Ã
= X j Y l ∇ j , ∇ l V k e k = R k ijl V i X j Y l e k .
X,∇ Y V (3.56)
 Ã
(ii) Using the commutator A , B ≡ A B − B A , we find
∇ AB − ∇ BA = A , B ,
 à (3.57)
have. Because of this construction, it does not matter in which order we take
the derivatives d/dt and d/ds , so that
d d d d
¾d d
¿
= ⇔ , = 0,
dt ds ds dt dt ds
∇ Xξ = ∇ ξ X . (3.58)
Now suppose particularly that the curves λ(t) are geodesics, which means
that ∇ X X = 0. Then the vector ξ joins points on the two geodesics that have
the same affine parameter.
That means that the second derivative of ξ carries information about how
quickly the two geodesics are accelerating apart (note that this is ‘acceleration’
in the sense of ‘second derivative of position coordinate’, and not anything
that would be measured by an accelerometer – observers on the two geodesics
would of course experience zero physical acceleration). Using Eq. (3.58) the
calculation is easy. The second derivative is
∇ X ∇ X ξ = ∇ X ∇ ξ X = ∇ ξ ∇ X X + R k ijlX i X j ξ l e k , (3.59)
where the first equality comes from Eq. (3.58) and the second from Eq. (3.56).
The first term on the right-hand side disappears since ∇ X X = 0 along a
geodesic. Now, the covariant derivative with respect to the vector X is just
the derivative with respect to the geodesic’s parameter t (since λ is part of a
coordinate system; see Section 3.2.2), so that this equation turns into
Ä Åk
d 2ξ
= R k ijlX i X j ξ l . (3.60)
d t2
Thus the amount by which two geodesics diverge depends on the curvature
of the space they are passing through. Note that the left-hand side here is the
k -component of the second derivative of the vector ξ , and is a conventional
shortcut for ∇ X ∇ X ; it is not the second derivative of the ξ k component d2 ξ k/dt2 ,
though some books (e.g., Stewart (1991, §1.9)) rather confusingly write it this
way. [Exercises 3.20–3.22]
3.5 Curvature 75
Exercises
Most of the exercises in Schutz’s §6.8 should be accessible.
u = 21 (x2 − y2 )
(i)
v = xy .
You might also want to look back at the ‘dangerous bend’ paragraphs below
Eq. (2.23).
(a) Write x 1 = x, x 2 = y , x 1̄ = u , x2̄ = v , and thus, referring to
Eq. (3.9), calculate the matrices ±ı̄j and ±ij̄ . (The easiest way of doing the latter
calculation is to calculate ∂ u/∂ u , ∂ u/∂ v, . . . , and solve for ∂ x/∂ u , ∂ x/∂ v, . . . ,
ending up with expressions in terms of x, y , and r 2 = x 2 + y 2 .)
(b) From Eq. (2.24),
i j
gı̄ j¯ = ±ı̄ ±j̄ g ij .
Thus calculate the components g ı̄j¯ of the metric in terms of the coordi-
nates { u , v } (you can end up with expressions in terms of u and v , via
4(u 2 + v 2 ) = r 4).
76 3 Manifolds, Vectors, and Differentiation
Exercise 3.4 (§ 3.1.4) (a) Write down the expressions for cartesian coordi-
nates { x , y } as functions of polar coordinates { r , θ } , thus calculate ∂ x/∂ r , ∂ x/∂ θ ,
∂ y/∂ r and ∂ y/∂ θ , and thus find the components of the transformation matrix
from cartesian to polar coordinates, Eq. (3.9b).
(b) The inverse transformation is
± y²
r 2 = x 2 + y 2, θ = arctan .
x
Differentiate these, and thus obtain the inverse transformation matrix
Eq. (3.9a). Verify that the product of these two matrices is indeed the identity
matrix. Compare Section 2.3.2.
(c) Let V be a vector with cartesian coordinates { x , y } , so that
V = xe x + ye y .
Show that V̇ and V̈ have components {˙x , ẏ } and {¨x , ÿ } in this basis.
(d) Using the relations x = r cos θ and y = r sin θ , write down expressions
for ẋ , ẏ , ẍ and ÿ in terms of polar coordinates r and θ and their time derivatives.
(e) Now use the general transformation law Eq. (3.9a)
j ∂ x ı̄ j
V = =
ı̄ ı̄
±j V V
∂xj
to transform the components of the vectors V̇ and V̈ , which you obtained in (c),
into the polar basis { e r , e θ } , and show that
V̇ = ṙe r + θ̇ eθ
± ² ´ 2
µ
2
V̈ = r̈ − r θ̇ er + θ̈ + ṙ θ̇ eθ .
r
[u ++ ]
φ (x , y ) = x 2 + y 2 + 2xy ,
(a) From Eq. (3.5), the ith component of the gradient one-form ḑφ is
obtained by taking the contraction of the gradient with the basis vector e i =
i
∂/∂ x . Thus write down the components of the gradient one-form with respect
to the cartesian basis.
(b) The result of Exercise 2.10 says that the transformation law for the
components of a one-form is
j ∂xj
A ı̄ = ±ı̄ A j = Aj.
∂ x ı̄
Exercise 3.6 (§ 3.2.1) In Eq. (3.16) we see, for example, two lowered θ s
on the left-hand side with no θ on the right-hand side. Why isn’t this this an
Einstein summation convention error?
Exercise 3.8 (§ 3.2.2) Do Exercise 3.7 again, but this time working with
the one-form field A̧ , with cartesian components { x 2 + 3y , y 2 + 3x} .
Exercise 3.9 (§ 3.2.2) Comparing Exercises 3.7 and 3.8, verify that in both
cartesian and polar coordinates gik V k ;j = A i;j . [d − ]
·p̧, A ¹ = 3.11
Exercise
∇V
·∇ p̧, A(§¹ +3.2.2)
·p̧, ∇ ADeduce
¹, and Eq.Eq. · ¹
(3.26). Use the Leibniz
(3.24) acting on f = p̧, A .
rule
[u + ]
V V
Exercise 3.12 (§ 3.2.3) Derive Eq. (3.33) from Eq. (3.29) (one-liner).
Derive Eq. (3.34) from Eq. (3.32) (few lines). [d − ]
∂ xı̄ ∂ x j ∂ x k i ∂ x ı̄ ∂ 2x l
²
ı̄
= ²jk + .
j̄ k̄
∂ x i ∂ x j̄ ∂ x k̄ ∂ x l ∂ x j̄ ∂ x k̄
The first term in this looks like Eq. (2.24); but the second term will not be zero
in general, demonstrating that Christoffel symbols are not the components of
any tensor. [d + ]
Exercise 3.14 (§ 3.2.3) Suppose that in one coordinate system the Christof-
fel symbols are symmetric in their lower indexes, ²jki = ²ikj . By considering
the transformation law for the Christoffel symbols, obtained in Exercise 3.13,
show that they will be symmetric in any coordinate system.
Exercise 3.15 (§ 3.3.2) Things to think about: Why have you never had to
learn about covariant differentiation before now? The glib answer is, of course,
that you weren’t learning GR; but what was it about the vector calculus that
you did learn that meant you never had to know about connection coefficients?
Or, given that you did effectively learn about them, but didn’t know that was
what they were called, why do we have to go into so much more detail about
them now? There are a variety of answers to these questions, at different levels.
3.5 Curvature 79
Exercise 3.16 (§ 3.4) (a) On the surface of a sphere, we can pick coordi-
nates θ and φ, where θ is the colatitude, and φ is the azimuthal coordinate. The
components of the metric in these coordinates are
Show that the components of the metric with raised indexes are
1
g
θθ
= 1, g
φφ
= , others zero.
sin2 θ
(b) The Christoffel symbols are defined as
i 1 ij
²kl = 2 g (g jk,l + g jl,k − gkl ,j ),
d
´ dx i µ dx j dx k
i
+ ²jk = 0,
dt dt dt dt
for a geodesic with parameter t. Using these find the Christoffel symbols for
these coordinates (i.e., ²θθθ , ²θθ φ and so on), and thus show that the geodesic
equations for these coordinates are
1. φ = t , θ = π/2 2. φ = t, θ = π/4 3. φ = t , θ = 0
4. φ = t , θ = t 5. φ = φ0, θ = t 6. φ = φ0 , θ = 2t − 1
7. φ = φ0 , θ = t2
(d) The surface of the sphere can also be described using the coordinates
( x , y ) of the Mercator projection , as used on some maps of the world to
represent the surface in the form of a rectangular grid. The coordinates (x , y)
can be given as a function of the coordinates (θ , φ) listed in this problem.
If you were to perform the same calculation as above using these coordi-
nates, would you obtain the same Christoffel symbols? Explain your answer.
Comment on the relationship between the curves in part (c) and the geodesics
obtained using the Mercator coordinates. [d + u++ ]
80 3 Manifolds, Vectors, and Differentiation
Exercise 3.18 (§ 3.5.1) Prove Eq. (3.53). Expand the definition of the
Riemann tensor in Eq. (3.49) in the local inertial frame, in which g kl,m = 0
(Eq. (3.38)). Recall that partial derivatives always commute.
(compare Section 3.1.2), and you verified in Exercise 3.16 that this does indeed
satisfy the geodesic equation.
3.5 Curvature 81
X µ(s)
λ(t)
θ
X θ = 1, X φ = 0.
and, by using the components of the curvature tensor that you worked out in
Exercise 3.19, show that
(∇ X ∇ X ξ ) = 0 (iia)
θ
(∇ X ∇ X ξ ) + = 0. (iib)
φ φ
ξ
This tells us that the connecting vector – the tangent vector to the family of
curves µ(s ), connecting points of equal affine parameter along the geodes-
ics λ(t ) – does not change its θ component, but does change its φ component.
Which isn’t much of a surprise.
(c) Can we get more out of this? Yes, but to do that we have to calculate
∇ X ∇ X ξ , which isn’t quite as challenging as it might look. From Eq. (3.21) we
write
´ ∂ξj µ
i i j i j γ
∇ X ξ = X ∇ iξ = X ejξ ;i = X ej + ²iγ ξ . (iii)
∂xi
You have worked out the Christoffel symbols for these coordinates in
Exercise 3.16, so we could trundle on through this calculation, and find
expressions for the components of the connecting vector ξ from Eq. (ii). In
order to illustrate something useful in a reasonable amount of time, however,
we will short-circuit that by using our previous knowledge of this coordinate
system.
82 3 Manifolds, Vectors, and Differentiation
The curve
µ(s ) : θ (s ) = θ0 , φ (s ) = s
is not a geodesic (it is a small circle at colatitude θ0 ), but it does connect points
on the geodesics λ(t) with equal affine parameter t ; it is a connecting curve for
this family of geodesics. Convince yourself of this and, as in part (a), satisfy
yourself that the tangent vector to this curve, ξ = d/ds , has components ξ θ = 0
and ξ φ = 1; and use this together with the components of the tangent vector X
and the expression Eq. (iii) to deduce that
˙ ≡ ∇ ξ = 0e + cot θ e ,
ξ X θ φ
(d) So far so good. In exactly the same way, take the covariant derivative
of ξ˙ , and discover that
∇ X ξ˙ = ∇ X ∇ X ξ = 0eθ − 1eφ = − ξ ,
and note that this ξ does in fact accord with the geodesic deviation equation of
Eq. (ii).
Note that this example is somewhat fake, in that, in (c), we set up the
curve µ( s ) as a connecting curve, and all we have done here is verify that
this was consistent. If we were doing this for real, we would not know (all of)
the components of ξ beforehand, but would carry on differentiating ξ as we
started to do in (c), put the result into the differential equation Eq. (ii) and thus
deduce expressions for the components ξ k .
As a final point, note that the length of the connecting vector ξ is just
Exercise 3.22 (§ 3.5.2) In the newtonian limit, the metric can be written as
where
ηij = diag(− 1, 1, 1, 1 )
Æ
− 2φ i = j
hij = ,
0 i ²= j
3.5 Curvature 83
84
4.1 The Energy-Momentum Tensor 85
Ptolemy is right, here (though some of the details of his cosmology have been
adjusted since he wrote this, and what he refers to as ‘theology’ is now more
often referred to as ‘Quantum Gravity’): mathematics we can know all about,
with certainty; for physics we have to make guesses. He is also correct about
the contribution of mathematics, and we’ll discover that our first insights in
this section do indeed come from considering the peculiarities of the material
world’s motion from place to place.
To match our return to physics, here, we’re now going to specialise to
working in a four-dimensional manifold, and to metrics with a signature of
+ 2, so that the LIF has the same metric as Minkowski space. To further
match Section 2.3.4, we will also introduce a slight, and traditional, notational
change. We will now index components with greek letters, µ , ν , α, β, . . . ,
which we take to run from 0 to 3; we will sometimes write indexes with latin
letters i, j,. . . , taking these to run over the spacelike directions, 1, 2, and 3.
Δz
Δy Δx
that are not moving relative to each other, so that the collection has zero
pressure – the dust’s only physical property is mass-density. That is to say that
there is a frame, called the momentarily comoving reference frame- (MCRF),
with respect to which all the particles in a given volume have zero velocity. 1
We can suppose for the moment that all the dust particles have the same
(rest) mass m , but that different parts of the dust cloud may have different
number densities n . Just as the particle mass m is the mass in the particle’s rest
frame, the number density n is always that measured in the MCRF.
If we Lorentz-transform to a frame that is moving with velocity v with
respect to the MCRF, a (stationary) volume element of size ±x ±y ±z
will be Lorentz-contracted into a (moving) element of size ±x ±± y± ± z± =
2 − 1/2
(±x /γ )±y ±z , where γ is the familiar Lorentz factor γ = ( 1 − v ) ,
supposing that the frames are chosen such that the relative motion is along
the x -axis. That means that the number density of particles, as measured in
the frame relative to which the dust is moving, goes up to γ n . What, then, is the
flux of particles through an area ±y ±z in the y ±–z ± plane? The particles in the
volume all pass through the area ± y± ±z ± in a time ±t± , where ±x ± = v ±t± , and
so this total number of particles is (γ n )(v ±t ± )±y± ± z ±. Thus the total number
of particles per unit time and per unit area, which is the flux in the x ± -direction,
is γ nv . Writing N x for this x -directed flux, and vx for the velocity along the
x -axis, v , this is
x x
N = γ nv . (4.1)
We can generalise this, and guess that we can reasonably define a flux vector
N = nU , (4.2)
where again n is the dust number density in its MCRF, and U is the
4-velocity vector (γ , γ v x , γ v y , γ v z ). Since the velocity vector has the property
g( U , U ) = − 1 (remember your SR, and that the 4-velocity vector U = ( 1, 0)
1 This is also, interchangeably, sometimes called the Instantaneously Comoving Reference Frame
(ICRF).
4.1 The Energy-Momentum Tensor 87
(where the last expression denotes the x -component of N , rather than the whole
set of components). That is, contracting the flux vector with a gradient one-
form produces the flux across the corresponding surface; this is true in general,
so that N (±
n) produces the flux across the surface φ = constant, where ± n =
±dφ/|±dφ| . The vector N = nU is manifestly geometrical; it is our ability to
recover the flux in this way that justifies our naming this the ‘flux vector’.
σµ = ² µαβ γ A
α
Bβ C γ , (4.11)
where A = (A 0, a), B = (B 0 , b ), and C = (C 0, c) are three linearly independent
vectors, and ²µαβ γ is the Levi-Civita symbol , which is such that
⎧
⎪⎪⎨+ 1 if µαβ γ is an even permutation of 0123
=
² µαβ γ
⎪⎪⎩− 1 if it is an odd permutation (4.12)
0 otherwise.
σ0 = ² 0αβ γ A
α
Bβ C γ ,
and you may recall from linear algebra that this is the expression for a · (b × c),
the vector triple product, which gives the volume, V , of the parallelepiped
bounded by the three 3-vectors a, b , and c. Since each of these vectors is
2 The account here is closely compatible with MTW (1973, chaper 5). It covers the same material
as the previous section, but in a less heuristic way, at the expense of introducing some new
maths. It doesn’t have a ‘dangerous bend’ marker but you should feel free to skip it if the
previous section was adequately satisfying to your mathematical sensibilities.
4.1 The Energy-Momentum Tensor 91
p
σ
B
A
C A
² ³
orthogonal to ± σ (that is, ±
σ , A = 0, etc.), the 3-d volume that they span
is a hyperplane of the one-form ± σ . This volume (shown in Figure 4.2, with
the direction C suppressed) contains matter or other energy with associated
momenta p.
This (timelike) volume is moving through space-time with a velocity U ,
where U = (1, 0) in the volume’s rest frame. The one-form dual to this velocity
± = g(U , ±· ), which has components ±U = (− 1, 0). Note that, promptly from
is U
this definition,
²± ³
U,U = g (U , U ) = − 1.
This (spacelike) volume, which is parallel to ± C (and thus in the direction of the
basis one-form ± 3
ω ), clearly represents the amount of space-time swept out by
the top of the box, A B , in time ±τ . The one-form ± σ , therefore, represents a
92 4 Energy, Momentum, and Einstein’s Equations
which (you will not at this point be surprised to learn) we can call the ‘energy-
momentum tensor’, the action of which, when given a volume one-form ± σ , is
to produce the quantity of energy-momentum contained within that volume:
pbox = T( ±· , ±σ ). (4.13)
We can see that this makes sense, in a number of ways, using the examples
of ±
σ mentioned previously.
Contracting T with the timelike ± σ , in Eq. (4.13), gives us the 4-momentum
contained within that (spacelike) volume, and directed into the future. The
energy instantaneously contained within the box, which is the zeroth compo-
nent of p box in its rest frame, can be extracted by contracting it with the basis
one-form ± ω which, in this frame, is just − ±
0
U , to give
± ±
E = p box (ω 0) = − p box (U ) = ±, ±
T( U σ ) = + V T(±
U, ±
U ),
as defined by Eq. (4.13), and comparison with Eq. (4.4) shows that this tensor
is exactly the energy-momentum tensor we obtained earlier.
The expression for the volume form, Eq. (4.11), may appear to be
pulled from a hat, but in fact it emerges fairly naturally from a larger
theory of differential forms ; the one-forms we have been using are the simplest
objects in a sequence of n-forms. This ‘exterior calculus’ can be used, amongst
other things, for providing yet another way of discussing differentiation (and
integration) in a curved manifold, alongside the covariant derivative we have
extensively used, and the Lie derivative we have mentioned in passing. MTW
discuss this approach in their chapter 4; Carroll describes them, very lucidly,
in his section 2.9; Schutz (1980) gives an extensive treatment in chapter 4; they
are well covered in other advanced GR textbooks.
[Exercise 4.2]
4.2 The Laws of Physics in Curved Space-time 93
F
µν
,ν = 4π J µ (4.15a)
F µν ,λ + F νλ ,µ + F λµ ,ν = 0. (4.15b)
The Faraday tensor F and the energy-momentum tensor T together form the
source for the gravitational field. Notwithstanding that, we shall not explicitly
include the Faraday tensor in the discussion that follows.
The Faraday tensor, and Maxwell’s equations, take a particularly compact
and elegant form when expressed in terms of the exterior derivatives mentioned
in passing at the end of Section 4.1.3.
It is possibly worth highlighting that the components of Eq. (4.14) are
manifestly frame-dependent – you can pick a frame where either E or B
disappears:
It is known that Maxwell’s electrodynamics – as usually understood at the present
time – when applied to moving bodies, leads to asymmetries which do not appear
to be inherent in the phenomena. . . . Examples of this sort, together with the
unsuccessful attempts to discover any motion of the earth relatively to the “light
medium,” suggest that the phenomena of electrodynamics as well as of mechanics
possess no properties corresponding to the idea of absolute rest.
(Einstein, 1905, paragraphs 1and 2).
Rβν ≡ g R αβ µν = R
αµ µ
β µν , (4.16)
and the Ricci scalar obtained by further contracting the Ricci tensor,
R ≡ g β ν R β ν = gβ ν gαµ R αβ µν . (4.17)
Recall that Eq. (3.54) was evaluated in LIF coordinates; however, since in these
µ µ
coordinates ³αβ = 0 (though ³αβ,σ need not be zero), partial differentiation
and covariant differentiation are equivalent, and Eq. (4.19) can be rewritten
G αβ ;β = 0, (4.22)
G αβ ≡ R αβ − 21 gαβ R . (4.23)
From its name, and the alluring property Eq. (4.22), you can guess that this
tensor turns out to be particularly important for us. There are some further
remarks about the Ricci tensor in Section 4.2.5. Some texts define the Einstein
and Ricci tensors with a different overall sign; see Section 1.4.3.
Anyway, back to the physics.
4.2 The Laws of Physics in Curved Space-time 95
Rindler refers to this as the ‘strong’ EP, and discusses it under that title with
characteristic care, distinguishing it from the ‘semistrong’ EP, and the ‘weak’
EP, which is the statement that inertial and gravitational mass are the same.
The EP gives us a route from the physics we understand to the physics
we don’t (yet). That is, given that we understand how to do physics in the
inertial frames of SR, we can import this understanding into the apparently
very different world of curved – possibly completely round the twist – space-
times, since the EP tells us that physics works locally in exactly the same way
in any LIF, free-falling in a curved space-time.
So that tells us that an electric motor, say, will work as happily as we
free-fall into a black hole, as it would work in any less doomed SR frame. It
does immediately constrain the general form of physical laws, since it requires
that, whatever their form in general, they must reduce to the SR version when
expressed in the coordinates of a LIF. For example, whatever form Maxwell’s
equations take in a curved space-time, they must reduce to the SR form,
Eq. (4.15), when expressed in the coordinates of any LIF. The same goes for
conservation laws such as Eq. (4.9) or Eq. (4.10). This form of the EP doesn’t,
however, rule out the possibility that the curved space-time law is (much) more
complicated in general, and simply (and even magically) reduces to a simple
SR form when in a LIF. Specifically, it doesn’t rule out the possibility of cur-
vature coupling , where the general form of a conservation law such as Eq. (4.9)
has some dependence on the local curvature, which disappears in a LIF.
For that, we need a slightly stronger wording of the EP as quoted earlier in
the section (see Schutz §7.1; Rindler §8.9 quotes this as a ‘reformulation’ of
the EP):
The Strong Equivalence Principle: Any physical law that can be expressed in
tensor notation in SR has exactly the same form in a locally inertial frame of a
curved space-time. (4.24)
96 4 Energy, Momentum, and Einstein’s Equations
The difference here is that this says, in effect, that only geometrical statements
count (this is why we’ve been making such a fuss about the primacy of
geometrical objects, and the relative unimportance of their components,
throughout the book). That is, it says that a SR conservation law such as
Eq. (4.9), T µν , ν = 0, has the same form in a LIF, and as a result, because
covariant differentiation reduces to partial differentiation in the LIF, the partial
derivative here is really just the LIF form of a covariant derivative, and so the
general form of this law is
T µν ;ν = 0, (4.25)
with the comma turning straight into a semicolon, and no extra curvature terms
appearing on the right-hand side . That is why this form of the EP is sometimes
referred to as the ‘ comma-goes-to-semicolon ’ rule.
Note that this comma-goes-to-semicolon is emphatically not what happened
in the step between, for example, Eq. (4.19) and Eq. (4.20), and in various
similar manouvres throughout Chapter 3 (such as before Eq. (3.41) and after
Eq. (3.54)). What was happening there was a mathematical step: covariant
differentiation of a geometrical object is equivalent to partial differentiation
when in a LIF. We have a true statement about partial differentiation in
Eq. (4.19), so the same statement must be true of covariant differentiation; such
a statement in one frame is true in any frame, hence the generality. The Strong
EP comma-goes-to-semicolon rule, on the other hand, is making a physical
statement, namely that the statement of a physical law in a LIF directly implies
a fully covariant law, which is no more complicated .
It is possibly not obvious, but the Strong EP also tells us how matter is
affected by space-time. In SR, a particle at rest in an inertial frame moves
along the time axis of the Minkowski diagram – that is, along the timelike
coordinate direction of the LIF, which is a geodesic. The Strong EP tells us
that the same must be true in GR, so that this picks out the curves generated by
the timelike coordinate of a LIF, which is to say:
This, like the Strong EP, is a physical statement about our universe, rather than
a mathematical one. ‘We will return to this very important point in the sections
to follow.’ [Exercises 4.3 and 4.4]
steps. First, you work out which geodesic it will travel along: this involves
solving Einstein’s equations, and working out from the initial conditions of
the motion which geodesic your particle is actually on, amongst the large
number of possible geodesics going through the initial point in space-time.
Secondly, you work out how to translate from the simple motion in the inertial
coordinates attached to the particle, to the coordinates of interest (presumably
attached to you).
The key thing on the way to the important insight here, is to note that
if you’re moving along a geodesic – if you’re in free fall – you are not
being accelerated , in the very practical sense that if you were carrying an
accelerometer, it would register no acceleration. If instead you stand still on
earth, and drop a ball from your hand, the ball is showing the path you would
have taken, were it not for the floor. That is, it is the force exerted by the floor
on your feet that is accelerating you away from your ‘natural’ free-fall path. If
you hold an accelerometer in your hand – for example, a mass on a spring –
you can see your acceleration register as the spring extends beyond the length
it would have in free fall.
In other words, we’ve been thinking of this situation backwards. We’re
used to standing on the ground being the normal state, and falling being the
exceptional one (we’re primates, after all, and not falling out of trees has long
been regarded as a key skill). But GR says that we’ve got that inside out:
inertial motion, which in the presence of masses we recognise as free fall,
is the simplest, or normal, or ‘natural’ motion, requiring no explanation, and it
is not-falling that has to be explained. 3 The EP says that the force of gravity
doesn’t just feel like being forced upwards by the floor, it is being accelerated
upwards by the floor.
3 This term ‘natural motion’ is clearly not being used in a technical sense. The history of physics
might be said to consist of a sequence of attempts – by Aristotle, Ptolemy, Kepler, Galileo,
Newton, and Einstein – to identify a successively refined idea of ‘natural motion’ which
adequately and fundamentally explains the observed behaviour of the cosmos. Currently ‘move
along your locally-minkowskian t-axis’ is it.
98 4 Energy, Momentum, and Einstein’s Equations
Einstein’s equations plausible . Schutz does this in his §§8.1–8.2; Rindler does
it very well in his §§8.2 and 8.10; essentially every textbook on GR does it in
one way or another, either heuristically or axiomatically.
Newton’s theory of gravity can be expressed in terms of a gravitational
field φ . The gravitational force f on a test particle of mass m is a three-vector
with components fi = − m φ,i , and the source of the field is mass density ρ ,
with the field equation connecting the two being
,i
φ ,i = 4π G ρ (4.27)
(with the sum being taken over the three space indexes, and where φ,i ,i =
g ij φ,ij = gij ∂ 2φ/∂ x i ∂ x j ). This is Poisson’s equation. In a region that does not
contain any matter – for example an area of space that is not inside a star or a
planet or a person – the mass density ρ = 0, and the vacuum field equations are
,i
φ ,i = 0. (4.28)
Now cast your mind back to Chapter 1, and the expression in the notes
there for the acceleration towards each other of two free-falling particles. This
expression can be slightly generalised and rewritten here as
d2 ξ i
= − φ ,i ,j ξ j . (4.29)
d t2
But compare this with Eq. (3.60): they are both equations of geodesic
deviation, suggesting that the tensor represented by R µανβ U α U β is analogous
to φ,i ,j (we’ve used the symmetries of the curvature tensor to swap two indexes,
note, and used U rather than X to refer to the free-falling particle velocity).
Since the particle velocities are arbitrary, that means, in turn, that the φ,i ,i
appearing in Poisson’s equation is analogous to R αβ = R µ αµβ , and so a good
guess at the relativistic analogue of Eq. (4.28) is
R µν = 0. (4.30)
This guess turns out to have ample physical support, and Eq. (4.30) is known
as Einstein’s vacuum field equation for GR.
If R µν = 0, then R = gµν R µν = 0 and therefore
G µν = R µν − 21 Rg µν = 0.
So much for the vacuum equations, but we want to know how space-time is
affected by matter. We can’t relate it simply to ρ , since Section 4.1.2 made it
clear that this was a frame-dependent quantity; the field is much more likely to
be somehow bound to the E-M tensor T instead. Looking back at Eq. (4.27),
we might guess
4.2 The Laws of Physics in Curved Space-time 99
R µν = κT
µν
(4.31)
as the field equations in the presence of matter, where κ is some coupling
constant, analogous to the newtonian gravitational constant G . This looks plau-
sible, but the conservation law Eq. (4.25) immediately implies that R µν ;ν = 0,
which, using the Bianchi identity Eq. (4.22), in turn implies that R ;ν = 0.
But if we use Eq. (4.31) again, this means that (g αβ T αβ );ν = 0 also. If we
look back to, for example, Eq. (4.8), we see that this field equation, Eq. (4.31),
would imply that the universe has a constant density. Which is not the case. So
Eq. (4.31) cannot be true.4
So how about
G µν = κT
µν
(4.32)
as an alternative? The Bianchi identity Eq. (4.22) tells us that the conservation
equation T µν ;ν = 0 is satisfied identically. Additionally – and this is the key
part of the argument – numerous experiments tell us that Eq. (4.32) has so-far
undisputed physical validity: it has not been shown to be incompatible with our
universe. It is known as the Einstein field equation , and allows us to complete
the other half of the famous slogan
Space tells matter how to move – the statement (4.26) plus
equations (3.42) or (3.43). And matter tells space how to curve –
equation (4.32).
Einstein first published these equations in a series of papers delivered to the
Prussian Academy of Sciences in November 1915; there is a detailed account
of Einstein’s actual sequence of ideas, which is slightly (but, remarkably, only
slightly) more tentative than the description in this section may suggest, in
Janssen and Renn (2015).
There are two further points to make, both relating to the arbitrariness that
is evident in our justification of Eq. (4.32).
The first is to acknowledge that, although we were forced to go from
Eq. (4.31) to Eq. (4.32) by the observation that the universe is in fact lumpy,
there is nothing other than Occam’s razor that forces us to stop adding
complication when we arrive at Einstein’s equations. There have been various
attempts to play with more elaborate theories of gravity, but almost none so
far that have acquired experimental support. Chandrasekhar’s words on this,
quoted in Schutz §8.1, are good:
4 This argument comes from §8.10 of Rindler (2006); Schutz has a more mathematical argument
in his §8.1. Which you prefer is a matter of taste, but in keeping with our attempt to talk about
physics in this chapter, we’ll prefer the Rindler version for now.
100 4 Energy, Momentum, and Einstein’s Equations
(the calculation is not long, but is somewhat tricky, and is described in Carroll
[2004, §4.3], and in MTW [1973, box 17.2 and chapter 21]). You will recognise
the term in square brackets from Eq. (4.23); requiring that δ S = 0 for all
variations g μν therefore implies that
G μν = 0,
In gravitational physics, we use natural units, for much the same reason. In SI
units, Newton’s gravitational constant has the dimensions [ G ] = kg − 1m3 s− 2 ,
but it is convenient in GR to have G dimensionless, and to this end we choose
4.3 The Newtonian Limit 103
our unit of mass to be the metre, with the conversion factor between this and
the other mass unit, kg, obtained by:
G
1= = 7.425 × 10− 28 m kg− 1 .
c2
See Schutz’s §8.1 and Exercise 1.3 for a table of physical values in these
units. Measuring masses in metres turns out unexpectedly intuitive: when you
learn about (Schwarzschild) black holes you discover that the radius of the
event horizon of an object is twice the value of the object’s mass expressed
in metres. Also, within the solar system, the mass of the sun is less precisely
measurable than the value of the ‘heliocentric gravitational constant’, GM ² ,
which has units of m3 s− 2 in SI units, and thus units of metres in natural units
(the ‘gravitational radius’ of the sun GM ² is known to one part in 1010, but
since G is known only to one part in 104 or so, the value of M ² in kg has the
same uncertainty).
In the weak-field approximation, we take the space-time round a small
object to be nearly minkowskian, with
g αβ = ηαβ + h αβ , (4.36)
where | h αβ | ³ 1, and the matrix ηαβ is the matrix of components of the
metric in Minkowski space. Note that Eq. (4.36), defining hαβ , is a matrix
equation, rather than a tensor one: we are choosing coordinates in which
the matrix of components g αβ of the metric tensor g is approximately equal
to ηαβ . If we Lorentz-transform Eq. (4.36) – using the ´αᾱ of SR, for which
β
ηᾱ β̄ = ´ᾱ ´ ηαβ – we get an equation of the same form as Eq. (4.36), but
α
β̄
in the new coordinates; that is, the components h αβ transform as if they were
the components of a tensor in SR. This allows us to express R α β μν , R αβ and
G αβ , and thus Einstein’s equation itself, in terms of h αβ plus corrections of
order | h αβ | 2. The picture here is that gαβ is the result of a perturbation on
flat (Minkowski) space-time, and that h (which encodes that perturbation) is
a tensor in Minkowski space: expressing Einstein’s equations in terms of h
(accurate to first order in h αβ ) gives us a mathematically tractable problem to
solve.
The next step is to observe that in the newtonian limit, which is the limit
where Newton’s gravity works, the gravitational potential | φ | ³ 1 and speeds
| v| ³ 1. This implies that | T 00| ´ | T 0i | ´ | T ij | (because T 00 ∝ m , T 0i ∝ v i
and T ij ∝ vi v j , with v earth ≈ 10− 4 ). We then identify T 00 = ρ + O (ρ v 2 ). By
matching the resulting form of Einstein’s equation with Newton’s equation for
gravity, we fix the constant κ in Eq. (4.32), so that, in geometrical units,
G μν = 8π T μν . (4.37)
104 4 Energy, Momentum, and Einstein’s Equations
See Schutz §§8.3–8.4 for the slightly intricate details of this deriva-
tion to Eq. (4.39), and see his §7.2 for the derivation of the newtonian
geodesics of Section 4.3.2. Carroll (2004) gives an overlapping account of the
same material in §4.1 and (very usefully, with more technical background)
§7.1. We return to this approximation in Appendix B.
Symmetries
We said, in the discussion after Eq. (4.36), that h αβ is not a tensor, even though
it looks like one, and in leading up to Eq. (4.39) we have treated it as one.
Although Eq. (4.36) is a tensor equation (there does exist a tensor g − η ) this
is only a useful thing to do in a coordinate system in which | hαβ ³ 1| . In that
coordinate system, the approximations Eq. (4.38) and Eq. (4.39) are true to
first order in h αβ . That is, the components of h αβ , as approximated, cannot be
transformed into another coordinate system with an arbitrary transformation
matrix ´ , and result in correct expressions.
There is a set of transformations that preserves the approximation, however,
and it’s useful to think a little more about this. Intuitively, a transformation to
any other coordinate system in which | hᾱβ̄ ³ 1| would produce an equivalent
result. If we restrict ´ to the Lorentz transformations of SR, then the metric ηαβ
β
will be invariant, and the components h ᾱβ̄ = ´αᾱ ´ h αβ will be small for small
β̄
velocities: the approximation is still good in these new coordinates. We can
think of hαβ as being a tensor within a background (Minkowski) space.
In slightly more formal terms – and see also Carroll (2004, section 7.1) –
we can consider a vector field ξ μ (x ) on the background Minkowski space, and
use this to generate a change from one coordinate system to another:
x ᾱ = x α + ξ (x ) ,
α
(4.40)
4.3 The Newtonian Limit 105
giving
∂xα
= α
δᾱ − ξ ,ᾱ .
α
∂ x ᾱ
Since ξ α ,ᾱ ³ 1, this has the same form as Eq. (4.36), meaning that vectors ξ ,
which are ‘small’ in the sense discussed in this section, generate a family of
coordinate systems, in all of which the metric is a perturbation (| hαβ | ³ 1) on
a background minkowskian space.6
You may also see this written using a ‘ symmetrisation ’ notation,
A ( ij) ≡ (A ij + A ji )/2, (4.42)
which lets us write
g ᾱβ̄ = ηᾱ β̄ + h ᾱβ̄ − 2ξ (ᾱ,β̄ ).
There is a corresponding antisymmetrisation notation A [ij ] ≡ (A ij − A ji )/2.
From Eq. (3.54), we promptly find
2R αβ γ δ = g αδ ,β γ − gαγ ,β δ + g β γ ,αδ − gβ δ ,αγ (4.43)
and (as you can fairly straightforwardly confirm) this does not change under
the transformation h αβ µ→ h αβ − 2ξ (α,β ) .
This situation – in which we have identified a subspace of the general
problem, in which the calculations are simpler, and physical quantities such
as the Riemann tensor are invariant – is characteristic of a problem with a
gauge invariance . If I describe and then solve a problem in classical newtonian
mechanics, the dynamics of my solution will not change if I move the origin
of my coordinates, or change units from metres to feet – that is, if I ‘re-gauge’
the solution. A more mathematical way of putting this is that the Lagrangian
is symmetric under the corresponding coordinate transformation, or that the
6 The diffeomorphism in Eq. (4.41) is related to the Lie derivative mentioned in passing in
section Section 3.3.2, which is in turn related to the idea of moving along the integral curves of
the vector field ξ μ ( x) .
106 4 Energy, Momentum, and Einstein’s Equations
7 Note the spelling: the Lorenz gauge is named after the Danish physicist Ludvig Lorenz, who is
different from the Dutch physicist Hendrik Antoon Lorentz, after whom the (Poincaré–
Larmor–FitzGerald–)Lorentz transformation is named. Your confusion about this is widely
shared, possibly even by Lorentz (Nevels & Shin 2001). It doesn’t help that there is also a
Lorenz–Lorentz relation in optics, associated with both of them, in one order or another.
4.3 The Newtonian Limit 107
the momentum p = mU . This has the advantage that the resulting geodesic
equation
∇ pp = 0 (4.44)
is also valid for photons, which have a well-defined momentum even though
they have no mass m . We shall now solve this equation, to find the path of a
free-falling particle through this space-time.
The component form of Eq. (4.44) is
p α pμ ;α = 0, (4.45)
or
p α p μ, α + p β = 0.
μ
³αβ p
α
(4.46)
I do not know what I may appear to the world, but to myself I seem
to have been only like a boy playing on the seashore, and diverting
myself in now and then finding a smoother pebble or a prettier shell than
ordinary, whilst the great ocean of truth lay all undiscovered before me.
Isaac Newton, as quoted in Brewster, Memoirs of the Life, Writings,
and Discoveries of Sir Isaac Newton .
Exercises
Exercise 4.1 (§ 4.1.2) Deduce Eq. (4.7), given that you have only the
tensors U ⊗ U and g = η to work with, that the result must be proportional
to both ρ and p , and that it must be consistent with both Eq. (4.6) and
Eq. (4.4) in the limit p = 0. Thus write down the general expression T =
( aρ + bp ) U ⊗ U + (c ρ + dp ) g and apply the various constraints. Recall that
U = (1, 0) in the MCRF. [u + ]
² ³
boost of speed v along a. Recall that A μ̄ = ´ μ̄ μ̄
μ A μ
, where ´μ is given by
Eq. (2.34). Verify that ± σ , A = 0 in this frame, too. [d − ]
Exercise 4.3 (§ 4.2.2) What would happen to an electric motor in free fall
across the event horizon of a black hole (ignore any tidal effects)? [d − ]
Exercise 4.5 (§ 4.2.5) Prove that the curvature tensor has only 20 inde-
pendent components for a 4-dimensional manifold, when you take equations
Eqs. (3.55a) and (3.55b) into account.
pα p β ;α = 0.
4.3 The Newtonian Limit 109
I presume you have studied Special Relativity at some point. This appendix is intended
to remind you of what you learned there, in a way that is notationally and conceptually
homogeneous with the rest of the text here.
This appendix is intended to be standalone, but because it is necessarily rather
compact, you might want to back it up with other reading. Taylor and Wheeler (1992)
is an excellent account of Special Relativity (hereafter ‘SR’), written in a style that
is simultaneously conversational and rigourous (Wheeler, here, is the Wheeler of
MTW (1973)). Rindler (2006) is, as mentioned elsewhere, now slightly old-fashioned
in its treatment of GR, but is extremely thoughtful about the conceptual underpinnings
of SR. In contrast, the first chapter of Landau and Lifschitz (1975) gives an admirably
compact account of SR, which would be hard to learn from, but which could consolidate
an understanding otherwise obtained.
There are several popular science books that are about, or that mention, relativity –
these aren’t to be despised just because you’re now doing the subject ‘properly’. These
books tend to ignore any maths, and skip more pedantic detail (so they won’t get you
through an exam), but in exchange they spend their efforts on the underlying ideas.
Those underlying ideas, and developing your intuition about relativity, are things that
can sometimes be forgotten in more formal courses. I’ve always liked Schwartz and
McGuinness (2003), which is a cartoon book but very clear (and I declare a sentimental
attachment, since this is the book where I first learned about relativity); these books, like
this appendix, and like many other introductions to relativity, partly follows Einstein’s
own popular account Einstein (1920).
110
A.1 The Basic Ideas 111
We must now (a) understand what these two postulates really mean and (b) examine
both their direct consequences, and the way that we have to adjust the physics we
already know.
A.1.1 Events
An ‘event’ in SR is something that happens at a particular place, at a particular instant
of time. The standard examples of events are a flashbulb going off, or an explosion, or
two things colliding.
Note that it is events , and not the reference frames that we are about to mention,
that are primary. Events are real things that happen in the real world; the separations
between events are also real; reference frames are a construct we add to events to allow
us to give them numbers, and to allow us to manipulate and understand them. That is,
events are not ‘relative to an observer’ or ‘frame dependent’ – everyone agrees that an
event happens. SR is about how we reconcile the different measurements of an event,
that different, relatively moving, observers make.
1 By restricting ourselves to only horizontal motion, we evade any consideration of gravity. With
that constraint, the definition of ‘inertial frame’ here is consistent with the broader definition
appropriate to GR, which refers to a frame attached to a body in free fall, moving onlyunder the
influence of gravity.
112 A Special Relativity – A Brief Introduction
surrounding a point of interest; and they may be ‘accelerating’ with respect to each
other in the sense that the second derivative of position is non-zero, even though there
is no acceleration measurable in the frame (think of two people in free fall on opposite
sides of the earth).
[Exercise A.1]
2 An educational experience, in many ways, but one probably best kept as a thought experiment.
3 Ditto.
A.2 The Postulates 113
y y
v
z z
x
x
Figure A.1 Standard configuration.
1. they are aligned so that the (x , y, z) and (x± , y± , z± ) axes are parallel;
2. the frame S± is moving along the x axis with velocity V ;
3. we set the zero of the time coordinates so that the origins coincide at t = t± = 0
(which means that the origin of the S± frame is always at position x = Vt).
When we refer to ‘frame S’ and ‘frame S±’, we will interchangably be referring either
to the frames themselves, or to the sets of coordinates (t, x, y, z) or (t± , x± , y± , z ± ).
Frame S± will often be termed the rest frame; however, it should always be the rest
frame of something. Yes, it does seem a little counterintuitive that it’s the ‘moving
frame’ that’s the rest frame, but it’s called the rest frame because it’s the frame in which
the thing we’re interested in – be it a train carriage or an electron – is at rest. It’s in the
rest frame of the carriage that the carriage is measured to have its rest length or proper
length.
The Principle of Relativity: All inertial frames are equivalent for the
performance of all physical experiments.
That is, there is no place for the idea of a standard of absolute rest.
From the Relativity Principle (RP), one can show that, with certain obvious (but, as
we shall discover, wrong) assumptions about the nature of space and time, one could
derive the (apparently also rather obvious) Galilean transformation (GT)
x± = x − Vt y± = y z± = z t± = t (A.1)
between two frames in the standard configuration of Section A.1.4. This transformation
relates the coordinates of an event (t, x, y, z ), measured in frame S, to the coordinates of
the same event ( t± , x± , y± , z± ) in frame S± . Differentiating these, we find that
114 A Special Relativity – A Brief Introduction
– that is, we find exactly the same relation, as if we had simply put primes on each of the
quantities. This is known as ‘form invariance’, or sometimes ‘covariance’, and indicates
that (in this example) the expressions for x and x± have exactly the same form , with the
only difference being that we have different numerical values for the coefficients and
coordinates.
Maxwell’s equations, however, are not invariant under a GT. The wave equation,
and Maxwell’s equations, do not transform into themselves under a GT, and take their
simplest form (that is, their well-known form) only in a ‘stationary’ frame. Einstein
noted that electrodynamics appeared to be concerned only with relative motion, and
did not take a different form when viewed in a moving frame. His famous 1905 paper
is very clear on this point (‘On the Electrodynamics of Moving Bodies’), and the very
first words of it are:
It is known that Maxwell’s electrodynamics – as usually understood at the
present time – when applied to moving bodies, leads to asymmetries which do
not appear to be inherent in the phenomena. Take, for example, the reciprocal
electrodynamic action of a magnet and a conductor . . . Einstein (1905)
This paper then briskly elevates the principle of relativity to the status of a postulate, and
adds to it a second one, stating that the speed of light has the same value ‘independent
of the state of motion of the emitting body’: no matter what sort of experiment you
are doing, whether you are directly observing the travel time of a flash of light, or
doing some interferometric experiment, the speed of light relative to your apparatus
will always have the same numerical value. This is perfectly independent of how fast
you are moving: it is independent of whichever inertial frame you are in, so that another
observer, measuring the same flash of light from their moving laboratory, will measure
the speed of light relative to their detectors to have exactly the same value.
There is no real way of justifying either of these postulates: it is simply a truth of
our universe, and we can do nothing more than simply demonstrate its truth through
experiment.
[Exercise A.2]
textbooks on electromagnetic theory also tend to have sections on SR, which make this
point more or less emphatically.
The aether drift experiments are discussed in most relativity textbooks. The
sci.physics.relativity FAQ (Roberts and Schleif 2007) provides a large list
of references to experimental corroboration of SR. For an interesting sociological and
historical take on the Michelson-Morley experiments, and the context in which they
were interpreted, see also Collins and Pinch, (1993, chapter 2). Barton (1999, §§3.1
& 3.4) presents the underlying ideas clearly and at length, discusses experimental
corroboration, and provides ample further references.
The constancy of the speed of light is not the only second postulate you could have.
You could take alternatives such as ‘Maxwell’s Equations are true’, or ‘Moving clocks
run slow according to . . . ’, or any other statement that picked out the phenomena of SR,
and you could still derive the results of SR, including, for an encore, the constancy of
the speed of light. However, this particular second postulate is a particularly simple
and fundamental one, which is why it is much the best choice. Alternatively, you
could choose as a second postulate something like ‘c is infinite’ or ‘The Galilean
Transformation is true’, and derive from the pair of postulates the rest of the laws of
classical mechanics. The point here is that each pair of postulates would give you a
perfectly consistent theory – a perfectly possible world – but the galilean transformation
is one that does not happen to match our world other than as a low-speed approximation.
Taking a more mathematical tack, Rindler (2006, §2.17), and Barton less abstractly
(1999, §4.3), shows that the only linear transformations consistent with the Euclidicity
and isotropy of inertial frames are the Galilean and Lorentz Transformations. A
second postulate consisting of ‘there is no upper limit to the speed of propagation of
interactions’ picks out the GT; the statement ‘there is an upper speed limit’ (which
the first postulate implies is the same in all frames) instead picks out the Lorentz
Transformation with a dependence on that constant speed, and saying ‘. . . and light
moves at that speed’ sets the value of the constant. See also Rindler’s other remarks
(2006, §2.7) on the properties of the Lorentz Transformation; Landau and Lifshitz
(1975, §1) take this tack, and are as lucidly compact as ever.
Taking a more historical tack, I have quoted Einstein’s own (translated) words,
not because the argument depends on his authority (it doesn’t), but firstly because he
introduces the key arguments with admirable compactness, and secondly because it
is a very rare example of the first introduction of a core physical theory still being
intelligible after it has been absorbed into the bedrock of physics. [Exercise A.3]
4 This argument ultimately originates from Einstein’s popular book about relativity (Einstein,
1920), first published in English in 1920. It reappears in multiple variants, in planes, trains,
automobiles, and rockets, in many popular and professional accounts of relativity. The variant
described here is most directly descended from Rindler’s version (2006).
116 A Special Relativity – A Brief Introduction
3 1 11 ?
1 3 ? 11
Figure A.2 Passing trains: (a, left) flash reaches rear of carriage; (b, right) rear
observers coincide.
flashbulb fires at the centre of the carriage and the observers record the time the flash
reaches them. Since Fred and Barbara are equidistant from the bulb, their times must
be the same, for example time ‘3’ units. In other words, Fred’s and Barbara’s watches
both reading ‘3’, are simultaneous events in the frame of the carriage.
Observing from the platform, we would see the light from the flash move both
forward towards Fred and backwards towards Barbara, but at the same speed c , as
measured on the platform. Consequently, the flash would naturally get to Barbara first.
If, standing on the platform, you were to take a photograph at this point, you would
get something like the upper part of Figure A.2a. Barbara’s watch must read ‘3’, since
the flash meeting her and her watch reading ‘3’ are simultaneous at the same point in
space, and so must be simultaneous for observers in any frame. But at this point, the
light moving towards Fred cannot yet have caught up with him: since the light reaches
Fred when his watch reads ‘3’, his watch must still be reading something less than
that, ‘1’, say. In other words, Barbara’s watch reading ‘3’ and Fred’s watch reading ‘1’
are simultaneous events in the inertial frame of the platform.
Now imagine observing two such trains go past, timetabled such that we can obtain
the observations in Figure A.2a, where the light has reached both rear observers and
neither front one. Now pause a moment, and take another photograph when the two
rear observers are beside each other, this time getting Figure A.2b.
Barbara can report observing the front of the other carriage passing at time ‘3’,
whereas Fred reports the back of that carriage passing earlier, at time ‘1’. They can
therefore conclude that they have measured the length of the other carriage and found
it to be shorter than their own one. This is length contraction.
Similarly, Fred observes the rear clock in Figure A.2a as being two units fast,
compared to his own. But Barbara can later, in Figure A.2b, observe that same clock to
be reading ‘11’ at the same time as hers, no longer fast. They know their own clocks
were synchronised, so they can conclude that the rear clock in the other carriage was
going more slowly than their clocks. This is time dilation.
Notably, Barbara and Fred’s counterparts in the other carriage would come to
precisely the same conclusions. Because this setup is perfectly symmetrical, they would
measure Barbara and Fred’s clocks to be moving slowly, and their carriage to be shorter.
There is no sense in which one of the carriages is absolutely shorter than the other.
ct/2
L L
vt/2
Figure A.3 The light clock, shown in its rest frame (left) and as observed in a
frame in which the clock is moving at speed v (right).
clock. If the mirror and the flashbulb are a distance L apart, then 2L = ct± , where
t± is the time on the watch of an observer standing by, and moving with, the clock,
and c is the frame-independent speed of light. Also note that the clock’s mirrors are
arranged perpendicular to the clock’s motion, and both the stationary and the moving
observer measure the same separation between them – there is no length contraction
perpendicular to the motion.5 Examining the same tick in a frame in which the clock is
moving, we find (ct/2)2 = L 2 + (vt /2)2 and thus, using the expression for L above,
t
t± = , (A.2)
γ
Now, the important thing about this equation is that it involves t± , the time for the clock
to ‘tick’ as measured by the person standing next to it on the train, and it involves t, the
time as measured by the person on the platform, and they are not the same .
5 If there were a perpendicular length contraction, then observers would see a passing relativistic
train’s axles contract so that they derailed inside the tracks; however observers on the train
would see the passing sleepers contract, so that the train would derail outside the tracks; the
contradiction implies there can be no such contraction.
118 A Special Relativity – A Brief Introduction
or in Landau and Lifshitz (1975). Alternatively, assuming this in place of the second
postulate would allow us to deduce the constancy of the speed of light.
This quantity ±s2 = ±x 2 − c 2±t2 is referred to as the interval , or sometimes,
interchangeably, as the squared interval or the invariant interval.
From here on, we will handle coordinates only in natural units, in which c = 1,
with the result that we define
±s
2 = ±x
2 − ±t
2. (A.4)
Some authors define the interval with the opposite sign. The definition here is
compatible with Schutz but opposite to Rindler.
x2 − t2 = s2 = x±2 − t± 2. (A.5)
Thus the relationship between (t, x) and (t ±, x± ) must be one for which Eq. (A.5) is true.
If we consider two frames on the xy plane related by a rotation, then their coordinates
will be related by
x± = x cos θ + y sin θ
± (A.6)
y = − x sin θ + y cos θ .
and this will preserve the euclidean distance r 2 = x2 + y2 . This is strongly reminiscent
of Eq. (A.5), and we can make it more so by writing l = it and l± = it± , so that Eq. (A.5)
becomes
x2 + l2 = s2 = x±2 + l±2 . (A.7)
This strongly suggests that the pairs (l, x) and (l± , x± ) can be related by writing down the
analogue of Eq. (A.6), replacing y ²→ l and y ± ²→ l± , for some angle θ that depends on v,
the relative speed of frame S± in frame S. That is, this specifies a linear relation for which
Eq. (A.5) is true. If we finally write θ = iφ (since l is pure imaginary, so is θ , so that φ
A.3 Spacetime and the Lorentz Transformation 119
is real), and recall the trigonometric identities sin iφ = i sinh φ and cos iφ = cosh φ ,
then this expression for x± and l± becomes
Now consider an event at x± = 0 for some unknown t±. This happens at time t in
the unprimed frame and thus happens at position x = vt in that frame, in which case
Eq. (A.8a) can be rewritten as
Since we now have φ as a function of v, we have, in Eq. (A.8), the full transformation
between the two frames; combining these with a little hyperbolic trigonometry (remem-
ber cosh 2 φ − sinh2 φ = 1), we can rewrite Eq. (A.8) in the more usual form (for c = 1)
t± = − vx)
γ (t (A.10a)
x± = γ (x − vt), (A.10b)
where the trivial transformations for y and z complete the LT, and (as in Eq. (A.3) but
now with c = 1)
( )
γ = 1 − v2 − 1/2 . (A.11)
If frame S± is moving with speed v relative to S, then S must have a speed − v relative
to S± . Swapping the roles of the primed and unprimed frames, the transformation from
frame S± to frame S is exactly the same as Eq. (A.10), but with the opposite sign for v:
t= ± + vx± )
γ (t (A.12a)
x = γ (x± + vt± ), (A.12b)
which can be verified by direct solution of Eq. (A.10) for the unprimed coordinates.
The form of the LT shown in Eq. (A.13), and the addition law in Eq. (A.14),
conveniently indicate three interesting things about the LT: (i) for any two
transformations performed one after the other, there exists a third with the same net
effect (i.e., the LT is ‘transitive’); (ii) there exists a transformation (with φ = 0) that
maps ( t, x) to themselves (i.e., there exists an identity transformation); (iii) for every
transformation (with φ = φ1, say) there exists another transformation (with φ = − φ1)
that results in the identity transformation (i.e., there exists an inverse). These three
properties are enough to indicate that the LT is an example of a mathematical ‘group’,
known as the Lorentz group.
Any two IFs, not just those in standard configuration, may be related via a
sequence of transformations, namely a translation to move the origin, a rota-
tion to align the axes along the direction of motion, a LT, another rotation, and another
translation. The transformation that augments the LT with translations and rotations is
known as a Poincaré transformation, and it is a member of the Poincaré group.
This calculated proper time would be agreed on by all the observers who could not agree
on the spatial or temporal separations of the two events. In other words, this number τ
is invariant under a LT – it is a Lorentz scalar.
In the clock’s frame, the interval separating these two events is just s2 = − τ 2, so
that the invariance of the proper time is just another manifestation of the invariance of
the interval, Eq. (A.4).
A.4.1 Kinematics
Consider a prototype displacement vector (±x, ±y, ±z ). These are the components
of a vector with respect to the usual axes ex , ey , and ez . We can rotate these into
new axes e ±x, e±y , and e±z using one or more rotation matrices, and obtain coordinates
± ± ±
(±x , ±y , ±z ) for the same displacement vector, with respect to the new axes. These
primed coordinates are different from the unprimed ones, but systematically related
to them.
In four-dimensional spacetime, the prototype displacement 4-vector is ±R =
(±t, ±x, ±y, ±z), relative to the space axes and wristwatch of a specific observer,
and the transformation that takes one 4-vector into another is the familiar LT of
Eq. (A.10), or
⎛ ±⎞ ⎛ ⎞⎛ ⎞
±t γ − γv 0 0 ±t
⎜
⎜ ±x ⎟
± ⎜ 0⎟ ⎜ ±x⎟
⎝ ±⎟ ⎠ = ⎜⎝ ⎟⎠ ⎜⎝ ⎟⎠
− γv γ 0
(A.17)
±y 0 0 1 0 ±y
±z
± 0 0 0 1 ±z
U · U = − 1. (A.23)
You can confirm that this is indeed true by applying Eq. (A.18) to Eq. (A.22).
Here, we defined the 4-velocity by differentiating the displacement 4-vector, and
deduced its value in a frame co-moving with a particle. We can now turn this on its
head, and define the 4-velocity as a vector that has norm − 1 and that points along
the t-axis of a co-moving frame (this is known as a ‘tangent vector’, and is effectively a
vector ‘pointing along’ the world line). We have thus defined the 4-velocity of a particle
as the vector that has components (1, 0) in the particle’s rest frame. Note that the norm
of the vector is always the same; the particle’s speed relative to a frame S is indicated
not by the ‘length’ of the velocity vector – its norm – but by its direction in S. We can
then deduce the form in Eq. (A.22) as the Lorentz-transformed version of ( 1, 0).
Equations A.21 can lead us to some intuition about what the velocity vector
is telling us. When we say that the velocity vector in the particle’s rest frame
is (1, 0), we are saying that, for each unit proper time τ , the particle moves the same
amount through coordinate time t, and not at all through space x; the particle ‘moves
into the future’ directly along the t-axis. When we are talking instead about a particle
that is moving with respect to some frame, the equation U0 = dt/dτ = γ tells us that
the particle moves through a greater amount of this frame’s coordinate time t, per unit
proper time (where, again, the ‘proper time’ is the time showing on a clock attached to
the particle). This is another appearance of ‘time dilation’.
By further differentiating the components of the velocity U, we can obtain the
components of the acceleration 4-vector
A= γ (γ̇ , γ̇ v + γ a) . (A.24)
This is useful less often than the velocity vector, but it is fairly straightforward to deduce
that U · A = 0, and that A · A = a2, defining the proper acceleration aas the magnitude
of the acceleration in the instantaneously co-moving inertial frame.
Finally, given two particles with velocities U and V , and given that the second has
velocity v with respect to the first, then in the first particle’s rest frame the velocity
vectors have components U = (1, 0) and V = γ (v)( 1, v). Thus
U·V= γ (v ),
Since m is a scalar, and U is a 4-vector, P must be a 4-vector also. Remember also that γ
is a function of v: γ (v).
In the rest frame of the particle, this becomes P = m(1, 0): it is a 4-vector whose
norm (P · P ) is − m2, and which points along the particle’s world line. That is, it points
in the direction of the particle’s movement in space time. Since this is a vector, its
norm and its direction are frame-independent quantities, so a particle’s 4-momentum
vector always points in the direction of the particle’s world , line and the 4-momentum
vector’s norm is always − m2 . We’ll call this vector the momentum (4-)vector, but it’s
also called the energy-momentum vector, and Taylor and Wheeler (1992) call it the
momenergy vector (coining the word in an excellent chapter on it) in order to stress that
it is not the same thing as the energy or momentum (or mass) that you are used to.
Note that here, and throughout, the symbol m denotes the mass as measured
in a particle’s rest frame. The reason I mention this is that some treatments
of relativity, particularly older ones, introduce the concept of the ‘relativistic mass’,
distinct from the ‘rest mass’. The only (dubious) benefit of this is that it makes a
factor of γ disappear from a few equations, making them look at little more like their
newtonian counterparts; the cost is that of introducing one more new concept to worry
about, which doesn’t help much in the long term, and which can obscure aspects of the
energy-momentum vector. Rindler introduces the relativistic mass; Taylor and Wheeler
and Schutz don’t.
Now consider a pair of incoming particles P 1 and P2 , which collide and produce a
set of outgoing particles P 3 and P4 . Suppose that the total momentum is conserved:
P1 + P2 = P3 + P4. (A.26a)
This is an equation between 4-vectors. Equating the time and space coordinates
separately, recalling Eq. (A.25), and writing p ≡ γ mv, we have
m1γ (v1 ) + m2γ (v2) = m3 γ (v3 ) + m4γ (v4 ) (A.26b)
p1 + p2 = p3 + p4 . (A.26c)
Now recall that, as v → 0, we have γ (v) → 1, so that, from Eq. (A.25), the low-
speed limit of the spatial part of the vector P is just mv, so that the spatial part of the
conservation equation, Eq. (A.26c), reduces to the statement that mv is conserved. Both
of these prompt us to identify the spatial part of the 4-vector P as the familiar linear
3-momentum, and to justify giving P the name 4-momentum.
What, then, of the time component of Eq. (A.25)? Let us (with, admittedly, a little
foreknowledge) write this as P 0 = E, so that
E= γ m. (A.27)
If we now expand γ into a Taylor series, then
E= m+ 1 mv 2 + O(v4 ). (A.28)
2
Now 12 mv2 is the expression for the kinetic energy in newtonian mechanics, and
Eq. (A.26b), compared with Eq. (A.27), tells us that this quantity E is conserved in
collisions, so we have persuasive support for identifying the quantity E in Eq. (A.27)
A.4 Vectors, Kinematics, and Dynamics 125
as the relativistic energy of a particle with mass m and velocity v. If, finally, we rewrite
Eq. (A.27) in physical units, we find
2
E= γ mc , (A.29)
the low-speed limit of which (remember γ (0) = 1) recovers what has been called the
most famous equation of the twentieth century.
The argument presented here after Eq. (A.26) has been concerned with giving names
to quantities, and, reassuringly for us, linking those newly named things with quantities
we already know about from newtonian mechanics. This may seem arbitrary, and it is
certainly not any sort of proof that the 4-momentum is conserved as Eq. (A.26) says it
might be. No proof is necessary, however: it turns out from experiment that Eq. (A.26) is
a law of nature, so that we could simply include it as a postulate of relativistic dynamics
and proceed to use it without bothering to identify its components with anything we are
familiar with.
In case you are worried that we are pulling some sort of fast one, that we
never had to do in newtonian mechanics, note that we do have to do a
similar thing in newtonian mechanics. There, we postulate Newton’s third law (action
equals reaction), and from this we can deduce the conservation of momentum; here,
we postulate the conservation of 4-momentum, and this would allow us to deduce a
relativistic analogy of Newton’s third law (I don’t discuss relativistic force here, but it
is easy to define). The postulational burden is the same in both cases.
We can see from Eq. (A.27) that, even when a particle is stationary and v = 0,
the energy E is non-zero. In other words, a particle of mass m has an energy γ m
associated with it simply by virtue of its mass. The low-speed limit of Eq. (A.26b)
simply expresses the conservation of mass, but we see from Eq. (A.27) that it is
actually expressing the conservation of energy. In SR there is no real distinction
between mass and energy – mass is, like kinetic, thermal, and strain energy, merely
another form into which energy can be transmuted – albeit a particularly dense store
of energy, as can be seen by calculating the energy equivalent, in Joules, of a mass
of one kilogramme. It turns out from GR that it is not mass that gravitates, but
energy-momentum (most typically, however, in the particularly dense form of mass),
so that thermal and electromagnetic energy, for example, and even the energy in the
gravitational field itself, all gravitate. (It is the non-linearity implicit in the last remark
that is part of the explanation for the mathematical difficulty of GR.)
Let us now consider the norm of the 4-momentum vector. Like any such norm, it
will be frame invariant, and so will express something fundamental about the vector,
analogous to its length. Since this is the momentum vector we are talking about, this
norm will be some important invariant of the motion, indicating something like the
‘quantity of motion’. From the definition of the momentum, Eq. (A.25), and its norm,
Eq. (A.23), we have
P · P = m2U · U = − m2 , (A.30)
and we find that this important invariant is the mass of the moving particle.
Now using the definition of energy, Eq. (A.27), we can write P = (E , p) , and find
P · P = − E 2 + p · p. (A.31)
126 A Special Relativity – A Brief Introduction
A.4.4 Photons
For a photon, the interval represented by dR · dR is always zero (d R · dR = − dt2 +
dx2 + dy2 + dz2 = 0 for photons. But this means that the proper time dτ 2 is also zero
for photons. This means, in turn, that we cannot define a 4-velocity vector for a photon
by the same route that led us to Eq. (A.20), and therefore cannot define a 4-momentum
as in Eq. (A.25).
We can do so, however, by a different route. Recall that we defined (in the paragraph
following Eq. (A.23)) the 4-velocity as a vector pointing along the world line, which
resulted in the 4-momentum being in the same direction. From the discussion of the
momentum of massive particles in the previous section, we see that the P 0 component
is related to the energy, so we can use this to define a 4-momentum for a massless
particle, and again write
Pγ = (E , pγ ) .
Since the photon’s velocity 4-vector is null, the photon’s 4-momentum must be also
(since it is defined in Eq. (A.25) to be pointing in the same direction). Thus we must
have Pγ · P γ = 0, thus pγ · pγ = E2 , recovering the m = 0 version of Eq. (A.32),
A.4 Vectors, Kinematics, and Dynamics 127
[Exercise A.4]
Exercises
Exercise A.1 (§ 1.1.2) Which of these are inertial frames? (i) A motorway
bridge, (ii) a stationary car, (iii) a car moving at a straight line at a constant speed,
(iv) a car cornering at a constant speed, (v) a stationary lift, (vi) a free-falling lift. (The
last one is rather subtle.)
Take
∂ ∂x
±
±
∂t ∂ ∂
= +
∂x ∂ x ∂ x± ∂ x ∂ t±
∂
±
∂x ∂ ∂ t± ∂
= +
∂t ∂ t ∂ x± ∂ t ∂ t±
and so on, and using the GT, show that Eq. (i) does not transform into the same form
under a GT. [d + ]
E, p
Q1 θ
φ
Q2
Figure A.4 Compton scattering.
1 This chapter is notationally harmonious with Carroll (2004, chapter 5), which provides very
useful expansion of the details. The selection and sequence of ideas, however, follows that in
my colleague Martin Hendry’s lecture notes, and can thus be claimed as pedagogically verified
in the same way as the rest of the book.
2 Schwarzschild did this work during his service as an army officer in World War I, and described
the solution in a letter to Einstein in December 1915, written under the sound of shell-fire,
probably in Alsace; Schwarzschild died early the next year, possibly from complications arising
from exposure to his own side’s experimental chemical weapons. The historical details here are
an aside in chapter 12 of Snygg (2012). The book is entertaining in a variety of ways,
mathematical and historical, but it is not an introductory text.
129
130 B Solutions to Einstein’s Equations
Here the choice of signs is to retain compatibility with the Minkowski metric
ds2 = − dt 2 + dr2 + r2 d±2 , and we have assumed that the coefficients of dt2 and d r2
cannot change sign from this. Also, we have chosen the coordinate r to be such that the
metric on surfaces of constant (t, r ) is r2 d±2 (and if we hadn’t, for some reason, then
we could change radial variables r ±→ r̄ so that the coefficient ρ (r̄ ) was indeed just r̄2 ).
Notice that this illustrates the principle of general covariance, of Section 1.1. Given
a space-time – such as the space-time around a single point mass – we are allowed
to choose how we wish to label points within it, using a set of four independent
coordinates (four, because the space-time we are interested in is homeomorphic to R4
– see Section 3.1.1). We can explore the length of intervals in that space-time using a
clock, for timelike intervals, or a piece of string, for spacelike intervals, and summarise
the structure we thus discover, by writing down the coefficients of the metric. Those
coefficients depend on our instantaneous position within the space-time, and how we
have chosen to label them using our coordinates. The covariance principle declares that
the physical conclusions we come to must not change with a change in coordinates –
for example that a dropped object accelerates radially downwards or that the curl of
a magnetic field is proportional to a current; the numerical value of the acceleration
might depend on the coordinate choice, or the constant of proportionality, but not the
underlying geometrical statements. Similarly, it is the covariant derivative we defined
in Section 3.3.2 that allows us to define differentiation in such a way that the derivative
depends on the metric but, again, not the choice of coordinates.
B.1 The Schwarzschild Solution 131
The above account may seem somewhat hand-waving, but it is the intuitive
content of a more formal account, which depends on identifying the geometri-
cal structures that generate the symmetries contained within a particular manifold. This
discussion leads to Birkhoff’s theorem, which asserts that the Schwarzschild metric,
which we are leading up to, is the only spherically symmetric vacuum solution (static
or not). See Carroll (2004, §5.2).
With this metric, the Christoffel symbols are:
²tr
t = α² ²tt
r = e2(α− β ) α² ²rr
r = β²
²
θ
rθ = 1/r ²
r
θθ
= − re − 2β ²
φ
rφ
= 1/ r (B.3)
cos θ
r
= − r e− 2β sin2 θ
φ
²φφ ²φφ
θ
= − sin θ cos θ ² =
sin θ θφ
(others zero) where primes denote differentiation with respect to r . The corresponding
Ricci tensor components are
±
²² + α ²2 − α ²β ² + 2α² /r ²
Rtt = e2(α− β ) α
R rr = − α²² − α²2 + α² β ² + 2β ² /r
( ) (B.4)
R θ θ = e− 2β r(β ² − α² ) − 1 + 1
R φφ = sin2 θ R θ θ
1 = e2α (2rα ² + 1) =
∂ 2α
(r e ),
∂r
dt
= e− α ,
dτ
so that the r-component of the geodesic equation is
d2 r
³ dt ´2 d2 r
+ r
²tt = + e 2α α ² = 0,
dτ 2 dτ dτ 2
or, differentiating Eq. (B.6),
d2 r R
2
= − 2. (B.7)
dτ 2r
In the non-relativistic, or weak field, limit, the acceleration of a test particle due to
gravity is (compare Section 4.3.2)
d2 r GM
2
= − 2 ,
dt r
where M is the mass of the body at r = 0. From this we can see that R, the
Schwarzschild radius, is
R = 2GM , (B.8)
and putting this back into the prototype metric Eq. (B.2), we have
³2GM
´ ³ ´
2GM − 1 2
2 2
ds = − 1 − dt + 1− dr + r2 d±2 (B.9a)
r r
≡ gtt dt2 + grr dr 2 + gθ θ dθ 2 + gφφ dφ2 , (B.9b)
the metric for the Schwarzschild solution to Einstein’s equations. We will also refer to
0 1 2 3
(t, r, θ , φ) as (x , x , x , x ) below.
Differentiating e 2α = e− 2β = 1 − R/r to obtain
α
² = − β² = R 1
,
2r r − R
we can specialise Eq. (B.3) to
t R r R r −R
²tr = ²tt = (r − R) ²rr =
2r(r − R) 2r 3 2r(r − R)
r
²
θ
rθ = 1/r ²θ θ = − (r − R) ²
φ
rφ = 1/r (B.10)
r cos θ
²φφ = − (r − R) sin2 θ ²
θ
φφ
= − sin θ cos θ ²
φ
θφ
= .
sin θ
The sun has a mass of approximately 2 × 1030 kg in physical units. Taking
G = 2/3 × 10− 10 m2 kg− 1 s − 2, this gives R = 8/3 × 1020 m3 s − 2; if we then divide
by the conversion factor 1 = c2 = 9 × 1016 m2 s − 2, we end up with R ≈ 3 × 103 m.
Equivalently, we may recall the natural units of Section 4.3, in which the choice of
G = 1 creates a conversion factor between units of kilogrammes and units of metres,
exactly as we did when writing c = 1 in Section 1.4.1. In these units, the mass of the
sun is approximately 1.5 × 103 m giving, again, R = 2M = 3 × 103 m. From here on,
we will work in units where G = 1.
As you can see from Eq. (B.9), the coefficients have a singularity at r = 2M .
These coordinates are ill-behaved at r = 2M but, it turns out, there is no physical
B.2 The Perihelion of Mercury 133
singularity there. It is not trivial to demonstrate this, but it is possible to find alternative
coordinates – Kruskal–Szekeres coordinates – that are perfectly well-behaved at that
point. Although the space-time is not singular at r = 2M , that radius is nonetheless
special. The radius r = 2M is known as the ‘event horizon’, and demarcates two
causally separate regions: there are no world lines that go from inside the event horizon
to the outside, so no events within it can affect the space-time outside.
[Exercise B.1]
d
³ ∂x
µ
´ µ ³ ∂ x ´2 α
gµµ − 1 gαα,µ =0 (no sum on µ). (B.11)
dτ ∂τ 2 ∂τ
α
This will simplify some of the calculations below. See Exercise B.2.
We can see that gµν is independent of both t and φ , so that the second term in
Eq. (B.11) disappears for both µ = 0 and µ = 3, and thus that the geodesic equations
for t and φ are solvable as
dt
gtt= k constant (B.12)
dτ
dφ
gφφ = h constant. (B.13)
dτ
Similarly, the θ -equation (µ = 2) becomes
d2 θ dr dθ
³ dφ ´2
r2 + 2r − r 2 sin θ cos θ =0
dτ 2 dτ dτ dτ
(using intermediate results for the metric, from Exercise B.1), which has θ = π/ 2 as a
particular integral (a test particle which starts in that plane stays in that plane).
Trying to do the same thing for r, in Eq. (B.11), gives us a second order differential
equation to solve, which we’d prefer to avoid. Instead, we can use the fact that
¶ ·
∂x ∂x
α β
d
gαβ = 0 (B.14)
dτ ∂τ ∂τ
134 B Solutions to Einstein’s Equations
along a geodesic (see Exercise B.3). Integrating this, we can fix the constant of
integration by recalling that Uµ Uµ = − 1 in Minkowski space, to give
∂x ∂x
α β
gαβ = − 1,
∂τ ∂τ
d2 u M
2
= 3Mu2 − u + 2 (B.16)
dφ h
(switching to M = R /2, here, avoids a few unsightly powers of 2 below). If you trace
this calculation back, you see that the first term, in u2 , is associated with the factor of
1 − R /r in dt/dτ , Eq. (B.12); it is, loosely speaking, the relativistic term, and it is the
term that is not present in the corresponding part of the non-relativistic, or newtonian,
calculation. And sure enough, this term is much smaller than the other two terms, for
cases such as the earth’s orbit around the sun.
What that means in turn is that the solution to this equation will be the solution to
the Kepler problem, u0, plus a small correction, which we can write as
u = u0 + u1 (u1 ´ u0 ).
You can confirm that, in these units, the solution to the newtonian problem (that is,
Eq. (B.16) without the relativistic term) is
M
u0 (φ) = (1 + e cos φ) ,
h2
and thus
¶ ·
d2 u1 M3 e2 e2
+ u 1 = 3 1+ + 2e cos φ + cos 2φ , (B.17)
d φ2 h4 2 2
B.2 The Perihelion of Mercury 135
where we have suppressed terms in u0u1 and u21 , and recalled that cos2 φ =
(1 + cos 2φ)/2. If you stared at this for long enough, you would doubtless spot that
d2
(φ sin φ) + φ sin φ = 2 cos φ
dφ 2
d2
(cos 2φ) + cos 2φ = − 3 cos 2φ ,
dφ 2
which prompts us to write
B C
u1 (φ) = A + φ sin φ − cos 2φ , (B.18)
2 3
which, on substutition into Eq. (B.17), gives
¶ ·
M3 e2 e2
A + B cos φ + C cos 2φ = 3 4 1+ + 2e cos φ + cos 2 φ ,
h 2 2
giving values for A, B, and C , which are all numerically small.
Examining the terms in Eq. (B.18), we see that the first term is a (small) constant,
and the third oscillates between µ C/ 2; both, therefore, have negligible effects. The
middle term, however, has an oscillatory factor, but also a secular factor, which grows
linearly with φ .
If we finally add u0 + u1 again, but discard the negligible terms in A and C , then we
obtain
M 3M 2
u= (1 + e cos(1 − α)φ) , α = ´ 1 (B.19)
h2 h2
(using a Taylor expansion which shows that cos (1 − α)φ = cos φ + αφ sin φ + O(α2 )).
This is very nearly an ellipse, but with a perihelion that advances by an angle 2 π α per
orbit.
We can describe a newtonian orbital ellipse with
1 + e cos φ
u(φ) = ,
a(1 − e2)
where e is the orbital eccentricity, and a the semi-major axis. By comparison with the
non-relativistic part of Eq. (B.16), we find that
M 1
= ,
h2 a(1 − e 2)
and thus that
M2 6π M
³φ ≡ 2π α = 6π = .
h 2 a(1 − e2)
The nearest planet to the sun is Mercury. The actual orbit of Mercury is not quite a
kelperian ellipse, but instead precesses at 574 arcsec/century (relative to the ICRF, the
relevant inertial frame). This is almost all explicable by newtonian perturbations arising
from the presence of the other planets in the solar system, and over the course of the
nineteenth century much of this anomalous precession had been accounted for in detail.
The process of identifying the corresponding perturbations had also been carried out
136 B Solutions to Einstein’s Equations
for Uranus, with the anomalies in that case resolved by predicting the existence of, and
then finding, Neptune.4 At one point it was suspected that there was an additional planet
near Mercury which could explain the anomaly, but this was ruled out on other grounds,
and a residual precession of 43 arcsec/century remained stubbornly unexplained.
Mercury has semi-major axis a = 5.79 × 1010 m = 193 s, and eccentricity
e = 0.2056. Taking the sun’s mass to be M¶ = 1.48 km = 4.93 × 10− 5 s, we find
³φ = 5.03 × 10
− 7 rad orbit− 1 . The orbital period of Mercury is 88 days, so that
converting ³φ for Mercury to units of arcseconds per century, we find
³φ = 43.0 arcsec/ century.
Einstein published this calculation in 1916. It is the first of the classic tests of GR,
which also include the deflection of light (or other EM radiation) in its passage near the
limb of the sun, and the measurement of gravitational redshift. Numerous further tests
have been made, at increasing precision, and no deviations from GR’s predictions have
been found. The history of such tests is discussed at some length in Will (2014).
[Exercise B.2–B.4]
where sij is traceless, so that the trace of hαβ appears only in µ = − δ ij hij / 6.
The tensor sij is referred to as the strain . We have done nothing here other than change
the notation; in particular, note that there are ten degrees of freedom here, just as in the
original hij . The significance of the different partitions of the matrix is that each of the
00, 0i, and ij components transforms into itself under a spatial rotation.
4 This took place in a few years leading up to 1846, and led to a Franco-British priority dispute
over which country’s astronomer had made and published the crucial prediction (Kollerstrom
2006).
B.3 Gravitational Waves 137
From this metric it is straightforward but tedious 5 to obtain first the Christoffel
symbols, and then the Riemann tensor
R0j0l = + w(l,j 0) − hjl,00
´,lj
∂ xμ ∂ xν
∂
= −
∂ t2
∂
+ ∇ 2,
the four-dimensional version of the Laplacian; the definition of the symmetrisation
notation in Eq. (4.42); and that these expressions are calculated using hμν ´ ημν .
Having quoted the Einstein tensor, our next task is to find a way of throwing most
of it away.
The Einstein tensor is governed by Einstein’s equation, Gμν = 8π T μν . Examining
the G00 term, we can see that the corresponding term in Einstein’s equation specifies µ
in terms of sij and T00 . There is no time derivative of µ , and so no propagation of it.
Similarly, the G0i term specifies the w j in terms of µ , s ij and the T0i , and the Gij term
specifies ´, in each case, again, without a time derivative. So although, after Eq. (B.20),
we counted ten degrees of freedom, they are not all independent. The propagating terms
– the terms that will shortly lead to a wave equation – are all in the sij component of the
metric, which, you may recall, is both symmetric and trace-free.
In the discussion under Eq. (4.41) we discussed the family of gauge transformations
generated by hαβ → hαβ + 2ξ(α,β ) (this is mildly adjusted from Eq. (4.40), to make the
signs below nicer). What does this look like, when applied to the metric parameterised
by Eq. (B.20)? We obtain
´ → ´ + ξ
0
,0
i 0
wi → wi + ξ ,0 − ξ ,i
1 k (B.23)
µ → µ − ξ ,k
3
1 k
s ij → sij + ξ(j,i) − ξ ,k δij .
3
Doing the same thing with the wi expression, and choosing ξ 0 so that ∇ 2 ξ 0 − ξ i ,0i =
wi ,i gives
wi ,i = 0.
This choice of coordinates – that is, this choice of ξ μ when applied to xμ – is known as
the transverse gauge. This gauge choice significantly simplifies Einstein’s equations in
Eq. (B.22). In this gauge, we can finally solve Einstein’s equations to find propagating
gravitational wave solutions.
We are looking for solutions propagating in free space, in which Tμν = 0. Looking
at the simplified Eq. (B.22), the G00 = 0 equation implies ∇ 2µ = 0, which, in an
asymptotically flat space-time (remember µ = − h00 / 2 ´ η00 ), implies µ = 0. The
G0i term then similarly implies wi = 0. Finally, Gij = 0 implies that Tr Gij = ηij Gij = 0
and thus (since sij is traceless) that ∇ 2´ = 0 and ´ = 0. The only term that survives
this carnage is the traceless part of the Gij = 0 equation, or
±sij = 0. (B.25)
This is usually referred to as the transverse-traceless gauge. Looking back to our
re-notation of the metric perturbation hμν in Eq. (B.20), and recalling that in this
context ´, wi , and µ are all zero, we write the metric perturbation in this gauge as hTT ,
and rewrite the above reduction of Einstein’s equations as
±h TT
μν
= 0. (B.26)
Comparing this to the definitions of ´, wi , and µ in Eq. (B.20), we can see that hTT 00
=
hTT
0i = 0. Eq. (B.26) has as solution a matrix of metric perturbations h (x ) = h (t, x),
TT α TT
giving the small deviations from a Minkowski metric as a function of position and time.
Recall that this matrix gives expressions for the components of this perturbation in this
gauge, and that (recalling Section 4.3.1) we can regard this as a tensor in a background
Minkowski space.
− kα kα hTT
μν
= 0,
which is a solution only if kα k α = 0, so that the wave-vector k α is null.
Looking back at the solution in Eq. (B.27), we see that hTT μν
is constant if kα xα is
constant. This is true for the curve x (λ) = k f (λ) + l for any scalar function f and
α α α
constant vector lα . Here, the function f (λ) = λ, or some affine transformation of that,
gives the worldline of a photon, indicating that a given perturbation – that is, a given
value of hTT
μν
– propagates in the same way as a photon, rectilinearly at the speed of
light.
Imposing the gauge condition hTTμν ,ν = 0, Eq. (B.24), we deduce that kμ Cμν = 0:
the wave vector is orthogonal to the constant tensor Cμν , and the oscillations of the
solution are transverse to the direction of propagation. This is why this is called the
transverse gauge.
Our last observation is that we can, without loss of generality, orient our coordinates
so that the wave vector is pointing along the z-axis, and kμ = (ω , 0, 0, ω), writing ω
for the time-component of the wave vector. Thus 0 = kμ Cμν = k 0C0ν + k3C 3ν so
that C3ν = 0 as well. Writing (with, again, a little foreknowledge) C11 = h+ and
C12 = h× , this means that the tensor components Cμν simplify, in this gauge, to
⎛ ⎞
0 0 0 0
⎜⎜0 h+ h× 0⎟⎟.
=⎝
0⎠
Cμν (B.28)
0 h× − h+
0 0 0 0
After identifying ten degrees of freedom in Eq. (B.20), in this gauge there are only two
degrees of freedom left.
d μ
U + U α Uβ = 0.
μ
²
αβ
dτ
Given the initial condition, this gives an initial deflection of
d μ
U = − ²00 = − 21 η μν (hTT
0ν ,0 + hν 0,0 − h00,ν ).
μ TT
dτ
140 B Solutions to Einstein’s Equations
Since hTT is proportional to Cμν , and C μν has components as in Eq. (B.28) in these
coordinates, we see that the right-hand side is zero, so that the particle’s velocity vector
is unchanged, and it remains at restas the gravitational wave passes.
Consider now a second test particle, a distance ¶ from the first, along the x-axis. The
proper distance between these two particles is
¸ ¸
³l = | ds2 |1/ 2 = | gαβ dxα dxβ | 1/2
¸ ¶
= | gxx | 1/ 2dx
0
¹¹
≈ |gxx | 1/ 2¹ ¶
x= 0
º ¹ ( 2) »
1 hTT ¹
2 xx ¹x= 0
= 1+ + O hTT
xx ¶. (B.29)
Since hTT has an oscillatory factor, the proper distance between the test particles
oscillates as the wave passes, even though both particles are following geodesics, and
are thus ‘at rest’ in their corresponding Lorentz frames. They are ‘at rest’ in the separate
senses that they remain at the same coordinate positions (in these coordinates) and that
they remain on geodesics. This is an example of the distinction between ‘acceleration’
as a second derivative of coordinate position – which is a frame-dependent thing –
and acceleration as the thing you directly perceive when pushed – which is a frame-
independent thing.
The two test particles are both following geodesics, so we should be able to analyse
this same situation using the geodesic deviation equation, Eq. (3.60), which directly
describes the way in which the proper distance between two test particles changes, as
they move along their mutual geodesics. To first order, we can take both tangent vectors
to be U μ = (1, 0, 0, 0) , as above, so that
¶ · μ
d2 ξ ν
= Rμ αβ ν U αU β ξ .
dt 2
Evaluating R μαβ ν = η
μσ
Rσ αβ ν in our transverse-traceless gauge,
1
0[ν ,0]σ
»
= 2 hTT
σ ν,00
− hTT
σ 0, ν 0
− hTT
0ν ,0σ + h00,νσ
TT
For these test particles, x 0 = τ = t, so that the geodesic deviation equation becomes
¶ · μ
d2 ξ 2
1 η μσ ξ ν d hTT .
= 2
(B.30)
dt 2 dt 2
σν
Notice that only ξ 1 and ξ 2 are affected – the directions transverse to the direction the
wave is travelling in.
B.3 Gravitational Waves 141
y
x
Figure B.1 Polarised gravitational waves. The successive figures show the posi-
tions of a ring of test particles at successive times. The upper figure shows the
effect of + -polarised gravitational waves, and the lower the effect of × -polarised
ones.
∂ ξ
2 2 ∂
2
= − 12 ξ 2 h+ exp ikμ xμ ,
∂ t2 ∂ t2
2
º » 2 (B.31)
ξ = 1 − 21 h+ exp ikμxμ ξ (0).
If a pair of test masses – either two of the particles in Figure B.1 or the two
in Eq. (B.29) – are not in free fall but instead carefully suspended so that they are
stationary in a lab, then their motion is govered both by the suspending force, which
is accelerating them away from a free-fall geodesic, but which is constant, and by
the changes to space-time caused by any passing gravitational waves. By carefully
monitoring the changes in proper separation between the masses, using the light-travel
time between them, we should be able to detect the passage of any gravitational waves
in the vicinity.
The size of these fluctuations is given by the size of the terms in hTT , compared to
the terms in the metric they are perturbing, which are equal to 1. For the Hulse–Taylor
binary mentioned above, hTT ∼ 10− 23 . Detecting the changes in proper separation
requires a rather intricate measurement.
One way of doing this is to develop resonant bar detectors, which are large bars,
typically aluminium, with a mechanical resonance that could be excited by a sufficiently
strong and sufficiently long gravitational wave of the matching frequency. These proved
insufficiently sensitive, however, and the gravitational wave community instead moved
on to develop interferometric detectors. These use an optical arrangement very similar
in principle to the Michelson–Morley detector which failed to detect the luminiferous
aether, and which was so important in the history of SR. The test masses here are
mirrors at the ends of two arms at right angles to each other: changes in the lengths
of the arms, as measured by lasers within the arms are, with suitably heroic efforts,
detectable interferometrically.
There are detailed discussions of the calculation in Schutz (2009, chapter 9) and
Carroll (2004, chapter 7), a summary calculation and a discussion of astrophysical
implications in Sathyaprakash and Schutz (2009), and an account of the experimental
design in Pitkin et al. (2011). The LISA design for a future space-based interferometer,
at a much larger scale and with higher sensitivities, and the astronomy it may unlock,
is described in Amaro-Seoane et al. (2013 & 2017). Pitkin et al. (2011) also provides
a very brief history of the development of gravitational wave detectors, and there is a
much more extensive history, from the point of view of the sociology of science, in
Collins (2004).
The first direct detection of gravitational waves was made by the LIGO experiment
on 2015 September 14, and jointly announced by the LIGO and Virgo collaborations on
2016 February 11 (Abbott et al., 2016). Subsequent detections confirm the expectation
that such measurements will become routine, and will allow gravitational wave physics
to change from being an end in itself, to becoming a non-electromagnetic messenger
for the exploration of the universe on the largest scales.
Exercises
Exercise B.1 (§ 2.1) Derive Eqs. (B.3) and (B.4). You might as well obtain
Eq. (B.10) as an encore. [d− ]
B.3 Gravitational Waves 143
Exercise B.2 (§ 2.2) Prove Eq. (B.11). Suspend the summation convention for
this exercise. First, obtain the Christoffel symbols for an orthogonal metric:
gαα,α
²αα =
α
2gαα
− gββ ,α
²
α
= α ³= β
ββ
2gαα
gαα,β
²αβ = α ³= β
α
2gαα
²
α
βγ
= 0 α, β , γ all different.
Then start from one or other version of the geodesic equation, such as Eq. (4.46), expand
d/dτ (gμμ ∂ xμ /∂ τ ), use the geodesic equation, and expand the sum carefully. [ u+ ]
Exercise B.3 (§ 2.2) Prove Eq. (B.14). Expand the left-hand side, use the
geodesic equation, and use symmetry.
Exercise B.4 (§ 2.2) Estimate the relative numerical values of the terms on the
right-hand side of Eq. (B.16), for the orbit of the earth. Confirm that the first term is
indeed small compared to the third. [ u+ ]
Appendix C Notation
Chapters 2 and 3 introduce a great deal of sometimes confusing notation. The best way
to get used to this is: to get used to it, by working through problems, but while you’re
slogging there, this crib might be useful.
C.1 Tensors
(M)
A N tensor is ( )a linear function of M one-forms and( ) N vectors, which turns them into
a number. A 10 tensor is called a vector, and a 01 tensor is a one-form. Vectors are
written with a bar over them, V , and( ) one-forms with a tilde ±p (§2.2.1). In my (informal)
notation in the text, T(±· , · ) is a 11 tensor – a machine with a single one-form-shaped
( )vector-shaped slot. Note that this is a different beast from T( · , ±· ),
slot and a single
which is also a 11 tensor, but with the slots differently arranged.
144
C.4 Differentiation 145
T ij = (ei , ±
j
T ω ) A different beast (note the
arrangement of indexes)
The object T i j is a number – a component of a tensor in a particular basis. However
we also (loosely, wickedly) use this same notation( ) to refer to the corresponding matrix
of numbers, and even to the corresponding 11 tensor T.
The vector space in which these objects live is the tangent plane to the manifold M
at the point P , TP (M ) (§3.1.1). In this space, the basis vectors are ei = ∂/∂ x i (§3.1.2),
and the basis one-forms ± dxi , where xi is the i-th coordinate (more precisely, coordinate
i
function; note that x is not a component of any vector, though the notation makes it
look like one). These bases are dual: ei (± ω ) = ∂/∂ x (±
j i dxj ) = δ j (cf., (3.7)).
i
C.3 Contractions
A contraction is a tensor with some or all of its arguments filled in.
± ±
V (p) = p(V ) by choice §2.2.5
±p(V ) ≡ ±±p, V ² special notation for vectors (2.2)
contracted with one-forms
±p(V ) = pi V i basis independent §2.2.5
(2)
T p, ±
(± · , V )j = pi V k T ij k partially contracted 1 tensor §2.2.5
(a vector)
p= ±p, ±· )
g( a vector, with components. . . §3.2.3
g( ±p, ±ωj ) = pi (ω±i , ±ωj ) ≡ pj
g definition of vector p, with raised
indexes, dual to one-form ±p, written
with lowered indexes
gij = g( ei , ej ), components of the metric (2.12)
C.4 Differentiation
V i ,j ≡ ∂ V i/∂ xj (non-covariant) derivative of a
vector component
146 C Notation
j
pi,j ≡ ∂ pi/∂ x . . . and one-form
∇V (1) derivative of V
covariant §3.2.2
(a 1 tensor)
∇ UV covariant derivative of V in the cf. (3.40)
direction U (a vector)
∇ ei V ≡ ∇ i V shorthand
(1)
( ∇ V )i j = V i ;j components of the tensor ∇ V (3.21)
(10)
( ∇±
p)ij = pi; j components of the 2 tensor ∇ ± p
eı̄ = ±
ie ,
ı̄ i
±ω ı̄
= ± ω
ı̄
i ±i transformation of basis vectors and §3.1.4
one-forms
i i
V ı̄ = ±
ı̄
iV , pı̄ = ±
ı̄
pi transformation of components (2.17)
j¯ j̄ j
ei = ±
i e j̄ = ± ±
i j̄
ej the transformation matrix goes in
both directions
⇒ ± ±
¯
j j
= δ
j
matrix inverses (2.22)
i j̄ i
Notes: (1) these look complicated to remember, but as long as each ± has one barred
and one unbarred index, you’ll find there’s only one place to put each index, consistent
with the summation convention. (2) This shows why it’s useful to have the bars on the
indexes. (3) Some books use hats for the alternate bases: eı̂ .
with raised ones; the other is to distinguish the components pi of the one-form ±
p from
p, ±
the components pi = gij pj of the related vector p = g(± · ). Points to watch:
• A term should have at most two duplicate indexes: one up, one down; if you find
something like Ui V i or Ui T ii , you’ve made a mistake.
• All the terms in an expression should have the same unmatched index(es):
A i = Bj T i j + Ci is all right, Ai = B j T k j is a mistake (typo or thinko).
ij ik ji
• You can change or swap repeated indexes: A Tij , A T ik , and A Tji all mean exactly
ij
the same thing, but all are different from A Tji (unless T happens to be symmetric).
C.7 Miscellaneous
³
i ij 1 if i = j
δ j = δ = δij = Kronecker delta symbol §2.1.1
0 else
η µν = diag(− 1, 1, 1, 1) the metric (tensor) of Minkowski (2.33)
space
[ A, B] = A B − B A the ‘commutator’ before (3.57)
A [ij] = (Aij − A ji )/ 2 anti-symmetrisation
A (ij ) = (Aij + A ji )/2 symmetrisation (4.42)
For the definitions – and specifically the sign conventions – of the Riemann, Ricci,
and Einstein tensors, see Eqs. (3.49), (4.16), and (4.23). For a table of contrasting sign
conventions in different texts, see Table 1.1 in §1.4.3.
Note: ±ij¯ and ²jki
are matrices, not the components of tensors, so the indexes don’t
correspond to arguments, and so don’t have to be staggered.
In general, component indexes are Roman letters: i, j, and so on. When discussing
specifically space-time, it is traditional but not universal to use Greek letters such as µ ,
ν , α, β , and so on, for component indexes ranging over { 0, 1, 2, 3 } , and Roman letters
for indexes ranging over the spacelike components { 1, 2, 3} . Here, we use the latter
convention only in Chapter 4.
Components are usually standing in for numbers, however we’ll sometimes replace
them with letters when a particular coordinate system suggests them. For example ex
rather than e1 in the context of cartesian coordinates, or write ²φφ θ
rather than, say,
1
²
22 when writing the Christoffel symbols for coordinates (θ , φ) . There shouldn’t be
confusion (context, again), because x, y, θ , and φ are never used as (variable) component
indexes; see e.g., Eq. (3.15).
References
Abbott, B. P. et al. (2016), ‘Observation of gravitational waves from a binary black hole
merger’, Phys. Rev. Lett.116 , 061102. https://doi.org/10.1103/PhysRevLett.116
.061102.
Amaro-Seoane, P. et al. (2013), ‘Doing science with eLISA: Astrophysics and
cosmology in the millihertz regime’, Preprint. https://arxiv.org/abs/1201.3621
Amaro-Seoane, P. et al. (2017), ‘LISA mission L3 proposal’, web page. www
.lisamission.org/articles/lisa-mission/lisa-mission-proposal-l3. Last accessed July
2018.
Barton, G. (1999), Introduction to the Relativity Principle , John Wiley and Sons.
Bell, J. S. (1987), Speakable and Unspeakable in Quantum Mechanics , Cambridge
University Press.
Carroll, S. M. (2004), Spacetime and Geometry, Pearson Education. See also http://
spacetimeandgeometry.net .
Collins, H. (2004), Gravity’s Shadow: The Search for Gravitational Waves , University
of Chicago Press.
Collins, H. and Pinch, T. (1993), The Golem: What Everyone Should Know about
Science , Cambridge University Press.
Einstein, A. (1905), ‘Zur Elektrodynamik bewegter Körper (On the electrodynam-
ics of moving bodies)’, Annalen der Physik 17, 891. https://doi.org/10.1002/
andp.19053221004. Reprinted, in translation, in Lorentz et al. (1952).
Einstein, A. (1920), Relativity: The Special and the General Theory , Methuen. Orig-
inally published in book form in German, in 1917; first published in English in
1920, in an authorised translation by Robert W Lawson; available in multiple
editions and formats.
Goldstein, H. (2001), Classical Mechanics, 3rd edn, Pearson Education.
Hartle, J. B. (2003), Gravity: An Introduction to Einstein’s General Relativity , Pearson
Education.
Hawking, S. W. and Ellis, G. F. R. (1973), The Large Scale Structure of Space-Time ,
Cambridge University Press.
Janssen, M. and Renn, J. (2015), ‘Arch and scaffold: How Einstein found his field
equations’, Physics Today 68(11), 30–36. https://doi.org/10.1063/PT.3.2979 .
Landau, L. D. and Lifshitz, E. M. (1975), Course of Theoretical Physics, Vol. 2: The
Classical Theory of Fields , 4th edn, Butterworth-Heinemann.
148
References 149
150
Index 151