Justin Corvino, Pengzi Miao - Lectures On Mathematical Relativity-Cambridge University Press (2025)
Justin Corvino, Pengzi Miao - Lectures On Mathematical Relativity-Cambridge University Press (2025)
66
Lectures on
Mathematical Relativity
Mathematical Sciences Research Institute Publications
This series is based on work undertaken at the Simons Laufer Mathematical Sciences Institute
(SLMath), formerly the Mathematical Sciences Research Institute (MSRI), in Berkeley,
California. It publishes surveys and workshop proceedings of long-lasting value, as well as
lecture notes and monographs by visitors to the Institute. The volumes below are published
by Cambridge University Press; earlier ones may be available from Springer-Verlag.
Justin Corvino
Lafayette College
Pengzi Miao
University of Miami
Lan-Hsuan Huang
University of Connecticut, Storrs
Brian Allen
Lehman College, CUNY
Fernando Schwartz
University of Tennessee, Knoxville
Justin Corvino Pengzi Miao
Lafayette College University of Miami
Easton, PA 18042 Coral Gables, FL 33146
United States United States
[email protected] [email protected]
The Simons Laufer Mathematical Sciences Institute wishes to acknowledge support by the National
Science Foundation and the Pacific Journal of Mathematics for the publication of this series.
Preface xiii
Notation and conventions xix
Chapter 1. Special relativity and Minkowski spacetime 1
1.1. Lorentz transformations 1
1.2. Kinematics in Minkowski spacetime 15
1.3. Energy and momentum 24
1.4. Some geometric aspects of Minkowski spacetime 32
Exercises 36
Chapter 2. The Einstein equation 47
2.1. Newtonian gravity 47
2.2. From the equivalence principle to general relativity 49
2.3. The Einstein equation 58
2.4. Spacetime examples 82
Exercises 100
Chapter 3. Basics of Lorentzian causality 107
3.1. Preliminaries from Lorentzian geometry 107
3.2. Causality relations 109
3.3. Causality conditions 110
3.4. Achronal sets 112
3.5. Cauchy hypersurfaces 114
3.6. Domains of dependence 117
3.7. Cauchy horizons 119
Exercises 120
Chapter 4. The Penrose singularity theorem 123
4.1. Jacobi fields and focal points 123
4.2. Riccati and Raychaudhuri equations 124
4.3. Proof of Penrose’s singularity theorem 129
Chapter 5. The Einstein constraint equations 133
5.1. Introduction 133
5.2. The Einstein constraint equations 139
5.3. The initial value formulation for the vacuum Einstein equation 144
xi
xii C ONTENTS
Exercises 161
Chapter 6. Scalar curvature deformation
and the Einstein constraint equations 167
6.1. A primer on elliptic PDE 168
6.2. Solving the constraint equations: the conformal method 194
6.3. Scalar curvature deformation on closed manifolds 205
Exercises 214
Excursus:
First and second variation of area 219
Exercises 225
Chapter 7. Asymptotically flat solutions of the Einstein
constraint equations 231
7.1. Harmonically flat solutions of the constraint equations 232
7.2. Asymptotically flat initial data 245
7.3. Harmonically flat asymptotics 282
7.4. On the positive mass theorem 285
7.5. Localized scalar curvature deformation and asymptotics 290
Exercises 304
Chapter 8. On the center of mass and constant mean curvature
surfaces of asymptotically flat initial data sets 319
8.1. Introduction 319
8.2. Uniqueness of embedded CMC surfaces 324
8.3. Stable CMC surfaces 327
8.4. Existence of CMC surfaces in asymptotically flat initial data
sets 331
8.5. Stability and foliations 340
8.6. Density theorems 348
Chapter 9. On the Riemannian Penrose inequality 357
9.1. Introduction 357
9.2. Preliminaries 358
9.3. Lam’s proof of the RPI (and PMT) for graphs,
in arbitrary dimensions 360
9.4. Huisken and Ilmanen’s proof of the RPI using IMCF 364
9.5. Bray’s proof 375
References 383
Index 395
Preface
This volume arose from the Summer Graduate Workshop in Mathematical Gen-
eral Relativity at the Mathematical Sciences Research Institute (MSRI, now
renamed the Simons Laufer Mathematical Sciences Institute) in Berkeley, CA
in 2012, and the subsequent summer school in Cortona, Italy, in 2013. The
editors of the volume served as scientific organizers for the summer schools.
The contributions to the volume grew out of lectures given at one or both of the
schools.
We have endeavored to enhance the presentation of the material covered in the
two-week summer schools to make it suitable for reading in book form, while at
the same time remaining faithful to the spirit of those schools.
The advertised prerequisites for the schools, and hence this volume, included a
standard first-year graduate analysis course, with elements of real and functional
analysis as might be found in Real analysis by H. Royden [193] and Real and
complex analysis by W. Rudin [194]. We also assumed introductory graduate
courses in differential and Riemannian geometry, at the level of the following
texts: An introduction to smooth manifolds and Riemannian manifolds, by J. M.
Lee [141; 140]; Riemannian geometry by M. P. do Carmo [41]; Riemannian
geometry by P. Peterson [182]; and of particular relevance to the summer schools,
Semi-Riemannian geometry by B. O’Neill [174]. A graduate course in partial
differential equations (PDE), at the level of Partial differential equations by L. C.
Evans [86] and the first half of Elliptic partial differential equations of second
order by D. Gilbarg and N. Trudinger [107], was not a requirement for the
schools, and although some of the lectures needed to draw on some PDE results,
students without such background could profit from the bulk of the material
discussed at the schools. For this volume, however, certain PDE details in some
of the presentations have been fleshed out, so those sections would be better
approached with this background in hand. We have endeavored to bridge the gap
by including a section introducing and motivating some of the PDE tools.
The students came to the schools with a wide range of backgrounds in mathe-
matics, from those who had nearly completed their doctoral dissertations to those
xiii
xiv P REFACE
who came without the prerequisite geometry background. Through exercises and
tutorial sessions, students were able to build enough intuition and computational
skills to understand much of the material presented. While we decided a primer
section on elliptic PDE was essential for the flow of this book, we resisted the
temptation to add further sections on background geometry. That said, we do
recall or develop some foundational material where needed, and some startup
notations and conventions are reviewed starting on p. xix. We include exercises
that were assigned before and during the schools, both to give readers a feel for
the tutorials and to help focus those who are learning the topics for the first time
or reviewing on the fly. We added many exercises as well, some collected at the
end of chapters, some interspersed in the text. Of particular note, many exercises
in Chapters 1 and 2 serve to review and extend background in geometry.
Strictly speaking, no physics background is required. We assume, as we did
at the schools, a nodding acquaintance with pre-relativity physics, enough so
that students can approach the development of the theory of special and general
relativity with context from which to appreciate the rudiments of spacetime
structure and the line of thought from Galileo to Newton to Einstein, and to
motivate why the Einstein equations and the initial value constraints were to
receive so much of their attention. In part for this reason, the first chapter
contains some very basic material that would be included in an undergraduate
course in special relativity, but we found it to be a fun way to start each school,
engendering some interesting discussion amongst participants without needing
much in the way of background. The first two chapters on special and general
relativity may seem somewhat chatty, including some discussion of physics
without always being mathematically efficient or fastidious, but we hope it helps
to frame the mathematical theory. We could have cut the physics discussion short
by formulating the mathematical postulates from the start with a small amount
of motivation, but we decided, given the audience, to put some more time into
developing these ideas from their genesis in physics. Even giving ourselves
some leeway, the presentation is not too leisurely, and the lecture schedule at the
schools called for covering the physics background reasonably efficiently at the
beginning of the first week.
The mathematical and physical foundations of relativity have been an active
topic of discussion and research for over a hundred years, and we have not tried
to approach the scope of the debate (for instance, we chose not to discuss Mach’s
principle in depth), nor have we tried to use too fine a brush in painting the
logical and philosophical distinctions, nor strained to give a serious historical
account of the development of the theory. Interested readers can follow up with
P REFACE xv
references such as [82; 83; 85; 161; 169; 170; 171], and with a wealth of material
available online.
Even while starting off in an elementary fashion, and keeping in mind the
range of student backgrounds represented, we were able to cover a reasonable
amount of ground at each school. During the MSRI workshop, Pengzi Miao
covered sufficient elements of causal theory to present the proof of the Penrose
singularity theorem. Justin Corvino developed enough background in scalar
curvature and asymptotically flat solutions of the constraint equations to be able
to present a proof of the Riemannian positive mass theorem in three dimensions,
while Lan-Hsuan Huang and Fernando Schwartz were able to build on this to
discuss advanced aspects of the geometry of initial data sets, with Lan-Hsuan
discussing constant mean curvature surfaces and the notion of center of mass,
and with Fernando outlining multiple approaches to the Riemannian Penrose
inequality. This volume reflects essentially the material covered during the MSRI
workshop.
At Cortona, in lieu of Pengzi’s lectures, Mauro Carfora (Università di Pavia)
presented an engaging and marvelously illustrated development connecting the
constraint equations (elliptic PDE governing initial values for the Einstein evo-
lution) and the Ricci flow,1 while Michael Eichmair (ETH Zürich, now at the
University of Vienna) developed connections between the positive mass theorem
and the geometry of initial data sets (including isoperimetry of large spheres),2
which dovetailed beautifully with the lectures of Huang and Schwartz.
With all this background to present, the organizers decided to focus the topics
lectures on the Einstein constraint equations which govern the initial data for
the Einstein evolution, at the expense of not including advanced and/or current
topics on the evolution problem. While this is a reasonable basis for criticism
(of the schools and hence this volume), the field has developed to a point where
there is room for multiple programs on each of these topics, and the relations
between them; articles such as [64] and volumes such as [11; 51] indicate the
considerable breadth and depth of the field.
The years just after the workshops witnessed a flurry of activity in general
relativity. The centennial year of 2015 marked the hundredth anniversary of
Einstein’s formulation of a geometric theory of gravity governed by the Einstein
equation, and was capped off with the excitement over the detection by LIGO
of gravitational waves generated from black hole mergers — the discovery of
which led to the 2017 Nobel Prize in physics. Roger Penrose shared the 2020
1A full treatment of the topic in Mauro’s lectures can be found in the recent monograph [40].
2 For this material see [32; 33; 79; 80; 81].
xvi P REFACE
Nobel Prize in physics for his work on singularity formation and black holes,
some of which we discuss. We hope the field will continue to develop in a robust
manner, and that this work will be of some value in introducing graduate students
to the field, and showing them some aspects of more advanced topics. Along
these lines, we enthusiastically point the reader to the graduate text Geometric
Relativity by Dan A. Lee (Queens College, CUNY), which has appeared recently
[142], and would surely have been a recommended text for the schools.
The first two chapters of this volume present the basic background, from
Minkowski spacetime and special relativity, to Einstein’s equation and general
relativity. Chapters 3 and 4 treat causality and the Penrose singularity theorem.
Chapter 5 on the Einstein constraint equations rounds out the basic background
from general relativity. Starting from Chapter 6 the text takes a sharp turn in the
direction of geometric analysis. Chapter 6 includes some background motivation
on elliptic PDE, with some applications to the constraint equations and scalar
curvature; of note, there is an excursus on the first and second variations of area,
which will appear throughout the rest of the text. Chapters 7–9 are written as
topical chapters and are largely independent of each other, though one might
find utility in referring to Chapter 7 for some properties of asymptotically flat
spaces. That said, on a first pass, some readers might find themselves giving
some of the more technical discussions in Chapter 7 a light read.
We would like to thank the graduate students for their hard work and enthu-
siasm at the summer schools, and in particular Alan Parry and Xin Zhou, as
well as Peter McGrath and Andrea Santi, for their work as graduate assistants
at the MSRI and Cortona schools, respectively. During one tutorial session,
Alan introduced us to his research area, by presenting work of his thesis advisor
Hubert Bray (Duke University), which modifies the Einstein–Hilbert action of
general relativity with a goal to model dark matter; while we do not treat this
topic in the text, we refer the interested reader to [27]; see also [30]. It has been
inspiring to the scientific organizers to see so many of the students producing a
staggering amount of interesting theses and papers in the years since the summer
schools were held, and many have moved on to postdocs and faculty positions. In
particular, Brian Allen, currently in the Department of Mathematics at Lehman
College, CUNY, attended the MSRI summer school as a graduate student, and is
a coauthor on Chapter 9 in this volume.
There are many people to thank for helping this project along. Giorgio
Patrizio (Università di Firenze) first broached the idea of a volume after the
Cortona summer school. We thank Heléne Barcelo (MSRI) for her enthusiastic
support throughout the process. We also thank all the great staff at MSRI, and in
P REFACE xvii
particular Chris Marshall, for their support before, during and after the school,
and likewise at Cortona, in particular Silvana Boscherini and Cinzia Benedetti.
Funding for the schools was provided in part by National Science Foundation,
the Clay Foundation, and INdAM (Istituto Nazionale di Alta Matematica), and
we thank them for their generous support. Likewise we thank our respective
home institutions, Lafayette College and the University of Miami. The editors
shaped the book in part during their invited mini-course at the 2013 Taiwan
International Conference on Geometry, at the National Taiwan University, and we
would like to extend our thanks to Yng-Ing Lee for that opportunity. JC thanks
Lehigh University, and especially Huai-Dong Cao, for inviting him to teach a
graduate course in mathematical relativity in 2011, an experience that helped
frame the approach to some of the material. JC would also like to acknowledge
invitations from the Park City Math Institute, the Erwin Schrödinger Institute in
Vienna, as well as from the Ravello Summer School, where he delivered mini-
courses in the summers of 2013, 2014 and 2015, respectively, at which some
of the presentation was honed. Of particular note is the support of Tommaso
Ruggeri (Università di Bologna) for both the Cortona and Ravello summer
schools. We thank Greg Galloway for reading Chapters 3 and 4 and offering
some helpful feedback. JC thanks former student Kevin Manogue (Lafayette
College) for feedback on Chapters 1 and 2, David Maxwell (University of Alaska,
Fairbanks) for discussions on the conformal method, Farhan Abedin (Lafayette
College) for reading parts of several chapters, and whose critical feedback led
to a reorganization of Chapters 5–7, and finally John D. Norton (University
of Pittsburgh) for several enlightening email exchanges on the foundations of
general relativity. In addition to lecturing in Cortona, Mauro Carfora read several
chapters in detail and offered critical advice from a physics perspective; in
addition, his beautiful sketch of the palace at which the school was held adorns
this volume. A huge thank you goes out to the editor Silvio Levy not only for
his advice and encouragement, but for his calm patience while this project took
longer than anticipated.
This book is dedicated to our friend and colleague Sergio Dain, who passed
away in February 2016 at the age of 46. Sergio was an inspiration — through
his work and his talks, he shared his deep insights into mathematical relativity
and inspired you to be a better mathematician, while through his friendly and
generous personality, interacting with him inspired you to be a better person. We
lack the words to express how much he is missed.
Notation and conventions
We will often indicate conventions when they appear in the text (sometimes
repeatedly), but we will mention a few here, just to get started.
While we generally use the term smooth to mean C ∞ (partly for definiteness),
we note that often it will be obvious that a certain C k -smoothness level is
sufficiently smooth for the context under consideration. Subset notation A ⊂ B
also allows for A = B. Vectors will be denoted in various ways; standard basis
vectors in coordinates x i will be often written as partial derivative operators
∂/∂ x i , so that a vector V can be written as a linear combination V = V i ∂/∂ x i .
Here we have used the Einstein summation convention of summing over repeated
upper and lower indices. While this convention will be in force unless otherwise
noted, we will repeat it on occasion for the sake of clarity.
The term manifold will generally refer to a smooth manifold without boundary.
A closed manifold will refer to a compact manifold (again, without boundary).
While we assume the standard topological conditions that manifolds are Hausdorff
and second countable, we are ambivalent about whether to restrict to connected
manifolds: many results will not require connectedness, and for certain results
that do, it is rather obvious that a statement as written would only hold on
each component separately. We will try to point out where connectedness is
assumed, but we trust the reader can discern if we have missed such an instance.
A submanifold of codimension one is a hypersurface, which will generally be
taken to be smoothly embedded, though we will try to point out when we allow
it to be immersed, or weaken the regularity assumption (as in Chapter 3).
We will work with semi-Riemannian (also called pseudo-Riemannian) metrics
on M, mostly Lorentzian or Riemannian; our signature for Lorentzian metrics
is (−, +, +, . . . , +). When the spacetime is the focus, it may be given as
a Lorentzian manifold (M, g), whereas at some point, the focus in the book
will shift primarily to Riemannian manifolds, often construed as Riemannian
hypersurfaces in a spacetime, so that the Riemannian manifold might then be
given as (M, g), and the corresponding spacetime (if referenced) by (S , ḡ), for
example. Pay close attention to this, and also to the dimension of the spacetime.
xix
xx N OTATION AND CONVENTIONS
This will be made clear in each situation, but just keep it in mind when cross-
referencing formulae across chapters and sections.
When dealing with tensors, we sometimes just need the value of the tensor at
a point, and sometimes we are referring to a tensor field; this will not always be
explicitly stated, but should be clear in context. If a formula refers to derivatives
of the tensor field, we will assume, unless stated otherwise, that the tensor field
is smooth, or, at least smooth enough to do the indicated computations. For
example, “consider a one-form θ ” might really mean “consider a smooth one-
form field θ”. At various points we will consider fields that have less regularity
(e.g., Sobolev spaces of tensor fields), and that will be made clear when needed;
in particular, we will be more deliberate about emphasizing the regularity when
it comes to the fore starting with the PDE discussion in Chapter 6.
Recall that a connection on the tangent bundle T M (an affine connection)
assigns to vector fields X and Y a vector field ∇ X Y , which is C ∞ (M)-linear
(and hence tensorial) in X and R-linear in Y , and satisfies the product rule
∇ X ( f Y ) = (∇ X f )Y + f ∇ X Y for f ∈ C ∞ (M), where ∇ X f = X [ f ] is the
directional derivative of f ; the value of (∇ X Y )| p depends only on X | p and the
values of Y along a curve tangent to X | p . One can extend the connection to
tensor fields T , defining ∇ X T by applying a product rule; e.g., if T is a one-form,
∇ X (T (Y )) = (∇ X T )(Y )+T (∇ X Y ). In general, ∇ X T is a tensor of the same rank
as T , and it follows easily from the definition that ∇ X T is tensorial in X . Hence
we can construe ∇T as a tensor with rank higher by one: if T is an (r, s)-tensor,
producing a scalar from a tuple of r one-forms and s vectors, then ∇T is an
(r, s+1)-tensor. On a semi-Riemannian (M, g), there is a unique connection,
called the Levi-Civita connection and denoted by ∇ (among other notations you
might see in the text), which is torsion-free (∇ X Y − ∇Y X = [X, Y ]) and satisfies
∇g = 0; this will generally be the connection employed unless stated otherwise.
A metric g will often be written in bracket notation: g(X, Y ) = ⟨X, Y ⟩.
In coordinates, g is given by a symmetric matrix of components gi j , so that
locally g = gi j d x i ⊗ d x j = gi j d x i d x j , where for one-forms θ and η we define
θη = 21 (θ ⊗η+η⊗θ ) (whereas the wedge product is given by θ ∧η = θ ⊗η−η⊗θ ).
Thus the Euclidean metric gEn on Rn , for which the component functions x i
are Cartesian coordinates, is then expressed as gEn = δi j d x i d x j , for example.
The nondegeneracy of g corresponds in components to the invertibility of the
matrix (gi j ), and we write (g i j ) = (gi j )−1 , i.e., g i j g jk = δ ik . There is a natural
volume measure dvg associated to g, which in local coordinates takes the form
dvg = |det(gi j )| d x, where d x is the Euclidean (Lebesgue) volume measure in
p
sometimes let det g = det(gi j ) for abbreviation, and we often let dσ or dσg be
the volume measure induced on a semi-Riemannian submanifold.
Since at each point on M the metric g is nondegenerate, it can be used to
change the tensor type, e.g., a vector X is associated to a dual form X ♭ by
g(X, Y ) = X ♭ (Y ), and likewise a one-form α can be associated to its vector dual
α ♯ by g(α ♯ , Y ) = α(Y ). It is easy to check in a basis v j for T p M with dual basis
θ i for T p∗ M (so θ i (v j ) = δ i j ) that if X = X j v j then X ♭ = X i θ i with X i = gi j X j ,
where gi j = g(vi , v j ); similarly, if α = αi θ i , then α ♯ = α j v j with α j = g i j αi .
This kind of operation, known as raising and lowering of indices from the way
the notation is arranged, can be performed on more general tensors T , with the
positions of the indices generally indicating tensor type in lieu of the musical ♯
and ♭ notation.
We remark on the consistency of the raising/lowering notation: if T is a
(0, 2)-tensor with components Ti j , then T i j = g ik g jℓ Tkℓ give the components of
the tensor obtained by type-changing using g, so that if T = g, then in fact we see
T i j = g i j (the components of the inverse matrix). Furthermore, we can extend g
as a bilinear form on more general tensors, defining ⟨S, T ⟩ to be an appropriate
metric contraction of S ⊗ T ; e.g., if S and T are (1, 2)-tensors, then in a local
basis ⟨S, T ⟩ = giℓ g js g km S ijk Tsm ℓ . We may write this in various ways, depending
tensor, while the components of the corresponding (0, 4)-tensor are given by
Ri jkℓ = gℓm Rimjk = ⟨R ∂∂x i , ∂ ∂x j , ∂ ∂x k , ∂∂x ℓ ⟩. Different books use different con-
The scalar curvature is the metric trace of the Ricci tensor, and is given in
components by R(g) = g i j Ri j .
A comma is used to denote a partial derivative, whereas a semicolon is used
to denote components of the covariant derivative of a tensor. For example, with
Ti jk = gkm Timj , we have (∇T )i jkℓ = Ti jk;ℓ = (gkm Timj );ℓ = gkm Timj;ℓ , since ∇g = 0.
While the covariant derivative of a function f is naturally a one-form d f, i.e.,
∇ f (X ) = ∇ X f = X [ f ] = d f (X ), sometimes ∇ f is instead taken to be the vector
(d f )♯ = gradg f dual to d f , i.e., the gradient of f with respect to the metric g,
so that d f (X ) = g(X, gradg f ); the meaning should be clear in context.
The Christoffel symbols 0ikj for a coordinate frame are defined by
∂ ∂
∇∂ = 0ikj k ,
∂ xi ∂x j ∂x
and can be computed in terms of the metric as 0ikj = 12 g km (gm j,i + gim, j − gi j,m ).
1g u = trg (Hessg u) = g i j u ;i j .
In some texts, the term Laplacian is reserved for the case (M, g) is Riemannian,
and may be defined as the negative of our definition. When (M, g) is Lorentzian,
the trace of the Hessian is often called (again, up to a sign) the wave operator □ g .
1
2 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
t˜ = t, x̃ = x − vt.
If we also arrange the relative velocity to lie along the x-axis, then the transfor-
mation becomes, with v = v ∂/∂ x,
t˜ = t, x̃ = x − vt, ỹ = y, z̃ = z. (1.1.1)
x = x̃ + vt = x̂ + wt + vt = x̂ + (v + w)t. (1.1.2)
and x̃ ′′ (t) = x ′′ (t). Hence the acceleration a(t) = x ′′ (t) of the path γ is the same
as measured in either frame.
Suppose γ is the path of an object of mass m, an observer-independent quantity
that we take to be independent of t, so that the momentum p(t) = m x ′ (t) in O
differs from p̃(t) = m x̃ ′ (t) in O e by a constant. Newton’s second law of motion
states that the net force F on the object due to physical interactions equals the
time rate of change of its momentum, as measured in an inertial frame, which
becomes the familiar F = ma; to obtain an analogous equation in a non-inertial
frame, a fictitious force must be added to balance the frame acceleration. It
seems reasonable that the net interaction forces should be observer-independent,
as would be the case when the force between objects is a function of their relative
separation and relative velocity. Therefore Newton’s second law of motion holds
in all inertial frames if it holds in one inertial frame.
Einstein’s foundational 1905 paper “On the electrodynamics of moving bodies”
[82] emphasizes how the incompatibility of electromagnetism and the Galilean
transformations led to the reformulation of mechanics. There are of course
very fundamental issues in interpreting electromagnetism. Nineteenth-century
experiments revealed that a magnetic field is generated by charges in motion.
The Lorentz force law F = q(E + vc × B) (written in Gaussian or cgs units,
with c the speed of light in vacuum) determines the force on a charge q moving
at velocity v in an electromagnetic field. For example, consider a charge q
which moves across the field lines of a stationary magnet. The moving charge
4 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
experiences a magnetic force from the Lorentz force law. On the other hand, if
we switch to a frame moving with the charge, then the charge q to which we
apply the Lorentz force law is stationary and the magnet is moving. As such,
the charge experiences a force from an electric field induced (Faraday’s law) by
the changing magnetic field moving past q. In the end, of course, the physical
predictions are the same in each case, it is only the interpretation that differs.
Einstein looked for a fundamental explanation of this in terms of relativity, that
the laws of electromagnetism should have the same form in all inertial frames.
A consequence of Maxwell’s equations for electromagnetism is that light
travels according to a wave equation, the speed of which can be determined. Of
course, this raises the question: the speed relative to what? And what would be
the medium capable of transmitting electromagnetic disturbances at such a great
speed, while seeming transparent to the motion of the earth through it? Attempts
such as the Michelson–Morley experiment in the late nineteenth century failed to
find the medium, a preferred reference frame (which was called the ether frame),
with respect to which light in vacuum travels at speed c, roughly 3 × 108 meters
per second. Under the Galilean transformations, inertial observers in relative
motion with respect to the ether frame would have different measurements of the
value of the speed of light. That this was not observed in experiments caused
quite a quandary. Classical results on stellar aberration along with the Fizeau
experiment supplied evidence against an ether drag theory that the ether moves
along with massive bodies. Other experiments ruled out theories that were
consistent with the null result of Michelson–Morley, for example the Lorentz
contraction hypothesis, which on its own cannot account for the result of the
Kennedy–Thorndike experiment; see [96; 137; 188], for instance, for more details
on these experiments. The ether theory embraced the notion that the principle of
relativity did not apply to electromagnetism, in the sense that the ether frame is
a preferred frame of reference for the theory. That there were problems with this
led Einstein to postulate that relativity does apply to electromagnetism. Thus,
although Newtonian dynamics works well with the Galilean transformations, for
relativity to apply to electromagnetism, the Galilean transformations required
modification.
For some foreshadowing, consider frames of reference O and O e with constant
relative velocity, with respective coordinates related as in (1.1.1). Suppose a
function ψ : R4 → R satisfies the wave equation (with wave speed c) in O, in
the sense that
1 ∂ 2ψ ∂ 2ψ ∂ 2ψ ∂ 2ψ
= + 2 + 2.
c2 ∂t 2 ∂x2 ∂y ∂z
L ORENTZ TRANSFORMATIONS 5
Then since
∂ ∂ t˜ ∂ ∂ x̃ ∂ ∂ ∂
= + = −v
∂t ∂t ∂ t˜ ∂t ∂ x̃ ∂ t˜ ∂ x̃
and
∂ ∂ t˜ ∂ ∂ x̃ ∂ ∂
= + = ,
∂x ∂ x ∂ t˜ ∂ x ∂ x̃ ∂ x̃
we see that 9(t˜, x̃) := ψ(t, x) satisfies
1 ∂ ∂ 2 ∂ 29 ∂ 29 ∂ 29
− v (9) = + + 2.
c2 ∂ t˜ ∂ x̃ ∂ x̃ 2 ∂ ỹ 2 ∂ z̃
This can be rewritten as (assuming ψ is C 2 )
1 ∂ 29 v ∂ 29 v2 ∂ 29 ∂ 29 ∂ 29
−2 2 = 1− 2 + + 2.
c2 ∂ t˜2 c ∂ t˜ ∂ x̃ c ∂ x̃ 2 ∂ ỹ 2 ∂ z̃
We see that the wave equation is not invariant under Galilean transformations.
This is perfectly reasonable for mechanical waves, but the homogeneous wave
equation governing light propagation in vacuum is a consequence of Maxwell’s
equations, and we have seen that experiments indicate that light travels at the
same speed in vacuum for all inertial observers.
However, it is not too hard to play around with the transformation so as to
coax the preceding equation into the standard wave equation form. Namely, let
1 v
t˜ = p t − 2x , (1.1.3)
1 − (v/c)2 c
1
x̃ = p (x − vt) (1.1.4)
1 − (v/c)2
replace the Galilean coordinate change. Then, as we did above, we obtain
∂ ∂ t˜ ∂ ∂ x̃ ∂ 1 ∂ ∂
= + =p −v ,
∂t ∂t ∂ t˜ ∂t ∂ x̃ 1 − (v/c)2 ∂ t˜ ∂ x̃
∂ ∂ t˜ ∂ ∂ x̃ ∂ 1 ∂ v ∂
= + =p − .
∂x ∂ x ∂ t˜ ∂ x ∂ x̃ 1 − (v/c)2 ∂ x̃ c2 ∂ t˜
Thus we find
1 ∂2
2
1 1 ∂ ∂2 2 ∂
2
= · − 2v +v ,
c2 ∂t 2 c2 (1 − (v/c)2 ) ∂ t˜2 ∂ t˜ ∂ x̃ ∂ x̃ 2
∂2
2
1 ∂ v ∂2 1 v 2 ∂ 2
= −2 2 +
∂ x 2 (1 − (v/c)2 ) ∂ x̃ 2 c ∂ t˜ ∂ x̃ c2 c ∂ t˜2
6 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
1 ∂2 ∂2 1 ∂2 ∂2
− = − .
c2 ∂t 2 ∂ x 2 c2 ∂ t˜2 ∂ x̃ 2
We have found a coordinate change for which the wave equation is preserved,
but at what cost? Especially disturbing is that t˜ depends not only on t, but
on x and v as well! In fact, we will derive these equations from applying a
few simple fundamental principles, as Einstein did. Namely, we combine the
principle of relativity (the Galilean/Newtonian principle for mechanics, extended
by Einstein to encompass electromagnetism), that physical laws should have
the same form in all inertial frames, along with Einstein’s postulate that the
speed of light in a vacuum is a physical law, and thus must be the same for all
inertial observers. This immediately mitigates the need for an ether, and gives
rise to striking predictions that are consistent with experimental results. For
example, Lorentz contraction can account for the result of Michelson–Morley,
and together with time dilation can explain the result of Kennedy–Thorndike;
we will derive both of these predictions in Section 1.2.3. (For a derivation of the
Lorentz transformation based directly on experimental results, see [192].)
0 = 1s 2 = η̃00 (1x̃ 0 )2 + η̃i j 1x̃ i 1x̃ j = η00 ((1x̃ 0 )2 − |1 x̃|2 ) + η̃i j 1x̃ i 1x̃ j ,
P
i̸ = j
so that 0 = i̸= j η̃i j 1x̃ i 1x̃ j . By choosing 1x̃ 1 = 1 = 1x̃ 2 and 1x̃ 3 = 0, and
P
1x̃ 0 = |1 x̃|, we infer that η̃12 = 0, and likewise that η̃i j = 0 for i ̸= j. We thus
conclude in general that 1s̃ 2 = −η̃00 1s 2 . In other words, that lightcones must
be preserved implies that the spacetime intervals 1s 2 in two inertial coordinate
systems must agree up to a multiplicative constant. By the principle of relativity
this multiplicative constant must equal ±1 (as it must be the same in going from
frame O to O e or going in reverse: only the relative motion should matter). On the
other hand, we know that certain non-vanishing spacetime intervals are preserved
from one observer to another (such as simultaneous events orthogonal to the
direction of motion, as discussed above), so that we must have −η̃00 = 1. We see
that 1s 2 = 1s̃ 2 , and η̃µν = ηµν : the coordinate change is a linear transformation
P3
that preserves η = −(d x 0 )2 + i=1 (d x i )2 = −c2 dt 2 + d x 2 + dy 2 + dz 2 . Such
linear maps are Lorentz transformations, and we want to get our hands on them
by writing some down explicitly.
We can set up axes with the direction of relative motion along the x-axis in
each coordinate system, oriented so that a photon path along the x-axis (fixed y
and z) given by x = ct corresponds to x̃ = ct˜. Based on the assumption that there
is no preferred direction orthogonal to the direction of motion, and recalling the
above argument regarding such orthogonal spatial displacements, it follows that
we can arrange ỹ = y and z̃ = z. Similar symmetry considerations show that
the coordinates t˜ and x̃ of an event should be independent of y and z, and thus
depend on t and x only. Arguing from the principle of relativity, if O e is moving
with velocity v = v ∂/∂ x with respect to O, then O is moving with velocity −v
with respect to O e — which can also be readily seen by letting x = 0 and using
(1.1.5) below. As the transformation between coordinates should depend only
on the relative velocity, if Tv maps the coordinates of O e to those of O, then we
−1
should have Tv = T−v (compare Einstein’s derivation in [82]).
We have reduced the problem to a two-dimensional one, with relative motion
in the x-direction, and with the change of coordinates giving a map between the
10 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
(t˜, x̃) and (t, x) coordinates of events in spacetime. Moreover, a vector along the
lightcone in the (t˜, x̃)-plane should map to the lightcone in the (t, x)-plane. We
let the reduced two-dimensional mapping with v = v ∂/∂ x be Tv . Using standard
bases we represent the linear transformation by a matrix [Tv ] = aa11
a12
21 a22
, so that
From here, we could use that this reduced map must preserve η (a corollary of
the fact shown above that the full mapping preserves η) to get relations between
the coefficients; see [188], for instance, and compare [84, Appendix 1]. Instead
we will follow [38], proceeding to highlight the preservation of lightcones, and
in the process showing directly that η is preserved in spacetime dimension two.
Now x̃ = 0 is the path of the observer moving with respect to O with velocity
v, and is given by t = a11 t˜, x = a21 t˜, so that v = a21 /a11 . Now we apply the
invariance of the lightcone: the path x = ±ct should map to x̃ = ±ct˜ (with
respective signs corresponding by orientation), which implies that the vectors
1 1
c and −c are eigenvectors of Tv ; by orientation again, the eigenvalues should
be positive. This implies there are respective eigenvalues λ± > 0 such that
a11 ± ca12 = λ± ,
a21 ± ca22 = ±cλ± .
Thus ca11 + c2 a12 = a21 + ca22 , and ca11 − c2 a12 = −a21 + ca22 . Adding and
subtracting these two equations yields a11 = a22 and a21 = c2 a12 , from which we
deduce v/c2 = a12 /a11 . We then see the matrix for Tv has the following form,
with α(v) := a11 :
a11 a12 1 v/c2
[Tv ] = 2 = α(v) . (1.1.5)
c a12 a11 v 1
Now 0 < λ+ λ− = det[Tv ] = (α(v))2 (1 − (v/c)2 ), so |v| < c: the relative speed
of two inertial observers is less than that of light. We also see α(v) > 0, because
2α(v) = tr[Tv ] = λ+ + λ− > 0.
We now apply a symmetry argument similar to one we used earlier. Start with
adapted coordinates (t, x) and (t˜, x̃) as above for two inertial frames of reference
O and O e, respectively. We also consider adapted coordinates (t ′ , x ′ ) = (t, −x)
and (tˆ, x̂) = (t˜, −x̃) related to the original coordinates by the parity operator
P(τ, ξ ) = (τ, −ξ ). The axes in these coordinates are consistently aligned, and
frame O e moves with velocity v = v ∂/∂ x = −v ∂/∂ x ′ with respect to O. So
the transformation from (tˆ, x̂) to (t ′ , x ′ ), which like Tv represents the identity
L ORENTZ TRANSFORMATIONS 11
map on spacetime, should be given by T−v (consistent with isotropy and the
principle of relativity). Thus if [T−v ] is the matrix representing this coordinate
transformation in the respective standard coordinate bases, then with [P] = 10 −10 ,
we see [P][T−v ][P] = [Tv ]. From this we conclude det [T−v ] = det[Tv ], but from
[Tv ]−1 = [T−v ], we see det [Tv ] = ±1, and thus det [Tv ] = 1 since we know the
determinant is positive.
Together with the result of the last paragraph, we get
1
α(v) = √ ,
1 − (v/c)2
and we note then that α(v) = α(−v). One could have argued α(v) = α(−v)
from the principle of relativity, from which the value of α(v) then follows using
[Tv ]−1 = [T−v ] and (1.1.5). In any case, we have
1 1 v/c2 1 1 −v/c2
[Tv ] = √ , [T−v ] = √ .
1 − (v/c)2 v 1 1 − (v/c)2 −v 1
Hence we have arrived at the Lorentz transformation
1
v
1
t=√ t˜ + 2 x̃ , x = √ (v t˜ + x̃),
1 − (v/c) 2 c 1 − (v/c)2
(1.1.6)
1
v
1
t˜ = √ t − 2 x , x̃ = √ (−vt + x) .
1 − (v/c)2 c 1 − (v/c)2
Note that if we change the first variable to x 0 := ct, so that the coordinates have
the same units, we have
1
v
1 v 0
x0 = √ x̃ 0 +
x̃ , x=√ x̃ + x̃ ,
1 − (v/c)2 c 1 − (v/c)2c
1
v
1
v
x̃ 0 = √ x0 − x , x̃ = √ − x0 + x .
1 − (v/c)2 c 1 − (v/c)2 c
Let β = v/c, so that the matrix for Tv relative to the bases 1c ∂∂t˜ , ∂∂x̃ and
1 ∂ ∂
c ∂t , ∂ x is then just
1 1 β 1 1 −β
[Tv ] = √ , and likewise [T−v ] = √ .
1 − β2 β 1 1 − β2 −β 1
which governs the propagation of the electric and magnetic fields in vacuum, is
invariant under Lorentz transformations. Moreover, it is a fundamental fact of
physics that the electromagnetic fields behave in such a way that the laws which
govern electromagnetism, Maxwell’s equations, are invariant under Lorentz
transformations (sometimes referred to as special covariance of the equations).
Thus the laws of electromagnetism take the same form in all inertial frames, in
accordance with the principle of relativity.
While we will not develop the foundations of electromagnetic theory in a
relativistic framework, we should at least indicate how the electromagnetic fields
as measured in two inertial frames compare, i.e., how they transform under a
Lorentz transformation. As we saw earlier, we expect the electric and magnetic
fields to somehow transform together, since charges in motion induce and are
affected by a magnetic field, but motion is relative to the frame of reference.
In fact, the fields do transform together as an anti-symmetric two-tensor, called
the Faraday tensor. In an inertial frame O (and using x 0 = ct), where the electric
field has components E i and the magnetic field B i , the Faraday tensor has a
component matrix
0 E1 E2 E3
−E 1 0 B 3 −B 2
[F µν ] =
−E 2 −B 3 0
. (1.1.7)
B1
−E 3 B 2 −B 1 0
F ♭ = (E 1 d x 1 + E 2 d x 2 + E 3 d x 3 ) ∧ d x 0
+ (B1 d x 2 ∧ d x 3 + B2 d x 3 ∧ d x 1 + B3 d x 1 ∧ d x 2 ).
How do the fields transform exactly? For example, let us compute the magnetic
field component B̃ 3 in another inertial frame Oe moving along the x-axis at
velocity v relative to O. The associated Lorentz transformation has matrix
µ
elements 3 ν given by
1 −v/c
0 0
1−(v/c)2 1−(v/c)2
p p
−v/c 1
µ
3ν = 0 0.
1−(v/c)2 1−(v/c)2
p p
0 0 1 0
0 0 0 1
14 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
B̃ 1 = B 1 ,
1
v
B̃ 2 = √ B2 + E 3 ,
1 − (v/c)2 c
1
v
Ẽ 2 = √ E 2 − B3 ,
1 − (v/c)2 c
1
v
Ẽ 3 = √ E 3 + B2 .
1 − (v/c)2 c
In a region with electromagnetic fields but free of charges and hence current
(for simplicity), Maxwell’s equations are as follows (in inertial coordinates, using
spatial divergence and curl):
1∂E
div E = 0, = curl B, (1.1.8)
c ∂t
1∂B
div B = 0, = −curl E. (1.1.9)
c ∂t
Maxwell’s equations in any inertial frame will take the same familiar form,
consistent with the principle of relativity; indeed, one can easily check, using the
above transformation rules for the fields, that the form of Maxwell’s equations
in inertial coordinates is preserved by Lorentz transformations.
Having said that, it is instructive to write Maxwell’s equations directly in
µν
terms of F. Equations (1.1.8) are equivalent to F ,ν = 0, with Gauss’s law
div E = 0 for µ = 0, and with 1c ∂∂tE = curl B corresponding to the remaining
µν
three components — and we recall that ∂∂x 0 = 1c ∂t∂ . Of course, F ,ν gives the
components of divη F in inertial coordinates, so that (1.1.8) can be expressed as
divη F = 0. Similarly, expressing dF ♭ = 0 in inertial coordinates is equivalent to
(1.1.9).
K INEMATICS IN M INKOWSKI SPACETIME 15
In the remainder of the chapter, we will use Roman indices such as i, j, k to label
spatial components, whereas Greek indices will continue to be used for space and
time components. The Einstein convention will remain in force, summing over
the relevant index range, as we will remind the reader sporadically throughout.
θ sinh θ
1.2.1. The Minkowski metric. Consider the matrix 3 = cosh sinh θ cosh θ . Since
3T −10 01 3 = −10 01 ,
we again see that each element of the group of Lorentz transformations in one
space dimension preserves the bilinear form −(d x 0 )2 + d x 2 = −c2 dt 2 + d x 2 .
Returning to three spatial dimensions, the relevant bilinear form is
η = −(d x 0 )2 + d x 2 + dy 2 + dz 2 = −c2 dt 2 + d x 2 + dy 2 + dz 2 ,
which as we have seen is preserved by the (four-dimensional) transformations
between two sets of inertial coordinates (corresponding to rigid reference frames
for two inertial observers with the same origin of spacetime coordinates). The
set of linear maps that preserve the quadratic form η — the set of Lorentz trans-
formations — forms a group, the Lorentz group; the proper Lorentz group is
the component of the identity map, chosen to ensure observers use a consistent
16 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
spatiotemporal orientation (and in particular keep the same arrow of time). The
Lorentz group is a subgroup of the Poincaré group of maps which preserve η,
which also includes translations.
We let ⟨v, w⟩ := η(v, w). The manifold R4 together with the semi-Riemannian
metric η is four-dimensional Minkowski spacetime M4 , or R14 , indicating the
signature of the metric, and similarly for Mn = R1n , for n ≥ 2.
1.2.2. Causal nature of vectors. (See Figure 1.) Vectors w = (c1t, 1x) with
⟨w, w⟩ < 0 are called timelike, since |1x| < c |1t|. Then |v| = |1x|/|1t| < c.
Such vectors may thus represent the tangents to spacetime paths of material
particles. We note in (1.2.1) below how such vectors represent the spacetime
displacements between pairs of events that in some inertial frame share the same
spatial location. Similarly, null vectors w ̸= 0 satisfy ⟨w, w⟩ = 0; these are
tangent to the lightcone and represent paths of light rays. Finally, vectors with
⟨w, w⟩ > 0 are called spacelike, and represent displacements between pairs of
events which are simultaneous (i.e., share the same time coordinate) in some
inertial frame; see (1.2.2) below. The zero vector is also defined to be spacelike.
Suppose that w = (c1t, 1x, 0, 0) is timelike, and let v = 1x/1t, so that
|v| < c. The Lorentz transformation T−v satisfies
c1t 1 1 −v/c c 1t ∗
[T−v ] =p = . (1.2.1)
1x 1 − (v/c)2 −v/c 1 1x 0
Similarly if |1x| > c|1t|, then let v/c = c1t/1x, so that |v| < c. Then
c1t 1 1 −v/c c1t 0
[T−v ] =p = . (1.2.2)
1x 1 − (v/c)2 −v/c 1 1x ∗
Thus in either case, there is a Lorentz transformation which maps the timelike
or spacelike vector to align with a timelike or spacelike axis for an appropriate
observer.
1.2.2.1. Twin paradox. It turns out that the familiar triangle inequality for vectors
in Euclidean geometry is reversed for timelike vectors in Minkowski spacetime.
ct
timelike
x
spacelike
lightcone
(null) O
Given a smooth timelike curve γ (λ) (meaning that γ ′ (λ) is timelike for all λ),
we define the proper time 1τ along a portion of γ as
Z λ1 p
1τ = c−1 −⟨γ ′ (λ), γ ′ (λ)⟩ dλ,
λ0
Thus we see the tangent vector has constant length −⟨γ̃ ′ (τ ), γ̃ ′ (τ )⟩ = c. This
p
means that the parameter τ −τ (λ0 ) is indeed the proper time elapsed along γ
from λ0 to λ(τ ).
We now consider the reversed triangle inequality. If the displacement vector
−
→
OB from O to B is timelike, let
−
→ −→ − →
q
| OB| = −⟨OB, OB⟩,
which is, up to a factor of c, the proper time along a straight-line path from O to
B (the elapsed time measured by the inertial observer passing through O and B).
Definition 1-1. A vector is causal if it is timelike or null. A path is causal if
its tangent vector at each point is causal. A causal vector w is future-pointing if
⟨w, ∂/∂t⟩ < 0, and analogously for past-pointing.
−
→ −
→ −
→
Proposition 1-2. If OB is future-pointing and timelike, and OA and AB are
−→ −
→ −→
future-pointing and causal, then | OB| ≥ | OA| + | AB|, with equality only in case
O, A and B are collinear.
Proof. By applying a Lorentz transformation as in (1.2.1), we may arrange
−→ −→
that OB has components (t B , 0, 0, 0), for t B > 0, and OA has components
−
→
(t A , x A , 0, 0), with |x A | ≤ ct A . Then AB has components (t B − t A , −x A , 0, 0),
with |x A | ≤ c(t B − t A ). Therefore,
−
→ −→
| OA|2 = (ct A )2 − x 2A and | AB|2 = c(t B − t A )2 − x 2A ,
so that
−
→ −
→ p p
| OA| + | AB| = (ct A )2 − x 2A + c(t B − t A )2 − x 2A
−
→
≤ ct A + c(t B − t A ) = ct B = | OB|.
18 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
Similarly the signal sent after ℓ years of proper time have elapsed on the outward
journey from O to A will arrive at x = 0 at
√
1 + v/c
t=√ ℓ. (1.2.3)
1 − v/c
The coefficient of ℓ in (1.2.3) gives the time interval between the reception of
successive signals which were emitted one unit in time apart, and thus it is the
ratio between the frequency of emission of signals, as measured by the emitter,
and the frequency of reception of signals, as measured by the receiver. As such,
it gives a measure of the relativistic Doppler shift. When v/c = 0.8, this Doppler
factor equals 3. The numerology works p out so that the third signal sent along
the path from O to A occurs at t = 3/ 1 − (0.8)2 = 5, so three signals are sent
along the outward journey, and the third signal arrives at x = 0 at t = 9. The twin
at A has sent three signals back to his twin, but has received only one signal. On
the “homeward” journey, this twin will send three more signals (the last just as
the twins are back together again), and will receive a total of nine signals from
the twin with frame O, the last upon arrival.
20 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
1.2.3. Simultaneity. The worldlines of inertial observers are special paths, namely
timelike geodesics. Moreover, inertial observers correspond to certain coordinate
charts on Minkowski spacetime M. Mathematically it is not a big deal when a
point in M has two different sets of coordinates in two different charts. However,
interpreting the coordinates in the physical model yields some interesting results:
the coordinates are not merely labels, but rather they are supposed to be the
results of physical measurements.
The first observation is that simultaneity is relative: two different observers
adapted to respective inertial frames O and O e moving relative to each other will
not agree in general on whether two events occur at the same time. Imagine
we synchronize the observers at a common origin O, and that they are moving
along their respective x-axes. The Lorentz transformations tell us that the events
which O e charts as occurring simultaneously at t˜ = 0 correspond to t = vx/c2
in O. Different points in this set have different t-coordinates, and so O will not
agree that they occur at the same time; cf. (1.2.2).
Though simultaneity is relative, both observers will calculate the same value of
−(ct)2 +x 2 = −(x 0 )2 +x 2 = −(x̃ 0 )2 +(x̃)2 = −(ct˜)2 +(x̃)2 . This observation can
be used to obtain two interesting conclusions that can be verified experimentally:
time dilation and Lorentz contraction.
Consider the event A which has coordinates t˜A = 1, x̃ A = 0 (we suppress
the other spatial dimensions, whose coordinates we take to be 0). The Lorentz
transformation gives us the coordinates (t A , x A ) for A in O, and in particular
t˜A + v x̃ A /c2 1
tA = p =p > 1.
1 − (v/c) 2 1 − (v/c)2
O measures more time to have elapsed from O to A, and so concludes that the
moving clock in O
e’s frame runs slow. Another way to see this from the invariant
hyperbola is to note that since −(ct˜A )2 + x̃ 2A = −c2 , A is on the hyperbola
−(ct)2 + x 2 = −c2 ; but x A = vt A ̸= 0, so that we must have t A > 1 = t˜A ! You
can see this with a simple picture: the invariant hyperbola through A hits the
t-axis at the point (t, x) = (1, 0) = (t˜A , 0), with t-coordinate clearly lower than
t A . (This point is labeled B in Figure 2, left.) In general, if time 1t˜ is measured
between events at p a fixed x̃ value, the time between the events as measured in O
will be 1t = 1t˜/ 1 − (v/c)2 > 1t˜; cf. (1.2.1).
Similarly, moving objects contract along the direction of motion. To be precise,
consider a rod along the x̃-axis, whose rest length measured in O e is L, and which
is moving with velocity v > 0 along the x-axis. This means that the ends of the
rod are measured simultaneously in O e at, say, O given by (t˜, x̃) = (0, 0) and A
K INEMATICS IN M INKOWSKI SPACETIME 21
ct~ ct~
ct ct
A
~
x ~
x
B
A
B x
x
O O C
given by (t˜, x̃) = (0, L). The ends of the rod make paths in spacetime, one given
by x̃ = 0, the other by x̃ = L. (See Figure 2, right.)
We need to find the coordinates of the point B where x̃ = L intersects t =
0, since then both O and B will be simultaneous with respect to O. By the
Lorentz ˜ 2 , so that
p transformation (1.1.6), the event B will have t = −vL/c
x = 1 − (v/c) L < L. In O, the rod is measured to have length 1 − (v/c)2 L,
2
p
since determining the length of the rod amounts to finding the spatial separation
between the ends at the same time. It is this simultaneity that is relative. This
length contraction is a necessary consequence of time dilation and the agreement
by the observers on their relative speed, and can readily be seen geometrically
in terms of the invariant hyperbola. Indeed, the hyperbola −(ct˜)2 + x̃ 2 = L 2
through the point A lies to the right of the line x̃ = L, touching only at the point
of tangency at A. Therefore, if C is the event given by the intersection of the
hyperbola and the line t = 0, the event B which occurs on x̃ = L is between the
origin O and the point C along the line t = 0. Thus C must have coordinate
(t, x) = (0, L), since it lies on the hyperbola. Hence the point B must have
x-coordinate less than L.
1.2.3.1. Pole and barn paradox. Consider a barn of rest length 10 and a pole of
rest length 20. Suppose they are at rest in respective inertial frames moving along
√
the x-axis with respect to each other with relative velocity v given by v/c = 23 ,
so that 1 − (v/c)2 = 21 . From the point of view of the barn, the pole contracts
p
along the direction of motion to half its rest length, so that it can fit entirely in
the barn as it moves through. From the point of view of the pole, the barn is
moving toward it, and thus it contracts to length 5; thus in the rest frame of the
pole, it can never fit entirely inside the barn. Can both viewpoints of reality be
correct?
22 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
The answer of course is yes! We analyze this first from the point of view of the
barn frame O, in which the one end of the barn has worldline x = 0, and the other
end x = 10. The pole is moving in the positive x-direction, and at time t = 0 in
O, the ends of the pole are at x = 0 (front) and x = −10 (back). The worldlines
√
for the front√and back ends of the pole are respectively x = vt/c = 23 t and
x = −10 + 23 t. At time t = √20 , the front end of the pole is at x = 10 (let this
3
be spacetime point A), and the back end at x = 0 (at spacetime point B, say).
From the pole frame O e, the barn never contains the pole. That the two
observers disagree is not a paradox, since simultaneity is relative. The issue is
simply that in O
e, the events A and B are not simultaneous as they are in O. This
is easy to compute by using the Lorentz transformation (1.1.6) on the points with
coordinates (t, x) = √20 , 10 (point A) and (t, x) = √20 , 0 (point B).
3 3
c h(τ ) c
+c sinh(bτ ) = h(τ ) cosh(bτ ),
b b b
from which we see h(τ ) = ebτ . The displacement vector from γ (τ ) to Aτ
is spacelike, and its length is the distance from γ (τ ) to the photon after τ
units of time have elapsed as measured along γ . This displacement has length
−−−→
|ebτ −1|·| Oγ (τ )| = bc (ebτ −1). This is the distance from γ (τ ) to the photon, the
derivative of which with respect to τ is cebτ . So, not only is the acceleration not
helping the observer γ make any progress on catching the photon, but as measured
along γ , the photon is accelerating away from γ . This is counterintuitive, but is a
consequence of how γ makes measurements along its worldline, and furthermore,
keep in mind that in any local rest frame, the photon has speed c.
We begin with a thought experiment due to Einstein. Imagine a box (and its
contents) of total mass M and length L, at rest. Suddenly from inside the left
side of the box some photons of total energy E are emitted in the direction
toward the right side of the box. The formula for photon momentum p is E = cp.
By conservation of momentum, then, the box should acquire a net momentum
− p = Mv (to the left). When the photons reach the right side of the box and stop,
the motion ceases, with the box having moved to the left with a net displacement
1x < 0. (You might argue that by causality, what happens at one end cannot
instantaneously effect the other end; this can be taken into account, still arriving
at the conclusion below; see [96, p. 27–28] or [188, p. 138–143].)
Einstein argues that there is no reason why in this closed system the center of
mass should have changed from the start to the end of the process. He suggests
E NERGY AND MOMENTUM 25
that the photons must have carried a mass m to the right side of the box to
balance out the center, i.e., m(L + 1x) + (M − m)1x = 0. The velocity of the
box is determined by (M − m)v = − p (conservation of momentum), and the
elapsed time during the photon motion is (L + 1x)/c = 1t = 1x/v. Putting
these together we obtain m = p/c = E/c2 , or
E = mc2 .
of the inertial frame O with respect to which the particle is moving, the mass
might be construed to be
m0
m(v) = p .
1 − (v/c)2
Note that limv ↗ c m(v) = +∞, which indicates that inertia increases with speed;
this is consistent with the fact that a force cannot accelerate any particle to or
above the speed of light. Note also that by Taylor (binomial) expansion,
The first two terms are the rest energy and the kinetic energy. Furthermore if
U obs /c is a timelike unit vector tangent to the path of an observer, the observed
energy of the particle with momentum P is just E obs = −⟨P, U obs ⟩. Finally, we
note that
E0 ∂ E ∂ E ∂
P= 2 = 2 + p= + p,
c ∂ t˜ c ∂t c ∂x0
E 02 = −c2 ⟨P, P⟩ = E 2 − c2 | p|2 = (m(v)c2 )2 − c2 | p|2 .
By conservation of energy, this must balance with the rate of flux of energy into
the region across the spatial boundary ∂ R with outward unit normal n:
i0 ∂
Z Z Z
R
00
cT ,0 d x =
∂R
cT⟨ ∂xi ⟩
, −n dσ = − cT i0,i d x
R
E NERGY AND MOMENTUM 29
where we used the divergence theorem. Since the region R may be made
arbitrarily small, we obtain the 0-component of divη T = 0. The other components
may be derived similarly using conservation of momentum.
We now introduce several standard examples of the stress-energy tensor. Of
course, when in vacuum (free of fields or particles), T = 0. The next simplest
example is that of dust. Consider a collection of particles which are all at rest
in some inertial frame O e. In this frame, the rest energy density ρ of the dust
determines the stress-energy tensor:
∂ ∂ ∂ ∂
T = c−2 ρ ⊗ = ρ 0 ⊗ 0.
∂ t˜ ∂ t˜ ∂ x̃ ∂ x̃
This can be written invariantly as T = c−2 ρ U ⊗ U , where U is the four-velocity
of the dust. Using a Lorentz transformation, one can determine the energy density
in a frame O with respect to which the dust moves with velocity v aligned along
the x-axis. Indeed, in such a frame, (1.2.4) gives
∂ 1 ∂ v ∂
U= =p +p
∂ t˜ 1 − (v/c)2 ∂t 1 − (v/c)2 ∂ x
c ∂ v ∂
=p 0
+p ,
1−(v/c)2 ∂ x 1−(v/c)2 ∂ x
so that
T = c−2 ρ U ⊗ U
ρ ∂ ∂
= 2 0
⊗ 0
1 − (v/c) ∂ x ∂x
ρv/c ∂ ∂ ∂ ∂ ρ(v/c)2 ∂ ∂
+ 2 0
⊗ + ⊗ 0
+ 2
⊗ . (1.3.2)
1 − (v/c) ∂ x ∂x ∂x ∂x 1 − (v/c) ∂ x ∂ x
the classical measure of the rate at which the force f applied with velocity v is
doing work, i.e., the (mechanical) power developed by the force. As E = m(v)c2 ,
we see that the measured rate of change of energy d E/dt in O is the sum of
two terms, the first of which we might interpret as the rate of change of kinetic
energy, and the second as the rate of change of heat energy, as it arises from the
rate of change of the internal energy as measured in O (cf. [189, Chapter V]).
This should not be surprising, as the energy-momentum four-vector delineates
relations between energy and momentum (hence mass) in different frames.
Given the discussion of the stress-energy tensor above in terms of energy and
momentum fluxes, if one measures the stress-energy fluxes for a system upon
which external forces or fields act (a non-closed system, then), the divergence
µν
(divη T )µ = T ;ν of the stress-energy tensor T for the system is then a vector
https://avxhm.se/blogs/hill0
32 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
field whose components encode the density of the net power and force applied
to the system (see [162, p. 188], for instance).
Before we move on, we give one example, that of the force on a particle
of charge q moving with four-velocity U in an electromagnetic field given by
the Faraday tensor F µν . The four-force is given by Fµ = (q/c) U ν F µν . By
skew-symmetry, Fµ U µ vanishes, so that the force is purely mechanical and does
not change the rest mass of the charged particle. We compute the components
in an inertial frame with respect to which the particle has velocity v = v i ∂/∂ x i .
Then with vi = v i , we have
1
Uν d x ν = p (−c d x 0 + vi d x i ),
1 − (v/c)2
so that
1 q ∂ v
i ∂
F= p ⟨v, E⟩ 0 + q E + × B
1 − (v/c)2 c ∂x c ∂xi
1 1 ∂ v
=p ⟨v, q E⟩ + q E + × B .
1 − (v/c)2 c2 ∂t c
We plainly see here the Lorentz force in the spatial components, as well as
(c−2 times) the power developed by the electric field on the moving charge in
the time component.
F(x) = −(x 0 )2 + (x 1 )2 + (x 2 )2 + · · · + (x k )2 .
We start with the zero level set, 6 = F −1 (0), which is precisely the set of all
points whose position vector from the origin O is null; in other words, 6 is the
lightcone from the origin O. The complement 6 \ {O} = 6 + ∪ 6 − is a smooth
S OME GEOMETRIC ASPECTS OF M INKOWSKI SPACETIME 33
null hypersurface: along 6 \ {O}, the null vector field x µ ∂/∂ x µ is both tangent
and normal to 6.
We move on to discuss the other level sets 6, which are regular hypersurfaces.
Recall (or see 5.1.2) that for Y and Z tangent to 6, if DY6 Z is the tangential
component of DY Z , then D 6 is the Levi-Civita connection for 6 in the induced
metric. We write DY Z = DY6 Z + II(Y, Z ), where II is the second fundamental
form of 6.
We can compute its curvature via the Gauss equation (see Proposition 5-5): for
X, Y, Z , W tangent to 6,
⟨R 6 (X, Y, Z ), W ⟩
= ⟨R(X, Y, Z ), W ⟩ − ⟨II(X, Z ), II(Y, W )⟩ + ⟨II(X, W ), II(Y, Z )⟩. (1.4.1)
Thus with R = 0 on Minkowski spacetime, we insert the formula for the second
fundamental form to obtain
⟨R 6 (X, Y, Z ),W ⟩ = r −2 (−⟨X, Z ⟩⟨Y,W ⟩⟨n, n⟩+⟨X,W ⟩⟨Y, Z ⟩⟨n, n⟩) . (1.4.2)
spacelike normal vector field. The induced metric on Sk1 (r ) is Lorentzian: the
vector field
∂ ∂
(r 2 + (x 0 )2 ) 0
+ x0xi i
∂x ∂x
where du dv = 21 (du ⊗ dv + dv ⊗ du), and g̊S2 is the metric on the round unit
sphere. Note that the level sets of u and v are null. If one holds u and a point
ω ∈ S2 fixed, then varying v → +∞ corresponds to going forward in time along
a null geodesic, to infinity. Likewise, with v fixed, u → −∞ corresponds to
a path of light going to past infinity. The goal is to represent where these null
paths “are” at infinity. One way to do this is to use the inverse tangent function
to define new coordinates
Since dT = (1+v 2 )−1 dv+(1+u 2 )−1 du and d R = (1+v 2 )−1 dv−(1+u 2 )−1 du,
we can easily derive the Jacobian determinant
∂(T, R) 2
= > 0.
∂(u, v) (1 + v 2 )(1 + u 2 )
Moreover,
4
−dT 2 + d R 2 = (−du dv) =: 2 (−du dv).
(1 + v 2 )(1 + u 2 )
4
Note that 2 = is smooth on all of M4 , and that
(1 + t 2 + r 2 )2 − 4t 2r 2
v−u
sin R = sin(tan−1 v) cos(tan−1 u) − sin(tan−1 u) cos(tan−1 v) = √ √ .
1+v 2 1+u 2
Thus sin2 R = 14 2 (v − u)2 . This implies
The metric on the left is manifestly conformal to the Minkowski metric, and is
readily identified as a Lorentzian product metric (R × S3 , −dt 2 + g̊S3 ), which
is in fact the Einstein static universe (Section 2.4.2). In other words, we have
produced an embedding of Minkowski spacetime into the Einstein static universe,
which is not an isometry, but a conformal isometry. Thus it preserves the causal
nature of vectors, in particular the null structure. The image of the embedding is
a bounded set, since −π < T < π and 0 ≤ R < π on Minkowski spacetime. The
boundary of the set is the union of two smooth null hypersurfaces J ± , “scri-plus”
and “scri-minus”, where “scri” is short for “script I.” We note that is a defining
function for J ± , since = 0 here, with d ̸= 0.
We now briefly describe some of the features of the boundary (Figure 3).
Since T + R = 2 tan−1 v, the null rays to the future end up (v → +∞) at
T + R = π, which for 0 < R < π gives J + . Similarly for u → −∞ we get
T − R = 2 tan−1 u = π, which gives J − . The null vector ∂/∂ T ∓ ∂/∂ R is both
tangent and normal to J ± . The closure of the image of Minkowski spacetime
can be represented by a T -R triangle, bounded by R = 0 and T ± R = ±π.
Every point in this region represents a two-sphere, except where R = 0 or R = π ,
each point of which represents a point. One can argue that timelike geodesics
must start at i − in the past, corresponding to (T, R) = (−π, 0) and must end
at i + corresponding to (T, R) = (π, 0). We let i 0 be the point corresponding
to (T, R) = (0, π ), which is called spacelike infinity. Spacelike curves with
r → +∞ have a limit at i 0 in the conformal picture.
https://avxhm.se/blogs/hill0
36 1. S PECIAL RELATIVITY AND M INKOWSKI SPACETIME
i+
T=p
T=0 −
i0
i−
T =−p
R=p R=0
Exercises
b. If V is a smooth vector field along γ , show (perhaps using part a.) that the
covariant derivative satisfies
DV d
= P −1 (V (t)).
dt t=0 dt t=0 t
Exercise 1-9 (the Hessian). Suppose (M, g) is semi-Riemannian, and u is a
smooth function on M. The Hessian of u is defined by Hessg u = ∇(du). It is a
(0, 2)-tensor; in local coordinates, (Hessg u)i j = u ;i j (and recall u ,i = u ;i , since
du = u ,i d x i ).
a. Show that Hessg u(X, Y ) = Y X [u] −(∇Y X )[u], where we recall that X [u] =
∇ X Y = ∇ XB Y,
∇ X V = f −1 (∇ X f )V = ∇V X,
∇V W = − f −1 ⟨V, W ⟩ gradg B f + ∇VF W
= − f g F (V, W ) gradg B f + ∇VF W.
b. Show that for any smooth function ϕ : B → R, the gradient gradg (π B∗ ϕ) agrees
(under the identification above) with gradg B (ϕ). Show that Hessg B ϕ(X, Y ) =
Hessg (π B∗ ϕ)(X, Y ) if X and Y are vectors tangent to B. Can you give an example
of a warped product metric, along with a function ϕ and a vector field Z tangent
to M, for which Hessg B ϕ((π B )∗ (Z ), (π B )∗ (Z )) ̸= Hessg (π B∗ ϕ)(Z , Z )?
R(X, Y, Z ) = R B (X, Y, Z ),
R(X,V, Y ) = f −1 (Hessg B f (X, Y ))V,
R(X, Y,V ) = 0 = R(V,W, X ),
R(V, X,W ) = f −1 ⟨V,W ⟩∇ XB (gradg B f ) = f g F (V,W )∇ XB (gradg B f ),
R(U,V,W ) = R F (U,V,W ) − f −2 ⟨gradg B f, gradg B f ⟩(⟨V,W ⟩U − ⟨U, W ⟩V )
= R F (U,V,W ) − ⟨gradg B f,gradg B f ⟩(g F (V,W )U − gF (U,W )V ).
signature):
Exercise 1-17 (curvature and parallel transport). The curvature tensor can be
computed via parallel transport, and in fact it measures the failure of path
independence of parallel transport. Some geometry texts skip this, whereas
some general relativity texts point this out, to varying degrees of mathematical
precision. See, for example, [207; 218].
Let g be a semi-Riemannian metric on M, and let p ∈ M. Consider a coordinate
chart ϕ : U ⊂ Rn → M centered at p, and let B ⊂ U be a closed rectangle in
a two-dimensional coordinate plane around the origin, say (x 1 , x 2 ) ∈ B for
max(|x 1 |, |x 2 |) ≤ ε0 . Given a vector V ∈ T p M, define a vector field along the
coordinate mapping from B to M as follows: for (x 1 , x 2 ) ∈ B, let V (x 1 , x 2 ) ∈
Tϕ(x 1 ,x 2 ) M be the vector obtained by the parallel transport of V ∈ T p M along the
image under ϕ of the segment from (0, 0) to (x 1 , 0), and then along the image
of the segment from (x 1 , 0) to (x 1 , x 2 ). By smooth dependence of solutions to
ODE, this depends smoothly on (x 1 , x 2 ). Similarly, let V e(x 1 , x 2 ) be the vector
field obtained by the parallel transport of V ∈ T p M along the image of the
segment from (0, 0) to (0, x 2 ), and then along the image of the segment from
(0, x 2 ) to (x 1 , x 2 ). We remark that we can compute in coordinates, i.e., compute
in (U, ϕ ∗ g), which we will do without further comment, pushing forward or
pulling back quantities between U and M as needed.
E XERCISES 45
be the value of W e(x 1 ,x 2 ) at the final point along the map γ(x1 ,x2 ) ; in other words,
W (x 1 , x 2 ) is the vector obtained by parallel transport of V (x 1 , x 2 ) first along
the line from (x 1 , x 2 ) to (0, x 2 ), and then along the line from (x 2 , 0) to (0, 0).
Given V, we can choose a uniform constant for the “O”-estimates below, or
likewise, given K , we have uniform “O”-estimates for all |V |g ≤ K .
a. Note that V (x 1 , x 2 ) = V e(x 1 , x 2 ) along the axes, where x 1 x 2 = 0. Show that
V (x 1 , x 2 ) − V
e(x 1 , x 2 ) = O(|x 1 x 2 |). (You might write the parallel transport
system in coordinates along the sides of the rectangle and estimate the change
in the vector fields along pairs of parallel sides by estimating an appropriate
integral.)
b. Note that W = V ∈ T p M for x 1 x 2 = 0. Argue that |W − V | = O(|x 1 x 2 |).
Remark. You can make the corresponding construction on any rectangle
[x 1 , x 1 + 1x 1 ] × [x 2 , x 2 + 1x 2 ] ⊂ B,
starting from V (x 1 , x 2 ), say, and the analogous difference is O(|1x 1 1x 2 |).
c. Show that
W (x 1 , x 2 ) − V ∂ ∂
lim =R , ,V .
(x 1 ,x 2 )→(0,0) x1x2 ∂x2 ∂x1
x 1 x 2 ̸ =0
(Hint: Write the numerator above in terms of integrals along the sides of γ(x 1 ,x 2 ) .
You again want to estimate the integrals in parallel pairs. The answer will drop out
once you argue that up to an acceptable error, you can replace the W e(x 1 ,x 2 ) -term
in the integrals by the appropriate V -term or Ve-term.)
https://avxhm.se/blogs/hill0
CHAPTER 2
Newton’s law of gravity can be formulated as follows. If two objects are separated
by a spatial distance r , then the magnitude of the gravitational force between
them is given by F = Gm g Mg /r 2 , where the direction of the force is along the
line from one mass to the other. Here m g and Mg are the gravitational masses
associated to the two objects, and G is Newton’s gravitational constant. If r̂ is the
unit vector from the object of mass Mg to the other object, then the force on the
object of mass m g is F = −(Gm g Mg /r 2 ) r̂. If the object of mass Mg is located
at the origin, and x ∈ R3 is the position of the other object, then r = |x|, r̂ = x/r ,
and we can write the force as follows, where ∇ is the Euclidean gradient:
Gm g Mg G Mg
F=− r̂ = m g∇ = −m g∇8, (2.1.1)
|x|2 |x|
where 8(x) := −GMg /|x| is the gravitational potential associated to the mass
Mg . In analogy with Coulomb’s law of electrostatics, namely that the electric
force between stationary charged particles (with charges q1 and q2 ) is given by
F = q1 q2 /r 2 (in cgs units), m g and Mg play the role of gravitational charges.
Of course, there is already a notion of mass embodied in Newton’s second
law: in this context we write it as F = m i a, and call m i the inertial mass. By
equating forces, we solve for the acceleration of an object of gravitational mass
m g and inertial mass m i due to the gravitational force of an object of gravitational
https://avxhm.se/blogs/hill0
47
48 2. T HE E INSTEIN EQUATION
mass Mg :
mg
a=− ∇8.
mi
We could in principle use this equation to discern the ratio of the inertial to the
gravitational mass for various objects. It turns out that the acceleration is the
same for all bodies, and hence the mass ratio is constant, a result epitomized
by the apocryphal experiments of Galileo dropping objects of different masses
from the tower in Pisa. By adjusting G, we may assume, then, that m i = m g :
the inertial and gravitational masses agree. The effect of gravity is universal:
it accelerates all objects the same way, independent of what precisely comprises
the mass. In this way gravity is decidedly different from electromagnetism.
Before we move on, we note that the potential function for Newtonian gravity
satisfies a simple partial differential equation. Indeed, away from x = 0, the
function 8(x) = −G M/|x| is harmonic (with respect to the Euclidean metric),
i.e., 18 = 0, as you can easily check. Of course, 18 can be interpreted globally
as a distribution, say T = 18, and we obtain the equation
18 = 4π G Mδ0 , (2.1.2)
where δ0 is the Dirac measure at the origin (the “location” of the mass M).
Indeed, suppose ψ ∈ Cc∞ (R3 ) is any smooth function supported in the ball of
radius r0 around the origin. There exists a C > 0 such that |(8∇ψ)(x)| ≤ C/|x|
for all |x| ̸= 0; hence, for all ε > 0, we have {|x|=ε} 8∇ψ · r̂ dσ ≤ 4πCε. Then,
R
since 1/|x| ∈ L 1loc (R3 ), an application of Gauss’s divergence theorem, along with
Green’s identity div(8∇ψ − ψ∇8) = 81ψ − ψ18 (compare (1-11b)) and the
vanishing of 18(x) for |x| ̸= 0, yields
Z Z
T (ψ) := 81ψ d x = lim 81ψ d x
R3 ε ↘ 0 {ε≤|x|≤r0 }
Z
= − lim (8∇ψ − ψ∇8) · r̂ dσ
ε ↘ 0 {|x|=ε}
GM
Z
= lim ψ 2 r̂ · r̂ dσ
ε ↘ 0 {|x|=ε} |x|
1
Z
= 4π G M lim ψ dσ
ε ↘ 0 4π ε 2 {|x|=ε}
= 4π G Mψ(0),
In this case the matter density is σ = Mδ0 . For a more general matter distri-
bution of density σ , the gravitational potential solves Poisson’s equation
18 = 4π Gσ. (2.1.3)
2.2.1. The equivalence principle. We can bring to bear upon the question of the
existence of inertial frames a famous thought experiment of Einstein. Suppose
there were an inertial frame of reference, say a small lab room isolated from
other forces or fields. In such a frame, if one lets go of a ball (or rather a test
particle, say), it would tend to stay at rest. Now suppose a rocket is attached to
the top of the room, and then accelerates the room “upward” at a uniform rate.
If one lets go of a ball now, it will “fall” toward the floor, just as it would in a
https://avxhm.se/blogs/hill0
50 2. T HE E INSTEIN EQUATION
uniform gravitational field.3 In this case, one cannot distinguish, by making local
measurements of the paths of objects upon which no forces (other than possibly
gravity) act, whether the frame is non-inertial, or whether there is a uniform
gravitational field present. Note that this depends on the universality of gravity:
it imparts the same acceleration to all objects. This version of the equivalence
principle basically comes down to the equality of gravitational and inertial masses:
if they were different, then one could distinguish between the two situations by
making appropriate measurements. The equality of the gravitational and inertial
masses is thus a reflection of the equivalence of a uniformly accelerating frame
with an inertial frame in which a uniform gravitational field is present. Other
versions of the equivalence principle involve other laws of physics, asserting
that measurements involving those laws cannot distinguish between a uniformly
accelerating frame and a frame in which a uniform gravitational field is present.
We have just seen how to “create” a uniform gravitational field via acceleration.
Conversely, consider now the lab room as a spaceship in orbit, or closer to home,
a compartment of one of those amusement park “free fall” rides. In free fall,
if one lets go of a small object (not safe to try this on the free fall ride!), its
position with respect to the room is constant, because a uniform gravitational
field accelerates all objects the same. The law of inertia appears to hold, and
the observer in free fall will then not detect a uniform gravitational field. We on
the earth claim to detect such a field precisely because we are not in free fall:
the contact forces with the earth keep us from a free fall path, and thus if we
drop an apple, we observe it “fall” to the earth under the force of gravity. This is
equivalent to the accelerated room: we feel the contact force between the floor
and our legs, and we observe objects falling toward the ground.
Even light cannot escape “gravity’s” pull: if we imagine a light ray entering
the room at one end moving in a straight line in an inertial frame, the path of the
light is curved in the accelerated frame of reference. Einstein reasoned that by
equivalence, a gravitational field should bend the paths of light rays too. Thus,
were there to exist an inertial reference frame, where a nonzero gravitational
field accounts for acceleration not attributed to other forces, light rays would
apparently not move along straight lines in this frame. (While Newton had
actually anticipated the deflection of light by a massive object, Einstein was able
3 Strictly speaking, we are considering a lab frame which is limited in extent in both space
and time, so that we can reliably expect a gravitational field to be roughly uniform, causing an
approximately constant acceleration. In regions more extended in spacetime, the non-uniformity
of the gravitational field will give rise to tidal forces that could be used to distinguish between
uniform acceleration and a gravitational field, and in fact will be directly related to the curvature
of spacetime, as we will see below.
F ROM THE EQUIVALENCE PRINCIPLE TO GENERAL RELATIVITY 51
to use his theory of gravitation to give a much more accurate prediction than that
of classical physics.)
A natural question is how to proceed with these notions and their ramifications
for the question of the existence of inertial frames. Before doing this, we introduce
the physical phenomenon of gravitational redshift.
2.2.2. Gravitational redshift. We recall the Doppler shift formula, as seen earlier
in the twin paradox example. Suppose light with wavelength λ0 emanates from
an emitter at rest in an inertial frame O; the frequency of the light ν0 satisfies
λ0 ν0 = c, and the time between the start and end of the emission of one full
wavelength is 1t0 = 1/ν0 . The light is absorbed by a detector moving at velocity
v with respect to O, along the same line as the propagation of light. We let O e be
the rest frame of the detector, and let ν1 be the frequency of the light as measured
by the detector in O e, so that 1t1 = 1/ν1 is the time for the absorption of one
wavelength as measured in O e. As shown in Exercise 1-5, we have
√
1t0 ν1 1 − v/c
= =√ .
1t1 ν0 1 + v/c
We next show that the gravitational redshift, which is derived in concert with
and provides evidence for the equivalence between uniform acceleration and a
uniform gravitational field, places a roadblock in the way of the existence of an
inertial observer, and thus calls into question whether one can mesh gravity with
the Minkowskian geometry of special relativity. Indeed, imagine two rockets
moving along the y-axis, one following the other at a fixed distance 1y, and with
a uniform acceleration a > 0 with respect to some inertial frame O. Suppose the
frame O is momentarily comoving with the rockets at the instant a photon of
wavelength λ0 = c/ν0 is emitted from the trailing rocket to the lead rocket. In the
time 1t (measured in O) the photon travels to the lead rocket, the rockets have
a change in velocity 1v = a1t, so there will be a Doppler shift in the reported
frequency of the received photon, from the difference in velocity at reception and
emission. The Doppler shift can be computed using the formula recalled above,
since the shift will be the same as that computed by inertial frames with relative
velocity 1v (momentarily comoving with the respective rocket at emission and
absorption, respectively; see Section 1.2.4). Thus if ν1 is the frequency of the
received photon (as measured by the lead rocket), then
√
ν1 1 − 1v/c 1 − 1v/c
=√ =p .
ν0 1 + 1v/c 1 − (1v/c)2
https://avxhm.se/blogs/hill0
52 2. T HE E INSTEIN EQUATION
other, so should they be inertial observers, then since the spacetime paths of the
beginning and tail ends of the photon wavelength should be related by a simple
time-translation (the gravitational field is time-independent, and only depends
on y), the 1t measurements should be the same at the two different heights.
The fact that experiment shows otherwise indicates an incompatibility between
gravitation and the existence of such inertial observers.
2.2.3. Towards a geometric solution. Einstein made an argument [85; 84] that
the spacetime continuum in the presence of a gravitational field (related to
acceleration via the equivalence principle) should be “non-Euclidean” (i.e., non-
flat, so non-Minkowskian). As we will see, the argument might be construed
as more heuristic than precise. We present it briefly for historical reasons, and
hopefully the reader will get some utility from it without getting the wrong
impression.
Consider a frame of reference O which represents uniform rotation with
angular velocity ω > 0 with respect to an inertial frame of reference. We pick
the origin for coordinate charts adapted to the frames as the center of rotation,
synchronize the clocks for the two observers at the origin of coordinates, and
align the axes at t = 0, so that the x and y axes are in the plane of rotation.
One can conceive ways to make measurements, and thus build coordinates for
spacetime, adapted in some way to O. There is not a canonical way to do this,
though one might initially be inclined to think the following is such a way: relate
the coordinates (t, x, y, z) in O to the inertial coordinates (t˚, x̊, ẙ, z̊) of an event
by the transformation (where (x, y) = (r cos θ, r sin θ) for r ≥ 0), t˚ = t, z̊ = z,
x̊ = x cos ωt − y sin ωt = r cos(θ + ωt), ẙ = x sin ωt + y cos ωt = r sin(θ + ωt).
For example, one might use radar (by sending and receiving signals to and from
an event), using the clock at the origin to read off the time, which then agrees in
both frames because there is no time dilation. One could also build coordinates
for spacetime by making measurements of events using measuring devices (a grid
of rods and clocks) associated to the observers at rest relative to O. See [161;
38; 162] for further discussion of coordinates adapted to accelerated observers.
Let Cr be a circle centered at the origin, of radius r as measured in the inertial
frame, with respect to which the length is 2πr . An observer on Cr at rest in O is
moving along the circle Cr with angular velocity ω relative to the inertial frame; of
course we require ωr < c. We proceed following Einstein: relative to the inertial
frame, the length of a unit measuring stick in the pframe O oriented tangentially
along Cr is shorter than one unit, by the factor 1p− (ωr/c)2 . Thus, the length
L(Cr ) measured using rods adapted to O is 2πr/ 1 − (ωr/c)2 , which is more
than 2πr : if ωr/c is small, then in O, L(Cr ) ≈ 2πr 1 + 12 (ωr/c)2 . Another
https://avxhm.se/blogs/hill0
54 2. T HE E INSTEIN EQUATION
way to think of this is that the observers at rest in O located along Cr place
non-overlapping measuring sticks along the circle to mark its circumference,
and one just counts how many of these are needed to go around the perimeter to
determine the length of the circle as measured in O. On the other hand, distances
perpendicular to the direction of motion agree in both frames. Thus both frames
agree on the radius r . It appears that from the point of view of the frame O, one
might conclude that the spatial geometry is curved. Indeed, recall the classical
formula for the Gauss curvature (Exercise 2-55), which applied to the above
analysis would yield nonzero (negative) curvature:
3 2πr − L(Cr )
K ( p) = lim · .
r →0 π r3
While Einstein cited this intriguing argument as motivation for the introduction
of non-Euclidean geometry into the theory of gravitation, one must critique it
in various ways. Bearing in mind the relativity of simultaneity, for instance,
has the argument above really succeeded in showing there is an observer who
will measure the circumference and radius of a circle to be out of step with
Euclidean geometry? Or does the analysis just yield some sort of non-Euclidean
geometry on the set of worldlines of rotating observers? While the spacetime
interval between two events is invariant, one needs to consider carefully to what
extent one can define and compare the spatial length and radius of a circle in
the two frames, keeping in mind that the observers will not necessarily agree on
simultaneity. Clock rates will vary for rotating clocks depending on the location
relative to the center; the clock rates depend on r , so while the rates are the same
for observers at rest in O on each Cr (each of which we note has a different local
rest frame), we might ask to what extent the observers can agree on a set of events
to be deemed the disk spanned by Cr . Imagine too if the rotating frame is slowly
brought to rest relative to the inertial frame: what happens as it slows and the
length contraction factor goes to 1? There are in principle too many measuring
rods positioned around the circumference, so something would have to give.
This discussion is related to the breakdown of standard notions of rigidity in
relativity, and the thought experiment of Einstein is related to similar arguments
in discussions of Ehrenfest’s paradox concerning the fate of a rotating cylinder
in relativity. For more on this topic, see [110], for example, and for Kaluza’s
argument relating Ehrenfest’s paradox to hyperbolic geometry, see [132].
From a geometric point of view, if spacetime is a manifold equipped with
a Lorentzian metric, then spacetime geometry is either Minkowskian or not.
If spacetime were Minkowskian, then the events which comprise a circle at a
fixed time in an inertial frame would yield a spacelike curve ξ embedded in a
F ROM THE EQUIVALENCE PRINCIPLE TO GENERAL RELATIVITY 55
Euclidean three-space (and hence a Euclidean plane), so that the length of the
curve would follow the well-known formula for the circumference of a circle.
The length of the curve ξ is a geometric invariant; if another frame of reference
were used to build spacetime coordinates (e.g., coordinates somehow adapted to
an accelerating frame), the spacelike curve ξ might not be comprised of events
simultaneous in this frame, but its length would be invariant. The coordinate
measurements might need to be converted using metric components to obtain
truly invariant geometric distances; this is akin to using curvilinear coordinates
for the Euclidean plane. If on the other hand there is a region where the spacetime
geometry is curved, there is no frame from which one could build coordinates
for which the metric is everywhere identical to that of Minkowski spacetime in
inertial coordinates, a situation analogous to the familiar fact that one cannot
make maps (charts) of the Earth’s surface which are isometries (up to a constant
scale factor) between the geometry on the surface and the Euclidean geometry
of the planar map.
Similar comments apply to the gravitational redshift scenario above. For
the observers stationary in a gravitational field with time-independent potential
varying in y as above, we maintain that the paths of successive photon crests
should be congruent in a coordinate chart associated to either observer. The
conundrum about clock rates discussed above would persist if this observer were
an inertial observer, say, using the Minkowski metric in inertial coordinates to
compute spacetime intervals. On the other hand, if the metric components are
not the Minkowskian components in inertial coordinates, but rather components
for Minkowski spacetime in a non-inertial frame, or components for a metric
for which the geometry of spacetime is curved, then it is generally expected
that coordinate measurements do not give geometric invariants, and that one
would need to use the metric to compute the relevant invariant, and thus give the
physical quantity of interest (in this case a proper time interval).
2.2.4. Free fall and geodesics. Beyond the issues we have seen above regarding
the obstruction to the existence of observers which are at once inertial and
stationary in a gravitational field, another fundamental issue arises from the
universality of gravity. An inertial frame of reference is one in which test
particles upon which no net force acts move with constant velocity. Consider a
region where the only forces are electromagnetic in nature: neutral particles and
charged particles could in principle be distinguished by their relative motions.
In an inertial frame, the neutral particles would move with constant velocity
according to the law of inertia, while charged particles would move according to
the Lorentz force law.
https://avxhm.se/blogs/hill0
56 2. T HE E INSTEIN EQUATION
identically, and as we will soon see, spacetime curvature produces tidal effects
ascribed to a gravitational field.
The universality of gravity has led to its incorporation into the structure of
spacetime at a fundamental level, determining inertial properties (motion of test
particles which experience no non-gravitational forces) through the geometry
of spacetime. Furthermore, as experiments continue to confirm that signals in
vacuum cannot travel at speeds faster than the speed of light, gravity plays a
distinguished role in determining the causal structure of spacetime through its
effect on paths of light rays. For a concrete geometric consequence, consider a
spacetime modeled on a Lorentzian manifold, and assert that light rays move
along null paths. The collection of lightcones determines the conformal class of
the spacetime metric g (and recall for comparison, the conformal compactification
of Minkowski spacetime as presented in the first chapter indeed preserves the
lightcone structure). Indeed, if X is timelike and Y ̸= 0 is spacelike at a point p,
then g(X + aY, X + aY ) = a 2 g(Y, Y ) + 2ag(X, Y ) + g(X, X ), a quadratic
polynomial in a with a positive leading coefficient and a negative constant term.
Hence there are exactly two real roots of this quadratic, which in principle we can
glean from the lightcone at p. The product of these roots gives g(X, X )/g(Y, Y ).
If V and W are any tangent vectors at p, then
Knowing the lightcone at p, then, allows us to find the ratio g(V, W )/g(X, X ),
since any of the terms on the right of the preceding equation, if nonzero, can be
gleaned in ratio with either g(X, X ) or g(Y, Y ).
The question remains how to connect the geometry of spacetime to the distri-
bution of matter and energy, whose motion should be in part determined by the
curvature of spacetime and whose gravitational effects should in turn influence
the geometry of spacetime. An answer lies in the Einstein equation, to which we
now turn.
We begin with a quote from Einstein [83, p. 113]: “The laws of physics must
be of such a nature that they apply to systems of reference in any kind of
motion.” This principle of general relativity puts all frames of reference on
an equal footing, in contrast to the privileged inertial frames of reference of
special relativity. Together with the equivalence principle, that an accelerating
frame ought to be locally (and approximately) equivalent from the point of view
of physics to a frame in which there is a uniform gravitational field, we see
T HE E INSTEIN EQUATION 59
g iℓ Ri jkℓ = Rk j , and the metric trace of the Ricci tensor is the scalar curvature
R(g) = g i j Ri j .
for all vectors X , Y , and Z . The curvature tensor also satisfies a differential
Bianchi identity: for all vectors X , Y , Z , V, and W ,
On the last line we used the Jacobi identity, and in the line above that we used
the torsion-free property of the Levi-Civita connection,
By symmetry-by-pairs, it remains only to prove the second differential Bianchi
identity above, for which it suffices to verify on a coordinate frame ∂∂x i . In
∂ ∂ ∂ ∂ ∂
⟨ ⟩
Ri jkℓ;m ( p) = m R , , , ℓ
∂ x p ∂ x i ∂x j ∂xk ∂ x
∂ ∂ ∂
D E
= ∇ ∂ ∇∂ ∇ ∂ − ∇ ∂ ∇ ∂ , ℓ
.
∂xm ∂xi ∂x j
k ∂x ∂x j
k
∂xi ∂x ∂x p
https://avxhm.se/blogs/hill0
62 2. T HE E INSTEIN EQUATION
∂
By combining terms in pairs and using that ∇ ∂ = 0 for all i, j, we find
∂xi ∂x j p
= 0. □
Proof. We use the differential Bianchi identity, along with symmetries of the
ij
curvature tensor, and the fact that ∇g = 0, so that g ;k = 0 for all i, j, k:
where 3 is a constant, called the cosmological constant. Note that sometimes the
Einstein tensor refers only to the case 3 = 0 above, i.e., G = Ric(g) − 12 R(g) g.
The Einstein tensor is divergence-free, as is any constant scalar multiple, and
thus provides a candidate for the stress-energy tensor of spacetime. In fact, it
is known that up to scalar multiple, G 3 is the only divergence-free symmetric
tensor whose coordinate expression is a function of the components gµν of the
metric tensor, along with their first and second partial derivatives. This result
was known to Cartan and Weyl in the special case that the tensor is quasilinear,
and the more general result was proved by Lovelock [148, p. 322].
T HE E INSTEIN EQUATION 63
From this result, if the Einstein equation should be as simple as possible, and
thus be second-order in the metric components, then it must take the form
G3 = κ T (2.3.2)
for some constant κ that will be determined by the Newtonian limit, as we now
show.
2.3.1. The Newtonian limit. Consider a spacetime metric g that is close to the
Minkowski metric η, in the sense that there are coordinates in which gµν =
ηµν + h µν , where h µν and its derivatives can be taken to be “small”, and ηµν
takes the standard inertial form. We assume that gµν,0 = 0 (or at least gµν,0 ∼ 0;
see below), so that with x 0 = ct, the field is (approximately) time-independent
in these coordinates. We let i and j run over spatial indices (i, j ̸= 0). Let the
trajectory of a slowly moving particle be modeled by a geodesic with coordinates
x µ (τ ), parametrized by proper time τ , so that |d x i/dτ | ≪ c dt/dτ ≈ c. We
will expand to first order in h (and derivatives of h) and c−1 d x i/dτ , and denote
expressions that are equal up to terms quadratic in these quantities (with bounded
coefficients) using “∼”. Thus dt/dτ ∼ 1, so that c−1 d x i/dt ∼ c−1 d x i/dτ . Since
g µν = ηµν + O(h), we have
µ
000 = 12 g µν (gν0,0 + g0ν,0 − g00,ν ) ∼ − 21 ηµν h 00,ν . (2.3.3)
µ
Since 0ρσ = 12 g µν (h νσ,ρ + h ρν,σ − h ρσ,ν ), the geodesic equation becomes
2 µ ρ σ
−2 d x µ dx dx
0=c + 0ρσ
dτ 2 dτ dτ
2 µ 0 2 2 µ
−2 d x dx −2 d x
µ 2 µ
∼c + 000 ∼c + c 000 .
dτ 2 dτ dτ 2
The time component gives, using (2.3.3),
d 2t 2 0
−2 d x 0
= c ∼ −000 ∼ 0.
dτ 2 dτ 2
For the spatial components we have
2 i i 2 i i 2 2 i
−2 d x −2 d d x dt −2 d x dt 2 d x d t −2 d x
c = c = c + ∼ c ,
dτ 2 dτ dt dτ dt 2 dτ dt dτ 2 dt 2
so that
d2xi 2 i
−2 d x i
c−2 2
∼ c 2
∼ −000 = 21 h 00,i .
dt dτ
If we let 8 = − 21 c2 h 00 , so that g00 = −(1 + 28c−2 ), we recover the Newtonian
relation between acceleration and the gradient of the gravitational potential
https://avxhm.se/blogs/hill0
64 2. T HE E INSTEIN EQUATION
(2.1.1). We remark that this analysis can be interpreted in the case where g
is the Minkowski metric, and gµν = ηµν + h µν gives the components of the
Minkowski metric in a (weakly) accelerating coordinate system, consistent with
the equivalence principle.
We now determine κ. To do this, we consider the stress-energy for a dust model.
For a perfect fluid, the pressure becomes important due to high random motion
of the particles, and we are assuming our particles are slowly moving. So we are
just considering the dust particles at rest in a given frame, without any pressure
forces between them, each with four-velocity U . Hence T µν = c−2 ρ U µ U ν . We
again consider gµν = ηµν + h µν , where gµν,0 = 0 (or at least gµν,0 ∼ 0). We
expand to first order in h (and its derivatives), U i and ρ. Since gµν Uµ Uν = −c2 ,
we have trg T = gµν T µν = −ρ and g00 U0 U0 ∼ −c2 , as well as
and so
1 i
2 κρ = R00 = Ri00 ∼ − 12 1(h 00 ) = c−2 18.
To compare with the Newtonian limit, we convert the energy density to mass
density, σ = c−2 ρ, to obtain 18 = 12 κc4 σ . Thus from (2.1.3) we get (in
spacetime dimension four) 21 κc4 = 4π G, or
8π G
κ= . (2.3.4)
c4
2.3.2. Energy conditions. Without the imposition of additional conditions on T ,
the Einstein equation does not impose any restrictions on a metric g, since
the Einstein tensor is always symmetric and divergence-free. We note here
some conditions often imposed on T based on physically reasonable energy
considerations.
We rewrite the Einstein equation (2.3.2) as follows. From (2.3.1) and (2.3.4)
we have Ric(g) − 21 R(g) g + 3g = κ T ; take the trace to obtain (in spacetime
T HE E INSTEIN EQUATION 65
In the vacuum case (T = 0) the Einstein equation becomes Ric(g) = 3g; a metric
satisfying an equation of this form is an Einstein metric. The vacuum Einstein
equation commonly refers to Ric(g) = 0, which holds when T = 0 and 3 = 0.
We now introduce several conditions coming from physical notions that are
sometimes imposed on T , some of which will appear in later chapters. The
weak energy condition is that T (ξ, ξ ) ≥ 0 for all timelike ξ . If c = 1, say, then
unit timelike vectors U correspond to (instantaneous) physical observers, and
T ( U , U ) is the energy density as measured by such an observer. The strong
energy condition is that for all unit timelike U , T ( U , U ) ≥ −21 trg T . From (2.3.5),
we see this is equivalent when 3 = 0 to the timelike convergence condition
Ric(ξ, ξ ) ≥ 0 for all timelike ξ ; replacing timelike ξ by null ξ , we have the null
energy condition. In a time-oriented Lorentz manifold (i.e., if the manifold admits
a smooth timelike vector field that can be used to give a smooth assignment of a
future timecone in the tangent space at each point), we define the dominant energy
condition that for all future-directed timelike ξ , the vector given by −T ab ξ b is
future-directed causal, or in other words, for all future-directed timelike (causal)
ξ and χ , we have T (ξ, χ ) ≥ 0. The dominant energy condition clearly implies
the weak energy condition.
possibly V (t) is defined along an immersed curve γ (t), so that V can be locally
extended and the identity holds (compare to the case γ (t) = p0 is constant, while
V (t) ∈ Tp0 M has a nonzero derivative).
With this in mind, we find (using the symmetry of the Christoffel symbols at
the last step, and with the summation convention running over all indices)
D ∂f D ∂f k ∂
=
∂t ∂s ∂t ∂s ∂ x k
∂2f k ∂ ∂f k ∂
= k
+ ∇∂f k
∂t ∂s ∂ x ∂s ∂t ∂ x
∂2f k ∂ ∂f k ∂f ℓ ∂
= k
+ ∇∂
∂t ∂s ∂ x ∂s ∂t ∂ x ℓ ∂ x k
∂f k ∂f ℓ m
2 m
∂ f ∂
= + 0ℓk
∂t ∂s ∂s ∂t ∂xm
D ∂f
= . (2.3.8)
∂s ∂t
As we did not use the geodesic equation, this identity holds for general f (t, s).
For a smooth vector field W (t, s) along the map f , we have similarly,
D DW D ∂Wk ∂ ∂
k
= + W ∇∂f
∂t ∂s ∂t ∂s ∂ x k ∂s ∂ x
k
D ∂Wk ∂ ℓ
k∂f ∂
= +W ∇∂
∂t ∂s ∂ x k ∂s ∂ x ℓ ∂ x k
∂2W k ∂ ∂Wk ∂ ∂Wk ∂ 2 ℓ
k∂ f ∂
= k
+ ∇ ∂ f k
+ ∇ ∂ f k
+ W ∇∂
∂t ∂s ∂ x ∂s ∂t ∂ x ∂t ∂s ∂ x ∂t ∂s ∂ x ℓ ∂ x k
∂f ℓ ∂f j ∂
+ Wk ∇∂ ∇∂ .
∂s ∂t ∂ x j ∂ x ℓ ∂ x k
Thus we see that
D DW D DW ∂f ℓ ∂f j ∂ ∂
− = Wk ∇∂ ∇∂ k
−∇∂ ∇∂ k
.
∂t ∂s ∂s ∂t ∂s ∂t ∂x j ∂xℓ ∂x ∂xℓ ∂x j ∂ x
Proof. For any (τ, b), with b = (b1 , b2 , b3 ), consider the curve defined via the
exponential map as β(s) = expγ (τ ) (sbi ei (τ )) = ϕ(τ, sb). Let b0 = 0. Then
β 0 (s) = cτ and β k (s) = sbk for k = 1, 2, 3. By definition, β is a geodesic. Since
d 2 β µ/ds 2 = 0 for µ = 0, 1, 2, 3, we have
µ dβ ν dβ σ µ µ
0 = 0νσ (cτ, sb) = 0νσ (cτ, sb)bν bσ = 0i j (cτ, sb)bi b j .
ds ds
d µ µ dγ σ dγ σ µ
0= (eℓ (τ )) + 0νσ (cτ, 0)eℓν (τ ) µ
= 0νσ (cτ, 0)δ νℓ = c0ℓ0 (cτ, 0),
dτ dτ dτ
µ
Proof. Along γ the Christoffel symbols vanish and hence ∂0νσ /∂τ = 0, so that
µ µ
R000 = 0 = ∂000 /∂ x 0 along γ , while for j = 1, 2, 3,
µ ∂ ∂ ∂
R j00 = ∇ e (τ ) ∇ ∂ − ∇c −1 γ ′ (τ ) ∇ ∂
∂xµ j
∂x0 ∂ x
0
∂x j ∂ x
0
µ ∂ µ ∂
= ∇e j (τ ) 000 µ − c−1 ∇γ ′ (τ ) 0 j0 µ
∂x ∂x
µ
∂000 ∂
= .
∂x j ∂xµ
Moreover, since the Christoffel symbols vanish along γ , so do the first partials
of gµν and g µν . Thus, along γ , we have
k
∂000 ∂ 1 kσ
= 2 g (2g0σ,0 − g00,σ ) = − 21 δ km g00,m j = − 21 g00,k j .
∂x j ∂x j
0
Similarly, ∂000 /∂ x j = − 12 g00,0 j = 0. □
∂ 28
Now, as we noted above, the analogue of the matrix is
∂x j ∂xk
c2 R kj00 = − 12 c2 g00, jk .
thus includes both the Lorentzian and Riemannian cases.) We assume that M is
compact, or more generally that R(g) ∈ L 1 (M, dvg ). We want to compute the
first variation of R and the associated Euler–Lagrange equation.
We will need Cramer’s rule. If A = (Ai j ) is an n × n matrix, we let Mi j be
the determinant of the (n −1) × (n −1) minor obtained by deleting row i and
T HE E INSTEIN EQUATION 71
indeed, we can interpret this sum as the determinant of the matrix à obtained
by replacing column i of A by column j of A. The minors Mki are obtained by
crossing out column i, so Mki are the same for A and Ã. But det à = 0 since
à has two equal columns. In summary we arrive at Cramer’s rule: If In is the
n × n identity matrix, then Aadj · A = (det A)In . If A is invertible, then
1
Aadj .
A−1 =
det A
Now we turn to variational formulae.
Lemma 2-6. If A(t) is a smooth path of n × n matrices,
d
det A(t) = tr(Aadj (t) · A′ (t)).
dt
If A(t) is invertible,
d
log |det A(t)| = tr(A−1 (t) · A′ (t))
dt
and
dp
|det A(t)| = 12 |det A(t)| tr(A−1 (t) · A′ (t)).
p
(2.3.10)
dt
2
Proof. If we consider det A as a function of (Ai j ) ∈ Rn , then
∂ det A adj
= (−1)i+ j Mi j = A ji .
∂ Ai j
By the chain rule,
n
d X ∂ det A ∂ Ai j
det A(t) = · .
dt ∂ Ai j ∂t
i, j=1
You might then argue that δ0ikj = 21 g km (h m j;i + h im; j − h i j;m ), and observe that
the variation of the Ricci tensor is given by dtd t=0 Ri j = (δ0)ikj;k − (δ0)ik; k
j.
We now derive the Euler–Lagrange equation for the Einstein–Hilbert action.
We will vary the metric g in the direction of a symmetric (0, 2)-tensor h. We
will take h to be compactly supported, so we can make sense out of the variation
for a given h even in case R(g) fails to be integrable, by integrating only over
the support of h, where the metric g is actually changing.
Theorem 2-9. The first variation of the Einstein–Hilbert action (total scalar
curvature functional) R is given by
d
Z
R(g + th) = − h · Ric(g) − 12 R(g) g dvg
dt t=0 M
for all compactly supported tensors h (vanishing near the boundary ∂ M if ∂ M
is nonempty). Thus the Euler–Lagrange equation is Ric(g) − 21 R(g)g = 0. This
equation is satisfied on all two-dimensional manifolds (M, g). For n = dim M ≥ 3,
the Euler–Lagrange equation is equivalent to Ric(g) = 0.
T HE E INSTEIN EQUATION 73
Integrating by parts and observing that boundary terms vanish by the choice of h,
we obtain
d
R(g + th)
dt t=0
Z Z
= L g h dvg + R(g) · 12 (trg h) dvg
M M
Z
= (−1g (trg h) + divg divg h − h · Ric(g)) + 12 R(g) g · h dvg
M
Z
=− h · Ric(g) − 21 R(g) g dvg .
M
If M is closed, we let h = Ric(g) − 21 R(g) g to finish the proof. In any case, the
preceding equation holds for all h in a dense subset of L 2 (M, dvg ), so we see
that we must have Ric(g) − 21 R(g) g = 0.
If M is two-dimensional and {e1 , e2 } is an orthonormal basis of T p M, say
⟨e1 , e2 ⟩ = 0, ⟨e1 , e1 ⟩ = ϵ1 = ±1, and ⟨e2 , e2 ⟩ = ϵ2 = 1, then
Exercise 2-13. Prove the lemma. You might recall the Bianchi identity; compare
Corollary 2-2.
For any constant 3, we can also consider R3 (g) = M (R(g) − 23) dvg . The
R
formula (2.3.12) for the variation of the volume element dvg easily gives this:
Ric(g) − 12 R(g)g + 3g = 0.
From here we can derive field equations (sometimes called equations of motion)
for the matter fields. Instead of introducing new notation, we illustrate with
examples.
Example 2-16. Consider g = η, the Minkowski metric on M1+k = R11+k , and
work with global inertial coordinates (t, x i ). Then for any smooth function ψ,
∂ψ 2 k
−2
X ∂ψ 2
η(dψ, dψ) = −c + .
∂t ∂xi
i=1
If the action integrand is Lm = − 12 η(dψ, dψ), then for any smooth compactly
supported ϕ, the field equation is obtained from
d
Z
0= − 1 η(dψ + tdϕ, dψ + tdϕ) dt d x 1 · · · d x k
dt t=0 M 2
Z Z
= −η(dψ, dϕ) dt d x 1 · · · d x k = ϕ □ ψ dt d x 1 · · · d x k
M M
where
2 k
∂ ∂ 2
−2
X
□ = −c +
∂t ∂xi
i=1
is the wave operator in Minkowski spacetime. Since this must hold for all
appropriate ϕ, the field equation is the wave equation □ ψ = 0.
We remark that the Lagrangian density in the Einstein–Hilbert action for the
gravitational field involves second-order derivatives of the metric, as does the
corresponding Euler–Lagrange equation (the Einstein equation), whereas for the
preceding two examples, the Lagrangian density is first-order in ψ, while the
Euler–Lagrange equation is second-order in ψ.
Thus we see Tab = ψ;a ψ;b − 12 (η(dψ, dψ) + m 2 ψ 2 )ηab . There is an analogue
on any spacetime (M, g): Tab = ψ;a ψ;b − 21 (ψ;c ψ;d g cd + m 2 ψ 2 )gab .
1
Example 2-20. For the source-free Maxwell field Lm = − F F g ac g bd ,
16π ab cd
the method above yields
Z
⟨h, T ⟩g dvg
M
1
Z
= − Fab Fcd − g ai h i j g jc g bd − g ac g bi h i j g jd + 21 h i j g i j g ac g bd dvg
8π
ZM
1
= h i j Fab Fcd g ai g jc g bd + g ac g bi g jd − 21 g ac g bd g i j dvg
8π
ZM
1
= hi j Fab Fcd g bd g ai g jc − 14 g ac g bd g i j dvg
M 4π
where in the last step we used the antisymmetry of Fab . Thus
1
Tab = Fac Fbd g cd − 41 Fi j Fsℓ g is g jℓ gab .
4π
One can derive the Einstein equation starting from the ansatz that the action
for the combination of gravitation with the fields is simply the sum of a constant
multiple of R3 with the action for the matter. In particular, we are assuming that
the fields do not themselves appear in the Lagrangian for the gravitational field.
With κ a positive constant as above (where κ = 8π G/c4 in spacetime dimension
four by the Newtonian limit), we consider the action
1
Z
S (g, 9) = (R(g) − 23) + Lm (g, 9) dvg . (2.3.15)
M 2κ
Since this holds for any compactly supported X , we see that T must be divergence-
free as desired.
Proof. Let gt = g +th, and let ḡt = (Vol gt )−2/n gt ∈ M1 . Then R(gt ) = R(ḡt ) =
2 d
R 1
R(gt )/(Vol gt )1− n . We compute, using dt t=0
Vol gt = M 2 (trg h) dvg ,
d
R(ḡt )
dt t=0
h · Ric(g) − 12 R(g) g dvg R(g) Z
R
2
M 1
=− − 1− (tr h) dvg
(Vol g)1− n
2 n (Vol g)2− n2 M 2 g
1 n −2
Z
1 −1
=− h · Ric(g) − 2 R(g) g + (Vol g) R(g)g dvg .
2
(Vol g)1− n M 2n
For this to vanish for all symmetric (0, 2)-tensors h, we must have
1 n −2
Ric(g) − R(g) g + (Vol g)−1 R(g)g = 0.
2 2n
T HE E INSTEIN EQUATION 79
We could further constrain the variation so that the metrics under consideration
are not only of unit volume, but pointwise conformal to g, that is, expressible as
f · g, for f > 0 a smooth function on M. One can partition M into equivalence
classes [g] of pointwise conformal metrics. As an immediate corollary of the
proof of Proposition 2-22 we have the following.
Proposition 2-25. A metric g is critical for R amongst variations gt ∈ [g] if and
only if R(g) is constant. A metric g ∈ M1 is critical for R amongst variations
gt ∈ M1 ∩ [g] if and only if R(g) is constant.
Proof. Let gt = f t g, t ∈ I , be smooth in I × M, with f 0 = 1. Then f t is smooth,
and h = dgt /dt = (d f t /dt)g. Let ψ = d f t /dt |t=0 , which can be any smooth
80 2. T HE E INSTEIN EQUATION
1 n −2
Z
−1
= · ψ R(g) − (Vol g) (g) dvg .
R
2
(Vol g)1− n n M
Since this must vanish for all ψ, we must have R(g) = (Vol g)−1 R(g), and
conversely. □
D' D
C' C
Σ \U \V
cylindrical
Figure 4. Connected sum construction for the proof of the Gauss–
Bonnet theorem.
For a sphere, we take the metric of the unit (round) sphere, K = 1 and
Vol g = 4π. The Euler characteristic of the sphere is 2, as mentioned.
For a torus, we have Euler characteristic 0, and we can use a flat metric on
the torus to compute R(g).
To progress to the higher-genus case, we use an auxiliary construction. By
smoothly capping off an end of a circular cylinder, we obtain a surface D which
is topologically a closed disk, with a cylindrical collar neighborhood of its
boundary, which is a circular geodesic on the cylinder. Reflect the surface D
across the plane of the geodesic circle to produce a congruent surface D ′ , and
let S = D ∪ D ′ , which is smooth with spherical topology, and induced metric g.
Thus
1
Z Z
K (g) dvg = K (g) dvg = 2π. (2.3.16)
D 2 S
Observe that (2.3.16) is independent of the precise geometry of the smooth cap,
as well as the size of the cylindrical portion (C in Figure 4), where K = 0.
Given a closed surface 6, let 6 ′ be the connected sum of 6 and a torus T. We
can readily put a metric g on 6 and g0 on T, so that there are neighborhoods U ⊂
6 and V ⊂ T, each isometric to the interior of the surface D as above, and whose
complements 6 \ U and T \ V each possess a cylindrical collar neighborhood of
their boundary circle. We can represent 6 ′ as the union (6 \ U ) ∪ (T \ V ) along
the common boundary circle, and take the metric g ′ coinciding with the metrics
on each piece, noting that g ′ is smooth by construction. If the Gauss–Bonnet
theorem holds for 6, then by (2.3.16)
Z Z Z
′ ′
K (g ) dvg = K (g) dvg + K (g0 ) dvg0 − 4π = 2π(χ(6) − 2).
6′ 6 T
82 2. T HE E INSTEIN EQUATION
In this section we consider some examples of spacetimes and the form of the
Einstein equation they satisfy.
2.4.1. Constant curvature spacetimes. Let Rνn , for 0 ≤ ν ≤ n, denote the semi-
Riemannian manifold Rn with the metric ⟨∂/∂ x i , ∂/∂ x j ⟩ = ϵi δi j , where ϵi = −1
for 1 ≤ i ≤ ν and ϵi = 1 for ν + 1 ≤ i ≤ n. We define the following manifolds
as level sets of certain quadratic polynomials, with the metric induced from the
indicated inclusions: for n ≥ 2 and r > 0, we let
equation with T = 0 and 3 = 14 R(g). Thus Ric(g) = 3g, and in the examples
S PACETIME EXAMPLES 83
2.4.2. The Einstein static universe. Consider the space R × S3 with the product
metric g = −c2 dt 2 + g̊S3 , where g̊S3 is the unit round metric on the three-sphere.
Exercise 2-28. Show that the Ricci curvature of this metric is Ric(g) = 2g̊S3 .
From this we see that R(g) = 6, so the Einstein tensor is Ric(g) − 12 R(g) g =
3c2 dt 2 − g̊S3 , and G 3 (g) = (3−3)c2 dt 2 +(3−1)g̊S3 . We identify this with the
stress-energy tensor of a perfect fluid, with fluid velocity U = ∂/∂t, density ρ and
pressure p. If we write T as a (0, 2)-tensor, we have T = (ρ + p)c2 dt 2 + pg =
ρc2 dt 2 + p g̊S3 . The Einstein equation G 3 (g) = κ T is then equivalent to
3 − 3 = κρ, 3 − 1 = κ p.
f ′′ (t) f ′′ (t)
Ric( U , U ) = −3 =3 ⟨ U , U ⟩c−2 ,
f (t) f (t)
Ric( U , X ) = 0,
′ 2
f (t) k0 f ′′ (t)
Ric(X, Y ) = 2 +2 2
+ ⟨X, Y ⟩c−2 ,
f (t) ( f (t)) f (t)
′ 2 ′′
f (t) k0 f (t) −2
R(g) = 6 + + c .
f (t) ( f (t))2 f (t)
The stress-energy tensor that corresponds to this metric can be found using
the Einstein equation:
c4 1
T= Ric(g) − R(g) g + 3g .
8π G 2
Exercise 2-30. Verify the following formulas.
T ( U , X ) = 0,
c2 f ′ (t) 2 k0 f ′′ (t)
2
T (X, Y ) = − + +2 − 3c ⟨X, Y ⟩,
8π G f (t) ( f (t))2 f (t)
′ 2
c4 f (t) k0
2
T (U, U) = 3 +3 − 3c .
8π G f (t) ( f (t))2
We can try to identify this with a perfect fluid, which has stress tensor T =
c−2 (ρ + p) U ♭ ⊗ U ♭ + pg. For instance, T ( U , U ) = c2 ρ, T ( U , X ) = 0 and
T (X, Y ) = p⟨X, Y ⟩. We can identify the stress-energy tensor of the warped
product as that of a perfect fluid, with
′ 2
c2 f (t) k0
2
ρ= 3 +3 − 3c
8π G f (t) ( f (t))2
′ 2
c4 f (t) k0
= 3 +3 −3 ,
8π G c f (t) (c f (t))2
′ 2
c2 f (t) k0 f ′′ (t)
2
p=− + +2 − 3c
8π G f (t) ( f (t))2 f (t)
′ 2
c4 f (t) k0 f ′′ (t)
= − − −2 2 +3 .
8π G c f (t) (c f (t))2 c f (t)
Note that (4π G/c4 )(ρ + 3 p) − 3 = −3 f ′′ (t)/f (t). It is also easy to derive
the following equation from the above: ρ ′ (t) = −3(ρ(t) + p(t)) f ′ (t)/f (t);
S PACETIME EXAMPLES 85
8π Gm 1
· = ( f ′ (t))2 + k0 − 13 3c2 ( f (t))2 .
3c2 f (t)
Note that there is a critical value of 3 for which one can achieve a static model
(constant f (t)) when p = 0, given by 3c2 = k03 (4π Gm/c2 )−2 .
We let 3 = 0 and A = 8π Gm/(3c2 ), so that the Friedmann equation becomes
A
= ( f ′ (t))2 + k0 .
f (t)
One can solve this for each possible sign of√k0 . For k0 = 0, we get, assuming
√
f (t) > 0 and f ′ (t) ̸= 0, f (t) f ′ (t) = ± A, so if we let f (0) = 0, then
f (t) = Ct 2/3 , where C > 0 is constant. This describes a universe that expands
from an initial “Big Bang” singularity (note that f ′ (t) → +∞ as t ↘ 0). If k0 = 1,
the solution graph can be written in parametrized form as t = 21 A(u − sin u),
f = 21 A(1 − cos u), which describes a cycloid. The geometry here expands from
f (0) = 0 to a maximum value, then recollapses at t = π A (u = 2π), the “Big
Crunch”. Similarly, if k0 = −1 and ρ > 0, then ( f ′ (t))2 = 1 + A /f (t) > 1, so
that f (t) keeps growing without bound, and the spatial slices expand in size over
time. We note that you can parametrize the solution graph as t = 21 A(sinh u − u),
f = 21 A(cosh u − 1).
where g̊Sn−1 is the unit round metric on the sphere. (We take n ≥ 3, since any
Ricci-flat surface or three-manifold is flat; see Exercise 2-11.) So a spherically
symmetric metric is also static (see Section 2.4.5, and compare [218, Chapter 6;
174, Chapter 12]), where the vector field ∂/∂t is a Killing field, timelike for
u(r ) > 0, which is orthogonal to the orbits of the isometry group.
We impose the Einstein equation Ric(g) = 0 to determine g. We will sketch
this here, referring to the works just cited, as well as, e.g., [182, Chapter 3].
We will use results from Exercises 1-14 and 1-15, namely the formulas for the
curvature of a warped product, with the tr -plane as the base B with metric
g B = −u(r )dt 2 +v(r )dr 2 and with fiber F the round sphere (Sn−1 , g̊Sn−1 ), with
warping function r .
We apply Exercises 1-14 and 1-15 to find Ricg (X, Y ) for vectors X and Y
tangent to the base, which yields
n−1 1 u ′ (r ) 2 1 v ′ (r ) 2
2 2
K −u(r )dt + v(r )dr − · − dt − dr (X, Y )
r 2 v(r ) 2 v(r )
where
1 u ′′ (r ) 1 (u ′ (r ))2 1 u ′ (r )v ′ (r )
K =− + + .
2 u(r )v(r ) 4 (u(r ))2 v(r ) 4 u(r )(v(r ))2
For this to vanish for all X and Y is equivalent to
n − 1 u ′ (r ) n − 1 v ′ (r )
−K u(r ) + = 0 = K v(r ) + . (2.4.1)
2r v(r ) 2r v(r )
u ′ (r ) v ′ (r )
Thus a necessary condition for Ric(g) = 0 is that =− , i.e., u(r )v(r )
u(r ) v(r )
is constant.
Exercise 2-31. Use the formula for K , as well as u(r )v(r ) = C, to show that the
condition (2.4.1) above reduces to r u ′′ (r ) + (n−1)u ′ (r ) = 0. Show the general
solution is given by u(r ) = c1 + c2 /r n−2 , where c1 and c2 are constants.
Thus we can achieve the Ricci-flat condition along base directions with u(r ) =
c1 + c2 /r n−2 and v(r ) = C/u(r ). From Exercise 1-15, it follows that we just
have to consider the fiber directions, for which we note Ric(g F ) = (n−1)g̊Sn−1 .
Furthermore, as in Exercise 1-15, the meaning of ⟨∇r, ∇r ⟩ = (v(r ))−1 is the
same computed in g B or g. Moreover from Exercise 1-14 we have
1 u ′ (r ) 2 1 v ′ (r ) 2
Hessg B r = − dt − dr ,
2 v(r ) 2 v(r )
S PACETIME EXAMPLES 87
so
1 u ′ (r ) v ′ (r )
1g B r = trg B (Hessg B r ) = − .
2 u(r )v(r ) (v(r ))2
Therefore, for vectors V and W tangent to the fiber, we find Ricg (V, W ) is
precisely
′
r u (r ) v ′ (r ) n−2
g̊Sn−1 (V, W ) (n − 2) − − + .
2 u(r )v(r ) (v(r ))2 v(r )
Exercise 2-32. Show that under the condition u(r )v(r ) = C, the vanishing of
Ricg (V, W ) for all V and W tangent to the fiber is tantamount to
r u ′ (r ) + (n − 2)u(r ) = C(n − 2).
Solve this to obtain u(r ) = C + c3 /r n−2 for some constant c3 .
Together with the result of Exercise 2-31, we see that Ric(g) = 0 boils down
to (letting α satisfy c3 = −αC)
c3
α
α −1
u(r ) = C + n−2 = C 1 − n−2 and v(r ) = 1 − n−2 ,
r r r
for C > 0 and α arbitrary constants. We remark that we can rescale the time
variable by a constant factor (changing time units effectively adjusts the numerical
value of the speed of light c), to normalize the constant value of u(r )v(r ). If
we take C = c2 , we can write the Schwarzschild spacetime metric ḡ S , for any
constant α, as
α α −1 2
ḡ S = − 1 − n−2 c2 dt 2 + 1 − n−2 dr + r 2 g̊Sn−1 .
r r
Note that for large r , ḡ S ≈ −c2 dt 2 + g En = −(d x 0 )2 +δi j d x i d x j , the Minkowski
metric. So not only must the spacetime be static, which follows directly from the
ansatz of spherical symmetry, but in the far field it must approach Minkowski
spacetime, a consequence of the symmetry together with the Ricci-flat condition
from the vacuum Einstein equation.
We want to identify α as proportional to the mass m of the spacetime, even-
tually settling on units for which α = 2m. We motivate this in dimension four,
i.e., n = 3, and we restrict the discussion to this dimension until further notice.
The Schwarzschild metric can represent the gravitational field in the exterior
of a non-rotating spherically symmetric massive body, and in the weak field
regime (large r ), the effect of the gravitational field on test particles (ascer-
tained by finding the geodesics in the Schwarzschild metric) is roughly that of a
Newtonian gravitational field for a point mass m. In fact, note that if we write
(ḡ S )µν = ηµν + h µν , with h 00 = 2Gm/(c2r ) and d x 0 = c dt, we can as on p. 63
88 2. T HE E INSTEIN EQUATION
For m < 0, the metric g S on a constant t-slice is defined for all r > 0 and is
Riemannian. There are radial geodesics for this metric g S , which are then easily
seen to be spacelike geodesics for ḡ S , since II = 0. If r0 > 0, then for m < 0,
R r0 −1/2 dr < +∞. Thus a radial geodesic going from r toward r = 0
0 (1 − 2m/r ) 0
has finite length, so that both (R3 \{0}, g S ) and (R×(R3 \{0}), ḡ S ) are geodesically
incomplete. Neither metric can be smoothly extended in order to extend the
geodesic, due to the curvature blowup as r ↘ 0. As we will see below, for m > 0,
while the geometry of the spacetime metric ḡ S is well-behaved near r = 2m, the
metric is causally geodesically incomplete: there are timelike geodesics which
approach r = 0 in finite proper time (as well as null geodesics which approach
r = 0 at finite affine parameter), and along which there is curvature blowup.
We collect a few formulas that will be useful here and in the next section, and
make one more observation before moving on.
Exercise 2-36. By direct calculation, either computing the Christoffel symbols
µ
0ρσ = 21 g µν (gνσ,ρ + gρν,σ − gρσ,ν ) or by using the Cartan equations (Exercise
1-13), show that in the Schwarzschild metric ḡ S we have
∇ ∂ ∂ = m2 1 − 2m ∂ ,
∂t ∂t r r ∂r
−1
∇ ∂ ∂ = m2 1 − 2m ∂ ∂
= ∇∂ ,
∂t ∂r r r ∂t ∂r ∂t
−1
∇ ∂ ∂ = − m2 1 − 2m ∂
.
∂r ∂r r r ∂r
Consider the unit timelike vector field U = (1 − 2m/r )−1/2 ∂/∂t, for r > 2m.
Integral curves γ (τ ) of this vector field are Schwarzschild observers. In these
coordinates, their spatial positions are fixed. From the preceding exercise, we
note that Dγ ′ (τ )/dτ = ∇ U U = (m/r 2 ) ∂/∂r . Thus, for m ̸= 0, such observers
are not in free fall. In fact, as we have seen earlier (cf. Section 1.3.2.1), if
m 0 is the rest mass of the observer, one might interpret f = (m 0 m/r 2 ) ∂/∂r
as the force on the observer. As the observer has fixed spatial coordinates in
this chart, one might interpret this as the force required to oppose the effect of
gravitation, to keep the observer from geodesic motion (free fall); moreover,
∂/∂r is approximately a unit vector if r is large, so the interpretation is consistent
with Newtonian gravity (with G = 1), at least in the far field (weak field limit).
2.4.4.1. Kruskal extension. We have seen that the null structure plays a key role
in the geometry of spacetime, so we study the behavior of null geodesics near
r = 2m > 0. Fix a point ω0 on the sphere, and consider a curve γ (s) which is
given in coordinates by (t (s), r (s), ω0 ). Thus γ ′ (s) = t ′ (s)∂/∂t + r ′ (s)∂/∂r .
90 2. T HE E INSTEIN EQUATION
We observe that the ∂t∂ -component of the geodesic equation γ ′′ (s) = 0 is first-
d
order linear in t ′ (s), and in fact can be written ds t ′ (s)(1 − 2m/r ) = 0, which
The metric is now adapted to the structure of null geodesics, but it still has
singular components at r = 2m. Note that r∗ = 21 (v − u), so that
v−u r r
e 4m = |F(r )|, with F(r ) := e 2m −1 .
2m
It is a simple exercise to show that F is a diffeomorphism between (0, +∞) and
(−1, +∞), mapping (0, 2m) to (−1, 0) and (2m, +∞) to (0, +∞).
u v
With this in mind, we let U = e− 4m and V = e 4m , so that for r > 2m,
U V = F(r ). The portion of the Schwarzschild spacetime for r > 2m corresponds,
with r∗ taking on all values in R, to {(U, V ) : U > 0, V > 0}. Moreover, we
u
1 − 4m v
1 4m
compute, since dU = − 4m e du and d V = 4m e dv, that
2m 2m −1 2 2m
− 1− dt 2 + 1 − dr = − 1 − du dv
r r r
1 − 2m
= −16m 2 r r r dU d V
e 2m 2m − 1
32m 3 − 2mr
= e dU d V.
r
Let F −1 be the inverse function for F. We can construe (U, V ) 7→ F −1 (U V )
as a smooth function on the set {(U, V ) : U V > −1}, for which we note that
{(U, V ) : F −1 (U V ) > 2m} = {(U, V ) : U V > 0}, one component of which
we just used above to map to r > 2m in Schwarzschild. Likewise we have
{(U, V ) : 0 < F −1 (U V ) < 2m} = {(U, V ) : −1 < U V < 0}, one component
of which we will map to the region 0 < r < 2m in Schwarzschild, namely
{(U, V ) : U V > −1, U < 0, V > 0}. Note that as r increases in the interval
u
(0, 2m), r∗ decreases over the interval (−∞, 0); if we now let U = −e− 4m and
v
V = e 4m , we again have in this region U V = F(r ), and also again we compute
1 − 4m u 1 4mv
with dU = 4m e du and d V = 4m e dv, that
2m 2m −1 2 2m 32m 3 − 2mr
− 1− dt 2 + 1 − dr = − 1 − du dv = e dU d V.
r r r r
Again, r = F −1 (U V ) can be construed as a smooth function of (U, V )
in {(U, V ) : U V > −1}, taking on all values in (0, +∞): r = 2m does not
correspond to a singularity of this metric. We have the original two disconnected
regions r > 2m and 0 < r < 2m embedded isometrically into a smooth manifold,
connected along the axis U = 0, V > 0. Notice that the level sets of r , which
correspond to varying t, are hyperbolae in the U V -plane, with one exception:
r = 2m corresponds to the axes, U V = 0. In any case, we see that there is not a
singularity at r = 2m in the spacetime metric, which smoothly extends over this
set, not only connecting r > 2m with 0 < r < 2m, but also giving rise to two
92 2. T HE E INSTEIN EQUATION
other related regions as well. They are topologically connected to each other,
but not all regions are causally connected, as will soon be clear.
If we want to take the metric out of null form, we can let T = 12 (V − U ), and
R = 12 (V + U ), and on {(T, R) : R 2 − T 2 > −1}, we have
2m 2m −1 2 32m 3 − 2m r
− 1− dt 2 + 1 − dr = e dU d V
r r r
32m 3 − 2m
r
= e (−dT 2 + d R 2 ),
r
where r = F −1 (R 2 − T 2 ); the level sets of r are hyperbolae, and r = 2m now
corresponds to R = ±T , which is clearly a null hypersurface. The lightcones
behave uniformly in these coordinates, which of course is how the coordinates
were constructed.
The original region r > 2m, corresponds to R +T = V > 0 and R −T = U > 0,
i.e., R > |T |, and in this region, we can relate the coordinates to the original
(t, r ) coordinates, as follows:
t+r∗ t−r∗ r∗ 12 r
t r t
T = 21 (V −U ) = 12 e 4m −e− 4m = e 4m sinh
= −1 e 4m sinh ,
4m 2m 4m
21 r
r t
R = 21 (V +U ) = −1 e 4m cosh .
2m 4m
The region 0 < r < 2m corresponds to R +T > 0, R −T < 0, i.e., T > |R|,
with R 2 −T 2 > −1, with coordinates related to the (t, r ) coordinates by (watch
the log term in r∗ )
t+r∗ 1 r
− t−r∗ r∗ t r 2 4m t
1 1
T = 2 (V −U ) = 2 e 4m +e 4m = e cosh
4m = 1− e cosh ,
4m 2m 4m
1
r 2 4m t
r
R = 12 (V +U ) = 1− e sinh .
2m 4m
We note several things about the causal structure (Figure 5). Spacetime has a
time orientation given by ∂/∂ T . Because the lightcones in the (T, R)-plane are
at 45◦ from the T -axis, we see that no causal curve from the region E 2 given
by R < −|T | can enter the region E 1 , where R > |T | (the original r > 2m
region). Moreover, any ingoing (to the future) radial null geodesic starting
from E 1 will cross R = |T | (r = 2m), entering into the region B1 (the original
0 < r < 2m region), where T > |R|, R 2 − T 2 > −1, and will inevitably “reach”
r = 0 (R 2 − T 2 = −1) at a finite affine parameter. In fact, any causal curve
emanating from B1 stays within this region to the future, and must reach r = 0
at finite proper time (or affine parameter). This is a simple consequence of the
structure of lightcones in the (T, R)-plane, and recall that a null geodesic has
S PACETIME EXAMPLES 93
r=0
2m
r=
B1
T
E2 E1
R
B2
r=
2m
r=0
affine parameter s with r ′ (s) constant. The final region B2 is dual to B1 , and is
given by T < −|R|, R 2 − T 2 > −1; future-directed causal curves starting from
B2 can enter E 1 , say, but no future-directed causal curve can go from E 1 to B2 .
Interpreting causal curves in terms of signals, one can interpret B1 as a black
hole and B2 as a white hole.
We can integrate this with the substitution r −m = m cosh w to get w = log r̃ + C̃,
or ew = C r̃ . Thus
m 1
r = m + m cosh w = m + C r̃ + ,
2 C r̃
r mC m m
and so = + + . If we let C = 2/m, we get
r̃ 2 r̃ 2C r̃ 2
r m 2
= 1+ .
r̃ 2r̃
94 2. T HE E INSTEIN EQUATION
~r → ∞
~ minimal
r - constant sphere
spheres ~ m
r=−
2
r~→
0
m 4
gS = 1 + g E3 .
2|x|
For m > 0, the conformal metric is defined for all x ̸= 0. For m < 0, r ↘ 0
corresponds to r̃ ↘ − m2 . The manifold (R3 \ {x : |x| ≤ − m2 }, g S ) is incomplete,
by the same argument given earlier, and cannot be extended over |x| = − m2 : the
g S -areas of the spheres {|x| = r } tend to 0 as r ↘ − m2 (Exercise 2-49), and radial
geodesics from r = r0 > − m2 have finite length as r ↘ − m2 , but the curvature
blowup precludes being able to add a point to complete the metric, as is done in
the case m = 0 for the Euclidean metric.
For m > 0, the values r > 2m correspond to r̃ > m2 , and the set where 0 < r̃ ≤ m2
is in the extended Schwarzschild spacetime developed above. The set where
t = 0 and r̃ = m2 is a round two-sphere, which is actually totally geodesic inside
the three-slice (Figure 6). We note that as r̃ = |x| → ∞, g S approaches the
Euclidean metric. What is not immediately obvious is that the same can be said
of the geometry as r̃ ↘ 0, so that the metric is complete. This follows from the
following exercise.
the set where r̃ > m2 isometrically onto the set where 0 < r̃ < m2 . Conclude that
the fixed two-sphere is totally geodesic.
S PACETIME EXAMPLES 95
√
We define r = r (s) to be a function satisfying dr/ds = r (s)/ H (s), i.e., r =
√
r0 e F(s) where dF/ds = 1/ H (s); we take r0 > 0. Thus r : I → r (I ) =: J
is a smooth bijection, with smooth inverse s = s(r ), for which we have g =
(s ′ (r ))2 (dr 2 +r 2 g̊Sn−1 ) = (s ′ (r ))2 g En . Thus g is conformally Euclidean. As n ≥ 3,
we can write this in a standard form given in Exercise 2-51 as g = u 4/(n−2) g En ,
where in this case u = u(r ) and r = |x|. Applying (2-51a) from that exercise,
we see that the vanishing of the scalar curvature amounts to 1u = 0, which for a
function of r = |x| alone yields u = c1 +c2 /r n−2 ; cf. (7.1.3). If c2 = 0, the metric
g is flat; more generally, by a radial rescaling, it is easy to see that g is isometric
to the Riemannian Schwarzschild metric g S of mass 2c1 c2 . See Exercise 7-36.
We make one final remark here. Let I = (0, +∞). For m > 0, we consider the
Riemannian Schwarzschild metric g S as a metric on I × Sn−1 . This metric admits
an isometric Z2 -action, given by radial inversion (cf. Exercises 2-37 and 2-49)
along I , together with the antipodal map on the sphere Sn−1 . The Schwarzschild
metric then induces a complete Riemannian metric with zero scalar curvature on
the quotient space, a smooth noncompact manifold diffeomorphic to RPn with a
point removed, known as the RPn -geon. The geometry can still be construed as
rotationally symmetric, with one asymptotically flat end instead of two.
Lemma 2-40. Let X and Y be (lifts of) vector fields tangent to the base M. Then
Thus the condition that ḡ be Ricci-flat is given by the static vacuum equations
Ric(g) = f −1 Hessg f and 1g f = 0. Note that from here we get R(g) = 0.
We can relate the static equations to scalar curvature deformation, by recalling
the linearization L g h of the scalar curvature from (2.3.11), written in local
coordinates as
2m −1 2
gS = 1 − dr + r 2 g̊S2
r
induced on a constant time slice p from the spacetime metric ḡ S in (2.4.2). From
the form of ḡ S , we see that f = 1 − 2m/|x| is a solution of L ∗g f = 0. If we
S
change to isotropic coordinates, g S = (1 + 21 m/|x|)4 g E3 , then using the form
(2.4.3) of the Schwarzschild metric ḡ S in these coordinates, we see that
1 − 21 m/|x|
f =
1 + 21 m/|x|
is a solution to L ∗g f = 0.
S
gradgE x i = ei = ∂/∂ x i , and along the unit sphere centered at the origin, the
position vector x is a unit normal.)
Proposition 2-44. Let (M, g) be a connected Riemannian manifold with n =
dim M ≥ 2. The dimension of the space of solutions to L ∗g f = 0 is at most
(n+1). For any f : M → R which is a nontrivial solution to L ∗g f = 0, the
level set 6 := { p ∈ M : f ( p) = 0} is either empty or a smooth totally geodesic
hypersurface, along any component of which |d f |2g is a nonzero constant.
Proof. We first show that the set { f = 0} is a regular level set for f . Suppose
L ∗g f = 0, and assume { f = 0} is nonempty. Let p ∈ M with f ( p) = 0. We claim
that since f is nontrivial, d f p cannot vanish; equivalently, if d f p = 0, then f
would be trivial. To see this, observe that the system L ∗g f = 0 can be rewritten as
1
Hessg f + f n−1 R(g) g−Ric(g) = 0. If γ (t) is a unit-speed geodesic emanating
from p = γ (0), and if we let h(t) = f (γ (t)), then since Dγ ′ /dt = 0, we have
h ′′ (t) = Hessg f (γ ′ (t), γ ′ (t)), so that h(t) satisfies the second-order linear ODE
h ′′ (t) + R(g) ′ ′
n−1 − Ricg (γ (t), γ (t)) h(t) = 0.
2.4.6. The Kerr metric. The Kerr metrics form a family of axisymmetric Ricci-
flat metrics, which includes as a subset the Schwarzschild metrics, with their full
spherical symmetry. Such metrics model the exterior of a rotating gravitational
object. We will write down the metric in a certain coordinate system, the Boyer–
Lindquist coordinates. Let r , φ and θ be spherical coordinates on three-space.
We will use the mathematical convention that θ is the polar angle, whereas φ
is the angle between the position vector and the z-axis: in most physics books,
the angle notation is reversed, so be careful when comparing. In any case, the
metric appears somewhat complicated, and depends on two parameters, m and a.
For simplicity, we use units where G = 1 and c = 1; in general, we can replace
“m” with “Gm/c2 ” and “dt” with “c dt” in what follows to obtain the formula
for general units. See [161, p. 36] for more on geometrized units.
Let 1 = r 2 − 2mr + a 2 and ρ 2 = r 2 + a 2 cos2 φ. The Kerr metric is given by
2mr 2mr sin2 φ
g = − 1 − 2 dt 2 − a (dt ⊗ dθ + dθ ⊗ dt)
ρ ρ2
(r 2 + a 2 )2 − a 2 1 sin2 φ 2 2 ρ2 2
+ sin φ dθ + dr + ρ 2 dφ 2 . (2.4.6)
ρ2 1
This spacetime metric is stationary: ∂/∂t is a timelike Killing vector, but unlike
the Schwarzschild situation, if a ̸= 0, this Killing field is not orthogonal to the con-
stant t-slices. The axisymmetry is apparent, as ∂/∂θ is also a Killing vector. One
can define suitably the angular momentum (see (7.2.6)) and show that it is J = am
in the z-direction. When a = 0, the metric reduces to the Schwarzschild metric.
General relativity predicts that a rotating massive object will cause the space-
time around it to be “warped”. One such manifestation of this is the frame
dragging effect, which we illustrate in the Kerr metric. We use the following
lemma saying that a symmetry gives rise to a conserved quantity along geodesics.
Lemma 2-47. If X is a Killing field on (M, g), and if γ is a geodesic, then
⟨X, γ ′ ⟩ is constant.
Proof. Since γ (τ ) has constant speed, ⟨∇ X γ ′ , γ ′ ⟩ = 0. By the geodesic equa-
tion, then, d⟨X, γ ′ ⟩/dτ = ⟨∇γ ′ X, γ ′ ⟩ = gi j (γ k )′ X i;k (γ j )′ = X j;k (γ k )′ (γ j )′ =
X k; j (γ k )′ (γ j )′ . We see this must vanish by applying the Killing equation
X j;k + X k; j = 0. □
Consider in any spacetime a timelike geodesic parametrized by proper time τ ,
with velocity vector U (τ ). In the Kerr metric, X = ∂/∂θ is a Killing vector, so
100 2. T HE E INSTEIN EQUATION
that by Lemma 2-47, U θ = ⟨ U , ∂/∂θ⟩ is conserved along the geodesic. One can
see this directly from the geodesic equations U α U α;β = 0, which can be written
d Uβ γ
= 0αβ U α U γ = 12 g γ ν (gνβ,α + gαν,β − gαβ,ν ) U α U γ
dτ
= 21 (gνβ,α + gαν,β − gαβ,ν ) U α U ν = 21 gαν,β U α U ν .
dθ Uθ g θα U α gθ t gθ t 2mra
= t = tα = tt = − = .
dt U g Uα g gθθ (r + a )2 − a 2 1 sin2 φ
2 2
Although in the metric the trajectory is orthogonal to the direction of the symmetry
given by the Killing field ∂/∂θ, the velocity has a nontrivial component in the
θ -direction. Thus a freely falling object seems to pick up some coordinate angular
momentum in the direction of rotation of the massive object, say, generating this
gravitational field (metric). For tests of the effect of the rotation of a massive
object on spacetime, see the results of the Gravity Probe B experiment [87].
Exercises
c2 gξ 0 c2 gξ 0
0 1 1 1
x = ξ + sinh 2 , x = ξ + cosh 2 .
g c g c
b. Find the proper acceleration along the worldlines of constant ξ 1 . Show that
for |gξ 0 /c2 | small, the motion along any such worldline approximates that of
uniform Newtonian acceleration in the inertial coordinates.
c. Interpreting the metric in the (ξ 0 , ξ 1 ) coordinates for |gξ 1 /c2 | small in terms
of a Newtonian limit, estimate the Newtonian gravitational potential 8, and show
it is consistent with that of a uniform gravitational field.
d. Let Cξ be the worldline ξ 1 = ξ . Suppose that light is emitted along Cξ with
coordinate frequency ν, i.e. with 1ξ 0 = c/ν between successive wave crests.
Argue that the light will have the same coordinate frequency of absorption as
measured along Cξ̃ for any ξ̃ . Find the relative change in proper frequency from
emission along Cξ to absorption at a detector carried along C0 , conclude that the
√
change in energy per unit mass is c2 ( −ḡ00 (ξ ) − 1), and relate this to 8.
M= R 3 if m = 0,
3 m
x ∈ R : |x| > − 2 if m < 0.
m 4 m 4
On M, we let g S = 1 + 2|x| g E3 , so (g S )i j = 1 + 2|x| δi j in Cartesian coordi-
m 4
nates, while in spherical coordinates we find g S = (1 + 2r ) dr 2 + r 2 g̊S2 with
r = |x|. For each r > 0, let A(r ) be the g S -area of Sr := {|x| = r }, and for m > 0,
let 6 = Sm/2 (cf. Exercise 2-37).
a. Calculate A(r ). Observe that limr ↘ − m2 A(r ) = 0 for m < 0, while if m > 0,
2
A(r ) has a global minimum at r = m2 , and that A(r ) = A m4r , which is consistent
with Exercise 2-37. Solve for m in terms of the area A6 of 6.
b. For m > 0, show that 6 is totally geodesic in M by direct computation in
coordinates (as opposed to using the inversion isometry, cf. Exercise 2-37).
c. Let m > 0. Find an isometric embedding of (M, g S ) into E4 , identified in
Cartesian coordinates (x, y, z, w) with (R4 , d x 2 + dy 2 + dz 2 + dw 2 ). It might
be easiest to use the first set of coordinates we introduced for the Schwarzschild
−1 2
metric, for which g S = 1 − 2m r dr + r 2 g̊S2 , r > 2m. For ω ∈ S2 , look for an
embedding of the form r ω 7→ (r ω, ξ(r )) ∈ R4 , for r > 2m and ξ ′ (r ) > 0. This
corresponds to half of (M, g S ). The map you get will then extend by reflection
to the other half. Use this to sketch a representative picture of the Riemannian
Schwarzschild metric. (If you let w = ξ(r ), with limr ↘ 2m ξ(r ) = 0, the solution
set in the wr -plane is (half of) a parabola. The image of the embedding in R4 is
called Flamm’s paraboloid.)
102 2. T HE E INSTEIN EQUATION
d. When m < 0 the above argument breaks down. Instead, look for an isometric
embedding into Minkowski spacetime M4 , which is identified with R4 with the
metric d x 2 + dy 2 + d x 2 − dw 2 .
e. Generalize the above to higher dimensions: consider the higher-dimensional
Riemannian Schwarzschild metric g S , with α = 2m > 0, given in isotropic
coordinates in Exercise 2-38. Find the value of r that minimizes the area A(r )
of Sr = {x : |x| = r }. Answer: r n−2 = m2 . Find an isometric inversion through
the corresponding minimal sphere 6. Show this sphere is totally geodesic, and
express m in terms of the area of 6. Can you embed (Rn \ {0}, g S ) with m > 0
isometrically in Euclidean space (Rn+1 , gEn+1 )?
Exercise 2-50. In Euclidean space, the spheres minimize surface area for a given
enclosed volume V . In fact if a closed surface of area A encloses a volume V,
√
the isoperimetric inequality in three dimensions is V ≤ A3/2 /(6 π).
Let m > 0. In his Ph.D. thesis [25] (see also [29]), Hubert Bray showed
that the spheres Sr = {x : |x| = r } in the Riemannian Schwarzschild metric
4
R3 \{0}, 1+ 12 m/|x| g E3 are isoperimetric in the homology class of 6 = Sm/2 .
In other words, amongst all surfaces homologous to the minimal surface 6 and
enclosing a certain volume V with 6, the one with smallest area is the sphere Sr
of the correct r value to enclose volume V .
a. Show that the volume V (r ) enclosed by 6 and Sr (r ≥ m/2) has the expansion
4πr 3 9m
V (r ) = 1+ + O(mr −2 ) .
3 2r
b. Conclude that the volume V enclosed by 6 and the sphere Sr of area A has
the expansion
√
A3/2 (3 π )m
V (A) = √ 1+ √ + O(m A−1 ) .
6 π A
Exercise 2-51 (conformal deformation of scalar curvature). Suppose (M, g) is a
semi-Riemannian manifold of dimension n, and g̃ = eϕ g where ϕ is a smooth
function on M, and let u > 0 be a smooth function on M.
a. Show that, with 1g ϕ = trg (Hessg ϕ) in any signature,
d. For g̃ = u 4/(n−2) g and any smooth f, prove that Lg f = u (n+2)/(n−2) Lg̃ ( f /u).
Exercise 2-52. Suppose (M, g) is a Riemannian manifold.
a. Use the Bianchi identity and the Ricci formula to show that divg L ∗g f =
− 21 f d R(g). Use this and Proposition 2-44 to give another proof of Corollary 2-45.
b. Find the kernel of L ∗g if (M, g) is closed and has negative scalar curvature.
(Hint: Exercise 1-11.)
c. Observe that if Ric(g) = 0, then L ∗g has nontrivial kernel. Recall an example
of a Ricci-flat metric where the kernel of L ∗g has dimension greater than one, and
an example of a metric with zero scalar curvature and which is not Ricci-flat, but
for which L ∗g is nontrivial. What can you say about the kernel of L ∗g if (M, g) is
closed with zero scalar curvature?
d. Consider the metric g = (n − 2)−1 g̊S1 ⊕ g̊Sn−1 on S1 × Sn−1 . Show that
f (t, ω) = sin t is in the kernel of L ∗g .
Exercise 2-53 (kernel of L ∗g ). Let g S be a Riemannian Schwarzschild metric of
S
nonzero mass m. Show that there is a one-dimensional kernel for L ∗g . Outline:
S
Observe that L ∗gS f = 0 implies HessgS f = f Ric(g S ). (Recall from Remark 2-35
that whereas Ric(ḡ S ) = 0, Ric(g S ) ̸= 0). In three dimensions, write this out in
coordinates for which g S = (1 − 2m/r )−1 dr 2 + r 2 (dϕ 2 + sin2 ϕ dθ 2 ). Show
that ∂θ f = 0 and ∂ϕ f = 0, and then solve the remaining ODE for f . Compare
your answer with Example 2-41; for m > 0, note where this potential vanishes
(cf. Exercise 2-49). Generalize the argument to higher dimensions.
Exercise 2-54 (metric expansion in normal coordinates). Suppose ∇ is the Levi-
Civita connection for a metric g = ⟨ · , · ⟩ on M. Suppose that γ (t) is a unit-speed
geodesic, and that J (t) is a Jacobi field along γ : J ′′ (t) = R(γ ′ (t), J (t), γ ′ (t))
(cf. Proposition 2-3).
104 2. T HE E INSTEIN EQUATION
c. Let L(r ) be the length of a geodesic circle of radius r about p, and let A(r )
be the area enclosed by this circle, both computed using the metric g. Show that
3 2πr − L(r ) 12 πr 2 − A(r )
lim = K ( p) = lim .
r↘0 π r3 r↘0 π r4
d. Let D = {(x, y) : x 2 + y 2 < 1} be the unit disk in the plane, with polar
coordinates (ρ, θ ), and consider the hyperbolic metric
4 4
gH = (d x 2 + dy 2 ) = (dρ 2 + ρ 2 dθ 2 ).
(1 − (x 2 + y 2 ))2 (1 − ρ 2 )2
By solving the differential equation
2dρ
= dr,
1 − ρ2
show how to rewrite the hyperbolic metric as gH = dr 2 + sinh2 r dθ 2 . Use this
along with the formulas above to show K = −1 at the origin of coordinates (of
course, K = −1 everywhere).
Exercise 2-56 (volume expansion of geodesic balls). In this problem you will
generalize the relation of curvature and area from part c. of Exercise 2-55 to
higher dimensions, showing how the scalar curvature measures the top-order
deviation of the volume of small geodesic balls from that of Euclidean geometry.
a. Suppose (V, ⟨ , ⟩) is an n-dimensional real inner product space. Suppose that
T : V → V is a self-adjoint linear operator. If dσ is the Euclidean area measure,
Bn is the unit ball, and Sn−1 = {x ∈ V : ⟨x, x⟩ = 1} ⊂ V is the unit sphere in V,
then if Vol is the Euclidean volume,
Z
⟨T (x), x⟩ dσ = trace(T ) Vol Bn .
Sn−1
107
108 3. BASICS OF L ORENTZIAN CAUSALITY
3.1.3. Convex open sets. For the later purpose of understanding the causality
relations locally on S , we recall the concept of convex open sets in S .
Given p ∈ S , an open neighborhood U of p is called a normal neighborhood
of p provided U = exp p (Ũ ), where exp p ( · ) denotes the exponential map at p
and Ũ ⊂ T p S is a starshaped open set containing 0 such that exp p : Ũ → U is a
diffeomorphism (see [174, p. 71]).
C AUSALITY RELATIONS 109
I + ( p) = {q ∈ S : p ≪ q} and J + ( p) = {q ∈ S : p ≤ q}.
Here, the notation p ≤ q means either p = q or p < q. The timelike past and
causal past of p, denoted by I − ( p) and J − ( p) respectively, are defined in a
time-dual manner.
Naturally one wonders if I + ( p) is open in S . To answer this question, it is
convenient to restrict the causality relations to an open set U of S . Given p ∈ U ,
the timelike future of p within U , denoted by I + ( p, U ), consists of all points q
in U for which there exists a future-directed timelike curve within U from p
to q. Similarly, one defines J + ( p, U ).
When U is convex, the causality relations in U are as simple as those of the
Minkowski spacetime.
Lemma 3-7 (see [174, p. 403]). Let U be a convex open set in S , and let p, q ∈ U .
• q ∈ I + ( p, U ) if and only if the unique geodesic segment σ pq connecting p to q
within U is future-directed timelike. Consequently, I + ( p, U ) is open in U
(hence open in S ).
• If q ̸= p, then q ∈ J + ( p, U ) if and only if the unique geodesic segment σ pq
in U is future-directed causal. Consequently, J + ( p, U ) is the closure of
I + ( p, U ) in U .
Lemma 3-7 implies that ≪ is indeed an open relation on S .
110 3. BASICS OF L ORENTZIAN CAUSALITY
Proof. Suppose S is compact. Since I + ( p) is open for all p, there exist a finite
Sm +
number of points p1 , . . . , pm such that S ⊂ i=1 I ( pi ). If pm ∈ I + ( pm ), then
there is a closed timelike curve through pm . If pm ∈/ I + ( pm ), then pm ∈ I + ( p j )
Sm−1 +
for some j ≤ m − 1, hence S ⊂ i=1 I ( pi ). Repeating this argument, one
concludes that S must contain a closed timelike curve. □
A spacetime S is said to satisfy the causality condition if it contains no closed
causal curves. Despite being stronger than the chronology condition, the causality
condition itself in many cases is not well suited for doing analysis on S because
it does not rule out existence of “almost closed” causal curves. For this reason,
the following condition is often imposed.
Definition 3-12. The strong causality condition is said to hold at a point p ∈
S provided that given any open neighborhood U of p there exists an open
neighborhood V ⊂ U of p such that every causal curve σ : [0, 1] → S with
α(0) ∈ V and α(1) ∈ V lies entirely in U .
A spacetime S is said to be strongly causal if the strong causality condition
holds at every p ∈ S .
The following lemma gives a good illustration of the implication of the strong
causality condition.
Lemma 3-13. Suppose K ⊂ S is a compact set and the strong causality condition
holds at every point in K . Let σ : [0, b) → S , where b ≤ ∞, be a future-directed
causal curve such that σ (0) ∈ K . If σ is future-inextendible, i.e., if limt↗b σ (t)
does not exist, then σ eventually leaves K : that is, there exists a T ∈ (0, b) such
that σ (t) ∈
/ K , for all t > T .
Proof. If the conclusion is false, there exists an increasing sequence {ti } ⊂ (0, b)
such that σ (ti ) ∈ K and limi→∞ ti = b. As K is compact, passing to a subse-
quence, one may assume limi→∞ σ (ti ) = p for some p ∈ K . Applying the strong
causality condition at p, one can show limt↗b σ (t) = p, which is a contradiction.
□
The (strong) causality condition is used to define a globally hyperbolic space-
time.
Definition 3-14. A spacetime S is said to be globally hyperbolic if
(1) S is strongly causal, and
(2) the sets J + ( p) ∩ J − (q) are compact for all p, q ∈ S .
Remark 3-15. In [21], it was shown that the causality condition together with
condition (2) in fact implies the strong causality condition.
112 3. BASICS OF L ORENTZIAN CAUSALITY
A \ A ⊂ edge(A) ⊂ A.
map exp p ( · ) is a diffeomorphism from W̃δ onto its image. Denote this image
by Wδ .
Since p ∈ / edge(A), there exists an open neighborhood V of p such that every
timelike curve, which is contained in V and is from I − ( p, V ) to I + ( p, V ), must
meet A. Fix a small δ > 0 such that W2δ ⊂ V . Given any x ∈ Dδ = {x ∈ Rn : |x| < δ},
consider the curve γx (t) = exp p (te0 + x1 e1 + · · · + xn en ), |t| < 2δ. By choosing
δ small, one can assume γx is timelike for all x ∈ Dδ . Now one restricts attention
within Wδ . As t increases within (−δ, δ), γx is from I − ( p, Wδ ) ⊂ I − ( p, V ) to
114 3. BASICS OF L ORENTZIAN CAUSALITY
86 : U ∩ (6 × R) → S ,
which is a continuous open map and satisfies the property ψ6 ( p) = p for all
p ∈ 6. In particular, this shows that 6 is connected, since S is connected.
Definition 3-29. Given an achronal set A ⊂ S , the future and past domains of
dependence of A, D + (A) and D − (A), are defined as follows:
D + (A) = {q : every past-inextendible causal curve from q meets A},
D − (A) = { p : every future-inextendible causal curve from p meets A}.
The union of D + (A) and D − (A), denoted by D(A), is called the domain of
dependence of A.
Remark 3-30. D + (A) and D − (A) are also known as the future and past Cauchy
developments of A respectively. Similarly, D(A) = D + (A) ∪ D − (A) is also
called the Cauchy development of A.
Remark 3-31. It follows from Definition II on p. 114 that an achronal set 6 is a
Cauchy hypersurface if and only if D(6) = S .
The following facts relating D ± (A), I ± (A) and J ± (A) are easily checked:
A ⊂ D + (A) ⊂ A ∪ I + (A) ⊂ J + (A),
A ⊂ ∂ D + (A),
D + (A) ∩ I − (A) = ∅,
D + (A) ∩ D − (A) = A,
D(A) ∩ I + (A) = D + (A) \ A.
A basic feature about D + (A) is that information outside D + (A) traveling into
D + (A) must first pass through A.
Lemma 3-32. Suppose σ : [0, 1] → S is a past-directed causal curve with
σ (0) ∈ D + (A) and σ (1) ∈
/ D + (A), then σ (t) ∈ A for some t ∈ [0, 1).
Proof. Since σ (1) ∈ / D + (A), there is a past-inextendible causal curve β from
σ (1) that does not meet A. The union σ ∪ β is a past-inextendible causal curve
starting at σ (0) ∈ D + (A). Hence, it must meet A somewhere on σ |[0,1) . □
One of the important aspects of D(A) = D + (A) ∪ D − (A) is that its interior
int D(A), if nonempty, is an open set with appealing properties.
Lemma 3-33. Suppose q ∈ int D(A). If q ∈ D + (A), then any past-inextendible
causal curve must meet I − (A); similarly, if q ∈ D − (A), any future-inextendible
causal curve must meet I + (A).
Proof. It suffices to consider the case q ∈ D + (A). Let β : [0, b) → S be a past-
inextendible causal curve with β(0) = q. The fact q ∈ int D(A) implies that there
is a nearby point p0 ∈ I + (q)∩ D(A). Let γ be a past-directed timelike curve from
118 3. BASICS OF L ORENTZIAN CAUSALITY
p0 to β(0). Repeating the construction in the proof of Lemma 3-26(i), one obtains
a past-inextendible timelike curve γ̃ starting from p0 and a sequence of points
{ pk } on γ̃ such that β(k) ≪ pk , for all k ≥ 1. Since p0 ∈ I + (A)∩ D(A) ⊂ D + (A),
γ̃ meet A somewhere. Therefore, β(k) ∈ I − (A) for large k. □
Remark 3-34. It follows from Lemma 3-33 that if p ∈ int D(A), then every
inextendible causal curve through p must meet both I − (A) and I + (A).
Lemma 3-35. The causality condition holds on int D(A), i.e., no causal loop
meets int D(A).
Proof. Suppose α is a causal loop passing some p ∈ int D(A). Traveling along α
infinitely many times, one gets an inextendible causal curve α̃. By Lemma 3-33,
α̃ meets both I + (A) and I − (A). On the other hand, α̃ meets A. This contradicts
the achronality of A. □
Lemma 3-36. If p, q ∈ int D(A) and p ≤ q, then J + ( p) ∩ J − (q) ⊂ int D(A).
Proof. When q = p, J + (q) ∩ J − (q) = {q} by Lemma 3-35. Hence it suffices to
assume p < q. There are a few cases to consider:
Case 1: q, p ∈ D + (A) \ A = D(A) ∩ I + (A). In this case, points that are
close to q are still in D(A) ∩ I + (A). Let q + ∈ I + (q) be chosen such that
q + ∈ D(A) ∩ I + (A). Consider the open set U = I − (q + ) ∩ I + (A). Given any
past-directed causal curve α : [0, 1] → S from q to p, one has α(t) ∈ U , for all
t ∈ [0, 1].
We proceed to show that U ⊂ D + (A). Suppose y ∈ U . Let σ : [0, 1] → S
be a past-directed timelike curve from q + to y. If y ∈ / D + (A), Lemma 3-32
implies σ (s) ∈ A for some s ∈ [0, 1). Since σ (1) = y ∈ I + (A), this contradicts
the achronality of A. Hence, U ⊂ D + (A).
Case 2: q ∈ D + (A) \ A = D(A) ∩ I + (A) and p ∈ D − (A) \ A = D(A) ∩ I − (A).
In this case, choose p − ∈ I − ( p) and q + ∈ I + (q) respectively such that p − ∈
D(A) ∩ I − (A) and q + ∈ D(A) ∩ I + (A). Any future-directed causal curve α
from p to q now is contained in the open set V = I + ( p − ) ∩ I − (q + ).
We conclude by showing that V ⊂ D(A). Suppose x ∈ V . Let γ : [0, 1] → S
and τ : [0, 1] → S be past-directed timelike curves from q + to x and from x
to p − respectively. If x ∈
/ D(A) = D + (A) ∪ D − (A), then Lemma 3-32 implies
γ (s) ∈ A for some s ∈ [0, 1) and τ (t) ∈ A for some t ∈ (0, 1]. Again, this
contradicts the achronality of A. Hence, V ⊂ D(A).
Case 3: q ∈ D + (A) \ A = D(A) ∩ I + (A) and p ∈ A. The proof of this case is
identical to that of Case 2.
Case 4: q, p ∈ A. Again, this case can be proved in the same way as Case 2.
C AUCHY HORIZONS 119
Any remaining case is dual to one of the cases above by reversing the time
orientation on S . This completes the proof. □
Results stronger than Lemmas 3-35 and 3-36 indeed hold on int D(A). Inter-
ested readers are referred to [174, Theorem 14.38] for a complete proof of the
following theorem.
This follows from Theorem 3-37 and Remark 3-31. The corollary’s converse
is also true; see [104].
We end this chapter with a brief introduction to Cauchy horizons. Although this
concept is not needed in the proof of the Penrose singularity theorem, it arises
naturally when an achronal set A is not a Cauchy hypersurface,
The past Cauchy horizon H − (A) is defined dually. The Cauchy horizon of A is
H (A) = H + (A) ∪ H − (A).
Exercises
Exercise 3-42. Let α : [0, 1] → S be a smooth causal curve segment. Let ε > 0
and x : (−ε, ε) × [0, 1] → S be a smooth variation of α, i.e., x(0, t) = α(t) for
all t ∈ [0, 1]. Let V = (∂ x/∂s)|s=0 be the variation vector field along α, with
covariant derivative V ′ (t) along α. Suppose for all t ∈ [0, 1], ⟨V ′ (t), α ′ (t)⟩ < 0.
Show that there is ε0 > 0 small enough so that for all 0 < s < ε0 , the curve
xs : [0, 1] → S with xs (t) = x(s, t) is timelike. Where could the strict inequality
be relaxed to ⟨V ′ (t), α ′ (t)⟩ ≤ 0?
This chapter presents the classical Penrose singularity theorem [177]. The main
ingredients in the proof concern, on the one hand, the causal structure of a
globally hyperbolic spacetime, discussed in the previous chapter, and on the
other, differential geometry techniques involving Jacobi fields together with the
Riccati and Raychaudhuri equations.
This chapter is organized as follows. In Section 4.1, we review the concept of
Jacobi fields and focal points of a spacelike submanifold. In Section 4.2, we give
a geometric formulation of the Riccati and Raychaudhuri equations along causal
geodesics. In Section 4.3, we state and prove the Penrose singularity theorem.
Throughout this chapter, S denotes an (n+1)-dimensional spacetime.
R(X, Y )Z = ∇ X ∇Y Z − ∇Y ∇ X Z − ∇[X,Y ] Z .
Thinking in the Newtonian regime, one tends to interpret −R(V, γs′ )γs′ as a
certain “force” acting on V . Thus the map V 7→ −R(V, γs′ )γs′ is often known as
the tidal force operator.
A submanifold P ⊂ S is called spacelike if every tangent vector to P is
spacelike.
123
124 4. T HE P ENROSE SINGULARITY THEOREM
⟨V ′ (0), w⟩ = ⟨∇γs′ (0) V, w⟩ = ⟨∇V γs′ (0), w⟩ = −⟨γs′ (0), II(V (0), w)⟩, (4.1.2)
V ′′ + R(V, γ ′ )γ ′ = 0. (4.1.3)
The set Ṽ of Jacobi fields along γ satisfying the initial condition (4.1.4) is a
vector space of dimension n + 1. Consider the subspace V ⊂ Ṽ given by
V = {J ∈ Ṽ : J ⊥ γ ′ along γ }.
γ (t) is not a focal point, B(t) is a linear isomorphism and B(t)−1 (v) is the
unique Jacobi field J ∈ V such that J (t) = v. The map A(t) : γ ′ (t)⊥ → γ ′ (t)⊥
defined by
A(t)(v) = [B(t)−1 (v)]′ (t), (4.2.1)
A(t) is therefore a smooth section of the vector bundle over γ with fiber
End(γ ′ (t)⊥ , γ ′ (t)⊥ ), the space of linear maps from γ ′ (t)⊥ to itself.
Now let A′ (t) = ∇γ ′ (t) A(t). By definition,
for any smooth vector field W (t) along γ such that W (t) ∈ γ ′ (t)⊥ . Taking
W (t) = J (t) ∈ V and applying (4.2.2), we have
A′ (t)(J (t)) = ∇γ ′ (t) [A(t)(J (t))] − A(t)(∇γ ′ (t) J (t))
= J ′′ (t)) − A(t)(J ′ (t))
= −R(J (t), γ ′ (t))γ ′ (t) − A(t) ◦ A(t)(J (t)). (4.2.3)
Proof. By (4.1.4),
lim⟨ A(t)(Ji (t)), J j (t)⟩= ⟨ Ji′ (0), J j (0)⟩= −⟨γ ′ (0), II(Ji (0), J j (0)⟩
t↘0
where trγ ′ (t)⊥ ( · ) denotes the trace on γ ′ (t)⊥ . It is easily checked that
d
trγ ′ (t)⊥ A(t) = trγ ′ (t)⊥ A′ (t). (4.2.9)
dt
With Proposition 4-2, this shows that θ (t) satisfies the following Raychaudhuri
equation.
Proof. Taking the trace of (4.2.4) and using (4.2.9), one has
where trγ ′ (t)⊥ (A(t) ◦ A(t)) = |h(t)|2 = n1 θ (t)2 + |h̊(t)|2 . This proves (4.2.10). □
gi j (t) = ⟨ Ji (t), J j (t)⟩ and let (g i j (t))n×n be the inverse matrix of (gi j (t))n×n .
By definition and Proposition 4-3,
⟨ Ji′ (0), w⟩= −⟨γ ′ (0), II(ei , w)⟩, for all w ∈ Tγ (0) P. (4.2.12)
= −⟨γ ′ (0), H⃗ ⟩. □
4.2.3. Raychaudhuri equation along null geodesics. When γ ′ (t) is null, the
restriction of ⟨ · , · ⟩ to γ ′ (t)⊥ is degenerate, since γ ′ (t) ∈ γ ′ (t)⊥ . The next two
lemmas are left as exercises.
Lemma 4-6. If γ ′ (t) is null, every v ∈ γ ′ (t)⊥ which is not a scalar multiple of
γ ′ (t) is spacelike.
Pn−1
Lemma 4-7. Ric(γ ′ (t), γ ′ (t)) = i=1 ⟨R(ei , γ ′ (t))γ ′ (t), ei ⟩ for any collection
of vectors {e1 , . . . , en−1 } ⊂ γ ′ (t)⊥ satisfying ⟨ei , e j ⟩= δi j .
128 4. T HE P ENROSE SINGULARITY THEOREM
As in the previous case, consider the associated symmetric bilinear form h̃(t) :
γ ′ (t)⊥ / ∼ × γ ′ (t)⊥ / ∼ → R where h̃(t)([v], [w]) = ⟨ Ã(t)([v]), [w]⟩∼ .
Define
θ̃ (t) = trγ ′ (t)⊥ /∼ Ã(t) = trγ ′ (t)⊥ /∼ h̃(t), (4.2.14)
where trγ ′ (t)⊥ /∼ ( · ) is the trace on γ ′ (t)⊥ / ∼ . Similar to (4.2.9), one has
d
θ̃ (t) = trγ ′ (t)⊥ /∼ Ã′ (t). (4.2.15)
dt
The following Raychaudhuri equation for θ̃ (t) follows from Lemma 4-7, (4.2.15)
and taking the trace of (4.2.13).
P ROOF OF P ENROSE ’ S SINGULARITY THEOREM 129
⟨ Ji′ (0), w⟩= −⟨γ ′ (0), II(ei , w)⟩, for all w ∈ Tγ (0) P. (4.2.16)
for all future-directed null vectors ν that are normal to 6. Here H⃗ denotes the
mean curvature vector of 6 in S .
It follows from (4.3.1) that there exists a constant δ > 0 such that
n−1
It follows from (4.3.3), (4.3.4) and elementary calculus that L < δ . Hence,
P ROOF OF P ENROSE ’ S SINGULARITY THEOREM 131
defined like Wi just above. Since W is compact and ∂ I + (6) is closed, ∂ I + (6)
must also be compact.
Now, by Lemma 3-19(ii) and Corollary 3-24, ∂ I + (6) is an achronal, C 0
hypersurface. Since ∂ I + (6) is also compact, Proposition 3-27(ii) implies that
any Cauchy hypersurface in S is homeomorphic to ∂ I + (6), therefore must be
compact. This contradicts the assumption that S has a noncompact Cauchy
hypersurface, completing the proof. □
CHAPTER 5
5.1. Introduction
133
134 5. T HE E INSTEIN CONSTRAINT EQUATIONS
write
divη F = 0 and dF ♭ = 0.
From the second equation and the Poincaré lemma (see, e.g., [141]), we can
write F ♭ = d A for a one-form A = Aµ d x µ , well-defined up to a gauge function
φ: F ♭ = d(A + dφ) for any smooth φ. If ∇ ⃗ is the spatial (Euclidean) gradient
operator, and if we let
∂ ∂ ∂
A = A1 + A2 2 + A3 3 ,
∂x1 ∂x ∂x
then we can make the identifications
⃗ × A,
B=∇ ⃗ A0 − ∂ A .
E=∇
∂t
⃗ · B = 0. The spatial
Note that the spatial divergence vanishes automatically: ∇
divergence of E is given by
⃗ · ∂ A,
⃗ · E = 1A0 − ∇
∇
∂t
where 1 is the Euclidean Laplacian on R3 . A simple calculation shows that
(divη F)0 = F 0ν ⃗
;ν = ∇ · E, so we see that while Maxwell’s equations are second
order in A, this component has only one time derivative of A in it. In terms
of formulating an evolution problem, this component of Maxwell’s equations,
equivalent to the vanishing of the spatial divergence of E, must be satisfied by
the initial data. This imposes a constraint on the data.
To formulate a second-order initial value problem for A, we can use the gauge
freedom in A. The Lorenz4 gauge condition imposes precisely that
∂ A0 ⃗
0 = divη A = − + ∇ · A.
∂t
Under the Lorenz gauge condition, Maxwell’s equations are equivalent to
∂ 2 Aµ
□ Aµ = − + 1Aµ = 0.
∂t 2
∂ Aν ∂ Aµ
Exercise 5-1. Verify this last claim. Use that Fµν = − .
∂xµ ∂xν
4 Note the spelling; this is not the eponym of Lorentz transformations!
I NTRODUCTION 135
∇ ⃗ · ∂ A = 0.
⃗ · E = 1A0 − ∇ (5.1.1)
∂t
We may also assume that the Lorenz gauge condition holds at t = 0. Indeed,
given initial data ŵ |t=0 and (∂ ŵ /∂t)|t=0 satisfying the constraint, we can
arrange the Lorenz gauge condition at t = 0 by adding dφ to Å, for a function φ
depending only on (x 1 , x 2 , x 3 ); this is a gauge transformation, and does not affect
the field F. In particular, it does not change E, which remains divergence-free.
Finding φ involves solving a Poisson equation 1φ = (∂ Å0 /∂t − ∇ ⃗ · Å)|t=0 on
3
R , where we note the right-hand side can be computed in terms of the initial
data. We assume we are working in function spaces where we can solve this
equation (for instance, where the fields decay sufficiently near infinity), and
similar comments apply in the remainder of this section. We note that this gauge
transformation A = Å + dφ does not change the time derivative at t = 0.
We now solve the wave equation □ Aµ = 0. We will have produced a solution
to Maxwell’s equations, provided we can show that the Lorenz gauge condition,
which we have satisfied at t = 0, is propagated in time. How do we do this? We
have not yet incorporated the constraint. In fact, the condition (5.1.1) at t = 0,
together with the wave equation □ A0 = 0, yields at t = 0
3
∂ 2 A0 X ∂ 2 Aℓ
= 1A 0 = ,
∂t 2 ∂t ∂ x ℓ
ℓ=1
which is just (∂ divη A/∂t)|t=0 = 0. Now, the wave equation for each Aµ also
implies □ (divη A) = 0. In other words, we see that divη A satisfies the wave
equation, and has vanishing initial data. Thus divη A = 0 for all t, and the gauge
condition propagates in time. In summary, the gauge term (which we arranged
to vanish at t = 0) is propagated with the wave equation, which, together with
the constraint ∇⃗ · E = 0, implies the vanishing of the first derivative of the gauge
term.
We could also formulate the initial value problem for the source-free Maxwell’s
equations in terms of E and B, with the constraints that the spatial divergences
vanish. We will solve for A using the wave equation as above, where we take
the initial data for A as follows: A0 = 0, and A is chosen so that ∇ ⃗ ×A= B
⃗
(this is possible since ∇ · B = 0 at t = 0). We also set (∂ A/∂t)|t=0 = −E, and
to arrange the Lorenz gauge condition initially, we take ∂ A0 /∂t = ∇ ⃗ · A at t = 0.
We can now solve for A as above to produce solutions to the Maxwell equations
with given initial electric and magnetic fields.
136 5. T HE E INSTEIN CONSTRAINT EQUATIONS
6 + = {(x 0 , x 1 , x 2 , x 3 ) : 0 < x 0 = (x 1 )2 + (x 2 )2 + (x 3 )2 } ⊂ M4
p
denote the forward lightcone minus the origin. Consider a point p in this submani-
fold at which x 2 = 0 = x 3 , and x 1 > 0, so that x 0 = x 1 at p. The tangent space
T p 6 + is spanned by the vectors (∂/∂ x 0 )| p + (∂/∂ x 1 )| p , (∂/∂ x 2 )| p , (∂/∂ x 3 )| p .
Note that the first of these is orthogonal to all of T p 6 + , and so the Minkowski
metric does not induce a metric on 6 + . 6 + is a null hypersurface.
I NTRODUCTION 137
We begin by reviewing the proof of the Gauss equation, which relates the
curvature of the submanifold to the ambient curvature and the second fundamental
form.
Proposition 5-5 (the Gauss equation). For any X, Y, Z , W ∈ T p 6, we have
⟨R 6 (X, Y, Z ), W ⟩
= ⟨R(X, Y, Z ), W ⟩−⟨II(X, Z ), II(Y, W )⟩+⟨II(X, W ), II(Y, Z )⟩. (5.1.2)
To derive the Einstein constraint equations, we will use the Einstein equation,
together with the Gauss equation, and the Codazzi equation, which we present
now. We first define the normal connection ∇ ⊥ in the normal bundle N 6 as
follows: for V tangent to 6 and Z normal along 6, we define ∇V⊥ Z to be the
normal component of ∇V Z . We can use this connection (and impose a product
rule) to differentiate tensors with values in the normal bundle, in particular the
second fundamental form: for V, X and Y tangent to 6,
Proof. As in the proof of the Gauss equation, we decompose the curvature tensor:
R(X, Y, Z ) = ∇ X ∇Y Z − ∇Y ∇ X Z − ∇[X,Y ] Z
= ∇ X (∇Y6 Z + II(Y, Z )) − ∇Y (∇ X6 Z + II(X, Z )) − ∇[X,Y ] Z .
Before we move on, we record the Gauss and Codazzi equations in index
form in local coordinates, for the hypersurface case. Recall the convention
Ri jℓm = gms Risjℓ . In the following, the indices refer to components tangential to
6, while the “n” index refers to the vector n inserted in the corresponding slot
in the tensor. The Gauss equation is easily seen to be
This readily follows from (5.1.4), II(X, Y ) = ⟨n, n⟩K (X, Y )n and ⟨∇ X n, n⟩ = 0.
(M k +1, g¯ )
n
n
Σk
density of the matter fields as measured by the observer with four-velocity cn,
and J is (c times) the corresponding momentum density one-form. If n is future-
pointing, then
qPthe dominant energy condition (J is future-pointing causal) implies
k i
ρ ≥ |J |g = i=1 J Ji .
We now come to the Einstein constraint equations, analogues of the divergence
constraint on the initial data for the Maxwell equations we saw earlier. We recall
that in spacetime dimension four, κ = 8π G/c4 .
Theorem 5-7 (Einstein constraint equations). Let (M, ḡ) be Lorentzian. The
following system of equations must hold on a Riemannian hypersurface 6 ⊂ M,
where the Einstein equation Ric(ḡ) − 12 R(ḡ)ḡ + 3ḡ = κ T holds on M:
R(g) − |K |2g + (trg K )2 = 2κρ + 23, (5.2.2)
divg K − d(trg K ) = κ J. (5.2.3)
Equation (5.2.2) is known as the Hamiltonian constraint, and (5.2.3) is the
momentum constraint; see Sections 5.3.3.2 and 7.2.1 for further connection to
the energy and momenta for gravitational systems. When we insert ρ = 0 and
J = 0 into the constraints, we obtain the vacuum constraint equations.
Proof. Let E 1 , . . . , E k be a smooth local orthonormal frame field for 6. We first
apply the Gauss equation:
k
⟨R(Ei , E j , E j ), Ei ⟩
P
i, j=1
k
= ⟨R 6 (Ei , E j , E j ), Ei ⟩−⟨II(E j , E j ), II(Ei , Ei )⟩+⟨II(Ei , E j ), II(Ei , E j )⟩
P
i, j=1
Remark 5-8. The momentum constraint (5.2.3) appears on [218, p. 266] and
some other works with a sign difference where the second fundamental form has
the opposite sign to ours. In Chapter 8 of this volume, the momentum constraint
takes the same form as (5.2.3), with both the second fundamental form and the
one-form J having the opposite sign to what we have taken here.
Example 5-9. In Minkowski spacetime M1+k, the metric ḡ = −dt 2 + kj=1 (d x j )2
P
As K involves one time derivative of the metric (see (5.3.4) below), the expres-
sions of G nn and G ni in (5.2.7)–(5.2.8) do not involve second time derivatives
of the metric. Thus in formulating the Einstein equation as a second-order
evolution problem for the spacetime metric, as discussed in the next section,
the values of G nµ (and hence (G 3 )nµ ) can be expressed in terms of the initial
data set (6, g, K ; ρ, J ) (where for the non-vacuum case, we augment the geo-
metric initial data (g, K ) with the energy-momentum density (ρ, J ) along 6).
Thus, for instance, a valid initial data set (6, g, K ) for the vacuum Einstein
equation is constrained by the condition that these components must vanish
along 6: (G 3 )nn = 0, (G 3 )ni = 0 along 6, i.e., (5.2.2)–(5.2.3) must hold with
(ρ, J ) = (0, 0).
In the non-vacuum case, an initial data set may include initial data for the
physical fields being modeled, from which ρ and J can be computed, and
moreover the constraints may be augmented by equations that these fields must
also satisfy, as in the following example.
Example 5-13. We consider the coupled Einstein–Maxwell equations in space-
time dimension four, initial data for which will be taken to be (6, g, K ; E, B),
though we might instead encode the electromagnetic fields on 6 using a potential
A, as seen in Section 5.1.1. As in (1.3.3)–(1.3.4), we have
1 1
ρ= (|E|2g + |B|2g ) and J i = (E ×g B)i .
8π 4π
Along with using these in the constraint equations (5.2.2)–(5.2.3), we also impose
constraints on the divergence of E and B, namely in the absence of charge, divg E
and divg B vanish.
Any spacelike hypersurface in any Lorentzian manifold gives rise to a solution
of the constraints, defining T so that the Einstein equation holds for a given 3.
Similarly, given (6, g, K ), one can simply define ρ and J by (5.2.2)–(5.2.3) to
get a solution (6, g, K ; ρ, J ) of the constraint equations. With this in mind, we
often impose some form for T , such as T = 0 (vacuum), or at least impose an
energy condition (cf. Section 2.3.2) on T , such as the dominant energy condition
ρ ≥ |J |g . Under such restrictions, the equations (5.2.2)–(5.2.3) do constrain g
and K on 6 k (k ≥ 2) in some way. For instance, if 3 = 0, the dominant energy
condition can be written
1 2 2
2 R(g) − |K |g + (trg K ) ≥ divg K − d(trg K ) g .
tensors). Even when one accounts for diffeomorphism gauge equivalence, the
system is still underdetermined, and indeed there are lots of solutions to the
vacuum constraints.
The time-symmetric, or Riemannian, constraints are the case K = 0. Then
(5.2.2) reduces to R(g) = 2κρ + 23, so that if 3 = 0, the scalar curvature R(g)
is proportional to the energy density, and R(g) ≥ 0 if and only if ρ ≥ 0 (which
holds under the weak, and hence under the dominant, energy condition). The
maximal case is H = 0, so that R(g) = |K |2g + 2κρ + 23 ≥ 2κρ + 23. In the
vacuum case (T = 0), the time-symmetric constraints reduce to R(g) = 23
(constant scalar curvature), and in the maximal case we have R(g) ≥ 23; we
often consider 3 = 0, which highlights the condition of zero or nonnegative
scalar curvature of (6, g).
The constraints operator 8 is defined by
As we will see in Sections 5.3.3.2 and 7.2.1, we may sometimes want to rewrite the
constraints in terms of the momentum tensor π, which is algebraically equivalent
(k ≥ 2) to K via πi j = K i j − (trg K )gi j . It is easy to see that
1
R(g) − |K |2g + (trg K )2 = R(g) − |π|2g + (tr π)2
k −1 g
divg K − d(trg K ) = divg π.
When we use π instead of K , we may abuse notation and write the constraints
1
operator as 8(g, π ) = R(g) − |π |2g + k−1 (trg π )2 , divg π . While this abuse of
notation should not cause confusion, we note that below we might use the same
notation for the operator where π is treated as a (2, 0)-tensor, in which case
divg π is a vector field. In the literature, the constraints operator may appear as
the above operator except with a factor of ± 21 in the first component (or a factor
of ±2 on the second component); see, e.g., (5.2.7) or Section 5.3.3.2.
5.3. The initial value formulation for the vacuum Einstein equation
In this section we discuss aspects of the analysis and geometry of the initial value
formulation for Ric(ḡ) = 0, or more generally G 3 (ḡ) = 0. We will follow the
approach of the foundational work of Choquet-Bruhat [49]. The purpose of this
section is to illustrate the ideas; to be mathematically precise, we should specify
function spaces and state carefully the partial differential equations results that
are in play. We will not do this, but refer the reader to [51; 190; 218].
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 145
λα = □ ḡ x α = ḡ µν x;µν
α
= ḡ µν (−0µν
κ α
x,κ ) = −ḡ µν 0µν
α
.
From (2-8b) (p. 72), the components of the Ricci curvature of ḡ are given by
κ κ κ γ κ γ κ κ
Rµν = 0µν,κ − 0µκ,ν + 0κγ 0µν − 0νγ 0µκ ∼ 0µν,κ − 0µκ,ν .
Moreover,
κ κ
0µν,κ − 0µκ,ν ∼ 12 ḡ κγ (ḡµγ ,νκ + ḡνγ ,µκ − ḡµν,γ κ ) − (ḡµγ ,κν + ḡκγ ,µν − ḡµκ,γ ν )
By adding to the Ricci curvature operator the Lie derivative of the metric with
H ,
respect to a suitable vector field, we obtain the reduced Ricci curvature Rµν
whose leading term plainly constitutes a nonlinear wave operator. (Recall the
Lie derivative formula (L X ḡ)µν = ḡαµ X α;ν + ḡαν X α;µ .)
146 5. T HE E INSTEIN CONSTRAINT EQUATIONS
1 H
We let R H = ḡ µν Rµν
H and (G H ( ḡ))
3
H
µν = Rµν − 2 R ḡµν + 3 ḡµν , and we
H
consider the reduced Einstein equation G 3 (ḡ) = 0, which can be formulated as
a system of quasilinear wave equations:
5.3.2. The Einstein constraints and the propagation of the gauge condition.
Suppose we are given a solution (6, g, K ) of the Einstein constraint equations
for G 3 (ḡ) = 0. We now incorporate this solution into initial data for (5.3.1).
Choose local coordinates x i (i ≥ 1) on V ⊂ 6; we will obtain a solution of (5.3.1)
on the product I × U , where I is an interval around 0 with coordinate x 0 = ct,
and U is a compactly contained open subset of V . We prescribe initial values of
ḡµν and ḡµν,0 at t = 0 as follows: for i, j ≥ 1, let ḡi j = gi j , and ḡi j,0 = −2K i j ;
let ḡ00 = −1, and for j ≥ 1, let ḡ0 j = 0. For a spacetime metric ḡ with these
conditions, ∂/∂ x 0 is a unit normal field to {0} × U , and so the components of
the second fundamental form of U are indeed given by
∂ ∂
K i j = ḡ ∇ ∂ ,
j ∂x0
= − 12 ḡi j,0 ,
∂xi ∂ x
At t = 0 we get
1 00 ρσ
λ0 = ḡ 00 1 1
,0 + 2 ḡ ḡ ḡρσ,0 = − 2 ḡ00,0 − 2 ḡ i j ḡi j,0 , (5.3.2)
P
i, j≥1
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 147
and, for i ≥ 1,
ji
λi = ḡ 0i,0 + ḡ , j + 12 ḡ ji ḡ ρσ ḡρσ, j
P
j≥1
ji
= ḡ0 j,0 ḡ ji + ḡ , j + 12 ḡ ji ḡ ρσ ḡρσ, j . (5.3.3)
P P
j≥1 j≥1
The final summation in (5.3.3) only involves spatial derivatives, so we can solve
for ḡ0 j,0 at t = 0 ( j ≥ 1) to arrange λi |t=0 = 0 for i ≥ 1. We can also clearly use
(5.3.2) to determine ḡ00,0 at t = 0 in order that λ0 |t=0 = 0 as well.
Now that we have specified all the Cauchy data, we can use standard theory for
nonlinear wave equations (see [51; 190; 218]) to obtain a solution ḡµν to (5.3.1),
which is a Lorentzian metric on the product of U with an interval about 0, and
the induced geometry on the slice {0} × U is precisely (U, g, K ). The question
now is how to guarantee that λα = 0 propagates in time, so that ḡ solves the
Einstein equation. As we will see, a homogeneous linear wave equation for λα
is a consequence of the Bianchi identities, while the Einstein constraints will
show that the initial time derivative of λα vanishes. Together with the preceding
paragraph, this will allow us to conclude that the gauge conditions that we have
arranged at t = 0 propagate in time.
We begin with a simple exercise.
H
Exercise 5-15. Assuming G 3 (ḡ) = 0, show that
(G 3 (ḡ))µν = Rµν − 21 R(ḡ)ḡµν + 3ḡµν = − 12 ḡαµ λα,ν − 21 ḡαν λα,µ + 21 ḡµν λα,α .
The vacuum constraints that are satisfied by the induced geometry (U, g, K )
are precisely (G 3 (ḡ))µν n ν = 0 for µ ≥ 0, or in our set up, (G 3 (ḡ))µ0 = 0 at
t = 0; cf. (5.2.7)–(5.2.8). Note that by our arrangement of the gauge condition at
µ
t = 0, λ ,i = 0 for i ≥ 1 and all µ. The component of the vacuum constraint for
µ = 0 (at t = 0) is just
Since this is true for i ≥ 1 and the matrix (gi j ) is invertible, λi,0 must vanish as
well at t = 0.
148 5. T HE E INSTEIN CONSTRAINT EQUATIONS
Exercise 5-16. Use the preceding exercise to show that for a solution of (5.3.1),
the vanishing of the divergence of G 3 (ḡ) is equivalent to
γ
0 = − 21 ḡαν □ ḡ λα + Bνγ
θ
(ḡρσ ), (ḡρσ,β ) λ ,θ .
From this exercise we see that the partials of λα satisfy a homogeneous linear
hyperbolic system with vanishing initial data. Thus the λα vanish identically (see
[190, Chapters 8 and 12]). Hence the gauge condition holds, and the solution
to the reduced Einstein equation yields a solution to the Einstein equation, as
desired.
5.3.2.1. More on the evolution problem. What we have discussed above is a local
construction. One would like to piece local solutions together, by patching along
a cover of 6, to obtain an existence result for a vacuum spacetime containing
(6, g, K ); see [51; 190; 218] for details. In order to formulate this, we sketch a
local uniqueness result for the evolution of initial data.
Suppose (V1 , ḡ1 ) and (V2 , ḡ2 ) are vacuum spacetimes containing respective
coordinate neighborhoods U1 and U2 on 6, with induced geometry (g, K ).
Set U = U1 ∩ U2 ⊂ 6. There is a coordinate system in a neighborhood of
U1 in V1 with respect to which the metric components of ḡ1 along U1 satisfy
(ḡ1 )i j = gi j and (ḡ1 )0 j = 0 for i, j ≥ 1, and (ḡ1 )00 = −1; indeed, one can start from
coordinates (x i ) for U1 and build adapted coordinates by (x µ ) 7→ expḡp(x 1 i (x 0 n ),
) 1
i i
where p(x ) ∈ U1 corresponds to coordinates (x ) for U1 , and n 1 is a smooth
timelike unit normal along U1 . We see that x 0 = 0 corresponds to U1 , and
n 1 = ∂/∂ x 0 along U1 . We have then that for i, j ≥ 1, (ḡ1 )i j,0 = −2K i j along
U1 . Actually in these specific coordinates, the geodesic equations also yield
(ḡ1 )0µ,0 = 0 along U1 . The analogous construction can be made along U2 with
normal unit normal n 2 with respect to ḡ2 , and we can build such coordinate
charts along U from a common coordinate domain to respective neighborhoods
W1 ⊂ V1 and W2 ⊂ V2 of U . If we pullback ḡ1 and ḡ2 by these coordinate
charts, respectively, then along x 0 = 0 we will have both (ḡ1 )µν = (ḡ2 )µν and
(ḡ1 )µν,0 = (ḡ2 )µν,0 (again, this comes from the constraint equations, along with
the geodesic equation). Said another way, if we define the diffeomorphism
ḡ1 0 ḡ2 0
ϕ : W1 → W2 by ϕ(exp p(x i ) (x n 1 )) = exp p(x i ) (x n 2 ) so that expressed in these
coordinates, ϕ is the identity map, then ϕ ∗ ḡ2 agrees with ḡ1 along U , and
(ϕ ∗ ḡ2 )µν,0 = (ḡ1 )µν,0 on x 0 = 0 in these coordinates for W1 .
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 149
ḡ = −N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt).
The first and second fundamental forms of the slices form a family of solutions
to the Einstein constraint equations. In our discussion of solving the Einstein
equations from initial data, we chose N = 1 and X = 0 on the initial slice. One
can use the solution of the initial value problem to determine a lapse and shift
for a spacetime splitting. You might also consider suitably prescribing N and X ,
possibly solving an auxiliary set of equations which might for instance impose
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 151
gauge conditions, and solve for the induced metric and second fundamental form
of the slices; cf. [7]. Given N and X , we indicate below the evolution equations
of these geometric quantities on the slices.
It is useful to have the analogous formulas in the Riemannian setting as well.
So, with ∂t∂ = N n + X , we let ϵ = ⟨n, n⟩ = ±1, and write the metric in the form
ḡ = ϵ N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt)
= (ϵ N 2 + |X |2g )dt 2 + X ♭ ⊗ dt + dt ⊗ X ♭ + g,
where N 2 > |X |2g in the Lorentzian case. With our convention on K , we have
∇ X Y = ∇ X6 Y + ϵ K (X, Y )n, where X and Y are tangent to a slice (and, as here,
we may suppress the subscript on 6t ).
5.3.3.1. The ADM equations. We compute the time derivative of the induced
metric. Let ei = ∂∂x i , 1 ≤ i ≤ k, be a coordinate frame for 6, and let e0 = ∂t∂ .
Using metric compatibility, the torsion-free property of the connection, and the
fact that all the eµ commute, we have
∂gi j
= ḡ(∇ei e0 , e j ) + ḡ(ei , ∇e j e0 )
∂t
= ḡ(∇ei (N n + X ), e j ) + ḡ(ei , ∇e j (N n + X ))
= N ḡ(∇ei n, e j ) + N ḡ(ei , ∇e j n) + ḡ(∇ei X, e j ) + ḡ(ei , ∇e j X )
= −2N K i j + g(∇e6i X, e j ) + g(ei , ∇e6j X )
= −2N K i j + (L X g)i j (5.3.4)
1 −1 ∂gi j
Ki j = − 2 N − (L X g)i j .
∂t
A more laborious exercise determines the time evolution of K :
∂ Ki j
= ϵ N;i j + (L X K )i j + N ϵ(Ri j − Ri6j ) − 2K iℓ K jℓ + K ℓℓ K i j , (5.3.5)
∂t
where N;i j are the components of Hessg N , Ri j are components of Ric(ḡ), Ri6j
are components of Ric(g), and L X K is the Lie derivative of K , (L X K )(Y, Z ) =
X [K (Y, Z )] − K ([X, Y ], Z ) − K (Y, [X, Z ]). Note that there are no time deriva-
tives of N or X in (5.3.4)–(5.3.5). We outline the proof of (5.3.5) in the remainder
of the section.
152 5. T HE E INSTEIN CONSTRAINT EQUATIONS
Exercise 5-17. a. Show that if T is a (0, 2)-tensor, and W , Y and Z are vector
fields, then (L W T )(Y, Z ) = (∇W T )(Y, Z ) + T (∇Y W, Z ) + T (Y, ∇ Z W ).
∂
b. With ei and ∂t = N n + X as above, if Y = Y i ei , then
We will also make use of the following form of what is sometimes referred to
as the Mainardi equation.
obtain
Putting this together with Exercise 5-18 and Lemma 5-19, we obtain (5.3.5).
The system (5.3.4)–(5.3.5) is known as the ADM equations, after Arnowitt,
Deser and Misner [10]. If the spacetime (S , ḡ) as above satisfies the vacuum
Einstein equation, then Ri j = 0 in (5.3.5), with ϵ = −1. Turning this around,
suppose we have some lapse N and shift X , and we want to build a vacuum
spacetime metric ḡ by solving for g = g(t). The spacetime scalar curvature
154 5. T HE E INSTEIN CONSTRAINT EQUATIONS
the notion of quasilocal mass [158]. In fact, for isolated systems, the natural
decay rates are such that not all the boundary terms should be discarded in
the Hamiltonian analysis; as we will see in Section 7.2.1, the ADM energy-
momentum is defined in terms of flux integrals at infinity (i.e., as a limit of
flux integrals over large spheres in an asymptotic end), and these terms must be
included to get a well-defined Hamiltonian in the natural phase space [9; 16;
75; 187]. For simplicity on the first pass, we will have in mind the case when
6 is compact with no boundary, but see Remark 5-22 and Exercise 5-30 for a
brief discussion of boundary terms, and see Section 7.2.1 for a discussion in the
asymptotically flat setting. Finally, we note that to get the units of energy, we
R 1
want to consider the Hamiltonian related to the Lagrangian S 2κ R(ḡ) dvḡ ; see
Remark 5-23. For simplicity, we work below in units where 2κ = 1, though in
later chapters we will use units where G = 1 and c = 1, so that the energy of
the Schwarzschild spacetime is m; in spacetime dimension four, for instance,
2κ = 16π.
We begin by finding expressions for the the scalar curvature R(ḡ) of (S , ḡ)
with ḡ = −N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt).
Exercise 5-21. Prove that ∇n (trg K ) = trg (∇n K ). Then use Lemma 5-19 to show
(Hint: You might use (5.3.7) to help show N −1 1g N = divḡ (∇n n).)
To rewrite the Einstein–Hilbert action, we will use the ADM equations (5.3.4)
√ √
and (5.3.5), and we also note the following fact: |det ḡ| = N det g. To derive
this by direct expansion of the matrix for ḡ in lapse-shift form, let ξ (i) be the
matrix obtained by replacing column i of the matrix for g with the column whose
entry in row j is X j = g jℓ X ℓ , so that
k
det ḡ = (−N 2 + |X |2g ) det g − X i det(ξ (i) )
P
i=1
k
= (−N 2 + |X |2g ) det g − (−1)i+ j X i X j Mi j ,
P
i, j=1
where Mi j is the (i, j)-minor determinant of the matrix for g. Cramer’s rule for
the inverse (Section 2.3.4) now can be employed to get g i j det g = (−1)i+ j M ji ,
from which the determinant formula follows.
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 157
We can thus write the Einstein–Hilbert action over the spacetime product
as S R(ḡ) dvḡ = I 6 R(ḡ)N dvg dt. Before we insert the formula (5-21b)
R R R
for R(ḡ) and integrate, we note that with Hamilton’s equations, as well as the
evolution equation (5.3.4) for ∂g/∂t, in mind, we let π i j = K i j − (trg K )g i j , and
π̂ i j = K̂ i j − (trg K̂ )g i j = −π i j , and we consider the following expression:
∂gi j
π̂ i j = −(K i j − (trg K )g i j )(−2N K i j ) + π̂ i j (L X g)i j
∂t
= −2N (trg K )2 + 2N |K |2g + π̂ i j (X i; j + X j;i )
= −2N (trg K )2 + 2N |K |2g + 2π̂ i j X i; j . (5.3.8)
Thus the action S R(ḡ) dvḡ (modulo any boundary terms) is equal to
R
i j ∂gi j
Z Z
i 2 2
π̂ + 2(divg π̂ )i X + N R(g) − |K |g + (trg K ) dvg dt.
I 6 ∂t
Remark 5-22. In case ∂6 is empty, the boundary terms are easily seen to be
Z Z Z Z
− 2trg K dvg − 2trg K dvg =: − − 2trg K dvg .
6t2 6t1 6t2 6t1
where
1
♭
8(g, π̂ ) = − R(g) − |π̂ |2g + 2
(tr π̂) , 2(divg π̂) ,
(k −1) g
b
with the second component a one-form field. We emphasize that the vacuum
constraint equations (with 3 = 0) are just 8
b(g, π̂ ) = 0.
158 5. T HE E INSTEIN CONSTRAINT EQUATIONS
i j ∂gi j
Z Z
π̂ dvg − HADM dt
I 6 ∂t
i j ∂gi j
Z Z
= π̂ − (N , X ) · 8(g, π̂ ) dvg dt. (5.3.11)
b
I 6 ∂t
N and X appear as Lagrange multipliers for a constrained optimization problem.
Indeed, stationarity of the action with respect to variations of lapse and shift
yield the vacuum constraint equations.
With an eye toward Hamilton’s equations, we choose a background volume
element d v̊ on p 6, induced from a background metric g̊ on 6. The function
√
θg = det g/ det g̊ encodes the ratio of the volume elements (and in particular
is independent of coordinates). We let π̃ i j = θg π̂ i j , He (g, π̃ ) = θg H b(g, π̂), and
8(g, π̃ ) = θg 8(g, π̂ ), so that the action can be written
e b
i j ∂gi j
Z Z
π̂ dvg − HADM dt
I 6 ∂t
i j ∂gi j i j ∂gi j
Z Z Z Z
= π̂ − H dvg dt = π̃ − H d v̊ dt
∂t ∂t
b e
I 6 I 6
∂gi j
Z Z
= π̃ i j − (N , X ) · 8
e(g, π̃) d v̊ dt. (5.3.12)
I 6 ∂t
Starting from the stationarity of the action in Hamiltonian form, we can obtain
the ADM equations. Indeed consider a one-parameter variation (compactly
supported away from the boundary) π̂ i j + ϵ σ̂ i j , keeping g fixed, or equivalently
π̃ i j + ϵ σ̃ i j = θg (π̂ i j + ϵ σ̂ i j ), and define
δH d δH d
σ̂ i j b(g, π̂ + ϵ σ̂ ), σ̃ i j e (g, π̃ + ϵ σ̃ ).
b e
= H = H
δ π̂ i j dϵ ϵ=0 δ π̃ i j dϵ ϵ=0
b/δ π̂ i j = δ H
We then observe that δ H e /δ π̃ i j , and moreover with HADM (g, π̂) :=
6 H dvg := 6 H d v̊ =: HADM (g, π̃ ),
R R
b e
d δH
Z
HADM (g, π̂ + ϵ σ̂ ) = σ̂ i j i j dvg
b
dϵ ϵ=0 6 δ π̂
δH d
Z
σ̃ i j i j d v̊ = HADM (g, π̃ + ϵ σ̃ ).
e
=
6 δ π̃ dϵ ϵ=0
From the first two lines of (5.3.12), the corresponding variation of the action is
i j ∂gi j δH i j ∂gi j δH
Z Z Z Z
σ̂ σ̃ − i j d v̊ dt. (5.3.13)
b e
− i j dvg dt =
I 6 ∂t δ π̂ I 6 ∂t δ π̃
T HE INITIAL VALUE FORMULATION FOR THE VACUUM E INSTEIN EQUATION 159
For this to vanish for all σ̂ (compactly supported away from the boundary), we
conclude one of Hamilton’s equations
∂gi j δH δH
= ij = ij .
b e
(5.3.14)
∂t δ π̂ δ π̃
If we now consider a (compactly supported) variation gi j + ϵh i j , with π̃ i j
fixed (so π̂ may change with g), we define
δH d
e (g + ϵh, π̃ ),
e
hi j = H
δgi j dϵ ϵ=0
so that
d δH
Z
HADM (g + ϵh, π̃ ) =: d v̊.
e
hi j
dϵ ϵ=0 6 δgi j
∂gi j
Z Z
The variation of the action π̃ i j e d v̊ dt is then
−H
I 6 ∂t
∂h i j δH
Z Z
π̃ i j d v̊ dt.
e
− hi j
I 6 ∂t δgi j
We integrate by parts in t; since the expression above must vanish for all
compactly supported h, we get another of Hamilton’s equations
∂ π̃ i j δH
.
e
=− (5.3.15)
∂t δgi j
We now determine what (5.3.14) means. We compute from (5.3.9)
δH
1
ij ij
σ̂ i j σ̂ ij
π̂ (tr π̂ )g − 2σ̂ ; j X i = 2N σ̂ i j K̂ i j − 2σ̂ ; j X i .
b
= 2N i j − g i j
δ π̂ i j k −1
Thus with an integration by parts we find
i j ∂H
Z Z
σ̂ σ̂ i j (2N K̂ i j + X i; j + X j;i ) dvg .
b
ij
dvg =
6 ∂ π̂ 6
By (5.3.13), we thus recover the first of the ADM equations, ∂gi j /∂t = −2N K i j +
(L X g)i j . A similar (but more laborious) analysis produces the second ADM
equation, which we leave as an exercise; cf. [51] or [218, Appendix E].
We can write the ADM equations in a symplectic form using the constraint
map 8 b, as follows. If we let
d
D8
b(g,π̂) (h, σ̂ ) = 8
b(g + ϵh, π̂ + ϵ σ̂ ),
dϵ ϵ=0
160 5. T HE E INSTEIN CONSTRAINT EQUATIONS
for all compactly supported (vanishing near the boundary) (h, σ̂ ). We similarly
define D 8e(g,π̃) (h, σ̃ ) = d
dϵ ϵ=0 8(g + ϵh, π̃ + ϵ σ̃ ) and
e
Z Z
(N , X ) · D 8 e∗(g,π̃ ) (N , X ) d v̊.
e(g,π̃) (h, σ̃ ) d v̊ = (h, σ̃ ) ·g D 8
6 6
One needs to take care when computing this adjoint, since the volume element
for the integral is induced from g̊, but the derivative operators would naturally be
taken with respect to g (factors of θg are used to compensate; cf. Exercise 5-28).
Using (5.3.11) and (5.3.12), we obtain the first variation of the action fixing g
and varying π̂ i j + ϵ σ̂ i j (equivalently π̃ i j + ϵ σ̃ i j ) to be
i j ∂gi j
Z Z
σ̂ − (N , X ) · D 8(g,π̂) (0, σ̂ ) dvg dt
b
I 6 ∂t
i j ∂gi j
Z Z
∗
= σ̂ − (0, σ̂ ) ·g D 8(g,π̂) (N , X ) dvg dt;
b
I 6 ∂t
equivalently,
i j ∂gi j
Z Z
σ̃ − (N , X ) · D 8(g,π̃) (0, σ̃ ) d v̊ dt
e
I 6 ∂t
i j ∂gi j
Z Z
∗
= σ̃ − (0, σ̃ ) ·g D 8(g,e
e π ) (N , X ) d v̊ dt. (5.3.16)
I 6 ∂t
i j ∂h i j
Z Z
π̃ − (N , X ) · D 8(g,π̃) (h, 0) d v̊ dt
e
I 6 ∂t
∂ π̃ i j
Z Z
= −h i j e∗(g,e
− (h, 0) ·g D 8 π) (N , X ) d v̊ dt. (5.3.17)
I 6 ∂t
By (5.3.17) and (5.3.16), for the action to be stationary for variations of
gi j , π̃ i j , as well as N and X (which yield the constraints), we have, with the
components of the adjoint written vertically and with the indices raised or lowered
by g to match the tensor type,
∂g/∂t 0 1
= e∗(g,π̃ ) (N , X ).
D8 (5.3.18)
∂ π̃ /∂t −1 0
E XERCISES 161
Remark 5-23. As noted, to get the units right, we consider the Hamiltonian
R 1
related to the Lagrangian S 2κ R(ḡ)dvḡ . We can readily modify the above to
ij 1 ij
incorporate κ. Indeed, the canonical momentum would be π̂κ = 2κ π̂ , with
ij 1 ij κ 1
π̃κ = 2κ π̃ . The Hamiltonian would change to HADM (g, π̂κ ) = 2κ HADM (g, π̂)
eκ (g, π̃κ ) = 1 H 1
with the weighted Hamiltonian H 2κ (g, π̃) = 2κ θg H(g, π̂), so that
e b
1
bκ (g, π̂κ ) = H 1
H 2κ (g, π̂ ), and finally 8κ (g, π̂κ ) = 2κ 8(g, π̂), with the weighted
b b b
1 e
form 8κ (g, π̃κ ) = 2κ
e 8(g, π̃ ). Hamilton’s equations (5.3.14)–(5.3.15) retain the
same form, since one can readily show that δ H bκ /δ π̂κi j = δ H
b/δ π̂ i j = δ H eκ /δ π̃κi j ,
ij eκ /δgi j is obtained simply by multiplying (5.3.15) by 1 .
while ∂ π̃κ /∂t = −δ H 2κ
Finally, (5.3.18) would retain the same form, as it can be readily seen to be
equivalent to the same equation with 8 eκ and π̃κ in place of 8 e and π̃ . See also
Exercise 5-28.
Exercises
Exercise 5-24. Let g E3 be the Euclidean metric. Let I ⊂ R be an interval, and let
a : I → (0, +∞) be a smooth function. Let S = I ×R3 , with ḡ = −dt 2 +(a(t))2 g E3 .
On the slices {t} × R3 , g(t) = (a(t))2 g E3 , and n = ∂/∂t is a global unit normal
field along the slices. Let a zero index denote the ∂/∂t-component direction.
a. Use the ADM equations to compute the second fundamental form K , as well
as the components of Ric(ḡ) in directions tangent to the slices.
b. Use the relation between g and K and G µ0 to compute G µ0 ; then compute G µν .
the first fundamental form matrix EF GF . These are the Weingarten equations.
E XERCISES 163
b. What do you get if you decompose the vector equations Xuuv − Xuvu = 0 and
Xvuv − Xvvu = 0 into tangential and normal components? (Answer: The Gauss–
Codazzi equations. From the tangential components comes the Gauss equation,
which shows the Gauss curvature κ1 κ2 (derived from the second fundamental
form) is the sectional curvature of 6, which can be computed from the first
fundamental form. This is known as Gauss’s theorema egregium, or “remarkable
theorem”.)
Exercise 5-27. a. Prove that Euclidean space (R3 , g E3 ) does not admit a closed
immersed minimal surface (one for which H = 0). To do this, show that there
must be a point on the surface where the Gaussian curvature is strictly positive.
Generalize this result to closed immersed hypersurfaces in (Rn , g En ) for n ≥ 3.
b. Conclude that any embedding of a two-torus T2 into Euclidean R3 is not flat.
Show that there is an embedding of a two-torus into Euclidean R4 for which the
induced metric is flat. Can you likewise embed a flat torus isometrically in the
product S2 × S2 of round spheres?
c. Consider the surface of revolution in Euclidean space given by x 2 + y 2 =
p
cosh z. The surface is a catenoid. Show the catenoid is minimal, and compute
its Gaussian curvature.
Exercise 5-28. Let (6, g) be Riemannian, and let π̂ be a symmetric (2, 0)-
tensor on 6. Recalling notation from Section 5.3.3.2, we let π̃ = θg π̂ and
8 b(g, θg−1 π̃ ), from which we have
e(g, π̃ ) = θg 8
Z Z
b(g, θg−1 π̃)dvg .
HADM = (N , X ) · 8(g, π̃ )d v̊ = (N , X ) · 8
e
6 6
D8
e(g,π̃) (h, σ̃ )
= θg D 8 b(g,π̂) (h, σ̂ ) + 1 θg (trg h) 8
b(g, π̂ ) − D 8
b(g,π̂ ) (0, (trg h) π̂) . (5-28a)
2
From here you can conclude that if (g, π̂ ) solves the vacuum constraint equations,
b∗
then (N , X ) is in the kernel of D 8 e∗ .
(g,π̂) if and only if it is in the kernel of D 8(g,e
π)
assuming the integrand has compact support (in the interior of M, should ∂ M
be nonempty).
a. Suppose 8(g, 0) = 0. Find D8(g,0) (h, σ ), where h is a symmetric (0, 2)-
tensor and σ is a symmetric (2, 0)-tensor, and then find the operator D8∗(g,0) .
Since π = 0, the linearization simplifies dramatically from the general case.
b. Find the kernel of D8∗(g En ,0) at the Minkowski data on M = Rn . (Hint: for
n = 3, the kernel is ten-dimensional.)
c. Let m ̸= 0. Find the kernel of D8∗(gS ,0) at the data for a constant t-slice in
m 4
Schwarzschild g S = 1 + 2|x| g E3 on M = R3 \ x : |x| ≤ max 0, − m2 . (Hint:
D8(g,π) (h, σ )
j
= L g h − 2h i j π iℓ π jℓ − 2π k σ kj + n−1
2
trg π(h i j π i j + trg σ ),
(divg σ )i − 21 π jk h jk;ℓ g ℓi + π jk h i j;k + 12 π i j (trg h), j ,
(5-29b)
D8∗(g,π) (N , X )
2 k 1 ℓm
= (L ∗g N )i j + N n−1 (trg π )πi j − 2πik π j + 2 giℓ g jm (L X π ) + (divg X )πi j
k k km
− 1
2 (X i π j;k + X j πi;k + X k;m π gi j + X k π km
;m gi j ),
1 i ℓj j ℓi 2 ij ij
2 (X ;ℓ g + X ;ℓ g ) + N n−1 (trg π )g − 2π ) ,
− (5-29c)
where L g is the linearization of the scalar curvature operator (cf. (2.3.11)), and
L X is the Lie derivative (cf. Exercise 5-17).
a. In general the outward unit conormal field ν to the slices 6t does not agree
with the outward unit normal of ∂ S . To analyze the boundary terms, you might
rewrite the scalar curvature from (5-21a), and combine with (5.3.8), to obtain
∂
∂g
R(ḡ) = 2N −1 − (trg K )+∇ X (trg K ) −2N −1 1g N + N −1 π̂ i j i j
∂t ∂t
−1 i j 2 2
−2N π̂ X i; j +2(trg K ) + R(g)−|K |g +(trg K )2 . (5-30a)
(i) We can write the integral over S as a product over I × 6. This allows us to
apply integration by parts to the first term in the expression (5-30a) for the scalar
curvature. Using the equality (2.3.10) in the form
∂p 1 ∂g
p
det g = trg det g
∂t 2 ∂t
and (5.3.4), show that
∂
Z Z
−2N −1 (trg K ) dvḡ
I 6 ∂t
Z Z Z Z
2
= −2N (trg K ) + 2(trg K )divg X dvg dt − − 2 trg K dvg .
I 6 6t2 6t1
(ii) Show that the modified action S R(ḡ)dvḡ + 6t − 6t 2 trg K dvg can be
R R R
2 1
written
i j ∂gi j
Z Z
i 2 2
π̂ + 2(divg π̂ )i X + N R(g) − |K |g + (trg K ) dvg dt
I 6 ∂t Z Z
∂N
Z Z
−2 + π̂ i j X i ν j dσg dt + 2 (trg K )X j ν j dσg dt.
I ∂6 ∂ν I ∂6
Note that we could use (5.3.7) to replace ∂ N /∂ν by N ⟨∇n n, ν⟩ on the last line.
To interpret the boundary integrals from the divergence terms in (5-21b), it
would be convenient for n to be tangent to this timelike boundary, i.e., for the
outward unit conormal ν to the slices 6t to be the outward unit normal to ∂ S ;
however, as ∂/∂t is tangent to ∂ S , this would restrict X to satisfy ⟨X, ν⟩ = 0
along ∂ S . So as in [113; 158], we restrict if necessary to a spacetime (S ′ , ḡ), with
S ′ = t∈I 6t′ , with smooth hypersurfaces 6t′ ⊂ 6t , where the boundary of each
S
6t′ meets the timelike boundary ∂ S ′ orthogonally, so that the outward pointing
unit conormal ν of each 6t′ is normal to this timelike boundary. Then if we let γ
be the induced metric on ∂ S ′ , the orthogonality gives us that, if we have a local
form field ωg which restricts to a local volume form along each ∂6t′ (in a suitable
neighborhood), then a local volume form for (∂ S ′ , γ ) is n ♭ ∧ ωg = N dt ∧ ωg .
If we extend ν as ν = −n on 6t′2 and ν = n on 6t′1 , the term that is added to
the action is then ∂ S ′ 2trγ A dvγ , where if II is the second fundamental form of
R
166 5. T HE E INSTEIN CONSTRAINT EQUATIONS
follows.
b. In the setting above, show the modified action S ′ R(ḡ) dvḡ + ∂ S ′ 2 trγ A dvγ
R R
can be written
i j ∂gi j
Z Z Z
ij
π̂ dvg − HADM − 2 (N H + π̂ X i ν j ) dσg dt,
I 6t′ ∂t ∂6t′
where H is the mean curvature of the boundary ∂6t′ inside 6t′ , with respect to
the normal ν (i.e., if Π is the vector-valued second fundamental form of ∂6t′
inside 6t′ , then H is the trace along ∂6t′ of κ(X, Y ) := ⟨Π (X, Y ), ν⟩). Thus the
modified Hamiltonian is HADM + 2 ∂6 ′ (N H + π̂ i j X i ν j ) dσg .
R
t
Remark. If you try to compare boundary terms from the analyses in parts a. and
b., you have to take care; in part a. we obtained one such term from an integration
by parts in the t-direction; if you tried to carry over the derivation from part (ii)
for S ′ , there may be an additional term in the t-derivative of the slice integrals,
since the slices in S ′ are generally not just 6t = {t} × 6, which had allowed us
to easily write the slice integrals in S over a fixed domain 6.
CHAPTER 6
We have derived the Einstein constraint equations and pinpointed their place in
terms of the initial value formulation for Einstein’s equation. From one point
of view, to construct interesting spacetimes, one could construct interesting
solutions of the Einstein constraint equations, and then study their evolution.
If one could then effectively parametrize the space of solutions to the Einstein
constraint equations, one could identify the parameters as the true gravitational
degrees of freedom; cf. [226; 227]. A classical approach to this is the conformal
method to construct solutions of the constraint equations, and we introduce this
in Section 6.2.
Once we set up the conformal method, we will focus on the constant mean
curvature (CMC) case. In this case, the momentum constraint decouples, and
the problem will be reduced to the analysis of a single nonlinear equation, the
Lichnerowicz equation (6.2.8), to solve the Hamiltonian constraint, as we will see.
The Hamiltonian constraint brings the analysis of the scalar curvature operator
to the fore, and we will study the range of the scalar curvature operator, focusing
on scalar curvature deformation, sometimes within a conformal class of metrics.
As we have seen, in the time-symmetric case, the scalar curvature is proportional
to energy density, and the dominant energy condition then focuses our attention
to positive (nonnegative) scalar curvature metrics. There are in fact topological
obstructions to admitting positive scalar curvature metrics, and as developed by
R. M. Schoen and S.-T. Yau, this gives an approach to the positive mass theorem,
as we will see in Chapter 7.
The remaining chapters will make heavier use of analysis and PDE (partial
differential equations), such as properties of the Laplace operator on Euclidean
space or more general Riemannian manifolds. We will discuss some aspects
of elliptic PDE theory in various amounts of detail. Rather than relegating the
PDE aspects to an appendix or simply listing bullet points of facts, we prefer to
cover various aspects of the theory that we will use, giving many references to
possible sources for further reading, and sketching a number of missing details
167
168 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
as exercises. We suggest the reader at least peruse the PDE sections to set some
notations and expectations. Hopefully someone who has had a course in elliptic
PDE will get a better appreciation for some basic applications of the theory,
while those without the background can get some feel for the place of elliptic
PDE in the study of the Einstein constraint equations.
This section contains some of the required fundamentals of elliptic PDE theory.
Given the fundamental role of the Laplace and Poisson equations in geometric
analysis and mathematical physics, the Laplace operator will be our primary,
though not exclusive, example of an elliptic operator. While we assume the reader
is familiar with basic tools of analysis, including basic functional analysis, found
in standard texts such as those mentioned in the preface, we start by collecting
some of the facts and notation about function spaces that we will use.
6.1.1. Sobolev and Hölder spaces. Many results on elliptic PDE are cast in
Sobolev and Hölder spaces. We define them briefly, and encourage the reader to
review their basic properties from references such as [2; 86; 107; 144].
We let be an open subset of Rn , sometimes called a domain in Rn . For a
real number p ≥ 1 and a nonnegative integer k, the space W k, p () is the set of
all Lebesgue measurable functions (up to an equivalence relation for functions
which agree except for a set of measure zero) which are in L p () along with
weak derivatives up to order k. The space has a Banach space structure with
norm given by
p p
∥∂ β u∥ L p () ,
X
∥u∥W k, p () =
|β|≤k
under the norm ∥u∥2H s (Rn ) = Rn (1 + |ξ |2 )s |û(ξ )|2 dξ , where û is the Fourier
R
sup |∂ β u(x)|
X
∥u∥C k () =
|β|≤k x∈
(which could be infinite). There are different conventions for C k (). A natural
one, adopted in [107], is that C k () consists of all functions u ∈ C k () such
that u and all its partials through order k possess continuous extensions to ; for
such functions, it is easy to see ∥u∥C k () = |β|≤k supx∈ |∂ β u(x)| =: ∥u∥C k () ,
P
and define ∥u∥C k,α () = ∥u∥C k () + [u]k,α; , which may be infinite. We let
k,α
C k,α () (or Cloc () for emphasis) be the set of all functions u in C k () such
that for all compact X ⊂ , [u]k,α;X is finite; this is equivalent to requiring
[u]k,α; B̄ be finite for all closed balls B ⊂ . Note that if [u]0,α; is finite, then u
is uniformly continuous on , and hence extends continuously to , and if u
denotes the extension, [u]0,α; = [u]0,α; . In [107], C k,α () is then defined to
170 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
be all u ∈ C k () such that [u]k,α; is finite. To have a Banach space with norm
∥u∥C k,α () , one would naturally restrict to the subspace C Bk,α () of functions
u ∈ C Bk () with [u]k,α; finite; when is bounded C Bk,α () = C k,α (), and
we will generally work in bounded domains, or in unbounded domains with
weighted norms. We note that in [2], the definition of C k,α () also includes the
condition that [u]ℓ,α; is finite for all 0 ≤ ℓ ≤ k, and the norm adds the maximum
of these seminorms to the C k -norm. In the domains we will use, it is automatic
that u ∈ C Bk,α () will satisfy this requirement, and in particular the seminorms
[u]ℓ,α; , 0 ≤ ℓ ≤ k, are bounded up a constant factor by
For example, if is convex, this follows from the mean value theorem; for
a smooth compact manifold-with-boundary, this again follows by the mean
value theorem, using a finite covering by coordinate charts, each of which is a
compactly contained restriction of a larger chart, and the image of which is a
ball or half-ball.
Remark 6-1. Whereas smooth functions are dense in Sobolev spaces, this fails
in Hölder spaces. For 0 < α ≤ 1, the function f α (x) = |x|α is readily seen to be
in C 0,α ([−1, 1]). For 0 < α < 1 and f Lipschitz, we have, for 0 < x ≤ 1,
|x α − ( f (x) − f (0))|
[ f α − f ]0,α;[−1,1] ≥ ,
xα
which approaches 1 as x ↘ 0. A similar two-sided argument shows the estimate
[ f 1 − f ]0,1,[−1,1] ≥ 1 for f ∈ C 1 ([−1, 1]). Furthermore, while by Weierstrass
approximation, W k, p ((−1, 1)) and C k ([−1, 1]) are separable, this fails for
C 0,α ([−1, 1]) for 0 < α ≤ 1: the functions f p (x) = |x − p|α are in C 0,α ([−1, 1]),
while for p and q distinct points in [−1, 1], [ f p − f q ]0,α;[−1,1] ≥ 2.
As we will see in this section, elliptic regularity theory is readily formulated
in terms of Sobolev and Hölder spaces, rather than C k spaces. (See also the
discussion of the Newtonian potential in Section 7.1.1.2.)
The above definitions can be extended naturally to tensor fields (for example
by taking the summation or maximum of the corresponding norms of Cartesian
components of the tensor field). To extend to functions or tensors (or more
generally to sections of vector bundles) on compact manifolds (smooth, and with
or without boundary), one builds a norm using the Sobolev or Hölder norms
coming from a covering by a finite number of appropriate coordinate charts
(say, charts which are compactly contained restrictions of larger charts, so that
overlap maps have uniformly bounded derivatives); any two such norms are
A PRIMER ON ELLIPTIC PDE 171
There are related notions which include elliptic operators, but which can
accommodate systems of differential equations involving bundles whose fibers
may have different dimensions.
Definition 6-8. A linear differential operator L with principal symbol σ is
overdetermined-elliptic if for each ξ ∈ T p∗ M \ {0}, σ (ξ ) is injective, while it is
underdetermined-elliptic if for each ξ ∈ T p∗ M \ {0}, σ (ξ ) is surjective.
Suppose L is a linear differential operator between two bundles E and F
each equipped with a smoothly varying fiber-wise inner product ⟨ · , · ⟩ (we use
the same notation for both if it will not cause confusion). We will in particular
focus on tensor bundles over a Riemannian manifold, and the inner product
will be the natural inner product on tensors induced by the metric. The formal
adjoint operator L ∗ between F and E is defined by integration by parts against
compactly supported (away from the boundary if ∂ M is nonempty) sections of the
corresponding bundle: M ⟨Lu, v⟩ dvg = M ⟨u, L ∗ v⟩ dvg ; note that (L ∗ )∗ = L.
R R
We say that Lu = f weakly if for all smooth sections v (compactly supported in the
interior of M), M ⟨ f, v⟩ dvg = M ⟨u, L ∗ v⟩ dvg ; this can be suitably interpreted
R R
and defined for distributions u. It is not a hard exercise to show that the principal
symbol of L ∗ is the conjugate transformation σ (ξ )∗ : at each point p ∈ M and
ξ ∈ T p∗ M, we have ⟨σ (ξ )(v), w⟩ = ⟨v, σ (ξ )∗ (w)⟩. (Without the factor i m in the
definition, which gets conjugated on one side of the Hermitian product, this
relation would hold only up to sign.)
Exercise 6-9. a. Prove the preceding claim, and conclude that L is underdeter-
mined-elliptic if and only if L ∗ is overdetermined-elliptic.
b. Suppose Q is a linear differential operator between (sections of vector bundles)
E and E 1 , and likewise P is a linear differential operator between E 1 and F,
so that P ◦ Q is a linear differential operator between E and F. Show that the
principal symbols satisfy σ P◦Q (ξ ) = σ P (ξ ) ◦ σ Q (ξ ).
c. Conclude that L ∗ is overdetermined-elliptic if and only if L L ∗ is elliptic.
We will now turn to two important examples; another will be met in the
discussion of the TT decomposition (Section 6.2.1).
Example 6-10. Recall the linearization DRg of the scalar curvature operator
(2.3.11), and its formal adjoint (DRg )∗ , given by
(DRg )∗ ( f ) = −(1g f )g + Hessg f − f Ric(g).
The principal symbol of (DRg )∗ is then given by σ (ξ )(1) = |ξ |2g g − ξ ⊗ ξ ,
mapping a one-dimensional fiber to the space of symmetric (0, 2)-tensors on
T p M. If σ (ξ )(1) = 0, then upon taking a trace we get (n−1)|ξ |2g = 0, which
176 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
cannot hold for ξ ̸= 0. In this case, the symbol of (DRg )∗ is injective i.e., (DRg )∗
is overdetermined-elliptic. Note that DRg (DRg )∗ is a fourth-order operator with
principal part (n−1)12g , which is indeed elliptic, in agreement with the preceding
exercise.
For nonlinear operators, one can define ellipticity (at any p ∈ M) at a given
section of the bundle on which the operator acts, using the linearization of the
operator at the given section. For example, the graphical mean curvature operator,
cf. Exercise 6-58, is elliptic. As another example, we now consider the Ricci
curvature operator, which does not have an elliptic linearization (see the next
example). We saw in Section 5.3.1 that in the Lorentzian setting, the Ricci
curvature operator can be put into hyperbolic form in wave coordinates. An
analogue holds in the Riemannian setting, using a derivation following that of
equation (5.3.1), with harmonic coordinates playing the role of wave coordinates.
Example 6-11. The operator g 7→ Ric(g) is nonlinear, and we let P = D Ricg
be its linearization at a metric g. This operator acts between symmetric (0, 2)-
tensor fields on M, and it fails to be elliptic precisely because of diffeomorphism
invariance. We first note that if ϕ is a diffeomorphism on M, then ϕ ∗ (Ric(g)) =
Ric(ϕ ∗ g) (this is easy to compute in local coordinates ψ1 : U ⊂ Rn → V ⊂ M
and ψ2 = ϕ ◦ ψ1 : U → ϕ(V )). If X is a smooth vector field, generating a flow
ϕt , then differentiating ϕt∗ (Ric(g)) = Ric(ϕt∗ g) with respect to t at t = 0 gives
L X (Ric(g)) = P(L X g), where L X is the Lie derivative. As differential operators
on X , P(L X g) is of third order and L X (Ric(g)) is of first order. Taking the
principal symbol yields 0 = σ P (ξ ) ◦ σ Q (ξ ), where we let Q(X ) = L X g. Since
X 7→ Q(X ) maps vector fields to symmetric (0, 2)-tensor fields, this symbol
equation means that 0 = σ P (ξ )(σ Q (ξ )(v)) for any v ∈ T p M and ξ ∈ T p∗ M.
From (L X g)i j = X i; j + X j;i , we see that σ Q (ξ )(v) = i (ξ ⊗ v ♭ + v ♭ ⊗ ξ ). Thus
for ξ ̸= 0, we have a nontrivial kernel for σ P (ξ ). In Exercise 6-59 we will
see that the kernel elements of σ P (ξ ) for ξ ̸= 0 are precisely those symmetric
(0, 2)-tensors of the form ξ ⊗ v ♭ + v ♭ ⊗ ξ .
The constraints operator 8(g, K ) = R(g)−|K |2g +(trg K )2 , divg K −d(trg K )
Remark 6-13. We could have labeled the constant differently for each of the
estimates above, since in addition to depending on L and M (including the
background metric g, if used to define the norms), the constant in (6.1.2) depends
on p, and in (6.1.3) depends on α.
Inequalities (6.1.1)–(6.1.3) are basic versions of, respectively, the L 2 -estimates,
L p -estimates, and Schauder estimates. One often breaks out the case p = 2
separately, as it can be obtained by a more elementary treatment than the general
L p case. These are a priori estimates: we are assuming smoothness on u to begin
with, and then the estimates show that u and all its derivatives up through order
m can be estimated in the indicated norms in terms of the associated norm of
Lu, and a lower-order norm on u; for instance the first and second derivatives of
a function u can be controlled in terms of 1g u, along with a lower-order norm.
In fact, the estimates do not need u to be C ∞ ; it is sufficient to assume that u
lies in ∈ H m (M), W m, p (M) or C m,α (M) as the case may be. For (6.1.1)–(6.1.2)
this can be easily derived from the smooth case by taking a sequence of smooth
u i converging to u in the relevant Sobolev norm.
We next indicate a few basic facts one can get by combining the estimates
with compactness results. These observations are used heavily; for instance they
can be employed in the proof of the Fredholm property (Proposition 6-25). For
the next proposition, the domain of L can be any of C m,α (M), 0 < α < 1, or
W m, p (M), 1 < p < ∞, for which the above elliptic estimates hold. We will see
shortly that the kernel is independent of which of these function spaces we take
for the domain.
Proposition 6-14. Let M be a closed manifold, on which L is a linear elliptic
operator of order m. The kernel of L is finite-dimensional.
178 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
Proof. The kernel of L is a closed subspace, and hence a Banach space. The claim
will be established by showing that the closed unit ball in ker L is compact. Given
any sequence in this unit ball, we can extract a subsequence u i that converges
appropriately, either in L p (M) (using Rellich–Kondrachov) or in C 0 (M) (using
Ascoli–Arzelà). Since Lu i = 0, the respective elliptic estimate may be applied
to the differences u i −u j to show that the sequence u i converges in ker L, from
which the claim follows. □
A second fundamental fact is an injectivity estimate transverse to the kernel;
thus should L have no kernel, it gives a sharper overall estimate.
Proposition 6-15. Let M be a closed manifold, on which L is a linear elliptic
operator of order m. Let 0 < α < 1, respectively 1 < p < ∞, and suppose S ⊂
C m,α (M), respectively S ⊂ W m, p (M), is a closed subspace with S ∩ ker L = {0}.
There is a C > 0 such that for all u ∈ S, ∥u∥C m,α (M) ≤ C∥Lu∥C 0,α (M) , respectively
∥u∥W m, p (M) ≤ C∥Lu∥ L p (M) .
Exercise 6-16. Prove this proposition. Proceed by contradiction, starting with a
sequence u i ∈ S with ∥u i ∥C m,α (M) = 1 but ∥Lu i ∥C 0,α (M) → 0. Use the basic elliptic
estimate, together with Ascoli–Arzelà, to get convergence of a subsequence in
C m,α (M). Find a contradiction. The respective statement in Sobolev spaces
follows the same way, via Rellich–Kondrachov.
We can get higher-order estimates by differentiation.
Proposition 6-17. Let M be a closed manifold, on which L is a linear ellip-
tic operator of order m. Let 0 < α < 1, 1 < p < ∞, and k ∈ Z+ . There
exists a constant C such that ∥u∥W m+k, p (M) ≤ C(∥Lu∥W k, p (M) + ∥u∥ L p (M) ) for
all u ∈ W m+k, p (M). Likewise there is a constant C such that ∥u∥C m+k,α (M) ≤
C(∥Lu∥C k,α (M) + ∥u∥C 0 (M) ) for all u ∈ C m+k,α (M).
Proof. Observe that L(∂ β u) − ∂ β (Lu) can be expressed as a linear combination
of ∂ γ u for |γ | < m + |β| (assuming the coefficients of L are smooth, or smooth
enough, else we have to restrict |β|). Applying (6.1.2), for example, to ∂ β u
for |β| ≤ k, we get ∥u∥W m+k, p (M) ≤ C(∥Lu∥W k, p (M) + ∥u∥W m+k−1, p (M) ) for all
u ∈ W m+k, p (M) (adjusting C as necessary). The lower-order term ∥u∥W m+k−1, p (M)
can be replaced by ∥u∥ L p (M) via interpolation (see Exercise 6-6). The higher-
order analogue of (6.1.3) follows likewise. □
6.1.4.1. On the proof of (6.1.1)–(6.1.3) and interior estimates. We make only
brief comments on the proofs of (6.1.1)–(6.1.3). One natural approach to proving
(6.1.1) is to use Fourier analysis; see, e.g., the proof from [139, p. 193], which
also uses a parametrix (for more on this notion see [114; 115]). For a beautiful
A PRIMER ON ELLIPTIC PDE 179
6.1.5. Elliptic regularity and bootstrapping. Suppose for the moment that the
coefficients of the linear elliptic operator L are smooth, and u is a locally
integrable function (or more generally a distribution) so that Lu = f weakly.
We would like to infer how smooth u is based on the regularity of f . A seminal
result along these lines is Weyl’s lemma, which we recall now.
show the product rule for differences: 1ih (Lu) = L(1ih u) + (1ih L)(u h ). For
0 < |h| < dist(′ , ∂), the interior estimate for L gives, since 1ih u ∈ H m (′ ),
∥1ih u∥ H m (′′ ) ≤ C ∥L(1ih u)∥ L 2 (′ ) + ∥1ih u∥ H m−1 (′ )
where we used a bound on the difference quotients 1ih aα , by the mean value
theorem, say. To complete the proof, we invoke a difference quotient lemma,
which roughly says that for u ∈ H m , difference quotients of u are uniformly
bounded if and only if u ∈ H m+1 . More precisely, for v ∈ H 1 (), and for any
′ ⊂ with 0 < |h| < d(′ , ∂), ∥1ih v∥ L 2 (′ ) ≤ ∥∂v/∂ x i ∥ L 2 () ; conversely for
v ∈ L 2 (), if there is a K > 0 such that for any ′ ⊂ with 0 < h < d(′ , ∂),
∥1ih v∥ L 2 (′ ) ≤ K , then ∂v/∂ x i ∈ L 2 () and ∥∂v/∂ x i ∥ L 2 () ≤ K . For a proof
of the analogous statement for all 1 < p < ∞, see [107, Chapter 7]. Applying
this to (6.1.4) yields a constant C ′′ such that (note where we use Lu ∈ H 1 ())
∥1ih u∥ H m (′′ ) ≤ C ′′ (∥Lu∥ H 1 () + ∥u∥ H m () ).
m+1
By applying the difference quotient lemma again, we get u ∈ Hloc () as
desired.
A PRIMER ON ELLIPTIC PDE 181
The proof for 1 < p < ∞ follows in the same way, and we can apply induction
to bootstrap regularity for higher k. A similar approach can be used for Hölder
regularity; see Exercise 6-60. □
From here we see that for L linear elliptic of order m, with smooth coefficients,
if u ∈ H m (M) and Lu = f ∈ C ∞ (M), then by induction we can prove u ∈ H k (M)
for all k ∈ Z+ , and analogously if instead u ∈ W m, p (M) and 1 < p < ∞, or
u ∈ C m,α (M) with 0 < α < 1. In the Hölder setting, then, we see directly that u is
C ∞ . In the case of Sobolev spaces, one invokes a Sobolev embedding: if kp > n,
any u ∈ W k, p (M) can be represented by a continuous function u ∈ C 0 (M); in fact
if ℓ is a nonnegative integer, (k − 1) p ≤ n < kp, and 0 < α < 1 with α ≤ k − np ,
then if u ∈ W k+ℓ, p (M), it follows that u ∈ C ℓ,α (M), and there is a constant
C > 0 so that for all u ∈ W k, p (M), ∥u∥C ℓ,α (M) ≤ C∥u∥W k+ℓ, p (M) . Section 7.2.2.1
discusses more such embeddings; see also [2; 86; 107], for example. We likewise
conclude that if L is linear elliptic of order m on , with smooth coefficients,
m, p
and u ∈ Wloc () for some 1 < p < ∞ with Lu = f ∈ C ∞ (), then u ∈ C ∞ ().
m, p
Remark 6-20. We have been working with a priori regularity for u, say u ∈ Wloc ,
for instance. We often naturally run into weak solutions that do not have this
regularity a priori, and in fact the following elliptic regularity result holds: for a
linear elliptic operator L of order m on ⊂ Rn with smooth coefficients, if u is
a distribution with Lu = f ∈ C ∞ (), then u ∈ C ∞ (). Since smoothness is a
local question, we can localize support near any point, and so consider a smooth
bump function ζ ∈ Cc∞ () that is identically 1 on Bε ( p) compactly contained in
, and let ψ1 be a smooth bump function identically 1 on Bε/2 ( p) supported in
Bε ( p). By [195, Theorem 8.9], ζ u ∈ H s (Rn ) for some s, so that since ζ u = u
on the support of ψ1 , L(ψ1 u) − ψ1 Lu ∈ H s−m+1 (Rn ). We can apply elliptic
regularity in these Sobolev spaces as in [93, Lemma 6.32] (cf. [139, Theorem
III.4.3] for a related approach) to conclude ψ1 u ∈ H s+1 (Rn ). The claim follows
by further localization and bootstrapping regularity.
We could compactify the localized problem to a closed manifold if we wish:
take an elliptic operator e L for which e L = L on Bε ( p), while e L has constant
coefficients outside the support of ζ ; this can be done with a smooth convex
combination of L with the constant-coefficient operator whose coefficients are
those of L at p, by taking sufficiently small ε if necessary. We can construe
L(ψ1 u) =: f˜ on a domain compactified to an n-torus — at which point Fourier
e
series methods could be employed, as in [219, Chapter 6].
As we may want to apply linear theory in situations where the coefficients are
not necessarily smooth, it is very important to note the regularity required on the
182 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
by the assumption on f . Thus this integral identity holds for all v ∈ H m (M),
and hence PP ∗ u = f weakly. Since P is assumed to have smooth coefficients
and g is smooth, elliptic regularity will allow us to conclude the appropriate
regularity for any of the splittings above. For example, for f ∈ L 2 (M), we
conclude u ∈ H 2m (M), so h := P ∗ u ∈ H m (M) solves Ph = f as desired. See
Exercise 6-60 for a proof of the splitting for p > 1, and for the Hölder setting. □
Remark 6-22. We have observed that certain facts extend to overdetermined-
elliptic operators L on a closed manifold M, such as finite-dimensionality of the
kernel, which we gleaned by using a metric g to define P = L ∗ , and considering
the elliptic operator PP ∗ = L ∗ L. Of course, if we could establish the elliptic
estimates in Proposition 6-12 for overdetermined-elliptic operators, then the
proofs of various facts would follow from the estimates, such as Proposition 6-15,
which could then be used to prove the Fredholm/Hodge splitting directly for
L = P ∗ overdetermined-elliptic. In fact, the Fourier transform proof for the
L 2 -estimates works for injective symbol (cf. [114]); the scaling argument in [210]
for the Schauder estimates uses hypoellipticity, that the elements in the kernel
are smooth, which holds for overdetermined-elliptic operators. One could also
establish estimates for L overdetermined-elliptic via the corresponding estimates
for L ∗ L = PP ∗ , say in case p = 2 as a mapping L ∗ L : H m (M) → H −m (M),
for which suitable mapping properties and estimates are available via Fourier
analysis. For other spaces, see, e.g., [210] and [214, Chapters 13–14].
6.1.9. The maximum principle and the Harnack inequality. The last stop on
this brief tour of elliptic PDE brings us to several important results for second-
order scalar operators. We will review in the next chapter how these properties
can be proved for the Euclidean Laplace operator using the mean value property.
For now, consider an operator
∂2 ∂
L = a i j (x) i j
+ bi (x) i + c(x).
∂x ∂x ∂x
We should impose some ellipticity and boundedness conditions on the coefficients
a i j , bi and c. Assume that the coefficients are continuous on the closure of
a bounded domain ⊂ Rn , and that (a i j ) satisfies an ellipticity condition:
there is λ > 0 so that for all x ∈ and all ξ ∈ Rn , a i j (x)ξi ξ j ≥ λ|ξ |2 . (The
continuity assumption, plus ellipticity, is sufficient to give the required bounds,
but not necessary; see [107], for example. As noted earlier, we may arrange by
symmetrization that a i j = a i j .)
Weak maximum principle. We state first a weak maximum principle for L, which
allows one to estimate certain u on a bounded open set in terms of its boundary
values. Suppose u ∈ C 2 ()∩C 0 (). We write u = u + −u − , |u| = u + +u − , where
u + = max(u, 0) ≥ 0, u − = − min(u, 0) ≥ 0 (note that in [107], u − = min(u, 0)
is the opposite of what we have taken here).
(i) Suppose c = 0 on . If Lu ≥ 0 on , then max u = max∂ u, while if
Lu ≤ 0, then min u = min∂ u.
(ii) Suppose c ≤ 0 in . If Lu ≥ 0 on , then max u ≤ max∂ u + , while if
Lu ≤ 0 in , then min u ≥ − max∂ u − . Thus max |u| = max∂ |u| if
Lu = 0.
The idea behind these is simple. To illustrate, if Lu > 0, we observe that u
cannot have an interior maximum. If it does, say at x0 ∈ , then each ∂u/∂ x i
vanishes at x0 , and the Hessian matrix of u is nonnegative definite at x0 . If c = 0,
we see, working in Cartesian coordinates diagonalizing (a i j (x0 )) (if symmetric;
else diagonalize the Hessian at x0 ), that Lu(x0 ) ≤ 0. This observation also
works when c ≤ 0 and u(x0 ) ≥ 0, or in any case when u(x0 ) = 0. This gives
a contradiction. The general case where the inequality satisfied by Lu is not
necessarily strict uses a perturbation to achieve the result; see [86, Chapter 6] or
[107, Chapter 3].
The strong maximum principle and the Harnack inequality. We now state a
strong maximum principle for operators L as above, which is more subtle (see
the references just cited for proofs). It strengthens the weak maximum principle
A PRIMER ON ELLIPTIC PDE 187
One last fundamental result is the Harnack inequality [86, Theorem 6.5] (see
also [107, Corollary 9.25]): for any connected, compactly contained open subset
W of , there is a C > 0 such that if u ≥ 0 is a nonnegative C 2 solution of Lu = 0
in , then supW u ≤ C infW u. In particular, if the supremum is positive on W,
the function u is strictly positive in the connected component of containing W.
The Harnack inequality can be established with weaker a priori assumptions
on u, and for operators L with coefficients satisfying modest assumptions on
ellipticity and boundedness; see [107, Chapter 9], for instance. For nonnegative
weak solutions of analogous divergence form equations (where the leading
order part of the operator has the form ∂i (a i j ∂ j u)), and under very general
assumptions of ellipticity, boundedness and measurability on the coefficients,
Moser established a Harnack inequality, from which he recovered oscillation
and Hölder continuity estimates first established in the breakthrough discovery
made independently by DeGiorgi and Nash, (see, for example, [107, Chapter 8])
or [214, Chapter 14]). We will not go into the details here, but in light of the
discussion above (p. 182) about applications to nonlinear elliptic equations, the
reader should still be able to appreciate that this result is fundamental in going
from weak solutions to higher regularity. For applications to the mean curvature
operator one can consult [107; 108], for example.
6.1.10. The method of super- and subsolutions. As a final topic in this section,
we discuss a version of the method of super- and subsolutions that will suffice for
our purposes. We will want to solve semilinear elliptic PDE of the form 1g φ =
f (x, φ), for f : M × I → R, where I ⊂ R is an open interval. A subsolution φ−
satisfies 1g φ− ≥ f (x, φ− ), and a supersolution φ+ satisfies 1g φ+ ≤ f (x, φ+ ).
We reiterate that we take g to be smooth.
for 1g φ = f (x, φ), such that [inf M φ− , sup M φ+ ] ⊂ I . Then there is a smooth
function φ, with φ− ≤ φ ≤ φ+ , such that 1g φ = f (x, φ).
For more general formulations of this method, see, e.g., [86; 127; 151]. We
use the proof as a means to illustrate the utility of some of the tools for elliptic
PDE that we have introduced above.
Proof. We first use continuity and compactness to choose a constant ρ > 0
so that, if we write the function f as f (x, s), then ρ − ∂ f /∂s is positive on
M × [inf M φ− , sup M φ+ ] = M × [min M φ− , max M φ+ ]. We let
so that 1g φ = f (x, φ) is equivalent to L(φ) = F(x, φ). Note that ∂ F/∂s > 0
on M × [min M φ− , max M φ+ ].
Since we take g to be smooth, the Fredholm alternative/Hodge decomposition
of the space of smooth functions for the linear, self-adjoint, elliptic operator L
has the form C ∞ (M) = ker L ⊕ ran L. Now, by considering M u Lu dvg and
R
integrating by parts, we see the operator L has trivial kernel. Thus L is surjective
as well. Since φ± is smooth, we can thus define the following sequence of
smooth functions recursively: φ1 is the unique solution to Lφ1 = F(x, φ+ ), and
for any k ∈ Z+ , φk+1 is the unique solution to Lφk+1 = F(x, φk ). We remark that
for this to be well-defined, we have to make sure the range of each φk is inside
the interval I . In fact, the sequence φk is not only well-defined, but satisfies
φ+ ≥ φ1 ≥ φ2 ≥ · · · ≥ φk ≥ φk+1 ≥ · · · ≥ φ− .
all u ∈ C 2,α (M), ∥u∥C 2,α (M) ≤ C∥Lu∥C 0,α (M) , by Proposition 6-15. We have the
analogous Sobolev space estimates (6.1.2), but again since L has trivial kernel,
there is a C > 0 such that for all u ∈ W 2, p (M), ∥u∥W 2, p (M) ≤ C∥Lu∥ L p (M) .
With this in hand, we note that
C > 0 for which ∥u∥C 0,γ (M) ≤ C∥u∥W 2, p (M) . Thus from ∥φk ∥W 2, p (M) ≤ K , we
see that φk is bounded in C 0,γ (M), so that Ascoli–Arzelà yields a C 0 -convergent
subsequence, and so the limit φ is continuous; moreover, by monotonicity, the
full sequence converges uniformly.
By applying the elliptic estimate judiciously, we can obtain more regularity
on φ. For instance, for k, ℓ > 1, we have
of conformal classes. To do this, we begin by recalling the formula for the scalar
curvature under a conformal change of metric.
Proposition 6-27. If ĝ and g are conformally related metrics on an n-manifold
4
M (n ≥ 3), say ĝ = u n−2 g with u > 0, then
− n+2
R(ĝ) = − 4(n−1)
n−2 u
n−2 1 u − n−2 R(g)u . (6.1.8)
g 4(n−1)
which we then conclude the eigenvalues are bounded from below. Moreover, we
see there is a variational characterization of λ1 , as the infimum of
Eg (u)
Gg (u) := R
M u 2 dvg
over all nontrivial u ∈ H 1 (M, g); the lower bound on Eg (u) shows that this
infimum is finite. As such, we will argue that infu∈H 1 \{0} Gg (u) is achieved by
a minimizer u ∈ H 1 (M, g), and then that this minimizer is a smooth eigen-
function which does not change sign. Given this, the minimizer has eigenvalue
λ1 = infu∈H 1 \{0} Gg (u), and furthermore this eigenvalue is simple: if there were
two linearly independent eigenfunctions for an eigenvalue, we could arrange
them to be orthogonal in L 2 (dvg ), but we will have shown that minimizers
(first eigenfunctions) cannot change sign, so that two such functions cannot be
orthogonal.
That a minimum is achieved can be proven using similar tools as in the proof
above of the Hodge decomposition, based on elementary functional analysis,
together with the fundamental compactness result from the Rellich lemma: the
inclusion of H 1 (M, g) ,→ L 2 (dvg ) is compact, which we recall means that
given an H 1 -bounded sequence, then there is a subsequence which converges
in L 2 . In addition, by Riesz representation for Hilbert spaces, together with the
Banach–Alaoglu theorem, we can choose the subsequence to converge weakly
in H 1 (M, g) as well (to the same limit).
With this in mind, consider an infimizing sequence vi ∈ H 1 (M, g) \ {0} for Gg .
Since Gg (u) = Gg (cu) for a constant c ̸= 0, we may take M vi2 dvg = 1. The fact
R
that Gg (vi ) approaches the finite infimum then implies that ∥vi ∥ H 1 is a bounded
sequence. Thus we may assume, reindexing the subsequence, that vi converges
strongly in L 2 and weakly in H 1 to a function u ∈ H 1 (M, g). u is nontrivial
because M u 2 dvg = 1. That Gg (u) is the minimum of Gg comes from the fact
R
that the norm is weakly lower semicontinuous, i.e., ∥u∥ H 1 ≤ lim infi→∞ ∥vi ∥ H 1
(cf. Exercise 6-21). Together with the L 2 -convergence of vi to u, we can now
conclude that Gg (u) ≤ lim infi→∞ Gg (vi ), so that Gg (u) realizes the minimum.
As with our variational formulation of the Einstein equation, for example, we
can compute the Euler–Lagrange equation by setting
d
t=0
Gg (u + tv) = 0,
dt
for any v ∈ H 1 (M, g) (for small enough |t|, ∥u + tv∥ L 2 ̸= 0 since ∥u∥ L 2 = 1).
192 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
i.e.,
n+2
Lg u = θ − n−2 Lg̊ (uθ ). (6.1.9)
This identity extends to all smooth u by a simple continuity argument about
4 4
points where u = 0. Next we see that if u > 0 and ĝ = u n−2 g = (uθ) n−2 g̊, then
4(n−1) 4(n−1)
n−2 Eg (u) = R(ĝ) = n−2 Eg̊ (uθ). (6.1.10)
We are now ready to summarize the above into a key proposition.
Proposition 6-30. Suppose M n (n ≥ 3) is a closed connected manifold. Let C
be a conformal class of Riemannian metrics. For g ∈ C , let λ1 (g) be the lowest
eigenvalue of the conformal Laplacian Lg . Then the sign of λ1 (g) is the same
for all g ∈ C : it is positive (zero, negative) if and only if there is a metric ĝ ∈ C
for which R(ĝ) is positive (zero, negative).
Proof. We let u > 0 be a first eigenfunction of Lg for a metric g ∈ C . If λ1 (g) = 0,
and if g̊ ∈ C , then by (6.1.9) or (6.1.10), we see λ1 (g̊) ≤ 0; similarly, if λ1 (g) < 0,
then λ1 (g̊) < 0. Hence, if λ1 (g) = 0, then λ1 (g̊) = 0 too. Thus we can also
conclude that if λ1 (g) > 0, then λ1 (g̊) > 0 as well.
4
We again let u > 0 be a first eigenfunction of Lg . It follows from R(u n−2 g) =
− n+2 4
− 4(n−1)
n−2 u
n−2 L u =
g
4(n−1)
n−2 λ1 (g)u
− n−2
that for any metric g, there is a confor-
4
mally related metric ĝ = u n−2 g with scalar curvature whose sign agrees with
A PRIMER ON ELLIPTIC PDE 193
that of λ1 (g). We have only left to show that if g̊ is conformal to g, so that R(g̊)
has a definite sign (or is identically zero), then the sign is the same as that of
4
λ1 (g). To see this, if g = θ n−2 g̊, and if u > 0 is a first eigenfunction of Lg , then
we have
n+2 n+2
n−2
1g̊ (uθ ) − 4(n−1) uθ R(g̊) = Lg̊ (uθ ) = θ n−2 Lg u = −λ1 (g)θ n−2 u.
If λ1 (g) > 0, then by considering a point p where uθ > 0 has a minimum value
on M (and hence 1g̊ (uθ )| p ≥ 0), we see R(g̊)| p > 0. Similar consideration
n−2
applies if λ1 (g) < 0. If λ1 (g) = 0, then 1g̊ (uθ ) = 4(n−1) uθ R(g̊). By considering
points where uθ obtains a maximum and minimum on M, respectively, we see
that since R(g̊) has a definite sign, then R(g̊) must be identically zero. □
As a direct corollary, the set of Riemannian metrics on a closed and connected
manifold can be written as a disjoint union Y + ∪ Y 0 ∪ Y − of Yamabe classes,
where Y + is the set of all metrics which are conformally related to a metric with
positive scalar curvature, and analogously for Y 0 and Y − . Each of these three
sets is a union of conformal classes. We note that Y + and Y 0 might in fact be
empty, whereas Y − is always nonempty (see [22, p. 123-4], for example). For
example, if M = T3 is the three-torus, Y + is empty [198] (Theorem 6-56 below),
and if M is a compact hyperbolic manifold, then Y + ∪ Y 0 is empty (see [139,
Corollary IV.5.6], for instance), and similarly Y + ∪ Y 0 is empty for T3 # T3 (see
comment in Section 6.3.1).
Before we move back to the constraint equations, we note a simple application
of the method of super- and subsolutions.
Theorem 6-31. Suppose (M, g̊) is a closed Riemannian n-manifold (n ≥ 3). Let
C be the conformal class of g̊, and suppose that g̊ ∈ Y − . For any constant ξ < 0,
there is a metric g ∈ C such that R(g) = ξ .
Proof. By the preceding proposition, we may assume without loss of generality
that R(g̊) < 0. By (6.1.8), we see we want to show the following PDE has a
positive solution:
4
n−2
1g̊ φ = 4(n−1) φ R(g̊) − φ n−2 ξ =: F(x, φ).
(following work of Yamabe, Aubin and Trudinger, cf. [143]) each conformal
class does contain a metric of constant scalar curvature.
In this section we discuss one method for producing solutions to the vacuum
Einstein constraint equations, the conformal method, which dates back to Lich-
nerowicz, Choquet-Bruhat, York and Ó Murchadha [50; 52; 145; 172; 228]. The
method has proved very useful in both theory and applications, and remains an
active area of research. We will not try to give an up-to-date account of the latest
developments or even an exhaustive list of references, but rather we will develop
some of the basic formulation and results of the method, and the interested reader
can find further results and references, such as [17; 64; 153; 154].
In terms of parametrizing the space of solutions (g, K ) to the Einstein con-
straint equations, the conformal method has proved successful in the constant
mean curvature (CMC) case, i.e., in case trg K is constant.5 The non-CMC case
is not completely understood, with some near-CMC results, and a number of
recent results which cast doubts on how successful the conformal method can be
for parametrizing the moduli space of solutions to the constraints. We will set
up the conformal method, and then study the CMC regime, following Isenberg’s
treatment [127], which unified a number of earlier results and completed the
analysis in the CMC case.
The basic idea is to prescribe part of the initial data (g, K ) as free data, and
solve for the other components. For instance, the Riemannian metric g is not
prescribed, but rather its conformal class is, and the method will involve solving
for the metric in the class. If g̊ is a metric in the conformal class, the method will
4
determine a function φ > 0 such that g = φ n−2 g̊ will be the desired metric. While
this part was simple enough to describe, it takes more finesse to understand how
to assemble the symmetric tensor K , part of which will be freely prescribed, with
the remainder to be determined by the constraints. The construction of K will be
motivated below with a discussion of the transverse-traceless (TT) decomposition
of symmetric tensors. To establish rigorously the TT decomposition and to solve
the equations that arise from the conformal formulation of the Einstein constraints,
we will draw on our above tour of elliptic PDE.
We wish to commute the covariant derivatives on the second term, for which we
use the Ricci formula.
Lemma 6-32 (Ricci formula). If X is a vector field on a semi-Riemannian
manifold (M, g), then
Therefore we have
where in the last line we used that S is trace-free, and in the previous line we
used symmetry. Applying this with S = L g W we obtain ⟨Z , Pg W ⟩ L 2 (dvg ) =
− 21 ⟨L g Z , L g W ⟩ L 2 (dvg ) , which shows we can switch the roles of Z and W ,
S OLVING THE CONSTRAINT EQUATIONS : THE CONFORMAL METHOD 197
Proof. Given that we solved (6.2.1), this gives the existence of the decomposition.
By the preceding corollary, the vector field W giving 9 L = L g W is determined
up to addition of a CKV, which plainly does not affect the term 9 L . □
We note that we could have done the analysis for a symmetric (2, 0)-tensor,
and the resulting TT decomposition is obtained from that of the corresponding
metrically equivalent (0, 2)-tensor. Furthermore, while we assumed M to be
closed, there are analogous results for certain noncompact manifolds, such as in
suitable weighted spaces on (M, g) asymptotically flat (see [39], for example).
6.2.2. The conformal data. We now define the conformal data on an n-manifold
(n ≥ 3). The essential idea is to prescribe certain parts of g and K , using the idea
of the decomposition we have just seen above, and to solve for the remaining
parts to constitute g and K satisfying the constraint equations. In fact we specify
the metric up to a conformal factor, prescribing a Riemannian metric g̊ such
that g will be in the conformal class of metrics C containing g̊. In light of the
decomposition of symmetric tensors we have discussed, we also prescribe a
symmetric TT (with respect to g̊) tensor σ along with a scalar function τ , as
giving part of the tensor K (up to conformal rescaling). We will specify K in
terms of this data and derive the corresponding form of the constraint equations,
but first we note some useful conformal identities.
6.2.2.1. Some conformal identities. We collect here some identities enjoyed by
metrics g and g̊ conformally related by g = φ q g̊, where φ > 0, with Levi-Civita
connections ∇ and ∇.˚ The computations collected in this section are relatively
straightforward, with some care. We start with a simple but useful formula.
Exercise 6-37. Recall that the difference T := ∇ − ∇˚ in two connections is
tensorial, and show that Tikj = q2 δ kj (d log φ)i +δ ki (d log φ) j − g̊i j (gradg̊ log φ)k .
Use this to show that if X is a vector field, then divg (φ −qn/2 X ) = φ −qn/2 divg̊ X .
Recall that conformal Killing vector (CKV) fields are those fields whose flow
preserves the conformal class of the given metric. Thus a vector field W is a
CKV for g if and only if it is a CKV for g̊. This can also be seen through the
identity
which generally would be called “W ” with indices down when only one metric
is under consideration. For instance, if we write
4
As we have seen (Proposition 6-27), we find it convenient to use q = n−2 .
With this choice of q, the following corollary is immediate.
4
Corollary 6-41. Let g = φ q g̊ with q = n−2 . For any trace-free symmetric (0, 2)-
tensor σ on (M, g̊), we have the one-form identity divg (φ −2 σ ) = φ −2−q divg̊ σ .
Thus if σ is TT for g̊, then φ −2 σ is TT for g.
Exercise 6-42. Prove Lemma 6-40. You might find Exercise 6-37 helpful.
There is an analogous formula for trace-free symmetric (2, 0)-tensors: if 4 is
a trace-free symmetric (2, 0)-tensor, then the following vector equation (indices
4
up) holds, again with q = n−2 : divg (φ −2−2q 4) = φ −2−2q divg̊ 4.
Exercise 6-43. Prove the preceding formula, based on the corresponding (0, 2)-
formula. You might let Scd = 4ab gac gbd and S̊cd = 4ab g̊ac g̊bd be the correspond-
ing trace-free (0, 2)-tensors for 4. Note that S = φ 2q S̊, and (as either form or
vector identities) divg 4 = divg S and divg̊ 4 = divg̊ S̊. Use the preceding lemma
to compute divg (φ −2−2q 4) (take care when raising and lowering indices).
200 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
6.2.2.2. The constraint equations. We will now determine the data g and K
4
in terms of the conformal data on M. Fix q = n−2 for the rest of the chapter.
We specify a metric g̊ to be in the conformal class of g, a symmetric TT (for g̊)
tensor σ , along with a scalar function τ . We can construe σ to be (0, 2) or (2, 0),
depending on how we want to present K (indices up or down). Given this data,
construct the physical data (g, K ) as follows: for φ > 0 and W a vector field, let
g = φ q g̊ and K = φ −2 (σ + L g̊ W ) + τn φ q g̊,
where g and K are (0, 2)-tensors (indices down). Note that trg K = τ . If we
let the indices on K and σ be up, the corresponding formulation would be
K cd = φ −2−2q (σ cd + (L g̊ W )cd ) + τn φ −q g̊ cd .
We now want to evaluate the constraint map (with 3 = 0)
2
for this conformally constituted data. Note that |K |2g = φ −4−2q |σ + L g̊ W |2g̊ + τn
(the trace-free part is pointwise orthogonal to the pure trace part), so that using
(6.1.8) yields
1 1 2 −q−3 1 2 q+1
1g̊ φ = q(n−1) R( g̊)φ − q(n−1) |σ |g̊ φ + qn τ φ . (6.2.10)
Remark 6-45. For the analysis it can be useful for R(g̊) to have a definite sign,
and to arrange this we will change to another metric in the conformal class.
Bearing this in mind, we point out that while Method B (Remark 6-44 and
Exercise 6-61) enjoys a natural conformal invariance, the Lichnerowicz equation
for the CMC case of Method A also enjoys a conformal invariance property:
namely, given (g̊, σ, τ ) with τ constant and σ a (0, 2)-tensor, for any function
θ > 0, the Lichnerowicz equation for (g̊, σ, τ ) admits a solution φ > 0 if and
only if the Lichnerowicz equation for (θ q g̊, θ −2 σ, τ ) admits a solution φθ −1 > 0,
and these solutions lead to the same data (g, K ). Indeed, by Corollary 6-41, θ −2 σ
is TT for θ q g̊, so that from (6.2.8) and (6.2.9), we see g = φ q g̊ = (φθ −1 )q (θ q g̊)
and K = (φθ −1 )−2 (θ −2 σ ) + τn (φθ −1 )q (θ q g̊).
202 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
σ =0 σ =0 σ ̸= 0 σ ̸= 0
τ =0 τ ̸= 0 τ =0 τ ̸= 0
g̊ ∈ Y + no no yes yes
g̊ ∈ Y0 yes no no yes
g̊ ∈ Y − no yes no yes
Proof. We present a fairly complete proof, except for the uniqueness statement,
for which we refer you to [127; 151]. By Remark 6-45, we can move g̊ within a
conformal class without changing the required solvability. In particular, we may
assume without further comment that a representative g̊ of the conformal class
has been chosen so that R(g̊) has a definite sign (or vanishes identically). We
then seek a positive solution to the Lichnerowicz equation (6.2.10).
The nonexistence cases are readily shown. Consider the case σ = 0, so that
1 1 2 q+1
the equation reduces to 1g̊ φ = q(n−1) R(g̊)φ + qn τ φ . If τ ̸= 0, then since by
Stokes’ theorem M 1g̊ φ dvg̊ = 0 for M closed, we see there can be no positive
R
1
solution φ if R(g̊) ≥ 0. If τ = 0 as well, we get 1g̊ φ = q(n−1) R(g̊)φ, from
which we again see there can be no positive solution φ > 0 when R(g̊) > 0 or
S OLVING THE CONSTRAINT EQUATIONS : THE CONFORMAL METHOD 203
R(g̊) < 0. Similar reasoning yields the other two nonexistence cases for σ ̸= 0,
/ Y +.
τ = 0 and g̊ ∈
There is one very easy existence case, when σ = 0, τ = 0 and g̊ ∈ Y 0 , for
which we can assume R(g̊) = 0. Then the equation reduces to 1g̊ φ = 0, whose
solutions are constants.
Consider the case g̊ ∈ Y − , σ = 0, τ ̸= 0. By (6.1.8) and Theorem 6-31, we
can solve
− n+2
− 4(n−1)
n−2 φ
n−2 L φ = R(φ q g̊) = − n−1 τ 2
g n
1 1 2 q+1
for φ > 0. This equation can easily be rewritten 1g̊ φ − q(n−1) R(g̊)φ = qn τ φ ,
which is the Lichnerowicz equation in this case.
For the case g̊ ∈ Y − , σ ̸= 0, τ ̸= 0, we follow [151]. We assume we have applied
Theorem 6-31 to choose a conformal representative g̊ so that R(g̊) = − n−1 2
n τ .
Hence the Lichnerowicz equation becomes
τ 2
1g̊ φ = − qn 1
φ − q(n−1) |σ |2g̊ φ −q−3 + qn
1 2 q+1
τ φ . (6.2.11)
compare the equation we solved to (6.2.11) to see how the preliminary step added
a term that allowed a simple choice of super- and subsolutions.
For the three cases with λ1 (g̊) ≥ 0 and σ ̸= 0, we again proceed following
Maxwell [151]. We arrange by conformal invariance to work with g̊ in the
conformal class with R(g̊) > 0 or R(g̊) = 0, and note that R(g̊) + n−1 2
n τ > 0 for
these three cases. We now define an operator Sg̊ analogous to that above:
n−2 2 2
Sg̊ := 1g̊ − 4(n−1) R(g̊) + n−1
n τ = Lg̊ − n−2
4n τ .
Again Sg̊ is linear, elliptic, self-adjoint, and has trivial kernel. Thus, by the
Fredholm alternative and elliptic regularity, we can solve the following with u
smooth:
n−2
Sg̊ u = − 4(n−1) |σ |2g̊ .
= 1
|σ̃ |2g̃ (u q+3 φ − φ −q−3 ) + n−1 2 q −q
n τ φ(φ − u ) =: F(x, φ).
q(n−1)
In this section, we discuss two results on the image of the scalar curvature map on
closed manifolds. We will develop in detail how to prescribe perturbations of the
scalar curvature function on a closed manifold M using inverse function theorem
methods, following Fischer and Marsden [89]. First we note the following
theorem of Kazdan and Warner [135; 134] on the range of the scalar curvature
map.
approach via the Dirac operator. For case (ii), an example is provided by Tn #Tn ,
on which, as with the torus, there is no metric of positive scalar curvature, and
so by Corollary 6-53, any scalar-flat metric would be Ricci flat; however, by
Bochner’s theorem (see [182], for instance), a closed n-manifold with nonnegative
Ricci curvature has first Betti number at most n.
The analogous result in dimension two also holds [135; 134]; in this case, the
three possibilities are governed by the Euler characteristic χ(M), consistent with
the Gauss–Bonnet formula.
Theorem 6-49 (Kazdan–Warner). Suppose M is a closed connected surface.
There are three possibilities for the set S of scalar curvatures of smooth Riemann-
ian metrics on M. Indeed, a function f ∈ C ∞ (M) is a scalar curvature of a
smooth Riemannian metric on M according to one of the following three cases:
(i) f ∈ S if and only if { f > 0} ̸= ∅, in case χ(M) > 0.
(ii) f ∈ S if and only if either f changes sign or f is identically zero, in case
χ (M) = 0.
(iii) f ∈ S if and only if { f < 0} ̸= ∅, in case χ(M) < 0.
J.-P. Bourguignon [136], which we will utilize in the discussion of positive scalar
curvature and the positive energy theorem.
Corollary 6-53. Suppose (M, g) is a closed connected Riemannian manifold
with nonnegative scalar curvature. Either Ric(g) = 0 or M admits a metric with
positive scalar curvature.
As we noted above, the torus Tn does not admit a metric with positive scalar
curvature, so that any zero scalar curvature metric on Tn must be Ricci-flat. It is
not hard to show from here that the metric must actually be flat; see, e.g., [182,
Chapters 7 and 9]. Cf. [91] and [139, Chapter IV, Section 5] for related results.
cf. [2; 86; 107; 144], for more on Sobolev embedding). Now if n > p > n2 ,
then n−n p > 2. Thus we can choose 2 ≤ s1 < n−n p , so that 1 < s2 < n−n p , with
s2 = s1s−1
1
. Using Hölder’s inequality, we obtain, for u, v ∈ W 1, p (M),
Z Z 1/s1 Z 1/s2
|u| p |v| p dvg̊ ≤ |u|s1 p dvg̊ |v|s2 p dvg̊ .
M M M
where we commuted derivatives to cancel out the last two terms, either replac-
ing the derivatives with partial derivatives (up to ∼), or keeping the covariant
derivatives and using the Ricci identity αi; jk − αi;k j = αℓ R ℓjki applied to α = d f ;
in either case, the principal order term in L g L ∗g is (n−1)12g . We note that the
symbol of this operator is (n−1)|ξ |4g .
With this in mind, to prove that f is smooth, we write the left-hand side of
R(g + L ∗g f ) − R(g) = S as a quasilinear operator on f . Indeed, if we let γ be
the metric γ = g + L ∗g f , then with a short computation we leave as an exercise,
the leading-order deviation of the volumes of small geodesic balls from their
Euclidean counterparts (see also [142, Theorem 1.16] for a comparison result on
the conjugate radius), its influence on the geometry, and certainly the topology,
seems less direct. Thus it was a breakthrough when Schoen and Yau established
a celebrated fundamental group obstruction to positive scalar curvature [198],
which we will turn to presently. At the time of their work, the only known
topological restrictions on positive scalar curvature involved harmonic spinors,
from the Dirac operator on spin manifolds, dating from works of Lichnerowicz
and Hitchin (as discussed in [139], where one can also explore Gromov and
Lawson’s development of this line of research). Of interest for us here will be
the connection made by Schoen and Yau in relating the positivity of the mass
of an asymptotically flat metric of nonnegative scalar curvature to topological
obstructions to positive scalar curvature (PSC), a topic we will explore in the
next chapter. For now we state one of the main theorems from [198].
Theorem 6-56 (Schoen–Yau). Suppose M is a closed, connected, orientable
three-manifold. Suppose furthermore that either π1 (M) contains a finitely-
generated non-cyclic abelian subgroup, or that π1 (M) contains a subgroup
abstractly isomorphic to the fundamental group of a closed orientable surface of
positive genus. Then M admits no metric with positive scalar curvature, and any
metric on M having nonnegative scalar curvature is flat.
We discuss some elements of the proof. If 6 is a smooth, closed two-sided
minimally immersed hypersurface in (M, g) with unit normal ν along the im-
mersion, and second fundamental form A = Aν , then by (X.14) on p. 224, the
second variation of area for variation field V = ϕν is
Z Z
− ϕ L6 ϕ dσ = |∇ 6 ϕ|2 − (|A|2 + Ricg (ν, ν))ϕ 2 dσ , (6.3.3)
6 6
on 6. One of the key components of the proof of Theorem 6-56 is the following
beautiful observation [198], cf. [106], using the stability inequality.
Proposition 6-57. Let (M, g) be a closed, oriented Riemannian three-manifold
with positive scalar curvature. Then (M, g) admits no stable minimal immersion
from a closed orientable surface 6 with positive genus.
Proof. Suppose 6 is a closed oriented surface, with a stable minimal immersion
to M. Choose a local orthonormal frame {E 1 , E 2 , E 3 } adapted to 6, with E 1
S CALAR CURVATURE DEFORMATION ON CLOSED MANIFOLDS 213
energy whose action on π1 (T2 ) is the same as the original (and in particular,
then, the map is nontrivial). The energy-minimizer can be shown to be a stable
minimal immersion (without branch points). We can now apply the preceding
proposition. See [198] for details. By (6.3.4), one may allow R(g) = 0, for
example in case (T3 , g) were the flat torus. By scalar curvature deformation
(Theorem 6-52 and Corollary 6-53, cf. [198]), g must be Ricci-flat (hence flat in
dimension three), else the scalar curvature may be made positive.
Exercises
a. Find the components g i j explicitly, and use this to show that the mean curvature
H = H (u) = g i j K i j satisfies
n
∂u ∂u ∂ 2 u
(1 + |grad u|2 )3/2 H (u) = (1 + |grad u|2 )1u −
X
.
∂xi ∂x j ∂xi ∂x j
i, j=1
(Hint: Let G = (gi j ) and B = (grad u)(grad u)T , where grad u is a column vector
(which may be zero). Find G B, and noting G = I + B, infer G −1 from here.)
b. Show that
grad u
H = div p ,
1 + |grad u|2
where div is the Euclidean divergence in Rn , by comparing to the result of the
calculation in part a.
E XERCISES 215
c. Show how the form of the minimal surface equation H = 0 using part b.
follows naturally from the variational characterization of minimal surfaces as
critical points ofparea functional, using the fact that the area A(u) of the graph of
u is A(u) = 1 + |grad u|2 d x. (If the area is infinite, you can still consider
R
the variation, via the regularization expressed schematically as A(u + tv) − A(u)
for some v compactly supported in .)
d. Show that the mean curvature operator u 7→ H = H (u) is elliptic, or equiva-
lently, u 7→ M(u) = (1 + |grad u|2 )3/2 H (u) is elliptic.
Exercise 6-59. Suppose (M n , g) is Riemannian.
a. Using the form (2-8a)–(2-8b) for the curvature tensor in coordinates, show
the symbol of the linearization P = DRicg is given by (note that the symbol as
given in some references differs from this up to sign)
b. Suppose ξ ∈ T p∗ M \ {0},
with σ (ξ )(h) = 0; this still holds if we rescale so that
|ξ |g = 1. By using an orthonormal frame for T p M with the first vector ξ ♯ , show
that h vanishes on (ξ ♯ )⊥ × (ξ ♯ )⊥ . Then show that there is a form η ∈ T p∗ M such
that h = ξ ⊗ η + η ⊗ ξ . We know from Example 6-11 that indeed such h are in
the kernel of σ (ξ ).
c. Adjust the preceding argument for the situation where g is a Lorentzian metric
on M; you might break up into cases depending on the causal character of ξ .
Exercise 6-60 (elliptic estimates and regularity). Suppose ⊂ Rn is open. Let
0 ≤ φ ≤ 1 be a smooth function supported on the unit ball in Rn , satisfying
−n φ x . If u ∈ L p , and if we
Rn φ(x) d x = 1, and for any σ > 0, let φσ (x) = σ
R
σ loc
define the convolution u σ = φσ ∗ u, then u σ is smooth, and u σ converges to u as
p
σ ↘ 0 almost everywhere and in L loc , while for u ∈ C k () with k a nonnegative
integer, u σ converges to u in Cloc
k
() [86]. Suppose L is linear elliptic of order
m, and the domain is in the first two parts of the exercise.
p
a. Suppose the coefficients of L are constant, and that for some p > 1, u ∈ L loc
p
weakly solves Lu = f ∈ L loc . Consider a ball B1 compactly contained in a ball
B2 , compactly contained in . Employ the interior elliptic estimate between B1
and B2 to show that the functions u σ form a bounded set in W m, p (B1 ), and then
m, p q m,q
conclude that u ∈ Wloc . If instead f ∈ L loc for some q > 1, prove that u ∈ Wloc
216 6. S CALAR CURVATURE DEFORMATION AND THE E INSTEIN CONSTRAINT EQUATIONS
(using the above argument and Sobolev embedding to handle the case q > p).
0,α m,α
Finally, if moreover f ∈ Cloc for some 0 < α < 1, argue that u ∈ Cloc . To do
this, you might use the preceding to first show u is continuous (in fact, you can
m−1,α
get u ∈ Cloc ); argue that f σ is uniformly bounded in C 0,α on any compact
subset, and apply the interior Schauder estimate.
b. You may assume the Schauder estimate (6.1.3) holds for L with C 0,α coeffi-
m,α
cients. Suppose for some 0 < α < 1, u ∈ Cloc solves Lu = f . Suppose that for
k,α
some k ∈ Z+ , the coefficients of L as well as the function f are in Cloc . Show
m+k,α
that u ∈ Cloc . To do this for k = 1, bound L(1ih u) in C 0,α (B) for a ball B
compactly contained in . Use interior estimates to show that for any B ′ com-
β
pactly contained in B, and for some δ > 0, the set ∂x 1ih u : 0 < |h| < δ, |β| ≤ m
m+1,α
is bounded and equicontinuous on B ′ . From here, conclude u ∈ Cloc .
c. Suppose L is linear elliptic of order m with smooth coefficients on a closed
Riemannian manifold (M, g). The Fredholm alternative/Hodge decomposition
was established earlier in the L 2 and smooth cases. Prove the decomposition
W k, p (M) = L(W m+k, p (M)) ⊕ ker L ∗ for 1 < p < ∞ and k a nonnegative
integer, as follows. Let ⊥ K := { f ∈ L p (M) : M f w dvg = 0 for all w ∈ K},
R
where K = ker L ∗ . Prove that the subspace (⊥ K) ∩C ∞ (M) is dense in the closed
subspace (⊥ K) ∩ W k, p (M) of W k, p (M). Apply an approximation argument, the
C ∞ (M)-splitting, and elliptic estimates to show that for f ∈ (⊥ K) ∩ W k, p (M),
there is a solution h ∈ W m+k, p (M) of Lh = f ; conclude that any distributional
solution w of Lw = f must lie in W m+k, p (M). By a similar method, show that
for f ∈ (⊥ K) ∩ C k,α (M), with k a nonnegative integer and 0 < α < 1, there is a
solution h ∈ C m+k,α (M) of Lh = f ; conclude that any distributional solution w
of Lw = f must lie in C m+k,α (M). (Hint for the last part: show that there is a
sequence f j ∈ (⊥ K) ∩ C ∞ (M) uniformly bounded in C k,α (M) and converging
to f in C k (M).)
(Note: Part a. holds more generally. Suppose the coefficients of L were merely
(sufficiently) smooth. Then while Lu σ − f σ goes to zero distributionally, one
p
needs an estimate in L loc to apply the above argument, and in fact, K. O. Friedrichs
p
[98, (3.8)] showed that the limit is zero in L loc . For the Hölder case, to go from
m−1,α m,α
u ∈ Cloc to u ∈ Cloc , one could localize and compactify as in Remark 6-20,
and then apply part c.)
enjoys a natural conformal invariance in the CMC case, cf. Remark 6-45. There
is another splitting for the conformal method which enjoys a natural conformal
invariance more generally, and we introduce this method, Method B, now, cf.
[17; 65; 172]. In this method, the freely prescribed TT tensor gives, up to
conformal factor, the TT part of K , and the vector field W is used to generate
the longitudinal part of K itself.
We start with a Riemannian metric g̊ on M n , a symmetric TT tensor σ , and a
4
function τ . We let q = n−2 .
a. For φ > 0, we let K ab = φ −2 σab + φ q (L g̊ W )ab + τn φ q g̊ab . This last term is
pure trace. Observe that the first two summands are TT and longitudinal for
g = φ q g̊, respectively. Show that the vacuum constraints 8(g, K ) = 0 can be
written
1 2 1+q
1g̊ φ = 1
q(n−1) R( g̊)φ + qn τ φ
1
− q(n−1) |σ |2g̊ φ −3−q
1
− q(n−1) |L g̊ W |2g̊ φ 1+q − q(n−1)
2
⟨L g̊ W, σ ⟩g̊ φ −1 ,
(divg̊ (L g̊ W ))a = n−1
n dτa − (q + 2)(L g̊ W )bc (d log φ)c g̊ab .
b. Show that this method enjoys a natural conformal invariance: for θ > 0,
(g̊, σ, τ ) admits a solution φ > 0 and W to the above system if and only if
(θ q g̊, θ −2 σ, τ ) admits a solution φθ = φθ −1 > 0 and Wθ = W to the corresponding
system.
Exercise 6-62. Suppose that, in the setting of Proposition 6-57, we instead have
R(g) ≥ 0. As remarked after the proof, in case there is a stable minimal immersion
from a torus 6, then 6 is also totally geodesic, and R(g) = 0 along 6 as well, by
(6.3.4). Show that more is true: prove that K 6 = 0, and conclude Ricg (ν, ν) = 0
along 6. (Hint: Following [92], consider the functional 6 (|∇ 6 ϕ|2 + K 6 ϕ 2 )dσ .
R
219
220 E XCURSUS : F IRST AND SECOND VARIATION OF AREA
g i j L(E i , E j ) = g i j Mim M ℓj L( Ê m , Ê ℓ ) = ĝ mℓ L( Ê m , Ê ℓ ). □
As a corollary, for any vector field W defined along 6, we can define div6 W :=
g i j ⟨∇ Ei W, E j ⟩, corresponding to the (0, 2)-tensor L(X, Y ) = ⟨∇ X W, Y ⟩ defined
along 6. We note that if W is tangent to 6, then div6 W = g i j ⟨∇ E6i W, E j ⟩=
divg W is the usual divergence of a tangential vector field. For another example,
if 6 is a hypersurface with unit normal field ν and respective mean curvature H ,
then
div6 ν = g i j ⟨∇ Ei ν, E j ⟩ = −g i j ⟨II(E i , E j ), ν⟩ = −H,
First variation. We compute the variation of the area element, written in local
√
coordinates on 6 as |det(gi j (t))| d x = |det g(t)| d x, as we have done in
p
(2.3.10):
dp d ∂F ∂F p
D E
|det g(t)| = 12 g i j , |det g(t)|,
dt dt ∂ x i ∂ x j
221
with
d ∂F ∂F D ∂F ∂F ∂F D ∂F
D E D E D E
, = , + ,
dt ∂ x i ∂ x j ∂t ∂ x i ∂ x j ∂ x i ∂t ∂ x j
D ∂F ∂F ∂F D ∂F
D E D E
= , + , ,
∂ x i ∂t ∂ x j ∂ x i ∂ x j ∂t
where we used (2.3.8). By symmetry we conclude that
dp D ∂F ∂F p
D E
|det g(t)| = g i j , |det g(t)|. (X.1)
dt i ∂ x ∂t
j ∂x
We evaluate this at t = 0, where V := ∂∂tF t=0 is the variation field. Since
F(0, · ) = f is an immersion, we can write at t = 0
d
|det g(t)| = g i j ⟨∇ Ei V, E j ⟩ |det g(0)|.
p p
dt t=0
Along 6 we divide up the variation field V into tangential and normal com-
ponents, V = V T + V N . Then we have
g i j ⟨∇ Ei V, E j ⟩ = g i j ⟨∇ Ei (V T ), E j ⟩+g i j ⟨∇ Ei (V N ), E j ⟩
= g i j ⟨∇ E6i (V T ), E j ⟩−g i j ⟨V N , ∇ Ei E j ⟩
= div6 (V T ) − g i j ⟨V N , II(E i , E j )⟩
= div6 (V T ) − ⟨V, H⟩, (X.2)
where dξg is the induced area element of ∂6 and η is the outward-pointing unit
conormal vector along ∂6, the tangent vector to 6 which is a unit outward-
pointing normal to ∂6. Note that the boundary term in (X.3) will pick up
changes to the area obtained by pushing the boundary along 6. Since each f t is
an immersion, the derivation above can be applied at any t-value, and yields
Z T D
∂F ∂F
E
′
A (t) = div6 − , H6t dσg(t)
6 ∂t ∂t
Z D Z D T E
∂F ∂F
E
=− , H6t dσg(t) + , ηt dξg(t) , (X.4)
6 ∂t ∂6 ∂t
where H6t is the mean curvature of the immersion f t , and ηt is the corresponding
conormal.
222 E XCURSUS : F IRST AND SECOND VARIATION OF AREA
By (X.3), we see that the first variation of area vanishes for tangential variations
that vanish along the boundary. This also follows from the following exercise,
which shows why one sometimes restricts to normal variations.
Exercise X-2. a. Suppose W is a compactly supported tangent vector field to 6,
which vanishes along ∂6 (if nonempty). Show that W is the variation field for a
variation 8 : I ×6 → 6, where each φ t := 8(t, · ) : 6 → 6 is a diffeomorphism.
b. Consider a variation of 6 with area function A(t) and variation field V which
is compactly supported on 6 with V T = 0 along ∂6 (if nonempty). Show that
there is a variation of 6 with the same area function A(t) and such that the
variation field is V N .
Second variation. For the second variation we take the derivative of (X.1),
obtaining
d2 p
|det g(t)|
dt 2 E2
∂gmℓ ℓj D ∂ F ∂ F D ∂F ∂F
D E D
|det g(t)| −g im , + gi j ,
p
= g
∂t ∂ x i ∂t ∂ x j ∂ x i ∂t ∂ x j
E
D D ∂F ∂F D ∂F D ∂F
D E D
ij
+g , + , . (X.5)
∂t ∂ x i ∂t ∂ x j ∂ x i ∂t ∂t ∂ x j
We evaluate the three terms inside the bracket at t = 0. The second is just
D ∂F ∂F 2 2
D E
gi j , = div6 (V T ) − ⟨V, H⟩ . (X.6)
i ∂ x ∂t
j ∂x t=0
For the term on the second line of (X.5) we apply (2.3.8) and (2.3.9):
D D ∂F ∂F D ∂F D ∂F
D E D E
gi j , + ,
∂t ∂ x i ∂t ∂ x j ∂ x i ∂t ∂t ∂ x j
D D ∂F ∂F ∂F ∂F ∂F ∂F D ∂F D ∂F
D E D E D E
= gi j , + R , , , + , .
∂ x ∂t ∂t
i ∂x
j i ∂t j∂x ∂t
i j ∂x ∂ x ∂t ∂ x ∂t
Evaluation at t = 0 yields
DV DV DV
D E
div6 − g i j ⟨R(E i , V, V ), E j ⟩ + g i j , . (X.7)
∂t ∂xi ∂x j
Just as in (X.2) above, we can write
DV DV T DV N DV T DV
D E
div6 = div6 + div6 = div6 − , H . (X.8)
∂t ∂t ∂t ∂t dt
As for the first term in brackets in (X.5), we get that it equals the negative of
D ∂F ∂F ∂F D ∂F ℓj D ∂ F ∂ F
D E D E D E
g im , ℓ
+ , ℓ
g , ,
∂x
m ∂t ∂x m ∂x ∂ x ∂t
i j ∂ x ∂t ∂x
223
which at t = 0 is
DV DV DV
D E D E D E
g im , Eℓ + Em , ℓ g ℓj , Ej . (X.9)
∂xm ∂x ∂xi
We can simplify the expressions in (X.7) and (X.9) when the variation field is
normal to 6. To facilitate this, we introduce some terminology. The normal Ricci
curvature R is a linear operator on each normal space N p 6, defined for vectors
W normal to 6 by R(W ) = g i j R(W, E i , E j ) N , where {E 1 , . . . , E k } is a basis
for T p 6. This is well-defined by Lemma X-1. Note that by symmetry-by-pairs,
if W is normal to 6, ⟨R(W ), W ⟩ = g i j ⟨R(E i , W, W ), E j ⟩.
For each vector W normal to 6, we define a scalar-valued (0, 2)-tensor A W
on 6 by
g im ⟨∇ Em V, E ℓ ⟩ + ⟨E m , ∇ Eℓ V ⟩ g ℓj ⟨∇ Ei V, E j ⟩
= 2⟨A V , A V ⟩g . (X.11)
224 E XCURSUS : F IRST AND SECOND VARIATION OF AREA
Applying (X.11), (X.10) and (X.6) (with V T = 0) to (X.5), we obtain for normal
variations V
DV T
Z
DV
D E
′′
A (0) = − 2⟨A V , A V ⟩g + div6 − ,H
6 ∂t dt
−⟨R(V ), V ⟩ + ⟨A , A ⟩g + ⟨∇ N V, ∇ N V ⟩g + ⟨V, H⟩2 dσg
V V
Z
= ⟨∇ N V, ∇ N V ⟩g − ⟨A V , A V ⟩g − ⟨R(V ), V ⟩ dσg
6 Z D Z D
DV DV
E E
2
− , H − ⟨V, H⟩ dσg + , η dξg . (X.12)
6 dt ∂6 ∂t
We have not yet restricted the signatures of ḡ and g, and thus the first two
terms above do not have a (semi)-definite sign. In case g is Riemannian, then
|A V |2g := ⟨A V , A V ⟩g ≥ 0. If ḡ is Lorentzian and 6 is a spacelike hypersurface,
then ⟨∇ N V, ∇ N V ⟩g = g i j ⟨∇ ENi V, ∇ ENj V ⟩ ≤ 0.
In the case (6, g) is a Riemannian hypersurface, we consider a smooth unit
normal field ν to 6; either we assume 6 ⊂ M is two-sided and ν is defined
globally on 6, or ν is a local unit normal field. We let A = Aν , which is just
the scalar-valued second fundamental form K with respect to ν. Consider a
normal variation V = ϕν, where ϕ is a smooth function of compact support
on 6, vanishing on the boundary. In this case, ⟨R(V ), V ⟩ = ϕ 2 Ricḡ (ν, ν),
|A V |2g = ϕ 2 |A|2g , and ⟨∇ N V, ∇ N V ⟩g = ⟨ν, ν⟩|∇ 6 ϕ|2g , so that we have
Z
A′′ (0) = ⟨ν, ν⟩|∇ 6 ϕ|2g − ϕ 2 |A|2g − ϕ 2 Ricḡ (ν, ν) dσg . (X.14)
6
Exercises
where X = ∂/∂r and dσ is the area measure induced by g S , thus verifying the
first variation of area formula (X.3).
c. The Hawking mass of a surface 6 with area |6| is given by
r
|6| 1
Z
2
m H (6) = 1− H dσ .
16π 16π 6
Show that m H (Sr ) = m.
d. Generalize the argument in part a. of Exercise 5-27 to show that if there is a
closed minimal surface 6 in (M, g S ), then m > 0 and 6 = Sm/2 .
Exercise X-4. Suppose (M, g) is Riemannian. Let I ⊂ R be an open interval
containing 0. Suppose F : I × M → M is smooth, with F(0, · ) : M → M the
226 E XCURSUS : F IRST AND SECOND VARIATION OF AREA
identity map. Suppose ⊂ M is an open set with compact closure and smooth
hypersurface boundary 6, and outward unit normal field ν along 6. Let V (t)
be the volume of t := F({t} × ), let A(t) be the area of F({t} × 6), and let
dσ be the induced surface measure.
a. Suppose that along 6 we have
∂F ∂
:= F∗ = X T + ϕν,
∂t t=0 ∂t t=0
∇ν S − S 2 = R( · , ν, ν), (X-6a)
the reference). We remark that the logarithmic cutoff is Lipschitz, which suffices
for the argument. If one feels the need to smooth it out, one can replace |x| with
ρ(x), where ρ(x) = θ for |x| ≤ θ + δ0 , ρ(x) = |x| for θ + δ1 ≤ |x| ≤ θ 2 − δ1 , and
ρ(x) = θ 2 for |x| ≥ θ 2 −δ0 , where 0 < δ0 < δ1 , and δ0 and δ1 are fixed, for all θ > 1
sufficiently large. For θ + δ0 ≤ |x| ≤ θ + δ1 , we can let ρ interpolate smoothly
between θ and |x| by letting ρ ′ (r ) = ψ(r −θ ), where ψ ≥ 0 is smooth, ψ = 0 for
Rδ
t ≤ δ0 , ψ(t) = 1 for t ≥ δ1 , and δ01 ψ(t) dt = δ1 . For θ 2 − δ1 ≤ |x| ≤ θ 2 − δ0 , we
can interpolate analogously, with ρ ′ (r ) = ψ(θ 2 − r ) for θ 2 − δ1 ≤ r ≤ θ 2 − δ0 .)
CHAPTER 7
This chapter develops the geometry of and analysis on initial data sets that arise in
models of isolated gravitational systems. While such models should behave in the
far field like Minkowski spacetime, there are important physical and mathematical
issues to consider when specifying precisely what the spacetime asymptotic
behavior should be, and the devil is in the details. A desirable, but rather strong,
requirement would be to have the spacetime admit a conformal compactification
in the spirit of that of Minkowski spacetime from Section 1.4.2. (See, for instance,
[218, Section 11.1], [112], and [176] for the notion of asymptotically simple
spacetimes, ideas from the analysis of which are often used to model gravitational
radiation.) One often considers weaker notions that still capture in some sense
how the spacetime approaches the Minkowski spacetime.
A natural question to ask is what kind of spacetime behavior is inherited
from the evolution of initial data which approaches, say, Euclidean geometry,
or, initial data which approaches that of a hyperboloid in Minkowski spacetime,
in the far field. With the conformal compactification of Minkowski spacetime
in mind, hypersurfaces in the former class might be thought of as tending to
spacelike infinity, and those in the latter class as tending to null infinity, though
such notions would come from the spacetime evolution of the data. There are
results, cf. [56; 55; 97], which establish properties about the evolution of such
initial data. We will not discuss such results here, but rather we will focus on the
analysis and geometry of the initial data, with particular emphasis on features
of the asymptotic structure imposed by the constraint equations (vacuum, or
under the dominant energy condition), leaving the interested reader to pursue
the fascinating questions and results for the evolution problem, which has seen
spectacular progress in recent years.
Throughout this chapter, n ≥ 3, g En is the Euclidean metric on Rn , g̊Sn−1 is
the unit round metric on the sphere, and we take the cosmological constant 3 to
be 0. We let d x be Euclidean volume measure, and let | · | be the Euclidean norm,
while for a submanifold of Euclidean space, dσ is Euclidean surface measure,
231
232 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
and ν is a Euclidean normal vector, whereas dvg , | · |g , dσg and νg will be the
analogous quantities with respect to a metric g.
The first two sections of the chapter contain some detailed discussion and
analysis involving the Laplace operator on Euclidean space and on asymptotically
flat manifolds. While it is not necessary to understand and fill in every detail on
a first pass, we hope that the reader can follow some of the discussion to get a
sense of what is important, and then be able to apply the results in later sections.
Indeed, rather than just state results and point to a panoply of references, we
have attempted to motivate the results and sketch proofs, hopefully providing a
reader’s guide to some of the references.
7.1.1.1. The fundamental solution. On a Riemannian manifold (M, g), the trace
of the covariant Hessian 1g u = divg (∇g u) = g i j u ;i j can be expressed as (recall
the summation convention is in force) (cf. Exercise 1-10)
1 ∂ ∂u
ij
1g u = √ .
p
g det g (7.1.2)
det g ∂ x i ∂x j
d
1 n−1 du
1u = n−1 r = u ′′ (r ) + n−1 ′
r u (r ).
r dr dr
234 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
d
For such u, to be harmonic is thus equivalent to dr r n−1 du
dr = 0, that is,
Ar + B for n = 1,
Let n be the volume of the unit ball B in Rn , and ωn−1 the surface area of
the unit round sphere Sn−1 = ∂ B. Note that 22 = 2π = ω1 , 33 = 4π = ω2 ,
and in general nn = ωn−1 . The proof of the following is a generalization of
that of (2.1.2), and is left as an exercise.
Proposition 7-1. Let
1
|x − y|2−n if n > 2,
(2−n)nn
0(x − y) =
1
log |x − y| if n = 2.
2π
Then, for all u ∈ Cc2 (Rn ),
Z
u(x) = 0(x − y)1u(y) dy. (7.1.4)
Rn
∂0(x) 1 xi
We remark that = for all n ≥ 2.
∂x i ωn−1 |x|n
7.1.1.2. The Poisson equation and the Newtonian potential. For functions f ∈
L∞ n 1 n
loc (R ) ∩ L (R ), for instance f locallyRbounded, measurable and decaying
suitably at infinity, the integral N f (x) = Rn 0(x − y) f (y) dy converges, and
constitutes the Newtonian potential of f .
Exercise 7-2. Let n ≥ 3. Prove that the integral defining N f converges for
functions f ∈ L ∞ n 1 n
loc (R )∩L (R ). (Hint: Break up the integral into two regions and
exploit the fact that 0 is both locally integrable, and bounded near infinity.) Using
Young’s inequality for convolution, adapt the proof to show that if f ∈ L 1 (Rn ),
the integral for N f converges a.e. to a locally integrable function.
For f integrable with compact support, a simple Fubini-type argument, along
with Proposition 7-1, yields that 1N f = f distributionally: for all ψ ∈ Cc∞ (Rn ),
Z Z
1ψ(x)N f (x) d x = ψ(y) f (y) dy.
Rn Rn
One can then use uniform continuity of ∂ α f to conclude that N f ∈ C k (Rn ), and of
course, if f ∈ Cc∞ (Rn ), then N f ∈ C ∞ (Rn ). Suppose we dial the regularity down
to f ∈ Cc1 (Rn ), so that ∂ j N f (x) = Rn 0(x − y)∂ j f (y) dy, as above. 0 and ∂ j 0
R
are locally integrable, and a very simple limiting argument around the singularity
as in the derivation of (2.1.2) justifies integration by parts to yield
Z Z
∂ j N f (x) = ∂ j 0(x − y) f (y) dy = ∂ j 0(y) f (x − y) dy.
Rn Rn
that N f ∈ C 1 (Rn ). Since ∂i2j 0 is not locally integrable, we cannot run the
analogous argument again; however, if we have suitable control on the difference
0,α
| f (x) − f (y)|, such as from local Hölder continuity, say f ∈ Cloc (Rn ) ∩ L 1 (Rn ),
then we can use the fact that ∂i2j 0(x − y) f (y) − f (x) is locally integrable to
Exercise 7-7. Prove Corollary 7-6. (Hint: Use v = N1v and the expansion
2−n
x y |y|2 2
2−n 2 2 2−n 2−n
|x − y| = (|x| − 2x · y + |y| ) = |x|
2 1−2 · +
|x| |x| |x|2
x y
= |x|2−n 1 + (n − 2) · + O∞ (|x|−2 )
|x| |x|
for |x| large relative to |y|.)
For such a function v as in the corollary, there is a neighborhood of infinity
4
for which u = 1+v > 0 and such that (, u n−2 g En ) is harmonically flat at infinity.
In analogy with the Schwarzschild metric, we will identify m = 2A as a mass.
The vector B will be related to a center of mass in Exercise 7-28 below. At
238 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
this point it might prove instructive to compare the Newtonian analogue: recall
from (2.1.3) that the gravitational potential 8 satisfies the Poisson equation
18 = 4π σ = 4πρ/c2 , where ρ is the energy density, and σ = ρ/c2 is the mass
density, and we have taken G = 1 for convenience. If we assume ρ is smooth and
compactly supported, and that lim|x|→∞ 8(x) = 0, then 8 admits an expansion
of the above form, which we write 8(x) = −m/|x| − β · x/|x|3 + O∞ (|x|−3 ).
In fact, from (7.1.5) we see that
1 ∂8 1
Z Z Z
m= lim dσ = 18 d x = σ dx (7.1.7)
4π r →∞ {|x|=r } ∂r 4π R3 R3
Thus m is in fact the total mass of the matter distribution, and β i = mci is the
first moment of the distribution, giving (for m ̸= 0) the center of mass ci .
7.1.2. Harmonic functions. We collect here some well-known facts about Eu-
clidean harmonic functions on domains in Rn , n ≥ 2, a reference for which is
[12]; see also [86; 107; 111]. We give some of the proofs, and refer others to
exercises or these references.
For a ∈ Rn and R > 0, let B R (a) = {x ∈ Rn : |x − a| < R}, so that B R (a) :=
B R (a) = {x ∈ Rn : |x − a| ≤ R}, the boundary ∂ B R (a) of which is a sphere.
When a is the origin, we may abbreviate by omitting the center point.
If is a bounded open set with smooth hypersurface boundary ∂, ν will be
the outward unit normal, and for a differentiable function u, ∂u ∂ν is the directional
derivative in this normal direction. In this section, we may use a subscript on the
surface measure to denote the variable of integration.
Classically, one defines u to be harmonic in an open set if u ∈ C 2 ()
and 1u = 0 in . As we mentioned for the more general Poisson equation
above, we could interpret 1u = 0 distributionally: if u is locally integrable, or
more generally a distribution, then 1u = 0 on distributionally means for all
ψ ∈ Cc∞ (), u1ψ d x = 0. An important classical result is Weyl’s Lemma:
R
further mention; we may, however, note any assumptions about behavior of the
function up to the boundary, as in the following proposition.
Proposition 7-8 (mean value property, MVP). If u ∈ C 0 (B R (a)) is harmonic on
B R (a), then
1 1
Z Z
u(a) = u(x) dσ = u(x) d x.
nn R n−1 {|x−a|=R} n R n B R (a)
point ξ on the boundary, PR (x, y) becomes sharply peaked about ξ , so that its unit
integral is concentrated around y = ξ , and thus ∂ B R ϕ(y)PR (x, y) dσ y ≈ ϕ(ξ ).
R
Exercise 7-12 (Green’s function). For any x ∈ B R , let h x (y) solve 1h x (y) = 0
on B R with h x (y) = −0(x − y) for y ∈ ∂ B R . Let G(x, y) = 0(x − y) + h x (y)
be the Green’s function for B R . Show that PR (x, y) = ∂G/∂ν y for y ∈ ∂ B R by
using Green’s identity, applying what we know about the distributional Laplacian
of 0 (Proposition 7-1), along with the fact that G = 0 on ∂ B R .
One can compute a formula for G, using inversion in the sphere, and then
derive the above relation to the Poisson kernel [86; 107; 111]; we do not do this
here, but we recall inversion in the sphere in Section 7.1.2.3 below.
We now arrive at a key fact about the behavior of harmonic functions at an
isolated singularity.
Proof. By translation and scaling we can, without loss of generality, assume that
u is bounded and harmonic on B \ {0}, where B = B1 (0). Let v be harmonic on
B, continuous on B, with the same boundary values as u. For n > 2 and ε > 0, let
vε (x) = u(x)−v(x)+ε(|x|2−n −1); for n = 2, let vε (x) = u(x)−v(x)−ε log |x|.
Then vε is harmonic on the punctured ball, with vε = 0 on ∂ B, while (since u and
v are bounded) vε (x) → +∞ as x → 0. By the maximum principle, vε (x) ≥ 0
on B \ {0}. Letting ε ↘ 0, we conclude u − v ≥ 0 on B \ {0}, and by replacing
u with (−u) in the argument, we can also conclude u − v ≤ 0 on this domain.
Hence u = v on B \ {0}, and v is harmonic in a neighborhood of the origin. □
Proof. Let u > 0 be harmonic on Rn . For any x ∈ Rn , let R > |x| and use the
MVP to write
1
Z Z
u(x) − u(0) = u(y) dy − u(y) dy .
n R n B R (x) B R (0)
H ARMONICALLY FLAT SOLUTIONS OF THE CONSTRAINT EQUATIONS 241
Since u > 0 and B R (x) \ B R (0) ∪ B R (0) \ B R (x) ⊂ B R+|x| (0) \ B R−|x| (0),
Sketch of proof. We indicate two ways to prove a local inequality; the proof can
be completed by a covering argument [12; 107].
Suppose x ∈ B R (y) and B 3R (y) ⊂ . Then B R (y) ⊂ B2R (x) ⊂ B3R (y), so
that since u ≥ 0, we obtain
1 2n 3n
Z Z Z
u d x ≤ u d x ≤ u d x.
n R n B R (y) n (2R)n B2R (x) n (3R)n B3R (y)
Applying the MVP we obtain u(y) ≤ 2n u(x) ≤ 3n u(y).
One could also use the Poisson kernel to get a local estimate. To illustrate,
suppose u ≥ 0 is harmonic on ⊃ B, where B = B1 (0). From the Poisson
kernel, the MVP, and the sign of u, we obtain for x ∈ B the inequality
1 − |x|2 1 1 − |x|2 1 − |x|2
Z
u(0) ≤ u(ξ ) dσξ ≤ u(0).
(1 + |x|)n ωn−1 ∂ B |x − ξ |n (1 − |x|)n
| {z }
u(x) □
For application to the asymptotics of harmonic functions, we consider positive
harmonic functions on Rn \ {0}. We start with a fundamental result, for proof of
which we refer to [12].
242 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
Exercise 7-21 (a key exercise). Prove that u harmonic on if and only if K [u]
is harmonic on ∗ . One way to do this it to show 1(K [u])(x) = |x|−n−2 1u(x ∗ )
by direct computation; also see [12, Theorem 4.4].
The Kelvin transform preserves harmonicity, transforming a harmonic function
in a neighborhood of infinity to one defined in a neighborhood of the origin. We
analyze conditions on the harmonic function near infinity that are encoded in the
transform near the origin.
Definition 7-22. We say a function u is harmonic near infinity if there is a
compact set K such that u is harmonic on Rn \ K . If u is harmonic near infinity,
we say u is harmonic at infinity provided K [u] has a removable singularity at
the origin.
The function in Corollary 7-6 is harmonic at infinity:
Proposition 7-23. Let n ≥ 3 and suppose u is harmonic near infinity. Then u is
harmonic at infinity if and only if lim|x|→∞ u(x) = 0.
Proof. From u(x) = |x|2−n K [u](x ∗ ), one direction above is trivial. For the
converse, assuming lim|x|→∞ u(x) = 0, we can adapt the proof of the removable
singularities theorem above to conclude. □
We remark that the Kelvin transform can be used to solve the exterior Dirichlet
problem, to find a harmonic function on the exterior of a ball with given values on
the boundary sphere, by inverting the solution of the interior Dirichlet problem on
the ball. The result is a function which is, by definition, harmonic at infinity; by
the preceding proposition, it decays to zero at infinity. Thus, given a continuous
function on the boundary of a ball, there is a unique function that goes to zero at
infinity with given boundary values. Compare the following simple example.
Example 7-24. Let n ≥ 3 and B = B1 (0). The function u(x) = 1 − |x|2−n is
harmonic on |x| ̸= 0, and u = 0 on ∂ B. The unique function harmonic at infinity
244 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
which solves the exterior Dirichlet problem with zero boundary data on ∂ B is the
zero function. The harmonic function u is not harmonic at infinity, but u − 1 is.
Proposition 7-25. Let n ≥ 3, and let u be harmonic near infinity. The following
statements are equivalent:
• u is bounded near infinity.
• u is bounded above or below near infinity.
• There is a constant a such that u − a is harmonic at infinity.
• u has a finite limit as |x| → ∞.
Proof. Most of the steps are straighforward. One step involves using Bôcher’s
theorem: if u > 0 near infinity, K [u] > 0 near the origin, so by Bôcher, there
is an a such that K [u](x) − a|x|2−n = v(x) is harmonic near the origin. So
K [v](x) = u(x) − a is harmonic near infinity. □
7.1.2.4. Spherical harmonic expansion. Let n ≥ 3. We note that for a function
harmonic near the origin, if you group the terms in the Taylor expansion into
homogeneous polynomials, each such is harmonic [12]. If v is harmonic at
infinity, by applying the Taylor expansion of K [v] about x ∗ = 0 we have (recall
Pn
that B · x = i=1 B i x i ),
Thus
A B·x
v(x) = A|x ∗ |n−2 + B · x ∗ |x ∗ |n−2 + · · · = n−2
+ + O∞ (|x|−n );
|x| |x|n
cf. Corollary 7-6. One can compute higher-order terms in the expansion, which
come from the homogeneous harmonic polynomials of degree higher than one.
Exercise 7-26. In the case n = 3, compute the next term in the expansion of v.
What is the dimension of the space of homogeneous harmonic polynomials of
degree two in dimension three? (The answer is five; see [12].)
We next explain why this is called the spherical harmonic expansion.
Remark 7-27. The eigenfunctions of 1g̊ on the round unit sphere (Sn , g̊) (i.e.,
nontrivial solutions of 1g̊ u+λu = 0 for some constant λ) are given by restrictions
of the homogeneous harmonic polynomials on Rn+1 . The lowest eigenvalue
is λ0 = 0 (constant eigenfunctions), and the next eigenvalue is λ1 = n, with
eigenfunctions x i , i = 1, . . . , n + 1, restricted to the sphere [12], cf. Exercise
2-43.
A SYMPTOTICALLY FLAT INITIAL DATA 245
4
Suppose g = u n−2 g En is harmonically flat at infinity. We may assume, rescaling
if necessary, that lim|x|→∞ u(x) = 1. Then u(x) admits a spherical harmonic
expansion near infinity, as u(x) = 1 + A/|x|n−2 + B · x/|x|n + O∞ (|x|−n ), cf.
Corollary 7-6. As we remarked earlier, we let m = 2A in analogy with the
Schwarzschild metric, and the vector B is related to the center of mass, as we
point out in the next exercise. Note that metrics which are harmonically flat
at infinity asymptote to the Euclidean metric, with the difference on the order
O(|x|−(n−2) ). The leading order deviation in the metric g from the flat metric
comes from the mass term in the expansion. Harmonically flat metrics might be
thought of as asymptotic to Schwarzschild, since g − g S , for g S given in (7.1.1)
with mass m = 2A, decays as O(|x|−(n−1) ), one order better than the deviation
from Euclidean. Actually, if we recenter coordinates, we can make the decay
another order better, as in the next exercise.
4
Exercise 7-28. Suppose g = u n−2 gEn is harmonically flat at infinity, with ex-
pansion u(x) = 1 + A/|x|n−2 + B · x/|x|n + O∞ (|x|−n ) for |x| > r0 . Translate
coordinates to y = x − a, for a ∈ Rn . For |y + a| > r0 , find the asymptotic
expansion of u(x) = u(y + a) in terms of y. Show that in case A ̸= 0, there is a
unique c ∈ Rn for which u(y + c) = 1 + A/|y|n−2 + O(|y|−n ).
With this important class of initial data sets in mind, we move to more general
asymptotically flat initial data in the next section.
We will give a starting definition of asymptotically flat initial data (g, K ), and
remark later on natural generalizations of the definition in terms of how the
regularity and decay assumptions are imposed. For simplicity, we will assume
g and K are smooth; the interested reader can discern when a lower regularity
level is sufficient. Recall the notation Oℓ (|x|s ; ) from Definition 7-5.
Definition 7-29. Let n ≥ 3, ℓ ∈ Z+ , and q > 0. An n-dimensional Riemannian
manifold (E , g) is called an asymptotically flat (or asymptotically Euclidean) end,
with rate q and order ℓ + 1, if E admits asymptotically flat coordinates x giving
a diffeomorphism of E and := Rn \ {|x| ≤ 1}, in which, for i, j ∈ {1, . . . , n},
gi j (x) − δi j = Oℓ+1 (|x|−q ; ).
An asymptotically flat initial data set (g, K ) on E will consist of an asymptotically
flat metric g on E as above, along with an associated symmetric (0, 2)-tensor
K (equivalently, π = K − (trg K )g) satisfying for i, j ∈ {1, . . . , n}
K i j (x) = Oℓ (|x|−q−1 ; ).
246 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
We remark that sometimes in the definition one also imposes some extra decay
requirement on 8(g, K ), i.e., on the energy-momentum density (ρ, J ) of (5.2.2)–
(5.2.3), beyond that which comes from the conditions on g and K above.
We may refer to either (E , g), (E , g, K ) or (E , g, π) as an asymptotically
flat end. (We emphasize that q here is different than in the conformal method
from last chapter.) Furthermore, while the definition makes sense for q > 0, we
will (unless otherwise noted) restrict to q > n−2
2 , so as to have a well-defined
energy-momentum vector of an asymptotically flat end. We will often remind the
reader where computations in an end are done with respect to asymptotically flat
coordinates x, though we may carry on without explicit remark, and we note that
in the definition it is equivalent to employ asymptotic coordinates x for which
the end E corresponds to |x| > r0 ≥ 1.
Definition 7-30. A connected complete n-dimensional Riemannian manifold
(M, g) is called asymptotically flat (or asymptotically Euclidean) if there is a
compact set C ⊂ M and a k ∈ Z+ such that the components of M \ C are given
by the (open) sets E1 , . . . , Ek , each of which is an asymptotically flat end (E j , g).
An asymptotically flat initial data set (M, g, K ) (or (M, g, π)) will consist of
an asymptotically flat (M, g) along with an associated symmetric (0, 2)-tensor
K (or π = K − (trg K )g) so that g and K satisfy that asymptotic conditions in
Definition 7-29 on each end.
Metrics on Rn that are harmonically flat at infinity are also asymptotically flat
with q = n − 2. For example, the Riemannian Schwarzschild metric, given on
Rn \{0} by (7.1.1) with m > 0, is asymptotically flat with two ends (Exercise 2-37).
For m < 0, the Riemannian Schwarzschild metric g S on Rn \ |x| ≤ − m2
includes one asymptotically flat end, but as we discussed earlier, this space is
not complete. If we excise the singularity in g S for m < 0 by restricting to a
manifold N = Rn \ |x| ≤ δ − m2 for any δ > 0, we see the mean curvature
vector H of ∂ N points out of N (cf. Exercise X-3). On the other hand, in case
m > 0, if we let N = Rn \ |x| ≤ m2 − δ for 0 < δ < m2 , then the mean curvature
vector of ∂ N points into N , in contrast with the m < 0 case. Thus should
one seek to formulate natural conditions to guarantee a nonnegative mass, this
example clearly shows if we allow boundary components in initial data sets, we
would need to impose some such condition on the boundary. For simplicity, we
will consider the case without boundary in this chapter (whereas asymptotically
flat metrics with minimal surface boundary components will play a role in the
formulation of the Riemannian Penrose inequality in Chapter 9). Even without
boundary, we also need to impose some condition on the constraints operator, i.e.,
on the energy-momentum density (ρ, J ), else we could simply patch together any
A SYMPTOTICALLY FLAT INITIAL DATA 247
7.2.1. The ADM energy and momenta. We now want to define an energy-
momentum vector for an isolated gravitational system, to capture the energy-
momentum of the matter fields along with the gravitational field. This will not
amount, then, to simply integrating the energy density ρ, since, for example, the
Schwarzschild spacetime is vacuum, ρ = 0, while we have seen that far field
observers will detect what they might describe as a gravitational field outside a
massive object. As we will see, the energy-momentum vector will be defined as a
limit of flux integrals, for which (7.1.7) gives a Newtonian analogue. We require
q > n−2 2 in various places in this section, with or without explicit mention.
That said, one might look to a Hamiltonian to define an appropriate notion
of energy of a system. We saw the ADM Hamiltonian in Section 5.3.3.2, cf.
(5.3.10), and how it generated the equations of motion (5.3.18); we also noted
(cf. Remark 5-22 and Exercise 5-30) that the Hamiltonian is modified when
boundary terms need to be taken into account. In the asymptotically flat case,
metrics we want to consider with finite nonzero energy have decay rates which
necessitate modifying the Hamiltonian by such boundary terms (at infinity) (see
[16; 187], and earlier works [9; 10; 75]).
Indeed, we consider g and π = −π̂ with decay rates in asymptotic coordinates
of the orders gi j − δi j = Oℓ+1 (|x|−q ) and π i j = Oℓ (|x|−q−1 ). Given a timelike
vector field V along M, we can extend these coordinates (x 7→ ϕ(x) ∈ M) to
spacetime coordinates (t, x) such that ∂/∂t = V along M, for example by using
the exponential map (t, x) 7→ expϕ(x) (t V |ϕ(x) ). Doing this with a timelike unit
normal field along M, we get Fermi (Gaussian normal) coordinates (t˚, x), in
which the metric has the form ḡ = −d t˚2 + γ i j d x i d x j , with γ i j |t˚=0 = gi j . Given
such Fermi coordinates we let V = N ∂/∂ t˚ + X i ∂/∂ x i be a timelike vector field
along M, and let (t, x) be adapted coordinates with ∂/∂t = V . We extend N ,
X i and gi j off of M (t = 0) so that N ∂/∂ t˚ + X i ∂/∂ x i = ∂/∂t and the metric ḡ
assumes the lapse-shift form ḡ = −N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt);
Fermi coordinates correspond to N = 1, X = 0. More generally, we will
0 = O −q i i −q
take N − X ∞ ℓ+1 (|x| ), X − X ∞ = Oℓ+1 (|x| ), for some constants
µ
X ∞ ; cf. [16]. We let g̊ be a background Riemannian metric as above, with
g̊i j = δi j in asymptotically flat coordinates near infinity, and let d v̊ = dvg̊ . As in
√
Section 5.3.3.2, we let θg = det g / det g̊, and π̃ = θg π̂ .
p
From the derivation that led to equation (5.3.18), we see that what is needed
to generate the correct equations of motion in the asymptotically flat con-
text is a Hamiltonian whose first variation in the direction (h, σ̃ ) is equal to
e∗
M (h, σ̃ ) ·g D 8(g,π̃) (N , X ) d v̊. The variation δ HADM of HADM is given by
R
A SYMPTOTICALLY FLAT INITIAL DATA 249
other complication, since for such data as above the integral for HADM might not
converge; we can either restrict to data for which it does converge, or regularize
by subtracting off some background terms, as in [16].) To try to put δ HADM
into the desired form, we integrate by parts, but for bounded (N , X ), and for h
decaying at the same rate as (g − g̊) and σ̃ decaying at the same rate as π̃, this
does not necessarily give us the required identity in general: if we use a spherical
exhaustion of the asymptotic ends, say, then integration by parts leaves terms on
the boundary that may not limit to zero as the spheres tend to infinity.
We claim that the terms from δ HADM = (D HADM )(g,π̃ ) (h, σ̃ ) that require
analysis are M − N L g h + 2(divg σ̂ )i X i dvg , with the other terms either not
R
terms from the N -component of the integrand will come from −N L g h, cf.
equation (5-29b). As for the X -component, the relevant terms arise from the
jk j k
variation of −2(divg π̂ )i X i = −2gi j X i (π̂ ,k + π̂ mk 0mk + π̂ jm 0km ), given by (see
(5.3.9)–(5.3.10))
j
−2h i j X i (divg π̂ ) j − 2X i (divg σ̂ )i − 2gi j X i (π̂ mk δ0mk + π̂ jm δ0km
k
).
The first of these terms will not contribute to the boundary integral, while for
j
the last term we recall that δ0mk = 21 g js (h sk;m + h ms;k − h mk;s ). Integrating a
j
term like 2gi j X i π̂ mk δ0mk by parts results in several boundary integral terms, the
first of which is X i π̂ mk h ik (νg )m = O(|x|−2q−1 ), which tends to zero in a limit
of integrals over large coordinate spheres {|x| = r } as r → ∞ with q > n−2 2 ;
where we swapped out the normal vector for the Euclidean normal (see Exercise
7-35 below). For the divergence term, the boundary integrand is simply
In order to get the desired linearization for the Hamiltonian, and subsequently
get the correct equations of motion, we need to add terms to HADM to cancel
out these boundary contributions, resulting in what we will call HRT , so that the
variation δ HRT in the direction (h, σ̃ ) = (h, θg σ̂ ) is given by, using Exercise 7-35
to exchange the measure dσg with the Euclidean measure dσ without affecting
the existence and value of the limit,
Z X n
j i j
δ HRT = lim N (h i j,i − h ii, j )ν + 2σ̂i j X ν dσ + δ HADM .
r →∞ {|x|=r }
i, j=1
We write
κ
HRT = 2κ HRT ,
κ
so that HRT has units of energy (see Remark 5-23). For solutions of the vacuum
constraint equations, we have HADM = 0, and so should the boundary integrals
κ
exist, the value of the Hamiltonian HRT would be
n
1
Z
0
(gi j,i − gii, j )ν j dσ
X
X∞ · lim
2κ r →∞ {|x|=r } i, j=1
1
Z
i
−X ∞ · lim πi j X i ν j dσ. (7.2.1)
κ r →∞ {|x|=r }
Proof. The second formula follows directly from the first. In the asymptotic
coordinates, 0ikj = O(|x|−q−1 ). Thus for the scalar curvature we find
From g km ki
,ℓ = −g gi j,ℓ g
jm = O(|x|−q−1 ), we find
R(g)
= g i j (0ikj,k − 0ik,
k
j ) + O(|x|
−2q−2
) = δ i j (0ikj,k − 0ik,
k
j ) + O(|x|
−2q−2
)
= 12 δ i j δ km (gm j,ik +gim, jk −gi j,mk )−(gmk,i j +gim,k j −gik,m j ) + O(|x|−2q−2 ),
Thus we see that, with q > n−2 , if ρ and J from the constraint equations are
Pn2
integrable on E , then so are i, j=1 (gi j,i j − gii, j j ) = L g En h and divgEn π, which
immediately gives us the following proposition.
Proof. The proof is by application of the divergence theorem: for r > r0 > 1,
Z n Z n
j
(gi j,i − gii, j ) ν j dσ
X X
(gi j,i − gii, j ) ν dσ −
{|x|=r } i, j=1 {|x|=r0 } i, j=1
Z n
X
= (gi j,i j − gii, j j ) d x.
{r0 ≤|x|≤r } i, j=1
Since, as remarked just above, the integrand is integrable, we see that the differ-
ence in flux integrals can be made small by taking r0 sufficiently large. Similarly,
the second limit follows from
Z n Z n Z
j j
X X
πi j ν dσ − πi j ν dσ = (divgEn π)i d x. □
{|x|=r } j=1 {|x|=r0 } j=1 {r0 ≤|x|≤r }
Here ωn−1 is the area of the unit (n−1)-sphere, so that in case n = 3, say, the
1
constant in front of the E-integral is 16π .
a corollary of the next simple exercise, the normal vector and the measure in
the above integrals (7.2.2)–(7.2.3) can be replaced by νg and dσg , respectively,
without altering the limit, for q > n−22 .
Exercise 7-35. a. Show that (as measures on {|x| = r }) |dσg −dσ | ≤ O(r −q )dσ .
b. Suppose that ν = x j |x|−1 ∂/∂ x j , while νg is the g-normal unit vector field to
the coordinate spheres of constant |x| in an asymptotic end E , with ⟨νg , ν⟩g > 0
(equivalently ⟨νg , ν⟩g En > 0). Show that |νg − ν| = O(|x|−q ).
A SYMPTOTICALLY FLAT INITIAL DATA 253
The ADM energy-momentum has been defined in terms of the initial data,
whereas one might also like to understand these quantities relative to the spacetime
picture. For instance, one could ask how these quantities transform under a
µ µ µ
boost y µ = 3 ν x ν , or slightly more generally y µ = 3 ν x ν + a µ , where 3 ν is a
proper (time-orientation preserving) Lorentz transformation, and a µ is a constant
spacetime vector representing a shift in coordinates (e.g., a time translation).
Consider a suitably asymptotically flat (Minkowskian) spacetime with asymptotic
coordinates x µ , in which the initial data set corresponds to x 0 = 0 (at least near
infinity, say), and contains for some θ0 > 0 the spacelike hypersurfaces defined
µ
near infinity by y 0 = 0, for y µ = 3 ν x ν , for proper Lorentz transformations with
boost angle |θ| < θ0 ; cf. [56], e.g., for the existence of such boosted slices in the
spacetime evolution from asymptotically Euclidean initial data. One can then
show as in Exercise 7-98, following [211, Chapter 7], [161, Chapter 20], cf. [61,
Appendix E], that the energy-momentum vector transforms as a Minkowskian
vector under a Lorentz transformation: if P µ are the components of the energy-
momentum computed in the x-coordinate chart, and P eµ are those computed in
µ
eµ = 3 ν P ν , i.e.,
the y-coordinate chart, then P
∂ µ ∂
ν µ ∂ ν ∂y ∂
P̃ µ µ
= P 3 ν µ
= P ν µ
= Pν ν .
∂y ∂y ∂x ∂y ∂x
See Exercise 7-97 for explicit confirmation of this for a family of boosted slices
in the Schwarzschild spacetime. In terms of thinking of the spacetime geometry
and boosted slices, it might be a good time to recall that in the conformal
compactification of Minkowski spacetime (see Section 1.4.2), spacelike curves
with r → +∞ all “end” at a single point i 0 , spacelike infinity. Finally, note
that if you simply switch the time orientation in the asymptotic regime, E does
not change, but since the second fundamental form changes sign, the linear
momentum changes sign.
As we see from the form of the expansion of R(g) above, and from the
Hamiltonian H RT , the energy and momentum integrals are directly related to the
linearized constraint operator D8(gEn ,0) (h, σ ) = (L g En h, divgEn σ ). By integration
of the linearized constraints operator against elements of the kernel of the adjoint
operator D8∗(g En ,0) at the Minkowski initial data (g En , 0) one obtains asymptotic
conserved quantities. Recall that D8∗(g En ,0) (N , X ) = (0, 0) for N a constant plus
a linear combination of Cartesian coordinate functions x i , and X a Euclidean
Killing field (a linear combination of generators of translations and rotations).
Now, if D8∗(g En ,0) (N , X ) = (0, 0), then (with the dot product below induced by
A SYMPTOTICALLY FLAT INITIAL DATA 255
n
L ∗g En
X
0= N = h · (−(1g En N )g En + Hessg En N ) = (−N, j j h ii + N,i j h i j ).
i, j=1
The boundary integral for E comes from (N , X ) = (1, 0), while the integral
for Pi comes from (N , X ) = (0, ∂/∂ x i ). If (g, π ) = (g En +h, π ) is asymptotically
flat, then in asymptotically flat coordinates x on an asymptotic end, 8(g, π) =
8(g En , 0) + D8(g En ,0) (h, π ) + O(|x|−2q−2 ) = D8(g En ,0) (h, π) + O(|x|−2q−2 ),
where the error term comes from a sum of terms of the algebraic forms h ∗ ∂ 2 h,
∂h ∗ ∂h, π ∗ π, ∂h ∗ π and h ∗ ∂π, the latter term not needed when π is treated
a (2, 0)-tensor. Here “∗” means a linear combination, with smooth bounded
coefficients, of terms quadratic in the components (or their partials) of h and π ,
as indicated. Thus if (g, π ) is a solution to the vacuum constraint 8(g, π ) = 0,
then D8(g En ,0) (h, π ) = O(|x|−2q−2 ).
Exercise 7-38. Verify the stated form of the error term in the expansion of
8(g, π) above, and confirm the order estimate.
We now consider the other remaining basic kernel elements of D8∗(g En ,0) . For
instance, suppose (N , X ) = (x k , 0), then
Z X n n
k
(δ i h i j − δ j h ii ) ν j dσ. (7.2.4)
k k
X
B (r ) = x (h i j,i − h ii, j ) −
{|x|=r } i, j=1 i, j=1
This integral need not converge as r → ∞ for general asymptotically flat initial
data, even for solutions of the vacuum constraints. However, when there are
asymptotically flat coordinates for which the data enjoys sufficient asymptotic
parity, then the above surface integral will converge. We illustrate with an
example.
4
Exercise 7-39. Suppose g = u n−2 g En = gEn + h is harmonically flat at infinity,
with expansion u(x) = 1+ A/|x|n−2 + B · x/|x|n + O∞ (|x|−n ) for |x| sufficiently
large. Show that
4(n−1)
ω Bk
n−2 n−1
Z X n n
xk (δ ki h i j − δ kj h ii ) ν j dσ.
X
= lim (h i j,i − h ii, j ) − (7-39a)
r →∞ {|x|=r }
i, j=1 i, j=1
Thus for m
k
= 2A ̸= 0, the limit above is equal to 2(n − 1)ωn−1 mck , where
B
ck = gives the center of mass as in Exercise 7-28.
(n −2)A
Thus for an asymptotically flat metric g with m ̸= 0, with respect to an
asymptotically flat chart for which limr →∞ B(r ), with B(r ) from (7.2.4), exists,
we define the center of mass ck as follows (compare (7.1.8)):
2(n−1)ωn−1 mck
Z X n n
k
(δ i h i j − δ j h ii ) ν j dσ.
k k
X
= lim x (h i j,i − h ii, j ) − (7.2.5)
r →∞ {|x|=r }
i, j=1 i, j=1
Just as with the center of mass, when there is enough asymptotic parity symmetry
in the data, the surface integrals approach a limit for r → ∞, and we define the
A SYMPTOTICALLY FLAT INITIAL DATA 257
angular momentum
1
Z
k j
Ji = lim π jk Y(i) ν dσ. (7.2.6)
8π r →∞ {|x|=r }
Exercise 7-40 (cf. [159]). From the proof of Proposition 7-33, the limit of
integrals defining the energy converges if and only if limr →∞ {r0 ≤|x|≤r } R(g)d x
R
exists. Show likewise that under condition (7.2.7) (with ℓ = 1), the limit of
integrals defining the center of mass as in (7.2.5) converges if and only if
limr →∞ {r0 ≤|x|≤r } x k R(g)d x exists, for each k.
R
Exercise 7-41. a. Verify the above claim that under the Regge–Teitelboim
conditions, one can replace π jk by K jk in (7.2.6).
b. Consider the initial data on a constant t-slice in the Boyer–Lindquist co-
ordinates used for the Kerr spacetime (2.4.6). Introduce asymptotically flat
coordinates x = r cos θ sin φ, y = r sin θ sin φ, z = r cos φ, and show that the
shift vector is given by
2yma 2xma
X1 = , X2 = − , X 3 = 0.
rρ 2 rρ 2
c. Show that this initial data set satisfies the Regge–Teitelboim conditions (7.2.7)–
(7.2.8), so that the angular momentum flux integrals converge. To do this, you
258 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
might use the ADM equation (5.3.4). Show that the nonzero component of
angular momentum is given by J3 = am.
Remark 7-42. The Kerr spacetime can also be expressed in Kerr–Schild form as
2m r̃ 3
gµν = ηµν + θµ θν ,
r̃ 4 + a 2 z 2
where η = −dt 2 + d x 2 + dy 2 + dz 2 is the Minkowski metric, and
1 z
θµ d x µ = d x 0 − r̃ (x d x + y dy) + a(xdy − yd x) − dz ,
r̃ 2 + a 2 r̃
with r̃ defined implicitly as the solution of the equation
r̃ 4 − r̃ 2 (x 2 + y 2 + z 2 − a 2 ) − a 2 z 2 = 0.
See [23, (37)] for the second fundamental form of the constant t-slice in these
coordinates, where it is shown that the leading order terms possess a form which
allows the angular momentum flux integrals to converge.
7.2.2. Weighted spaces. We will now introduce weighted function spaces that
are designed to capture growth or decay rates of functions and tensors near
infinity. One can generalize the notion of asymptotically flat initial data sets in
terms of these spaces. Moreover, we will want to utilize the properties of the
Laplace operator between such weighted spaces in order to achieve certain kinds
of conformal deformations on asymptotically flat manifolds. In particular we
will construct certain deformations of the scalar curvature on asymptotically flat
manifolds, with sufficient control on the change in the ADM mass induced by
such conformal transformations. We remark that in their proof of the Riemannian
positive mass theorem, Schoen and Yau [199] prove the necessary existence and
expansion of solutions to the PDE giving the required deformations of scalar
curvature directly, via a Green’s function expansion. As the theory of weighted
spaces has now been sufficiently developed and applied to the study of ADM
mass (cf. e.g., [15]), we will discuss the definitions, fundamental results, and
basic ideas behind the proofs.
For τ > 0, the exponent on the weight factor is chosen to encourage decay of u
at infinity (and also to rescale the volume measure). The norm
∥∂ γ u∥ L −τ
X
∥u∥W k, p = p
−|γ |
−τ
|γ |≤k
k, p
determines the spaces W−τ (Rn ), considered as the closure of the space of smooth
compactly supported functions in the given norm, or equivalently defined in terms
0, p p
of weak derivatives. Note that W−τ (Rn ) = L −τ (Rn ), while L p (Rn ) = L p−n/ p (Rn ),
k, p k−ℓ, p
and moreover for ℓ := |γ | ≤ k, the linear map ∂ γ : W−τ (Rn ) → W−τ −ℓ (Rn )
k,∞
is bounded. The spaces W−τ can be defined in the natural way starting from
∥u∥ L −τ = ∥σ u∥ L ∞ , so that ∥u∥W k,∞ = |γ |≤k ∥∂ γ u∥ L ∞
τ . Finally, the weight-
P
∞
−τ −τ −|γ |
ing conventions are not universal; we have chosen to follow those in [15].
There are a number of basic properties of these spaces that we will need, some
of which will be given as exercises, which may be interspersed at appropriate
points throughout the section. For example, the next exercise is basic but essential.
q p
Exercise 7-43. Show that if 1 ≤ p ≤ q and τ1 < τ2 , then L −τ2 (Rn ) ⊂ L −τ1 (Rn ),
and in fact there is a constant C > 0 such that ∥u∥ L −τ p ≤ C∥u∥ L q−τ for all
q 1 2
n
u ∈ L −τ2 (R ). Prove the following weighted version of Hölder’s inequality:
q
if p, q, r ≥ 1 and 1p = q1 + r1 , then if u ∈ L −τ1 (Rn ) and v ∈ L r−τ2 (Rn ), then
p
uv ∈ L −(τ1 +τ2 ) (Rn ) and
p
We can identify the dual spaces to L −τ (Rn ) via a dual pairing, as in the next
exercise.
Exercise 7-44. Suppose that p > 1 and q > 1 satisfy 1p + q1 = 1. Using the dual
pairing (Riesz representation) between L p (d x) and L q (d x), identify the dual
p q p
space of L −τ (Rn ) as L τ −n (Rn ) by considering a pairing (u, v) ∈ L −τ (Rn ) ×
q
L τ −n (Rn ) 7→ Rn uv d x.
R
In contrast to integral control from the weighted Sobolev norms, one can use
weighted Hölder norms for direct pointwise control. Indeed, let τ ∈ R, α ∈ (0, 1],
and let σ (x, y) = min(σ (x), σ (y)). We define a weighted Hölder seminorm by
| f (x) − f (y)|
[ f ]α,−τ = sup σ (x, y)α+τ ,
x̸ = y |x − y|α
The Banach space C−τ k (Rn ) is comprised of those C k -functions for which the
k,α
above norm is finite, while the corresponding weighted Hölder spaces C−τ (Rn )
k,α n
are given by those functions f ∈ Cloc (R ) such that the norm
[∂ γ f ]α,−τ −k
X
∥ f ∥C k,α := ∥ f ∥C−τ
k +
−τ
|γ |=k
k,α k,β
If in addition 0 < β ≤ α and τ1 ≤ τ2 , prove that C−τ 2
(Rn ) ⊂ C−τ1 (Rn ) is a
k,α
continuous inclusion, i.e., there is a constant C > 0 such that for all u ∈ C−τ 2
(Rn ),
∥u∥C k,β ≤ C∥u∥C k,α .
−τ1 −τ2
Before moving on, we note that for tensor fields on Rn , one may use Cartesian
components to extend the above definitions to weighted spaces of tensor fields,
as well as the embeddings between the various spaces to which we will turn next.
7.2.2.1. Sobolev embeddings. There are classical inequalities associated with
Sobolev embeddings (cf., e.g., [2; 86; 107; 144]), for which there are weighted
analogues. We introduced certain such embeddings in Section 6.1, and used
them in Section 6.1.10. We will recall a suite of such embeddings here, and
then sketch how to obtain weighted analogues. The domains to which these
theorems apply, and to which we restrict our discussion here, include balls and
annuli such as A R = {x ∈ Rn : R < |x| < 2R}; these suffice for our purposes, so
we may assume that is a ball or an annulus. For more details, please refer to
the above references, to which we defer for the proofs of these embeddings.
A SYMPTOTICALLY FLAT INITIAL DATA 261
To prove this estimate, we can consider for R > 0 the rescaled function u R (x) =
u(Rx), and note that if A R = {x ∈ Rn : R < |x| < 2R}, then ∥u∥C 0,α (A R ) (defined
−τ
as above but using x, y ∈ A R ) and R τ ∥u R ∥C 0,α (A1 ) are equivalent for R ≥ 1: there
0,α
is a constant C > 1 such that for all R ≥ 1 and all u ∈ C−τ (Rn ),
(where the weighted Sobolev norms on A R are defined as above but only inte-
grating over A R ).
Exercise 7-46. Verify the scaling relationships (7.2.10)–(7.2.11), and then deduce
the corresponding weighted estimate (7.2.9).
0,α
Exercise 7-47 (cf. [15]). If u ∈ C−τ (Rn ), then clearly u(x) = O(|x|−τ ). Show
1, p
that if p > n and u ∈ W−τ (Rn ), then in fact u(x) = o(|x|−τ ) as |x| → ∞, i.e.,
lim|x|→∞ |x|τ u(x) = 0.
262 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
1, p q
We can likewise establish embeddings W−τ (Rn ) ,→ L −τ (Rn ) for 1 ≤ p < n
np
and p ≤ q ≤ p ∗ = n− p ; in the borderline case p = n, there is a continuous
1,n q
embedding W−τ (R ) ,→ L −τ (Rn ) for any n ≤ q < ∞. One proves these as
n
in [15], by using the inequality ∥u∥ L q (A1 ) ≤ C∥u∥W 1, p (A1 ) associated to the
relevant embedding of W 1, p (A1 ) into L q (A1 ) and rescaling as above to prove
∥u∥ L q−τ ≤ C∥u∥W 1, p . Indeed there are constants C1 , C2 , C3 such that for R ≥ 1
1, p −τ
and for u ∈ W−τ (Rn ), defining u R (x) = u(Rx) as above,
Compact embeddings. Many of the above embeddings are compact, the definition
of which we recalled in Section 6.1.
For example, recall that we obtain some compact embeddings by employing
the Ascoli–Arzelà theorem, a direct corollary of which is the following: if is
a bounded domain, then the space C 0,α () (0 < α ≤ 1) is compactly included
in C 0 (). The Hölder continuity yields the required equicontinuity criterion
for Ascoli–Arzelà. Combining this observation with just a little work, you can
obtain the following.
Exercise 7-49. Show that for a bounded domain , C 0,α () is compactly in-
cluded in C 0,β (), for 0 < β < α ≤ 1.
A SYMPTOTICALLY FLAT INITIAL DATA 263
For a convex, bounded domain, the mean value theorem can be used together
with Ascoli–Arzelà to conclude that C 1 () is compactly included in C 0,β () for
0 < β < 1, and as well in C 0 (); likewise for k a nonnegative integer, C k+1 ()
is compactly included in C k () and in C k,β () for 0 < β < 1, and moreover,
C k,α () is compactly included C k (), and in fact in C k,β () for 0 < β < α ≤ 1.
These results extend to compact smooth manifolds-with-boundary, such as an
annular region, by a simple covering argument.
There are also compact embeddings between certain Sobolev spaces, given
by the Rellich–Kondrachov theorem [2; 86; 107; 144]. For an example of such
an embedding, for suitable bounded domains , such as a ball or an annulus,
the inclusion W k, p () ,→ W j, p () for 0 ≤ j < k, 1 ≤ p < ∞, is compact. For
a proof in the p = 2 case (Rellich’s lemma) using the Fourier transform and
Ascoli–Arzelà, see, e.g., [139]. Some of the Sobolev embeddings mentioned in
the preceding section are compact for suitable such domains , for example, for
kp > n, with (k − 1) p ≤ n < kp, and for α with 0 < α < k − np , then for ℓ ∈ Z+ ,
W k+ℓ, p () embeds compactly into C ℓ,α () (and hence compactly into C ℓ ()
and W ℓ,q () for q ≥ 1).
k, p
Lemma 7-50. For 1 ≤ p < ∞, k > j ≥ 1 and µ > 0, W−τ −µ (Rn ) is compactly
j, p
included in W−τ (Rn ).
Exercise 7-51. Prove the preceding lemma, employing Rellich–Kondrachov and
a diagonal argument.
0,α n
Exercise 7-52. In a similar manner as the preceding, prove that C−τ −µ (R )
0,β n
is compactly included in C−τ (R ), for µ > 0 and 0 < β < α ≤ 1, and that
1 n 0,β n
C−τ −µ (R ) is compactly included in C −τ (R ) for 0 < β < 1. Formulate higher
regularity versions as well.
7.2.2.2. Weighted spaces and asymptotically flat manifolds. One can extend
the definitions of weighted Sobolev spaces to an asymptotically flat manifold
(M, g), using asymptotically flat charts for each of the finitely many ends whose
union is M \ C , and a suitable finite covering by charts of the compact set
C ⊂ M; equivalently, one could define these spaces using the induced connection
and volume measure from a smooth background metric g̊ with g̊i j = δi j in the
asymptotically flat charts. Similar considerations apply for defining weighted
Hölder spaces on asymptotically flat manifolds, by using a suitable covering
by charts, and computing Hölder quotients for components, or using parallel
transport. We often focus on an asymptotically flat end (E , g), and can consider
weighted spaces defined on the end. We will sometimes omit the domain in the
k, p k, p
notation for a weighted function space, for example letting W−τ = W−τ (M, g).
264 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
∂g ∈ L 2 (E ) (the same holds at the borderline, using the proof of the Sobolev
embedding). To complete the proof that the ADM mass definition extends to this
setting, proceed as in the following exercise.
p′
Exercise 7-53. Let µ > n−2 2 . Use (gi j −δi j ) = o(|x|
−µ ) along with g
i j,kℓ ∈ L −µ−2
and Hölder’s inequality to deduce that (gi j − δi j )gkℓ,ms ∈ L 1 (E ). From here,
k
conclude that (gi j − δi j )0ℓm,s ∈ L 1 (E ); then, expanding as in the proof of
Proposition 7-31 and using that R(g) is integrable, show that the mass is defined.
Laplacian on weighted spaces. We now turn to the Fredholm theory for the
Laplace operator on functions in weighted spaces. In preparation for this, we
note that since the operator 1g involves the metric g, the regularity and decay of
the components of g in an asymptotic chart are reflected in the coefficients of
the operator 1g and the weighted spaces between which it operates. It is simple
to keep track of this in case the decay conditions are pointwise conditions; we
illustrate more generally with the following lemma.
1, p ′
Lemma 7-54. Suppose (M, g) is asymptotically flat, with (gi j − δi j ) ∈ W−µ for
µ > 0 and p ′ > n ≥ 3 on each asymptotic end. Then, for 1 ≤ p ≤ p ′ ,
2, p 0, p
1g : W−τ → W−τ −2
2, p 0, p
is a bounded linear map. Moreover, (1 − 1g ) : W−τ → W−τ −µ−2 is also
2, p 0, p
bounded on each end, or equivalently, (1̊ − 1g ) : W−τ → W−τ −µ−2 is bounded,
where 1̊ is the Laplacian for a smooth background metric g̊ as above.
We remark that when g is asymptotically flat of rate q > 0, p ′ can be taken as
large as desired, so there would be no restriction on p ∈ [1, ∞) in the above.
2, p 0, p
Proof. The main step is to show that for w ∈ W−τ , 0ikj ∂w/∂ x k ∈ W−τ −µ−2 , with
′
a suitable bound on its norm. Note that 1 ≤ p < p ′ implies p < pp′ −pp ; if 1 ≤ p < n
′
as well, then pp′ −pp < p ∗ = n− np ′
p . Thus when 1 ≤ p < p , we can apply Sobolev
2, p 1, p 0,q
embedding to conclude that for w ∈ W−τ , ∂w/∂ x j ∈ W−1−τ ,→ W−1−τ for
′
q = pp′ −pp . Since metric components are uniformly bounded near infinity in any
end by Sobolev embedding, the Christoffel symbols of g in these coordinates
0, p ′
satisfy 0ikj ∈ W−µ−1 , and we can apply Hölder’s inequality (7-43a) (p. 259) to
0, p
conclude 0ikj ∂w/∂ x k ∈ W−τ −µ−2 . To handle the case p = p ′ , we use the Sobolev
1, p 0,α
embedding ∂w/∂ x j ∈ W−1−τ ,→ C−1−τ for p > n, where α = 1− np , to conclude
0, p
0ikj ∂w/∂ x k ∈ W−τ −µ−2 .
2, p
By the Sobolev embedding, (gi j − δi j ) = o(|x|−µ ), and so for w ∈ W−τ ,
0, p
1g w = g i j (w,i j − 0ikj w,k ) ∈ W−τ −2
0, p
(1 − 1g )(w) = (δ i j − g i j )w,i j + g i j 0ikj w,k ∈ W−τ −µ−2 . □
266 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
k−1, p ′
Exercise 7-55. Let k ≥ 2, and suppose (gi j − δi j ) ∈ W−µ for µ > 0 and
′ ′ ′ k′, p k ′ −2, p
p > n ≥ 3. Prove that for 2 ≤ k ≤ k and 1 ≤ p ≤ p , 1g : W−τ → W−τ −2 and
k′, p k ′ −2, p
(1̊ − 1g ) : W−τ → W−τ −µ−2 are bounded linear maps. To analyze the term
0imj ∂w/∂ x m , you can employ the product rule together with methods used above.
k−1,α
If instead (gi j − δi j ) ∈ C−µ for µ > 0 and α ∈ (0, 1], prove the analogous
k ′ ,β k ′ −2,β k ′ ,β k ′ −2,β
statements for 1g : C−τ → C−τ −2 and 1̊−1g : C−τ → C−τ −µ−2 in weighted
Hölder spaces, with 0 < β ≤ α and 2 ≤ k ′ ≤ k. Compare Exercise 7-45.
We will now recall from [15; 155; 211] some fundamental results regarding
the Laplacian acting on scalar-valued functions in weighted spaces; see also
[39; 44; 45; 168]. We will mostly use these when solving for conformal defor-
mations of asymptotically flat metrics in the next section. You might review the
definition of a Fredholm operator from Section 6.1.
polynomial of degree at most −τ . For the rest of the section, we take 1 < p < ∞
(with or without comment).
2, p p
For p > 1 and any τ , then, 1 : W−τ (Rn ) → L −τ −2 (Rn ) has finite-dimensional
kernel. Recalling what we have seen in Section 7.1.2.3, the set
and its index depends only on the interval inside the nonexceptional set that
contains −τ . More explicitly, if m is a nonnegative integer. the operator 1 :
2, p p
W−τ (Rn ) → L −τ −2 (Rn ) is surjective with kernel Km if m < −τ < m + 1, while
for (n − 2) + m < τ < (n − 2) + m + 1 the operator has trivial kernel, and closed
range
2, p p
n Z o
1(W−τ (Rn )) = f ∈ L −τ −2 (Rn ) : n f (x)v(x) d x = 0 for all v ∈ Km .
R
2, p p
For 0 < τ < n − 2, 1 : W−τ (Rn ) → L −τ −2 (Rn ) is an isomorphism.
The theorem is proved in [155]; see also [168; 15]. (We note that the weighting
conventions in [155] differ from [15]. For comparison, the weighted Sobolev
p k, p p
spaces Ms,δ (Rn ) as defined in [155] are such that W−τ (Rn ) = Mk,δ (Rn ) for
p p
τ = δ + np , and so the Laplace operator 1 : Mk,δ (Rn ) → Mk−2,δ+2 (Rn ) is a
bounded linear map.) We will not go into full details of the proof of Theorem 7-56
here, but we will make some observations and comments.
Comments on the proof. We discussed the kernel above, so we turn to the range
2, p p p
1(W−τ (Rn )) ⊂ L −τ −2 (Rn ). Recall that the dual space of L −τ −2 (Rn ) can be
q
identified with L τ +2−n (Rn ), where 1p + q1 = 1 (Exercise 7-44), as follows: for
q
v ∈ L τ +2−n (Rn ), ℓv (w) = Rn v(x)w(x) d x defines a bounded linear functional
R
p
on L −τ −2 (Rn ); as such, ℓv has a closed nullspace, which is codimension-one for
q 2, p
nontrivial v. Now suppose v ∈ L τ +2−n (Rn ) annihilates the range 1(W−τ (Rn )),
2, p
i.e., for all u ∈ W−τ (Rn ), Rn v1u d x = 0; in particular this holds for all smooth,
R
closed. That said, to establish closed range for −τ < 0 nonexceptional, one
may proceed directly by solving the Poisson equation 1u = f , for f in the
p
appropriate space, either for f ∈ L −τ −2 (Rn ) in case 0 < τ < n − 2, or for
p
f ∈ ⊥ Km ⊂ L −τ −2 (Rn ) when τ > n − 2; we will say a few words about this now.
Once this is established, one may apply a duality argument to handle the case
−τ > 0 nonexceptional, cf. [155].
p
For p > 1, let p ′ = p−1 , and suppose a and b satisfy a +b > 0. Consider the in-
tegral kernel K ′ (x, y) = |x|−a |x −y|a+b−n |y|−b , and the related kernel K e(x, y) =
−a
(1 + |x|) |x − y| a+b−n −b
(1 + |y|) . A workhorse lemma for the analysis of the
Laplacian 1 on weighted spaces is that T ′ (u)(x) = Rn K ′ (x, y)u(y) dy defines a
R
bounded linear operator T ′ on L p (Rn ) if and only if a < np and b < pn′ . For a proof,
see [168, Lemma 2.1]. See also [155, Lemma A] for other references; as noted
there, a straightforward reformulation is that T e(u)(x) = n K e(x, y)u(y) dy
R
R
defines a bounded linear operator on L (R ) if and only if a < np and b < pn′ .
p n
p
and note that f ∈ L −τ −2 (Rn ) if and only if f˜ ∈ L p (Rn ), with C −1 ∥ f ∥ L p ≤
−τ −2
∥ f˜∥ L p ≤ C∥ f ∥ L p for some constant C > 1. Since
−τ −2
n
e( f˜)(x) = (1 + |x|)−τ + p T
N f (x) = (1 + |x|)a T e( f˜)(x),
p
we conclude that N f ∈ L −τ (Rn ) and that N is bounded as claimed. We will
recall a weighted analogue of elliptic regularity in Proposition 7-61, from which
2, p
we conclude N f ∈ W−τ (Rn ), as desired. This gives us the theorem in the
isomorphism range 0 < τ < n − 2, which is the case we will need.
That said, it is instructive to make some comments about the other weight
p p
ranges. We recall that for τ1 < τ2 , L −τ2 (Rn ) ⊂ L −τ1 (Rn ), so that in particular,
p p
for τ ≥ n − 2, L −τ (Rn ) ⊂ 0<σ <n−2 L −σ (Rn ). For τ ≥ n − 2, it follows from
T
p
the preceding discussion that f ∈ L −τ −2 (Rn ) implies f = 1u for u = N f ∈
A SYMPTOTICALLY FLAT INITIAL DATA 269
2, p 2, p
W−σ (Rn ). This space contains W−τ (Rn ), but it is certainly possible
T
0<σ <n−2
2, p
that N f is not in W−τ (Rn ), as in the following example.
Example 7-57. Let ϕ be a smooth function which is zero in a neighborhood of
the origin, and is identically one outside a compact set. Suppose v is a nontrivial
homogeneous harmonic polynomial with degree m and Kelvin transform K [v]
(Section 7.1.2.3). We let f = 1u with u(x) = ϕ(x)K [v](x); for example if m = 0
2, p
and v = 1, we have u(x) = ϕ(x)|x|2−n . We have u ∈ 0<σ <m+n−2 W−σ (Rn )
T
2, p p
but u ∈/ W−τ (Rn ) if τ ≥ m + n − 2. Clearly f ∈ L −τ −2 (Rn ) for any τ (as f
p
vanishes near infinity), and so (u − N f ) is harmonic and lies in L −σ (Rn ) for
2, p
some σ > 0. Hence N f = u ∈ / W−(m+n−2) (Rn ).
Exercise 7-58. Continuing the previous example, show that, whereas for τ >
2, p
m +n −2, 1(W−τ (Rn )) is annihilated by Km under the dual pairing, a simple ap-
plication of Green’s identity, together with the fact that K [v](x) = |x|2−n−2m v(x),
shows that Rn v(x) f (x) d x ̸= 0.
R
2, p
To prove Theorem 7-56 it remains to show that N f ∈ W−τ (Rn ) for all f in
⊥K ⊂ L p n
m −τ −2 (R ), with m + (n − 2) < τ < m + 1 + (n − 2) and m a nonnegative
integer. As you might imagine by now, there is an elegant way to do this using the
spherical harmonic expansion of cn |x − y|2−n , which for |x| > |y| and x̂ = x |x|−1
we write as ∞
cn |x − y|2−n = cn |x|2−n h k (x̂, y)|x|−k ,
X
k=0
k=0
m
K m (x, y) = cn |x − y|2−n − cn (ψ(x))n−2 h k (x̂, y)(ψ(x))k .
X
k=0
K m′ is used in [15], which works with weighted spaces defined on punctured
Euclidean space, whereas [155] applies the integral kernel K m in weighted spaces
on Rn , as the use of ψ(x) removes the singularity at x = 0 in the sum. Using the
workhorse lemma on T e cited above, with elementary estimates of K m , one shows
p p
that K m : L −τ −2 (R ) → L −τ (Rn ) is bounded for m +(n−2) < τ < m +1+(n−2).
n
p
Of course, if f ∈ ⊥ Km ⊂ L −τ −2 (Rn ), then Rn K m (x, y) f (y) dy = N f (x), so that
R
p p
N f ∈ L −τ (Rn ), with 1N f = f ∈ L −τ −2 (Rn ). Just as above, one can conclude
2, p
in fact N f ∈ W−τ (Rn ), as desired.
270 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
That the range is not closed for −τ exceptional is addressed in Exercise 7-104.
For more details and the remainder of the proof, see [15; 155]. □
Remark 7-59. Using the regularity statement in Proposition 7-61 below, one can
k, p k−2, p
conclude that 1 : W−τ (Rn ) → W−τ −2 (Rn ) is Fredholm if −τ is nonexceptional,
p > 1, and k ≥ 2. The map is surjective for −τ nonexceptional and τ < n − 2,
while for m a nonnegative integer and (n − 2) + m < τ < (n − 2) + m + 1,
k, p k−2, p
1(W−τ (Rn )) = ⊥ Km ∩ W−τ −2 (Rn ); the kernel is the same as in the k = 2
k,α k−2,α
case. The analogous Fredholm properties for 1 : C−τ (Rn ) → C−τ n
−2 (R ), with
k ≥ 2, 0 < α < 1 and −τ nonexceptional, can be established directly using the
expansion of the Newtonian potential, along with weighted Schauder estimates
and regularity, as in [157].
Exercise 7-60. Combine the expansion result in [157] with the Fredholm prop-
erties in weighted Sobolev spaces to obtain the Fredholm properties in weighted
Hölder spaces. Begin by letting τ ′ < τ be slightly less than τ , so that −τ ′ is nonex-
2, p
ceptional, and letting p > 1 be such that the Sobolev embedding W−τ ′ (Rn ) ,→
0,α n k,α n k, p n
C−τ ′ (R ) holds; note that C −τ (R ) ,→ W−τ ′ (R ). Use the weighted Hölder
estimate and regularity as in Proposition 7-63 (cf. [157, Theorem 1]) to get that
k, p k−2,α k,α
for f ∈ 1(W−τ ′ (Rn )) ∩ C−τ n n
−2 (R ), there is w ∈ C −τ ′ (R ) with 1w = f . Use
k,α
[157, Theorem 2] to conclude that w ∈ C−τ (Rn ). From here one can easily find
k,α k−2,α
the image of the map 1 : C−τ (Rn ) → C−τ n
−2 (R ), and conclude that the map is
Fredholm.
The Laplacian 1g on asymptotically flat (M, g). The extension of the Fred-
holm property from Theorem 7-56 to the Laplacian 1g on an asymptotically
2, p p
flat manifold (M, g), namely that 1g : W−τ → L −τ −2 is Fredholm for −τ
nonexceptional (and more generally for operators which are asymptotic to 1
in a suitable sense), is established in [15, Proposition 1.14, Proposition 2.2];
see also [39]. For our purposes, the following fundamental weighted elliptic
estimate (7.2.12) and regularity result for 1g , and the weighted Hölder version in
Proposition 7-63, will suffice; compare Lemma 7-54. We emphasize the stronger
estimate (7.2.13) that holds in the weight range for which 1g is an isomorphism.
Proposition 7-61. Let τ ∈ R, µ > 0, k ≥ 2, p ′ > n ≥ 3 and 1 < p ≤ p ′ . Suppose
(M n , g) is asymptotically flat, admitting coordinate charts for each end in which
k−1, p ′ k, p k−2, p
(gi j −δi j ) ∈ W−µ . The bounded linear operator 1g : W−τ → W−τ −2 satisfies
k, p
the following a priori weighted elliptic estimate: for some C > 0 and for w ∈ W−τ ,
p k−2, p k, p
Moreover, if u ∈ L −τ and 1g u ∈ W−τ −2 , then u ∈ W−τ .
k, p k−2, p
If in addition τ ∈ (0, n − 2), then 1g : W−τ → W−τ −2 is an isomorphism, so
k, p
in particular, there is a C > 0 such that for all w ∈ W−τ ,
p k−2, p k, p
Thus, if 0 < τ1 < τ2 < n − 2, with u ∈ L −τ1 and 1g u ∈ W−τ2 −2 , then u ∈ W−τ2 .
Remark 7-62. Aside from the various parameters in the proposition, the constant
C in (7.2.12) depends on the ellipticity constant λ > 0 such that λ|ξ |2 ≤ g i j ξi ξ j ≤
λ−1 |ξ |2 and on the norm ∥gi j − δi j ∥W−µ
k−1, p ′ (defined using a suitable covering),
k−2,α k,α
Given any f ∈ C−n ∩ L 1 , there is a unique w ∈ C2−n such that 1g w = f .
Comments on the proof. As with the Sobolev case, (7.2.14) is established using
interior estimates and scaling. Given what we saw earlier about the role of the
relevant integral kernels in the analysis of 1 on weighted Sobolev spaces, it
should not be surprising that the injectivity estimates (7.2.15) and (7.2.16) can
be obtained using the fundamental solution, as we will indicate shortly for the
Euclidean case. Thus to establish these two estimates in general, one approach is
to show there is a fundamental solution for 1g with the same asymptotics as that
for 1, and compute with it as we will indicate below for the Euclidean metric,
cf. [211].
A SYMPTOTICALLY FLAT INITIAL DATA 273
On the proof of (7.2.15) and (7.2.16) for 1. Recall that the Newtonian potential
is defined by N f (x) = Rn cn |x − y|2−n f (y) dy.
R
Lemma 7-64. For 0 < τ < n − 2, there is a constant C > 0 such that for
0 n 2−n ∈ L 1 (Rn , dy), and in fact
all f ∈ C−τ −2 (R ), it follows that f (y)|x − y|
0 n
N f ∈ C−τ (R ) with ∥N f ∥C−τ
0 ≤ C∥ f ∥ 0 .
C −τ −2
Exercise 7-65. Prove the preceding lemma. Note that this claim involves deriving
a supremum estimate and establishing continuity of N f . To show existence and
continuity of N f , you might break up the integral for N f (x) into two parts
centered around x; continuity will follow from an application of dominated
convergence. To show the C−τ 0 -estimate, for |x| > 0, you might break the integral
{y : |x − y| ≥ 2|x|}. Remark where you use the conditions 0 < τ and τ < n − 2.
The case τ = n − 2 is handled similarly, as in the next lemma.
Lemma 7-66. There is a constant C > 0 such that for all f ∈ C−n 0 (Rn )∩ L 1 (Rn ),
2−n 1 n 0
it follows that f (y)|x − y| ∈ L (R , dy), and in fact N f ∈ C2−n (Rn ) with
∥N f ∥C 0 ≤ C(∥ f ∥C−n0 + ∥ f ∥ L 1 ).
2−n
Exercise 7-67. Prove the preceding lemma. To do this, for |x| > 0, you might
break the integral into two parts: over y : |x − y| ≤ |x| |x|
2 and y : |x − y| ≥ 2 .
f ∈ L 1 (Rn ) as well, but we cannot conclude N f ∈ C−τ 0 (Rn ). Where does the
obstruction appear in the proofs of the lemmas? We see that with τ > n − 2, for
0,α n 2,α n
general f ∈ C−τ −2 (R ), one can conclude N f ∈ C 2−n (R ) and 1N f = f , but
2,α
we cannot conclude N f ∈ C−τ (Rn ). The reader should keep in mind that |x|2−n
is harmonic on Rn \ {0}, and gives the leading order rate of decay for functions
which are harmonic at infinity in Euclidean space; as such, you can let w be a
smooth function which is equal to |x|2−n outside a ball, and f = 1w is any of
the weighted spaces under discussion. If 1v = f with lim|x|→∞ v(x) = 0, then
w = v by the maximum principle applied to the harmonic function w − v. For
more on the asymptotics of solutions of the Poisson equation in weighted Hölder
spaces, please see [157]. □
The isomorphism result is often applied as follows:
Corollary 7-68. Suppose 0 < τ < n −2, µ > 0, k ≥ 2, p ′ > n ≥ 3 and 1 < p ≤ p ′ .
Suppose (M n , g) is asymptotically flat, admitting coordinate charts for each end
k−1, p ′ k−2,∞
in which (gi j − δi j ) ∈ W−µ . Furthermore, suppose ν > 2 and h ∈ W−ν .
k, p k−2, p
If the operator 1g − h : W−τ → W−τ −2 is injective, such as is the case for
∥h∥W k−2,∞ sufficiently small, then it is an isomorphism, and so there is a C > 0
−ν k, p
such that, for all w ∈ W−τ ,
∥hu∥W k−2, p ≤ C∥h∥W k−2,∞ ∥u∥W k−2, p ≤ C∥h∥W k−2,∞ ∥u∥W k−2, p . (7.2.18)
−τ −2 −ν −τ −2+ν −ν −τ
k,α k−2,α
the operator 1g − h : C−τ → C−τ −2 is injective, as for example when ∥h∥C−ν
k−2,α
k,α
Now suppose that the only solution w of (1g − h)w = 0 with w ∈ C2−n and
1 k−2,α 1
1g w ∈ L is the trivial solution w = 0. Then for any f ∈ C−n ∩ L , there is a
k,α
unique w ∈ C2−n such that (1g − h)w = f . There is a constant C > 0 such that
k,α
for all w ∈ C2−n
0, p 0, p
(1 − 1g )(v) ∈ W−2δ−τ −2 . Hence 1v ∈ W−2δ−τ −2 , and so if ℓ ≥ 2, we have
2, p
v ∈ W−2δ−τ .
0, p
We can continue this way until we get to 1v ∈ W−(ℓ+1)δ−τ −2 . By the Fredholm
2, p 0, p
property, 1 : W−(ℓ+1)δ−τ → W−(ℓ+1)δ−τ −2 has finite codimension, which together
with the density of smooth functions with compact support, implies there is a
2, p 0, p
finite-dimensional subspace S ⊂ Cc∞ with 1(W−(ℓ+1)δ−τ ) ⊕ S = W−(ℓ+1)δ−τ −2 .
2, p
Thus we conclude there is an R > 0 and a w ∈ W−(ℓ+1)δ−τ such that 1(v−w) = 0
on E R := {|x| > R}. As v and w both decay, v − w is harmonic at infinity and so
as in Section 7.1.2.4, v − w admits a spherical harmonic expansion in E R , with
the first term in the expansion of the form A/|x|n−2 for some constant A.
Now we apply Sobolev embedding (using p > n2 ) to complete the argument.
2, p 2, p
Note that since A/|x|n−2 ∈ W−ℓδ−τ (E R ) \ W−(ℓ+1)δ−τ (E R ) if A ̸= 0, we cannot
in general go further due to the degree of the leading harmonic. □
Of particular importance in applying this below is the ability to differentiate
the expansion and improve decay by a power of |x|−1 for each successive partial
derivative. We can formulate the above proposition in appropriate higher regu-
larity weighted spaces, so that the error term and some number of its derivatives
enjoy suitable decay. The pointwise estimates will come via Sobolev embedding.
Exercise 7-71. a. Assume that p > n in Proposition 7-70. Prove that v(x) =
A/|x|n−2 + O1 (|x|−(n−2)−γ ) for some A and γ > 0.
b. Formulate and prove an analogue of Proposition 7-70 whose conclusion is
that v(x) = A/|x|n−2 + O2 (|x|−(n−2)−γ ) for some A and γ > 0.
We can also get pointwise estimates using weighted Hölder spaces and
Schauder estimates from Proposition 7-63; see [157, Theorem 2].
Proposition 7-72. Suppose E is an asymptotically flat end in (M n , g), admitting
k−1,α
asymptotic coordinates x in which (gi j − δi j ) ∈ C−µ (E ) for some µ > 0 and
0 k−2,α
k ≥ 2. If β > n ≥ 3 and τ ∈ (0, n − 2), and if v ∈ C−τ (E ) and 1g v ∈ C−β (E ),
then there is a constant A and γ > 0 such that
A
v(x) = + Ok,α (|x|−(n−2)−γ ).
|x|n−2
Sketch of proof. Argue as in the proof of Proposition 7-70, with δ as chosen there.
k−2,α
Then 1v ∈ C−δ−τ −2 . Recall Remark 7-59. □
k, p k, p k, p
Thus it suffices to prove that the multiplication maps W−τ × W−τ → W−τ ,
k, p k−2, p k−2, p k−1, p k−1, p k−2, p
W−τ × W−τ −2 → W−τ −2 and W−τ −1 × W−τ −1 → W−τ −2 are continuous
maps. We leave this as an exercise for the reader, as these are weighted versions
of the analogous statements analyzed in the proof of Proposition 6-54.
We now state an asymptotic version of Theorem 6-52; cf. [89].
The directions transverse to the kernel in the above proof are tangent to
conformal deformations. For a splitting analogous to (6.3.2), see Exercise 7-109.
Proposition 7-75. Let p > 1 and k ≥ 2 be such that kp > n ≥ 3, and consider a
(smooth) asymptotically flat manifold (M n , g) (of rate q > 0 and order k). For
τ ∈ (0, n−2) with τ < q, there is an ε > 0 such that if ∥S∥W k−2, p < ε, there is
−τ −2
a function u > 0 with u(x) → 1 as |x| → ∞, and a smooth symmetric (0, 2)-
4
tensor h compactly supported in M, such that γ := u n−2 g + h is a metric and
0, p ′
R(γ ) = R(g) + S. If q + τ > n − 2, and S ∈ W−δ′ for some p ′ > n2 and δ ′ > n,
then u admits an expansion u(x) = 1 + A/|x|n−2 + O(|x|−β ) in any end, for
some β > n − 2.
n−2
Proof. Let cn = 4(n−1) . Let T be the nonlinear operator defined on the open
k, p
subset of W−τ where v > −1 by
4 n+2
T (v) = R((1 + v) n−2 g) = (1 + v)− n−2 R(g)(1 + v) − cn−1 1g (1 + v) .
k−2, p
T is a smooth map to W−τ −2 , by arguments similar to those showing that the
scalar curvature map is smooth. Of course T (0) = R(g), and the linearization
A SYMPTOTICALLY FLAT INITIAL DATA 279
DT |0 at v = 0 is given by
n+2 4
DT |0 (w) = R(g)w − cn−1 1g w − n−2 w R(g) = − cn−1 1g w + n−2 w R(g) .
For S suitably small, we would like to solve T (v) = R(g) + S, for v small,
using the inverse function theorem. Let 2 ≤ k ′ ≤ k. By the asymptotics of g,
k′, p k ′ −2, p
w 7→ w R(g) gives a bounded linear map from W−τ to W−τ −q−2 , which is readily
k ′ −2, p
seen to be compact as a map to W−τ −2 by Rellich’s lemma (Lemma 7-50), as
k′, p k ′ −2, p
in the proof of Corollary 7-68. Thus, DT |0 : W−τ → W−τ −2 is a Fredholm
operator of zero index, as a compact perturbation of the isomorphism −cn−1 1g .
We use a method from [69] to find h. We first show that the linearization
2, p 0, p
L g := DRg : W−τ → W−τ −2 of the scalar curvature operator is surjective,
2, p
for τ ∈ (0, n − 2). The range of L g contains DT |0 (W−τ ), and so it has finite
0,q ′ p
codimension, and hence is closed. Suppose a dual element f ∈ Wτ +2−n , q ′ = p−1 ,
∗
annihilates the range of L g . Then 0 = L g f = −(1g f )g + Hessg f − f Ric(g).
By elliptic regularity, f is smooth and as we proved earlier (see Section 2.4.5),
if f were nontrivial, R(g) would be constant (since M is connected), so that by
asymptotic flatness, R(g) = 0. However in case R(g) vanishes identically, we
see 1g f = 0, and so f = 0 (since 0 < n − 2 − τ < n − 2). Thus in any case we
have f = 0, showing that L g is surjective as desired.
By the density of Cc∞ , we let ω1 , . . . , ωℓ be compactly supported smooth
(0, 2)-tensors for which span{L g ω1 , . . . , L g ωℓ } is a complementary subspace
2, p 0, p
to DT |0 (W−τ ) in W−τ −2 . This span is easily seen to form a complementary
k, p k−2, p
subspace to DT |0 (W−τ ) in W−τ −2 , by elliptic regularity (Proposition 7-61)
Pℓ k−2, p
applied to DT |0 (w) + i=1 ci L g ωi ∈ W−τ −2 . (Alternatively, since a distribu-
tional solution of L ∗g f = 0 is smooth, essentially the same argument as above
k, p k−2, p
shows L g := D Rg : W−τ → W−τ −2 is surjective.)
We let X be a complementary space to the kernel of DT |0 , and we define,
for suitably chosen neighborhoods Y of 0 ∈ X and Z of 0 ∈ span{ω1 , . . . , ωℓ },
k−2, p 4
the map T : Y ⊕ Z → W−τ −2 by T (v, h) = R((1 + v) n−2 g + h). Note that
T (v, 0) = T (v), so that DT |(0,0) (w, 0) = DT |0 (w), while DT |(0,0) (0, ω) = L g ω.
Thus by design, T is a smooth map for which DT |(0,0) is an isomorphism. From
here, the result follows from the inverse function theorem, and the standard
asymptotic expansion follows as in Proposition 7-70 (see Exercise 7-108). □
This result gives us good control on u and h for small S; however, for our
purposes in studying the ADM energy, the above result does not suffice, since we
do not get control on the sign of A. As we will see, for suitably small R(g), or
for R(g) ≥ 0, we can avoid the difficulty encountered in the above proof, which
accounts for the possibility that the linearization is not invertible. Note that when
280 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
we work in the smooth setting and carry out constructions like that above, we
should analyze the regularity of the result. In the preceding, for example, one
can readily show that if S were smooth, then the solution v would be smooth
outside the support of h. To guarantee this inside the support of h, one could
increase k if needed and use Sobolev embedding (to control the nonlinear terms)
and elliptic bootstrapping.
We now consider what happens to the ADM energy if we conformally trans-
form away the matter density in a time-symmetric asymptotically flat initial data
set. In the time-symmetric setting, the ADM linear momentum vanishes, and we
will let m(g) be the ADM mass (energy) for a chosen end of an asymptotically
flat (M n , g).
Proposition 7-76. Suppose (M n , g), n ≥ 3, is asymptotically flat at rate q > n−2
2
and order k + 1 ≥ 3, with nonnegative scalar curvature R(g) ∈ L 1 (M). There is
4
a conformally related metric ḡ = u n−2 g which is asymptotically flat (of order k)
and has R(ḡ) = 0, with m(ḡ) ≤ m(g) (in each end).
n−2
Proof. Recall the conformal Laplacian Lg = 1g − 4(n−1) R(g), a self-adjoint
4
linear operator. By (6.1.8), for u > 0 and ḡ = u n−2 g, R(ḡ) vanishes if and only
if Lg u = 0. This is a linear equation, and we study Lg acting on a weighted
function space.
k,α
Consider τ ≥ n−2 2 , and suppose that v ∈ C −τ solves Lg v = 0. We have
k−2,α
R(g)v ∈ C−q−2−τ and q + τ + 2 > n. From Proposition 7-72, we see that
j
v ∂v/∂ x = O(|x| −2n+3 ) in asymptotic coordinates x, so that since n ≥ 3,
limr →∞ {|x|=r } v (∂v/∂ν) dσ = 0, and hence limr →∞ {|x|=r } v (∂v/∂νg ) dσg = 0
R R
In the case of multiple ends, one could isolate one end at a time, by considering
a comparison principle on a manifold-with-boundary which contains only one
asymptotic end, the chosen asymptotic end, see Exercise 7-94. The exercise
invokes a solution to a certain Dirichlet boundary value problem with asymptotic
conditions, and we note that one can modify the theory of weighted spaces to
allow boundary, see e.g., [152]. In fact we remark that Schoen and Yau [199]
give a direct proof for the existence and expansion of solutions of the analogue
Neumann boundary value problem.
Note that the above integration by parts formula would hold for a conformal
factor u satisfying the PDE Lg u = 0 on a manifold with compact boundary
and one asymptotic end, such that u goes to 1 at infinity in the end, and with
∂u/∂ν = 0 on the boundary. One could apply this together with (7.2.21) to
handle the multiple end case; cf. [142, pp. 88–89].
Remark 7-78. We adapt an argument from [199] to show that the conformal
2, p 0, p
Laplacian Lg : W−τ → W−τ −2 where τ ∈ (0, n − 2), p > n2 is invertible for
2n
where 2∗ = n−2 is the Sobolev conjugate exponent to 2, in dimension n > 2.
For R(g)− small in L n/2 , we see that w must be zero by the Sobolev inequal-
2, p 0, p
ity: ∥w∥ L 2∗ (dvg ) ≤ C∥∇w∥ L 2 (dvg ) , and therefore Lg : W−τ → W−τ −2 is an
isomorphism.
For analyzing the question of the sign of the ADM mass of time-symmetric
initial data sets with the dominant energy condition, it thus suffices to study the
vacuum case. In the next section we reduce further to the case of harmonically
flat asymptotics.
Schoen and Yau [202] showed that asymptotically flat metrics with zero scalar
curvature can be approximated by metrics which are harmonically flat on the
asymptotic ends. We sketch the proof of a modestly strengthened statement,
slightly modifying and improving the treatment from [68]. In this section and
the next, asymptotically flat metrics have a rate q > n−22 .
2 2,α
For the next proposition, we let X −τ be either C−τ (M) (only if the order is at
2, p
least 3) or W−τ (M), with p > n2 , n−22 < τ < min(q, n − 2), 0 < α < 1. Likewise
0 0,α 0, p
X −τ −2 stands for either C −τ −2 (M) or W−τ −2 (M), as appropriate.
Proof. By Proposition 7-76, the second claim follows from the first, the proof of
which will in fact yield R(ḡ) = 0 in the case R(g) = 0 to start.
Let 0 ≤ ψ ≤ 1 be a smooth cutoff function so that ψ(t) = 1 for t < 1 and
ψ(t) = 0 for t > 2. On each end we choose asymptotically flat coordinates defined
for |x| > 1. For θ > 1, let ψθ (x) = ψ(|x|θ −1 ); ψθ extends smoothly from the ends
to all of M. Now consider the metric gθ (x) = ψθ (x)g(x) + (1 − ψθ (x))g En (x).
This metric is identical to the Euclidean metric for |x| > 2θ , and (as q > τ )
gθ can be made arbitrarily close to g in X −τ 2 , for θ sufficiently large. Note
H ARMONICALLY FLAT ASYMPTOTICS 283
that R(gθ ) ≥ 0 except possibly for θ < |x| < 2θ , and we will use a conformal
deformation to impose nonnegative scalar curvature.
From (6.1.8) we have
4 4(n−1) − n−2
n+2 n −2
R(u n−2 gθ ) = − u 1gθ u − R(gθ )u .
n −2 4(n−1)
4 4
We want to impose R(u n−2 gθ ) = ψθ u − n−2 R(g), i.e.,
n −2
1gθ u − R(gθ ) − ψθ R(g) u = 0.
4(n−1)
With u = 1 + v this becomes
n −2 n −2
1gθ v − R(gθ ) − ψθ R(g) v = R(gθ ) − ψθ R(g) . (7.3.1)
4(n−1) 4(n−1)
n−2
Note that h θ := 4(n−1) R(gθ ) − ψθ R(g) vanishes unless θ < |x| < 2θ, with
|h θ | ≤ Cθ −2−q . In fact since q > τ , for sufficiently large θ , h θ has small norm
0
in X −τ −2 . Thus, 1gθ − h θ is a small perturbation of the invertible operator
2 0
1g : X −τ → X −τ −2 , and thus it is an isomorphism, with inverse bounded
uniformly in θ, cf. Corollaries 7-68 and 7-69; in particular, the constant in
(7.2.17) or (7.2.19) can be chosen uniformly in h suitably small. Thus for θ
large enough, we can solve (7.3.1), and the solution vθ will be smooth and close
to zero in X −τ2 , so that u = 1 + v > 0, while ḡ = u 4/(n−2) g is close in the
θ θ θ θ θ
weighted norm to gθ and hence to g, and is harmonically flat near infinity in
each end.
Upon further restricting to τ > n−2 2 , the masses will also be close for large θ .
We will sketch the idea of the proof following [202; 197] using the Hölder space
for vθ , cf. [78; 119] for the Sobolev case. From (7.2.2) and Propositions 7-33
and 7-31, we know that m(g) equals
n
1
Z Z
j
X
(gi j,i − gii, j )ν dσ + (R(g) + Qg ) d x ,
2(n−1)ωn−1 {|x|=r1 } {r1 ≤|x|}
i, j=1
Pn
where we have set i, j=1 (gi j,i j − gii, j j ) = R(g) + Qg .
From the expansion of the scalar curvature (Proposition 7-31; cf. Exercise
7-38) we have Qg (x) = O(|x|−2q−2 ), so that R(g) + Qg is integrable on the
end. Given ε > 0, we can bound the second integral by 3ε by choosing r1
sufficiently large. We replace g by ḡθ in the integrals above to compute m(ḡθ ).
2,α
By our control on vθ in C−τ (M), and hence on ḡθ , we can uniformly estimate
−4/(n−2)
R(ḡθ ) = ψθ u θ R(g), and we can see that |Qḡθ (x)| ≤ C|x|−2τ −2 , where C
is uniform in θ large. Thus we can bound the corresponding integral for ḡθ by 3ε
by choosing an r1 large enough, uniformly for θ ≥ θ0 for some sufficiently large
284 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
θ0 . By using (7.2.19), we see that for θ sufficiently large, h θ and hence vθ and
∂vθ are small enough that
n n
ε
Z X
((ḡθ )i j,i − (ḡθ )ii, j ) ν j dσ < .
X
(gi j,i − gii, j ) − □
{|x|=r1 } 3
i, j=1 i, j=1
We now state and prove Bray’s approximation. The closeness of the metrics g
and g̃ can be measured in a weighted norm as above, or, as Bray states it, as an
ε-quasi-isometry, meaning that for all nonzero v ∈ T M, the ratio g(v, v)/g̃(v, v)
lies in (e−ε , eε ).
Proposition 7-80 (Bray). Suppose (M, g) is asymptotically flat end with non-
negative R(g) ∈ L 1 (M). For any ε > 0, there is a metric g̃ with R(g̃) ≥ 0, which
is ε-close to g, which is isometric to a Riemannian Schwarzschild metric near
infinity in each end of M, and for which |m(g) − m(g̃)| < ε.
Proof. By applying Proposition 7-79, we may modify the metric so that each end
is harmonically flat. For any chosen end E , we may choose asymptotic coordinates
along with an r0 > 0, so that for |x| > r0 , the metric has the form gi j (x) =
4
u(x) n−2 δi j , with 1u = 0, and u(x) = 1 + 12 m(g)/|x|n−2 + O∞ (|x|−(n−1) ). We
will handle each end individually, so we do not introduce notation to distinguish
the mass on different ends. Now for any R > r0 and δ > 0, consider the harmonic
function ũ(x) = C1 + C2 /|x|n−2 , with C1 and C2 chosen so that
C2 C2
C1 + = max u(x) + δ and C1 + = min u(x) − δ.
R n−2 {|x|=R} (2R)n−2 {|x|=2R}
By the choice of C1 and C2 , the function w defined by
u(x) if |x| < R,
w(x) = min(u(x), ũ(x)) if R ≤ |x| ≤ 2R,
ũ(x) if |x| > 2R
O N THE POSITIVE MASS THEOREM 285
As we saw earlier, to each end in an asymptotically flat initial data set with
integrable mass and momentum densities (ρ, J ), we can associate an energy
E and linear momentum vector P, which together form the ADM energy-
momentum vector for the asymptotic end. The constraint equations have the
form 8(g, π ) = (2κρ, κ J ) (we take 3 = 0). Without a model for, or condition
on, (ρ, J ), any (g, π) satisfies this system by definition, and indeed by simple
patching one can construct asymptotically flat initial data sets with negative
energy. We want a condition that will imply that the ADM energy-momentum
vector is future-pointing causal,p i.e., E ≥ |P| (with c = 1), in which case we
define the ADM mass as m = E 2 − |P|2 . The positive mass theorem asserts
that indeed E ≥ |P| for an asymptotically flat initial data set (M, g, π) with
the dominant energy condition in the form ρ ≥ |J |g (cf. Section 5.2), while the
positive energy theorem asserts that E ≥ 0.
Sometimes these statements are called the spacetime positive mass theorem and
spacetime positive energy theorem, in contrast to the situation where the object of
interest is an asymptotically flat manifold (M, g). Indeed, in the time-symmetric
case (K = 0, i.e., π = 0), we have P = 0, and so we often write E = m, which
we note can be defined for any asymptotically flat (M, g) with R(g) ∈ L 1 (M).
The dominant energy condition in the time-symmetric case, or more generally
in the maximal (trg K = 0) case, reduces to R(g) ≥ 0. The Riemannian positive
mass theorem for an asymptotically flat (M, g) with R(g) ≥ 0 is then m ≥ 0.
In each situation above, the theorem is really comprised of an inequality,
together with a rigidity statement which characterizes the ground state. In the
positive energy theorem, if we find E = 0, we would like to conclude our initial
286 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
vanishing of the mass rules out nontrivial topology in the manifold. We compare
to a simple analogue in dimension n ≥ 2 as remarked following Proposition 7-73,
by compactification to Tn : any metric g on Rn with nonnegative scalar curvature
and which outside a compact set is isometric to the Euclidean metric is actually
isometric to g En . (Thus, for n ≥ 3, the argument of the preceding section implies
m ≥ 0 for (Rn , g) asymptotically flat with nonnegative scalar curvature.)
We indicate a rigidity proof in the next exercise, given the Riemannian positive
mass theorem in higher dimensions [197; 205]. The proof below uses volume
comparison; for a proof using harmonic coordinates, see [196, Proposition 2],
and for further discussion, see [142].
Exercise 7-83. Suppose (M n , g), n ≥ 3, is asymptotically flat with nonnegative
scalar curvature and vanishing ADM mass. By Remark 7-77 (see Exercise 7-95),
g has vanishing scalar curvature. Suppose M has a single asymptotic end.
Suppose h is a smooth, symmetric (0, 2)-tensor with compact support in
M. Let γϵ = g + ϵh, which is a metric for |ϵ| small. By Corollary 7-69
4/(n−2)
and Proposition 7-72, there is u ϵ = 1 + vϵ > 0 so that gϵ := u ϵ γϵ has
2,α
R(gϵ ) = 0, with vϵ ∈ C−τ (M), so that gϵ is asymptotically flat, and u ϵ (x) =
1 + 12 m(ϵ)/|x|n−2 + O2 (|x|−(n−2)−γ ), for some γ > 0. Lemma 7-37 and the
hypothesis on g imply that m(gϵ ) = m(ϵ), and as in Remark 7-77, we have
1
Z
m(ϵ) = − R(γϵ )u ϵ dvγϵ .
2(n−1)ωn−1 M
2,α
a. Argue that the map ϵ 7→ vϵ ∈ C−τ (M) is differentiable in ϵ, then obtain an
identity from computing m ′ (0). (Differentiability at ϵ = 0 suffices for this.)
b. Let ζ be any compactly supported smooth bump function. Let h = ζ Ric(g)
in the above. Conclude that Ric(g) = 0.
c. Use the Bishop–Gromov volume comparison theorem to conclude that (M n , g)
is isometric to Euclidean space, cf. [182, Chapter 9, Exercise 5].
We remark that one way to handle multiple ends is to proceed following an
argument in [199]: given M as above with an end E with vanishing mass, namely,
let N ⊂ M be an asymptotically flat manifold-with-boundary containing one
asymptotic end E , and with boundary given by a union of large coordinate spheres
from the remaining ends, chosen so large that the mean curvature vector of these
spheres points into N . Now run the argument indicated in the exercise, solving
for the conformal factor u ϵ which solves the same PDE but has the Neumann
4/(n−2)
boundary condition; see Remark 7-77. The conformal metric gϵ = u ϵ γϵ will
be asymptotically flat with vanishing scalar curvature, and with mean curvature
vector still pointing into N along ∂ N , by Exercise X-5. Thus by Schoen–Yau
290 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
[199], the mass of gϵ must be nonnegative. From here you can conclude N is
Ricci-flat, and then that M is Ricci-flat, and conclude as above.
We see that the decay rate to the Euclidean metric of an asymptotically flat
metric with nonnegative scalar curvature is constrained by the mass: any such
metric which admits asymptotically flat coordinates with rate q > n − 2 must
in fact be flat. We point out that this holds for complete asymptotically flat
manifolds. One can easily write down a nontrivial harmonically flat end with
zero mass, but just as with analogous negative mass examples, that end cannot be
completed to an asymptotically flat manifold with nonnegative scalar curvature.
In the final section of this chapter we will address the following question [204,
p. 371]: are there nontrivial asymptotically flat solutions of the vacuum Einstein
constraint equations on Rn which reduce to a Riemannian Schwarzschild metric
outside a compact set? Here we put in the condition nontrivial to rule out the
simple case of the Euclidean metric, with vanishing ADM mass. We saw in
Proposition 7-80 that if the vacuum condition is removed and only the dominant
energy condition in the form R(g) ≥ 0 is enforced, then the answer to the
question is yes, there are lots of such solutions; this sufficed for the proof of
the Riemannian positive mass theorem. In the purely Riemannian case of the
vacuum constraints (time-symmetric, K = 0), the Hamiltonian constraint is
simply R(g) = 0. If we consider conformal methods to construct solutions to
the constraints, unique continuation for harmonic functions suggests that maybe
the answer is no: maybe the condition of vanishing scalar curvature and being
identical to Schwarzschild near infinity is a rigid condition, cf. Example 7-19.
The strategy we will outline here is to investigate the question using a localized
version of the Fischer–Marsden scalar curvature deformation. The idea is to
patch together an asymptotically flat metric with zero scalar curvature to a
Schwarzschild end, using a smooth cutoff function. In an annular transition
region the scalar curvature may fail to vanish, and we seek to reimpose the
scalar curvature constraint by considering localized deformations whose support,
unlike in the case of conformal deformations in general, does not extend to the
Schwarzschild end.
Thus as a first step, we have to address to what extent the metric deformation
in the Fischer–Marsden theorem, which has been discussed earlier both in the
closed case (Theorem 6-52) and near the Euclidean metric in the asymptotically
flat case (Proposition 7-73), can be localized. We state a sufficient version of a
localized scalar curvature deformation (cf. [66; 70]). We introduce a term before
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 291
domain, the kernel of L ∗g has dimension at most n+1, since the equation L ∗g f = 0
induces an ODE along geodesics emanating from a point; from this we also see
that on a connected manifold, no nontrivial static potential can vanish on an open
set (Proposition 2-44). We can even consider weak solutions to L ∗g f = 0, but
by elliptic regularity (assuming g is smooth), we will obtain that f is smooth,
and in fact, it is a simple exercise to use the induced ODE to show that a static
potential extends smoothly to the boundary [66; 70].
In the remaining sections, we will give an idea of the proof of Theorem 7-84
and then answer the question posed at the start of the section in the form of
Theorem 7-89. We will focus on the linear theory, while leaving a number of
technical details to the references. Also, the discussion will be somewhat breezy,
as the goal is to map out the big ideas of the proofs.
iteration). For the proof of Theorem 7-84, we run the iteration by hand and show
it converges to a solution.
To do this, we employ ideas from our analysis in the previous chapter. First,
we get the elliptic estimate used to make the variational method for establishing
linear surjectivity succeed. We then iterate the process of linear corrections,
using interior estimates to control the sequence of approximate solutions, and
establish convergence. At this point, this should sound reasonable and somewhat
unremarkable. That said, we have not yet indicated how we will impose the
decay of the deformation tensor h for which we solve. We will define weighted
spaces to accomplish this, and broach some of the technical hurdles involved
and how these are addressed in the analysis.
The linearized problem. We used elliptic estimates to serve as the foundation for
much of the elliptic PDE theory that we have employed. The elliptic estimates
in Section 6.1 were on closed manifolds, or the related interior estimates. If
you study, say, the Laplace operator on a bounded domain, the relevant elliptic
estimate will have to include a term for the boundary data. In our present setting,
we do not want to solve a boundary value problem per se, because we want
our deformation tensor to vanish on ∂, along with derivatives (a finite number,
or all derivatives, depending on how regular we want h to be across ∂). The
operator L ∗g is overdetermined-elliptic, and the structure of the operator allows
us to get an absolute elliptic estimate on any domain , without boundary terms.
This is easy for this operator, because trg (L ∗g f ) = −(n−1)1g f − f R(g), so
1
that Hessg f = L ∗g f − n−1 trg L ∗g f + f R(g) g + f Ric(g). So on any domain
We say u ∈ L 2ρ () if ∥u∥2L 2 () = |u|2 ρ dvg is finite, and we define Hρk ()
R
ρ
similarly for any nonnegative integer k, where ∥u∥2H k () = kℓ=0 ∥∇ ℓ u∥2L 2 () ;
P
ρ ρ
we note Hρ0 () = L 2ρ (). It is a simple exercise to show that Hρk () is a Hilbert
space. Furthermore, since ρ is bounded, we have by [69, Lemma 2.1] that
C ∞ () is dense in Hρk (); this can be useful in establishing certain relations
or estimates, by proving on the dense subset and then taking limits. As a set,
Hρk () is the same even if the metric g, and hence the norm, changes, and we will
usually suppress the metric from the notation. We remark that, considering the
behavior of the weight function near the boundary, it would be natural to define
the weighted spaces with a different weighting on different order derivatives, cf.
[67; 61].
We state the key estimate: with a compact smooth manifold-with-boundary,
assuming L ∗g has no kernel on the manifold interior , there is a constant C > 0
such that for all f ∈ Hρ2 ()
For the full constraint operator the analogous weighted estimates for D8∗(g,π )
hold, but are somewhat harder to establish, without having an analogue of the
absolute (unweighted) estimate (7.5.2), cf. [69; 67; 61].
We now see that (7.5.3) gives us a coercivity bound on the functional G:
assuming the kernel of L ∗g on is trivial and setting ∥S∥2L 2 () = |S|2 ρ −1 dvg ,
R
ρ −1
296 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
G(u) ≥ C ′ ∥u∥2H 2 () −∥u∥ L 2ρ () ∥S∥ L 2 () ≥ C ′ ∥u∥2H 2 () −∥u∥ Hρ2 () ∥S∥ L 2 () .
ρ ρ −1 ρ ρ −1
This can all be done, in the spirit of [66] (with correction to the norm needed
on S as noted in [69], and cf. [70] for a cleaner formulation). When S has
compact support, if we assume it is C 0,α -small, then it will also be small in the
required weighted norm, depending on the support (in particular, on a lower
bound of ρ on the support of S), which explains the way in which Theorem 7-84
is stated. We note here that given k, by taking a power weight ρ(x) = (d(x)) N
near the boundary, for N sufficiently large we can make h extend by zero in a
C k fashion across the boundary; an exponential weight can be used to produce a
C ∞ -solution. The solution just constructed is supported on ; we can arrange
the support to lie strictly in by running the above construction, replacing by
′ = ϵ ′ , for ϵ ′ > 0 sufficiently small.
There are numerous technical issues remaining in the proof of Theorem 7-84,
which we will not go into here, hoping we have at least given the spirit of the
argument. Instead we spend the remaining section in this chapter discussing
what modifications need to be made to address the question posed at the start
of the section: can we construct a zero scalar curvature metric by gluing an
asymptotically flat metric of zero scalar curvature to a Schwarzschild metric and
doing a localized perturbation to make the scalar curvature vanish?
7.5.2. Asymptotic gluing construction. We now state a result that answers the
question posed at the start of the section. For simplicity we continue to work
with metrics that are smooth, and for asymptotically flat metrics we require the
rate q > n−2
2 , with order ℓ ≥ 3, though we can replace this with a weighted
2,α
C−q -assumption; in any case, we can use this weighted norm to measure the
closeness of two such metrics as in the next theorem.
Theorem 7-89. Suppose (E , g) is an asymptotically flat end with vanishing
scalar curvature, nonzero ADM mass m(g), and asymptotically flat coordinates
x. There is θ0 > 0 so that for all θ ≥ θ0 , there is a metric ḡ on E with R(ḡ) = 0,
ḡ = g for |x| ≤ θ , and ḡ is a Schwarzschild metric for |x| ≥ 2θ . Given ε > 0, for
large enough θ , ḡ is ε-close to g, and |m(ḡ) − m(g)| < ε.
The theorem applies to ends of negative or positive mass, while if E ⊂ M is
an asymptotic end in M, and since the construction is local to the end, ḡ extends
smoothly to all of M. The remainder of the chapter will be spent illustrating
the steps in the proof. The basic strategy is simple to lay out, but a number of
technical issues will make the discussion in parts a bit cumbersome.
With respect to the asymptotic coordinates x, we let
4
m
n−2
g̊m,c (x) = 1 + g En
2|x − c|n−2
298 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
Even if L ∗g̃θ has trivial kernel, K̊ can be thought of as an approximate kernel for
large θ , an obstruction to uniform estimates. We proceed as in Proposition 6-15:
we get uniform estimates for L ∗g̃θ if we work transverse to K̊. To this end, let
⊥ K = u ∈ H 2 (A ) : i
A1 uζ d x = 0 = A1 uζ x d x, i = 1, . . . , n . Since ζ
R R
ρ 1
is compactly supported away from the boundary of the annulus, integration
against an element of K defines a continuous linear functional on Hρ2 (A1 ), and
so ⊥ K is closed, which also follows from Hρ2 (A1 ) = ⊥ K ⊕ K. We remark that
Hρ2 (A1 ) = ⊥ K ⊕ K̊ as well, since K̊ ∩ ⊥ K = {0}, and K̊ has the same dimension
as Hρ2 (A1 )/⊥ K ∼ = K. To be explicit, let x 0 := 1; we can decompose u ∈ Hρ2 (A1 )
as u = R0 (u) + P0 (u), where
n
X ⟨u, ζ x j ⟩ L 2 j
P0 (u) = x ∈ K̊.
⟨x j , ζ x j ⟩ L 2
j=0
n
⟨u, ζ x j ⟩ L 2
We can also write u = R(u) + P(u), with P(u) = ζ x j ∈ K. It
P
j=0 ⟨ζ x j , ζ x j ⟩ L 2
is easy to see that R0 (u) and R(u) are in ⊥ K.
Remark 7-91. We have not used anything about g̃θ except that it is near the
Euclidean metric, and so the results just obtained extend to g suitably near g En .
300 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
i, j=1
Note that we replaced L g̃θ with the operator L at the Euclidean metric, up to
error terms, since g̃θ is close to the Euclidean metric. Integrating by parts, using
L ∗ (x k ) = 0 and the fact that h θ and ∂h θ vanish at the boundary, we get
Z Z n
k k
(g̃θ )i j,i j − (g̃θ )ii, j j d x + O(θ −2q ). (7.5.4)
X
x R(g̃θ + h θ ) d x = x
A1 A1 i, j=1
We first consider k = 0. Recall that n−2 2 < q ≤ n − 2; the upper bound holds
since m(g) ̸= 0 by assumption. From the fact that g is asymptotically flat with
vanishing scalar curvature, we know that i,n j=1 (gi j,i j −gii, j j )(x) = O(|x|−2−2q ),
P
and similarly for g̊m,c (with q = n − 2). Thus by (7.5.4) and (7.2.2) (see also
Proposition 7-33), we have
Z
R(g̃θ + h θ ) d x
A1 Z Z X n
= − ((g̃θ )i j,i −(g̃θ )ii, j )ν j dσ + O(θ −2q )
{|x|=2} {|x|=1} i, j=1
Z n
= θ 2−n ((g̊m,c )i j,i − (g̊m,c )ii, j )ν j dσ
X
{|x|=2θ } i, j=1 Z n
j
(gi j,i − gii, j )ν dσ + O(θ −2q )
X
−
{|x|=θ} i, j=1
Using i,n j=1 (gi j,i j − gii, j j )(x) = O(|x|−2−2q ) again, we see the integrand
P
We now compute, with (h̊ m,c )i j = (g̊m,c )i j − δi j and (h̃ θ )i j = (g̃θ )i j − δi j , and
using (7.5.4),
Z
x k R(g̃θ + h θ ) d x
A1
Z Z n
X
= − x k ((h̃ θ )i j,i −(h̃ θ )ii, j )−(δ ki (h̃ θ )i j −δ kj (h̃ θ )ii ) ν j dσ
{|x|=2} {|x|=1} i, j=1
+ O(θ −2q )
= θ 1−n {|x|=2θ} i,n j=1 x k ((h̊ m,c )i j,i −(h̊ m,c )ii, j )−(δ ki (h̊ m,c )i j −δ kj (h̊ m,c )ii ) ν j dσ
R P
Z n
1−n
x k (h i j,i − h ii, j ) − (δ ki h i j − δ kj h ii ) ν j dσ
X
−θ
{|x|=θ } i, j=1
Since R(g̊m,c ) = 0, we have i,n j=1 (g̊m,c )i j,i j − (g̊m,c )ii, j j = Qg̊m,c , where Qg
P
is defined on page 283 (cf. Proposition 7-31). Using (7-39a) (p. 256) we then
L OCALIZED SCALAR CURVATURE DEFORMATION AND ASYMPTOTICS 303
obtain
Z n
x k ((h̊ m,c )i j,i − (h̊ m,c )ii, j ) − (δ ki (h̊ m,c )i j − δ kj (h̊ m,c )ii ) ν j dσ
X
{|x|=2θ } i, j=1
Z n
k k
X
= 2(n−1)ωn−1 mc − lim x (g̊m,c )i j,i j − (g̊m,c )ii, j j d x
r →∞ {2θ ≤|x|≤r }
i, j=1
Z
= 2(n−1)ωn−1 mck − lim x k Qg̊m,c (x) d x. (7.5.8)
r →∞ {2θ ≤|x|≤r }
While Qg̊m,c (x) = O(|x|2−2n ), a closer analysis of Qg̊m,c (x) shows that it takes
the form
which denotes a linear combination of terms of the form (h̊ m,c )i j (∂ 2 h̊ m,c )kℓ and
(∂ h̊ m,c )i j (∂ h̊ m,c )kℓ , with bounded coefficients which are rational expressions in
the components of g̊m,c . By explicit expansion, Qg̊m,c (x) = Qeg̊m,c (x)+O(|x|1−2n ),
where Qeg̊m,c (−x) = Qeg̊m,c (x); cf. (7.2.7) and the discussion of approximate parity
symmetry.
Thus we see for any r > 2θ , {2θ ≤|x|≤r } x k Qeg̊m,c (x) d x = 0, while we also
R
have the estimate {|x|≥2θ } x k (Qg̊m,c (x) − Qeg̊m,c (x)) d x = O(θ 2−n ). From (7.5.6)–
R
(7.5.8), we have
Z
x k R(g̃θ + h θ )d x = θ 1−n · 2(n−1)ωn−1 mck + O(γ (θ)) .
A1
We saw above that the rescaled Schwarzschild indeed has mass times center
equal to (m/θ n−2 )(c/θ ) = θ 1−n mc. With ĉk = ck /θ, recalling that γ (θ) = o(θ),
we obtain
Z
x k R(g̃θ + h θ ) d x = θ 2−n · 2(n−1)ωn−1 m ĉk + o(1) .
A1
Thus even though the metric g may not have had a convergent center of mass
integral, we still have the following vector identity (recall (7.5.5)):
θ n−2
Z Z
R(g̃θ + h θ ) d x, x R(g̃θ + h θ ) d x
2(n−1)ωn−1 A1 A1
=: m − m(g), m ĉ + (ξ 0 (m, ĉ), ξ(m, ĉ))
ξ(m, ĉ)
F(m, ĉ) = m(g) − ξ 0 (m, ĉ), −
m(g) − ξ 0 (m, ĉ)
defines a continuous map F : 2 → 2, since ξ = o(1), uniformly for θ large
and m, ĉ ∈ 2. The Brouwer fixed point theorem yields an (m, ĉ) ∈ 2 such
that F(m, ĉ) = (m, ĉ), which translates into A1 x k R(g̃θ + h θ ) d x = 0 for k ∈
R
{0, 1, . . . , n} as desired.
Upon rescaling the metric back out from A1 , the center of mass becomes
c = θ ĉ, which can be “large”, but which is relatively smaller than the scale θ
at which the gluing is happening; for instance in the case n = 3, q = 1, we
have c = O(log θ ). This might be expected, since we did not assume g had a
convergent center of mass. As in [66] (note there are some unfortunate differences
between the normalization there and here), if the original metric g satisfies better
asymptotics, we can work a bit harder to get the final center of mass c to be close
to the original center c(g). In [66], we assumed asymptotically Schwarzschild
asymptotics, but the same would hold under Regge–Teitelboim asymptotics
(7.2.7), cf. [69; 61].
Exercises
where each end Eα is diffeomorphic to {|x| > 1} in Rn , and that there is a C > 0
with |gi j (x)−δi j | ≤ C|x|−q (q > 0, |x| > 1) on each end. Let d : M × M → [0, ∞)
E XERCISES 305
Before the next part, you might recall as an example the RPn -geon from
Remark 2-39. We also recall that if M is nonorientable, there is a connected
orientable double cover π : M
b → M.
d. Suppose π : M b → M is a connected double cover, with the covering metric ĝ.
Show that ( M,
b ĝ) is asymptotically flat, with 2k asymptotic ends. Indeed, let
C ⊂ M be as in part c., and let b C = π −1 (C ) ⊂ M.
b Use covering arguments (path
lifting properties) to prove that the path components of M b\bC give the ends of
M.
b It may be useful to prove the following: if p ∈ M \ C is a point in an end of
M, and if π −1 ( p) = { p̂1 , p̂2 } ⊂ Mb\bC , then since any asymptotic end is simply
connected, any path from p̂1 to p̂2 in M b must hit bC.
Exercise 7-94. Refer to Remark 7-77 for the setting of the exercise. Suppose
(M, g) is asymptotically flat and let u = 1+v be smooth with 0 < u < 1 and u → 1
at infinity in each asymptotic end, with v subharmonic and lying in a suitable
weighted space, admitting an expansion v(x) = A/|x|n−2 + O1 (|x|−(n−2)−γ )
on each end, for some γ > 0. Recall that these conditions come from solving
n−2
1g u − 4(n−1) R(g)u = 0, where R(g) ≥ 0 is nontrivial and has suitable decay.
Let E be the manifold-with-boundary corresponding to {|x| ≥ r0 } in asymptotic
coordinates on a chosen end. Suppose w solves the boundary value problem
1g w = 0, w ∂ E = max∂ E v < 0 and w(x) → 0 as |x| → ∞ in the end of E , with
′
w(x) = B/|x|n−2 + O1 (|x|−(n−2)−γ ) for some γ ′ > 0. Argue that A ≤ B and
B < 0. (Hint: use Hopf’s lemma; see [107, Lemma 3.4] or [86, Chapter 6].)
Exercise 7-95. a. Complete the proof sketched in Remark 7-78. To check the
integration by parts argument, first give a reasonably direct argument for p > n;
to extend to p > n2 , one can use a suitable exhaustion of the ends, choosing
ri ↗ ∞ for which ri1−n {|x|=ri } |∇w| p dσ decays suitably.
R
306 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
Exercise 7-96. In this exercise you will develop some further properties of
Euclidean harmonic functions on punctured domains in Rn .
a. Show that if u is harmonic in a punctured ball around the origin, then the
isolated singularity at x = 0 is in fact removable if limx→0 |x|n−2 u(x) = 0 in
u(x)
case n > 2, and in case n = 2, if limx→0 log |x| = 0.
b. Suppose n > 2 and is a domain containing the origin. If u is harmonic
on \ {0}, and lim infx→0 |x|n−2 u(x) > −∞, there exist v harmonic in and
b ∈ R such that u(x) = b|x|2−n + v(x) on \ {0}. Formulate an analogue in case
n = 2.
c. What can you say about a positive harmonic function on Rn \{0, a}, a ∈ Rn \{0}?
Exercise 7-97. The goal of this exercise is to verify directly that the ADM energy-
momentum vector of Schwarzschild spacetime transforms correctly under boosts
as a Lorentz-invariant vector. We consider the four-dimensional Schwarzschild
metric ḡ S given by
2
1 − 2m m 4
ḡ S = − r
dt 2 + 1 + 2r (d x 2 + dy 2 + dz 2 )
2m 2
1+
r
2 2 2 2
= − 1 − 2m 2m −2
r dt + 1 + r (d x + dy + dz ) + O∞ (r ),
E XERCISES 307
m
−2 m
6
where ψ(r ) = 1 − 2r 1 + 2r = 1 + 4m −2
r + O(r ).
b. Note that
∂ 1 ∂ ∂ ∂ 1 ∂ ∂
=√ +α and =√ +α .
∂τ 1−α 2 ∂t ∂x ∂ξ 1−α 2 ∂ x ∂t
Show that the metric ḡ S in the coordinates (τ, ξ, y, z) has the matrix representa-
tion
1+α 2 2m α 4m
− 1 + 1−α 2 · r · 0 0
1−α 2 r
α 4m 1+α 2 2m
1 0 0
· + ·
[ḡ S ] = 2 r 2 r + O∞ (r −2 ).
1−α 1−α
0 0 1 + 2m 0
r
2m
0 0 0 1+ r
∂/∂ξ when computing ∇ ∂ ∂ ∂x j ; and (ii) use the ADM equation (5.3.4) and the
∂ξ
form of the metric above to compute K i j from the lapse and shift form of the
metric ḡ S in the (τ, ξ, y, z) coordinates; we again emphasize that ∂r/∂τ ̸= 0. In
either case you should find that the matrix representation of K in the (ξ, y, z)
coordinates is
α(α 2 −3) mξ α 2my α 2mz
− −
(1−α 2 )2 r 3 1−α 2 r 3 1−α 2 r 3
[K ] = − α 2 2my α mξ
0 + O(r −3 ),
1−α r 3 1−α r2 3
α 2mz α mξ
− 2 3
0 2 3
1−α r 1−α r
α(1 + α 2 ) mξ
so that the trace of K is trg K = − + O(r −3 ).
(1 − α 2 )2 r 3
e. Find the momentum tensor πi j = K i j −(trg K )gi j = K i j −(trg K )δi j + O(r −3 )
in the (ξ, y, z) coordinates, noting in particular that
2α mξ
πξ ξ = − + O(r −3 ).
1 − α2 r 3
√
Now compute the ADM momentum, showing that P1 = −mα/ 1 − α 2 and
P2 = P3 = 0. Note that this is consistent with the Lorentz transformation of
coordinates, and makes sense on physical grounds in terms of the relative motion
of the two observers with worldlines x = 0 and ξ = 0 in the asymptotic region.
Exercise 7-98 (Lorentz invariance of ADM energy-momentum). Consider a
Lorentzian manifold (S , ḡ). We will be interested in the case when ḡ is asymp-
totically Minkowskian, so we will assume that S is Rn+1 (or a suitable open
subset), and let η be a background Minkowski metric, so that the coordinates x µ
(x 0 = t, c = 1) on Rn+1 are inertial coordinates for η:
η = ηµν d x µ d x ν = −(d x 0 )2 + (d x 1 )2 + · · · + (d x n )2 .
thus DRη (h) = ηµν ηλα (h αµ,λν − h λα,µν ). Note then that DG η (h) = DRicη (h) −
1
2 DRη (h)η.
Define τ by κτ µν = ηµα ηνβ (DG η (h))αβ . Then κτ is comprised of the terms
in the Einstein tensor that are linear in h.
µν
b. Show that divη τ = 0, i.e., in any inertial coordinate system, τ ,ν = 0. You can
show this directly from the formula above, or use the fact that divḡ G(ḡ) = 0.
c. Show that in any inertial coordinate system, for i ∈ {1, . . . , n},
n
κτ 00 = 1
(h i j,i − h ii, j ), j
P
2
i, j=1
n n
0i 1
κτ = h 0i, j − h i j,0 − h 0 j,i + h kk,0 δi j
P P
2 ,j
j=1 k=1
n n
1
= h 0i, j − h i j,0 + (h kk,0 − h 0k,k )δi j .
P P
2 ,j
j=1 k=1
µ µ
Let y µ = 3 ν x ν + a µ for some proper Lorentz matrix 3 ν , and some vector
with components a µ . Then the volume form is ω = d x 0 ∧ d x 1 ∧ · · · ∧ d x n =
dy 0∧dy 1∧· · ·∧dy n . Consider spacelike hyperplanes M and M e in Rn+1 , where M
n
is given by x 0 = 0 and M
e by y 0 = 0. Let C R = x ∈ Rn+1 : i=1 (x i )2 ≤ R 2 . Let
P
MR = M ∩ C R, M eR = M e ∩ C R ; then C R , M R and M
eR bound a solid spacetime
region W R , the boundary of which is the union of M R , M eR and a timelike
hypersurface 6 R ⊂ ∂C R . For each µ, let Iµ be the n-index increasing from 0 to
n, but omitting µ, and let d x Iµ be the corresponding n-form. For a vector field
X = X µ ∂/∂ x µ , note that i X ω = nµ=0 (−1)µ+1 X µ d x Iµ .
P
ḡ = −N 2 dt 2 + gi j (d x i + X i dt) ⊗ (d x j + X j dt),
where πi j = K i j − (trg K )gi j ; cf. (5.3.4). Use these formulas to compute πi j for
a boosted slice 6α in Schwarzschild as in Exercise 7-97; in the notation of that
problem, ∂r/∂τ ̸= 0.
g. Prove that the ADM momentum flux integral converges, and is given by
Z
Pi = lim 2i j ν j dσ = b
Pi ,
r →∞ {x 0 =0, |x|=r }
where κ2i j can be taken to be either 21 h 0i, j − h i j,0 − h 0 j,i + nk=1 h kk,0 δi j
P
or 12 h 0i, j − h i j,0 + nk=1 (h kk,0 − h 0k,k )δi j . (You might note that in R3 , if
P
Exercise 7-99. We use the notation from Exercise 7-98. Assume in addition to
the asymptotically flat assumptions in that exercise that there are asymptotically
flat coordinates x µ in which ḡ and T are even to one order better than their
E XERCISES 311
µ
b. Let y µ = 3 ν x ν + a µ , and let
and define M R and M eR as in Exercise 7-98. Using the vector field X µ ∂/∂ x µ
given by X µ = (x ν τ (d x µ , d x λ ) − x λ τ (d x µ , d x ν )), prove that
Z
νλ
J = lim (x ν τ (d x 0 , d x λ ) − x λ τ (d x 0 , d x ν )) d x
r →∞ M
Z r
= lim (x ν τ (dy 0 , d x λ ) − x λ τ (dy 0 , d x ν )) dy,
r →∞ M
er
so that
∂ ∂
J νλ ν ⊗ λ
∂ x Z∂ x
∂ ∂
α α 0 β β β 0 α
= lim ((y − a )τ (dy , dy ) − (y − a )τ (dy , dy )) dy α
⊗ β
r →∞ Mer ∂y ∂y
∂ ∂
= ( J˜αβ − a α P̃β + a β P̃α ) α ⊗ β ,
∂y ∂y
β β
i.e., J˜αβ = 3αν J νλ 3 λ + a α 3 λ Pλ − a β 3αν Pν .
c. For k ∈ {1, . . . , n}, show that J k0 gives the center of mass integral from
(7.2.5). Give a physical interpretation of the result of part b. in case the Lorentz
transformation is the identity, and a i = 0 for i > 0.
312 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
ℓm
d. For ℓ, m ∈ {1, . . . , n}, show that J ℓm = limr →∞ {x 0 =0, |x|=r } 2 j ν j dσ ,
R
where
κ2ℓm 1 ℓ ℓ m m
j = 2 (x h 0m, j − x h m j,0 − x h 0ℓ, j + x h ℓj,0 + h 0ℓ δm j − h 0m δℓj )
= (x ℓ K m j − x m K ℓj ) + Dℓm ℓm
j + Ej ,
1 ℓ 1 m
where in terms of the lapse N and shift X , Dℓm j = − 2 x X j,m + 2 x X j,ℓ +
1 1 ℓm
2 X ℓ δm j − 2 X m δℓj , and E j = x ∗ X ∗ 0 +(N −1)∗x ∗ K , where ∗ denotes a linear
combination of products of components of the indicated factors. By the parity as-
sumptions, N −1 and X are each even up to respective terms of order O(|x|−q−1 ),
and the Christoffel symbols and second fundamental form are odd up to respective
terms of order O(|x|−q−2 ). Conclude that limr →∞ {x 0 =0, |x|=r } E ℓm j
j ν dσ = 0.
R
For R > 0, let χ R = χ B R (0) be the characteristic function of the ball of radius R
about the origin.
a. Show that for p > n2 , there are functions f ∈ L p (Rn ) for which the integral
defining the Newtonian potential does not converge. What can you say about the
borderline case p = n2 ?
b. Observe that for f ∈ L ∞ n
loc (R ), the convolution integral (χ1 0) ∗ f converges
everywhere, whereas by Young’s inequality, for 1 ≤ p ≤ ∞ and f ∈ L p (Rn ),
(χ1 0) ∗ f ∈ L p (Rn ), so that in particular the convolution integral converges
almost everywhere.
c. Prove that for 1 ≤ p < n2 and f ∈ L p (Rn ), the convolution ((1 − χ1 )0) ∗ f
converges everywhere to a bounded, measurable function.
d. Generalize Proposition 7-3: prove that for 1 ≤ p < n2 , 1N f = f distributionally.
Exercise 7-101. Suppose f ∈ L ∞ n 1 n
loc (R ) ∩ L (R ), and suppose we have estab-
lished ∂ j N f (x) = Rn ∂ j 0(x − y) f (y) dy. Let ⊂ Rn be a bounded open set;
R
Conclude that there is a c depending only on n so that for ρ > 0, |I (Bρ (x1 ))| ≤
c |x1 − x2 | supx∈Bρ (x1 ) | f (x)|.
b. Let x λ = (1 − λ)x1 + λx2 . Show that for any y with |y − x1 | ≥ ρ, and for all
λ ∈ [0, 1], |y − x λ | ≥ 21 |y − x1 | ≥ |x1 − x2 |.
c. Conclude that there are constants K and L, depending only on n, such that for
all x1 , x2 ∈ , |y − x1 | > ρ and λ ∈ [0, 1], we have |x λ − y| > 0, and moreover
|x1 − y|n |∂i2j 0|(x λ −y) | ≤ K , from which it follows for suitably chosen L that
|x1 − y|n |∂ j 0(x1 − y) − ∂ j 0(x2 − y)| ≤ L|x1 − x2 |.
d. Show that |I (Rn \ B R (x1 ))| ≤ L R −n |x1 − x2 |∥ f ∥ L 1 (Rn \B R (x1 )) .
e. Finally, for ρ > 0, derive the following estimate for C depending only on n:
I (B R (x1 ) \ Bρ (x1 )) ≤ C supx∈B R (x1 ) | f (x)||x1 − x2 |(log R − log ρ). Conclude
1,α
that N f ∈ Cloc (Rn ) for any 0 < α < 1.
Exercise 7-102. Suppose 1 ≤ p < ∞.
p
a. Suppose 1u = 0 on Rn , and that u ∈ L −τ (Rn ). Show that if τ ≥ 0, then u is
identically zero. (Hint: Use the mean value property.)
p
b. If τ < 0, then if u is a polynomial lying in L −τ (Rn ), show that deg u < −τ .
Exercise 7-103. Let K ′ (x, y) = |x|−a |x −y|a+b−n |y|−b , with a+b > 0. Let p > 1
p
and p ′ = p−1 . Prove that if T (u)(x) = Rn K ′ (x, y)u(y) dy defines a bounded
R
linear operator T on L p (Rn ), then a < np and b < pn′ . To do this, first observe that
f (x) = {|y|≤1} K ′ (x, y) dy gives a function f ∈ L p (Rn ). Furthermore, argue that
R
Exercise 7-104. Let m be a nonnegative integer and 1 < p < ∞. Show that
2, p p
1 : Wm (Rn ) → L m−2 (Rn ) does not have closed range. (A duality argument
2, p p
then shows that 1 : W2−n−m (Rn ) → L −n−m (Rn ) does not have closed range cf.
[155].)
(Hint: Let Km be space of homogeneous harmonic polynomials on Rn of degree
2, p
at most m, so that K−1 = {0}. The kernel of 1 in Wm (Rn ) is precisely Km−1 .
314 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
Let 0 ≤ ϕ ≤ 1 be smooth with compact support in {x : |x| ≤ 2}, and with ϕ(x) = 1
for all |x| ≤ 1, and let u ∈ Km \ Km−1 , i.e., u is a nontrivial homogeneous harmonic
polynomial of degree m. For k ∈ Z+ , let u k (x) = (log k)−1/ p ϕ(x/k)u(x). Show
that infk∈Z+ infh∈Km−1 ∥u k − h∥W 2, p > 0. To do this, first argue that there are
m
positive constants c1 and c2 such that for all k ∈ Z+ , c1 ≤ ∥u k ∥ L mp ≤ c2 . Show
that if infk∈Z+ infh∈Km−1 ∥u k − h∥W 2, p = 0, then for some subsequence k j ∈ Z+ ,
m
p
and for some v ∈ Km−1 , u k j → v in L m (Rn ); then argue that v = 0, which gives
p
a contradiction. Argue carefully that 1u k → 0 in L m−2 (Rn ). From here, an
elementary functional analysis argument shows that the range is not closed. After
you have finished the proof, note where you used that u k is harmonic, and check
p
directly that if you tried to run an analogous proof in L µ (Rn ) for m − 1 < µ < m,
you can renormalize u k in a reasonable way to get c1 and c2 as above, but the
analogous proof does not go through (the issue then must lie with the estimate
of 1u k ).)
Exercise 7-105. In parts a. and b. you will prove Corollary 7-69.
a. Prove the claim for the weight range 0 < τ < n − 2, which follows along the
lines of the proof of Corollary 7-68.
b. For the borderline case τ = n − 2, you should first establish an estimate of the
form ∥hu∥ L 1 ≤ C∥h∥C−ν 0 ∥u∥ 0 for ν > 2, for some n −ν < β < n −2. Conclude
C−β
0
that Th (u) = hu gives a compact linear mapping Th : C2−n → L 1 . Following
k−2,α k−2,α 1
[211], we let D−n = C−n ∩ L , with ∥ f ∥ D k−2,α = ∥ f ∥ D k−2,α + ∥ f ∥ L 1 , and
−n −n
k,α k,α
we let E 2−n = {u ∈ C−n : 1g u ∈ L 1 }, with norm ∥u∥ E k,α = ∥u∥C k,α + ∥1g u∥ L 1 .
2−n 2−n
k,α k−2,α
Use (7.2.19) to show (1g − h) : E 2−n → D−n is an isomorphism.
k,α
c. We note that in [211], a slightly different norm for E 2−n is used. Consider a
background metric g̊ as we have done before, where g̊i j = δi j on the asymptotic
2
charts for g on each end. Show that for u ∈ C2−n , 1g u ∈ L 1 if and only if
1
1g̊ u ∈ L , by estimating ∥1g u − 1g̊ u∥ L 1 . (This shows that the functions in the
k,α
space E 2−n are the same for all metrics which satisfy the same asymptotically
flat condition as g does in the asymptotic charts for g.) Conclude that one gets
k,α
an equivalent norm on E 2−n by replacing ∥1g u∥ L 1 in ∥u∥ E k,α with ∥1g̊ u∥ L 1 .
2−n
j
1X i +4u −1 u , j (L X )i − 2u −1 u ,i tr(L X ) = 0.
Show that
3
Bi x j + B j x i X Bk x k
πi j = − + δi j + O(|x|−3 )
|x|3 |x|3
k=1
e. We saw in (7-39a) and (7.2.5) how to relate β from part d. to the center of
mass. Show the components of the ADM angular momentum can be expressed as
linear combinations of components of d(i) : J 23 = J1 = 12 (d(2)
3 2
−d(3) ), J 31 = J2 =
1 1 3 12 = J = 1 (d 2 − d 1 ). Use this along with an expansion
2 (d(3) − d(1) ) and J 3 2 (1) (2)
of X i (y − a) to show that under translation of asymptotically flat coordinates
y = x + a, the angular momentum changes by Ji 7→ Ji + (a × P)i .
316 7. A SYMPTOTICALLY FLAT SOLUTIONS OF THE E INSTEIN CONSTRAINT EQUATIONS
Exercise 7-107 (DEC and PMT). This example from [130] illustrates a difference
between the dominant energy condition and the weak energy condition (recall
Section 2.3.2); at the level of the constraints operator on initial data sets, we are
comparing the condition ρ ≥ |J |g versus ρ ≥ 0.
Consider an initial data set (R3 , g̊, π ), where g̊ is the Euclidean metric, and
where the momentum tensor π is pure-trace, π = 3p g̊, where trg̊ π = p is a chosen
(smooth) function on R3 . As usual we let (2ρ, J ) = 8(g, π) be the constraints
operator.
a. Show that ρ ≥ 0 for such an initial data set.
b. The ADM energy E of this initial data set vanishes. Give an example of p for
which ρ ∈ L 1 (R3 ), and the ADM linear momentum P does not vanish; thus the
conclusion of the positive mass theorem does not hold for such an initial data set.
c. Show that if the DEC holds, and if p is compactly supported, then p is
identically zero. To do this, translate the DEC into a differential inequality in
the radial direction. Derive the same conclusion if ρ ∈ L 1 (R3 ).
Exercise 7-108. This exercise refers to the proof of Proposition 7-75.
a. Finish the proof of Proposition 7-75 by establishing the expansion. Recall
that q + τ + 2 > n, and outside a compact set, we have
4 k, p 0, p ′
R((1 + v) n−2 g) = R(g) + S, with v ∈ W−τ , S ∈ W−δ′ .
0,s n np n
Show that 1g v ∈ W−β for some s > 2 and β > n. Note that n−(k−2) p > 2 if
(k − 2) p ≤ n and p > nk .
of the adjoint of div. For a weight function ρ as considered on page 294, let
G (u) = B 21 |∇u|2 ρ − u f d x. Let 0 ≤ ζ ≤ 1 be a nontrivial smooth bump
R
8.1. Introduction
Many deep results in mathematical general relativity concern the interplay be-
tween globally conserved quantities and the geometric structure of initial data sets.
Examples include the minimal surface approach by R. Schoen and S.-T. Yau [199;
203] and the spinor method by E. Witten [224] in the proof of the Riemannian
positive mass theorem; the inverse mean curvature flow by G. Huisken and
T. Ilmanen [125] and the conformal flow by H. Bray [26] in the proof of the
Penrose inequality; and the constant mean curvature foliation by G. Huisken and
S.-T. Yau [126] (cf. R. Ye [225]) in establishing a geometric notion of center of
mass.
In a broad sense, this chapter is intended to introduce some aspects of the
connections between globally conserved physical quantities, such as the center of
mass and angular momentum, and the geometric structure of the manifold, using
analysis of the scalar curvature, or more generally the full constraint equations
derived from the spacetime Einstein equation. The chapter focuses on constant
mean curvature foliations and the geometric center of mass of asymptotically
flat initial data sets. This research program was initiated by Huisken and Yau in
1996 and has drawn great interest in recent years; see, for example, [31; 32; 33;
79; 80; 81; 117; 156; 166; 184; 225]. This chapter begins with a partial survey
of the classical results of constant mean curvature surfaces and introduces the
now standard concept of stability. We then discuss some recent progress on the
constant mean curvature surfaces in asymptotically flat initial data sets and the
geometric center of mass. In the last part, we adopt a more analytic approach to
The author, Lan-Hsuan Huang, was partially supported by NSF through grants DMS-1308837
and DMS-1452477. This chapter is based upon two mini-courses presented in the 2012 Summer
School on Mathematical General Relativity at MSRI and the 2013 Summer School on Mathematical
General Relativity in Cortona, Italy. The author is very grateful to the organizers Justin Corvino
and Pengzi Miao for their warm hospitality, which made the summer schools memorable. Sincere
appreciation goes to Justin Corvino for valuable comments.
319
320 8. O N THE CENTER OF MASS AND CMC SURFACES
study the center of mass and angular momentum from the Einstein constraint
equations.
A spacetime is an (n + 1)-dimensional smooth manifold equipped with a
Lorentzian metric g of signature (− + · · · +). The Einstein equation is the tensor
equation (in appropriate units)
Ric(g) − 21 R(g)g = T,
where µ is the energy density and J is the momentum density. More specifically,
let T be the energy-momentum tensor and let ν be the future-directed timelike
normal to M. We define µ := T (ν, ν) and J := T (ν, ·). The dominant energy
condition on the tensor T reduces to the inequality µ ≥ |J |g at each point of M.
When k ≡ 0, (M, g) is called a time-symmetric (or Riemannian) initial data set. It
is simple to see that in the time-symmetric case the system of constraint equations
becomes a single equation R(g) = 2µ. Thus the dominant energy condition
coincides with the condition that the scalar curvature of g is nonnegative, which
is a condition that naturally appears in Riemannian geometry. (However, for
I NTRODUCTION 321
case, we may unambiguously use the ADM mass m to denote the energy E,
since |P| = 0.
There are also the notions of center of mass and angular momentum for an
asymptotically flat initial data set. T. Regge and C. Teitelboim [187] and R. Beig
and N. Ó Murchadha [18] proposed the following definitions of the center of
mass CBORT and the angular momentum J (if E ̸= 0), for k, ℓ = 1, 2, . . . , n:
n n
1 ∂gi j ∂gii
Z X
ℓ j ℓ
− j ν0 − (giℓ ν0 −gii ν0 ) d H0n−1 ,
i
X
Cℓ = lim x
2(n−1)Eωn−1 r →∞ {|x|=r } i, j=1
∂ x i ∂x i=1
n
(8.1.1)
1
Z
j
i
d H0n−1 ,
X
J(kℓ) = lim πi j Y(kℓ) ν0 (8.1.2)
(n − 1)Eωn−1 r →∞ {|x|=r } i, j=1
where Y(kℓ) = x k ∂/∂ x ℓ − x ℓ ∂/∂ x k are the Euclidean rotational vector fields.7
To distinguish the above definitions from other notions of center of mass and
angular momentum (e.g. [126; 48]), we refer to the integrals (8.1.1) and (8.1.2)
as the BORT center of mass and the ADM angular momentum, respectively.
In contrast to the ADM energy-momentum, the integrals of CBORT and J are
less well understood and may not even converge in general. In fact, explicit
examples of asymptotically flat initial data sets such that the integrals diverge
have been constructed [18; 43; 47; 46; 118]. Nevertheless, the author shows
that if one assumes the following Regge–Teitelboim conditions, then (8.1.1) and
(8.1.2) converge and transform correctly with respect to different coordinate
charts [116].
An initial data set (M, g, k) is said to satisfy the Regge–Teitelboim conditions
if it is asymptotically flat and, in the coordinate chart M \ K ∼ =x Rn \ B,
gi j (x) − gi j (−x) = O(|x|−1−q ), ki j (x) + ki j (−x) = O(|x|−2−q ),
and
µ(x) − µ(−x) = O(|x|−n−q0 −1 ), Ji (x) − Ji (−x) = O(|x|−n−q0 −1 ).
Example 8-1 (three-dimensional Schwarzschild manifolds). A fundamental
example in general relativity is the Schwarzschild spacetime, which describes
the exterior gravitational field of a static, spherically symmetric body. The totally
geodesic time-slice outside the apparent horizon of the Schwarzschild spacetime
of mass m > 0 can be expressed as a Riemannian manifold M = (2m, ∞) × S2
endowed with the metric
(1 − 2ms −1 )−1 ds 2 + s 2 gS2 ,
7 In the literature (see [69], for example), the BORT center of mass and angular momentum are
sometimes defined as E Cℓ and E J(kℓ) , respectively.
I NTRODUCTION 323
where gS2 is the round metric on the unit sphere. One can readily check that M
is the manifold interior of an asymptotically flat initial data set with a minimal
boundary and one end, and it has zero scalar curvature. Mathematically one
can extend M to a complete asymptotically flat initial data set of zero scalar
curvature by “doubling” M across its minimal boundary. The complete two-
ended asymptotically flat initial data set can be expressed as a conformally flat
metric (R3 \ {a}, gm,a ), where gm,a = u 4 gE and
m
u(x) = 1 + ,
2|x − a|
where gE is the Euclidean metric. We generally suppress “a” from the notation
and write gm = gm,a . One computes directly that m is the ADM energy and a is
the BORT center of mass. The asymptotic expansion of gm for |x| large is
2m 2ma · x 3m 2
−3
gm = 1 + + + + O(|x| ) gE .
|x| |x|3 2|x|2
It follows that m appears in the |x|−1 -term of the expansion and the BORT center
of mass appears in the odd part of the O(|x|−2 )-term. This demonstrates that
appropriate assumptions need to be imposed on the leading-order terms of the
data in order for the integrals (8.1.1) and (8.1.2) to converge. It explains the
motivation behind the definition of the Regge–Teitelboim conditions. We also
note that the BORT center of mass of a Schwarzschild manifold is not a point of
the manifold. □
Define h(r ) = s(r ). We then rewrite the Schwarzschild metric in the form
gm = dr 2 + h 2 (r )gSn−1 . Define the vector field X = h(r ) ∂/∂r . By direct
computation,
L X gm = L X (dr ) ⊗ dr + dr ⊗ L X (dr ) + X (h 2 )gSn−1
= 2h ′ (r )dr ⊗ dr + 2(h(r ))2 h ′ (r )gSn−1
= 2h ′ (r )gm .
Thus X is a conformal vector field that satisfies L X gm = 2 f gm , where
−1
dr
f (r ) = h ′ (r ) = = (1 − 2ms 2−n )1/2 .
ds
U NIQUENESS OF EMBEDDED CMC SURFACES 325
Proof. Recall that the Lie derivative L X g in a local frame {e1 , . . . , en+1 } has the
expression
(L X g)(ei , e j ) = g(∇ei X, e j ) + g(ei , ∇e j X )
Integrating on 6 and applying the divergence theorem yields the desired integral
formula. □
Theorem 8-4 (The Heintze–Karcher inequality; see [31, Theorem 3.5]). Let
6 n be a closed, embedded hypersurface in Rn+1 , with 6 = ∂, where is a
bounded region in Rn+1 with volume Vol . Suppose that the mean curvature H
is positive with respect to the outward unit normal. Then
1
Z
n dµ ≥ (n + 1) Vol ,
6 H
F(x, t) = x − tν(x),
where ν is the outward unit normal on 6. Let 6t := F(6, t), with surface
measure dµ (suppressing the dependence on t). Let d6 ( p) denote the distance
of p ∈ Rn+1 to 6. For t sufficiently small, 6t = d6−1 (t) ∩ is smooth, but 6t
may begin to have self-intersection for some t. Hence, instead of working on 6t ,
we consider
where ν(x, t) is a unit normal to 6t := F(6, t). Define Ft (x) = F(x, t). We
further suppose that Ft : 6 → M is an immersion. By direct computation, the
first variation formula says
d
Z
Hn (6t ) = H η dµ, (8.3.1)
dt t=0 6
where A is the second fundamental form of 6 and Ric is the Ricci tensor of
(M, g). Now define the stability operator
functions η.
If we restrict our attention to a smaller class of deformations on 6, hyper-
surfaces of constant mean curvature H ̸= 0 can appear as the critical points of
the functional Hn (6t ). Consider the (n+1)-dimensional signed volume V (t)
328 8. O N THE CENTER OF MASS AND CMC SURFACES
Then
d
Z
V (t) = η dµ.
dt t=0 6
A variation such that V (t) = V (0) for all t ∈ (−ϵ, ϵ) is called volume-
preserving. Proposition 8-7 shows that if a variation satisfies 6 η dµ = 0, then
R
Z Z
µ0 := inf ηL 6 η dµ : η ∈ C ∞ (6), ∥η∥ L 2 (6) = 1, η dµ = 0 . (8.3.3)
6 6
where 10 is the Laplacian operator of the round sphere of radius r . The smallest
eigenvalue of L Sr is
−4r 2 + 8r m − m 2 2 10m
λ0 = = − 2 + 3 + O(r −4 ),
2r 4 u 6 r r
with the corresponding eigenspace spanned by constant functions. The next
eigenvalues are
6m 6m
λ1 = λ2 = λ3 = = 3 + O(r −4 ),
r 3u6 r
with corresponding eigenspace spanned by the coordinate functions {x 1, x 2, x 3 }
restricted to Sr . Thus, if m > 0, Sr is a stable hypersurface of constant mean
curvature (with respect to volume-preserving variations). The spheres {Sr } form a
smooth foliation of constant mean curvature spheres with common center CBORT .
a local orthonormal frame along 6. Note that ∇ei X = ei , where ∇ is the ambient
connection. For any point in 6, we can choose the frame so that ∇e6i e j = 0 at
the point, at which we find
n
16 ⟨X, ν⟩ = ei ei ⟨X, ν⟩
P
i=1
n
= ei ⟨∇ei X, ν⟩ + ⟨X, ∇ei ν⟩
P
i=1
n
= ⟨∇ei ei , ν⟩ + ⟨ei , ∇ei ν⟩ + ⟨∇ei X, ∇ei ν⟩ + ⟨X, ∇ei ∇ei ν⟩
P
i=1
n
= ⟨ei , ∇ei ν⟩ + ⟨X, ∇ei ∇ei ν⟩
P
i=1
= H − |A|2 ⟨X, ν⟩, (8.3.6)
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 331
Pn
where in the last step we have used the equality i=1 ⟨ek , ∇ei ∇ei ν⟩ = 0, which
holds for each k because H is constant.
Let L 6 be the stability operator on 6. The computation above implies that
with H constant and η = n − H ⟨X, ν⟩,
where in the last equality we used 6 (H − |A|2 ⟨X, ν⟩) dµ = 0, which is implied
R
We have seen in Example 8-9 that the BORT center of mass of a Schwarzschild
manifold of positive mass is the common geometric center of the (unique)
foliation of the stable constant mean curvature surfaces. In 1996, motivated
by the goal of finding a geometric description of the center of mass in general
relativity, Huisken and Yau [126] initiated a program to study stable constant
mean curvature surfaces in more general asymptotically flat initial data sets.
Throughout this section, we consider three-dimensional asymptotically flat
initial data sets; in addition, we impose the Regge–Teitelboim conditions where
CBORT is used.
We recall that the three-dimensional Schwarzschild metric of mass m is
m 4
denoted by gm = 1 + 2|x| gE . Here we are interested in the exterior region of
the manifold, so the metric is valid for all m ∈ R, and not only for m > 0. For
most of the results presented here, we focus on an asymptotically flat manifold
that is close to some Schwarzschild manifold in the following sense.
332 8. O N THE CENTER OF MASS AND CMC SURFACES
|I |≤k
The proof by Huisken and Yau consists of two parts. For the existence part,
they use the volume-preserving mean curvature flow to evolve a sufficiently round
initial surface into a constant mean curvature surface. Next, using the estimates
obtained from the flow, they analyze the eigenvalues of the stability operator and
show that the constant mean curvature surfaces are stable and form a smooth
foliation. We sketch the method of volume-preserving mean curvature flow in
Section 8.4.1 and discuss the eigenvalue estimates in Section 8.5. A different
approach by Ye [225] uses the inverse function theorem for the existence part,
which we discuss in Section 8.4.2.
where H = |6t |−1 6t H dµ, 6t = Fr (S2 , t), and |6t | is the area of 6t . By
R
Proposition 8-7, the flow keeps the signed volume between 6t and Sr the same.
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 333
in Euclidean space are round spheres. (We applied this fact earlier in the proof of
Theorem 8-10.) For surfaces that are almost umbilic, there are several quantitative
estimates that measure how far the surfaces are from being round; see, for
example, [72]. Below we provide a simple version of the quantitative estimates.
We define the area radius for a closed surface 6 in (M, g) by
r
|6|
r6 := .
4π
Proposition 8-16 [126, Proposition 2.1]. There exists an absolute constant C > 0
such that the following holds. Let 6 be a closed surface in R3 of genus zero. Let
B1 and B2 be positive numbers such that
H
λi − ≤ C(B1 + B2 )r6−3 .
2
Proof. In the proof, C is assumed to be an absolute constant and may change
from line to line. As a consequence of the Codazzi equation [123, Lemma 2.2],
we have
|∇ A|2 ≥ 43 |∇ H |2 .
|∇ Å|2 = |∇ A|2 − 12 |∇ H |2 ≥ 41 |∇ H |2 .
Let x0 ∈ 6 be such that H (x0 ) = H . By the mean value theorem and the above
bound on |∇ H |, we have
|H (x) − H | ≤ C B2 r6−2 |H |.
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 335
r |I | |∂ I φ| + r 2+α [∂ I φ]α ≤ Cr −1 ,
X X
|I |≤2 |I |=2
Sketch of proof. We will suppress the subscript r in 6r when the context is clear.
Fix an asymptotically flat coordinate system in the exterior region of M. Let 6
be a graph over the coordinate sphere: for φ ∈ C 2,α (Sr (a)) suitably small,
6 = x + φν Sr : x ∈ Sr (a) .
where d Hr is the first Fréchet derivative with respect to the second component.
Specifically, d Hr (a, 0) is the stability operator on Sr (a) with respect to g:
The term Hr (a, 0) in (8.4.4) is the mean curvature of the coordinate sphere
computed as in (8.4.3). Thus, solving Hr (a, φ) = r2 − 4mr2
for some (a, φ) is
equivalent to solving
6m(x − a) · a 9m 2
L Sr (a) φ = − − 3
r4 r Z
1
− G r (x, a) − d Hr (a, sφ) − d Hr (a, 0) (φ) ds. (8.4.5)
0
where G r (x, a) is the remainder term in (8.4.3) and dµ0 is the area measure of
the Euclidean round sphere.
Remark 8-19. Lemma 8-18 has been generalized to initial data sets with the
Regge–Teitelboim conditions. We prove this in Lemma 8-22 below.
By Lemma 8-18 and direct computation, we obtain
Z
Fr (x, a, φ, ∂φ, ∂ 2 φ)(x i −a i ) dµ0 = −8π m(a i − CBORT
i
)+ O(r −1 ∥φ∥C 2 ),
Sr (a)
where the constant C depends only on the metric g. Choose r such that r ≥ C.
It follows that T maps B into itself. Thus, by the Schauder fixed point theorem,
T has a fixed point φ. Then φ solves the desired equation
2 4m
Hr (a, φ) = − 2,
r r
where a = CBORT + O(r −1 ∥φ∥C 2 ). □
Let {6r } be the family of constant mean curvature surfaces constructed in the
previous theorem, and let {x 1 , x 2 , x 3 } be the coordinate functions. The geometric
center of mass of (M, g) proposed by Huisken–Yau is defined as follows, for
i = 1, 2, 3:
i
6 x dµ0
R
i
CGeom = lim R r , (8.4.8)
6r dµ0
r →∞
CBORT = CGeom .
This corollary has been generalized to the class of asymptotically flat initial
data sets that satisfy the Regge–Teitelboim conditions [117].
8.4.3. Another notion of center of mass. In this section we show the following
identity between the mean curvature of the coordinate spheres and the BORT
center of mass.
Lemma 8-22 (Huang [117]). Let (M, g, k) be an asymptotically flat initial
data set satisfying the Regge–Teitelboim conditions. Given a ∈ R3 , denote by
Sr (a) = {x ∈ M : |x − a| = r } the coordinate sphere centered at a of radius r .
Then we have
2
Z
(x α − a α ) H − dµ0 = 8π E(a α − CBORT
α
) + O(r 1−2q ) (8.4.9)
Sr (a) r
for α = 1, 2, 3, where H is the mean curvature of Sr (a) with respect to g and
dµ0 is the area measure of the Euclidean round sphere Sr (a).
E XISTENCE OF CMC SURFACES IN ASYMPTOTICALLY FLAT INITIAL DATA SETS 339
where E 0 (x) = O(r −1−2q ) and E 0 (x) − E 0 (−x) = O(r −2−2q ). We will use the
key identity
Z
(x α − a α ) 21 h i j,k (x)ρ i ρ j ρ k dµ0
Sr (a)
1 ρi ρ j
Z
= (x α − a α ) h i j,i (x)ρ j − 2h i j (x) dµ0
Sr (a) 2 r
1
Z
+ h ii (x)ρ α + h iα (x)ρ i dµ0 . (8.4.10)
Sr (a) 2
2
Z
(x α−a α ) H (x)− dµ0
Sr (a) r
1 1
Z Z
α α
=− j
(x −a )(h i j,i −h ii, j )ρ dµ0 + (h ρ i −h ii ρ α )dµ0 +O(r 1−2q )
2 Sr (a) 2 Sr (a) iα
1
Z
=− x α (gi j,i −gii, j )ρ j −(giα ρ i −gii ρ α ) dµ0
2 Sr (a)
1 α
Z
+ a (g −g )ρ j dµ0 +O(r 1−2q )
2 Sr (a) i j,i ii, j
1 xj xi xα
Z
α
=− x (gi j,i −gii, j ) − giα −gii dµ0
2 Sr (a) r r r
1 xj
Z
+ aα (gi j,i −gii, j ) dµ0 +O(r 1−2q ),
2 Sr (a) r
where we used the Regge–Teitelboim conditions in all the equalities. The desired
identity follows from the definitions of the ADM energy and the BORT center
of mass.
It remains to prove (8.4.10). Our original proof uses a density theorem
(Theorem 8-34) which states that initial data sets with harmonic asymptotics
are dense among initial data sets with the Regge–Teitelboim conditions in a
suitable topology such that the ADM energy and the BORT center of mass vary
continuously. It is then straightforward to verify that (8.4.10) holds for initial data
sets with harmonic asymptotics. Eichmair and Metzger later gave the following
proof [81]. For each α, define the vector field X (α) = (x α − a α )h i j ρ i ∂ j . By the
340 8. O N THE CENTER OF MASS AND CMC SURFACES
where H is the mean curvature of the coordinate sphere Sr (0) with respect to g,
and dµ0 is the area measure of a Euclidean round sphere.
Corollary 8-23. Let (M, g, k) be an asymptotically flat initial data set satisfying
the Regge–Teitelboim conditions. Then these notions of center of mass coincide:
C H = CBORT .
After having obtained a family of constant mean curvature surfaces in Section 8.4,
we now discuss their properties in this section. We continue to restrict our
discussions to three-dimensional asymptotically flat initial data sets throughout
this section, unless otherwise specified.
8.5.1. Analyzing the stability operator. We have shown in Section 8.4 the exis-
tence of a family of constant mean curvature surfaces in an initial data set (M, g)
asymptotic to Schwarzschild. From the construction, each member of the family
of constant mean curvature surfaces {6r } can be expressed as a normal graph
over the corresponding coordinate sphere at a common center a:
6r = x + φν Sr : x ∈ Sr (a) , (⋆)
Ric(ξ, ξ ) ≥ (n − 1)κ|ξ |2 ,
1 (1u)2
Z Z
2
0= 1|∇u| dµ ≥ + ⟨∇u, ∇1u⟩ + Ric(∇u, ∇u) dµ
2 6 6 n
2
λLap
Z
≥ + ((n − 1)κ − λLap )λLap u 2 dµ.
n 6
where we used |∇w| ≥ 0 and applied (⋆⋆) to the term involving |A|2 + Ric(ν, ν).
On the other hand, using a constant function in the Rayleigh quotient yields
2 10m
λ0 ≤ − 2
+ 3 + Cr −4 .
r r
Thus we have shown that
2 10m
λ0 = − 2
+ 3 + O(r −4 ). (8.5.2)
r r
S TABILITY AND FOLIATIONS 343
The function w is almost constant in the L 2 sense. Indeed, let w = |6r |−1 6r w dµ
R
Theorem 8-28 (inverse function theorem). Let E and F be Banach spaces, and
let U be an open subset of E. Suppose f : U ⊂ E → F is of class C k , k ≥ 1.
Let x0 ∈ U . Suppose that D f (x0 ) is a linear isomorphism. Then f is a C k
diffeomorphism of some neighborhood of x0 onto some neighborhood of f (x0 ).
from which it follows that members of the family of constant mean curvature
surfaces do not intersect, and in fact form a foliation.
We recall below a standard application of Moser iteration and include the
proof since our setting is slightly different from [107, Theorem 8.17].
As i tends to infinity, ∥w∥ L p(i+1) converges to ∥w∥ L ∞ and the factor multiplying
the product converges because κ > 1. This implies that, for κ fixed and for any
p0 ≥ 2, and enlarging C0 from one inequality to the next if necessary,
−v ≤ C0 (∥v∥ L p0 + k).
for some constant c. Then there exists r0 large enough so that, for each r ≥ r0 ,
This implies
1
∥u − ū∥ L 2 (6r ) ≤ Cm −1r −1 |ū||6r | 2 .
By Proposition 8-30,
When p = ∞,
Theorem 8-33. Let p > n ≥ 3, q ∈ n−2 2 , n − 2 , q0 > 0. Let (g, k) and ( ḡ, k̄)
2 1
be Cloc × Cloc asymptotically flat initial data sets such that
2, p 1, p
(g − g0 , k), (ḡ − g0 , k̄) ∈ W−q × W−1−q ,
and
0, p
µodd , Jiodd , µ̄odd , J¯iodd ∈ W−n−q0 −1 (M \ K ).
Let ϵ > 0. There exists δ > 0 such that if
∥g odd − ḡ odd ∥W 2, p ≤δ and ∥k even − k̄ even ∥W 1, p ≤δ
−1−q (M\K ) −2−q (M\K )
then
|CBORT − C BORT | < ϵ and |J − J | < ϵ.
8.6.2. Scalar curvature equation8 . We discuss a density result for the scalar
curvature equation due to Schoen and Yau. The density argument is used in
the proof of the Riemannian positive mass theorem and enables them to reduce
the case of the general asymptotically flat metrics to the case that the metrics
are scalar flat and conformally flat at infinity. In what follows, we consider an
asymptotically flat manifold M of dimension n ≥ 3.
8 See p. 280–284 for more details on the results in this section, which we include in order to
keep the chapter self-contained.
350 8. O N THE CENTER OF MASS AND CMC SURFACES
(n − 2)A 1−n
Z
= lim − r dµ
r →∞ {|x|=r } 2
n−2
=− ωn−1 A. □
2
Proof of Theorem 8-35. By Lemma 8-36, we may assume that g has zero scalar
curvature. For λ ≥ 1 large, we define the cutoff metric
ĝλ := χλ g + (1 − χλ )gE ,
8.6.3. Einstein constraint equations. Let (M, g, k) be an initial data set. Define
the momentum tensor
π = k − (trg k)g.
Proposition 8-37 (Corvino–Schoen [69]; see also [78, Proposition 24]). Let
p > n, q ∈ n−2 n
2 , n − 2 , q0 > 1. Suppose that (M , g, π ) is an asymptotically
Theorem 8-38 (Corvino–Schoen [69, Theorem 1]). Let p > n, q ∈ n−2 2 ,n−2 .
Let (g, π ) and (ḡ, π̄ ) be vacuum asymptotically flat initial data sets satisfying
2, p 1, p
(g − g0 , π ) ∈ W−q × W−1−q .
Let ϵ > 0. There exists a vacuum asymptotically flat initial data set (ḡ, π̄) with
D ENSITY THEOREMS 353
where indices are raised and covariant derivatives are taken with respect to ĝ.
Because q ∈ n−2 2 , n − 2 and p > n, DT(ĝ,π̂) |(1,0) and DT(g,π ) |(1,0) are Fred-
holm operators of index 0 for λ sufficiently large [15]. Instead of proving the
linearization has a trivial kernel as in the proof of Theorem 8-35, which seems
difficult for the system, we use the following argument.
2, p 1, p
Let K 1 be a complementing subspace for ker(DT(g,π ) |(1,0) ) in W−q × W−1−q .
2, p 1, p
Since DT(g,π) |(1,0) is Fredholm and the linearization D8|(g,π ) : W−q ×W−1−q →
0, p
W−2−q is surjective (see [69, Proposition 3.1]; cf. [120, Lemma 2.10]), we can
N
find smooth compactly supported symmetric (0, 2)-tensors {(h k , wk )}k=1 whose
N
images {D8|(g,π) (h k , wk )}k=1 form a basis for a complementing subspace of
354 8. O N THE CENTER OF MASS AND CMC SURFACES
0, p N
ran(DT(g,π) |(1,0) ) in W−2−q . Let K 2 = span{(h k , wk )}k=1 . For (u − 1, X ) ∈ K 1
and (h, w) ∈ K 2 , define the maps T (ĝ,π̂) , T (g,π) by
4 2
T (ĝ,π̂) (u, X, h, w) = 8(u n−2 ĝ + h, u n−2 (π̂ + Lĝ X ) + w),
4 2
T (g,π) (u, X, h, w) = 8(u n−2 g + h, u n−2 (π + Lg X ) + w).
∥g − ḡ∥W 2, p ≤ ϵ, ∥π − π̄∥W 1, p ≤ ϵ
−q −1−q
and
E = E, P = P, J = J +α
⃗0, C BORT = CBORT + γ⃗0 .
D ENSITY THEOREMS 355
τi j,i = 0 for j = 1, 2, 3.
P
i
Consider
ĝλ = g + σλ and π̂λ = π + τλ ,
where σλ = σ (x/λ), τλ = τ (x/λ). To specify the center of mass and angular
momentum, the proof centers on constructing the tensors σ, τ with certain desired
properties. For the angular momentum, we find σ, τ with components satisfying
σi j (x) = σi j (−x), τi j (x) = τi j (−x) and so that, for a given α⃗ = (α1 , α2 , α3 ) ∈ R3 ,
Z
l l
X
1
ij
2 τi j,l Y(k) + τil (Y(k) ), j σ d x = αk
{1≤|x|≤2} i, j,l
for each k = 1, 2, 3, where Y(k) is the rotation vector field Y(k) = ∂ ∂x k × x⃗.
To specify the center of mass, we construct a divergence-free and trace-free
tensor σ such that for a given γ⃗ = (γ1 , γ2 , γ3 ) ∈ R3 ,
Z
xk (σi j,l )2 d x = γk ,
X
{1≤|x|≤2} i, j,l
for each k = 1, 2, 3. For the construction of those tensors, see Theorems 2.1
and 2.2 of [122].
CHAPTER 9
9.1. Introduction
This chapter arose from class notes for the minicourse “On the Penrose Inequality” taught by
Fernando Schwartz in the 2012 MSRI Summer School on Mathematical Relativity and the 2013
Summer School in Cortona, Italy. Brian Allen attended the MSRI course and contributed to the
writing, having carefully typed the contents of the course and improved the exposition, particularly
in Section 9.4.
357
358 9. O N THE R IEMANNIAN P ENROSE INEQUALITY
the inverse mean curvature flow. Their proof has an unavoidable limitation: the
term A in (9.1.1) is the area of any single component of the outermost minimal
surface. In the same year, Bray [26] proved the general statement of the RPI
using a conformal flow and the positive mass theorem. (Bray’s proof was later
generalized by Bray and Lee to work up to dimension seven [28].) Finally, Lam
[138] proved the RPI for graphical manifolds, in arbitrary dimensions, in 2010.
Related results in different ambient spaces appear in [95].
In this chapter we present the main ideas of these three proofs of the RPI.
We start with Lam’s proof, which is the simplest; we move onto Huisken and
Ilmanen’s proof, which is the one that builds on Geroch’s original argument; and
we end with Bray’s proof, which gives the most general result to date.
9.2. Preliminaries
precisely:
Theorem 9-2 (Riemannian Penrose inequality). Let (M 3 , g) be an asymptot-
ically flat Riemannian manifold with nonnegative scalar curvature and ADM
mass m. Let 6 2 ⊂ M be the outermost minimal hypersurface in M, and denote
by |6| its area. Then r
|6|
m≥ .
16π
Equality is achieved if and only if (M 3 , g) is isometric (outside 6) to the Rie-
mannian Schwarzschild manifold of mass m.
Throughout this chapter we will discuss three different proofs of the Rie-
mannian Penrose inequality that apply to three slightly different, precisely stated
versions of the inequality above. Each one of the proofs gives us a different
perspective on the problem, but only Bray’s proof fully establishes Theorem 9-2.
In Section 9.3 we discuss Lam’s proof of the PMT, as well as his argument
for proving the RPI for a manifold that is a graph over Rn . Both proofs work in
arbitrary dimensions.
In Section 9.4 we discuss Huisken and Ilmanen’s proof of the Riemannian
Penrose inequality. Their proof develops a weak setting in which Geroch’s
monotonicity formula for the Hawking mass under inverse mean curvature flow
can be applied. The techniques contain many interesting ideas of independent
interest, some of which have had a broader impact in geometric analysis and
general relativity (see e.g. [167; 213]).
Finally, in Section 9.5 we discuss Bray’s proof of the Riemannian Penrose
inequality. Bray’s argument fully proves Theorem 9-2. It uses the PMT together
with a novel conformal flow of the metric and a mass-capacity inequality.
Our goal in this chapter is to provide a gentle introduction to the main ideas,
central arguments, and motivating calculations that go into the aforementioned
proofs. In particular both Huisken and Ilmanen’s proof and Bray’s proof are quite
involved and require deep knowledge and lengthy calculations from elliptic PDE
and geometric measure theory, for instance. Thus, we shall not give a complete
360 9. O N THE R IEMANNIAN P ENROSE INEQUALITY
account of their arguments here. Rather, we hope that the interested reader will
be intrigued enough by the sheer beauty of the results, ultimately visiting the
original works in search of their full exposition.
For this section we will adapt the definition of asymptotically flat manifolds to
the setting of graphs over Rn .
Definition 9-3. Let ⊂ Rn , n ≥ 3, be a bounded open set (possibly empty),
and let f : Rn \ → R be a smooth function. Let M = {(x, f (x)) : x ∈ Rn \ }
denote the graph of f in Rn+1 , and we denote partial derivatives with subscripts.
We say that M is asymptotically flat if
Let us state now the positive mass theorem and the Riemannian Penrose
inequality for the case of graphs.
Theorem 9-4 (PMT for graphs). Let (M n , g) be as in Definition 9-3, with = ∅.
Then the ADM mass of M satisfies
1 R(g)
Z
m= d Vg ,
2(n − 1)ωn−1 M 1 + |∇ f |2
p
where R(g) and d Vg are the scalar curvature and volume measure of (M, g),
respectively, ωn−1 is the area of the (n − 1)-dimensional unit sphere and |∇ f |2 =
f 12 + · · · + f n2 .
Remark 9-5. If R(g) ≥ 0 above, then m ≥ 0. Also, assuming R(g) ≥ 0, then
m = 0 if and only if f is constant. In other words, m = 0 if and only if M is flat
Euclidean space; see [121].
Lam’s formula is quite general and does not need an assumption on the scalar
curvature. This observation has been used for extending the PMT to different
settings (see [160], for example). The RPI for graphs is the following inequality.
Theorem 9-6 (Riemannian Penrose inequality for graphs). Let (M, g) be iso-
metric to the smooth asymptotically flat graph of f : Rn \ → R, where ⊂ Rn
is a smooth bounded open set that is the union of its finitely many components,
the closure of each of which is a convex smooth compact set. Assume that f is
constant on each component of ∂ and |∇ f (x)| → ∞ as x → 6 := ∂. Then if
L AM ’ S PROOF OF THE RPI ( AND PMT) FOR GRAPHS , IN ARBITRARY DIMENSIONS 361
|6| is the area of 6, and if R(g) and d Vg are the scalar curvature and volume
measure of g, the ADM mass of M satisfies
n−2
1 |6| n−1 1 R(g)
Z
m≥ + d Vg .
2 ωn−1 2(n − 1)ωn−1 M 1 + |∇ f |2
p
n−2
Hence, if R(g) ≥ 0, then m ≥ 12 |6|/ωn−1 n−1 .
Remark 9-7. Lam’s proof of the RPI does not deal with the case of equality.
Nevertheless, Huang and Wu establish it in [121]. Theorem 9-6 can be generalized
to the case of 6 mean-convex and outer-minimizing. This follows from estimates
by Freire and Schwartz [95], as we will see at the end of this section.
We let the mean curvature H of a hypersurface 6 with respect to a smooth unit
normal field ν be the divergence of ν taken along 6 (note the sign convention). A
standard formula for the mean curvature of graphs [41] gives the mean curvature
H of 6 with respect to the induced metric as
H0
H=p ,
1 + |∇ f |2
where H0 is the mean curvature of 6 inside Euclidean space. So the assumption
that |∇ f | → ∞ on 6 guarantees that H = 0; i.e., 6 is a minimal surface.
Before we can prove the above theorems we will need some preliminary
results, the proofs of which can be found in [41; 138]. In what follows, we will
use the convention of summing over pairs of repeated indices.
Lemma 9-8. The induced metric of a manifold M n that is the graph of a function
f : Rn → R is given by
fi f j
gi j = δi j + f i f j , gi j = δi j − .
1 + |∇ f |2
The Christoffel symbols of this metric are
fi j fk
0ikj = ,
1 + |∇ f |2
f i jk f k f i j f kk 2 f i j f kl f k fl
0ikj,k = + − .
1 + |∇ f |2 1 + |∇ f |2 (1 + |∇ f |2 )2
Combining the formula
for the scalar curvature R(g) in terms of the Christoffel symbols with Lemma 9-8
we obtain the following.
362 9. O N THE R IEMANNIAN P ENROSE INEQUALITY
Lemma 9-9. The scalar curvature R(g) of a manifold M that is the graph of a
function f : Rn → R is given by
1 2 f j fk
R(g) = f f
ii j j − f f
ij ij − ( f f
ii jk − f f
i j ik ) .
1 + |∇ f |2 1 + |∇ f |2
Lemma 9-10. The scalar curvature of a graph over Rn is given by
1 ∂
R(g) = div0 ( f f
ii j − f f
ij i ) ,
1 + |∇ f |2 ∂x j
where div0 is the divergence with respect to the Euclidean metric.
The proof, a simple calculation using Lemma 9-9, is left as an exercise.
We will need one more lemma relating the volume form and mean curvature
with respect to g to the volume form and mean curvature with respect to the
Euclidean metric.
Lemma 9-11. Let d Vg be the volume measure of (M, g), and let d V0 be the
Euclidean volume measure on Rn . Using pullback under graphical coordinates,
we have p
d Vg = 1 + |∇ f |2 d V0 .
Proof of Theorems 9-4 and 9-6. We will prove both theorems at once where we
note that for Theorem 9-4 we will use the assumption that M is a graph over all
of Rn , whereas for Theorem 9-6 we will use the assumption that M is a graph
over Rn \ , and ̸= ∅. The mass of (M, g) is given by the following where
Sr = {x ∈ Rn : |x| = r } and d Sr is the Euclidean surface measure associated to
this set:
1
Z
m = lim (gi j,i − gii, j )ν j d Sr
r →∞ 2(n − 1)ωn−1 S
r
1
Z
= lim ( f ii f j − f i j f i )ν j d Sr
r →∞ 2(n − 1)ωn−1 S
r
1 f ii f j − f i j f i
Z
= lim ν j d Sr ,
r →∞ 2(n − 1)ωn−1 S
r
1 + |∇ f |2
where the last equality follows from the fact that
1
= 1 + O(|x|− p) ) as |x| → ∞,
1 + |∇ f (x)|2
by Definition 9-3. To prove Theorem 9-4 we use the divergence theorem. We
notice that when M is a graph over all of Rn there is no boundary term. Applying
the divergence theorem gives
1 f ii f j − f i j f i ∂
Z
m= div0 d V0
2(n − 1)ωn−1 Rn 1 + |∇ f |2 ∂ x j
1 R(g)
Z
= d Vg ,
2(n − 1)ωn−1 Rn 1 + |∇ f |2
p
as desired. For the case where the asymptotic end of M is a graph over Rn \ ,
with ̸= ∅, we apply the divergence theorem and this time we get a boundary
term. Since |∇ f (x)| → ∞ as x → 6, we must proceed carefully.
By virtue of Lemma 9-11 we see that
f ii f j − f i j f i |∇ f |2
ν j = H0 .
1 + |∇ f |2 1 + |∇ f |2
|∇ f |2
The behavior of |∇ f (x)|2 as x → 6 is no cause for concern since → 1.
1+|∇ f |2
With ν pointing into along 6, we find
1 f ii f j − f i j f i ∂ f ii f j − f i j f i
Z Z
m= div0 d V0 + ν j dµ0
2(n−1)ωn−1 Rn \ 1+|∇ f |2 ∂ x j 6 1+|∇ f |
2
1 R(g)
Z Z
= d Vg + H0 dµ0 ,
2(n−1)ωn−1 Rn \ 1+|∇ f |2
p
6
where dµ0 is the surface measure of 6 with respect to the Euclidean metric.
364 9. O N THE R IEMANNIAN P ENROSE INEQUALITY
small H
large speed
Now we switch gears back to the Riemannian Penrose inequality, and define
the Hawking mass of a hypersurface. This quantity is a crucial ingredient in
Huisken and Ilmanen’s proof of the RPI.
The outline of the proof of the RPI envisioned by Geroch [105] and further
developed by Jang and Wald [131] is based on the key facts collected in the
following proposition, from which Theorem 9-14 follows.
Hence
r
|60 |
= m H (60 ) ≤ m H (6t ) ≤ lim m H (6t ) = m AD M (M) = m.
16π t→∞
Conclusion (i) follows by Definition 9-18, and (ii) follows from the Geroch
monotonicity formula in Section 9.4.1 below. Underlying (iii) is a result (similar
to Theorem 9-17 above) on weak convergence to a round sphere, in which
Huisken and Ilmanen are able to relate the Hawking mass to a limit of integrals
over coordinate spheres. From these three key facts the proof of the RPI follows,
so long as the IMCF has a smooth solution for all time, which does not happen
in general. Huisken and Ilmanen’s contribution consists in the monumental task
of developing a weak existence setting for which the above facts still remain
true.
H UISKEN AND I LMANEN ’ S PROOF OF THE RPI USING IMCF 367
∂H 1 |A|2 Rc(ν, ν) ∂
= −1 − − , dµt = dµt ,
∂t H H H ∂t
∂ ∂H 2 ∂
Z Z
2
H dµt = 2H dµt + H dµt
∂t 6t 6t ∂t ∂t
1
Z
2 2
= −2H 1 − 2|A| − 2Rc(ν, ν) + H dµt
6t H
|∇ H |2
Z
2
= −2 − |A| − R̄ + 2σ6t dµt
6t H2
|∇ H |2 1 2 1
Z
2
= 4π χ (6t ) + −2 − 2 H − 2 (λ1 − λ2 ) − R̄ dµt
6t H2
1
Z
2
≤ 16π − H dµt ,
2 6t
where we integrated by parts and used the Gauss equation in the third equality,
and then used Gauss–Bonnet and the fact that |A|2 = 21 H 2 + 21 (λ1 − λ2 )2 in the
fourth. The last inequality follows from the assumption R̄ ≥ 0 and the fact that
if (smooth) IMCF exists for all t ∈ [0, ∞), the topology of 6t remains the same
for all t. Hence, if 60 is a sphere (topologically), then 6t remains a topological
sphere for all t; in particular it remains connected, and χ(6t ) = 2 for all t; of
course if 6t is a closed, connected, orientable surface, then χ(6t ) ≤ 2 and the
inequality would also persist.
368 9. O N THE R IEMANNIAN P ENROSE INEQUALITY
is nondecreasing in t. Using |6t |1/2 = |60 |1/2 et/2 in the preceding equation
proves that the Hawking mass is nondecreasing along IMCF. This fact is known
as Geroch’s monotonicity formula.
Remark 9-20. If we could prove that a smooth solution of IMCF starting from
a minimal surface exists for all time, we would have a proof of the RPI. This
is the heuristic argument that originally motivated Geroch to pursue IMCF.
Unfortunately, this cannot be expected, as there are counterexamples where
smooth solutions of IMCF do not exist for all time. Indeed, if we start the flow
with initial condition consisting of two spheres far apart (far away black holes),
both surfaces will expand exponentially fast under IMCF, as we have seen. Thus,
they eventually meet, and this produces a “jump” in the topology. More precisely,
embeddedness of the flow no longer holds [125]. For another example, take for
initial condition a thin torus — one for which the inner radius is much smaller
than the outer radius. Then H > 0, so under IMCF the torus will expand, but it
cannot expand forever. Eventually H → 0 somewhere and |A|2 → ∞ [125].
9.4.2. IMCF in the weak setting. The weak setting developed by Huisken and
Ilmanen in [125] produces a flow that exists for all time starting off with any
reasonable initial condition. Their weak solution is unique and has “jumps”, but
nevertheless associated quantities such as the area of the flowing hypersurface
remain continuous (except possibly at time zero). Other quantities, such as
the total mean curvature, remain monotonic along the weak flow. Here we
present the main ideas in the construction of weak solutions and mention some
of its consequences, while omitting details and heavy calculations, such as those
involving geometric measure theory.
The first step in defining weak solutions to IMCF is to rewrite the IMCF
equation as a (degenerate) elliptic PDE. This is accomplished by defining a level
set formulation for which 6t is the level set of a function. More precisely, we
let u : ⊂ M → R, ⊂ M open, and define
E t := {u < t}, 6t := ∂ E t , E t+ := int{u ≤ t}, 6t+ := ∂ E t+ .
where ∂ and int represent the topological boundary and interior of a set.
H UISKEN AND I LMANEN ’ S PROOF OF THE RPI USING IMCF 369
∇u
If x(t) is a path with u(x(t)) = t (i.e., x(t) ∈ 6t ), then as ν = |∇u| , we see the
normal speed of a moving level set is given by
N
dx 1
= .
dt |∇u(x)|
ν
If the level sets were to satisfy IMCF, so that ddtx = H with x(t) = F( p, t), we
would obtain
∇u
divg = |∇u|, (9.4.2)
|∇u|
as the term on the left is the mean curvature of a level set.
Huisken and Ilmanen define weak solutions of IMCF by constructing a func-
tional whose Euler–Lagrange equation is (almost) (9.4.2). Actually, they want to
find a functional whose Euler–Lagrange equation “freezes” the right-hand side
of (9.4.2), so to speak. The idea is the following. We note that the difference
between 6t and 6t+ from above arises when u is constant on a set with nonempty
interior. Let us define the values tˆ for which 6tˆ ̸= 6t+
ˆ , and call them jump times
since 6t will not evolve smoothly into 6tˆ , rather 6tˆ will jump (instantly) to 6t+
+
ˆ
in the weak setting. With this in mind, we define weak solutions to the IMCF
as follows. Fix a locally Lipschitz function u and a compact set K ⊂ , and
consider the functional JuK defined as
Z
K
Ju (v) = |∇v| + v|∇u| dvg ,
K
finite perimeter, and coincides with the usual notion of boundary on smooth sets.
(See [108] for more on these geometric measure theoretical notions.)
Huisken and Ilmanen are able to prove using standard regularity results from
geometric measure theory that 6t is at least C 1,α [125] (ambient dimension less
than eight). Thus, we will assume in what remains of this section that all sets we
deal with are at least this regular. Now we will describe an intuitive, geometric
way of understanding weak solutions of IMCF using the definitions from above.
Since equation (9.4.2) may be degenerate when |∇u| = 0, we need to regularize
the PDE in order to prove existence of solutions. It turns out that this process
also holds the key for showing that the monotonicity of the Hawking mass can
be applied in the weak setting, and so it is important to understand how to
accomplish this.
To regularize the PDE (9.4.2) we take ϵ > 0, and we consider the equation
∇u ϵ p
divg p = |∇u ϵ |2 + ϵ 2 . (9.4.3)
|∇u ϵ |2 + ϵ 2
If u ϵ is a smooth solution of (9.4.3), and if we define Uϵ : × R → R
by Uϵ (x, z) = u ϵ (x) − ϵz, then Uϵ is a smooth solution to (9.4.2), but in one
etϵ := {Uϵ = t}, then one can show that
dimension higher. In fact if we let 6
uϵ t
etϵ = graph
6 − ,
ϵ ϵ
and hence 6etϵ is a translating solution of IMCF inside × R (see Figure 9).
Huisken and Ilmanen show that existence of solutions to (9.4.3) is guaranteed
by the existence of a subsolution of (9.4.2) which can be taken to be an expo-
nentially expanding coordinate sphere, e.g., u(x) = C log |x|, for some C > 0.
t
graph u
− t1
n +1
{u < t1}
Σn
(Since this is a strong subsolution it is defined in the usual way for (9.4.2).)
The estimates in [125] also give that for a suitable subsequence ϵi → 0+ , we have
that Uϵi (x, z) → u(x) as well as 6̃tϵi → 6t × R, locally in C 1,α . This last fact is
key for proving the monotonicity of the Hawking mass under weak solutions, as
we see below.
Now we turn to developing an intuitive, geometric way of understanding weak
solutions of IMCF. In order to do that, we define the following notion.
Definition 9-21. Let ⊂ M be open. We say that E ⊂ M is a minimizing hull
if it minimizes area on the outside (in ). This is, E satisfies
|∂ ∗E ∩ K | ≤ |∂ ∗F ∩ K |
for any F such that E ⊂ F and F \ E ⊂ is compact, and any compact set
K ⊂ such that F \ E ⊂ K . Additionally, we say that E is a strictly minimizing
hull if equality above implies that F ∩ = E ∩ almost everywhere.
Minimizing hulls minimize area amongst all competing sets that contain
them, and are thus called outer-minimizing sets. Finding the minimizing hull
containing some set is sometimes called the shrink wrap problem, since it amounts
to finding the least area enclosure of a given region. By the first variation formula
for the area, an outer-minimizing hull must have nonnegative (weak) mean
curvature everywhere. (Otherwise, we could choose a compactly supported,
outward-pointing deformation of the hull with lesser area.) In particular, an
outer-minimizing, closed minimal surface is a minimizing hull. This suggests
that we should be able to run the weak IMCF with such a surface as the initial
condition, as it will immediately flow to a surface with positive (weak) mean
curvature.
Proposition 9-22 (Huisken–Ilmanen [125, Proposition 1.4]). Consider a weak
solution to IMCF as above, and assume that E t is precompact. Then:
(i) E t is a minimizing hull in M for all t > 0.
(ii) E t+ is a strictly minimizing hull in M for all t ≥ 0.
(iii) |6t | = |6t+ | for all t ≥ 0, provided that E 0 is a minimizing hull.
Using this, we can portray a (heuristic) geometric characterization of weak
solutions to the flow:
• E t flows by the smooth IMCF as long as E t is a strictly minimizing hull.
• E t jumps to its strictly minimizing hull E t+ when it is not a strictly mini-
mizing hull.
372 9. O N THE R IEMANNIAN P ENROSE INEQUALITY
u >t
u<t u<t
u = t ( jump region)
In Figure 10 we see two spheres that are flowing under weak IMCF until
the moment when they can be enclosed by a “peanut” — two spherical caps
joined by a catenoidal bridge — of the same area. The weak solution will then
instantly jump across the region between these two surfaces, and resume from
the outermost surface.
(There is an alternative notion of weak solutions to IMCF using viscosity
solutions [53]. Such solutions will agree with smooth IMCF until a singularity
is formed; but the problem arises that, after a singular time, 6t may no longer
be a hypersurface anymore. Therefore, a viscosity solution of the IMCF is less
desirable, since it is not clear how to preserve — or even make sense of — the
monotonicity of the Hawking mass.)
From this intuitive geometric picture we can expect at jump times under
weak IMCF that (1) the area of the flowing hypersurface remains continuous
(|∂ E t+ | = |∂ E t |), and (2) the mean curvature does not increase. Therefore, the
following should hold at jump times:
Z Z
2
H dµt ≤ H 2 dµt .
∂ E t+ ∂ Et
Remark 9-23. To prove these statements one considers 6 etϵ , the smooth trans-
lating solutions of IMCF in M 3 × R, for which the smooth monotonicity of the
Hawking mass calculation holds. With this, weak monotonicity of the Hawking
mass follows by taking a limit as ϵ → 0+ so long as χ(6t ) ≤ 2. This, in turn, is
accomplished by showing that if 60 is connected, the weak solution of IMCF
will remain connected for all times. See Sections 4 and 5 of [125] for details.
The proof of part (iii) of Proposition 9-19, in the weak setting, follows from a
weak convergence result for weak solutions of IMCF.
Theorem 9-24 (Huisken–Ilmanen [125, p. 417]). Suppose (M 3 , g) is asymptoti-
cally flat. Let 6t ⊂ M be a weak solution to IMCF and let r (t) be the area radius,
1
i.e., the quantity defined implicitly by |6t | = 4πr (t)2 . Then 6t converges in
1 r (t)
C to to the round unit sphere as t → ∞.
H UISKEN AND I LMANEN ’ S PROOF OF THE RPI USING IMCF 373
Note that since we are rescaling by the area radius, all rescaled solutions here
converge to the unit sphere. This is in contrast to the result of Theorem 9-17,
where we have rescaled to keep area constant to show that all star-shaped hyper-
surfaces converge to a sphere with the same area as 60 .
Using Theorem 9-24 it can be shown that r (t)H6t converges in L 2 to the
analogous quantity for the round sphere. This is helpful for showing that the
Hawking mass of the flowing hypersurface (in the weak setting) converges to the
ADM mass as t → ∞, since the ADM mass is defined as the limit of integrals
over spheres of increasing radius [15]. Generally speaking, the fact that m H (6t )
limits to m AD M (M) as t → ∞ is expected, since these two quantities are equal in
the case of the (Riemannian) Schwarzschild manifold. (Compare this to the case
of asymptotically hyperbolic manifolds [167] where this asymptotic property is
not true.) The proof of this statement is somewhat lengthy, though; we give only
a brief overview here.
Let us denote various geometric quantities of 6t with respect to (M, g) by
H, A, ν, dµt ,∇, and the corresponding ones with respect to (R3 , δ) by H0 , A0 , ν0 ,
dµ0t , ∇ 0 . If we recall that | p(x)| → 0 as |x| → ∞, where pi j = gi j −δi j (because
of our assumption of asymptotic flatness, which also gives us a specified rate of
convergence | p| ≤ C/|x|), we obtain [125, p. 418]
H − H0 = −h ik pkl h l j Ai j + 12 H ν i ν j pi j − h i j ∇i p jl ν l + 12 h i j ∇l pi j ν l
±C| p||∇ p| ± C| p|2 |A|,
H02 (dµt − dµ0t ) = 12 H 2 h i j pi j ± | p|2 |A|2 ± C|∇ p|2 dµt ,
of the first term in terms of the 16π factor that we encounter in the definition
of the Hawking mass. After integrating by parts, the remaining terms cancel,
leaving error terms as well as, remarkably, terms in the definition of the ADM
mass. More specifically, using the above expressions the factor h i j ∇i p jl ν l in
H − H0 recovers terms in the definition of the ADM mass, plus error terms that
integrate to zero in the limit as t → ∞, where we note that ∇i0 p jl = ∇i0 g jl and
∇ = ∇ 0 ± C| p| |∇ p| . This completes our overview of the proof of asymptotic
convergence of the Hawking mass to the ADM mass in the weak setting. Using
374 9. O N THE R IEMANNIAN P ENROSE INEQUALITY
it, we obtain a proof of Theorem 9-14. The interested reader is directed to [125]
for details.
Remark 9-25. One could try to run IMCF on a spacelike slice of spacetime with
initial condition a codimension-two surface, in order to prove the full Penrose
inequality. Huisken and Ilmanen point out in [125] that such a construction
produces a flow that is a forward-backward system of PDE, and hence does not
possess good existence theory.
Proof. The full proof of this theorem requires weak solutions to IMCF, so we
will only prove this theorem for the case where IMCF starting from 6 is smooth
for all time, e.g. when 6 is mean convex and star-shaped. For the rest of the
proof in the weak case see [95].
Claim: Let 6 be as above, and consider a (weak) solution of IMCF in Rn with
initial condition 60 = 6. Then
Z Z
n−2
H dµt ≤ H dµ e n−1 t .
6t 6
|A|2 n − 2 1
H− − H = H 2 − (n − 1)|A|2 ≤ 0.
H n−1 (n − 1)H
d |A|2 n−2
Z Z Z
H dµt = H− dµt ≤ H dµt .
dt 6t 6t H n − 1 6t
n−2
d
Now if we let f (t) = |6t |− n−1 6t H dµt , then using dt |6t | = |6t |, we find
R
n−2
d |6t |− n−1 d
Z Z
n−2
′ − n−1
f (t) = H dµt + |6t | H dµt
dt 6t 6t dt
n −2 n −2
Z Z
− n−2 − n−2
≤− |6 | n−1 H dµt + |6 | n−1 H dµt
n −1 t 6t n −1 t 6t
= 0.
Then we see that
Z
− n−2 1/(n−1)
|6| n−1 H dµ = f (0) ≥ lim f (t) = (n − 1)ωn−1 ,
6 t→∞
where the last equality follows from the fact that IMCF converges to a round
sphere. We are omitting the technical part of the argument, which is to show the
convergence in the weak setting for the above quantities. This is not automatic
since we are not integrating H 2 as in Huisken and Ilmanen’s work, rather we are
integrating H . See [95] for details. □
The Freire–Schwartz theorem allows us to generalize Theorem 9-6 to the case
where ∂ is a union of smooth, mean-convex, outer-minimizing sets. Related
results in [95] include a generalized Pólya–Szegő inequality, as well as mass-
capacity and volumetric Penrose inequalities for the conformally flat case. The
proofs of these results work in arbitrary dimensions and are independent of the
positive mass theorem. It is worth pointing out that the proof of the mass-capacity
inequality is the only part of Bray’s work [26] where the positive mass theorem
is used; cf. Theorems 9-32 and 9-33 below.
Exercise 9-30. Prove that d|6t |/dt = 0 in the smooth case; this calculation uses
and motivates the definition of vt .
Step 3. We next sketch the proof that m(t) is nonincreasing. This involves
a generalization of the classical notion of electrostatic capacity of bodies in
Euclidean space (see [183]).
Definition 9-31. Let (M 3 , g) be an asymptotically flat manifold with boundary
6. The capacity of 6 is
1
Z
2
C(6) = inf |∇ϕ| d Vg ,
ϕ∈M10 4π M
where M10 denotes the space of smooth functions ϕ such that ϕ|6 = 0 and ϕ → 1
as |x| → ∞ and ∇ is the gradient operator of the metric g.
Bray uses a mass-capacity inequality to prove that m(t) is nonincreasing, as
we see below. The precise statement is the following.
Theorem 9-32 (mass-capacity inequality [26]). Let (M 3 , g) be an asymptotically
flat manifold with nonnegative scalar curvature and boundary 6, which is a
minimal surface. Then
m ≥ C(6),
where m is the ADM mass of M. Equality is achieved if and only if (M 3 , g) is
isometric to the Riemannian Schwarzschild manifold.
The idea behind the proof of Theorem 9-32 is to use a trick due to Bunting
and Masood-ul-Alam [36], which consists in reflecting the metric of M across
the boundary of M in order to obtain a asymptotically flat metric with two ends.
This reflection is nontrivial since, in general, the reflected metric is not smooth
across the boundary. Bray overcomes this difficulty by solving an ODE system,
as we see below. Once a suitable metric has been constructed in the reflected
manifold, Bray applies the following theorem:
Theorem 9-33. Let (M 3 , g) be a Riemannian manifold with nonnegative scalar
curvature, multiple asymptotically flat ends and no boundary. Let E be one of its
ends. Then
1
Z
1 2
2 m(E) ≥ C(E) := inf 1 4π |∇ϕ| d Vg ,
ϕ∈M0 M
where M10 is the set of all smooth functions in M that approach 1 at infinity in
the end E and approach 0 in all the other ends.
The proof depends on the following observation:
B RAY ’ S PROOF 379
Claim. The function φ ∈ M10 that realizes the infimum of C(E) has the special
form φ(x) = 1 − C(E)/|x| + O(|x|−2 ).
The Euler–Lagrange equation for the capacity is the Laplace equation, and
indeed the infimum C(E) is realized by the harmonic function with the appro-
priate end behavior (and analogously for C(6)). Since φ is harmonic and tends
to 1 at infinity in the chosen end, we have
1 dφ
Z
C(E) = dµg
4π 6 dν
after integrating by parts over the manifold. Here, 6 is the smooth compact
boundary of a set containing all the ends of M except for the chosen end E, and
ν is the unit normal vector to 6. Letting 6 be an arbitrarily large sphere in E,
the claim follows.
Proof of Theorem 9-33. Let k be the number of ends of M other than E. Without
loss of generality let us assume that (M, g) is harmonically flat at infinity, and
consider the metric g̃ defined on M as g̃ = φ(x)4 ḡ, where φ(x) is the function
that realizes the infimum in the calculation of C(E) [26]. Since φ goes to zero
in all the ends except for E, we can use the removable singularity theorem to
extend the metric g̃ to M e = M ∪ {∞1 } ∪ · · · ∪ {∞k }. Thus M e is the manifold
obtained from compactifying all ends of M other than E. It follows that g̃ is a
harmonically flat metric on M e with one end, and nonnegative scalar curvature.
To keep track of the mass m̃ of ( M, e g̃) in terms of m = m(E), we use the
expansion of u from above. More precisely, since g is harmonically flat, we
know it has the form g = u 4 δ, where u(x) = 1 + 21 m/|x| + O(|x|−2 ). We can
similarly write g̃ = ũ 4 δ = φ 4 u 4 δ, where ũ(x) = 1 + 21 m̃/|x| + O(|x|−2 ). Using
the spherical harmonic expansion for φ we find that
m − 2C(E) = m̃ ≥ 0,
where the last inequality follows from the positive mass theorem. Putting this
together yields m ≥ 2C(E), as desired. See [26] for the case of equality. □
Proof of the mass-capacity inequality, Theorem 9-32. Bray uses a reflection
argument inspired by the work of Bunting and Masood-ul-Alam [36] in order to
double M and reflect the metric smoothly across the boundary. The process is
depicted in Figure 11.
The main idea is to double (M, g) using a specific construction. We glue
6-cylindrical tubes of the form (6 × [0, 2δ], G) onto the two copies of the inner
boundaries. We define coordinates (z, t) on the tube 6 × [0, 2δ], so that points
of the form (z, 0) are glued to one copy of M, and points of the form (z, 2δ)
380 9. O N THE R IEMANNIAN P ENROSE INEQUALITY
land on the other copy. The difficulty of this process is to be able to define the
metric G in such a way that the doubled manifold ( M eδ , g̃δ ) not only has a smooth
metric, but also its scalar curvature remains nonnegative. Furthermore, we want
the mass of either end of M eδ be arbitrarily close to m.
In order to do this, Bray solves an ODE, as we see below. We start by defining
the following components of G:
where {∂t , ∂z1 , ∂z2 } are the coordinate vector fields. We still need to define the
metric over all slices 6 × {t}. To do this, consider the unique solution of the
ODE system
d
G i j (z, t) = 2G ik (z, t)Akj (z, t),
dt
with smooth initial conditions G i j (z, 0) = gi j (z, 0), where Ai j (z, 0) is the second
fundamental form of 6 in M for i, j indexing {∂z1 , ∂z2 }, and where Akj (z, t) =
−Akj (z, 2δ − t) extends smoothly to M eδ . Then, using this definition of G, the
second fundamental form of each slice is automatically Ai j (z, t), and G is
symmetric about t = δ; i.e., G i j (z, t) = G i j (z, 2δ − t). Bray then conformally
deforms the metric once more, and obtains a metric on M eδ whose scalar curvature
is nonnegative, so its mass m̃ δ converges to m as δ → 0+ .
Finally, let ϕ be the function that realizes C(6), so that
lim ϕ(x) = 1,
x→∞
1g ϕ = 0 in M,
ϕ(x) = 0 on 6.
B RAY ’ S PROOF 381
and the fact that u 0 (x) ≡ 1, we obtain a(0) = 1, b(0) = 0, a ′ (0) = −1, and
b′ (0) = C(60 ). From this, it follows that
m ′ (0) = 2C(60 ) − 2m(0) ≤ 0,
as desired.
Step 4. The last step is to show that (M, g(t)) converges (in some sense) to
the Riemannian Schwarzschild metric as t → ∞. Bray is able to show that the
rescaled horizon converges to a coordinate sphere of radius m2 in R3 and that the
m
rescaled solution of the flow ũ t converges to 1 + 2|x| as t → ∞. The interested
reader is directed to Bray’s paper for more details. □
References
[1] R. Abraham, J. E. Marsden, and T. Ratiu, Manifolds, tensor analysis, and applications, 2nd
ed., Applied Math. Sciences 75, Springer, New York, 1988.
[2] R. A. Adams, Sobolev spaces, Pure and Applied Math. 65, Academic Press, San Diego, 1975.
[3] S. Agmon, A. Douglis, and L. Nirenberg, “Estimates near the boundary for solutions of elliptic
partial differential equations satisfying general boundary conditions, II”, Comm. Pure Appl. Math.
17 (1964), 35–92.
[4] A. D. Alexandrov, “On the theory of mixed volumes, II: New inequalities between mixed
volumes and their applications”, Mat. Sbornik (N.S.) 2 (1937), 1205–1238. In Russian.
[5] A. D. Alexandrov, “On the theory of mixed volumes, III: Extension of two theorems of
Minkowski on convex polyhedra to arbitrary convex surfaces”, Mat. Sbornik (N.S.) 3 (1938),
27–46. In Russian.
[6] J. Anderson, J. Corvino, and F. Pasqualotto, “Multi-localized time-symmetric initial data for
the Einstein vacuum equations”, J. Reine Angew. Math. 808 (2024), 67–110.
[7] L. Andersson and V. Moncrief, “Elliptic-hyperbolic systems and the Einstein equations”, Ann.
Henri Poincaré 4:1 (2003), 1–34.
[8] V. I. Arnold, Mathematical methods of classical mechanics, Graduate Texts in Math. 60,
Springer, New York, 1978.
[9] R. Arnowitt, S. Deser, and C. W. Misner, “Coordinate invariance and energy expressions in
general relativity”, Phys. Rev. (2) 122 (1961), 997–1006.
[10] R. Arnowitt, S. Deser, and C. W. Misner, “The dynamics of general relativity”, pp. 227–265
in Gravitation: an introduction to current research, edited by L. Witten, Wiley, New York, 1962.
[11] A. Ashtekar, B. K. Berger, J. Isenberg, and M. MacCallum (editors), General relativity and
gravitation: a centennial perspective, Cambridge Univ. Press, Cambridge, 2015.
[12] S. Axler, P. Bourdon, and W. Ramey, Harmonic function theory, Graduate Texts in Mathe-
matics 137, Springer, 1992.
[13] J. L. Barbosa and M. do Carmo, “Stability of hypersurfaces with constant mean curvature”,
Math. Z. 185:3 (1984), 339–353.
[14] J. L. Barbosa, M. do Carmo, and J. Eschenburg, “Stability of hypersurfaces of constant mean
curvature in Riemannian manifolds”, Math. Z. 197:1 (1988), 123–138.
[15] R. Bartnik, “The mass of an asymptotically flat manifold”, Comm. Pure Appl. Math. 39:5
(1986), 661–693.
[16] R. Bartnik, “Phase space for the Einstein equations”, Comm. Anal. Geom. 13:5 (2005),
845–885.
383
384 R EFERENCES
[17] R. Bartnik and J. Isenberg, “The constraint equations”, pp. 1–38 in The Einstein equations
and the large scale behavior of gravitational fields, edited by P. T. Chruściel and H. Friedrich,
Birkhäuser, Basel, 2004.
[18] R. Beig and N. Ó Murchadha, “The Poincaré group as the symmetry group of canonical
general relativity”, Ann. Physics 174:2 (1987), 463–498.
[19] R. Beig, P. T. Chruściel, and R. Schoen, “KIDs are non-generic”, Ann. Henri Poincaré 6:1
(2005), 155–194.
[20] M. Berger and D. Ebin, “Some decompositions of the space of symmetric tensors on a
Riemannian manifold”, J. Differential Geometry 3 (1969), 379–392.
[21] A. N. Bernal and M. Sánchez, “Globally hyperbolic spacetimes can be defined as ‘causal’
instead of ‘strongly causal’”, Classical Quantum Gravity 24:3 (2007), 745–749.
[22] A. L. Besse, Einstein manifolds, Ergebnisse der Math. (3) 10, Springer, Berlin, 1987.
[23] E. Bonning, P. Marronetti, D. Neilsen, and R. Matzner, “Physics and initial data for multiple
black hole spacetimes”, Phys. Rev. D (3) 68:4 (2003), 044019, 17.
[24] J.-P. Bourguignon, D. G. Ebin, and J. E. Marsden, “Sur le noyau des opérateurs pseudo-
différentiels à symbole surjectif et non injectif”, C. R. Acad. Sci. Paris Sér. A-B 282:16 (1976),
Aii, A867–A870.
[25] H. L. Bray, The Penrose inequality in general relativity and volume comparison theorems
involving scalar curvature, Ph.D. thesis, Stanford University, 1997, https://www.proquest.com/
docview/304386501. arXiv 0902.3241
[26] H. L. Bray, “Proof of the Riemannian Penrose inequality using the positive mass theorem”, J.
Differential Geom. 59:2 (2001), 177–267.
[27] H. L. Bray, “On dark matter, spiral galaxies, and the axioms of general relativity”, pp. 1–64
in Geometric analysis, mathematical relativity, and nonlinear partial differential equations,
Contemp. Math. 599, Amer. Math. Soc., Providence, RI, 2013.
[28] H. L. Bray and D. A. Lee, “On the Riemannian Penrose inequality in dimensions less than
eight”, Duke Math. J. 148:1 (2009), 81–106.
[29] H. Bray and F. Morgan, “An isoperimetric comparison theorem for Schwarzschild space and
other manifolds”, Proc. Amer. Math. Soc. 130:5 (2002), 1467–1472.
[30] H. L. Bray and A. R. Parry, “Modeling wave dark matter in dwarf spheroidal galaxies”, J.
Phy. Conf. Series 615 (2015), art. id. 012001.
[31] S. Brendle, “Constant mean curvature surfaces in warped product manifolds”, Publ. Math.
Inst. Hautes Études Sci. 117 (2013), 247–269.
[32] S. Brendle and M. Eichmair, “Isoperimetric and Weingarten surfaces in the Schwarzschild
manifold”, J. Differential Geom. 94:3 (2013), 387–407.
[33] S. Brendle and M. Eichmair, “Large outlying stable constant mean curvature spheres in initial
data sets”, Invent. Math. 197:3 (2014), 663–682.
[34] S. Brendle and F. C. Marques, “Scalar curvature rigidity of geodesic balls in S n ”, J. Differen-
tial Geom. 88:3 (2011), 379–394.
[35] S. Brendle, F. C. Marques, and A. Neves, “Deformations of the hemisphere that increase
scalar curvature”, Invent. Math. 185:1 (2011), 175–197.
[36] G. L. Bunting and A. K. M. Masood-ul-Alam, “Nonexistence of multiple black holes in
asymptotically Euclidean static vacuum space-time”, Gen. Relativity Gravitation 19:2 (1987),
147–154.
R EFERENCES 385
[37] M. Cai and G. J. Galloway, “Rigidity of area minimizing tori in 3-manifolds of nonnegative
scalar curvature”, Comm. Anal. Geom. 8:3 (2000), 565–573.
[38] J. J. Callahan, The geometry of spacetime: an introduction to special and general relativity,
Springer, New York, 2000.
[39] M. Cantor, “Elliptic operators and the decomposition of tensor fields”, Bull. Amer. Math. Soc.
(N.S.) 5:3 (1981), 235–262.
[40] M. Carfora and A. Marzuoli, Einstein constraints and Ricci flow: a geometrical averaging of
initial data sets, Springer, Singapore, 2023.
[41] M. P. do Carmo, Riemannian geometry, Birkhäuser, Boston, 1992.
[42] S. Carroll, Spacetime and geometry: An introduction to general relativity, Addison Wesley,
San Francisco, 2004.
[43] C. Cederbaum and C. Nerz, “Explicit Riemannian manifolds with unexpectedly behaving
center of mass”, Ann. Henri Poincaré 16:7 (2015), 1609–1631.
[44] A. Chaljub-Simon, “Systèmes elliptiques linéaires dans des espaces de fonctions höldériennes
à poids”, Rend. Circ. Mat. Palermo (2) 30:2 (1981), 300–310.
[45] A. Chaljub-Simon and Y. Choquet-Bruhat, “Problèmes elliptiques du second ordre sur une
variété euclidienne à l’infini”, Ann. Fac. Sci. Toulouse Math. (5) 1:1 (1979), 9–25.
[46] P.-Y. Chan and L.-F. Tam, “A note on center of mass”, Comm. Anal. Geom. 24:3 (2016),
471–486.
[47] P.-N. Chen, L.-H. Huang, M.-T. Wang, and S.-T. Yau, “On the validity of the definition of
angular momentum in general relativity”, Ann. Henri Poincaré 17:2 (2016), 253–270.
[48] P.-N. Chen, M.-T. Wang, and S.-T. Yau, “Conserved quantities in general relativity: from the
quasi-local level to spatial infinity”, Comm. Math. Phys. 338:1 (2015), 31–80.
[49] Y. Choquet-Bruhat (as Fourès-Bruhat), “Théorème d’existence pour certains systèmes
d’équations aux dérivées partielles non linéaires”, Acta Math. 88 (1952), 141–225.
[50] Y. Choquet-Bruhat, “New elliptic system and global solutions for the constraints equations in
general relativity”, Comm. Math. Phys. 21 (1971), 211–218.
[51] Y. Choquet-Bruhat, General relativity and the Einstein equations, Oxford Univ. Press, 2009.
[52] Y. Choquet-Bruhat and J. W. York, Jr., “The Cauchy problem”, pp. 99–172 in General
relativity and gravitation, vol. 1, edited by A. Held, Plenum Press, New York, 1980.
[53] B. Chow and R. Gulliver, “Aleksandrov reflection and geometric evolution of hypersurfaces”,
Comm. Anal. Geom. 9:2 (2001), 261–280.
[54] D. Christodoulou, “Global solutions of nonlinear hyperbolic equations for small initial data”,
Comm. Pure Appl. Math. 39:2 (1986), 267–282.
[55] D. Christodoulou and S. Klainerman, The global nonlinear stability of the Minkowski space,
Princeton Mathematical Series 41, Princeton University Press, 1993.
[56] D. Christodoulou and N. Ó Murchadha, “The boost problem in general relativity”, Comm.
Math. Phys. 80:2 (1981), 271–300.
[57] P. T. Chruściel, “Boundary conditions at spatial infinity from a Hamiltonian point of view”,
pp. 49–59 in Topological properties and global structure of space-time (Erice, 1985), NATO Adv.
Sci. Inst. Ser. B Phys. 138, Plenum, New York, 1986.
[58] P. T. Chruściel, “Mass and angular-momentum inequalities for axi-symmetric initial data sets.
I. Positivity of mass”, Ann. Physics 323:10 (2008), 2566–2590.
[59] P. T. Chruściel, “Elements of causality theory”, preprint, 2011. arXiv 1110.6706v1
386 R EFERENCES
[60] P. T. Chruściel and J. L. Costa, “Mass, angular-momentum and charge inequalities for
axisymmetric initial data”, Classical Quantum Gravity 26:23 (2009), 235013, 7.
[61] P. T. Chruściel and E. Delay, On mapping properties of the general relativistic constraints
operator in weighted function spaces, with applications, Mém. Soc. Math. Fr. (N.S.) 94, 2003.
[62] P. T. Chruściel and J. D. E. Grant, “On Lorentzian causality with continuous metrics”,
Classical Quantum Gravity 29:14 (2012), art. id. 145001.
[63] P. T. Chruściel, Y. Li, and G. Weinstein, “Mass and angular-momentum inequalities for
axi-symmetric initial data sets. II. Angular momentum”, Ann. Physics 323:10 (2008), 2591–2613.
[64] P. T. Chruściel, G. J. Galloway, and D. Pollack, “Mathematical general relativity: a sampler”,
Bull. Amer. Math. Soc. (N.S.) 47:4 (2010), 567–638.
[65] G. B. Cook, “Initial data for numerical relativity”, Living Rev. Relativ. 3 (2000), art. id.
2000–5.
[66] J. Corvino, “Scalar curvature deformation and a gluing construction for the Einstein constraint
equations”, Comm. Math. Phys. 214:1 (2000), 137–189.
[67] J. Corvino and L.-H. Huang, “Localized deformation for initial data sets with the dominant
energy condition”, Calc. Var. Partial Differential Equations 59:1 (2020), Paper No. 42, 43.
[68] J. Corvino and D. Pollack, “Scalar curvature and the Einstein constraint equations”, pp. 145–
188 in Surveys in geometric analysis and relativity, Adv. Lect. Math. (ALM) 20, International
Press, Somerville, MA, 2011.
[69] J. Corvino and R. M. Schoen, “On the asymptotics for the vacuum Einstein constraint
equations”, J. Differential Geom. 73:2 (2006), 185–217.
[70] J. Corvino, M. Eichmair, and P. Miao, “Deformation of scalar curvature and volume”, Math.
Ann. 357:2 (2013), 551–584.
[71] S. Dain, “Proof of the angular momentum-mass inequality for axisymmetric black holes”, J.
Differential Geom. 79:1 (2008), 33–67.
[72] C. De Lellis and S. Müller, “Optimal rigidity estimates for nearly umbilical surfaces”, J.
Differential Geom. 69:1 (2005), 75–110.
[73] D. M. DeTurck, “Existence of metrics with prescribed Ricci curvature: local theory”, Invent.
Math. 65:1 (1981/82), 179–207.
[74] D. M. DeTurck, “Deforming metrics in the direction of their Ricci tensors”, J. Differential
Geom. 18:1 (1983), 157–162.
[75] B. S. DeWitt, “Quantum theory of gravity, I: The canonical theory”, Phys. Rev. 160:5 (1967),
1113–1148.
[76] A. Douglis and L. Nirenberg, “Interior estimates for elliptic systems of partial differential
equations”, Comm. Pure Appl. Math. 8 (1955), 503–538.
[77] M. Eichmair, “The Jang equation reduction of the spacetime positive energy theorem in
dimensions less than eight”, Comm. Math. Phys. 319:3 (2013), 575–593.
[78] M. Eichmair, L.-H. Huang, D. A. Lee, and R. Schoen, “The spacetime positive mass theorem
in dimensions less than eight”, J. Eur. Math. Soc. 18:1 (2016), 83–121.
[79] M. Eichmair and J. Metzger, “On large volume preserving stable CMC surfaces in initial data
sets”, J. Differential Geom. 91:1 (2012), 81–102.
[80] M. Eichmair and J. Metzger, “Large isoperimetric surfaces in initial data sets”, J. Differential
Geom. 94:1 (2013), 159–186.
[81] M. Eichmair and J. Metzger, “Unique isoperimetric foliations of asymptotically flat manifolds
in all dimensions”, Invent. Math. 194:3 (2013), 591–630.
R EFERENCES 387
[82] A. Einstein, “Zur Elektrodynamik bewegter Körper”, Ann. der Physik 17 (1905), 891–
921. Translated as “On the electrodynamics of moving bodies”, pp. 35–65 in The principle
of relativity: a collection of original memoirs on the special and general theory of relativity,
Methuen, London, 1923; reprinted Dover, New York, 1952. Essentially the same text is posted at
http://www.fourmilab.ch/etexts/einstein/specrel/www/. A new translation appeared in pp. 123–
160 of [212].
[83] A. Einstein, “Die Grundlage der allgemeinen Relativitätstheorie”, Ann. der Physik 49:7
(1916), 769–822. Translated as “The foundation of the general theory of relativity”, pp. 109–164
in The principle of relativity: a collection of original memoirs on the special and general theory
of relativity, Methuen, London, 1923; reprinted Dover, New York, 1952.
[84] A. Einstein, Relativity: the special and the general theory, Methuen, London, 1920. Reprinted
by Crown Publishers, New York, 1961.
[85] A. Einstein, The meaning of relativity, Princeton Univ. Press, Princeton, 1922.
[86] L. C. Evans, Partial differential equations, Graduate Studies in Math. 19, Amer. Math. Soc.,
Providence, RI, 1998.
[87] C. F. W. Everitt et al., “The Gravity Probe B test of general relativity”, Classical Quantum
Gravity 32:22 (2015), art. id. 224001.
[88] A. E. Fischer and J. E. Marsden, “Linearization stability of the Einstein equations”, Bull.
Amer. Math. Soc. 79 (1973), 997–1003.
[89] A. E. Fischer and J. E. Marsden, “Deformations of the scalar curvature”, Duke Math. J. 42:3
(1975), 519–547.
[90] A. E. Fischer and J. E. Marsden, The initial value problem and the dynamical formulation of
general relativity, edited by S. W. Hawking and W. Israel, Cambridge Univ. Press, Cambridge,
1979.
[91] A. E. Fischer and J. A. Wolf, “The structure of compact Ricci-flat Riemannian manifolds”, J.
Differential Geometry 10 (1975), 277–288.
[92] D. Fischer-Colbrie and R. Schoen, “The structure of complete stable minimal surfaces in
3-manifolds of nonnegative scalar curvature”, Comm. Pure Appl. Math. 33:2 (1980), 199–211.
[93] G. B. Folland, Introduction to partial differential equations, 2nd ed., Princeton University
Press, 1995.
[94] T. Frankel, Gravitational curvature: An introduction to Einstein’s theory, W. H. Freeman,
San Francisco, 1979.
[95] A. Freire and F. Schwartz, “Mass-capacity inequalities for conformally flat manifolds with
boundary”, Comm. PDE 39 (2014), 98–119.
[96] A. P. French, Special relativity, W. W. Norton, New York, 1968.
[97] H. Friedrich, “On the existence of n-geodesically complete or future complete solutions of
Einstein’s field equations with smooth asymptotic structure”, Comm. Math. Phys. 107:4 (1986),
587–609.
[98] K. O. Friedrichs, “The identity of weak and strong extensions of differential operators”,
Trans. Amer. Math. Soc. 55 (1944), 132–151.
[99] G. J. Galloway, “Least area tori, black holes and topological censorship”, pp. 113–123 in
Differential geometry and mathematical physics (Vancouver, 1993), edited by J. K. Beem and
K. L. Duggal, Contemp. Math. 170, Amer. Math. Soc., Providence, RI, 1994.
[100] G. J. Galloway, “Stability and rigidity of extremal surfaces in Riemannian geometry and
general relativity”, pp. 221–239 in Surveys in geometric analysis and relativity, edited by H. L.
388 R EFERENCES
Bray and W. P. Minicozzi, Adv. Lect. Math. (ALM) 20, International Press, Somerville, MA,
2011.
[101] G. J. Galloway, Notes on Lorentzian causality: ESI-EMS-IAMP Summer School on Mathe-
matical Relativity (Vienna, 2014), 2014. http://hdl.handle.net/10385/2167.
[102] C. Gerhardt, “Flow of nonconvex hypersurfaces into spheres”, J. Diff. Geom. 32 (1990),
299–314.
[103] R. Geroch, “What is a singularity in general relativity?”, Ann. Phys. 48:3 (1968), 526–540.
[104] R. Geroch, “Domain of dependence”, J. Mathematical Phys. 11 (1970), 437–449.
[105] R. Geroch, “Energy extraction”, Ann. New York Acad. Sci. 224 (1973), 108–117.
[106] G. W. Gibbons, “The time symmetric initial value problem for black holes”, Comm. Math.
Phys. 27 (1972), 87–102.
[107] D. Gilbarg and N. S. Trudinger, Elliptic partial differential equations of second order, 2nd
ed., Grundlehren der Math. Wiss. 224, Springer, Berlin, 1983.
[108] E. Giusti, Minimal surfaces and functions of bounded variation, Monographs in Math. 80,
Birkhäuser, Basel, 1984.
[109] A. Gray, Tubes, Addison-Wesley, Redwood City, CA, 1990.
[110] O. Grøn, “Space geometry in rotating frames: a historical appraisal”, pp. 285–334 in
Relativity in rotating frames: relativistic physics in rotating reference frames, edited by G. Rizzi
and M. L. Ruggiero, Kluwer, Dordrecht, 2004.
[111] Q. Han, A basic course in partial differential equations, Graduate Studies in Mathematics
120, American Mathematical Society, 2011.
[112] S. W. Hawking and G. F. R. Ellis, The large scale structure of space-time, Cambridge
Monog. Math. Phys. 1, Cambridge Univ. Press, London, 1973.
[113] S. W. Hawking and G. T. Horowitz, “The gravitational Hamiltonian, action, entropy and
surface terms”, Classical Quantum Gravity 13:6 (1996), 1487–1498.
[114] L. Hörmander, “Pseudo-differential operators and non-elliptic boundary problems”, Ann. of
Math. (2) 83 (1966), 129–209.
[115] L. Hörmander, The analysis of linear partial differential operators, I: Distribution theory
and Fourier analysis, Grundlehren der Math. Wiss. 256, Springer, 1990.
[116] L.-H. Huang, “On the center of mass of isolated systems with general asymptotics”, Classi-
cal Quantum Gravity 26:1 (2009), 015012, 25.
[117] L.-H. Huang, “Foliations by stable spheres with constant mean curvature for isolated systems
with general asymptotics”, Comm. Math. Phys. 300:2 (2010), 331–373.
[118] L.-H. Huang, “Solutions of special asymptotics to the Einstein constraint equations”,
Classical Quantum Gravity 27:24 (2010), 245002, 10.
[119] L.-H. Huang, “On the center of mass in general relativity”, pp. 575–591 in Fifth International
Congress of Chinese Mathematicians, AMS/IP Stud. Adv. Math. 51, pt. 1, Amer. Math. Soc.,
2012.
[120] L.-H. Huang and D. A. Lee, “Equality in the spacetime positive mass theorem”, Comm.
Math. Phys. 376:3 (2020), 2379–2407.
[121] L.-H. Huang and D. Wu, “The equality case of the Penrose inequality for asymptotically
flat graphs”, Trans. Amer. Math. Soc. 367:1 (2015), 31–47.
[122] L.-H. Huang, R. Schoen, and M.-T. Wang, “Specifying angular momentum and center of
mass for vacuum initial data sets”, Comm. Math. Phys. 306:3 (2011), 785–803.
R EFERENCES 389
[123] G. Huisken, “Flow by mean curvature of convex surfaces into spheres”, J. Differential Geom.
20:1 (1984), 237–266.
[124] G. Huisken, “The volume preserving mean curvature flow”, J. Reine Angew. Math. 382
(1987), 35–48.
[125] G. Huisken and T. Ilmanen, “The inverse mean curvature flow and the Riemannian Penrose
inequality”, J. Differential Geom. 59:3 (2001), 353–437.
[126] G. Huisken and S.-T. Yau, “Definition of center of mass for isolated physical systems and
unique foliations by stable spheres with constant mean curvature”, Invent. Math. 124:1-3 (1996),
281–311.
[127] J. Isenberg, “Constant mean curvature solutions of the Einstein constraint equations on
closed manifolds”, Classical Quantum Gravity 12:9 (1995), 2249–2274.
[128] J. Isenberg and V. Moncrief, “A set of nonconstant mean curvature solutions of the Einstein
constraint equations on closed manifolds”, Classical Quantum Gravity 13:7 (1996), 1819–1847.
[129] J. Isenberg, R. Mazzeo, and D. Pollack, “On the topology of vacuum spacetimes”, Ann.
Henri Poincaré 4:2 (2003), 369–383.
[130] P. S. Jang, “On the positive energy conjecture”, J. Mathematical Phys. 17:1 (1976), 141–145.
[131] P. S. Jang and R. M. Wald, “The positive energy conjecture and the cosmic censor hypothe-
sis”, J. Math. Phys. 18 (1977), 41–44.
[132] T. Kaluza, “Zur Relativitätstheorie”, Physikal. Z. 11 (1910), 977–978. Translated in
https://en.wikisource.org/wiki/Translation:On_the_Theory_of_Relativity_(Kaluza).
[133] N. Kapouleas, “Constant mean curvature surfaces constructed by fusing Wente tori”, Invent.
Math. 119:3 (1995), 443–518.
[134] J. L. Kazdan and F. W. Warner, “A direct approach to the determination of Gaussian and
scalar curvature functions”, Invent. Math. 28 (1975), 227–230.
[135] J. L. Kazdan and F. W. Warner, “Existence and conformal deformation of metrics with
prescribed Gaussian and scalar curvatures”, Ann. of Math. (2) 101 (1975), 317–331.
[136] J. L. Kazdan and F. W. Warner, “Prescribing curvatures”, pp. 309–319 in Differential
geometry (Stanford, 1973), vol. 2, Proc. Sympos. Pure Math. 27, 1975.
[137] D. Kleppner and R. J. Kolenkow, An introduction to mechanics, McGraw Hill, New York,
1973.
[138] M.-K. G. Lam, “The graph cases of the Riemannian positive mass and Penrose inequalities
in all dimensions”, preprint, 2010. arXiv 1010.4256v1
[139] H. B. Lawson, Jr. and M.-L. Michelsohn, Spin geometry, Princeton Math. Series 38,
Princeton Univ. Press, 1989.
[140] J. M. Lee, Riemannian manifolds: an introduction to curvature, Graduate Texts in Math.
176, Springer, New York, 1997.
[141] J. M. Lee, Introduction to smooth manifolds, Graduate Texts in Math. 218, Springer, New
York, 2003.
[142] D. A. Lee, Geometric relativity, Graduate Studies in Mathematics 201, Amer. Math. Soc.,
Providence, RI, 2019.
[143] J. M. Lee and T. H. Parker, “The Yamabe problem”, Bull. Amer. Math. Soc. (N.S.) 17:1
(1987), 37–91.
[144] G. Leoni, A first course in Sobolev spaces, Graduate Studies in Math. 105, Amer. Math.
Soc., Providence, RI, 2009.
390 R EFERENCES
[168] L. Nirenberg and H. F. Walker, “The null spaces of elliptic partial differential operators
in Rn ”, J. Math. Anal. Appl. 42 (1973), 271–301. Collection of articles dedicated to Salomon
Bochner.
[169] J. Norton, “What was Einstein’s principle of equivalence?”, Stud. Hist. Philos. Sci. 16:3
(1985), 203–246.
[170] J. D. Norton, “General covariance and the foundations of general relativity: eight decades
of dispute”, Rep. Progr. Phys. 56:7 (1993), 791–858.
[171] J. D. Norton, “Mach’s principle before Einstein”, pp. 9–57 in Mach’s principle: from
Newton’s bucket to quantum gravity, edited by J. Barbour and H. Pfister, Einstein Studies 6,
Birkhäuser, 1995.
[172] N. Ó Murchadha and J. W. York, Jr., “Initial-value problem of general relativity, I: General
formulation and physical interpretation”, Phys. Rev. D (3) 10 (1974), 428–436.
[173] M. Obata, “Certain conditions for a Riemannian manifold to be isometric with a sphere”, J.
Math. Soc. Japan 14 (1962), 333–340.
[174] B. O’Neill, Semi-Riemannian geometry, with applications to relativity, Pure and Applied
Math. 103, Academic Press, New York, 1983.
[175] T. Parker and C. H. Taubes, “On Witten’s proof of the positive energy theorem”, Comm.
Math. Phys. 84:2 (1982), 223–238.
[176] R. Penrose, “Asymptotic properties of fields and space-times”, Phys. Rev. Lett. 10 (1963),
66–68.
[177] R. Penrose, “Gravitational collapse and space-time singularities”, Phys. Rev. Lett. 14 (1965),
57–59.
[178] R. Penrose, “Gravitational collapse: the role of general relativity”, Nuovo Cimento 1:(numero
speciale) (1969), 252–276.
[179] R. Penrose, Techniques of differential topology in relativity, CBMS Regional Conf. Series
Appl. Math. 7, Society for Industrial and Applied Mathematics, Philadelphia, 1972.
[180] R. Penrose, “Naked singularities”, Ann. New York Acad. Sci. 224 (1973), 125–134.
[181] R. Penrose, “Some unresolved problems in classical general relativity”, pp. 631–668 in
Seminar on Differential Geometry, edited by S.-T. Yau, Annals of Math. Stud. 102, Princeton
University Press, Princeton, NJ, 1982.
[182] P. Petersen, Riemannian geometry, 2nd ed., Graduate Texts in Math. 171, Springer, 2006.
[183] G. Pólya and G. Szegö, Isoperimetric inequalities in mathematical physics, Annals of
Mathematics Studies 27, Princeton University Press, Princeton, NJ, 1951.
[184] J. Qing and G. Tian, “On the uniqueness of the foliation of spheres of constant mean
curvature in asymptotically flat 3-manifolds”, J. Amer. Math. Soc. 20:4 (2007), 1091–1110
(electronic).
[185] J. Qing and W. Yuan, “On scalar curvature rigidity of vacuum static spaces”, Math. Ann.
365:3-4 (2016), 1257–1277.
[186] W. Qiu, “Interior regularity of solutions to the isotropically constrained Plateau problem”,
Comm. Anal. Geom. 11:5 (2003), 945–986.
[187] T. Regge and C. Teitelboim, “Role of surface integrals in the Hamiltonian formulation of
general relativity”, Ann. Physics 88 (1974), 286–318.
[188] R. Resnick, Introduction to special relativity, Wiley, 1968.
[189] W. Rindler, Introduction to special relativity, 2nd ed., Oxford Univ. Press, New York, 1991.
392 R EFERENCES
[190] H. Ringström, The Cauchy problem in general relativity, European Mathematical Society,
Zürich, 2009. Errata at https://people.kth.se/∼hansr/errata.html.
[191] H. Ringström, On the topology and future stability of the universe, Oxford Univ. Press,
2013.
[192] H. P. Robertson, “Postulate versus observation in the special theory of relativity”, Rev.
Modern Phys. 21 (1949), 378–382.
[193] H. L. Royden, Real analysis, 3rd ed., Macmillan, New York, 1988.
[194] W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill, New York, 1987.
[195] W. Rudin, Functional analysis, 2nd ed., McGraw-Hill, New York, 1991.
[196] R. Schoen, “Conformal deformation of a Riemannian metric to constant scalar curvature”, J.
Differential Geom. 20:2 (1984), 479–495.
[197] R. M. Schoen, “Variational theory for the total scalar curvature functional for Riemannian
metrics and related topics”, pp. 120–154 in Topics in calculus of variations (Montecatini Terme,
1987), Lecture Notes in Math. 1365, Springer, 1989.
[198] R. Schoen and S. T. Yau, “Existence of incompressible minimal surfaces and the topology of
three-dimensional manifolds with nonnegative scalar curvature”, Ann. of Math. (2) 110:1 (1979),
127–142.
[199] R. Schoen and S. T. Yau, “On the proof of the positive mass conjecture in general relativity”,
Comm. Math. Phys. 65:1 (1979), 45–76.
[200] R. Schoen and S. T. Yau, “On the structure of manifolds with positive scalar curvature”,
Manuscripta Math. 28:1-3 (1979), 159–183.
[201] R. M. Schoen and S. T. Yau, “Complete manifolds with nonnegative scalar curvature and
the positive action conjecture in general relativity”, Proc. Nat. Acad. Sci. U.S.A. 76:3 (1979),
1024–1025.
[202] R. Schoen and S. T. Yau, “The energy and the linear momentum of space-times in general
relativity”, Comm. Math. Phys. 79:1 (1981), 47–51.
[203] R. Schoen and S. T. Yau, “Proof of the positive mass theorem, II”, Comm. Math. Phys. 79:2
(1981), 231–260.
[204] R. Schoen and S.-T. Yau, Lectures on differential geometry, Conference Proceedings and
Lecture Notes in Geometry and Topology, I, International Press, Cambridge, MA, 1994.
[205] R. Schoen and S.-T. Yau, “Positive scalar curvature and minimal hypersurface singularities”,
pp. 441–480 in Surveys in differential geometry, 2019: Differential geometry, Calabi–Yau
theory, and general relativity, vol. 2, Surv. Differ. Geom. 24, International Press, Boston, 2022.
arXiv 1704.05490
[206] R. Schoen and X. Zhou, “Convexity of reduced energy and mass angular momentum
inequalities”, Ann. Henri Poincaré 14:7 (2013), 1747–1773.
[207] B. F. Schutz, A first course in general relativity, Cambridge Univ. Press, Cambridge, 1990.
[208] J. M. M. Senovilla and D. Garfinkle, “The 1965 Penrose singularity theorem”, Classical
Quantum Gravity 32:12 (2015), 124008, 45.
[209] L. Simon, Theorems on regularity and singularity of energy minimizing maps, Birkhäuser,
Basel, 1996.
[210] L. Simon, “Schauder estimates by scaling”, Calc. Var. Partial Differential Equations 5:5
(1997), 391–407.
[211] B. Smith and G. Weinstein, “Quasiconvex foliations and asymptotically flat metrics of
non-negative scalar curvature”, Comm. Anal. Geom. 12:3 (2004), 511–551.
R EFERENCES 393
[212] J. Stachel, Einstein’s miraculous year: five papers that changed the face of physics, Princeton
Univ. Press, 1998.
[213] J. D. Streets, “Quasi-local mass functionals and generalized inverse mean curvature flow”,
Comm. Anal. Geom. 16:3 (2008), 495–537.
[214] M. E. Taylor, Partial differential equations, III: Nonlinear equations, Applied Math. Sci-
ences 117, Springer, New York, 1997.
[215] P. Topping, “Relating diameter and mean curvature for submanifolds of Euclidean space”,
Comment. Math. Helv. 83:3 (2008), 539–546.
[216] J. Urbas, “On the expansion of starshaped hypersurfaces by symmetric functions of their
principal curvatures”, Math. Z. 205 (1990), 355–372.
[217] J. W. Vick, Homology theory: an introduction to algebraic topology, Pure and Applied
Math. 53, Academic Press, New York, 1973.
[218] R. M. Wald, General relativity, University of Chicago Press, 1984.
[219] F. W. Warner, Foundations of differentiable manifolds and Lie groups, Graduate Texts in
Mathematics 94, Springer-Verlag, New York-Berlin, 1983. Corrected reprint of the 1971 edition.
[220] R. H. Wasserman, Tensors and manifolds, with applications to mechanics and relativity,
Oxford Univ. Press, New York, 1992.
[221] S. Weinberg, Gravitation and cosmology, Wiley, New York, 1972.
[222] H. C. Wente, “Counterexample to a conjecture of H. Hopf”, Pacific J. Math. 121:1 (1986),
193–243.
[223] T. Willmore, Total curvature in Riemannian geometry, Wiley, 1982.
[224] E. Witten, “A new proof of the positive energy theorem”, Comm. Math. Phys. 80:3 (1981),
381–402.
[225] R. Ye, “Foliation by constant mean curvature spheres on asymptotically flat manifolds”, pp.
369–383 in Geometric analysis and the calculus of variations, International Press, Cambridge,
MA, 1996.
[226] J. W. York, Jr., “Gravitational degrees of freedom and the initial-value problem”, Phys. Rev.
Lett. 26 (1971), 1656–1658.
[227] J. W. York, Jr., “Role of conformal three-geometry in the dynamics of gravitation”, Phys.
Rev. Lett. 28:16 (1972), 1082–1085.
[228] J. W. York, Jr., “Conformally invariant orthogonal decomposition of symmetric tensors on
Riemannian manifolds and the initial-value problem of general relativity”, J. Mathematical Phys.
14 (1973), 456–464.
[229] J. W. York, Jr., “Boundary terms in the action principles of general relativity”, Found. Phys.
16:3 (1986), 249–257.
[230] W. Yuan, “Brown-York mass and compactly supported conformal deformations of scalar
curvature”, J. Geom. Anal. 27:1 (2017), 797–816.
[231] X. Zhang, “Angular momentum and positive mass theorem”, Comm. Math. Phys. 206:1
(1999), 137–155.
[232] X. Zhou, “Mass angular momentum inequality for axisymmetric vacuum data with small
trace”, Comm. Anal. Geom. 22:3 (2014), 519–571.
Index
395
396 I NDEX
lightcone, 8 normal
Liouville theorem, 240 connection, 223
local rest frame, 22 coordinates, xxiii
Lorentz Ricci curvature, 223
contraction, 6, 20 null vector, 16
force, 3, 32 nullcone, 8
group, 16
transformation, 9
observer, 2, 7
proper, 16
operator
Lorenz gauge, 134
conformal Killing, 195
constraints, 144
Mach’s principle, 59 elliptic, 173, 174
Mainardi equation, 152 formal adjoint, 175
manifold, convention on, xix Fredholm, 185
mass, 238 Jacobi, 212, 225
ADM, 252
Laplace, xxii, 39
decay conditions, 264
overdetermined-elliptic, 175
gravitational, 47
quasilinear, 182
Hawking, 366
self-adjoint elliptic, 184
inertial, 47
semilinear, 182
rest, 25
shape, 161, 162, 227
mass-capacity inequality, 378, 379
tidal force, 123
maximal hypersurface, 225
underdetermined-elliptic, 175
maximum principle, 186, 187
wave, xxii, 39, 75
strong, 239
outermost minimal surface, 357
weak, 239
overdetermined-elliptic, 175
Maxwell’s equations, 14
mean
curvature, 137, 324 parallel transport, 38
value property, 239 past-directed, 108
metric Penrose
Kerr, 99 inequality, see RPI
Lorentzian, xx singularity theorem, 130
Minkowski, 16 PMT, see positive mass theorem, 360
Riemannian, xx Poincaré group, 16
Schwarzschild, 88, 96, 97, 101–103 Poisson
semi-Riemannian, xx equation, 235
static vacuum, 98 kernel, 239
minimal surface, 212 positive
stable, 212, 327 energy theorem, 285
minimizing hull, 371 mass theorem, 285, 358
Minkowski metric/spacetime, 16 for graphs, 360
conformal compactification, 34 Riemannian, 285
momentarily comoving inertial frame, 22 precompact smooth domain, 291
momentum principal
constraint, 140 curvatures, 162
tensor, 144 symbol, 174
principle of relativity, 2
Newtonian potential, 235, 268 proper Lorentz group/transformation, 16
I NDEX 399